# Python中的概率分布

## 介绍

• 了解概率术语, 例如随机变量, 密度曲线, 概率函数等。
• 了解不同的概率分布及其分布函数以及它们的某些属性。
• 学习在python中创建和绘制这些分布。

## 随机变量

1：0 <\$ p_ {i} \$ <每个\$ i \$ 1

2：\$ p_ {1} + p_ {2} + … + p_ {k} = 1 \$。

1：曲线没有负值\$(p(x)> 0 \$对于所有\$ x \$)

2：曲线下的总面积等于\$ 1 \$。

``````# for inline plots in jupyter
%matplotlib inline
# import matplotlib
import matplotlib.pyplot as plt
# for latex equations
from IPython.display import Math, Latex
# for displaying images
from IPython.core.display import Image
``````
``````# import seaborn
import seaborn as sns
# settings for seaborn plotting style
sns.set(color_codes=True)
# settings for seaborn plot sizes
sns.set(rc={'figure.figsize':(5, 5)})
``````

## 1.均匀分布

``````# import uniform distribution
from scipy.stats import uniform
``````

``````# random numbers from uniform distribution
n = 10000
start = 10
width = 20
data_uniform = uniform.rvs(size=n, loc = start, scale=width)
``````

``````ax = sns.distplot(data_uniform, bins=100, kde=True, color='skyblue', hist_kws={"linewidth": 15, 'alpha':1})
ax.set(xlabel='Uniform Distribution ', ylabel='Frequency')
``````
``````[Text(0, 0.5, u'Frequency'), Text(0.5, 0, u'Uniform Distribution ')]
``````

## 2.正态分布

``````from scipy.stats import norm
# generate random numbers from N(0, 1)
data_normal = norm.rvs(size=10000, loc=0, scale=1)
``````

``````ax = sns.distplot(data_normal, bins=100, kde=True, color='skyblue', hist_kws={"linewidth": 15, 'alpha':1})
ax.set(xlabel='Normal Distribution', ylabel='Frequency')
``````
``````[Text(0, 0.5, u'Frequency'), Text(0.5, 0, u'Normal Distribution')]
``````

## 3.伽玛分布

``````from scipy.stats import gamma
data_gamma = gamma.rvs(a=5, size=10000)
``````

``````ax = sns.distplot(data_gamma, kde=True, bins=100, color='skyblue', hist_kws={"linewidth": 15, 'alpha':1})
ax.set(xlabel='Gamma Distribution', ylabel='Frequency')
``````
``````[Text(0, 0.5, u'Frequency'), Text(0.5, 0, u'Gamma Distribution')]
``````

## 4.指数分布

``````from scipy.stats import expon
data_expon = expon.rvs(scale=1, loc=0, size=1000)
``````

``````ax = sns.distplot(data_expon, kde=True, bins=100, color='skyblue', hist_kws={"linewidth": 15, 'alpha':1})
ax.set(xlabel='Exponential Distribution', ylabel='Frequency')
``````
``````[Text(0, 0.5, u'Frequency'), Text(0.5, 0, u'Exponential Distribution')]
``````

## 5.泊松分布

``````from scipy.stats import poisson
data_poisson = poisson.rvs(mu=3, size=10000)
``````

``````ax = sns.distplot(data_poisson, bins=30, kde=False, color='skyblue', hist_kws={"linewidth": 15, 'alpha':1})
ax.set(xlabel='Poisson Distribution', ylabel='Frequency')
``````
``````[Text(0, 0.5, u'Frequency'), Text(0.5, 0, u'Poisson Distribution')]
``````

## 6.二项式分布

``````from scipy.stats import binom
data_binom = binom.rvs(n=10, p=0.8, size=10000)
``````

``````ax = sns.distplot(data_binom, kde=False, color='skyblue', hist_kws={"linewidth": 15, 'alpha':1})
ax.set(xlabel='Binomial Distribution', ylabel='Frequency')
``````
``````[Text(0, 0.5, u'Frequency'), Text(0.5, 0, u'Binomial Distribution')]
``````

1. 试验的次数是无限大的或\$ n→∞\$。
2. 每个试验的成功概率是相同的, 并且无限小或\$ p→0 \$。
3. \$ np =λ\$是有限的。

1. 试验次数无限大, \$ n→∞\$。
2. \$ p \$和\$ q \$都不是无限小的。

## 7.伯努利分布

``````from scipy.stats import bernoulli
data_bern = bernoulli.rvs(size=10000, p=0.6)
``````

``````ax= sns.distplot(data_bern, kde=False, color="skyblue", hist_kws={"linewidth": 15, 'alpha':1})
ax.set(xlabel='Bernoulli Distribution', ylabel='Frequency')
``````
``````[Text(0, 0.5, u'Frequency'), Text(0.5, 0, u'Bernoulli Distribution')]
``````

• 随机变量(耶鲁)
• 泊松分布