Distributions and Confidence Intervals
Normal Distribution
  - a.k.a. Guass, a.k.a. Bell Curve
 
  - many measurements define the normal distribution
 
  - defined by mean and standard deviation and calculated as follows
 

Generating Normal Distribution
see notebook
Confidence Interval
  - Range of values defined so that there is a specified probability a value lies within it.
 
  - ie 95% confidence interval is a range of values that we are 95% certain contains the value.
 
  - Confidence interval can be calculated from mean and standard deviation of a distribution.
 
For example in Python the following code calculates the confidence interval of a single draw from a distribution with given mean and standard deviation
import numpy as np
import scipy
test = np.array([1,2,3,2,1,4]);
scipy.stats.norm.interval(0.95, test1.mean(), test1.std())
CONFIDENCE INTERVAL OF THE MEAN
  - A mean calculated from sample data has an error itself
 
  - Often we are interested in quantifying this error with a confidence interval
 
  - In this case we need a statistic that estimates the deviation of the mean
 
  - This is called standard error.
    
Standard deviation vs standard error
   
  - standard deviation measures how far datapoints are from average

 
  - standard error measures deviation of the mean

 
Confidence Interval of the Mean in Practice
  - Range of values defined so that there is a specified probability the mean of the distribution lies within it.
 
  - ie 95% confidence interval is a range of values that we are 95% certain contains the mean.
 
  - Confidence interval can be calculated from mean and standard error of a distribution.
 
In Python
# in Python
import scipy
# assume a is an array
sample_mean=np.mean(a);
standard_error=scipy.stats.sem(a)
scipy.stats.t.interval(0.95, len(a)-1, loc=sample_mean, scale=standard_error)