Probability Theory and Statistics

Mean Value or Expected Value



The mean or expected value $$\mu$$ is the expected long-run average value of an experiment.

If a random variable $$X$$ can take different values $$x_i$$ with the probability $$P(x_i)$$, then the expected value of the random variable $$\mu = E[X]$$ is:

$\mu = E[X] = \sum_i x_i P(x_i)$

Variance

The variance of a random variable $$X$$ is the expected value of the squared deviation of the expected value of $$X$$:

\begin{equation}
Var(X) = E[(X – \mu)^2]
\end{equation}

Standard Deviation

The standard deviation $$\sigma$$ is a measure to quantify the amount of variation or dispersion of a set of data values $$X$$. The standard deviation is the square root of the variance of $$X$$.

\begin{equation}
\sigma = \sqrt{E[(X – \mu)^2]} = \sqrt{Var(X)}
\end{equation}

Covariance

The covariance is a measure of the joint variability of two random variables. The covariance shows the linear relationship of the two random variables.

\begin{equation}
cov (X, Y) = E[(X – \mu_X)(Y – \mu_Y)]
\end{equation}

The normalized version of the covariance is the correlation coefficient.

Correlation of two random variables

The correlation coefficient (Pearson’s correlation coefficient) measures the linear relationship between two random variables $$X$$ and $$Y$$. It is defined as the covariance of $$X$$ and $$Y$$ divided by the product of the standard derivations of $$X$$ and $$Y$$:

\begin{equation}
corr(X, Y) = \frac{cov(X, Y)}{\sigma_X \sigma_Y} = \frac{E[(X – \mu_X)(Y – \mu_Y)]}{\sigma_X \sigma_Y}
\end{equation}

The correlation is only defined if both standard derivations are finite and nonzero.

The correlation is $$+1$$ in the case of a perfect direct linear relationship and $$-1$$ in the case of a perfect inverse linear relationship of the two random variables. If the variables are independent, the correlation is $$0$$.