Probability Theory and Statistics

Mean Value or Expected Value

\( \)

The mean or expected value \( \mu \) is the expected long-run average value of an experiment.

If a random variable \( X \) can take different values \( x_i \) with the probability \( P(x_i) \), then the expected value of the random variable \( \mu = E[X] \) is:

\[
\mu = E[X] = \sum_i x_i P(x_i)
\]

Variance

The variance of a random variable \( X \) is the expected value of the squared deviation of the expected value of \( X \):

\begin{equation}
Var(X) = E[(X – \mu)^2]
\end{equation}

Standard Deviation

The standard deviation \( \sigma \) is a measure to quantify the amount of variation or dispersion of a set of data values \( X \). The standard deviation is the square root of the variance of \( X \).

\begin{equation}
\sigma = \sqrt{E[(X – \mu)^2]} = \sqrt{Var(X)}
\end{equation}

Covariance

The covariance is a measure of the joint variability of two random variables. The covariance shows the linear relationship of the two random variables.

\begin{equation}
cov (X, Y) = E[(X – \mu_X)(Y – \mu_Y)]
\end{equation}

The normalized version of the covariance is the correlation coefficient.

Correlation of two random variables

The correlation coefficient (Pearson’s correlation coefficient) measures the linear relationship between two random variables \( X \) and \( Y \). It is defined as the covariance of \( X \) and \( Y \) divided by the product of the standard derivations of \( X \) and \( Y \):

\begin{equation}
corr(X, Y) = \frac{cov(X, Y)}{\sigma_X \sigma_Y} = \frac{E[(X – \mu_X)(Y – \mu_Y)]}{\sigma_X \sigma_Y}
\end{equation}

The correlation is only defined if both standard derivations are finite and nonzero.

The correlation is \( +1 \) in the case of a perfect direct linear relationship and \( -1 \) in the case of a perfect inverse linear relationship of the two random variables. If the variables are independent, the correlation is \( 0 \).