# Statistics

Statistics is the discipline that concerns the collection, analysis, interpretation and presentation of data. Those data relate to any underlying phenomenon and are usually gathered by sampling observable values, defined as population. In finance it often pertains return of a security or asset price.

- Descriptive statistics aims to summarize a sample and is solely concerned with properties of the observed data. The indexes calculated most often concern two sets of properties of a distribution (sample or population):

- Central tendency (location) seeks to characterize the distribution's central or typical value.
- Dispersion (variability) characterizes the extent to which members of the distribution depart from its center and each other.

- Inferential statistics use the data to learn about the larger population that the observed sample of data is assumed to represent. Data analysis is used to deduce properties of an underlying probability distribution allowing to draw conclusions from data that are subject to random variation.

For the statistics to follow, be:

- \( X, X_{j}, X_{k} \) random variables
- \( x_{i} \) value of the i
^{th}element in the sampling - \( n \) sample size
- \( N \) population size

## Single variable

### Mean

The mean of a set of numbers is a simple mathematical average. One distinguishes between:

### Variance, Standard deviation

Variance and Standard deviation measure how data are scattered, the dispersion of a dataset relative to its mean.

In finance, dispersion of returns for a given asset, security or market index is called volatility. There are several ways to measure it 🔎. Historical volatility (statistical volatility) gauges the fluctuations of an underlying asset by measuring price changes over predetermined periods of time. Statistically it measures the variance of returns. It gives an estimate of the uncertainty of future returns. The larger this value, the larger the risk or uncertainty associated with an asset. Higher volatility means less predictable prices.

## Multiple variables

### Covariance

Covariance is a measure of the joint variability of two random variables. The sign of the covariance shows the tendency in the linear relationship between the variables: tend to show similar (+) or opposite (-) behavior. The magnitude of the covariance is not easy to interpret because it is not normalized and hence depends on the magnitudes of the variables. The normalized version of the covariance, the correlation coefficient, however, shows by its magnitude the strength of the linear relation.