3 minute read

0. Dictionary

English Chinese Symbol
Moment  
$n^{th}$ n 阶  
Raw Moment 原点矩 $\mu’_n$
Central Moment 中心矩 $\mu_n$
Standardized Moment 标准矩 $\alpha_n$
Mean 平均值 $\mu$
Median 中位数  
Mode 众数  
Variance 方差 $\sigma^2$
Standard Deviation 标准差 $\sigma$
Expectation Operator 期望算子 $E[X]$
Skewness [sk’ju:nes] 偏度 $\gamma_1$
Kurtosis [kɜ:’təʊsɪs] 峰度 $\gamma_2$

1. Moment

1.1 Definition in Physics

数学中矩的概念来自于物理学。在物理学中,矩,又称动差,是用来表示物体形状的物理量。

实函数(指定义域和值域均为实数域的函数)$f(x)$ 相对于值 $c$ 的 $n$ 阶矩(the $n^{th}$ moment of a real-valued continuous function $f$ of a real variable x about a value $c$)为:

\[\mu'_n = \int^\infty_{-\infty} (x-c)^n f(x) dx\]

1.2 Raw Moment

主要参考 Raw Moment

In statistics, a raw moment of a univariate continuous random variable $X$ is one of a probability density function (a.k.a pdf) $f(x)$ taken about 0 (i.e. $c = 0$).

\[\mu'_n = \int^\infty_{-\infty} x^n f(x) dx\]

Of a discrete random variable $X$:

\[\mu'_n = \sum_{i=1}^k x_i^n P(X = x_i)\]

当 n = 1 时,它的意义就是:”$X$ 的取值 $x_i$” 乘以 “$X$ 取 $x_i$ 的概率”,然后求和。

特定地,有 $\mu’_0 = 1$

1.3 Central Moment

主要参考 Central Moment

A central moment of a univariate continuous random variable $X$ is one of a probability density function $f(x)$ taken about the mean (因为 Expectation (== Mean) 也被称为随机变量的 “中心”,所以 $c = mean(X)$ 的 moment 就被命名为 central moment):

\[\mu_n = \int^\infty_{-\infty} (x-\mu)^n f(x) dx\]

特定地,有 $\mu_0 = 1$ 和 $\mu_1 = 0$

1.4 Standardized Moment

\[\alpha_n = \frac{\mu_n}{\sigma^n}\]

特定地,有 $\alpha_1 = 0$ 和 $\alpha_2 = 1$

2. Expectation

2.1 Expectation Equals Arithmetic Mean

Expectation is defined as $1^{st}$ raw moment:

\[\mu = \mu'_1 = \int^\infty_{-\infty} x f(x) dx\]

Expectation is the arithmetic mean of any random variable coming from any probability distribution,这个不用怀疑,可以参见这篇 Why is expectation the same as the arithmetic mean?

2.2 Expectation Operator

其实就是把 $\mu$ 看做 a function of $x$:

\[E[X] = \mu = \mu'_1 = \int^\infty_{-\infty} x f(x) dx\]

If $Y = g(X)$, then:

\[E[Y] = E[g(X)] = \int^\infty_{-\infty} g(x) f(x) dx\]

这个 $E$ 就称为 Expectation Operator。

进而有:

  • $E[X^n] = \mu’_n$
  • $E[(X-\mu)^n] = \mu_n$
  • $E \left [ \big(\frac{X-\mu}{\sigma} \big)^n \right ] = \frac{E[(X-\mu)^n]}{\sigma^n} = \alpha_n$

3. Variance

Variance is defined as $2^{nd}$ central moment:

\[\sigma^2 = \mu_2 = \int^\infty_{-\infty} (x-\mu)^2 f(x) dx = E[(X-\mu)^2] = E[X^2] - \mu^2\]

4. Skewness

Skewness is defined as $3^{rd}$ standardized moment:

\[\gamma_1 = \alpha_3 = \frac{\mu_3}{\sigma^3}\]

Skewness is a measure of asymmetry [əˈsɪmɪtri]:

  • If a distribution is “pulled out” towards higher values (to the right), then it has positive skewness ($\gamma_1 > 0$,称为正偏态或右偏态).
  • If it is pulled out toward lower values, then it has negative skewness ($\gamma_1 < 0$,称为负偏态或左偏态).
  • A symmetric [sɪ’metrɪk] distribution, e.g., the Gaussian distribution, has zero skewness ($\gamma_1 = 0$).
    • 进一步还可以得到:mean == median
      • 如果是 symmetric 且是单峰分布,那么还可以得到:mean == median == mode

注意看图的时候,skewness 是个非常 confusing 的概念:

  • 左图:Negative skew ($\gamma_1 < 0$) == The distribution is skewed to the LEFT == Mean is on the left side of the peak
    • while the peak is pulled towards RIGHT
  • 右图:Positive skew ($\gamma_1 > 0$) == The distribution is skewed to the RIGHT == Mean is on the right side of the peak
    • while the peak is pulled towards LEFT

所以 skewness 最好不要根据图形去记忆,而应该根据一维坐标轴:D H@ScienceForums.Net:

One way to remember the left/right stuff is that it corresponds with the orientation of the numberline. Since negative numbers are to the left of zero, negative skewness is the same as left-skewed. The same goes for positive skewness and right-skewed.

5. Kurtosis

Kurtosis, from Greek word “kyrtos” for convex, related to word “curve”, is mainly defined by $4^{th}$ standardized moment:

\[\gamma_2 = \alpha_4 - 3 = \frac{\mu_4}{\sigma^4} - 3\]

It is also known as excess kurtosis (超值峰度). The “minus 3” at the end of this formula is often explained as a correction to make the kurtosis of the normal distribution equal to zero.

  • If $\gamma_2 > 0$,称为尖峰态(leptokurtic, [leptəʊ’kɜ:tɪk])
  • If $\gamma_2 < 0$,称为低峰态(platykurtic, [plæ’ti:kɜ:tɪk])。

Comments