吴恩达机器学习:week3

时间:2024-04-07 21:50:47

title: ‘吴恩达机器学习:week3’
date: 2019-11-20 15:37:28
mathjax: true
categories:

  • 机器学习
    tags:
  • 机器学习

线性代数回顾(Linear Algebra Review)

3.1 矩阵和向量

参考视频: 3 - 1 - Matrices and Vectors (9 min).mkv

如图:这个是4×2矩阵,即4行2列,如mm为行,nn为列,那么m×nm×n即4×2

吴恩达机器学习:week3

矩阵的维数即行数×列数

矩阵元素(矩阵项):A=[1402191137182194914371471448]A=\left[ \begin{matrix} 1402 & 191 \\ 1371 & 821 \\ 949 & 1437 \\ 147 & 1448 \\\end{matrix} \right]

AijA_{ij}指第ii行,第jj列的元素。

向量是一种特殊的矩阵,讲义中的向量一般都是列向量,如:
y=[460232315178]y=\left[ \begin{matrix} {460} \\ {232} \\ {315} \\ {178} \\\end{matrix} \right]

为四维列向量(4×1)。

如下图为1索引向量和0索引向量,左图为1索引向量,右图为0索引向量,一般我们用1索引向量。

y=[y1y2y3y4]y=\left[ \begin{matrix} {{y}_{1}} \\ {{y}_{2}} \\ {{y}_{3}} \\ {{y}_{4}} \\\end{matrix} \right]y=[y0y1y2y3]y=\left[ \begin{matrix} {{y}_{0}} \\ {{y}_{1}} \\ {{y}_{2}} \\ {{y}_{3}} \\\end{matrix} \right]

3.2 加法和标量乘法

参考视频: 3 - 2 - Addition and Scalar Multiplication (7 min).mkv

矩阵的加法:行列数相等的可以加。

例:

吴恩达机器学习:week3

矩阵的乘法:每个元素都要乘

吴恩达机器学习:week3

组合算法也类似。

3.3 矩阵向量乘法

参考视频: 3 - 3 - Matrix Vector Multiplication (14 min).mkv

矩阵和向量的乘法如图:m×nm×n的矩阵乘以n×1n×1的向量,得到的是m×1m×1的向量

吴恩达机器学习:week3

算法举例:

吴恩达机器学习:week3

3.4 矩阵乘法

参考视频: 3 - 4 - Matrix Matrix Multiplication (11 min).mkv

矩阵乘法:

m×nm×n矩阵乘以n×on×o矩阵,变成m×om×o矩阵。

如果这样说不好理解的话就举一个例子来说明一下,比如说现在有两个矩阵AABB,那么它们的乘积就可以表示为图中所示的形式。

吴恩达机器学习:week3
吴恩达机器学习:week3

3.5 矩阵乘法的性质

参考视频: 3 - 5 - Matrix Multiplication Properties (9 min).mkv

矩阵乘法的性质:

矩阵的乘法不满足交换律:A×BB×AA×B≠B×A

矩阵的乘法满足结合律。即:A×(B×C)=(A×B)×CA×(B×C)=(A×B)×C

单位矩阵:在矩阵的乘法中,有一种矩阵起着特殊的作用,如同数的乘法中的1,我们称这种矩阵为单位矩阵.它是个方阵,一般用 II 或者 EE 表示,本讲义都用 II 代表单位矩阵,从左上角到右下角的对角线(称为主对角线)上的元素均为1以外全都为0。如:

AA1=A1A=IA{{A}^{-1}}={{A}^{-1}}A=I

对于单位矩阵,有AI=IA=AAI=IA=A

3.6 逆、转置

参考视频: 3 - 6 - Inverse and Transpose (11 min).mkv

矩阵的逆:如矩阵AA是一个m×mm×m矩阵(方阵),如果有逆矩阵,则:AA1=A1A=IA{{A}^{-1}}={{A}^{-1}}A=I

我们一般在OCTAVE或者MATLAB中进行计算矩阵的逆矩阵。

矩阵的转置:设AAm×nm×n阶矩阵(即mmnn列),第$i j 列的元素是a(i,j),即:A=a(i,j)$

定义AA的转置为这样一个n×mn×m阶矩阵BB,满足B=a(j,i)B=a(j,i),即 b(i,j)=a(j,i)b (i,j)=a(j,i)BB的第ii行第jj列元素是AA的第jj行第ii列元素),记AT=B{{A}^{T}}=B。(有些书记为A’=B)

直观来看,将AA的所有元素绕着一条从第1行第1列元素出发的右下方45度的射线作镜面反转,即得到AA的转置。

例:

abcdefT=acebdf{{\left| \begin{matrix} a& b \\ c& d \\ e& f \\\end{matrix} \right|}^{T}}=\left|\begin{matrix} a& c & e \\ b& d & f \\\end{matrix} \right|

矩阵的转置基本性质:

$ {{\left( A\pm B \right)}{T}}={{A}{T}}\pm {{B}^{T}} $
(A×B)T=BT×AT{{\left( A\times B \right)}^{T}}={{B}^{T}}\times {{A}^{T}}
${{\left( {{A}^{T}} \right)}^{T}}=A $
${{\left( KA \right)}{T}}=K{{A}{T}} $

matlab中矩阵转置:直接打一撇,x=y'

机器学习的数学基础

高等数学

1.导数定义:

导数和微分的概念

f(x0)=limΔx0f(x0+Δx)f(x0)Δxf'({{x}_{0}})=\underset{\Delta x\to 0}{\mathop{\lim }}\,\frac{f({{x}_{0}}+\Delta x)-f({{x}_{0}})}{\Delta x} (1)

或者:

f(x0)=limxx0f(x)f(x0)xx0f'({{x}_{0}})=\underset{x\to {{x}_{0}}}{\mathop{\lim }}\,\frac{f(x)-f({{x}_{0}})}{x-{{x}_{0}}} (2)

2.左右导数导数的几何意义和物理意义

函数f(x)f(x)x0x_0处的左、右导数分别定义为:

左导数:f(x0)=limΔx0f(x0+Δx)f(x0)Δx=limxx0f(x)f(x0)xx0,(x=x0+Δx){{{f}'}_{-}}({{x}_{0}})=\underset{\Delta x\to {{0}^{-}}}{\mathop{\lim }}\,\frac{f({{x}_{0}}+\Delta x)-f({{x}_{0}})}{\Delta x}=\underset{x\to x_{0}^{-}}{\mathop{\lim }}\,\frac{f(x)-f({{x}_{0}})}{x-{{x}_{0}}},(x={{x}_{0}}+\Delta x)

右导数:f+(x0)=limΔx0+f(x0+Δx)f(x0)Δx=limxx0+f(x)f(x0)xx0{{{f}'}_{+}}({{x}_{0}})=\underset{\Delta x\to {{0}^{+}}}{\mathop{\lim }}\,\frac{f({{x}_{0}}+\Delta x)-f({{x}_{0}})}{\Delta x}=\underset{x\to x_{0}^{+}}{\mathop{\lim }}\,\frac{f(x)-f({{x}_{0}})}{x-{{x}_{0}}}

3.函数的可导性与连续性之间的关系

Th1: 函数f(x)f(x)x0x_0处可微f(x)\Leftrightarrow f(x)x0x_0处可导

Th2: 若函数在点x0x_0处可导,则y=f(x)y=f(x)在点x0x_0处连续,反之则不成立。即函数连续不一定可导。

Th3: f(x0){f}'({{x}_{0}})存在f(x0)=f+(x0)\Leftrightarrow {{{f}'}_{-}}({{x}_{0}})={{{f}'}_{+}}({{x}_{0}})

4.平面曲线的切线和法线

切线方程 : yy0=f(x0)(xx0)y-{{y}_{0}}=f'({{x}_{0}})(x-{{x}_{0}})
法线方程:yy0=1f(x0)(xx0),f(x0)0y-{{y}_{0}}=-\frac{1}{f'({{x}_{0}})}(x-{{x}_{0}}),f'({{x}_{0}})\ne 0

5.四则运算法则
设函数u=u(x)v=v(x)u=u(x),v=v(x)]在点xx可导则
(1) (u±v)=u±v(u\pm v{)}'={u}'\pm {v}' d(u±v)=du±dvd(u\pm v)=du\pm dv
(2)(uv)=uv+vu(uv{)}'=u{v}'+v{u}' d(uv)=udv+vdud(uv)=udv+vdu
(3) (uv)=vuuvv2(v0)(\frac{u}{v}{)}'=\frac{v{u}'-u{v}'}{{{v}^{2}}}(v\ne 0) d(uv)=vduudvv2d(\frac{u}{v})=\frac{vdu-udv}{{{v}^{2}}}

6.基本导数与微分表
(1) y=cy=c(常数) y=0{y}'=0 dy=0dy=0
(2) y=xαy={{x}^{\alpha }}($\alpha $为实数) y=αxα1{y}'=\alpha {{x}^{\alpha -1}} dy=αxα1dxdy=\alpha {{x}^{\alpha -1}}dx
(3) y=axy={{a}^{x}} y=axlna{y}'={{a}^{x}}\ln a dy=axlnadxdy={{a}^{x}}\ln adx
特例: (ex)=ex({{{e}}^{x}}{)}'={{{e}}^{x}} d(ex)=exdxd({{{e}}^{x}})={{{e}}^{x}}dx

(4) y=logaxy={{\log }_{a}}x y=1xlna{y}'=\frac{1}{x\ln a}

dy=1xlnadxdy=\frac{1}{x\ln a}dx
特例:y=lnxy=\ln x (lnx)=1x(\ln x{)}'=\frac{1}{x} d(lnx)=1xdxd(\ln x)=\frac{1}{x}dx

(5) y=sinxy=\sin x

y=cosx{y}'=\cos x d(sinx)=cosxdxd(\sin x)=\cos xdx

(6) y=cosxy=\cos x

y=sinx{y}'=-\sin x d(cosx)=sinxdxd(\cos x)=-\sin xdx

(7) y=tanxy=\tan x

y=1cos2x=sec2x{y}'=\frac{1}{{{\cos }^{2}}x}={{\sec }^{2}}x d(tanx)=sec2xdxd(\tan x)={{\sec }^{2}}xdx
(8) y=cotxy=\cot x y=1sin2x=csc2x{y}'=-\frac{1}{{{\sin }^{2}}x}=-{{\csc }^{2}}x d(cotx)=csc2xdxd(\cot x)=-{{\csc }^{2}}xdx
(9) y=secxy=\sec x y=secxtanx{y}'=\sec x\tan x

d(secx)=secxtanxdxd(\sec x)=\sec x\tan xdx
(10) y=cscxy=\csc x y=cscxcotx{y}'=-\csc x\cot x

d(cscx)=cscxcotxdxd(\csc x)=-\csc x\cot xdx
(11) y=arcsinxy=\arcsin x

y=11x2{y}'=\frac{1}{\sqrt{1-{{x}^{2}}}}

d(arcsinx)=11x2dxd(\arcsin x)=\frac{1}{\sqrt{1-{{x}^{2}}}}dx
(12) y=arccosxy=\arccos x

y=11x2{y}'=-\frac{1}{\sqrt{1-{{x}^{2}}}} d(arccosx)=11x2dxd(\arccos x)=-\frac{1}{\sqrt{1-{{x}^{2}}}}dx

(13) y=arctanxy=\arctan x

y=11+x2{y}'=\frac{1}{1+{{x}^{2}}} d(arctanx)=11+x2dxd(\arctan x)=\frac{1}{1+{{x}^{2}}}dx

(14) y=arccotxy=\operatorname{arc}\cot x

y=11+x2{y}'=-\frac{1}{1+{{x}^{2}}}

d(arccotx)=11+x2dxd(\operatorname{arc}\cot x)=-\frac{1}{1+{{x}^{2}}}dx
(15) y=shxy=shx

y=chx{y}'=chx d(shx)=chxdxd(shx)=chxdx

(16) y=chxy=chx

y=shx{y}'=shx d(chx)=shxdxd(chx)=shxdx

7.复合函数,反函数,隐函数以及参数方程所确定的函数的微分法

(1) 反函数的运算法则: 设y=f(x)y=f(x)在点xx的某邻域内单调连续,在点xx处可导且f(x)0{f}'(x)\ne 0,则其反函数在点xx所对应的yy处可导,并且有dydx=1dxdy\frac{dy}{dx}=\frac{1}{\frac{dx}{dy}}
(2) 复合函数的运算法则:若μ=φ(x)\mu =\varphi (x)在点xx可导,而y=f(μ)y=f(\mu )在对应点$\mu ((\mu =\varphi (x)),)可导,则复合函数y=f(\varphi (x))在点x,可导,且{y}’={f}’(\mu )\cdot {\varphi }’(x)$
(3) 隐函数导数dydx\frac{dy}{dx}的求法一般有三种方法:
1)方程两边对xx求导,要记住yyxx的函数,则yy的函数是xx的复合函数.例如1y\frac{1}{y}y2{{y}^{2}}lnyln yey{{{e}}^{y}}等均是xx的复合函数.
xx求导应按复合函数连锁法则做.
2)公式法.由F(x,y)=0F(x,y)=0dydx=Fx(x,y)Fy(x,y)\frac{dy}{dx}=-\frac{{{{{F}'}}_{x}}(x,y)}{{{{{F}'}}_{y}}(x,y)},其中,Fx(x,y){{{F}'}_{x}}(x,y)
Fy(x,y){{{F}'}_{y}}(x,y)分别表示F(x,y)F(x,y)xxyy的偏导数
3)利用微分形式不变性

8.常用高阶导数公式

(1)(ax)(n)=axlnna(a>0)(ex)(n)=ex({{a}^{x}}){{\,}^{(n)}}={{a}^{x}}{{\ln }^{n}}a\quad (a>{0})\quad \quad ({{{e}}^{x}}){{\,}^{(n)}}={e}{{\,}^{x}}
(2)(sinkx)(n)=knsin(kx+nπ2)(\sin kx{)}{{\,}^{(n)}}={{k}^{n}}\sin (kx+n\cdot \frac{\pi }{{2}})
(3)(coskx)(n)=kncos(kx+nπ2)(\cos kx{)}{{\,}^{(n)}}={{k}^{n}}\cos (kx+n\cdot \frac{\pi }{{2}})
(4)(xm)(n)=m(m1)(mn+1)xmn({{x}^{m}}){{\,}^{(n)}}=m(m-1)\cdots (m-n+1){{x}^{m-n}}
(5)(lnx)(n)=(1)(n1)(n1)!xn(\ln x){{\,}^{(n)}}={{(-{1})}^{(n-{1})}}\frac{(n-{1})!}{{{x}^{n}}}
(6)莱布尼兹公式:若u(x),v(x)u(x)\,,v(x)nn阶可导,则
(uv)(n)=i=0ncniu(i)v(ni){{(uv)}^{(n)}}=\sum\limits_{i={0}}^{n}{c_{n}^{i}{{u}^{(i)}}{{v}^{(n-i)}}},其中u(0)=u{{u}^{({0})}}=uv(0)=v{{v}^{({0})}}=v

9.微分中值定理,泰勒公式

Th1:(费马定理)

若函数f(x)f(x)满足条件:
(1)函数f(x)f(x)x0{{x}_{0}}的某邻域内有定义,并且在此邻域内恒有
f(x)f(x0)f(x)\le f({{x}_{0}})f(x)f(x0)f(x)\ge f({{x}_{0}}),

(2) f(x)f(x)x0{{x}_{0}}处可导,则有 f(x0)=0{f}'({{x}_{0}})=0

Th2:(罗尔定理)

设函数f(x)f(x)满足条件:
(1)在闭区间[a,b][a,b]上连续;

(2)在(a,b)(a,b)内可导;

(3)f(a)=f(b)f(a)=f(b)

则在(a,b)(a,b)内一存在个$\xi $,使 f(ξ)=0{f}'(\xi )=0
Th3: (拉格朗日中值定理)

设函数f(x)f(x)满足条件:
(1)在[a,b][a,b]上连续;

(2)在(a,b)(a,b)内可导;

则在(a,b)(a,b)内一存在个$\xi $,使 f(b)f(a)ba=f(ξ)\frac{f(b)-f(a)}{b-a}={f}'(\xi )

Th4: (柯西中值定理)

设函数f(x)f(x)g(x)g(x)满足条件:
(1) 在[a,b][a,b]上连续;

(2) 在(a,b)(a,b)内可导且f(x){f}'(x)g(x){g}'(x)均存在,且g(x)0{g}'(x)\ne 0

则在(a,b)(a,b)内存在一个$\xi $,使 f(b)f(a)g(b)g(a)=f(ξ)g(ξ)\frac{f(b)-f(a)}{g(b)-g(a)}=\frac{{f}'(\xi )}{{g}'(\xi )}

10.洛必达法则
法则Ⅰ (00\frac{0}{0}型)
设函数f(x),g(x)f\left( x \right),g\left( x \right)满足条件:
limxx0f(x)=0,limxx0g(x)=0\underset{x\to {{x}_{0}}}{\mathop{\lim }}\,f\left( x \right)=0,\underset{x\to {{x}_{0}}}{\mathop{\lim }}\,g\left( x \right)=0;

f(x),g(x)f\left( x \right),g\left( x \right)x0{{x}_{0}}的邻域内可导,(在x0{{x}_{0}}处可除外)且g(x)0{g}'\left( x \right)\ne 0;

limxx0f(x)g(x)\underset{x\to {{x}_{0}}}{\mathop{\lim }}\,\frac{{f}'\left( x \right)}{{g}'\left( x \right)}存在(或$\infty $)。

则:
limxx0f(x)g(x)=limxx0f(x)g(x)\underset{x\to {{x}_{0}}}{\mathop{\lim }}\,\frac{f\left( x \right)}{g\left( x \right)}=\underset{x\to {{x}_{0}}}{\mathop{\lim }}\,\frac{{f}'\left( x \right)}{{g}'\left( x \right)}
法则I{{I}'} (00\frac{0}{0}型)设函数f(x),g(x)f\left( x \right),g\left( x \right)满足条件:
limxf(x)=0,limxg(x)=0\underset{x\to \infty }{\mathop{\lim }}\,f\left( x \right)=0,\underset{x\to \infty }{\mathop{\lim }}\,g\left( x \right)=0;

存在一个X>0X>0,当x>X\left| x \right|>X时,f(x),g(x)f\left( x \right),g\left( x \right)可导,且g(x)0{g}'\left( x \right)\ne 0;limxx0f(x)g(x)\underset{x\to {{x}_{0}}}{\mathop{\lim }}\,\frac{{f}'\left( x \right)}{{g}'\left( x \right)}存在(或$\infty $)。

则:
limxx0f(x)g(x)=limxx0f(x)g(x)\underset{x\to {{x}_{0}}}{\mathop{\lim }}\,\frac{f\left( x \right)}{g\left( x \right)}=\underset{x\to {{x}_{0}}}{\mathop{\lim }}\,\frac{{f}'\left( x \right)}{{g}'\left( x \right)}
法则Ⅱ(\frac{\infty }{\infty }型) 设函数f(x),g(x)f\left( x \right),g\left( x \right)满足条件:
$\underset{x\to {{x}{0}}}{\mathop{\lim }},f\left( x \right)=\infty ,\underset{x\to {{x}{0}}}{\mathop{\lim }},g\left( x \right)=\infty $; f(x),g(x)f\left( x \right),g\left( x \right)x0{{x}_{0}} 的邻域内可导(在x0{{x}_{0}}处可除外)且g(x)0{g}'\left( x \right)\ne 0;limxx0f(x)g(x)\underset{x\to {{x}_{0}}}{\mathop{\lim }}\,\frac{{f}'\left( x \right)}{{g}'\left( x \right)}存在(或$\infty $)。则
limxx0f(x)g(x)=limxx0f(x)g(x).\underset{x\to {{x}_{0}}}{\mathop{\lim }}\,\frac{f\left( x \right)}{g\left( x \right)}=\underset{x\to {{x}_{0}}}{\mathop{\lim }}\,\frac{{f}'\left( x \right)}{{g}'\left( x \right)}.同理法则II{I{I}'}(\frac{\infty }{\infty }型)仿法则I{{I}'}可写出。

11.泰勒公式

设函数f(x)f(x)在点x0{{x}_{0}}处的某邻域内具有n+1n+1阶导数,则对该邻域内异于x0{{x}_{0}}的任意点xx,在x0{{x}_{0}}xx之间至少存在
一个$\xi $,使得:
$f(x)=f({{x}{0}})+{f}’({{x}{0}})(x-{{x}{0}})+\frac{1}{2!}{f}’’({{x}{0}}){{(x-{{x}_{0}})}^{2}}+\cdots $
+f(n)(x0)n!(xx0)n+Rn(x)+\frac{{{f}^{(n)}}({{x}_{0}})}{n!}{{(x-{{x}_{0}})}^{n}}+{{R}_{n}}(x)
其中 Rn(x)=f(n+1)(ξ)(n+1)!(xx0)n+1{{R}_{n}}(x)=\frac{{{f}^{(n+1)}}(\xi )}{(n+1)!}{{(x-{{x}_{0}})}^{n+1}}称为f(x)f(x)在点x0{{x}_{0}}处的nn阶泰勒余项。

x0=0{{x}_{0}}=0,则nn阶泰勒公式
f(x)=f(0)+f(0)x+12!f(0)x2++f(n)(0)n!xn+Rn(x)f(x)=f(0)+{f}'(0)x+\frac{1}{2!}{f}''(0){{x}^{2}}+\cdots +\frac{{{f}^{(n)}}(0)}{n!}{{x}^{n}}+{{R}_{n}}(x)……(1)
其中 Rn(x)=f(n+1)(ξ)(n+1)!xn+1{{R}_{n}}(x)=\frac{{{f}^{(n+1)}}(\xi )}{(n+1)!}{{x}^{n+1}},$\xi 0在0与x$之间.(1)式称为麦克劳林公式

常用五种函数在x0=0{{x}_{0}}=0处的泰勒公式

(1) ex=1+x+12!x2++1n!xn+xn+1(n+1)!eξ{{{e}}^{x}}=1+x+\frac{1}{2!}{{x}^{2}}+\cdots +\frac{1}{n!}{{x}^{n}}+\frac{{{x}^{n+1}}}{(n+1)!}{{e}^{\xi }}

=1+x+12!x2++1n!xn+o(xn)=1+x+\frac{1}{2!}{{x}^{2}}+\cdots +\frac{1}{n!}{{x}^{n}}+o({{x}^{n}})

(2) sinx=x13!x3++xnn!sinnπ2+xn+1(n+1)!sin(ξ+n+12π)\sin x=x-\frac{1}{3!}{{x}^{3}}+\cdots +\frac{{{x}^{n}}}{n!}\sin \frac{n\pi }{2}+\frac{{{x}^{n+1}}}{(n+1)!}\sin (\xi +\frac{n+1}{2}\pi )

=x13!x3++xnn!sinnπ2+o(xn)=x-\frac{1}{3!}{{x}^{3}}+\cdots +\frac{{{x}^{n}}}{n!}\sin \frac{n\pi }{2}+o({{x}^{n}})

(3) cosx=112!x2++xnn!cosnπ2+xn+1(n+1)!cos(ξ+n+12π)\cos x=1-\frac{1}{2!}{{x}^{2}}+\cdots +\frac{{{x}^{n}}}{n!}\cos \frac{n\pi }{2}+\frac{{{x}^{n+1}}}{(n+1)!}\cos (\xi +\frac{n+1}{2}\pi )

=112!x2++xnn!cosnπ2+o(xn)=1-\frac{1}{2!}{{x}^{2}}+\cdots +\frac{{{x}^{n}}}{n!}\cos \frac{n\pi }{2}+o({{x}^{n}})

(4) ln(1+x)=x12x2+13x3+(1)n1xnn+(1)nxn+1(n+1)(1+ξ)n+1\ln (1+x)=x-\frac{1}{2}{{x}^{2}}+\frac{1}{3}{{x}^{3}}-\cdots +{{(-1)}^{n-1}}\frac{{{x}^{n}}}{n}+\frac{{{(-1)}^{n}}{{x}^{n+1}}}{(n+1){{(1+\xi )}^{n+1}}}

=x12x2+13x3+(1)n1xnn+o(xn)=x-\frac{1}{2}{{x}^{2}}+\frac{1}{3}{{x}^{3}}-\cdots +{{(-1)}^{n-1}}\frac{{{x}^{n}}}{n}+o({{x}^{n}})

(5) (1+x)m=1+mx+m(m1)2!x2++m(m1)(mn+1)n!xn{{(1+x)}^{m}}=1+mx+\frac{m(m-1)}{2!}{{x}^{2}}+\cdots +\frac{m(m-1)\cdots (m-n+1)}{n!}{{x}^{n}}
+m(m1)(mn+1)(n+1)!xn+1(1+ξ)mn1+\frac{m(m-1)\cdots (m-n+1)}{(n+1)!}{{x}^{n+1}}{{(1+\xi )}^{m-n-1}}

或 ${{(1+x)}{m}}=1+mx+\frac{m(m-1)}{2!}{{x}{2}}+\cdots $ +m(m1)(mn+1)n!xn+o(xn)+\frac{m(m-1)\cdots (m-n+1)}{n!}{{x}^{n}}+o({{x}^{n}})

12.函数单调性的判断
Th1: 设函数f(x)f(x)(a,b)(a,b)区间内可导,如果对x(a,b)\forall x\in (a,b),都有f(x)>0f\,'(x)>0(或f(x)<0f\,'(x)<0),则函数f(x)f(x)(a,b)(a,b)内是单调增加的(或单调减少)

Th2: (取极值的必要条件)设函数f(x)f(x)x0{{x}_{0}}处可导,且在x0{{x}_{0}}处取极值,则f(x0)=0f\,'({{x}_{0}})=0

Th3: (取极值的第一充分条件)设函数f(x)f(x)x0{{x}_{0}}的某一邻域内可微,且f(x0)=0f\,'({{x}_{0}})=0(或f(x)f(x)x0{{x}_{0}}处连续,但f(x0)f\,'({{x}_{0}})不存在。)
(1)若当xx经过x0{{x}_{0}}时,f(x)f\,'(x)由“+”变“-”,则f(x0)f({{x}_{0}})为极大值;
(2)若当xx经过x0{{x}_{0}}时,f(x)f\,'(x)由“-”变“+”,则f(x0)f({{x}_{0}})为极小值;
(3)若f(x)f\,'(x)经过x=x0x={{x}_{0}}的两侧不变号,则f(x0)f({{x}_{0}})不是极值。

Th4: (取极值的第二充分条件)设f(x)f(x)在点x0{{x}_{0}}处有f(x)0f''(x)\ne 0,且f(x0)=0f\,'({{x}_{0}})=0,则 当f(x0)<0f'\,'({{x}_{0}})<0时,f(x0)f({{x}_{0}})为极大值;
f(x0)>0f'\,'({{x}_{0}})>0时,f(x0)f({{x}_{0}})为极小值。
注:如果f(x0)<0f'\,'({{x}_{0}})<0,此方法失效。

13.渐近线的求法
(1)水平渐近线 若limx+f(x)=b\underset{x\to +\infty }{\mathop{\lim }}\,f(x)=b,或limxf(x)=b\underset{x\to -\infty }{\mathop{\lim }}\,f(x)=b,则

y=by=b称为函数y=f(x)y=f(x)的水平渐近线。

(2)铅直渐近线 若$\underset{x\to x_{0}^{-}}{\mathop{\lim }},f(x)=\infty ,或\underset{x\to x_{0}^{+}}{\mathop{\lim }},f(x)=\infty $,则

x=x0x={{x}_{0}}称为y=f(x)y=f(x)的铅直渐近线。

(3)斜渐近线 若a=limxf(x)x,b=limx[f(x)ax]a=\underset{x\to \infty }{\mathop{\lim }}\,\frac{f(x)}{x},\quad b=\underset{x\to \infty }{\mathop{\lim }}\,[f(x)-ax],则
y=ax+by=ax+b称为y=f(x)y=f(x)的斜渐近线。

14.函数凹凸性的判断
Th1: (凹凸性的判别定理)若在I上f(x)<0f''(x)<0(或f(x)>0f''(x)>0),则f(x)f(x)在I上是凸的(或凹的)。

Th2: (拐点的判别定理1)若在x0{{x}_{0}}f(x)=0f''(x)=0,(或f(x)f''(x)不存在),当xx变动经过x0{{x}_{0}}时,f(x)f''(x)变号,则(x0,f(x0))({{x}_{0}},f({{x}_{0}}))为拐点。

Th3: (拐点的判别定理2)设f(x)f(x)x0{{x}_{0}}点的某邻域内有三阶导数,且f(x)=0f''(x)=0f(x)0f'''(x)\ne 0,则(x0,f(x0))({{x}_{0}},f({{x}_{0}}))为拐点。

15.弧微分

dS=1+y2dxdS=\sqrt{1+y{{'}^{2}}}dx

16.曲率

曲线y=f(x)y=f(x)在点(x,y)(x,y)处的曲率k=y(1+y2)32k=\frac{\left| y'' \right|}{{{(1+y{{'}^{2}})}^{\tfrac{3}{2}}}}
对于参数方程KaTeX parse error: No such environment: align at position 15: \left\{ \begin{̲a̲l̲i̲g̲n̲}̲ & x=\varphi (…k=φ(t)ψ(t)φ(t)ψ(t)[φ2(t)+ψ2(t)]32k=\frac{\left| \varphi '(t)\psi ''(t)-\varphi ''(t)\psi '(t) \right|}{{{[\varphi {{'}^{2}}(t)+\psi {{'}^{2}}(t)]}^{\tfrac{3}{2}}}}

17.曲率半径

曲线在点MM处的曲率k(k0)k(k\ne 0)与曲线在点MM处的曲率半径$\rho 有如下关系:\rho =\frac{1}{k}$。

线性代数

行列式

1.行列式按行(列)展开定理

(1) 设A=(aij)n×nA = ( a_{{ij}} )_{n \times n},则:ai1Aj1+ai2Aj2++ainAjn={A,i=j0,ija_{i1}A_{j1} +a_{i2}A_{j2} + \cdots + a_{{in}}A_{{jn}} = \begin{cases}|A|,i=j\\ 0,i \neq j\end{cases}

a1iA1j+a2iA2j++aniAnj={A,i=j0,ija_{1i}A_{1j} + a_{2i}A_{2j} + \cdots + a_{{ni}}A_{{nj}} = \begin{cases}|A|,i=j\\ 0,i \neq j\end{cases}AA=AA=AE,AA^{*} = A^{*}A = \left| A \right|E,其中:A=(A11A12A1nA21A22A2nAn1An2Ann)=(Aji)=(Aij)TA^{*} = \begin{pmatrix} A_{11} & A_{12} & \ldots & A_{1n} \\ A_{21} & A_{22} & \ldots & A_{2n} \\ \ldots & \ldots & \ldots & \ldots \\ A_{n1} & A_{n2} & \ldots & A_{{nn}} \\ \end{pmatrix} = (A_{{ji}}) = {(A_{{ij}})}^{T}

Dn=111x1x2xnx1n1x2n1xnn1=1j<in(xixj)D_{n} = \begin{vmatrix} 1 & 1 & \ldots & 1 \\ x_{1} & x_{2} & \ldots & x_{n} \\ \ldots & \ldots & \ldots & \ldots \\ x_{1}^{n - 1} & x_{2}^{n - 1} & \ldots & x_{n}^{n - 1} \\ \end{vmatrix} = \prod_{1 \leq j < i \leq n}^{}\,(x_{i} - x_{j})

(2) 设A,BA,Bnn阶方阵,则AB=AB=BA=BA\left| {AB} \right| = \left| A \right|\left| B \right| = \left| B \right|\left| A \right| = \left| {BA} \right|,但A±B=A±B\left| A \pm B \right| = \left| A \right| \pm \left| B \right|不一定成立。

(3) kA=knA\left| {kA} \right| = k^{n}\left| A \right|,AAnn阶方阵。

(4) 设AAnn阶方阵,AT=A;A1=A1|A^{T}| = |A|;|A^{- 1}| = |A|^{- 1}(若AA可逆),A=An1|A^{*}| = |A|^{n - 1}

n2n \geq 2

(5) AOOB=ACOB=AOCB=AB\left| \begin{matrix} & {A\quad O} \\ & {O\quad B} \\ \end{matrix} \right| = \left| \begin{matrix} & {A\quad C} \\ & {O\quad B} \\ \end{matrix} \right| = \left| \begin{matrix} & {A\quad O} \\ & {C\quad B} \\ \end{matrix} \right| =| A||B|
A,BA,B为方阵,但OAm×mBn×nO=(1)mnAB\left| \begin{matrix} {O} & A_{m \times m} \\ B_{n \times n} & { O} \\ \end{matrix} \right| = ({- 1)}^{{mn}}|A||B|

(6) 范德蒙行列式Dn=111x1x2xnx1n1x2n1xnn1=1j<in(xixj)D_{n} = \begin{vmatrix} 1 & 1 & \ldots & 1 \\ x_{1} & x_{2} & \ldots & x_{n} \\ \ldots & \ldots & \ldots & \ldots \\ x_{1}^{n - 1} & x_{2}^{n 1} & \ldots & x_{n}^{n - 1} \\ \end{vmatrix} = \prod_{1 \leq j < i \leq n}^{}\,(x_{i} - x_{j})

AAnn阶方阵,λi(i=1,2,n)\lambda_{i}(i = 1,2\cdots,n)AAnn个特征值,则
A=i=1nλi|A| = \prod_{i = 1}^{n}\lambda_{i}

矩阵

矩阵:m×nm \times n个数aija_{{ij}}排成mmnn列的表格[a11a12a1na21a22a2nam1am2amn]\begin{bmatrix} a_{11}\quad a_{12}\quad\cdots\quad a_{1n} \\ a_{21}\quad a_{22}\quad\cdots\quad a_{2n} \\ \quad\cdots\cdots\cdots\cdots\cdots \\ a_{m1}\quad a_{m2}\quad\cdots\quad a_{{mn}} \\ \end{bmatrix} 称为矩阵,简记为AA,或者(aij)m×n\left( a_{{ij}} \right)_{m \times n} 。若m=nm = n,则称AAnn阶矩阵或nn阶方阵。

矩阵的线性运算

1.矩阵的加法

A=(aij),B=(bij)A = (a_{{ij}}),B = (b_{{ij}})是两个m×nm \times n矩阵,则m×nm \times n 矩阵C=cij)=aij+bijC = c_{{ij}}) = a_{{ij}} + b_{{ij}}称为矩阵AABB的和,记为A+B=CA + B = C

2.矩阵的数乘

A=(aij)A = (a_{{ij}})m×nm \times n矩阵,kk是一个常数,则m×nm \times n矩阵(kaij)(ka_{{ij}})称为数kk与矩阵AA的数乘,记为kA{kA}

3.矩阵的乘法

A=(aij)A = (a_{{ij}})m×nm \times n矩阵,B=(bij)B = (b_{{ij}})n×sn \times s矩阵,那么m×sm \times s矩阵C=(cij)C = (c_{{ij}}),其中cij=ai1b1j+ai2b2j++ainbnj=k=1naikbkjc_{{ij}} = a_{i1}b_{1j} + a_{i2}b_{2j} + \cdots + a_{{in}}b_{{nj}} = \sum_{k =1}^{n}{a_{{ik}}b_{{kj}}}称为AB{AB}的乘积,记为C=ABC = AB

4. AT\mathbf{A}^{\mathbf{T}}A1\mathbf{A}^{\mathbf{-1}}A\mathbf{A}^{\mathbf{*}}三者之间的关系

(1) (AT)T=A,(AB)T=BTAT,(kA)T=kAT,(A±B)T=AT±BT{(A^{T})}^{T} = A,{(AB)}^{T} = B^{T}A^{T},{(kA)}^{T} = kA^{T},{(A \pm B)}^{T} = A^{T} \pm B^{T}

(2) (A1)1=A,(AB)1=B1A1,(kA)1=1kA1,\left( A^{- 1} \right)^{- 1} = A,\left( {AB} \right)^{- 1} = B^{- 1}A^{- 1},\left( {kA} \right)^{- 1} = \frac{1}{k}A^{- 1},

(A±B)1=A1±B1{(A \pm B)}^{- 1} = A^{- 1} \pm B^{- 1}不一定成立。

(3) (A)=An2 A  (n3)\left( A^{*} \right)^{*} = |A|^{n - 2}\ A\ \ (n \geq 3)(AB)=BA,\left({AB} \right)^{*} = B^{*}A^{*}, (kA)=kn1A  (n2)\left( {kA} \right)^{*} = k^{n -1}A^{*}{\ \ }\left( n \geq 2 \right)

(A±B)=A±B\left( A \pm B \right)^{*} = A^{*} \pm B^{*}不一定成立。

(4) (A1)T=(AT)1, (A1)=(AA)1,(A)T=(AT){(A^{- 1})}^{T} = {(A^{T})}^{- 1},\ \left( A^{- 1} \right)^{*} ={(AA^{*})}^{- 1},{(A^{*})}^{T} = \left( A^{T} \right)^{*}

5.有关A\mathbf{A}^{\mathbf{*}}的结论

(1) AA=AA=AEAA^{*} = A^{*}A = |A|E

(2) A=An1 (n2),    (kA)=kn1A,  (A)=An2A(n3)|A^{*}| = |A|^{n - 1}\ (n \geq 2),\ \ \ \ {(kA)}^{*} = k^{n -1}A^{*},{{\ \ }\left( A^{*} \right)}^{*} = |A|^{n - 2}A(n \geq 3)

(3) 若AA可逆,则A=AA1,(A)=1AAA^{*} = |A|A^{- 1},{(A^{*})}^{*} = \frac{1}{|A|}A

(4) 若AAnn阶方阵,则:

r(A)={n,r(A)=n1,r(A)=n10,r(A)<n1r(A^*)=\begin{cases}n,\quad r(A)=n\\ 1,\quad r(A)=n-1\\ 0,\quad r(A)<n-1\end{cases}

6.有关A1\mathbf{A}^{\mathbf{- 1}}的结论

AA可逆AB=E;A0;r(A)=n;\Leftrightarrow AB = E; \Leftrightarrow |A| \neq 0; \Leftrightarrow r(A) = n;

A\Leftrightarrow A可以表示为初等矩阵的乘积;A;Ax=0\Leftrightarrow A;\Leftrightarrow Ax = 0

7.有关矩阵秩的结论

(1) 秩r(A)r(A)=行秩=列秩;

(2) r(Am×n)min(m,n);r(A_{m \times n}) \leq \min(m,n);

(3) A0r(A)1A \neq 0 \Rightarrow r(A) \geq 1

(4) r(A±B)r(A)+r(B);r(A \pm B) \leq r(A) + r(B);

(5) 初等变换不改变矩阵的秩

(6) r(A)+r(B)nr(AB)min(r(A),r(B)),r(A) + r(B) - n \leq r(AB) \leq \min(r(A),r(B)),特别若AB=OAB = O
则:r(A)+r(B)nr(A) + r(B) \leq n

(7) 若A1A^{- 1}存在r(AB)=r(B);\Rightarrow r(AB) = r(B);B1B^{- 1}存在
r(AB)=r(A);\Rightarrow r(AB) = r(A);

r(Am×n)=nr(AB)=r(B);r(A_{m \times n}) = n \Rightarrow r(AB) = r(B);r(Am×s)=nr(AB)=r(A)r(A_{m \times s}) = n\Rightarrow r(AB) = r\left( A \right)

(8) r(Am×s)=nAx=0r(A_{m \times s}) = n \Leftrightarrow Ax = 0只有零解

8.分块求逆公式

(AOOB)1=(A1OOB1)\begin{pmatrix} A & O \\ O & B \\ \end{pmatrix}^{- 1} = \begin{pmatrix} A^{-1} & O \\ O & B^{- 1} \\ \end{pmatrix}(ACOB)1=(A1A1CB1OB1)\begin{pmatrix} A & C \\ O & B \\\end{pmatrix}^{- 1} = \begin{pmatrix} A^{- 1}& - A^{- 1}CB^{- 1} \\ O & B^{- 1} \\ \end{pmatrix}

(AOCB)1=(A1OB1CA1B1)\begin{pmatrix} A & O \\ C & B \\ \end{pmatrix}^{- 1} = \begin{pmatrix} A^{- 1}&{O} \\ - B^{- 1}CA^{- 1} & B^{- 1} \\\end{pmatrix}(OABO)1=(OB1A1O)\begin{pmatrix} O & A \\ B & O \\ \end{pmatrix}^{- 1} =\begin{pmatrix} O & B^{- 1} \\ A^{- 1} & O \\ \end{pmatrix}

这里AABB均为可逆方阵。

向量

1.有关向量组的线性表示

(1)α1,α2,,αs\alpha_{1},\alpha_{2},\cdots,\alpha_{s}线性相关\Leftrightarrow至少有一个向量可以用其余向量线性表示。

(2)α1,α2,,αs\alpha_{1},\alpha_{2},\cdots,\alpha_{s}线性无关,α1,α2,,αs\alpha_{1},\alpha_{2},\cdots,\alpha_{s}β\beta线性相关β\Leftrightarrow \beta可以由α1,α2,,αs\alpha_{1},\alpha_{2},\cdots,\alpha_{s}唯一线性表示。

(3) β\beta可以由α1,α2,,αs\alpha_{1},\alpha_{2},\cdots,\alpha_{s}线性表示
r(α1,α2,,αs)=r(α1,α2,,αs,β)\Leftrightarrow r(\alpha_{1},\alpha_{2},\cdots,\alpha_{s}) =r(\alpha_{1},\alpha_{2},\cdots,\alpha_{s},\beta)

2.有关向量组的线性相关性

(1)部分相关,整体相关;整体无关,部分无关.

(2) ① nnnn维向量
α1,α2αn\alpha_{1},\alpha_{2}\cdots\alpha_{n}线性无关[α1α2αn]0\Leftrightarrow \left|\left\lbrack \alpha_{1}\alpha_{2}\cdots\alpha_{n} \right\rbrack \right| \neq0nnnn维向量α1,α2αn\alpha_{1},\alpha_{2}\cdots\alpha_{n}线性相关
[α1,α2,,αn]=0\Leftrightarrow |\lbrack\alpha_{1},\alpha_{2},\cdots,\alpha_{n}\rbrack| = 0

n+1n + 1nn维向量线性相关。

③ 若α1,α2αS\alpha_{1},\alpha_{2}\cdots\alpha_{S}线性无关,则添加分量后仍线性无关;或一组向量线性相关,去掉某些分量后仍线性相关。

3.有关向量组的线性表示

(1) α1,α2,,αs\alpha_{1},\alpha_{2},\cdots,\alpha_{s}线性相关\Leftrightarrow至少有一个向量可以用其余向量线性表示。

(2) α1,α2,,αs\alpha_{1},\alpha_{2},\cdots,\alpha_{s}线性无关,α1,α2,,αs\alpha_{1},\alpha_{2},\cdots,\alpha_{s}β\beta线性相关β\Leftrightarrow\beta 可以由α1,α2,,αs\alpha_{1},\alpha_{2},\cdots,\alpha_{s}唯一线性表示。

(3) β\beta可以由α1,α2,,αs\alpha_{1},\alpha_{2},\cdots,\alpha_{s}线性表示
r(α1,α2,,αs)=r(α1,α2,,αs,β)\Leftrightarrow r(\alpha_{1},\alpha_{2},\cdots,\alpha_{s}) =r(\alpha_{1},\alpha_{2},\cdots,\alpha_{s},\beta)

4.向量组的秩与矩阵的秩之间的关系

r(Am×n)=rr(A_{m \times n}) =r,则AA的秩r(A)r(A)AA的行列向量组的线性相关性关系为:

(1) 若r(Am×n)=r=mr(A_{m \times n}) = r = m,则AA的行向量组线性无关。

(2) 若r(Am×n)=r<mr(A_{m \times n}) = r < m,则AA的行向量组线性相关。

(3) 若r(Am×n)=r=nr(A_{m \times n}) = r = n,则AA的列向量组线性无关。

(4) 若r(Am×n)=r<nr(A_{m \times n}) = r < n,则AA的列向量组线性相关。

5.n\mathbf{n}维向量空间的基变换公式及过渡矩阵

α1,α2,,αn\alpha_{1},\alpha_{2},\cdots,\alpha_{n}β1,β2,,βn\beta_{1},\beta_{2},\cdots,\beta_{n}是向量空间VV的两组基,则基变换公式为:

(β1,β2,,βn)=(α1,α2,,αn)[c11c12c1nc21c22c2ncn1cn2cnn]=(α1,α2,,αn)C(\beta_{1},\beta_{2},\cdots,\beta_{n}) = (\alpha_{1},\alpha_{2},\cdots,\alpha_{n})\begin{bmatrix} c_{11}& c_{12}& \cdots & c_{1n} \\ c_{21}& c_{22}&\cdots & c_{2n} \\ \cdots & \cdots & \cdots & \cdots \\ c_{n1}& c_{n2} & \cdots & c_{{nn}} \\\end{bmatrix} = (\alpha_{1},\alpha_{2},\cdots,\alpha_{n})C

其中CC是可逆矩阵,称为由基α1,α2,,αn\alpha_{1},\alpha_{2},\cdots,\alpha_{n}到基β1,β2,,βn\beta_{1},\beta_{2},\cdots,\beta_{n}的过渡矩阵。

6.坐标变换公式

若向量γ\gamma在基α1,α2,,αn\alpha_{1},\alpha_{2},\cdots,\alpha_{n}与基β1,β2,,βn\beta_{1},\beta_{2},\cdots,\beta_{n}的坐标分别是
X=(x1,x2,,xn)TX = {(x_{1},x_{2},\cdots,x_{n})}^{T}

Y=(y1,y2,,yn)TY = \left( y_{1},y_{2},\cdots,y_{n} \right)^{T} 即: γ=x1α1+x2α2++xnαn=y1β1+y2β2++ynβn\gamma =x_{1}\alpha_{1} + x_{2}\alpha_{2} + \cdots + x_{n}\alpha_{n} = y_{1}\beta_{1} +y_{2}\beta_{2} + \cdots + y_{n}\beta_{n},则向量坐标变换公式为X=CYX = CYY=C1XY = C^{- 1}X,其中CC是从基α1,α2,,αn\alpha_{1},\alpha_{2},\cdots,\alpha_{n}到基β1,β2,,βn\beta_{1},\beta_{2},\cdots,\beta_{n}的过渡矩阵。

7.向量的内积

(α,β)=a1b1+a2b2++anbn=αTβ=βTα(\alpha,\beta) = a_{1}b_{1} + a_{2}b_{2} + \cdots + a_{n}b_{n} = \alpha^{T}\beta = \beta^{T}\alpha

8.Schmidt正交化

α1,α2,,αs\alpha_{1},\alpha_{2},\cdots,\alpha_{s}线性无关,则可构造β1,β2,,βs\beta_{1},\beta_{2},\cdots,\beta_{s}使其两两正交,且βi\beta_{i}仅是α1,α2,,αi\alpha_{1},\alpha_{2},\cdots,\alpha_{i}的线性组合(i=1,2,,n)(i= 1,2,\cdots,n),再把βi\beta_{i}单位化,记γi=βiβi\gamma_{i} =\frac{\beta_{i}}{\left| \beta_{i}\right|},则γ1,γ2,,γi\gamma_{1},\gamma_{2},\cdots,\gamma_{i}是规范正交向量组。其中
β1=α1\beta_{1} = \alpha_{1}β2=α2(α2,β1)(β1,β1)β1\beta_{2} = \alpha_{2} -\frac{(\alpha_{2},\beta_{1})}{(\beta_{1},\beta_{1})}\beta_{1}β3=α3(α3,β1)(β1,β1)β1(α3,β2)(β2,β2)β2\beta_{3} =\alpha_{3} - \frac{(\alpha_{3},\beta_{1})}{(\beta_{1},\beta_{1})}\beta_{1} -\frac{(\alpha_{3},\beta_{2})}{(\beta_{2},\beta_{2})}\beta_{2}

βs=αs(αs,β1)(β1,β1)β1(αs,β2)(β2,β2)β2(αs,βs1)(βs1,βs1)βs1\beta_{s} = \alpha_{s} - \frac{(\alpha_{s},\beta_{1})}{(\beta_{1},\beta_{1})}\beta_{1} - \frac{(\alpha_{s},\beta_{2})}{(\beta_{2},\beta_{2})}\beta_{2} - \cdots - \frac{(\alpha_{s},\beta_{s - 1})}{(\beta_{s - 1},\beta_{s - 1})}\beta_{s - 1}

9.正交基及规范正交基

向量空间一组基中的向量如果两两正交,就称为正交基;若正交基中每个向量都是单位向量,就称其为规范正交基。

线性方程组

1.克莱姆法则

线性方程组{a11x1+a12x2++a1nxn=b1a21x1+a22x2++a2nxn=b2an1x1+an2x2++annxn=bn\begin{cases} a_{11}x_{1} + a_{12}x_{2} + \cdots +a_{1n}x_{n} = b_{1} \\ a_{21}x_{1} + a_{22}x_{2} + \cdots + a_{2n}x_{n} =b_{2} \\ \quad\cdots\cdots\cdots\cdots\cdots\cdots\cdots\cdots\cdots \\ a_{n1}x_{1} + a_{n2}x_{2} + \cdots + a_{{nn}}x_{n} = b_{n} \\ \end{cases},如果系数行列式D=A0D = \left| A \right| \neq 0,则方程组有唯一解,x1=D1D,x2=D2D,,xn=DnDx_{1} = \frac{D_{1}}{D},x_{2} = \frac{D_{2}}{D},\cdots,x_{n} =\frac{D_{n}}{D},其中DjD_{j}是把DD中第jj列元素换成方程组右端的常数列所得的行列式。

2. nn阶矩阵AA可逆Ax=0\Leftrightarrow Ax = 0只有零解。b,Ax=b\Leftrightarrow\forall b,Ax = b总有唯一解,一般地,r(Am×n)=nAx=0r(A_{m \times n}) = n \Leftrightarrow Ax= 0只有零解。

3.非奇次线性方程组有解的充分必要条件,线性方程组解的性质和解的结构

(1) 设AAm×nm \times n矩阵,若r(Am×n)=mr(A_{m \times n}) = m,则对Ax=bAx =b而言必有r(A)=r(Ab)=mr(A) = r(A \vdots b) = m,从而Ax=bAx = b有解。

(2) 设x1,x2,xsx_{1},x_{2},\cdots x_{s}Ax=bAx = b的解,则k1x1+k2x2+ksxsk_{1}x_{1} + k_{2}x_{2}\cdots + k_{s}x_{s}k1+k2++ks=1k_{1} + k_{2} + \cdots + k_{s} = 1时仍为Ax=bAx =b的解;但当k1+k2++ks=0k_{1} + k_{2} + \cdots + k_{s} = 0时,则为Ax=0Ax =0的解。特别x1+x22\frac{x_{1} + x_{2}}{2}Ax=bAx = b的解;2x3(x1+x2)2x_{3} - (x_{1} +x_{2})Ax=0Ax = 0的解。

(3) 非齐次线性方程组Ax=b{Ax} = b无解r(A)+1=r(A)b\Leftrightarrow r(A) + 1 =r(\overline{A}) \Leftrightarrow b不能由AA的列向量α1,α2,,αn\alpha_{1},\alpha_{2},\cdots,\alpha_{n}线性表示。

4.奇次线性方程组的基础解系和通解,解空间,非奇次线性方程组的通解

(1) 齐次方程组Ax=0{Ax} = 0恒有解(必有零解)。当有非零解时,由于解向量的任意线性组合仍是该齐次方程组的解向量,因此Ax=0{Ax}= 0的全体解向量构成一个向量空间,称为该方程组的解空间,解空间的维数是nr(A)n - r(A),解空间的一组基称为齐次方程组的基础解系。

(2) η1,η2,,ηt\eta_{1},\eta_{2},\cdots,\eta_{t}Ax=0{Ax} = 0的基础解系,即:

  1. η1,η2,,ηt\eta_{1},\eta_{2},\cdots,\eta_{t}Ax=0{Ax} = 0的解;

  2. η1,η2,,ηt\eta_{1},\eta_{2},\cdots,\eta_{t}线性无关;

  3. Ax=0{Ax} = 0的任一解都可以由η1,η2,,ηt\eta_{1},\eta_{2},\cdots,\eta_{t}线性表出.
    k1η1+k2η2++ktηtk_{1}\eta_{1} + k_{2}\eta_{2} + \cdots + k_{t}\eta_{t}Ax=0{Ax} = 0的通解,其中k1,k2,,ktk_{1},k_{2},\cdots,k_{t}是任意常数。

矩阵的特征值和特征向量

1.矩阵的特征值和特征向量的概念及性质

(1) 设λ\lambdaAA的一个特征值,则 kA,aA+bE,A2,Am,f(A),AT,A1,A{kA},{aA} + {bE},A^{2},A^{m},f(A),A^{T},A^{- 1},A^{*}有一个特征值分别为
kλ,aλ+b,λ2,λm,f(λ),λ,λ1,Aλ,{kλ},{aλ} + b,\lambda^{2},\lambda^{m},f(\lambda),\lambda,\lambda^{- 1},\frac{|A|}{\lambda},且对应特征向量相同(ATA^{T} 例外)。

(2)若λ1,λ2,,λn\lambda_{1},\lambda_{2},\cdots,\lambda_{n}AAnn个特征值,则i=1nλi=i=1naii,i=1nλi=A\sum_{i= 1}^{n}\lambda_{i} = \sum_{i = 1}^{n}a_{{ii}},\prod_{i = 1}^{n}\lambda_{i}= |A| ,从而A0A|A| \neq 0 \Leftrightarrow A没有特征值。

(3)设λ1,λ2,,λs\lambda_{1},\lambda_{2},\cdots,\lambda_{s}AAss个特征值,对应特征向量为α1,α2,,αs\alpha_{1},\alpha_{2},\cdots,\alpha_{s}

若: α=k1α1+k2α2++ksαs\alpha = k_{1}\alpha_{1} + k_{2}\alpha_{2} + \cdots + k_{s}\alpha_{s} ,

则: Anα=k1Anα1+k2Anα2++ksAnαs=k1λ1nα1+k2λ2nα2+ksλsnαsA^{n}\alpha = k_{1}A^{n}\alpha_{1} + k_{2}A^{n}\alpha_{2} + \cdots +k_{s}A^{n}\alpha_{s} = k_{1}\lambda_{1}^{n}\alpha_{1} +k_{2}\lambda_{2}^{n}\alpha_{2} + \cdots k_{s}\lambda_{s}^{n}\alpha_{s}

2.相似变换、相似矩阵的概念及性质

(1) 若ABA \sim B,则

  1. ATBT,A1B1,,ABA^{T} \sim B^{T},A^{- 1} \sim B^{- 1},,A^{*} \sim B^{*}

  2. A=B,i=1nAii=i=1nbii,r(A)=r(B)|A| = |B|,\sum_{i = 1}^{n}A_{{ii}} = \sum_{i =1}^{n}b_{{ii}},r(A) = r(B)

  3. λEA=λEB|\lambda E - A| = |\lambda E - B|,对λ\forall\lambda成立

3.矩阵可相似对角化的充分必要条件

(1)设AAnn阶方阵,则AA可对角化\Leftrightarrow对每个kik_{i}重根特征值λi\lambda_{i},有nr(λiEA)=kin-r(\lambda_{i}E - A) = k_{i}

(2) 设AA可对角化,则由P1AP=Λ,P^{- 1}{AP} = \Lambda,A=PΛP1A = {PΛ}P^{-1},从而An=PΛnP1A^{n} = P\Lambda^{n}P^{- 1}

(3) 重要结论

  1. AB,CDA \sim B,C \sim D,则[AOOC][BOOD]\begin{bmatrix} A & O \\ O & C \\\end{bmatrix} \sim \begin{bmatrix} B & O \\ O & D \\\end{bmatrix}.

  2. ABA \sim B,则f(A)f(B),f(A)f(B)f(A) \sim f(B),\left| f(A) \right| \sim \left| f(B)\right|,其中f(A)f(A)为关于nn阶方阵AA的多项式。

  3. AA为可对角化矩阵,则其非零特征值的个数(重根重复计算)=秩(AA)

4.实对称矩阵的特征值、特征向量及相似对角阵

(1)相似矩阵:设A,BA,B为两个nn阶方阵,如果存在一个可逆矩阵PP,使得B=P1APB =P^{- 1}{AP}成立,则称矩阵AABB相似,记为ABA \sim B

(2)相似矩阵的性质:如果ABA \sim B则有:

  1. ATBTA^{T} \sim B^{T}

  2. A1B1A^{- 1} \sim B^{- 1} (若AABB均可逆)

  3. AkBkA^{k} \sim B^{k}kk为正整数)

  4. λEA=λEB\left| {λE} - A \right| = \left| {λE} - B \right|,从而A,BA,B
    有相同的特征值

  5. A=B\left| A \right| = \left| B \right|,从而A,BA,B同时可逆或者不可逆

  6. (A)=\left( A \right) =(B),λEA=λEB\left( B \right),\left| {λE} - A \right| =\left| {λE} - B \right|A,BA,B不一定相似

二次型

1.n\mathbf{n}个变量x1,x2,,xn\mathbf{x}_{\mathbf{1}}\mathbf{,}\mathbf{x}_{\mathbf{2}}\mathbf{,\cdots,}\mathbf{x}_{\mathbf{n}}的二次齐次函数

f(x1,x2,,xn)=i=1nj=1naijxiyjf(x_{1},x_{2},\cdots,x_{n}) = \sum_{i = 1}^{n}{\sum_{j =1}^{n}{a_{{ij}}x_{i}y_{j}}},其中aij=aji(i,j=1,2,,n)a_{{ij}} = a_{{ji}}(i,j =1,2,\cdots,n),称为nn元二次型,简称二次型. 若令x= [x1x1xn],A=[a11a12a1na21a22a2nan1an2ann]x = \ \begin{bmatrix}x_{1} \\ x_{1} \\ \vdots \\ x_{n} \\ \end{bmatrix},A = \begin{bmatrix} a_{11}& a_{12}& \cdots & a_{1n} \\ a_{21}& a_{22}& \cdots & a_{2n} \\ \cdots &\cdots &\cdots &\cdots \\ a_{n1}& a_{n2} & \cdots & a_{{nn}} \\\end{bmatrix},这二次型ff可改写成矩阵向量形式f=xTAxf =x^{T}{Ax}。其中AA称为二次型矩阵,因为aij=aji(i,j=1,2,,n)a_{{ij}} =a_{{ji}}(i,j =1,2,\cdots,n),所以二次型矩阵均为对称矩阵,且二次型与对称矩阵一一对应,并把矩阵AA的秩称为二次型的秩。

2.惯性定理,二次型的标准形和规范形

(1) 惯性定理

对于任一二次型,不论选取怎样的合同变换使它化为仅含平方项的标准型,其正负惯性指数与所选变换无关,这就是所谓的惯性定理。

(2) 标准形

二次型f=(x1,x2,,xn)=xTAxf = \left( x_{1},x_{2},\cdots,x_{n} \right) =x^{T}{Ax}经过合同变换x=Cyx = {Cy}化为f=xTAx=yTCTACf = x^{T}{Ax} =y^{T}C^{T}{AC}

y=i=1rdiyi2y = \sum_{i = 1}^{r}{d_{i}y_{i}^{2}}称为 f(rn)f(r \leq n)的标准形。在一般的数域内,二次型的标准形不是唯一的,与所作的合同变换有关,但系数不为零的平方项的个数由r(A)r(A)唯一确定。

(3) 规范形

任一实二次型ff都可经过合同变换化为规范形f=z12+z22+zp2zp+12zr2f = z_{1}^{2} + z_{2}^{2} + \cdots z_{p}^{2} - z_{p + 1}^{2} - \cdots -z_{r}^{2},其中rrAA的秩,pp为正惯性指数,rpr -p为负惯性指数,且规范型唯一。

3.用正交变换和配方法化二次型为标准形,二次型及其矩阵的正定性

AA正定kA(k>0),AT,A1,A\Rightarrow {kA}(k > 0),A^{T},A^{- 1},A^{*}正定;A>0|A| >0,AA可逆;aii>0a_{{ii}} > 0,且Aii>0|A_{{ii}}| > 0

AABB正定A+B\Rightarrow A +B正定,但AB{AB}BA{BA}不一定正定

AA正定f(x)=xTAx>0,x0\Leftrightarrow f(x) = x^{T}{Ax} > 0,\forall x \neq 0

A\Leftrightarrow A的各阶顺序主子式全大于零

A\Leftrightarrow A的所有特征值大于零

A\Leftrightarrow A的正惯性指数为nn

\Leftrightarrow存在可逆阵PP使A=PTPA = P^{T}P

\Leftrightarrow存在正交矩阵QQ,使QTAQ=Q1AQ=(λ1λn),Q^{T}{AQ} = Q^{- 1}{AQ} =\begin{pmatrix} \lambda_{1} & & \\ \begin{matrix} & \\ & \\ \end{matrix} &\ddots & \\ & & \lambda_{n} \\ \end{pmatrix},

其中λi>0,i=1,2,,n.\lambda_{i} > 0,i = 1,2,\cdots,n.正定kA(k>0),AT,A1,A\Rightarrow {kA}(k >0),A^{T},A^{- 1},A^{*}正定; A>0,A|A| > 0,A可逆;aii>0a_{{ii}} >0,且Aii>0|A_{{ii}}| > 0

概率论和数理统计

随机事件和概率

1.事件的关系与运算

(1) 子事件:ABA \subset B,若AA发生,则BB发生。

(2) 相等事件:A=BA = B,即ABA \subset B,且BAB \subset A

(3) 和事件:ABA\bigcup B(或A+BA + B),AABB中至少有一个发生。

(4) 差事件:ABA - BAA发生但BB不发生。

(5) 积事件:ABA\bigcap B(或AB{AB}),AABB同时发生。

(6) 互斥事件(互不相容):ABA\bigcap B=\varnothing

(7) 互逆事件(对立事件):
AB=,AB=Ω,A=Bˉ,B=AˉA\bigcap B=\varnothing ,A\bigcup B=\Omega ,A=\bar{B},B=\bar{A}
2.运算律
(1) 交换律:AB=BA,AB=BAA\bigcup B=B\bigcup A,A\bigcap B=B\bigcap A
(2) 结合律:(AB)C=A(BC)(A\bigcup B)\bigcup C=A\bigcup (B\bigcup C)
(3) 分配律:(AB)C=A(BC)(A\bigcap B)\bigcap C=A\bigcap (B\bigcap C)
3.德$\centerdot $摩根律

AB=AˉBˉ\overline{A\bigcup B}=\bar{A}\bigcap \bar{B} AB=AˉBˉ\overline{A\bigcap B}=\bar{A}\bigcup \bar{B}
4.完全事件组

A1A2An{{A}_{1}}{{A}_{2}}\cdots {{A}_{n}}两两互斥,且和事件为必然事件,即${{A}{i}}\bigcap {{A}{j}}=\varnothing, i\ne j ,\underset{i=1}{\overset{n}{\mathop \bigcup }},=\Omega $

5.概率的基本公式
(1)条件概率:
P(BA)=P(AB)P(A)P(B|A)=\frac{P(AB)}{P(A)},表示AA发生的条件下,BB发生的概率。
(2)全概率公式:
$P(A)=\sum\limits_{i=1}^{n}{P(A|{{B}{i}})P({{B}{i}}),{{B}{i}}{{B}{j}}}=\varnothing ,i\ne j,\underset{i=1}{\overset{n}{\mathop{\bigcup }}},{{B}_{i}}=\Omega $
(3) Bayes公式:

P(BjA)=P(ABj)P(Bj)i=1nP(ABi)P(Bi),j=1,2,,nP({{B}_{j}}|A)=\frac{P(A|{{B}_{j}})P({{B}_{j}})}{\sum\limits_{i=1}^{n}{P(A|{{B}_{i}})P({{B}_{i}})}},j=1,2,\cdots ,n
注:上述公式中事件Bi{{B}_{i}}的个数可为可列个。
(4)乘法公式:
P(A1A2)=P(A1)P(A2A1)=P(A2)P(A1A2)P({{A}_{1}}{{A}_{2}})=P({{A}_{1}})P({{A}_{2}}|{{A}_{1}})=P({{A}_{2}})P({{A}_{1}}|{{A}_{2}})
P(A1A2An)=P(A1)P(A2A1)P(A3A1A2)P(AnA1A2An1)P({{A}_{1}}{{A}_{2}}\cdots {{A}_{n}})=P({{A}_{1}})P({{A}_{2}}|{{A}_{1}})P({{A}_{3}}|{{A}_{1}}{{A}_{2}})\cdots P({{A}_{n}}|{{A}_{1}}{{A}_{2}}\cdots {{A}_{n-1}})

6.事件的独立性
(1)AABB相互独立P(AB)=P(A)P(B)\Leftrightarrow P(AB)=P(A)P(B)
(2)AABBCC两两独立
P(AB)=P(A)P(B)\Leftrightarrow P(AB)=P(A)P(B);P(BC)=P(B)P(C)P(BC)=P(B)P(C) ;P(AC)=P(A)P(C)P(AC)=P(A)P(C);
(3)AABBCC相互独立
P(AB)=P(A)P(B)\Leftrightarrow P(AB)=P(A)P(B); P(BC)=P(B)P(C)P(BC)=P(B)P(C) ;
P(AC)=P(A)P(C)P(AC)=P(A)P(C) ; P(ABC)=P(A)P(B)P(C)P(ABC)=P(A)P(B)P(C)

7.独立重复试验

将某试验独立重复nn次,若每次实验中事件A发生的概率为pp,则nn次试验中AA发生kk次的概率为:
P(X=k)=Cnkpk(1p)nkP(X=k)=C_{n}^{k}{{p}^{k}}{{(1-p)}^{n-k}}
8.重要公式与结论
(1)P(Aˉ)=1P(A)(1)P(\bar{A})=1-P(A)
(2)P(AB)=P(A)+P(B)P(AB)(2)P(A\bigcup B)=P(A)+P(B)-P(AB)
P(ABC)=P(A)+P(B)+P(C)P(AB)P(BC)P(AC)+P(ABC)P(A\bigcup B\bigcup C)=P(A)+P(B)+P(C)-P(AB)-P(BC)-P(AC)+P(ABC)
(3)P(AB)=P(A)P(AB)(3)P(A-B)=P(A)-P(AB)
(4)P(ABˉ)=P(A)P(AB),P(A)=P(AB)+P(ABˉ),(4)P(A\bar{B})=P(A)-P(AB),P(A)=P(AB)+P(A\bar{B}),
P(AB)=P(A)+P(AˉB)=P(AB)+P(ABˉ)+P(AˉB)P(A\bigcup B)=P(A)+P(\bar{A}B)=P(AB)+P(A\bar{B})+P(\bar{A}B)
(5)条件概率P(B)P(\centerdot |B)满足概率的所有性质,
例如:. P(Aˉ1B)=1P(A1B)P({{\bar{A}}_{1}}|B)=1-P({{A}_{1}}|B)
P(A1A2B)=P(A1B)+P(A2B)P(A1A2B)P({{A}_{1}}\bigcup {{A}_{2}}|B)=P({{A}_{1}}|B)+P({{A}_{2}}|B)-P({{A}_{1}}{{A}_{2}}|B)
P(A1A2B)=P(A1B)P(A2A1B)P({{A}_{1}}{{A}_{2}}|B)=P({{A}_{1}}|B)P({{A}_{2}}|{{A}_{1}}B)
(6)若A1,A2,,An{{A}_{1}},{{A}_{2}},\cdots ,{{A}_{n}}相互独立,则P(i=1nAi)=i=1nP(Ai),P(\bigcap\limits_{i=1}^{n}{{{A}_{i}}})=\prod\limits_{i=1}^{n}{P({{A}_{i}})},
P(i=1nAi)=i=1n(1P(Ai))P(\bigcup\limits_{i=1}^{n}{{{A}_{i}}})=\prod\limits_{i=1}^{n}{(1-P({{A}_{i}}))}
(7)互斥、互逆与独立性之间的关系:
AABB互逆\Rightarrow AABB互斥,但反之不成立,AABB互斥(或互逆)且均非零概率事件$\Rightarrow $AABB不独立.
(8)若A1,A2,,Am,B1,B2,,Bn{{A}_{1}},{{A}_{2}},\cdots ,{{A}_{m}},{{B}_{1}},{{B}_{2}},\cdots ,{{B}_{n}}相互独立,则f(A1,A2,,Am)f({{A}_{1}},{{A}_{2}},\cdots ,{{A}_{m}})g(B1,B2,,Bn)g({{B}_{1}},{{B}_{2}},\cdots ,{{B}_{n}})也相互独立,其中f(),g()f(\centerdot ),g(\centerdot )分别表示对相应事件做任意事件运算后所得的事件,另外,概率为1(或0)的事件与任何事件相互独立.

随机变量及其概率分布

1.随机变量及概率分布

取值带有随机性的变量,严格地说是定义在样本空间上,取值于实数的函数称为随机变量,概率分布通常指分布函数或分布律

2.分布函数的概念与性质

定义: F(x)=P(Xx),<x<+F(x) = P(X \leq x), - \infty < x < + \infty

性质:(1)0F(x)10 \leq F(x) \leq 1

(2) F(x)F(x)单调不减

(3) 右连续F(x+0)=F(x)F(x + 0) = F(x)

(4) F()=0,F(+)=1F( - \infty) = 0,F( + \infty) = 1

3.离散型随机变量的概率分布

P(X=xi)=pi,i=1,2,,n,pi0,i=1pi=1P(X = x_{i}) = p_{i},i = 1,2,\cdots,n,\cdots\quad\quad p_{i} \geq 0,\sum_{i =1}^{\infty}p_{i} = 1

4.连续型随机变量的概率密度

概率密度f(x)f(x);非负可积,且:

(1)f(x)0,f(x) \geq 0,

(2)+f(x)dx=1\int_{- \infty}^{+\infty}{f(x){dx} = 1}

(3)xxf(x)f(x)的连续点,则:

f(x)=F(x)f(x) = F'(x)分布函数F(x)=xf(t)dtF(x) = \int_{- \infty}^{x}{f(t){dt}}

5.常见分布

(1) 0-1分布:P(X=k)=pk(1p)1k,k=0,1P(X = k) = p^{k}{(1 - p)}^{1 - k},k = 0,1

(2) 二项分布:B(n,p)B(n,p)P(X=k)=Cnkpk(1p)nk,k=0,1,,nP(X = k) = C_{n}^{k}p^{k}{(1 - p)}^{n - k},k =0,1,\cdots,n

(3) Poisson分布:p(λ)p(\lambda)P(X=k)=λkk!eλ,λ>0,k=0,1,2P(X = k) = \frac{\lambda^{k}}{k!}e^{-\lambda},\lambda > 0,k = 0,1,2\cdots

(4) 均匀分布U(a,b)U(a,b):$f(x) = { \begin{matrix} & \frac{1}{b - a},a < x< b \ & 0, \ \end{matrix} $

(5) 正态分布:N(μ,σ2):N(\mu,\sigma^{2}): φ(x)=12πσe(xμ)22σ2,σ>0,<x<+\varphi(x) =\frac{1}{\sqrt{2\pi}\sigma}e^{- \frac{{(x - \mu)}^{2}}{2\sigma^{2}}},\sigma > 0,\infty < x < + \infty

(6)指数分布:$E(\lambda):f(x) ={ \begin{matrix} & \lambda e^{-{λx}},x > 0,\lambda > 0 \ & 0, \ \end{matrix} $

(7)几何分布:G(p):P(X=k)=(1p)k1p,0<p<1,k=1,2,.G(p):P(X = k) = {(1 - p)}^{k - 1}p,0 < p < 1,k = 1,2,\cdots.

(8)超几何分布: H(N,M,n):P(X=k)=CMkCNMnkCNn,k=0,1,,min(n,M)H(N,M,n):P(X = k) = \frac{C_{M}^{k}C_{N - M}^{n -k}}{C_{N}^{n}},k =0,1,\cdots,min(n,M)

6.随机变量函数的概率分布

(1)离散型:P(X=x1)=pi,Y=g(X)P(X = x_{1}) = p_{i},Y = g(X)

则: P(Y=yj)=g(xi)=yiP(X=xi)P(Y = y_{j}) = \sum_{g(x_{i}) = y_{i}}^{}{P(X = x_{i})}

(2)连续型:X ~fX(x),Y=g(x)X\tilde{\ }f_{X}(x),Y = g(x)

则:Fy(y)=P(Yy)=P(g(X)y)=g(x)yfx(x)dxF_{y}(y) = P(Y \leq y) = P(g(X) \leq y) = \int_{g(x) \leq y}^{}{f_{x}(x)dx}fY(y)=FY(y)f_{Y}(y) = F'_{Y}(y)

7.重要公式与结论

(1) XN(0,1)φ(0)=12π,Φ(0)=12,X\sim N(0,1) \Rightarrow \varphi(0) = \frac{1}{\sqrt{2\pi}},\Phi(0) =\frac{1}{2}, Φ(a)=P(Xa)=1Φ(a)\Phi( - a) = P(X \leq - a) = 1 - \Phi(a)

(2) XN(μ,σ2)XμσN(0,1),P(Xa)=Φ(aμσ)X\sim N\left( \mu,\sigma^{2} \right) \Rightarrow \frac{X -\mu}{\sigma}\sim N\left( 0,1 \right),P(X \leq a) = \Phi(\frac{a -\mu}{\sigma})

(3) XE(λ)P(X>s+tX>s)=P(X>t)X\sim E(\lambda) \Rightarrow P(X > s + t|X > s) = P(X > t)

(4) XG(p)P(X=m+kX>m)=P(X=k)X\sim G(p) \Rightarrow P(X = m + k|X > m) = P(X = k)

(5) 离散型随机变量的分布函数为阶梯间断函数;连续型随机变量的分布函数为连续函数,但不一定为处处可导函数。

(6) 存在既非离散也非连续型随机变量。

多维随机变量及其分布

1.二维随机变量及其联合分布

由两个随机变量构成的随机向量(X,Y)(X,Y), 联合分布为F(x,y)=P(Xx,Yy)F(x,y) = P(X \leq x,Y \leq y)

2.二维离散型随机变量的分布

(1) 联合概率分布律 P{X=xi,Y=yj}=pij;i,j=1,2,P\{ X = x_{i},Y = y_{j}\} = p_{{ij}};i,j =1,2,\cdots

(2) 边缘分布律 pi=j=1pij,i=1,2,p_{i \cdot} = \sum_{j = 1}^{\infty}p_{{ij}},i =1,2,\cdots pj=ipij,j=1,2,p_{\cdot j} = \sum_{i}^{\infty}p_{{ij}},j = 1,2,\cdots

(3) 条件分布律 P{X=xiY=yj}=pijpjP\{ X = x_{i}|Y = y_{j}\} = \frac{p_{{ij}}}{p_{\cdot j}}
P{Y=yjX=xi}=pijpiP\{ Y = y_{j}|X = x_{i}\} = \frac{p_{{ij}}}{p_{i \cdot}}

3. 二维连续性随机变量的密度

(1) 联合概率密度f(x,y):f(x,y):

  1. f(x,y)0f(x,y) \geq 0

  2. ++f(x,y)dxdy=1\int_{- \infty}^{+ \infty}{\int_{- \infty}^{+ \infty}{f(x,y)dxdy}} = 1

(2) 分布函数:F(x,y)=xyf(u,v)dudvF(x,y) = \int_{- \infty}^{x}{\int_{- \infty}^{y}{f(u,v)dudv}}

(3) 边缘概率密度: fX(x)=+f(x,y)dyf_{X}\left( x \right) = \int_{- \infty}^{+ \infty}{f\left( x,y \right){dy}} fY(y)=+f(x,y)dxf_{Y}(y) = \int_{- \infty}^{+ \infty}{f(x,y)dx}

(4) 条件概率密度:fXY(x|y)=f(x,y)fY(y)f_{X|Y}\left( x \middle| y \right) = \frac{f\left( x,y \right)}{f_{Y}\left( y \right)} fYX(yx)=f(x,y)fX(x)f_{Y|X}(y|x) = \frac{f(x,y)}{f_{X}(x)}

4.常见二维随机变量的联合分布

(1) 二维均匀分布:(x,y)U(D)(x,y) \sim U(D) ,f(x,y)={1S(D),(x,y)D0,f(x,y) = \begin{cases} \frac{1}{S(D)},(x,y) \in D \\ 0,其他 \end{cases}

(2) 二维正态分布:(X,Y)N(μ1,μ2,σ12,σ22,ρ)(X,Y)\sim N(\mu_{1},\mu_{2},\sigma_{1}^{2},\sigma_{2}^{2},\rho),(X,Y)N(μ1,μ2,σ12,σ22,ρ)(X,Y)\sim N(\mu_{1},\mu_{2},\sigma_{1}^{2},\sigma_{2}^{2},\rho)

f(x,y)=12πσ1σ21ρ2.exp{12(1ρ2)[(xμ1)2σ122ρ(xμ1)(yμ2)σ1σ2+(yμ2)2σ22]}f(x,y) = \frac{1}{2\pi\sigma_{1}\sigma_{2}\sqrt{1 - \rho^{2}}}.\exp\left\{ \frac{- 1}{2(1 - \rho^{2})}\lbrack\frac{{(x - \mu_{1})}^{2}}{\sigma_{1}^{2}} - 2\rho\frac{(x - \mu_{1})(y - \mu_{2})}{\sigma_{1}\sigma_{2}} + \frac{{(y - \mu_{2})}^{2}}{\sigma_{2}^{2}}\rbrack \right\}

5.随机变量的独立性和相关性

XXYY的相互独立:F(x,y)=FX(x)FY(y)\Leftrightarrow F\left( x,y \right) = F_{X}\left( x \right)F_{Y}\left( y \right):

pij=pipj\Leftrightarrow p_{{ij}} = p_{i \cdot} \cdot p_{\cdot j}(离散型)
f(x,y)=fX(x)fY(y)\Leftrightarrow f\left( x,y \right) = f_{X}\left( x \right)f_{Y}\left( y \right)(连续型)

XXYY的相关性:

相关系数ρXY=0\rho_{{XY}} = 0时,称XXYY不相关,
否则称XXYY相关

6.两个随机变量简单函数的概率分布

离散型: P(X=xi,Y=yi)=pij,Z=g(X,Y)P\left( X = x_{i},Y = y_{i} \right) = p_{{ij}},Z = g\left( X,Y \right) 则:

P(Z=zk)=P{g(X,Y)=zk}=g(xi,yi)=zkP(X=xi,Y=yj)P(Z = z_{k}) = P\left\{ g\left( X,Y \right) = z_{k} \right\} = \sum_{g\left( x_{i},y_{i} \right) = z_{k}}^{}{P\left( X = x_{i},Y = y_{j} \right)}

连续型: (X,Y)f(x,y),Z=g(X,Y)\left( X,Y \right) \sim f\left( x,y \right),Z = g\left( X,Y \right)
则:

Fz(z)=P{g(X,Y)z}=g(x,y)zf(x,y)dxdyF_{z}\left( z \right) = P\left\{ g\left( X,Y \right) \leq z \right\} = \iint_{g(x,y) \leq z}^{}{f(x,y)dxdy}fz(z)=Fz(z)f_{z}(z) = F'_{z}(z)

7.重要公式与结论

(1) 边缘密度公式: fX(x)=+f(x,y)dy,f_{X}(x) = \int_{- \infty}^{+ \infty}{f(x,y)dy,}
fY(y)=+f(x,y)dxf_{Y}(y) = \int_{- \infty}^{+ \infty}{f(x,y)dx}

(2) P{(X,Y)D}=Df(x,y)dxdyP\left\{ \left( X,Y \right) \in D \right\} = \iint_{D}^{}{f\left( x,y \right){dxdy}}

(3) 若(X,Y)(X,Y)服从二维正态分布N(μ1,μ2,σ12,σ22,ρ)N(\mu_{1},\mu_{2},\sigma_{1}^{2},\sigma_{2}^{2},\rho)
则有:

  1. XN(μ1,σ12),YN(μ2,σ22).X\sim N\left( \mu_{1},\sigma_{1}^{2} \right),Y\sim N(\mu_{2},\sigma_{2}^{2}).

  2. XXYY相互独立ρ=0\Leftrightarrow \rho = 0,即XXYY不相关。

  3. C1X+C2YN(C1μ1+C2μ2,C12σ12+C22σ22+2C1C2σ1σ2ρ)C_{1}X + C_{2}Y\sim N(C_{1}\mu_{1} + C_{2}\mu_{2},C_{1}^{2}\sigma_{1}^{2} + C_{2}^{2}\sigma_{2}^{2} + 2C_{1}C_{2}\sigma_{1}\sigma_{2}\rho)

  4.  X{\ X}关于Y=yY=y的条件分布为: N(μ1+ρσ1σ2(yμ2),σ12(1ρ2))N(\mu_{1} + \rho\frac{\sigma_{1}}{\sigma_{2}}(y - \mu_{2}),\sigma_{1}^{2}(1 - \rho^{2}))

  5. YY关于X=xX = x的条件分布为: N(μ2+ρσ2σ1(xμ1),σ22(1ρ2))N(\mu_{2} + \rho\frac{\sigma_{2}}{\sigma_{1}}(x - \mu_{1}),\sigma_{2}^{2}(1 - \rho^{2}))

(4) 若XXYY独立,且分别服从N(μ1,σ12),N(μ1,σ22),N(\mu_{1},\sigma_{1}^{2}),N(\mu_{1},\sigma_{2}^{2}),
则:(X,Y)N(μ1,μ2,σ12,σ22,0),\left( X,Y \right)\sim N(\mu_{1},\mu_{2},\sigma_{1}^{2},\sigma_{2}^{2},0),

C1X+C2Y ~N(C1μ1+C2μ2,C12σ12C22σ22).C_{1}X + C_{2}Y\tilde{\ }N(C_{1}\mu_{1} + C_{2}\mu_{2},C_{1}^{2}\sigma_{1}^{2} C_{2}^{2}\sigma_{2}^{2}).

(5) 若XXYY相互独立,f(x)f\left( x \right)g(x)g\left( x \right)为连续函数, 则f(X)f\left( X \right)g(Y)g(Y)也相互独立。

随机变量的数字特征

1.数学期望

离散型:P{X=xi}=pi,E(X)=ixipiP\left\{ X = x_{i} \right\} = p_{i},E(X) = \sum_{i}^{}{x_{i}p_{i}}

连续型: Xf(x),E(X)=+xf(x)dxX\sim f(x),E(X) = \int_{- \infty}^{+ \infty}{xf(x)dx}

性质:

(1) E(C)=C,E[E(X)]=E(X)E(C) = C,E\lbrack E(X)\rbrack = E(X)

(2) E(C1X+C2Y)=C1E(X)+C2E(Y)E(C_{1}X + C_{2}Y) = C_{1}E(X) + C_{2}E(Y)

(3) 若XXYY独立,则E(XY)=E(X)E(Y)E(XY) = E(X)E(Y)

(4)[E(XY)]2E(X2)E(Y2)\left\lbrack E(XY) \right\rbrack^{2} \leq E(X^{2})E(Y^{2})

2.方差D(X)=E[XE(X)]2=E(X2)[E(X)]2D(X) = E\left\lbrack X - E(X) \right\rbrack^{2} = E(X^{2}) - \left\lbrack E(X) \right\rbrack^{2}

3.标准差D(X)\sqrt{D(X)}

4.离散型:D(X)=i[xiE(X)]2piD(X) = \sum_{i}^{}{\left\lbrack x_{i} - E(X) \right\rbrack^{2}p_{i}}

5.连续型:D(X)=+[xE(X)]2f(x)dxD(X) = {\int_{- \infty}^{+ \infty}\left\lbrack x - E(X) \right\rbrack}^{2}f(x)dx

性质:

(1) D(C)=0,D[E(X)]=0,D[D(X)]=0\ D(C) = 0,D\lbrack E(X)\rbrack = 0,D\lbrack D(X)\rbrack = 0

(2) XXYY相互独立,则D(X±Y)=D(X)+D(Y)D(X \pm Y) = D(X) + D(Y)

(3) D(C1X+C2)=C12D(X)\ D\left( C_{1}X + C_{2} \right) = C_{1}^{2}D\left( X \right)

(4) 一般有 D(X±Y)=D(X)+D(Y)±2Cov(X,Y)=D(X)+D(Y)±2ρD(X)D(Y)D(X \pm Y) = D(X) + D(Y) \pm 2Cov(X,Y) = D(X) + D(Y) \pm 2\rho\sqrt{D(X)}\sqrt{D(Y)}

(5) D(X)<E(XC)2,CE(X)\ D\left( X \right) < E\left( X - C \right)^{2},C \neq E\left( X \right)

(6) D(X)=0P{X=C}=1\ D(X) = 0 \Leftrightarrow P\left\{ X = C \right\} = 1

6.随机变量函数的数学期望

(1) 对于函数Y=g(x)Y = g(x)

XX为离散型:P{X=xi}=pi,E(Y)=ig(xi)piP\{ X = x_{i}\} = p_{i},E(Y) = \sum_{i}^{}{g(x_{i})p_{i}}

XX为连续型:Xf(x),E(Y)=+g(x)f(x)dxX\sim f(x),E(Y) = \int_{- \infty}^{+ \infty}{g(x)f(x)dx}

(2) Z=g(X,Y)Z = g(X,Y);(X,Y)P{X=xi,Y=yj}=pij\left( X,Y \right)\sim P\{ X = x_{i},Y = y_{j}\} = p_{{ij}}; E(Z)=ijg(xi,yj)pijE(Z) = \sum_{i}^{}{\sum_{j}^{}{g(x_{i},y_{j})p_{{ij}}}} (X,Y)f(x,y)\left( X,Y \right)\sim f(x,y);E(Z)=++g(x,y)f(x,y)dxdyE(Z) = \int_{- \infty}^{+ \infty}{\int_{- \infty}^{+ \infty}{g(x,y)f(x,y)dxdy}}

7.协方差

Cov(X,Y)=E[(XE(X)(YE(Y))]Cov(X,Y) = E\left\lbrack (X - E(X)(Y - E(Y)) \right\rbrack

8.相关系数

ρXY=Cov(X,Y)D(X)D(Y)\rho_{{XY}} = \frac{Cov(X,Y)}{\sqrt{D(X)}\sqrt{D(Y)}},kk阶原点矩 E(Xk)E(X^{k});
kk阶中心矩 E{[XE(X)]k}E\left\{ {\lbrack X - E(X)\rbrack}^{k} \right\}

性质:

(1) Cov(X,Y)=Cov(Y,X)\ Cov(X,Y) = Cov(Y,X)

(2) Cov(aX,bY)=abCov(Y,X)\ Cov(aX,bY) = abCov(Y,X)

(3) Cov(X1+X2,Y)=Cov(X1,Y)+Cov(X2,Y)\ Cov(X_{1} + X_{2},Y) = Cov(X_{1},Y) + Cov(X_{2},Y)

(4) ρ(X,Y)1\ \left| \rho\left( X,Y \right) \right| \leq 1

(5)  ρ(X,Y)=1P(Y=aX+b)=1\ \rho\left( X,Y \right) = 1 \Leftrightarrow P\left( Y = aX + b \right) = 1 ,其中a>0a > 0

ρ(X,Y)=1P(Y=aX+b)=1\rho\left( X,Y \right) = - 1 \Leftrightarrow P\left( Y = aX + b \right) = 1
,其中a<0a < 0

9.重要公式与结论

(1) D(X)=E(X2)E2(X)\ D(X) = E(X^{2}) - E^{2}(X)

(2) Cov(X,Y)=E(XY)E(X)E(Y)\ Cov(X,Y) = E(XY) - E(X)E(Y)

(3) ρ(X,Y)1,\left| \rho\left( X,Y \right) \right| \leq 1,ρ(X,Y)=1P(Y=aX+b)=1\rho\left( X,Y \right) = 1 \Leftrightarrow P\left( Y = aX + b \right) = 1,其中a>0a > 0

ρ(X,Y)=1P(Y=aX+b)=1\rho\left( X,Y \right) = - 1 \Leftrightarrow P\left( Y = aX + b \right) = 1,其中a<0a < 0

(4) 下面5个条件互为充要条件:

ρ(X,Y)=0\rho(X,Y) = 0 Cov(X,Y)=0\Leftrightarrow Cov(X,Y) = 0 E(X,Y)=E(X)E(Y)\Leftrightarrow E(X,Y) = E(X)E(Y) D(X+Y)=D(X)+D(Y)\Leftrightarrow D(X + Y) = D(X) + D(Y) D(XY)=D(X)+D(Y)\Leftrightarrow D(X - Y) = D(X) + D(Y)

注:XXYY独立为上述5个条件中任何一个成立的充分条件,但非必要条件。

数理统计的基本概念

1.基本概念

总体:研究对象的全体,它是一个随机变量,用XX表示。

个体:组成总体的每个基本元素。

简单随机样本:来自总体XXnn个相互独立且与总体同分布的随机变量X1,X2,XnX_{1},X_{2}\cdots,X_{n},称为容量为nn的简单随机样本,简称样本。

统计量:设X1,X2,Xn,X_{1},X_{2}\cdots,X_{n},是来自总体XX的一个样本,g(X1,X2,Xn)g(X_{1},X_{2}\cdots,X_{n}))是样本的连续函数,且g()g()中不含任何未知参数,则称g(X1,X2,Xn)g(X_{1},X_{2}\cdots,X_{n})为统计量。

样本均值:X=1ni=1nXi\overline{X} = \frac{1}{n}\sum_{i = 1}^{n}X_{i}

样本方差:S2=1n1i=1n(XiX)2S^{2} = \frac{1}{n - 1}\sum_{i = 1}^{n}{(X_{i} - \overline{X})}^{2}

样本矩:样本kk阶原点矩:Ak=1ni=1nXik,k=1,2,A_{k} = \frac{1}{n}\sum_{i = 1}^{n}X_{i}^{k},k = 1,2,\cdots

样本kk阶中心矩:Bk=1ni=1n(XiX)k,k=1,2,B_{k} = \frac{1}{n}\sum_{i = 1}^{n}{(X_{i} - \overline{X})}^{k},k = 1,2,\cdots

2.分布

χ2\chi^{2}分布:χ2=X12+X22++Xn2χ2(n)\chi^{2} = X_{1}^{2} + X_{2}^{2} + \cdots + X_{n}^{2}\sim\chi^{2}(n),其中X1,X2,Xn,X_{1},X_{2}\cdots,X_{n},相互独立,且同服从N(0,1)N(0,1)

tt分布:T=XY/nt(n)T = \frac{X}{\sqrt{Y/n}}\sim t(n) ,其中XN(0,1),Yχ2(n),X\sim N\left( 0,1 \right),Y\sim\chi^{2}(n),XXYY 相互独立。

FF分布:F=X/n1Y/n2F(n1,n2)F = \frac{X/n_{1}}{Y/n_{2}}\sim F(n_{1},n_{2}),其中Xχ2(n1),Yχ2(n2),X\sim\chi^{2}\left( n_{1} \right),Y\sim\chi^{2}(n_{2}),XXYY相互独立。

分位数:若P(Xxα)=α,P(X \leq x_{\alpha}) = \alpha,则称xαx_{\alpha}XXα\alpha分位数

3.正态总体的常用样本分布

(1) 设X1,X2,XnX_{1},X_{2}\cdots,X_{n}为来自正态总体N(μ,σ2)N(\mu,\sigma^{2})的样本,

X=1ni=1nXi,S2=1n1i=1n(XiX)2,\overline{X} = \frac{1}{n}\sum_{i = 1}^{n}X_{i},S^{2} = \frac{1}{n - 1}\sum_{i = 1}^{n}{{(X_{i} - \overline{X})}^{2},}则:

  1. XN(μ,σ2n)  \overline{X}\sim N\left( \mu,\frac{\sigma^{2}}{n} \right){\ \ }或者XμσnN(0,1)\frac{\overline{X} - \mu}{\frac{\sigma}{\sqrt{n}}}\sim N(0,1)

  2. (n1)S2σ2=1σ2i=1n(XiX)2χ2(n1)\frac{(n - 1)S^{2}}{\sigma^{2}} = \frac{1}{\sigma^{2}}\sum_{i = 1}^{n}{{(X_{i} - \overline{X})}^{2}\sim\chi^{2}(n - 1)}

  3. 1σ2i=1n(Xiμ)2χ2(n)\frac{1}{\sigma^{2}}\sum_{i = 1}^{n}{{(X_{i} - \mu)}^{2}\sim\chi^{2}(n)}

4)  XμS/nt(n1){\ \ }\frac{\overline{X} - \mu}{S/\sqrt{n}}\sim t(n - 1)

4.重要公式与结论

(1) 对于χ2χ2(n)\chi^{2}\sim\chi^{2}(n),有E(χ2(n))=n,D(χ2(n))=2n;E(\chi^{2}(n)) = n,D(\chi^{2}(n)) = 2n;

(2) 对于Tt(n)T\sim t(n),有E(T)=0,D(T)=nn2(n>2)E(T) = 0,D(T) = \frac{n}{n - 2}(n > 2)

(3) 对于F ~F(m,n)F\tilde{\ }F(m,n),有 1FF(n,m),Fa/2(m,n)=1F1a/2(n,m);\frac{1}{F}\sim F(n,m),F_{a/2}(m,n) = \frac{1}{F_{1 - a/2}(n,m)};

(4) 对于任意总体XX,有 E(X)=E(X),E(S2)=D(X),D(X)=D(X)nE(\overline{X}) = E(X),E(S^{2}) = D(X),D(\overline{X}) = \frac{D(X)}{n}