首页 > 其他分享 >Simulation-计算统计-随机数生成

Simulation-计算统计-随机数生成

时间:2022-10-15 14:56:01浏览次数:74  
标签:frac 生成 随机数 distribution Simulation Sigma operatorname sim left

library('ggplot2')
library('dplyr')

Lecture 6 Methods for generating random numbers

Goal: Use U(0, 1) numbers to generate observations (variates) from other
distributions, and even stochastic processes.

  • Discrete: Bernoulli, Binomial, Poisson, empirical
  • Continuous: Exponential, Normal (many ways), empirical
  • Multivariate normal
  • Auto-regressive moving average time series
  • Waiting times

Linear congruential generator LCG (线性同余法)

\[X_{n+1} = (aX_n+c) \text{ mod } m, n \geq 0 \]

sample(0:1, size=10,replace=TRUE)
sample(letters)
sample(1:4, size=100, replace=TRUE, prob=1:4/10) %>% 
  table()

Drawbacks: 1. There may be a loop, so not all possible values will be
generated.

uniform pseudo-random generator

u <- runif(10)

Inverse Transform Method

Theorem

If \(X\) is a continuous random variable with cdf \(F(x)\),i.e. \(X\sim F\),
then $$F(X) \sim \operatorname{Uniform}(0,1)$$ Proof. Let $F(X) = U $

Because \(X\) is continuous, then \(F\) is an increasing function, and thus

\[ F(X) \leq u \Longleftrightarrow X \leq F^{-1}(u) . $$ Since $X \sim F$, it follows that for any $0<u<1$. $$ P(U \leq u)=P(F(X) \leq u)=P\left(X \leq F^{-1}(u)\right)=F\left(F^{-1}(u)\right)=u .\]

Def. Inverse transformation of F

\[F^{-1}(u)=\operatorname{inf}\{x:F(x) \geq u\},\quad 0 < u < 1 \]

证明看老师ppt。还有一些题目可以看"Random Variate Generation"

Continuous Case

逆一般比较好求。

求密度分布为F_X(x)的X的随机样本。

  1. 生成\(U\sim \mathcal{U}(0,1)\)

  2. 推导出逆函数\(F^{− 1}(u)\)

  3. 写一个命令或函数来计算\(F^{-1}(u)\)

  4. 对每一随机变量都要求:

(1)生成的随机数\(u\)来自于Uniform(0, 1)

(2)得到的\(x = F^{-1}(u)\), 此处的x就是服从\(F(x)\)

Eg.服从密度分布为\(f(x)=3x^2 (0<x<1)\)的random sample。

set.seed(1)
u <- runif(1000)
x <- u^(1/3) 
hist(x, prob = TRUE)
y <- seq(0, 1, .01)
lines(y, 3 * y^2)


f <- function(x) 3 * x^2
ggplot(as.data.frame(x), aes(x = x)) +
geom_histogram(aes(y = ..density..), color = "black", alpha = 0.7) +
geom_density(color = "red") +
stat_function(fun = f, color = "blue")

结论:生成的随机数直方图与理论密度分布曲线保持一致。符合预期。

Discrete Case

注意一下求逆的过程。

  1. 生成的随机样本\(u\)服从\(\mathcal{U}(0,1)\)
  2. 计算\(x = F^{-1}(u)\)
  3. 当\(F(x_{i-1})<u<F(x_{i})\),输出\(x_i\)

Eg.(Two point distribution) Generate a random sample of Bernoulli
variables with p = 0.4.

u <- runif(1000)
x <- as.integer(u > 0.6)
table(x)

Eg.(几何分布) the probability of dropping the dice n times succeed 1
time.

n <- 1000
p <- 0.25
set.seed(308)
u <- runif(n)
x <- (log(1 - u)/log(1 - p)) %>% ceiling()
hist(x,probability = TRUE)

Sol2:generate Fk, then calculate u is larger than how many Fk. That
figure is F-1(u).

[For distributions that have no analytical notation, Use this Method!!]

n <- 1000
p <- 0.25
x <- numeric(n) #def a numeric vecor of length n
set.seed(308)
u <- runif(1000)
K <- 100 ## set the initial length of cdf vector,in fact this could be infinity
k <- 1:K
Fk <- 1 - (1 - p)^(k - 1)
for(i in 1:n){
  x[i] = sum(u[i] > Fk) # u[i]>Fk is a list of [T,T,T,...,F,F,F], we calculate the number of k's that u is greater than Fk.
}
mean(x)

\text{Note} that not all X are [1,2,3,...], THEY MAY BE [102,304,-3].
Another dict is needed.

g <- c(102,304,03)
#g(x) # (mapping) y=g(x) is what we needed Y~F

\text{Note} that HERE we use k <- 100, however, practically it may be
larger then 100, so we need to use max(x) to check whether K <- 100
is exceeded. For distributions that does not have the analytical cdf, we
can use Fk <- cumsum(pk) to calculate.

The Acceptance-Rejection Method

Motivation: The majority of cdf's cannot be inverted efficiently. A-R
samples form a distribution that is "almost" the one we want, and then
adjusts by "accepting" only a certain proportion of those samples.

假定X与Y是服从密度函数 f 和 g
的随机变量,存在一个常数\(c\)使$$ \frac{f(t)}{g(t)}\le c $$对所有 t
都满足\(f(t)>0\),则接受拒绝法可以用来生成X的随机样本。

目标:生成服从\(f(x)\)的随机样本X

  1. 找到一个随机样本Y,该随机变量的密度函数为g,且满足\(\frac{f(y)}{g(y)} \le c\),同时对所有y,都满足\(f(y)>0\),生成Y的随机样本。

  2. 生成随机样本U来自于\(U(0,1)\)

  3. 如果\(u<\frac{f(y)}{cg(y)}\),则接受y并传递给x,令\(x =y\),
    否则:拒绝y,重复操作2(a).

Note:

1.每成功一次的概率是1/c,所以成功n次的概率是nc,所以上界c越小越好。

  1. g怎么选:可以根据f的support来选。

证明看老师ppt。

Eg. Generate Beta(2, 2) random variables. 即\(f(x) = 6x(1-x)\), \((0<x<1)\)
so let \(g(x) \sim \operatorname{Uniform}(0,1)\), then $g(x) = 1 $, for\(0<x<1\), \(c=6\)

n <- 1000
k <- 0 # counter for acceptance
count <- 0 # count the number of iterations
y <- numeric(n)
while(k < n){
  u <- runif(1)
  count <- count + 1
  x <- runif(1) # random variable from g
  if (x * (1 - x) > u){
    # we accept x
    k <- k + 1
    y[k] <- x
  }
}
j

Transformation Methods

(0) If \(Z \sim N(0,1)\), then \(Z^2 \sim \chi^2(1)\).

  • If \(U \sim \chi^2(m)\) and \(V \sim \chi^2(n)\) are independent, then

\[F=\frac{U / m}{V / n} \sim F(m, n) . \]

  • If \(Z \sim N(0,1)\) and \(V \sim \chi^2(n)\) are independent, then

\[T=\frac{Z}{\sqrt{V / n}} \sim t(n). \]

  • If \(U, V \sim\) Uniform \((0,1)\) are independent, then

\[Z_1=\sqrt{-2 \log U} \cos (2 \pi V), \quad Z_2=\sqrt{-2 \log U} \sin (2 \pi V) \]

are independent \(N(0,1)\) random variables.

  • If \(U \sim \Gamma(r, \lambda)\) and \(V \sim \Gamma(s, \lambda)\) are independent, then

\[X=\frac{U}{U+V} \sim \operatorname{Beta}(r, s) . \]

Sums and Mixtures

Convolutions

Let \(X_1, \ldots, X_n\) be i.i.d. with distribution \(X_i \sim X\) (the distribution function is \(F_X\) ), and let

\[S=X_1+\cdots+X_n . \]

The distribution function of the sum \(S\) is called the \(n\)-fold convolution of \(X\) and denoted by \(F_X^{*(n)}\).

Example

  • \(Z_1^2+Z_2^2 \cdots+Z_m^2 \sim \chi_m^2\)
  • \(X_1+\cdots+X_m \sim\) Negative Binomial \((m, p)\), where \(X \sim \operatorname{Geometric~}(p)\).
  • \(X_1+\cdots+X_m \sim \Gamma(m, \lambda)\), where \(X \sim \operatorname{Exp}(\lambda)\).

Sums (Composition Method)

Eg: Chi-squared distribution \(\chi^2(m)\)

n <- 1000
m <- 3
X <- matrix(rnorm(n * m), ncol = m)^2
y <- rowSums(X)

Mixtures

Using theta

\(X ~ \frac{1}{3} N (\mu_1,\sigma_1^2) + \frac{2}{3} N (\mu_2,\sigma_2^2)\)
的意思是X有1/3的概率取第一个,有2/3的概率取第二个。不能直接用两个分布的相加。!!!!
对比下图x,y有着明显的差异 WHY????

n <- 1000
x1 <- rnorm(n, 0, 1)
x2 <- rnorm(n, 3, 1)
u <- runif(n)
k <- as.integer(u < 1/3)
x <- k * x1 + (1 - k) * x2
y <- 1/3*x1 + 2/3*x2

hist(x)
hist(y)

Continuous mixture
A random variable \(X\) is a continuous mixture if the distribution of \(X\) is

\[F_X(x)=\int_{-\infty}^{\infty} F_{X \mid Y=y}(x) f_Y(y) d y \]

for a family \(X \mid Y=y\) indexed by the real numbers \(y\) and weighting function \(f_Y\) such that \(\int f_Y(y) d y=1\)
Example: The Poisson-Gamma mixture
The negative binomial distribution is a mixture of Poisson \((\Lambda)\) distribution, where \(\Lambda\) has a gamma distribution \(\Gamma(r, \beta)\). That is, if

\[(X \mid \Lambda=\lambda) \sim \operatorname{Poission}(\lambda) \text { and } \Lambda \sim \Gamma(r, \beta), \]

then

\[X \sim \operatorname{Negative} \operatorname{Binomial}\left(r, \frac{\beta}{1+\beta}\right) . \]

Ex:负二项分布==Poisson-Gamma mixture

n <- 1000
r <- 4
beta <- 3
lambda <- rgamma(n, r, beta)
x <- rpois(n, lambda) # 每一个x相当于(对每一个lambda生成一个x)
print(lambda)
print(x)

Multivariate Distribution

如何生成多元正态分布 \(AZ+b ~ N_p(b, AA^T)\)

Theorem
Let \(Z=\left(Z_1, \ldots, Z_p\right)^T\) be a vector of i.i.d. \(N(0,1)\) random variables. Then,

\[Z \sim N_p\left(0, I_p\right) . \]

Assume that \(Z \sim N_p(\mu, \Sigma)\). For any \(A \in \mathbb{R}^{q \times p}\) and \(b \in \mathbb{R}^q\), then

\[A Z+b \sim N_q\left(A \mu+b, A \Sigma A^T\right) . \]

SO, in order to generate \(X\), let's start with a vector \(Z=\left(Z_1, \ldots, Z_k\right)\) of iid \(\operatorname{Nor}(0,1)\) RV's. That is, suppose \(Z \sim \operatorname{Nor}_k(0, I)\), where \(I\) is the \(k \times k\) identity matrix, and 0 is simply a vector of 0 's.

Suppose we can find a (lower triangular) matrix \(C\) such that \(\Sigma=C C^{\mathrm{T}}\).
Then it can be shown that \(X=\mu+C Z\) is multivariate normal with mean \(\mathrm{E}[X]\) and covariance matrix

\[\Sigma \equiv \mathrm{E}\left[(C Z)(C Z)^T\right]=\mathrm{E}\left[C Z Z^T C^T\right]=C\left(\mathrm{E}\left[\boldsymbol{Z} \boldsymbol{Z}^T\right]\right) C^T=C C^T . \]

That is, \(X \sim \operatorname{Nor}_k(\boldsymbol{\mu}, \Sigma)\).

want to find: \(A = \Sigma^{\frac{1}{2}}\)

Choleski factorization method

The Choleski factorization of a real symmetric positive-definite matrix is
\(\Sigma = Q^T Q\), where Q is an upper triangular matrix.

Sigma <- matrix(c(4, 12, 12, 37), nrow = 2)
chol(Sigma)

Spectral decomposition method

\(Q = U\Lambda^{\frac{1}{2}}U^T = \Sigma\)

library(knitr)
library(dplyr)
eigen_Sigma <- eigen(Sigma)
lam <- eigen_Sigma$values
U <- eigen_Sigma$vectors
Lam <- diag(lam)
Lam

U %*% Lam %*% t(U) %>% round()

U %*% sqrt(Lam) %*% t(U) %>% round(digits = 3)

Singular value decomposition (SVD) method

参考:https://blog.csdn.net/Yeeyi_max?type=blog 的一些文章
https://www2.isye.gatech.edu/~sman/courses/Mexico2010/Module07-RandomVariateGeneration.pdf

标签:frac,生成,随机数,distribution,Simulation,Sigma,operatorname,sim,left
From: https://www.cnblogs.com/pny01/p/16794114.html

相关文章

  • django seed模型生成测试数据
     安装要安装djangoseed,请使用pip:pipinstalldjango-seed或从源安装:pythonsetup.pyinstall配置将其添加到settings.py:中已安装的应用程序中INSTALLED_A......
  • springboot MP代码生成器
    1、需要的依赖和版本号(我这个是项目完成后的全部依赖,只参照需要的依赖即可)<?xmlversion="1.0"encoding="UTF-8"?><projectxmlns="http://maven.apache.org/POM/4.0.0......
  • P4180 [BJWC2010] 严格次小生成树
    #include<bits/stdc++.h>usingnamespacestd;#defineintlonglongintn,m;longlongq=2000000000000000000;longlongsum=0;namespacekt{ structedge{ intx......
  • Excel如何生成2个随机值,相加始终为指定的固定值?
    Excel情报局职场联盟Excel生产挖掘分享Excel基础技能Excel爱好者大本营用1%的Excel基础搞定99%的职场问题做一个超级实用的Excel公众号Excel是门手艺玩转需要勇气数万Excel......
  • openssl 生成密钥
    1、下载OpenSSL工具地址:http://slproweb.com/products/Win32OpenSSL.html2、配置OpenSSL的环境变量3、生成rsa密钥命令opensslgenrsa-outrsa_private_key.pem2......
  • 内表生成XML简单实例
    REPORTzlm_xml_02.*&---------------------------------------------------------------------**&声明及定义部分*&---------------------------------------------------......
  • 使用Keras生成可变尺寸输入数据的神经网络
    本教程发布于博客园,转载请注明出处!问题:在使用神经网络处理实际数据时,往往遇到数据尺寸不相同的情况。例如:训练得到一个图片去雾模型后,需要对不同尺寸的照片去雾。解决方......
  • 生成会计凭证 ACC_DOCUMENT 增强可能忽略一个问题
    帮人解决一个BAPI_ACC_DOCUMENT_POST创建会计凭证增强的问题,然后整理了以下内容:  "创建凭证的时候经常会用到  extension2 传一些标准bapi接口未提供的值  CALLFU......
  • [译] PEP 255--简单的生成器
     我正打算写写Python的生成器,然而查资料时发现,引入生成器的PEP没人翻译过,因此就花了点时间翻译出来。如果在阅读时,你有读不懂的地方,不用怀疑,极有可能是我译得不到位。......
  • Python 之父的解析器系列之三:生成一个 PEG 解析器
    原题|GeneratingaPEGParser作者|GuidovanRossum(Python之父)译者|豌豆花下猫(“Python猫”公众号作者)声明|本翻译是出于交流学习的目的,基于​​CCBY-NC-SA4.0......