Выборка из нормального распределения, но игнорирование всех случайных значений, выходящих за пределы указанного диапазона, до моделирования.
Этот метод является правильным, но, как упомянул @ Xi'an в своем ответе, потребуется много времени, когда диапазон мал (точнее, когда его мера мала при нормальном распределении).
F−1(U)FU∼Unif(0,1)FG(a,b)G−1(U)U∼Unif(G(a),G(b)) .
G−1G−1GG−1abG .
Имитация усеченного распределения с использованием выборки по важности
N(0,1)GGG(q)=arctan(q)π+12 and G−1(q)=tan(π(q−12)). Therefore, the truncated Cauchy distribution is easy to sample by the inversion method and it is a good choice of the instrumental variable for importance sampling of the truncated normal distribution.
After a bit of simplifications, sampling U∼Unif(G(a),G(b)) and taking G−1(U) is equivalent to take tan(U′) with U′∼Unif(arctan(a),arctan(b)):
a <- 1
b <- 5
nsims <- 10^5
sims <- tan(runif(nsims, atan(a), atan(b)))
Now one has to calculate the weight for each sampled value xi, defined as the ratio ϕ(x)/g(x) of the two densities up to normalization, hence we can take
w(x)=exp(−x2/2)(1+x2),
but it could be safer to take the log-weights:
log_w <- -sims^2/2 + log1p(sims^2)
w <- exp(log_w) # unnormalized weights
w <- w/sum(w)
The weighted sample (xi,w(xi)) allows to estimate the measure of every interval [u,v] under the target distribution, by summing the weights of each sampled value falling inside the interval:
u <- 2; v<- 4
sum(w[sims>u & sims<v])
## [1] 0.1418
This provides an estimate of the target cumulative function.
We can quickly get and plot it with the spatsat
package:
F <- spatstat::ewcdf(sims,w)
# estimated F:
curve(F(x), from=a-0.1, to=b+0.1)
# true F:
curve((pnorm(x)-pnorm(a))/(pnorm(b)-pnorm(a)), add=TRUE, col="red")
# approximate probability of u<x<v:
F(v)-F(u)
## [1] 0.1418
Of course, the sample (xi) is definitely not a sample of the target distribution, but of the instrumental Cauchy distribution, and one gets a sample of the target distribution by performing weighted resampling, for instance using the multinomial sampling:
msample <- rmultinom(1, nsims, w)[,1]
resims <- rep(sims, times=msample)
hist(resims)
mean(resims>u & resims<v)
## [1] 0.1446
Another method: fast inverse transform sampling
Olver and Townsend developed a sampling method for a broad class of continuous distribution. It is implemented in the chebfun2 library for Matlab as well as the ApproxFun library for Julia. I have recently discovered this library and it sounds very promising (not only for random sampling). Basically this is the inversion method but using powerful approximations of the cdf and the inverse cdf. The input is the target density function up to normalization.
The sample is simply generated by the following code:
using ApproxFun
f = Fun(x -> exp(-x.^2./2), [1,5]);
nsims = 10^5;
x = sample(f,nsims);
As checked below, it yields an estimated measure of the interval [2,4] close to the one previously obtained by importance sampling:
sum((x.>2) & (x.<4))/nsims
## 0.14191