# Simulation

## Distributions in R

1. Density function (p.d.f.): dDistribution(x,...)
2. Probability function (c.d.f.): pDistribution(q,...)
3. Quantile function (inverse c.d.f.): qDistribution(p,...)
4. Random function (random variate generator): rDistribution(n,...)

where

• x is numeric vector of x values for p.d.f. f(x);
• p is numeric vector of probabilities;
• q is numeric vector of quantiles;
• n is number of observations, a numeric vector of length 1. If length(n)>1 then length(n) is taken to be the number of observations;

Note that

• for Probability functions and Quantile functions, an additional argument lower.tail (default is TRUE) is used to indicate whether lower-tail probability is implied (lower.tail=FALSE implies that the probability is a upper-tail probability)
• for Density functions, an additional argument log (default is FALSE) indicates whether log likelihood is in view
• for Probability functions and Quantile functions, an additional argument log.p (default is FALSE) indicates whether probability is in log(p)
• for most cases where applicable, R uses recycling rule to match first argument (x, p, q) for the first three types of function with parameter(s) of the distribution, so vectorization is in place

Distributions in package:stats
Distribution Density Probability Quantile Random
beta
binom
birthday × ×
cauchy
chisq
exp
f
gamma
geom
hyper
lnorm
logis
multinom × ×
nbinom
norm
pois
signrank
t
tukey × ×
unif
weibull
wilcox

Many add-on packages have other distributions available, for instance, package mvtnorm has multivariate normal distribution and multivariate t distribution. Users can write their own distribution functions and are recommended to comply with the above convention (using initials d, p, q, and r for function names; and using x, p, q, and n as the first argument).

## Sampling

sample(x, size, replace = FALSE, prob = NULL)
where
• x is a vector from which sample is taken; if given as a positive integer, then vector of integers 1, ..., x is implied;
• size is a positive integer specifying sample size; if unspecified, size of length(x) is implied;
• replace indicates whether sampleing is with replacement or not
• prob gives a vector of probabilities associating with the elements of x (in this case, replace should be set to TRUE)

### Examples

1. Random permutation: sample(x)
2. Bootstrap sample: bx <- sample(x, replace=T)
3. Double bootstrap sample: with bx above, sample(bx, replace=T) gives a double bootstrap sample
4. Sampling with unequal probabilities: sample(0:2,20,T,c(.2,.5,.3))
5. Bootstrap multivariate data (assume data matrix X): X[sample(nrow(X),rep=T),]

2008-09-26