{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Probability distributions and Random Number Genereation\n", "====" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Probability distributions\n", "----\n", "\n", "To some extent, the foundation of statistics is an understanding of probability distributions. In addition, drawing of random samples from specific probability distributions is ubiquitous in applied statistics and useful in many contexts, not least of which is an appreciation of how different probabilty distriutions behave" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", "
Distributions {stats} | R Documentation |
Density, cumulative distribution function, quantile function and random\n", "variate generation for many standard probability distributions are\n", "available in the stats package.\n", "
\n", "\n", "\n", "The functions for the density/mass function, cumulative distribution\n",
"function, quantile function and random variate generation are named in the\n",
"form dxxx
, pxxx
, qxxx
and rxxx
respectively.\n",
"
For the beta distribution see dbeta
.\n",
"
For the binomial (including Bernoulli) distribution see\n",
"dbinom
.\n",
"
For the Cauchy distribution see dcauchy
.\n",
"
For the chi-squared distribution see dchisq
.\n",
"
For the exponential distribution see dexp
.\n",
"
For the F distribution see df
.\n",
"
For the gamma distribution see dgamma
.\n",
"
For the geometric distribution see dgeom
. (This is also\n",
"a special case of the negative binomial.)\n",
"
For the hypergeometric distribution see dhyper
.\n",
"
For the log-normal distribution see dlnorm
.\n",
"
For the multinomial distribution see dmultinom
.\n",
"
For the negative binomial distribution see dnbinom
.\n",
"
For the normal distribution see dnorm
.\n",
"
For the Poisson distribution see dpois
.\n",
"
For the Student's t distribution see dt
.\n",
"
For the uniform distribution see dunif
.\n",
"
For the Weibull distribution see dweibull
.\n",
"
For less common distributions of test statistics see\n",
"pbirthday
, dsignrank
,\n",
"ptukey
and dwilcox
(and see the\n",
"‘See Also’ section of cor.test
).\n",
"
RNG
about random number generation in R.\n",
"
The CRAN task view on distributions,\n", "http://cran.r-project.org/web/views/Distributions.html,\n", "mentioning several CRAN packages for additional distributions.\n", "
\n", "\n", "Binomial {stats} | R Documentation |
Density, distribution function, quantile function and random\n",
"generation for the binomial distribution with parameters size
\n",
"and prob
.\n",
"
This is conventionally interpreted as the number of ‘successes’\n",
"in size
trials.\n",
"
\n", "dbinom(x, size, prob, log = FALSE)\n", "pbinom(q, size, prob, lower.tail = TRUE, log.p = FALSE)\n", "qbinom(p, size, prob, lower.tail = TRUE, log.p = FALSE)\n", "rbinom(n, size, prob)\n", "\n", "\n", "\n", "
x, q | \n",
"\n",
" vector of quantiles. \n", " |
p | \n",
"\n",
" vector of probabilities. \n", " |
n | \n",
"\n",
" number of observations. If |
size | \n",
"\n",
" number of trials (zero or more). \n", " |
prob | \n",
"\n",
" probability of success on each trial. \n", " |
log, log.p | \n",
"\n",
" logical; if TRUE, probabilities p are given as log(p). \n", " |
lower.tail | \n",
"\n",
" logical; if TRUE (default), probabilities are\n", "P[X ≤ x], otherwise, P[X > x]. \n", " |
The binomial distribution with size
= n and\n",
"prob
= p has density\n",
"
\n", " p(x) = choose(n, x) p^x (1-p)^(n-x)
\n", "\n", "for x = 0, …, n.\n",
"Note that binomial coefficients can be computed by\n",
"choose
in R.\n",
"
If an element of x
is not integer, the result of dbinom
\n",
"is zero, with a warning.\n",
"
is computed using Loader's algorithm, see the reference below.\n", "
\n", "The quantile is defined as the smallest value x such that\n", "F(x) ≥ p, where F is the distribution function.\n", "
\n", "\n", "\n", "dbinom
gives the density, pbinom
gives the distribution\n",
"function, qbinom
gives the quantile function and rbinom
\n",
"generates random deviates.\n",
"
If size
is not an integer, NaN
is returned.\n",
"
The length of the result is determined by n
for\n",
"rbinom
, and is the maximum of the lengths of the\n",
"numerical arguments for the other functions. \n",
"
The numerical arguments other than n
are recycled to the\n",
"length of the result. Only the first elements of the logical\n",
"arguments are used.\n",
"
For dbinom
a saddle-point expansion is used: see\n",
"
Catherine Loader (2000). Fast and Accurate Computation of\n", "Binomial Probabilities; available from\n", "http://www.herine.net/stat/software/dbinom.html.\n", "
\n", "pbinom
uses pbeta
.\n",
"
qbinom
uses the Cornish–Fisher Expansion to include a skewness\n",
"correction to a normal approximation, followed by a search.\n",
"
rbinom
(for size < .Machine$integer.max
) is based on\n",
"
Kachitvichyanukul, V. and Schmeiser, B. W. (1988)\n", "Binomial random variate generation.\n", "Communications of the ACM, 31, 216–222.\n", "
\n", "For larger values it uses inversion.\n", "
\n", "\n", "\n", "Distributions for other standard distributions, including\n",
"dnbinom
for the negative binomial, and\n",
"dpois
for the Poisson distribution.\n",
"
\n", "require(graphics)\n", "# Compute P(45 < X < 55) for X Binomial(100,0.5)\n", "sum(dbinom(46:54, 100, 0.5))\n", "\n", "## Using \"log = TRUE\" for an extended range :\n", "n <- 2000\n", "k <- seq(0, n, by = 20)\n", "plot (k, dbinom(k, n, pi/10, log = TRUE), type = \"l\", ylab = \"log density\",\n", " main = \"dbinom(*, log=TRUE) is better than log(dbinom(*))\")\n", "lines(k, log(dbinom(k, n, pi/10)), col = \"red\", lwd = 2)\n", "## extreme points are omitted since dbinom gives 0.\n", "mtext(\"dbinom(k, log=TRUE)\", adj = 0)\n", "mtext(\"extended range\", adj = 0, line = -1, font = 4)\n", "mtext(\"log(dbinom(k))\", col = \"red\", adj = 1)\n", "\n", "\n", "