{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Review" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**1**. Generate a sequence of the numbers `10,9,8,7,6,5,4,3,2,1`" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**2**. Extract only numbers divisible by 3 from the sequence generated in Q1." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**3**. Generate the sequence `1,2,3,4,1,2,3,4,1,2,3,4`" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**4**. Generate the sequence `1,1,1,1,2,2,2,2,3,3,3,3`" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**5**. Replace all odd numbers in Q1 with their square and leave the even numbers the same" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**6**. Generate all sliding windows of length 4 from the sequence in Q1. The first vector is `10.9,8,7` and the last one is `4,3,2,1`" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**7**. Generate the matrix shown below and find the row and column sums\n", "\n", "| | | | |\n", "|-|-|-|-|\n", "|1|2|3|4|\n", "|5|1|7|8|\n", "|9|10|NA|12|\n", "|13|14|15|1|" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**8**. Scale the matrix from Q7 so each **row** has mean of 0 and standard deviaion of 1." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**9**. Generate and assign row names `pt-1, pt-2, pt-3, pt-4` and column names `gene.1, gene.2, gene.3, gene.4` to the matrix from Q7." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**10**. Convert the matrix from Q7 to a `data.frame` and add a column `group` with values `A,B,A,B`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**11**. Remove the row with a missing value (NA) from the `data.frame` in Q10." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**12**. Find the average value of each gene by group using the `data.frame` from Q11." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**13**. Reshape the `data.frame` from Q10 to have only 3 columns `group`, `gene`, `value`. The `geen` column should have entries such as `gene:1`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**14**. Sort the `data.frame` from Q11 in decreasing order of `gene.1`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**15**. Replace all missing value with the column mean for the `data.frame` from Q10. group. Create a new data.frame that contains only the `log` values for all genes and the `group` column." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**16**. create a new `data.frame` with columns for `genes.5`, `genes.6`, rows for `pt-2, pt-3, pt-1, pt-4` (in this order) and values drawn from a Poisson distribution with rate 10. Merge this with the `data.frame` from Q10 to get a new `data.frame` with 4 rows and 7 columns." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**17**. Generate data using the code shown below. Then fit a linear model to the data and print the coefficients and associated p-values for `x1, x2, x3`.\n", "\n", "```R\n", "set.seed(123)\n", "n <- 10\n", "x1 <- runif(n, 0, 10)\n", "x2 <- runif(n, 0, 10)\n", "x3 <- runif(n, 0, 10)\n", "y <- 2 + 0.5*x1 + 0.05*x3 + rnorm(n)\n", "df <- data.frame(y=y, x1=x1, x2=x2, x3=x3)\n", "```" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**18**. Fit a linear model to the data below. Explain the results.\n", "\n", "```R\n", "set.seed(123)\n", "n <- 10\n", "x1 <- 1:10\n", "x2 <- seq(2,20,by=2)\n", "x3 <- seq(3,30,by=3)\n", "y <- 2 + 0.5*x1 + 0.05*x3 + rnorm(n)\n", "df <- data.frame(y=y, x1=x1, x2=x2, x3=x3)\n", "```" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**19**. Load the data set at https://www.openintro.org/stat/data/bdims.csv into a `data.frame` using `read.csv`. Use `knn` with 5 neighbors on the `wgt` and `hgt` columns to predict the `sex` and generate a classification table of true and predicted values from LOOCV." ] }, { "cell_type": "code", "execution_count": 49, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**20**. Using the same data set from Q19, perform a linear regression to predict the age from the 5 variables most correlated with the age. Use LOOCV correctly to get predicted age for each subject. Calculate the root mean square (RMS) error (square root of the mean of squared residuals) from the LOOCV predictions. Plot the predicted against the observed ages." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "R", "language": "R", "name": "ir" }, "language_info": { "codemirror_mode": "r", "file_extension": ".r", "mimetype": "text/x-r-source", "name": "R", "pygments_lexer": "r", "version": "3.4.0" } }, "nbformat": 4, "nbformat_minor": 2 }