{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Practice THREE" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "(1) Make a plot of the standard normal curve on the interval [-4, 4]. Give the plot a title \"Standard normal curve\", an x label of \"Normal deviate\" and a y label of \"Density\"." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "", "image/svg+xml": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", " \n", "\n", "\n", " \n", "\n", "\n", " \n", "\n", "\n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n", "\n", "\n" ], "text/plain": [ "Plot with title “Standard normal curve”" ] }, "metadata": { "image/svg+xml": { "isolated": true } }, "output_type": "display_data" } ], "source": [ "x <- pretty(-4:4, n=100)\n", "y <- dnorm(x)\n", "plot(x, y, type=\"l\", main=\"Standard normal curve\", xlab=\"Normal deviate\", ylab=\"Density\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "(2) What is the area under the curve to the right of x=3? In other words, what is the probability of drawing a random number from the normal distribution that is 3 standard deviations or more larger than the mean?" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "0.0013498980316301" ], "text/latex": [ "0.0013498980316301" ], "text/markdown": [ "0.0013498980316301" ], "text/plain": [ "[1] 0.001349898" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "1 - pnorm(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "(3) If the expression valuse for a gene are normally distributed with mean 10 and standard deviation 2, what is the value of a gene at the 95th percentile?" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "13.2897072539029" ], "text/latex": [ "13.2897072539029" ], "text/markdown": [ "13.2897072539029" ], "text/plain": [ "[1] 13.28971" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "qnorm(0.95, mean=10, sd=2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Generate 50 numbers from a normal distribtuion with mean=10 and sd=2. Now trnaform this vector so that the numbers have a stnadard normal distribtuion with mean=0 and sd=1." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": true }, "outputs": [], "source": [ "x <- rnorm(50, 10, 2)\n", "z <- (x - mean(x))/sd(x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "(4) A t-test with 6 degrees of freedom has a score of 3.5. Using only the dt, pt, qt or rt probability functions, what is the p-value if this was a two-sided test? Recall that a p-value is the probailty of seeing a value as extreme or more extreme than the observed score, assuming the score was drawn from the specified distirbution." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "0.0128263383328053" ], "text/latex": [ "0.0128263383328053" ], "text/markdown": [ "0.0128263383328053" ], "text/plain": [ "[1] 0.01282634" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "2*(1 - pt(3.5, df = 6))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "(5) Draw 1 million random numbers from the t-distirbution with 6 degrees of freedom. How many times is the numbr less than -3.5 or greater than 3.5? " ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "1310" ], "text/latex": [ "1310" ], "text/markdown": [ "1310" ], "text/plain": [ "[1] 1310" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x <- rt(100000, df=6)\n", "sum(abs(x) > 3.5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "(6) Find the mean value of all numeric variables for the mtcars data, grouping by number of gears and automtatic or manual transmission. (Hint: Use the aggregate function)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
geartransmissionmpgcyldisphpdratwtqsecvsamgearcarb
13016.106677.466667326.3176.13333.1326673.892617.6920.2032.666667
24021.055155.675100.753.86253.30520.0251043
34126.2754.5106.687583.8754.133752.272518.4350.75142
45121.386202.48195.63.9162.632615.640.2154.4
\n" ], "text/latex": [ "\\begin{tabular}{r|lllllllllllll}\n", " & gear & transmission & mpg & cyl & disp & hp & drat & wt & qsec & vs & am & gear & carb\\\\\n", "\\hline\n", "\t1 & 3 & 0 & 16.10667 & 7.466667 & 326.3 & 176.1333 & 3.132667 & 3.8926 & 17.692 & 0.2 & 0 & 3 & 2.666667\\\\\n", "\t2 & 4 & 0 & 21.05 & 5 & 155.675 & 100.75 & 3.8625 & 3.305 & 20.025 & 1 & 0 & 4 & 3\\\\\n", "\t3 & 4 & 1 & 26.275 & 4.5 & 106.6875 & 83.875 & 4.13375 & 2.2725 & 18.435 & 0.75 & 1 & 4 & 2\\\\\n", "\t4 & 5 & 1 & 21.38 & 6 & 202.48 & 195.6 & 3.916 & 2.6326 & 15.64 & 0.2 & 1 & 5 & 4.4\\\\\n", "\\end{tabular}\n" ], "text/plain": [ " gear transmission mpg cyl disp hp drat wt qsec\n", "1 3 0 16.10667 7.466667 326.3000 176.1333 3.132667 3.8926 17.692\n", "2 4 0 21.05000 5.000000 155.6750 100.7500 3.862500 3.3050 20.025\n", "3 4 1 26.27500 4.500000 106.6875 83.8750 4.133750 2.2725 18.435\n", "4 5 1 21.38000 6.000000 202.4800 195.6000 3.916000 2.6326 15.640\n", " vs am gear carb\n", "1 0.20 0 3 2.666667\n", "2 1.00 0 4 3.000000\n", "3 0.75 1 4 2.000000\n", "4 0.20 1 5 4.400000" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "with(mtcars, aggregate(mtcars, by=list(gear=gear, transmission=am), FUN=mean))" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [], "source": [ "library(plyr)\n", "library(reshape2)\n", "data(airquality)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
OzoneSolar.RWindTempMonthDay
1411907.46751
23611887252
31214912.67453
41831311.56254
5NANA14.35655
628NA14.96656
\n" ], "text/latex": [ "\\begin{tabular}{r|llllll}\n", " & Ozone & Solar.R & Wind & Temp & Month & Day\\\\\n", "\\hline\n", "\t1 & 41 & 190 & 7.4 & 67 & 5 & 1\\\\\n", "\t2 & 36 & 118 & 8 & 72 & 5 & 2\\\\\n", "\t3 & 12 & 149 & 12.6 & 74 & 5 & 3\\\\\n", "\t4 & 18 & 313 & 11.5 & 62 & 5 & 4\\\\\n", "\t5 & NA & NA & 14.3 & 56 & 5 & 5\\\\\n", "\t6 & 28 & NA & 14.9 & 66 & 5 & 6\\\\\n", "\\end{tabular}\n" ], "text/plain": [ " Ozone Solar.R Wind Temp Month Day\n", "1 41 190 7.4 67 5 1\n", "2 36 118 8.0 72 5 2\n", "3 12 149 12.6 74 5 3\n", "4 18 313 11.5 62 5 4\n", "5 NA NA 14.3 56 5 5\n", "6 28 NA 14.9 66 5 6" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "head(airquality)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "(7) Use `melt` to convert the airquality dataframe into a \"tall\" format using Month and Day as teh id variables, saving it as a new datafrmae. Print the first 6 rows." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
MonthDayvariablevalue
151Ozone41
252Ozone36
353Ozone12
454Ozone18
555OzoneNA
656Ozone28
\n" ], "text/latex": [ "\\begin{tabular}{r|llll}\n", " & Month & Day & variable & value\\\\\n", "\\hline\n", "\t1 & 5 & 1 & Ozone & 41\\\\\n", "\t2 & 5 & 2 & Ozone & 36\\\\\n", "\t3 & 5 & 3 & Ozone & 12\\\\\n", "\t4 & 5 & 4 & Ozone & 18\\\\\n", "\t5 & 5 & 5 & Ozone & NA\\\\\n", "\t6 & 5 & 6 & Ozone & 28\\\\\n", "\\end{tabular}\n" ], "text/plain": [ " Month Day variable value\n", "1 5 1 Ozone 41\n", "2 5 2 Ozone 36\n", "3 5 3 Ozone 12\n", "4 5 4 Ozone 18\n", "5 5 5 Ozone NA\n", "6 5 6 Ozone 28" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "md <- melt(airquality, id=c(\"Month\", \"Day\"))\n", "head(md)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "(8) Find the avarage values of Ozone, Solar.R, Wind and Temp for each month using `dcast`. Hint: Give an extra argument `na.rm = TRUE` to ignore missing data." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
MonthOzoneSolar.RWindTemp
1523.61538181.296311.6225865.54839
2629.44444190.166710.2666779.1
3759.11538216.48398.94193583.90323
4859.96154171.85718.79354883.96774
5931.44828167.433310.1876.9
\n" ], "text/latex": [ "\\begin{tabular}{r|lllll}\n", " & Month & Ozone & Solar.R & Wind & Temp\\\\\n", "\\hline\n", "\t1 & 5 & 23.61538 & 181.2963 & 11.62258 & 65.54839\\\\\n", "\t2 & 6 & 29.44444 & 190.1667 & 10.26667 & 79.1\\\\\n", "\t3 & 7 & 59.11538 & 216.4839 & 8.941935 & 83.90323\\\\\n", "\t4 & 8 & 59.96154 & 171.8571 & 8.793548 & 83.96774\\\\\n", "\t5 & 9 & 31.44828 & 167.4333 & 10.18 & 76.9\\\\\n", "\\end{tabular}\n" ], "text/plain": [ " Month Ozone Solar.R Wind Temp\n", "1 5 23.61538 181.2963 11.622581 65.54839\n", "2 6 29.44444 190.1667 10.266667 79.10000\n", "3 7 59.11538 216.4839 8.941935 83.90323\n", "4 8 59.96154 171.8571 8.793548 83.96774\n", "5 9 31.44828 167.4333 10.180000 76.90000" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dcast(md, Month ~ variable, mean, na.rm = TRUE)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "(9) Find the avarage values of Ozone, Solar.R, Wind and Temp for each month using `dcast`, but only for the first 2 weeks of each month. Hint: Give an extra argument `na.rm = TRUE` to ignore missing data. Hint: Use the subset argument." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
MonthOzoneSolar.RWindTemp
1519.41667200.090911.1785766.28571
2640.5249.142910.7357182.85714
3764.81818228.71439.00714384.85714
4858.41667168.72738.72142985.5
5943.35714188.64299.40714382.21429
\n" ], "text/latex": [ "\\begin{tabular}{r|lllll}\n", " & Month & Ozone & Solar.R & Wind & Temp\\\\\n", "\\hline\n", "\t1 & 5 & 19.41667 & 200.0909 & 11.17857 & 66.28571\\\\\n", "\t2 & 6 & 40.5 & 249.1429 & 10.73571 & 82.85714\\\\\n", "\t3 & 7 & 64.81818 & 228.7143 & 9.007143 & 84.85714\\\\\n", "\t4 & 8 & 58.41667 & 168.7273 & 8.721429 & 85.5\\\\\n", "\t5 & 9 & 43.35714 & 188.6429 & 9.407143 & 82.21429\\\\\n", "\\end{tabular}\n" ], "text/plain": [ " Month Ozone Solar.R Wind Temp\n", "1 5 19.41667 200.0909 11.178571 66.28571\n", "2 6 40.50000 249.1429 10.735714 82.85714\n", "3 7 64.81818 228.7143 9.007143 84.85714\n", "4 8 58.41667 168.7273 8.721429 85.50000\n", "5 9 43.35714 188.6429 9.407143 82.21429" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dcast(md, Month ~ variable, mean, subset = .(Day < 15), na.rm = TRUE)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Questions below use the day.1 and day.2 dataframes**" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false }, "outputs": [], "source": [ "set.seed(123)\n", "pid.1 <- c(1,1,2,2)\n", "gid.1 <- c(1,2,1,2)\n", "val.1 <- rnorm(4)\n", "day.1 <- data.frame(pid=pid.1, gid=gid.1, val=val.1)\n", "\n", "pid.2 <- c(1,1,2,2)\n", "gid.2 <- c(1,2,1,2)\n", "val.2 <- 1 + rnorm(4)\n", "day.2 <- data.frame(pid=pid.2, gid=gid.2, val=val.2)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
pidgidval
111-0.5604756
212-0.2301775
3211.558708
4220.07050839
\n" ], "text/latex": [ "\\begin{tabular}{r|lll}\n", " & pid & gid & val\\\\\n", "\\hline\n", "\t1 & 1 & 1 & -0.5604756\\\\\n", "\t2 & 1 & 2 & -0.2301775\\\\\n", "\t3 & 2 & 1 & 1.558708\\\\\n", "\t4 & 2 & 2 & 0.07050839\\\\\n", "\\end{tabular}\n" ], "text/plain": [ " pid gid val\n", "1 1 1 -0.56047565\n", "2 1 2 -0.23017749\n", "3 2 1 1.55870831\n", "4 2 2 0.07050839" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "day.1" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
pidgidval
1111.129288
2122.715065
3211.460916
422-0.2650612
\n" ], "text/latex": [ "\\begin{tabular}{r|lll}\n", " & pid & gid & val\\\\\n", "\\hline\n", "\t1 & 1 & 1 & 1.129288\\\\\n", "\t2 & 1 & 2 & 2.715065\\\\\n", "\t3 & 2 & 1 & 1.460916\\\\\n", "\t4 & 2 & 2 & -0.2650612\\\\\n", "\\end{tabular}\n" ], "text/plain": [ " pid gid val\n", "1 1 1 1.1292877\n", "2 1 2 2.7150650\n", "3 2 1 1.4609162\n", "4 2 2 -0.2650612" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "day.2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "(10) Suppose day.1 and day.2 are results from experiments performed on differnet days. Merge the data from day.1 and day.2 into a single dataframe caleld `days` to combine the data sets." ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false }, "outputs": [], "source": [ "days <- merge(day.1, day.2, by=c(\"pid\", \"gid\"), suffixes = 1:2)" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
pidgidval1val2
111-0.56047561.129288
212-0.23017752.715065
3211.5587081.460916
4220.07050839-0.2650612
\n" ], "text/latex": [ "\\begin{tabular}{r|llll}\n", " & pid & gid & val1 & val2\\\\\n", "\\hline\n", "\t1 & 1 & 1 & -0.5604756 & 1.129288\\\\\n", "\t2 & 1 & 2 & -0.2301775 & 2.715065\\\\\n", "\t3 & 2 & 1 & 1.558708 & 1.460916\\\\\n", "\t4 & 2 & 2 & 0.07050839 & -0.2650612\\\\\n", "\\end{tabular}\n" ], "text/plain": [ " pid gid val1 val2\n", "1 1 1 -0.56047565 1.1292877\n", "2 1 2 -0.23017749 2.7150650\n", "3 2 1 1.55870831 1.4609162\n", "4 2 2 0.07050839 -0.2650612" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "days" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "(11) Sort the `days` dataframe by val1 in decreasing order." ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
pidgidval1val2
3211.5587081.460916
4220.07050839-0.2650612
212-0.23017752.715065
111-0.56047561.129288
\n" ], "text/latex": [ "\\begin{tabular}{r|llll}\n", " & pid & gid & val1 & val2\\\\\n", "\\hline\n", "\t3 & 2 & 1 & 1.558708 & 1.460916\\\\\n", "\t4 & 2 & 2 & 0.07050839 & -0.2650612\\\\\n", "\t2 & 1 & 2 & -0.2301775 & 2.715065\\\\\n", "\t1 & 1 & 1 & -0.5604756 & 1.129288\\\\\n", "\\end{tabular}\n" ], "text/plain": [ " pid gid val1 val2\n", "3 2 1 1.55870831 1.4609162\n", "4 2 2 0.07050839 -0.2650612\n", "2 1 2 -0.23017749 2.7150650\n", "1 1 1 -0.56047565 1.1292877" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "days[order(-days$val1),]" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "collapsed": false }, "outputs": [ { "ename": "ERROR", "evalue": "Error in parse(text = x, srcfile = src): :1:6: unexpected symbol\n1: (12) Remove\n ^\n", "output_type": "error", "traceback": [ "Error in parse(text = x, srcfile = src): :1:6: unexpected symbol\n1: (12) Remove\n ^\n" ] } ], "source": [ "(12) Remove duplicate rows from the following dataframe." ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
pidgidval1val2
111-0.56047561.129288
211-0.56047561.129288
312-0.23017752.715065
4220.07050839-0.2650612
5220.07050839-0.2650612
6211.5587081.460916
\n" ], "text/latex": [ "\\begin{tabular}{r|llll}\n", " & pid & gid & val1 & val2\\\\\n", "\\hline\n", "\t1 & 1 & 1 & -0.5604756 & 1.129288\\\\\n", "\t2 & 1 & 1 & -0.5604756 & 1.129288\\\\\n", "\t3 & 1 & 2 & -0.2301775 & 2.715065\\\\\n", "\t4 & 2 & 2 & 0.07050839 & -0.2650612\\\\\n", "\t5 & 2 & 2 & 0.07050839 & -0.2650612\\\\\n", "\t6 & 2 & 1 & 1.558708 & 1.460916\\\\\n", "\\end{tabular}\n" ], "text/plain": [ " pid gid val1 val2\n", "1 1 1 -0.56047565 1.1292877\n", "2 1 1 -0.56047565 1.1292877\n", "3 1 2 -0.23017749 2.7150650\n", "4 2 2 0.07050839 -0.2650612\n", "5 2 2 0.07050839 -0.2650612\n", "6 2 1 1.55870831 1.4609162" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df <- read.csv(\"df.csv\")\n", "df" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
pidgidval1val2
111-0.56047561.129288
312-0.23017752.715065
4220.07050839-0.2650612
6211.5587081.460916
\n" ], "text/latex": [ "\\begin{tabular}{r|llll}\n", " & pid & gid & val1 & val2\\\\\n", "\\hline\n", "\t1 & 1 & 1 & -0.5604756 & 1.129288\\\\\n", "\t3 & 1 & 2 & -0.2301775 & 2.715065\\\\\n", "\t4 & 2 & 2 & 0.07050839 & -0.2650612\\\\\n", "\t6 & 2 & 1 & 1.558708 & 1.460916\\\\\n", "\\end{tabular}\n" ], "text/plain": [ " pid gid val1 val2\n", "1 1 1 -0.56047565 1.1292877\n", "3 1 2 -0.23017749 2.7150650\n", "4 2 2 0.07050839 -0.2650612\n", "6 2 1 1.55870831 1.4609162" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "unique(df)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "R", "language": "R", "name": "ir" }, "language_info": { "codemirror_mode": "r", "file_extension": ".r", "mimetype": "text/x-r-source", "name": "R", "pygments_lexer": "r", "version": "3.2.3" } }, "nbformat": 4, "nbformat_minor": 0 }