{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Exercises with `pandas` 2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**1**. Warm up with the `iris` data frame.\n", "\n", "- Show the first 3 rows\n", "- Show the last 3 rows\n", "- Show 3 random rows without repetition" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**2**. Using the `iris` data set,\n", "\n", "- Find the mean value of all 4 measurements \n", "- Find the mean value of all 4 measurements for each Species" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**3**. Using the `iris` data set,\n", "\n", "- Sort the observations by Sepal.Width in decreasing order." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**4**. Using the `iris` data`m set,\n", "\n", "- Count the number of flowers of each Species" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**5**. Using the `iris` data set,\n", "\n", "- Count the number of observations where Petal.Length is longer than Sepal.Width" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**6**. Using the `iris` data set, \n", "\n", "- Find the Species with the most number of observations where the Sepal.Length is less then the mean Sepal.Length of all observations" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**7**. Using the `iris` data set,\n", "\n", "- Convert the data frame from the current wide format to a tall format, with just 3 columns: Species, Measurement, Value." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**8**. Using the `mtcars` data set, \n", "\n", "- Find the mean weight of all cars with mpg > 20 and cyl = 4." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**9**. Using the `mtcars` data set,\n", "\n", "- Add a new column named `bmi` that is equal to (hp*mpg/wt) " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**10**. Using the `mtcars` data set\n", "\n", "- Find all rows whose car names have numbers in them." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**11**. Using the `iris` data set\n", "\n", "- Create a new data frame `df` that has only 3 columns (`Species`, `Measure`, `Value`) where `Measure` takes on the values `Sepal.Length`, `Sepal.Width`, `Petal.Length` or `Petal.Width`. Show the first 5 rows.\n", "- Show the mean value and counts for each Species and Measure of `df`" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**12**. Using the `df` data set,\n", "\n", "- give each different `treatement` its own column." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "df <- data.frame(subject=rep(1:4,3),\n", " treatment = rep(c(\"A\", \"B\", \"C\"), each=4),\n", " value = rnorm(12))" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**13**. Using the `expt` data set\n", "\n", "- Find the average blood pressure for each treatment group (A, B or C).\n", "\n", "Note: You are assumed not to have access to the `pid` and `treat` values separate.y." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "pid <- rep(1:4, 3)\n", "treat <- rep(c('A','B','C'), each = 4)\n", "bp <- rnorm(12, 120, 25)\n", "expt <- data.frame(name=paste(pid, treat, sep='-'), bp=bp)\n", "rm(pid)\n", "rm(treat)\n", "expt" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.1" } }, "nbformat": 4, "nbformat_minor": 2 }