Exercises with pandas 2

1. Warm up with the iris data frame.

  • Show the first 3 rows
  • Show the last 3 rows
  • Show 3 random rows without repetition
In [1]:

2. Using the iris data set,

  • Find the mean value of all 4 measurements
  • Find the mean value of all 4 measurements for each Species
In [1]:

3. Using the iris data set,

  • Sort the observations by Sepal.Width in decreasing order.
In [1]:

4. Using the iris data`m set,

  • Count the number of flowers of each Species
In [1]:

5. Using the iris data set,

  • Count the number of observations where Petal.Length is longer than Sepal.Width
In [1]:

6. Using the iris data set,

  • Find the Species with the most number of observations where the Sepal.Length is less then the mean Sepal.Length of all observations
In [1]:

7. Using the iris data set,

  • Convert the data frame from the current wide format to a tall format, with just 3 columns: Species, Measurement, Value.
In [1]:

8. Using the mtcars data set,

  • Find the mean weight of all cars with mpg > 20 and cyl = 4.
In [1]:

9. Using the mtcars data set,

  • Add a new column named bmi that is equal to (hp*mpg/wt)
In [1]:

10. Using the mtcars data set

  • Find all rows whose car names have numbers in them.
In [1]:

11. Using the iris data set

  • Create a new data frame df that has only 3 columns (Species, Measure, Value) where Measure takes on the values Sepal.Length, Sepal.Width, Petal.Length or Petal.Width. Show the first 5 rows.
  • Show the mean value and counts for each Species and Measure of df
In [1]:

12. Using the df data set,

  • give each different treatement its own column.
In [1]:
df <- data.frame(subject=rep(1:4,3),
                 treatment = rep(c("A", "B", "C"), each=4),
                 value = rnorm(12))
  File "<ipython-input-1-f9b58b684d55>", line 1
    df <- data.frame(subject=rep(1:4,3),
                                  ^
SyntaxError: invalid syntax

In [2]:

13. Using the expt data set

  • Find the average blood pressure for each treatment group (A, B or C).

Note: You are assumed not to have access to the pid and treat values separate.y.

In [2]:
pid <- rep(1:4, 3)
treat <- rep(c('A','B','C'), each = 4)
bp <- rnorm(12, 120, 25)
expt <- data.frame(name=paste(pid, treat, sep='-'), bp=bp)
rm(pid)
rm(treat)
expt
  File "<ipython-input-2-83fe990909ce>", line 1
    pid <- rep(1:4, 3)
                ^
SyntaxError: invalid syntax