Exercises with pandas
2¶
1. Warm up with the iris
data frame.
- Show the first 3 rows
- Show the last 3 rows
- Show 3 random rows without repetition
In [1]:
2. Using the iris
data set,
- Find the mean value of all 4 measurements
- Find the mean value of all 4 measurements for each Species
In [1]:
3. Using the iris
data set,
- Sort the observations by Sepal.Width in decreasing order.
In [1]:
4. Using the iris
data`m set,
- Count the number of flowers of each Species
In [1]:
5. Using the iris
data set,
- Count the number of observations where Petal.Length is longer than Sepal.Width
In [1]:
6. Using the iris
data set,
- Find the Species with the most number of observations where the Sepal.Length is less then the mean Sepal.Length of all observations
In [1]:
7. Using the iris
data set,
- Convert the data frame from the current wide format to a tall format, with just 3 columns: Species, Measurement, Value.
In [1]:
8. Using the mtcars
data set,
- Find the mean weight of all cars with mpg > 20 and cyl = 4.
In [1]:
9. Using the mtcars
data set,
- Add a new column named
bmi
that is equal to (hp*mpg/wt)
In [1]:
10. Using the mtcars
data set
- Find all rows whose car names have numbers in them.
In [1]:
11. Using the iris
data set
- Create a new data frame
df
that has only 3 columns (Species
,Measure
,Value
) whereMeasure
takes on the valuesSepal.Length
,Sepal.Width
,Petal.Length
orPetal.Width
. Show the first 5 rows. - Show the mean value and counts for each Species and Measure of
df
In [1]:
12. Using the df
data set,
- give each different
treatement
its own column.
In [1]:
df <- data.frame(subject=rep(1:4,3),
treatment = rep(c("A", "B", "C"), each=4),
value = rnorm(12))
File "<ipython-input-1-f9b58b684d55>", line 1
df <- data.frame(subject=rep(1:4,3),
^
SyntaxError: invalid syntax
In [2]:
13. Using the expt
data set
- Find the average blood pressure for each treatment group (A, B or C).
Note: You are assumed not to have access to the pid
and treat
values separate.y.
In [2]:
pid <- rep(1:4, 3)
treat <- rep(c('A','B','C'), each = 4)
bp <- rnorm(12, 120, 25)
expt <- data.frame(name=paste(pid, treat, sep='-'), bp=bp)
rm(pid)
rm(treat)
expt
File "<ipython-input-2-83fe990909ce>", line 1
pid <- rep(1:4, 3)
^
SyntaxError: invalid syntax