Exercises with pandas 1

We will work with the Puromycin data set (available in R) in this exercise.

Reaction Velocity of an Enzymatic Reaction

Description:

     The ‘Puromycin’ data frame has 23 rows and 3 columns of the
     reaction velocity versus substrate concentration in an enzymatic
     reaction involving untreated cells or cells treated with
     Puromycin.

Usage:

     Puromycin

Format:

     This data frame contains the following columns:

     ‘conc’ a numeric vector of substrate concentrations (ppm)

     ‘rate’ a numeric vector of instantaneous reaction rates
          (counts/min/min)

     ‘state’ a factor with levels ‘treated’ ‘untreated’

Details:

     Data on the velocity of an enzymatic reaction were obtained by
     Treloar (1974).  The number of counts per minute of radioactive
     product from the reaction was measured as a function of substrate
     concentration in parts per million (ppm) and from these counts the
     initial rate (or velocity) of the reaction was calculated
     (counts/min/min).  The experiment was conducted once with the
     enzyme treated with Puromycin, and once with the enzyme untreated.

Source:

     Bates, D.M. and Watts, D.G. (1988), _Nonlinear Regression Analysis
     and Its Applications_, Wiley, Appendix A1.3.

     Treloar, M. A. (1974), _Effects of Puromycin on
     Galactosyltransferase in Golgi Membranes_, M.Sc. Thesis, U. of
     Toronto.

Load the Puromycin data set into a Python DataFrame

In [ ]:

How many rows and columns are there?

In [ ]:

What is the type of each column?

In [ ]:

Show all unique values for the state column

In [ ]:

Show the first 5 rows

In [ ]:

Show the last 5 rows

In [ ]:

Show 5 randomly sampled rows

In [ ]:

Show rows 5 to 10 (inclusive)

In [ ]:

Show only rows where the state is untreated

In [ ]:

Show only rows where the conc is 0.11

In [ ]:

Show only rows where the conc is less than 0.1

In [ ]:

Show only rows where the state is treated and the rate is more than 100

In [ ]:

Show only rows where the conc is less than 0.1 or the rate is more than 200

In [ ]:

Show only the conc and rate columns

In [ ]:

Show only the columns whose type is numeric

In [ ]:

Show only the columns whose names end with the letter e

In [ ]:

Convert all column names to UPPERCASE

In [ ]:

Rearrange the columns in the order state, conc, rate

In [ ]:

Drop the state column

In [ ]:

Create a new column rate2 that is the square of rate

In [ ]:

Create a new data frame that only has the 3 columns with conc, conc^2 and conc^3 values. Name them conc, conc2 and conc3

In [ ]:

Replace each value of all numeric columns with the square root of the value

In [ ]:

Sort in ascending rate order

In [ ]:

Sort in descending rate order

In [ ]:

Sort first on conc i ascending order, then rate in ascending order

In [ ]:

Sort in ascending order of the number of characters in the state column

In [ ]:

Find the mean value of numeric columns

In [ ]:

Find the mean length of the state column

In [ ]:

Find the min, median and max of the rate column

In [ ]:

Find the average rate for each state

In [ ]:

Find the number of treated and untreated states in a new column count

In [ ]:

Find the number of rows with the same conc and state in a new column count and only show rows where the count is an even number.

In [ ]:

Find the mean and standard deviation of rate for each state and conc. Remove any rows with an NA value for the rate standard deviation.

In [ ]: