Numpy and Seaborn¶

Q1. (15 pts) Exercise 1. Write a 12 by 12 times table matrix shwon below. Do this

using nested for loops
using numpy fromfunction array constructor

using numpy broadcasting

array([[  1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12],
   [  2,   4,   6,   8,  10,  12,  14,  16,  18,  20,  22,  24],
   [  3,   6,   9,  12,  15,  18,  21,  24,  27,  30,  33,  36],
   [  4,   8,  12,  16,  20,  24,  28,  32,  36,  40,  44,  48],
   [  5,  10,  15,  20,  25,  30,  35,  40,  45,  50,  55,  60],
   [  6,  12,  18,  24,  30,  36,  42,  48,  54,  60,  66,  72],
   [  7,  14,  21,  28,  35,  42,  49,  56,  63,  70,  77,  84],
   [  8,  16,  24,  32,  40,  48,  56,  64,  72,  80,  88,  96],
   [  9,  18,  27,  36,  45,  54,  63,  72,  81,  90,  99, 108],
   [ 10,  20,  30,  40,  50,  60,  70,  80,  90, 100, 110, 120],
   [ 11,  22,  33,  44,  55,  66,  77,  88,  99, 110, 121, 132],
   [ 12,  24,  36,  48,  60,  72,  84,  96, 108, 120, 132, 144]])

In [ ]:

Q2. (25 points) In this exercise, we will practice using Pandas dataframes to explore and summarize a data set heart.

This data contains the survival time after receiving a heart transplant, the age of the patient and whether or not the survival time was censored

Number of Observations - 69
Number of Variables - 3

Variable name definitions:

-  survival - Days after surgery until death
-  censors - indicates if an observation is censored. 1 is uncensored
-  age - age at the time of surgery

Answer the following questions with respect to the heart data set:

How many patients were censored?
What is the correlation coefficient between age and survival for uncensored patients?
What is the average age for censored and uncensored patients?
What is the average survival time for censored and uncensored patients under the age of 45?
What is the survival time of the youngest and oldest uncensored patient?

In [2]:

import statsmodels.api as sm
heart = sm.datasets.heart.load_pandas().data
heart.head(n=6)

Out[2]:

	survival	censors	age
0	15	1	54.3
1	3	1	40.4
2	624	1	51.0
3	46	1	42.5
4	127	1	48.0
5	64	1	54.6

In [ ]:

Q3. (40 pts)

Given matrix \(M\)

[[7, 8, 8],
 [1, 3, 8],
 [9, 2, 1]]

(10 pts) Normalize the given matrix \(M\) so that all rows sum to 1.0. This cna then be considered as a transition matrix \(P\) for a Markov chain. Find the stationary distribution of this matrix in the following ways using numpy and numpy.linalg (or scipy.linalg):
(10 pts) By raising the matrix \(P\) to some large power unitl it doesn’t change with higher powers (see np.linalg.matrix_power) and then calculating \(vP\)
(10 pts) From the equation for stationarity \(wP = w\), we can see that \(w\) must be a left eigenvector of \(P\) with eigenvalue \(1\) (Note: np.linalg.eig returns the right eigenvectors, but the left eighenvector of a matrix is the right eigenvector of the transposed matrix). Use this to find \(w\) using np.linalg.eig.
(10 pts) Suppose \(w = (w_1, w_2, w_3)\). Then from \(wP = w\), we have:

\[ \begin{align}\begin{aligned}:nowrap:\\\begin{split} \begin{align} w_1 P_{11} + w_2 P_{21} + w_3 P_{31} &= w_1 \\ w_1 P_{12} + w_2 P_{22} + w_3 P_{32} &= w_2 \\ w_1 P_{13} + w_2 P_{23} + w_3 P_{331} &= w_3 \\ \end{align}\end{split}\\This is a singular system, but we also know that\end{aligned}\end{align} \]

\(w_1 + w_2 + w_3 = 1\). Use these facts to set up a linear system of equations that can be solved with np.linalg.solve to find \(w\).

In [34]:

M = np.array([7,8,8,1,3,8,9,2,1.0]).reshape(3,3)
M

Out[34]:

array([[ 7.,  8.,  8.],
       [ 1.,  3.,  8.],
       [ 9.,  2.,  1.]])

In [ ]:

Q4. Write code to replicate the following plot using seaborn.

In [41]:

from IPython.display import Image
Image('../images/hw2_q4.png')

Out[41]:

homework/../_build/doctrees/nbsphinx/homework_Homework02_10_0.png

In [ ]: