Numpy and SeabornΒΆ
Q1. (15 pts) Exercise 1. Write a 12 by 12 times table matrix shwon below. Do this
using nested for loops
using numpy fromfunction array constructor
using numpy broadcasting
array([[ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], [ 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24], [ 3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36], [ 4, 8, 12, 16, 20, 24, 28, 32, 36, 40, 44, 48], [ 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60], [ 6, 12, 18, 24, 30, 36, 42, 48, 54, 60, 66, 72], [ 7, 14, 21, 28, 35, 42, 49, 56, 63, 70, 77, 84], [ 8, 16, 24, 32, 40, 48, 56, 64, 72, 80, 88, 96], [ 9, 18, 27, 36, 45, 54, 63, 72, 81, 90, 99, 108], [ 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120], [ 11, 22, 33, 44, 55, 66, 77, 88, 99, 110, 121, 132], [ 12, 24, 36, 48, 60, 72, 84, 96, 108, 120, 132, 144]])
In [ ]:
Q2. (25 points) In this exercise, we will practice using Pandas
dataframes to explore and summarize a data set heart
.
This data contains the survival time after receiving a heart transplant, the age of the patient and whether or not the survival time was censored
- Number of Observations - 69
- Number of Variables - 3
Variable name definitions:
- survival - Days after surgery until death
- censors - indicates if an observation is censored. 1 is uncensored
- age - age at the time of surgery
Answer the following questions with respect to the heart
data set:
- How many patients were censored?
- What is the correlation coefficient between age and survival for uncensored patients?
- What is the average age for censored and uncensored patients?
- What is the average survival time for censored and uncensored patients under the age of 45?
- What is the survival time of the youngest and oldest uncensored patient?
In [2]:
import statsmodels.api as sm
heart = sm.datasets.heart.load_pandas().data
heart.head(n=6)
Out[2]:
survival | censors | age | |
---|---|---|---|
0 | 15 | 1 | 54.3 |
1 | 3 | 1 | 40.4 |
2 | 624 | 1 | 51.0 |
3 | 46 | 1 | 42.5 |
4 | 127 | 1 | 48.0 |
5 | 64 | 1 | 54.6 |
In [ ]:
Q3. (40 pts)
Given matrix \(M\)
[[7, 8, 8],
[1, 3, 8],
[9, 2, 1]]
(10 pts) Normalize the given matrix \(M\) so that all rows sum to 1.0. This cna then be considered as a transition matrix \(P\) for a Markov chain. Find the stationary distribution of this matrix in the following ways using
numpy
andnumpy.linalg
(orscipy.linalg
):(10 pts) By raising the matrix \(P\) to some large power unitl it doesn’t change with higher powers (see
np.linalg.matrix_power
) and then calculating \(vP\)(10 pts) From the equation for stationarity \(wP = w\), we can see that \(w\) must be a left eigenvector of \(P\) with eigenvalue \(1\) (Note: np.linalg.eig returns the right eigenvectors, but the left eighenvector of a matrix is the right eigenvector of the transposed matrix). Use this to find \(w\) using
np.linalg.eig
.(10 pts) Suppose \(w = (w_1, w_2, w_3)\). Then from \(wP = w\), we have:
\[ \begin{align}\begin{aligned}:nowrap:\\\begin{split} \begin{align} w_1 P_{11} + w_2 P_{21} + w_3 P_{31} &= w_1 \\ w_1 P_{12} + w_2 P_{22} + w_3 P_{32} &= w_2 \\ w_1 P_{13} + w_2 P_{23} + w_3 P_{331} &= w_3 \\ \end{align}\end{split}\\This is a singular system, but we also know that\end{aligned}\end{align} \]\(w_1 + w_2 + w_3 = 1\). Use these facts to set up a linear system of equations that can be solved with
np.linalg.solve
to find \(w\).
In [34]:
M = np.array([7,8,8,1,3,8,9,2,1.0]).reshape(3,3)
M
Out[34]:
array([[ 7., 8., 8.],
[ 1., 3., 8.],
[ 9., 2., 1.]])
In [ ]:
Q4. Write code to replicate the following plot using seaborn
.
In [41]:
from IPython.display import Image
Image('../images/hw2_q4.png')
Out[41]:
![homework/../_build/doctrees/nbsphinx/homework_Homework02_10_0.png](homework/../_build/doctrees/nbsphinx/homework_Homework02_10_0.png)
In [ ]: