{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Introduction to Python (Part 2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Getting help\n", "\n", "Use ?, ? or ?? to get help. Or use the help function." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Help on class range in module builtins:\n", "\n", "class range(object)\n", " | range(stop) -> range object\n", " | range(start, stop[, step]) -> range object\n", " | \n", " | Return an object that produces a sequence of integers from start (inclusive)\n", " | to stop (exclusive) by step. range(i, j) produces i, i+1, i+2, ..., j-1.\n", " | start defaults to 0, and stop is omitted! range(4) produces 0, 1, 2, 3.\n", " | These are exactly the valid indices for a list of 4 elements.\n", " | When step is given, it specifies the increment (or decrement).\n", " | \n", " | Methods defined here:\n", " | \n", " | __contains__(self, key, /)\n", " | Return key in self.\n", " | \n", " | __eq__(self, value, /)\n", " | Return self==value.\n", " | \n", " | __ge__(self, value, /)\n", " | Return self>=value.\n", " | \n", " | __getattribute__(self, name, /)\n", " | Return getattr(self, name).\n", " | \n", " | __getitem__(self, key, /)\n", " | Return self[key].\n", " | \n", " | __gt__(self, value, /)\n", " | Return self>value.\n", " | \n", " | __hash__(self, /)\n", " | Return hash(self).\n", " | \n", " | __iter__(self, /)\n", " | Implement iter(self).\n", " | \n", " | __le__(self, value, /)\n", " | Return self<=value.\n", " | \n", " | __len__(self, /)\n", " | Return len(self).\n", " | \n", " | __lt__(self, value, /)\n", " | Return self integer -- return number of occurrences of value\n", " | \n", " | index(...)\n", " | rangeobject.index(value, [start, [stop]]) -> integer -- return index of value.\n", " | Raise ValueError if the value is not present.\n", " | \n", " | ----------------------------------------------------------------------\n", " | Data descriptors defined here:\n", " | \n", " | start\n", " | \n", " | step\n", " | \n", " | stop\n", "\n" ] } ], "source": [ "help(range)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## [Magics](http://ipython.readthedocs.io/en/stable/interactive/magics.html)\n", "\n", "See link for details of magic functions. Perhaps the most useful for data science is the ability to easily go between Python and R." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "%load_ext rpy2.ipython" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "x = %R 1:15" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15], dtype=int32)" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": true }, "outputs": [], "source": [ "y = %R -i x x^2 + 3 + rnorm(length(x))" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 3.9395204 , 5.34805803, 11.25033399, 19.58789804,\n", " 28.24917345, 39.13577363, 52.51709934, 67.27538519,\n", " 87.26379034, 102.63916992, 123.36500302, 146.54990275,\n", " 171.79786825, 199.2817583 , 228.31390622])" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "y" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "\n", "Call:\n", "lm(formula = y ~ x)\n", "\n", "Residuals:\n", " Min 1Q Median 3Q Max \n", "-18.492 -14.547 -3.388 10.827 30.470 \n", "\n", "Coefficients:\n", " Estimate Std. Error t value Pr(>|t|) \n", "(Intercept) -42.573 9.490 -4.486 0.000613 ***\n", "x 16.043 1.044 15.370 1.02e-09 ***\n", "---\n", "Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1\n", "\n", "Residual standard error: 17.47 on 13 degrees of freedom\n", "Multiple R-squared: 0.9478,\tAdjusted R-squared: 0.9438 \n", "F-statistic: 236.2 on 1 and 13 DF, p-value: 1.022e-09\n", "\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%R -i x,y -o m\n", "\n", "m <- lm(y ~ x)\n", "summary(m)" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(Intercept) x \n", " -42.57263 16.04253 \n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%R\n", "\n", "coef(m)" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import numpy as np" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[-42.57263055, 16.04253416]])" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.array(m.rx('coefficients'))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Algorithmic complexity" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## More on data collections" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Tuple and namedtuple" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "collapsed": true }, "outputs": [], "source": [ "x = ('Tom', 'Jones', 'chemistry', 'statistics', '3.5')" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from collections import namedtuple" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "collapsed": true }, "outputs": [], "source": [ "student = namedtuple('student', 'first last major minor gpa')" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "collapsed": true }, "outputs": [], "source": [ "x = student('Tom', 'Jones', 'chemistry', 'statistics', '3.5')" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "student(first='Tom', last='Jones', major='chemistry', minor='statistics', gpa='3.5')" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "('Tom', 'Jones')" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x.first, x.last" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'3.5'" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x.gpa" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [], "source": [ "y = student(last='Rice', first='Anne', major='Philosophy', minor='English', gpa=4.0)" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "student(first='Anne', last='Rice', major='Philosophy', minor='English', gpa=4.0)" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "y" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Unpacking" ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "collapsed": true }, "outputs": [], "source": [ "a, b, c, d = range(1, 5)" ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1, 2, 3, 4)" ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a, b, c, d" ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "collapsed": true }, "outputs": [], "source": [ "a, *b, c, d = range(1, 10)" ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1, [2, 3, 4, 5, 6, 7], 8, 9)" ] }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a, b, c, d" ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1, [2, 3, 4, 5, 6, 7])" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a, b" ] }, { "cell_type": "code", "execution_count": 49, "metadata": { "collapsed": true }, "outputs": [], "source": [ "a, b = b, a" ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "([2, 3, 4, 5, 6, 7], 1)" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a, b" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Appending and extending lists" ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "collapsed": true }, "outputs": [], "source": [ "xs = []" ] }, { "cell_type": "code", "execution_count": 39, "metadata": { "collapsed": true }, "outputs": [], "source": [ "xs.append(1)" ] }, { "cell_type": "code", "execution_count": 40, "metadata": { "collapsed": true }, "outputs": [], "source": [ "xs.append(2)" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[1, 2]" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "xs" ] }, { "cell_type": "code", "execution_count": 42, "metadata": { "collapsed": true }, "outputs": [], "source": [ "xs.extend([3,4,5])" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[1, 2, 3, 4, 5]" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "xs" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Dictionary creation and idioms" ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'a': 1, 'b': 2, 'c': 3}" ] }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ "{'a': 1, 'b':2, 'c': 3}" ] }, { "cell_type": "code", "execution_count": 52, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'a': 1, 'b': 2, 'c': 3}" ] }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dict(a=1, b=2, c=3)" ] }, { "cell_type": "code", "execution_count": 53, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'a': 1, 'b': 2, 'c': 3}" ] }, "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dict(zip('abc', [1,2,3]))" ] }, { "cell_type": "code", "execution_count": 56, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'a': 0, 'b': 0, 'c': 0}" ] }, "execution_count": 56, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dict.fromkeys('abc', 0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Flow control" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### If-elif-else" ] }, { "cell_type": "code", "execution_count": 57, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1\n", "2\n", "buzz\n", "4\n", "fizz\n", "buzz\n", "7\n", "8\n", "buzz\n", "fizz\n", "11\n", "buzz\n", "13\n", "14\n", "fizzbuzz\n", "16\n", "17\n", "buzz\n", "19\n", "fizz\n", "buzz\n", "22\n", "23\n", "buzz\n", "fizz\n", "26\n", "buzz\n", "28\n", "29\n", "fizzbuzz\n" ] } ], "source": [ "for i in range(1, 31):\n", " if i % 15 == 0:\n", " print('fizzbuzz')\n", " elif i % 5 == 0:\n", " print('fizz')\n", " elif i % 3 == 0:\n", " print('buzz')\n", " else:\n", " print(i)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Ternary if operator" ] }, { "cell_type": "code", "execution_count": 59, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'fizzbuzz'" ] }, "execution_count": 59, "metadata": {}, "output_type": "execute_result" } ], "source": [ "i = 45\n", "x = 'fizzbuzz' if i % 15 == 0 else i\n", "x" ] }, { "cell_type": "code", "execution_count": 60, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "42" ] }, "execution_count": 60, "metadata": {}, "output_type": "execute_result" } ], "source": [ "i = 42\n", "x = 'fizzbuzz' if i % 15 == 0 else i\n", "x" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### The for loop" ] }, { "cell_type": "code", "execution_count": 61, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0\n", "1\n", "4\n" ] } ], "source": [ "for x in range(3):\n", " print(x**2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Using `enumerate`" ] }, { "cell_type": "code", "execution_count": 86, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0 10\n", "1 11\n", "2 12\n", "3 13\n", "4 14\n" ] } ], "source": [ "for i, x in enumerate(range(10, 15)):\n", " print(i, x)" ] }, { "cell_type": "code", "execution_count": 87, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1 10\n", "2 11\n", "3 12\n", "4 13\n", "5 14\n" ] } ], "source": [ "for i, x in enumerate(range(10, 15), start=1):\n", " print(i, x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Nested loops" ] }, { "cell_type": "code", "execution_count": 62, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0 3\n", "0 4\n", "1 3\n", "1 4\n", "2 3\n", "2 4\n" ] } ], "source": [ "for x in range(3):\n", " for y in range(3, 5):\n", " print(x, y)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### The while loop" ] }, { "cell_type": "code", "execution_count": 63, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0\n", "1\n", "2\n", "3\n", "4\n" ] } ], "source": [ "i = 0\n", "while i < 5:\n", " print(i)\n", " i += 1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### `pass`, `continue` and `break`" ] }, { "cell_type": "code", "execution_count": 64, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0\n", "1\n", "2\n", "3\n", "4\n" ] } ], "source": [ "for i in range(5):\n", " if i == 3:\n", " pass\n", " print(i)" ] }, { "cell_type": "code", "execution_count": 65, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0\n", "1\n", "2\n", "4\n" ] } ], "source": [ "for i in range(5):\n", " if i == 3:\n", " continue\n", " print(i)" ] }, { "cell_type": "code", "execution_count": 66, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0\n", "1\n", "2\n" ] } ], "source": [ "for i in range(5):\n", " if i == 3:\n", " break\n", " print(i)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Iterable objects and iterators" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Iterators" ] }, { "cell_type": "code", "execution_count": 67, "metadata": { "collapsed": true }, "outputs": [], "source": [ "xs = range(5)" ] }, { "cell_type": "code", "execution_count": 68, "metadata": {}, "outputs": [ { "ename": "TypeError", "evalue": "'range' object is not an iterator", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mnext\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mxs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mTypeError\u001b[0m: 'range' object is not an iterator" ] } ], "source": [ "next(xs)" ] }, { "cell_type": "code", "execution_count": 69, "metadata": { "collapsed": true }, "outputs": [], "source": [ "it = iter(xs)" ] }, { "cell_type": "code", "execution_count": 70, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(0, 1, 2, 3, 4)" ] }, "execution_count": 70, "metadata": {}, "output_type": "execute_result" } ], "source": [ "next(it), next(it), next(it), next(it), next(it)" ] }, { "cell_type": "code", "execution_count": 71, "metadata": {}, "outputs": [ { "ename": "StopIteration", "evalue": "", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mStopIteration\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mnext\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mit\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mStopIteration\u001b[0m: " ] } ], "source": [ "next(it)" ] }, { "cell_type": "code", "execution_count": 72, "metadata": { "collapsed": true }, "outputs": [], "source": [ "for x in it:\n", " print(x)" ] }, { "cell_type": "code", "execution_count": 73, "metadata": { "collapsed": true }, "outputs": [], "source": [ "it = iter(xs)" ] }, { "cell_type": "code", "execution_count": 76, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0\n", "1\n", "2\n", "3\n", "4\n" ] } ], "source": [ "it = iter(xs)\n", "for x in it:\n", " print(x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Generators" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Comprehensions" ] }, { "cell_type": "code", "execution_count": 80, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def squares(i=1):\n", " while True:\n", " yield i**2\n", " i = i + 1" ] }, { "cell_type": "code", "execution_count": 84, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1\n", "4\n", "9\n", "16\n", "25\n", "36\n", "49\n", "64\n", "81\n", "100\n" ] } ], "source": [ "for x in squares():\n", " if x > 100:\n", " break\n", " print(x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Generator comprehensions" ] }, { "cell_type": "code", "execution_count": 77, "metadata": {}, "outputs": [ { "data": { "text/plain": [ " at 0x113e32c50>" ] }, "execution_count": 77, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(i**2 for i in range(5))" ] }, { "cell_type": "code", "execution_count": 78, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[0, 1, 4, 9, 16]" ] }, "execution_count": 78, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(i**2 for i in range(5))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### List comprehensions" ] }, { "cell_type": "code", "execution_count": 88, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[0, 1, 4, 9, 16]" ] }, "execution_count": 88, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[i**2 for i in range(5)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Dictionary comprehensions" ] }, { "cell_type": "code", "execution_count": 91, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'a': 1, 'b': 2, 'c': 3}" ] }, "execution_count": 91, "metadata": {}, "output_type": "execute_result" } ], "source": [ "{k: v for (v, k) in enumerate('abc', 1)}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Set comprehensions" ] }, { "cell_type": "code", "execution_count": 92, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'d', 'e', 'l', 'm', 't', 'u', 'w'}" ] }, "execution_count": 92, "metadata": {}, "output_type": "execute_result" } ], "source": [ "{char for char in 'tweedledum'}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Nested comprehensions" ] }, { "cell_type": "code", "execution_count": 93, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[('a', 0),\n", " ('a', 1),\n", " ('a', 2),\n", " ('b', 0),\n", " ('b', 1),\n", " ('b', 2),\n", " ('c', 0),\n", " ('c', 1),\n", " ('c', 2)]" ] }, "execution_count": 93, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[(i, j) for i in 'abc' for j in range(3)]" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.1" } }, "nbformat": 4, "nbformat_minor": 2 }