{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Functional programming in Python (operator, functional, itertoools, toolz)\n",
    "\n",
    "- Pure functions\n",
    "- Recursive functions\n",
    "- Anonymous functions\n",
    "- Lazy evaluation\n",
    "- Higher-order functions\n",
    "- Decorators\n",
    "- Partial application\n",
    "- Using `operator`\n",
    "- Using `functional`\n",
    "- Using `itertools`\n",
    "- Pipelines with `toolz`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Pure functions"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Deterministic"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Pure"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(148.4131591025766, 148.4131591025766)"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.exp(5), np.exp(5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Not pure"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(-0.07465802214544887, 1.0829226355933053)"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.random.randn(), np.random.randn()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### No side effects"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Pure"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "def f(xs):\n",
    "    '''Modify value at first index.'''\n",
    "    if len(xs) > 0:\n",
    "        xs = list(xs)\n",
    "        xs[0] = '@'\n",
    "    return xs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(['@', 2, 3], [1, 2, 3])"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "xs = [1,2,3]\n",
    "f(xs), xs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Not pure"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "def g(xs):\n",
    "    '''Modify value at first index.'''\n",
    "    if len(xs) > 0:\n",
    "        xs[0] = '@'\n",
    "    return xs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(['@', 2, 3], ['@', 2, 3])"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "xs = [1,2,3]\n",
    "g(xs), xs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exercise\n",
    "\n",
    "Is the function `h` pure or impure?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "def h(n, xs=[]):\n",
    "    for i in range(n):\n",
    "        xs.append(i)\n",
    "    return xs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The function is not deterministic, and it has side effects!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "n = 5\n",
    "xs = [1,2,3]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Non-deterministic"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[0, 1, 2, 3, 4]"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "h(n)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[0, 1, 2, 3, 4, 0, 1, 2, 3, 4]"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "h(n)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To avoid non-determinism, do not set default mutable arguments. The usaal Python idiom is\n",
    "\n",
    "```python\n",
    "def func(xs=None):\n",
    "    \"\"\"Docstring.\"\"\"\n",
    "    \n",
    "    if xs is None:\n",
    "        xs = []\n",
    "    do_something(xs)\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Side effects"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "xs = [1,2,3]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1, 2, 3, 0, 1, 2, 3, 4]"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "h(n, xs)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1, 2, 3, 0, 1, 2, 3, 4]"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "xs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Recursive functions\n",
    "\n",
    "A recursive function is one that calls itself. Python supports recursion, but recursive functions in Python are not efficient and iterative algorithms are preferred in general."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "def rec_sum(xs):\n",
    "    \"\"\"Recursive sum.\"\"\"\n",
    "    \n",
    "    if len(xs) == 0:\n",
    "        return 0\n",
    "    else:\n",
    "        return xs[0] + rec_sum(xs[1:])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "10"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "rec_sum([1,2,3,4])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Anonymous functions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<function __main__.<lambda>(x, y)>"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "lambda x, y: x + y"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [],
   "source": [
    "add = lambda x, y: x + y"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "7"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "add(3, 4)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<function __main__.<lambda>(x, y)>"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "lambda x, y: x if x < y else y"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
    "smaller = lambda x, y: x if x < y else y"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "smaller(9,1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Lazy evaluation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "range(0, 10)"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "range(10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(range(10))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Generators\n",
    "\n",
    "Generators behave like functions that retain the last state and can be re-entered. Results are returned with `yield` rather than `return`. Generators are used extensively in Python, and almost exclusively in the `itertools` and `toolz` packages we will review later in this  notebook."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Differences between a function and a generator"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [],
   "source": [
    "def fib_eager(n):\n",
    "    \"\"\"Eager Fibonacci function.\"\"\"\n",
    "    \n",
    "    xs = []\n",
    "    a, b = 1,1\n",
    "    for i in range(n):\n",
    "        xs.append(a)\n",
    "        a, b = b, a + b\n",
    "    return xs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fib_eager(10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [],
   "source": [
    "def fib_lazy(n):\n",
    "    \"\"\"Lazy Fibonacci generator.\"\"\"\n",
    "    \n",
    "    a, b = 1,1\n",
    "    for i in range(n):\n",
    "        yield a\n",
    "        a, b = b, a + b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<generator object fib_lazy at 0x117053250>"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fib_lazy(10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(fib_lazy(10))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [],
   "source": [
    "fibs1 = fib_eager(10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1,1,2,3,5,8,13,21,34,55,"
     ]
    }
   ],
   "source": [
    "for i in fibs1:\n",
    "    print(i, end=',')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1,1,2,3,5,8,13,21,34,55,"
     ]
    }
   ],
   "source": [
    "for i in fibs1:\n",
    "    print(i, end=',')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [],
   "source": [
    "fibs2 = fib_lazy(10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1,1,2,3,5,8,13,21,34,55,"
     ]
    }
   ],
   "source": [
    "for i in fibs2:\n",
    "    print(i, end=',')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [],
   "source": [
    "for i in fibs2:\n",
    "    print(i, end=',')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Generators can return infinite sequences"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [],
   "source": [
    "def iota(start = 1):\n",
    "    \"\"\"An infinite incrementing genrator.\"\"\"\n",
    "    \n",
    "    while True:\n",
    "        yield start\n",
    "        start += 1    "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,"
     ]
    }
   ],
   "source": [
    "for i in iota():\n",
    "    print(i, end=',')\n",
    "    if i > 25:\n",
    "        break"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Higher order functions\n",
    "\n",
    "- Take a function as an argument \n",
    "- Returns a function"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [],
   "source": [
    "def dist(x, y, measure):\n",
    "    \"\"\"Returns distance between x and y using given measure.\n",
    "    \n",
    "    measure is a function that takes two arguments x and y.\n",
    "    \"\"\"\n",
    "    \n",
    "    return measure(x, y)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [],
   "source": [
    "def euclid(x, y):\n",
    "    \"\"\"Returns Euclidean distance between x and y.\"\"\"\n",
    "    \n",
    "    return np.sqrt(np.sum(x**2 + y**2))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5.0"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x = np.array([0,0])\n",
    "y = np.array([3,4])\n",
    "\n",
    "dist(x, y, measure=euclid)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Standard HOFs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [],
   "source": [
    "from functools import reduce"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(map(lambda x: x**2, range(10)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[0, 2, 4, 6, 8]"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(filter(lambda x: x % 2 == 0, range(10)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "45"
      ]
     },
     "execution_count": 44,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "reduce(lambda x, y: x + y, range(10))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "55"
      ]
     },
     "execution_count": 45,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "reduce(lambda x, y: x + y, range(10), 10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Example: Flattening a nested list"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[['the', 'quick', 'brown', 'fox'], ['jumps', 'over', 'the', 'dog']]"
      ]
     },
     "execution_count": 46,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "s1 = 'the quick brown fox'\n",
    "s2 = 'jumps over the dog'\n",
    "xs = [s.split() for s in [s1, s2]]\n",
    "xs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Using a nested for loop"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['the', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'dog']"
      ]
     },
     "execution_count": 47,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ys = []\n",
    "for x in xs:\n",
    "    for y in x:\n",
    "        ys.append(y)\n",
    "ys"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Using a list comprehension"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['the', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'dog']"
      ]
     },
     "execution_count": 48,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "[y for x in xs for y in x]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Using `reduce`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['the', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'dog']"
      ]
     },
     "execution_count": 49,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "reduce(lambda x, y: x + y, xs)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Closure"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {},
   "outputs": [],
   "source": [
    "def f(a):\n",
    "    \"\"\"The argument given to f is visible in g when g is called.\"\"\"\n",
    "    \n",
    "    def g(b):\n",
    "        return a + b\n",
    "    return g"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {},
   "outputs": [],
   "source": [
    "g3 = f(3)\n",
    "g5 = f(5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(7, 9)"
      ]
     },
     "execution_count": 52,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "g3(4), g5(4)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Decorators\n",
    "\n",
    "A decorator is a higher-order function that takes a function as input and returns a *decorated* version of the original function with additional capabilities. There is syntactical sugar for decorators in Python using the `@decorator_function` notation as shown below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {},
   "outputs": [],
   "source": [
    "import time"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {},
   "outputs": [],
   "source": [
    "def timer(f):\n",
    "    \"\"\"Times how long f takes.\"\"\"\n",
    "    \n",
    "    def g(*args, **kwargs):\n",
    "        \n",
    "        start = time.time()     \n",
    "        res = f(*args, **kwargs)\n",
    "        elapsed = time.time() - start\n",
    "        return res, elapsed\n",
    "    return g"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {},
   "outputs": [],
   "source": [
    "def slow(n=1):\n",
    "    time.sleep(n)\n",
    "    return n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Use as a function"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "metadata": {},
   "outputs": [],
   "source": [
    "slow_t1 = timer(slow)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(0.5, 0.500197172164917)"
      ]
     },
     "execution_count": 57,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "slow_t1(0.5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Use as a decorator"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "metadata": {},
   "outputs": [],
   "source": [
    "@timer\n",
    "def slow_t2(n=1):\n",
    "    time.sleep(n)\n",
    "    return n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(0.5, 0.5045151710510254)"
      ]
     },
     "execution_count": 59,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "slow_t2(0.5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Partial application\n",
    "\n",
    "Partial application takes a function with two or more parameters, and returns the function with some parameters *filled in*. Partial application is very useful when constructing pipelines that transform data in stages."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 60,
   "metadata": {},
   "outputs": [],
   "source": [
    "from functools import partial"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 61,
   "metadata": {},
   "outputs": [],
   "source": [
    "def add(a, b):\n",
    "    \"\"\"Add a and b.\"\"\"\n",
    "    \n",
    "    return a + b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 62,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[10, 11, 12, 13, 14]"
      ]
     },
     "execution_count": 62,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(map(partial(add, b=10), range(5)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Using `operator`\n",
    "\n",
    "Operator provides named version of Python operators, and is typically used in place of anonymous functions in higher order functions to improve readability."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 63,
   "metadata": {},
   "outputs": [],
   "source": [
    "import operator as op"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 64,
   "metadata": {},
   "outputs": [],
   "source": [
    "xs = [('a', 3), ('b', 1), ('c', 2)]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 65,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('b', 1), ('c', 2), ('a', 3)]"
      ]
     },
     "execution_count": 65,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sorted(xs, key=op.itemgetter(1))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 66,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('b', 1), ('c', 2), ('a', 3)]"
      ]
     },
     "execution_count": 66,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sorted(xs, key=lambda x: x[1])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 67,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "10"
      ]
     },
     "execution_count": 67,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "reduce(op.add, range(1,5))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 68,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "10"
      ]
     },
     "execution_count": 68,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "reduce(lambda a, b: a + b, range(1,5))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Using `functional`\n",
    "\n",
    "We have already seen the use of `reduce` and `partial` from `functools`. Another useful function from the package is the `lrucache` decorator."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 69,
   "metadata": {},
   "outputs": [],
   "source": [
    "def fib(n):\n",
    "    \"\"\"Recursive version of Fibonacci.\"\"\"\n",
    "    \n",
    "    print('Call fib(%d)' % n)\n",
    "    \n",
    "    if n == 0:\n",
    "        return 0\n",
    "    elif n == 1:\n",
    "        return 1\n",
    "    else:\n",
    "        return fib(n-2) + fib(n-1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Notice the inefficiency from repeatedly calling the function with the same arguments."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 70,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Call fib(6)\n",
      "Call fib(4)\n",
      "Call fib(2)\n",
      "Call fib(0)\n",
      "Call fib(1)\n",
      "Call fib(3)\n",
      "Call fib(1)\n",
      "Call fib(2)\n",
      "Call fib(0)\n",
      "Call fib(1)\n",
      "Call fib(5)\n",
      "Call fib(3)\n",
      "Call fib(1)\n",
      "Call fib(2)\n",
      "Call fib(0)\n",
      "Call fib(1)\n",
      "Call fib(4)\n",
      "Call fib(2)\n",
      "Call fib(0)\n",
      "Call fib(1)\n",
      "Call fib(3)\n",
      "Call fib(1)\n",
      "Call fib(2)\n",
      "Call fib(0)\n",
      "Call fib(1)\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "8"
      ]
     },
     "execution_count": 70,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fib(6)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 71,
   "metadata": {},
   "outputs": [],
   "source": [
    "from functools import lru_cache"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 72,
   "metadata": {},
   "outputs": [],
   "source": [
    "@lru_cache(maxsize=None)\n",
    "def fib_cache(n):\n",
    "    \"\"\"Recursive version of Fibonacci.\"\"\"\n",
    "    \n",
    "    print('Call fib(%d)' % n)\n",
    "    \n",
    "    if n == 0:\n",
    "        return 0\n",
    "    elif n == 1:\n",
    "        return 1\n",
    "    else:\n",
    "        return fib_cache(n-2) + fib_cache(n-1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 73,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Call fib(6)\n",
      "Call fib(4)\n",
      "Call fib(2)\n",
      "Call fib(0)\n",
      "Call fib(1)\n",
      "Call fib(3)\n",
      "Call fib(5)\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "8"
      ]
     },
     "execution_count": 73,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fib_cache(6)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Using `itertools`\n",
    "\n",
    "The `itertools` package provides tools for efficient looping."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 74,
   "metadata": {},
   "outputs": [],
   "source": [
    "import itertools as it"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Generators"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 75,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]"
      ]
     },
     "execution_count": 75,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(it.islice(it.count(), 10))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 76,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1, 2, 3, 1, 2, 3, 1, 2, 3, 1]"
      ]
     },
     "execution_count": 76,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(it.islice(it.cycle([1,2,3]), 10))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 77,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[[1, 2, 3],\n",
       " [1, 2, 3],\n",
       " [1, 2, 3],\n",
       " [1, 2, 3],\n",
       " [1, 2, 3],\n",
       " [1, 2, 3],\n",
       " [1, 2, 3],\n",
       " [1, 2, 3],\n",
       " [1, 2, 3],\n",
       " [1, 2, 3]]"
      ]
     },
     "execution_count": 77,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(it.islice(it.repeat([1,2,3]), 10))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 78,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[(0, 'a'), (1, 'b'), (2, 'c'), (3, None), (4, None)]"
      ]
     },
     "execution_count": 78,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(it.zip_longest(range(5), 'abc'))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 79,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[(0, 'a'), (1, 'b'), (2, 'c'), (3, 'out of donuts'), (4, 'out of donuts')]"
      ]
     },
     "execution_count": 79,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(it.zip_longest(range(5), 'abc', fillvalue='out of donuts'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Permutations and combinations"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 80,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('a', 'b', 'c'),\n",
       " ('a', 'c', 'b'),\n",
       " ('b', 'a', 'c'),\n",
       " ('b', 'c', 'a'),\n",
       " ('c', 'a', 'b'),\n",
       " ('c', 'b', 'a')]"
      ]
     },
     "execution_count": 80,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(it.permutations('abc'))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 81,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('a', 'b'), ('a', 'c'), ('b', 'a'), ('b', 'c'), ('c', 'a'), ('c', 'b')]"
      ]
     },
     "execution_count": 81,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(it.permutations('abc', 2))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 82,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('a', 'b'), ('a', 'c'), ('b', 'c')]"
      ]
     },
     "execution_count": 82,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(it.combinations('abc', 2))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 83,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('a', 'a'), ('a', 'b'), ('a', 'c'), ('b', 'b'), ('b', 'c'), ('c', 'c')]"
      ]
     },
     "execution_count": 83,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(it.combinations_with_replacement('abc', 2))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 84,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('a', 'a'),\n",
       " ('a', 'b'),\n",
       " ('a', 'c'),\n",
       " ('b', 'a'),\n",
       " ('b', 'b'),\n",
       " ('b', 'c'),\n",
       " ('c', 'a'),\n",
       " ('c', 'b'),\n",
       " ('c', 'c')]"
      ]
     },
     "execution_count": 84,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(it.product('abc', repeat=2))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Miscellaneous"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 85,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1, 2, 3, 4, 5, 6, 7, 8, 9]"
      ]
     },
     "execution_count": 85,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(it.chain([1,2,3], [4,5,6], [7,8,9]))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 86,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1, 2, 3, 4, 5, 6, 7, 8, 9]"
      ]
     },
     "execution_count": 86,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(it.chain.from_iterable([[1,2,3], [4,5,6], [7,8,9]]))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 87,
   "metadata": {},
   "outputs": [],
   "source": [
    "nums = [1,3,5,7,2,4,6,1,3,5,7,9,2,2,2]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 88,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "False [1, 3, 5, 7]\n",
      "True [2, 4, 6]\n",
      "False [1, 3, 5, 7, 9]\n",
      "True [2, 2, 2]\n"
     ]
    }
   ],
   "source": [
    "for i, g in it.groupby(nums, key=lambda x: x % 2 ==0):\n",
    "    print(i, list(g))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 89,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1, 3, 5, 7]"
      ]
     },
     "execution_count": 89,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(it.takewhile(lambda x: x % 2 == 1, nums))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 90,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[2, 4, 6, 1, 3, 5, 7, 9, 2, 2, 2]"
      ]
     },
     "execution_count": 90,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(it.dropwhile(lambda x: x % 2 == 1, nums))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Using `toolz`\n",
    "\n",
    "The `toolz` package provides a very rich set of functional operators, and is recommended if you want to program in the functional style using Python."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 91,
   "metadata": {},
   "outputs": [],
   "source": [
    "import toolz"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 92,
   "metadata": {},
   "outputs": [],
   "source": [
    "from toolz import sliding_window, pipe, frequencies, concat"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 93,
   "metadata": {},
   "outputs": [],
   "source": [
    "import toolz.curried as c"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Example: Read in some documents, remove punctuation, split into words and then into individual characters, and find the most commonly occurring sliding windows containing 3 characters.\n",
    "\n",
    "This is most naturally done by piping in a data set through a series of transforms."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 94,
   "metadata": {},
   "outputs": [],
   "source": [
    "d1 = 'the doo doo doo the dah dah dah'\n",
    "d2 = 'every breath she takes, every move she makes'\n",
    "d3 = 'another brick in the wall'\n",
    "d4 = 'another one bites the dust and another one gone'\n",
    "docs = [d1,d2,d3,d4]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 95,
   "metadata": {},
   "outputs": [],
   "source": [
    "import string"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 126,
   "metadata": {},
   "outputs": [],
   "source": [
    "triple_freqs = pipe(\n",
    "    docs,\n",
    "    c.map(lambda x: x.translate(str.maketrans('', '', string.punctuation))),\n",
    "    c.map(lambda x: x.split()),\n",
    "    concat,\n",
    "    concat,\n",
    "    c.sliding_window(3),\n",
    "    c.map(lambda x: ''.join(x)),\n",
    "    frequencies,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 127,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('the', 7), ('oth', 4), ('hed', 3), ('doo', 3), ('dah', 3)]"
      ]
     },
     "execution_count": 127,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sorted(triple_freqs.items(), key=lambda x: x[1], reverse=True)[:5]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Step by step"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 130,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'the': 7,\n",
       " 'hed': 3,\n",
       " 'edo': 1,\n",
       " 'doo': 3,\n",
       " 'ood': 2,\n",
       " 'odo': 2,\n",
       " 'oot': 1,\n",
       " 'oth': 4,\n",
       " 'eda': 1,\n",
       " 'dah': 3,\n",
       " 'ahd': 2,\n",
       " 'hda': 2,\n",
       " 'ahe': 1,\n",
       " 'hev': 1,\n",
       " 'eve': 2,\n",
       " 'ver': 2,\n",
       " 'ery': 2,\n",
       " 'ryb': 1,\n",
       " 'ybr': 1,\n",
       " 'bre': 1,\n",
       " 'rea': 1,\n",
       " 'eat': 1,\n",
       " 'ath': 1,\n",
       " 'ths': 1,\n",
       " 'hsh': 1,\n",
       " 'she': 2,\n",
       " 'het': 1,\n",
       " 'eta': 1,\n",
       " 'tak': 1,\n",
       " 'ake': 2,\n",
       " 'kes': 2,\n",
       " 'ese': 1,\n",
       " 'sev': 1,\n",
       " 'rym': 1,\n",
       " 'ymo': 1,\n",
       " 'mov': 1,\n",
       " 'ove': 1,\n",
       " 'ves': 1,\n",
       " 'esh': 1,\n",
       " 'hem': 1,\n",
       " 'ema': 1,\n",
       " 'mak': 1,\n",
       " 'esa': 1,\n",
       " 'san': 1,\n",
       " 'ano': 3,\n",
       " 'not': 3,\n",
       " 'her': 3,\n",
       " 'erb': 1,\n",
       " 'rbr': 1,\n",
       " 'bri': 1,\n",
       " 'ric': 1,\n",
       " 'ick': 1,\n",
       " 'cki': 1,\n",
       " 'kin': 1,\n",
       " 'int': 1,\n",
       " 'nth': 1,\n",
       " 'hew': 1,\n",
       " 'ewa': 1,\n",
       " 'wal': 1,\n",
       " 'all': 1,\n",
       " 'lla': 1,\n",
       " 'lan': 1,\n",
       " 'ero': 2,\n",
       " 'ron': 2,\n",
       " 'one': 3,\n",
       " 'neb': 1,\n",
       " 'ebi': 1,\n",
       " 'bit': 1,\n",
       " 'ite': 1,\n",
       " 'tes': 1,\n",
       " 'est': 1,\n",
       " 'sth': 1,\n",
       " 'edu': 1,\n",
       " 'dus': 1,\n",
       " 'ust': 1,\n",
       " 'sta': 1,\n",
       " 'tan': 1,\n",
       " 'and': 1,\n",
       " 'nda': 1,\n",
       " 'dan': 1,\n",
       " 'neg': 1,\n",
       " 'ego': 1,\n",
       " 'gon': 1}"
      ]
     },
     "execution_count": 130,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pipe(\n",
    "    docs,\n",
    "    c.map(lambda x: x.translate(str.maketrans('', '', string.punctuation))),\n",
    "    c.map(lambda x: x.split()),\n",
    "    concat,\n",
    "    concat,\n",
    "    c.sliding_window(3),\n",
    "    c.map(lambda x: ''.join(x)),\n",
    "    frequencies,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 131,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['the doo doo doo the dah dah dah',\n",
       " 'every breath she takes, every move she makes',\n",
       " 'another brick in the wall',\n",
       " 'another one bites the dust and another one gone']"
      ]
     },
     "execution_count": 131,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pipe(\n",
    "    docs,\n",
    "    list\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 132,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['the doo doo doo the dah dah dah',\n",
       " 'every breath she takes every move she makes',\n",
       " 'another brick in the wall',\n",
       " 'another one bites the dust and another one gone']"
      ]
     },
     "execution_count": 132,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pipe(\n",
    "    docs,\n",
    "    c.map(lambda x: x.translate(str.maketrans('', '', string.punctuation))),\n",
    "    list\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 133,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[['the', 'doo', 'doo', 'doo', 'the', 'dah', 'dah', 'dah'],\n",
       " ['every', 'breath', 'she', 'takes', 'every', 'move', 'she', 'makes'],\n",
       " ['another', 'brick', 'in', 'the', 'wall'],\n",
       " ['another', 'one', 'bites', 'the', 'dust', 'and', 'another', 'one', 'gone']]"
      ]
     },
     "execution_count": 133,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pipe(\n",
    "    docs,\n",
    "    c.map(lambda x: x.translate(str.maketrans('', '', string.punctuation))),\n",
    "    c.map(lambda x: x.split()),\n",
    "    list\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 134,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['the', 'doo', 'doo', 'doo', 'the']"
      ]
     },
     "execution_count": 134,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pipe(\n",
    "    docs,\n",
    "    c.map(lambda x: x.translate(str.maketrans('', '', string.punctuation))),\n",
    "    c.map(lambda x: x.split()),\n",
    "    concat,\n",
    "    c.take(5),\n",
    "    list\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 135,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['t', 'h', 'e', 'd', 'o']"
      ]
     },
     "execution_count": 135,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pipe(\n",
    "    docs,\n",
    "    c.map(lambda x: x.translate(str.maketrans('', '', string.punctuation))),\n",
    "    c.map(lambda x: x.split()),\n",
    "    concat,\n",
    "    concat,\n",
    "    c.take(5),\n",
    "    list\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 136,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('t', 'h', 'e'),\n",
       " ('h', 'e', 'd'),\n",
       " ('e', 'd', 'o'),\n",
       " ('d', 'o', 'o'),\n",
       " ('o', 'o', 'd')]"
      ]
     },
     "execution_count": 136,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pipe(\n",
    "    docs,\n",
    "    c.map(lambda x: x.translate(str.maketrans('', '', string.punctuation))),\n",
    "    c.map(lambda x: x.split()),\n",
    "    concat,\n",
    "    concat,\n",
    "    c.sliding_window(3),\n",
    "    c.take(5),\n",
    "    list\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 140,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['the', 'hed', 'edo', 'doo', 'ood', 'odo', 'doo', 'ood', 'odo', 'doo']"
      ]
     },
     "execution_count": 140,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pipe(\n",
    "    docs,\n",
    "    c.map(lambda x: x.translate(str.maketrans('', '', string.punctuation))),\n",
    "    c.map(lambda x: x.split()),\n",
    "    concat,\n",
    "    concat,\n",
    "    c.sliding_window(3),\n",
    "    c.map(lambda x: ''.join(x)),\n",
    "    c.take(10),\n",
    "    list\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 139,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'the': 7,\n",
       " 'hed': 3,\n",
       " 'edo': 1,\n",
       " 'doo': 3,\n",
       " 'ood': 2,\n",
       " 'odo': 2,\n",
       " 'oot': 1,\n",
       " 'oth': 4,\n",
       " 'eda': 1,\n",
       " 'dah': 3,\n",
       " 'ahd': 2,\n",
       " 'hda': 2,\n",
       " 'ahe': 1,\n",
       " 'hev': 1,\n",
       " 'eve': 2,\n",
       " 'ver': 2,\n",
       " 'ery': 2,\n",
       " 'ryb': 1,\n",
       " 'ybr': 1,\n",
       " 'bre': 1,\n",
       " 'rea': 1,\n",
       " 'eat': 1,\n",
       " 'ath': 1,\n",
       " 'ths': 1,\n",
       " 'hsh': 1,\n",
       " 'she': 2,\n",
       " 'het': 1,\n",
       " 'eta': 1,\n",
       " 'tak': 1,\n",
       " 'ake': 2,\n",
       " 'kes': 2,\n",
       " 'ese': 1,\n",
       " 'sev': 1,\n",
       " 'rym': 1,\n",
       " 'ymo': 1,\n",
       " 'mov': 1,\n",
       " 'ove': 1,\n",
       " 'ves': 1,\n",
       " 'esh': 1,\n",
       " 'hem': 1,\n",
       " 'ema': 1,\n",
       " 'mak': 1,\n",
       " 'esa': 1,\n",
       " 'san': 1,\n",
       " 'ano': 3,\n",
       " 'not': 3,\n",
       " 'her': 3,\n",
       " 'erb': 1,\n",
       " 'rbr': 1,\n",
       " 'bri': 1,\n",
       " 'ric': 1,\n",
       " 'ick': 1,\n",
       " 'cki': 1,\n",
       " 'kin': 1,\n",
       " 'int': 1,\n",
       " 'nth': 1,\n",
       " 'hew': 1,\n",
       " 'ewa': 1,\n",
       " 'wal': 1,\n",
       " 'all': 1,\n",
       " 'lla': 1,\n",
       " 'lan': 1,\n",
       " 'ero': 2,\n",
       " 'ron': 2,\n",
       " 'one': 3,\n",
       " 'neb': 1,\n",
       " 'ebi': 1,\n",
       " 'bit': 1,\n",
       " 'ite': 1,\n",
       " 'tes': 1,\n",
       " 'est': 1,\n",
       " 'sth': 1,\n",
       " 'edu': 1,\n",
       " 'dus': 1,\n",
       " 'ust': 1,\n",
       " 'sta': 1,\n",
       " 'tan': 1,\n",
       " 'and': 1,\n",
       " 'nda': 1,\n",
       " 'dan': 1,\n",
       " 'neg': 1,\n",
       " 'ego': 1,\n",
       " 'gon': 1}"
      ]
     },
     "execution_count": 139,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pipe(\n",
    "    docs,\n",
    "    c.map(lambda x: x.translate(str.maketrans('', '', string.punctuation))),\n",
    "    c.map(lambda x: x.split()),\n",
    "    concat,\n",
    "    concat,\n",
    "    c.sliding_window(3),\n",
    "    c.map(lambda x: ''.join(x)),\n",
    "    frequencies\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}