{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Functional programming in Python (operator, functional, itertoools, toolz)\n",
    "\n",
    "- Pure functions\n",
    "- Recursive functions\n",
    "- Anonymous functions\n",
    "- Lazy evaluation\n",
    "- Higher-order functions\n",
    "- Decorators\n",
    "- Partial application\n",
    "- Using `operator`\n",
    "- Using `functional`\n",
    "- Using `itertools`\n",
    "- Pipelines with `toolz`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Pure functions"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Deterministic"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Pure"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(148.4131591025766, 148.4131591025766)"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.exp(5), np.exp(5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Not pure"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(0.1539109537552415, 0.25636728267414016)"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.random.randn(), np.random.randn()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### No side effects"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Pure"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "def f(xs):\n",
    "    '''Modify value at first index.'''\n",
    "    if len(xs) > 0:\n",
    "        xs = list(xs)\n",
    "        xs[0] = '@'\n",
    "    return xs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(['@', 2, 3], [1, 2, 3])"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "xs = [1,2,3]\n",
    "f(xs), xs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Not pure"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "def g(xs):\n",
    "    '''Modify value at first index.'''\n",
    "    if len(xs) > 0:\n",
    "        xs[0] = '@'\n",
    "    return xs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(['@', 2, 3], ['@', 2, 3])"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "xs = [1,2,3]\n",
    "g(xs), xs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exercise\n",
    "\n",
    "Is the function `h` pure or impure?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "def h(n, xs=[]):\n",
    "    for i in range(n):\n",
    "        xs.append(i)\n",
    "    return xs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The function is not deterministic, and it has side effects!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "n = 5\n",
    "xs = [1,2,3]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Non-deterministic"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[0, 1, 2, 3, 4]"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "h(n)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[0, 1, 2, 3, 4, 0, 1, 2, 3, 4]"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "h(n)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To avoid non-determinism, do not set default mutable arguments. The usaal Python idiom is\n",
    "\n",
    "```python\n",
    "def func(xs=None):\n",
    "    \"\"\"Docstring.\"\"\"\n",
    "    \n",
    "    if xs is None:\n",
    "        xs = []\n",
    "    do_something(xs)\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Side effects"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "xs = [1,2,3]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1, 2, 3, 0, 1, 2, 3, 4]"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "h(n, xs)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1, 2, 3, 0, 1, 2, 3, 4]"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "xs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Recursive functions\n",
    "\n",
    "A recursive function is one that calls itself. Python supports recursion, but recursive functions in Python are not efficient and iterative algorithms are preferred in general."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "def rec_sum(xs):\n",
    "    \"\"\"Recursive sum.\"\"\"\n",
    "    \n",
    "    if len(xs) == 0:\n",
    "        return 0\n",
    "    else:\n",
    "        return xs[0] + rec_sum(xs[1:])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "10"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "rec_sum([1,2,3,4])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Anonymous functions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<function __main__.<lambda>(x, y)>"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "lambda x, y: x + y"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [],
   "source": [
    "add = lambda x, y: x + y"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "7"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "add(3, 4)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<function __main__.<lambda>(x, y)>"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "lambda x, y: x if x < y else y"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
    "smaller = lambda x, y: x if x < y else y"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "smaller(9,1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Lazy evaluation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "range(0, 10)"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "range(10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(range(10))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Generators\n",
    "\n",
    "Generators behave like functions that retain the last state and can be re-entered. Results are returned with `yield` rather than `return`. Generators are used extensively in Python, and almost exclusively in the `itertools` and `toolz` packages we will review later in this  notebook."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Differences between a function and a generator"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [],
   "source": [
    "def fib_eager(n):\n",
    "    \"\"\"Eager Fibonacci function.\"\"\"\n",
    "    \n",
    "    xs = []\n",
    "    a, b = 1,1\n",
    "    for i in range(n):\n",
    "        xs.append(a)\n",
    "        a, b = b, a + b\n",
    "    return xs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fib_eager(10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [],
   "source": [
    "def fib_lazy(n):\n",
    "    \"\"\"Lazy Fibonacci generator.\"\"\"\n",
    "    \n",
    "    a, b = 1,1\n",
    "    for i in range(n):\n",
    "        yield a\n",
    "        a, b = b, a + b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<generator object fib_lazy at 0x116874c80>"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fib_lazy(10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(fib_lazy(10))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [],
   "source": [
    "fibs1 = fib_eager(10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1,1,2,3,5,8,13,21,34,55,"
     ]
    }
   ],
   "source": [
    "for i in fibs1:\n",
    "    print(i, end=',')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1,1,2,3,5,8,13,21,34,55,"
     ]
    }
   ],
   "source": [
    "for i in fibs1:\n",
    "    print(i, end=',')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [],
   "source": [
    "fibs2 = fib_lazy(10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1,1,2,3,5,8,13,21,34,55,"
     ]
    }
   ],
   "source": [
    "for i in fibs2:\n",
    "    print(i, end=',')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [],
   "source": [
    "for i in fibs2:\n",
    "    print(i, end=',')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Generators can return infinite sequences"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [],
   "source": [
    "def iota(start = 1):\n",
    "    \"\"\"An infinite incrementing genrator.\"\"\"\n",
    "    \n",
    "    while True:\n",
    "        yield start\n",
    "        start += 1    "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,"
     ]
    }
   ],
   "source": [
    "for i in iota():\n",
    "    print(i, end=',')\n",
    "    if i > 25:\n",
    "        break"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Higher order functions\n",
    "\n",
    "- Take a function as an argument \n",
    "- Returns a function"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [],
   "source": [
    "def dist(x, y, measure):\n",
    "    \"\"\"Returns distance between x and y using given measure.\n",
    "    \n",
    "    measure is a function that takes two arguments x and y.\n",
    "    \"\"\"\n",
    "    \n",
    "    return measure(x, y)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [],
   "source": [
    "def euclid(x, y):\n",
    "    \"\"\"Returns Euclidean distance between x and y.\"\"\"\n",
    "    \n",
    "    return np.sqrt(np.sum(x**2 + y**2))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5.0"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x = np.array([0,0])\n",
    "y = np.array([3,4])\n",
    "\n",
    "dist(x, y, measure=euclid)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Standard HOFs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [],
   "source": [
    "from functools import reduce"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(map(lambda x: x**2, range(10)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[0, 2, 4, 6, 8]"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(filter(lambda x: x % 2 == 0, range(10)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "45"
      ]
     },
     "execution_count": 44,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "reduce(lambda x, y: x + y, range(10))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "55"
      ]
     },
     "execution_count": 45,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "reduce(lambda x, y: x + y, range(10), 10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Example: Flattening a nested list"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[['the', 'quick', 'brown', 'fox'], ['jumps', 'over', 'the', 'dog']]"
      ]
     },
     "execution_count": 46,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "s1 = 'the quick brown fox'\n",
    "s2 = 'jumps over the dog'\n",
    "xs = [s.split() for s in [s1, s2]]\n",
    "xs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Using a nested for loop"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['the', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'dog']"
      ]
     },
     "execution_count": 47,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ys = []\n",
    "for x in xs:\n",
    "    for y in x:\n",
    "        ys.append(y)\n",
    "ys"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Using a list comprehension"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['the', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'dog']"
      ]
     },
     "execution_count": 48,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "[y for x in xs for y in x]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Using `reduce`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['the', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'dog']"
      ]
     },
     "execution_count": 49,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "reduce(lambda x, y: x + y, xs)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Closure"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {},
   "outputs": [],
   "source": [
    "def f(a):\n",
    "    \"\"\"The argument given to f is visible in g when g is called.\"\"\"\n",
    "    \n",
    "    def g(b):\n",
    "        return a + b\n",
    "    return g"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {},
   "outputs": [],
   "source": [
    "g3 = f(3)\n",
    "g5 = f(5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(7, 9)"
      ]
     },
     "execution_count": 52,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "g3(4), g5(4)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Decorators\n",
    "\n",
    "A decorator is a higher-order function that takes a function as input and returns a *decorated* version of the original function with additional capabilities. There is syntactical sugar for decorators in Python using the `@decorator_function` notation as shown below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {},
   "outputs": [],
   "source": [
    "import time"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {},
   "outputs": [],
   "source": [
    "def timer(f):\n",
    "    \"\"\"Times how long f takes.\"\"\"\n",
    "    \n",
    "    def g(*args, **kwargs):\n",
    "        \n",
    "        start = time.time()     \n",
    "        res = f(*args, **kwargs)\n",
    "        elapsed = time.time() - start\n",
    "        return res, elapsed\n",
    "    return g"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {},
   "outputs": [],
   "source": [
    "def slow(n=1):\n",
    "    time.sleep(n)\n",
    "    return n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Use as a function"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "metadata": {},
   "outputs": [],
   "source": [
    "slow_t1 = timer(slow)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(0.5, 0.5037732124328613)"
      ]
     },
     "execution_count": 57,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "slow_t1(0.5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Use as a decorator"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "metadata": {},
   "outputs": [],
   "source": [
    "@timer\n",
    "def slow_t2(n=1):\n",
    "    time.sleep(n)\n",
    "    return n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(0.5, 0.5038459300994873)"
      ]
     },
     "execution_count": 59,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "slow_t2(0.5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Partial application\n",
    "\n",
    "Partial application takes a function with two or more parameters, and returns the function with some parameters *filled in*. Partial application is very useful when constructing pipelines that transform data in stages."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 60,
   "metadata": {},
   "outputs": [],
   "source": [
    "from functools import partial"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 61,
   "metadata": {},
   "outputs": [],
   "source": [
    "def add(a, b):\n",
    "    \"\"\"Add a and b.\"\"\"\n",
    "    \n",
    "    return a + b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 62,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[10, 11, 12, 13, 14]"
      ]
     },
     "execution_count": 62,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(map(partial(add, b=10), range(5)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Using `operator`\n",
    "\n",
    "Operator provides named version of Python operators, and is typically used in place of anonymous functions in higher order functions to improve readability."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 63,
   "metadata": {},
   "outputs": [],
   "source": [
    "import operator as op"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 64,
   "metadata": {},
   "outputs": [],
   "source": [
    "xs = [('a', 3), ('b', 1), ('c', 2)]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 65,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('b', 1), ('c', 2), ('a', 3)]"
      ]
     },
     "execution_count": 65,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sorted(xs, key=op.itemgetter(1))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 66,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('b', 1), ('c', 2), ('a', 3)]"
      ]
     },
     "execution_count": 66,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sorted(xs, key=lambda x: x[1])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 67,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "10"
      ]
     },
     "execution_count": 67,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "reduce(op.add, range(1,5))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 68,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "10"
      ]
     },
     "execution_count": 68,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "reduce(lambda a, b: a + b, range(1,5))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Using `functional`\n",
    "\n",
    "We have already seen the use of `reduce` and `partial` from `functools`. Another useful function from the package is the `lrucache` decorator."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 69,
   "metadata": {},
   "outputs": [],
   "source": [
    "def fib(n):\n",
    "    \"\"\"Recursive version of Fibonacci.\"\"\"\n",
    "    \n",
    "    print('Call fib(%d)' % n)\n",
    "    \n",
    "    if n == 0:\n",
    "        return 0\n",
    "    elif n == 1:\n",
    "        return 1\n",
    "    else:\n",
    "        return fib(n-2) + fib(n-1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Notice the inefficiency from repeatedly calling the function with the same arguments."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 70,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Call fib(6)\n",
      "Call fib(4)\n",
      "Call fib(2)\n",
      "Call fib(0)\n",
      "Call fib(1)\n",
      "Call fib(3)\n",
      "Call fib(1)\n",
      "Call fib(2)\n",
      "Call fib(0)\n",
      "Call fib(1)\n",
      "Call fib(5)\n",
      "Call fib(3)\n",
      "Call fib(1)\n",
      "Call fib(2)\n",
      "Call fib(0)\n",
      "Call fib(1)\n",
      "Call fib(4)\n",
      "Call fib(2)\n",
      "Call fib(0)\n",
      "Call fib(1)\n",
      "Call fib(3)\n",
      "Call fib(1)\n",
      "Call fib(2)\n",
      "Call fib(0)\n",
      "Call fib(1)\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "8"
      ]
     },
     "execution_count": 70,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fib(6)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 71,
   "metadata": {},
   "outputs": [],
   "source": [
    "from functools import lru_cache"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 72,
   "metadata": {},
   "outputs": [],
   "source": [
    "@lru_cache(maxsize=None)\n",
    "def fib_cache(n):\n",
    "    \"\"\"Recursive version of Fibonacci.\"\"\"\n",
    "    \n",
    "    print('Call fib(%d)' % n)\n",
    "    \n",
    "    if n == 0:\n",
    "        return 0\n",
    "    elif n == 1:\n",
    "        return 1\n",
    "    else:\n",
    "        return fib_cache(n-2) + fib_cache(n-1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 73,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Call fib(6)\n",
      "Call fib(4)\n",
      "Call fib(2)\n",
      "Call fib(0)\n",
      "Call fib(1)\n",
      "Call fib(3)\n",
      "Call fib(5)\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "8"
      ]
     },
     "execution_count": 73,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fib_cache(6)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Using `itertools`\n",
    "\n",
    "The `itertools` package provides tools for efficient looping."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 74,
   "metadata": {},
   "outputs": [],
   "source": [
    "import itertools as it"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Generators"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 75,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]"
      ]
     },
     "execution_count": 75,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(it.islice(it.count(), 10))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 76,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1, 2, 3, 1, 2, 3, 1, 2, 3, 1]"
      ]
     },
     "execution_count": 76,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(it.islice(it.cycle([1,2,3]), 10))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 77,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[[1, 2, 3],\n",
       " [1, 2, 3],\n",
       " [1, 2, 3],\n",
       " [1, 2, 3],\n",
       " [1, 2, 3],\n",
       " [1, 2, 3],\n",
       " [1, 2, 3],\n",
       " [1, 2, 3],\n",
       " [1, 2, 3],\n",
       " [1, 2, 3]]"
      ]
     },
     "execution_count": 77,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(it.islice(it.repeat([1,2,3]), 10))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 78,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[(0, 'a'), (1, 'b'), (2, 'c'), (3, None), (4, None)]"
      ]
     },
     "execution_count": 78,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(it.zip_longest(range(5), 'abc'))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 79,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[(0, 'a'), (1, 'b'), (2, 'c'), (3, 'out of donuts'), (4, 'out of donuts')]"
      ]
     },
     "execution_count": 79,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(it.zip_longest(range(5), 'abc', fillvalue='out of donuts'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Permutations and combinations"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 80,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('a', 'b', 'c'),\n",
       " ('a', 'c', 'b'),\n",
       " ('b', 'a', 'c'),\n",
       " ('b', 'c', 'a'),\n",
       " ('c', 'a', 'b'),\n",
       " ('c', 'b', 'a')]"
      ]
     },
     "execution_count": 80,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(it.permutations('abc'))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 81,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('a', 'b'), ('a', 'c'), ('b', 'a'), ('b', 'c'), ('c', 'a'), ('c', 'b')]"
      ]
     },
     "execution_count": 81,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(it.permutations('abc', 2))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 82,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('a', 'b'), ('a', 'c'), ('b', 'c')]"
      ]
     },
     "execution_count": 82,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(it.combinations('abc', 2))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 83,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('a', 'a'), ('a', 'b'), ('a', 'c'), ('b', 'b'), ('b', 'c'), ('c', 'c')]"
      ]
     },
     "execution_count": 83,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(it.combinations_with_replacement('abc', 2))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 84,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('a', 'a'),\n",
       " ('a', 'b'),\n",
       " ('a', 'c'),\n",
       " ('b', 'a'),\n",
       " ('b', 'b'),\n",
       " ('b', 'c'),\n",
       " ('c', 'a'),\n",
       " ('c', 'b'),\n",
       " ('c', 'c')]"
      ]
     },
     "execution_count": 84,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(it.product('abc', repeat=2))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Miscellaneous"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 85,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1, 2, 3, 4, 5, 6, 7, 8, 9]"
      ]
     },
     "execution_count": 85,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(it.chain([1,2,3], [4,5,6], [7,8,9]))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 86,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1, 2, 3, 4, 5, 6, 7, 8, 9]"
      ]
     },
     "execution_count": 86,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(it.chain.from_iterable([[1,2,3], [4,5,6], [7,8,9]]))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 87,
   "metadata": {},
   "outputs": [],
   "source": [
    "nums = [1,3,5,7,2,4,6,1,3,5,7,9,2,2,2]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 88,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "False [1, 3, 5, 7]\n",
      "True [2, 4, 6]\n",
      "False [1, 3, 5, 7, 9]\n",
      "True [2, 2, 2]\n"
     ]
    }
   ],
   "source": [
    "for i, g in it.groupby(nums, key=lambda x: x % 2 ==0):\n",
    "    print(i, list(g))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 89,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1, 3, 5, 7]"
      ]
     },
     "execution_count": 89,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(it.takewhile(lambda x: x % 2 == 1, nums))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 90,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[2, 4, 6, 1, 3, 5, 7, 9, 2, 2, 2]"
      ]
     },
     "execution_count": 90,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(it.dropwhile(lambda x: x % 2 == 1, nums))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Example: Classic word-count using map-reduce\n",
    "\n",
    "There are two basic steps - first we create a tuple (word, 1), then group by word, then reduce the grouping to sum up the ones. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 91,
   "metadata": {},
   "outputs": [],
   "source": [
    "rhyme = 'humpty dumpty sat on a wall humpty dumpty had a great fall'\n",
    "words = rhyme.split()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Map to create key-value pair"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 92,
   "metadata": {},
   "outputs": [],
   "source": [
    "x1 = map(lambda x: (x, 1), words)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Group similar keys together"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 93,
   "metadata": {},
   "outputs": [],
   "source": [
    "x2 = it.groupby(sorted(x1), key=lambda x: x[0])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Reduce on value of key-value pair"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 94,
   "metadata": {},
   "outputs": [],
   "source": [
    "x3 = map(lambda x: (x[0], reduce(lambda a, b: a[1] + b[1], x[1])), x2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Clean-up because the nested reduce stops when the list has only one element"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 95,
   "metadata": {},
   "outputs": [],
   "source": [
    "x4 = map(lambda x: (x[0], x[1] if isinstance(x[1], int) else x[1][1]), x3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 96,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('a', 2),\n",
       " ('dumpty', 2),\n",
       " ('fall', 1),\n",
       " ('great', 1),\n",
       " ('had', 1),\n",
       " ('humpty', 2),\n",
       " ('on', 1),\n",
       " ('sat', 1),\n",
       " ('wall', 1)]"
      ]
     },
     "execution_count": 96,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list(x4)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Using `toolz`\n",
    "\n",
    "The `toolz` package provides a very rich set of functional operators, and is recommended if you want to program in the functional style using Python."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 97,
   "metadata": {},
   "outputs": [],
   "source": [
    "import toolz"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 98,
   "metadata": {},
   "outputs": [],
   "source": [
    "from toolz import sliding_window, pipe, frequencies, concat"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 99,
   "metadata": {},
   "outputs": [],
   "source": [
    "import toolz.curried as c"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Example word count revisited"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 100,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'humpty': 2,\n",
       " 'dumpty': 2,\n",
       " 'sat': 1,\n",
       " 'on': 1,\n",
       " 'a': 2,\n",
       " 'wall': 1,\n",
       " 'had': 1,\n",
       " 'great': 1,\n",
       " 'fall': 1}"
      ]
     },
     "execution_count": 100,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "frequencies(words)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Example\n",
    "\n",
    "Read in some documents, remove punctuation, split into words and then into individual characters, and find the most commonly occurring sliding windows containing 3 characters.\n",
    "\n",
    "This is most naturally done by piping in a data set through a series of transforms."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 101,
   "metadata": {},
   "outputs": [],
   "source": [
    "d1 = 'the doo doo doo the dah dah dah'\n",
    "d2 = 'every breath she takes, every move she makes'\n",
    "d3 = 'another brick in the wall'\n",
    "d4 = 'another one bites the dust and another one gone'\n",
    "docs = [d1,d2,d3,d4]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 102,
   "metadata": {},
   "outputs": [],
   "source": [
    "import string"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 103,
   "metadata": {},
   "outputs": [],
   "source": [
    "triple_freqs = pipe(\n",
    "    docs,\n",
    "    c.map(lambda x: x.translate(str.maketrans('', '', string.punctuation))),\n",
    "    c.map(lambda x: x.split()),\n",
    "    concat,\n",
    "    concat,\n",
    "    c.sliding_window(3),\n",
    "    frequencies,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 104,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[(('t', 'h', 'e'), 7),\n",
       " (('o', 't', 'h'), 4),\n",
       " (('h', 'e', 'd'), 3),\n",
       " (('d', 'o', 'o'), 3),\n",
       " (('d', 'a', 'h'), 3)]"
      ]
     },
     "execution_count": 104,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sorted(triple_freqs.items(), key=lambda x: x[1], reverse=True)[:5]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Step by step"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 105,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{('t', 'h', 'e'): 7,\n",
       " ('h', 'e', 'd'): 3,\n",
       " ('e', 'd', 'o'): 1,\n",
       " ('d', 'o', 'o'): 3,\n",
       " ('o', 'o', 'd'): 2,\n",
       " ('o', 'd', 'o'): 2,\n",
       " ('o', 'o', 't'): 1,\n",
       " ('o', 't', 'h'): 4,\n",
       " ('e', 'd', 'a'): 1,\n",
       " ('d', 'a', 'h'): 3,\n",
       " ('a', 'h', 'd'): 2,\n",
       " ('h', 'd', 'a'): 2,\n",
       " ('a', 'h', 'e'): 1,\n",
       " ('h', 'e', 'v'): 1,\n",
       " ('e', 'v', 'e'): 2,\n",
       " ('v', 'e', 'r'): 2,\n",
       " ('e', 'r', 'y'): 2,\n",
       " ('r', 'y', 'b'): 1,\n",
       " ('y', 'b', 'r'): 1,\n",
       " ('b', 'r', 'e'): 1,\n",
       " ('r', 'e', 'a'): 1,\n",
       " ('e', 'a', 't'): 1,\n",
       " ('a', 't', 'h'): 1,\n",
       " ('t', 'h', 's'): 1,\n",
       " ('h', 's', 'h'): 1,\n",
       " ('s', 'h', 'e'): 2,\n",
       " ('h', 'e', 't'): 1,\n",
       " ('e', 't', 'a'): 1,\n",
       " ('t', 'a', 'k'): 1,\n",
       " ('a', 'k', 'e'): 2,\n",
       " ('k', 'e', 's'): 2,\n",
       " ('e', 's', 'e'): 1,\n",
       " ('s', 'e', 'v'): 1,\n",
       " ('r', 'y', 'm'): 1,\n",
       " ('y', 'm', 'o'): 1,\n",
       " ('m', 'o', 'v'): 1,\n",
       " ('o', 'v', 'e'): 1,\n",
       " ('v', 'e', 's'): 1,\n",
       " ('e', 's', 'h'): 1,\n",
       " ('h', 'e', 'm'): 1,\n",
       " ('e', 'm', 'a'): 1,\n",
       " ('m', 'a', 'k'): 1,\n",
       " ('e', 's', 'a'): 1,\n",
       " ('s', 'a', 'n'): 1,\n",
       " ('a', 'n', 'o'): 3,\n",
       " ('n', 'o', 't'): 3,\n",
       " ('h', 'e', 'r'): 3,\n",
       " ('e', 'r', 'b'): 1,\n",
       " ('r', 'b', 'r'): 1,\n",
       " ('b', 'r', 'i'): 1,\n",
       " ('r', 'i', 'c'): 1,\n",
       " ('i', 'c', 'k'): 1,\n",
       " ('c', 'k', 'i'): 1,\n",
       " ('k', 'i', 'n'): 1,\n",
       " ('i', 'n', 't'): 1,\n",
       " ('n', 't', 'h'): 1,\n",
       " ('h', 'e', 'w'): 1,\n",
       " ('e', 'w', 'a'): 1,\n",
       " ('w', 'a', 'l'): 1,\n",
       " ('a', 'l', 'l'): 1,\n",
       " ('l', 'l', 'a'): 1,\n",
       " ('l', 'a', 'n'): 1,\n",
       " ('e', 'r', 'o'): 2,\n",
       " ('r', 'o', 'n'): 2,\n",
       " ('o', 'n', 'e'): 3,\n",
       " ('n', 'e', 'b'): 1,\n",
       " ('e', 'b', 'i'): 1,\n",
       " ('b', 'i', 't'): 1,\n",
       " ('i', 't', 'e'): 1,\n",
       " ('t', 'e', 's'): 1,\n",
       " ('e', 's', 't'): 1,\n",
       " ('s', 't', 'h'): 1,\n",
       " ('e', 'd', 'u'): 1,\n",
       " ('d', 'u', 's'): 1,\n",
       " ('u', 's', 't'): 1,\n",
       " ('s', 't', 'a'): 1,\n",
       " ('t', 'a', 'n'): 1,\n",
       " ('a', 'n', 'd'): 1,\n",
       " ('n', 'd', 'a'): 1,\n",
       " ('d', 'a', 'n'): 1,\n",
       " ('n', 'e', 'g'): 1,\n",
       " ('e', 'g', 'o'): 1,\n",
       " ('g', 'o', 'n'): 1}"
      ]
     },
     "execution_count": 105,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pipe(\n",
    "    docs,\n",
    "    c.map(lambda x: x.translate(str.maketrans('', '', string.punctuation))),\n",
    "    c.map(lambda x: x.split()),\n",
    "    concat,\n",
    "    concat,\n",
    "    c.sliding_window(3),\n",
    "    frequencies,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 106,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{('t', 'h', 'e'): 7,\n",
       " ('h', 'e', 'd'): 3,\n",
       " ('e', 'd', 'o'): 1,\n",
       " ('d', 'o', 'o'): 3,\n",
       " ('o', 'o', 'd'): 2,\n",
       " ('o', 'd', 'o'): 2,\n",
       " ('o', 'o', 't'): 1,\n",
       " ('o', 't', 'h'): 4,\n",
       " ('e', 'd', 'a'): 1,\n",
       " ('d', 'a', 'h'): 3,\n",
       " ('a', 'h', 'd'): 2,\n",
       " ('h', 'd', 'a'): 2,\n",
       " ('a', 'h', 'e'): 1,\n",
       " ('h', 'e', 'v'): 1,\n",
       " ('e', 'v', 'e'): 2,\n",
       " ('v', 'e', 'r'): 2,\n",
       " ('e', 'r', 'y'): 2,\n",
       " ('r', 'y', 'b'): 1,\n",
       " ('y', 'b', 'r'): 1,\n",
       " ('b', 'r', 'e'): 1,\n",
       " ('r', 'e', 'a'): 1,\n",
       " ('e', 'a', 't'): 1,\n",
       " ('a', 't', 'h'): 1,\n",
       " ('t', 'h', 's'): 1,\n",
       " ('h', 's', 'h'): 1,\n",
       " ('s', 'h', 'e'): 2,\n",
       " ('h', 'e', 't'): 1,\n",
       " ('e', 't', 'a'): 1,\n",
       " ('t', 'a', 'k'): 1,\n",
       " ('a', 'k', 'e'): 2,\n",
       " ('k', 'e', 's'): 2,\n",
       " ('e', 's', 'e'): 1,\n",
       " ('s', 'e', 'v'): 1,\n",
       " ('r', 'y', 'm'): 1,\n",
       " ('y', 'm', 'o'): 1,\n",
       " ('m', 'o', 'v'): 1,\n",
       " ('o', 'v', 'e'): 1,\n",
       " ('v', 'e', 's'): 1,\n",
       " ('e', 's', 'h'): 1,\n",
       " ('h', 'e', 'm'): 1,\n",
       " ('e', 'm', 'a'): 1,\n",
       " ('m', 'a', 'k'): 1,\n",
       " ('e', 's', 'a'): 1,\n",
       " ('s', 'a', 'n'): 1,\n",
       " ('a', 'n', 'o'): 3,\n",
       " ('n', 'o', 't'): 3,\n",
       " ('h', 'e', 'r'): 3,\n",
       " ('e', 'r', 'b'): 1,\n",
       " ('r', 'b', 'r'): 1,\n",
       " ('b', 'r', 'i'): 1,\n",
       " ('r', 'i', 'c'): 1,\n",
       " ('i', 'c', 'k'): 1,\n",
       " ('c', 'k', 'i'): 1,\n",
       " ('k', 'i', 'n'): 1,\n",
       " ('i', 'n', 't'): 1,\n",
       " ('n', 't', 'h'): 1,\n",
       " ('h', 'e', 'w'): 1,\n",
       " ('e', 'w', 'a'): 1,\n",
       " ('w', 'a', 'l'): 1,\n",
       " ('a', 'l', 'l'): 1,\n",
       " ('l', 'l', 'a'): 1,\n",
       " ('l', 'a', 'n'): 1,\n",
       " ('e', 'r', 'o'): 2,\n",
       " ('r', 'o', 'n'): 2,\n",
       " ('o', 'n', 'e'): 3,\n",
       " ('n', 'e', 'b'): 1,\n",
       " ('e', 'b', 'i'): 1,\n",
       " ('b', 'i', 't'): 1,\n",
       " ('i', 't', 'e'): 1,\n",
       " ('t', 'e', 's'): 1,\n",
       " ('e', 's', 't'): 1,\n",
       " ('s', 't', 'h'): 1,\n",
       " ('e', 'd', 'u'): 1,\n",
       " ('d', 'u', 's'): 1,\n",
       " ('u', 's', 't'): 1,\n",
       " ('s', 't', 'a'): 1,\n",
       " ('t', 'a', 'n'): 1,\n",
       " ('a', 'n', 'd'): 1,\n",
       " ('n', 'd', 'a'): 1,\n",
       " ('d', 'a', 'n'): 1,\n",
       " ('n', 'e', 'g'): 1,\n",
       " ('e', 'g', 'o'): 1,\n",
       " ('g', 'o', 'n'): 1}"
      ]
     },
     "execution_count": 106,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pipe(\n",
    "    docs,\n",
    "    c.map(lambda x: x.translate(str.maketrans('', '', string.punctuation))),\n",
    "    c.map(lambda x: x.split()),\n",
    "    concat,\n",
    "    concat,\n",
    "    c.sliding_window(3),\n",
    "    frequencies,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 107,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{('t', 'h', 'e'): 7,\n",
       " ('h', 'e', 'd'): 3,\n",
       " ('e', 'd', 'o'): 1,\n",
       " ('d', 'o', 'o'): 3,\n",
       " ('o', 'o', 'd'): 2,\n",
       " ('o', 'd', 'o'): 2,\n",
       " ('o', 'o', 't'): 1,\n",
       " ('o', 't', 'h'): 4,\n",
       " ('e', 'd', 'a'): 1,\n",
       " ('d', 'a', 'h'): 3,\n",
       " ('a', 'h', 'd'): 2,\n",
       " ('h', 'd', 'a'): 2,\n",
       " ('a', 'h', 'e'): 1,\n",
       " ('h', 'e', 'v'): 1,\n",
       " ('e', 'v', 'e'): 2,\n",
       " ('v', 'e', 'r'): 2,\n",
       " ('e', 'r', 'y'): 2,\n",
       " ('r', 'y', 'b'): 1,\n",
       " ('y', 'b', 'r'): 1,\n",
       " ('b', 'r', 'e'): 1,\n",
       " ('r', 'e', 'a'): 1,\n",
       " ('e', 'a', 't'): 1,\n",
       " ('a', 't', 'h'): 1,\n",
       " ('t', 'h', 's'): 1,\n",
       " ('h', 's', 'h'): 1,\n",
       " ('s', 'h', 'e'): 2,\n",
       " ('h', 'e', 't'): 1,\n",
       " ('e', 't', 'a'): 1,\n",
       " ('t', 'a', 'k'): 1,\n",
       " ('a', 'k', 'e'): 2,\n",
       " ('k', 'e', 's'): 2,\n",
       " ('e', 's', 'e'): 1,\n",
       " ('s', 'e', 'v'): 1,\n",
       " ('r', 'y', 'm'): 1,\n",
       " ('y', 'm', 'o'): 1,\n",
       " ('m', 'o', 'v'): 1,\n",
       " ('o', 'v', 'e'): 1,\n",
       " ('v', 'e', 's'): 1,\n",
       " ('e', 's', 'h'): 1,\n",
       " ('h', 'e', 'm'): 1,\n",
       " ('e', 'm', 'a'): 1,\n",
       " ('m', 'a', 'k'): 1,\n",
       " ('e', 's', 'a'): 1,\n",
       " ('s', 'a', 'n'): 1,\n",
       " ('a', 'n', 'o'): 3,\n",
       " ('n', 'o', 't'): 3,\n",
       " ('h', 'e', 'r'): 3,\n",
       " ('e', 'r', 'b'): 1,\n",
       " ('r', 'b', 'r'): 1,\n",
       " ('b', 'r', 'i'): 1,\n",
       " ('r', 'i', 'c'): 1,\n",
       " ('i', 'c', 'k'): 1,\n",
       " ('c', 'k', 'i'): 1,\n",
       " ('k', 'i', 'n'): 1,\n",
       " ('i', 'n', 't'): 1,\n",
       " ('n', 't', 'h'): 1,\n",
       " ('h', 'e', 'w'): 1,\n",
       " ('e', 'w', 'a'): 1,\n",
       " ('w', 'a', 'l'): 1,\n",
       " ('a', 'l', 'l'): 1,\n",
       " ('l', 'l', 'a'): 1,\n",
       " ('l', 'a', 'n'): 1,\n",
       " ('e', 'r', 'o'): 2,\n",
       " ('r', 'o', 'n'): 2,\n",
       " ('o', 'n', 'e'): 3,\n",
       " ('n', 'e', 'b'): 1,\n",
       " ('e', 'b', 'i'): 1,\n",
       " ('b', 'i', 't'): 1,\n",
       " ('i', 't', 'e'): 1,\n",
       " ('t', 'e', 's'): 1,\n",
       " ('e', 's', 't'): 1,\n",
       " ('s', 't', 'h'): 1,\n",
       " ('e', 'd', 'u'): 1,\n",
       " ('d', 'u', 's'): 1,\n",
       " ('u', 's', 't'): 1,\n",
       " ('s', 't', 'a'): 1,\n",
       " ('t', 'a', 'n'): 1,\n",
       " ('a', 'n', 'd'): 1,\n",
       " ('n', 'd', 'a'): 1,\n",
       " ('d', 'a', 'n'): 1,\n",
       " ('n', 'e', 'g'): 1,\n",
       " ('e', 'g', 'o'): 1,\n",
       " ('g', 'o', 'n'): 1}"
      ]
     },
     "execution_count": 107,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pipe(\n",
    "    docs,\n",
    "    c.map(lambda x: x.translate(str.maketrans('', '', string.punctuation))),\n",
    "    c.map(lambda x: x.split()),\n",
    "    concat,\n",
    "    concat,\n",
    "    c.sliding_window(3),\n",
    "    frequencies,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 108,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['the doo doo doo the dah dah dah',\n",
       " 'every breath she takes, every move she makes',\n",
       " 'another brick in the wall',\n",
       " 'another one bites the dust and another one gone']"
      ]
     },
     "execution_count": 108,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pipe(\n",
    "    docs,\n",
    "    list\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 109,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['the doo doo doo the dah dah dah',\n",
       " 'every breath she takes every move she makes',\n",
       " 'another brick in the wall',\n",
       " 'another one bites the dust and another one gone']"
      ]
     },
     "execution_count": 109,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pipe(\n",
    "    docs,\n",
    "    c.map(lambda x: x.translate(str.maketrans('', '', string.punctuation))),\n",
    "    list\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 110,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[['the', 'doo', 'doo', 'doo', 'the', 'dah', 'dah', 'dah'],\n",
       " ['every', 'breath', 'she', 'takes', 'every', 'move', 'she', 'makes'],\n",
       " ['another', 'brick', 'in', 'the', 'wall'],\n",
       " ['another', 'one', 'bites', 'the', 'dust', 'and', 'another', 'one', 'gone']]"
      ]
     },
     "execution_count": 110,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pipe(\n",
    "    docs,\n",
    "    c.map(lambda x: x.translate(str.maketrans('', '', string.punctuation))),\n",
    "    c.map(lambda x: x.split()),\n",
    "    list\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 111,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['the',\n",
       " 'doo',\n",
       " 'doo',\n",
       " 'doo',\n",
       " 'the',\n",
       " 'dah',\n",
       " 'dah',\n",
       " 'dah',\n",
       " 'every',\n",
       " 'breath',\n",
       " 'she',\n",
       " 'takes',\n",
       " 'every',\n",
       " 'move',\n",
       " 'she',\n",
       " 'makes',\n",
       " 'another',\n",
       " 'brick',\n",
       " 'in',\n",
       " 'the',\n",
       " 'wall',\n",
       " 'another',\n",
       " 'one',\n",
       " 'bites',\n",
       " 'the',\n",
       " 'dust',\n",
       " 'and',\n",
       " 'another',\n",
       " 'one',\n",
       " 'gone']"
      ]
     },
     "execution_count": 111,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pipe(\n",
    "    docs,\n",
    "    c.map(lambda x: x.translate(str.maketrans('', '', string.punctuation))),\n",
    "    c.map(lambda x: x.split()),\n",
    "    concat,\n",
    "    list\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 112,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['t',\n",
       " 'h',\n",
       " 'e',\n",
       " 'd',\n",
       " 'o',\n",
       " 'o',\n",
       " 'd',\n",
       " 'o',\n",
       " 'o',\n",
       " 'd',\n",
       " 'o',\n",
       " 'o',\n",
       " 't',\n",
       " 'h',\n",
       " 'e',\n",
       " 'd',\n",
       " 'a',\n",
       " 'h',\n",
       " 'd',\n",
       " 'a',\n",
       " 'h',\n",
       " 'd',\n",
       " 'a',\n",
       " 'h',\n",
       " 'e',\n",
       " 'v',\n",
       " 'e',\n",
       " 'r',\n",
       " 'y',\n",
       " 'b',\n",
       " 'r',\n",
       " 'e',\n",
       " 'a',\n",
       " 't',\n",
       " 'h',\n",
       " 's',\n",
       " 'h',\n",
       " 'e',\n",
       " 't',\n",
       " 'a',\n",
       " 'k',\n",
       " 'e',\n",
       " 's',\n",
       " 'e',\n",
       " 'v',\n",
       " 'e',\n",
       " 'r',\n",
       " 'y',\n",
       " 'm',\n",
       " 'o',\n",
       " 'v',\n",
       " 'e',\n",
       " 's',\n",
       " 'h',\n",
       " 'e',\n",
       " 'm',\n",
       " 'a',\n",
       " 'k',\n",
       " 'e',\n",
       " 's',\n",
       " 'a',\n",
       " 'n',\n",
       " 'o',\n",
       " 't',\n",
       " 'h',\n",
       " 'e',\n",
       " 'r',\n",
       " 'b',\n",
       " 'r',\n",
       " 'i',\n",
       " 'c',\n",
       " 'k',\n",
       " 'i',\n",
       " 'n',\n",
       " 't',\n",
       " 'h',\n",
       " 'e',\n",
       " 'w',\n",
       " 'a',\n",
       " 'l',\n",
       " 'l',\n",
       " 'a',\n",
       " 'n',\n",
       " 'o',\n",
       " 't',\n",
       " 'h',\n",
       " 'e',\n",
       " 'r',\n",
       " 'o',\n",
       " 'n',\n",
       " 'e',\n",
       " 'b',\n",
       " 'i',\n",
       " 't',\n",
       " 'e',\n",
       " 's',\n",
       " 't',\n",
       " 'h',\n",
       " 'e',\n",
       " 'd',\n",
       " 'u',\n",
       " 's',\n",
       " 't',\n",
       " 'a',\n",
       " 'n',\n",
       " 'd',\n",
       " 'a',\n",
       " 'n',\n",
       " 'o',\n",
       " 't',\n",
       " 'h',\n",
       " 'e',\n",
       " 'r',\n",
       " 'o',\n",
       " 'n',\n",
       " 'e',\n",
       " 'g',\n",
       " 'o',\n",
       " 'n',\n",
       " 'e']"
      ]
     },
     "execution_count": 112,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pipe(\n",
    "    docs,\n",
    "    c.map(lambda x: x.translate(str.maketrans('', '', string.punctuation))),\n",
    "    c.map(lambda x: x.split()),\n",
    "    concat,\n",
    "    concat,\n",
    "    list\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 113,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('t', 'h', 'e'),\n",
       " ('h', 'e', 'd'),\n",
       " ('e', 'd', 'o'),\n",
       " ('d', 'o', 'o'),\n",
       " ('o', 'o', 'd'),\n",
       " ('o', 'd', 'o'),\n",
       " ('d', 'o', 'o'),\n",
       " ('o', 'o', 'd'),\n",
       " ('o', 'd', 'o'),\n",
       " ('d', 'o', 'o'),\n",
       " ('o', 'o', 't'),\n",
       " ('o', 't', 'h'),\n",
       " ('t', 'h', 'e'),\n",
       " ('h', 'e', 'd'),\n",
       " ('e', 'd', 'a'),\n",
       " ('d', 'a', 'h'),\n",
       " ('a', 'h', 'd'),\n",
       " ('h', 'd', 'a'),\n",
       " ('d', 'a', 'h'),\n",
       " ('a', 'h', 'd'),\n",
       " ('h', 'd', 'a'),\n",
       " ('d', 'a', 'h'),\n",
       " ('a', 'h', 'e'),\n",
       " ('h', 'e', 'v'),\n",
       " ('e', 'v', 'e'),\n",
       " ('v', 'e', 'r'),\n",
       " ('e', 'r', 'y'),\n",
       " ('r', 'y', 'b'),\n",
       " ('y', 'b', 'r'),\n",
       " ('b', 'r', 'e'),\n",
       " ('r', 'e', 'a'),\n",
       " ('e', 'a', 't'),\n",
       " ('a', 't', 'h'),\n",
       " ('t', 'h', 's'),\n",
       " ('h', 's', 'h'),\n",
       " ('s', 'h', 'e'),\n",
       " ('h', 'e', 't'),\n",
       " ('e', 't', 'a'),\n",
       " ('t', 'a', 'k'),\n",
       " ('a', 'k', 'e'),\n",
       " ('k', 'e', 's'),\n",
       " ('e', 's', 'e'),\n",
       " ('s', 'e', 'v'),\n",
       " ('e', 'v', 'e'),\n",
       " ('v', 'e', 'r'),\n",
       " ('e', 'r', 'y'),\n",
       " ('r', 'y', 'm'),\n",
       " ('y', 'm', 'o'),\n",
       " ('m', 'o', 'v'),\n",
       " ('o', 'v', 'e'),\n",
       " ('v', 'e', 's'),\n",
       " ('e', 's', 'h'),\n",
       " ('s', 'h', 'e'),\n",
       " ('h', 'e', 'm'),\n",
       " ('e', 'm', 'a'),\n",
       " ('m', 'a', 'k'),\n",
       " ('a', 'k', 'e'),\n",
       " ('k', 'e', 's'),\n",
       " ('e', 's', 'a'),\n",
       " ('s', 'a', 'n'),\n",
       " ('a', 'n', 'o'),\n",
       " ('n', 'o', 't'),\n",
       " ('o', 't', 'h'),\n",
       " ('t', 'h', 'e'),\n",
       " ('h', 'e', 'r'),\n",
       " ('e', 'r', 'b'),\n",
       " ('r', 'b', 'r'),\n",
       " ('b', 'r', 'i'),\n",
       " ('r', 'i', 'c'),\n",
       " ('i', 'c', 'k'),\n",
       " ('c', 'k', 'i'),\n",
       " ('k', 'i', 'n'),\n",
       " ('i', 'n', 't'),\n",
       " ('n', 't', 'h'),\n",
       " ('t', 'h', 'e'),\n",
       " ('h', 'e', 'w'),\n",
       " ('e', 'w', 'a'),\n",
       " ('w', 'a', 'l'),\n",
       " ('a', 'l', 'l'),\n",
       " ('l', 'l', 'a'),\n",
       " ('l', 'a', 'n'),\n",
       " ('a', 'n', 'o'),\n",
       " ('n', 'o', 't'),\n",
       " ('o', 't', 'h'),\n",
       " ('t', 'h', 'e'),\n",
       " ('h', 'e', 'r'),\n",
       " ('e', 'r', 'o'),\n",
       " ('r', 'o', 'n'),\n",
       " ('o', 'n', 'e'),\n",
       " ('n', 'e', 'b'),\n",
       " ('e', 'b', 'i'),\n",
       " ('b', 'i', 't'),\n",
       " ('i', 't', 'e'),\n",
       " ('t', 'e', 's'),\n",
       " ('e', 's', 't'),\n",
       " ('s', 't', 'h'),\n",
       " ('t', 'h', 'e'),\n",
       " ('h', 'e', 'd'),\n",
       " ('e', 'd', 'u'),\n",
       " ('d', 'u', 's'),\n",
       " ('u', 's', 't'),\n",
       " ('s', 't', 'a'),\n",
       " ('t', 'a', 'n'),\n",
       " ('a', 'n', 'd'),\n",
       " ('n', 'd', 'a'),\n",
       " ('d', 'a', 'n'),\n",
       " ('a', 'n', 'o'),\n",
       " ('n', 'o', 't'),\n",
       " ('o', 't', 'h'),\n",
       " ('t', 'h', 'e'),\n",
       " ('h', 'e', 'r'),\n",
       " ('e', 'r', 'o'),\n",
       " ('r', 'o', 'n'),\n",
       " ('o', 'n', 'e'),\n",
       " ('n', 'e', 'g'),\n",
       " ('e', 'g', 'o'),\n",
       " ('g', 'o', 'n'),\n",
       " ('o', 'n', 'e')]"
      ]
     },
     "execution_count": 113,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pipe(\n",
    "    docs,\n",
    "    c.map(lambda x: x.translate(str.maketrans('', '', string.punctuation))),\n",
    "    c.map(lambda x: x.split()),\n",
    "    concat,\n",
    "    concat,\n",
    "    c.sliding_window(3),\n",
    "    list\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 114,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{('t', 'h', 'e'): 7,\n",
       " ('h', 'e', 'd'): 3,\n",
       " ('e', 'd', 'o'): 1,\n",
       " ('d', 'o', 'o'): 3,\n",
       " ('o', 'o', 'd'): 2,\n",
       " ('o', 'd', 'o'): 2,\n",
       " ('o', 'o', 't'): 1,\n",
       " ('o', 't', 'h'): 4,\n",
       " ('e', 'd', 'a'): 1,\n",
       " ('d', 'a', 'h'): 3,\n",
       " ('a', 'h', 'd'): 2,\n",
       " ('h', 'd', 'a'): 2,\n",
       " ('a', 'h', 'e'): 1,\n",
       " ('h', 'e', 'v'): 1,\n",
       " ('e', 'v', 'e'): 2,\n",
       " ('v', 'e', 'r'): 2,\n",
       " ('e', 'r', 'y'): 2,\n",
       " ('r', 'y', 'b'): 1,\n",
       " ('y', 'b', 'r'): 1,\n",
       " ('b', 'r', 'e'): 1,\n",
       " ('r', 'e', 'a'): 1,\n",
       " ('e', 'a', 't'): 1,\n",
       " ('a', 't', 'h'): 1,\n",
       " ('t', 'h', 's'): 1,\n",
       " ('h', 's', 'h'): 1,\n",
       " ('s', 'h', 'e'): 2,\n",
       " ('h', 'e', 't'): 1,\n",
       " ('e', 't', 'a'): 1,\n",
       " ('t', 'a', 'k'): 1,\n",
       " ('a', 'k', 'e'): 2,\n",
       " ('k', 'e', 's'): 2,\n",
       " ('e', 's', 'e'): 1,\n",
       " ('s', 'e', 'v'): 1,\n",
       " ('r', 'y', 'm'): 1,\n",
       " ('y', 'm', 'o'): 1,\n",
       " ('m', 'o', 'v'): 1,\n",
       " ('o', 'v', 'e'): 1,\n",
       " ('v', 'e', 's'): 1,\n",
       " ('e', 's', 'h'): 1,\n",
       " ('h', 'e', 'm'): 1,\n",
       " ('e', 'm', 'a'): 1,\n",
       " ('m', 'a', 'k'): 1,\n",
       " ('e', 's', 'a'): 1,\n",
       " ('s', 'a', 'n'): 1,\n",
       " ('a', 'n', 'o'): 3,\n",
       " ('n', 'o', 't'): 3,\n",
       " ('h', 'e', 'r'): 3,\n",
       " ('e', 'r', 'b'): 1,\n",
       " ('r', 'b', 'r'): 1,\n",
       " ('b', 'r', 'i'): 1,\n",
       " ('r', 'i', 'c'): 1,\n",
       " ('i', 'c', 'k'): 1,\n",
       " ('c', 'k', 'i'): 1,\n",
       " ('k', 'i', 'n'): 1,\n",
       " ('i', 'n', 't'): 1,\n",
       " ('n', 't', 'h'): 1,\n",
       " ('h', 'e', 'w'): 1,\n",
       " ('e', 'w', 'a'): 1,\n",
       " ('w', 'a', 'l'): 1,\n",
       " ('a', 'l', 'l'): 1,\n",
       " ('l', 'l', 'a'): 1,\n",
       " ('l', 'a', 'n'): 1,\n",
       " ('e', 'r', 'o'): 2,\n",
       " ('r', 'o', 'n'): 2,\n",
       " ('o', 'n', 'e'): 3,\n",
       " ('n', 'e', 'b'): 1,\n",
       " ('e', 'b', 'i'): 1,\n",
       " ('b', 'i', 't'): 1,\n",
       " ('i', 't', 'e'): 1,\n",
       " ('t', 'e', 's'): 1,\n",
       " ('e', 's', 't'): 1,\n",
       " ('s', 't', 'h'): 1,\n",
       " ('e', 'd', 'u'): 1,\n",
       " ('d', 'u', 's'): 1,\n",
       " ('u', 's', 't'): 1,\n",
       " ('s', 't', 'a'): 1,\n",
       " ('t', 'a', 'n'): 1,\n",
       " ('a', 'n', 'd'): 1,\n",
       " ('n', 'd', 'a'): 1,\n",
       " ('d', 'a', 'n'): 1,\n",
       " ('n', 'e', 'g'): 1,\n",
       " ('e', 'g', 'o'): 1,\n",
       " ('g', 'o', 'n'): 1}"
      ]
     },
     "execution_count": 114,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pipe(\n",
    "    docs,\n",
    "    c.map(lambda x: x.translate(str.maketrans('', '', string.punctuation))),\n",
    "    c.map(lambda x: x.split()),\n",
    "    concat,\n",
    "    concat,\n",
    "    c.sliding_window(3),\n",
    "    frequencies\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}