{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Python: Classes and Object-Oriented Design\n", "\n", "Python supports object-oriented programming via classes. Objects are instances of classes, and they can possess both data (known as attributes) and class-specific functions (known as methods). Classes can therefore be used to model real world structures that have properties and behaviors. " ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "## Self\n", "\n", "Class methods have a special first parameter, traditionally called `self`, that refers to the instance *itself*. Basically, when we have an instance of a class, say `a = A()`, and we call an instance method, `a.func(x, y, z)`, the Python interpreter executes `A.func(a, x, y, z)`. Hence there is an extra first argument that is a *placeholder* i.e., `self`, referring to the instance calling the method. Examples below will make this concept clearer." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Everything in Python is a class" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Even a number is a class\n", "\n", "We can show all the attributes and methods with the built-in `dir` function. Notice that there are special attributes that begin and end with double underscores, and regular ones that do not. We will learn more about special methods later." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'__abs__, __add__, __and__, __bool__, __ceil__, __class__, __delattr__, __dir__, __divmod__, __doc__, __eq__, __float__, __floor__, __floordiv__, __format__, __ge__, __getattribute__, __getnewargs__, __gt__, __hash__, __index__, __init__, __init_subclass__, __int__, __invert__, __le__, __lshift__, __lt__, __mod__, __mul__, __ne__, __neg__, __new__, __or__, __pos__, __pow__, __radd__, __rand__, __rdivmod__, __reduce__, __reduce_ex__, __repr__, __rfloordiv__, __rlshift__, __rmod__, __rmul__, __ror__, __round__, __rpow__, __rrshift__, __rshift__, __rsub__, __rtruediv__, __rxor__, __setattr__, __sizeof__, __str__, __sub__, __subclasshook__, __truediv__, __trunc__, __xor__, bit_length, conjugate, denominator, from_bytes, imag, numerator, real, to_bytes'" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "', '.join(dir(2))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Define a class" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "class A:\n", " \"\"\"Describe class.\"\"\"\n", " pass" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Create an instance of the class" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": true }, "outputs": [], "source": [ "a = A()" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "<__main__.A object at 0x104999710>\n" ] } ], "source": [ "print(a)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Special and regular methods" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "class B:\n", " \"\"\"Describe the class.\"\"\"\n", " def __init__(self, val):\n", " \"\"\"Class initizlizer.\"\"\"\n", " self.val = val\n", " \n", " def __str__(self):\n", " \"\"\"String reprresntation of class instance.\"\"\"\n", " return \"{}: {}\".format(self.__class__.__name__, self.val)\n", " \n", " def add(self, val):\n", " \"\"\"Add val to self.val\"\"\"\n", " self.val += val" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This uses the `__init__` special method" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "b = B('hello')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This uses the `__str__` special method" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "B: hello\n" ] } ], "source": [ "print(b)" ] }, { "cell_type": "raw", "metadata": {}, "source": [ "This uses the `add` regular method." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "b.add(' world')" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "B: hello world\n" ] } ], "source": [ "print(b)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Inheritance" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": true }, "outputs": [], "source": [ "class C(B):\n", " def __init__(self, val, name='C'):\n", " super().__init__(val)\n", " self.name = name\n", " \n", " def __str__(self):\n", " \"\"\"String reprresntation of class instance.\"\"\"\n", " return \"{}: {}\".format(self.name, self.val) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This uses the `__init__` method of C. Note that assignment of `val` is delegated to the parent class." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": true }, "outputs": [], "source": [ "c = C('hello', 'Rumpelstiltskin')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This uses the `add` regular method of B, since C has no `add` method defined." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": true }, "outputs": [], "source": [ "c.add(' Rapunzel')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This uses the `__str__` special method of C, not B." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Rumpelstiltskin: hello Rapunzel\n" ] } ], "source": [ "print(c)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Extended Example\n", "\n", "We will design a class to store and manipulate biological sequence data. This class has fairly rich functionality using some advanced Python features even though it consists of only a few lines of code. " ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from collections.abc import Mapping\n", "\n", "class BioSequence(Mapping):\n", " \"\"\"A class that contains one or more biological sequences.\"\"\" \n", " \n", " def __init__(self, fasta): \n", " \"\"\"Construcror from a FASTA format string.\"\"\"\n", " chunks = [x for x in fasta.strip().split('>') if x]\n", " names = []\n", " seqs = []\n", " for chunk in chunks:\n", " lines = chunk.splitlines()\n", " names.append(lines[0].strip())\n", " seqs.append(''.join(lines[1:]))\n", " self._data = dict(zip(names, seqs))\n", " \n", " def __getitem__(self, key):\n", " return self._data[key]\n", " \n", " def __iter__(self):\n", " return iter(self._data)\n", " \n", " def __len__(self):\n", " return len(self._data) \n", " \n", " def __str__(self):\n", " return str(self._data)" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "short = ['Ala', 'Arg', 'Asn', 'Asp', 'Cys', 'Glu', 'Gln', \n", " 'Gly', 'His', 'Ile', 'Leu', 'Lys', 'Met', 'Phe', \n", " 'Pro', 'Ser', 'Thr', 'Trp', 'Tyr', 'Val', \n", " 'ANY', 'GAP', 'STP']\n", "\n", "letters = ['A', 'R', 'N', 'D', 'C', 'E', 'Q', 'G', 'H', \n", " 'I', 'L', 'K', 'M', 'F', 'P', 'S', 'T', 'W', \n", " 'Y', 'V', 'X', '-', '*']\n", "\n", "class PeptideSequence(BioSequence):\n", " \"\"\"Specialized class for peptide seequences.\"\"\"\n", " \n", " def __init__(self, fasta): \n", " super().__init__(fasta)\n", " self.mapper = dict(zip(letters, short))\n", " \n", " def __str__(self):\n", " s = []\n", " for k, v in self._data.items():\n", " s.append(k)\n", " s.append('-'.join(self.mapper[c] for c in v))\n", " s.append('')\n", " return '\\n'.join(s)\n", " \n", "class DNASequence(BioSequence):\n", " \"\"\"Specialized class for DNA seequences.\"\"\"\n", " \n", " def __init__(self, fasta): \n", " super().__init__(fasta)\n", " self.table = str.maketrans('ACTG', 'TGAC')\n", " \n", " def reverse_complement(self):\n", " return {k: v.translate(self.table)[::-1] for \n", " k, v in self._data.items()}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Creating a class instance for a peptide sequence" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": true }, "outputs": [], "source": [ "peptide_fasta = '''>MCHU - Calmodulin - Human, rabbit, bovine, rat, and chicken\n", "ADQLTEEQIAEFKEAFSLFDKDGDGTITTKELGTVMRSLGQNPTEAELQDMINEVDADGNGTID\n", "FPEFLTMMARKMKDTDSEEEIREAFRVFDKDGNGYISAAELRHVMTNLGEKLTDEEVDEMIREA\n", "DIDGDGQVNYEEFVQMMTAK*\n", "\n", ">gi|5524211|gb|AAD44166.1| cytochrome b [Elephas maximus maximus]\n", "LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLV\n", "EWIWGGFSVDKATLNRFFAFHFILPFTMVALAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLG\n", "LLILILLLLLLALLSPDMLGDPDNHMPADPLNTPLHIKPEWYFLFAYAILRSVPNKLGGVLALFLSIVIL\n", "GLMPFLHTSKHRSMMLRPLSQALFWTLTMDLLTLTWIGSQPVEYPYTIIGQMASILYFSIILAFLPIAGX\n", "IENY'''" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [], "source": [ "peptides = PeptideSequence(peptide_fasta)" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(peptides)" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['MCHU - Calmodulin - Human, rabbit, bovine, rat, and chicken',\n", " 'gi|5524211|gb|AAD44166.1| cytochrome b [Elephas maximus maximus]']" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(peptides.keys())" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'ADQLTEEQIAEFKEAFSLFDKDGDGTITTKELGTVMRSLGQNPTEAELQDMINEVDADGNGTIDFPEFLTMMARKMKDTDSEEEIREAFRVFDKDGNGYISAAELRHVMTNLGEKLTDEEVDEMIREADIDGDGQVNYEEFVQMMTAK*'" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "peptides['MCHU - Calmodulin - Human, rabbit, bovine, rat, and chicken']" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "MCHU - Calmodulin - Human, rabbit, bovine, rat, and chicken ADQLTEEQIAEFKEAFSLFDKDGDGTITTKELGTVMRSLGQNPTEAELQDMINEVDADGNGTIDFPEFLTMMARKMKDTDSEEEIREAFRVFDKDGNGYISAAELRHVMTNLGEKLTDEEVDEMIREADIDGDGQVNYEEFVQMMTAK*\n", "\n", "gi|5524211|gb|AAD44166.1| cytochrome b [Elephas maximus maximus] LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLVEWIWGGFSVDKATLNRFFAFHFILPFTMVALAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLGLLILILLLLLLALLSPDMLGDPDNHMPADPLNTPLHIKPEWYFLFAYAILRSVPNKLGGVLALFLSIVILGLMPFLHTSKHRSMMLRPLSQALFWTLTMDLLTLTWIGSQPVEYPYTIIGQMASILYFSIILAFLPIAGXIENY\n", "\n" ] } ], "source": [ "for k, v in peptides.items():\n", " print(k, v)\n", " print()" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "MCHU - Calmodulin - Human, rabbit, bovine, rat, and chicken\n", "Ala-Asp-Gln-Leu-Thr-Glu-Glu-Gln-Ile-Ala-Glu-Phe-Lys-Glu-Ala-Phe-Ser-Leu-Phe-Asp-Lys-Asp-Gly-Asp-Gly-Thr-Ile-Thr-Thr-Lys-Glu-Leu-Gly-Thr-Val-Met-Arg-Ser-Leu-Gly-Gln-Asn-Pro-Thr-Glu-Ala-Glu-Leu-Gln-Asp-Met-Ile-Asn-Glu-Val-Asp-Ala-Asp-Gly-Asn-Gly-Thr-Ile-Asp-Phe-Pro-Glu-Phe-Leu-Thr-Met-Met-Ala-Arg-Lys-Met-Lys-Asp-Thr-Asp-Ser-Glu-Glu-Glu-Ile-Arg-Glu-Ala-Phe-Arg-Val-Phe-Asp-Lys-Asp-Gly-Asn-Gly-Tyr-Ile-Ser-Ala-Ala-Glu-Leu-Arg-His-Val-Met-Thr-Asn-Leu-Gly-Glu-Lys-Leu-Thr-Asp-Glu-Glu-Val-Asp-Glu-Met-Ile-Arg-Glu-Ala-Asp-Ile-Asp-Gly-Asp-Gly-Gln-Val-Asn-Tyr-Glu-Glu-Phe-Val-Gln-Met-Met-Thr-Ala-Lys-STP\n", "\n", "gi|5524211|gb|AAD44166.1| cytochrome b [Elephas maximus maximus]\n", "Leu-Cys-Leu-Tyr-Thr-His-Ile-Gly-Arg-Asn-Ile-Tyr-Tyr-Gly-Ser-Tyr-Leu-Tyr-Ser-Glu-Thr-Trp-Asn-Thr-Gly-Ile-Met-Leu-Leu-Leu-Ile-Thr-Met-Ala-Thr-Ala-Phe-Met-Gly-Tyr-Val-Leu-Pro-Trp-Gly-Gln-Met-Ser-Phe-Trp-Gly-Ala-Thr-Val-Ile-Thr-Asn-Leu-Phe-Ser-Ala-Ile-Pro-Tyr-Ile-Gly-Thr-Asn-Leu-Val-Glu-Trp-Ile-Trp-Gly-Gly-Phe-Ser-Val-Asp-Lys-Ala-Thr-Leu-Asn-Arg-Phe-Phe-Ala-Phe-His-Phe-Ile-Leu-Pro-Phe-Thr-Met-Val-Ala-Leu-Ala-Gly-Val-His-Leu-Thr-Phe-Leu-His-Glu-Thr-Gly-Ser-Asn-Asn-Pro-Leu-Gly-Leu-Thr-Ser-Asp-Ser-Asp-Lys-Ile-Pro-Phe-His-Pro-Tyr-Tyr-Thr-Ile-Lys-Asp-Phe-Leu-Gly-Leu-Leu-Ile-Leu-Ile-Leu-Leu-Leu-Leu-Leu-Leu-Ala-Leu-Leu-Ser-Pro-Asp-Met-Leu-Gly-Asp-Pro-Asp-Asn-His-Met-Pro-Ala-Asp-Pro-Leu-Asn-Thr-Pro-Leu-His-Ile-Lys-Pro-Glu-Trp-Tyr-Phe-Leu-Phe-Ala-Tyr-Ala-Ile-Leu-Arg-Ser-Val-Pro-Asn-Lys-Leu-Gly-Gly-Val-Leu-Ala-Leu-Phe-Leu-Ser-Ile-Val-Ile-Leu-Gly-Leu-Met-Pro-Phe-Leu-His-Thr-Ser-Lys-His-Arg-Ser-Met-Met-Leu-Arg-Pro-Leu-Ser-Gln-Ala-Leu-Phe-Trp-Thr-Leu-Thr-Met-Asp-Leu-Leu-Thr-Leu-Thr-Trp-Ile-Gly-Ser-Gln-Pro-Val-Glu-Tyr-Pro-Tyr-Thr-Ile-Ile-Gly-Gln-Met-Ala-Ser-Ile-Leu-Tyr-Phe-Ser-Ile-Ile-Leu-Ala-Phe-Leu-Pro-Ile-Ala-Gly-ANY-Ile-Glu-Asn-Tyr\n", "\n" ] } ], "source": [ "print(peptides)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Creating a class instance for a DNA sequence" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [], "source": [ "dna_fasta ='''\n", ">SRR001666_1.\n", "GGGTGATGGCCGCTGCCGATGGCGTCAAATCCCACC\n", "GTTCAGGGATACGACGTTTGTATTTTAAGAATCTGA\n", "\n", ">SRR001666_2\n", "AAGTTACCCTTAACAACTTAAGGGTTTTCAAATAGA\n", "AGCAGAAGTCGATGATAATACGCGTCGTTTTATCAT\n", "'''" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "collapsed": true }, "outputs": [], "source": [ "dnas = DNASequence(dna_fasta)" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'SRR001666_1.': 'GGGTGATGGCCGCTGCCGATGGCGTCAAATCCCACCGTTCAGGGATACGACGTTTGTATTTTAAGAATCTGA', 'SRR001666_2': 'AAGTTACCCTTAACAACTTAAGGGTTTTCAAATAGAAGCAGAAGTCGATGATAATACGCGTCGTTTTATCAT'}\n" ] } ], "source": [ "print(dnas)" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'SRR001666_1.': 'TCAGATTCTTAAAATACAAACGTCGTATCCCTGAACGGTGGGATTTGACGCCATCGGCAGCGGCCATCACCC',\n", " 'SRR001666_2': 'ATGATAAAACGACGCGTATTATCATCGACTTCTGCTTCTATTTGAAAACCCTTAAGTTGTTAAGGGTAACTT'}" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dnas.reverse_complement()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.1" } }, "nbformat": 4, "nbformat_minor": 2 }