Python: Classes and Object-Oriented Design¶
Python supports object-oriented programming via classes. Objects are instances of classes, and they can possess both data (known as attributes) and class-specific functions (known as methods). Classes can therefore be used to model real world structures that have properties and behaviors.
Self¶
Class methods have a special first parameter, traditionally called
self, that refers to the instance itself. Basically, when we have
an instance of a class, say a = A(), and we call an instance method,
a.func(x, y, z), the Python interpreter executes
A.func(a, x, y, z). Hence there is an extra first argument that is a
placeholder i.e., self, referring to the instance calling the
method. Examples below will make this concept clearer.
Everything in Python is a class¶
Even a number is a class¶
We can show all the attributes and methods with the built-in dir
function. Notice that there are special attributes that begin and end
with double underscores, and regular ones that do not. We will learn
more about special methods later.
In [1]:
', '.join(dir(2))
Out[1]:
'__abs__, __add__, __and__, __bool__, __ceil__, __class__, __delattr__, __dir__, __divmod__, __doc__, __eq__, __float__, __floor__, __floordiv__, __format__, __ge__, __getattribute__, __getnewargs__, __gt__, __hash__, __index__, __init__, __init_subclass__, __int__, __invert__, __le__, __lshift__, __lt__, __mod__, __mul__, __ne__, __neg__, __new__, __or__, __pos__, __pow__, __radd__, __rand__, __rdivmod__, __reduce__, __reduce_ex__, __repr__, __rfloordiv__, __rlshift__, __rmod__, __rmul__, __ror__, __round__, __rpow__, __rrshift__, __rshift__, __rsub__, __rtruediv__, __rxor__, __setattr__, __sizeof__, __str__, __sub__, __subclasshook__, __truediv__, __trunc__, __xor__, bit_length, conjugate, denominator, from_bytes, imag, numerator, real, to_bytes'
Define a class¶
In [2]:
class A:
"""Describe class."""
pass
Create an instance of the class¶
In [3]:
a = A()
In [4]:
print(a)
<__main__.A object at 0x104999710>
Special and regular methods¶
In [5]:
class B:
"""Describe the class."""
def __init__(self, val):
"""Class initizlizer."""
self.val = val
def __str__(self):
"""String reprresntation of class instance."""
return "{}: {}".format(self.__class__.__name__, self.val)
def add(self, val):
"""Add val to self.val"""
self.val += val
This uses the __init__ special method
In [6]:
b = B('hello')
This uses the __str__ special method
In [7]:
print(b)
B: hello
In [8]:
b.add(' world')
In [9]:
print(b)
B: hello world
Inheritance¶
In [10]:
class C(B):
def __init__(self, val, name='C'):
super().__init__(val)
self.name = name
def __str__(self):
"""String reprresntation of class instance."""
return "{}: {}".format(self.name, self.val)
This uses the __init__ method of C. Note that assignment of val
is delegated to the parent class.
In [11]:
c = C('hello', 'Rumpelstiltskin')
This uses the add regular method of B, since C has no add method
defined.
In [12]:
c.add(' Rapunzel')
This uses the __str__ special method of C, not B.
In [13]:
print(c)
Rumpelstiltskin: hello Rapunzel
Extended Example¶
We will design a class to store and manipulate biological sequence data. This class has fairly rich functionality using some advanced Python features even though it consists of only a few lines of code.
In [14]:
from collections.abc import Mapping
class BioSequence(Mapping):
"""A class that contains one or more biological sequences."""
def __init__(self, fasta):
"""Construcror from a FASTA format string."""
chunks = [x for x in fasta.strip().split('>') if x]
names = []
seqs = []
for chunk in chunks:
lines = chunk.splitlines()
names.append(lines[0].strip())
seqs.append(''.join(lines[1:]))
self._data = dict(zip(names, seqs))
def __getitem__(self, key):
return self._data[key]
def __iter__(self):
return iter(self._data)
def __len__(self):
return len(self._data)
def __str__(self):
return str(self._data)
In [15]:
short = ['Ala', 'Arg', 'Asn', 'Asp', 'Cys', 'Glu', 'Gln',
'Gly', 'His', 'Ile', 'Leu', 'Lys', 'Met', 'Phe',
'Pro', 'Ser', 'Thr', 'Trp', 'Tyr', 'Val',
'ANY', 'GAP', 'STP']
letters = ['A', 'R', 'N', 'D', 'C', 'E', 'Q', 'G', 'H',
'I', 'L', 'K', 'M', 'F', 'P', 'S', 'T', 'W',
'Y', 'V', 'X', '-', '*']
class PeptideSequence(BioSequence):
"""Specialized class for peptide seequences."""
def __init__(self, fasta):
super().__init__(fasta)
self.mapper = dict(zip(letters, short))
def __str__(self):
s = []
for k, v in self._data.items():
s.append(k)
s.append('-'.join(self.mapper[c] for c in v))
s.append('')
return '\n'.join(s)
class DNASequence(BioSequence):
"""Specialized class for DNA seequences."""
def __init__(self, fasta):
super().__init__(fasta)
self.table = str.maketrans('ACTG', 'TGAC')
def reverse_complement(self):
return {k: v.translate(self.table)[::-1] for
k, v in self._data.items()}
Creating a class instance for a peptide sequence¶
In [16]:
peptide_fasta = '''>MCHU - Calmodulin - Human, rabbit, bovine, rat, and chicken
ADQLTEEQIAEFKEAFSLFDKDGDGTITTKELGTVMRSLGQNPTEAELQDMINEVDADGNGTID
FPEFLTMMARKMKDTDSEEEIREAFRVFDKDGNGYISAAELRHVMTNLGEKLTDEEVDEMIREA
DIDGDGQVNYEEFVQMMTAK*
>gi|5524211|gb|AAD44166.1| cytochrome b [Elephas maximus maximus]
LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLV
EWIWGGFSVDKATLNRFFAFHFILPFTMVALAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLG
LLILILLLLLLALLSPDMLGDPDNHMPADPLNTPLHIKPEWYFLFAYAILRSVPNKLGGVLALFLSIVIL
GLMPFLHTSKHRSMMLRPLSQALFWTLTMDLLTLTWIGSQPVEYPYTIIGQMASILYFSIILAFLPIAGX
IENY'''
In [17]:
peptides = PeptideSequence(peptide_fasta)
In [18]:
len(peptides)
Out[18]:
2
In [19]:
list(peptides.keys())
Out[19]:
['MCHU - Calmodulin - Human, rabbit, bovine, rat, and chicken',
'gi|5524211|gb|AAD44166.1| cytochrome b [Elephas maximus maximus]']
In [20]:
peptides['MCHU - Calmodulin - Human, rabbit, bovine, rat, and chicken']
Out[20]:
'ADQLTEEQIAEFKEAFSLFDKDGDGTITTKELGTVMRSLGQNPTEAELQDMINEVDADGNGTIDFPEFLTMMARKMKDTDSEEEIREAFRVFDKDGNGYISAAELRHVMTNLGEKLTDEEVDEMIREADIDGDGQVNYEEFVQMMTAK*'
In [21]:
for k, v in peptides.items():
print(k, v)
print()
MCHU - Calmodulin - Human, rabbit, bovine, rat, and chicken ADQLTEEQIAEFKEAFSLFDKDGDGTITTKELGTVMRSLGQNPTEAELQDMINEVDADGNGTIDFPEFLTMMARKMKDTDSEEEIREAFRVFDKDGNGYISAAELRHVMTNLGEKLTDEEVDEMIREADIDGDGQVNYEEFVQMMTAK*
gi|5524211|gb|AAD44166.1| cytochrome b [Elephas maximus maximus] LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLVEWIWGGFSVDKATLNRFFAFHFILPFTMVALAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLGLLILILLLLLLALLSPDMLGDPDNHMPADPLNTPLHIKPEWYFLFAYAILRSVPNKLGGVLALFLSIVILGLMPFLHTSKHRSMMLRPLSQALFWTLTMDLLTLTWIGSQPVEYPYTIIGQMASILYFSIILAFLPIAGXIENY
In [22]:
print(peptides)
MCHU - Calmodulin - Human, rabbit, bovine, rat, and chicken
Ala-Asp-Gln-Leu-Thr-Glu-Glu-Gln-Ile-Ala-Glu-Phe-Lys-Glu-Ala-Phe-Ser-Leu-Phe-Asp-Lys-Asp-Gly-Asp-Gly-Thr-Ile-Thr-Thr-Lys-Glu-Leu-Gly-Thr-Val-Met-Arg-Ser-Leu-Gly-Gln-Asn-Pro-Thr-Glu-Ala-Glu-Leu-Gln-Asp-Met-Ile-Asn-Glu-Val-Asp-Ala-Asp-Gly-Asn-Gly-Thr-Ile-Asp-Phe-Pro-Glu-Phe-Leu-Thr-Met-Met-Ala-Arg-Lys-Met-Lys-Asp-Thr-Asp-Ser-Glu-Glu-Glu-Ile-Arg-Glu-Ala-Phe-Arg-Val-Phe-Asp-Lys-Asp-Gly-Asn-Gly-Tyr-Ile-Ser-Ala-Ala-Glu-Leu-Arg-His-Val-Met-Thr-Asn-Leu-Gly-Glu-Lys-Leu-Thr-Asp-Glu-Glu-Val-Asp-Glu-Met-Ile-Arg-Glu-Ala-Asp-Ile-Asp-Gly-Asp-Gly-Gln-Val-Asn-Tyr-Glu-Glu-Phe-Val-Gln-Met-Met-Thr-Ala-Lys-STP
gi|5524211|gb|AAD44166.1| cytochrome b [Elephas maximus maximus]
Leu-Cys-Leu-Tyr-Thr-His-Ile-Gly-Arg-Asn-Ile-Tyr-Tyr-Gly-Ser-Tyr-Leu-Tyr-Ser-Glu-Thr-Trp-Asn-Thr-Gly-Ile-Met-Leu-Leu-Leu-Ile-Thr-Met-Ala-Thr-Ala-Phe-Met-Gly-Tyr-Val-Leu-Pro-Trp-Gly-Gln-Met-Ser-Phe-Trp-Gly-Ala-Thr-Val-Ile-Thr-Asn-Leu-Phe-Ser-Ala-Ile-Pro-Tyr-Ile-Gly-Thr-Asn-Leu-Val-Glu-Trp-Ile-Trp-Gly-Gly-Phe-Ser-Val-Asp-Lys-Ala-Thr-Leu-Asn-Arg-Phe-Phe-Ala-Phe-His-Phe-Ile-Leu-Pro-Phe-Thr-Met-Val-Ala-Leu-Ala-Gly-Val-His-Leu-Thr-Phe-Leu-His-Glu-Thr-Gly-Ser-Asn-Asn-Pro-Leu-Gly-Leu-Thr-Ser-Asp-Ser-Asp-Lys-Ile-Pro-Phe-His-Pro-Tyr-Tyr-Thr-Ile-Lys-Asp-Phe-Leu-Gly-Leu-Leu-Ile-Leu-Ile-Leu-Leu-Leu-Leu-Leu-Leu-Ala-Leu-Leu-Ser-Pro-Asp-Met-Leu-Gly-Asp-Pro-Asp-Asn-His-Met-Pro-Ala-Asp-Pro-Leu-Asn-Thr-Pro-Leu-His-Ile-Lys-Pro-Glu-Trp-Tyr-Phe-Leu-Phe-Ala-Tyr-Ala-Ile-Leu-Arg-Ser-Val-Pro-Asn-Lys-Leu-Gly-Gly-Val-Leu-Ala-Leu-Phe-Leu-Ser-Ile-Val-Ile-Leu-Gly-Leu-Met-Pro-Phe-Leu-His-Thr-Ser-Lys-His-Arg-Ser-Met-Met-Leu-Arg-Pro-Leu-Ser-Gln-Ala-Leu-Phe-Trp-Thr-Leu-Thr-Met-Asp-Leu-Leu-Thr-Leu-Thr-Trp-Ile-Gly-Ser-Gln-Pro-Val-Glu-Tyr-Pro-Tyr-Thr-Ile-Ile-Gly-Gln-Met-Ala-Ser-Ile-Leu-Tyr-Phe-Ser-Ile-Ile-Leu-Ala-Phe-Leu-Pro-Ile-Ala-Gly-ANY-Ile-Glu-Asn-Tyr
Creating a class instance for a DNA sequence¶
In [23]:
dna_fasta ='''
>SRR001666_1.
GGGTGATGGCCGCTGCCGATGGCGTCAAATCCCACC
GTTCAGGGATACGACGTTTGTATTTTAAGAATCTGA
>SRR001666_2
AAGTTACCCTTAACAACTTAAGGGTTTTCAAATAGA
AGCAGAAGTCGATGATAATACGCGTCGTTTTATCAT
'''
In [24]:
dnas = DNASequence(dna_fasta)
In [25]:
print(dnas)
{'SRR001666_1.': 'GGGTGATGGCCGCTGCCGATGGCGTCAAATCCCACCGTTCAGGGATACGACGTTTGTATTTTAAGAATCTGA', 'SRR001666_2': 'AAGTTACCCTTAACAACTTAAGGGTTTTCAAATAGAAGCAGAAGTCGATGATAATACGCGTCGTTTTATCAT'}
In [26]:
dnas.reverse_complement()
Out[26]:
{'SRR001666_1.': 'TCAGATTCTTAAAATACAAACGTCGTATCCCTGAACGGTGGGATTTGACGCCATCGGCAGCGGCCATCACCC',
'SRR001666_2': 'ATGATAAAACGACGCGTATTATCATCGACTTCTGCTTCTATTTGAAAACCCTTAAGTTGTTAAGGGTAACTT'}