William H. Majoros, Ph.D.
Assistant Professor

Center for Statistical Genetics & Genomics
Center for Genomic & Computational Biology
Center for Advanced Genomic Technologies
Department of Biostatistics & Bioinformatics
Duke University

Research Interests

  • Deep learning and language modeling for genomics
  • Convolutional neural networks, attention models, and transformers
  • Probabilistic machine learning
  • Bayesian inference in probabilistic graphical models
  • Structured prediction
  • Interpretation/prioritization of genetic variants in disease
  • RNA splicing
  • Gene regulation
  • Computational linguistics - formal grammars and parsing
  • Syntactic structure and memetic evolution in complex bird song
  • Single-cell CRISPR screens



  1. Methods for Computational Gene Prediction (2007). Majoros WH (foreword by Steven Salzberg).  Cambridge University Press.  430 pages.

Book Chapters:

  1. Gene Prediction Methods (2009) Majoros WH, Korf I, Ohler U.  In: Bioinformatics, Tools and Applications. Springer.  pp99-120.
  2. Dynamic programming for gene finders. (2005). Majoros WH.  In: Encyclopedia of Genetics, Genomics, Proteomics, and Bioinformatics.  Wiley. 
  3. Automatic concept identification in biomedical literature. (2005). Majoros WH.  In: Encyclopedia of Genetics, Genomics, Proteomics, and Bioinformatics.  Wiley.

Research Articles:

  1. Pickar-Oliver A, et al. (2021) Full-length dystrophin restoration via targeted exon integration by AAV-CRISPR in a humanized mouse model of Duchenne muscular dystrophy. Molecular Therapy 29:3243-3257.
  2. Miller DE, et al. (2021) Targeted long-read sequencing identifies missing disease-causing variation. American Journal of Human Genetics 108:1436-1449.
  3. Kim Y-S, Johnson GD, Seo J, Barrera A, Cowart TN, Majoros WH, Ochoa A, Allen AS, Reddy TE (2021) Correcting signal biases and detecting regulatory elements in STARR-seq data. Genome Research 10.1101/gr.269209.120.
  4. Majoros WH, Barrera A, Kim Y-S, Li F, Wang X, Cunningham SJ, Johnson GD, Guo C, Lowe WL, Scholtens DM, Hayes MG, Reddy TE, Allen AS (2019) Bayesian estimation of genetic regulatory effects in high-throughput reporter assays.  Bioinformatics (Accepted).
  5. Edsall LE, Berrio A, Majoros WH, Swain-Lenz D, Morrow S, Shibata Y, Safi A, Wray GA, Crawford GE, Allen AS (2019) Evaluating Chromatin Accessibility Differences across Multiple Primate Species Using a Joint Modeling Approach.  Genome Biology and Evolution 11:3035-3053.
  6. Johnson G, Barrera A, McDowell I, D'Ippolito A, Majoros WH, Vockley C, Wang X, Allen A, Reddy T (2018) Human genome-wide measurement of drug-responsive regulatory activity.  Nature Communications 9:5317.
  7. McDowell IC, Barrera A, D'Ippolito AM, Vockley CM, Hong LK, Leichter SM, Bartelt LC, Majoros WH, Song L, Safi A, Kocak DD, Gersbach CA, Hartemink AJ, Crawford GE, Engelhardt BE, Reddy TE (2018) Glucocorticoid receptor recruits to enhancers and drives activation by motif-directed binding.  Genome Research 28:1272-1284.
  8. D'Ippolito AM, McDowell I, Barrera A, Hong L, Leichter S, Bartlelt L, Vockley C, Majoros WH, Safi A, Song L, Gersbach CA, Crawford GE, Reddy TE (2018) Glucocorticoid treatment mediates chromatin interaction dynamics concordant with gene expression responses.  Cell Systems 7:146-160.
  9. Majoros WH, Holt C, Campbell MS, Ware D, Yandell M, Reddy TE (2018) Predicting gene structure changes resulting from genetic variants via exon definition features.  Bioinformatics 34:3616-3623.
  10. Gussow AB, Copeland BR, Dhindsa RS, Wang Q, Petrovski S, Majoros WH, Allen AS, Goldstein DB. (2017) Orion: Detecting Regions of the Human Non-Coding Genome that are Intolerant to Variation Using Population Genetics.  PLoS One 12(8):e0181604.
  11. Majoros WH, Campbell MS, Holt C, DeNardo EK, Ware D, Allen AS, Yandell M, Reddy TE (2017) High-throughput interpretation of gene structure changes in human and nonhuman resequencing data, using ACE.  Bioinformatics 33:1437-1446
  12. Vockley CM, D'Ippolito AM, McDowell IC, Majoros WH, Safi AS, Song L, Crawford GE, Reddy TE (2016) Direct GR binding sites potentiate clusters of TF binding across the human genome.  Cell 166:1269-1281.
  13. Synder-Mackler N, Majoros WH, Yuan M, Shaver A, Gordon J, Kopp G, Schlebush S, Wall JD, Alberts SC, Mukherjee S, Xiang Z, Tung J (2016) Genome-wide sequencing and low coverage pedigree analysis from low quality, noninvasively collected samples. Genetics 203:699-714.
  14. Vockley CM*, Guo C*, Majoros WH* (co-1st author), Nodzenski M, Scholtens DM, Hayes MG, Lowe WL, Reddy TE (2015) Massively parallel quantification of the regulatory effects of non-coding genetic variation in a human cohortGenome Research 25:1206-1214.
  15. Ousterout DG, Kabadi AM, Thakore PI, Majoros WH, Reddy TE, Gersbach CA (2015) Multiplex CRISPR/Cas9-based genome editing for correction of dystrophin mutations that cause Duchenne muscular dystrophy. Nature Comm 6:6244.
  16. Ousterout DG, Kabadi AM, Thakore PI, Perez-Pinera P, Brown MT, Majoros WH, Reddy TE, Gersbach CA (2015) Correction of dystrophin expression in cells from duchenne muscular dystrophy patients through genomic excision of exon 51 by zinc finger nucleases. Molecular Therapy 23:523-32.
  17. Majoros WH, Lebeck N, Ohler U, Li S (2014) Improved transcript isoform discovery using ORF graphs. Bioinformatics 30:1958-64.
  18. Majoros WH, Lekprasert P, Mukherjee N, Skalsky RL, Corcoran DL, Cullen BR, Ohler U (2013) MicroRNA target site identification by integrating sequence and binding information. Nature Methods 10:630-3.
  19. Pruteanu-Malinici I, Majoros WH, Ohler U (2013) Automated annotation of gene expression image sequences via non-parametric factor analysis and conditional random fields. Bioinformatics 29:i27-35.
  20. LaMonte G, Philip N, Reardon J, Lacsina JR, Majoros W, Chapman L, Thornburg CD, Telen MJ, Ohler U, Nicchitta CV, Haystead T, Chi JT (2012) Translocation of sickle cell erythrocyte microRNAs into Plasmodium falciparum inhibits parasite translation and contributes to malaria resistance. Cell Host and Microbe 12:187-99.
  21. Majoros WH, Ohler U (2010) Modeling the Evolution of Regulatory Elements by Simultaneous Detection and Alignment with Phylogenetic Pair HMMs. PLoS Computational Biology 6(12): e1001037
  22. Rach EA, Yuan HY, Majoros WH, Tomancak P, Ohler U (2009) Motif composition, conservation and condition-specificity of single and alternative transcription start sites in the Drosophila genome. Genome Biology 10:R73.
  23. Majoros WH, Ohler U (2008) Complexity Reduction in Context-dependent DNA Substitution Models. Bioinformatics 25:185-82.
  24. Gottwein E, Mukherjee N, Sachse C, Frenzel C, Majoros WH, Chi J-T, Braich R, Manoharan M, Soutschek J, Ohler U, Cullen BR (2007) A viral microRNA functions as an ortholog of cellular miR-155. Nature 450:1096-1099.
  25. Majoros W, Ohler U (2007) Spatial preferences of microRNA targets in 3' untranslated regions. BMC Genomics 8:152.
  26. Majoros W, Ohler U (2007) Advancing the State of the Art in Computational Gene Prediction. Proceedings of KDECB / Benelearn '06, Springer-Verlag.
  27. Allen J, Majoros W, Pertea M, Salzberg SL (2006) JIGSAW, GeneZilla, and GlimmerHMM: puzzling out the features of human genes in the ENCODE regions Genome Biology 7(Suppl 1):S9.
  28. Eisen JA, Coyne RS, Wu M, Wu D, Thiagarajan M, Wortman JR, Badger JH, Ren Q, Amedeo P, Jones KM, Tallon LJ, Delcher AL, Salzberg SL, Silva JC, Haas BJ, Majoros WH, and 37 others (2006) Macronuclear genome sequence of the ciliate Tetrahymena thermophila, a model eukaryotePLoS Biology 4(9).
  29. Nierman W, ... , Majoros WH (author 51 of 97), et al. (2005) Genomic sequence of the pathogenic and allergenic filamentous fungus Aspergillus fumigatus. Nature 438:1151-1156.
  30. Majoros WH, Pertea M, Salzberg SL (2005) Efficient implementation of a generalized pair hidden Markov model for comparative gene finding. Bioinformatics 21:1782-1788.
  31. Majoros W, Pertea M, Delcher A, Salzberg SL (2005) Efficient decoding algorithms for generalized hidden Markov model gene finders. BMC Bioinformatics 6:16.
  32. Anderson IJ, Watkins RF, Samuelson J, Spencer DF, Majoros WH, Gray MW, Loftus BJ (2005). Gene discovery in the Acanthamoeba castellanii genome. (2005). Protist 156:203-14.
  33. Majoros WH, Salzberg SL (2004) An empirical analysis of training protocols for probabilistic gene finders. BMC Bioinformatics 5:206.
  34. Majoros WH, Pertea M, Salzberg SL (2004) TIGRscan and GlimmerHMM : two open source ab initio eukaryotic gene finders. Bioinformatics 20:2878-2879. 
  35. The ENCODE Project Consortium (2004) The ENCODE (ENCyclopedia Of DNA Elements) Project. Science 306:636-640.
  36. Majoros WH, Pertea M, Antonescu C, Salzberg SL (2003) GlimmerM, Exonomy, and Unveil: Three ab initio Eukaryotic Genefinders. Nucleic Acids Research 31:3601-3604.
  37. Mi H, Vandergriff J, Campbell M, Marechania A, Majoros WH, Lewis S, Thomas PD, Ashburner M (2003) Assessment of genome-wide protein function classification for Drosophila melanogaster. Genome Research 13:2118-2128.
  38. Majoros WH, Subramanian GM, Yandell MD (2002) Identification of Key Concepts in Biomedical Literature using a Modified Markov Heuristic. Bioinformatics 19:402-407.
  39. Yandell MD, Majoros WH (2002) Genomics and Natural Language Processing. Nature Reviews Genetics 3:601.
  40. Holt RA, ... , Majoros WH (author 14 of 123), et al. (2002) The Genome Sequence of the Malaria Mosquito Anopheles gambiaeScience 298:129-149.
  41. Mural RJ, ... , Majoros WH (author 20 of 24), et al. (2002) A preliminary comparison of the mouse and human genomes. Int Congr Ser 2002 1246:169-181.
  42. Mural RJ, ... , Majoros WH (author 87 of 180), et al. (2002) A Comparison of Whole-Genome Shotgun-Derived Mouse Chromosome 16 and the Human Genome. Science 296:1661-1671.
  43. Majoros WH (2002) Syntactic Structure in Birdsong: Memetic Evolution of Songs or Grammars? Journal of Memetics, vol 6.
  44. Venter JC, ..., Majoros WH (author 249 of 273), et al. (2001) The sequence of the human genome. Science 291:1304-1351.

Teaching - Computer Science / Computational Biology

  • Computational Sequence Biology (COMPSCI 561, Spring 2023) - entire course
  • Graphical Models and Structured Prediction for Biological Data (BIOSTAT 914, Fall 2022) - entire course
  • Genome Tools and Technologies (CBB 520, Fall 2022) - one guest lecture
  • Genetic Approaches to the Solution of Biological Problems (UPGEN778, Fall 2018) - two lectures
  • Computational Sequence Biology (CS561 Duke University, Spring 2018) - two lectures
  • Genetic and Genomic Solutions to Biological Problems (UPGEN778, Fall 2016) - one lecture
  • Introduction to Computational Genomics (CS260 Duke University, Spring 2016) - two lectures
  • Computational and Regulatory Genomics (CBB561 Duke University, Fall 2015) - one lecture
  • Sequencing-based Genomics (CBB720 Duke University, Fall 2014) - two lectures
  • Computational Sequence Biology (CS261 Duke University, Spring 2014) - seven lectures
  • Computational Sequence Biology (CS261 Duke University, Spring 2012) - seven lectures
  • Computational Sequence Biology (CS261 Duke University, Spring 2010) - seven lectures
  • Computational Biology of Gene Regulation (CS261 Duke University, Spring 2008) - five lectures
  • Computational Biology of Gene Regulation (CS261 Duke University, Fall 2006) - five lectures


Duke University - Ph.D., Computational Biology and Bioinformatics
Penn State University - B.Sc. in Computer Science, Magna Cum Laude.

Click to buy my book:

This was supposed to be on the cover of my 2007 book, but the publisher of another book felt it was too similar to their title.

The Duke Chapel.  HDR photo taken with a wide-angle lens.