The B73 Maize Genome: Complexity, Diversity, and Dynamics.
Patrick S. Schnable, Doreen Ware, Robert S. Fulton, Joshua C. Stein, Fusheng Wei, Shiran Pasternak, Chengzhi Liang, Jianwei Zhang, Lucinda Fulton, Tina A. Graves, Patrick Minx, Amy Denise Reily, Laura Courtney, Scott S. Kruchowski, Chad Tomlinson, Cindy Strong, Kim Delehaunty, Catrina Fronick, Bill Courtney, Susan M. Rock, Eddie Belter, Feiyu Du, Kyung Kim, Rachel M. Abbott, Marc Cotton, Andy Levy, Pamela Marchetto, Kerri Ochoa, Stephanie M. Jackson, Barbara Gillam, Weizu Chen, Le Yan, Jamey Higginbotham, Marco Cardenas, Jason Waligorski, Elizabeth Applebaum, Lindsey Phelps, Jason Falcone, Krishna Kanchi, Thynn Thane, Adam Scimone, Nay Thane, Jessica Henke, Tom Wang, Jessica Ruppert, Neha Shah, Kelsi Rotter, Jennifer Hodges, Elizabeth Ingenthron, Matt Cordes, Sara Kohlberg, Jennifer Sgro, Brandon Delgado, Kelly Mead, Asif Chinwalla, Shawn Leonard, Kevin Crouse, Kristi Collura, Dave Kudrna, Jennifer Currie, Ruifeng He, Angelina Angelova, Shanmugam Rajasekar, Teri Mueller, Rene Lomeli, Gabriel Scara, Ara Ko, Krista Delaney, Marina Wissotski, Georgina Lopez, David Campos, Michele Braidotti, Elizabeth Ashley, Wolfgang Golser, HyeRan Kim, Seunghee Lee, Jinke Lin, Zeljko Dujmic, Woojin Kim, Jayson Talag, Andrea Zuccolo, Chuanzhu Fan, Aswathy Sebastian, Melissa Kramer, Lori Spiegel, Lidia Nascimento, Theresa Zutavern, Beth Miller, Claude Ambroise, Stephanie Muller, Will Spooner, Apurva Narechania, Liya Ren, Sharon Wei, Sunita Kumari, Ben Faga, Michael J. Levy, Linda McMahan, Peter Van Buren, Matthew W. Vaughn, Kai Ying, Cheng-Ting Yeh, Scott J. Emrich, Yi Jia, Ananth Kalyanaraman, An-Ping Hsia, W. Brad Barbazuk, Regina S. Baucom, Thomas P. Brutnell, Nicholas C. Carpita, Cristian Chaparro, Jer-Ming Chia, Jean-Marc Deragon, James C. Estill, Yan Fu, Jeffrey A. Jeddeloh, Yujun Han, Hyeran Lee, Pinghua Li, Damon R. Lisch, Sanzhen Liu, Zhijie Liu, Dawn Holligan Nagel, Maureen C. McCann, Phillip SanMiguel, Alan M. Myers, Dan Nettleton, John Nguyen, Bryan W. Penning, Lalit Ponnala, Kevin L. Schneider, David C. Schwartz, Anupma Sharma, Carol Soderlund, Nathan M. Springer, Qi Sun, Hao Wang, Michael Waterman, Richard Westerman, Thomas K. Wolfgruber, Lixing Yang, Yeisoo Yu, Lifang Zhang, Shiguo Zhou, Qihui Zhu, Jeffrey L. Bennetzen, R. Kelly Dawe, Jiming Jiang, Ning Jiang, Gernot G. Presting, Susan R. Wessler, Srinivas Aluru, Robert A. Martienssen, Sandra W. Clifton, W. Richard McCombie, Rod A. Wing, Richard K. Wilson
At MaizeGDB DOI
Stock and Biosample Information
Coe PI 550473
Zea mays ssp. mays (maize)
Coe PI 550473
The source for the inbred line B73 used to make the BAC libraries that were sequenced is available from the North Central Regional Plant Introduction Station through the U.S. National Plant Germplasm System under the accession PI 550473. When requesting seed from the North Central Regional Plant Introduction Station, ask for any lot descended from the Coe PI 550473 lines.
The length of scaffold which takes the sum length (summing from longest to shortest scaffold) past 50% of the total assembly size.
How many scaffolds are counted in reaching the N50 threshold.
A contig is a contiguous consensus sequence that is
derived from a collection of overlapping reads.
A scaffold is set of a ordered and orientated contigs
that are linked to one another by mate pairs of sequencing reads.
Maize Genome Sequencing Consortium
Gramene evidence-based gene build pipeline and FGENESH
Genes were predicted on the basis of assembled contigs from 16,006 BAC clones with a combination of the Gramene evidence-based gene build pipeline and FGENESH. For a small subset (506 BACs) only FGENESH was used. Prior to annotation, sequences were masked with MIPs REdat v4.3 library which resulted in masking of 78.4% of genome sequence. The gene-build incorporated sequence evidence from both maize and other plant sources available as of October 2008 as follows: a) Maize full-length cDNAs (FLcDNA): 14,097 from the Arizona Maize Full-length cDNA Project; 36,430 from Ceres b) EST: 2,000,333 maize; 1,217,85 9 rice; 2,448,641 other monocots c) mRNA: 18,181 maize; 72,919 rice; 14,015 other monocots d) Proteins: a) 359,942 from Swiss-Prot from all species b) 494,444 non-maize plant from Trembl c) 94,734 GenBank proteins from plant species d) 52,177 rice proteins from rice gene annotations e) 36338 proteins from sorghum gene annotations. For many genes, multiple spliced transcripts were preserved with high confidence cDNA/EST support (at least 99% sequence alignment identity). The resulting gene set was filtered by translation length: 50 amino acid residues for cDNA or multiple-EST supported genes, 25 residues for protein-supported genes and 100 residues for single-EST supported genes. FGENES H models were incorporated into evidence-based predictions when the former could extend the open reading frame of an otherwise incomplete coding sequence. FGENESH models that did not overlap an evidence-based prediction were used “as-is.” The resulting BAC-level annotations were projected onto the reference chromosome sequence on the basis of coordinates in the Accessioned Golden Path (AGP). This removed redundant annotations due to overlap between adjacent clones in the tiling path. Some genes failed to project because their models were disrupted by assembly breakpoints. These were re-annotated directly on the chromosome assembly.