Ames, Iowa
Iowa State University
Lincoln, Nebraska
University of Nebraska
High-throughput mapping and gene discovery tools for maize genomics --Wen, TJ, Qiu F, Guo, L, Fu, Y, Liu, F, Lee M, Russell K, Ashlock, D, Schnable PS EST libraries. Two normalized B73 cDNA libraries with complexities >106 were prepared using mRNA from seedlings and silk (ISUM4-TN), and a wide variety of organs, stages of development, and conditions (ISUM5-RN). More than 27,000 3’ ESTs from these libraries have been submitted to GenBank. Mean and modal trimmed sequence lengths are 545 and 575 bp, respectively. Unlike EST projects that are composed of 5’ sequences derived from a variety of genetic backgrounds, it is possible for these ESTs to be assembled into a set of unique genes with a very high degree of confidence. Two independent methods have been used to assay the number of genes defined by this collection of ESTs. In the first assay ESTs were contiged using CAP3 (Huang et. al, Genome Res 9:868-877, 1999). Representative contigs were manually QCed using Consed (Gordon et al., Genome Res 8:195-202, 1998). This analysis defined 4,666 singletons and 3,540 clusters for a total of 8,206 genes, resulting in an overall gene discovery rate of 30%. The second assay relies on preliminary grouping via the application of a novel bioinformatic tool developed during this project (the Cosine Homology Tool) followed by local Smith-Waterman alignment. Based on this analysis, these 27,020 3’ B73 ESTs define 13,328 genes. Hence, one gene was defined for every two to three ESTs sequenced from these normalized libraries.

An additional B73 library (ISUM6) was prepared using the same mRNAs as ISUM5-RN, but 21 specific 6-bp "bar-code" tags were added to each mRNA prior to cloning (Qiu, et al, submitted). These tags allow the particular mRNA source from which a given EST was derived to be determined, even though this library was prepared using multiple sources of mRNAs. Over 3,600 3’ ESTs from ISUM6 have been sequenced and submitted to GenBank.

Over 1,000 3’ ESTs from a Mo17 cDNA library (ISUM7) were sequenced and aligned with the 3’UTRs of corresponding B73 alleles. At least one indel is present in 48% (277/572) of these alignments.

Development of IDP Markers and Genetic Mapping of ESTs. IDPs (InDel Polymorphisms) are a class of PCR-based, allele-specific genetic marker that detects insertion and deletions (indels) among maize alleles. Primer pairs were designed based on the 3’ UTRs of ESTs and used to PCR amplify B73 and Mo17 alleles from genomic DNA. A useful primer pair can distinguish B73 and Mo17 alleles because it can amplify one allele but not the other, or because the PCR products from the two inbreds display a size polymorphism. A high-throughput primer design tool was developed. 531 out of 7,702 (6.9%) primers designed with this tool yielded validated +/- results or size polymorphisms between B73 and Mo17. About one third of the IDPs are polymorphic between any pair of inbreds tested, demonstrating the widespread utility of these genetic markers. The 531 IDP markers were genetically mapped using a panel of 94 RIs from the Intermated B73 X Mo17 (IBM) population. A set of 22 inbreds has been surveyed with 3,884 primer pairs. About 31% of these primers detect polymorphisms between at least one inbred and B73 or one inbred and Mo17. 88 F1BC populations are being developed and will be used to map an additional 3,000 IDPs and the corresponding cDNAs.

Rescue: a new tool for gene discovery. Because genes are present at equimolar concentrations in genomic DNA, genomic sequencing provides a means to uncover those genes that will be missed via EST projects as a consequence of being expressed only under unusual conditions or at very low levels. Unfortunately, it is not currently feasible/cost-effective to sequence the entire crop genomes. Hence, biochemical approaches need to be developed to filter out the non-coding regions of the genome so that limited sequences resources can be focused on genic DNA. A novel expression vector system has been developed to directly rescue open-reading frames (ORFs) from genomic DNA. In a preliminary experiment, 250 maize genomic fragments cloned into this vector were selected based on a colorimetric screen. Sequence analysis of these clones revealed that 93.6% (234 out of 250) contain an uninterrupted ORF and 55% (129 out of 234) exhibit significant degrees of sequence similarity to entries in protein and EST databases. Many of the remaining clones are thought to contain ORFs that have not yet been discovered via EST sequencing.

Please Note: Notes submitted to the Maize Genetics Cooperation Newsletter may be cited only with consent of the authors.

Return to the MNL 76 On-Line Index
Return to the Maize Newsletter Index
Return to the Maize Genome Database Page