|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
These data sets represent subsets of the data that may be of particular interest. Each of the data sets above can be downloaded as a simple list of the intron / exon identifiers, as a table, or as a flatfile (see the pages on the intron / exon flat and table files for descriptions).
GC-AG type introns
This data set contains 171 GC-AG introns that were identifided in our analysis.
U12-type GT-AG introns
This data set contains 60 U12-type GT-AG introns that were identified in our analysis. This list of 45 introns constitute the manually curated subset after examination of branch point signals. Further, here is a list of the 10 AT-AC introns that, although not part of the data set, were observed in the construction of the data set.
AG-dependant introns
The data set contains 523 introns where no polypyrimidine tract was identified that came within 20 nucleotides of the acceptor splice site, and where the acceptor site bit score (based on positions -20 to + 4) was less than 5 bits (thus reducing the number of false positives occuring as a result of a PPT signal that was too weak to be specificly identifed). In 280 of these introns, no tracts of polypyrimidine were identified in the 100 nucleotides upstream of the acceptor site. Please note that this data set is a preliminary set of probable AG-depentant introns, and that further examination is required.
Short introns, short exons and long exons
The data set contains 572 exons that are less than 50 nts in length, 177 exons that are greater than 400 nts and 48 introns that are less than 70 nts. These extreme subsets may be of interest in studies related to the mechanisms that define exons and introns.
fc - 16/8/2001