PPT.tag_pos_seq.all (912 Kb)
PPT.tag_pos_seq.clean (847 Kb)
The file 'PPT.tag_pos_seq.all' contains a listing of putative PPT sequences (up to 100 nucleotides upstraem from the acceptor splice site) for all the introns in our data set for which a putative PPT was determined. Where more that one polypyrimidine tract was identified, only the PPT closest to the acceptor site is given (for a full listing of all the PPT's identified, please see the "introns.flat" file).
The first field is the primary key identifying the intron, the second field describes the position of the PPT with respect to the 3' end of the intron, and the third field contains the PPT sequence.
The file 'PPT.tag_pos_seq.clean' contains a subset of the data in the .all file. Firstly, duplicate PPTs have been removed (caused by intron isoforms with the same PPT). Secondly, introns that were categorised as being of 'GC' or 'U12' type have been removed.
fc - 13/9/2001