I am aligning a large number of ESTs. It seems poly-A tails show in many different ways. In addition to occurring at the very end, they can be flanked by the cloning sequence one one end, or have mismatches/errors. What is a good rule or available tools that will handle the usual cases?
A few examples of the non-trivial cases I found, with their Genbank Accs:
>EE409337
... AAAAAAAAAAAAAAAAAAAAAAAAAGGAAAAAAAAAAAAAAAAAAAAAAAAAAAACCTTGTC
>EE409340
... TTTCTACAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACTTGTC
>EE409361
... TTGTTAAACTGAAAAAAAAAAAAAAAAAAAAAAAAAAAACCATGTCGGC
TTACTGAATTGAA
>EE420306
.... AAAAAAAGTTATGTTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGGGAAAAAAA
AAAAAAAAAAAAAAAAA
Cross-posted on SeqAnswers,Biostars.
No comments:
Post a Comment