There are a bunch of different alignment tools out there, and I don't want to get bogged down in the maths behind them as this not only between software but varies from software version to version.
There are two main divides in the programs; some use local alignments and others use global alignments. My question is threefold:
- What are the fundamental differences between the two?
- What are the advantages and disadvantages of each?
- When should one use either a global or local sequence alignment?
Answer
The very basic difference between a local and a global alignments is that in a local alignment, you try to match your query with a substring (a portion) of your subject (reference). Whereas in a global alignment you perform an end to end alignment with the subject (and therefore as von mises said, you may end up with a lot of gaps in global alignment if the sizes of query and subject are dissimilar). You may have gaps in local alignment also.
Local Alignment
5' ACTACTAGATTACTTACGGATCAGGTACTTTAGAGGCTTGCAACCA 3'
|||| |||||| |||||||||||||||
5' TACTCACGGATGAGGTACTTTAGAGGC 3'
Global Alignment
5' ACTACTAGATTACTTACGGATCAGGTACTTTAGAGGCTTGCAACCA 3'
||||||||||| ||||||| |||||||||||||| |||||||
5' ACTACTAGATT----ACGGATC--GTACTTTAGAGGCTAGCAACCA 3'
I shall give the example of the well known dynamic programming algorithms. In the Needleman-Wunsch (Global) algorithm, the score tracking is done from the (m,n) co-ordinate corresponding to the bottom right corner of the scoring matrix (i.e. the end of the aligned sequences) whereas in the Smith-Waterman (local), it is done from the element with highest score in the matrix (i.e. the end of the highest scoring pair). You can check these algorithms for details.
You can adopt any scoring schemes and there is no fixed rule for it.
Global alignments are usually done for comparing homologous genes whereas local alignment can be used to find homologous domains in otherwise non-homologous genes.
No comments:
Post a Comment