식물미생물/Fun. Mol. Evol.

Chapter 3, Evolutionary Change in Nucleotide Sequences

케이든 2014. 10. 15. 17:54

 

Chapter 3, Evolutionary Change in Nucleotide Sequences

 

1. estimating the rate of evolution

2. reconstructing the evolutionary history of organisms

 

NUCLEOTIDE SUBSTITUTION IN A DNA SEQUENCE

 

Jukes and Cantor's one-parameter model (1969)

subsitutions occur with equal probability among the 4 nucleotide types

one-parameter model (α)

 

Kimura's two-parameter model (1980)

two-parameter model (Ts/Tv ratio)

transitional subsitituion rate (α) - more frequent

transversional subsitituion rate (β)

 

NUMBER OF NUCLEOTIDE SUBSTITUTIONS BETWEEN TWO DNA SEQUENCES

Degree of divergence (n/N x 100%) or Hamming distance - seq length N, sites n

Multiple subsitituions or Multiple hits

 

Number of substitutions between two noncoding sequences

pass...

 

Substitution schemes with more than two parameters

Blaisdell's four-parameter model (1985)

more parameter more estimation error

 

Violation of assumptions

The probability of a certain subsitution occuring at a site is not affected by

1. the context of surrounding nucleotides

2. the occurence of a substitution at a different site

3. the history of substitutions at the site in question

 

Number of substitutions between two protein-coding genes

harder than computing the number of subsititutions between two noncoding sequences

due to a distinction should be made between Synonymous(Ns) and Nonsynonymous(NA) substitutions

Unweighted method - approach to deal with multiple substitutions at a codon, all pathways are equally proable

Weighted method - employs a priori criteria to decide which pathway is more probable

Calculating Ks and KA - 3 types nucleotide sites

1. nondegenerate (L0)

2. twofold degenerate (L2)

3. fourfold degenerate (L4)

 

Indirect estimations of the number of nucleotide substitutions

K는 다른 종류의 분자 데이터를 사용하여 간접적으로 얻을 수도 있다.

그러나 샘플링 오류가 클 수 있다.

 

NUMBER OF AMINO ACID REPLACEMENTS BETWEEN TWO PROTEINS

 

p = n/L, n: the number of AA differences between 2 sequences, L: the length of aligned seqs

d = -ln(1-p), d: the number of AA replacements per site

 

ALIGNMENT OF NUCLEOTIDE AND AMINO ACID SEQUENCES

 

Sequence alignment - comparison of two homologous sequences

AA > DNA (reliability) because

1. AAs changes less frequently during evolution than NTs

2. 20 AAs v.s. 4 NTs

3 Types of aligned pairs: matches, mismatches, gaps (with null base[-])

Termianl gaps / Internal gaps

Positional homology - a claim to the effect that the two members of the pair descended from a common ancestral nucleotide

 

Manual alignment by visual inspection

Advantage: 뇌를 사용/ 도메인 지식 사용 가능

Disadvantage: 주관적 / 다른 것과 비교 불가능

 

The dot matrix method

The two seq to be aligned are written out as column and row headings of a two-D matrix

Dot matrix plot - window size & stringency

 

Distance and similarity methods

Optiaml alignment - the best possible alignment between two sequences

Mismatch를 줄이려고 하면 Gap이 늘어남

Gap과 Mismatch를 최소화한 Alignment

Gap penalty (Gap cost) - gap-opening penalty, gap-extension penatly

1. fixed gap penalty system

2. affine or linear gap penatly system

3. logarithmic gap penalty system

Mismatch penalties

Distance (dissimilarity index, D)

Simiarity index, S

 

Alignment algorithms

To choose the alignment assoicated with the smallest D (or the largest S) from among all possible alignments

Needleman-Wunsch algorithm (use dynamic programming)

Pointer

Traceback

path graph

 

Multiple alignments

MACAW

CLUSTAL

MASH

Such alignment can be frequently improved by visual inspection