CAT : Empirical profile mixture models for phylogenetic reconstruction.
Le S.Q., Gascuel O., Lartillot N. Bioinformatics. 2008 Oct 15;24(20):2317-23.
Please cite
THESE papers if you use CAT.
CAT (
Lartillot and Philippe 2004) is a model especially devised to account for
site-specific features of protein evolution. In general, each position of a protein is under a very specific selective constraint, and as a result, only a subset of the 20 amino-acids is likely to be accepted at this position during evolutionary times. As we have shown in previous works, accounting for such site specific features is crucial, both to obtain a better statistical fit (
Lartillot and Philippe 2006), and to alleviate phylogenetic artefacts, due to
long branch attraction phenomena (
Lartillot et al 2007).
Technically,
CAT is a mixture model, assuming a given number
K of components (or site classes). Each component specifies a
biochemical profile, which is a probability vector over the 20 amino-acids. Such a profile in turns defines a very simple amino-acid replacement process : each time a substitution event occurs, a new amino-acid is chosen at random, according to the probabilities defined by the profile. We call this a
Poisson process, although it is also known as a
Felsenstein1981, or
proportional, amino-acid replacement process. The likelihood at each site of the alignment is then an average over all available Poisson processes defined by the mixture.
See also :