CAT : Empirical profile mixture models for phylogenetic reconstruction.

Le S.Q., Gascuel O., Lartillot N. Bioinformatics. 2008 Oct 15;24(20):2317-23.

Please cite THESE papers if you use CAT.

CAT (Lartillot and Philippe 2004) is a model especially devised to account for site-specific features of protein evolution. In general, each position of a protein is under a very specific selective constraint, and as a result, only a subset of the 20 amino-acids is likely to be accepted at this position during evolutionary times. As we have shown in previous works, accounting for such site specific features is crucial, both to obtain a better statistical fit (Lartillot and Philippe 2006), and to alleviate phylogenetic artefacts, due to long branch attraction phenomena (Lartillot et al 2007). Technically, CAT is a mixture model, assuming a given number K of components (or site classes). Each component specifies a biochemical profile, which is a probability vector over the 20 amino-acids. Such a profile in turns defines a very simple amino-acid replacement process : each time a substitution event occurs, a new amino-acid is chosen at random, according to the probabilities defined by the profile. We call this a Poisson process, although it is also known as a Felsenstein1981, or proportional, amino-acid replacement process. The likelihood at each site of the alignment is then an average over all available Poisson processes defined by the mixture.

See also :