PhyML 3.0 Benchmarks

Large-size data sets

Comparison of PhyML 3.0 tree search options and RAxML, using 20 DNA and protein alignments extracted from Treebase.

See DNA benchmark - See protein benchmark - Download DNA large-size data sets - Download protein large-size data sets

Large-size data sets

Distribution of relative computing times: for each of the 2 sets of alignments (10 DNA and 10 protein large-size alignments) we measured the base-2 logarithm of the ratio of the computing time of the given method, and that of the fastest approach with the corresponding alignment. Thus, a log-ratio equals to X corresponds to a method being 2^X times slower than the fastest approach; e.g. with DNA alignments PhyML 2.4.5 NNI is basically twice faster than PhyML 3.0 NNI, but both are pretty much the same with protein alignments.

DNAAv. LogLk rankDelta>5P-value<0.05Av. RF distance
PhyML 3.0 NNI3.51070.46
PhyML 3.0 SPR1.4300.15

PROTEINAv. LogLk rankDelta>5P-value<0.05Av. RF distance
PhyML 3.0 NNI2.65630.20
PhyML 3.0 SPR2.75700.18

Comparison of log-likelihoods on 10 DNA and 10 protein large-size data sets. The column ‘Av. LogLk rank’ gives the average log-likelihood ranks for the different methods. These ranks are corrected by taking into account information on tree topologies. ‘Delta>5’ gives the number of cases (among 10) for which the difference of log-likelihood between the method of interest and the highest log-likelihood for the corresponding data set is greater than 5. The column ‘p-value<0.05’ displays the number of cases for which the difference of log-likelihood when comparing the method of interest to the corresponding highest log-likelihood is statistically significant (SH test). The ‘Av. RF distance’ values are the average Robinson and Foulds topological distances between the trees estimated by the method of interest and the corresponding most likely trees (0 corresponds to identical trees, while 1 means that the two trees do not have any clade in common).

Data sets

The benchmark contains 10 protein alignments and 10 DNA alignments.


4 programs and options have been compared. All programs were configured with the GTR model for DNA sequences, with WAG for proteins, and with 4 discrete gamma rate categories (alpha estimated from the data).


Resulting trees are compared regarding topology, log-likelihood and computing time, using the same criteria as with medium-size data sets.