PhyML 3.0 Benchmarks

Efficiency of filtering strategies

Comparison of distance-based PhyML SPR program (W. Hordijk and O. Gascuel, 2005) and PhyML 3.0 SPR , using different parsimony filtering with various intensities.

See DNA benchmark - See protein benchmark - Download DNA medium-size data sets - Download protein medium-size data sets

SPR data sets

Distribution of relative computing times: for each of the 2 sets of alignments (50 DNA and 50 protein medium-size alignments) we measured the base-2 logarithm of the ratio of the computing time of the given method, and that of the fastest approach with the corresponding alignment. Thus, a log-ratio equals to X corresponds to a method being 2^X times slower than the fastest approach; e.g. distance-based method is much faster than others, and the unfiltered 3.0 version is by far the slowest one. The PT=0 and PT=5 ones seem to be close and a good compromise.

DNAAv. LogLk rankDelta>5P-value<0.05Av. RF distance
PhyML SPR3.491510.24
PhyML 3.0 SPR (PT=0)2.43300.09
PhyML 3.0 SPR (PT=5)2.17200.06
PhyML 3.0 SPR (PT=infinity)1.91200.05

PROTEINAv. LogLk rankDelta>5P-value<0.05Av. RF distance
PhyML SPR2.85720.18
PhyML 3.0 SPR (PT=0)2.69200.12
PhyML 3.0 SPR (PT=5)2.25200.06
PhyML 3.0 SPR (PT=infinity)2.12100.04

Performance of the parsimony filter. PhyML-SPR filter (Hordijk and Gascuel, 2005) uses distance-based minimum evolution principle, while PhyML 3.0 SPR filter uses parsimony. When PT=infinity all SPRs are evaluated with likelihood, without any preliminary filtering. On the opposite, PT=0 corresponds to strong filtering (see text). The column ‘Av. LogLk rank’ gives the average log-likelihood ranks for the different methods. These ranks are corrected by taking into account information on tree topologies (see text). ‘Delta>5’ gives the number of cases (among 50) for which the difference of log-likelihood between the method of interest and the highest log-likelihood for the corresponding data set is greater than 5. The column ‘p-value<0.05’ displays the number of cases for which the difference of log-likelihood when comparing the method of interest to the corresponding highest log-likelihood is statistically significant (SH test). The ‘Av. RF distance’ values are the average Robinson and Foulds topological distances between the trees estimated by the method of interest and the corresponding most likely trees (0 corresponds to identical trees, while 1 means that the two trees do not have any clade in common).

Data sets

The benchmark contains 50 protein alignments and 50 DNA alignments.


All programs have been run on a cluster Intel(R) Xeon(R) CPU 5140 @ 2.33GHz, 24 computing nodes, with 8GB of RAM for one bi-dualcore unit. Times can be compared because we've only considered effective computing time for the CPU.


4 programs and options have been compared. All programs were configured with the GTR model for DNA sequences, with WAG for proteins, and with 4 discrete gamma rate categories (alpha estimated from the data).


Resulting trees are compared regarding topology, log-likelihood and computing time.