Quartet based phylogenetic inference: improvement and limits.

Ranwez V., Gascuel O. Molecular Biology and Evolution. 2001 18: 1103-1116.

Please cite THIS paper if you use these datasets.

Key words:

Phylogenetic reconstruction, quartet methods, tree consensus, maximum-likelihood, parsimony, distance methods, computer simulations.

This page provide the benchmarks used in this paper.

Our experimental tests followed a protocol used within a similar framework by Kumar and later by Gascuel. Six model trees were considered, each consisting of 12 taxa:
trees figure
The first three (AA, BB, AB) satisfy the molecular-clock hypothesis, while the three others (CC, DD, CD) present varying substitution rates among lineages. Each interior branch is one unit long, a for constant and b for variable rate trees, and the lengths of external branches are given in multiples of a or b. For each of these model trees, we studied four evolutionary conditions: For each tree T and evolutionary condition, we used Seq-Gen (v1.06) to generate 1000 data files with sequences of length 300, and 1000 data files with sequences of length 600. These sequences were obtained by simulating an evolving process along T according to the Kimura two-parameter model with a transition/transversion rate of 2. We thus tested the different methods on 48000 test sets corresponding to 2 sequence lengths, 4 evolution rates and 6 model trees. These test files are available in one zip file (62 Mo) or in separate zip files (about 700K for 300 nucleotides).

Trees satisfying molecular-clock hypothesis:

Tree AA: (((((T1:4a, T2:4a):a, T3:5a):a, T4:6a):a, T5:7a):a, T6:8a, (((((T7:4a, T8:4a):a, T9:5a):a, T10:6a):a, T11:7a):a, T12:8a):a); Tree BB: (((T1:6a, T2:6a):a, T3:7a):a, (T4:7a, (T5:6a, T6:6a):a):a, (((T7:6a, T8:6a):a, T9:7a):a, (T10:7a, (T11:6a, T12:6a):a):a):a); Tree AB: (((((T1:4a, T2:4a):a, T3:5a):a, T4:6a):a, T5:7a):a, T6:8a, (((T7:6a, T8:6a):a, T9:7a):a, (T10:7a, (T11:6a, T12:6a):a):a):a);

Trees with varying substitution rates among lineages:

Tree CC: ((((T1:b, T2:9b):b, T3:b):b, (T4:b, T5:9b):b):b, T6:8b, ((((T7:b, T8:9b):b, T9:b):b, (T10:b, T11:9b):b):b, T12:8b):b); Tree DD: ((((T1:b, T2:9b):b, (T3:b, T4:9b):b):b, T5:b):b, T6:8b, ((((T7:b, T8:9b):b, (T9:b, T10:9b):b):b, T11:b):b, T12:8b):b); Tree CD: ((((T1:b, T2:9b):b, T3:b):b, (T4:b, T5:9b):b):b, T6:8b, ((((T7:b, T8:9b):b, (T9:b, T10:9b):b):b, T11:b):b, T12:8b):b);