Quartet based phylogenetic inference: improvement and limits.
Ranwez V., Gascuel O. Molecular Biology and Evolution. 2001 18: 1103-1116.
Please cite
THIS paper if you use these datasets.
Key words:
Phylogenetic reconstruction, quartet methods, tree consensus, maximum-likelihood, parsimony, distance methods, computer simulations.
This page provide the benchmarks used in this
paper.
Our experimental tests followed a protocol used within a similar framework by Kumar and later by Gascuel. Six model trees were considered, each consisting of 12 taxa:

The first three (AA, BB, AB) satisfy the molecular-clock hypothesis, while the three others (CC, DD, CD) present varying substitution rates among lineages. Each interior branch is one unit long, a for constant and b for variable rate trees, and the lengths of external branches are given in multiples of a or b. For each of these model trees, we studied four evolutionary conditions:
-
a low evolutionary rate, for which the Maximum Pairwise Divergence (MD) is about 0.1 substitutions per site (a=0.00625 and b=0.005)
-
a medium evolutionary rate, MD is about 0.3 per site (a=0.0185 and b=0.015)
-
a fast evolutionary rate, MD is about 1.0 (a=0.0625 and b=0.05)
-
a very fast evolutionary rate, MD is about 2.0 (a=0.125 and b=0.1)
For each tree T and evolutionary condition, we used
Seq-Gen (v1.06) to generate 1000 data files with sequences of length 300, and 1000 data files with sequences of length 600. These sequences were obtained by simulating an evolving process along T according to the Kimura two-parameter model with a transition/transversion rate of 2. We thus tested the different methods on 48000 test sets corresponding to 2 sequence lengths, 4 evolution rates and 6 model trees. These test files are available in one
zip file (62 Mo) or in separate zip files (about 700K for 300 nucleotides).
Trees satisfying molecular-clock hypothesis:
Tree AA: (((((T1:4a, T2:4a):a, T3:5a):a, T4:6a):a, T5:7a):a, T6:8a, (((((T7:4a, T8:4a):a, T9:5a):a, T10:6a):a, T11:7a):a, T12:8a):a);
Tree BB: (((T1:6a, T2:6a):a, T3:7a):a, (T4:7a, (T5:6a, T6:6a):a):a, (((T7:6a, T8:6a):a, T9:7a):a, (T10:7a, (T11:6a, T12:6a):a):a):a);
Tree AB: (((((T1:4a, T2:4a):a, T3:5a):a, T4:6a):a, T5:7a):a, T6:8a, (((T7:6a, T8:6a):a, T9:7a):a, (T10:7a, (T11:6a, T12:6a):a):a):a);
Trees with varying substitution rates among lineages:
Tree CC: ((((T1:b, T2:9b):b, T3:b):b, (T4:b, T5:9b):b):b, T6:8b, ((((T7:b, T8:9b):b, T9:b):b, (T10:b, T11:9b):b):b, T12:8b):b);
Tree DD: ((((T1:b, T2:9b):b, (T3:b, T4:9b):b):b, T5:b):b, T6:8b, ((((T7:b, T8:9b):b, (T9:b, T10:9b):b):b, T11:b):b, T12:8b):b);
Tree CD: ((((T1:b, T2:9b):b, T3:b):b, (T4:b, T5:9b):b):b, T6:8b, ((((T7:b, T8:9b):b, (T9:b, T10:9b):b):b, T11:b):b, T12:8b):b);