PhyML 3.0: new algorithms, methods and utilities
Guindon S., Gascuel O.
Systematic Biology, 52(5):696-704, 2003.
Frequently Asked Questions
- Does PhyML handle outgroup sequences ?
No, PhyML does not make any difference between outgroup and ingroup sequences.
The best solution to take into account outgroup sequences is to run two separate analysis.
The first analysis should be conducted on the set of aligned sequences excluding the outgroup sequences.
This data set is used to estimate the ingroup phylogeny.
The second analysis includes the whole set of sequences.
The tree corresponds to the ingroup+outgroup phylogeny.
The third step is to manually position the root on the ingroup phylogeny using the ingroup+outgroup phylogeny.
The advantage of this technique is to avoid long-branch attraction in the phylogeny estimation due to distantly related outgroup sequences.
- Does PhyML estimate clokc-constrained trees ?
No, PhyML cannot do that at the moment. However, future releases of the program will include this feature.
- Can PhyML analyse partitioned data, such as multiple gene sequences ?
We are currently working on this topic. Future releases of the program will provide options to estimate trees from phylogenomic data sets, with the opportunity to use different substitution models on the different data partitions (e.g., different genes).
PhyML will also include specific algorithms to search the space of tree topologies for this type of data.
- Is there a way to estimate the PhyML computation time ?
Yes. A rough estimate can be obtained with the formula : T * T * L * A * B * I
T (Taxa) : number of taxa
L (Length) : sequences length
A (Alphabet) : 1 for nucleotides, 12 for amino-acids
B (Bootstrap): number of bootstrap replicates + number of random starting trees
I (tree Improvement) : 1 for NNI, 4 for SPR
The result is proportional to the PhyML computation time but should be considered with care.
Indeed, the PhyML computation time also depends on the number of tree improvement iterations.
This number is directly linked to the input data and cannot be estimated.
When using the PhyML online server, you can divide the resulting value by 100 000 to get a rough estimate of the computation time in seconds.
- There is no attached tree to the result e-mail I received.
Carrefully read the e-mail content.
The PhyML ouput certainly gives indications on what happened during PhyML execution.
This is most often due to a badly formatted input file. See the "Option" section in the
User guide.
- I ran PhyML online several days ago and I still get no result e-mail.
The PhyML execution was likely interrupted before end.
This is likely due to a badly formatted input file.
See the "Option" section in the
User guide.
- Where can I get the sources for PhyML ?
PhyML was developped under the GNU Public Licence (GPL).
The sources are freely available.
However, we like to know what people to do with our code.
Hence, to get the files, send an e-mail to Stéphane Guindon
with just a few words about the reasons why you need the sources.