ERaBLE: Evolutionary Rates and Branch Lengths Estimation

Fast and accurate branch lengths estimation for phylogenomic trees.

Manuel Binet, Olivier Gascuel, Celine Scornavacca, Emmanuel J.P. Douzery and Fabio Pardi. BMC Bioinformatics. 2016, 17:23.

Please cite THIS paper if you use ERaBLE.

User guide:

1 Availability

Binaries and source code: Downloads

Bug report: If you ever come across an issue, please feel free to report it sending an email to manuel.binet AT lirmm DOT fr

2 Installing ERaBLE

2.1 Compilation for UNIX/Linux/Mac
In a command-line window, go to the directory that contains the archive erable*.tar.gz and type:
2.2 Installation for Windows
Copy the files erable.exe and erable.bat in the same directory. Modify the command line in erable.bat with your favorite text editor. To launch ERaBLE, click on the icon corresponding to erable.bat

Alternatively you can compile the sources with ./configure and make.

3 Input files

ERaBLE requires at least two input files: one matrices file and one tree topology file.
The matrices file contains a list of m distance matrices in PHYLIP format (lower triangular or square); the value of m must be given at the beginning of the file (see example below). The length of the sequences (called N_k in the following) must follow the number of taxa as done in PHYLIP alignment files. If the length of the sequences are not available, you can set the same length value (e.g. 1) for all the matrices. The tree topology file contains an unrooted, binary tree topology in NEWICK format whose branch lengths - if specified - will be ignored. Comments inside the input files start with the # character for Windows and Unix. For Unix version you can use // characters too.

An example of input matrices file containing square distance matrices: input_matrices.txt
			
//comment: Gene names "%names" are optional

3

%Gene 43
4 381
Tax10      0.00000000  0.63692700  1.18140000  1.07080000
Tax3       0.63692700  0.00000000  0.70195800  0.59136000  
Tax20      1.18140000  0.70195800  0.00000000  0.61216600  
Tax4       1.07080000  0.59136000  0.61216600  0.00000000   

%Gene 60
4 210
Tax14      0.00000000  0.11026300  0.29855000  0.27683700  
Tax3       0.11026300  0.00000000  0.21525100  0.19353800  
Tax20      0.29855000  0.21525100  0.00000000  0.15977400  
Tax4       0.27683700  0.19353800  0.15977400  0.00000000  

%Gene 69
4 590
Tax34      0.00000000  0.39659600  0.42909900  0.42292600  
Tax14      0.39659600  0.00000000  0.46365800  0.45748400  
Tax20      0.42909900  0.46365800  0.00000000  0.28217300  
Tax4       0.42292600  0.45748400  0.28217300  0.00000000 			
			
			

An example of input tree toplogy file with branch lengths (ignored): input_tree_topology.nwk
 
			
(Tax10:0.51144,(Tax34:0.19411,(Tax20:0.19064,Tax4:0.15564):0.09995):0.09856,(Tax14:0.10233,Tax3:0.03293):0.01914);				
			
			

4 Output files

ERaBLE outputs two outfiles: The outfiles of ERaBLE for: erable -i input_matrices.txt -t input_tree_topology.nwk
			
((Tax10:0.338333578449,(Tax34:0.233436677569,(Tax20:0.197509459236,Tax4:0.16370288841):0.134789424385):0.0372630086088)
:0.0329947660905,Tax3:0.0194998053449,Tax14:0.20724052087);			
			
			

			
gene                 rate

Gene 43              1.62969
Gene 60              0.486296
Gene 69              0.776216	
		
			

5 Program usage

ERaBLE's usage is :
				 			
erable -i <infile(matrices)> -t <infile(tree)> <options> 

options: 
	-o <outfile>;
	-w <weights> ols, wseq (default), user infile weights;	
	-m <system solving method> ldlt (Cholesky without square root calcul(default)), llt (Cholesky), QR , LU; 
	-p <integer>: output precision (default 12 significant digits);
	-c <constraint> ols, cseq, nksumdist (default);  
				
			

An example of syntax:
				 	
$ ./erable -i input_matrices.txt -t input_tree_topology.nwk
					
			
A run with this syntax estimates the branch lengths of the reference topology given in the file "input_tree_topology.nwk", and the gene rates, using the collection of distance matrices given in the file "input_matrices.txt". The branch lengths are given in the outfile with the suffix ".lengths.nwk" and the gene rates in the outfile with the suffix ".rates.txt".

The parameters can be specified as follows: