ATGC: PEWO

PEWO: Placement Evaluation WOrkflows

Benjamin Linard, Nikolai Romashchenko, Fabio Pardi, Eric Rivals

Bioinformatics, 2020.

Motivation

Metagenomic and metabarcoding projects, whether related to ecological studies, biodiversity or medical diagnostics, rely on the critical step of species identifications. Based on sequence clustering or alignment to references databases, initial identification is often refined by contextualization in a taxonomy [1] but ignores the potential resolution brought by more advanced phylogenetic models. An alternative remains in using phylogenetic placement (PP) [2], a process in which query reads are “placed” on the branches of a reference phylogeny.

In 2019 and 2020, new implementations [3] or novel algorithms [4,5,6] resulted to PP tools scalable to current sequencing volumes (>10⁶ reads placed on a phylogeny in <30 min). With a total of 8 proposed algorithms, 6 different implementations and 4 fundamentally different approaches (distance-based, alignment-based, alignment-free), it may be hard to evaluate which PP solution best fits the needs of a particular study.

Standardized benchmarking of phylogenetic placement

Starting from our own evaluation pipeline developped in the conrext of the RAPPAS project (alignement-free phylogenetic placement), we developed PEWO (Placement Evaluation Workflows) [7], a set of automated experimental procedures ensuring reproducibility of PP benchmarking both in terms of accuracy and computational resources (github.com/phylo42/PEWO). Beyond benchmarking, PEWO procedures also help to answer common questions that arise during metabarcoding or metagenomic experimental design :

Should I consider phylogenetic placement as part of my experimental design ?
Which taxonomic marker (16S rRNA, cox1 ... ) is likely to produce the most accurate identifications ?
Which tool and parameters should be chosen to optimise the accuracy / computational cost trade-off ?

Figure 1: Overview of PEWO inputs and outputs

Finally, PEWO is intended to become a community effort and we called for PP developers and users to join us in this standardisation effort. The first community collaborations rapidly followed its publication, with a novel placement tool [6] that entirely based its performance evaluation on PEWO procedures and the involvement of different authors to extend the PEWO modules related to their respective tools.

Downloads

Intructions to download and install PEWO are available on its Github repository page.

There is also a Wiki with detailed tutorials, runnable examples and instructions for potential contributors ...