Metagenomic and metabarcoding projects, whether related to ecological studies, biodiversity or medical diagnostics, rely on the critical step of species identifications. Based on sequence clustering or alignment to references databases, initial identification is often refined by contextualization in a taxonomy [1] but ignores the potential resolution brought by more advanced phylogenetic models. An alternative remains in using phylogenetic placement (PP) [2], a process in which query reads are “placed” on the branches of a reference phylogeny.
In 2019 and 2020, new implementations [3] or novel algorithms [4,5,6] resulted to PP tools scalable to current sequencing volumes (>10⁶ reads placed on a phylogeny in <30 min). With a total of 8 proposed algorithms, 6 different implementations and 4 fundamentally different approaches (distance-based, alignment-based, alignment-free), it may be hard to evaluate which PP solution best fits the needs of a particular study.
Starting from our own evaluation pipeline developped in the conrext of the RAPPAS project (alignement-free phylogenetic placement), we developed PEWO (Placement Evaluation Workflows) [7], a set of automated experimental procedures ensuring reproducibility of PP benchmarking both in terms of accuracy and computational resources (github.com/phylo42/PEWO). Beyond benchmarking, PEWO procedures also help to answer common questions that arise during metabarcoding or metagenomic experimental design :
Finally, PEWO is intended to become a community effort and we called for PP developers and users to join us in this standardisation effort. The first community collaborations rapidly followed its publication, with a novel placement tool [6] that entirely based its performance evaluation on PEWO procedures and the involvement of different authors to extend the PEWO modules related to their respective tools.