Improving the Efficiency of SPR Moves in Phylogenetic Tree Search Methods Based on Maximum Likelihood
Wim Hordijk and Olivier Gascuel
Bioinformatics 21(24), 4338-4347, 2005.
Maximum likelihood methods have become very popular for constructing phylogenetic trees from sequence data. However, despite noticeable recent progress, with large and difficult data sets (e.g. multiple genes with conflicting signals) current ML programs require huge computing times and can be trapped in bad local optima of the likelihood function. When this occurs, the resulting trees may still show some of the defects (e.g. long branch attraction) of starting trees obtained using fast distance or parsimony programs.
Subtree Pruning and Regrafting (SPR) topological rearrangements are usually sufficient to intensively search the tree space. Here, we propose two new methods to make SPR moves more efficient. The first method uses a fast distance-based approach to detect the least promising candidate SPR moves, which are then simply discarded. The second method locally estimates the change in likelihood for any remaining potential SPRs, as opposed to globally evaluating the entire tree for each possible move. These two methods are implemented in a new algorithm with a sophisticated filtering strategy, which efficiently selects potential SPRs and concentrates most of the likelihood computation on the promising moves.