For the current REF see the REF 2021 website REF 2021 logo

Output details

8 - Chemistry

University of York

Return to search Previous output Next output
Output 27 of 191 in the submission
Name of software

BUCCANEER

Type
G - Software
Name of software house
STFC Collaborative Computational Project Number 4
Year
2010
Number of additional authors
0
Additional information

Since 2008 I undertook the following research enabling new functionality in the Buccaneer software:

1. Auto-built protein models typically contain fragments separated by poorly resolved flexible loop regions. There may be hundreds of fragments belonging to tens of chains of different types, so allocation of fragments to chains is a massive combinatorial problem with potentially billions of solutions. I developed a tree search algorithm to evaluate possible assemblies of the fragments and then to score them for compactness and consistency, along with a pruning method to eliminate unproductive branches and make the problem computationally tractable. I tested the resulting method against 24 known structures with starting data of varying quality to establish that the method is robust and significantly reduces the manual rebuilding required.

2. Automated model building is frequently a rate-limiting step in structure solution. I therefore developed code to allow the calculation to be spread across multiple processor cores. Automatic parallelisation procedures can lead to differing and unreproducible results, so I implemented the threading algorithms by hand with careful ordering of the computational steps to ensure exact reproducibility. In addition I developed provisional cacheing strategies to eliminate recalculation of intermediate results. The resulting methods were benchmarked against competing software and gave an order of magnitude improvement, so that refinement is now the rate-limiting step.

3. Side chains are hard to see at low resolution, so additional sources of sequence information as a 'prior' probability are helpful. I developed a statistical scoring scheme enabling known selenium atom positions in selenomethionine phasing to increase the probability of placing a methionine near that position. A similar approach allows a partial or molecular replacement model to be used as a restraint on probable residue types. Both methods were tested against a library of 55 known structures with starting data of varying quality.

Interdisciplinary
-
Cross-referral requested
-
Research group
B - Chemical and Structural Biology (YSBL)
Proposed double-weighted
No
Double-weighted statement
-
Reserve for a double-weighted output
No
Non-English
No
English abstract
-