By John Timmer, Ars Technica
Chemically, the proteins that run most of a cell’s functions are little more than a string of amino acids. Their ability to perform structural and catalytic functions is primarily dependent upon the fact that, when in solution, that string adopts a complex, three-dimensional shape. Understanding how that three-dimensional structure forms has been a serious challenge; even if you know the order of the amino acids in the string, it’s generally been impossible to predict how they’ll fold up into the final product. But now, gamers are giving scientists some insight into the algorithms that predict protein structures.gamers could top the best algorithms.
Given the gamers’ success, the scientists behind FoldIt started to wonder if it might be possible to produce algorithms that did some of the things that people did right. In their new paper, they describe how they decided to go about it. “One way to arrive at algorithmic methods underlying successful human Foldit play would be to apply machine learning techniques to the detailed logs of expert Foldit players,” they wrote. “We chose instead to rely on a superior learning machine: Foldit players themselves. As the players themselves understand their strategies better than anyone, we decided to allow them to codify their algorithms directly, rather than attempting to automatically learn approximations.”
Essentially, what they put in place was a scripting engine which allowed users to create a automated series of steps that the users could apply to a protein, speeding up the process of folding it—they called the scripts “recipes.” But the team didn’t stop there: players were allowed to share their recipes, and could modify any recipes they obtained from other users. This enabled a form of social evolution as recipes with names like “tlaloc Contract 3.00? and “Aotearoas_Romance” got passed around the community.
The recipes were a big success. In under four months, about 5,500 were created, and over 10,000 individual recipes were run on several weeks. Users came up with four general classes of script that modified the protein structure in distinct ways. For example, some recipes would let the user select a region of the protein, distort it, and then search for the lowest energy form of that region, essentially letting them do a partial reset of part of the structure. Another set of recipes allowed users to do an aggressive rebuild of part of the structure.
Nobody came up with a script that performed the whole folding process. Instead, experienced users built up a toolbox of recipes that they’d apply at different parts of the optimization process, allowing them to speed up parts of the process that they might otherwise have to do manually.
By the end of three months, two recipes (called Quake and Blue Fuse) accounted for about a third of the total scripting activities. Both of them took similar approaches to optimizing a local part of the protein’s structure, in essence, letting it breathe a bit, then settle down into a new energy minimum. Quake did this by alternately squeezing and relaxing the structure using a set of virtual rubber bands applied by the user. Blue Fuse did a similar thing by changing the strength of the attraction/repulsion among the atoms in the protein, causing the structure to repeatedly expand and contract. Both of them would successfully pack the protein more densely when applied to a partially completed structure.
At the same time, it turned out that one of the labs behind the FoldIt project was working on an algorithm called Fast Relax that, as it turned out, did essentially the same thing. The people working on Fast Relax reimplemented it using the FoldIt scripting language, and found that it had a somewhat different performance profile than Blue Fuse, taking about four minutes to reach the same level of optimization, but doing better than the users’ creation after that. As it turns out, FoldIt players rarely ran the filter for more than two minutes, so they’d never have seen its performance plateau out.
But the coders behind Fast Relax were ultimately able to provide a higher level of optimization because they had access to more features of the software than the scripting language exposed. Because of this success, however, the people behind FoldIt are going back and expanding its scripting capabilities, providing an expanded control over the environment’s variables. They say they “look forward to learning what Foldit player ingenuity can do with these additional capabilities.”
Image: Foldit team/University of Washington
Source: Ars Technica
Citation: “Algorithm discovery by protein folding game players.” By Firas Khatiba, Seth Cooperb, Michael D. Tykaa, Kefan Xub, Ilya Makedonb, Zoran Popovi?b, David Bakera and Foldit Players. Proceedings of the National Academy of Sciences, published online Nov. 7, 2011. DOI: 10.1073/pnas.1115898108