You may have heard of Folding at Home. It’s a very creative way that a bioengineering team, based at Washington University in St Louis, are modeling the process of protein folding. Greg Bowman, an associate professor of biochemistry and biophysics at the university directs the project and presented at Arm DevSummit this year. Proteins mediate pretty much everything to do with life, including pathologies. They’re long chains of amino acids, which fold up into energetically favorable shapes within milliseconds. The shape is critical. If a good protein folds incorrectly you get a disease, maybe Alzheimer’s. Conversely the infamous protein spike on the COVID virus hides the site that binds to a cell, to protect it from antiviral therapies. Then it opens up as it approaches a tasty cell target. Proteins in his words are molecular machines, dynamically changing structure as needed.
Modeling this behavior to search for effective therapies had been extremely difficult even on supercomputers. A single protein may contain a thousand or more atoms. And the COVID spike contains three proteins. Trying to model the evolution of folding to a favorable energy state had been limited to nanoseconds to microseconds. Not nearly long enough to simulate a complete folding. Also the process is stochastic. Which would require many simulations to build a range of samples and then model the dynamic behavior of these structures. Useful simulations could take hundreds or thousands of years to complete.
The Folding@Home innovation took advantage of a known characteristic of folding. It’s not a continuously changing dynamic system. Instead the protein evolves in small steps between local energy minima before moving on to a next configuration. It spends the majority of its time in these minima, where nothing interesting happens. So simulation can be decomposed into a Markov (statistical) sequence where computation is modeling these short transitions.
Massive parallelism through citizen science
This observation makes massive parallelism a realistic method of attack. Just have each engine model a transition. Or a bunch of transitions in sequence. Orchestrating the whole process from a central site. But again, there’s no budget for a research group to push this into the cloud. And anyway, it’s not clear that any one cloud, no matter how big, could provide enough parallelism for this problem.
Instead this team created Folding@Home to take advantage of all the millions of home computers around the world. Enthusiasts can sign up to become a part of the folding network. Their software will run in the background on your system when you’re not busy doing something else. You run a little bit of the Markov sequence, then maybe another little bit and so on.
What makes Folding@Home so impressive is how many adopters have signed up. Greg showed a map with lots of support across the world. He talked about tens of thousands of Folding@Home volunteers before COVID hit. That rapidly grew to (now) over a million. He estimates that they are now computing at 5X the performance of the world’s fastest supercomputer (he cites IBM Summit), becoming the first compute system to break the Exaflops barrier.
Modeling COVID spike behavior
With this computational power they are now able to see the spike protein tip opening to reveal the binding site that the virus would use to attach to a cell. Remember, this is a level of modeling that isn’t even possible on a supercomputer. Of course a lot more work has to happen to get to a therapy from that simulation, but now they can provide experimenters with a much more detailed view of what’s happening. From which they can plan much more targeted attacks.
Why did Greg present this at the Arm conference? Because the software is also available on Android and will no doubt become available on other Arm-based platforms. And what better use could any of us imagine for using all that compute technology around the world?
You can learn more about Folding@Home HERE.
Share this post via: