Can AI give us more effective drugs and better batteries?
Slinkys, fans, jelly rolls and crystals
With the disheartening news this week that Google decided it was a good idea to alter substantive parts of the pre launch video of Gemini, I went looking to reawaken wonder and joy in the positive potential of data and found it in some fascinating ways that AI is making waves in materials science.
I’ve written before about how AI is helping speed up drug discovery and I was recently delighted to read about a robotic chemistry lab that paves the way to doing chemistry on Mars.
But you would have to have been living under a rock in the AI world in 2020 not to have known about AlphaFold, an AI system from Google DeepMind that accurately predicts the 3D models of protein structures. It floored the scientific world and blew previous approaches out of the water. This week an article about the biotech/AI startup Cradle that is building ‘an AI powered protein programming platform’, convinced me it was again time to take a deeper look at AI-assisted protein design.
Why is the folding of proteins important?
While most of us probably think about protein as ‘the macro nutrient food group that we eat’, there can be (slightly mind bogglingly) upwards of 10k unique proteins within a single human cell. And the function of a protein depends critically on its, decidedly three dimensional, shape.
Proteins are the workhorses of the cell. Each expertly performs a specific task. This wealth of diversity and specificity in function is made possible by a seemingly simple property of proteins: they fold.
The back bone of any protein is a long chain (~100-300) of amino acids (of which there are 20 or 22 distinct ones, depending on who you ask). The ordering of these amino acids in the chain determines how the protein will fold.
When folding, two types of structures usually form first. Some regions of the protein chain coil up into slinky-like formations called “alpha helices,” while other regions fold into zigzag patterns called “beta sheets,” which resemble the folds of a paper fan. These two structures can interact to form more complex structures … with descriptive names include the “beta barrel,” the “beta propeller,” the “alpha/beta horseshoe,” and the “jelly-roll fold.”
By folding into distinct shapes, proteins can perform very different roles despite being composed of the same basic building blocks. To draw an analogy, all vehicles are made from steel, but a racecar’s sleek shape wins races, while a bus, dump truck, crane, or zamboni are each shaped to perform their own unique tasks.
Once properly folded, proteins perform very diverse roles inside cells and even a small change in the shape of a protein can render it useless, with potentially lethal consequences. So how come we think it’s valuable (or even wise!) to tinker with them?
Why is it useful to be able to create new proteins?
Proteins are ubiquitous in the natural environment and they perform a huge variety of jobs with astounding finesse.
Inside living organisms, proteins provide structural support, function as enzymes, are involved in transport and storage of nutrients around the body, signal change across cells and across organs, are involved in the immune response and regulate physiological processes (like blood clotting ). Oh, and they help us move our limbs around 😎.
Proteins are also the building blocks of natural fibres like silk, wool and spider webs each of which has a unique blend of flexibility, strength, and biodegradability. We’ve been keen to improve on and scale the manufacturing of many of these natural fibres for centuries.
In the pharmaceutical industry, custom proteins are of interest as drugs, vaccines and targeted drug delivery vehicles.
In industrial biotechnology, enzymes (which are (almost always) proteins) are often used as catalysts. Novel protein structures could be valuable in the production of biofuels, the creation of fully biodegradable plastics and in waste management more generally (maybe even in a truely circular economy).
Custom proteins that are designed to bind and neutralise pollutants could assist with environmental cleanup, bioremediation and the decontamination of industrial sites.
There are also applications for custom protein design in the full farm-to-plate cycle of the food and agriculture industry. From crop resistance to pests, diseases, or environmental stresses to enzymes that promote growth, improve the nutritional value of food items or their shelf life.
So next time you hear about AI saving the world, forget personal assistants and think designer proteins!
How does AI speed up the process?
At this point in my reading, I had certainly convinced myself that being able to tweak the properties of proteins can be valuable in a much wider range of places than I had previously realised, but how does AI play a role in that tweaking?
Basically, without AI, the process of new designing and manufacturing a new protein is slow. Like years and years, heck call it a decade or more, slow. AI can be used to both broaden and accelerate the process.
Design: Protein design involves the use of different approaches to come up with a potential functional protein. Large language- and diffusion-based models can now be used to design new proteins, either singly or in combination.
In-silico validation: In-silico validation of proteins includes the use of computational methods and simulations to assess and validate the properties and behaviors of the generated protein candidates. This approach allows researchers to predict and analyze various aspects of a protein's structure, stability, function, and interactions before conducting actual experimental work in the lab.
This is where companies like Cradle and Profluent come in. Still early stage, they’re betting they can wrap robust and user friendly software tools around generative or diffusion-based AI models for non AI experts to leverage.
Cradle’s design platform makes it easy for everyone to start building products with biology instead of oil or animals, leveraging generative machine learning models to transform how biologists design and optimize proteins
Cradle blog, Nov 2022
At Profluent, our machine learning models excel at protein design and have been trained on billions of curated biological sequences. Our versatile technology is ready to operate on multiple distinct modalities across enzymes, antibodies, gene editors, and peptides.
Profluent website, Dec 2023
Then in a slightly different twist, companies like A-Alpha Bio are using AI to predict protein-protein interactions. This stuff gets out of my competence zone really fast but one of the neat things I believe they’re looking to do is speed up the process of finding / making molecular glues to bind small molecule drugs to currently ‘undruggable’ proteins.
Disease-relevant molecules that cannot be pharmacologically targeted are sometimes referred to as undruggable, and in cancer, a number of proteins fall into this category.
Targeting the undruggable, David S Hong
From proteins to decarbonisation
And it’s not only in the world of protein engineering that AI is helping out with the design and build of materials.
At the back end of November, Google DeepMind once again showed that they do work on some extraordinarily interesting stuff when they’re not getting caught up in their own hype, launching GNoME - graph networks for materials exploration. From the Nature paper abstract (slightly edited for length, emphasis is mine)
Here we show that graph networks trained at scale can reach unprecedented levels of generalization, improving the efficiency of materials discovery by an order of magnitude
Building on 48,000 stable crystals identified in continuing studies, improved efficiency enables the discovery of 2.2 million structures … many of which escaped previous human chemical intuition.
Our work represents an order-of-magnitude expansion in stable materials known to humanity.
Scaling deep learning for materials discovery, Merchant et al, Nature, 29 Nov 2023
Critically, for a material to be useful, it pretty much has to be stable. GNoME predicts stability with impressive and increasing accuracy, allowing very rapid ‘in-silico’ experimentation and exploration. As in our designer proteins story, only the most promising compounds pass through to the slower and more labour/resources intensive step of actual synthesis.
And the link to decarbonisation? The DeepMind blog flags the discovery of ‘52,000 new layered compounds similar to graphene’ which have a potential to impact superconductor research.
With GNoME, we’ve multiplied the number of technologically viable materials known to humanity. Of its 2.2 million predictions, 380,000 are the most stable, making them promising candidates for experimental synthesis.
Among these candidates are materials that have the potential to develop future transformative technologies ranging from superconductors, powering supercomputers, and next-generation batteries to boost the efficiency of electric vehicles.
Millions of new materials discovered with deep learning, DeepMind blog, Nov 2023
And perhaps most excitingly because it feels a little closer to feasible (based really only on my deep suspicion of superconductor announcements in general 😎) are 528 new potential lithium ion conductors that could be used to improve the performance of batteries💖💖💖💖💖💖💖💖
AI for tangible, practical good!
There, that’s better. AI isn’t only a productivity hack for busy office workers. Next time someone hand waves about how ‘generative AI will solve climate change’ I might have to be just ever so slightly less sceptical!
I’m taking a break from writing this newsletter over Christmas but I can’t imagine that I’m going to stop thinking about this stuff. I foresee some deep pondering of AI regulation rumbling along in the back of my brain while I cut firebreaks and water my baby trees.
Wherever the holiday break finds you, I wish you at least a little peaceful unplugged time to watch the clouds / snow / birds / waves. Thank you for reading and I look forward to learning with you again in 2024.