Cradle is a software platform used for protein engineering by scientists, targeting a wide range of users, from computational experts to bench scientists. Eli Bixby, Co-Founder & Head of Machine Learning at Cradle, argued that ML shines the most after de Novo, particularly during the lead optimization stage.
Although de Novo methods are effective for initial target hits, they struggle with strong binders and require significant improvement in developability. Having participated in public competitions to test their models, they consistently achieved better binding results than competitors relying on more traditional de Novo structure-based approaches.
The main drive behind improving the lead optimization process is attributed to the fact that lead optimization is one of the most costly processes in drug discovery and development. Bixby introduced the lab-in-the-loop machine learning platform that seeks to tackle challenges in lead optimization by homing in on multi-property optimization. Bixby stated: “Lead optimization is fundamentally a multi-property optimization problem. You need to hit all of your developability targets in order to move into the clinic.”
The team at Cradle primarily works with enzyme projects. They rely on two main measurements: their predictors, which turn sequences into scores, and their generators. Bixby discussed the issues associated with optimizing only one property. For example, by optimizing only thermostability, the team lost 85% of samples because they wouldn’t express enough to measure thermostability. To address the issue of generators focusing on single properties and excluding others, multi-property conditioning must take place to balance multiple properties.
Bixby pointed out that ML, particularly generative ML tends to favour small areas of sequence space, narrowing in on a solution. Biology happens in batches so running plates of nearly identical sequences is inefficient. However, down selection and multi-property conditioning can overcome these challenges.
Cradle’s generator allows for insertions, deletions, and substitutions across the entire sequence. Furthermore, the generators can be conditioned on observed data, improving correlation. This resulted in a statistically significant improvement in hit rates.
In the future, Cradle aims to deploy customer-wide base models for developability properties and integrate third-party predictors into their pipeline. Bixby is focused on designing multi-chain complexes and generating large libraries for early discovery. A demo is worth 1,000 slides, so Bixby ended his talk by giving the audience a live demonstration of the Cradle software, showcasing its abilities in project creation, assay addition, and data import.