Contact Us

Thank You

The form was submitted successfully. We will be in touch with you soon.

AI and ML Are Rapidly Coming Into Focus for Cryo-EM

Adam Marko
| May 17, 2022

Estimated reading time: 3 minutes

Everyone knows by now that tech vendors love to hype artificial intelligence (AI) and machine learning (ML).

The tendency is understandable when you think about the revolutionary ways these technologies can enhance the work that humans do. But scientists are typically a more skeptical and methodical crowd, less likely to get swept up in our periodic bursts of enthusiasm. And I say this all as someone with experience from both perspectives: I’ve spent half my career as a researcher and the other half helping life sciences organizations navigate their complex IT needs.

But progress in high performance computing (HPC) data storage, alongside promising developments in AI/ML, are ushering in a new frontier. As laboratories become increasingly reliant on these technologies to keep up with data generation and analysis, the life sciences community is starting to register groundbreaking research advances.

I recently attended the 4th International Symposium on Cryo-3D Image Analysis, which brought researchers from around the world together to talk about how these changes are manifesting themselves in cryogenic electron microscopy (cryo-EM). COVID-19 had forced the cancellation of the last conference originally planned for 2020, so this was the first time in several years that these scientists could gather in person to compare notes.

Since much of the research covered is still unpublished, I can’t divulge specifics. But suffice to say that these technologies are driving progress across the entire scientific pipeline, from sample preparation to data transmission and analysis, and the excitement about the potential for new breakthroughs was palpable.

F1. ATPsynthase molecule with the motor domain structure highlighted
Image Source: Sobti et al., “Cryo-EM reveals distinct conformations of E. coli ATP synthase on exposure to ATP,” eLife (2019). Acquired at UOW Molecular Horizons cryo-EM facility.


A few key takeaways: 
  • We’re at a tipping point where the ability to crunch vast amounts of data with the help of AI and ML promises to automate many processes in cryo-EM labs and speed time to discovery. The previously labor- and time-intensive steps required to distinguish good datasets from bad ones, such as particle picking, noise reduction, and error correction, may soon benefit from these tools.
  • Cryo-EM holds immense opportunity for the pharmaceutical industry, since researchers can now visualize binding sites in proteins (a binding site is a region on a macromolecule that binds to another molecule with specificity) in record time. This is also good news for nanotechnology, as chip makers continue to use cryo-EM to analyze silicon and reduce the time to bring products to market.
  • The event reinforced that data storage and management remain pain points for many researchers, especially as new generations of microscopes and detectors generate more data than ever before. Some scientists don’t want to share their data. Others struggle to manage it altogether. In many cases, some can’t share, even if they want to, because the data is scattered across multiple storage systems. This also makes it difficult, if not impossible, to develop training datasets for ML workflows.
  • There’s an overarching urgency to fix that situation. In the future, researchers will require more flexible and performant storage systems. I anticipate higher demand for parallel file systems specifically, due to their unlimited scalability allowing for more performance capacity and data protection. Additionally, keeping data spread across silos is simply no longer feasible. Consolidating all the data onto one central, easy-to-manage system will enable collaboration by making it easier to share that data with external institutions, as well as allowing for data reuse and ML training.

We’re still in the early innings of this historic transition; however, as researchers equip themselves with better storage and data management tools and discover new AI/ML applications, we can expect rapid and exciting advancements throughout the life sciences.

It’s taken a while to reach this point. But we’re on the cusp of witnessing something big here, and it’s not just hype.