šŸ‰
cryoDRGN
  • CryoDRGN User Guide
    • News and Release Notes
  • Installation
  • CryoDRGN EMPIAR-10076 Tutorial
  • CryoDRGN2 Ab Initio Reconstruction
  • CryoDRGN-ET Subtomogram Analysis
  • CryoDRGN Conformational Landscape Analysis
  • Making long trajectories with cryoDRGN graph_traversal
  • FAQ & Troubleshooting
  • Helpful Resources
    • ChimeraX tips
  • Deprecated Workflows
    • Handling large datasets with cryoDRGN preprocess
Powered by GitBook
On this page
  • Quick Start
  • Recent Highlights
  • Background
  • Input data requirements
  • Tutorial overview
  • References

CryoDRGN User Guide

welcome to cryoDRGN's detailed documentation!

NextNews and Release Notes

Last updated 28 days ago

This page contains user guides for the cryoDRGN šŸ‰ ā„ļø open-source software package, including in-depth tutorials for training and analyzing cryoDRGN models.

Quick Start

CryoDRGN can be installed with pip, and supports Python 3.10, 3.11 and 3.12. We recommend installing cryodrgn in a separate anaconda environment.

$ conda create --name cryodrgn-env python=3.12
$ conda activate cryodrgn-env
(cryodrgn-env) $ pip install cryodrgn

All cryoDRGN commands are accessed through the cryodrgn and cryodrgn_utils executables. Use the -h flag to display all available subcommands, and cryodrgn <command> -h to see the parameters for each command.

(cryodrgn-env) $ cryodrgn -h
(cryodrgn-env) $ cryodrgn_utils -h

See Installation for more details and advanced installation instructions.

Recent Highlights

  • Sep 2024: cryodrgn version 3.4.0 released! This version includes a new plot_classes utility, full support for RELION 3.1 .star files, and cryoSPARC-style phase-randomization applied to FSCs

  • April 2024: released: first non-beta release of , new command cryodrgn direct_traversal, improved testing, and cleaned-up jupyter notebooks

  • Sep 2023: for subtomogram analysis is now available in beta, as cryodrgn version 3.0.0-beta

  • Jun 2023: Documentation clean up, available here on gitbook

  • May 2023: cryodrgn version 2.3.0 released with improvements and fixes to ab initio tools.

  • Jan 2023: cryodrgn version 2.2.0 released with new ab initio reconstruction tools and more.

  • July 2022: cryodrgn version 1.1.0 released with updated default architecture

  • May 2022: cryodrgn version 1.0.0 released with new landscape analysis, cryodrgn_utils, and more

See News and Release Notes for additional details.

Background

CryoDRGN is a neural network-based method for heterogeneous reconstruction. Instead of discrete methods like 3D classification that produce an ensemble of K density maps, cryoDRGN performs heterogeneous reconstruction by learning a continuous distribution of density maps parameterized by a coordinate-based neural network.

The inputs to a cryoDRGN training run are 1) extracted particle images, 2) the CTF parameters associated with each particle, and, optionally, 3) poses for each particle from a C1 (asymmetric) 3D refinement. CryoDRGN2's ab initio reconstruction algorithms do not require input poses.

The final result of the software will be 1) latent embeddings for each particle image in the form of a real-valued vector (usually denoted with z, and output as a z.pkl file by the software), and 2) neural network weights modeling the distribution of density maps (parameterizing the function from z→V). Once trained, the software can reconstruct a 3D density map given a value of z.

How do you interpret the resulting distribution of structures? Since different datasets have diverse sources of heterogeneity (e.g. discrete vs. continuous), cryoDRGN contains a variety of automated and interactive tools to analyze the reconstructed distribution of structures. The starting point for analysis is the cryodrgn analyze pipeline, which generates a sample of 3D density maps and visualizations of the latent space. Specifically, the cryodrgn analyze pipeline will produce 1) N density maps sampled from different regions of the latent space (N=20, by default), 2) continuous trajectories along the principal components axes of the latent space embeddings, and 3) visualizations of the latent space with PCA and UMAP.

CryoDRGN also provides interactive tools to further explore the learned ensemble, implemented as Jupyter notebooks with interactive widgets for visualizing the dataset, extracting particles, and generating more volumes. Additional tools are also available that can generate trajectories given user-defined endpoints and convert particle selections to .star files for further refinement in other tools. An overview of these functionalities will be demonstrated in the tutorial.

Furthermore, because the model is trained to reconstruct image heterogeneity, any non-structural image heterogeneity that is not captured by the image formation model (e.g. junk particles and artifacts) can be reflected in the latent embeddings. In practice, junk particles are often easily identified in the latent embeddings and can then be filtered out. A jupyter notebook is provided to filter particle stacks.

What settings should I use for training cryoDRGN networks? Common hyper-parameters when training a cryoDRGN model are: 1) the size of the neural network, which controls the capacity of the model, 2) the input image size, which bounds the resolution information and greatly impacts the training speed and 3) the latent variable dimension, which is the bottleneck layer that bounds the expressiveness of the model. The three parameters together all affect the expressiveness/complexity of the learned model. After exploring many real datasets, we provide reasonable defaults and recommended settings of these parameters for training.

Input data requirements

  • Extracted single particle images (in .mrcs/.cs/.star/.txt format), ideally clean from edge, ice, or hot pixel artifacts

  • For cryoDRGN1, a C1 consensus reconstruction with:

    • High-quality CTF parameters

    • High-quality image poses (also called particle alignments)

  • Image poses are not required for cryoDRGN2's ab initio reconstruction tools

Tutorial overview

  1. preprocessing of inputs,

  2. initial cryoDRGN training and explanation of outputs,

  3. particle filtering to remove junk particles,

  4. high-resolution cryoDRGN training,

  5. extracting particle subsets for traditional refinement, and

  6. generation of trajectories.

References

For a complete description of the method, see:

An earlier version of this work appeared at the International Conference of Learning Representations (ICLR):

CryoDRGN's ab initio reconstruction algorithms are described here:

A protocols paper that describes the analysis of the EMPIAR-10076 assembling ribosome dataset:

CryoDRGN-ET for heterogeneous subtomogram analysis:

See CryoDRGN EMPIAR-10076 Tutorial for a step-by-step guide for running cryoDRGN. This walkthrough of cryoDRGN analysis of the assembling ribosome dataset (EMPIAR-10076) covers all steps used to reproduce the analysis in , including:

For an abbreviated overview of the steps for running cryoDRGN, see the github

A protocols paper that describes the analysis of the assembling ribosome dataset is now published. See

CryoDRGN: reconstruction of heterogeneous cryo-EM structures using neural networks Ellen D. Zhong, Tristan Bepler, Bonnie Berger*, Joseph H Davis* Nature Methods 2021, []

Reconstructing continuous distributions of protein structure from cryo-EM images Ellen D. Zhong, Tristan Bepler, Joseph H. Davis*, Bonnie Berger* ICLR 2020, Spotlight,

CryoDRGN2: Ab Initio Neural Reconstruction of 3D Protein Structures From Real Cryo-EM Images Ellen D. Zhong, Adam Lerer, Joseph H Davis, and Bonnie Berger International Conference on Computer Vision (ICCV) 2021, []

Uncovering structural ensembles from single particle cryo-EM data using cryoDRGN Laurel Kinman, Barrett Powell, Ellen D. Zhong*, Bonnie Berger*, Joseph H Davis* Nature Protocols 2023,

Deep reconstructing generative networks for visualizing dynamic biomolecules inside cells Ramya Rangan, Sagar Khavnekar, Adam Lerer, Jake Johnston, Ron Kelley, Martin Obr, Abhay Kotecha, Ellen D. Zhong bioRxiv 2023,

cryodrgn version 3.3.0
cryoDRGN-ET
cryoDRGN-ET
Zhong et al.
README
Kinman*, Powell*, Zhong* et al.
https://doi.org/10.1038/s41592-020-01049-4
pdf
https://arxiv.org/abs/1909.05215
paper
https://doi.org/10.1038/s41596-022-00763-x
https://www.biorxiv.org/content/10.1101/2023.08.18.553799v1
Page cover image
Principal component trajectories and graph traversal trajectories of the pre-catalyic spliceosome. SI Video 4 from
Zhong et al 2021
SI Video 3 from
Zhong et al 2021