CryoDRGN2 Ab Initio Reconstruction

how to use the commands cryodrgn abinit_homo and crydrgn abinit_het

There are two commands for ab initio reconstruction, cryodrgn abinit_homo and cryodrgn abinit_het for homogeneous and heterogeneous ab initio reconstruction, respectively:

# homogeneous ab initio reconstruction
(cryodrgn) $ cryodrgn abinit_homo -h

# heterogeneous ab initio reconstruction
(cryodrgn) $ cryodrgn abinit_het -h
  • Downsample your particles to a box size of 128 either with cryodrgn downsample or with other tools.

  • If you have a large dataset (>500k images), we recommend training on a subset of particles for initial testing. Use cryodrgn_utils select_random to select a random subset of particles.

    # get a random selection of 200k particles from a dataset of 1,423,124 particles
    (cryodrgn) $ cryodrgn_utils select_random 1423124 -n 200000 -o ind200k.pkl
    • You can then train on only the random subset with the argument --ind ind200k.pkl

  • For reference, ab initio heterogeneous reconstruction on a dataset containing 218k 128x128 particles took 20 hours to train on a single V100 GPU.

Example usage

# homogeneous reconstruction
(cryodrgn) $ cryodrgn abinit_homo [particles] --ctf [ctf.pkl] -o [output_directory]  >> output.log

# heterogeneous reconstruction
(cryodrgn) $ cryodrgn abinit_het [particles] --ctf [ctf.pkl] --zdim 8 -o [output_directory]  >> output.log

Note on training settings

  • The default translational search extent is +/- 10 pixels (--t-extent 10). If your particles are not well-centered, you can use a wider search extent, e.g. +/- 40 pixels ( --t-extent 40).

  • Poses are updated every 5 epochs (--ps-freq 5) to alternate between pose search (slow) and standard cryodrgn1 training (fast) using the last iteration's poses.

  • The default pose search settings are not tuned for high accuracy alignments (a tradeoff of accuracy vs. compute speed). You can increase the resolution of the pose search with `

  • The default training time is 30 epochs. A typical use case is to run for 30 epochs, check the results (cryodrgn analyze), then extend training to 60 epochs. You can extend by rerunning with -n 60 --load latest. If your dataset is very large, you may want to reduce the pose search freqency --ps-freq and the number of epochs -n.

  • During training, pose search epochs will get successively slower. This is because the parameter --l-ramp-epochs 25 increases the max resolution from a Fourier radius of 12 pixels (--l-start) to 32 pix (--l-end) over the first 25 epochs of training.

    • Example training time course (1 V100 GPU)

Questions and contact

If you have any questions about the method or software, please file a GitHub issue:

https://github.com/zhonge/cryodrgn/issues

Or post in the cryoDRGN Google Group: https://groups.google.com/g/cryodrgn.

Reference

CryoDRGN2 software was developed by Ellen Zhong & Adam Lerer with software support from Vineet Bansal. If you find the ab initio tools in cryoDRGN useful, please cite:

Zhong, Lerer, Davis, Berger. ICCV 2021.

https://openaccess.thecvf.com/content/ICCV2021/html/Zhong_CryoDRGN2_Ab_Initio_Neural_Reconstruction_of_3D_Protein_Structures_From_ICCV_2021_paper.html

Last updated