CryoDRGN-ET Subtomogram Analysis
how to perform heterogeneous reconstruction using cryo-ET subtomograms
CryoDRGN-ET for subtomogram analysis has been made available as of version 3.0.0+ using additional flags passed to the train_vae command:
cryodrgn train_vae particles_from_M.star --encode-mode tilt --dose-per-tilt 2.93 --angle-per-tilt 3.0 --ctf ctf.pkl --poses pose.pkl --datadir /data/subtiltstacks/ --zdim 8 -n 50 --beta 0.025 -o my_output_directory/Note that --encode-mode tilt as well as a given value for --dose-per-tilt are required to activate subtomogram analysis, while --angle-per-tilt is optional with a default value of 3.
We describe here a typical workflow for preparing tilt series inputs for use with cryoDRGN heterogeneous reconstruction, training a reconstruction model, and analyzing its outputs. See our preprint here for a description of the cryoDRGN-ET method and associated results.
Preprocessing
Export particles
CryoDRGN-ET expects 2D particle tilt series images in a .star file exported from Windows Warp/M or RELION5 --tomo. From RELION5, 2D particle tilt series images without CTF premultiplication should be extracted using the --no_ctf option . We are actively working to support CTF-corrected images exported from WarpTools but this data type is not currently supported. 
Windows Warp/M
If you intend to perform subvolume refinement or tomogram visualization, we recommend also exporting highly binned subvolumes to generate a 3D star file at this stage, since additional Warp/M processing could result in particle reordering (!), which would affect downstream refinement if filtering with cryoDRGN-ET.


Example 2D Warp/M star file:
RELION5
Export refined particles as 2D stacks using the --no_ctf additional argument. It is important to consider using a larger box to prevent CTF aliasing. 

Although RELION5 exports 2D particle stacks rather than subvolumes, the resulting star file refers to the 3D tomogram coordinates. Following RELION5 extraction, 3D particle star files need to be converted to the 2D format used for cryoDRGN-ET.
Run the RELION5 3D to 2D conversion parse_relion in your RELION5 project directory. The conversion utility requires inputs of a particles.star file,  the associated tomograms.star  file, and the raw tilt dimensions. The tiltseries.star files are read from the relative paths provided in the tomograms.star file. To run the conversion outside of a project directory, create a sym link to make the tiltseries.star relative paths accessible. 
(cryodrgn) $ cryodrgn_utils parse_relion -t Polish/jobxxx/tomograms.star -p Extract/jobxxx/particles.star --tilt-dim 4096 4096 -o particles_2d.starPrepare cryoDRGN-ET input files
Here we will assume our tilt series images have been exported to the file particles_2d.star, and that we have already loaded a conda environment named cryodrgn with cryoDRGN v3.5.1+ installed. We will need to extract separate .pkl files containing CTF parameters and pose estimates from this .star file for use with cryoDRGN commands.
- Downsample images (if necessary) 
To optimize cryoDRGN training, you may first want to consider downsampling your images to a more manageable size. The downsampled pixel size should be chosen based on the quality of the consensus reconstruction and the resolution required to observe the types of changes you anticipate in your data. For example, A 6Å Nyquist limit should be sufficient to observe changes in secondary structure. For cryo-ET data, a pixel size of up to 10Å may be useful for particle classification based on low resolution features.
Downsampled particle images need to maintain the same stack organization and ordering to prevent metadata mismatch. This is handled automatically by cryodrgn downsample when a star file is used as the input. 
(cryodrgn) $ cryodrgn downsample particles_2d.star -D 128 -o particles_2d.128.star --outdir downsampled_128For subsequent commands, the star file will contain relative paths to the provided outdir location. Alternatively, you can use the full path to downsampled_128/ with the --datadir argument (and the same .star file) to use these downsampled subtilts from a different working directory.
Particles may also be downsampled at the time of extraction in RELION5 or Warp/M.
- Parse additional pose and CTF information 
Obtaining the additional CTF parameter and pose estimate files can be done using the utility command installed as part of cryoDRGN:
(cryodrgn) $ cryodrgn parse_star particles_2d.star --ctf ctf.pkl --poses pose.pkl --Apix 1.54 -D 280The command should be run on the Warp/M or converted RELION5 2D star file. Do not parse pose and ctf information from the cryodrgn downsample star file. 
Note that this command requires you to specify the original box size and Å/pixel values if these are not listed in the .star file under fields such as _rlnImageSize and _rlnImagePixelSize. The values should match the output of particle extraction.  If particles were downsampled during extraction the downsampled pixel and box size should be used however, if cryodrgn downsample was used instead, then the original box and pixel size should be provided. 
- Perform a sanity check using backprojection 
We can confirm our inputs were correctly parsed using traditional homogeneous reconstruction. This step will run 10x faster on a GPU compute node! Note that this command also requires extra metadata on how the tilts were collected.
Particles must have tilts ≥ --ntilts (10 tilts by default). If this is not the case, you can generate an indices file to exclude particles with tilts < --ntilts by providing the --force-ntilts argument at this stage. An indices .pkl file will be written to the backprojection output directory which can also be used for the subsequent training job. 
(cryodrgn) $ cryodrgn backproject_voxel particles_2d.128.star --ctf ctf.pkl --poses pose.pkl --dose-per-tilt 3 -o 00_backproject_128Note: For large datasets or limited GPU ram, the --lazy argument will prevent out of memory errors. 
CryoDRGN-ET training
Once we have obtained our input files, we are ready to train the reconstruction model on our tilt series. Here we use an example command using the files described above:
(cryodrgn) $ cryodrgn train_vae particles_2d.128.star --ctf ctf.pkl --poses pose.pkl  --encode-mode tilt --dose-per-tilt 3.0 --zdim 8 -n 50 --beta 0.025 -o 01_trainvae_128/In particular:
- --encode-mode tiltis required to properly treat tilt series data
- In our current experiments, we use a KL regularization weight of - --beta 0.025. We recommend this setting as a starting point for all tilt series experiments!
- --dose-per-tiltand- --angle-per-tiltare used for dose exposure correction. The default value of- --angle-per-tiltis 3 degrees and is left off of the example command.
Training a model on 16,655 particles for 50 epochs on 1 A100 GPU took 3h, 38min.
Analysis
Once a cryoDRGN-ET model has finished training, use cryodrgn analyze to visualize the latent space and generate volumes.
(cryodrgn) $ cryodrgn analyze output_directory 50 # or replace with a different epoch numberThis portion of the analysis is similar to the workflow in single particle cryodrgn. See the EMPIAR-10076 tutorial for further documentation.
Additional tools
Landscape analysis: The commands cryodrgn analyze_landscape and cryodrgn analyze_landscape_full can be used for further analysis on the landscape of reconstructed volumes (as opposed to the landscape of latent space co-ordinates). See the cryoDRGN landscape analysis tutorial for more information.
Particle selection: We implemented a standalone filtering tool to enable lasso selection from the UMAP representation outside of the Jupyter notebook. Run cryodrgn filter . from your results directory to launch an interactive plot in an X11 window. If running remotely you must be connected with ssh -Y

Star file filtering: Particle indices identified by cryoDRGN-ET can be used to filter 2D subtilt star files from Warp/M or 3D subvolume star files for downstream proccessing or visualization in ArtiaX. Filtering a 3D subvolume star file with cryodrgn_utils filter_star using the --micrograph-files or -m option produces a directory containing one star file per tomogram.  
Experiment Workflow
Similarly to the reconstruction workflow for single particle analysis, we recommend an iterative process for training successive cryoDRGN models on a new dataset:
- First, train on downsampled images (e.g. D=128) as an initial pass to sanity check results and remove junk particles: 
(cryodrgn) $ cryodrgn train_vae particles_2d.128.star --ctf ctf.pkl --poses pose.pkl --datadir downsampled_128/ --encode-mode tilt -dose-per-tilt 3.0 --zdim 8 -n 50 --beta 0.025 -o my_output_directory/01_trainvae_128- Validate selection and junk classifications with backprojections: 
(cryodrgn) $ cryodrgn backproject_voxel particles_2d_128.star --poses pose.pkl --ctf ctf.pkl --tilt --ntilts 10 --dose-per-tilt 3 -o 02_backproject_128_selected_particles --lazy --ind 01_trainvae_128/selected_particles.pkl- After validating selected particles ( - --ind selected_particles.pkl), train a new model excluding any junk:
(cryodrgn) $ cryodrgn train_vae particles_2d_128.star --ctf ctf.pkl --poses pose.pkl --datadir downsampled_128/ --encode-mode tilt --ind selected_particles.pkl --dose-per-tilt 3 --zdim 8 -n 50 --beta 0.025 -o my_output_directory/03_trainvae_selected.128- Analyze heterogeneity and separate discrete states using: - Clustering methods available in - output_dir/analyze.50/cryoDRGN_ET_viz.ipynb
- Standalone lasso GUI - cryodrgn filter
 
- Optional - Filter original 3D star file with cryodrgn selected particles for further refinement in RELION/M: 
(cryodrgn) $ cryodrgn_utils filter_star Extract/jobxxx/particles.star --ind 01_trainvae_128/sel_ind.pkl -o cryodrgn_particles.star- Optional - Train a new cryoDRGN model on higher resolution images to better resolve conformational heterogeneity: 
(cryodrgn) $ cryodrgn train_vae particles_2d.star --ctf ctf.pkl --poses pose.pkl --datadir Extract/jobxxx/Micrographs/ --encode-mode tilt -dose-per-tilt 3.0 --zdim 8 -n 50 --beta 0.025 -o my_output_directory/04_trainvae_256Feedback
Please file a github issue or contact Ellen (zhonge@princeton.edu) with any questions or feedback!
Last updated