Running reconstruction experiments

executing and analyzing a particle reconstruction model with cryoDRGN-AI

Using a submission script to a compute cluster

Training a reconstruction neural network is usually computationally intensive; we thus recommend using a high-performance compute cluster to run cryoDRGN-AI experiments.

For example, a submission script to a cluster using the Slurm job scheduler would look like:

#!/bin/bash
#SBATCH --partition=cryoem
#SBATCH --job-name=cryodrgnai_abinit
#SBATCH -t 8:00:00
#SBATCH --gres="gpu:a100:1"
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=16
#SBATCH --mem-per-cpu=16G

cd /scratch_dir/my_name/

cryodrgn abinit /data_dir/empiar_benchmark/particles.128.mrcs \
                --ctf /data_dir/empiar_benchmark/ctf.pkl -o 001_abinit/

And would be submitted using:

(drgnai-env) $ sbatch -t 8:00:00 -p cryoem -J cryodrgnai_test -o crydrgnai_test.out cryodrgnai_slurm.sh 

Examining model output

Epochs in cryoDRGN-AI are 1-indexed, as in the rest of cryoDRGN (but not the original implementation of cryoDRGN-AI, in which epochs were 0-indexed). This means that epochs are labelled 1...n instead of 0...n-1.

Outputs are also structured similarly as in other cryoDRGN reconstruction commands, with the following types of files produced by the model:

  • weights.<epoch>.pkl checkpoint of model weights

  • pose.<epoch>.pkl reconstructed image poses

  • z.<epoch>.pkl reconstructed z-latent-space co-ordinates

  • reconstruct.<epoch>.pkl a single reconstructed volume, using the image closest to the centroid in the z-latent-space in the case of a heterogeneous model

  • analyze.<epoch>/ subfolder containing the outputs of post-analyses produced by running the analyze command; the final training epoch is analyzed by default

Restarting finished experiments

Experiments that have finished running can be run for further epochs using the --load flag, which instructs abinit to use a weights.* model checkpoint as a starting point. The new number of total epochs for the reloaded model (not the number of additional epochs) should be specified using -n:

Last updated