We provide installation instructions assuming an Anaconda environment for managing dependencies. Anaconda is a python package/environment manager which can handle complex dependencies between Python packages through the creation of python environments. We recommended creating a separate environment for cryodrgn to prevent any conflicts among dependencies with other software packages.
Compute/hardware requirements:
High performance linux workstation or cluster
NVIDIA GPUs
Dependencies:
python
pytorch
cudatoolkit
numpy
pandas
Additional dependencies for visualization:
matplotlib
seaborn
scipy 1.4.0+
scikit-learn
umap
jupterlab
ipywidgets
plotly and cufflinks
The software has been tested on Python 3.7-3.9 and pytorch 1.0-1.7, 1.12.
1) Install anaconda
Once your anaconda environment is activated, your anaconda environment should be indicated on the command line, e.g.:
(base) $
2) Setting up the cryoDRGN environment
First, create a new conda environment named cryodrgn (or renamed as appropriate):
(base) $ conda create --name cryodrgn python=3.11
Activate the environment. Your command prompt will usually indicate the environment you are in with (environment name) before the prompt:
(base) $ conda activate cryodrgn
(cryodrgn) $
3) Install cryoDRGN with pip
Option 1: Install with pip
(cryodrgn) $ pip install cryodrgn
Option 2: Install from the source code
Install pytorch and cudatoolkit into your new cryodrgn environment:
Replace the cudatoolkit version with the appropriate version of CUDA installed with the GPU drivers. You can check the CUDA version with nvidia-smi.
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.65.01 Driver Version: 515.65.01 CUDA Version: 11.7 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 Off | N/A |
| N/A 41C P0 N/A / N/A | 5MiB / 4096MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1420 G /usr/lib/xorg/Xorg 4MiB |
+-----------------------------------------------------------------------------+
Don't forget to include -c pytorch to get the software from the official pytorch channel
Obtain cryodrgn source code by cloning the git repository, and then doing a pip install . in the checkout folder. This will also install dependencies that cryoDRGN depends on.
# Clone source code and install
(cryodrgn) $ git clone https://github.com/ml-struct-bio/cryodrgn.git
(cryodrgn) $ cd cryodrgn
(cryodrgn) $ pip install .
Once installed, you should be able to call the cryodrgn executable and see a list of commands:
(cryodrgn) $ cryodrgn -h
There is a small testing dataset in the source code that you can use to run cryodrgn and verify that all the dependencies were installed correctly:
(cryodrgn) $ cd [sourcecode directory]/tests
(cryodrgn) $ ./quicktest.sh
It should take ~20 seconds to run and reach a final loss around 0.08 in version 1.0 and 0.03 in version 1.1+. The output should look something like:
+ cryodrgn train_vae data/hand.mrcs -o output/toy_recon_vae --lr .0001 --seed 0 --poses data/hand_rot.pkl --zdim 10 --pe-type gaussian
2022-09-20 16:30:05 /home/vineetb/.conda/envs/cryodrgn/bin/cryodrgn train_vae data/hand.mrcs -o output/toy_recon_vae --lr .0001 --seed 0 --poses data/hand_rot.pkl --zdim 10 --pe-type gaussian
2022-09-20 16:30:05 Namespace(particles='/home/vineetb/cryodrgn/cryodrgn/tests/data/hand.mrcs', outdir='/home/vineetb/cryodrgn/cryodrgn/tests/output/toy_recon_vae', zdim=10, poses='/home/vineetb/cryodrgn/cryodrgn/tests/data/hand_rot.pkl', ctf=None, load=None, checkpoint=1, log_interval=1000, verbose=False, seed=0, ind=None, invert_data=True, window=True, window_r=0.85, datadir=None, lazy=False, preprocessed=False, max_threads=16, tilt=None, tilt_deg=45, num_epochs=20, batch_size=8, wd=0, lr=0.0001, beta=None, beta_control=None, norm=None, amp=True, multigpu=False, do_pose_sgd=False, pretrain=1, emb_type='quat', pose_lr=0.0003, qlayers=3, qdim=1024, encode_mode='resid', enc_mask=None, use_real=False, players=3, pdim=1024, pe_type='gaussian', feat_sigma=0.5, pe_dim=None, domain='fourier', activation='relu', func=<function main at 0x7f47d8676790>)
2022-09-20 16:30:06 Use cuda True
2022-09-20 16:30:06 Loading dataset from /home/vineetb/cryodrgn/cryodrgn/tests/data/hand.mrcs
2022-09-20 16:30:06 Loaded 100 64x64 images
2022-09-20 16:30:06 Windowing images with radius 0.85
2022-09-20 16:30:06 Computing FFT
2022-09-20 16:30:06 Spawning 16 processes
2022-09-20 16:30:06 Symmetrizing image data
2022-09-20 16:30:06 Normalized HT by 0 +/- 94.426513671875
2022-09-20 16:30:06 WARNING: No translations provided
2022-09-20 16:30:07 Using circular lattice with radius 32
2022-09-20 16:30:07 HetOnlyVAE(
(encoder): ResidLinearMLP(
(main): Sequential(
(0): Linear(in_features=3208, out_features=1024, bias=True)
(1): ReLU()
(2): ResidLinear(
(linear): Linear(in_features=1024, out_features=1024, bias=True)
)
(3): ReLU()
(4): ResidLinear(
(linear): Linear(in_features=1024, out_features=1024, bias=True)
)
(5): ReLU()
(6): ResidLinear(
(linear): Linear(in_features=1024, out_features=1024, bias=True)
)
(7): ReLU()
(8): Linear(in_features=1024, out_features=20, bias=True)
)
)
(decoder): FTPositionalDecoder(
(decoder): ResidLinearMLP(
(main): Sequential(
(0): Linear(in_features=202, out_features=1024, bias=True)
(1): ReLU()
(2): ResidLinear(
(linear): Linear(in_features=1024, out_features=1024, bias=True)
)
(3): ReLU()
(4): ResidLinear(
(linear): Linear(in_features=1024, out_features=1024, bias=True)
)
(5): ReLU()
(6): ResidLinear(
(linear): Linear(in_features=1024, out_features=1024, bias=True)
)
(7): ReLU()
(8): Linear(in_features=1024, out_features=2, bias=True)
)
)
)
)
2022-09-20 16:30:07 9814038 parameters in model
2022-09-20 16:30:07 6455316 parameters in encoder
2022-09-20 16:30:07 3358722 parameters in decoder
2022-09-20 16:30:07 Warning: z dimension is not a multiple of 8 -- AMP training speedup is not optimized
2022-09-20 16:30:08 # =====> Epoch: 1 Average gen loss = 1.10873, KLD = 3.386924, total loss = 1.108840; Finished in 0:00:01.346893
2022-09-20 16:30:09 # =====> Epoch: 2 Average gen loss = 0.740441, KLD = 8.101010, total loss = 0.740694; Finished in 0:00:00.542031
2022-09-20 16:30:09 # =====> Epoch: 3 Average gen loss = 0.535575, KLD = 10.675920, total loss = 0.535908; Finished in 0:00:00.541637
2022-09-20 16:30:10 # =====> Epoch: 4 Average gen loss = 0.368407, KLD = 13.592397, total loss = 0.368831; Finished in 0:00:00.541102
2022-09-20 16:30:11 # =====> Epoch: 5 Average gen loss = 0.234344, KLD = 16.974737, total loss = 0.234873; Finished in 0:00:00.544139
2022-09-20 16:30:11 # =====> Epoch: 6 Average gen loss = 0.140454, KLD = 19.307134, total loss = 0.141056; Finished in 0:00:00.541751
2022-09-20 16:30:12 # =====> Epoch: 7 Average gen loss = 0.0899037, KLD = 20.093284, total loss = 0.090530; Finished in 0:00:00.544378
2022-09-20 16:30:13 # =====> Epoch: 8 Average gen loss = 0.0641149, KLD = 20.714445, total loss = 0.064761; Finished in 0:00:00.543761
2022-09-20 16:30:13 # =====> Epoch: 9 Average gen loss = 0.0496119, KLD = 20.798030, total loss = 0.050260; Finished in 0:00:00.545478
2022-09-20 16:30:14 # =====> Epoch: 10 Average gen loss = 0.0408589, KLD = 21.089181, total loss = 0.041516; Finished in 0:00:00.548600
2022-09-20 16:30:15 # =====> Epoch: 11 Average gen loss = 0.0338546, KLD = 21.053582, total loss = 0.034511; Finished in 0:00:00.560902
2022-09-20 16:30:15 # =====> Epoch: 12 Average gen loss = 0.0290218, KLD = 21.509603, total loss = 0.029692; Finished in 0:00:00.549171
2022-09-20 16:30:16 # =====> Epoch: 13 Average gen loss = 0.0252569, KLD = 21.402734, total loss = 0.025924; Finished in 0:00:00.554416
2022-09-20 16:30:17 # =====> Epoch: 14 Average gen loss = 0.0222708, KLD = 21.686829, total loss = 0.022947; Finished in 0:00:00.549451
2022-09-20 16:30:17 # =====> Epoch: 15 Average gen loss = 0.0196031, KLD = 21.829715, total loss = 0.020284; Finished in 0:00:00.547152
2022-09-20 16:30:18 # =====> Epoch: 16 Average gen loss = 0.0175662, KLD = 21.648027, total loss = 0.018241; Finished in 0:00:00.548684
2022-09-20 16:30:19 # =====> Epoch: 17 Average gen loss = 0.0159719, KLD = 21.876881, total loss = 0.016654; Finished in 0:00:00.540562
2022-09-20 16:30:19 # =====> Epoch: 18 Average gen loss = 0.0147737, KLD = 21.754937, total loss = 0.015452; Finished in 0:00:00.541675
2022-09-20 16:30:20 # =====> Epoch: 19 Average gen loss = 0.0133148, KLD = 21.684366, total loss = 0.013991; Finished in 0:00:00.543656
2022-09-20 16:30:20 # =====> Epoch: 20 Average gen loss = 0.0124398, KLD = 21.621814, total loss = 0.013114; Finished in 0:00:00.543454
2022-09-20 16:30:21 Finished in 0:00:15.839427 (0:00:00.791971 per epoch)
You will want to verify that the output contains Use cuda True in the first few lines to ensure that cryoDRGN will be using your GPU for training.
Updating cryoDRGN
(cryodrgn) $ cd /path/to/git/repo/with/source/code
(cryodrgn) $ git checkout 3.4.3 # or `git pull origin main` to get the latest sw
(cryodrgn) $ pip install .
To keep multiple versions of cryoDRGN in parallel, you will need to create a new anaconda environment and re-install all the dependencies.
For most platforms, the installation typically consists of a shell script (e.g. ) that you execute on the command line which will prompt you to install and choose a base directory where all the downloaded software and environments will go.
See the official Anaconda documentation and follow their installation instructions .
To customize the installation line depending on your situation, look at Pytorch's .
Alternatively, if you prefer not to use git, you can directly download a ZIP file of the latest release from . For example, if the latest release is cryodrgn-3.4.0.zip:
To update to a later version, you need to obtain the updated software either with git checkout <version> or direct download from , then rerun $ pip install . in your cryodrgn anaconda environment:
If you run into any issues getting cryoDRGN installed, please file a , including all the commands you used and their output. This Issues page is also a good place to check first if you encounter problems running our software!