Getting started
How to get up and running with DRGN-AI
Installation
We recommend installing DRGN-AI in a clean conda environment. First clone the git repository, and then use pip
to install the package from source code:
You can also install the latest development version of DRGN-AI if you have access to the drgnai
repository:
To confirm that the package was installed successfully, use drgnai test
:
You can also perform a more comprehensive test of the package and its installation using pytest
. This will take the better part of an hour, so we recommend running it in the background, or on a CPU (not a GPU) compute node, e.g. if using a Slurm cluster:
Setting up an experiment
To run an experiment, first we need to set up an experiment folder. For DRGN-AI to recognize a folder as an experiment folder, all it needs to contain is a configs.yaml
file listing parameters used by the reconstruction model for dataset acquisition and model training:
For those unfamiliar with creating files and folder in a command line setting, we have created the drgnai setup
utility to assist you in the steps needed to set up a DRGN-AI experiment. For example, the above config file can be created and placed in a new experiment folder your_workdir
using:
To begin an experiment it is sufficient to specify an input dataset and the four quick_config
parameters described below. Major parameters have their own flags for drgnai setup
; the remainder can be added to configs.yaml
using the --cfgs
flag:
Use drgnai setup -h
to get a list of all parameters that have their own setup
flags.
Quick Config
As a shortcut for the most important parameter settings we have introduced the quick_config
parameter for use in configs.yaml
, which uniquely amongst DRGN-AI parameters contains four sub-parameters:
capture_setup
For the moment, only single-particle imaging (spa
) is supported.reconstruction_type
Whether we want to model a latent space for conformations (het
) or instead do homogeneous reconstruction (homo
).conf_estimation
If doing heterogeneous reconstruction, what type of model to use for conformations: (autodecoder
orencoder
).pose_estimation
Whether to model poses from scratch (abinit
), use known poses (fixed
), or refine known poses (refine
).
These are listed in a nested manner under the quick_config
entry in a configs.yaml
as demonstrated in the example above.
Input Datasets
We also have to specify the input dataset. A DRGN-AI experiment relies upon a stack of particles picked from a cryoEM imaging run; an input dataset for DRGN-AI thus consists of, at minimum, a file with the picked particles and the CTF parameters. There are multiple ways of telling DRGN-AI where these files are located:
Add
particles
andctf
entries to theconfigs.yaml
file in your experiment folder, as well as adatadir
entry if necessary when using a .star or .cs file particle stack.Use the
--particles
and--ctf
arguments (and--datadir
if necessary) to thedrgnai setup
tool, which will place these entries in theconfigs
file for you.Set the
DRGNAI_DATASETS
environment variable to point to a file with an entry for the dataset.
We have already seen examples of the first two approaches; in the third approach, we create a file called e.g. /home/drgnai-paths.yaml
that will contain dataset entries:
We then set the environment variable:
Now we can use the dataset labels defined in the paths file as shortcuts when using either drgnai setup
or configs.yaml
:
Last updated