AlphaFold
Introduction
In short AlphaFold is groundbreaking AI system that is making research faster in the field of bioinformatics. To use AlphaFold the system first takes in a sequence of an amino acid and will then predict three dimensional structure of a protein and does so extremely efficiently.
Read more on the AlphaFold official website.
This section on AlphaFold will go through how to use AlphaFold on Elja.
Getting started
Due to Nvidia compatibility issues Elja now requires you to run AlphaFold in a Conda environment.
Setting up the Conda environment
We start by initializing the conda environment, these are the same steps as seen in Conda:
$ module use /hpcapps/lib-mimir/modules/all
$ module load Anaconda3/2022.05
$ conda config --add channels defaults
$ conda config --add channels bioconda
$ conda config --add channels conda-forge
$ conda config --set auto_activate_base false
$ conda init
$ bash # You can also log out and in again.
Load AlphaFold
Once conda is initialized and ready to use we can load AlphaFold module.
$ ml use /hpcapps/libbio-gpu/modules/all
$ ml load AlphaFold/2.3.1
Run AlphaFold on Elja
To run AlphaFold on Elja you can either run an interactive session or run a batch job.
Starting an interactive session
You can start an interactive session with the srun
command on a GPU node. You can use the screen
command or tmux
to create a secondary terminal where your interactive session is running in the background.
$ srun --job-name "AlphaFold" --partition gpu-1xA100 --time 01:00:00 --pty bash
$ conda activate $env_path
$ run_alphafold.sh -d /AlphaFoldData/AlphaFold/data -o /hpcapps/source/alphafold_non_docker/dummy_test/ -f /hpcapps/source/alphafold_non_docker/example/query.fasta -t 2020-05-14
Running AlphaFold with SBATCH
cat submit.slurm
#!/bin/bash
#SBATCH --mail-type=ALL
#SBATCH --mail-user=<MAIL> # for example uname@hi.is
#SBATCH --nodes=1 # number of nodes
#SBATCH --partition=gpu-1xA100
#SBATCH --time=1-00:00:00 # run for 1 day maximum
#SBATCH --output=slurm_job_output.log
#SBATCH --error=slurm_job_errors.log # Logs if job crashes
module use /hpcapps/libbio-gpu/modules/all
module load AlphaFold/2.3.1
conda activate $env_path
# Run the command
run_alphafold.sh -d /AlphaFoldData/AlphaFold/data -o /hpcapps/source/alphafold_non_docker/dummy_test/ -f /hpcapps/source/alphafold_non_docker/example/query.fasta -t 2020-05-14