Eco-Evolutionary dynamics in Microbial Communities

Basecalling with albacore on the compute cluster

The albacore base caller is installed on the braid computing cluster in /sw/qbio/bin/read_fast5_basecaller.py Since base calling requires substantial computational resources, this is best done on the cluster A suitable submit script is located in /home/neher/submit_albacore.sh

#!/bin/bash
# specify number of nodes and cores per node
#PBS -l nodes=1:ppn=8
# specify the time you expect the job to run hh:mm:ss
#PBS -l walltime=2:00:00
#specify the amount of memory needed
#PBS -l mem=16G
# output and error files
#PBS -o myout.o$PBS_JOBID
#PBS -e myout.e$PBS_JOBID

# load paths
source /home/neher/.bashrc

# move to current working directory
cd $PBS_O_WORKDIR

# call albacore
read_fast5_basecaller.py -f $1 -k $2 -i $3 -s new_calls  -t 8 -o fastq
#read_fast5_basecaller.py -f FLO-MIN107 -k SQK-RAD002 -i reads/fail/0 -s new_calls -t 8 -o fastq```

This script can be called via qsub as

qsub ../submit_albacore.sh -F "FLO-MIN007 SQK-RAD002 reads/pass/0"

where the three arguments specify the flow cell (FLO-MIN007), the sequencing kit (SQK-RAD002) and the directory with the reads to be called. My preliminary test indicate that the cluster can call between 3 and 10MB of fastq per minute when run on 16 cpus.

Eco-Evolutionary dynamics in Microbial Communities

Basecalling with albacore on the compute cluster

Published

Category

Tags