Daniel Vaulot
2024-01-24
Introduction to Roscoff ABIMS server
Launch MobaXterm
Enter information in new session
Rename session
The password is saved in the session (save passwords = always)
Launch WinSCP
Create new site
Our project are located in /shared/projects/geek_simple_laby
script
folder under your directory (script/sandra
).# Project working directory usage at ABiMS
ABiMS provides a backup on your project directory.
To take advantage of this process, you have to follow some rules:
- Only the subdirectories ‘archive’, ‘script’ and ‘finalresult’ are backed up.
- You must place these subdirectories at the root of your project folder.
- Please be smart in your backups for our finances and the planet.
What we are going to do:
>pacbio;3d4a51f4d1;Aplanochytrium_sp.;size=120;
AGCTCCAATAGCGTATATTAAAGTTGTTGCAGTTAAAAAGCTCGTAGTTGGATTTCTGGTAGGAGCGACCGTGCCGAACTTGATTGTTCGTGTATTGTGTTGTCTTCAGCCATCCTCGT
GGAGAACTTTTCTAACATTAACTTGTTGGGATTGGGACCCGCGTCGTTTACTGTGAAAAAATTAGAGTGTTTAAAGCAGGCATTAGCTTGAATACATTAGCATGGAATAATAAGATAGG
ACTTTGGTACTATTTTGTTGGTTTGCATACCAAATTAATGATCAACAGGAACAGTTTGAGGATATTCGTATGAACATGTCAGAGGTGAAATTCTTGGATTTTGATCAGACGAACTACTG
CGAAAGCATTTATCAAGGATGTTTTCATTAATCAAGAACGAAAGTTAGGGGATCGAA...
Two parts:
#!/bin/bash
#SBATCH -p fast # Partition can be also fast, long, bigmem
#SBATCH --cpus-per-task 4
#SBATCH --mem-per-cpu 4GB # mémoire vive pour l'ensemble des cœurs
#SBATCH -t 6-0:00 # durée maximum du travail (D-HH:MM)
#SBATCH -o slurm.%N.%j.out # STDOUT
#SBATCH -e slurm.%N.%j.err # STDERR
#SBATCH --mail-user=vaulot@sb-roscoff.fr # ! Replace with uio email
#SBATCH --mail-type=BEGIN,END,FAIL
# Submitted with
# cd /shared/home/csim/daniel # ! Change to your directory
# sbatch sbatch_cluster_01.sh
module load vsearch
cd /shared/home/csim/daniel # ! Change to your directory
"${VSEARCH}" --cluster_fast "Labyrinthulomycetes.pacbio.fasta" \
--threads 4 \
--id 0.99 \
--uc clusters_0.99_Labyrinthulomycetes.pacbio.tsv \
--sizeout \
--centroids clusters_0.99_Labyrinthulomycetes.pacbio.centroids.fasta \
--clusterout_sort \
--clusterout_id
Need to edit: email and directories
Three states:
You should also get two emails:
slurm.cpu-node-050.36263527.err
slurm.cpu-node-050.36263527.out
clusters_0.99_Labyrinthulomycetes.pacbio.centroids.fasta
clusters_0.99_Labyrinthulomycetes.pacbio.tsv
Output of the program vsearch
vsearch v2.22.1_linux_x86_64, 251.3GB RAM, 256 cores
https://github.com/torognes/vsearch
Reading file Labyrinthulomycetes.pacbio.fasta 100%
1327126 nt in 289 seqs, min 4174, max 5443, avg 4592
Masking 100%
Sorting by length 100%
Counting k-mers 100%
Clustering 100%
Sorting clusters 100%
Writing clusters 100%
Clusters: 78 Size min 1, max 29, avg 3.7
Singletons: 41, 14.2% of seqs, 52.6% of clusters
>pacbio;1a9244a2e6;Thraustochytriaceae_X_sp.;clusterid=74;size=29
AGCTTCAATAGCATATACTAACGTTGTCGCAGTTAAAAAGTTCGTAGTTGAATTTCTGGTAGGAGTGACCTGGCCTTTTA
CGTTTGTAATTGTATGCTGTGTGTTATCTCTGGCCATCCTGAATCTGCTTTGTTGTAGATTCTCACATACTGTAAAAAAA
TTAGAGTGTTTAAAGCATTTCGTATGAAAAGAATACATCTTATGGGATATCAAAATAGGATTTTGGTGCTATTTTGTTGG
TTTGCACACCAAAATAATGATTAACAGGGACAGTTGGGGGTATTTGTATTTAATTGTCAGAGGTGAAATTCTTGGATTTA
TGAAAGACAAACTACTGCGAAAGCATTTATCAAGGATGTTTTCATTAATCATGAACGAAAGTTAGGGGATCGAAGATGAT
CAGATACCATCGTAGTCTTAACAGTAAACTATACCAACTTGCGATTATTCCATGGTGTTTTTTGCCAGGAGTAGCAGCAC
S 0 5443 * * * * * pacbio;4b0f43a6e4;Labyrinthulomycetes_LAB8_sp.;size=3; *
H 0 5436 99.4 + 0 0 1120MI12M7I673MI289M3D174M2D1534M3I1629M pacbio;e3f28aa0af;Labyrinthulomycetes_LAB8_sp.;size=4; pacbio;4b0f43a6e4;Labyrinthulomycetes_LAB8_sp.;size=3;
H 0 5436 99.4 + 0 0 1120MI12M7I673MI289M3D174M2D1534M3I1629M pacbio;70ee1479b8;Labyrinthulomycetes_LAB8_sp.;size=3; pacbio;4b0f43a6e4;Labyrinthulomycetes_LAB8_sp.;size=3;
S 1 5230 * * * * * pacbio;86192ef156;Labyrinthulomycetes_LAB8_sp.;size=15; *
H 1 5229 99.8 + 0 0 1267MI530MD301MI3130M pacbio;e54d7a85c5;Labyrinthulomycetes_LAB8_sp.;size=4; pacbio;86192ef156;Labyrinthulomycetes_LAB8_sp.;size=15;
H 1 5227 99.5 + 0 0 1120MI12M7I634M3D24MD305MD3127M pacbio;970a95784e;Labyrinthulomycetes_LAB8_sp.;size=7; pacbio;86192ef156;Labyrinthulomycetes_LAB8_sp.;size=15;
H 1 5227 99.5 + 0 0 1120MI12M7I634M3D24MD305MD3127M pacbio;d23a7e7d59;Labyrinthulomycetes_LAB8_sp.;size=2; pacbio;86192ef156;Labyrinthulomycetes_LAB8_sp.;size=15;
H 1 5226 99.7 + 0 0 1267MI555MI275MI39MI3090M pacbio;abfdcca79c;Labyrinthulomycetes_LAB8_sp.;size=9; pacbio;86192ef156;Labyrinthulomycetes_LAB8_sp.;size=15;
If something goes wrong:
slurm.cpu-node-050.36263532.err
module load vsearch
DIR="/shared/home/csim/daniel/" # ! Change to your directory
FILE_HEAD="Labyrinthulomycetes.pacbio"
IDENTITY="0.99"
THREADS=4
cd $DIR
vsearch --cluster_fast "${FILE_HEAD}.fasta" \
--threads "${THREADS}" \
--id "${IDENTITY}" \
--uc clusters_${IDENTITY}_$FILE_HEAD.tsv \
--msaout clusters_${IDENTITY}_${FILE_HEAD}.align.fasta \
--sizeout \
--centroids clusters_${IDENTITY}_${FILE_HEAD}.centroids.fasta \
--clusterout_sort \
--clusterout_id
Loop through files
Loop through parameters
Work directly on files
tar/zip/unzip: compress files/rectories
wget: download files from internet