2025-10-30
Frédéric Jarlier, Julien Roméjon, Philippe Hupé, Laurent Jourdren, Quentin Duvert
Environment
Nextflow files
Goal of the pipeline:
loadData: runs wget to fetch fasta sequences from the UniProt website
fusionFasta: runs cat to concatenate 2 fasta files
mafft: runs mafft to align the 2 sequences within the fused output of fusionFasta
main.nf : is responsible of the chaining of the processes
In the main.nf you include or write your processes
In the workflow you initialize the inputs, chain the processes et set the output
sample url
P10415 https://www.uniprot.org/uniprot/P10415.fasta
P01308 https://www.uniprot.org/uniprot/P01308.fastaworkflow {
main:
Channel.fromPath(params.input) | splitCsv(header:true, sep: '\t')
| map { row -> [row.sample, row.url] } | set { input_ch }
loadData(input_ch) // parallelized far all samples in sample.csv file
loadData.out | groupTuple() | set {fastaList}
/* give the list of files to concat to fusionFasta*/
fusionFasta(fastaList)
/* Run the mult. align with mafft*/
mafft(fusionFasta.out)
}/* Parameters for the pipeline */
params {
input = 'sample.csv'
outdir = "results"
}
/* reports rules */
timeline {
enabled = true
overwrite = true
file = "${params.outdir}/pipeline_info/execution_timeline.html"
}
report {
enabled = true
overwrite = true
file = "${params.outdir}/pipeline_info/execution_report.html"
}nextflow run main.nf --input=sample.csv --outdir=results🔗 https://nextflow.io/docs/latest/reference/config.html
(base) fjarlier@clust-slurm-client:~/TP_intro$ module load mafft
(base) fjarlier@clust-slurm-client:~/TP_intro$ nextflow run main.nf
N E X T F L O W ~ version 25.04.7
Launching `main.nf` [condescending_liskov] DSL2 - revision: 33d6a8bc68
executor > local (4)
[5d/ec6219] loadData (P01308) [100%] 2 of 2 ✔
[a8/d2c870] fusionFasta (1) [100%] 1 of 1 ✔
[5e/0caee7] mafft (1) [100%] 1 of 1 ✔
work is an intermediate folderresults contains the published results(base) fjarlier@clust-slurm-client:~/TP_intro$ cd pipeline_info
(base) fjarlier@clust-slurm-client:~/TP_intro$ ls
execution_report.html execution_timeline.html execution_trace.txt pipeline_dag.html