Maxime Garcia [
@gau](https://twitter.com/gau) [
@MaxUlysse](https://github.com/MaxUlysse) [
maxulysse.github.io/jobim2020](https://maxulysse.github.io/jobim2020) [JOBIM 2020](https://jobim2020.sciencesconf.org/) - Montpellier, France [virtual] - 2020/07/01
--- [![Barntumörbanken](https://maxulysse.github.io/assets/img/svg/barntumorbanken_logo.svg "Barntumörbanken")](https://ki.se/forskning/barntumorbanken) The Swedish Childhood Tumor Biobank [![KI](https://maxulysse.github.io/assets/img/svg/ki_logo.svg "KI")](https://ki.se) Note: * Working for The Swedish Childhood Tumor Biobank located at KI --- [![NGI](https://maxulysse.github.io/assets/img/svg/ngi_logo.svg "NGI")](https://ngisweden.scilifelab.se/) * State-of-the-art infrastructure * Sequencing (DNA, RNA ...) * Guidelines and support * Sample collection, study design, protocol selection * Bioinformatics analysis Note: * NGI is a sequencing facility used by researchers all over Sweden === [![SciLifeLab](https://maxulysse.github.io/assets/img/svg/scilifelab_logo.svg "SciLifeLab")](https://scilifelab.se/) National centre for molecular biosciences with focus on health and environmental research [![KI](https://maxulysse.github.io/assets/img/svg/ki_logo.svg)](https://ki.se/) | [![KTH](https://maxulysse.github.io/assets/img/svg/kth_logo.svg)](https://www.kth.se/) | [![SU](https://maxulysse.github.io/assets/img/svg/su_logo.svg)](https://www.su.se/) | [![UU](https://maxulysse.github.io/assets/img/svg/uu_logo.svg)](https://www.uu.se/) :-:|:-:|:-:|:-: Note: * SciLifeLab is several infrastructures * NGI collaborates a lot with NBIS the National Bioinformatics Infrastructure Sweden which is the local Elixir node --- ## Reproducibility is central [![Figure 1](https://maxulysse.github.io/assets/img/slides/gigascience_giy077_fig1.jpg "figure 1")](https://academic.oup.com/view-large/figure/118918033/giy077fig1.jpg) [
10.1093/gigascience/giy077](https://doi.org/10.1093/gigascience/giy077) Note: * For me, as a bioinformatician it is a crucial matter --- [![Nextflow](https://maxulysse.github.io/assets/img/slides/nextflow.png "Nextflow")](https://www.nextflow.io/) * Workflow manager * Data driven language * Portable * executable on multiple platforms * Shareable and reproducible * with containers or virtual environments Note: * Early adopters of Nextflow for its portability, shareability and of course reproducibility === ##
Data driven language The execution graph depends on the input data, and is calculated on the go
In `snakemake` it's the other way around
The execution graph depends on the final target, and is calculated before launch
Note: * Execution graph = the way to link all the different tools within the pipeline * For me, this is the main difference between Snakemake and Nextflow === ## Portability [www.nextflow.io/docs/latest/executor.html](https://www.nextflow.io/docs/latest/executor.html) *
Sun Grid Engine, SLURM, PBS/Torque, OAR ... *
AWS Batch, Kubernetes, Google Life Sciences Note: * Nextflow supports main schedulers on HPCs or in the cloud === ## Reproducibility [![Conda](https://maxulysse.github.io/assets/img/svg/conda_logo.svg)](https://docs.conda.io/) | [![Docker](https://maxulysse.github.io/assets/img/svg/docker_logo.svg)](https://www.docker.com/) | [![Singularity](https://maxulysse.github.io/assets/img/svg/singularity_logo.svg)](https://sylabs.io/singularity/) :-:|:-:|:-: Note: * Nextflow supports container engines and virtual environments --- [![nf-core](https://maxulysse.github.io/assets/img/svg/nf-core_logo.svg "nf-core")](https://nf-co.re/) ---
Note: 3 - 12 - 27 - 39 * NGI had been developing analysis pipelines for years and using a set of standards * This helped other group run the pipelines on their own * Pipelines began to outgrow the SciLifeLab/NGI branding * In late 2017, nf-core was created, thanks to Phil Ewels (NGI), Alex Peltzer (QBiC), Sven Fillinger (QBiC) and Andreas Wilm (A*STAR) * All relevant pipelines were moved to this new GitHub organisation === [![nf-core](https://maxulysse.github.io/assets/img/svg/nf-core_logo.svg "nf-core")](https://nf-co.re/) [
https://nf-co.re/join](https://nf-co.re/join)
[
@nf_core](https://twitter.com/nf_core)
[
@nf-core](https://www.youtube.com/c/nf-core)
[
nfcore.slack.com](https://nfcore.slack.com/)
[
@nf-core](https://github.com/nf-core/)
Note: * As a community we are active on twitter * We're using YouTube more and more especially nowadays * Always using Slack a lot (probably too much in my case) * And of course GitHub === [
https://nf-co.re/pipelines](https://nf-co.re/pipelines)
Note: * Our website has pages for each pipeline that renders the documentation available on github * 24 released pipelines, and 14 in development * Most recently released * Most starred --- ## Pipeline requirements [
https://nf-co.re/developers/guidelines](https://nf-co.re/developers/guidelines) * Nextflow based * Common structure (based on the nf-core template) * Stable release tags * MIT license (can be used even in commercial settings) * Software bundled for reproducibility * Continuous Integration testing (e.g. GitHub Actions) Note: * Our community is closely tied to Nextflow --- [![nf-core tools](https://maxulysse.github.io/assets/img/svg/nf-core-tools_logo.svg "nf-core tools")](https://github.com/nf-core/tools) === ## A companion tool [
https://nf-co.re/tools](https://nf-co.re/tools) * [nf-core list](https://nf-co.re/tools#listing-pipelines) - List available pipelines * [nf-core launch](https://nf-co.re/tools#launch-a-pipeline) - Run pipeline with interactive prompts * [nf-core download](https://nf-co.re/tools#downloading-pipelines-for-offline-use) - Download pipeline for offline use * [nf-core licences](https://nf-co.re/tools#pipeline-software-licences) - List software licences in a pipeline * [nf-core create](https://nf-co.re/tools#creating-a-new-workflow) - Create a new pipeline from the template * [nf-core lint](https://nf-co.re/tools#linting-a-workflow) - Check pipeline code against guidelines * ... Note: * We provide a companion tool to help with common tasks === ## Making a pipeline [
https://nf-co.re/tools#creating-a-new-workflow](https://nf-co.re/tools#creating-a-new-workflow) [![nf-core create](https://maxulysse.github.io/assets/img/slides/nf-core_create_jobim.png "nf-core create")](https://nf-co.re/tools#creating-a-new-workflow) Note: * nf-core create can be used to create non nf-core pipeline, and * it is a good help to start with a minimal skeleton * with at least a MultiQC process for reporting * we're working on making it less specific to nf-core to help the whole Nextflow community === ## Software dependencies [![Bioconda](https://maxulysse.github.io/assets/img/svg/bioconda_logo.svg)](https://bioconda.github.io/) | [![Docker](https://maxulysse.github.io/assets/img/svg/docker_logo.svg)](https://www.docker.com/) | [![Singularity](https://maxulysse.github.io/assets/img/svg/singularity_logo.svg)](https://sylabs.io/singularity/) :-:|:-:|:-: * All tools are installed with Conda/Bioconda * Allows set up of a new environment * Bundled into a Docker container * Nextflow can automatically download from DockerHub * Built from the Docker container * Singularity images can solve HPC container problems Note: * We do recommend to publish tools using bioconda and are involved within the community * This setting allow for an easy update of tools, and easy usage of all the different technologies depending on the system --- ## Configurations All pipelines come with a default sensible configuration for a regular sized HPC
[
github.com/nf-core/configs](https://github.com/nf-core/configs/) allows shared configurations between pipelines for a specific HPC * cpus, time and memory requirements * scheduler * queues * environments * path to common references files * ...
Note: * This allows anyone using a nf-core pipeline in a infrastructure, or anyone else in the same infrastructure to easily any nf-core pipeline --- ## Coming soon * Full-sized dataset testing for pipeline releases on AWS * Graphical user interface to launch pipelines * Modules with `Nextflow DSL 2` Note: * We do collaborate with Nextflow developers, and are up to date with the latest developments * Nextflow DSL 2 will allow for more modular pipelines similar to the snakemake rules --- ##
Core team [![@alneberg](https://maxulysse.github.io/assets/img/slides/alneberg.jpeg)](https://github.com/alneberg) | [![@apeltzer](https://maxulysse.github.io/assets/img/slides/apeltzer.jpeg)](https://github.com/apeltzer) | [![@drpatelh](https://maxulysse.github.io/assets/img/slides/drpatelh.jpeg)](https://github.com/drpatelh) | [![@ewels](https://maxulysse.github.io/assets/img/slides/ewels.jpeg)](https://github.com/ewels) :-:|:-:|:-:|:-: Johannes Alneberg | Alexander Peltzer | Harshil Patel | Phil Ewels [![@maxulysse](https://maxulysse.github.io/assets/img/slides/maxulysse.jpeg)](https://github.com/maxulysse) | [![@olgabot](https://maxulysse.github.io/assets/img/slides/olgabot.jpeg)](https://github.com/olgabot) | [![@sven1103](https://maxulysse.github.io/assets/img/slides/sven1103.jpeg)](https://github.com/sven1103) | [![@ggabernet](https://maxulysse.github.io/assets/img/slides/ggabernet.jpeg)](https://github.com/ggabernet) :-:|:-:|:-:|:-: Maxime Garcia | Olga Botvinnik | Sven Fillinger | Gisela Gabernet
The nf-core framework for community-curated bioinformatics pipelines [Nat Biotechnology (2020)](https://www.nature.com/articles/s41587-020-0439-x); [
10.1038/s41587-020-0439-x](https://doi.org/10.1038/s41587-020-0439-x)
--- ##
Extensive statistics [
https://nf-co.re/stats](https://nf-co.re/stats) [![nf-core stats](https://maxulysse.github.io/assets/img/slides/nf-core_stats_jobim.png "nf-core stats")](https://nf-co.re/stats) Note: * Phil loves to make stats --- ##
Hackathons [
https://nf-co.re/events](https://nf-co.re/events) [![Hackathon at Crick 2020](https://maxulysse.github.io/assets/img/slides/nf-core_hackathon_crick2020.jpg "Hackathon at Crick 2020")](https://nf-co.re/events/2020/hackathon-francis-crick-2020)
Next one is online from July 13th to 17th: [
https://nf-co.re/events/2020/hackathon-july-2020](https://nf-co.re/events/2020/hackathon-july-2020)
--- ##
Stay at home message === *
Facilities * Highly optimised pipelines with excellent reporting * Validated releases ensure reproducibility *
Users * Portable, documented and easy to use pipelines * Easy to share between different collaborators *
Developers * Companion templates and tools help to validate your code and simplify common tasks ---
---
## Any questions * [
nf-co.re](https://nf-co.re/) * [
maxulysse.github.io/jobim2020](https://maxulysse.github.io/jobim2020) * [
nf-co.re/join](https://nf-co.re/join) * [
nf-co.re/events/2020/hackathon-july-2020](https://nf-co.re/events/2020/hackathon-july-2020) * [
@nf-core](https://github.com/nf-core) * [
@nf_core](https://twitter.com/nf_core) * [
@nf-core](https://www.youtube.com/c/nf-core) * [
nfcore.slack.com](https://nfcore.slack.com/)