* [Ruotes](https://github.com/SciLifeLab/Sarek/releases/tag/2.1.0)
* [Skårki](https://github.com/SciLifeLab/Sarek/releases/tag/2.2.0)
* [Äpar](https://github.com/SciLifeLab/Sarek/releases/tag/2.3)
* [Ålkatj](https://github.com/nf-core/sarek/releases/tag/2.5)
* [Årjep-Ålkatjjekna](https://github.com/nf-core/sarek/releases/tag/2.5.1)
* [Jåkkåtjkaskajekna](https://github.com/nf-core/sarek/releases/tag/2.5.2)
* [Piellorieppe](https://github.com/nf-core/sarek/releases/tag/2.6)
* [Gådokgaskatjåhkkå](https://github.com/nf-core/sarek/releases/tag/2.6.1)
---
[![Sarek](https://maxulysse.github.io/assets/img/svg/nf-core_sarek_logo.svg "Sarek")](https://nf-co.re/sarek)
* Open-Source Nextflow Pipeline
* Started at NGI
* In collaboration with NBIS
* Support from BTB
* In collaboration with QBiC (Tübingen)
===
[![Nextflow](https://maxulysse.github.io/assets/img/slides/nextflow.png "Nextflow")](https://www.nextflow.io/)
* Workflow manager
* Data driven language
* Portable
* executable on multiple platforms
* Shareable and reproducible
* with containers or virtual environments
Note:
* Early adopters of Nextflow for its portability, shareability and of course reproducibility
---
## Multiple flavors
![Sarek](https://maxulysse.github.io/assets/img/svg/sarek_logo.svg "Sarek")
![Sarek](https://maxulysse.github.io/assets/img/svg/sarek-germline.svg "Sarek") | ![Sarek](https://maxulysse.github.io/assets/img/svg/sarek-somatic.svg "Sarek")
:-:|:-:
===
## WES and Targeted Sequencing
![](https://maxulysse.github.io/assets/img/svg/appleseq.svg "WGS, WES, and Targeted")
===
## Reference genomes
[AWS iGenomes](https://registry.opendata.aws/aws-igenomes/)
* GRCh37
* GRCh38
* GRCm38
---
## Preprocessing
[![GATKBP](https://maxulysse.github.io/assets/img/svg/gatk-bp_logo.svg "GATK Best Practices")](https://software.broadinstitute.org/gatk/best-practices/)
Based on GATK Best Practices (GATK 4.1.7.0)
* Reads mapped to reference genome with `bwa`
* Duplicates marked with `picard MarkDuplicates`
* Recalibrate with `GATK BaseRecalibrator`
===
## Germline Variant Calling
* SNVs and small indels
* HaplotypeCaller
* Strelka2
* Freebayes
* mpileup
* Structural variants
* Manta
* TIDDIT
===
## Somatic Variant Calling
* SNVs and small indels
* Mutect2
* Strelka2
* Freebayes
* Structural variants
* Manta
* Sample heterogeneity, ploidy and CNVs
* ASCAT
* Control-FREEC
* Microsatellite instability
* MSIsensor
===
## Annotation
* VEP and SnpEff
* ClinVar, COSMIC, dbSNP, GENCODE, gnomAD, polyphen, sift, etc.
===
## Reports
[![MultiQC](https://maxulysse.github.io/assets/img/svg/multiqc_logo.svg "MultiQC")](https://multiqc.info/)
---
## Workflow
[![Sarek Workflow](https://maxulysse.github.io/assets/img/svg/sarek_workflow_2.6.1.svg "Sarek Workflow 2.6.1")](https://github.com/nf-core/sarek/releases/tag/2.6.1)
---
## Prioritization
* First step towards clinical use
* Rank scores are computed for all variants
* COSMIC, ClinVar, SweFreq and MSK-IMPACT (cancerhotspots.org)
* Findings are ranked
* Well known, high-impact variants
* Variants in known cancer-related genes
* Remaining variants
---
## What is coming soon
* `@sarek-team` to mention the core sarek developers on Slack
* `@nf-core/sarek` to mention the core sarek developers on GitHub
* BWA-MEM2 ([dev](https://github.com/nf-core/sarek/tree/dev))
* Bug-fixes ([dev](https://github.com/nf-core/sarek/tree/dev))
* DSL 2 ([dsl2](https://github.com/nf-core/sarek/tree/dsl2)) with [@ggabernet](https://github.com/ggabernet) and [@FriederikeHanssen](https://github.com/FriederikeHanssen)
===
## What is coming next
* Validation tests
* More tools
* Sub-workflows for specific usage
* Improved cloud usage
* Improved usage for non-model organism
* Joint Variant Calling
* More downstream processing of the final vcf files
* Easier connection to [Scout](https://www.clinicalgenomics.se/scout/)
Note:
* Scout is a tool developed by SciLifeLab Clinical Genomics to analyse VCF files
---
## Sarek usage
* Within BTB
* Tumor/normal pairs
* In production at NGI
* All normal samples
* Tumor/normal pairs
* The whole SweGen dataset
* 1 000 normal samples (GRCh38)
* Genome Medicine Sweden
Note:
* GMS is an initiative to implement Precision Medicine at a national level
---
## Publication in F1000Research
Sarek: A portable workflow for whole-genome sequencing
analysis of germline and somatic variants
[version 2; peer review: 2 approved]
Maxime Garcia, Szilveszter Juhos, Malin Larsson, Pall I. Olason, Marcel Martin,
Jesper Eisfeldt, Sebastian DiLorenzo, Johanna Sandgren, Teresita Díaz De Ståhl,
Philip Ewels, Valtteri Wirta, Monica Nistér, Max Käller, Björn Nystedt
[ doi.org/10.12688/f1000research.16665.2](https://doi.org/10.12688/f1000research.16665.2)
---
## Get involved
* Our code is hosted on Github
* [ github.com/nf-core](https://github.com/nf-core)
* [ github.com/nf-core/sarek](https://github.com/nf-core/sarek)
* We have slack
* [ nfcore.slack.com](https://nfcore.slack.com/)
* [ nfcore.slack.com/channels/sarek](https://nfcore.slack.com/channels/sarek)
---
---
## Any questions
* [ nf-co.re/sarek](https://nf-co.re/sarek)
* [ github.com/nf-core/sarek](https://github.com/nf-core/sarek)
* [ nfcore.slack.com/channels/sarek](https://nfcore.slack.com/channels/sarek)