How to define a project
1. Start with a basic PEP
To start, you need a project defined in the standard Portable Encapsulated Project (PEP) format. Start by creating a PEP.
2. Specify the Sample Annotation
This information generally lives in a project_config.yaml
file.
Simplest example:
pep_version: 2.0.0
sample_table: sample_annotation.csv
A more complicated example taken from PEPATAC:
pep_version: 2.0.0
sample_table: tutorial.csv
sample_modifiers:
derive:
attributes: [read1, read2]
sources:
# Obtain tutorial data from http://big.databio.org/pepatac/ then set
# path to your local saved files
R1: "${TUTORIAL}/tools/pepatac/examples/data/{sample_name}_r1.fastq.gz"
R2: "${TUTORIAL}/tools/pepatac/examples/data/{sample_name}_r2.fastq.gz"
imply:
- if:
organism: ["human", "Homo sapiens", "Human", "Homo_sapiens"]
then:
genome: hg38
prealignment_names: ["rCRSd"]
deduplicator: samblaster # Default. [options: picard]
trimmer: skewer # Default. [options: pyadapt, trimmomatic]
peak_type: fixed # Default. [options: variable]
extend: "250" # Default. For fixed-width peaks, extend this distance up- and down-stream.
frip_ref_peaks: None # Default. Use an external reference set of peaks instead of the peaks called from this run