Getting Started
This guide will walk you through setting up OSCAR for single-cell sequencing analysis.
Prerequisites
- Access to a Linux/Unix system (BIH cluster recommended)
- Apptainer/Singularity installed
- Basic command-line knowledge
Installation
1. Clone the Repository
2. Download Container Images
OSCAR uses two Apptainer containers for different stages of the analysis:
oscar-count.sif
Used for alignment and counting:
mkdir -p ${TMPDIR}/OSCAR
apptainer pull library://romagnanilab/oscar/oscar-count:latest --dir ${TMPDIR}/OSCAR/
oscar-qc.sif
Used for quality control and downstream analysis:
3. Set Up Reference Genomes
Reference genomes are required for alignment. Build scripts are provided in the reference/ directory:
Human Reference (GRCh38)
Mouse Reference (GRCm39)
Kallisto Index (for ASAP-seq)
Project Setup
1. Create Project Directory
PROJECT_ID="my_experiment"
DIR_PREFIX="${HOME}/scratch/ngs"
mkdir -p ${DIR_PREFIX}/${PROJECT_ID}
cd ${DIR_PREFIX}/${PROJECT_ID}
2. Prepare Metadata File
Create a metadata.csv file or use the Metadata Generator tool.
Example metadata structure:
assay,experiment_id,historical_number,replicate,modality,chemistry,index_type,index,species,n_donors,adt_file
CITE,EXP001,H001,R1,RNA,3prime,dual,SI-TT-A1,Human,1,adt_totalseq_a.csv
3. Add FASTQ Files
Place your FASTQ files in:
Running the Pipeline
Step 1: Process Metadata
bash ${HOME}/work/bin/OSCAR/bash/01_process_metadata.sh \
--project-id ${PROJECT_ID} \
--dir-prefix ${DIR_PREFIX}
Step 2: Process FASTQ Files
bash ${HOME}/work/bin/OSCAR/bash/02_fastq.sh \
--project-id ${PROJECT_ID} \
--dir-prefix ${DIR_PREFIX}
Step 3: Process Libraries
bash ${HOME}/work/bin/OSCAR/bash/03_process_libraries.sh \
--project-id ${PROJECT_ID} \
--dir-prefix ${DIR_PREFIX}
Step 4: Count
bash ${HOME}/work/bin/OSCAR/bash/04_count.sh \
--project-id ${PROJECT_ID} \
--dir-prefix ${DIR_PREFIX}
Step 5: Quality Control
bash ${HOME}/work/bin/OSCAR/bash/05_quality_control.sh \
--project-id ${PROJECT_ID} \
--dir-prefix ${DIR_PREFIX}
Configuration
Custom Settings
Edit bash/config.sh to customize:
- Reference genome paths
- Container image locations
- Default parameters
Environment Variables
Troubleshooting
Common Issues
FASTQ files not found
Ensure FASTQ files follow the naming convention expected by your sequencing platform.
Memory errors
Increase memory allocation in your job submission script.
Checking logs
Log files are created in ${DIR_PREFIX}/${PROJECT_ID}/logs/
Next Steps
- Metadata Generator - Create properly formatted metadata
- Feature Barcode Generator - Generate ADT/HTO reference files
- Functions Reference - Detailed script documentation
Support
For help or questions:
- Email: oliver.knight@charite.de
- GitHub Issues: Submit an issue