Report Ribo-Seq Pipeline: 20241118_225323

Sequencing and mapping quality control (QC)
MACS peak calling
Bioinformatics pipeline methods
Links to results
Acknowledgments

Sequencing and mapping quality control (QC)

Figure 1: Plots the average quality of each base across all reads. Quality of 30 and up is good (predicted error rate 1:1000).

Download figure as table

Figure 2: Histogram showing the number of reads for each sample in raw data. The sequencing depth statistics for all samples were as follows:

the median depth was 53,583,236 reads,

the mean depth was 55,238,538 reads,

the standard deviation was 4,703,440 reads.

Download figure as table

Figure 3: Percentage of reads discarded after trimming.

No figure presented since the percentage of reads discarded after trimming for all samples is lower than 1%.

Download table

Figure 4: Histogram with the number of reads for each sample in each step of the pipeline.

Download figure as table

MACS peak calling

MACS results for each sample

Sample Type	Sample	Total Tags (Reads)
Treatment	RNC1	3513170
Treatment	RNC2	3300693
Treatment	RKD1	3924299
Treatment	RKD2	3433104

MACS results for each comparison

Comparison	tag (read) size (bp)	MACS model d length	Total number of peaks
RNC1	28	1	178147
RNC2	28	1	172091
RKD1	28	1	39898
RKD2	27	1	29716

The final number of peaks for all comparisons is: 419,852

Figure 5: Number of peaks for all samples

Figure 6: Peaks distribution in genomic regions

Figure 7: Peaks distribution around TSS

Figure 8: Overlap of peaks among the first 4 samples

Venn plot legend

samples	mark
RNC1	1
RNC2	2
RKD1	3
RKD2	4

Bioinformatics pipeline methods

Reads were trimmed using cutadapt (DOI: 10.14806/ej.17.1.200) with the parameters: -a CTGTAGGCACCATCAATAGATCGGAAGAGCACACGTCTGAACTCCAGTCAC –times TIMES -q 20 -m 25).

Reads that did not align to rRNA were filtered using Bowtie1 (DOI: 10.1186/gb-2009-10-3-r25).

Reads with a minimum length of 25 and a maximum length of 32 were filtered using Cutadapt.

Reads were mapped to the mm10 genome using TopHat (DOI: 10.1093/bioinformatics/btp120) with the parameters: -N 1, –no-novel-juncs, –library-type fr-firststrand, and -p 20.

Uniquely mapped reads were extracted using samtools with the -q 10 parameter.

Only 5’ UTR and CDS fragments are counted using HTSeq-count v2.0.2 (DOI: 10.1093/bioinformatics/btu638) in intersection-nonempty mode.

Significant regions (peaks) are identified using MACS2 callpeak (DOI: 10.1186/gb-2008-9-9-r137) with the parameters: –keep-dup all, –nomodel, and –extsize=1.

The summits output from peak calling were shifted by 13 bases and extended by 3 bases according to gene orientation.

The distribution of peaks in genomic regions and their proximity to TSS (transcription start sites) were examined using ChIPseeker (DOI: 10.18129/B9.bioc.ChIPseeker). The ovelap of peaks for the first 4 samples in the Venn diagram was also analyzed using ChIPseeker.

The pipeline was constructed using Snakemake (DOI: 10.1093/bioinformatics/bts480).

Links to results

Sequences from folder: ~/example_and_data_for_testing_ribo-seq_mm10/fastq

Output folder: ~/example_and_data_for_testing_ribo-seq_mm10/20241118_225323_demo_Ribo-Seq

FastQC folder: ~/example_and_data_for_testing_ribo-seq_mm10/20241118_225323_demo_Ribo-Seq/2_fastqc

MultiQC folder: ~/example_and_data_for_testing_ribo-seq_mm10/20241118_225323_demo_Ribo-Seq/2_fastqc/multiQC

Report output folder: data_for_demo_production_20241118_225323

Statistics regarding the number of reads for each sample for various steps of the pipeline can be downloaded from: here.

MACS peak calling Statistics for each sample and comparison can be downloaded from: here.

Commands log can be downloaded from: here.

R packages versions can be found at: sessionInfo.txt

UTAP version 2.1

General information about the run can be found at: ~/reports/20241118_225323_demo_analysis_parameters.yaml

Treatment and control samples are found at: here.

Acknowledgments

Citing UTAP:

Lindner, J., Dassa, B., Wigoda, N. et al. UTAP2: an enhanced user-friendly transcriptome and epigenome analysis pipeline. BMC Bioinformatics 26, 79 (2025). https://doi.org/10.1186/s12859-025-06090-8.

Report Ribo-Seq Pipeline: 20241118_225323_demo

18-11-2024

Sequencing and mapping quality control (QC)

MACS peak calling

Bioinformatics pipeline methods

Links to results

Acknowledgments