Sample_Report

Author
Published

January 13, 2025


This R Markdown aims to provide a brief summary of all diagnostic plots for your TREx experiment.
FASTQ files are available for download upon request.

Users receiving files from RSCshare are advised to delete all files once they have secure copied them to their own respective drives.
A copy of your data will be securely archived on our end and may incur a fee for future retrievals.

Sample Tracking

Pre-Processing

The raw fastq reads were first processed with fastp package to:

  • Trim for low quality reads;
  • 2 color chemistry bias (next-seq);
  • Trim for noisy short fragments;
  • Trim for adapter sequence;

The filtered reads were then aligned to GRCh38 reference genome with ENSEMBL annotations

The multiqc html (separate file), summarises the alignment statistics along with the summary of raw counts generated via STAR
For default parameters used within TREx pipeline’s, please refer to the code modules available on Github at this link.

Counts Distribution

Post-normalization, the medians should be consistent across samples and more similar between biological replicates.

geneBodyCoverage

A good library should indicate little to no bias across the entire gene body.

Sample Clustering

An euclidean distance is computed between samples, and the dendrogram is built upon the Ward criterion. We expect this dendrogram to group replicates and separate biological conditions.

Principal Components Analysis

Another way of visualizing the experiment variability is to look at the first principal components of the PCA. On this figure, the first principal component (PC1) is expected to separate samples from the different biological conditions, meaning that the biological variability is the main source of variance in the data.

MA-Plots

The above figure represents the MA-plot of the data for the comparisons done, where differentially expressed features are highlighted in red. A MA-plot represents the log ratio of differential expression as a function of the mean intensity for each feature. Triangles correspond to features having a too low/high log2(FC) to be displayed on the plot.

Citations

citation("DESeq2")
To cite package 'DESeq2' in publications use:

  Love, M.I., Huber, W., Anders, S. Moderated estimation of fold change
  and dispersion for RNA-seq data with DESeq2 Genome Biology 15(12):550
  (2014)

A BibTeX entry for LaTeX users is

  @Article{,
    title = {Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2},
    author = {Michael I. Love and Wolfgang Huber and Simon Anders},
    year = {2014},
    journal = {Genome Biology},
    doi = {10.1186/s13059-014-0550-8},
    volume = {15},
    issue = {12},
    pages = {550},
  }
citation("SARTools")
To cite package 'SARTools' in publications use:

  Hugo Varet, Loraine Brillet-Guéguen, Jean-Yves Coppée and
  Marie-Agnès Dillies (2016): SARTools: A DESeq2- and EdgeR-Based R
  Pipeline for Comprehensive Differential Analysis of RNA-Seq Data.
  PLoS One, 2016, doi: http://dx.doi.org/10.1371/journal.pone.0157022

A BibTeX entry for LaTeX users is

  @Article{,
    title = {SARTools: A DESeq2- and EdgeR-Based R Pipeline for Comprehensive Differential Analysis of RNA-Seq Data},
    author = {Hugo Varet and Loraine Brillet-Guéguen and Jean-Yves Coppée and Marie-Agnès Dillies},
    year = {2016},
    journal = {PLoS One},
    doi = {10.1371/journal.pone.0157022},
    url = {http://dx.doi.org/10.1371/journal.pone.0157022},
  }