Pipelines for Circular RNA Identification and Characterization

CIRCexplorer2 contains two main pipelines for circular RNA identification and characterization:

  • Circular RNA annotating pipeline (annotating pipeline)
  • Circular RNA characterization pipeline (characterization pipeline)

Annotating pipeline

This pipeline is derived from CIRCexplorer which was employed in our previous Cell paper, and has been proven to be one of the most reliable bioinformatic tools for circRNA prediction (Hansen, et al., Nucleic Acids Res, 2015). It is a integrated strategy to identify fusion junction reads from back spliced exons and intron lariats, and annotates these fusion junction reads to correct gene annotations with elaborately designed realignment script. In CIRCexplorer2, we extended this pipeline to support more aligners (including STAR, segemehl and MapSplice) for satisfying different requirements for circular RNA alignments and data mining.

Schematic flow

annotating_pipeline

Features

  • It relies on existing gene annotations, and it only reports circular RNAs owning exactly the same boundaries with existing gene annotations. This criterion enables this pipeline to have high accuracy in circular RNA prediction. If you want to identify circular RNAs with inaccurate boundaries, please see more information of annotate module, but it may induce many false positives.
  • It supports multiple aligners (TopHat2/TopHat-Fusion, STAR, segemehl and MapSplice).
  • It is very convenient. You only need to run two simple commands to complete this pipeline without any additional manipulations, and CIRCexplorer2 would prepare all you need in following circular RNA analysis.
  • It is sufficient for general circular RNA identification.

Steps

  • Circular RNA fusion junction read alignment and parsing (Alignment)
  • Circular RNA fusion junction read annotating and realignment (Annotating)

Characterization pipeline

This pipeline aims to comprehensively and systematically characterize the landscape of alternative back-splicing and alternative splicing of circular RNAs through integrating de novo assembly for circular RNA transcripts.

Schematic flow

characterization_pipeline

Features

  • Cufflinks reference annotation based transcript (RABT) assembly method was employed to facilitate better identication novel transcripts for circular RNAs.
  • Besides circular RNAs with annotated exons, it could identify hundreds of novel circular RNA specific exons, which are not expressed in linear RNAs.
  • It could identify two types of alternative back-splicing events (alternative 5' back-splice site and alternative 3' back-splice site) and four types of alternative splicing events (cassette exon, intron retention, alternative 5' splice site and alternative 3' splice site).

Steps

  • Circular RNA fusion junction read alignment and parsing (Alignment)
  • De novo assembly for circular RNA transcripts (Assembly)
  • Characterization of alternative back-splicing and alternative splicing (Alternative Splicing)