assemble
CIRCexplorer2 assemble
carries out de novo assembly for circular RNA based on RABT method.
Usage and option summary
Usage:
CIRCexplorer2 assemble [options] -r REF <circ_dir>
Options:
-h --help Show help message.
-v --version Show version.
-r REF --ref=REF Gene annotation file.
-p THREAD --thread=THREAD Running threads. [default: 10]
--bb Convert assembly results to BigBed.
--tophat-dir=TOPHAT_DIR TopHat mapping directory.
--chrom-size=CHROM_SIZE Chrom size file for converting to BigBed.
--remove-rRNA Ignore rRNA during assembling (only for human hg19).
--max-bundle-frags=FRAGMENTS Cufflinks --max-bundle-frags option.
Notes about options
CIRCexplorer2 assemble
require to use alignment results of TopHat2 to de novo assemble circular RNA transcripts. So if the TopHat2 result folder is not under<circ_dir>
, you could specify its path using--tophat-dir
.CIRCexplorer2 assemble
will search for the chrom size file in TopHat2 result folder, and you could also specify its path using--chrom-size
. If you have usedCIRCexplorer2 align
to align RNA-seq data, the chrom size file would be existed in the TopHat2 result folder.- Assembly for rRNA would be very time-consuming. If you set
--remove-rRNA
option, it would skip assembly for rRNA. To be noted, this option is only suitable for hg19. If the assembly step is still very slow, you could set--max-bundle-frags
with a small number. Please see Cufflinks protocol for more details about--max-bundle-frags
option. - If you set
--bb
option, the BigBed file of assembled transcripts would be created.
Output
CIRCexplorer2 assemble
will create one cufflinks
folder under the <circ_dir>
folder. It will also create cufflinks.log
under <circ_dir>
folder. The transcripts_ref.txt
would be used to do alternative splicing analysis for circular RNAs, and it has the same format with refFlat format.
cufflinks
├── filtered_junction.gtf
├── filtered_junction.txt
├── genes.fpkm_tracking
├── isoforms.fpkm_tracking
├── skipped.gtf
├── transcripts.gtf
├── transcripts.txt
├── transcripts_ref.txt
├── transcripts_ref.bed
├── transcripts_ref_sorted.bed
└── transcripts_ref_sorted.bb
filtered_junction.gtf
andfiltered_junction.txt
: Filtered gene annotation files.genes.fpkm_tracking
,isoforms.fpkm_tracking
,skipped.gtf
andtranscripts.gtf
: Cufflinks result files.transcripts_ref.txt
,transcripts_ref.bed
andtranscripts_ref_sorted.bed
: Circular RNA transcript files.transcripts_ref_sorted.bb
: BigBed file of Circular RNA transcripts.