CRAC: an integrated approach to the analysis of RNA-seq reads.
CRAC is a tool to analyze High Throughput Sequencing (HTS), also called Next Generation Sequencing (NGS), data in comparison to a reference genome. It is intended for transcriptomic and genomic sequencing reads.
More precisely, with transcriptomic reads as input, it predicts point mutations, indels, splice junction, and chimeric RNAs (ie, non colinear splice junctions). CRAC can also output positions and nature of sequence error that it detects in the reads.
CRAC uses a genome index. This index must be computed before running the read analysis. For this sake, use the command "crac-index" on your genome files.
You can then process the reads using the command crac. See the man page of CRAC (help file) by typing "man crac".
CRAC requires large amount of main memory on your computer. For processing against the Human genome, say 50 million reads of 100 nucleotide each, CRAC requires about 40 gigabytes of main memory. Check whether the system of your computing server is equipped with sufficient amount of memory before launching an analysis.
Source code
CRAC is distributed under the CeCill license and can be used free of charge for non commercial usages. The distribution comes as an archive (a gzipped tarball file .tar.gz) containing the source code.
Features
- predicts SNV, splice junction, chimeric RNA from RNA-seq reads and a reference genome
- does not require any mapping step
- does not require any and is independent of annotations
- achieves good sensitivity and comparatively high specificity
- accepts standard input formats for the reads: FASTA, FastQ
- delivers output in SAM format
- makes multiple predictions in a single analysis
- efficient
- integrates both the genomic location and the coverage to analyse each read
- runs on all unix like systems
- can uses multiple processors in parallel to speed up the analysis
For more information about the CRAC project or download latest update of CRAC, see
CRAC project website