None
Published Pages | vimalkumarvelayudhan | Plot and output Ribo-Seq read counts using riboplot and ribocount

Plot and output Ribo-Seq read counts using riboplot and ribocount - sample workflow

This is a sample workflow of using the riboplot and ribocount programs which are part of the riboplot suite.
Both these tools are available under "RiboSeq Analysis --> Riboplot" on RiboGalaxy.

Sample data

For this tutorial, sample data (NCBI GEO: GSE53693 ) from Bazzini et.al will be used.

  1. 5h RPFs column replicate 1 run 1 - NCBI GEO: GSM1299080
  2. 5h input mRNA run 1 - NCBI GEO: GSM1299084
For each sample dataset above, the SRA files were downloaded using FileZilla, an FTP client and then converted individually to FASTQ format using the fastq-dump command. Finally, the FASTQ files were concatenated using the cat command.

The FASTQ files were then uploaded to the RiboGalaxy FTP server and then uploaded to RiboGalaxy using the "Upload File" tool under "Get Data". The format was selected as fastqsanger.

riboplot and ribocount require Ribo-Seq data to be aligned to a transcriptome using Bowtie version 1.

This can be done on RiboGalaxy itself by following the steps below

1. Pre-process reads to remove adaptor and rRNA

This can be done using Cutadapt and Remove rRNA using Bowtie tools on RiboGalaxy (under Pre-processing tools).Example:
The FASTQ reads in the sample data were pre-processed using Cutadapt to remove the
adaptor sequence (ATCTCGTATGCCGTCTTCTGCTTG) and then rRNA.

2. Align reads to a transcriptome.

Upload a FASTA format file of the transcriptome.

If you do not have a FASTA format file of the transcriptome, please follow the Alignments to
a transcriptome
section of the RiboGalaxy Help page to obtain a FASTA format file of the
transcriptome.

Then use "Transcriptome Mapping --> Align to transcriptome using Bowtie" to align reads
to the transcriptome.

Optional for riboplot: If you wish to plot RNA coverage, you will need to align RNA-Seq
data as above for Ribo-Seq.

Example:
The Zebrafish transcriptome as downloaded and the reads were aligned to it using the
steps above.

3. Convert Alignment format from SAM to BAM

This can be done by clicking on the pencil icon corresponding to the alignment dataset

and the selecting the "Convert Format" tab. Click "Convert" using the default option
"Convert SAM to BAM"
.

4. Sort the BAM file

Using the "Sort Data --> Sort BAM dataset" tool using the default sort by option.

riboplot

To plot and output read counts for a single transcript, select the riboplot tool from the menu and provide the following inputs (Figure 1)

Figure 1: Input options for riboplot

Input options for riboplot

Description of options:

  1. Ribo-Seq alignment file in BAM format - Select the sorted Ribo-Seq alignment file from step 4.
  2. FASTA format file of the transcriptome - This is the transcriptome FASTA file obtained in step 2 or uploaded manually.
  3. Name of the transcript to plot (as in the FASTA and SAM/BAM).
  4. If RNA coverage is desired, select the 'Include RNA coverage' option and provide a sorted BAM file of RNA-Seq data.
  5. Input read lengths to consider. If 0 is input, all read lengths will be considered.
  6. An offset can be specified if necessary.
Output (Figure 2 & 3)


Figure 2
: Plot of Ribo-Seq read counts and RNA coverage data with ORF architecture

Plot of Ribo-Seq read counts and RNA-Seq coverage with ORF architecture

Figure 3: A CSV file will also be included in the output containing Ribo-Seq read counts in 3 frames for the given transcript

CSV ouput containing Ribo-Seq read counts in 3 frames for the transcript

ribocount

To output read counts for all transcripts in an alignment, select the ribocount tool from the RiboPlot suite (Figure 4).

Figure 4: Input options for ribocount

Input options for ribocount

Description of options

  1. Ribo-Seq alignment file in BAM format - Select the sorted Ribo-Seq alignment file from step 4.
  2. FASTA format file of the transcriptome - This is the transcriptome FASTA file obtained in step 2 or uploaded manually.
  3. Input read lengths to consider. If 0 is input, all read lengths will be considered.
  4. An offset can be specified if necessary. If this option is provided, this offset is added to the read alignment positions.
  5. Choose whether to output read counts for the entire transcript or restrict read counts to the 5' or 3' region of the longest ORF. Default start (ATG) and stop codons ('TAG', 'TGA', 'TAA') are used to identify the longest ORF in 3 frames.
Output (Figure 5 & 6)


Figure 5: Summary of the ribocount run

Output of ribocount run (summary)

As indicated, please download and extract the ribocount_output.zip file and open the index.html in a browser.
Total reads for each transcript will be displayed in a table along with the name of the transcript and a link to
the CSV file containing the read counts in 3 frames for each position in the transcript (Figure 6).


Figure 6:
HTML page with results of the ribocount run for all transcripts in an alignment.

Ribocount result

For additional information or questions, please refer to the RiboPlot documentation, the RiboGalaxy Help page or the Forums.