GRanges
vignettes/tRNAdbImport.Rmd
tRNAdbImport.Rmd
The tRNAdb and mttRNAdb (Jühling et al. 2009) is a compilation of tRNA sequences and tRNA genes. It is a follow up version of the database of Sprinzl et al. (Sprinzl and Vassilenko 2005).
Using tRNAdbImport
the tRNAdb can be accessed as
outlined on the website http://trna.bioinf.uni-leipzig.de/
and the results are returned as a GRanges
object.
The tRNAdb Server is currently not available. Some chunks of the code in this vignette are currently not avilable. See https://www.bioinf.uni-leipzig.de/services/webservices for more information.
GRanges
library(tRNAdbImport)
# accessing tRNAdb
# tRNA from yeast for Alanine and Phenylalanine
gr <- import.tRNAdb(organism = "Saccharomyces cerevisiae",
aminoacids = c("Phe","Ala"))
## Warning: tRNAdb Server seems to be not available.
# get a Phenylalanine tRNA from yeast
gr <- import.tRNAdb.id(tdbID = gr[gr$tRNA_type == "Phe",][1L]$tRNAdb_ID)
# find the same tRNA via blast
gr <- import.tRNAdb.blast(blastSeq = gr$tRNA_seq)
# accessing mtRNAdb
# get the mitochrondrial tRNA for Alanine in Bos taurus
gr <- import.mttRNAdb(organism = "Bos taurus",
aminoacids = "Ala")
## Warning: tRNAdb Server seems to be not available.
# get one mitochrondrial tRNA in Bos taurus.
gr <- import.mttRNAdb.id(mtdbID = gr[1L]$tRNAdb_ID)
# check that the result has the appropriate columns
istRNAdbGRanges(gr)
## Warning: Input GRanges object does not meet the requirements of the function. The following columns are expected:
## 'tRNA_length', 'tRNA_type', 'tRNA_anticodon', 'tRNA_seq', 'tRNA_str', 'tRNA_CCA.end', 'tRNAdb_ID', 'tRNAdb', 'tRNAdb_organism', 'tRNAdb_strain', 'tRNAdb_taxonomyID', 'tRNAdb_verified'.
## [1] FALSE
GRanges
from the RNA database
The tRNAdb offers two different sets of data, one containing DNA
sequences and one containing RNA sequences. Depending on the database
selected, DNA
as default, the GRanges will contain a
DNAStringSet
or a ModRNAStringSet
as the
tRNA_seq
column. Because the RNA sequences can contain
modified nucleotides, the ModRNAStringSet
class is used
instead of the RNAStringSet
class to store the sequences
correctly with all information intact.
gr <- import.tRNAdb(organism = "Saccharomyces cerevisiae",
aminoacids = c("Phe","Ala"),
database = "RNA")
gr$tRNA_seq
The special characters in the sequence might no exactly match the
ones shown on the website, since they are sanitized internally to a
unified dictionary defined in the Modstrings
package.
However, the type of modification encoded will remain the same (See the
Modstrings
package for more details).
The information on the position and type of the modifications can
also be converted into a tabular format using the separate
function from the Modstrings
package.
separate(gr$tRNA_seq)
The output can be saved or directly used for further analysis.
library(Biostrings)
library(rtracklayer)
# saving the tRAN sequences as fasta file
writeXStringSet(gr$tRNA_seq, filepath = tempfile())
# converting tRNAdb information to GFF compatible values
gff <- tRNAdb2GFF(gr)
gff
# Saving the information as gff3 file
export.gff3(gff, con = tempfile())
Please have a look at the tRNA
package for further
analysis of the tRNA sequences.
## R Under development (unstable) (2024-03-24 r86185)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 22.04.4 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] rtracklayer_1.63.1 tRNAdbImport_1.21.1 tRNA_1.21.2
## [4] Structstrings_1.19.1 Modstrings_1.19.0 Biostrings_2.71.5
## [7] XVector_0.43.1 GenomicRanges_1.55.4 GenomeInfoDb_1.39.9
## [10] IRanges_2.37.1 S4Vectors_0.41.5 BiocGenerics_0.49.1
## [13] BiocStyle_2.31.0
##
## loaded via a namespace (and not attached):
## [1] SummarizedExperiment_1.33.3 gtable_0.3.4
## [3] rjson_0.2.21 xfun_0.43
## [5] bslib_0.6.2 ggplot2_3.5.0
## [7] httr2_1.0.0 lattice_0.22-6
## [9] Biobase_2.63.0 vctrs_0.6.5
## [11] tools_4.4.0 bitops_1.0-7
## [13] generics_0.1.3 parallel_4.4.0
## [15] curl_5.2.1 tibble_3.2.1
## [17] fansi_1.0.6 pkgconfig_2.0.3
## [19] Matrix_1.7-0 desc_1.4.3
## [21] lifecycle_1.0.4 GenomeInfoDbData_1.2.11
## [23] compiler_4.4.0 stringr_1.5.1
## [25] Rsamtools_2.19.4 textshaping_0.3.7
## [27] munsell_0.5.0 codetools_0.2-19
## [29] htmltools_0.5.8 sass_0.4.9
## [31] RCurl_1.98-1.14 yaml_2.3.8
## [33] pkgdown_2.0.7 pillar_1.9.0
## [35] crayon_1.5.2 jquerylib_0.1.4
## [37] BiocParallel_1.37.1 DelayedArray_0.29.9
## [39] cachem_1.0.8 abind_1.4-5
## [41] tidyselect_1.2.1 digest_0.6.35
## [43] stringi_1.8.3 restfulr_0.0.15
## [45] dplyr_1.1.4 purrr_1.0.2
## [47] bookdown_0.38 fastmap_1.1.1
## [49] grid_4.4.0 SparseArray_1.3.4
## [51] colorspace_2.1-0 cli_3.6.2
## [53] magrittr_2.0.3 S4Arrays_1.3.6
## [55] XML_3.99-0.16.1 utf8_1.2.4
## [57] scales_1.3.0 rappdirs_0.3.3
## [59] httr_1.4.7 rmarkdown_2.26
## [61] matrixStats_1.2.0 ragg_1.3.0
## [63] memoise_2.0.1 evaluate_0.23
## [65] knitr_1.45 BiocIO_1.13.0
## [67] rlang_1.1.3 glue_1.7.0
## [69] BiocManager_1.30.22 xml2_1.3.6
## [71] jsonlite_1.8.8 R6_2.5.1
## [73] MatrixGenerics_1.15.0 GenomicAlignments_1.39.4
## [75] systemfonts_1.0.6 fs_1.6.3
## [77] zlibbioc_1.49.3