R/Modstrings-sanitize.R
sanitizeInput.Rd
Since the one letter nomenclature for RNA and DNA modification differs depending on the source, a translation to a common alphabet is necessary.
sanitizeInput
exchanges based on a dictionary. The dictionary is
expected to be a DataFrame
with two columns, mods_abbrev
and
short_name
. Based on the short_name
the characters from in the
input are converted from values of mods_abbrev
into the the ones
from alphabet
.
Only different values will be searched for and exchanged.
sanitizeFromModomics
and sanitizeFromtRNAdb
use a predefined
dictionary, which is builtin.
sanitizeInput(input, dictionary)
sanitizeFromModomics(input)
sanitizeFromtRNAdb(input)
the modified character
vector compatible for constructing a
ModString
object.
# Modomics
chr <- "AGC@"
# Error since the @ is not in the alphabet
if (FALSE) { # \dontrun{
seq <- ModRNAString(chr)
} # }
seq <- ModRNAString(sanitizeFromModomics(chr))
seq
#> 4-letter ModRNAString object
#> seq: AGC÷
# tRNAdb
chr <- "AGC+"
# No error but the + has a different meaning in the alphabet
if (FALSE) { # \dontrun{
seq <- ModRNAString(chr)
} # }
seq <- ModRNAString(sanitizeFromtRNAdb(chr))
seq
#> 4-letter ModRNAString object
#> seq: AGCΘ