Representing nucleotide modifications in a nucleotide sequence is usually done via special characters from a number of sources. This represents a challenge to work with in R and the Biostrings package. The Modstrings package implements this functionallity for RNA and DNA sequences containing modified nucleotides by translating the character internally in order to work with the infrastructure of the Biostrings package. For this the ModRNAString and ModDNAString classes and derivates and functions to construct and modify these objects despite the encoding issues are implemenented. In addition the conversion from sequences to list like location information (and the reverse operation) is implemented as well.

A good place to start would be the vignette and the man page for the ModStringSet objects.

The alphabets for the modifications used in this package are based on the compilation of RNA modifications by http://modomics.genesilico.pl by the Bujnicki lab and DNA modifications https://dnamod.hoffmanlab.org by the Hoffman lab. Both alphabets were modified to remove some incompatible characters.

Author

Felix G M Ernst [aut,cre] and Denis L.J. Lafontaine [ctb]