The Modifier
class is a virtual class, which provides the central
functionality to search for post-transcriptional RNA modification patterns in
high throughput sequencing data.
Each subclass has to implement the following functions:
Slot nucleotide
: Either "RNA" or "DNA". For conveniance the
subclasses RNAModifier
and DNAModifier
are already available
and can be inherited from.
Function aggregateData
: used for specific data
aggregation
Function findMod
: used for specific search for
modifications
Optionally the function settings<-
can be
implemented to store additional arguments, which the base class does not
recognize.
Modifier
objects are constructed centrally by calling
Modifier()
with a className
matching the specific class to be
constructed. This will trigger the immediate analysis, if find.mod
is
not set to FALSE
.
Modifier(className, x, annotation, sequences, seqinfo, ...)
# S4 method for SequenceData
Modifier(
className,
x,
annotation = NULL,
sequences = NULL,
seqinfo = NULL,
...
)
# S4 method for SequenceDataSet
Modifier(
className,
x,
annotation = NULL,
sequences = NULL,
seqinfo = NULL,
...
)
# S4 method for SequenceDataList
Modifier(
className,
x,
annotation = NULL,
sequences = NULL,
seqinfo = NULL,
...
)
# S4 method for character
Modifier(
className,
x,
annotation = NULL,
sequences = NULL,
seqinfo = NULL,
...
)
# S4 method for list
Modifier(
className,
x,
annotation = NULL,
sequences = NULL,
seqinfo = NULL,
...
)
# S4 method for BamFileList
Modifier(
className,
x,
annotation = NULL,
sequences = NULL,
seqinfo = NULL,
...
)
The name of the class which should be constructed.
the input which can be of the following types
SequenceData
: a single SequenceData
or a list
containing only SequenceData
objects. The input will just be used to
file the data
slot of the Modifier
and must match the
requirements of specific Modifier
class.
BamFileList
: a named BamFileList
character
: a character
vector, which must be coercible
to a named BamFileList
referencing existing bam files. Valid names are
control
and treated
to define conditions and replicates
annotation data, which must match the information contained
in the BAM files. This parameter is only required if x
is not a
SequenceData
object or a list of SequenceData
objects.
sequences matching the target sequences the reads were
mapped onto. This must match the information contained in the BAM files.
TThis parameter is only required if x
is not a SequenceData
object or a list of SequenceData
objects.
An optional Seqinfo
argument or character vector, which can be coerced to one, to subset the
sequences to be analyzed on a per chromosome basis.
Additional otpional parameters:
find.mod
: TRUE
or FALSE
: should the search for
for modifications be triggered upon construction? If not the search can be
started by calling the modify()
function.
additional parameters depending on the specific Modifier
class
All additional options must be named and will be passed to the
settings
function and onto the SequenceData
objects, if x
is not a SequenceData
object or a list of
SequenceData
objects.
a Modifier
object of type className
nucleotide
a character
value, which needs to contain "RNA" or
"DNA"
mod
a character
value, which needs to contain one or more
elements from the alphabet of a
ModRNAString
or
ModDNAString
class.
score
the main score identifier used for visualizations
dataType
the class name(s) of the SequenceData
class used
bamfiles
the input bam files as BamFileList
condition
conditions along the BamFileList
: Either
control
or treated
replicate
replicate number along the BamFileList
for each of the
condition types.
data
The sequence data object: Either a SequenceData
,
SequenceDataSet
or a SequenceDataList
object, if more than one
dataType
is used.
aggregate
the aggregated data as a SplitDataFrameList
modifications
the found modifications as a GRanges
object
settings
arguments used for the analysis as a list
aggregateValidForCurrentArguments
TRUE
or FALSE
whether
the aggregate data was constructed with the current arguments
modificationsValidForCurrentArguments
TRUE
or FALSE
whether the modifications were found with the current arguments
Modifier
objects can be created in two ways, either by providing a
list of bamfiles or
SequenceData
/SequenceDataSet
/SequenceDataList
objects,
which match the structure in dataType()
.
dataType()
can be a character
vector or a list
of
character
vectors and depending on this the input files have to
follow this structure:
a single character
: a SequenceData
is
constructed/expected.
a character
vector: a SequenceDataSet
is
constructed/expected.
a list
of character
vectors: a SequenceDataList
is constructed/expected.
The cases for a SequenceData
or SequenceDataSet
are straight
forward, since the input remains the same. The last case is special, since it
is a hypothetical option, in which bam files from two or more different
methods have to be combined to reliably detect a single modification (The
elements of a SequenceDataList
don't have to be created from the
bamfiles, whereas from a SequenceDataSet
they have to be).
For this example a list
of character
vectors is expected.
Each element must be named according to the names of dataType()
and
contain a character
vector for creating a SequenceData
object.
All additional options must be named and will be passed to the
settings
function and onto the SequenceData
objects, if x
is not a SequenceData
object or a list of
SequenceData
objects.