RNAmodR 1.0.2
Post-transcriptional modifications can be found abundantly in rRNA and tRNA and can be detected classically via several strategies. However, difficulties arise if the identity and the position of the modified nucleotides is to be determined at the same time. Classically, a primer extension, a form of reverse transcription (RT), would allow certain modifications to be accessed by blocks during the RT changes or changes in the cDNA sequences. Other modification would need to be selectively treated by chemical reactions to influence the outcome of the reverse transcription.
With the increased availability of high throughput sequencing, these classical methods were adapted to high throughput methods allowing more RNA molecules to be accessed at the same time. With these advances post-transcriptional modifications were also detected on mRNA. Among these high throughput techniques are for example Pseudo-Seq [Carlile et al. (2014)], RiboMethSeq [Birkedal et al. (2015)] and AlkAnilineSeq [Marchand et al. (2018)] each able to detect a specific type of modification from footprints in RNA-Seq data prepared with the selected methods.
Since similar pattern can be observed from some of these techniques, overlaps of the bioinformatical pipeline already are and will become more frequent with new emerging sequencing techniques.
RNAmodR
implements classes and a workflow to detect post-transcriptional RNA
modifications in high throughput sequencing data. It is easily adaptable to new
methods and can help during the phase of initial method development as well as
more complex screenings.
Briefly, from the SequenceData
, specific subclasses are derived for accessing
specific aspects of aligned reads, e.g. 5’-end positions or pileup data. With
this a Modifier
class can be used to detect specific patterns for individual
types of modifications. The SequenceData
classes can be shared by different
Modifier
classes allowing easy adaptation to new methods.
## snapshotDate(): 2019-10-22
library(rtracklayer)
library(Rsamtools)
library(GenomicFeatures)
library(RNAmodR.Data)
library(RNAmodR)
Each SequenceData
object is created with a named character vector, which can
be coerced to a BamFileList
, or named BamFileList
. The names must be either
“treated” or “control” describing the condition the data file belongs to.
Multiple files can be given per condition and are used as replicates.
annotation <- GFF3File(RNAmodR.Data.example.gff3())
sequences <- RNAmodR.Data.example.fasta()
files <- c(Treated = RNAmodR.Data.example.bam.1(),
Treated = RNAmodR.Data.example.bam.2(),
Treated = RNAmodR.Data.example.bam.3())
For annotation
and sequences
several input are accepted. annotation
can
be a GRangesList
, a GFF3File
or a TxDb
object. Internally, a GFF3File
is converted to a TxDb
object and a GRangesList
is retrieved using the
exonsBy
function.
seqdata <- End5SequenceData(files, annotation = annotation,
sequences = sequences)
## Import genomic features from the file as a GRanges object ... OK
## Prepare the 'metadata' data frame ... OK
## Make the TxDb object ... OK
## Loading 5'-end position data from BAM files ... OK
seqdata
## End5SequenceData with 60 elements containing 3 data columns and 3 metadata columns
## - Data columns:
## end5.treated.1 end5.treated.2 end5.treated.3
## <integer> <integer> <integer>
## - Seqinfo object with 84 sequences from an unspecified genome; no seqlengths:
SequenceData
extends from a CompressedSplitDataFrameList
and contains the
data per transcript alongside the annotation information and the sequence. The
additional data stored within the SequenceData
can be accessed by several
functions.
names(seqdata) # matches the transcript names as returned by a TxDb object
colnames(seqdata) # returns a CharacterList of all column names
bamfiles(seqdata)
ranges(seqdata) # generate from a TxDb object
sequences(seqdata)
seqinfo(seqdata)
Currently the following SequenceData
classes are implemented:
End5SequenceData
End3SequenceData
EndSequenceData
ProtectedEndSequenceData
CoverageSequenceData
PileupSequenceData
NormEnd5SequenceData
NormEnd3SequenceData
The data types and names of the columns are different for most of the
SequenceData
classes. As a naming convenction a descriptor is combined with
the condition as defined in the files input and the replicate number. For more
details please have a look at the man pages, e.g. ?End5SequenceData
.
SequenceData
objects can be subset like a CompressedSplitDataFrameList
.
Elements are returned as a SequenceDataFrame
dependent of the type of
SequenceData
used. For each SequenceData
class a matching
SequenceDataFrame
is implemented.
seqdata[1]
## End5SequenceData with 1 elements containing 3 data columns and 3 metadata columns
## - Data columns:
## end5.treated.1 end5.treated.2 end5.treated.3
## <integer> <integer> <integer>
## - Seqinfo object with 84 sequences from an unspecified genome; no seqlengths:
sdf <- seqdata[[1]]
sdf
## End5SequenceDataFrame with 1649 rows and 3 columns
## end5.treated.1 end5.treated.2 end5.treated.3
## <integer> <integer> <integer>
## 1 1 4 0
## 2 0 2 0
## 3 0 0 0
## 4 0 0 0
## 5 0 0 0
## ... ... ... ...
## 1645 0 0 0
## 1646 0 0 0
## 1647 0 0 0
## 1648 0 0 0
## 1649 0 0 0
##
## containing a GRanges object with 1 range and 3 metadata columns:
## seqnames ranges strand | exon_id exon_name exon_rank
## <Rle> <IRanges> <Rle> | <integer> <character> <integer>
## [1] Q0020_15S_RRNA 1-1649 + | 1 Q0020 1
## -------
## seqinfo: 60 sequences from an unspecified genome; no seqlengths
##
## and a 1649-letter "RNAString" instance
## seq: GUAAAAAAUUUAUAAGAAUAUGAUGUUGGUUCAGAU...UGCGGUGGGCUUAUAAAUAUCUUAAAUAUUCUUACA
The SequenceDataFrame
objects retains some accessor functions from the
SequenceData
class.
names(sdf) # this returns the columns names of the data
ranges(sdf)
sequences(sdf)
Subsetting of a SequenceDataFrame
returns a SequenceDataFrame
or
DataFrame
, depending on whether it is subset by a column or row, respectively.
The drop
argument is ignored for column subsetting.
sdf[,1:2]
## End5SequenceDataFrame with 1649 rows and 2 columns
## end5.treated.1 end5.treated.2
## <integer> <integer>
## 1 1 4
## 2 0 2
## 3 0 0
## 4 0 0
## 5 0 0
## ... ... ...
## 1645 0 0
## 1646 0 0
## 1647 0 0
## 1648 0 0
## 1649 0 0
##
## containing a GRanges object with 1 range and 3 metadata columns:
## seqnames ranges strand | exon_id exon_name exon_rank
## <Rle> <IRanges> <Rle> | <integer> <character> <integer>
## [1] Q0020_15S_RRNA 1-1649 + | 1 Q0020 1
## -------
## seqinfo: 60 sequences from an unspecified genome; no seqlengths
##
## and a 1649-letter "RNAString" instance
## seq: GUAAAAAAUUUAUAAGAAUAUGAUGUUGGUUCAGAU...UGCGGUGGGCUUAUAAAUAUCUUAAAUAUUCUUACA
sdf[1:3,]
## DataFrame with 3 rows and 3 columns
## end5.treated.1 end5.treated.2 end5.treated.3
## <integer> <integer> <integer>
## 1 1 4 0
## 2 0 2 0
## 3 0 0 0
Whereas, the SequenceData
classes are used to hold the data, Modifier
classes are used to detect certain features within high throughput sequencing
data to assign the presence of specific modifications for an established
pattern. The Modifier
class (and it nucleotide specific subclasses
RNAModifier
and DNAModifier
) is virtual and can be addapted to individual
methods. For example mapped reads can be analyzed using the ModInosine
class to reveal the presence of I by detecting a A to G conversion in normal
RNA-Seq data. Therefore, ModInosine
inherits from RNAModifier
.
To fix the data processing and detection strategy, for each type of sequencing
method a Modifier
class can be developed alongside to detect modifications.
For more information on how to develop such a class and potentially a new
corresponding SequenceData
class, please have a look at the vignette for
creating a new analysis.
For now three Modifier
classes are available:
ModInosine
ModRiboMethSeq
from the RNAmodR.RiboMethSeq
packageModAlkAnilineSeq
from the RNAmodR.AlkAnilineSeq
packageModifier
objects can use and wrap multiple SequenceData
objects as elements
of a SequenceDataSet
class. The elements of this class are different types of
SequenceData
, which are required by the specific Modifier
class. However,
they are required to contain data for the same annotation and sequence data.
Modifier
objects are created with the same arguments as SequenceData
objects
and will start loading the necessary SequenceData
objects from these. In
addition they will start automatically start to calculate any additional scores
(aggregation) and then start to search for modifications, if the optional
argument find.mod
is not set to FALSE
.
mi <- ModInosine(files, annotation = annotation, sequences = sequences)
## Import genomic features from the file as a GRanges object ... OK
## Prepare the 'metadata' data frame ... OK
## Make the TxDb object ... OK
## Loading Pileup data from BAM files ... OK
## Aggregating data and calculating scores ... Starting to search for 'Inosine' ... done.
(Hint: If you use an artificial genome, name the chromosomes chr1-chrN. It
will make some things easier for subsequent visualization, which relies on the
Gviz
package)
Since the Modifier
class wraps a SequenceData
object the accessors to data
contained within work similarly to the SequenceData
accessors described above.
What type of conditions the Modifier
class expects/supports is usually
described in the man pages of the Modifier class.
names(mi) # matches the transcript names as returned by a TxDb object
bamfiles(mi)
ranges(mi) # generated from a TxDb object
sequences(mi)
seqinfo(mi)
sequenceData(mi) # returns the SequenceData
The behavior of a Modifier
class can be fine tuned using settings. The
settings()
function is a getter/setter for arguments used in the analysis and
my differ between different Modifier
classes depending on the particular
strategy and whether they are implemented as flexible settings.
settings(mi)
## $minCoverage
## [1] 10
##
## $minReplicate
## [1] 1
##
## $find.mod
## [1] TRUE
##
## $minScore
## [1] 0.4
settings(mi,"minScore")
## [1] 0.4
settings(mi) <- list(minScore = 0.5)
settings(mi,"minScore")
## [1] 0.5
Each Modifier
object is able to represent one sample set with multiple
replicates of data. To easily compare multiple sample sets the ModifierSet
class is implemented.
The ModifierSet
object is created from a named list of named character vectors
or BamFileList
objects. Each element in the list is a sample type with a
corresponding name. Each entry in the character vector/BamFileList
is a
replicate (Alternatively a ModifierSet
can also be created from a list
of
Modifier
objects, if they are of the same type).
sequences <- RNAmodR.Data.example.AAS.fasta()
annotation <- GFF3File(RNAmodR.Data.example.AAS.gff3())
files <- list("SampleSet1" = c(treated = RNAmodR.Data.example.wt.1(),
treated = RNAmodR.Data.example.wt.2(),
treated = RNAmodR.Data.example.wt.3()),
"SampleSet2" = c(treated = RNAmodR.Data.example.bud23.1(),
treated = RNAmodR.Data.example.bud23.2()),
"SampleSet3" = c(treated = RNAmodR.Data.example.trm8.1(),
treated = RNAmodR.Data.example.trm8.2()))
msi <- ModSetInosine(files, annotation = annotation, sequences = sequences)
## Import genomic features from the file as a GRanges object ... OK
## Prepare the 'metadata' data frame ... OK
## Make the TxDb object ... OK
The creation of the ModifierSet
will itself trigger the creation of a
Modifier
object each containing data from one sample set. This step is
parallelized using the functions from the BiocParallel
package. If a
Modifier
class itself uses parallel computing for its analysis, it is switched
off unless internalBP = TRUE
is set. In this case each Modifier
object is
created in sequence, allowing parallel computing during of the creation of each
object.
names(msi)
## [1] "SampleSet1" "SampleSet2" "SampleSet3"
msi[[1]]
## A ModInosine object containing PileupSequenceData with 11 elements.
## | Input files:
## - treated: /home/biocbuild/.cache/ExperimentHub/1fd24c10d892_2544
## - treated: /home/biocbuild/.cache/ExperimentHub/1fd25d13f72a_2546
## - treated: /home/biocbuild/.cache/ExperimentHub/1fd230ca865e_2548
## | Nucleotide - Modification type(s): RNA - I
## | Modifications found: yes (6)
## | Settings:
## minCoverage minReplicate find.mod minScore
## <integer> <integer> <logical> <numeric>
## 10 1 TRUE 0.4
Again accessors remain mostly the same as described above for the Modifier
class returning a list of results, one element for each Modifier
object.
bamfiles(msi)
ranges(msi) # generate from a TxDb object
sequences(msi)
seqinfo(msi)
Found modifications can be retrieved from a Modifier
or ModifierSet
object
via the modifications()
function. The function returns a GRanges
or
GRangesList
object, respectively, which contains the coordinates of the
modifications with respect to the genome used. For example if a transcript
starts at position 100 and contains a modified nucleotide at position 50, the
returned coordinate will 149.
mod <- modifications(msi)
mod[[1]]
## GRanges object with 6 ranges and 5 metadata columns:
## seqnames ranges strand | mod source type
## <Rle> <IRanges> <Rle> | <character> <character> <character>
## [1] chr2 34 + | I RNAmodR RNAMOD
## [2] chr4 35 + | I RNAmodR RNAMOD
## [3] chr6 34 + | I RNAmodR RNAMOD
## [4] chr7 67 + | I RNAmodR RNAMOD
## [5] chr9 7 + | I RNAmodR RNAMOD
## [6] chr11 35 + | I RNAmodR RNAMOD
## score Parent
## <numeric> <character>
## [1] 0.900932010777216 2
## [2] 0.899621561783724 4
## [3] 0.984035468313303 6
## [4] 0.934553142420191 7
## [5] 0.709758144689451 9
## [6] 0.874027388733271 11
## -------
## seqinfo: 11 sequences from an unspecified genome; no seqlengths
To retrieve the coordinates with respect to the transcript boundaries, use the
optional argument perTranscript = TRUE
. In the example provided here, this
will yield the same coordinates, since a custom genome was used for mapping of
the example, which does not contain transcripts on the negative strand and per
transcript chromosomes.
mod <- modifications(msi, perTranscript = TRUE)
mod[[1]]
## GRanges object with 6 ranges and 5 metadata columns:
## seqnames ranges strand | mod source type
## <Rle> <IRanges> <Rle> | <character> <character> <character>
## [1] chr2 34 * | I RNAmodR RNAMOD
## [2] chr4 35 * | I RNAmodR RNAMOD
## [3] chr6 34 * | I RNAmodR RNAMOD
## [4] chr7 67 * | I RNAmodR RNAMOD
## [5] chr9 7 * | I RNAmodR RNAMOD
## [6] chr11 35 * | I RNAmodR RNAMOD
## score Parent
## <numeric> <character>
## [1] 0.900932010777216 2
## [2] 0.899621561783724 4
## [3] 0.984035468313303 6
## [4] 0.934553142420191 7
## [5] 0.709758144689451 9
## [6] 0.874027388733271 11
## -------
## seqinfo: 11 sequences from an unspecified genome; no seqlengths
To compare results between samples, a ModifierSet
as well as a definition of
positions to compare are required. To construct a set of positions, we will use
the intersection of all modifications found as an example.
mod <- modifications(msi)
coord <- unique(unlist(mod))
coord$score <- NULL
coord$sd <- NULL
compareByCoord(msi,coord)
## DataFrame with 6 rows and 6 columns
## SampleSet1 SampleSet2 SampleSet3 names positions
## <numeric> <numeric> <numeric> <factor> <factor>
## 1 0.900932010777216 0.998134328358209 0.953650793650794 2 34
## 2 0.899621561783724 0.856240838876987 0.976927536231884 4 35
## 3 0.984035468313303 0.992011716149647 0.993127802959445 6 34
## 4 0.934553142420191 0.942904664532954 0.943773271556888 7 67
## 5 0.709758144689451 0.766483635920979 0.681450587234341 9 7
## 6 0.874027388733271 0.971474358974359 0.954782082324455 11 35
## mod
## <character>
## 1 I
## 2 I
## 3 I
## 4 I
## 5 I
## 6 I
The result can also be plotted using plotCompareByCoord
, which accepts an
optional argument alias
to allow transcript ids to be converted to other
identifiers. For this step it is probably helpful to construct a TxDb
object
right at the beginning and use it for constructing the Modifier
/ModifierSet
object as the annotation
argument.
txdb <- makeTxDbFromGFF(annotation)
## Import genomic features from the file as a GRanges object ... OK
## Prepare the 'metadata' data frame ... OK
## Make the TxDb object ... OK
alias <- data.frame(tx_id = names(id2name(txdb)),
name = id2name(txdb))
plotCompareByCoord(msi, coord, alias = alias)
Additionally, the order of sample sets can be adjusted, normalized to any of the sample sets and the numbering of positions shown per transcript.
plotCompareByCoord(msi[c(3,1,2)], coord, alias = alias, normalize = "SampleSet3",
perTranscript = TRUE)
The calculated scores and data can be visualized along the transcripts or chunks
of the transcript. With the optional argument showSequenceData
the plotting
of the sequence data in addition to the score data can be triggered by setting
it to TRUE
.
plotData(msi, "2", from = 10L, to = 45L, alias = alias) # showSequenceData = FALSE
plotData(msi[1:2], "2", from = 10L, to = 45L, showSequenceData = TRUE, alias = alias)
Since the detection of modifications from high throughput sequencing data relies
usually on thresholds for calling modifications, there is considerable interest
in analyzing the performance of the method based on scores chosen and available
samples. To analyse the performance, the function plotROC()
is implemented,
which is a wrapper around the functionality of the ROCR
package
(Sing et al. 2005)(#References).
For the example data used in this vignette, the information gained is rather
limited and the following figure should be regarded just as a proof of concept.
In addition, the use of found modifications sites as an input for plotROC
is
strongly discouraged, since defeats the purpose of the test. Therefore, please
regard this aspect of the next chunk as proof of concept as well.
plotROC(msi, coord)
Please have a look at ?plotROC
for additional details. Most of the
functionality from the ROCR
package is available via additional arguments,
thus the output of plotROC
can be heavily customized.
The development of RNAmodR
will continue. General ascpects of the analysis
workflow will be addressed in the RNAmodR
package, whereas additional
classes for new sequencing techniques targeted at detecting post-transcriptional
will be wrapped in individual packages. This will allow general improvements
to propagate upstream, but not hinder individual requirements of each detection
strategy.
For an example have a look at the RNAmodR.RiboMethSeq
and
RNAmodR.AlkAnilineSeq
packages.
Features, which might be added in the future:
We welcome contributions of any sort.
sessionInfo()
## R version 3.6.2 (2019-12-12)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.3 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.10-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.10-bioc/R/lib/libRlapack.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] parallel stats4 stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] RNAmodR_1.0.2 Modstrings_1.2.0 RNAmodR.Data_1.0.0
## [4] ExperimentHubData_1.12.0 AnnotationHubData_1.16.0 futile.logger_1.4.3
## [7] ExperimentHub_1.12.0 AnnotationHub_2.18.0 BiocFileCache_1.10.2
## [10] dbplyr_1.4.2 GenomicFeatures_1.38.0 AnnotationDbi_1.48.0
## [13] Biobase_2.46.0 Rsamtools_2.2.1 Biostrings_2.54.0
## [16] XVector_0.26.0 rtracklayer_1.46.0 GenomicRanges_1.38.0
## [19] GenomeInfoDb_1.22.0 IRanges_2.20.1 S4Vectors_0.24.1
## [22] BiocGenerics_0.32.0 BiocStyle_2.14.4
##
## loaded via a namespace (and not attached):
## [1] RUnit_0.4.32 tidyselect_0.2.5
## [3] RSQLite_2.2.0 htmlwidgets_1.5.1
## [5] grid_3.6.2 BiocParallel_1.20.1
## [7] munsell_0.5.0 codetools_0.2-16
## [9] colorspace_1.4-1 OrganismDbi_1.28.0
## [11] highr_0.8 knitr_1.26
## [13] rstudioapi_0.10 ROCR_1.0-7
## [15] assertive.base_0.0-7 labeling_0.3
## [17] optparse_1.6.4 GenomeInfoDbData_1.2.2
## [19] farver_2.0.2 bit64_0.9-7
## [21] vctrs_0.2.1 lambda.r_1.2.4
## [23] xfun_0.11 biovizBase_1.34.1
## [25] R6_2.4.1 assertive.sets_0.0-3
## [27] AnnotationFilter_1.10.0 bitops_1.0-6
## [29] DelayedArray_0.12.2 assertthat_0.2.1
## [31] promises_1.1.0 scales_1.1.0
## [33] nnet_7.3-12 gtable_0.3.0
## [35] biocViews_1.54.0 ensembldb_2.10.2
## [37] rlang_0.4.2 zeallot_0.1.0
## [39] splines_3.6.2 lazyeval_0.2.2
## [41] acepack_1.4.1 dichromat_2.0-0
## [43] checkmate_1.9.4 BiocManager_1.30.10
## [45] yaml_2.2.0 reshape2_1.4.3
## [47] backports_1.1.5 httpuv_1.5.2
## [49] Hmisc_4.3-0 RBGL_1.62.1
## [51] tools_3.6.2 bookdown_0.17
## [53] ggplot2_3.2.1 gplots_3.0.1.2
## [55] assertive.strings_0.0-3 RColorBrewer_1.1-2
## [57] assertive.reflection_0.0-4 Rcpp_1.0.3
## [59] plyr_1.8.5 base64enc_0.1-3
## [61] progress_1.2.2 zlibbioc_1.32.0
## [63] purrr_0.3.3 RCurl_1.95-4.12
## [65] prettyunits_1.1.0 rpart_4.1-15
## [67] openssl_1.4.1 SummarizedExperiment_1.16.1
## [69] cluster_2.1.0 colorRamps_2.3
## [71] assertive.models_0.0-2 magrittr_1.5
## [73] data.table_1.12.8 assertive.data_0.0-3
## [75] futile.options_1.0.1 ProtGenerics_1.18.0
## [77] matrixStats_0.55.0 hms_0.5.3
## [79] mime_0.8 evaluate_0.14
## [81] xtable_1.8-4 XML_3.98-1.20
## [83] jpeg_0.1-8.1 gridExtra_2.3
## [85] compiler_3.6.2 biomaRt_2.42.0
## [87] tibble_2.1.3 assertive.datetimes_0.0-2
## [89] KernSmooth_2.23-16 crayon_1.3.4
## [91] htmltools_0.4.0 later_1.0.0
## [93] assertive_0.3-5 Formula_1.2-3
## [95] DBI_1.1.0 formatR_1.7
## [97] assertive.files_0.0-2 rappdirs_0.3.1
## [99] assertive.numbers_0.0-2 Matrix_1.2-18
## [101] getopt_1.20.3 assertive.types_0.0-3
## [103] assertive.matrices_0.0-2 assertive.data.uk_0.0-2
## [105] gdata_2.18.0 Gviz_1.30.0
## [107] pkgconfig_2.0.3 GenomicAlignments_1.22.1
## [109] foreign_0.8-74 assertive.data.us_0.0-2
## [111] stringdist_0.9.5.5 AnnotationForge_1.28.0
## [113] BiocCheck_1.22.0 stringr_1.4.0
## [115] VariantAnnotation_1.32.0 digest_0.6.23
## [117] graph_1.64.0 assertive.code_0.0-3
## [119] rmarkdown_2.0 htmlTable_1.13.3
## [121] curl_4.3 shiny_1.4.0
## [123] gtools_3.8.1 lifecycle_0.1.0
## [125] jsonlite_1.6 askpass_1.1
## [127] BSgenome_1.54.0 pillar_1.4.3
## [129] lattice_0.20-38 fastmap_1.0.1
## [131] httr_1.4.1 survival_3.1-8
## [133] interactiveDisplayBase_1.24.0 glue_1.3.1
## [135] png_0.1-7 BiocVersion_3.10.1
## [137] bit_1.1-14 assertive.properties_0.0-4
## [139] stringi_1.4.5 blob_1.2.0
## [141] latticeExtra_0.6-29 caTools_1.17.1.3
## [143] memoise_1.1.0 rBiopaxParser_2.26.0
## [145] dplyr_0.8.3
Birkedal, Ulf, Mikkel Christensen-Dalsgaard, Nicolai Krogh, Radhakrishnan Sabarinathan, Jan Gorodkin, and Henrik Nielsen. 2015. “Profiling of Ribose Methylations in Rna by High-Throughput Sequencing.” Angewandte Chemie (International Ed. In English) 54 (2):451–55. https://doi.org/10.1002/anie.201408362.
Carlile, Thomas M., Maria F. Rojas-Duran, Boris Zinshteyn, Hakyung Shin, Kristen M. Bartoli, and Wendy V. Gilbert. 2014. “Pseudouridine Profiling Reveals Regulated mRNA Pseudouridylation in Yeast and Human Cells.” Nature 515 (7525):143–46.
Marchand, Virginie, Lilia Ayadi, Felix G. M. Ernst, Jasmin Hertler, Valérie Bourguignon-Igel, Adeline Galvanin, Annika Kotter, Mark Helm, Denis L. J. Lafontaine, and Yuri Motorin. 2018. “AlkAniline-Seq: Profiling of m7G and m3C Rna Modifications at Single Nucleotide Resolution.” Angewandte Chemie International Edition 57 (51):16785–90. https://doi.org/10.1002/anie.201810946.
Sing, Tobias, Oliver Sander, Niko Beerenwinkel, and Thomas Lengauer. 2005. “ROCR: Visualizing Classifier Performance in R.” Bioinformatics (Oxford, England) 21 (20):3940–1. https://doi.org/10.1093/bioinformatics/bti623.