vignettes/RLHub.Rmd
RLHub.Rmd
RLHub (“R-Loop Hub”) provides processed data sets for the RLSuite toolchain. It is an ExperimentHub
package containing annotations of R-Loop consensus regions, genomic features directly relevant to R-loops, such as R-loop-forming sequences (RLFS), G-or-C skew regions, and other data of relevance to RLSuite.
All data were generated via the protocol in the RLBase-data repository.
RLHub can be installed from Bioconductor via the following command:
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("RLHub")
RLHub may also be installed from GitHub:
remotes::install_github("Bishop-Laboratory/RLHub")
Data can be conveniently accessed through ExperimentHub
functions or with the built-in accessors available through RLHub
.
A summary of the data can also be found by running the following:
?`RLHub-package`
The full manifest of the available data is found here:
DT::datatable(
read.csv(system.file("extdata", "metadata.csv", package = "RLHub")),
options = list(
scrollX=TRUE,
pageLength = 5
)
)
The Tags column list the function names used to access each data set. This method of access is detailed below.
In the below example, we show how one can access data using convenient built-in functions.
The data access function name is simply the value in Tags corresponding to the entry for that data set in the metadata.csv
table. In this example,“rlbps” is the tag corresponding to entry #5: “R-loop-binding proteins discovered from mass-spec studies.” Therefore, the function to access this data is simply RLHub::rlbps()
.
For examples of all accessors, please run the following:
?`RLHub-package`
library(ExperimentHub)
In this example, we show how to access RLHub data using the ExperimentHub object.
eh <- ExperimentHub()
rlhub <- query(eh, "RLHub")
rlhub
## ExperimentHub with 16 records
## # snapshotDate(): 2021-10-13
## # $dataprovider: Multiple
## # $species: Homo sapiens, Mus musculus
## # $rdataclass: tbl, list, SummarizedExperiment, preProcess, caretStack
## # additional mcols(): taxonomyid, genome, description,
## # coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
## # rdatapath, sourceurl, sourcetype
## # retrieve records with, e.g., 'object[["EH6793"]]'
##
## title
## EH6793 | Primary Genomic Annotations (hg38)
## EH6794 | Primary Genomic Annotations (mm10)
## EH6795 | Full Genomic Annotations (hg38)
## EH6796 | Full Genomic Annotations (mm10)
## EH6797 | R-loop Binding Proteins
## ... ...
## EH6804 | RLFS-Test Results
## EH6805 | RLRegion Annotations
## EH6806 | RLRegion Metadata
## EH6807 | RLRegion Read Counts
## EH6808 | RLBase Sample Manifest
If we want to obtain the R-loop-binding proteins, for example, we can do so with corresponding ExperimentHub ID.
rlbps <- rlhub[["EH6797"]]
DT::datatable(rlbps)
Finally, all package resources may be loaded as a list using loadResources()
.
rlhublst <- loadResources(rlhub, package = "RLHub")
names(rlhublst) <- listResources(rlhub, package = "RLHub")
## R version 4.1.1 (2021-08-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.3 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] ExperimentHub_2.1.4 AnnotationHub_3.1.5 BiocFileCache_2.1.1
## [4] dbplyr_2.1.1 BiocGenerics_0.39.2 RLHub_0.99.5
## [7] BiocStyle_2.21.3
##
## loaded via a namespace (and not attached):
## [1] Biobase_2.53.0 httr_1.4.2
## [3] sass_0.4.0 bit64_4.0.5
## [5] jsonlite_1.7.2 bslib_0.3.0
## [7] shiny_1.7.0 assertthat_0.2.1
## [9] interactiveDisplayBase_1.31.2 BiocManager_1.30.16
## [11] stats4_4.1.1 blob_1.2.2
## [13] GenomeInfoDbData_1.2.7 yaml_2.2.1
## [15] BiocVersion_3.14.0 pillar_1.6.2
## [17] RSQLite_2.2.8 glue_1.4.2
## [19] digest_0.6.28 promises_1.2.0.1
## [21] XVector_0.33.0 htmltools_0.5.2
## [23] httpuv_1.6.3 pkgconfig_2.0.3
## [25] bookdown_0.24 zlibbioc_1.39.0
## [27] purrr_0.3.4 xtable_1.8-4
## [29] later_1.3.0 tibble_3.1.4
## [31] KEGGREST_1.33.0 generics_0.1.0
## [33] IRanges_2.27.2 ellipsis_0.3.2
## [35] DT_0.19 cachem_1.0.6
## [37] withr_2.4.2 magrittr_2.0.1
## [39] crayon_1.4.1 mime_0.12
## [41] memoise_2.0.0 evaluate_0.14
## [43] fs_1.5.0 fansi_0.5.0
## [45] textshaping_0.3.5 tools_4.1.1
## [47] lifecycle_1.0.1 stringr_1.4.0
## [49] S4Vectors_0.31.3 AnnotationDbi_1.55.1
## [51] Biostrings_2.61.2 compiler_4.1.1
## [53] pkgdown_1.6.1 jquerylib_0.1.4
## [55] GenomeInfoDb_1.29.8 systemfonts_1.0.2
## [57] rlang_0.4.11 RCurl_1.98-1.5
## [59] rappdirs_0.3.3 htmlwidgets_1.5.4
## [61] crosstalk_1.1.1 bitops_1.0-7
## [63] rmarkdown_2.11 DBI_1.1.1
## [65] curl_4.3.2 R6_2.5.1
## [67] knitr_1.34 dplyr_1.0.7
## [69] fastmap_1.1.0 bit_4.0.4
## [71] utf8_1.2.2 filelock_1.0.2
## [73] rprojroot_2.0.2 ragg_1.1.3
## [75] desc_1.3.0 stringi_1.7.4
## [77] Rcpp_1.0.7 vctrs_0.3.8
## [79] png_0.1-7 tidyselect_1.1.1
## [81] xfun_0.26