Test Genomic Feature Enrichment — featureEnrich • RLSeq

Tests the enrichment of genomic features in supplied peaks. See details.

featureEnrich(
  object,
  annotype = c("primary", "full"),
  annotations = NULL,
  downsample = 10000,
  quiet = FALSE
)

Arguments

object: An RLRanges object.
annotype: The type of annotations to use. Can be one of "primary" or "full". Default: "primary". See RLHub::annotations for greater detail.
annotations: A custom annotation list of the same structure described in RLHub::annotations.
downsample: If a numeric, data will be down sampled to the requested number of peaks. This improves the speed of genomic shuffling and helps prevent p-value inflation. If FALSE, then downsampling will not be performed. Default: 10000.
quiet: If TRUE, messages will be suppressed. Default: FALSE

Value

An RLRanges object containing the results of the enrichment test accessed via rlresult(object, "featureEnrichment"). The results are in tbl format. For a full description of all columns in the output table see RLHub::feat_enrich_samples.

Details

Method

Annotations relevant to R-loops were curated as part of the RLBase-data workflow and are provided via RLHub::annotations.

In featureEnrich, each annotation "type" (e.g., "Exons", "Introns", etc) is compared to the supplied RLRanges, yielding enrichment statistics with the following procedure:

For each annotation type, the peaks are overlapped with the annotations.
Then, valr::bed_reldist is used to find the relative distance distribution between the peaks and the annotations for both the supplied RLRanges and shuffled RLRanges (via valr::bed_shuffle). Significance of the relative distance is calculated via stats::ks.test.
Then, Fisher’s exact test is implemented via valr::bed_fisher to obtain the significance of the overlap and the odds ratio.

Examples


# Example RLRanges dataset
rlr <- readRDS(system.file("extdata", "rlrsmall.rds", package = "RLSeq"))

# RL Region Test
featureEnrich(rlr)
#> see ?RLHub and browseVignettes('RLHub') for documentation
#>  - Calculating enrichment...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#> Warning: Not enough observations for interval tests...
#>  - Done
#> GRanges object with 7 ranges and 6 metadata columns:
#>                   seqnames            ranges strand |                     V4
#>                      <Rle>         <IRanges>  <Rle> |            <character>
#>   1                  chr14 20343155-20343313      * | /home/UTHSCSA/miller..
#>   2                  chr21   8214588-8214866      * | /home/UTHSCSA/miller..
#>   3                  chr21   8396653-8396835      * | /home/UTHSCSA/miller..
#>   4 chr22_KI270733v1_ran..     129919-130080      * | /home/UTHSCSA/miller..
#>   5 chr22_KI270733v1_ran..     174979-175184      * | /home/UTHSCSA/miller..
#>   6                   chr6 52995648-52995831      * | /home/UTHSCSA/miller..
#>   7                   chr9 35657778-35657976      * | /home/UTHSCSA/miller..
#>            V5          V6        V7        V8      qval
#>     <integer> <character> <numeric> <numeric> <numeric>
#>   1        56           .   7.40428   12.1836   5.63076
#>   2        56           .   7.27639   12.2524   5.66372
#>   3        15           .   5.19728    7.7825   1.52856
#>   4       256           .  16.31970   32.6965  25.65290
#>   5      1085           .  30.29480  116.1250 108.58400
#>   6        95           .   9.63809   16.3088   9.58453
#>   7       100           .   9.84017   16.7326  10.04910
#>   -------
#>   seqinfo: 640 sequences (1 circular) from hg38 genome
#> 
#> RDIP-Seq +RNH1: 
#>   Mode: RDIP 
#>   Genome: hg38 
#>   Label: NEG
#> 
#> RLSeq Results Available: 
#>   featureEnrichment, txFeatureOverlap, correlationMat, rlfsRes, noiseAnalysis, geneAnnoRes, predictRes, rlRegionRes 
#> 
#> prediction: NEG
#> 

# With custom annotations
small_anno <- list(
    "Centromeres" = readr::read_csv(
        system.file("extdata", "Centromeres.csv.gz", package = "RLSeq"),
        show_col_types = FALSE
    )
)
featureEnrich(rlr, annotations = small_anno)
#>  - Calculating enrichment...
#>  - Done
#> GRanges object with 7 ranges and 6 metadata columns:
#>                   seqnames            ranges strand |                     V4
#>                      <Rle>         <IRanges>  <Rle> |            <character>
#>   1                  chr14 20343155-20343313      * | /home/UTHSCSA/miller..
#>   2                  chr21   8214588-8214866      * | /home/UTHSCSA/miller..
#>   3                  chr21   8396653-8396835      * | /home/UTHSCSA/miller..
#>   4 chr22_KI270733v1_ran..     129919-130080      * | /home/UTHSCSA/miller..
#>   5 chr22_KI270733v1_ran..     174979-175184      * | /home/UTHSCSA/miller..
#>   6                   chr6 52995648-52995831      * | /home/UTHSCSA/miller..
#>   7                   chr9 35657778-35657976      * | /home/UTHSCSA/miller..
#>            V5          V6        V7        V8      qval
#>     <integer> <character> <numeric> <numeric> <numeric>
#>   1        56           .   7.40428   12.1836   5.63076
#>   2        56           .   7.27639   12.2524   5.66372
#>   3        15           .   5.19728    7.7825   1.52856
#>   4       256           .  16.31970   32.6965  25.65290
#>   5      1085           .  30.29480  116.1250 108.58400
#>   6        95           .   9.63809   16.3088   9.58453
#>   7       100           .   9.84017   16.7326  10.04910
#>   -------
#>   seqinfo: 640 sequences (1 circular) from hg38 genome
#> 
#> RDIP-Seq +RNH1: 
#>   Mode: RDIP 
#>   Genome: hg38 
#>   Label: NEG
#> 
#> RLSeq Results Available: 
#>   txFeatureOverlap, correlationMat, rlfsRes, noiseAnalysis, geneAnnoRes, predictRes, rlRegionRes 
#> 
#> prediction: NEG
#>