Models used for predicting sample label ("POS": robustly maps R-loops; "NEG": poorly maps R-loops ) based on R-loop-forming sequences analysis (RLSeq::analyzeRLFS()). These models are used with RLSeq::predictCondition().

fft_model(quiet = FALSE)

prep_features(quiet = FALSE)

## Arguments

quiet If TRUE, messages are suppressed. Default: FALSE.

## Value

A model object from the caret package.

## Details

### Source

The models were developed as part of a semi-automated online learning scheme found in the RLBase-data protocol here. Briefly, R-loop-forming sequences (RLFS) analysis was performed using RLSeq::analyzeRLFS() for every sample peakset in RLBase (see rlfs_res for full results). The samples were then manually inspected and any which starkly differed from their label were removed. Out of 693 possible samples, 135 were excluded due to a mismatch with their label. The remaining steps were performed automatically.

• First,The non-discarded samples were partitioned 50:25:25 (train:test:discovery). Feature transformation was performed on the full data-set using the "YeoJohnson" transform along with typical standardization via caret::preProcess().

• Then, feature selection was performed in the discovery set using Boruta::Boruta().

• Then, the training set was then trained using a stacked ensemble model:

• The ensemble model is a Random Forest and the 5 base models in the stack are:

• Latent Dirichlet allocation

• Recursive partitioning

• Generalized linear model (logit)

• K-nearest neighbors

• 10-fold 5-repeated cross-validation was implemented during training.

• Finally, The model was then evaluated in the testing set. It demonstrates an accuracy of 0.9043. For more details, see the HTML report here.

### Structure

• prep_features()

• A feature-transform model which prepares the data for classification.

• It is an object of class preProcess from the caret::preProcess() function call.

• fft_model()

• A binary classifier which returns "POS" or "NEG".

• It is an object of class caretStack from the caretEnsemble::caretList() function call.

### Usage

These models are used internally by RLSeq::predictCondition().

## Examples

fftModel <- fft_model()

pfModel <- prep_features()