Pipeline to eliminate false positive predictions of retrotransposons

This function takes an LTRpred output table as input and eliminates false positive predictions.

quality.filter(pred, sim, n.orfs, strategy = "default")

Arguments

pred	`LTRpred.tbl` generated with `LTRpred`
sim	LTR similarity threshold. Only putative LTR transposons that fulfill this LTR similarity threshold will be retained.
n.orfs	minimum number of ORFs detected in the putative LTR transposon.
strategy	quality filter strategy. Options are `strategy = "default"` : see section `Quality Control` `strategy = "stringent"` : in addition to filter criteria specified in section `Quality Control`, the filter criteria `!is.na(protein_domain)) \| (dfam_target_name != "unknown")` is applied

Value

A quality filtered LTRpred.tbl.

Details

Quality Control

ltr.similarity: Minimum similarity between LTRs. All TEs not matching this criteria are discarded.
n.orfs: minimum number of Open Reading Frames that must be found between the LTRs. All TEs not matching this criteria are discarded.
PBS or Protein Match: elements must either have a predicted Primer Binding Site or a protein match of at least one protein (Gag, Pol, Rve, ...) between their LTRs. All TEs not matching this criteria are discarded.
The relative number of N's (= nucleotide not known) in TE <= 0.1. The relative number of N's is computed as follows: absolute number of N's in TE / width of TE.

Author

Hajk-Georg Drost

Examples

# example prediction file generated by LTRpred 
pred.file <- system.file("Athaliana_TAIR10_chr_all_LTRpred_DataSheet.csv", package = "LTRpred")
# read LTRpred generated prediction file (data sheet)
pred <- read.ltrpred(pred.file)
# apply quality filter
pred <- quality.filter(pred, sim = 70, n.orfs = 1)
#> The LTRpred prediction table has been filtered (default) to remove potential false positives. Predicted LTRs must have an PBS or Protein Domain and must fulfill thresholds: sim = 70%; #orfs = 1. Furthermore, TEs having more than 10% of N's in their sequence have also been removed.
#> Input #TEs: 458
#> Output #TEs: 202

Pipeline to eliminate false positive predictions of retrotransposons

Arguments

Value

Details

See also

Author

Examples