R/quality.filter.meta.R
quality.filter.meta.Rd
This function takes the file paths to the genomes folder and LTRpred.meta
output folder as input and eliminates false positive retrotransposon predictions on a metagenomic scale.
quality.filter.meta( kingdom, genome.folder, ltrpred.meta.folder, sim, cut.range = 2, n.orfs, strategy, update = FALSE )
kingdom | a character string specifying the kingdom of life to which genomes annotated with |
---|---|
genome.folder | path to folder storing the genome assembly files that were used for |
ltrpred.meta.folder | path to folder storing the |
sim | LTR similarity threshold. Only putative LTR transposons that fulfill this LTR similarity threshold will be retained. |
cut.range | a numeric number indicating the interval size for binning LTR similarities. |
n.orfs | minimum number of ORFs detected in the putative LTR transposon. |
strategy | quality filter strategy. Options are
|
update | shall already existing |
A list with to list elements sim_file
and gm_file
. Each list element stores a data.frame
:
sim_file
(similarity file)
gm_file
(genome metrics file)
Quality Control
ltr.similarity
: Minimum similarity between LTRs. All TEs not matching this
criteria are discarded.
n.orfs
: minimum number of Open Reading Frames that must be found between the
LTRs. All TEs not matching this criteria are discarded.
PBS or Protein Match
: elements must either have a predicted Primer Binding
Site or a protein match of at least one protein (Gag, Pol, Rve, ...) between their LTRs. All TEs not matching this criteria are discarded.
The relative number of N's (= nucleotide not known) in TE <= 0.1. The relative number of N's is computed as follows: absolute number of N's in TE / width of TE.
Hajk-Georg Drost