Filter the output of the repbase.query function to quantify the number of hits for each query LTR transposon (duplicates) and retain only hits found in Repbase that span the annotation sequence in Repbase to a certain percentage (scope).

repbase.filter(query.output, scope.value = 0.7, verbose = TRUE)

Arguments

query.output

a data.frame returned by the repbase.query function.

scope.value

a value between [0,1] qunatifying the percentage of minimum sequence similariy between the LTR transposon and the corresponding annotated sequence found in Repbase.

verbose

a logical value indicating whether or not additional information shall be printed to the console while executing this function.

Value

A data.frame storing the filtered output returned by repbase.query.

See also

Author

Hajk-Georg Drost

Examples

if (FALSE) {
# PreProcess Repbase: A thaliana
# and save the output into the file "Athaliana_repbase.ref"
repbase.clean(repbase.file = "athrep.ref",
              output.file  = "Athaliana_repbase.ref")
             
# perform blastn search against A thaliana repbase annotation
AthalianaRepBaseAnnotation <- repbase.query(ltr.seqs     = "TAIR10_chr_all-ltrdigest_complete.fas", 
                                           repbase.path = "Athaliana_repbase.ref", 
                                           cores        = 1)
 # filter the annotation query output                                           
 AthalianaAnnot.HighMatches <- repbase.filter(AthalianaRepBaseAnnotation, 
                                              scope = 0.9)
 Ath.TE.Matches.Families <- sort(table(
                            unlist(lapply(stringr::str_split(
                            names(table(AthalianaAnnot.HighMatches$subject_id)),"_"),
                            function(x) paste0(x[2:3],collapse = ".")))),
                                        decreasing = TRUE)
 
 # visualize the hits found to have a scope of 90%
 barplot(Ath.TE.Matches.Families,
        las       = 3, 
        cex.names = 0.8,
        col       = bcolor(length(Ath.TE.Matches.Families)), 
        main = "RepBase Annotation: A. thaliana")

}