R/read.blast6out.R
read.blast6out.Rd
Read a file in blast6out format generated by either USEARCH or VSEARCH.
read.blast6out(blast6out.file)
blast6out.file | to blast6out file ( |
---|
A dataframe storing the following columns:
query:
query id.
subject:
subject id.
perc_ident:
pecent identity between query and subject.
align_len:
alignment length between query and subject.
n_mismatch:
number of mismathces between query and subject.
n_gap_open:
number of gap openings between query and subject.
start_q:
start position in query. Query coordinates start with 1 at the
first base in the sequence as it appears in the input file. For translated searches (nucleotide queries, protein targets), query start < end for +ve frame and start > end for -ve frame.
end_q:
end position in query.
start_s:
start position in subject. Subject coordinates start with 1 at
the first base in the sequence as it appears in the database. For untranslated
nucleotide searches, subject start < end for plus strand, start > end for a reverse-complement alignment.
end_s:
end position in subject.
evalue:
evalue calculated using Karlin-Altschul statistics.
bit_score:
bit score calculated using Karlin-Altschul statistics.
Hajk-Georg Drost
# read example *.blast6out file test.blast6out <- read.blast6out(system.file("test.blast6out", package = "LTRpred")) # look at the format in R head(test.blast6out) #> # A tibble: 6 × 12 #> query subject perc_ident align_len n_mismatch n_gap_open start_q end_q start_s #> <chr> <chr> <dbl> <int> <int> <int> <int> <int> <int> #> 1 2_CH… mitoch… 99.6 18868 22 19 18831 1 1 #> 2 mito… 2_CHRO… 99.3 18390 29 35 1 18354 1 #> 3 3_CH… 1_CHRO… 91.3 9070 317 20 1 8652 1 #> 4 4_CH… 3_CHRO… 92.2 8301 429 19 1 8152 1 #> 5 3_CH… 3_CHRO… 92 8337 292 15 1 8080 1 #> 6 4_CH… 3_CHRO… 93 8292 284 12 1 8053 1 #> # … with 3 more variables: end_s <int>, evalue <dbl>, bit_score <dbl>