-outfmt "6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore qlen slen"
1. qseqid Query Seq-id
2. sseqid Subject Seq-id
3. pident Percentage of identical matches
4. length Alignment length
5. mismatch Number of mismatches
6. gapopen Number of gap openings
7. qstart Start of alignment in query
8. qend End of alignment in query
9. sstart Start of alignment in subject
10. send End of alignment in subject
11. evalue Expect value
12. bitscore Bit score
*13. qlen Query sequence length
*14. slen Subject sequence length
Query coverage
Use the following awk command on the blast tabular output:
awk '{if ($4/$13 > 0.75 && $4/$14 > 0.75 && $3>55 && $11<0.000000000000001) print $0}' blast_out.tab
$5/$13 > 0.75 = Alignment length should be > than 75% of query length;
$5/$14 > 0.75 = Alignment length should be > than 75% of Subject length;
$3>55 = Percent identity should be > than 55%;
$11<0.000000000000001 = e-value less than e-15.
https://www.biostars.org/p/57602/
Subject에 대한 Query coverage
awk '{if ((($10-($9-1))/$14)>0.7) print $0"\t"($10-($9-1))/$14}' blast_out.tab
'생물정보학 > Bioinformatics' 카테고리의 다른 글
N50, L50 및 연관된 통계치 (0) | 2016.12.01 |
---|---|
dN/dS ratios (또는 Ka/Ks) 에서 추측 할 수 있는 결론 (0) | 2016.04.11 |
NCBI의 nr db에서 특정 종 제거하기 (0) | 2016.01.04 |
Composition Vector Tree (CVtree) 사용법 (0) | 2015.11.23 |
SEARCHING BEST HITS FROM BLAST TABLE FORMAT (0) | 2015.07.22 |