Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences (2006)

by W Li, A Godzik
Venue:Bioinformatics