## Improving Quality of Search Results Clustering with Approximate Matrix Factorisations

### Abstract

Abstract. In this paper we show how approximate matrix factorisations can be used to organise document summaries returned by a search engine into meaningful thematic categories. We compare four different factorisations (SVD, NMF, LNMF and K-Means/Concept Decomposition) with respect to topic separation capability, outlier detection and label quality. We also compare our approach with two other clustering algorithms: Suffix Tree Clustering (STC) and Tolerance Rough Set Clustering (TRC). For our experiments we use the standard merge-thencluster approach based on the Open Directory Project web catalogue as a source of human-clustered document summaries. 1

