Abstract:
Information is an important part of creating knowledge. We live at a time when so much information is created. Unfortunately, much of the information is redundant. There is huge amount of online information in the form of news articles that cover similar stories. The number of articles is projected to grow. The growth, makes it difficult for a person to process all that information to extract knowledge. This affects the quality of knowledge. There is need for a solution that can organize this similar information into specific themes. The solution is a branch of Artificial intelligence (AI) called machine learning (ML) using clustering algorithms. Clustering will group information that is similar into containers. When the information is clustered people can be presented with information about their interest, grouped together. The information in a group can even be summarized for better processing. One of the most widely used and studied clustering algorithm is K-Means. K-means is chosen because of its simplicity and easiness to implement. However, many variations of K-means have been produced. This makes it difficult to pick a variation of K-Means to use for the clustering problem. This paper presents the systematic literature review conducted with the aim of finding the application of K-Means and other clustering algorithms using the hypothesis. Studies using clustering algorithms in different contexts, the techniques used, and a summary of the outcome is discussed. The result of the systematic literature review is presented in a tabular and textual format.