Enhanced Multi-View Point Non-Negative Matrix Factorization Clustering for Clinical Documents Analysis
Divya Sharma1, Surya Prasath V. B2,3,4, Vijayarajan V5, Mohan Kubendiran6 and Padmapriya R7

1,5,6,7SCOPE (School of Computing Science and Engineering), VIT University, Vellore, India.

2Computational Imaging and Vis Analysis (CIVA) Lab, Department of Computer Science, University of Missouri-Columbia, Columbia MO 65211 USA.

3Biomedical Informatics, Cincinnati Children's Hospital Medical Center (CCHMC), Cincinnati, OH 45229 USA.

4Department of Biomedical Informatics, College of Medicine, University of Cincinnati, OH USA.

Corresponding Author E-mail: vijayarajan.v@vit.ac.in

Abstract: Clustering of clinical documents is a major research area in the field of machine learning and artificial intelligence, which aims to acquaint some type of association with the information that helps to highlight relevant examples and patterns. The rich corpus of clinical notes consists of several unprocessed data that needs to be mined with appropriate techniques to improve and augment the existing healthcare system. Biomedical information mining is a general research strategy that aims to recover, break down, and analyze clinical information from a collection of medical and/or medicinal records. This paper presents a novel approach that utilizes Non-Negative Matrix Factorization (NMF) clustering approach to mine the medication names based on age of the patients. Pharmaceutical data from clinical notes is regularly communicated with prescription names and other medication information, which needs to be mined, based on the similarity between documents so that more accurate extraction of similarity could be accomplished. Even in the wake of being an exceptionally effective solution, clustering is yet not deployed in major search engines. The basic issue with it is to determine a fast and accurate cluster values even after reducing the complexity of the technique. This paper presents an enhanced multi-viewpoint similarity measure that utilizes many distinct viewpoints to measure similarity between documents so that more accurate extraction of similarity could be accomplished.

Keywords: Clinical Documents; Clustering Algorithms; Cluster Analysis; Heuristic; Text Mining; Semantics

[ HTML Full Text]

Back to TOC