<?xml version="1.0" encoding="UTF-8"?>



<records>

  <record>
    <language>eng</language>
          <publisher>Oriental Scientific Publishing Company</publisher>
        <journalTitle>Biomedical and Pharmacology Journal</journalTitle>
          <issn>0974-6242</issn>
            <publicationDate>2025-09-30</publicationDate>
    
        <volume>18</volume>
        <issue>3</issue>

 
    <startPage>2050</startPage>
    <endPage>2060</endPage>

	 
      <doi>10.13005/bpj/3236</doi>
        <publisherRecordId>67371</publisherRecordId>
    <documentType>article</documentType>
    <title language="eng">Clustering Medical Conditions in Patient Records Using Unsupervised Learning Techniques: A Comparative Study</title>

    <authors>
	 


      <author>
       <name>Bhupesh Rawat</name>

 
		
	<affiliationId>1</affiliationId>
      </author>
    

	 


      <author>
       <name>Himanshu Pant</name>


		
	<affiliationId>1</affiliationId>

      </author>
    

	 


      <author>
       <name>Ankur Bist</name>

		
	<affiliationId>2</affiliationId>
      </author>
    

	


	


	
    </authors>
    
	    <affiliationsList>
	    
		
		<affiliationName affiliationId="1">Department of School of Computing, Graphic Era Hill University, Bhimtal, India.</affiliationName>
    

		
		<affiliationName affiliationId="2">Department of Computer Science and Engineering (CSE), Graphic Era Hill University, Bhimtal, India.</affiliationName>
    
		
		
		
		
	  </affiliationsList>






    <abstract language="eng">The expansion of electronic health records (EHRs) presents unparalleled opportunity to identify clinically significant patient trends via unsupervised learning. This study assesses three clustering methodologies—K-Means, DBSCAN, and Hierarchical Clustering—applied to EHR data with PCA for dimensionality reduction, evaluating performance through the Silhouette Score (0.183 for K-Means), Davies-Bouldin Index (1.594), and Calinski-Harabasz Index (245.7). K-Means identified four distinct clusters, including a high-risk grouping including 25% of patients, characterized by increased tumor  size (1262 mm) and mitotic activity (0.20/HPF), with SHAP analysis indicating tumor morphology as the principal factor influencing clustering. Although DBSCAN was ineffective in identifying density-based clusters and Hierarchical Clustering exhibited inadequate separation (Silhouette: 0.130), K-Means demonstrated superior efficacy, enabling data-driven patient stratification for personalized treatment strategies and optimized resource allocation. These findings highlight the promise of unsupervised learning in revolutionizing healthcare analytics; however, subsequent research should incorporate temporal data and clinical ontologies to improve interpretability.

&nbsp;</abstract>

    <fullTextUrl format="html">https://biomedpharmajournal.org/vol18no3/clustering-medical-conditions-in-patient-records-using-unsupervised-learning-techniques-a-comparative-study/</fullTextUrl>

<keywords language="eng">

      
        <keyword>Clustering</keyword>
      

      
        <keyword> DBSCAN</keyword>
      

      
        <keyword> EHR</keyword>
      

      
        <keyword> K-Means</keyword>
      

      
        <keyword> Medical Records</keyword>
      

      
        <keyword> Patient Profiling</keyword>
      

      
        <keyword> PCA</keyword>
      

      
        <keyword> Unsupervised Learning</keyword>
      
</keywords>
  </record>
</records>