<?xml version="1.0" encoding="UTF-8"?>



<records>

  <record>
    <language>eng</language>
          <publisher>Oriental Scientific Publishing Company</publisher>
        <journalTitle>Biomedical and Pharmacology Journal</journalTitle>
          <issn>0974-6242</issn>
            <publicationDate>2026-04-21</publicationDate>
    
        <volume>19</volume>
        <issue>2</issue>

 
    <startPage></startPage>
    <endPage></endPage>

	    <publisherRecordId>71474</publisherRecordId>
    <documentType>article</documentType>
    <title language="eng">A Two-Stage Ensemble Approach for Diabetes Prediction: Early Diagnosis with CatBoost and Advanced Diagnosis with LightGBM</title>

    <authors>
	 


      <author>
       <name>Smita Kulkarni</name>

 
		
	<affiliationId>1</affiliationId>
      </author>
    

	 


      <author>
       <name>Dnyanda Hire</name>


		
	<affiliationId>2</affiliationId>

      </author>
    

	 


      <author>
       <name>Priya Charles</name>

		
	<affiliationId>2</affiliationId>
      </author>
    

	 


      <author>
       <name>Shweta Suryawanshi</name>

		
	<affiliationId>2</affiliationId>
      </author>
    


	


	
    </authors>
    
	    <affiliationsList>
	    
		
		<affiliationName affiliationId="1">Department of E and TC Engineering, MIT Academy of Engineering, Pune, India</affiliationName>
    

		
		<affiliationName affiliationId="2">Department of Semiconductor Engineering, School of Engineering, Management and Research, D. Y. Patil International University, Pune, India</affiliationName>
    
		
		
		
		
	  </affiliationsList>






    <abstract language="eng">Diabetes is a common long-term sickness defined by elevated blood sugar levels as a result of impaired insulin physiological effects, faulty insulin secretion, or both. This condition can cause long-term damage and dysfunction to various tissues, including the kidneys, heart, blood vessels, eyes, and nerves. As living standards grow, diabetes is becoming increasingly common, making early and accurate detection crucial. This research focuses on predicting diabetes in two stages using machine learning. In this research, a two-stage approach was implemented for diabetes prediction using ensemble models. The first stage focused on the early diagnosis of diabetes to provide prior intimation about individuals’ health status. This involved using the Sylhet dataset, which contains comprehensive information for detecting prediabetes. In the second stage, the Frankfurt dataset was utilized, which includes numerical parameters for further diabetes diagnosis, to predict diabetes based on pathological parameters and to provide appropriate treatment to prevent further health issues.The article falls under the domain of Biomedical Signals and Medical Sciences, with its scope focusing on the early and advanced diagnosis of diabetes using machine learning ensemble models, involving medical data analysis, preprocessing, and the use of biomedical signals or patient health parameters. Various ensemble models were employed in both stages. In stage one, the Categorical Boosting(CatBoost) algorithm demonstrated superior performance for early diagnosis using the Sylhet dataset, while in stage two, the Light Gradient Boosting Machine(LightGBM) algorithm proved to be most effective for diabetes prediction using the Frankfurt dataset. Selecting the appropriate classifier and correct featuresa critical challenge for machine learning techniques in this domain. The findings suggest that ML models are beneficial for diabetes prediction and can significantly contribute to improving human health.Future research could compare these ensemble models against a deep learning technique, such as CNNs or RNNs, on both datasets to enhance accuracy for early and advanced diabetes prediction.</abstract>

    <fullTextUrl format="html">https://biomedpharmajournal.org/vol19no2/a-two-stage-ensemble-approach-for-diabetes-prediction-early-diagnosis-with-catboost-and-advanced-diagnosis-with-lightgbm/</fullTextUrl>

<keywords language="eng">

      
        <keyword>Categorical Boosting (CatBoost)</keyword>
      

      
        <keyword> Diabetes Prediction</keyword>
      

      
        <keyword> Ensemble Learning</keyword>
      

      
        <keyword> Light Gradient Boosting Machine (LightGBM)</keyword>
      

      
        <keyword> Machine Learning</keyword>
      
</keywords>
  </record>
</records>