<?xml version="1.0" encoding="UTF-8"?>



<records>

  <record>
    <language>eng</language>
          <publisher>Oriental Scientific Publishing Company</publisher>
        <journalTitle>Biomedical and Pharmacology Journal</journalTitle>
          <issn>0974-6242</issn>
            <publicationDate>2026-05-06</publicationDate>
    
        <volume>19</volume>
        <issue>2</issue>

 
    <startPage></startPage>
    <endPage></endPage>

	    <publisherRecordId>71646</publisherRecordId>
    <documentType>article</documentType>
    <title language="eng">Predictive Modelling of Hepatitis C Virus Disease Progression Using PCA and Machine Learning</title>

    <authors>
	 


      <author>
       <name>Sandeep Kumar Sunori</name>

 
		
	<affiliationId>1</affiliationId>
      </author>
    

	 


      <author>
       <name>Shilpa Jain</name>


		
	<affiliationId>2</affiliationId>

      </author>
    

	 


      <author>
       <name>Govind Singh Jethi</name>

		
	<affiliationId>2</affiliationId>
      </author>
    

	 


      <author>
       <name>Pradeep Juneja</name>

		
	<affiliationId>1</affiliationId>
      </author>
    


	


	
    </authors>
    
	    <affiliationsList>
	    
		
		<affiliationName affiliationId="1">Department of ECE, Graphic Era Hill University, Bhimtal Campus, India</affiliationName>
    

		
		<affiliationName affiliationId="2">Department of CSE, Graphic Era Hill University, Bhimtal Campus, India</affiliationName>
    
		
		
		
		
	  </affiliationsList>






    <abstract language="eng"><p style="margin-top: 14.15pt; text-align: justify;"><span style="font-weight: normal;">The stages of chronic infection in hepatitis C Virus (HCV) include Hepatitis and Fibrosis, followed by Cirrhosis, and staging of the diseases must be non-invasive to be effectively used in clinical practices. This research article creates a powerful computational algorithm of multi-class HCV staging with standard serum laboratory biomarkers.A dataset of 12 clinical biomarkers and demographics of 615 subjects has been used. In connection to the intrinsic correlation and high dimensionality of the biomarker panel, Principal Component Analysis (PCA) was used as an essential step in feature engineering as it retained 95% of total data variance. Three supervised machine learning classifiers, Naive Bayes (NB), K-Nearest Neighbors (KNN, k=5) and a multi-class Support Vector Machine (SVM) based on the Error-Correcting Output Codes (ECOC) wrapper with a linear kernel,were trained and compared on the optimal low-dimension set of features obtained through PCA. The SVM-ECOC model has shown better overall predictive performance (highest Accuracy 91 %), Macro-Averaged Precision (0.745) and Macro-Averaged Recall (Sensitivity) of 0.61. The translational usefulness of the SVM model was further validated by further rigorous clinical validation using the Multi-Category Net Reclassification Improvement (MCNRI) measure, which reported a net improvement in proper risk stratification of 8.13 % over Naive Bayes and 4.88 % over K-Nearest Neighbors. This performance justifies the feasibility of PCA in reducing multidimensional biological data to a space of features that can be separated linearly, which boosts the success of classification tremendously. Nevertheless, another significant limitation of the study is pointed out, the difference between the high overall accuracy and moderate Macro-Averaged Recall indicates the insensitivity (high False Negative Rate) of the key minority disease types (Hepatitis, Fibrosis, Cirrhosis) because of the imbalance in the dataset. All models have been simulated on MATLAB. Research in future should focus on the application of data-level methods, such as oversampling, to reduce the bias of the class and determine ethically acceptable, reliable diagnostic sensitivity at all phases of HCV development to be clinically applicable.</span></p></abstract>

    <fullTextUrl format="html">https://biomedpharmajournal.org/vol19no2/predictive-modelling-of-hepatitis-c-virus-disease-progression-using-pca-and-machine-learning/</fullTextUrl>

<keywords language="eng">

      
        <keyword>HCV (Hepatitis C Virus)</keyword>
      

      
        <keyword> Liver Fibrosis</keyword>
      

      
        <keyword> MCNRI (Multi-Category Net Reclassification Improvement)</keyword>
      

      
        <keyword> Multi-class Classification</keyword>
      

      
        <keyword> PCA (Principal Component Analysis)</keyword>
      

      
        <keyword> Predictive modelling</keyword>
      

      
        <keyword> SVM (Support Vector Machine)</keyword>
      

      
        <keyword> Serum Biomarkers</keyword>
      
</keywords>
  </record>
</records>