1 Introduction

Cervical intraepithelial neoplasia (CIN), classified as CIN 1 (low-grade) and CIN 2/3 (high-grade), is a precursor to cervical cancer depending on the risk of progression into malignancy [1]. The primary aim of managing CIN is to prevent the progression of invasive cancer at the same time avoiding overtreatment of potentially regressing lesions [2]. The current guidelines recommend observing CIN1 and CIN2 patients who have fertility requirements. Loop electrosurgical excision procedure (LEEP) or cone resection is performed for CIN2 who do not have fertility requirements and CIN3 [3]. Cervical cancer usually progresses slowly, making the follow-up and treatment of patients with CIN possible over time. Roughly half of CIN2 cases spontaneously regress if left untreated [4]. In one meta-analysis of 36 studies including randomized trials and observational research, 3160 patients with CIN2 were followed up at 24 months, among whom 50% experienced regression, 32% persistence, and 18% experienced progression to CIN3+ [5]. When the follow-up was extended to 36 months, the progression rate of CIN2 increased from 5 to 24%, suggesting that most patients spontaneously regress and the remaining subset may have a higher risk of progression [6]. Moore et al. indicated that 65% of adolescents and young women with biopsy-confirmed CIN2 revert to normal condition within 18 months, and the likelihood of regression was higher in young women. Based on these findings, the American Society for Colposcopy and Cervical Pathology guideline recommends considering observation for CIN2 in adolescents and suggests colposcopy and cytology for young women, and evaluated further every 4–6 months [7]. This conservative approach facilitates spontaneous resolution of CIN2 and avoids LEEP or cold knife conization.

In recent years, besides causing psychological and physical trauma, treatment involving cervical excision has been shown to be associated with obstetric complications [8, 9], such as preterm birth and preterm premature rupture of membranes (PPROM) [10]. At present, research related to conservative treatment of CIN2 mainly focuses on traditional risk factors, such as HPV genotype, cytology results, and lesion characteristics under colposcopy. However, these studies often lack a comprehensive prediction model [11, 12]. This study employed relevant data from a cervical disease cohort in Shanxi, and attempted to establish a prediction model for CIN2 regression employing follow-up data from the Shanxi Province colposcopy unit from 2019 to 2022. This would allow for accurate triage of patients with CIN2, with conservative treatment for those with a high likelihood of regression and standardized treatment for those with progression or persistence.

2 Methods

2.1 Research participants

This study included women aged 19 to 65 years who were diagnosed with CIN2 after vaginal cervical biopsy between 2019 and 2022 at the Outpatient Department, the Second Hospital of Shanxi Medical University. Between 2019 and 2022, patients with CIN2 underwent LEEP after 6 months of follow-up. Patients with CIN1 or normal pathology within 6 months were defined as regression, while other patients were defined as persistence or progression (Fig. 1). Demographic information, liquid-based cytology (LBC), results of HPV typing, and data regarding vaginal examination were also collected. The inclusion criteria comprised CIN2 cases that were histopathologically confirmed using the original diagnosis; women aged 19–65 years; and complete visualization of the transformation zone on colposcopy. Exclusion criteria were pregnant women; those with a history of hysterectomy; those previously treated for CIN2 + lesions; patients with other malignancies; patients with cardiovascular, blood and digestive system diseases; patients with invasive carcinoma confirmed by cone biopsy; glandular lesions of the cervix; patients who chose conservative treatment and immunodeficiency. This study was approved by the Ethics Committee of the Second Hospital of Shanxi Medical University and was conducted after obtaining written informed consent from the participants (approval number: 2021YX136).

Fig. 1
figure 1

Flowchart of the distribution of women with CIN2 lesions according to inclusion/exclusion criteria

2.2 Data collection

Clinical data were collected through questionnaire interviews, physical examinations, laboratory tests, and biological specimen collection. Trained interviewers utilized standardized and structured questionnaires to conduct interviews on-site. Demographic information was collected, including age, drinking, reproductive history (such as parity, childbirth, abortion, age at first sexual intercourse, and age at first pregnancy), and menopausal status. Clinical data were acquired through gynecological examinations and laboratory tests, including LBC, HPV, vaginal examination (lesion area and acetic acid white changes under colposcope), and cervical biopsy.

2.3 Liquid-based cytology and vaginal biopsy

All PAP tests were performed using the LBC method. Two cytopathologists from the Second Hospital of Shanxi Medical University performed cytological evaluation following the Bethesda system 2001 terminology. Obstetricians and gynecologists from the Second Hospital of Shanxi Medical University performed vaginal examination, and from abnormal or suspicious lesion sites, biopsy specimens were taken under acetic acid and iodine tests [13,14,15]. In case of no abnormality in one quadrant, biopsy was performed at the junction of the squamous columnar epithelium at 2, 3, 8, or 10 o’clock positions of the cervix, followed by endocervical curettage (ECC). The final histological diagnosis was according to quadrant biopsy and ECC.

2.4 Human papilloma virus typing test

The HPV typing test was performed employing the KaiPu HPV 21 gene typing test kit for flow-through hybridization detection. This test can detect 21 types of HPV, with 15 high-risk types (16, 18, 31, 33, 35, 39, 45, 51, 52, 53, 56, 58, 59, 66, 68) and 6 low-risk types (6, 11, 41, 42, 44, CP8304).

2.5 Cervical cone biopsy

Cervical cone biopsy involved LEEP or cold knife cone biopsy. Women with CIN2 as per biopsy pathology results underwent LEEP or cold knife cone biopsy. Results of pathological analysis after cervical cone biopsy included chronic cervicitis, CIN1, CIN2, CIN3, squamous cell carcinoma, and cervical adenocarcinoma.

2.6 Statistical methods

Continuous data were defined as means ± standard deviations, and the two groups were compared using t-tests. In case of homogeneous variance, one-way analysis of variance was performed for inter-group comparison and least significant distance-t test was used for pairwise comparison. In case of heterogeneous variance, Mann–Whitney U test was performed for pairwise comparison. Single-factor logistic regression analysis through the R 4.0.2 rms package [16] was employed to identify factors influencing the results of LEEP or cone biopsy pathology of patients with CIN2. LASSO regression was then performed to further screen the influencing factors, and multiple-factor logistic regression was performed to establish a prediction model for CIN2 regression. The model was recurrently sampled using Bootstrap, and model calibration was evaluated through calibration plots to assess the accuracy of the model. Receiver operating characteristic (ROC) curves were prepared to verify the discriminative power of the model [17]. All P-values were based on two-sided tests, with P < 0.05 being statistically significant.

3 Results

3.1 Demographic characteristics

This study prospectively analyzed the pathology results of patients with CIN2 who underwent LEEP or conization during 2019–2022. Data of a total of 521 women between the ages of 19–65 were collected, which included 185 cases of lesion regression and 336 cases of lesion persistence or progression. Finally, we developed a prediction model for CIN2 regression, which included age, alcohol consumption, reproductive factors (such as parity, childbirth, abortion history, age at first sexual intercourse, and age at first pregnancy), menopause status, LBC results, HPV status, and the results of vaginal examination (including changes in lesion area and acetowhite during colposcopy).

No statistically significant difference in age (P = 0.562), age at menarche (P = 0.214), age at first pregnancy (P = 0.256), number of pregnancies (P = 0.951), number of childbirths (P = 0.529), or proportion of menopausal women (P = 0.602) among the lesion regression group and the lesion persistence/progression group. In addition, although the variation in abortion history (P = 0.709) and number of sexual partners (P = 0.074) was also not statistically significant, we did observe a statistically significant difference between the lesion regression group and the lesion persistence/progression group in the age of first sexual intercourse (P = 0.004). The lesion persistence/progression group had a higher incidence of LBC results of ASC-US or greater (P < 0.001), HPV types 16, 18, and 58 (P = 0.003), larger lesion areas at the time of colposcopy (P < 0.001), and thinner acetowhite changes (P = 0.01) than the lesion regression group, with the differences being statistically significant (Table 1).

Table 1 Comparison of baseline data

3.2 Selection of factors influencing CIN2 regression prediction model

Univariate logistic regression was sued to screen for risk factors for downgrading pathological outcomes post-LEEP or cone biopsy in CIN2 patients. There was no statistically significant difference in age (P = 0.561), menarche age (P = 0.214), age at first pregnancy (P = 0.256), number of pregnancies (P = 0.951), number of births (P = 0.528), number of abortions (P = 0.708), menopausal status (P = 0.602), and pathological outcomes. A later onset of sexual activity (95% CI: 1.128–1.926, P = 0.005) was identified as a protective factor for CIN2 regression. Multiple sexual partners (95% CI: 0.041–1.171, P = 0.076), abnormal results of TCT analysis (ACS-US or above) (95% CI: 0.440–0.717, P < 0.0001), HPV 16/18/58 infection (95% CI: 0.227–0.729, P = 0.0025), larger area of lesion (95% CI: 0.166–0.396, P < 0.0001), and thick acetowhite (95% CI: 0.118–0.382, P < 0.0001) were identified as risk factors for CIN2 regression post-LEEP or cone biopsy (Fig. 2a).

Fig. 2
figure 2

Confounding factors related to the regression of the cervical intraepithelial neoplasia 2 (CIN2) prediction model. A Univariate logistic regression screening of risk factors for downgrading pathological results after Loop Electrosurgical Excision Procedure (LEEP) or cone biopsy in patients with CIN2; B and C Least absolute shrinkage and selection operator regression screening of risk factors for downgrading pathological results post-surgery in patients with CIN2; D Multivariate logistic screening of risk factors for downgrading pathological results in patients with CIN2 post-surgery

Next, LASSO regression was performed to further select variables. When λ = 0.0170571, seven predictive factors were identified (Fig. 2b and c), such as menarche age, age at first sexual activity, number of sexual partners, TCT results, HPV infection type, lesion area determined by colposcopy, and acetowhitening thickness. Subsequently, stepwise multivariate logistic regression was performed to further screen for factors related CIN2 patients undergoing LEEP or cone biopsy. Later onset of sexual activity has been demonstrated (95% CI: 1.130–2.021, P = 0.005) as a protective factor for CIN2 regression, while abnormal TCT results (ACS-US or above) (95% CI: 0.482–0.801, P = 0.0002), HPV 16/18/58 infection (95% CI: 0.243–0.845, P = 0.013), larger lesion area (95% CI: 0.213–0.557, P < 0.0001), and thick acetowhite (95% CI: 0.265–1.030, P = 0.061) could be identified as risk factors for CIN2 regression (Fig. 2d).

Finally, the predictive factors for CIN2 regression included age at first sexual activity, TCT results, HPV infection type, lesion area detected by colposcopy, and acetowhitening thickness in the constructed prediction model. Each predictive factor was assigned a specific score, and the probability of CIN2 regression was calculated by adding up the various scores (Fig. 3).

Fig. 3
figure 3

The nomogram model, five-factor predictive model for cervical intraepithelial neoplasia 2 (CIN2) regression

3.3 Model discrimination and accuracy

ROC analysis was used to predict The model used ROC scores for predictions, with sensitivity and specificity of 0.827 and 0.708, respectively. The area under the curve (AUC) was 0.832 (95% CI: 0.797–0.865) (Fig. 4a), indicating good discrimination of the predictive model. Subsequently, internal repeated sampling showed good calibration of the model (Fig. 4b). Further, an HPV and TCT combined prediction model was constructed with an AUC of 0.6805 (95% CI: 0.629–0.735), while the five-factor CIN2 regression prediction model demonstrated significantly higher predictive value than the combined HPV and TCT prediction model (P < 0.05) (Fig. 4c). Finally, the decision curve analysis (DCA) evaluated the predictive value of the five-factor CIN2 regression prediction model; the predictive value was higher than that obtained by the HPV and TCT combined prediction model (Fig. 4d). The Net Reclassification Improvement (NRI) of the five-factor CIN2 regression prediction model relative to the HPV and TCT combined prediction model was 0.348 (95% CI: 0.241–0.455) and the Integrated Discrimination Improvement (IDI) was 0.348 (95% CI: 0.241–0.456), and the difference was statistically significant (P < 0.0001), which further indicated the predictive efficiency of the 5-factor CIN2.

Fig. 4
figure 4

Discrimination and accuracy of the five-factor predictive model. A Area under the curve (AUC) of the five-factor predictive model; B Internal resampling validation of the 5-factor predictive model; C Comparison of the receiver operating characteristic curve (ROC) AUCs between the human papilloma virus (HPV) and ThinPrep cytologic test (TCT) combined predictive model and the five-factor predictive model; D Clinical decision curve comparing the predictive value of the HPV and TCT combined predictive model with the five-factor predictive model

4 Discussion

The incidence of CIN2 is high, with nearly half the number of CIN2 cases resolving within 2 years and less than 1/5 of the number progressing [18]. Additionally, some localized CIN2 lesions can be completely removed by biopsy during colposcopy. the use of traditional treatment methods often leads to postoperative downgrading, causing previously negative-tested patients to undergo invasive treatments, causing physical or psychological trauma and potentially excessive treatment [19].

Currently, there is limited research on post-biopsy pathological downgrading in patients with CIN2. Our study collected clinical information, LBC, HPV, and colposcopy findings, such as lesion area and acetic acid changes to establish a predictive model for post-biopsy pathological downgrading in patients with CIN2. Subsequently, single-factor logistic regression, LASSO regression, and multi-factor logistic regression were employed to identify five factors linked to CIN2 colposcopy post-biopsy downgrading, and then a prediction score model was constructed. The factors of the prediction scoring model were age at first intercourse, TCT, HPV, lesion area observed from colposcopy, and acetic acid changes. The prediction scoring model was internally validated and the results reveal that the model had good discriminative ability and calibration.

Several studies indicate that the safe age range for conservative treatment of CIN2 is < 25 years old [5, 20]. Among them, the meta-analysis by Zhang et al. of 1,481 female patients who underwent conservative treatment of CIN2 + for 15 months demonstrated a significantly negative correlation of age with regression [21]. Boulch observed 2,408 women with cervical cytological abnormalities for 24 months and reported that older women had longer-lasting prevalence of HPV infections (P = 0.008), but did not identify any correlation between new HPV infections and age [22]. However, Mancebo et al. studied CIN2 patients and found no relationship between the safety of conservative CIN2 treatment and age at diagnosis [23]. Our study also did not find any statistically significant difference in age and post-operative pathological upgrade or downgrade in patients with CIN2 who underwent LEEP or conization. Several cross-sectional studies have indicated that an early age of sexual debut is a confounding risk for HPV infection [24, 25]. Our study also reveals that a larger age of first intercourse was associated with post-operative pathological downgrading in patients with CIN2 who underwent LEEP or conization.

TCT and HPV are routine screening processes for cervical cancer and are often employed as predictive factors for cervical lesion progression. The global cytological HSIL positive predictive value for diagnosing CIN2 + is 77.5% [26]. Sedeno et al. analyzed 143 patients with low-grade squamous intraepithelial lesions (LSIL) and found that HPV16 LSIL is more likely to progress to CIN2 + and that HPV genotyping is a risk for patients with ASC-US or LSIL cytology [27]. Prior cytological examination showed HSIL as an independent risk factor for progression in CIN2 + women, and cytological examination results are helpful in more effective and personalized management of CIN2 [28, 29]. HPV16 and 18 are factors affecting post-operative pathological downgrading in LEEP or conization [30,31,32]. Persistent HPV16 infection is significantly more related to recurrence. Accordingly, CIN2 + caused by HPV16-positive infections is less likely to resolve than those due to other high-risk HPV or HPV16-negative CIN2 + cases. Other studies have further shown that compared to HPV16/18, the third most common genotype may be HPV58. HPV58 is 12.5% at 4–6 months after conization, but 0% at 8–12 months post-conization, indicating that HPV58 is not necessarily cleared and that the HPV58 clearance rate of may be just as ineffective as that of HPV16 [33,34,35]. Similarly, our study also reports that cytology (P < 0.0001) and HPV genotyping (P = 0.0121) are independent factors affecting post-operative pathological downgrading in CIN2 patients who underwent LEEP or conization.

Colposcopy is an important diagnostic approach for evaluating cervical lesions and performing biopsies following abnormal cytology and HPV genotyping [36, 37]. The lesion area and acetic acid epithelium under colposcopy are the primary colposcopy indicators for evaluating cervical lesions [38, 39]. Although acetic acid epithelium may not be identical to tumor tissue, nearly all cervical lesions exhibit variable transient and opaque white areas after the application of 3–5% acetic acid. Therefore, lesion area and the thickness of acetic acid epithelium under colposcopy may help in predicting the severity of cervical lesions. However, because of the subjectivity of the examination, the specificity of the colposcopy examination depends mainly on the experience and professional knowledge of the physician performing colposcopy. Our study shows that lesion area (P < 0.0001) and the thickness of acetic acid epithelium (P = 0.061) under colposcopy were critical factors affecting the post-operative pathological downgrade or upgrade in patients with CIN2.

Cervical cancer screening strategies need to consider overall costs, benefits, and cost-effectiveness [40]. According to estimates from the World Health Organization (WHO), between 1 and 2% of women are diagnosed with CIN2 + each year. In the United States, the incidence rate of CIN2/3 ranges from 120 to 160 cases per 100,000 women, while South Korea reports an incidence rate of CIN3 at 39.8 per 100,000. In China, the prevalence of histologically confirmed CIN2 + stands at 3%, with significantly higher rates observed in rural areas compared to urban centers; this figure also exceeds the prevalence seen in other Asian countries. The elevated incidence of CIN2 + clearly positions it as a critical public health concern, necessitating substantial resources for screening, diagnosis, and treatment. There is an urgent need to develop a predictive model for cervical lesions that is tailored for low- and middle-income countries. Our regression prediction model for patients with CIN2 incorporates several key factors, including TCT results, HPV infection types, the area of lesions detected through colposcopy, and the thickness of acetowhitening. These parameters can be obtained via cytology, HPV typing, and colposcopy, eliminating the requirement for additional tests such as immunohistochemistry, methylation analysis, or DNA ploidy analysis. This approach significantly reduces the economic burden on both patients and society. Moreover, by leveraging information technology, our disease prediction model can be effectively deployed in resource-limited settings. Primary healthcare providers can enter patient data into the model to estimate the likelihood of CIN2 regression. Low- and middle-income countries currently bear a substantial burden of cervical lesions. Our CIN2 prediction model, by avoiding the need for extra testing, offers a reliable means to assess the regression probability of CIN2 patients and can enhance the accuracy of treatment planning. This will ultimately contribute to reducing the incidence and mortality associated with cervical lesions.

This study followed the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis statement recommendations and used internal validation methods in calculating the C-index, time-dependent AUC, and calibration curves [41]. Overall, the prediction scoring model proposed by us may facilitate objective prediction of pathologic regression in patients with CIN2 and identify patients with CIN2 suitable for conservative treatment more conveniently, objectively, and practically in a clinical setting.

Nonetheless, despite the good performance of the prediction scoring model, this study had some limitations. The research design excluded patients with blood and digestive system diseases, immunocompromised patients, cervical gland lesions, etc. The sample size and follow-up time need to be increased to study the predictive value of the model in other CIN2 women. At the same time, further multi-center external validation will be carried out to explore the prediction model for predicting the regression of CIN2 patients.

5 Conclusions

This study successfully established and validated a predictive nomogram model for predicting the risk of CIN2 regression. The model combined the age of first sexual life, TCT results, HPV infection status, lesion area observed by colposcopy, and acetic acid staining thickness, and had high accuracy and reliability. This study can provide clinicians with an objective tool to assess the risk of CIN2 regression, thereby facilitating the decision-making process of personalized treatment plans. In addition, further validation in larger multicenter studies is necessary to confirm the universality and robustness of our findings, which will ultimately help guide the conservative treatment of CIN2 patients.