Journal of Clinical Medicine

MDPI

Article

Ki-67 as a Predictor of Metastasis in Adrenocortical Carcinoma: Artificial Intelligence Insights from Retrospective Imaging Data

Andrew J. Goulian 1,*DD and David S. Yee 1,2

1 College of Medicine, California Northstate University, Elk Grove, CA 95757, USA

2 Department of Urology and Genitourinary Oncology, Sutter Health, Roseville, CA 95661, USA

* Correspondence: andrew.goulian9114@cnsu.edu

Abstract

☒ check for updates

Academic Editors: Francesco Ziglioli and Umberto Vittorio Maestroni

Received: 13 April 2025 Revised: 15 June 2025

Accepted: 1 July 2025 Published: 8 July 2025

Citation: Goulian, A.J .; Yee, D.S. Ki-67 as a Predictor of Metastasis in Adrenocortical Carcinoma: Artificial Intelligence Insights from Retrospective Imaging Data. J. Clin. Med. 2025, 14, 4829. https://doi.org/ 10.3390/jcm14144829

Copyright: @ 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/ licenses/by/4.0/).

Background/Objectives: Adrenocortical carcinoma (ACC) is a rare, aggressive malignancy with poor prognosis, particularly in metastatic cases. The Ki-67 proliferation index is a recognized marker of tumor aggressiveness, yet its role in guiding diagnostic imaging and surgical decision-making remains underexplored. This study evaluates Ki-67’s predictive value for metastasis at diagnosis, leveraging artificial intelligence (AI) to inform personal- ized, minimally invasive strategies for ACC management. Methods: We retrospectively analyzed 53 patients with histologically confirmed ACC from the Adrenal-ACC-Ki67-Seg dataset in The Cancer Imaging Archive. All patients had Ki-67 indices from surgical speci- mens and preoperative contrast-enhanced CT scans. Descriptive statistics, t-tests, ANOVA, and multivariable logistic regression evaluated associations between Ki-67, tumor size, age, and metastasis. Random Forest classifiers-with and without the Synthetic Minority Over- sampling Technique (SMOTE)-were developed to predict metastasis. A Ki-67-only model served as a baseline comparator. Model performance was assessed using the area under the curve (AUC) and DeLong’s test. Results: Patients with metastatic disease had significantly higher Ki-67 indices (mean 39.4% vs. 21.6%, p < 0.05). Logistic regression identified Ki-67 as the sole significant predictor (OR = 1.06, 95% CI: 1.01-1.12). The Ki-67-only model achieved an AUC of 0.637, while the SMOTE-enhanced Random Forest achieved an AUC of 0.994, significantly outperforming all others (p < 0.001). Conclusions: Ki-67 is significantly associated with metastasis at ACC diagnosis and demonstrates independent predictive value in regression analysis. However, integration with machine learning models incorpo- rating tumor size and age significantly improves overall predictive accuracy, supporting AI-assisted risk stratification and precision imaging strategies in adrenal cancer care.

Keywords: adrenocortical carcinoma; Ki-67; metastasis prediction; artificial intelligence; random forest; risk stratification; minimally invasive surgery; radiomics; precision oncology

1. Introduction

Adrenocortical carcinoma (ACC) is a rare and aggressive malignancy of the adrenal cortex, with an annual incidence of 0.7-2 cases per million and a poor five-year survival rate, particularly in patients with metastatic disease at diagnosis [1,2]. Despite advances in surgical techniques, such as minimally invasive and robotic adrenalectomy, and systemic therapies, early detection and accurate risk stratification remain critical challenges for improving patient outcomes [3,4]. Emerging evidence highlights the prognostic value of comprehensive histopathological assessments, including Ki-67, though challenges persist in standardizing its evaluation across diverse ACC cohorts [5,6]. Additionally, refined surgical

strategies and diagnostic innovations, such as urine steroid metabolomics, underscore the need for integrated approaches to enhance risk stratification and early detection [6,7].

Among histopathological markers, the Ki-67 proliferation index has emerged as a key prognostic indicator in ACC, and is consistently linked to tumor recurrence, overall survival, and metastatic potential [8-11]. Studies have reported Ki-67 cutoffs of 10-20% as predictive of adverse outcomes, underscoring its clinical relevance [9,11].

However, the standalone predictive utility of Ki-67 is constrained by interobserver variability in immunohistochemical scoring, inconsistent tissue sampling, and limited integration into standardized staging systems [12]. Comprehensive evaluations of Ki-67 alongside clinical variables, such as age, tumor size, laterality, or resection margin status, are scarce, with most prior studies focusing on recurrence or survival rather than metastasis at initial diagnosis [13,14]. Moreover, while meta-analyses have established associations between Ki-67 expression and clinical features such as age and tumor size, they do not assess the relative predictive contribution of Ki-67 compared to these variables within multivariable models [15]. To date, comparative predictive analyses evaluating Ki-67 alongside basic clinical features in models specific to ACC remain largely absent from the literature [15,16]. While Ki-67 is linked to recurrence and survival, its role in predicting metastasis at initial ACC diagnosis remains underexplored, particularly in multivariable predictive models. And the increasing availability of open access imaging repositories, such as The Cancer Imaging Archive (TCIA), combined with advanced computational methods, provides an opportunity to validate Ki-67’s predictive capacity through novel analytical approaches [17,18].

Radiomic methodologies have shown promise in predicting Ki-67 expression pre- operatively using contrast-enhanced CT imaging, yet translating these predictions into actionable clinical decisions remains challenging [19]. This difficulty arises due to variabil- ity in Ki-67 scoring, lack of consensus on clinical cutoff thresholds, and uncertainty about how Ki-67 should guide imaging surveillance or treatment decisions, such as adjuvant therapy or surgical planning [12,19]. Integrating Ki-67-based predictions with imaging and interventional strategies could optimize preoperative risk assessment and guide minimally invasive procedures in ACC management [20,21].

Recent advancements in artificial intelligence (AI) and machine learning offer sig- nificant potential to enhance diagnostic precision and personalize risk stratification in oncology [22,23]. Ensemble learning algorithms, such as Random Forest classifiers paired with resampling techniques like the Synthetic Minority Oversampling Technique (SMOTE), provide robust classification capabilities, particularly for the imbalanced datasets common for rare cancers like ACC [24-26]. These methods address the limitations of traditional statistical models, which often lack power due to small sample sizes [25].

In this study, we analyzed clinical and histopathological data from the publicly acces- sible Adrenal-ACC-Ki67-Seg dataset on TCIA [18]. Our primary objective was to evaluate the Ki-67 index’s predictive capacity for metastatic disease at initial diagnosis and assess its incremental value when combined with tumor size and age, using both logistic regression and Random Forest machine learning techniques. We aim to develop data-driven, repro- ducible risk stratification models for ACC, contributing to multimodal approaches that integrate imaging, histopathological, and computational tools to guide minimally invasive strategies in urological oncology [20,21].

2. Materials and Methods

2.1. Study Design

This retrospective study utilized de-identified clinical and imaging data obtained from The Cancer Imaging Archive (TCIA), a publicly funded resource supported by the

National Cancer Institute (Bethesda, MD, USA) and maintained by the Department of Biomedical Informatics within the College of Medicine at the University of Arkansas for Medical Sciences (UAMS) (Little Rock, AR, USA) in collaboration with the UAMS Information Technology department and the Department of Biomedical Informatics at Emory University (Atlanta, GA, USA).As all TCIA data are de-identified in compliance with the HIPAA Safe Harbor Method, and the de-identification process is conducted under protocols approved by the Institutional Review Board of the hosting institution, this study was exempt from additional IRB review and the requirement for informed consent was waived [17].

Clinical and imaging data were extracted from the Adrenal-ACC-Ki67-Seg dataset hosted on TCIA [18]. Patients were included if they met the following criteria: a confirmed diagnosis of adrenocortical carcinoma (ACC); underwent surgical resection of the tumor; Ki-67 index determined from histopathological analysis of the resected specimen; and had available contrast-enhanced abdominal CT imaging performed prior to surgery. Patients were excluded if the Ki-67 index was determined via biopsy rather than resected tumor tissue, based on prior studies indicating that whole-tumor evaluation provides more reliable quantification [18].

A total of 53 patients diagnosed with ACC between 2006 and 2018 were included. Among them, 7 patients (13.2%) had metastatic disease at initial diagnosis, while 46 (86.8%) did not. The dataset included clinical, demographic, and histopathologic variables such as age, sex, race, tumor laterality, tumor size (in cm), Ki-67 index (%), resection margin status, T and N staging, and time from imaging to diagnosis (in days). All patients underwent surgical resection, and Ki-67 indices were quantified from the full tumor specimen.

2.2. Statistical Analysis

All statistical analyses were conducted in R version 4.5.0. Descriptive statistics, includ- ing means, standard deviations (SDs), medians, and counts, were calculated for all clinical and histopathologic variables. Variables were stratified by metastatic status at diagnosis (yes vs. no) to facilitate group-level comparisons.

For continuous variables including age, tumor size, Ki-67 index, and days from imaging to diagnosis, group comparisons were performed using Welch’s two-sample t-tests. One-way analysis of variance (ANOVA) was used to assess differences in the Ki-67 index across categories of resection margin, laterality, T stage, N stage, sex, and race. Pearson correlation was used to evaluate the association between Ki-67 index and tumor size.

A chi-square test was performed to assess the association between resection margin status and metastasis at diagnosis. A multivariable linear regression model was constructed to evaluate the relationship between Ki-67 index and potential predictors, including tumor size, age, and N stage.

Finally, a multivariable logistic regression model was constructed using age, tumor size, and Ki-67 index to explore their joint association with metastatic disease at diagnosis. This reduced model was chosen based on clinical relevance and model stability [27]. A separate full model including all available predictors was attempted but failed to converge due to complete separation among certain categorical features. All statistical tests were two-sided, and p-values < 0.05 were considered statistically significant.

2.3. Artificial Intelligence

To develop a clinically relevant predictive model of metastatic disease at diagnosis, we trained supervised classifiers using the Random Forest algorithm [28]. This approach was

selected due to its resilience to multicollinearity, ability to model nonlinear interactions, and effectiveness in small-to-moderate sample sizes typical of rare cancer datasets [24,26].

Ten predictor variables were used: age, sex, race, tumor laterality, tumor size (cm), resection margin status, T stage, N stage, Ki-67 index (%), and days from imaging to diagnosis. All categorical variables were one-hot encoded, and the dataset was complete with no missing values [29].

Given the moderate class imbalance (7 metastatic vs. 46 non-metastatic cases), the Synthetic Minority Oversampling Technique (SMOTE) was employed to synthetically balance the dataset prior to model training. SMOTE interpolates new examples from the minority class to improve classifier performance and mitigate overfitting [26].

Model training was conducted using repeated 10-fold cross-validation with five repe- titions. Random Forest models were trained with 500 trees, and Gini impurity was used as the splitting criterion. Model performance was primarily assessed using the area under the receiver operating characteristic curve (AUC-ROC). Secondary performance metrics included sensitivity, specificity, and balanced accuracy. Variable importance was computed using permutation-based importance scores from the SMOTE-enhanced Random Forest model [30]. To directly assess the standalone predictive performance of Ki-67, we trained an additional Random Forest model using Ki-67 as the sole predictor. The same cross- validation strategy and performance metrics were applied as in the multivariable models. This single-variable model served as a baseline comparator to evaluate whether the inclu- sion of additional clinical and pathological features meaningfully improved discrimination.

2.4. Model Performance Comparison

To statistically compare the discriminatory performance of each classifier, we con- ducted DeLong’s test for paired receiver operating characteristic (ROC) curves. Pairwise comparisons were performed between the logistic regression model, the Random Forest model trained without class balancing, and the Random Forest model trained on SMOTE- balanced data. This nonparametric method evaluates whether observed differences in area under the ROC curve (AUC) are statistically significant, accounting for the paired nature of predictions on the same dataset [31]. All AUC comparisons were two-sided, with p-values less than 0.05 considered statistically significant.

2.5. Model Evaluation Metrics

For each model, we evaluated discrimination using the area under the receiver op- erating characteristic curve (AUC). Optimal thresholds were determined using Youden’s Index. To estimate model performance at the chosen threshold, we calculated sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV).

To derive 95% confidence intervals (CIs) for sensitivity, specificity, PPV, and NPV, we employed a nonparametric bootstrap procedure with 1000 resamples [32]. For each bootstrap iteration, predictions and true labels were sampled with replacement, binary predictions were generated at the fixed threshold from Youden’s Index, and the metric of interest was calculated. Final point estimates represent the mean across all iterations, with the 2.5th and 97.5th percentiles used to construct the confidence intervals [33]. This approach allowed for robust interval estimation across all four models: logistic regression, random forest without SMOTE, random forest with SMOTE, and random forest using Ki-67 alone.

3. Results

3.1. Descriptive Characteristics

This study included 53 patients with histopathologically confirmed adrenocortical carcinoma (ACC), each of whom underwent surgical resection. The study aimed to assess whether Ki-67 index and other clinical and pathological variables could predict the presence of metastasis at diagnosis.

Among the cohort, 7 patients (13.2%) presented with metastatic disease, and 46 (86.8%) were non-metastatic at the time of diagnosis. The mean age was 52.8 years (SD: 13.5), and 58.5% of the cohort were female.

The mean tumor size was 11.5 cm (SD: 6.5), ranging from 3.2 to 33.0 cm. Patients with metastatic disease tended to have slightly smaller tumors (mean: 9.4 cm) compared to those without metastasis (mean: 11.9 cm). The mean Ki-67 proliferation index was 23.9% (SD: 18.4), with significantly higher levels in metastatic patients (mean: 39.4%) than in non-metastatic patients (mean: 21.6%).

In terms of resection margins, 35 patients (66.0%) had R0 (negative) margins, 10 (18.9%) had R1 (microscopically positive) margins, and 8 (15.1%) had RX margins. The RX category indicates cases in which the resection margin status was not clearly documented in the pathology report and was therefore classified as unknown. Most tumors were right-sided (45%) or left-sided (55%), with no significant laterality difference between groups.

T-stage distribution favored T2 and T3 tumors overall. Nodal staging revealed N1 disease in 4 of 7 metastatic patients (57.1%) compared to just 1 of 46 non-metastatic patients (2.2%). Descriptive characteristics stratified by metastasis status are summarized below in Table 1.

Table 1. Baseline demographic, clinical, and tumor characteristics of 53 patients with adrenocor- tical carcinoma (ACC), stratified by the presence of metastatic disease at diagnosis. Values are reported as mean ± standard deviation for continuous variables and counts with percentages for categorical variables.
VariableNon-Metastatic (n = 46)Metastatic (n = 7)Total (n = 53)
Age, years (mean ± SD)53.4 ± 13.648.9 ± 13.252.8 ± 13.5
Sex, n (%)
Female27 (58.7)4 (57.1)31 (58.5)
Male19 (41.3)3 (42.9)22 (41.5)
Race, n (%)
White36 (78.3)5 (71.4)41 (77.4)
Black3 (6.5)1 (14.3)4 (7.5)
Hispanic or Latino5 (10.9)1 (14.3)6 (11.3)
Asian2 (4.3)0 (0.0)2 (3.8)
Laterality, n (%)
Right19 (41.3)5 (71.4)24 (45.3)
Left27 (58.7)2 (28.6)29 (54.7)
Tumor Size, cm (mean ± SD)11.9 ±6.89.4 ± 2.811.5 ± 6.5
Ki-67 Index, % (mean ± SD)21.6 ± 16.139.4 ± 14.123.9 ± 18.4
Resection Margin, n (%)
R0 (Negative)33 (71.7)2 (28.6)35 (66.0)
R1 (Positive)7 (15.2)3 (42.9)10 (18.9)
RX (Unknown)6 (13.0)2 (28.6)8 (15.1)
T Staging, n (%)
T14 (8.7)0 (0.0)4 (7.5)
T220 (43.5)0 (0.0)20 (37.7)
T319 (41.3)5 (71.4)24 (45.3)
T43 (6.5)2 (28.6)5 (9.4)
Table 1. Cont.
VariableNon-Metastatic (n = 46)Metastatic (n = 7)Total (n = 53)
N Staging, n (%)
N043 (93.5)3 (42.9)46 (86.8)
N11 (2.2)3 (42.9)4 (7.5)
NX2 (4.3)1 (14.3)3 (5.7)
Days to Diagnosis (mean ± SD)34.7 ± 41.734.9 ± 17.134.7 ± 39.4

3.2. Analysis Association Between Ki-67 and Clinical Variables

To assess whether the Ki-67 proliferation index was associated with demographic or anatomical variables, a multivariable linear regression model was constructed that included tumor size, patient age, and N stage as predictors. The model was not statistically significant overall (adjusted R2 < 0), and none of the included variables showed a significant association with Ki-67 expression (p > 0.05 for all). This suggests that Ki-67 may reflect a biologically distinct feature of tumor aggressiveness, independent of tumor size or patient age.

Pearson correlation analysis revealed no significant relationship between tumor size and Ki-67 index (r = - 0.02, p = 0.90), further supporting the independence of proliferative activity from anatomical tumor burden.

3.3. Comparison of Ki-67 Index by Metastasis and Other Clinical Features

A Welch’s two-sample t-test demonstrated that patients with metastatic disease at diagnosis had significantly higher Ki-67 indices (mean: 39.4%) compared to non-metastatic patients (mean: 21.6%) (p = 0.014). This finding supports Ki-67 as a potential marker of systemic disease at presentation. Figure 1 illustrates the distribution of Ki-67 by metastatic status.

Ki-67 Proliferation Index by Metastasis Status

60

*

n = 7

Mean Ki-67 Index (%)

40

n = 46

20

0

No Metastasis

Metastasis

Metastasis at Diagnosis

Figure 1. Mean Ki-67 proliferation index in patients with and without metastasis at diagnosis. Patients with metastatic disease had significantly higher Ki-67 indices (mean 39.4%, SD 14.1%) than those without metastasis (mean 21.6%, SD 16.1%) (p = 0.014, Welch’s t-test). Bars represent means ± 1 standard deviation. Note: * p < 0.05 indicates a statistically significant difference.

One-way ANOVA showed no statistically significant differences in Ki-67 expression by resection margin status (p = 0.703), laterality (p = 0.925), T staging (p = 0.76), N staging (p = 0.484), sex (p = 0.821), or race (p = 0.457). A chi-square test examining the association be- tween resection margin status and metastasis approached statistical significance (p = 0.076), suggesting a potential trend that warrants further investigation in larger cohorts.

3.4. Logistic Regression Model for Metastasis

A multivariable logistic regression model including the Ki-67 index, tumor size, and age was constructed to evaluate predictors of metastatic disease at diagnosis. Among these, only Ki-67 was statistically significant (OR = 1.06, 95% CI: 1.01-1.12, p < 0.05), indicating that each one percent increase in Ki-67 was associated with a six percent increase in the odds of presenting with metastasis. Tumor size and age were not significant predictors (p > 0.05). This association is visually supported by Figure 1, which illustrates markedly higher Ki-67 indices in patients with metastatic disease. The logistic model demonstrated good discriminatory performance with an AUC of 0.84.

A full logistic regression model including all available predictors was attempted but failed to converge due to quasi-complete separation in several categorical variables, consistent with limitations in small-sample binary modeling.

3.5. Random Forest Classifier with and Without SMOTE

To improve prediction and account for class imbalance, a supervised classification model was trained using Random Forest. The final model incorporated ten predictors: age, sex, race, tumor laterality, tumor size, resection margin, T stage, N stage, Ki-67 index, and time from imaging to diagnosis. Categorical variables were one-hot encoded [27].

A Random Forest model trained without oversampling achieved an AUC of 0.793, sensitivity of 57.2%, and specificity of 83.6%. After applying SMOTE, performance im- proved substantially, with an AUC of 0.994, sensitivity of 94.3%, and specificity of 97.4%. In comparison, the logistic regression model yielded an AUC of 0.722, sensitivity of 57.2%, and specificity of 83.6%.

To further evaluate whether the inclusion of clinical and pathological variables im- proved model performance beyond Ki-67 alone, we trained a baseline Random Forest model using only the Ki-67 index. This single-variable Ki-67 model yielded an AUC of 0.660, with a sensitivity of 37.4% and specificity of 78.9%. These results demonstrate that Ki- 67 alone provides limited predictive accuracy, supporting the added discriminatory value of including features such as staging, resection margin, and tumor laterality alongside Ki-67.

Receiver operating characteristic (ROC) curves for all four models are displayed in Figure 2. Pairwise DeLong tests confirmed that the SMOTE-enhanced Random Forest significantly outperformed all other models, including the non-SMOTE Random Forest (p < 0.001), logistic regression (p < 0.001), and the Ki-67-only model (p < 0.001). The non-SMOTE Random Forest significantly outperformed the Ki-67-only model (p < 0.05), but not the logistic regression model (p = 0.297). The performance difference between logistic regression and the Ki-67-only model approached statistical significance (p = 0.062), suggesting marginally better discrimination by the logistic model. All model performance metrics, including AUC variability, PPV, and NPV, are summarized in Table 2.

The SMOTE-enhanced Random Forest model not only improved predictive perfor- mance but also provided insights into the relative importance of predictors. As shown in Figure 3, the Ki-67 index was the most influential variable, contributing the highest mean decrease in Gini impurity, followed by N staging (N1), resection margin (R1), laterality (right), and T staging (T2). Variables such as tumor size, age, sex, and race had lower importance scores, indicating a lesser role in distinguishing metastatic from non-metastatic

cases. These findings underscore the critical role of Ki-67 as a marker of metastatic potential in this cohort.

Figure 2. Receiver operating characteristic (ROC) curves for metastasis prediction using four models: Logistic Regression (blue), Random Forest using Ki-67 only (purple), Random Forest without SMOTE (orange), and Random Forest with SMOTE (green). The SMOTE-enhanced Random Forest model achieved the highest AUC (0.994), followed by RF without SMOTE (0.708), Logistic Regression (0.637), and RF with Ki-67 alone (0.509). DeLong tests showed statistically significant improvements in AUC for the SMOTE-enhanced Random Forest model over RF without SMOTE, logistic regression, and the Ki-67-only model (p < 0.001 for all). RF without SMOTE also outperformed the Ki-67-only model (p < 0.05), whereas its difference from logistic regression was not statistically significant (p = 0.297). The diagonal dashed line represents the line of no-discrimination (AUC = 0.5), corresponding to random chance.

ROC Curve Comparison for Metastasis Prediction Models

1.00

True Positive Rate (Sensitivity)

0.75

Model

- Logistic Regression (AUC = 0.637)

0.50

- RF (Ki-67 Only) (AUC = 0.509)

- RF (No SMOTE) (AUC = 0.708)

- RF (SMOTE) (AUC = 0.994)

0.25

0.00

0.00

0.25

False Positive Rate (1 - Specificity)

0.50

0.75

1.00

Table 2. Performance metrics of predictive models for metastasis in ACC. Each model's discrimination ability is summarized using area under the ROC curve (AUC), along with sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV), all reported with bootstrapped 95% confidence intervals. The models evaluated include logistic regression, Random Forest using only Ki-67 as a predictor, Random Forest without class balancing, and Random Forest with SMOTE. The SMOTE-enhanced Random Forest demonstrated superior predictive performance across all metrics. Conversely, the Ki-67 only model showed limited discrimination, emphasizing the benefit of integrating additional clinical and pathological variables.
ModelAUC (95% CI)Sensitivity % (95% CI)Specificity % (95% CI)PPV (95% CI)NPV (95% CI)
Logistic0.7220.4580.8180.2750.909
Regression(0.644-0.790)(0.286-0.625)(0.767-0.869)(0.162-0.385)(0.867-0.945)
RF (Ki-67 Only)0.660 (0.577-0.743)0.374 (0.229-0.531)0.789 (0.734-0.845)0.240 (0.138-0.343)0.876 (0.829-0.923)
RF (no SMOTE)0.793 (0.726-0.861)0.572 (0.405-0.731)0.836 (0.788-0.882)0.345 (0.229-0.468)0.928 (0.892-0.959)
RF + SMOTE0.994 (0.990-0.998)0.943 (0.911-0.972)0.974 (0.951-0.991)0.971 (0.946-0.990)0.949 (0.920-0.975)

Variable Importance in Random Forest Model Mean Decrease in Gini Impurity

Figure 3. Variable importance in a SMOTE-Enhanced Random Forest Model for Metastasis Prediction. The bar plot displays the mean decrease in Gini impurity for each predictor, with the Ki-67 index showing the highest importance, followed by N staging (N1), resection margin (R1), laterality (right), and T staging (T2). Lower importance scores were observed for variables such as tumor size, age, sex, and race. The top three predictors are shown in a darker shade to highlight their relatively greater contribution to the model.

Ki-67 Index

N Staging (N1)

Resection Margin (R1)

Laterality (Right)

T Staging (T2)

Predictor Variable

Days to Diagnosis

T Staging (T3)

Resection Margin (RX)

Tumor Size

T Staging (T4)

Race (Black)

N Staging (NX)

Race (White)

Age

Sex (Male)

Race (Hispanic/Latino)

0

5

10

15

Importance Score

4. Discussion

4.1. Principal Findings

This study underscores the pivotal role of the Ki-67 proliferation index in predicting metastatic disease at the time of adrenocortical carcinoma (ACC) diagnosis, offering a transformative approach to risk stratification through the integration of statistical and machine learning methodologies. Analysis of the dataset revealed that Ki-67 is a dominant biomarker for identifying patients at high risk of metastasis, surpassing the predictive utility of clinical variables such as tumor size and patient age [18,27]. Traditional logis- tic regression confirmed Ki-67’s independent association with metastatic status, while a SMOTE-enhanced Random Forest model demonstrated exceptional discriminatory power, highlighting Ki-67’s primacy among predictors [34-36]. This dual approach not only vali- dates Ki-67’s prognostic significance but also showcases the potential of advanced compu- tational techniques to refine clinical decision-making in rare cancers [37,38]. These findings pave the way for personalized ACC management, enabling tailored imaging surveillance and minimally invasive interventional strategies to optimize patient outcomes [21].

4.2. Comparison with Prior Literature

Previous research has consistently identified the Ki-67 proliferation index as a critical prognostic marker in adrenocortical carcinoma (ACC), with thresholds of 10-20% linked to increased recurrence and reduced survival [8,9,11,14,39]. Studies have shown that Ki-

67 levels ≥10% are associated with poorer outcomes in localized ACC following surgical resection [10], while its histoprognostic significance extends to oncolytic adrenal tumors [14]. Additionally, Ki-67’s prognostic impact has been demonstrated across both adult and pediatric ACC populations, as well as in other endocrine neoplasms, underscoring its broader relevance [11,39]. However, these studies predominantly employed Cox regression or Kaplan-Meier analyses, focusing on long-term outcomes like recurrence or overall survival rather than metastasis at initial diagnosis [13].

In contrast, radiomic approaches have explored preoperative prediction of Ki-67 expression using contrast-enhanced CT imaging, achieving moderate accuracy but facing challenges in clinical application due to variability in Ki-67 scoring and lack of standardized cutoff thresholds [19]. The present study advances this field by specifically evaluating Ki-67’s predictive capacity for metastatic disease at ACC presentation. By employing a Random Forest classifier with Synthetic Minority Oversampling Technique (SMOTE) to address class imbalance (7 metastatic vs. 46 non-metastatic cases), our approach integrates clinical variables such as age and tumor size, enhancing risk stratification beyond traditional statistical models [24,38].

A key barrier to translating Ki-67-based predictions into clinical practice is the vari- ability in immunohistochemical scoring, driven by inconsistent cutoffs and differences in morphometric techniques [12,19]. This variability limits the integration of Ki-67 into standardized imaging or interventional protocols. Our study mitigates these challenges by leveraging machine learning to achieve exceptional predictive accuracy, providing a robust framework to inform imaging surveillance strategies and guide minimally invasive interventions, such as robotic adrenalectomy, for high-risk ACC patients [21]. This data- driven approach marks a significant step toward precision oncology, distinguishing our work from prior efforts focused on prognostic rather than predictive applications of Ki-67.

4.3. Methodological Considerations

The inclusion of tumor size and age alongside Ki-67 in predictive models, despite their non-significant associations with metastasis, was guided by their clinical relevance in ACC staging and surgical planning [3,13]. Incorporating these routinely collected variables allowed us to evaluate Ki-67’s incremental predictive value and ensured model robustness across diverse clinical contexts [27]. Pearson correlation analysis confirmed no significant relationship between Ki-67 and tumor size (r = - 0.02, p = 0.90) or age (p > 0.05), supporting Ki-67’s independent biological role as a marker of tumor aggressiveness [35].

To address the class imbalance in the dataset (7 metastatic vs. 46 non-metastatic cases), the Synthetic Minority Oversampling Technique (SMOTE) was applied, improving the Random Forest model’s area under the receiver operating characteristic curve (AUC) from 0.793 with 95% CI (0.726-0.861) to 0.994 with 95% CI (0.990-0.998) across 10-fold cross-validation repeated five times (p < 0.001, DeLong’s test) [24,35,37]. A non-SMOTE Random Forest model was also evaluated to confirm that SMOTE’s resampling, rather than the Random Forest algorithm alone, drove the performance gain [37]. The Random Forest classifier was constructed with 500 trees, using Gini impurity as the splitting criterion and no constraints on tree depth to maximize flexibility given the small sample size [37]. AUC was selected as the primary performance metric due to its robustness in evaluating binary classifiers on imbalanced datasets [36]. To capture clinical relevance, secondary metrics included sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV), each reported with bootstrapped 95% confidence intervals. This approach provides a comprehensive and reliable assessment of model performance variability across key diagnostic thresholds [40].

4.4. Clinical Implications

Our findings highlight Ki-67’s integrative role in optimizing imaging and interven- tional strategies for ACC. Patients with high Ki-67 indices (≥20%) are at elevated risk of metastasis, warranting intensified preoperative imaging, such as PET-CT or contrast- enhanced CT, to detect occult metastatic disease and guide surgical planning [9,18]. Postop- eratively, Ki-67 can inform surveillance protocols, with high-risk patients benefiting from more frequent imaging to monitor for recurrence [17]. In terms of interventions, identifying high-risk patients early enables tailored approaches, such as minimally invasive or robotic adrenalectomy, which reduce morbidity compared to open surgery [21]. Moreover, patients with elevated Ki-67 may benefit from early adjuvant therapies, such as mitotane, radiation therapy, or enrollment in clinical trials, to improve prognosis [4,9]. The superior perfor- mance of the SMOTE-enhanced Random Forest model suggests that machine learning can augment traditional staging systems, enabling data-driven patient stratification [12,23]. This approach aligns with precision oncology by integrating histopathological data with clinical decision-making, potentially reducing unnecessary interventions in low-risk pa- tients while prioritizing aggressive management in high-risk cases [41,42].

4.5. Limitations

The study’s sample size (n = 53) limits its generalizability, though SMOTE and cross- validation mitigated overfitting [24]. Variability in Ki-67 scoring across institutions may introduce measurement error, necessitating standardized protocols [12]. While the Adrenal- ACC-Ki67-Seg dataset includes preoperative CT imaging, we did not incorporate radiomic features, which could enhance predictive accuracy [19,20]. External validation in larger, multi-institutional cohorts is needed to confirm our findings before clinical adoption [43].

4.6. Future Directions

Future research should validate Ki-67’s predictive role in diverse cohorts and integrate it with radiomic and genomic data to develop multimodal models [8,41,44]. For the Special Issue, combining Ki-67 with imaging biomarkers from contrast-enhanced CT could improve preoperative risk assessment, guiding the selection of minimally invasive techniques [20]. Longitudinal studies examining Ki-67’s association with treatment response and recur- rence will further elucidate its role across the ACC continuum [4]. Ultimately, integrating histologic, radiomic, and molecular data into AI-driven models could personalize ACC management, optimizing imaging, interventions, and adjuvant therapies [12,23].

5. Conclusions

The Ki-67 proliferation index is a powerful, independent predictor of metastatic disease at ACC diagnosis, offering actionable insights for imaging and interventional strategies. Its significant association with metastasis (OR = 1.06, p < 0.05) and dominance in the SMOTE-enhanced Random Forest model (AUC = 0.994) highlight its potential to be incorporated and further refine risk stratification beyond traditional markers like tumor size and age [36,38]. For the Special Issue, Ki-67 can guide intensified imaging surveillance such as PET-CT for high-risk patients and inform the selection of minimally invasive or robotic adrenalectomy to minimize morbidity [18,21]. Early identification of high Ki-67 levels (≥20%) also supports timely initiation of adjuvant therapies, such as mitotane or radiation, and enrollment in clinical trials to improve outcomes [4,9].

Machine learning enhances predictive accuracy, with the SMOTE-enhanced Random Forest model significantly outperforming logistic regression (p < 0.001), highlighting its utility in rare cancers like ACC [37]. The model’s high sensitivity (94.3%) and specificity (97.4%) support its potential clinical applicability for early risk stratification, aligning with

the goals of precision oncology [41]. Future models integrating Ki-67 with radiomic and ge- nomic data could further personalize care, optimizing imaging protocols and interventional techniques [20,44]. By reimagining how biomarkers like Ki-67 are integrated into clinical pathways, this study paves the way for data-driven, minimally invasive management of ACC, potentially transforming patient outcomes.

Author Contributions: A.J.G. and D.S.Y. contributed to the conception and design of the study. A.J.G. was responsible for data curation, formal analysis, methodology, software, investigation, and visualization. A.J.G. also drafted the original manuscript and was responsible for the references. A.J.G. and D.S.Y. contributed to manuscript review and editing. D.S.Y. provided project supervision and critical review of the final manuscript. All authors have read and agreed to the published version of the manuscript.

Funding: This research received no external funding.

Institutional Review Board Statement: This study utilized fully de-identified clinical and imaging data from The Cancer Imaging Archive (TCIA), a publicly funded repository supported by the U.S. National Cancer Institute. All TCIA datasets are de-identified in accordance with the HIPAA Safe Harbor Method and collected under protocols approved by the Institutional Review Boards (IRBs) of the contributing institutions. As such, this secondary analysis of de-identified data was exempt from additional IRB review.

Informed Consent Statement: Patient consent was waived because all data used in this study were fully de-identified and obtained from The Cancer Imaging Archive (TCIA).

Data Availability Statement: The data supporting the findings of this study are publicly available through The Cancer Imaging Archive (TCIA) under the dataset entitled Adrenal-ACC-Ki67-Seg | Voxel-level segmentation of pathologically-proven Adrenocortical carcinoma with Ki-67 expression. The dataset can be accessed via the following DOI: https://doi.org/10.7937/1FPG-VM46 [18].

Acknowledgments: The authors would like to acknowledge The Cancer Imaging Archive (TCIA) and the institutions involved in curating and hosting the Adrenal-ACC-Ki67-Seg dataset. We also thank the faculty and staff at California Northstate University College of Medicine for their support throughout the study.

Conflicts of Interest: The authors declare no conflicts of interest.

References

1. Sharma, E .; Dahal, S .; Sharma, P .; Bhandari, A .; Gupta, V .; Amgai, B .; Dahal, S. The characteristics and trends in adrenocortical carcinoma: A United States population-based study. J. Clin. Med. Res. 2018, 10, 636-640. [CrossRef] [PubMed]

2. Rowell, N.P. Oncological Management of Adrenocortical Carcinoma: An Update and Critical Review. Oncol. Ther. 2025, 13, 307-323. [CrossRef] [PubMed]

3. Clay, M.R .; Pinto, E.M .; Fishbein, L .; Else, T .; Kiseljak-Vassiliades, K. Pathological and genetic stratification for management of adrenocortical carcinoma. J. Clin. Endocrinol. Metab. 2022, 107, 1159-1169. [CrossRef] [PubMed]

4. Thampi, A .; Shah, E .; Elshimy, G .; Correa, R. Adrenocortical carcinoma: A literature review. Transl. Cancer Res. 2020, 9, 1253-1264. [CrossRef]

5. Fassnacht, M .; Puglisi, S .; Kimpel, O .; Terzolo, M. Adrenocortical Carcinoma: A Practical Guide for Clinicians. Lancet Diabetes Endocrinol. 2025, 13, 438-452. [CrossRef]

6. Shariq, O.A .; Mckenzie, T.J. Adrenocortical Carcinoma: Current State of the Art, Ongoing Controversies, and Future Directions in Diagnosis and Treatment. Ther. Adv. Chronic Dis. 2021, 12, 20406223211033103. [CrossRef]

7. Rossi, L .; Becucci, C .; Ambrosini, C.E .; Puccini, M .; Vasquez, M.C .; Gjeloshi, B .; Materazzi, G. Surgical Management of Adrenocor- tical Carcinoma: A Literature Review. J. Clin. Med. 2022, 11, 5754. [CrossRef]

8. Angelousi, A .; Kyriakopoulos, G .; Athanasouli, F .; Dimitriadi, A .; Kassi, E .; Aggeli, C .; Zografos, G .; Kaltsas, G. The Role of Immunohistochemical Markers for the Diagnosis and Prognosis of Adrenocortical Neoplasms. J. Pers. Med. 2021, 11, 208. [CrossRef]

9. Dojcinovic, T .; Tomsic, K.Z .; Vodanovic, I.D .; Dusek, T .; Kraljevic, I .; Nekic, A.B .; Polovina, T.S .; Knezevic, N .; Alduk, A.M .; Golubic, Z.A .; et al. Treatment outcomes in patients with recurrent adrenocortical carcinoma. Endocr. Res. 2025, 50, 43-49. [CrossRef]

10. Beuschlein, F .; Weigel, J .; Saeger, W .; Kroiss, M .; Wild, V .; Daffara, F .; Libé, R .; Ardito, A .; Al Ghuzlan, A .; Papotti, M .; et al. Major Prognostic Role of Ki-67 in Localized Adrenocortical Carcinoma After Complete Resection. J. Clin. Endocrinol. Metab. 2015, 100, 841-849. [CrossRef]

11. Martins-Filho, S.N .; Almeida, M.Q .; Soares, I .; Wakamatsu, A .; Alves, V.A.F .; Fragoso, M.C.B.V .; Zerbini, M.C.N. Clinical Impact of Pathological Features Including the Ki-67 Labeling Index on Diagnosis and Prognosis of Adult and Pediatric Adrenocortical Tumors. Endocr. Pathol. 2021, 32, 288-300. [CrossRef] [PubMed]

12. Ciaramella, P.D .; Vertemati, M .; Petrella, D .; Bonacina, E .; Grossrubatscher, E .; Duregon, E .; Volante, M .; Papotti, M .; Loli, P. Analysis of histological and immunohistochemical patterns of benign and malignant adrenocortical tumors by computerized morphometry. Pathol. Res. Pract. 2017, 213, 815-823. [CrossRef] [PubMed]

13. Libé, R .; Borget, I .; Ronchi, C.L .; Zaggia, B .; Kroiss, M .; Kerkhofs, T .; Bertherat, J .; Volante, M .; Quinkler, M .; Chabre, O .; et al. Prognostic factors in stage III-IV adrenocortical carcinomas (ACC): An European Network for the Study of Adrenal Tumor (ENSAT) study. Ann. Oncol. 2015, 26, 2119-2125. [CrossRef] [PubMed]

14. Renaudin, K .; Smati, S .; Wargny, M .; Al Ghuzlan, A .; Aubert, S .; Leteurtre, E .; Patey, M .; Sibony, M .; Sturm, N .; Tissier, F .; et al. Clinicopathological Description of 43 Oncocytic Adrenocortical Tumors: Importance of Ki-67 in Histoprognostic Evaluation. Mod. Pathol. 2018, 31, 1708-1716. [CrossRef]

15. Wei, D .- M .; Chen, W .- J .; Meng, R .- M .; Zhao, N .; Zhang, X .- Y .; Liao, D .- Y .; Chen, G. Augmented Expression of Ki-67 Is Correlated with Clinicopathological Characteristics and Prognosis for Lung Cancer Patients: An Up-Dated Systematic Review and Meta- Analysis with 108 Studies and 14,732 Patients. Respir. Res. 2018, 19, 150. [CrossRef]

16. Song, C .; Chen, J .; Zhao, C .; Song, S .; Yang, T .; Huang, A .; Liu, R .; Pan, Y .; Xu, C .; Chen, C .; et al. Prediction of Ki-67 Expression in HIV-Associated Lung Adenocarcinoma Patients Using Multiple Machine Learning Models Based on CT Imaging Radiomics. Cancer Manag. Res. 2025, 17, 881-892. [CrossRef]

17. Prior, F .; Smith, K .; Sharma, A .; Kirby, J .; Tarbox, L .; Clark, K .; Bennett, W .; Nolan, T .; Freymann, J. The public cancer radiology imaging collections of The Cancer Imaging Archive. Sci. Data 2017, 4, 170124. [CrossRef]

18. Moawad, A.W .; Ahmed, A.A .; ElMohr, M .; Eltaher, M .; Habra, M.A .; Fisher, S .; Perrier, N .; Zhang, M .; Fuentes, D .; Elsayes, K. Voxel-Level Segmentation of Pathologically-Proven Adrenocortical Carcinoma with Ki-67 Expression (Adrenal-ACC-Ki67-Seg) [Data Set]. Cancer Imaging Arch. 2023. [CrossRef]

19. Ahmed, A.A .; Elmohr, M.M .; Fuentes, D .; Boldrini, L .; Cusumano, D .; Issa, M.Y .; Elsayes, K.M .; Elshafie, M.L. Radiomic Mapping Model for Prediction of Ki-67 Expression in Adrenocortical Carcinoma. Clin. Radiol. 2020, 75, 479.e17-479.e22. [CrossRef]

20. Li, C .; Fu, Y .; Yi, X .; Zhang, Y .; Zhang, X .; Wu, P. Application of radiomics in adrenal incidentaloma: A literature review. Discov. Oncol. 2022, 13, 112. [CrossRef]

21. Ferro, A .; Bottosso, M .; Dieci, M.V .; Scagliori, E .; Miglietta, F .; Aldegheri, V .; Bonanno, L .; Caumo, F .; Guarneri, V .; Griguolo, G .; et al. Clinical Applications of Radiomics and Deep Learning in Breast and Lung Cancer: A Narrative Literature Review on Current Evidence and Future Perspectives. Crit. Rev. Oncol. Hematol. 2024, 203, 104479. [CrossRef] [PubMed]

22. Esteva, A .; Robicquet, A .; Ramsundar, B .; Kuleshov, V .; DePristo, M .; Chou, K .; Cui, C .; Corrado, G .; Thrun, S .; Dean, J. A guide to deep learning in healthcare. Nat. Med. 2019, 25, 24-29. [CrossRef] [PubMed]

23. Hosny, A .; Parmar, C .; Quackenbush, J .; Schwartz, L.H .; Aerts, H.J.W.L. Artificial intelligence in radiology. Nat. Rev. Cancer 2018, 18, 500-510. [CrossRef]

24. Imani, M .; Beikmohammadi, A .; Arabnia, H.R. Comprehensive Analysis of Random Forest and XGBoost Performance with SMOTE, ADASYN, and GNUS Under V incentive Imbalance Levels. Technologies 2025, 13, 88. [CrossRef]

25. Wang, J .; Zeng, Z .; Li, Z .; Liu, G .; Zhang, S .; Luo, C .; Hu, S .; Wan, S .; Zhao, L. The clinical application of artificial intelligence in cancer precision treatment. J. Transl. Med. 2025, 23, 120. [CrossRef]

26. Chawla, N.V .; Bowyer, K.W .; Hall, L.O .; Kegelmeyer, W.P. SMOTE: Synthetic Minority Over-Sampling Technique. J. Artif. Intell. Res. 2002, 16, 321-357. [CrossRef]

27. Zhang, Z. Variable selection with stepwise and best subset approaches. Ann. Transl. Med. 2016, 4, 136. [CrossRef]

28. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5-32. [CrossRef]

29. Dahouda, M.K .; Joe, I. A Deep-Learned Embedding Technique for Categorical Features Encoding. IEEE Access 2021, 9, 114381-114391. [CrossRef]

30. Disha, R.A .; Waheed, S. Performance Analysis of Machine Learning Models for Intrusion Detection System Using Gini Impurity- Based Weighted Random Forest. Cybersecurity 2022, 5, 1. [CrossRef]

31. DeLong, E.R .; DeLong, D.M .; Clarke-Pearson, D.L. Comparing the Areas under Two or More Correlated Receiver Operating Characteristic Curves: A Nonparametric Approach. Biometrics 1988, 44, 837-845. [CrossRef] [PubMed]

32. Ying, G .- S .; Maguire, M.G .; Glynn, R.J .; Rosner, B. Calculating Sensitivity, Specificity, and Predictive Values for Correlated Eye Data. Investig. Opthalmology Vis. Sci. 2020, 61, 29. [CrossRef] [PubMed]

33. Carpenter, J .; Bithell, J. Bootstrap Confidence Intervals: When, Which, What? A Practical Guide for Medical Statisticians. Stat. Med. 2000, 19, 1141-1164. [CrossRef]

34. Kim, T.K. T test as a parametric statistic. Korean J. Anesthesiol. 2015, 68, 540-546. [CrossRef]

35. Mishra, P .; Pandey, C.M .; Singh, U .; Gupta, A .; Sahu, C .; Keshri, A. Descriptive statistics and normality tests for statistical data. Ann. Card. Anaesth. 2019, 22, 67-72. [CrossRef]

36. Mandrekar, J.N. Receiver operating characteristic curve in diagnostic test assessment. J. Thorac. Oncol. 2010, 5, 1315-1316. [CrossRef]

37. Salman, H.A .; Kalakech, A .; Steiti, A. Random Forest Algorithm Overview. Babylon. J. Mach. Learn. 2024, 2024, 69-79. [CrossRef]

38. Salehi, M .; Khosravi, A .; Saeed, A .; Nahavandi, S. CSBBoost: Cluster-based Synthetic Boosting Framework for Class-Imbalanced Data Classification. Sci. Rep. 2024, 14, 5152. [CrossRef]

39. La Rosa, S. Diagnostic, Prognostic, and Predictive Role of Ki67 Proliferative Index in Neuroendocrine and Endocrine Neoplasms: Past, Present, and Future. Endocr. Pathol. 2023, 34, 79-97. [CrossRef]

40. Trevethan, R. Sensitivity, Specificity, and Predictive Values: Foundations, Pliabilities, and Pitfalls in Research and Practice. Front. Public Health 2017, 5, 307. [CrossRef]

41. Passaro, A .; Al Bakir, M .; Hamilton, E.G .; Kim, R .; Lopes, G .; Rolfo, C .; Andre, F. Cancer Biomarkers: Emerging Trends and Clinical Implications for Personalized Treatment. Cell 2024, 187, 1617-1635. [CrossRef] [PubMed]

42. Abualigah, L .; Alomari, S.A .; Almomani, M.H .; Abu Zitar, R .; Saleem, K .; Migdady, H .; Snasel, V .; Smerat, A .; Ezugwu, A.E. Medicine: A machine learning framework for predicting disease outcomes and optimizing patient-centric care. J. Transl. Med. 2025, 23, 302. [CrossRef] [PubMed]

43. Santos, C.S .; Amorim-Lopes, M. Externally Validated and Clinically Useful Machine Learning Algorithms to Support Patient- Related Decision-Making in Oncology: A Scoping Review. BMC Med. Res. Methodol. 2025, 25, 45. [CrossRef] [PubMed]

44. Lerario, A.M .; Mohan, D.R .; Hammer, G.D. Update on biology and genomics of adrenocortical carcinomas: Rationale for emerging therapies. Endocr. Rev. 2022, 43, 1051-1073. [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.