ELSEVIER
Clinica Chimica Acta
journal homepage: www.elsevier.com/locate/cca
H
ifce
CLINICA CHIMICA ACTA
Simplified urinary steroid profiling by LC-MS as diagnostic tool for malignancy in adrenocortical tumors
Check for updates
Nora Vogg a,b, Tobias Müller , Andreas Floren ”, Thomas Dandekar”, Anna Riester”, Ulrich Dischinger ª, Max Kurlbaum a,b, Matthias Kroiss a, d,1, Martin Fassnacht a, b, 1,*
a Department of Internal Medicine I, Division of Endocrinology and Diabetes, University Hospital, University of Würzburg, Germany
b Central Laboratory, Core Unit Clinical Mass Spectrometry, University Hospital Würzburg, Germany
· Department of Bioinformatics, Biocenter, Am Hubland, University of Würzburg, Germany
d Department of Internal Medicine IV, University Hospital Munich, Ludwig-Maximilians-Universität München, Munich, Germany
ARTICLE INFO
Keywords:
Adrenal tumors Adrenocortical carcinoma LC-MS/MS
Mass spectrometry Steroid profiling
ABSTRACT
Objectives: Preoperative identification of malignant adrenal tumors is challenging. 24-h urinary steroid profiling by LC-MS/MS and machine learning has demonstrated high diagnostic power, but the unavailability of bio- informatic models for public use has limited its routine application. We here aimed to increase usability with a novel classification model for the differentiation of adrenocortical adenoma (ACA) and adrenocortical carcinoma (ACC).
Methods: Eleven steroids (5-pregnenetriol, dehydroepiandrosterone, cortisone, cortisol, «-cortolone, tetrahydro- 11-deoxycortisol, etiocholanolone, pregnenolone, pregnanetriol, pregnanediol, and 5-pregnenediol) were quantified by LC-MS/MS in 24-h urine samples from 352 patients with adrenal tumor (281 ACA, 71 ACC). Random forest modelling and decision tree algorithms were applied in training (n = 188) and test sets (n = 80) and independently validated in 84 patients with paired 24-h and spot urine.
Results: After examining different models, a decision tree using excretions of only 5-pregnenetriol and tetrahydro- 11-deoxycortisol classified three groups with low, intermediate, and high risk for malignancy. 148/217 ACA were classified as being at low, 67 intermediate, and 2 high risk of malignancy. Conversely, none of the ACC demonstrated a low-risk profile leading to a negative predictive value of 100% for malignancy. In the inde- pendent validation cohort, the negative predictive value was again 100% in both 24-h urine and spot urine with a positive predictive value of 87.5% and 86.7%, respectively.
Conclusions: This simplified LC-MS/MS-based classification model using 24-h-urine provided excellent results for exclusion of ACC and can help to avoid unnecessary surgeries. Analysis of spot urine led to similarly satisfactory results suggesting that cumbersome 24-h urine collection might be dispensable after future validation.
1. Introduction
Adrenal tumors are among the most common neoplasms in humans with a prevalence between 3 and 10% increasing with age [1-4]. The nowadays common use of cross sectional imaging led to a rise of inci- dentally detected adrenal tumors. Thus, the need for reliable diagnostic workup of these incidentalomas has increased substantially [5]. Current
guidelines aim at the diagnosis of (i) the malignant potential of a given lesion and (ii) autonomous hormone secretion [6]. The differentiation between benign adrenocortical adenoma (ACA) and malignant adreno- cortical carcinoma (ACC) can be challenging based on current diagnostic methods and techniques. Most of the frequent ACA require no thera- peutic intervention, if relevant autonomous hormone excess is excluded [6]. In contrast, ACC are very rare with an annual incidence of 0.5-2/
Abbreviations: ACA, adrenocortical adenoma; ACC, adrenocortical carcinoma; CT, computed tomography; HU, Hounsfield units; MRI, magnetic resonance im- aging; THS, tetrahydro-11-deoxycortisol; PPV, positive predictive value; NPV, negative predictive value; ENSAT, European Network for the Study of Adrenal Tumors; 5-PT, 5-pregnenetriol.
* Corresponding author. Address: Department of Internal Medicine I, Division of Endocrinology and Diabetes, University Hospital, University of Würzburg, Oberdürrbacher Str. 6, 97080 Würzburg, Germany.
E-mail address: fassnacht_m@ukw.de (M. Fassnacht).
1 Matthias Kroiss and Martin Fassnacht contributed equally to this work.
https://doi.org/10.1016/j.cca.2023.117301
1.000.000 [7] and treatment options are particularly limited in advanced stages resulting in a 5-year survival <15% of patients with metastatic disease [8]. Hence, early diagnosis of ACC in localized stages might be lifesaving by enabling complete surgical tumor resection [9]. Delayed surgery has been linked to time-consuming hormonal workup [10]. Therefore, a simplified diagnostic test strategy in clinical routine is urgently needed.
Current international guidelines for the diagnostic workup of adrenal tumors recommend imaging and biochemical testing for hormone excess [6]. Unenhanced abdominal computed tomography (CT) is the imaging method of choice whereby tumor tissue attenuation ≤10 [11] or ≤20 Hounsfield units (HU) [12] indicates absence of malignancy with high specificity but poor sensitivity. Magnetic resonance imaging (MRI) with chemical shift is probably similarly accurate, but the number of sound studies is limited [11]. While the additional value of delayed wash-out CT has recently been found to be moderate [13,14], also fluorodesox- yglucose positron emission tomography (FDG-PET) missed to identify 7 of 47 malignant tumors in a series of 117 indeterminate adrenal masses [15].
Urinary steroid profiling by mass spectrometry-based techniques has previously demonstrated its value in the differential diagnostics of ad- renal tumors [16]. Gas chromatography mass spectrometry (GC-MS) revealed increased urinary tetrahydro-11-deoxycortisol (THS) in ACC compared to benign adrenal tumors or controls [17-22]. In a prospec- tive study using more readily available LC-MS/MS as a technique, the quantification of 15 steroids in 24-h urine samples and risk classification by machine-learning together with imaging resulted in a positive pre- dictive value (PPV) of 76.4% and a negative predictive value (NPV) of 99.7% for the diagnosis of malignancy [12].
Steroid quantification in 24-h urine is the current standard and is considered to usefully exploit the circadian rhythmicity of steroido- genesis in healthy subjects compared to patients with autonomous ste- roid secretion. All previous studies in the research field have used this traditional sampling [12,17-23]. The collection of 24-h urine being cumbersome for patients and prone to sampling errors, incomplete routine collections occur in a proportion of 30% or higher [24,25].
We here aimed to overcome the drawbacks of the use of proprietary algorithms and sampling that until now have limited the application of urinary steroid profiling for the differential diagnosis of malignancy in adrenal tumors outside very specialized institutions.
2. Materials and methods
2.1. Study design and population
This is a retrospective study of prospectively collected urine samples from adult patients treated at two German referral centers (University Hospitals Würzburg and Munich). The study was conducted as part of the European Network for the Study of Adrenal Tumors (ENSAT) reg- istry that has been approved by the local ethics committee (#88/11 and 379/10). All patients provided written informed consent. The study follows the STARD criteria (Standards for Reporting of Diagnostic Ac- curacy) reporting guideline. The inclusion criterion was presence of an adrenal tumor with a diameter >2 cm. Exclusion criteria were previous treatment of adrenal disease and diagnoses of pheochromocytoma, myelolipoma, or adrenal metastases from other malignancies based on clinical workup. The final diagnosis was based on current clinical practice guidelines for the management of adrenal incidentalomas and ACC [6,8,9] with post-operative histopathology and/or follow-up in- vestigations as gold standards. Patients were instructed to collect their urine after discarding the first morning urine over the period of 24-h including the following morning urine. Spot urine was sampled within 14 days before or after 24-h urine collection. Spot urine samples were collected between November 2017 and March 2022 and 24-h urine samples were collected between March 2010 and March 2022 and stored at -20 ℃ until analysis.
We followed the method outlined by Buderer [26] to compute sample sizes for expected sensitivity and specificity and took the larger required sample size of the two, which in our case is the sensitivity. Assuming a proportion of 20% ACC at specialized centers, a minimum sample size of 173 samples was required to yield a 10% width of a 2- sided 95% CI of an expected sensitivity of 90% [17].
A cohort of 268 patients provided a 24-h urine sample per patient that were used for classification model establishment (training/test cohort). Samples from ACA and ACC patients were randomly split into a training set (n = 188, 70%) and a test set (n = 80, 30%) with the only prerequisite being an equal distribution of ACA and ACC in training and test set. An independent cohort of 84 patients provided both 24-h urine and corresponding spot urine and served as validation cohort, but also to compare the performance of spot urine to 24-h urine. Fig. 1 visualizes the composition of the study cohort. In total, 71 patients had an ACC and 281 were diagnosed with ACA. These diagnoses were based on histology in all patients with ACC and in 145 patients with ACA. In the remaining patients, strict follow-up criteria [13] were applied to prove the benign nature of the lesion.
Age, sex, and tumor diameter were recorded and imaging charac- teristics were classified as unsuspicious or suspicious. For this purpose, unsuspicious was defined as HU ≤20 in unenhanced CT [12], relative contrast washout >58% in delayed washout CT [13], loss of signal in- tensity between in- and out-of-phase images in MRI chemical shift analysis [27], or absence of FDG uptake or uptake less than the liver [6,28].
2.2. Laboratory methods
The urinary concentrations of the following 11 steroids were quan- tified by LC-MS/MS as described in detail elsewhere [29]: 5-pregnene- triol (5-PT), dehydroepiandrosterone, cortisone, cortisol, «-cortolone, tetrahydro-11-deoxycortisol (THS), etiocholanolone, pregnenolone, pregnanetriol, pregnanediol, and 5-pregnenediol. Briefly, 150 ul urine underwent enzymatic hydrolysis of steroid conjugates with arylsulfa- tase/glucuronidase from Helix pomatia and steroids were extracted via solid-phase extraction before LC-MS/MS analysis on an 1290 Infinity HPLC (Agilent, Waldbronn, Germany) coupled to a QTRAP 6500 + mass spectrometer (Sciex, Darmstadt, Germany). Intra- and inter-day co- efficients of variation of quality control samples within the batches of study samples were ≤12.9% for all analytes (Supplementary Table S1). Study samples with steroid concentrations above the quantification limit were diluted into the calibration range to ensure linear responses.
Creatinine measurements were performed using a Cobas® 8000 immunoassay (Roche Diagnostics GmbH).
2.3. Establishment of classification models and statistical analysis
Statistical analyses were performed using IBM SPSS 28 and R version 4.0.2 [30]. Groups were compared using non-parametric tests and p- values were adjusted for multiple testing according to Benjamini- Hochberg [31] with adjusted p < 0.05 considered significant.
For the classification task, we used the “ctree” function from the R package “partykit” [32] to construct conditional inference trees. These are nonparametric regression models that recursively partition the feature space into smaller regions, with each region having its own unique regression function. These models rely on a conditional inference framework to perform hypothesis tests at each split point in the tree, ensuring that the tree is unbiased and avoiding overfitting [33]. The procedure begins by constructing an initial tree with a single node that contains all data (root node). Next, each predictor variable is considered in turn, searching for the best split point that maximizes the difference in the response variable between the two resulting subgroups. The split is evaluated using a hypothesis test based on a permutation test statistic (N = 99,999). If a significant split is found, the node is split into two child nodes and the split variable and split point are stored in the
Training/test cohort for classification model establishment: 268 patients
Independent validation cohort: 84 patients
268 samples 24-h urine
84 samples 24-h urine
Paired samples
84 samples spot urine
188 samples training set
80 samples test set
64 ACA
20 ACC
64 ACA
20 ACC
152 ACA
36 ACC
65 ACA
15 ACC
decision node. If not, the node is marked as a leaf node. This procedure is performed recursively for each child node until all nodes are either leaf nodes or the minimum node size (10% of data size) is reached. The structure of a conditional interference tree with root node, decision nodes, and leaf nodes is shown in Supplementary Fig. S1. Once the tree is constructed, the function can be used to predict the response variable for new data. To do this, the function navigates down the tree by comparing the new data with the split variables and split points stored in each node until it reaches a leaf node. The predicted value for the new data is then the mean of the response variable in the corresponding leaf node.
For implementation of the random forest algorithm (as implemented in the cforest function), we relied on conditional inference decision trees. We applied both functions with defaults settings, but with an increased number of resamplings (n = 99,999) and set the proportion of observations needed to establish a terminal node to minprob = 0.1.
3. Results
3.1. Patient characteristics and steroid excretions
Urine samples of 352 patients with adrenal tumors were included in this study and comprised a training (n = 188), test (n = 80) and vali- dation (n = 84) cohort. Demographic and clinical characteristics of the patient cohorts are summarized in Table 1.
We first compared the power to discriminate ACA from ACC of two normalization approaches for urinary steroid concentrations: (i) total steroid mass excretion of the given steroid excreted in 24-h, and (ii) steroid concentration normalized to creatinine concentration. Wilcoxon tests and ROC analyses shown in Supplementary Fig. S2 revealed com- parable results but the difference in steroid quantity between ACC and ACA reached higher levels of significance using steroid-to-creatinine ratios. Since we also aimed at applying this classification model to spot urine, we decided to use creatinine-normalized steroid excretion for further analyses.
Between groups, we observed significantly higher (p < 0.001) excretion of 8/11 steroids in patients with ACC compared to patients with ACA after correction for multiple testing (Supplementary Fig. S3).
3.2. Establishment and application of the classification model
We applied both a decision tree strategy and random forest classifi- cation to differentiate between ACA and ACC using all 11 steroids. The ctree algorithm automatically selected the excretion of the two steroids 5-PT and THS for the decision tree that performed surprisingly well for clinical diagnosis. Even the threshold of 5-PT at 276 µg/g creatinine alone resulted in a total training error of only 6.4% for the binary classification ACC vs. ACA: 7/36 patients with ACC and 5/152 patients with ACA were misclassified (Fig. 2). In comparison, the training error of the more complex random forest model based on all 11 steroid excre- tions was 5.3%. We found strong correlation between all steroids in a 5- PT and THS dominated cluster, respectively (Supplementary Fig. S4).
The decision tree revealed a valuable substructure regarding classi- fication accuracy in its branches (Supplementary Fig. S1). Both the left and the right leaf nodes were accurate with 0.0% training error. How- ever, the three intermediate branches together exhibited a training error of 20.7%, including 15 ACC and 43 ACA. Therefore, we defined the three classes of low (5-PT ≤ 70 µg/g creatinine and THS ≤ 278 µg/g creati- nine), intermediate, and high risk of ACC (5-PT > 276 ug/g creatinine and THS > 1062 µg/g creatinine).
Subsequently the decision tree was applied to the 24-h urine samples of the test set (n = 80). 48 samples were classified with either high or low risk, 32 samples had an intermediate risk of ACC. Two of the nine samples with high risk had the final diagnosis of ACA and all 39 samples with low risk were ACA (Table 2).
The independent cohort of 84 patients with adrenal tumors was used to validate the decision tree. Classification between ACA and ACC resulted in a total accuracy of 66.7% when considering correct classifi- cation in the high risk class for ACC and in the low risk class for ACA, respectively. The true positive rate was 70.0% and the true negative rate was 65.6%. A proportion of 87.5% of patients classified with high risk had the final diagnosis of ACC whereas low risk classification excluded ACC with 100.0% NPV.
3.3. Combined classification by imaging characteristics and urinary steroids
All 352 patients who provided a 24 h-urine sample were included in
| Training/test cohort | Validation cohort | |||
|---|---|---|---|---|
| ACC | ACA | ACC | ACA | |
| (n = 51) | (n = 217) | (n = 20) | (n = 64) | |
| Sex | ||||
| Men | 16 | 63 | 6 (30.0%) | 21 |
| (31.4%) | (29.0%) | (32.8%) | ||
| Women | 35 | 154 (71.0%) | 14 | 43 |
| (68.6%) | (70.0%) | (67.2%) | ||
| Age, years | 53 (46-65) | 59 (51-68) | 53 (47-58) | 59 (52-65) |
| Tumor diameter, cm | 10.0 | 3.2 | 10.3 | 3.0 |
| (7.5-14.0) | (2.7-4.1) | (8.9-12.1) | (2.4-3.9) | |
| 2 to ≤ 4 cm | 2 (3.9%) | 160 | 1 (5.0%) | 54 |
| (73.7%) | (84.4%) | |||
| > 4 cm | 49 | 57 | 19 | 10 |
| (96.1%) | (26.3%) | (95.0%) | (15.6%) | |
| Imaging modalityª | ||||
| Unenhanced CT | 10 | 129 | 6 (30.0%) | 38 |
| (19.6%) | (60.6%) | (60.3%) | ||
| Delayed washout | 0 | 6 (2.8%) | 0 | 2 (3.2%) |
| CT | ||||
| Contrast | 11 | 3 (1.4%) | 4 (20.0%) | 1 (1.6%) |
| enhanced CTb | (21.6%) | |||
| FDG-PET/CT | 25 | 22 | 9 (45.0%) | 4 (6.3%) |
| (49.0%) | (10.3%) | |||
| MRT | 5 (9.8%) | 52 | 1 (5.0%) | 19 |
| (24.4%) | (30.2%) | |||
| Imaging characteristic | ||||
| Unsuspicious | 0 | 157 | 0 | 42 |
| (72.4%) | (65.6%) | |||
| Suspicious | 51 | 56 | 20 | 21 |
| (100.0%) | (25.8%) | (100.0%) | (32.8%) | |
| Not specified | 0 | 4 (1.8%) | 0 | 1 (1.6%) |
| Biochemical/Clinical | ||||
| evidence of hormone | ||||
| excess | ||||
| Cushing's | 14 | 60 | 5 (25.0%) | 7 |
| syndrome | (27.5%) | (27.6%) | (10.9%) | |
| Autonomous | 26 | 91 | 11 | 31 |
| cortisol secretion | (51.0%) | (41.9%) | (55.0%) | (48.4%) |
| Primary | 3 (5.9 %) | 9 (4.1%) | 2 (10.0%) | 2 (3.1%) |
| hyperaldosteronism | ||||
| Sex hormone | 30 | 9 (4.1%) | 12 | 7 |
| excess | (58.8%) | (60.0%) | (10.9%) | |
| Nonfunctioning | 6 (11.8%) | 59 | 2 (10.0%) | 22 |
| (27.2%) | (34.4%) | |||
Data presented as n (%) or median (IQR). ªIf several imaging modalities were used, only the method is mentioned that was used to determine the imaging characteristics. bFinal diagnosis based on histology or long-term follow-up.
a cross tabulation combining urinary steroid analysis and cross sectional imaging characteristics for ACC and ACA (Fig. 3A). 45 patients exhibited suspicious imaging characteristics and were at high risk of ACC by the urinary steroid decision tree. Thereof, 42 patients had an ACC, 3 patients with ACA were misclassified. Typically benign imaging features or a low risk classification by urine analysis identified 249 of 281 patients with ACA. Moreover, this combination was able to exclude ACC with high accuracy in our patient cohort. The combination of suspicious imaging characteristics and intermediate risk included 29 patients with ACC and 29 patients with ACA.
While adrenal cross sectional imaging was suspicious in 100% of ACC, 77 of 276 patients with ACA likewise had a suspicious tumor appearance on imaging (Fig. 3B). When applying the proposed urinary steroid risk classification to the 148 patients with suspicious imaging characteristics, 45 of 77 patients with ACA had a low risk classification while 42 of 71 patients with ACC were classified with a high risk of ACC (Fig. 3C and D). With this approach, in up to 18 of 26 operated patients (69%) with ACA and suspicious imaging but without clinically relevant
hormone excess, the necessity of an adrenalectomy might have been questioned before surgery, because the urinary steroid classification had pointed clearly towards a benign lesion.
3.4. Evaluation of spot urine as surrogate of 24-h urine
The independent validation cohort of 84 patients provided paired samples of 24-h urine and spot urine and we directly compared steroid excretions in both sample types. No significant differences in steroid excretions of 5-PT and THS normalized to creatinine were detectable between both collection approaches using Wilcoxon testing after adjustment for multiple testing (Fig. 4A). Excretions of 5-PT and THS normalized to creatinine were correlated between 24-h urine and spot urine samples with a Pearson’s r of = 0.992 for 5-PT and 0.999 for THS using both Passing-Bablok and Deming regression (Supplementary Fig. S5).
The decision tree model was applied to both sample types. Direct comparison of the classification using 24-h urine and spot urine samples resulted in a lower performance of spot urine with a total accuracy of 56.0% compared to 66.7% in 24-h urine. 26 Samples of 24-h urine and 35 samples with spot urine were classified with intermediate risk. However, PPV in spot urines was almost as high as in 24-h urine samples (86.7% vs. 87.5%) and NPV was 100.0% in both urine types (Fig. 4B-E).
Urinary steroid classification by spot urine in patients with suspi- cious imaging in the validation cohort likewise increased the diagnostic performance. A total of 41 patients had a suspicious tumor without typically benign imaging tumors on imaging. Thereof, 21 patients had an ACA. Eleven of these patients were classified with a low risk after spot urine analysis. For 13 of 20 patients with ACC the suspicious imaging characteristics were supported by a high risk classification using spot urine (Supplementary Fig. S6).
4. Discussion
This study demonstrates the performance of a simple and transparent classification model using the excretion of just only two urinary steroids for the differentiation between ACA and ACC. Importantly, the appli- cation of a more complex random forest algorithm only slightly improved overall accuracy by 1-2% compared to a decision tree. This supports our aim to propose a relatively straightforward model with the perspective of facilitating interpretation and reducing complexity for clinical routine implementation. After normalization of steroid con- centrations to creatinine, a direct interpretation can be carried out considering only 5-PT, the urinary metabolite of 17-hydroxypregneno- lone, and THS, the urinary metabolite of 11-deoxycortisol. The good performance of these two metabolites is in accordance with the results of previously published studies using urinary steroid profiling for the dif- ferentiation of ACA and ACC in which 5-PT and THS were among the most discriminative analytes as well [17-23,34].
A major strength of the reported decision tree is its substructure of the five classification branches, allowing for determination of the three classes of high, intermediate, and low risk of ACC. High risk and low risk classes are both of high diagnostic accuracies with very low error rates. Not unexpectedly, the 0.0% training error of the high-risk class could neither be reproduced in the test set, nor in the independent validation cohort though, but only two patients with ACA were classified with high risk in the test and validation set, respectively. All these four cases were preoperatively suspected ACC but postoperative histopathology showed ACA. Two of these ACA were oncocytic. However, as all four cases had clinically significant hormone excess, adrenalectomy was indicated anyway [6]. In contrast, tumors classified as “low risk” had a 0.0% error rate, which was reproducible in the test set as well as in the independent validation cohort for both 24-h urine and spot urine. With a NPV of 100%, ACC could be excluded with high certainty in more than half of patients with ACA even using spot urine.
Previous studies with LC-MS/MS quantified 26 [23] and 15 steroids
Training set
n = 188
ACC=36
ACA=152
5-PT
≤ 276 µg/g creatinine
> 276 ug/g creatinine
154 ACC=7 ACA=147
34
ACC=29 ACA=5
5-PT
THS
≤ 70
> 70
≤ 1062 ug/g creatinine
> 1062
ug/g creatinine
ug/g creatinine
µg/g creatinine
122
ACC=1 ACA=121
26 x ACA 6 x ACC
8 x ACC 5 x ACA
21 x ACC
THS
≤ 278 ug/g creatinine
> 278 µg/g creatinine
109 x ACA
12 x ACA 1 x ACC
Į
1
Low risk of ACC
Intermediate risk of ACC
High risk of ACC
| ACC | ACA | Total | Predictive values, % (95% CI) | |
|---|---|---|---|---|
| High risk of ACC | 7 | 2 | 9 | 77.8 (PPV), (40.2-96.1) |
| Intermediate risk of ACC | 8 | 24 | 32 | |
| Low risk of ACC | 0 | 39 | 39 | 100.0 (NPV) (88.8-100.0) |
| Total | 15 | 65 | 80 | |
| Accuracy, % | 46.7 (TPR), | 60.0 (TNR), | 57.5, | |
| (95% CI) | (22.3-72.6) | (47.1-71.7) | (45.9-68.5) |
TPR: True positive rate, TNR: True negative rate, PPV: Positive predictive value, NPV: negative predictive value.
[12] in 24-h urine. Our LC-MS/MS method is validated to quantify 11 urinary steroids [29] but the finally selected decision tree-based classi- fication requires only excretion of two steroids leading to reduced time and effort for data interpretation. Inclusion of additional steroids is unlikely to deliver a relevant improvement of classification due to high correlations between each other in a THS and 5-PT dominated cluster. Our data, therefore, suggest that more complex classification algorithms such as random forest can be obviated in clinical routine.
Imaging characteristics are a key tool in adrenal tumor diagnostics
[6] and some authors consider even a tumor diameter >4 cm as generally suspicious [35,36]. However, both small ACC and large ACA occur with measurable frequency [37], indicating that the tumor diameter alone has limited diagnostic accuracy. In our entire cohort, 67 ACA had a diameter >4 cm and three ACC measured ≤4 cm. These three ACC did not show typical features of benign tumors on imaging and all urine samples were classified with intermediate risk.
Although sensitivity of adrenal cross sectional imaging for ACC is high, specificity is lacking and a considerable number of patients with suspicious imaging characteristics is likely to have a benign tumor [11]. Urinary steroid risk classification provides a particular benefit for these patients as more than half of the patients with ACA and suspicious im- aging characteristics in our cohorts are classified with a low risk of ACC after urine analysis. Moreover, the majority of patients with ACC have a high risk according to urinary steroid classification supporting the sus- picion of malignancy from imaging.
To our knowledge, this study is the first to systematically compare 24-h urine samples in patients with adrenal tumors to corresponding spot urine. Bileck et al. previously found poor correlation of 40 steroids normalized to creatinine between 24-h urine and spot urine but did not study the value of spot urine for the differential diagnosis of adrenal masses [38]. Paired Wilcoxon test of 5-PT and THS excretions did not reveal significant differences between the two sample types in our validation cohort. More samples were classified as intermediate risk in spot urine though, which is likely caused by circadian variations of steroid excretion in spot urine samples. However, the stringent cutoffs
A
| ACC | ||||
|---|---|---|---|---|
| Imaging Urinary steroids | Suspicious | Unsuspicious | n/a | Total |
| High risk of ACC | 42 | 0 | 0 | 42 |
| Intermediate risk of ACC | 29 | 0 | 0 | 29 |
| Low risk of ACC | 0 | 0 | 0 | 0 |
| Total | 71 | 0 | 0 | 71 |
| ACA | |||
|---|---|---|---|
| Suspicious | Unsuspicious | n/a | Total |
| 3 | 1 | 0 | 4 |
| 29 | 58 | 0 | 87 |
| 45 | 140 | 5 | 190 |
| 77 | 199 | 5 | 281 |
B
| Imaging | ACC | ACA | Total |
|---|---|---|---|
| Suspicious | 71 | 77 | 148 |
| Unsuspicious | 0 | 199 | 199 |
| Total | 71 | 276 | 347 |
C
| Urinary steroids | ACC | ACA | Total |
|---|---|---|---|
| High risk of ACC | 42 | 3 | 45 |
| Intermediate risk of ACC | 29 | 29 | 58 |
| Low risk of ACC | 0 | 45 | 45 |
| Total | 71 | 77 | 148 |
D
Imaging n=347
suspicious
unsuspicious
71 ACC 77 ACA
199 ACA
Urinary steroids
High risk
Low risk
42 ACC 3 ACA
Intermediate risk
45 ACA
29 ACC 29 ACA
for 5-PT and THS in the classification model hardly affected NPV and PPV. Depending on patient factors and practical considerations, spot urine might serve as an easily accessible sample for urinary steroid analysis and has the potential to replace 24-h urine in the future. In case of classification with intermediate risk after spot urine analysis, a sub- sequent 24-h urine collection may be taken into consideration to reduce the intermediate risk proportion. Patients with unsuspicious character- istics on imaging or biochemical evidence of low risk of ACC require no adrenalectomy (especially if the tumor is rather small, e.g. < 6 cm). In contrast, patients with high risk urinary steroid profile and suspicious tumor appearance on imaging that is still localized should undergo adrenalectomy without any further delay. Patients with suspicious tumor appearance and intermediate risk of ACC should be treated individually, but in case of doubt rather undergo surgery as the ACC proportion of these cases was 50% in our study cohort. However, with our proposed approach the number of unnecessary adrenalectomies could be significantly reduced.
Several limitations apply to our study: the sample size is still limited, in particular regarding spot urines. In our series, ACC account for 20% of
all tumors. This is a clear over-representation that could lead to an overestimation of the performance of our diagnostic test. On the other hand, patients were thoroughly characterized including available his- topathology in 216 patients and reliable diagnostic criteria in patients who have not undergone surgery. Imaging being performed as part of routine workup comes with the limitation of variable imaging meth- odologies. Due to the small sample size of the various methods, the characteristics could not be individually assessed in the evaluation of diagnostic performance. Furthermore, other malignant adrenal masses such as metastases of non-adrenal tumors pose a specific diagnostic dilemma which cannot be solved using urine steroid metabolomics. Whereas these tumors are relatively frequent in patients with a history of extra-adrenal malignancy, they are rather rare without such a history [39]. Therefore, our proposed decision tree is restricted to patients with “true” incidentalomas and no history of any cancer in the past. More- over, quantification of 5-PT and THS by LC-MS/MS is not available in every laboratory. Over the years, however, LC-MS/MS has increasingly replaced alternative techniques and is currently available at many centers.
ns
A
1000000
ns
100000
24h urine
100000
Spot urine
10000
ns
ns
*
10000
5-PT *
1000
THS
8
1000
8
100
100
10
10
0
0
ACA n=64
ACC n=20
ACA n=64
ACC n=20
B
C
24h urine
100000
ACA
ACC
10000
THS
1000
100
10
1
0
0
1
10
100
1000
10000
100000
5-PT*
D
E
spot urine
100000
· ACA
ACC
10000
THS*
1000
100
10
1
0
0
1
10
100
1000
10000
*steroid excretion in [ug steroid / g creatinine]
100000
5-PT*
| 24h urine | ACC | ACA | Total | Predictive values (%) |
|---|---|---|---|---|
| High risk of ACC | 14 | 2 | 16 | 87.5 (PPV) |
| Intermediate risk of ACC | 6 | 20 | 26 | |
| Low risk of ACC | 0 | 42 | 42 | 100.0 (NPV) |
| Total | 20 | 64 | 84 | |
| Accuracy (%) | 70.0 (TPR) | 65.6 (TNR) | 66.7 |
| spot urine | ACC | ACA | Total | Predictive values (%) |
|---|---|---|---|---|
| High risk of ACC | 13 | 2 | 15 | 86.7 (PPV) |
| Intermediate risk of ACC | 7 | 28 | 35 | |
| Low risk of ACC | 0 | 34 | 34 | 100.0 (NPV) |
| Total | 20 | 64 | 84 | |
| Accuracy (%) | 65.0 (TPR) | 53.1 (TNR) | 56.0 |
Fig. 4. (A) Comparison of 5-PT and THS excretion in 24-h urine and spot urine within ACA and ACC patients showed no significant differences. (B-E) Diagnostic performance of the independent validation cohort. Cross tabulation (B, D) and visualization as scatterplot (C, E) of the 24-h urine sample set (B, C) and the spot urine sample set (D, E).
In conclusion, we present a classification model to be used as part of the diagnostic workup of adrenal tumors that consists of simple and non- invasive sample acquisition and a straightforward interpretation of steroid excretion data with high practical applicability. The exclusion of ACC with high certainty using only two urinary steroids even in spot urine is a major advance of our study and highly relevant in clinical
practice. Whereas the results using 24-h urine are already quite robust, a prospective validation study with larger sample size is required for the use of spot urine as sample matrix.
Research funding:
This work was supported by the Bayerische Forschungsstiftung, Forschungsverbund FORTiTher (AZ-1365-18) and by the Deutsche
Forschungsgemeinschaft project number 314061271 (CRC/Transregio 205/1 “The Adrenal: Central relay of health and disease”).
Ethical approval:
The study was conducted according to the Declaration of Helsinki and was part of the European Network for the study of adrenal tumors (ENSAT) registry, which has been approved by the local ethics com- mittee of the University of Würzburg (#88/11 and 379/10). All patients included in this study provided written informed consent.
CRediT authorship contribution statement
Nora Vogg: Methodology, Investigation, Formal analysis, Writing - original draft. Tobias Müller: Methodology, Formal analysis, Writing - review & editing. Andreas Floren: Methodology, Formal analysis, Writing - review & editing. Thomas Dandekar: Funding acquisition, Writing - review & editing. Anna Riester: Resources, Writing - review & editing. Ulrich Dischinger: Resources, Writing - review & editing. Max Kurlbaum: Supervision, Funding acquisition, Writing - review & editing. Matthias Kroiss: Resources, Supervision, Funding acquisition, Writing - review & editing. Martin Fassnacht: Resources, Supervision, Funding acquisition, Writing - review & editing.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Data availability
Data will be made available on request.
Acknowledgements
The authors thank Chromsystems Instruments & Chemicals GmbH for supporting the FORTiTher consortium, for analytical advice and fruitful discussions. We are grateful to Robert Matern for performing creatinine measurements and Martina Zink for sample management. Moreover, thanks to Irina Chifu and Otilia Kimpel for supporting the classification of imaging data.
Appendix A. Supplementary material
Supplementary data to this article can be found online at https://doi. org/10.1016/j.cca.2023.117301.
References
[1] M. Terzolo, A. Stigliano, I. Chiodini, et al., AME position statement on adrenal incidentaloma, Eur. J. Endocrinol. 164 (6) (2011) 851-870, https://doi.org/ 10.1530/EJE-10-1147.
[2] G. Mansmann, J. Lau, E. Balk, M. Rothberg, Y. Miyachi, S.R. Bornstein, The clinically inapparent adrenal mass: update in diagnosis and management, Endocr. Rev. 25 (2) (2004) 309-340, https://doi.org/10.1210/er.2002-0031.
[3] A. Ebbehoj, D. Li, R.J. Kaur, et al., Epidemiology of adrenal tumours in Olmsted County, Minnesota, USA: a population-based cohort study, Lancet Diab. Endocrinol. 8 (11) (2020) 894-902, https://doi.org/10.1016/S2213-8587(20) 30314-4.
[4] M. Sherlock, A. Scarsbrook, A. Abbas, et al., Adrenal incidentaloma, Endocr. Rev. 41 (6) (2020), https://doi.org/10.1210/endrev/bnaa008.
[5] W.F. Young Jr., Clinical practice. The incidentally discovered adrenal mass, N. Engl. J. Med. 356 (6) (2007) 601-610, https://doi.org/10.1056/ NEJMcp065470.
[6] M. Fassnacht, W. Arlt, I. Bancos, et al., Management of adrenal incidentalomas: European Society of Endocrinology Clinical Practice Guideline in collaboration with the European Network for the Study of Adrenal Tumors, Eur. J. Endocrinol. 175 (2) (2016) G1-G34, https://doi.org/10.1530/EJE-16-0467.
[7] T.M. Kerkhofs, R.H. Verhoeven, J.M. Van der Zwan, et al., Adrenocortical carcinoma: a population-based study on incidence and survival in the Netherlands since 1993, Eur. J. Cancer 49 (11) (2013) 2579-2586, https://doi.org/10.1016/j. ejca.2013.02.034.
[8] M. Fassnacht, O.M. Dekkers, T. Else, et al., European Society of Endocrinology Clinical Practice Guidelines on the management of adrenocortical carcinoma in adults, in collaboration with the European Network for the Study of Adrenal Tumors, Eur. J. Endocrinol. 179 (4) (2018) G1-G46, https://doi.org/10.1530/EJE- 18-0608.
[9] M. Fassnacht, G. Assie, E. Baudin, et al., Adrenocortical carcinomas and malignant phaeochromocytomas: ESMO-EURACAN Clinical Practice Guidelines for diagnosis, treatment and follow-up, Ann. Oncol. 31 (11) (2020) 1476-1490, https://doi.org/ 10.1016/j.annonc.2020.08.2099.
[10] K.I. Makris, D.L. Clark, A.W. Buffie, E.H. Steen, D.J. Ramsey, H. Singh, Missed opportunities to promptly diagnose and treat adrenal tumors, J. Surg. Res. 276 (2022) 174-181, https://doi.org/10.1016/j.jss.2022.02.049.
[11] J. Dinnes, I. Bancos, L. Ferrante di Ruffano, et al., Management of endocrine disease: Imaging for the diagnosis of malignancy in incidentally discovered adrenal masses: a systematic review and meta-analysis, Eur. J. Endocrinol. 175 (2) (2016) R51-R64, https://doi.org/10.1530/EJE-16-0461.
[12] I. Bancos, A.E. Taylor, V. Chortis, et al., Urine steroid metabolomics for the differential diagnosis of adrenal incidentalomas in the EURINE-ACT study: a prospective test validation study, Lancet Diab. Endocrinol. 8 (9) (2020) 773-781, https://doi.org/10.1016/S2213-8587(20)30218-7.
[13] W. Schloetelburg, I. Ebert, B. Petritsch, et al., Adrenal wash-out CT: moderate diagnostic value in distinguishing benign from malignant adrenal masses, Eur. J. Endocrinol. 186 (2) (2021) 183-193, https://doi.org/10.1530/EJE-21-0650.
[14] M. Marty, D. Gaye, P. Perez, et al., Diagnostic accuracy of computed tomography to identify adenomas among adrenal incidentalomas in an endocrinological population, Eur. J. Endocrinol. 178 (5) (2018) 439-446.
[15] X. He, E.M. Caoili, A.M. Avram, B.S. Miller, T. Else, 18F-FDG-PET/CT evaluation of indeterminate adrenal masses in noncancer patients, J. Clin. Endocrinol. Metab. 106 (5) (2021) 1448-1459, https://doi.org/10.1210/clinem/dgab005.
[16] M. Araujo-Castro, P. Valderrabano, H.F. Escobar-Morreale, F.A. Hanzu, G. Casals, Urine steroid profile as a new promising tool for the evaluation of adrenal tumors. Literature review, Endocrine 72 (1) (2021) 40-48, https://doi.org/10.1007/ s12020-020-02544-6.
[17] W. Arlt, M. Biehl, A.E. Taylor, et al., Urine steroid metabolomics as a biomarker tool for detecting malignancy in adrenal tumors, J. Clin. Endocrinol. Metab. 96 (12) (2011) 3775-3784, https://doi.org/10.1210/jc.2011-1565.
[18] S.C. Tiu, A.O. Chan, N.F. Taylor, et al., Use of urinary steroid profiling for diagnosing and monitoring adrenocortical tumours, Hong Kong Med. J. 15 (6) (2009) 463-470.
[19] T.M. Kerkhofs, M.N. Kerstens, I.P. Kema, T.P. Willems, H.R. Haak, Diagnostic value of urinary steroid profiling in the evaluation of adrenal tumors, Horm. Cancer 6 (4) (2015) 168-175, https://doi.org/10.1007/s12672-015-0224-3.
[20] L.I. Velikanova, Z.R. Shafigullina, A.A. Lisitsin, et al., Different types of urinary steroid profiling obtained by high-performance liquid chromatography and gas chromatography-mass spectrometry in patients with adrenocortical carcinoma, Horm. Cancer 7 (5-6) (2016) 327-335, https://doi.org/10.1007/s12672-016- 0267-0.
[21] S. Grondal, B. Eriksson, L. Hagenas, S. Werner, T. Curstedt, Steroid profile in urine: a useful tool in the diagnosis and follow up of adrenocortical carcinoma, Acta Endocrinol. (Copenh) 122 (5) (1990) 656-663, https://doi.org/10.1530/ acta.0.1220656.
[22] Z.R. Shafigullina, L.I. Velikanova, N.V. Vorokhobina, et al., Urinary steroid profiling by gas chromatography mass spectrometry: Early features of malignancy in patients with adrenal incidentalomas, Steroids 135 (2018) 31-35, https://doi. org/10.1016/j.steroids.2018.04.006.
[23] J.M. Hines, I. Bancos, C. Bancos, et al., High-resolution, accurate-mass (HRAM) mass spectrometry urine steroid profiling in the diagnosis of adrenal disorders, Clin. Chem. 63 (12) (2017) 1824-1835, https://doi.org/10.1373/ clinchem.2017.271106.
[24] S.J. Mann, L.M. Gerber, Addressing the problem of inaccuracy of measured 24-hour urine collections due to incomplete collection, J. Clin. Hypertens. (Greenwich) 21 (11) (2019) 1626-1634, https://doi.org/10.1111/jch.13696.
[25] L. Christopher-Stine, M. Petri, B.C. Astor, D. Fine, Urine protein-to-creatinine ratio is a reliable measure of proteinuria in lupus nephritis, J. Rheumatol. 31 (8) (2004) 1557-1559.
[26] N.M. Buderer, Statistical methodology: I. Incorporating the prevalence of disease into the sample size calculation for sensitivity and specificity, Acad. Emerg. Med. 3 (9) (1996) 895-900, https://doi.org/10.1111/j.1553-2712.1996.tb03538.x.
[27] F.V. d’Amuri, U. Maestroni, F. Pagnini, et al., Magnetic resonance imaging of adrenal gland: state of the art, Gland. Surg. 8 (Suppl 3) (2019) S223-S232. 10.21 037/gs.2019.06.02.
[28] J.J. Park, B.K. Park, C.K. Kim, Adrenal imaging for adenoma characterization: imaging features, diagnostic accuracies and differential diagnoses, Br. J. Radiol. 89 (1062) (2016) 20151018, https://doi.org/10.1259/bjr.20151018.
[29] N. Vogg, T. Müller, A. Floren, et al., Targeted metabolic profiling of urinary steroids with a focus on analytical accuracy and sample stability, J. Mass Spectrom. Adv. Clin. Lab. 25 (2022) 44-52. 10.1016/j.jmsacl.2022.07.006.
[30] R Core Team, R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, 2020.
[31] Y. Benjamini, Y. Hochberg, Controlling the false discovery rate: a practical and powerful approach to multiple testing, J. R. Stat. Soc. Series B Stat. Methodol. 57 (1) (1995) 289-300.
[32] T. Hothorn, A. Zeileis, partykit: a modular toolkit for recursive partytioning in R, J. Mach. Learn. Res. 16 (1) (2015) 3905-3909.
[33] T. Hothorn, K. Hornik, A. Zeileis, Unbiased recursive partitioning: a conditional inference framework, J. Comput. Graph. Stat. 15 (3) (2006) 651-674.
[34] S. Minowada, K. Kinoshita, M. Hara, K. Isurugi, T. Uchikawa, T. Niijima, Measurement of urinary steroid profile in patients with adrenal tumor as a screening method for carcinoma, Endocrinol. Jpn. 32 (1) (1985) 29-37, https:// doi.org/10.1507/endocrj1954.32.29.
[35] F. Mantero, M. Terzolo, G. Arnaldi, et al., A survey on adrenal incidentaloma in Italy.Study group on adrenal tumors of the Italian society of endocrinology, J. Clin. Endocrinol. Metab. 85 (2) (2000) 637-644, https://doi.org/10.1210/ jcem.85.2.6372.
[36] N. Ballian, J.T. Adler, R.S. Sippel, H. Chen, Revisiting adrenal mass size as an indication for adrenalectomy, J. Surg. Res. 156 (1) (2009) 16-20.
[37] C.C. Barnett, Jr., D.G. Varma, A.K. El-Naggar, et al., Limitations of size as a criterion in the evaluation of adrenal tumors, Surgery 128(6) (2000) 973-982; discussion 82-3. 10.1067/msy.2000.110237.
[38] A. Bileck, S. Frei, B. Vogt, M. Groessl, Urinary steroid profiles: comparison of spot and 24-hour collections, J. Steroid Biochem. Mol. Biol. 200 (2020), 105662, https://doi.org/10.1016/j.jsbmb.2020.105662.
[39] Y. Jing, J. Hu, R. Luo, et al., Prevalence and characteristics of adrenal tumors in an unselected screening population: a cross-sectional study, Ann. Intern. Med. 175 (10) (2022) 1383-1391, https://doi.org/10.7326/M22-1619.