AGENCY . cal

VIN3WNOSIANS . UA MENTAL PROTECTION

EPA Public Access Author manuscript Comput Toxicol. Author manuscript; available in PMC 2023 November 01.

About author manuscripts

Submit a manuscript

Published in final edited form as: Comput Toxicol. 2022 November 01; 24: 1-23. doi:10.1016/j.comtox.2022.100245.

Evaluating structure-based activity in a high-throughput assay for steroid biosynthesis

MJ Foster1,2, G Patlewicz1, I Shah1, DE Haggard1, RS Judson1, K Paul Friedman1

1Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, 27711, USA

2National Student Services Contractor, Oak Ridge Associated Universities

Abstract

Data from a high-throughput human adrenocortical carcinoma assay (HT-H295R) for steroid hormone biosynthesis are available for >2000 chemicals in single concentration and 654 chemicals in multi-concentration (mc). Previously, a metric describing the effect size of a chemical on the biosynthesis of 11 hormones was derived using mc data referred to as the maximum mean Mahalanobis distance (maxmMd). However, mc HT-H295R assay data remain unavailable for many chemicals. This work leverages existing HT-H295R assay data by constructing structure- activity relationships to make predictions for data-poor chemicals, including: (1) identification of individual structural descriptors, known as ToxPrint chemotypes, associated with increased odds of affecting estrogen or androgen synthesis; (2) a random forest (RF) classifier using physicochemical property descriptors to predict HT-H295R maxmMd binary (positive or negative) outcomes; and, (3) a local approach to predict maxmMd binary outcomes using nearest neighbors (NNs) based on two types of chemical fingerprints (chemotype or Morgan). Individual chemotypes demonstrated high specificity (85-98%) for modulators of estrogen and androgen synthesis but with low sensitivity. The best RF model for maxmMd classification included 13 predicted physicochemical descriptors, yielding a balanced accuracy (BA) of 71% with only modest improvement when hundreds of structural features were added. The best two NN models for binary maxmMd prediction demonstrated BAs of 85 and 81% using chemotype and Morgan fingerprints, respectively. Using an external test set of 6302 chemicals (lacking HT-H295R data), 1241 were identified as putative estrogen and androgen modulators. Combined results across the three classification models (global RF model and two local NN models) predict that 1033 of the 6302 chemicals would be more likely to affect HT-H295R bioactivity. Together, these in silico approaches can efficiently prioritize thousands of untested chemicals for screening to further evaluate their effects on steroid biosynthesis.

Keywords

Steroidogenesis; structure-activity relationships; read-across; chemotypes; in silico screening

Introduction

Screening for endocrine bioactivity to address the regulatory data needs of different agencies throughout the world requires rapid methods to address the thousands of chemicals subject to endocrine screening requirements (ECCC/HC, 2016; EFSA, 2018; EPA, 2011). In the US, the number of substances of regulatory interest for endocrine bioactivity screening may approach as many as 10,000 (EPA, 2012). With the success of high throughput screening (HTS) assays for endocrine bioactivity, large amounts of assay data that may be relevant to regulatory toxicology are readily accessible for hundreds to thousands of chemicals via programs including the US EPA Toxicity ForeCaster or ToxCast (Dix et al., 2007; Kavlock et al., 2012), and Tox21 (Thomas et al., 2018). The data are publicly disseminated through the EPA CompTox Chemicals Dashboard1 (Williams et al., 2017) and ICE Tool (Bell et al., 2021). The ToxCast and Tox21 programs also inform a larger international effort to advocate for the use of new approach methodologies (NAMs) in safety assessment (Commission, 2007; ECCC/HC, 2016; ECHA, 2016; ECHA, 2017; EPA, 2021; Lautenberg, 2016; Sakuratani et al., 2018) and make progress in reducing the use of vertebrates in regulatory toxicology (EPA, 2021; Wheeler, 2019). The ToxCast and Tox21 programs include HTS assays to evaluate molecular initiating events and key events relevant to endocrine bioactivity, including the estrogen and androgen receptors, steroid biosynthesis, and thyroid hormone homeostasis (Judson et al., 2018). Moving forward, a tiered strategy (Thomas et al., 2019) will be needed that employs not only in vitro bioactivity screening, but also models to integrate and interpret these screening data, as well as structure-informed models to extend what is learned from in vitro screening to chemicals lacking screening data using in silico approaches (such as (Quantitative) Structure-Activity Relationships (Q)SARs and read-across).

Indeed, the approach of in vitro screening for bioactivity, integration of assay data into a model, and extension of these results on the basis of structure to data-poor chemicals has been demonstrated for estrogen and androgen receptor activity. ToxCast and Tox21 HTS in vitro data have been used previously in computational models of estrogen and androgen receptor agonist and antagonist activity (Browne et al., 2015; Judson et al., 2020; Judson et al., 2015; Kleinstreuer et al., 2017; Watt and Judson, 2018) and a statistical model of steroidogenesis activity (Haggard et al., 2018; Haggard et al., 2019). The first of these models, the ToxCast ER pathway model, developed using the data from 1811 substances in 18 distinct in vitro ER assays, has been accepted for use as part of an integrated approach to testing and assessment (IATA) on the basis of performance evaluation using in vitro and in vivo reference chemicals (ICCVAM, 2011; Kleinstreuer et al., 2016). Acceptance of the ToxCast ER pathway model in agonist mode as a fit-for-purpose alternative to the three estrogen receptor-related assays in the EDSP Tier 1 screening battery:

1Comp Tox Chemicals Dashboard: https://comptox.epa.gov/dashboard/

the in vitro ER transactivation (OPPTS 890.1300, OECD TG455), in vitro ER binding (OPPTS 890.1250 OECD TG493), and in vivo uterotrophic EDSP Tier 1 screening assays (OPPTS 890.1600/OECD TG440) (EPA, 2015), has proceeded based on validation efforts and recommendations of two Scientific Advisory Panels (SAPs) (EPA, 2013; EPA, 2014) conducted under the authority of the Federal Insecticide, Fungicide, and Rodenticide Act (FIFRA). The ToxCast AR pathway model (Judson et al., 2020; Kleinstreuer et al., 2017) has followed a similar trajectory, with validation efforts using in vitro and in vivo (Browne et al., 2018; Kleinstreuer et al., 2018) reference chemicals, and FIFRA SAPs to evaluate its fit-for-purpose utility (EPA, 2014; EPA, 2017). The latest iteration of the full ToxCast AR pathway model is comprised of 14 AR-related assays in ToxCast/Tox21 used to screen 1820 chemicals, and with minimal assay subsets, may extend to provide antagonist predictions for another 1000 chemicals (Judson et al., 2020). Given that thousands of chemicals may be of interest for ER and AR activity screening, large efforts to create consensus QSAR models for ER (the collaborative estrogen receptor activity prediction project, known as CERAPP) and AR (the collaborative modeling project for androgen receptor activity, known as COMPARA) have been developed to provide in silico predictions based on chemical structure as input (Mansouri et al., 2016; Mansouri et al., 2020). The approximately 1800 substances with full ToxCast ER and AR model datasets form the basis of a training set for the consensus CERAPP and COMPARA binding, agonist, and antagonist models.

Similarly, in this work, the feasibility of developing in silico models to identify chemicals that may perturb steroid biosynthesis in the high-throughput implementation of the H295R assay (HT-H295R) in ToxCast (Haggard et al., 2018; Karmaus et al., 2016) is examined. Perturbation of steroid biosynthesis in the HT-H295R system can be summarized using the previously published statistical model that integrates 11 steroid hormone measurements based on the Mahalanobis distance or on fold changes in the steroid hormones themselves. The Mahalanobis distance-based evaluation of the ToxCast HT-H295R assay in concentration-response quantitatively and qualitatively distinguishes active substances in the HT-H295R assay using a single metric that integrates all 11 hormone measures (Haggard et al., 2018; Haggard et al., 2019). This statistical model was reviewed by a FIFRA SAP in 2017 (EPA, 2017). More recently, the robustness and reproducibility of the Mahalanobis distance-based analysis were evaluated (Haggard et al., 2019). In Haggard et al. (2019), we observed consistent covariation between the hormones measured, sufficient power of the mean Mahalanobis distance to capture 1.5 to 2-fold changes in hormones and hormone combinations, and positive behavior of reference aromatase inhibitors. However, just like the ToxCast ER and AR models, the Mahalanobis distance-based approach is limited to the substances screened in vitro (2012 total in single and multi-concentration combined, 654 chemicals in multi-concentration only), which leaves thousands of substances with no data or prediction regarding in vitro steroid biosynthesis disruption.

In vitro and in silico chemical screening for possible disruption of steroid biosynthesis present some unique biological and practical challenges that necessitate a different approach than the strategies previously employed for the ToxCast ER and AR models. The first biological challenge for modeling steroidogenesis disrupting activity is that multiple mechanisms of steroidogenesis perturbation may occur within the H295R cell line, which expresses several steroid hormone receptors (including AR, ER, and the progesterone

receptor, PR) (de Cremoux et al., 2008; Montanaro et al., 2005; Robitaille et al., 2015), and a set of enzymes necessary for the synthesis of steroid hormones including progestogens, corticosteroids, androgens, and estrogens (Gazdar et al., 1990). Thus, the precise target or mechanism of steroid biosynthesis disruption is obscured in the H295R model. The use of H295R adrenocortical carcinoma cells in the HT-H295R assay (Karmaus et al., 2016) was based on the US EPA Endocrine Disruptor Screening Program (EDSP) and the Organization of Economic Cooperation and Development (OECD) testing guidelines for an in vitro steroidogenesis assay using the H295R cell line to measure chemical-induced perturbations of testosterone or estradiol synthesis (Hecker et al., 2011; OECD, 2011; USEPA, 2009). Despite the screening utility of this cell line, delineating a precise mechanism based on changes in steroid hormone levels remains elusive. Indeed, previous work demonstrated that 608 of the 766 unique chemical samples screened in multi-concentration produced unique patterns, with increased, decreased, or no change in each of the 11 hormones as possible outcomes (113 possible patterns representing different combinations of activities). Due to the use of a tiered screening design, of the 654 unique chemicals with a maxmMd based on multi-concentration screening, 596 were positive. Only 10 chemicals affected only estrogen and/or androgen synthesis, and 67 chemicals affected only progestogen and/or corticosteroid synthesis. A possible conclusion from these findings was that most of the chemicals screened affected multiple steroid hormone classes and that using all 11 hormones in aggregate, as defined by the mean Mahalanobis distance at each concentration, to evaluate the magnitude and potency of changes across the pathway appeared useful. An additional practical difference between the HT-H295R model and the ToxCast ER and AR pathway models is the availability of orthogonal and confirmatory assay data. For the ToxCast ER and AR pathway models, the high balanced accuracy of the model prediction likely related, at least in part, to the existence of assays that targeted specific sequential events within the activation or inactivation pathways for these receptors, enabling detection of signal less hampered by sources of assay interference. In contrast, the HT-H295R model for steroidogenesis encompasses many mechanisms and pathways. Many molecular initiating events and key events are present in tandem, and the system is dynamic, i.e. changing in response to time (Breen et al., 2010; Saito et al., 2016). Nonetheless, the HT-H295R assay may be useful in identifying substances in a first screen that can perturb the H295R steroid biosynthesis system. Thus, in silico prediction of in vitro bioactivity in the HT-H295R assay represents a promising prioritization step for identifying chemicals for further in vitro evaluation for effects on steroidogenesis.

Our objective in this work was to infer the putative activity of thousands of untested chemicals through predictive in silico models of in vitro bioactivity in HTS assays based on chemical descriptors. Therefore we developed a multi-strategy approach to use the HT- H295R assay data in ToxCast to construct structure-activity relationships (SARs) to make binary predictions of steroid biosynthesis disrupting activity. This approach included: (1) as part of an initial data exploration exercise, identification of individual structural descriptors, known as chemotypes (also referred to as ToxPrints) (Yang et al., 2015) associated with increased odds of affecting estrogen or androgen synthesis; (2) a “global” approach employing random forest (RF) classification for binary positive or negative effects in the HT-H295R assay, with structural descriptors and physicochemical property predictions; and,

(3) a “local” approach to classify binary positive or negative chemicals in the HT-H295R assay using nearest neighbors (NNs) characterized by chemotype and Morgan chemical fingerprints. “Local” makes reference to more homogenous groups of chemicals to inform SARs (Shah et al., 2016), whereas “global” makes reference to the use of descriptors from a heterogenous group of chemicals to which a machine learning approach such as random forest could be applied. An external set of chemicals for predictions comprising 6302 chemicals from the EDSP Universe of Chemicals (EDSPUOC), lacking any HT-H295R bioactivity data, were translated as chemical fingerprints and all three approaches (individual features associated with estrogen and androgen perturbation, global RF model for HT- H295R bioactivity, and local NN models for HT-H295R bioactivity) were used to output predictions. The presence of individual features associated with estrogen and/or androgen perturbation serves as a preliminary structural alert, and the results from the global and local models are integrated into a heuristic score to reflect the degree of model agreement, and thus, the overall confidence in the ability of a chemical to perturb the HT-H295R 11-hormone system. Importantly, this work provides the basis for a data-informed selection of chemicals for any further in vitro bioactivity screening for steroid biosynthesis disruption.

Methods

Overview of approach

In this work, we attempted to develop SARs for two different kinds of bioactivity in the HT-H295R assay, i.e. effects on estrogen or androgen synthesis specifically or effects on the synthesis of any of the 11 hormones in the pathway,. We used three different approaches for SARs for effects on the synthesis of the 11 hormones in the pathway: (1) identification of individual chemical structural features associated with increased chances of positive bioactivity; (2) a global machine learning approach involving random forest (RF) classification of positive (active) or negative (inactive) outcomes in the HT-H295R assay using both chemical fingerprint and physicochemical descriptors; and, (3) a local approach defining neighborhoods of structurally-related chemicals based on chemical fingerprints (chemotype or Morgan) with prediction of the HT-H295R assay outcome based on the similarity weighted average of those neighbors (Shah et al., 2016) (see Figure 1 to accompany this overview). Both the global and local approaches employ a classification approach to predicting positive (active) or negative (inactive) chemicals in the HT-H295R assay. The combination of these approaches was necessary to flag chemicals most likely to perturb estrogen and androgen synthesis in HT-H295R, identify chemicals with properties similar to those that affected HT-H295R bioactivity, and identify chemicals for which there is sufficient similarity to chemicals with existing HT-H295R data such that a prediction of HT-H295R activity can be made with greater confidence. A single approach was insufficient to address all three critical areas for prioritization, data gap filling, and data-informed selection of chemicals that might be best screened in assay technologies that evaluate chemical-mediated disruption of steroid biosynthesis. The aggregate results provide the most informative view for data-poor chemicals that do not have steroidogenesis screening data available.

It is important to understand the two types of bioactivity we attempted to model. First, the synthesis of individual hormones in the HT-H295R assay was analyzed as fold-change from vehicle control in Haggard et al. (2018) compared to the OECD test guideline using H295R cells. For the work herein, we included chemicals that resulted in ± 1.5-fold changes (Hecker et al., 2011) (based on in 17ß-estradiol and/or estrone synthesis in HT-H295R as chemicals that caused increases or decreases in estrogens (“Estrogen up” and “Estrogen dn”). Similarly, chemicals that resulted in ± 1.5-fold changes in androstenedione and/or testosterone were included as positives for “Androgen up” and/or “Androgen dn.” It should be noted that in previous work, we observed strong correlation of the covariances for estradiol and estrone (0.75) as well as androstenedione and testosterone (0.66) (Haggard et al., 2018), such that combining these hormone outputs into “estrogens” and “androgens” respectively is a logical approach. The other type of bioactivity we attempted to model was a metric that summarizes chemical effect size across the whole 11-hormone pathway: the maximum mean Mahalanobis distance (maxmMd), derived in Haggard et al. (2018). Here for classification modeling, we binarized the maxmMd as positive/active or negative/inactive based on the maxmMd being above or below the critical value (defined in detail below).

The performance of each approach to characterizing the SAR for chemicals in the HT- H295R assay was evaluated several ways. For the individual feature enrichment evaluation, we used a Fisher’s test and a minimum odds ratio of 3 to demonstrate enrichment for each chemotype. The sensitivity, specificity, and balanced accuracy of using small sets of enriched chemotypes for estrogen or androgen-related bioactivity to predict perturbation of estrogens or androgens (with the threshold for a change at ±1.5-fold change) in the HT-H295R assay was evaluated for the training set. For the global and local modeling approaches, performance evaluation involved a leave-one-out cross-validation. For the global approach, additional elements of performance evaluation included recursive feature elimination and feature importance to facilitate interpretation of the descriptors in the resultant models.

Finally, following the steps described in detail below in the Methods and Results, a preliminary structure alert for individual chemotype features associated with estrogen and/or androgen perturbation, the best RF model for HT-H295R maxmMd response, and the NN models (chemotypes and Morgan fingerprints) for HT-H295R maxmMd response were applied to an external test set of chemicals comprised of 6302 chemicals of known structure on the EDSPUOC list available on the CompTox Chemicals Dashboard. The presence of specific chemotypes associated with estrogen and/or androgen perturbation functions as a preliminary structural alert. For the global and local approaches, a model agreement score was developed using the model outputs as heuristics. An optimal model agreement score between the RF and NN models was demonstrated using a leave-one-out cross-validation for the training set of chemicals with known binary HT-H295R bioactivity outcomes.

Chemical descriptors

All chemical descriptors were related to chemicals as defined by their Distributed Structure- Searchable Toxicity (DSSTox) identifiers, or DTXSIDs (Grulke et al., 2019). The complete library of DTXSIDs is approaching one million unique substances and is fully available

Comput Toxicol. Author manuscript; available in PMC 2023 November 01.

for download at the CompTox Chemicals Dashboard (Williams et al., 2017). The CompTox Chemicals Dashboard (version 3.5, 2020) was also the source of SMILES (Simplified Molecular Input Line Entry System) strings linked to DTXSIDs.

Bioactivity data

All of the data used in this study are publicly available (EPA, 2020; Haggard et al., 2018). As summarized in Figure 1A, there were two main designations of bioactivity data used in this work: effects on estrogens or androgens and effects on HT-H295R activity overall, represented by a binarized maxmMd. The HT-H295R assay itself has been previously described in detail (Karmaus et al. 2016, Haggard et al. 2018, USEPA 2017), and the data previously collected using the ToxCast HT-H295R assay as analyzed in Haggard et al. 2018 were used in this work, both for HT-H295R bioactivity (represented by positive or negative maxmMd) and for significant effects on estrogen and/or androgen biosynthesis (±1.5 fold-change).

Due to the tiered design of the HT-H295R assay screening, only 654 chemicals out of 2012 unique chemicals screened had multi-concentration data available for calculation of the maxmMd, and the majority of these 654 chemicals were positive. However, single concentration data exist for the remainder of the chemicals (using invitrodb version 3.3) (EPA, 2020). The total training set for HT-H295R activity was 1400 chemicals (555 positive and 845 negative, Figure 1A). To bolster the number of negative (inactive) chemicals in our training set for global and local classification models for HT-H295R bioactivity, we used chemicals that, when screened using a high single concentration (most often 100 µM, with some chemicals screened at lower concentrations to avoid extreme cytotoxicity or due to solubility limitations), had no effect on all 11 steroid hormones. After identifying the chemicals that failed to affect all 11 steroid hormones in single concentration, combining these with chemicals with a maxmMd value from multi-concentration screening, and filtering to keep chemicals with physicochemical and environmental fate predictions and structural descriptors (chemotypes and Morgan fingerprints), a training set of 1400 chemicals could be constructed. Of these 1400, 845 are negative (inactive) and 555 are positive (active) based on maxmMd (Figure 1A). In total, 599 chemicals from multi- concentration screening (555 positives and 44 negatives) and 801 chemicals from single concentration screening, added as inferred negatives, comprise the 1400 chemical training set. The entire training dataset, based on the bioactivity data from HT-H295R, are available as Supplemental File 2, which contains all input data for this work.

Chemicals that demonstrated ± 1.5-fold changes in 170-estradiol (E2) and/or estrone (E1) synthesis in HT-H295R were classified as chemicals that caused increases or decreases in estrogens (“Estrogen up” and “Estrogen dn”). Similarly, chemicals that resulted in ± 1.5-fold changes in androstenedione and/or testosterone were included as positives for “Androgen up” and/or “Androgen dn.” The number of chemicals that increased or decreased estrogens or androgens by 1.5-fold are reported in Figure 1A (304, 143, 84, 267 chemicals were positive for estrogen up, estrogen dn, androgen up, and androgen dn, respectively). Only the 642 chemicals that could be described by chemotypes and were associated with multi-

concentration screening were used to compile the training set for effects on estrogen and androgen synthesis (Figure 1A).

For the global and local models, the maxmMd values from Haggard et al. (2018) analysis were used to define the “positive” or “negative” chemicals for overall steroidogenesis perturbation in the HT-H295R assay, resulting in a total training set of 1400 chemicals (Figure 1A). The maxmMd represents the maximum of the series of mean Mahalanobis distance (mMd) calculated at each concentration of screened chemicals, and has been shown to be a reproducible and sensitive representation of the magnitude of the effect size in the 11-hormone system, as well as being generally inversely proportional to potency (Haggard et al., 2018; Haggard et al., 2019). The mMd quantifies the magnitude of the difference in all 11 hormones compared to the vehicle control response. A Mahalanobis distanced-based approach was used due to the covariances of residuals of the hormones evaluated in the HT-H295R assay; Mahalanobis distance corrected for correlation and covariance in multi- variate data by including the inverse of the covariance matrix of all of the multivariate measures in the calculation of the distance values. The calculation presented previously in Haggard et al. (2018, 2019) for the mean Mahalanobis distance at each concentration can be understood conceptually as rotating and scaling the hormone multi-concentration response data to a set of new variables that are not correlated with one another and have the same standard deviation, followed by calculation of the Euclidean distance in this new scaled space (Haggard et al., 2018; Haggard et al., 2019). Herein, we used the published maxmMd values. Chemicals with maxmMd values greater than the critical value (the statistical threshold for a significant mean Mahalanobis distance) were classified as being positive (or active) in the HT-H295R assay. When the maxmMd was less than the critical value, the chemical was classified as being negative (or inactive) in the HT-H295R assay. The critical value was computed using a multiple comparisons procedure and approximates a nominal Type I error rate of 0.01, as previously published (Haggard et al., 2019).

Structure descriptors

Chemotypes-The public ToxPrint Chemotype structural feature set was developed by Altamira (Altamira, Columbus, OH USA) and Molecular Networks (Molecular Networks, Erlangen, GmbH) under contract from the U.S. Food and Drug Administration (Yang et al., 2015). Chemotypes are designed to capture a broad diversity of chemical atom, bond, chain and scaffold types, as well as represent chemical patterns and properties especially relevant to various toxicity and safety assessment concerns. Chemotypes consist of 729 features to allow for the representation of substances in binary fingerprints (1 for presence or 0 for absence of the chemotype). These fingerprints can be downloaded from the CompTox Chemicals Dashboard using the batch search function and a list of DTXSIDs. All of the chemotype fingerprints used are available in Supplemental File 2.

Morgan fingerprints-Two dimensional Morgan fingerprints (circular, ECFP6) (Rogers and Hahn, 2010) were developed for substructure and structure similarity searching in structure-activity modeling, and have been used previously in Generalized Read-across (GenRA) (Shah et al., 2016). We generated these 1024-bit fingerprints with R library redk

(Guha, 2007) (version 3.5.0, March 2020), using the function get.fingerprint() and the QSAR ready SMILES corresponding to the DTXSID of each chemical.

Predicted physicochemical and environmental fate properties

OPEn structure-activity/property Relationship App (OPERA) predictions of physicochemical and environmental fate properties (Mansouri et al., 2018) were obtained from the CompTox Chemicals Dashboard. These predictions include 13 descriptors: (1) atmospheric hydroxylation rate (AOH), (2) bioconcentration factor (BCF), (3) biodegradability half-life, (4) boiling point, (5) Henry’s Law constant, (6) fish biotransformation half-life (KM), (7) octanol: air partition coefficient (KOA), (8) soil adsorption constant (KOC), (9) octanol: water partition coefficient (logP), (10) melting point, (11) vapor pressure, (12) water solubility, and (13) average mass. Some of the predicted properties are not independent of one another, but feature reduction approaches were used in the global modeling approach to demonstrate the input feature importance.

Multi-strategy approach to predicting bioactivity

Individual feature enrichment-Individual structural feature enrichment, using chemotypes (Paul Friedman et al., 2020; Wang et al., 2019), was included as a component of exploratory data analysis for estrogen and androgen-related bioactivity as well as HT- H295R activity as defined by maxmMd (Figure 1B). The existence of enriched chemotypes provided some initial indication that SARs may be present in the bioactivity dataset. In the case of the estrogen and androgen-related bioactivity, the individual feature enrichment went on to inform a preliminary structure alert for identifying likely modulators of estrogen and/or androgen synthesis in HT-H295R. The specific steps for chemotype enrichment calculations are presented in the next two subsections.

Chemotype enrichment for HT-H295R activity

1. Chemicals with available maxmMd value (including the inferred negatives added from single concentration screening) to represent HT-H295R bioactivity and available chemotypes were identified, for a total of 1400 chemicals.

2. Chemotypes were removed from the analysis if they were not present in the chemical set identified in step one, resulting in 455 chemotypes represented in this dataset.

3. Two-by-two tables that classify positive and negative maxmMd as related to presence and absence of each chemotype for each chemical were constructed. A conceptual example of the 2×2 table for a given chemotype is displayed below.

	Chemotype presence	Chemotype absence
Positive HT-H295R maxmMd	355	200
Negative HT-H295R maxmMd	545	300

4. A Fisher’s test with a false discovery rate correction was used to determine the p- value for significance on enrichment of all 455 chemotypes; this hypergeometric test tells us if there was a significant increase in the odds of positive bioactivity. In tandem, odds ratios, and confidence intervals around these odds ratios, were computed based on the two-by-two tables. Any odds ratio with an upper confidence bound approaching infinity, having resulted from a cell value of zero (i.e., the chemotype was not associated with a single chemical associated with a positive maxmMd), were set to 0. Chemotypes were considered “enriched” if they had both significant p-values after using a false discovery rate correction (p<0.01) and a threshold of an odds ratio > 3 (Paul Friedman et al., 2020; Wang et al., 2019).

1. Chemicals with multi-concentration data and chemotype fingerprints were used, for a total of 642 chemicals. More chemicals from multi-concentration screening are included in this training set (642 rather than the 599 used for maxmMd modeling) because OPERA property predictions were not available for all 642 chemicals, and these predicted properties are not required for chemotype enrichment.

2. Chemotypes were removed from the analysis if they were not present in the chemical set identified in step five, resulting in 403 chemotypes represented in this dataset.

3. Using the data for individual hormone fold change (Haggard et al., 2018) (provided here in Supplemental File 2) with a cut-off of ±1.5-fold change was used to categorize four distinct modes: estrogen up, estrogen down (dn), androgen up, androgen dn, defined in the previous section describing bioactivity.

4. Two-by-two tables that classify positive and negative activity for each estrogen/ androgen mode as related to presence and absence of each chemotype for each chemical were constructed. A conceptual example of the 2×2 table for a given chemotype is displayed below.

	Chemotype presence	Chemotype absence
Positive	10	42
Estrogen up	10	42
Negative Estrogen up	190	400

5. Repeat step 4 from above for Fisher’s and odds ratio testing. For androgen up, significant chemotype enrichments were identified with a false discovery rate (FDR)-adjusted p-value < 0.01 and odds ratio >3. Due to the structural diversity of chemicals that were associated with fold changes greater than ± 1.5-fold in estrogens or androgens, an unadjusted p-value of p<0.01 and odds ratio > 3 were used for androgen dn, estrogen up, and estrogen dn. Use of a more stringent

set of criteria for androgen dn, estrogen up, and estrogen dn resulted in no significant enrichments.

6. A preliminary structure alert was developed based on the chemotypes enriched in the positive sets of chemicals for all four modes (estrogen up, estrogen dn, androgen up, and androgen dn), requiring at least one of the enriched chemotypes to classify a chemical as a positive in the particular mode. Performance of these preliminary structure alerts was evaluated, wherein a prediction for each of the 654 chemicals in the training set was made as positive (if at least one enriched chemotype for that mode was present) or negative (no enriched chemotypes for that mode present). The sensitivity, specificity, Kappa, balanced accuracy, and out of bag error (OOB) were calculated for each mode using the R library caret using the confusionMatrix() function.

Global modeling approach

A global modeling approach was pursued to use large sets of descriptors from a heterogenous group of chemicals, without the requirement for information about nearest chemical neighbors on the basis of structure. In this approach, there is no requirement for chemical fingerprint similarity; instead, predictions are inferred based on shared input features to the machine learning model. In the subsequent interpretation of the modeling outputs, the global approach becomes important for chemicals for which a local approach is uninformative due to lack of neighbors and/or lack of certainty in the bioactivity response for a neighborhood as well as for building confidence in local model predictions when available. In this global approach, random forest (RF) was used to train multiple models that predicted HT-H295R bioactivity using several different sets of input features. This multiple- model approach allowed investigation of the importance of specific input features and was computationally feasible because RF models are relatively quick and easy to tune and train. A performance comparison of the best RF model to a support vector machine modeling approach was also investigated to understand if other machine learning approaches may have yielded improved results (see Supplemental Results). The specific global modeling steps employed are described in detail below.

1. Identify chemicals screened in HT-H295R for which structural descriptors (chemotypes and Morgan fingerprints) and physicochemical (OPERA predictions) properties are available for use as model input features.

2. Train random forest models for prediction of HT-H295R bioactivity outcomes using the traincontrol() function in library(caret), using leave one out cross validation as the method, and setting metric to both ROC and Accuracy to output sensitivity and specificity, and accuracy and Kappa respectively. Input data are described below in 2a-e.

a. OPERA property predictions only

b. Chemotype fingerprints only

c. OPERA and chemotypes

d. Morgan fingerprints only

e. OPERA and Morgan fingerprints

3. Model performance was evaluated using leave one out cross-validation (LOOCV) to estimate sensitivity, specificity, accuracy, and Kappa of models 2(a)-2(e).

4. Feature importance for models 2(a)-2(e), based on permuted out-of-bag error for mean decreased accuracy, was evaluated using 100 iterations of the model to identify the most deterministic features.

5. A recursive feature elimination (RFE) analysis was also performed to determine if a smaller subset of the features from models in 2a-2e would yield optimal model performance. This was run using a custom for-loop built around a model from library(randomForest) which easily outputs the importance of model features, using 500 trees and the number of variables randomly sampled as candidates in each split, mtry, as the default value (square root of the number of features). This analysis ranked importance of features for each iteration, and then cut out the least important feature until only 1 feature is left. The performance of each model run in the RFE was evaluated using sensitivity and specificity of the training set using the confusionMatrix() function from the R library, caret.

6. The model with the highest balanced accuracy that also minimized the number of features used was selected. To determine the tuning on the final model using the most deterministic features, the best model accuracy over a range of ntree (total number of trees in the random forest model) from 500 to 1500 by increments of 100, and a variety of mtry variables (1, 2, 3, sqrt(13) = 3.6, 4, 5, 6, 7, 10 , 15, 20, 25, 30). We then calculated the balanced accuracy of each iteration using the same traincontrol() function in library(caret) using the method=“LOOCV”, as above. The balanced accuracies varied little when changing values of ntree and mtry, as confirmed by the standard deviation of the vector of balanced accuracy values, which were both smaller than a fraction of a percent (0.0029 and 0.0043 for ntree and mtry, respectively). Based on this small standard deviation, we used the default values (ntree = 500 and mtry = the square root of the number of features).

7. The best RF model, using OPERA predictions as input (2a, above) was compared to results from a support vector machines (SVM) model with a support vector machine (SVM) approach using library(caret) and the function traincontrol(), specifying LOOCV and function training using method = “svmLinear”, preProcess = c(“center”,“‘scale”) for preprocessing with center and scaling. However, performance was not improved with SVM (Supplemental File 7), and as such, only the RF approach is reported in the main text.

8. Additional global RF models were attempted using the same feature sets described in 2a-e above for classification of positive or negative effects on estrogen up, estrogen dn, androgen up, and androgen dn. Unfortunately these models were insufficiently sensitive and had balanced accuracies approaching random chance, regardless of the feature set used as input; as such we do not

report the results in the main text. The results of this failed modeling attempt are reported in Supplemental File 7, and subsequently the preliminary structure alert approach based on the presence of informative chemotypes was applied for prediction of chemicals that might specifically impact estrogen or androgen synthesis in HT-H295R.

Nearest neighbor (NN) modeling approach

The modeling approach is based on the chemical-biological read-across approaches used previously (Low et al., 2013; Shah et al., 2016) to implement an automated approach to assign HT-H295R activity prediction on the basis of the weighted aggregate activities of the k-nearest neighbors for which HT-H295R in vitro data are available. The quantitative methods were adapted from Shah et al. 2016, known as Generalised Read-across or GenRA, with addition of tuning on the number of NNs and required neighbor similarity to output predictions. Following this tuning, the optimal NN model using chemotypes and the optimal NN model using Morgan fingerprints were used for prediction (2 local NN models in total). The analysis steps are described in detail below.

1. Jaccard similarities were calculated between all 1400 chemicals with maxmMd values available based on their chemotype and Morgan fingerprints (each of these yields a 1400 x 1400 matrix of Jaccard similarity). Higher Jaccard similarity indicates a stronger relationship between chemicals. The Jaccard distance, ji, between two chemicals, Ci and Cj, was calculated using their “fingerprints” as a vector (x¡ and Xj), per the equation 1:

ji = Σι (xix;) ΣΙ(ΧΙ΄X11)

Eq (1)

2. For each chemical, n number of nearest neighbors with the highest Jaccard similarities were selected to create bioactivity predictions.

3. The resultant Jaccard distance matrices, one each for chemotype and Morgan fingerprints, were then used to derive HT-H295R activity predictions using the equation 2:

PK =

jik ΣjINXi Eq (2)

Where jik is the Jaccard similarity between chemical k (prediction chemical) and i (nearest neighbor chemical). x; represents the binary maxmMd outcome (1 or 0) for chemical i, representing the i of n nearest neighbors selected. Finally, px is the prediction for chemical k.

4. The predictions (px), which are proportional to the neighbors in the neighborhood that are positive, were translated into binary positive or negative predictions by creating appropriate upper and lower cutoffs that both maximize the balanced accuracy and maximize the number of chemicals with predictions.

5. Nearest neighbor model tuning used a LOOCV approach, creating a prediction for each target chemical in the training set while ignoring its existing data. Three parameters were adjusted to tune the nearest neighbor models: (1) the number of neighbors for each neighborhood; (2) the proportion of positive neighbors for positive predictions; and, (3) the proportion of negative neighbors for negative prediction. Therefore, LOOCV was performed for each combination of the number of neighbors, the upper cutoff to define positive neighborhoods, and the lower cutoff to define negative neighborhoods. The optimal number of neighbors for neighborhoods defined using chemotype or Morgan fingerprints were determined by testing 1, 2, 3, 4, 5, 7, 10, 15, 20, 25, 30, 35, 40 neighbors. For an upper cutoff to establish positive behavior for a neighborhood, proportions of 0.7, 0.8, and 0.9 were tested; similarly, the lower cutoff to establish negative behavior for a neighborhood was tested at 0.1 and 0.2. Testing all of these potential limits in combination required 180 model combinations to test with LOOCV to evaluate nearest neighbor model performance using chemotype and Morgan fingerprints. Setting highly stringent thresholds on the average activity would result in highly accurate predictions but only for a very limited number of chemicals, leaving many neighborhoods as “equivocal;” thus, an optimal model would have good balanced accuracy and include a high number of training chemicals for positive and negative forward prediction.

6. A number of additional NN modeling approaches were attempted, including the combination of binary structural fingerprint metrics and continuous metrics (i.e., OPERA physicochemical predictions) to define neighborhoods using Jaccard similarity and Euclidean distance-based approaches; using only Euclidean distance, based on OPERA physicochemical predictions, to define neighborhoods; and, the use of heterogeneous value difference metric (HVDM) distances to attempt combination of binary and continuous features in the NN modeling approach. However, none of these approaches led to NN models that approached the performance of the chemotype and Morgan Jaccard similarity-based models reported here. Therefore, the summary results from these unsuccessful modeling attempts are reported in Supplemental File 7 for interest.

Evaluating preliminary structure alerts and model agreement

No single modeling approach predicting HT-H295R activity was sufficient to address the data gaps and questions that one might ask when predicting chemical effects on steroid biosynthesis from this assay, necessitating a multi-step approach for making external predictions.

Estrogen and androgen preliminary structure alerts

First, we applied a positive prediction for estrogen up, estrogen dn, androgen up, and/or androgen dn when at least one of the enriched chemotypes per mode were present in the target chemical. This approach to identifying chemicals that may perturb estrogen or androgen synthesis specifically in HT-H295R serves as a preliminary structure alert for

Comput Toxicol. Author manuscript; available in PMC 2023 November 01.

these specific modes of bioactivity. Each of the 4 modes (estrogen and androgen, up and down) were evaluated for classification performance within the 642 chemical training set for estrogen and androgen related bioactivity (Figure 1A), with computation of sensitivity, specificity, balanced accuracy, and Kappa.

Model agreement score for the global and local approaches

Although the accuracy tends to be higher, the NN models require cutoffs to avoid making predictions for chemicals that are highly dissimilar from those screened and to avoid making predictions using neighborhoods with mixed results. This level of stringency yields equivocal results for chemicals without a reliable NN. This served as our motivation for creating a model agreement score that combines the results of the two best NN models (which provide structure-based predictions and demonstrated higher balanced accuracy values) with the results of the global RF model (which uses physicochemical and environmental fate predictions available for all chemicals, but perhaps with only moderate confidence). A higher model agreement score indicates a greater likelihood of a positive (active) response on the HT-H295R assay as represented by the maxmMd for given chemical. The details of the model agreement score calculation and performance evaluation are described here.

1. The model agreement score is essentially a heuristic, calculated as a sum of the binary outcomes of the three models. As the nearest neighbor models demonstrate higher balanced accuracy, they are weighted more heavily (0.25 higher than a positive in the global RF model).

2. Model agreement scores are inherently discrete, with the following possible scores: 0, 0.25, 0.5, 0.75, 1, 1.25, 1.5, 1.75, 2, 2.25, 2.75.

a. If positive:

i. Local NN model (chemotypes or Morgan fingerprints) = 1

ii. Global RF model = 0.75

b. If equivocal:

i. Nearest neighbor model = 0.5

ii. Random forest (OPERA predictions unavailable) = 0.25

c. If negative:

i. Nearest neighbor model = 0

ii. Random forest = 0

3. Performance evaluation of the model agreement scores was conducted using the 1400 training set chemicals with known HT-H295R maxmMd results. The classification sensitivity, specificity, accuracy, and Kappa for positive or negative maxmMd was evaluated for different minimum threshold levels of model agreement (0.75, 1, 1.25, 1.5, and 1.75). Additionally, the proportion of chemicals with each model agreement score that had a positive maxmMd value were also evaluated. These two elements of performance evaluation aid the user

in defining a threshold for model agreement to fit their specific use case, i.e. what the tolerance might be for false positives or false negatives.

External prediction set

The universe of chemicals potentially relevant for EDSP (EDSPUOC) prioritization and screening approaches 10,000 unique DTXSIDs, and is available for batch download on the CompTox Chemicals Dashboard (https://comptox.epa.gov/dashboard/chemical_lists/ EDSPUOC). Predictions from the multi-strategy modeling approach (including preliminary structure alerts, global RF modeling, and 2 local NN models, Figure 1C) were made for the chemicals on the EDSPUOC. Of the 9700 substances on this list, 7700 are chemicals easily represented by structural descriptors, i.e. chemotype or Morgan fingerprints. Removing chemicals that already have HT-H295R data yielded an external prediction set of 6302 chemicals. The estrogen and androgen preliminary structure alerts, global RF model for maxmMd, and local NN models for maxmMd were all applied to this external prediction set of chemicals, model agreement scores were calculated, and all of these results were output into Supplemental File 5.

Following generation of NN predictions for these chemicals, a greater understanding of the types of substances that corresponded to positive, negative, and/or equivocal NN model bioactivity predictions was pursued via taxonomic groupings from ClassyFire (Djoumbou Feunang et al., 2016), which are freely available via webserver (http:// classyfire.wishartlab.com/). Data were downloaded from the ClassyFire webserver as a json file, and this file was parsed to extract the hierarchical taxonomic terms, including kingdom, superclass, class, subclass, and substituent information for each substance. ClassyFire classes were assigned to chemicals to facilitate interpretation of which structural groupings of chemicals resulted in positive, negative, or equivocal behavior in the NN models.

Software

The work described herein was conducted in R software (version 4.0.2) and is available at GitHub (https://github.com/USEPA/CompTox-HTH295R-SAR ) and the US EPA Clowder repository (https://clowder.edap-cluster.com/datasets/62680e42e4b0e232136663ff), as well as in Supplemental File 1 as an html file including code and resultant output.

Results

Per the workflow summarized in Figure 1, herein we report the results of a multi-step approach to predicting bioactivity in the HT-H295R assay, including the results of: (1) as a precursor to any SAR, the individual structure descriptors, known as chemotypes, that were enriched among chemicals that affected estrogen or androgen synthesis specifically and among chemicals that affected the overall HT-H295R 11 hormone system, represented by a positive maxmMd; (2) a global model, employing RF machine learning and various sets of input descriptors, to predict positive or negative maxmMd values; and (3) a “local” model approach, based on the GenRA approach (Shah et al., 2016), employed to provide similarity-weighted activity prediction on the basis of the NNs, using chemotype or Morgan fingerprints. Finally, the results from combining the global and local model approaches,

using a model agreement score to summarize, is also reported. Together these results inform which chemicals are most likely to affect estrogen or androgen synthesis specifically, on the basis of individual chemotype features, and which chemicals are likely to affect the overall HT-H295R assay result based on a multi-strategy approach.

Training data available

Estrogen and androgen bioactivity-Training set data were compiled for all four modes of binary effect on estrogen or androgen biosynthesis in HT-H295R (estrogen up, estrogen dn, androgen up, androgen dn) as well as for effects on the maxmMd (Figure 1A). The training data available for estrogen and androgen synthesis all came from the multi-concentration screening data, represented as fold-change (Haggard et al., 2018), for a total of 642 unique chemicals with structure information. Positive rates were not balanced for all modes with respect to effects that exceeded 1.5-fold change, with positive rates of 47%, 22%, 13%, and 41.5% for estrogen up, estrogen dn, androgen up, and androgen dn, respectively. The imbalances present in the estrogen dn and androgen up bioactivity datasets, due to lack of active chemicals that cause clear changes (±1.5-fold), may have limited the success of any subsequent modeling attempts.

HT-H295R bioactivity-The bioactivity in HT-H295R for all 11 hormones was represented by maxmMd, and a positive maxmMd that exceeded a threshold cutoff for a positive based on a multiple comparisons procedure (described in Methods). Though single-concentration data were available for >2000 chemicals, only 654 were advanced to multi-concentration screening due to positive results in single-concentration screening, as part of a tiered screening design for resource efficiency. To ameliorate the effects of tiered screening on training set balance, i.e. multi-concentration screening of chemicals that were positive in a single concentration screening, “inferred” negatives that demonstrated no effects in single concentration screening were added to the multi-concentration data set for which maxmMd was previously calculated, creating a dataset with 555 positive chemicals and 845 negative chemicals (total of 1400 chemicals) on the basis of a binarized maxmMd value (Figure 1A). Of these 845 negative chemicals, 801 were “inferred.”

Individual feature enrichment

A chemotype enrichment analysis (Figure 1B) was performed initially to understand the feasibility of developing SARs for estrogen and androgen biosynthesis disruption as well as effects on maxmMd (representing an effect on HT-H295R across one or more of the 11 steroid hormones).

Estrogen and androgen bioactivity and feature enrichment-Chemotype enrichments were considered significant for androgen dn, estrogen dn, and estrogen up when the unadjusted p-value < 0.01 for the Fisher’s test and the odds ratio > 3. After correcting using FDR, estrogen up, estrogen dn, and androgen dn did not have any enriched chemotypes, so stringency was reduced by using an uncorrected p < 0.01 and an odds ratio ≥ 3. This resulted in 6, 4, and 1 enriched chemotypes for estrogen up, estrogen down, and androgen dn, respectively. Largely, different chemotypes were enriched by mode for chemicals with effects on estrogen and androgen synthesis

Comput Toxicol. Author manuscript; available in PMC 2023 November 01.

(Figure 2); only one chemotype (group.ligand_path_4_bidentate_aminoethanol, represented simply as O-C-C-N) was enriched for two modes (androgen dn and estrogen dn). For androgen dn, group.ligand_path_4_bidentate_aminoethanol was the only chemotype that met the significance criteria; this chemotype was present in 14 of the 267 chemicals that decreased androgens, including azole fungicides and anti-estrogenic drugs such as tamoxifen citrate and clomiphene citrate, the latter of which is also an androgen receptor antagonist (Judson et al., 2020; Judson et al., 2015). This chemotype group.ligand_path_4_bidentate_aminoethanol was also present in 10 of the 143 chemicals that decreased estrogens in the training set, including chemicals with known ability to decrease steroid synthesis, such as azole (triazole and imidazole) fungicides (Goetz et al., 2009; Hecker et al., 2006), as well as a member of the beta-blocker drug family, metoprolol (Gracia et al., 2007; Selvaraj et al., 2000). The other three chemotypes enriched for estrogen dn are highly related heterocyclic rings that also describe imidazole and triazole fungicides: ring.hetero _. 5 ._ N_imidazole, ring.hetero _. 5 ._ N_triazole_1_2_4, and ring.hetero _. 5 ._ N_triazole_1_3_4. This finding is unsurprising as heterocyclic ring structures with 5-6 atoms comprised of carbons and nitrogens, including triazines often used in production of herbicides and dyes, have been associated with decreased estrogen and/or androgen synthesis in HT-H295R and other models (Draskau et al., 2021; Goetz and Dix, 2009; Goetz et al., 2007; Goetz et al., 2009; Karmaus et al., 2016; Skolness et al., 2013).

Six chemotypes were enriched among the chemicals that increased estrogen synthesis in the HT-H295R assay: bond.CN_amine_sec.NH_alkyl, bond.CN_amine.NH_aromatic, bond.CN_amine.NH_aromatic_aliphatic, bond.P.O_phosphate_thioate, ring.hetero_6.N_triazine_1_3_5, and ring.hetero _. 6 ._ Z_1_3_5. The three amine chemotypes enriched among chemicals that increased estrogens are highly analogous structures that are common to triazine herbicides, such as atrazine, propazine, and simazine, which have been associated with induction of aromatase, the enzyme responsible for conversion of androgens to estrogens, in the H295R cell line (Sanderson et al., 2000). The chemotypes ring.hetero_6.N_triazine_1_3_5 and ring.hetero _. 6 ._ Z_1_3_5 are analogous heterocyclic rings also common to the triazines class. The chemotype bond.P.O_phosphate_thioate is common to organophosphate insecticides; ten organophosphates increased estrogen biosynthesis in the HT-H295R assay and contained this chemotype. It is important to note the odds ratios for a 1.5-fold change given the chemotype presence (x-axis in Figure 2). For estrogen up, the mean odds ratios were 3.1, 5.3, 3.1, 4.2, 11.4, and 16.2, with very wide upper confidence limits for chemotypes bond.P.O_phosphate_thioate and ring.hetero_6.N_triazine_1_3_5.

Four chemotypes were enriched among chemicals that increased androgens (Figure 2) based on a Fisher’s test p-value < 0.01 after adjusting for false discovery rate and an odds ratio ≥ 3, including ring: fused_steroid_generic_5_6_6_6 (Figure 2). As demonstrated by the enriched chemotypes, there was a strong relationship between steroid ring presence and effects on androgen up, among other enriched chemotypes (Figure 2, with odds ratios of 5.3, 3.7, 5.2, and 7.7 for chemotypes 1 (bond.CC .. O.C_ketone_aliphatic_generic), 6 (chain.alkaneCyclic_hexyl_C6), 7 (chain.alkaneCyclic_pentyl_C5), and 9 (ring:fused_steroid_generic_5_6_6_6) with respect to androgen up response).

Comput Toxicol. Author manuscript; available in PMC 2023 November 01.

The classification performance of using the presence of any enriched chemotype per mode, i.e. a preliminary structure alert for activity in estrogen up, estrogen dn, androgen up, and/or androgen dn, was evaluated using the training set of 642 chemicals for estrogen/androgen synthesis effects, and reported in Table 1. The sensitivity of these preliminary structure alerts to identifying chemicals that modulate estrogen or androgen biosynthesis in HT-H295R is extremely low across all modes (ranging from 3-12%), indicating that the enriched chemotypes fail to capture the structural diversity of chemicals that can modulate these specific bioactivities in HT-H295R. The balanced accuracies and Kappa values are also very low, indicating low predictivity for positive or negative effects on each mode. However, the specificity of this approach is very high, ranging from 93 to 98%, indicating that these preliminary structural feature alerts may flag a relatively limited set of chemical structures that appear highly associated with perturbance of estrogen or androgen biosynthesis in HT-H295R.

HT-H295R maxmMd and feature enrichment-The 10 chemotypes most associated with positive activity in the HT-H295R assay, as indicated by a maxmMd greater than the critical value, are listed in Table 2. Unsurprisingly, the chemotype corresponding to the highest odds ratio (43.1, 95% confidence interval ranging from 7, adjusted p-value < 3.8E-8) was for a steroid backbone (ring:fused_steroid_generic_[5_6_6_6]). Twenty-eight substances contain this chemotype, including steroid hormones themselves (i.e., 17alpha- hydroxyprogesterone, 5alpha-dihydrotestosterone, 4-androstene-3,17-dione, progesterone, 17alpha-ethinylestradiol, 17beta-estradiol, 17-methyltestosterone, estrone, corticosterone, estriol, dehydroepiandrosterone, androsterone) as well as steroid drugs intended to act within steroid hormone pathways. Examples of these types of steroid drugs include aldosterone receptor antagonists such as spironolactone; therapeutics for ER-responsive breast cancer such as fulvestrant; glucocorticoid drugs used to treat various types of inflammatory processes such as triamcinolone, dexamethasone sodium, and prednisone; contraception medications such as levonorgestrel, norethindrone, and lynestrenol; the anti-progestagen drug mifepristone, and androgenic drugs such as danazol and anti-androgen drugs such as cyproterone acetate. These drugs are likely to interact with the steroid hormone receptors that are present, act as substrates for the Phase I biotransformation enzymes present, and/or may be directly measured by the HPLC analytical detection of the 11 steroid hormones measured as the readout for the HT-H295R assay when they are identical to the hormone being detected.

The presence of a steroid ring is clearly the most informative structure descriptor associated with positive maxmMd values, as evidenced by the high odds ratios in Table 2 versus lower bounds on all other enriched chemotypes that are all less than 3. Beyond the steroid ring, other chemotypes enriched in the estrogen and/or androgen bioactivity were also enriched with respect to effects on maxmMd, including: a six-membered heterocyclic triazine ring structure (ring.hetero _. 6 ._ N_triazine_1_3_5 .. ) and a 5-membered carbon ring (chain.alkaneCyclic_pentyl_C5). Additionally, 5 and 6 membered aromatic rings were enriched with respect to positive maxmMd. Though the presence of a ring structure, whether heterocyclic or aromatic, appears to increase the odds of effects on HT-H295R biosynthesis, this is unlikely to be a highly specific indicator of activity; for example, the

presence of an aromatic benzene ring is very common in the training set data, with 845 chemicals containing an aromatic benzene, and so would not be a very specific indicator, as indicated by the marginal odds ratios around 3 for this chemotype. Enriched chemotypes for positive maxmMd may reflect available toxicophores for Phase I metabolism by cytochrome P450 enzymes (CYP450s) that might have cross-reactivity to CYP450s in the HT-H295R biosynthetic system. Other enriched chemotypes include monothiophosphate or dithiophosphate bonds characteristic of organophosphate insecticides that undergo P450- mediated desulfuration and detoxification by CYP450-mediated hydrolysis (Ellison et al., 2012). The cyclic ketone alkane (bond:CC(=O)C_ketone_alkane_cyclic) that is enriched is a moiety that may be available for Phase I-mediated monooxygenation and is contained by several estrogen and androgen hormones, drugs, and natural products. The lack of specificity for the enriched chemotypes suggests that while there may be an opportunity for SARs, based on the presence of enriched structural descriptors, a modeling approach that incorporates more than the presence of singular chemotypes would be ideal, leading to the global and local modeling approaches undertaken herein.

HT-H295R bioactivity modeling

The results of global and local modeling approaches are reported herein only for HT-H295R bioactivity, as represented by a binarized maxmMd value (positive or negative). Both the global and local models used the same training set of 1400 chemicals (Figure 1A) that were structurable (chemotype and Morgan fingerprints available) and compatible with generation of OPERA physicochemical predictions and also had maxmMd values available.

Though attempts were made to construct global and nearest neighbor models for increased and decreased estrogen and androgen synthesis, the results had limited value because of the few number of chemicals that increased or decreased estrogen and/or androgen specifically by greater than ±1.5-fold. Preliminary work suggested that sufficiently sensitive models (neither global random forest nor nearest neighbor models) could not be constructed to accurately classify chemicals that increased or decreased estrogen and/or androgen. The code for the unsuccessful modeling attempts for effects on estrogen and androgen synthesis in HT-H295R are provided (Supplemental File 1, html output of RMarkdown code), with a brief summary of results in Supplemental File 7 (word doc containing additional results).

Global random forest models

LOOCV for 5 different RF models: The results for LOOCV for the global RF models using 5 different input feature sets, based on various combinations of OPERA property predictions and structural descriptors (chemotypes and Morgan fingerprints) (Figure 1B), are provided in Table 3. Evident from this analysis is that the balanced accuracy among all global RF models was moderate (approaching 70% accuracy), with sensitivities in the range of 56-65% range and specificities in the 77-82% range, demonstrating little difference in classification performance regardless of which of the 5 sets of input features (OPERA predictions, chemotypes, OPERA predictions and chemotypes, Morgan fingerprints, OPERA predictions and Morgan fingerprints). Kappa was also calculated as a means of adjusting accuracy by accounting for the correct prediction by chance alone; the Kappa values on all models could be characterized as fair to moderate agreement, in

the range of 0.32-0.46 (Lantz, 2019). The out of bag error (OOB) was also estimated for each RF model as an estimate of the mean prediction error on each training sample that did not contain the sample row (as RF uses subsampling or bagging with replacement to create training samples from which to learn); OOB was fairly constant across all 5 RF models, ranging from 0.27-0.33, indicating an approximate 30% error rate for all 5 RF models. The RF model informed by OPERA predictions and chemotypes appeared to perform slightly better than the OPERA predictions alone (65% vs 61% sensitivity, 82% vs 80% specificity, 73% vs. 71% balanced accuracy), but with the addition of 455 input features for very minor improvement.

Recursive feature elimination: Following the results that the model performance was similar across all 5 input feature sets, and that chemotypes may have added only a minor contribution to overall model accuracy, a recursive feature elimination (RFE) analysis was performed to identify the minimum number of input features that might lead to an optimally informative model, as illustrated in Figure 3. RFE proceeds via iterating model runs of the RF model with one more feature per iteration. In Figure 3A, the RFE suggests that the OPERA property predictions each add very little to the performance beyond the most 1-3 important features, with optimal performance with inclusion of 10-13 of these property predictions, noting that these property predictions are not all independent from one another. In Figure 3B and 3C (chemotypes and OPERA property predictions plus chemotypes, respectively), the RFE analysis suggests that there is little to no improvement in model sensitivity or specificity when chemotypes descriptors are used in a random forest model. Switching from chemotype to Morgan fingerprints (Figure 3D) also failed to provide improvement in model sensitivity and specificity when compared to using the OPERA property predictions alone. The results in Figure 3 suggest that a random forest model using OPERA property predictions alone may provide a reasonable baseline model for HT-H295R activity prediction.

Feature importance: The top 13-16 features for each random forest model using OPERA and/or chemotype descriptors, based on the feature importance of the features used in the model, are reported in Table 4. The value to indicate feature importance is the permuted out-of-bag error for mean decreased accuracy, evaluated using 100 iterations; as such, lower values suggest greater importance. The OPERA predictions, as descriptors, appeared to be the most informative descriptors on the basis of their relative feature importance in the models that included any chemotypes. In this analysis, it is evident that when present in the random forest model, typically 12-13 of the OPERA physicochemical predictions were within the top 13-15 features in the model. The most informative feature from any model containing OPERA predictions was the octanol: water partition coefficient, or logP, which may have been important due to the impact of logP on chemical bioavailability to cells (Armitage et al., 2014). Interestingly, some of the chemotypes with the greatest feature importance in the chemotypes-only RF model were the same chemotypes significantly enriched for positive maxmMd, including the aromatic benzene ring and the steroid ring, among others.

RF model tuning-Examining the RF model performance in Table 2, the RFE from Figure 3, and the feature importance in Table 4 suggests that the effects on model performance of including any structural descriptors was minor to perhaps negligible. As such, the model with the fewest features, the RF model with the 13 OPERA predictions as features, was tuned for application to the external prediction set. The default tuning parameters for RF in the R package caret are 500 trees and an mtry (the number of variables to randomly sample as candidates at each split) equal to the square root of the number of input descriptors (\13 = 3.6). The results of resamping across tuning parameters are illustrated in Figure 4, and show that tuning beyond the default number of trees and mtry fails to yield performance improvements. The final RF model was also compared to the results of another machine learner, support vector machine, but with the latter providing no gains in performance (Supplemental Result File 7).

Local nearest neighbor models-Following a “global” approach to predicting HT- H295R activity, a “local” approach based on using nearest neighbors to predict HT-H295R activity was pursued with two different chemical structure fingerprints: ToxPrint and Morgan fingerprints. A local approach was pursued because the maximum balanced accuracy for the global models (near 70%) could underestimate prediction success in localized chemical neighborhoods of more homogeneous chemicals, e.g. steroids or triazole fungicides, where a stronger SAR for effects in the HT-H295R assay may have been present.

As indicated in Figure 1B, positive and negative maxmMd predictions were pursued using chemical similarity as defined by chemotypes, Morgan fingerprints, and/or OPERA predictions. However, the combination of continuous OPERA predictions with binary Morgan and chemotype fingerprints failed to yield improved performance (Supplemental File 7 with additional results). As such, the NN model development described here is only for the two NN models that used chemotypes and Morgan fingerprints.

A LOOCV procedure was used to report the performance of NN models that had different numbers of neighbors per neighborhood, different upper bounds on the proportion of positive neighbors required for a positive prediction, and different lower bounds on the proportion of negative neighbors required for a negative prediction (Figure 5; full results in Supplemental File 4). Optimal performance for the chemotype fingerprint model (sensitivity = 0.83, specificity = 0.86, balanced accuracy =0.85 for 250 training chemicals) was obtained with 12 of the closest neighbors and a positive activity cutoff for neighborhoods at 0.8 and a negative activity cutoff for neighborhoods at 0.1. For the Morgan fingerprint model we achieved optimal performance (sensitivity = 0.84, specificity of 0.78, balanced accuracy = 0.81 for 305 training chemicals) using 12 of the closest neighbors with a positive cut off proportion of 0.7 and negative cutoff proportion of 0.1 (Figure 5, Supplemental File 4). It was possible to tune the NN models to provide predictions for fewer numbers of chemicals increase the balanced accuracy on the NN model performance. For instance, including only 94 training chemicals in the chemotype NN model, rather than 250, would result in a balanced accuracy of 89%, but using 250 chemicals would best achieve the goal to simultaneously maximize the number of chemicals included and also achieve relatively high balanced accuracy. Increasing the number of training chemicals, in principle, would increase the chances of being able to make NN predictions for the external test set chemicals.

External prediction set

The EDSPUOC was used as an external prediction set of chemicals because these chemicals may be of interest for further data gathering with respect to chemical modulation of steroid biosynthesis. All three modeling approaches (Figure 1C) were applied to this external set, comprised of 6302 chemicals that were structurable and lacked existing in vitro HT-H295R screening data. Below the results for the predictions of these three approaches are presented.

Estrogen and androgen preliminary structure alert-A preliminary structure alert was used to flag external chemicals as positive for estrogen or androgen synthesis modulation in HT-H295R, requiring one enriched chemotype per mode to be considered positive. The full results for the presence of an enriched chemotype per mode are provided in Supplemental File 5, and the number of chemicals flagged are summarized in Table 5. Note that based on the classification performance for the training set, this approach by itself has somewhat limited confidence, particularly with respect to sensitivity.

Global RF model-Out of the 6302 chemicals in the external prediction set, 842 were predicted positive and 4716 were predicted negative for effects on HT-H295R by the global RF model (full results reported in Supplemental File 5, 744 chemicals of 6302 did not have OPERA predictions and were assigned equivocal for this model). Given that sensitivity was lower than specificity (61% < 80%) for this model (Table 2), resulting in a balanced accuracy approaching 70%, this global RF model should be viewed as having moderate performance. Despite modest sensitivity for identifying true positives, a positive in this model and in the absence of other information, e.g. no local NN model outputs due to a lack of informative neighborhoods, would indicate a chemical of interest for additional data gathering with reasonable confidence since the false positive rate is low (with a model specificity of 80%). Conversely, a higher false negative rate (sensitivity of 61%) yields lower confidence in prioritizing negative predictions.

Of interest may be the 3429 of the total 6302 prediction set chemicals for which no local NN predictions could be made; for these 3429, the global RF model indicated 609 positives and 2820 negatives with respect to predicted effects on maxmMd.

Local NN models-As previously noted, chemicals (3429 of 6302) in the external prediction set were considered “equivocal” with respect to both of the local NN models due to the mean neighborhood activity falling between the upper and lower thresholds for positive and negative maxmMd, leaving 2262 chemicals for which local NN models could make predictions. A summary of the positive and negative predictions from the chemotype and Morgan fingerprint models for both the training and external prediction sets of chemicals is shown in Figure 6. In Figure 6A, for the training set chemicals, 58 chemicals appear positive in both local NN models, representing a subset of chemicals with the highest confidence prediction of a positive maxmMd. There were 79 chemicals from the training set that were negative in both local NN models. Note that a number of training chemicals are positive in one local NN model, based on one structural fingerprint, and not positive in the other local NN model (i.e., positive in one local NN model and equivocal in the other). Thus, a sense that levels of confidence may be possible based on the layering

of multiple modeling approaches begins to emerge. For the prediction set (Figure 6B), containing 2262 chemicals for which some prediction could be made with at least one of the local NN models, 429 chemicals were positive in at least one local NN model, with 69 chemicals positive in both and likely representing the chemicals with the highest likelihood of perturbing maxmMd. There were 1839 chemicals that were negative in at least one NN model, with only six chemicals in disagreement, meaning they were positive in one model and negative in the other. Five hundred twenty-four chemicals were negative in both NN models, indicating greater confidence that those chemicals would be negative for effects on maxmMd in HT-H295R.

Two additional evaluations of how the local NN models may have performed with the external prediction set of chemicals were performed: evaluation of the Jaccard similarities based on chemotypes and Morgan fingerprints between a target chemical and the neighborhood used for prediction (Figure 7) and taxonomic grouping, based on similar structure, of chemicals associated with positive and negative NN model predictions (Figure 8). Histograms of the mean Jaccard similarities between a given target chemical and the neighborhood used for prediction for chemotype (Figure 7A) and Morgan (Figure 7B) fingerprint models indicate that the Jaccard similarities between target chemical and the nearest neighborhood used are relatively low for either structural fingerprint, with the median Jaccard similarity approaching 0.5. The Jaccard similarities are similar for the chemicals screened in the HT-H295R assay and the EDSPUOC external prediction set, suggesting chemical structural diversity in both chemical sets. No limits were placed on minimum Jaccard similarity in order to output local NN model predictions, and as such, if the prediction for a particular chemical was important for an application, one could examine the Jaccard similarity for the neighborhood used (Supplemental File 5).

The external test set predictions from the local NN models were further described by taxonomic grouping from ClassyFire (Djoumbou Feunang et al., 2016) using the “class” groupings (Figure 8). ClassyFire is a structure-based chemical taxonomy that consistently describes chemicals in terms of structural category. Comparing the results for class across the chemotype and Morgan fingerprint-based predictions demonstrates a few key findings. First, almost every class is associated with a wide range of chemical prediction on the basis of the NN models, suggesting that ClassyFire class would not be a good predictor on its own. Even steroids and steroid derivatives, triazines, and benzene and substituted derivatives, with the highest numbers of positive predictions based on chemotype or Morgan fingerprints, demonstrated a broad range of neighborhood activities. Minor differences between the chemotype and Morgan fingerprint predictions separated by ClassyFire class are apparent as shifts in the distribution of the predicted values for the chemical neighborhoods between Figure 8A and 8B for the training set and between Figure 8C and 8D for the prediction set. The “steroid and steroid derivatives,” expectedly, labeled the chemical group most associated with positive nearest neighbor predictions in the HT-H295R screening and EDSPUOC chemical sets, with no negatives in the training set and very few chemical neighborhoods being predicted as negative. Prenol lipids, phenols, organooxygen compounds, and benzene and substituted derivatives were classes that contained many training set and predicted positives for both the chemotype and Morgan fingerprint defined

neighborhoods, but these ClassyFire classes also encompassed many positive and equivocal values.

Model agreement

The resultant predictions for all three models developed in this work (global RF model with OPERA predictions as input and two local NN models, with chemotypes or Morgan fingerprints as inputs) for the EDSPUOC were combined into a heuristic to indicate the degree of model agreement for this multi-step approach to HT-H295R bioactivity prediction. This model agreement value was intended to highlight the chemicals associated with the highest and lowest possibilities of positive bioactivity in the HT-H295R assay (Figure 9). A positive result from each model was assigned an arbitrary value that was weighted according to the balanced accuracy of each model. These arbitrary vaues were summed and evaluated using the 1400 chemical training set. The local NN models received greater weight in this sum because the balanced accuracies of the NN models were higher than that of the RF model, suggesting greater confidence in the NN model predictions. Model agreement values were tabulated with the following values: 1 = predicted positives in either NN model; 0.75 = predicted positive in RF model; 0.5 = equivocal in either NN model; 0.25 = equivocal in RF model; 0 = negative in any model. The final model agreement value is the sum of the three scores. For example, if the chemical is predicted positive in the chemotype-informed NN model (1), equivocal in the Morgan fingerprint NN model (0.5), and negative in the RF model (0), it would receive a model agreement value of 1.5. Forty-four chemicals from the EDSPUOC were positive in all models (model agreement = 2.75) suggesting that these 44 chemicals are of greatest interest for putative steroidogenesis-disrupting activity in the HT-H295R assay (see Table 6 and Supplemental File 5). One thousand thirty-three substances were positive in at least one nearest neighbor model, with equivocal or better results in one additional model (model agreement ≥ 1.5), and subsets of these chemicals may be interesting for further evaluation of some kind. A large number (4115 chemicals) of the total EDSPUOC that were not previously screened in the HT-H295R assay and were “structure-able” (6302 chemicals) had essentially equivocal results (model agreement scores of 0.5-1). If these equivocal chemicals are grouped with negative chemicals, 4631 of 6302 chemicals become of secondary interest for prioritization. Many chemicals (3429) of the 6302 total chemicals had RF model predictions in the absence of NN model predictions, as so few neighborhoods in the NN models were considered informative.

Due to the importance of understanding which chemicals might be more likely to affect estrogen and/or androgen synthesis, alongside a reporting of model agreement in Supplemental File 5 is an indication of whether each chemical contained chemotypes that were overrepresented amongst chemicals that significantly increased or decreased estrogen and/or androgen synthesis in the HT-H295R assay. In total, of the 6302 prediction set, 338, 274, 695, and 210 chemicals were flagged with preliminary structure alerts for estrogen up, estrogen dn, androgen up, and androgen dn (Supplemental File 5). However, when examining only the chemicals with a model agreement score of 2.75 (positive in all models), 5, 3, 13, and 2 chemicals were flagged with preliminary structure alerts for estrogen up, estrogen dn, androgen up, and androgen dn, demonstrating how the model agreement score value can be used as a threshold depending on the use case for further screening.

Finally, as a complement to Table 6, example chemical performance across models, along with their model agreement scores, is provided in Figure 10. Two sets of example chemicals were chosen, where in each set, the chemicals are structurally similar (containing enriched chemotypes), but one chemical has empirical screening data and the other does not. First, androsterone and corticosterone, which both share the fused steroid generic ring (enriched within androgen up) as highlighted in their structures, were compared. Androsterone decreased estrone and estradiol and increased testerone in vitro in the HT-H295R assay and had a positive maxmMd. Preliminary structure alerts for estrogens and androgens flagged both androsterone and cortisone for estrogens dn and androgens up. The RF model and the chemotype NN model were positive for both androsterone and cortisone, but the Morgan NN was equivocal for androsterone and positive for cortisone, leading to model agreement scores of 2.25 for androsterone and 2.75 for cortisone. Propiconazole and ipconazole are both triazole fungicides containing the heterocyclic 1,2,4-triazole ring (enriched for estrogen dn), and as a class generally decrease steroid biosynthesis in the in vitro HT-H295R assay (Goetz et al., 2009; Karmaus et al., 2016; Paul Friedman et al., 2016). Both propiconazole (previously screened in vitro with a positive maxmMd) and ipconazole (not previously screened in vitro) were flagged for estrogen dn and were positive in the RF as well as chemotype and Morgan NN models, resulting in model agreement scores of 2.75.

Discussion

This work presents a multi-step approach to understanding the SAR for chemicals and bioactivity in the HT-H295R assay implemented for the ToxCast program, with the goal of identifying chemicals of greatest interest for any further consideration of bioactivity in a model of steroid biosynthesis. The results of our work led us to apply a multi-step approach to address key data gaps, including whether a chemical shares structural features with chemicals that disrupt estrogen or androgen synthesis in HT-H295R; whether a chemical is predicted to perturb hormone biosynthesis in HT-H295R using a global RF approach; and whether a chemical shares structural features with chemicals that perturb HT-H295R bioactivity using a local NN approach. A heuristic model agreement score for HT-H295R bioactivity prediction was developed to easily communicate overall confidence in a chemical’s positive or negative prediction for HT-H295R activity. Critically, this work extends the bioactivity screening information that was previously obtained for 2012 chemicals in the HT-H295R assay to address data gaps using an in silico method for thousands of substances of potential interest on the EDSPUOC list. The primary impact of this work is to assist with selection of chemicals for further evaluation of chemical effects on steroidogenesis, but it may also be informative as part of a weight-of-evidence approach for chemicals that have other sources of information regarding reproduction and hormone synthesis. Of the 6302 chemicals on the EDSPUOC lacking HT-H295R data, 222 have a model agreement score greater than or equal to 2 (with a subset of 44 chemicals with the maximum model agreement score of 2.75 and positive in both NN models and the global RF model), representing perhaps the most interesting chemicals for follow-up consideration depending on the specific application. However, included among these 44 chemicals on the EDSPUOC that did not have HT-H295R screening data are several steroid hormones and steroid drugs, as well as additional conazole fungicides, which have been

previously observed to suppress steroid hormone production in in vitro models like the H295R assay (Goetz et al., 2009; Hecker et al., 2006; Karmaus et al., 2016); therefore, for some chemicals with high model agreement values, such as steroids or conazoles, existing information outside of HT-H295R predictions may be available to suggest whether additional information is of value or not for particular toxicology applications.

In a first feasibility and data exploration analysis, chemicals were described using chemotypes and enrichment of these individual structural features associated with either (1) effects on estrogen and/or androgen synthesis or (2) bioactivity in the HT-H295R assay, as represented by maxmMd. Chemotypes are an advantageous two-dimensional structural feature descriptor for this analysis because they are easily understood visually by toxicologists (Paul Friedman et al., 2020; Wang et al., 2019; Yang et al., 2015). For effects on estrogen and/or androgen synthesis, the extreme imbalance in training data contributed to unsuccessful modeling attempts, and led us to develop a preliminary structure alert approach, i.e. the use of enriched chemotypes as “structural alerts” for putative estrogen and androgen synthesis disruption. This approach is insensitive and identifies a relatively narrow chemical feature space as relevant to estrogen and androgen synthesis in HT-H295R; however, the use of this preliminary structure alert is specific, helping to identify chemicals that share structural features with chemicals that produce greater effects on estrogen and androgen synthesis, e.g. chemicals that resemble steroids. As noted, attempts to apply global modeling to estrogen and androgen synthesis failed to provide sufficient sensitivity (Supplemental File 7), likely due to limitations in the training data; as such, only preliminary structure alerts were for the prediction set of chemicals.

A global modeling approach was attempted to predict HT-H295R bioactivity with the hopes that heterogeneous chemical structure information could be used to easily make predictions for a broad chemical feature space and a larger, more balanced training data set. Given the enrichment of several structural features, including the expected enrichment of a steroid ring based on the bioactivity of steroids in the screening chemical library, a series of RF models that would apply globally to the entire chemical set were developed based on inclusion of different feature sets, including structural descriptors as features. The best RF model included only physicochemical and environmental fate property predictions from OPERA and had only moderate performance (balanced accuracy of 71%, with sensitivity of 80%). Surprisingly, in evaluating different sets of input features for RF modeling, we found that though chemotype enrichments were evident for some substructural features (with odds ratio >3 and significant p-values even after adjusting for false discovery rate), adding chemotypes or Morgan fingerprints failed to substantially improve model performance. It is possible the failure of 2D structural descriptors to significantly improve model performance may be attributable to a methodological limitation: OPERA physicochemical and environmental fate property predictions are continuous and available for all chemicals in training and test sets, whereas the chemotypes are binary descriptors that are present somewhat sparsely in the training and test sets. Additionally, as indicated in Figure 3A and Table 4, logP is the most informative input for RF modeling, and this may be due to the inherent relationships and correlations among the physicochemical features themselves. For instance, logP may suffice for describing hydrophobicity, and thus including both water solubility and melting point (which is correlated with water solubility) (Katritzky et al., 2010) would be unlikely to add

much additional predictive value. Boiling point and vapor pressure are inversely proportional and correlated with mass, and so it’s unlikely that including both of those features adds predictive value; similarly, Henry’s Law constant can be estimated using vapor pressure and water solubility. Thus, it is unsurprising to see in Figure 3A that RF model performance is optimized with only 3-5 OPERA predictions as inputs. The 5 most informative OPERA predictions used as input features in the OPERA prediction-only RF model were logP, boiling point, biodegradation half-life, soil adsorption coefficient (logKoc), and bioconcentration factor, all of which may be indicators of cellular bioavailability in aqueous cell-based assays. Interestingly, logP appears to be the most informative feature of any of the RF models developed, perhaps due to the known relationship between logP and in vitro potency (Gleeson et al., 2011) and/or cellular bioavailability (Armitage et al., 2014). Despite these limitations and it’s moderate performance, this RF model serves a couple of roles in the multi-step approach: it provides a baseline for any future global modeling attempts, and it also provides predictions for any chemical that has a structure that can be described by a SMILES and fits the domain of the OPERA modeling procedure (Mansouri et al., 2018). The RF model becomes particularly informative when it is either combined with the local NN models, using a model agreement score to indicate overall confidence in HT-H295R bioactivity prediction, or for chemicals that do not have an informative local neighborhood (and thus have no NN model predictions). For the external prediction set of chemicals used here, 6302 EDSPUOC chemicals, 3429 had only a RF model prediction for maxmMd, and of these, 609 had positive RF prediction for a positive maxmMd.

A local approach, utilizing NNs and based on previously published methodology for GenRA (Shah et al., 2016), was evaluated with the hopes of improved overall model performance (e.g., improved balanced accuracy). Indeed, the optimized nearest neighbor models for chemotypes or Morgan fingerprints had balanced accuracies approaching 81- 85%, depending on Morgan or chemotype fingerprints, suggesting better performance with either local NN model than the global RF model. This local NN approach appeared to better address the significant variability in chemical structural features associated with HT-H295R bioactivity by parsing the chemicals into more homogeneous groups, or neighborhoods, but at the cost of providing more confident predictions for a more limited number of chemicals. Many neighborhoods became uninformative due to our use of limits on the proportion of chemicals positive or negative to produce positive or negative predictions for the neighborhood. Further demonstrating unexplained variability in HT-H295R bioactivity based on chemical structure were the results of ClassyFire grouping: no grouping of chemicals by structural class seemed to correspond to 100% positive or negative bioactivity.

To best communicate the results of the three models that predict HT-H295R bioactivity, model agreement scores were used to identify the EDSPUOC associated with higher confidence in positive or negative bioactivity predictions for the HT-H295R. Additionally, the presence of chemotypes associated with increased or decreased estrogen or androgen synthesis in HT-H295R were identified. The NN models rely on the degree of structural similarity to chemicals in localized neighborhoods with empirical HT-H295R data, whereas the global RF model relies on OPERA physicochemical and environmental fate predictions as input, which while reliant on chemical structure information encoded in SMILES, is likely less specifically associated with HT-H295R activity. Further, the local NN models

performed with greater balanced accuracy than the global RF model. Thus, within the model agreement scoring, the local NN model positive results are weighted as “1” whereas the global RF model positive results are weighted as “0.75,” reflecting model prediction confidence. Using the model agreement score, and presence of chemotypes associated with estrogen and/or androgen synthesis disruption, can assist the user in a rapid, model-informed, fit-for-purpose approach to triaging available in silico information for the EDSPUOC with respect to in vitro steroidogenesis.

Prediction of in vitro bioactivity for chemicals lacking high-throughput screening data on the basis of structure has been reported for biological targets with datasets in ToxCast and Tox21 previously (Garcia de Lomana et al., 2021; Mansouri et al., 2016; Mansouri and Judson, 2016; Mansouri et al., 2020; Rosenberg et al., 2017). In silico prediction of H295R bioactivity on the basis of ToxCast data, to our knowledge, has not been reported to date. However, there have been previous efforts to establish the utility of the H295R cell line and/or steroid biosynthesis measurements in the cell line for prediction of in vivo outcomes. Gene expression of key enzymes in the steroid biosynthetic pathway present in H295R cells was previously used in random forest modeling of in vivo reproductive toxicity, with balanced accuracy approaching 74% (Maglich et al., 2014). A physiologically-based toxicokinetic model was applied to a dataset of lowest observable effect concentrations for in vitro H295R steroidogenesis screening data and yeast estrogen and androgen screening assays to derive administered equivalent doses to compare to in vivo doses at which reproductive and/or endocrine effects were observed for 10 reference substances, with limited results suggesting H295R bioactivity may be relevant for prediction of in vivo lowest effect levels (Fabian et al., 2019). However, published comparisons of H295R data to in vivo reproductive and developmental outcomes for a large number of substances are unavailable.

The work herein to assist with identification of substances that may be of interest for further screening relies on the HT-H295R assay implementation and analysis, but also on the biology represented in the assay system. H295R cells, derived from human adenocarcinoma and demonstrating similar phenotype to zonally undifferentiated adrenal cells, are a useful screening modality as they contain adrenal and other reproductive tissues steroidogenic enzymes in a single system (Gazdar et al., 1990; Gracia et al., 2006). However, H295R cells are not expected to synthesize estrogens and androgens at the same concentrations as gonadal tissues in vivo, as normal human adult adrenal is responsible for the synthesis of mineralocorticoids, glucocorticoids, and weak androgens (Botteri Principato et al., 2018; Gracia et al., 2006; Sanderson, 2006). Though H295R responses to chemical exposure have in some cases been similar to in vivo responses, there have been some differences, possibly due to the reproductive status of the animal models compared or the interaction of multiple tissues in vivo to regulate steroid hormones (Hecker et al., 2006), leading to some limitations interpreting the relevance of H295R results for in vivo steroidogenesis where hypothalamic-pituitary-gonadal axes are intact. However, H295R cells are known to produce limited 17ß-estradiol, necessitating stimulation with forskolin (Hecker et al., 2011; Hecker et al., 2006) in order to sensitively detect estradiol synthesis inhibitors. And, H295R cells have limited responsivity to luteinizing hormone and more limited testosterone production when compared to cells such as primary Leydig cells (Botteri Principato et al., 2018). Thus, the context for use for the H295R cell line may be screening for

possible effects on steroid biosynthesis, including progestogens, corticosteroids, estrogens, and/androgens, and additional mechanistic assays in other cell and/or organoid models may be necessary to elucidate putative mechanisms and outcomes within specific reproductive tissues (Botteri Principato et al., 2018; Forgacs et al., 2012; Heidari-Khoei et al., 2020; Heremans et al., 2021; Lipskind et al., 2018; Pendergraft et al., 2017; Robitaille et al., 2015; Sakib et al., 2019; Thibeault et al., 2018). Substances may affect steroid biosynthesis through a number of putative mechanisms, including inhibition of steroidogenic enzymes (e.g., steroidogenic acute regulatory protein, CYP11A1, CYP17A1, HSD3B2, CYP21A2, CYP11B1, CYP11B2, HSD17B3, CYP19A1)(Saito et al., 2016), effects on cholesterol and its transport, mitochondrial toxicity, or effects on estrogen, glucocorticoid, and androgen receptors directly (Robitaille et al., 2015; Sanderson, 2006). As such, a limitation in our work is that positives in the HT-H295R assay as indicated by a measure of perturbation in the system of steroid hormones synthesized, such as a the mean Mahalanobis distance approach, do not directly indicate a specific putative mechanism. In the future, structure alerts or other predictive models for mechanisms present in H295R cells could be combined with the output of the SAR approach described here to develop an in silico dataset for putative chemical effects on steroidogenesis; such an extension might provide support for one or more mechanisms present in H295R cells.

The SARs explored in this work may be a useful data gap filling technique for chemicals lacking data relevant to disruption of steroidogenesis, or for chemicals where this in silico approach can add to the weight of evidence. Indeed, when considering the agreement of three models (one global random forest model and two nearest neighbor models), the majority (4489 chemicals) of the total EDSPU chemicals that were not previously screened in the HT-H295R assay and were “structure-able” (6302 chemicals) had equivocal performance in both NN models and negative performance in the RF model. Of the 6302 chemicals, 1033 have a model agreement score greater than or equal to 1.5, and 222 have a model agreement score greater than or equal to 2, representing perhaps the most promising chemicals for follow-up consideration. This approach of using SARs to extend the learnings from in vitro screening data to chemical lists of interest may contribute to a cheminformatics-driven “Tier 0” in an applied new approach methodology framework, preceding Tier 1 broad screening and/or Tier 2 targeted screening (Thomas et al., 2019). Future work to build a Tier 0 within a next generation risk assessment toolbox should combine the work described herein with other structure alerts and (Q)SARs to predict chemicals that may disrupt endocrine signaling, including steroidogenesis, estrogen and androgen signaling, as well as chemicals that may disrupt developmental and reproductive processes. A cheminformatic Tier 0 applying a number of methods in these biological domains would likely indicate which chemicals were of greatest interest for additional screening in models of higher biological complexity; regions of chemical space with limited information; and, potentially putative mechanisms for chemical perturbation of endocrine, developmental, and reproductive systems.

Supplementary Material

Refer to Web version on PubMed Central for supplementary material.

Comput Toxicol. Author manuscript; available in PMC 2023 November 01.

Acknowledgments

The authors wish to thank Daniel Chang for support on the use of ClassyFire descriptors as well as Caroline Ring, Chad Deisenroth, and John Cowden (all from the US EPA Office of Research and Development) for useful comments on a previous version of this manuscript.

References

Armitage JM, et al. , 2014. Application of mass balance models and the chemical activity concept to facilitate the use of in vitro toxicity data for risk assessment. Environ Sci Technol. 48, 9770-9. [PubMed: 25014875]

Botteri Principato NL, et al. , 2018. The use of purified rat Leydig cells complements the H295R screen to detect chemical-induced alterations in testosterone production. Biol Reprod. 98, 239-249. [PubMed: 29272331]

Breen MS, et al. , 2010. Computational model of steroidogenesis in human H295R cells to predict biochemical response to endocrine-active chemicals: model development for metyrapone. Environ Health Perspect. 118, 265-72. [PubMed: 20123619]

Browne P, et al. , 2015. Screening Chemicals for Estrogen Receptor Bioactivity Using a Computational Model. Environ Sci Technol. 49, 8804-14. [PubMed: 26066997]

Browne P, et al. , 2018. Development of a curated Hershberger database. Reprod Toxicol. 81, 259-271. [PubMed: 30205136]

Commission, E., Registration, Evaluation, Authorisation and Restriction of Chemicals. Regulation (EC) No 1907/2006 of the European Parliament and of the Council, 2007.

de Cremoux P, et al. , 2008. Expression of progesterone and estradiol receptors in normal adrenal cortex, adrenocortical tumors, and primary pigmented nodular adrenocortical disease. Endocr Relat Cancer. 15, 465-74. [PubMed: 18508999]

Dix DJ, et al. , 2007. The ToxCast program for prioritizing toxicity testing of environmental chemicals. Toxicol Sci. 95, 5-12. [PubMed: 16963515]

Djoumbou Feunang Y, et al. , 2016. ClassyFire: automated chemical classification with a comprehensive, computable taxonomy. J Cheminform. 8, 61. [PubMed: 27867422]

Draskau MK, et al. , 2021. Human-relevant concentrations of the antifungal drug clotrimazole disrupt maternal and fetal steroid hormone profiles in rats. Toxicol Appl Pharmacol. 422, 115554. [PubMed: 33910022]

ECCC/HC, Chemicals Management Plan (CMP) Science Committee Objectives Paper Meeting no. 5 - Integrating New Approach Methodologies within the CMP: Identifying Priorities for Risk Assessment, Existing Substances Risk Assessment Program., Vol. Retrieved from http:// www.ec.gc.ca/ese-ees/default.asp?lang=En&n=172614CE-1. Government of Canada, Ottawa (ON), 2016.

ECHA, New Approach Methodologies in Regulatory Science, Proceedings of a scientific workshop. In: Agency, E. C., (Ed.), Helsinki, Finland, 2016.

ECHA, ECuHA Strategic Plan 2019-2023. 2017.

EFSA, E. a., Guidance for the Identification of Endocrine Disruptors in the Context of Regulations (EU) No 528/2012 and (EC) No 1107/2009., 2018.

Ellison CA, et al. , 2012. Human hepatic cytochrome P450-specific metabolism of the organophosphorus pesticides methyl parathion and diazinon. Drug Metab Dispos. 40, 1-5. [PubMed: 21969518]

EPA, U. S., EDSP21 Workplan. In: Prevention, O. o. C. S. a. P., (Ed.), Washington, DC, 2011.

EPA, U. S., U.S. EPA EDSP Universe of Chemicals. In: Office of Chemical Safety and Pollution Prevention, O. o. W., and Office of Research and Development, (Ed.), 2012.

EPA, U. S., FIFRA Scientific Advisory Panel: Prioritizing the Universe of Endocrine Disruptor Screening Program (EDSP) Chemicals using Computational Toxicology Tools. Vol. EPA-HQ- OPP-2012-0818, 2013.

EPA, U. S., 2014. FIFRA SAP Meeting on Integrated Endocrine Activity and Exposure-based Prioritization and Screening. EPA-HQ-OPP-2014-0614.

Comput Toxicol. Author manuscript; available in PMC 2023 November 01.

EPA, U. S., Endocrine Disruptor Screening Program: Use of High Throughput Assays and Computational Tools. In: Prevention, O. o. C. S. a. P., (Ed.), Vol. EPA-HQ-OPPT-2015-0305-0001. Federal Register, Washington, DC, 2015.

EPA, U. S., FIFRA Scientific Advisory Panel: Continuing Development of Alternative High- Throughput Screens to Determine Endocrine Disruption, Focusing on Androgen Receptor, Steroidogenesis, and Thyroid Pathways. Vol. EPA-HQ-OPP-2017-0214, 2017.

EPA, U. S., Invitrodb version 3.3. In: Exposure, C. f. C. T. a., (Ed.), 2020.

EPA, U. S., New Approach Methods Work Plan (v2). In: Prevention, O. o. R. a. D. a. O. o. C. S. a. P., (Ed.), Washington, DC, 2021.

Fabian E, et al. , 2019. In vitro-to-in vivo extrapolation (IVIVE) by PBTK modeling for animal-free risk assessment approaches of potential endocrine-disrupting compounds. Arch Toxicol. 93, 401- 416. [PubMed: 30552464]

Forgacs AL, et al. , 2012. BLTK1 murine Leydig cells: a novel steroidogenic model for evaluating the effects of reproductive and developmental toxicants. Toxicol Sci. 127, 391-402. [PubMed: 22461451]

Garcia de Lomana M, et al., 2021. In Silico Models to Predict the Perturbation of Molecular Initiating Events Related to Thyroid Hormone Homeostasis. Chem Res Toxicol. 34, 396-411. [PubMed: 33185102]

Gazdar AF, et al. , 1990. Establishment and characterization of a human adrenocortical carcinoma cell line that expresses multiple pathways of steroid biosynthesis. Cancer Res. 50, 5488-96. [PubMed: 2386954]

Gleeson MP, et al. , 2011. Probing the links between in vitro potency, ADMET and physicochemical parameters. Nat Rev Drug Discov. 10, 197-208. [PubMed: 21358739]

Goetz AK, Dix DJ, 2009. Mode of action for reproductive and hepatic toxicity inferred from a genomic study of triazole antifungals. Toxicol Sci. 110, 449-62. [PubMed: 19423681]

Goetz AK, et al. , 2007. Disruption of testosterone homeostasis as a mode of action for the reproductive toxicity of triazole fungicides in the male rat. Toxicol Sci. 95, 227-39. [PubMed: 17018648]

Goetz AK, et al. , 2009. Inhibition of rat and human steroidogenesis by triazole antifungals. Syst Biol Reprod Med. 55, 214-26. [PubMed: 19938956]

Gracia T, et al. , 2007. Modulation of steroidogenic gene expression and hormone production of H295R cells by pharmaceuticals and other environmentally active compounds. Toxicol Appl Pharmacol. 225, 142-53. [PubMed: 17822730]

Gracia T, et al. , 2006. The H295R system for evaluation of endocrine-disrupting effects. Ecotoxicol Environ Saf. 65, 293-305. [PubMed: 16935330]

Grulke CM, et al. , 2019. EPA’s DSSTox database: History of development of a curated chemistry resource supporting computation toxicology research. Computational Toxicology. 12.

Guha R, 2007. Chemical Informatics Functionality in R. Journal of Statistical Software. 18.

Haggard DE, et al. , 2018. High-Throughput H295R Steroidogenesis Assay: Utility as an Alternative and a Statistical Approach to Characterize Effects on Steroidogenesis. Toxicol Sci. 162, 509-534. [PubMed: 29216406]

Haggard DE, et al. , 2019. Development of a prioritization method for chemical-mediated effects on steroidogenesis using an integrated statistical analysis of high-throughput H295R data. Regul Toxicol Pharmacol. 109, 104510. [PubMed: 31676319]

Hecker M, et al. , 2011. The OECD validation program of the H295R steroidogenesis assay: Phase 3. Final inter-laboratory validation study. Environ Sci Pollut Res Int. 18, 503-15. [PubMed: 20890769]

Hecker M, et al. , 2006. Human adrenocarcinoma (H295R) cells for rapid in vitro determination of effects on steroidogenesis: hormone production. Toxicol Appl Pharmacol. 217, 114-24. [PubMed: 16962624]

Heidari-Khoei H, et al. , 2020. Organoid technology in female reproductive biomedicine. Reprod Biol Endocrinol. 18, 64. [PubMed: 32552764]

Heremans R, et al. , 2021. Organoids of the Female Reproductive Tract: Innovative Tools to Study Desired to Unwelcome Processes. Front Cell Dev Biol. 9, 661472. [PubMed: 33959613]

Comput Toxicol. Author manuscript; available in PMC 2023 November 01.

ICCVAM, ICCVAM Test Method Evaluation Report: The LUMI-CELL ER (BG1Luc ER TA) Test Method: An In Vitro Assay for Identifying Human Estrogen Receptor Agonist and Antagonist Activity of Chemicals. In: Interagency Coordinating Committee on the Validation of Alternative Methods; National Toxicology Program Interagency Center for the Evaluation of Alternative Toxicological Methods; National Institute of Environmental Health Sciences, N. I. o., (Ed.), Research Triangle Park, NC, 27709, 2011.

Judson R, et al. , 2020. Selecting a minimal set of androgen receptor assays for screening chemicals. Regul Toxicol Pharmacol. 117, 104764. [PubMed: 32798611]

Judson RS, et al. , 2015. Integrated Model of Chemical Perturbations of a Biological Pathway Using 18 In Vitro High-Throughput Screening Assays for the Estrogen Receptor. Toxicol Sci. 148, 137- 54. [PubMed: 26272952]

Judson RS, et al. , 2018. New approach methods for testing chemicals for endocrine disruption potential. Current Opinion in Toxicology. 9, 40-47.

Karmaus AL, et al. , 2016. High-Throughput Screening of Chemical Effects on Steroidogenesis Using H295R Human Adrenocortical Carcinoma Cells. Toxicol Sci. 150, 323-32. [PubMed: 26781511]

Katritzky AR, et al. , 2010. Quantitative correlation of physical and chemical properties with chemical structure: utility for prediction. Chem Rev. 110, 5714-89. [PubMed: 20731377]

Kavlock R, et al. , 2012. Update on EPA’s ToxCast program: providing high throughput decision support tools for chemical risk management. Chem Res Toxicol. 25, 1287-302. [PubMed: 22519603]

Kleinstreuer NC, et al. , 2018. Evaluation of androgen assay results using a curated Hershberger database. Reprod Toxicol. 81, 272-280. [PubMed: 30205137]

Kleinstreuer NC, et al. , 2017. Development and Validation of a Computational Model for Androgen Receptor Activity. Chem Res Toxicol. 30, 946-964. [PubMed: 27933809]

Kleinstreuer NC, et al. , 2016. A Curated Database of Rodent Uterotrophic Bioactivity. Environ Health Perspect. 124, 556-62. [PubMed: 26431337]

Lantz B, 2019. Machine Learning with R: Expert techniques for predictive modeling, 3rd Edition. Birmingham, UK.

Lautenberg FR, Frank R Lautenberg Chemical Safety for the 21st Century Act. In: Congress, t. U., (Ed.). Public Law, 2016, pp. 114-182.

Lipskind S, et al. , 2018. An Embryonic and Induced Pluripotent Stem Cell Model for Ovarian Granulosa Cell Development and Steroidogenesis. Reprod Sci. 25, 712-726. [PubMed: 28854867]

Low Y, et al. , 2013. Integrative chemical-biological read-across approach for chemical hazard classification. Chem Res Toxicol. 26, 1199-208. [PubMed: 23848138]

Maglich JM, et al. , 2014. More than just hormones: H295R cells as predictors of reproductive toxicity. Reprod Toxicol. 45, 77-86. [PubMed: 24434083]

Mansouri K, et al. , 2016. CERAPP: Collaborative Estrogen Receptor Activity Prediction Project. Environ Health Perspect. 124, 1023-33. [PubMed: 26908244]

Mansouri K, et al. , 2018. OPERA models for predicting physicochemical properties and environmental fate endpoints. J Cheminform. 10, 10. [PubMed: 29520515]

Mansouri K, Judson RS, 2016. In Silico Study of In Vitro GPCR Assays by QSAR Modeling. Methods Mol Biol. 1425, 361-81. [PubMed: 27311474]

Mansouri K, et al. , 2020. CoMPARA: Collaborative Modeling Project for Androgen Receptor Activity. Environ Health Perspect. 128, 27002. [PubMed: 32074470]

Montanaro D, et al. , 2005. Antiestrogens upregulate estrogen receptor beta expression and inhibit adrenocortical H295R cell proliferation. J Mol Endocrinol. 35, 245-56. [PubMed: 16216906] OECD, 2011. Test No. 456: H295R Steroidogenesis Assay.

Paul Friedman K, et al. , 2020. Utility of In Vitro Bioactivity as a Lower Bound Estimate of In Vivo Adverse Effect Levels and in Risk-Based Prioritization. Toxicol Sci. 173, 202-225. [PubMed: 31532525]

Paul Friedman K, et al. , 2016. A predictive data-driven framework for endocrine prioritization: a triazole fungicide case study. Crit Rev Toxicol. 46, 785-833. [PubMed: 27347635]

Pendergraft SS, et al. , 2017. Three-dimensional testicular organoid: a novel tool for the study of human spermatogenesis and gonadotoxicity in vitro. Biol Reprod. 96, 720-732. [PubMed: 28339648]

Robitaille CN, et al. , 2015. Antiandrogenic mechanisms of pesticides in human LNCaP prostate and H295R adrenocortical carcinoma cells. Toxicol Sci. 143, 126-35. [PubMed: 25324206]

Rogers D, Hahn M, 2010. Extended-connectivity fingerprints. J Chem Inf Model. 50, 742-54. [PubMed: 20426451]

Rosenberg SA, et al. , 2017. QSAR models for thyroperoxidase inhibition and screening of U.S. and EU chemical inventories. Computational Toxicology. 4, 11-21.

Saito R, et al. , 2016. Estimation of the Mechanism of Adrenal Action of Endocrine-Disrupting Compounds Using a Computational Model of Adrenal Steroidogenesis in NCI-H295R Cells. J Toxicol. 2016, 4041827. [PubMed: 27057163]

Sakib S, et al. , 2019. Three-dimensional testicular organoids as novel in vitro models of testicular biology and toxicology. Environ Epigenet. 5, dvz011.

Sakuratani Y, et al. , 2018. Integrated Approaches to Testing and Assessment: OECD Activities on the Development and Use of Adverse Outcome Pathways and Case Studies. Basic Clin Pharmacol Toxicol. 123 Suppl 5, 20-28. [PubMed: 29316278]

Sanderson JT, 2006. The steroid hormone biosynthesis pathway as a target for endocrine-disrupting chemicals. Toxicol Sci. 94, 3-21. [PubMed: 16807284]

Sanderson JT, et al. , 2000. 2-Chloro-s-triazine herbicides induce aromatase (CYP19) activity in H295R human adrenocortical carcinoma cells: a novel mechanism for estrogenicity? Toxicol Sci. 54, 121-7. [PubMed: 10746939]

Selvaraj N, et al. , 2000. Establishment and characterization of steroidogenic granulosa cells expressing beta(2)-adrenergic receptor: regulation of adrenodoxin and steroidogenic acute regulatory protein by adrenergic agents. Mol Cell Endocrinol. 168, 53-63. [PubMed: 11064152]

Shah I, et al. , 2016. Systematically evaluating read-across prediction and performance using a local validity approach characterized by chemical structure and bioactivity information. Regul Toxicol Pharmacol. 79, 12-24. [PubMed: 27174420]

Skolness SY, et al. , 2013. Propiconazole inhibits steroidogenesis and reproduction in the fathead minnow (Pimephales promelas). Toxicol Sci. 132, 284-97. [PubMed: 23339182]

Thibeault AH, et al. , 2018. Co-culture of H295R Adrenocortical Carcinoma and BeWo Choriocarcinoma Cells to Study Feto-placental Interactions: Focus on Estrogen Biosynthesis. Methods Mol Biol. 1710, 295-304. [PubMed: 29197012]

Thomas RS, et al. , 2019. The Next Generation Blueprint of Computational Toxicology at the U.S. Environmental Protection Agency. Toxicol Sci. 169, 317-332. [PubMed: 30835285]

Thomas RS, et al. , 2018. The US Federal Tox21 Program: A strategic and operational plan for continued leadership. ALTEX. 35, 163-168. [PubMed: 29529324]

USEPA, Endocrine Disruptor Screening Program Test Guidelines: OPPTS 890.1550 : Steroidogenesis (human Cell Line - H295R). U.S. Environmental Protection Agency, Office of Chemical Safety and Pollution Prevention, 2009.

Wang J, et al. , 2019. High-throughput screening and chemotype-enrichment analysis of ToxCast phase II chemicals evaluated for human sodium-iodide symporter (NIS) inhibition. Environ Int. 126, 377-386. [PubMed: 30826616]

Watt ED, Judson RS, 2018. Uncertainty quantification in ToxCast high throughput screening. PLoS One. 13, e0196963. [PubMed: 30044784]

Wheeler AR, Directive to Prioritize Efforts to Reduce Animal Testing. Washington, D.C. 20460, 2019. Williams AJ, et al. , 2017. The CompTox Chemistry Dashboard: a community data resource for environmental chemistry. J Cheminform. 9, 61. [PubMed: 29185060]

Yang C, et al. , 2015. New publicly available chemical query language, CSRML, to support chemotype representations for application to data mining and modeling. J Chem Inf Model. 55, 510-28. [PubMed: 25647539]

Highlights

· Structure-activity relationships were constructed with HT-H295R assay data.

· Chemotypes associated with altered estrogen or androgen synthesis were identified.

· Random forest modeling used physicochemical descriptors to predict HT- H295R outcomes.

· Nearest neighbor models used structural similarity to infer HT-H295R outcomes.

· Together all three approaches inform priority chemicals for screening.

Bioactivity Modeled	(+) chems	(-) chems
Estrogen up	52	590
Estrogen dn	40	602
Androgen up	4	638
Androgen dn	99	543
HT-H295R maxmMd	555	845

Figure 1. Multi-strategy approach to predicting bioactivity.

A. Assay data for model building

B. Multi-strategy approach to predicting bioactivity

C. Model agreement

Individual features

inputs

methods

performance evaluation

E and A, ±1.5-fold change

Enrichment: Fisher’s test and odds ratios for each chemotype

Balanced accuracy of prelim structure alert

Presence of enriched chemotypes per E or A mode

Chemotypes

HT-H295R maxmMd

Local approach Global approach

Leave one out cross-validation

Chemotypes

Optimized random forest model for HT- H295R maxmMd

HT-H295R maxmMd

Morgan fingerprints

Machine learning: random forest Models

Recursive feature elimination

OPERA predictions

Feature importance

Nearest neighbor models (chemotypes and Morgan) for HT- H295R maxmMd

Leave one out cross-validation

Chemotypes

HT-H295R maxmMd

Morgan fingerprints

Jaccard similarity-based predictions (GenRA)

# of neighbors/ n’hood

Evaluation of model agreement with leave one out cross-validation

OPERA predictions

upper bound probability for +

lower bound probability for +

External prediction set

In part A, the available datasets for model building are presented, including the positive (+) and negative (-) number of chemicals (chems) available for each type of bioactivity. In (B), the multi-strategy approach to predicting bioactivity is summarized by the input, methods, and performance evaluation for enrichment of individual structural features, a global modeling approach, and a local modeling approach. In (C), the final approach for evaluating individual features, global approach, and local approach were all applied to an external prediction set of chemicals. The global and local approaches were combined in a model agreement score, and the performance of this model agreement score, using training set chemicals, was also evaluated.

Figure 2. Odds ratios for enriched chemotypes for estrogen and androgen synthesis modulators Chemotypes significantly enriched for androgen dn (p<0.01 and odds ratios > 3), androgen up, estrogen dn, and estrogen up (FDR adjusted p<0.05 and odds ratios > 3). The odds ratio lower bound, mean, and upper bound are illustrated as connected dots for each chemotype, with the chemotypes in the legend ordered 1-14.

Androgen Down

Androgen Up

Chemotype

2.5

5.0

7.5

10.0

12.5

1000

2000

3000

Estrogen Down

Estrogen Up

Odds Ratio of a 1.5 Fold Change given Chemotype Presence

Chemotype

bond.CC .. O.C_ketone_alkane_cyclic (1

- bond.CC .. O.C_ketone_alkane_cyclic _. C5. (2

bond.COH_alcohol_sec.alkyl (3

- bond.CS_sulfide_dialkyl (4

- chain.alkaneBranch_neopentyl_C5 (5

- chain.alkaneCyclic_pentyl_C5 (6

- chain.alkeneCyclic_diene_cyclohexene (7

- ring.fused_steroid_generic _. 5_6_6_6. (8

ring.hetero _. 5 ._ N_imidazole (9

- ring.hetero _. 5 ._ N_triazole_1_2_4 .. (10

- ring.hetero _. 5 ._ N_triazole_1_3_4 .. (11

- ring.hetero _. 5 ._ Z_1_2_4_1_3_4.Z (12

ring.hetero _. 6 ._ N_triazine_1_3_5.(13

ring.hetero _. 6 ._ Z_1_3_5. (14

Figure 3. RFE results for different RF models Different RF models were built using (A) OPERA predictions and chemotypes; (B) chemotypes only; (C) OPERA predictions only; and (D) Morgan fingerprints only. The specificity (salmon circles) and sensitivity (blue triangles) are illustrated for the number of features included for each RFE iteration.

(A) OPERA Only

(B) Chemotypes Only

1.00

0.75

Rate

0.50

Rate

0.50

0.25

0.00

Number of Features

100

200

300

400

Number of Features

variable

(D) Morgan Fingerprints Only

Specificity Sensitivity

1.00

0.75

Rate

0.50

Rate

0.50

0.25

0.00

100

200

300

400

250

500

750

1000

Number of Features

Figure 4. Tuning parameters for the RF model The RF model based on 13 OPERA predictions as the input features was iterated to demonstrate the lack of effect from tuning the (A) number of trees and (B) the mtry from the default values of 750 and 3.6, respectively, on sensitivity and specificity. The default mtry value is illustrated by the dotted vertical line in B.

Rate

Specificity

Sensitivity

1.00

☒

0.75

☒

Rate

0.50

0.25

0.00

500

750

1000

1250

1500

Number of Trees

mtry

Figure 5. Summary of LOOCV results for NN model development The chemotype and Morgan fingerprint NN models required simultaneous tuning of the number of neighbors in each neighborhood and the upper and lower cutoffs on the proportion of positive and negative neighbors for prediction. In gray boxes across the top are the (upper, lower) cutoffs, and on the x-axis are the number of neighbors per neighborhood in each model iteration. The number of training chemicals included is indicated by the size of the point. Specificity (salmon circles) and sensitivity (blue triangles) demonstrate the LOOCV results.

0.7,0.1

0.7,0.2

0.8,0.1

0.8,0.2

0.9,0.1

0.9,0.2

1.00

0.75

Chemotypes

0.50

Number of Chemicals

0.25

· 100

· 200

· 300

0.00

· 400

· 500

1.00

· 600

0.75

Rate

Morgan

· Specificity

0.50

Sensitivity

0.25

0.00

Number of Neighbors

Figure 6. Summary local NN model results from training and prediction sets In A, the local NN model results for the 1400 chemicals in the training set (with HT-H295R maxmMd values) are provided; in B, the local NN model results for the 6302 chemicals in the external prediction set are provided. Mpos = Morgan NN model positive; Cpos = chemotype NN model positive; Mneg = Morgan NN model negative; Cneg = chemotype NN model negative.

Cpos

Mneg

Cpos

Mneg

153

730

524

201

584

Mpos

Cneg

Mpos

Cneg

Figure 7. Histograms of Jaccard similarities between target chemical and nearest neighborhood. The mean Jaccard similarities between target chemicals and the nearest neighbood based on chemotypes (A) and Morgan fingerprints (B) are shown. The dotted line indicates the histogram for the training set chemicals with HT-H295R bioactivity data, and the solid line indicates the histogram for external prediction chemicals from the EDSPUOC. The solid red vertical line indicates a Jaccard similarity of 0.8.

Density

0.00

0.25

0.50

0.75

0.25

0.50

0.75

1.00

Jaccard Mean

Line ☐ EDSP

☐ HTH295R

EPA Author Manuscript

steroids and steroid derivatives

pyridines and derivatives

prenol lipids

phenols

Class

organooxygen compounds

Class

organooxygen compounds

organonitrogen compounds

naphthalenes

fatty acyls-

carboxylic acids and derivatives

benzene and substituted derivatives

0.00

0.25

0.50

0.75

1.00

0.00

0.25

0.50

0.75

1.00

Predicted Value

Prediction . Equivocal ^ Negative . Positive

Figure 8. ClassyFire grouping of external test set results.

vinyl halides

unsaturated hydrocarbons

transition metal oxoanionic compounds

unsaturated hydrocarbons

stilbenes

steroids and steroid derivatives

saturated hydrocarbons

pyridines and derivatives

prenol lipids

phenols

phenol ethers

organooxygen compounds

Class

organonitrogen compounds

organometalloid compounds

Class

organometalloid compounds

organic sulfuric acids and derivatives

organic phosphoric acids and derivatives

organic metal salts

organic carbonic acids and derivatives

organic metal salts

naphthalenes

organic carbonic acids and derivatives

fatty acyls

naphthalenes

diazines

fatty acyls

carboxylic acids and derivatives

diazines

benzene and substituted derivatives

carboxylic acids and derivatives

azoles

benzene and substituted derivatives

anthracenes

azoles

alkyl halides

anthracenes

alkali metal oxoanionic compounds

alkyl halides

0.00

0.25

0.50

0.75

1.00

0.00

0.25

0.50

0.75

1.00

Predicted Value

Prediction . Equivocal - Negative . Positive

ClassyFire class groupings of chemical results from the local NN models are illustrated for (A) chemotype NN model and HT-H295R training set; (B) Morgan fingerprint NN model and HT-H295R training set; (C) chemotype NN model and EDSPUOC prediction set; and (D) Morgan fingerprint NN model and EDSPUOC prediction set. Orange triangle = negative; green circles = equivocal; purple square = positive. The size of the symbol relates to the number of chemicals with that predicted nearest neighbor model value.

3000

2825

Model Agreement

2000

0.25

0.5

Count

0.75

1154

1.25

1.5

1000

1.75

638

515

614

2.25

2.75

136

197

153

0.25

0.5

0.75

1.25

1.5

1.75

2.25

2.75

Model Agreement

Model Agreement	Sensitivity	Specificity	Accuracy	Kappa
0.75	0.9712	0.2284	0.5998	0.1668
1	0.6793	0.7846	0.7319	0.4633
1.25	0.6721	0.7941	0.7331	0.4673
1.5	0.6234	0.8166	0.72	0.4475
1.75	0.2468	0.9763	0.6116	0.2545

Figure 9. Model agreement values.
Model Agreement	Proportion	Sample Size
2.75	1	51
2.25	0.82	98
2	0.75	8
1.75	0.61	344
1.5	0.59	46
1.25	0.33	12
1	0.26	632
0.75	0	1
0.5	0.12	130
0	0	78

In (A), the model agreement values for the EDSPUOC are illustrated as a bar chart with the counts of chemicals associated with score annotated above each bar. Model agreement classification performance, using different thresholds of model agreement to indicate a positive (greater than 0.75-1.75, as rows in this table), is presented in (B). A summary of model agreement results for the 1400 chemical training set are provided in (C), including the proportion of chemicals positive and sample size by model agreement value.

Androsterone

H-0

Cortisone

Propiconazole

Ipconazole

Figure 10. Example chemicals
Androgen Estrogen and	E and A fold changes	E1 dn, E2 dn, TESTO up	Not screened	E2 dn	Not screened
Androgen Estrogen and	E and A chemotype predictions	Estrogens dn, Androgens up	Estrogens dn, Androgens up	Estrogens dn	Estrogens dn
Steroid Synthesis	maxmMd	Positive	Not screened	Positive	Not screened
	RF	0.75	0.75	0.75	0.75
	chemotype NN	1	1	1	1
	Morgan NN	0.5	1	1	1
	Model Agreement	2.25	2.75	2.75	2.75

Performance in the HT-H295R assay (only for androsterone and propiconazole in this figure), models developed in this work, and model agreement scores, are provided for four chemicals. Androsterone and propiconazole were screened in the HT-H295R assay and cortisone and ipconazole are on the EDSPUOC but lack bioactivity data from the HT-H295R assay.

Table 1. Classification performance of enriched chemotypes for estrogen and androgen effects

Each of the 4 modes (estrogen and androgen, up and down) were evaluated for classification performance based on the presence of at least one enriched chemotype per mode, with computation of sensitivity, specificity, balanced accuracy, and Kappa.

Mode	Sensitivity	Specificity	Balanced accuracy	Kappa
Estrogen up	0.05	0.97	0.51	0.02
Estrogen down	0.2	0.88	0.54	0.1
Androgen up	0.11	0.95	0.53	0.07
Androgen down	0.08	0.94	0.51	0.02

Table 2. 10 most enriched chemotypes for positive maxmMd.

The results from the Fisher’s test (p-value and false discovery rated-adjusted p-value) and computation of the odds ratio, including “Lower” and “Upper” bounds, are listed for each enriched chemotype, along with the

number of training chemicals containing the chemotype, the chemotype name, and two-dimensional illustration of the chemotype.
P-value	P-value Adjusted	Odds Ratio	Lower	Upper	chemicals # training	Chemotype Name
1.7E- 10	3.8E- 08	43.1	7.1	1753. 9	28	ring.fused_steroid_generic _. 5_6_6_6.
5.2E- 04	8.5E- 03	9.3	2.1	86.0	14	bond.P.O_phosphate_dithio
5.5E- 04	8.6E- 03	7.3	2.0	39.6	17	ring.hetero _. 6 ._ N_triazine _. 1_3_5 ..
3.7E- 07	4.2E- 05	6.8	2.9	18.6	37	bond.P.S_generic
4.6E- 06	1.9E- 04	3.7	2.0	7.0	56	bond.CC .. O.C_ketone_alkane_cyclic
3.1E- 05	8.7E- 04	3.3	1.8	6.2	55	chain.alkaneCyclic_pentyl_C5
4.2E- 04	8.0E- 03	3.3	1.6	7.0	40	bond.COH_alcohol_ter.alkyl
1.5E- 06	9.5E- 05	3.1	1.9	5.3	79	chain.alkaneCyclic_hexyl_C6

4.6E- 04	8.0E- 03	2.7	1.5	5.2	52	bond.CC .. O.C_ketone_alkene_cyclic_2.en.1.one_gen eric
2.4E- 17	1.1E- 14	2.7	2.1 Comput	Toxicol. Author 3.4	845	manuscript; available in PMC 2023 November 01. ring.aromatic_benzene

Table 3. LOOCV results for global random forest models. Five different sets of input features were used in this analysis, with computation of sensitivity, specificity, balanced accuracy, Kappa, and out-of-bag error (OOB), and annotation of the number of features included.
Feature Set	Sensitivity	Specificity	Balanced Accuracy	Kappa	OOB	Number of Features
OPERA predictions	0.61	0.80	0.71	0.42	0.29	13
Chemotypes	0.59	0.79	0.69	0.36	0.31	455
OPERA predictions and Chemotypes	0.65	0.82	0.73	0.46	0.27	468
Morgan fingerprints	0.56	0.77	0.67	0.32	0.33	1024
OPERA predictions and Morgan	0.61	0.81	0.71	0.43	0.29	1037

Table 4. Feature importance for RF models

The type of feature, feature name, and then columns for different global RF models are presented, including OPERA predictions only, chemotypes only, and OPERA and chemotypes. The value for feature importance is based off 100 replications of the RF model and represents the average rank of importance based on mean decrease accuracy.

Type	Feature	OPERA only	Chemo- types only	Chemo- types and OPERA
OPERA FEATURES	OCTANOL_WATER_PARTITION_LOGP_OPERA_PRED	1	NP	1.01
	BOILING_POINT_DEGC_OPERA_PRED	2.79	NP	6.35
	BIODEGRADATION_HALF_LIFE_DAYS_DAYS_OPERA_PRED	3.38	NP	2.15
	SOIL_ADSORPTION_COEFFICIENT_KOC_L.KG_OPERA_PRED	3.81	NP	9.27
	BIOCONCENTRATION_FACTOR_OPERA_PRED	5.41	NP	3.65
	AVERAGE_MASS	5.62	NP	7.81
	OCTANOL_AIR_PARTITION_COEFF_LOGKOA_OPERA_PRED	7.25	NP	5.7
	VAPOR_PRESSURE_MMHG_OPERA_PRED	7.54	NP	5.38
	WATER_SOLUBILITY_MOL.L_OPERA_PRED	8.59	NP	10.79
	HENRYS_LAW_ATM.M3.MOLE_OPERA_PRED	9.69	NP	4.57
	MELTING_POINT_DEGC_OPERA_PRED	11.1	NP	9.33
	ATMOSPHERIC_HYDROXYLATION_RATE _. AOH ._ CM3.MOLECULE.SEC_OPERA_PRED	11.72	NP	12.8
	OPERA_KM_DAYS_OPERA_PRED	13	NP	16.17
CHEMOTYPES	ring.aromatic_benzene	NP	1	15.45
	bond.P.S_generic	NP	2	14.69
	bond.CX_halide_aromatic.X_generic	NP	3	NA
	bond.C .. O.O_carboxylicAcid_generic	NP	5.38	NA
	bond.C .. O.O_carboxylicEster_aromatic	NP	5.39	NA
	ring.fused_steroid_generic _. 5_6_6_6.	NP	5.69	14.85
	bond.S.O_sulfonyl_generic	NP	6.21	14.84
	bond.CC .. O.C_ketone_alkane_cyclic	NP	8.8	NA
	bond.CN_amine_aromatic_generic	NP	9.27	NA
	bond.COC_ether_aliphatic_aromatic	NP	9.85	NA
	bond.COH_alcohol_aromatic_phenol	NP	13.66	NA
	bond.C .. O.N_carbamate	NP	13.94	NA
	bond.P.O_phosphate_dithio	NP	15.13	NA
	bond.CN_amine_ter.N_generic	NP	16.2	NA

Table 5. Summary of estrogen and androgen preliminary structure alerts

The number of chemicals that were flagged as positive (+) or negative (-) by mode based on the preliminary structure alert approach (at least one enriched chemotype per mode must be present for a positive).

Mode	Chemicals
Mode	(+)	(-)
Androgen dn	97	1303
Androgen up	108	1292
Estrogen dn	210	1190
Estrogen up	50	1350

Table 6. Chemicals positive in global RF and local NN models.

44 chemicals were positive in all three models (global RF, chemotype NN, and Morgan NN) of HT-H295R bioactivity. This is an excerpt from Supplemental File 5, which shows the results of all the models, model agreement score, and presence of preliminary structure alerts for effects on estrogen and/or androgen synthesis in the HT-H295R assay (up and down) for the 1400 chemical training set and the 6302 chemicals from the EDSPUOC for which predictions were made.

Chemical	DTXSID	Chemotype NN model	Morgan NN model	Global RF model	Model Agreement Score	Estrogen up	Estrogen dn	Andrgen up
Benzo(a)pyrene	DTXSID2020139	Positive	Positive	positive	2.75	0	0	0
4-Chloro-1,3- diaminobenzene	DTXSID0020282	Positive	Positive	positive	2.75	0	0	0
2-Chloro-1,4- diaminobenzene sulfate (1:1)	DTXSID0020284	Positive	Positive	positive	2.75	0	0	0
2,4-Diaminoanisole sulfate	DTXSID7020398	Positive	Positive	positive	2.75	0	0	0
4,4'- Diaminoazobenzene	DTXSID2020399	Positive	Positive	positive	2.75	0	0	0
3,3'-Dichlorobenzidine	DTXSID6020432	Positive	Positive	positive	2.75	0	0	0
Zearalenone	DTXSID0021460	Positive	Positive	positive	2.75	0	0	0
Fensulfothion	DTXSID6021953	Positive	Positive	positive	2.75	0	0	0
4-Hydroxyazobenzene	DTXSID3022160	Positive	Positive	positive	2.75	0	0	0
Testosterone	DTXSID8022371	Positive	Positive	positive	2.75	0	1	1
Aldosterone	DTXSID7022419	Positive	Positive	positive	2.75	0	1	1
Chrysene	DTXSID0022432	Positive	Positive	positive	2.75	0	0	0
Cortisone	DTXSID5022857	Positive	Positive	positive	2.75	0	1	1
Ethisterone	DTXSID2023016	Positive	Positive	positive	2.75	0	1	1
Fludrocortisone	DTXSID7023061	Positive	Positive	positive	2.75	0	1	1
Benzo(e)pyrene	DTXSID3023764	Positive	Positive	positive	2.75	0	0	0
Terbutryn	DTXSID3024318	Positive	Positive	positive	2.75	1	0	0
6-Methoxy-2- benzothiazolamine	DTXSID9024485	Positive	Positive	positive	2.75	0	0	0
2,4-Dichloroaniline	DTXSID1024966	Positive	Positive	positive	2.75	0	0	0
Uniconazole	DTXSID7032505	Positive	Positive	positive	2.75	0	1	0
Desmedipham	DTXSID0034518	Positive	Positive	positive	2.75	0	0	0
Diniconazole-M	DTXSID2034548	Positive	Positive	positive	2.75	0	1	0
Ipconazole	DTXSID7034674	Positive	Positive	positive	2.75	0	1	0
Levonorgestrel	DTXSID3036496	Positive	Positive	positive	2.75	0	1	1
3,3'-Diaminobenzidine	DTXSID2036827	Positive	Positive	positive	2.75	0	0	0
trans-Permethrin	DTXSID4037611	Positive	Positive	positive	2.75	0	0	0
2-Hydroxyatrazine	DTXSID6037807	Positive	Positive	positive	2.75	1	0	0
Methazole	DTXSID1040293	Positive	Positive	positive	2.75	0	0	0

Chemical	DTXSID	Chemotype NN model	Morgan NN model	Global RF model	Model Agreement Score	Estrogen up	Estrogen dn	Andrgen up
2,4-Diaminoanisole	DTXSID5041358	Positive	Positive	positive	2.75	0	0	0
Chlorfluazuron	DTXSID5041772	Positive	Positive	positive	2.75	0	0	0
Isoxathion	DTXSID0042080	Positive	Positive	positive	2.75	0	0	0
Swep	DTXSID7042437	Positive	Positive	positive	2.75	0	0	0
Terbumeton	DTXSID1042445	Positive	Positive	positive	2.75	1	0	0
21- Hydroxyprogesterone	DTXSID0045254	Positive	Positive	positive	2.75	0	1	1
Betamethasone 21- phosphate disodium	DTXSID8047137	Positive	Positive	positive	2.75	0	1	0
Flurandrenolide	DTXSID2047434	Positive	Positive	positive	2.75	0	1	1
Fluorometholone	DTXSID7047435	Positive	Positive	positive	2.75	0	1	0
dl-Norgestrel	DTXSID3047477	Positive	Positive	positive	2.75	0	1	1
Perylene	DTXSID4047753	Positive	Positive	positive	2.75	0	0	0
9-Fluorenone	DTXSID6049307	Positive	Positive	positive	2.75	0	0	0
3,3,3',3'- Tetramethyl-2,2',3,3'- tetrahydro-1,1'- spirobi[indene]-6,6'- diol	DTXSID3051753	Positive	Positive	positive	2.75	0	0	0
C.I.Solvent Orange 4	DTXSID4052136	Positive	Positive	positive	2.75	0	0	0
(+)-cis-Permethrin	DTXSID5052208	Positive	Positive	positive	2.75	0	0	0
Phenyl (4- cyanophenyl)carbamate	DTXSID2072251	Positive	Positive	positive	2.75	0	0	0

Quartz 4

Explorer

37841081

Evaluating structure-based activity in a high-throughput assay for steroid biosynthesis

Abstract

Keywords

Introduction

Methods

Overview of approach

Chemical descriptors

Bioactivity data

Structure descriptors

Predicted physicochemical and environmental fate properties

Multi-strategy approach to predicting bioactivity

Chemotype enrichment for HT-H295R activity

Global modeling approach

e. OPERA and Morgan fingerprints

Nearest neighbor (NN) modeling approach

Evaluating preliminary structure alerts and model agreement

Estrogen and androgen preliminary structure alerts

Model agreement score for the global and local approaches

External prediction set

Software

Results

Training data available

Individual feature enrichment

HT-H295R bioactivity modeling

Global random forest models

External prediction set

Model agreement

Discussion

Supplementary Material

Acknowledgments

References

Highlights

Graph View

Table of Contents

Backlinks

Quartz 4

Explorer

37841081

Evaluating structure-based activity in a high-throughput assay for steroid biosynthesis

Abstract

Keywords

Introduction

Methods

Overview of approach

Chemical descriptors

Bioactivity data

Structure descriptors

Predicted physicochemical and environmental fate properties

Multi-strategy approach to predicting bioactivity

Chemotype enrichment for HT-H295R activity

Chemotype enrichment for estrogen and androgen related bioactivity

Global modeling approach

e. OPERA and Morgan fingerprints

Nearest neighbor (NN) modeling approach

Evaluating preliminary structure alerts and model agreement

Estrogen and androgen preliminary structure alerts

Model agreement score for the global and local approaches

External prediction set

Software

Results

Training data available

Individual feature enrichment

HT-H295R bioactivity modeling

Global random forest models

External prediction set

Model agreement

Discussion

Supplementary Material

Acknowledgments

References

Highlights

Graph View

Table of Contents

Backlinks