International Journal of Molecular Sciences

MDPI

Article

Intratumoral Microbiota Correlates with AP-2 Expression: A Pan-Cancer Map with Cohort-Specific Prognostic and Molecular Footprints

Damian Kołat 1,2,*DD, Piotr Gromek 10D, Lin-Yong Zhao 3, Żaneta Kałuzińska-Kołat 1,2,4 (D, Mateusz Kciuk 1,2[D, Renata Kontek 20 and Elżbieta Płuciennik 1D

1 Department of Functional Genomics, Medical University of Lodz, 90-752 Lodz, Poland; piotr.gromek@stud.umed.lodz.pl (P.G.); zaneta.kaluzinska@umed.lodz.pl (Ż.K .- K.); mateusz.kciuk@biol.uni.lodz.pl (M.K.); elzbieta.pluciennik@umed.lodz.pl (E.P.)

2 Department of Molecular Biotechnology and Genetics, University of Lodz, 90-237 Lodz, Poland; renata.kontek@biol.uni.lodz.pl

3 Department of General Surgery, West China Hospital, Sichuan University, Chengdu 610041, China; 153795352@scu.edu.cn

4 Department of Biomedicine and Experimental Surgery, Medical University of Lodz, 90-136 Lodz, Poland

* Correspondence: damian.kolat@umed.lodz.pl

check for updates

Academic Editor: Apostolos Zaravinos

Received: 4 November 2025 Revised: 27 November 2025

Accepted: 28 November 2025 Published: 29 November 2025

Citation: Kołat, D .; Gromek, P .; Zhao, L .- Y .; Kałuzińska-Kołat, Ż .; Kciuk, M .; Kontek, R .; Płuciennik, E. Intratumoral Microbiota Correlates with AP-2 Expression: A Pan-Cancer Map with Cohort-Specific Prognostic and Molecular Footprints. Int. J. Mol. Sci. 2025, 26, 11587. https://doi.org/ 10.3390/ijms262311587

Copyright: @ 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/ licenses/by/4.0/).

Abstract

The AP-2 family is a group of key regulators in cancer, yet their relationship with in- tratumoral microbes remains undefined. The present pan-cancer workflow leveraged TCGA transcriptomic data to correlate expression of AP-2 representatives with bacterial abundance on the genus and species level, followed by cohort-specific survival modeling, clinical profiling, differential expression, weighted co-expression analysis, and chromatin proximity tests with AP-2 enrichment. Significant correlations between microbiota and AP- 2 were observed in 18 of 33 analyzed tumor types; TFAP2E-AS1 was most recurrent among AP-2 members, and Halomonas was most recurrent among genera. Further species-level verification and prognostic importance nominated three promising pairs: Paraburkholderia fungorum-TFAP2E in adrenocortical carcinoma (ACC), Actinomyces oris-TFAP2E in diffuse large B-cell lymphoma (DLBC), and Cutibacterium granulosum-TFAP2B in stomach ade- nocarcinoma (STAD). An attempt to define a consensus expression signature driven by microbiota and AP-2, yet independent of the specific species or family member, revealed genes regulating various biological processes and pathways. ACC and DLBC shared a consensus expression program, whereas STAD diverged; chromatin analysis showed AP-2 motifs near microbe-responsive genes in ACC and DLBC but not STAD, supporting cohort- specific regulation. Collectively, AP-2 family members emerge as plausible mediators of tumor microbiota-host interplay, warranting further mechanistic and translational research.

Keywords: microbiota; AP-2; pan-cancer; bioinformatics; prognosis; TFAP2E; TFAP2B; adrenocortical carcinoma; diffuse large B-cell lymphoma; stomach adenocarcinoma

1. Introduction

The human microbiome, a complex ecosystem of bacteria, viruses, fungi, and proto- zoa, colonizes diverse anatomical niches (e.g., gut, oral cavity, skin), orchestrating both homeostasis and disease [1,2]. While the majority of research has predominantly focused on the influence of gut microbiota on the efficacy of anti-cancer therapies (particularly immunotherapy) [3-5], emerging evidence reveals that microbial communities within

non-gastrointestinal tumors actively participate in disease development, progression, and clinical outcomes [6,7]. Advanced sequencing technologies have uncovered distinct micro- biomes across cancers, with bacterial composition varying not only between malignancies but also among patients with the same tumor type [8]. Intratumoral bacteria can influ- ence drug metabolism and alter the efficacy of checkpoint inhibitors, highlighting their therapeutic relevance [9,10]. Dysbiosis in tumors correlates with advanced disease stage, metastasis, and poor survival [11,12], suggesting that microbiome-targeted strategies could complement existing cancer therapies.

Microbes are recognized as integral components of the tumor microenvironment, where they modulate inflammation, angiogenesis, and immune evasion through direct microbial-host cell interactions or the secretion of metabolites [13-15]. In fact, the regu- lation of metabolic processes or the immune response is driven by transcriptional mech- anisms [16]. The microbiome regulates host gene expression through bidirectional in- teractions in cellular/ extracellular compartments [17]. Microbiome-derived stimuli (e.g., transcripts, metabolites, enzymes, pH changes) are recognized by host cells via extracellular receptors or intracellular entry. These stimuli modulate transcription factor (TF) binding, DNA methylation, chromatin accessibility, transcription, and alternative splicing, driving differential gene/protein expression. Conversely, host-derived proteins mutually shape microbial growth and transcriptional activity, establishing a feedback loop [17]. The mi- crobiota can regulate host gene transcription independently of chromatin accessibility, primarily through differential expression (DE) of TFs and enrichment of their binding sites in nucleosome-depleted cis-regulatory regions [18]. Beyond TF expression, the microbiota also alters TF binding activity and induces epigenetic modifications. Notably, microbial components can directly interact with TFs, as demonstrated for specific host-microbe molecular interfaces [18,19].

For instance, the microbiome inhibits the HNF4« transcription factor, impairing the regulation of inflammatory pathways and promoting pro-inflammatory states [19]. Simi- larly, the microbiota modulates key transcription factors (e.g., KLF4, AP-1, SP-1), underscor- ing systemic microbial-TF interactions [20]. Interestingly, one of the TF families important in carcinogenesis, but yet to be explored in relation to the microbiota, is the Activating enhancer-binding protein-2 (AP-2). Microbiome-related research on AP-2 members should be initiated; one of the premises behind this is H. pylori infection, which has been shown to disrupt cellular processes by activating the AP-1 family, a member of the same superclass as the AP-2 family [21,22]. Our recent works highlight the broader role of the AP-2 TFs in cancer and gastroenterological disorders, including gastric/colorectal tumors, metabolic diseases, and gut dysmotility [23,24]. The AP-2 family regulates embryogenesis (e.g., limb, eye, and facial development) under homeostatic conditions but exhibits dysregulated functionality in cancer, serving as a prognostic marker [25,26]. AP-2 encompasses five proteins, i.e., AP-2x, AP-26, AP-2y, AP-28, and AP-2€, which are encoded, respectively, by TFAP2A, TFAP2B, TFAP2C, TFAP2D, and TFAP2E genes. In addition, three genes affiliated with the long non-coding RNA (lncRNA) class have been identified, i.e., TFAP2A-AS1, TFAP2A-AS2, and TFAP2E-AS1 [27,28]. Structurally, AP-2 proteins belong to the basic Helix-Span-Helix class (bHSH) of Superclass-1, with a transactivation domain at the amino terminus and DNA-binding/dimerization motifs at the carboxyl terminus [27]. The HSH motif facilitates homo- and heterodimerization but requires post-translational modifica- tions (phosphorylation, sumoylation) and interactions with other proteins to modulate subcellular localization, DNA binding, or degradation [25,29]. AP-2 TFs bind GC-rich DNA motifs, with all members except AP-28 containing a proline-rich activation domain [30]. Previously, we indicated that AP-2 members regulate biological processes that underlie cancer hallmarks [31]. Moreover, these TFs are known to affect the immune system and

the effectiveness of related therapies [32], as well as invasiveness [33] or proliferation [23], i.e., aspects frequently discussed in the microbiota literature [11,34]. The importance of AP-2 in the diagnosis of selected tumors, its implication in a network of long non-coding RNAs and microRNAs, as well as the prognostic significance of its targets, certifies that the complexity of the entire family is worth investigating [27,35,36].

As mentioned above, no study has investigated the potential relationship between intratumoral microbiota abundance and AP-2 expression, which could lay the groundwork for future mechanistic research on relevant tumor types. Thus, we present the first analysis of the relationships between microbial abundance and the expression of AP-2 family members, which was preceded by a general pan-cancer assessment of bacterial composition and diversity. By integrating host-microbiome data across a human pan-cancer spectrum, we identified significant correlations between bacterial genera/species and AP-2 family members, as well as evaluated their prognostic and clinical relevance. Furthermore, gene expression patterns associated with these relationships were examined through differential expression and co-expression network analyses. By bridging microbiome and transcription factor research, this work reveals the biological outcomes underlying the microbiota-AP-2 relationship, which may influence cancer development and progression.

2. Results

2.1. Despite Each Tumor Type Having Its Own Bacterial Composition, Some Commonalities Were Noted

Prior to the investigation focused on the AP-2 family, each cohort was summarized in terms of its bacterial composition (Figure 1). Although heterogeneity between tumors or between patients within a single cohort was expected, it is worth noting that the most preva- lent genus in a given cohort also predominates in others, e.g., Bacillus is the foremost genus in BRCA, KIRC, LUAD, LUSC, OV, PCPG, PRAD, SKCM, THCA, and UCEC. Other prevail- ing genera are Pseudomonas (CHOL, KICH, LGG, PADD, TGCT, THYM, UVM), Paenibacillus (GBM, BLCA, CESC, KIRP, LIHC), Prevotella (ESCA, HNSC, STAD), Bacteroides (COAD, READ), and Actinoplanes (DLBC, MESO). In contrast, the most prevalent genera in ACC, SARC, and UCS are, respectively, Acinetobacter, Peptoclostridium, and Saccharomonospora. In some tumors, mutual exclusivity in bacterial composition can be observed, particularly in KIRP and LIHC, and to a lesser extent in DLBC and SKCM.

Cohorts were also investigated for their microbiota abundance in a pan-cancer spec- trum, as well as in terms of diversity indices (Figure 2). The top 50 most prevalent genera included those mentioned earlier (Bacillus, Pseudomonas, Paenibacillus, Prevotella, Bacteroides, and Acinetobacter), as well as bacteria that are less noticeable when analyzing each tumor separately, such as Escherichia, Klebsiella, Yersinia, and Vibrio. A prevalent characteristic of the tumor microbiome is a subtle decline in the Shannon index compared to control tissues. This finding is often accompanied by a decrease in the Simpson index, indicating domina- tion by a few taxa. Still, some tissues exhibit similar interquartile ranges, independent of the sample type, suggesting that certain tumors have a microbial composition resembling that of normal specimens. As for richness, some tumors, such as BRCA, COAD, OV, and READ, harbor two to three times more taxa than others.

Figure 1. Bacterial composition in tumor cohorts included in the study. The top 15 most abundant genera are specified for each disease.

ACC

BLCA

BRCA

CESC

CHOL

COAD

DLBC

ESCA

GBM

HNSC

KICH

KIRC

KIRP

LGG

LIHC

LUAD

LUSC

MESO

OV

PAAD

PCPG

PRAD

READ

SARC

SKCM

STAD

TGCT

THCA

THYM

UCEC

UCS

UVM

Int. J. Mol. Sci. 2025, 26, 11587

Relative abundance (%)

D

100

TCMbio

CH

Legend for Figure 2A -+

S

25

0

0

Shannon

100

g_Escherichia

BLC

g_Pseudomonas

ARCA

@_Staphylococcus

CESC

g_Salmonella

CHO

g_Klebsiella

m

0

caso

g_Ralstonia

OLBe

@_Pasteurella

1.00 -

100

@_Porphyrobacter

Coca

Simpson

CA

0.76-

GBar

g_Cutbacterium

OFPca

MASC

g_Burkholderia

0.50-

g_Corynebacterium

Eis

NON

Se

0.25-

APRO

g_Streptococcus

.. ..

COMO

@_Bradyrhizobium

0

Ao

G

0.00

OLBC

CARL

COCA

400

Log (Observed)

g_Yersinia g_Fusobacterium

A00

GBM

Chp

g_Clostridium

OUÇA

MASO

440

@_Halomonas

2

DACA

NICH

400

G_Lactococcus

0

GÃsp

ARO

MESO

g_Shigella

g_Enterobacter

g_Hydrogenophaga

90%

1/8

9

00

COND

LAty

440

-.. ..

980

g_Bacteroides

600

PCPO


COCA

BRAD

g_Prevotella

Log (Chao1)

900

Che

SLCA

004.

QUAD

1640

9_Vibrio

g_Sphingomonas

4

MAISO

LUST

-..

FAR

MARCA

g_Shinelia

May

MESO

SCH

@_Alicycliphilus

9

PESO

WHO1

TIPO

-

2700

… .

0

tipo

20

Boy

Evenness

=

g_Lactobacillus

40

-DA0

DLBC

LAME

CA

INCA

g_Priestia

-.. ..

g_Geobacillus

ESCA

40G

HM

CHO

“RO

2

g_Streptomy ces

g_Actinobacillus

GBM

16

Sample_type

ACC

ELCA

440

440

non_tumor

UC’s

=

MINSC

9_Actinomyces

g_Anaerococcus

NICH

4g

OKCHA

tumor

ORCA

UM

·

CESC

RESO

STAD

g_Aeromonas

TRO

2

g_Clostridioides

CHOL

Tpo

0%

TGCY

Shannon index

V

COMO

AMAL

140

PACA

8

g_Agrobacterium

CBC

50

SMY

2

9_Kinneretia

100

20

Sample

CE

g_Paenbacillus

ESCA

CANC

GBM

READ

B

CUAD

CC

90

g_Bacilus

..

ANSO

SARA

LUSC

Im

non_tumor

NACH

2

type

g_Moracella

g_Acinatobacter

NICH

MESO

STAD

0

KIRO

-

TI

g_Rothia

0%

E

g_Phocaeicola

A/go

For

A10

g_Cupriavidus

CAMI

PCPG

CA

8

g_Methylobacterium

LGG

3

G

PRAD

LINC

CEO

Figure 2. Pan-cancer view of bacterial abundance and diversity indices. (A) Top 50 most prevalent genera in a pan-cancer spectrum according to TCMbio data. (B) Evenness according to BIC. (C,D) Shannon

Simpson index of diversity

READ

Sample_type

9 8

q

g_Niala

CUAD

..

8

MARC

-

g_Ideonella

1

LUSc

tumor

_tumor

UM

40

%

SKCM

MESO

8

g_Neisseria g_Diaphorobacter

STAD

0%

I

0

8

$

.

IGCT

Simpson reciprocal in

lex

0

others

PAAD

THICA

PCPG

8

9h

m

8

PRAD

a

8

Sample_type

CEC

5

1

READ

S

VCS

0

2

MARC

non_tumor

8

8

BIC

6

5KCM

tumor

Un

18

8

STAD

0

8

8

0

IGCT

2%

8

8

THCA

Richness

500

B

8

400

0

THYM

8

V 9

CEO

2

a

20

48

0

kg

5 of 27

OCS

200

90

va

2

8

OVM

3

38

100

0

0

0

3 2

280

5g

De

9

8

28

20

8

de

9

40

8

4

q

Ve

%

CECA

20

188

8

VICA

8,

2

30

g

I

40

2%

2

38

00

42

40

I

ch

€40

F

4

100

20,

40

8

280

8

40

1980

30

8

3

8

26

00

de

96

0

4

ty

m

MESO

UD

9

Co

200

30

hy

20

6

Me

2

00

DE

80

0

indices according to TCMbio and BIC. (E,F) Simpson indices according to TCMbio and BIC. (G) log2-transformed abundances according to TCMbio. (H) Simpson’s reciprocal index accord- ing to BIC. (I) Chao1 index according to TCMbio. (J) Richness according to BIC. Data from TCMbio enables comparison between tumors and non-tumor specimens, whereas BIC data focus on differ- ences between tumor types.

2.2. Correlation of Microbiota Abundance and AP-2 Expression Revealed That More than Half of the Cohorts Contain Significant Relationships, with Some of Them Having Prognostic Importance

Focusing on the correlation between bacterial genus abundance and AP-2 family member expression, potential associations were observed in more than half of the cohorts, i.e., 18 of 33 tumor types (Table 1). Between cohorts, the most duplicative AP-2 family member was TFAP2E-AS1 (DLBC, ESCA, LAML, MESO, PCPG, READ, SKCM, UCS, UVM), while regarding genera, it was Halomonas (CHOL, DLBC, GBM, KICH, LIHC, SARC, SKCM). Others found in at least three cohorts were TFAP2B (CESC, CHOL, KICH, SARC, STAD, UCS), TFAP2D (GBM, LAML, LIHC, MESO, PCPG, SARC), TFAP2E (ACC, COAD, DLBC, KICH, READ), TFAP2A-AS1 (CHOL, DLBC, KICH, PCPG, READ), Meiothermus (CESC, DLBC, MESO), and Corynebacterium (DLBC, KICH, UCS). The cohort with the highest number of identified relationships was DLBC, while the fewest significant correlations were identified ex aequo in ACC, CESC, COAD, ESCA, LIHC, SKCM, STAD, and UVM.

Table 1. Correlation between the abundance of bacterial genera and the expression of AP-2 family members in a pan-cancer view.
TCGA Tumor CohortMicrobiota (Genus-Level)AP-2 TFCorrelation Coefficient and Statistical Significance
ACCParaburkholderiaTFAP2Er = 0.55; p < 0.0001 ( **** )
CESCMeiothermusTFAP2Br = 0.36; p < 0.0001 ( **** )
CHOLHalomonasTFAP2Br = 0.41; p < 0.05 (*)
Liberibacterr = 0.45; p < 0.01 ( ** )
MoraxellaTFAP2A-AS1r = 0.38; p < 0.05 (*)
Sphingomonasr = 0.35; p < 0.05 (*)
COADBrucellaTFAP2Er = 0.38; p < 0.0001 ( **** )
DLBCActinomycesTFAP2Er = 0.65; p < 0.0001 ( **** )
TFAP2E-AS1r = 0.67; p < 0.0001 ( **** )
AlcaligenesTFAP2A-AS2r = 0.49; p < 0.001 ( *** )
BrucellaTFAP2A-AS2r = 0.31; p < 0.05 (*)
CorynebacteriumTFAP2E-AS1r = 0.33; p < 0.05 (*)
CutibacteriumTFAP2A-AS2r = 0.32; p < 0.05 (*)
GordoniaTFAP2Er = 0.39; p < 0.01 ( ** )
TFAP2E-AS1r = 0.41; p < 0.01 ( ** )
TFAP2A-AS1r = 0.31; p < 0.05 (*)
TFAP2A-AS2r = 0.37; p < 0.05 (*)
HalomonasTFAP2E-AS1r = 0.47; p < 0.001 ( *** )
MeiothermusTFAP2Er = 0.32; p < 0.05 (*)
TFAP2A-AS1r = 0.38; p < 0.01 ( ** )
TFAP2A-AS2r = 0.51; p < 0.001 ( *** )
ParaburkholderiaTFAP2E-AS1r = 0.36; p < 0.05 (*)
TFAP2A-AS2r = 0.51; p < 0.001 ( *** )
SphingomonasTFAP2Ar = 0.47; p < 0.001 ( *** )
ESCAShewanellaTFAP2E-AS1r = 0.38; p < 0.0001 ( **** )
Table 1. Cont.
TCGA Tumor CohortMicrobiota (Genus-Level)AP-2 TFCorrelation Coefficient and Statistical Significance
GBMClostridiumTFAP2Dr = 0.79; p < 0.0001 ( **** )
HaemophilusTFAP2Cr = 0.39; p < 0.0001 ( **** )
HalomonasTFAP2Dr = 0.48; p < 0.0001 ( **** )
KlebsiellaTFAP2Dr = 0.57; p < 0.0001 ( **** )
KICHCorynebacteriumTFAP2Br = 0.97; p < 0.0001 ( **** )
HalomonasTFAP2Er = 0.43; p < 0.001 ( *** )
TFAP2A-AS2r = 0.50; p < 0.0001 ( **** )
VibrioTFAP2Er = 0.32; p < 0.01 ( ** )
TFAP2A-AS1r = 0.72; p < 0.0001 ( **** )
TFAP2A-AS2r = 0.47; p < 0.0001 ( **** )
LAMLAquitaleaTFAP2E-AS1r = 0.36; p < 0.0001 ( **** )
PorphyrobacterTFAP2Dr = 0.30; p < 0.001 ( *** )
ThiopseudomonasTFAP2Dr = 0.33; p < 0.0001 ( **** )
LIHCHalomonasTFAP2Dr = 0.42; p < 0.0001 ( **** )
MESOMeiothermusTFAP2Dr = 0.35; p < 0.01 ( ** )
TFAP2E-AS1r = 0.31; p < 0.01 ( ** )
PCPGEscherichiaTFAP2Dr = 0.34; p < 0.0001 ( **** )
GordoniaTFAP2A-AS1r = 0.35; p < 0.0001 ( **** )
MitsuariaTFAP2E-AS1r = 0.33; p < 0.0001 ( **** )
READAquirufaTFAP2Er = 0.35; p < 0.0001 ( **** )
MoraxellaTFAP2A-AS1r = 0.32; p < 0.0001 ( **** )
LimnohabitansTFAP2E-AS1r = 0.34; p < 0.0001 ( **** )
PolynucleobacterTFAP2E-AS1r = 0.30; p < 0.0001 ( **** )
SARCHalomonasTFAP2Dr = 0.37; p < 0.0001 ( **** )
StaphylococcusTFAP2Br = 0.38; p < 0.0001 ( **** )
SKCMHalomonasTFAP2E-AS1r = 0.31; p < 0.01 ( ** )
STADCutibacteriumTFAP2Br = 0.51; p < 0.0001 ( **** )
UCSCorynebacteriumTFAP2Br = 0.85; p < 0.0001 ( **** )
PriestiaTFAP2Ar = 0.38; p < 0.01 ( ** )
SolimonasTFAP2E-AS1r = 0.32; p < 0.05 (*)
UVMStaphylococcusTFAP2E-AS1r = 0.50; p < 0.0001 ( **** )

p < 0.05 (*), p < 0.01 (*), p < 0.001 (*), p < 0.0001 (*).

Evaluating the influence of given TFAP2-microbiota pairs on patient prognosis re- vealed that the following bacterial genera or AP-2 family members affect at least one survival endpoint: (1) Paraburkholderia and TFAP2E in ACC; (2) Meiothermus and TFAP2B in CESC; (3) Sphingomonas and TFAP2A-AS1 in CHOL; (4) Brucella and TFAP2E in COAD; (5) Actinomyces and TFAP2E or TFAP2E-AS1 in DLBC; (6) Halomonas and TFAP2E-AS1 in DLBC; (7) Clostridium and TFAP2D in GBM; (8) Haemophilus and TFAP2C in GBM; (9) Vibrio and TFAP2A-AS1 or TFAP2A-AS2 in KICH; (10) Halomonas and TFAP2A-AS2 in KICH; (11) Halomonas and TFAP2D in LIHC; (12) Mitsuaria and TFAP2E-AS1 in PCPG; (13) Limnohabitans and TFAP2E-AS1 in READ; (14) Polynucleaobacter and TFAP2E-AS1 in READ; (15) Halomonas and TFAP2D in SARC; (16) Staphylococcus and TFAP2B in SARC; (17) Cutibacterium and TFAP2B in STAD; (18) Solimonas and TFAP2E-AS1 in UCS; as well as (19) Staphylococcus and TFAP2E-AS1 in UVM. Corresponding genus-level survival analysis is summarized in Supplementary File S1.

2.3. Species-Level Correlation Analysis and Assessment of Prognostic Significance Have Enabled the Selection of Promising Relationships in Four Cohorts

The aforementioned relationships were investigated at the species level of a given mi- crobiota, confirming that the majority of these relationships remained significant (Table 2). Occasionally, a single TFAP2-microbiota pair on the genus level yielded two or three pairs for investigation on the species level because specific bacteria were sufficiently abundant in a given cohort. For instance, the correlation between TFAP2E-AS1 and Staphylococcus in UVM yielded two relationships at the species level (incorporating S. aureus and S. epider- midis), whereas for TFAP2B and Cutibacterium in STAD, it yielded C. acnes, C. granulosum, and C. modestum. Some pairs were not found at the species level (Meiothermus and TFAP2B in CESC, as well as Sphingomonas and TFAP2A-AS1 in CHOL), making further analysis impossible. A few relationships, although present, did not meet an established threshold for the correlation coefficient and were thus excluded from subsequent study stages. Ex- cluded relationships were: (1) Brucella anthropi and TFAP2E in COAD; (2) Halomonas sp. JS92-SW72 and TFAP2E-AS1 in DLBC; (3) Haemophilus parainfluenzae and TFAP2C in GBM; (4) Halomonas sp. JS92-SW72 and TFAP2D in LIHC; (5) Polynucleobacter sp. Adler-ghost and TFAP2E-AS1 in READ; as well as (6) Halomonas sp. JS92-SW72 and TFAP2D in SARC.

Table 2. Correlation between the abundance of bacterial species and the expression of AP-2 family members in a pan-cancer view.
TCGA Tumor CohortMicrobiota (Species-Level)AP-2 TFCorrelation Coefficient and Statistical Significance
ACCParaburkholderia fungorumTFAP2Er = 0.50; p < 0.0001 ( **** )
COADBrucella anthropiTFAP2Er = 0.20; p < 0.0001 ( **** )
DLBCActinomyces orisTFAP2Er = 0.47; p < 0.001 ( *** )
TFAP2E-AS1r = 0.46; p < 0.01 ( ** )
Halomonas sp. JS92-SW72TFAP2E-AS1r = 0.29; p < 0.05 (*)
GBMClostridium botulinumTFAP2Dr = 0.32; p < 0.0001 ( **** )
Haemophilus parainfluenzaeTFAP2Cr = 0.18; p < 0.05 (*)
KICHHalomonas sp. JS92-SW72TFAP2A-AS2r = 0.35; p < 0.01 ( ** )
Vibrio anguillarumTFAP2A-AS1r = 0.48; p < 0.0001 ( **** )
TFAP2A-AS2r = 0.39; p < 0.01 ( ** )
LIHCHalomonas sp. JS92-SW72TFAP2Dr = 0.18; p < 0.001 ( *** )
PCPGMitsuaria sp. 7TFAP2E-AS1r = 0.38; p < 0.0001 ( **** )
READLimnohabitans sp. 103DPR2TFAP2E-AS1r = 0.37; p < 0.0001 ( **** )
Limnohabitans sp. 63ED37-2r = 0.35; p < 0.0001 ( **** )
Polynucleobacter sp. Adler-ghostr = 0.22; p < 0.01 ( ** )
SARCHalomonas sp. JS92-SW72TFAP2Dr = 0.25; p < 0.0001 ( **** )
Staphylococcus aureusTFAP2Br = 0.33; p < 0.0001 ( **** )
STADCutibacterium acnesTFAP2Br = 0.44; p < 0.0001 ( **** )
Cutibacterium granulosumr = 0.37; p < 0.0001 ( **** )
Cutibacterium modestumr = 0.63; p < 0.0001 ( **** )
UCSSolimonas sp. K1W22B-7TFAP2E-AS1r = 0.34; p < 0.05 (*)
UVMStaphylococcus aureusTFAP2E-AS1r = 0.38; p < 0.001 ( *** )
Staphylococcus epidermidisr = 0.72; p < 0.0001 ( **** )

p< 0.05 (*), p < 0.01 ( ** ), p < 0.001 ( *** ), p < 0.0001 ( **** ).

Subsequently, the impact of promising bacterial species on patient prognosis was evalu- ated (Supplementary File S2). Since all TFAP2-microbiota pairs showed a positive correlation (Tables 1 and 2), it was decided to select those exerting a significant influence on at least one

survival endpoint, with both bacterial abundance and AP-2 expression having the same effect on survival. Only the following five relationships met the requirements: (1) Paraburkholderia fungorum and TFAP2E in ACC; (2) Actinomyces oris and TFAP2E in DLBC; (3) Actinomyces oris and TFAP2E-AS1 in DLBC; (4) Halomonas sp. JS92-SW72 and TFAP2A-AS2 in KICH; as well as (5) Cutibacterium granulosum and TFAP2B in STAD. Since two relationships involved A. oris in KICH (paired with either TFAP2E or TFAP2E-AS1), it was decided to proceed with TFAP2E, given its superior statistical significance and correlation coefficient. Corresponding species-level survival analysis is summarized in Supplementary File S2.

2.4. Establishment of Representative Patient Groups Reaffirmed Survival Outcomes and Revealed Distinct Clinical Features

Patients from the four selected cohorts underwent intersection analysis to identify representative groups (with higher or lower microbiota abundance and AP-2 expression), which were subsequently evaluated for their prognostic outcomes (Figure 3). Although no significant results were found in KICH, the remaining three cohorts indicated that individuals with simultaneously higher microbiota abundance and AP-2 expression have unfavorable survival relative to those with concurrently lower abundance/expression, based on at least three significant endpoints.

Figure 3. Establishment of representative patient groups in four cohorts with promising TFAP2- microbiota relationships, alongside the assessment of their prognostic significance. The upper part

ACC

52

Investigation

DLBC

25

Investigation

KICH

30

Investigation

of P. fungorum and TEAPE

al A. oris and TEAPZE

of H.sp. J592-SW72 and TFAPZA-AS2

STAD

172

Investigation

of C. granulosum

Intarsection Si20

Intersection Size

Intersection 8i20

Intersection Size

and TFAP2B

19

11

13

85

13

9

7

57

51

,

Sic Sian

Sat Sice

Hazard ratio

0.12 (3.00. 0.000 095

4

Ratonunca

0.18:(0:20, 1.06) 0.06

· 13

.

329 13 37, 29:54 03

Overall survival (OS)

1.00-

1.004

1.00

Survival probability

Survival probability

Survival probability

Survival probability

.50

sa

150

2

Log-rank p =0.018

Q.25

Log-tank

ASS

Log-rank

p=0.26

0.95

Log-rank p=0.001

p=0.033

0.00

D.DO

0.00

LantY AP-3 dni

2000

1000

2000

3000 Tits&

4000

1000

9000

0

1000

4500

5000

1000

3000

Strom

Number at risk

Number at risk

Number at risk

Girata

Number at risk

15

a

2

C

2

:

9

1

10

10

3

50

0

32

8

Schoenfeld Individual Test p: 0.342

0

Schoenfeld Individual Test p: 0.1503

Schoenfeld Individual Test p: 0.0993

Schoenfeld Individual Test p: 0.8453

Beta[]]

Bata[D)

Betalt)

Betalt)

300

400

Time

1200

$800 2100

350

570

$300 2200 210

460

570

Disease-specific survival (DSS)

Time

150 800

880

110

390

THỦ0 3500

Hazard satio

PIHR

13

Masardi sarka

Harand ratko

PİHA

0.06 (0.01. 0:57) 0.02

·12

.

50

1.00

1.74 |0.16, 19.21)

7

1.00-

Survival probability

Fa.75

Survival probability

0.75

Survival probability

0.75

Survival probability

0.76

50

60

Log-rank

Log-rank

p= 0.018

1.25

Log-rank

Log-nank

higher AO-de supression ant higher micsobida abundance

p = 0.002

0.25

p= 0.85

0.25

+ higher ko-2 mpmission and higher microbios abundance

p = 0.02

higher ko- 2 epression antihigher micsoblou abundance

a.Da

0.00

0.00

1000

8000

4000

5000

1000

Number at risk

5000

9000

000

2000

3000

4000

5000

Number at risk

1000

3000

4000

Number at risk

Number at risk

12

2

:

6

25

11

2

6

U

O

19

10

TO

1

a

Strata

ER

0

7

0

BL

Schoenfeld Individual Test p: 0.342

Schoenfeld test unavailable due to data scarcity

Schoenfeld test unavailable due to data scarcity

Schoenfeld Individual Test p: 0.403

Bata(!)

… ..

-

1600

2100

14D

370

500

1500

1900

Disease-free interval (DFI)

Hazard ratio

P(HP)

Hazard ratio cannot be calculated because one of the groups has no events

Hazard ratio cannot be calculated because one of the groups has no events

Hscsd matia

sin sin ie is i

0.09 (0.01, 0 #7) 0.04

: 0.14 10.06. 9.34) =0.001

1./00

1.00-

Survival probabilny

0.75

urvival probability

4.75

Survival probabilny

Survival probabili

0.75

.50

1.50

54

50

S

Log-rank

12

Log-sank

Log-tank

Log-rank

p = 0.000

p = 0.77

1.25

p =0.22

0.29

p < 0.0001

D

0.00

1000

8000 Time 200

4000

5000

1000

4000

Number at risk

3000

Number at risk

Number at risk

2000

Tima

3000

1000

Number at risk

2000 Tìma

2000

5

7

2

9

Strat

15

9

2

5

2

A

9

a

00

Stala

2

?

,

g

0

Schoenfeld Individual Test p: 0.4887

Betalt)

Schoenfeld test unavailable due to data scarcity

Schoenfeld test unavailable due to data scarcity

5

Schoenfeld Individual Test p: 0.1011

Beta()

Progression-free interval (PFI)

190

210

440

710 98

130

220

250

Time

720

1000

Hazard ratio

Harari sofia

1.00-

1

1.00

Pokerunca

1.00

· 13

1.004

Survival probability

Survival probability

0.7%

Survival probability

Survival probability

0.75

50

66

50

025

25

21

Log-rank p=0.66

8

Log-tank

Log-rank

0.00

p = 0.00099

D.OD

p= 0.00031

6

1500

Time

4000

à

1000

3000

1000

1000

3900

4000

1000

Number at risk

Time

0

2000

3000

4000

Strata

Tima

13

1

2

0

Numbor at risk

Numbor at rick

e

11

8

2

8

;

5

Number at rie

19

10

0.

Strato

28

8

1

0

Schoenfeld Individual Test p: 0.8054

25

Schoenfeld Individual Test p: 0.7382

Schoenfeld Individual Test p: 0.1018

Schoenfeld Individual Test p: 0.1932

Bata(!)

Betalt)

Beta(!)

Betalt)


43

TT

100

150

0

300

470

850

330

1830

3000 8400

3800

-

150

170

1400 2009

Time

2200

97

190

270

370 440

Time

550

1000

1800

depicts the results of intersection analysis, while the remaining part concerns survival analysis, with specific endpoints (OS, DSS, DFI, PFI) in rows and tumor cohorts in columns.

ACC, DLBC, and STAD patients were further evaluated for their clinical picture, re- vealing that, among the included characteristics, there were more differences in qualitative than quantitative features (Figure 4). In the first cohort, individuals with simultaneously higher Paraburkholderia fungorum abundance and TFAP2E expression were more frequently characterized by higher tumor stage, C1A molecular subtype, metastasis, COC3 molecular subtype, and steroid phenotype with proliferation pattern. In DLBC, the group with higher Actinomyces oris abundance and TFAP2E expression consisted solely of females and tended to have less germinal-center B-cell-like (GCB) expression subtype, although the amount of unavailable data or unclassified expression subtype complicates the inference. Regarding STAD, patients with higher Cutibacterium granulosum abundance and TFAP2B expression showed more homozygous deletions and were less frequently characterized by MSI molec- ular subtype or more commonly had microsatellite stability; however, this is inconclusive due to the unavailability of data for some individuals. The remaining clinical features with less evident differences between patient groups are depicted in Figure S1.

Figure 4. Investigation of clinical differences in representative groups of ACC, DLBC, and STAD patients. Qualitative data were visualized as filled stackbars, while quantitative information was

TFAP2E & P. fungorum 4

TFAP2E \ A. oris !

TFAP2B + C. granulosum !

TFAP2E 1 P. fungorum t

ACC

TFAP2E 1 A. oris 1

DLBC

TFAP2B + C. granulosum t

STAD

Percentage

100

Percentage

100

ACC

80

80

DLBC

tumor stage

60

60

Stage II

40

40

Stage I

gender

Stage IV

20

20

female

Stage III

male

0

0

Percentage

100

Percentage

100

ACC

80

80

DLBC

gene

60

60

expression

molecular subtype

subgroup

40

40

NA

NA

C1B

20

20

Unclass

GCB

C1A

ABC

0

0

Percentage

100

Percentage

100

STAD

ACC

80

80

molecular

tumor classification

subtype

60

60

HM_SNV

40

40

NA

Prior primary

GS

metastasis

20

EBV

primary

20

MSI

CIN

0

0

ACC COC

Percentage

100

Percentage

100

80

80

STAD

subtyping

60

60

MSI

NA

status

COC3

40

40

COC1

NA

COC2

20

20

MSI_L

MSI_H

ACC

MSS

0

0

K4 cluster

Percentage

100

40

** (p=0.00605)

STAD

80

8

steroid_phenotype_low

steroid_phenotype_high+prolif. steroid_phenotype_high

Amount of deletions

60

20

Homozygous deletions

40

2

20

NA

o

0

depicted as a beanplot (p < 0.01 ( ** )), with small lines representing individual data points and a larger horizontal line indicating the median per group. Data acquired from the Genomic Data Commons (GDC) portal.

2.5. An Attempt to Identify a Consensus Expression Profile Dependent on AP-2 and Microbiota Uncovered Genes Regulating Various Biological Processes and Pathways

In addition to clinical picture comparison, it was decided to establish a consensus expression profile across cohorts, related to both microbiota and AP-2, but independent of specific species or family members. Simultaneous evaluation of three cohorts using WGCNA revealed one statistically significant gene module associated with microbiota and AP-2 status (Figure 5A). However, the correlation was less satisfactory than that observed in the joint analysis of ACC and DLBC without STAD (Figure 5B), suggesting that the expression profile changes in STAD are generally distinct from those of the other two tumors, and that some essential genes might not constitute a module. Thus, it was decided to employ DEA between patient groups in a given tumor (Supplementary File S3), followed by the intersection analyses between: (1) WGCNA modules from both approaches, or (2) WGCNA modules and DEA-derived differentially expressed genes. Overlapping significant modules identified 116 genes found in the “tan” module from the three-cohort WGCNA and the “midnightblue” module from the two-cohort approach (Figure 5C). In parallel, intersecting WGCNA and DEA data for ACC and DLBC revealed 223 upregulated and 226 downregulated genes that were considered essential for the manifestation of consensus expression profile (Figure 5D). Comparing the lists of 116 and 223 genes, only 18 were found in the “tan” module from the three-cohort WGCNA (Figure 5E), suggesting that expression changes in STAD are devoid of essential genes to the extent that they only partly resemble the consensus profile found in ACC and DLBC.

Figure 5. Identification of consensus expression profile dependent on AP-2 and microbiota, with intersection of WGCNA and DEA results. (A) Three-cohort WGCNA. (B) Two-cohort WGCNA.

A

ACC & DLBC & STAD

Consensus module-trait relationships

B

ACC & DLBC

0.24*

-0.24*

tan

-0.59 **

0.59 **

greenyellow

-0.06

0.06

darkgreen

0.47*

-0.47*

midnightblue

-0.064

0.064

red

0.049

-0.049

lightcyan

grey

grey

Correlation

1

red

grey60

0.5

blue

0

cyan

-0.5

Module

green

-1

magenta

Module

midnightblue

pink

AP-2 1 Microbiota 1

AP-2 4

AP-2 1 Microbiota 1

AP-2 4 Microbiota 4

Microbiota 4

51

1952

223

0

226

1769

116

12

1662

0

0

95

0

98

18

205

0

0

0

0

0

0

0

C

2178

D

E

0

Module “midnightblue” from two-cohort WGCNA Module “greenyellow” from two-cohort WGCNA Module “tan” from three-cohort WGCNA

Module “midnightblue” from two-cohort WGCNA Module “greenyellow” from two-cohort WGCNA Genes upregulated in “AP-2 1 Microbiota t” group identified via limma found in ACC and DLBC cohorts Genes downregulated in “AP-2 t Microbiota t” group identified via limma found in ACC and DLBC cohorts

116 genes overlapping between “midnightblue” module of two-cohort WGCNA and “tan” module of three-cohort WGCNA

223 genes overlapping between “midnightblue” module of two-cohort WGCNA and genes upregulated in “AP-2 1 Microbiota t” group identified via limma found in ACC and DLBC cohorts

(C) Intersection analysis of significant gene modules from both WGCNA approaches. (D) Overlap of WGCNA and DEA data for ACC and DLBC cohorts. (E) Identification of a common part between the 116-gene and 223-gene sets found in two previous subpanels. p < 0.05 (*), p < 0.01 ( ** ).

To assess biological outcomes of such disparity, the above modules and overlapping genes were functionally annotated (Figures 6 and S2). Based on the “WikiPathways cancer” resource, genes residing in the “greenyellow” module are responsible for, e.g., angiogenesis, Wnt signaling, cell adhesion, and interleukin-1 signaling: phenomena that were not found in other gene sets. In contrast, the “midnightblue” module had mutual annotations with the “tan” module and a list of 116 overlapping genes, although with subtle differences. For instance, all three gene sets were linked to ATM signaling, as well as glycolysis and gluconeogenesis. However, the “midnightblue” module is more closely related to DNA mismatch repair and Notch signaling, rather than non-homologous end-joining and Ras signaling, as indicated in the ontology of the other two gene sets. This suggests that the expression profile alterations across three cohorts might entail a few similar biological outcomes, but differences between STAD patient groups do not involve phenomena unique to the “midnightblue” module specific to ACC and DLBC, e.g., metabolic reprogramming or TP53 regulatory network. Extended functional annotation of the same gene sets based on other ontological resources is summarized in Figure S2.

Figure 6. Functional annotation of WGCNA results and genes overlapping between modules according to the "WikiPathways cancer" resource. (A) Module "greenyelow" from the two-cohort WGCNA. (B) Module

A

Regulation of Wnt B catenin signaling by small molecule compounds

Angiogenesis

Type II interferon signaling

PDGFR beta pathway

PI3K AKT mTOR signaling pathway and therapeutic opportunities in prostate cancer

Transcriptional activation by NRF2 in response to phytochemicals

EPO receptor signaling

NRF2 ARE regulation

Imatinib and chronic myeloid leukemia

Integrin mediated cell adhesion

IL 1 signaling pathway

Focal adhesion PI3K Akt mTOR signaling pathway

TGF beta signaling pathway

Non small cell lung cancer

2178 genes

Chemokine signaling pathway

0.0

Enrichment ratio

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

2.0

2.2

B

DNA mismatch repair

Metabolic reprogramming in colon cancer

Retinoblastoma gene in cancer

DNA damage response

G1 to S cell cycle control

ATM signaling in development and disease

4 hydroxytamoxifen dexamethasone and retinoic acids regulation of p27 expression

TP53 network

ATM signaling pathway

Cell cycle

DNA IR double strand breaks and cellular response via ATM

Glycolysis and gluconeogenesis

Fluoropyrimidine activity

Notch signaling

1885 genes

Non small cell lung cancer

Enrichment ratio

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

4.5

5.0

C

Non homologous end joining

ATM signaling in development and disease

Glycolysis and gluconeogenesis

DNA IR double strand breaks and cellular response via ATM

Clear cell renal cell carcinoma pathways

128 genes

Ras signaling

Enrichment ratio

0

5

10

15

20

25

30

35

10

15

50

D

Non homologous end joining

ATM signaling in development and disease

Glycolysis and gluconeogenesis

DNA IR double strand breaks and cellular response via ATM

Clear cell renal cell carcinoma pathways

116 genes

Ras signaling

Enrichment ratio

0

5

10

15

20

25

30

35

40

45

50

“midnightblue” from the two-cohort WGCNA. (C) Module “tan” from the three-cohort WGCNA. (D) Genes overlapping between “midnightblue” and “tan” modules. The rectangles on the left are colored according to the WGCNA palette. The last subfigure contains a two-color rectangle because it represents the overlap of the “midnightblue” and “tan” modules.

2.6. AP-2 Engagement of Microbiota-Responsive Chromatin in ACC and DLBC Contrasts with a Null Pattern in STAD

Ultimately, it was evaluated whether microbe-responsive genes show AP-2 motif and ChIP-overlap enrichment in nearby accessible chromatin, implying AP-2-mediated regulation of the microbiota-host transcriptional program. Across radii, ACC and DLBC displayed consistent enrichment of AP-2 motif sites that intersect the ChIP union in FG se- quences relative to GC-matched BG, with the strongest signal at ±750 kb (ACC OR = 1.281, p = 2.4 × 10-5; DLBC OR = 1.258, p = 7.1 × 10-6; empirical one-sided p ~ 1 × 10-4). Enrichment remained positive at ±1000 kb but with smaller effect sizes (ACC OR = 1.219, p = 1.2 × 10-4; DLBC OR=1.106, p=1.6 x 10-2). In STAD, the AP-2 family signal was uniformly absent (OR < 1 and p > 0.05 at all radii), indicating cohort-specific regulatory divergence (Table 3). These patterns were further verified (Table 4): overlaps between motif-assigned nearest genes and AP-2 regulon targets stabilized around ~12-14% (ACC, AP-2€ regulon) and ~10-11% (DLBC, AP-2€ regulon), whereas STAD (AP-2ß regulon) remained low (~2-3%).

Table 3. AP-2 family motif-ChIP enrichment across radii.
CohortRadius (kb)Odds RatioFisher p-ValueEmpirical p-Value
ACC2501.1388.915 ×10-28.139 × 10-2
5001.1562.079 × 10-22.37 × 10-2
7501.2812.403 × 10-51.00 × 10-4
10001.2191.227 × 10-42.00 × 10-4
DLBC2501.2413.496 × 10-32.90 × 10-3
5001.1674.839 × 10-34.40 × 10-3
7501.2587.099 × 10-61.00 × 10-4
10001.1061.643 × 10-21.58 ×10-2
STAD2500.8358.373 × 10-12.167 × 10-1
5000.8319.068 ×10-11.195 × 10-1
7500.8279.524 × 10-16.129 × 10-2
10000.8559.378 × 10-17.909 ×10-2

Odds ratios (FG vs. GC-matched BG) and two-sided Fisher p-values are shown alongside directional empirical one-sided p-values from 10,000 hypergeometric draws.

Table 4. Overlap between nearest-gene assignments from motif windows and AP-2 regulon targets.
CohortTF250 kb500 kb750 kb1000 kb
ACCAP-2€84/622 (13.5%)135/1058 (12.8%)184/1444 (12.7%)223/1822 (12.2%)
DLBCAP-2€60/544 (11.0%)100/939 (10.6%)137/1255 (10.9%)163/1538 (10.6%)
STADAP-263/134 (2.24%)8/239 (3.35%)12/374 (3.21%)17/469 (3.62%)

3. Discussion

The present study establishes the first-ever characterization of the relationship between intratumoral microbiota abundance and AP-2 transcription factor expression across human neoplasms. Through systematic pan-cancer analysis, significant correlations were identified in 18 of 33 tumor types, with subsequent species-level investigation revealing prognostically relevant relationships in ACC, DLBC, and STAD. These findings pave the way for a novel investigative path in cancer biology, suggesting that microbial communities may influence tumor development through previously unexplored transcriptional regulatory mechanisms related to the AP-2 family.

The identification of TFAP2E-AS1 as the most frequently correlated AP-2 family mem- ber represents an encouraging discovery, particularly given its documented role in mi- croRNA regulation [37]. While direct evidence linking TFAP2E-AS1 to the microbiota remains absent, the established involvement of microRNAs in host-microbe interactions suggests potential indirect regulatory mechanisms that require further investigation [38,39]. The prominence of TFAP2E among frequently correlated family members aligns with its documented pleiotropic functions in gastrointestinal diseases, in which the gut microbiota plays an established pathogenic role [24]. This convergence suggests that TFAP2E may serve as a transcriptional mediator of microbiota-host interactions in gastrointestinal ma- lignancies. Though mechanistic insights are still lacking, the location of TFAP2E-AS1 and TFAP2E on the same cytogenetic band (1p34.3) suggests a feedback loop between them and a potential dependency of their expression, which may explain the prominence of both genes among frequently correlated family members. The absence of prior literature on the microbiota-TFAP2D relationship, despite correlations observed across six cohorts, under- scores the novelty of our findings. Given the limited functional characterization of AP-28 relative to other family members, these observations may guide future research on this understudied transcription factor. Previous studies have noted the involvement of TFAP2D in prostate and lung cancer [35,40], suggesting broader relevance across malignancies.

While the predominance of TFAP2C expression in ACC has been previously docu- mented [41], our work represents the first evidence suggesting that TFAP2E may also play a role in this tumor type. The association of TFAP2E and P. fungorum levels with aggres- sive clinical features (including C1A molecular subtype, COC3 classification, and steroid phenotype) aligns with molecular stratification systems, in which these ACC subtypes are associated with worse outcomes [42]. The DLBC findings are particularly compelling, given the extensive evidence for the prognostic significance of microbiota in this malig- nancy [43-48]. While the presence of Actinomyces in DLBC has been documented in the literature [49], its correlation with TFAP2E represents a novel observation. The potential association with reduced frequency of GCB subtype carries clinical relevance, as microbiota composition influences treatment response and survival in DLBC patients [44,45]. The increased number of females observed in a group with higher Actinomyces oris abundance and TFAP2E expression may reflect gender-specific immune-microbiome interactions, con- sistent with emerging evidence of gender-dependent effects of the microbiota on lymphoma progression [46]. For STAD, our findings conform with substantial literature demonstrating the critical role of microbiota in gastric cancer pathogenesis [50-53]. Although specific data on Cutibacterium granulosum are lacking, the putative relationship observed in our study between microsatellite stability and homozygous deletions suggests involvement in genomic instability mechanisms. Previous studies have indicated that microbiota composi- tion influences genomic stability in gastric cancer [52], and AP-2 family members regulate DNA damage response pathways [54], supporting potential mechanistic connections.

The divergent expression profiles between STAD and ACC/DLBC cohorts provide insights into heterogeneous interactions between AP-2 and microbiota across tumor types.

The unique STAD profile, involving TFAP2B rather than TFAP2E, suggests that different AP-2 family members mediate distinct transcriptional programs in response to microbial stimuli. Remarkably, the consensus between ACC and DLBC persists despite fundamental differences in bacterial species (Paraburkholderia vs. Actinomyces) and tumor origins (endocrine vs. hematologic). This convergence, in which TFAP2E was consistently involved in both cohorts and maintained a mutual WGCNA pattern across the specific bacterium, suggests that the identity of the responding AP-2 factor may be a more critical determinant of transcriptional outcomes than either tumor type or bacterial species. These observations, while derived from our most significant findings in only three cohorts with highly refined patient groups, warrant future investigation into whether the primary determinant of microbiota-induced transcriptional changes is not the tumor context or bacterial identity per se, but rather the AP-2 family member engaged in response to microbial stimuli. The FG-BG enrichment and gene-set intersections suggest that AP-2-dependent regulatory programs are selectively engaged in ACC and DLBC within the genomic neighborhoods of microbiota- linked genes, while such coupling is not evident in STAD. The stable OR pattern across radii in ACC and DLBC implies that relevant AP-2-associated elements are distributed across hundreds of kilobases rather than confined to immediate promoters, consistent with enhancer-rich landscapes and distal enhancer-promoter communication. The lack of enrichment in STAD suggests that AP-2 occupancy is either less central to gastric tumor regulatory wiring or operates in contexts not captured by the current microbe-associated gene sets.

Apart from the above most essential relationships, others identified in our initial screening could be elaborated using the literature. The correlation between Haemophilus and TFAP2C in GBM gains context from evidence linking Haemophilus to IDH1 status and glioma grade [55]. For Sphingomonas and TFAP2A-AS1 in CHOL, existing literature suggests potential roles of the aforementioned genus in cholangiocarcinoma pathogenesis [56]. The identification of Staphylococcus as a correlate of SARC and UVM parallels findings in lung cancer, where Staphylococcus levels are notably elevated in tumor tissues compared to healthy controls [57,58]. Staphylococcus exhibits bidirectional effects on carcinogenesis, with pathogenic strains potentially modulating immune responses through virulence factors [59]. Importantly, AP-2 transcription factors in lung cancer show significant correlations with immune cell infiltration, including CD8+ T cells, CD4+ T cells, and other tumor-infiltrating immune populations [60], suggesting potential mechanistic links between transcriptional regulation and immune modulation. This is consistent with Staphylococcus influencing CD8+ T cell infiltration and the immune microenvironment in breast cancer [61], indicating that Staphylococcus may exert universal immunomodulatory functions across multiple malignancies, potentially through interactions with transcriptional programs.

Transcriptomic insights, followed by functional annotation, revealed that microbiota- AP-2 relationships influence key signaling pathways and fundamental cellular processes, such as the DNA damage response and metabolic reprogramming. AP-2« directly regulates transcription of critical DNA repair genes such as TOP2A, NUDT1, POLD1, and PARP1 in hepatocellular carcinoma, thereby facilitating repair of oxidized DNA lesions [62]. AP-2y inhibits GADD45B expression, with overexpressed TFAP2C suppressing GADD45B and PMAIP1 at both mRNA and protein levels in non-small cell lung cancer cells [63]. The identification of ATM signaling and DNA repair pathways aligns with emerging evidence that intratumoral bacteria alter genomic stability by inducing DNA damage and muta- tions [48,64,65]. Regarding metabolic alterations, the microbiota influences tumor metabolic reprogramming, with changes in glycolysis and gluconeogenesis consistent with its estab- lished role in regulating metabolic pathways, including fatty acid synthesis and glutamate metabolism [51]. Moreover, AP-2 family members regulate metabolic processes through

their downstream targets, with TFAP2A shown to increase EZH2 expression by perturbing the activity of the nucleosome remodeling complex, contributing to the metabolic rewiring characteristic of cancer cells [23]. The divergence in pathway enrichment between cohorts (with STAD showing distinct patterns compared with ACC and DLBC) may reflect differ- ences in the AP-2 family members involved and thus distinct transcriptional programs. For instance, phenomena unique to ACC and DLBC (incorporating TFAP2E-related rela- tionships) included Notch signaling and the TP53 network, whereas alternative pathways concerned STAD (in which a TFAP2B-related relationship was noted). This specificity has implications for therapeutic targeting, as different AP-2 members may require distinct intervention strategies.

The consistent observation that elevated bacterial abundance and AP-2 expression correlate with unfavorable survival across multiple endpoints (OS, DSS, DFI, PFI) implies clinical utility. These markers could enhance existing prognostic systems, particularly in ACC, where molecular classification already guides therapy [66], and in DLBC, where microbiota profiling shows promise for patient stratification [43,44]. The species-specific nature of identified correlations opens possibilities for targeted interventions. While direct manipulation of intratumoral microbiota remains challenging, the documented effects of microbiota on treatment response in various cancers suggest potential therapeutic avenues [45,47]. The prominence of TFAP2E across multiple tumor types positions it as a putative target, particularly given its presence in the most promising relationships identified throughout this study, as well as its role in gastrointestinal malignancies, where microbiome-based interventions are advancing rapidly [24,50].

Some study limitations and key takeaways are worth mentioning. Despite being the first investigation to incorporate both microbiota and AP-2 data, the correlative nature of the analysis cannot establish causality, necessitating experimental validation through so- phisticated approaches such as co-culture systems, animal models, and relevant sequencing methods (e.g., Dual-Seq). The genus-level correlations inevitably aggregate heterogeneous species and therefore served only as an initial, broad screening step. It also explains why many PCC values are modest yet significant, which is why only relationships that remained consistent at the species level and in downstream analyses were further interpreted. The high number of significant correlations in some tumors during genus-level analysis (espe- cially in DLBC) may be due to a relatively small cohort size, and these findings require additional verification in larger cohorts in the future. An absence of significant survival effects in KICH, despite initially promising correlations, highlights the importance of mul- tilayered analysis in microbiome studies. Technically, one should consider the potential variability between sequencing methodologies and taxonomic classification systems across databases. In addition, strain-level variations within species, which can dramatically in- fluence host interactions, are challenging to assess in silico. Finally, TFAP2B/E ChIP-seq data are not available in public repositories at the time of analysis; thus, results rely on TFAP2A/C and their union as a proxy for family binding. This limitation affects all cohorts equally and is unlikely to create a divergence between ACC/DLBC and STAD.

All things considered, this work establishes a foundation for investigating microbiota- AP-2 interactions in cancer biology. Priority areas for future research include: (1) mech- anistic validation of identified relationships, particularly the TFAP2E-microbiota axis; (2) characterization of AP-2 antisense transcripts, which are mentioned prominently but lack precise functional studies; (3) investigation of indirect mechanisms, such as microRNA- mediated regulation; (4) expansion to additional tumors or cancer subtypes, particularly those with established microbiome involvement; and (5) development of microbiome- based biomarker signatures incorporating AP-2 members for clinical implementation. The potential for AP-2 family members to serve as therapeutic targets mediating microbiota

effects warrants investigation. Given the established role of microbiota in modulating im- munotherapy response, understanding how AP-2 transcription factors integrate microbial signals could lead to the development of novel therapeutic strategies. The influence of the microbiome on transcriptional programs and host gene expression opens new avenues for precision oncology, enabling integrated profiling of microbial communities and host transcriptional states to guide personalized interventions and improve the selection and effectiveness of combination therapies.

4. Materials and Methods

4.1. Assessment of Bacterial Composition and Diversity in a Pan-Cancer View

Bacterial composition data across all tumor types in The Cancer Genome Atlas (TCGA) [67] were obtained from the Bacteria in Cancer (BIC) database [68], and the relative abundance of the top 15 genera was visualized in composition barplots. BIC also served as a source of genus-level diversity data, presented as boxplots for all available indices: evenness, richness, Shannon index, Simpson index of diversity, and Simpson reciprocal index. In addition, bacterial composition was assessed across a pan-cancer spectrum for the top 50 genera using The Cancer Microbiota (TCMbio) database [3], which was also employed to compare alpha-diversity (Shannon, Simpson, Observed, Chao1) across tumors and non-tumor specimens.

4.2. Acquisition and Processing of Gene Expression and Microbiota Abundance Data

Data for all tumor cohorts available in TCGA (Table 5) were obtained from the Ge- nomic Data Commons repository [69] using the GDCquery() function of the TCGAbiolinks v2.34.1 R-package. Transcriptomic data were downloaded as raw counts generated by the Spliced Transcripts Alignment to a Reference (STAR) protocol, with the 38th major build of the human genome defined by the Genome Reference Consortium (hg38) selected as the reference. Corresponding clinical profiles were merged from both GDC (as part of GDCquery) and TCGA Clinical Data Resource (TCGA-CDR) [70], with the former used to derive general clinical features while the latter utilized as a source of prognostic end- points, i.e., overall survival (OS), disease-specific survival (DSS), disease-free interval (DFI), and progression-free interval (PFI). Regarding microbiota, genus-level and species-level abundance data of intratumoral bacteria were downloaded for all TCGA tumors from TCMbio, which utilizes the Kraken2 taxonomic classification system in its pipeline [3]. Data were filtered to investigate genera/species present in at least 20% of a given cohort and in patients having at least two non-zero counts. Among individuals with available gene expression and bacterial abundance data, counts were subjected to Trimmed Mean of M-values (TMM) normalization using the edgeR v4.4.2 R-package. Patients with values acquired through the TMM method were investigated in consecutive stages of the study.

Table 5. Investigated tumors included in the study, with an indication of each cohort size.
TCGA Cohort AbbreviationFull Disease Name/DescriptionNumber of Samples Included in This Study, Considering Microbiota Data Availability
ACCAdrenocortical carcinoma76
BLCABladder urothelial carcinoma395
BRCABreast invasive carcinoma1090
CESCCervical and endocervical cancers293
CHOLCholangiocarcinoma33
COADColon adenocarcinoma456
Table 5. Cont.
TCGA Cohort AbbreviationFull Disease Name/DescriptionNumber of Samples Included in This Study, Considering Microbiota Data Availability
DLBCDiffuse large B-cell lymphoma47
ESCAEsophageal carcinoma162
GBMGlioblastoma multiforme154
HNSCHead and neck squamous cell carcinoma500
KICHKidney chromophobe65
KIRCKidney renal clear cell carcinoma532
KIRPKidney renal papillary cell carcinoma288
LAMLAcute myeloid leukemia146
LGGLower grade glioma510
LIHCLiver hepatocellular carcinoma361
LUADLung adenocarcinoma496
LUSCLung squamous cell carcinoma495
MESOMesothelioma85
OVOvarian serous cystadenocarcinoma373
PAADPancreatic adenocarcinoma176
PCPGPheochromocytoma and Paraganglioma179
PRADProstate adenocarcinoma484
READRectum adenocarcinoma166
SARCSarcoma253
SKCMSkin cutaneous melanoma103
STADStomach adenocarcinoma375
TGCTTesticular germ cell tumors135
THCAThyroid carcinoma502
THYMThymoma116
UCECUterine corpus endometrial carcinoma545
UCSUterine carcinosarcoma56
UVMUveal melanoma79

4.3. Correlation Analysis

The Pearson correlation coefficient (PCC) between bacterial abundance and gene expression of AP-2 family members was assessed using the rcorr() function of the Hmisc v5.2-3 R-package. A threshold of p < 0.05 was applied via the which() function of the base v4.4.3 R-package to retain only statistically significant relationships. Since TFAP2- microbiota pairs were later subjected to various analyses, a PCC threshold of at least |0.3| was considered acceptable at this stage. Initial analysis focused on the correlation between genus-level abundance and AP-2 expression; however, the scope of the species- level analysis (see Section 2.3) was outlined following survival analysis (described below) and the interpretation of results for promising genera (see Section 2.2).

4.4. Survival Analysis

Optimal gene expression or bacterial abundance cutpoints to stratify patients into two groups based on OS, DSS, DFI, and PFI were established using the EvaluateCutpoints tool [71], and corresponding Kaplan-Meier curves were generated using the ggsurvplot() function of the survminer v0.5.0 R-package. As mentioned in the previous section, the scope of the survival analysis for bacterial data at both the genus and species levels depended on the results of the correlation analysis, with genus-level information utilized first (see Section 2.2) and species-level data used afterwards (see Section 2.3). During the genus-level approach, it was acceptable to proceed with relationships in which only one component (i.e., microbiota or an AP-2 member) evidently affected survival, with the other component showing at least one result approaching statistical significance. At the species-level stage, it was decided to select relationships that exerted a significant influence on at least one survival endpoint, with both bacterial abundance and AP-2 expression having the same effect on survival.

4.5. Intersection Analysis Followed by Assessment of Prognostic Outcomes

To establish representative groups (with higher or lower microbiota abundance and AP- 2 expression), an intersection analysis was performed using the UpSetR v1.4.0 R-package. Such groups of patients in selected cohorts (see Section 2.4) were subjected to extended survival analysis. In addition to Kaplan-Meier curves generated as in the above section, hazard ratios were assessed using the coxph() function of forestmodel v0.6.2 R-package, whereas the proportional hazards assumption in a Cox model was evaluated using the Schoenfeld residual test performed via the ggcoxdiagnostics() function of survminer v0.5.0 R-package.

4.6. Investigating Clinical Profile and Performing Differential Expression Analysis (DEA) Among Representative Groups of Patients

Representative groups were compared in terms of the clinical profile acquired from GDC, encompassing both qualitative and quantitative features, with the latter detailed in Section 4.11. The distribution of qualitative data was visualized as filled stackbars using the ggplot2 v3.5.1 R-package, while quantitative information was depicted as beanplots generated using the beanplot v1.2 R-package. The same groups of patients were subjected to differential expression analysis (DEA), which was conducted separately for each selected cohort using the limma v3.62.2 R-package, employing the limma-voom method [72,73]. The workflow incorporated normalization through the calcNormFactors() function and the elimination of lowly expressed transcripts (those with ≥5 counts per million in ≥1 library), followed by variance modeling via the voom() transformation. Model fitting was performed in limma using weighted least squares for individual genes through lmFit(), and log2FC values comparing the case group (characterized by higher microbiota abundance and AP-2 expression) against the control group (characterized by lower microbiota abundance and AP-2 expression) were calculated using the makeContrasts() function with default parameters. Empirical Bayes smoothing of standard errors was applied before identifying differentially expressed genes (defined by p < 0.05 and log2FC > 0.58 or 0.58) using the topTable() function.

4.7. Bootstrapping and Weighted Gene Co-Expression Network Analysis (WGCNA)

Given the data scarcity in one of the representative groups (only four patients in the DLBC cohort representing higher microbiota abundance and AP-2 expression), this group was subjected twice to the resampling bootstrap method [74,75] to acquire a relevant quantity of samples prior to Weighted gene co-expression network analysis (WGCNA) [76]. The latter was performed on all selected cohorts to obtain consensus modules across tu-

mors using the Biological network reconstruction omnibus (BioNERO) [77]. Read counts were subjected to variance stabilizing transformation [78] via the “vstransform” param- eter set to “TRUE” and filtered by variance using “variance_filter” set to “TRUE” with n = 10,000 during the data preprocessing. Consensus modules were identified using the consensus_trait_cor() and consensus_modules() functions, with the latter executed with the correlation method set to Pearson and a module merging threshold of 0.2 for a signed hybrid type of network. The transformation into an adjacency matrix was performed using BioNERO-estimated optimal power via the consensus_SFT_fit() function and with the scale-free topology fitting index (R2) > 0.8. Established gene-containing modules were correlated to a binary trait representing microbiota abundance and AP-2 expression, with samples denoted as “AP-2 1 Microbiota 1” or “AP-2 Į Microbiota Į”.

4.8. Overlapping DEA and WGCNA Data Alongside Gene Ontology

Given the expression profile disparity between one of the selected cohorts versus the others (see Section 2.5), both WGCNA modules and DEA-derived differentially expressed genes were overlapped using the draw.pairwise.venn() function of the VennDiagram v1.7.3 R-package. The aforementioned modules and overlapping genes were independently subjected to Over-Representation Analysis (ORA) via the WebGestaltR v0.4.6 R-package employing a set of functional annotation resources: Biological Process noRedundant, KEGG, PANTHER, Reactome, and WikiPathways cancer. After setting the reference at “genome” and redundancy removal at “weighted set cover”, up to the top 15 annotations were acquired for each input list of genes.

4.9. Proximity Analysis Around Microbiota-Responsive Genes and FG/BG Set Construction

Following identification of ACC, DLBC, and STAD as the most promising cohorts, it was assessed whether microbe-responsive genes show AP-2 motif and ChIP-overlap enrichment in nearby accessible chromatin, suggesting AP-2-mediated regulation of the microbiota-host transcriptional program. Microbiota-responsive gene sets (utilized to center TSS windows) were derived from various sources (more details in Appendix A.1). Analyses were performed at the cohort level across genomic windows centered on the transcription start sites (TSS) of these genes using four radii (±250, ±500, ±750, ±1000 kb). Accessible chromatin peak sets for ACC and STAD were obtained from the TCGA Pan-Cancer ATAC compendium [79], and the DLBC-oriented analysis utilized ENCODE GM12878 ATAC peaks [80] owing to the lack of DLBC data in the TCGA compendium. Files were converted to BED3 format prior to processing. TSS coordinates were taken from UCSC refFlat on hg38 [81,82]. For each cohort and radius, ATAC peaks were intersected with TSS-centered windows to define foreground (FG) peak windows near upregulated genes, as well as background (BG) peak windows near genes not upregulated. Foreground windows were length/GC matched to background windows to control base composition. GC content was computed with BEDTools nuc [83,84], and adaptive with-replacement matching (tolerances ~2-10% GC and ±20-100 bp) produced a BG set equal in size to FG.

4.10. AP-2 Motif Scanning, ChIP-Seq Integration, and Enrichment Testing

To score AP-2 binding motifs, 300-bp windows centered on peak midpoints (pm150) were converted to FASTA using UCSC twoBitToFa [85] against hg38. Position-weight ma- trices for TFAP2B and TFAP2E were taken from HOCOMOCO v13 [86]. Motif scanning was performed with FIMO (MEME Suite) [87] using default background and a per-site p-value threshold of 1 x 10-3; per-sequence maximum scores and motif-hit BEDs were retained for downstream analysis. AP-2 family binding tracks were compiled from the Cistrome Data Browser v3.0 [88]; ChIP-seq peaks were downloaded, converted to BED3, and harmonized to hg38 (where needed) using UCSC liftOver [89]. No peak sets were available for TFAP2B

or TFAP2E; therefore, analyses relied on TFAP2A and TFAP2C individually and on their union as a surrogate for shared AP-2 occupancy. Motif-ChIP overlap was quantified by intersecting FG and GC-matched BG motif hits with AP-2 ChIP union peaks (bedtools intersect -u [90]). Enrichment of FG relative to BG was tested with a 2 x 2 Fisher’s exact test and summarized as odds ratio (OR) and p-value; directional empirical one-sided p-values were computed by Monte-Carlo draws from the hypergeometric distribution (N = 10,000). Lists of genes related to AP-20 and AP-2€ were obtained from various sources (more details in Appendix A.2) and further used for overlap summaries. Included references were UCSC hg38.2bit and refFlat.txt [91,92].

4.11. Statistical Analysis

All analyses were performed in RStudio 2024.12.1 build 563 with R 4.4.3 (Posit Soft- ware, Boston, MA, USA). Correlations between microbiota abundance and AP-2 expression were quantified using Pearson’s correlation coefficient. Survival endpoints were evaluated using Kaplan-Meier analysis with data-driven cutpoints and Cox proportional hazards models. For quantitative clinical data, normality of distribution was determined by the Shapiro-Wilk test, followed by an unpaired t-test or a Wilcoxon test. DEA between rep- resentative patient groups was assessed using the limma-voom framework, as well as integrated with consensus WGCNA and ORA. Enrichment near microbiota-responsive genes was tested using matched foreground and background peak sets. Results with a p- value less than 0.05 were considered statistically significant. Details on the implementation of specific methods are provided in the above subsections.

5. Conclusions

This study provides the first-ever assessment of the relationships between intratumoral microbiota and AP-2 transcription factors, and their widespread identification suggests a novel regulatory axis in human cancer. Initial screening revealed significant correlations in 18 of 33 tumor types, with TFAP2E-AS1 emerging as the most frequently correlated AP-2 member, and Halomonas as the most prevalent genus. Other frequently correlated compo- nents included TFAP2B, TFAP2D, TFAP2E, TFAP2A-AS1, Meiothermus, and Corynebacterium, with the DLBC tumor cohort exhibiting the highest number of identified relationships. Subsequent species-level analysis and prognostic evaluation identified three most promis- ing relationships: Paraburkholderia fungorum and TFAP2E in ACC, Actinomyces oris and TFAP2E in DLBC, as well as Cutibacterium granulosum and TFAP2B in STAD. Across mul- tiple prognostic endpoints, patients with simultaneously elevated microbiota abundance and AP-2 expression had unfavorable survival compared with those with concurrently lower microbiota and AP-2 levels, which was accompanied by distinct clinical feature patterns. Consensus expression profiling, as well as proximity and enrichment analyses across these three cohorts, has revealed important biological insights. While ACC and DLBC exhibited enrichment of AP-2 motif sites around microbiota-responsive genes and shared a consensus transcriptional profile, STAD displayed uniformly absent AP-2 family signals and showed expression changes that only partially resembled the ACC/DLBC consensus. This divergence manifested functionally, with metabolic reprogramming and TP53 regulation network involvement evident in ACC and DLBC but notably absent in STAD, suggesting that different AP-2 family members (TFAP2E versus TFAP2B) mediate distinct transcriptional programs in response to microbial stimuli. While the mechanistic landscape remains to be elucidated, our findings position AP-2 family members as poten- tial mediators of microbiota-host interactions in cancer, opening new investigative and therapeutic avenues in the rapidly evolving field of cancer microbiome research.

Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/ijms262311587/s1.

Author Contributions: Conceptualization, D.K .; methodology, D.K .; software, D.K .; writing-original draft preparation, D.K., P.G., and E.P .; writing-review and editing, D.K., P.G., L .- Y.Z., Ż.K .- K., M.K., R.K., E.P .; visualization, D.K. and P.G .; supervision, D.K .; project administration, D.K. All authors have read and agreed to the published version of the manuscript.

Funding: This research received no external funding.

Institutional Review Board Statement: Not applicable.

Informed Consent Statement: Not applicable.

Data Availability Statement: The original contributions presented in this study are included in the article/Supplementary Materials. Further inquiries can be directed to the corresponding author.

Conflicts of Interest: The authors declare no conflicts of interest.

Appendix A

Appendix A.1. Establishment of Microbiota-Responsive Gene Sets

Three cohort-specific gene sets were assembled to represent host programs plausibly modulated by the focal microbes. For ACC, a published adrenocortical carcinoma tran- scriptomic study was used to define a panel of high-confidence genes altered following treatment with P. fungorum-containing bacterial supernatant [93]. Differentially expressed genes were filtered at q < 0.05, and the top 30 upregulated plus top 30 down-regulated symbols were retained (n = 60). For STAD, a literature-anchored, Cutibacterium-responsive host panel was curated and then restricted to tumor-related genes: (i) genes implicated in Cutibacterium (Propionibacterium)-driven signaling and M2-macrophage polarization (centered on TLR4-PI3K-AKT and immunomodulatory markers such as IL10, MRC1, CD163, ARG1, TGFB1) were drawn from gastric cancer study [94], and (ii) tumor genes positively associated with Propionibacteriaceae abundance in gastric mucosa were taken from an integrative microbiome-transcriptome study [95]; the union was intersected with genes upregulated in gastric carcinoma versus inflamed non-tumor mucosa, yielding a concise, tumor-relevant set (n = 16). For DLBC, due to the lack of an Actinomyces-related host DE catalog, a mechanism-anchored panel (n = 47) was curated to reflect the expected B-lineage response to Gram-positive, lipoteichoic-acid-bearing stimuli. The panel was centered on TLR2/TLR1/TLR6 MYD88 IRAK4/IRAK1 TRAF6 NF-KB/MAPK, with complementary sensors (TLR9, NOD2), downstream transcription factors (RELA/RELB/ /NFKB1/2; JUN/FOS; MAPK1/3/14; MAP2K3/6), B-cell activation/costimulation and BCR-PI3K nodes (CD69, CD86, CD40, CD19; SYK, LYN, BLNK, PLCG2, PIK3CD), as well as cytokines/chemokines (IL6, TNF, IL1B, CXCL8, CCL2/3/4, CXCL1), consistent with established TLR2-MyD88 biology and B-cell TLR literature [96,97].

Appendix A.2. Acquisition of Genes Related to AP-23 and AP-2€

Candidate gene sets related to AP-26 /TFAP2B and AP-2€/TFAP2E were assembled us- ing a unified multi-source approach that integrated TF-target annotations and co-expression data. Repositories comprised Enrichr [98], cBioPortal [99], CorrelationAnalyzeR [100], UALCAN [101], TFLink [102], GeneCards [103], Harmonizome [104], TFBSDB [105], and TFTG [106]. Lists from sources with available data for specific TF were merged and dedupli- cated, yielding 928 TFAP2B-associated and 2700 TFAP2E-associated genes.

References

1. Chen, B.Y .; Lin, W.Z .; Li, Y.L .; Bi, C .; Du, L.J .; Liu, Y .; Zhou, L.J .; Liu, T .; Xu, S .; Shi, C.J .; et al. Roles of oral microbiota and oral-gut microbial transmission in hypertension. J. Adv. Res. 2023, 43, 147-161. [CrossRef] [PubMed]

2. Adu-Oppong, B .; Thanert, R .; Wallace, M.A .; Burnham, C.D .; Dantas, G. Substantial overlap between symptomatic and asymp- tomatic genitourinary microbiota states. Microbiome 2022, 10, 6. [CrossRef] [PubMed]

3. Sheng, D .; Jin, C .; Yue, K .; Yue, M .; Liang, Y .; Xue, X .; Li, P .; Zhao, G .; Zhang, L. Pan-cancer atlas of tumor-resident microbiome, immunity and prognosis. Cancer Lett. 2024, 598, 217077. [CrossRef]

4. Thomas, A.M .; Fidelle, M .; Routy, B .; Kroemer, G .; Wargo, J.A .; Segata, N .; Zitvogel, L. Gut OncoMicrobiome Signatures (GOMS) as next-generation biomarkers for cancer immunotherapy. Nat. Rev. Clin. Oncol. 2023, 20, 583-603. [CrossRef]

5. Overacre-Delgoffe, A.E .; Bumgarner, H.J .; Cillo, A.R .; Burr, A.H.P .; Tometich, J.T .; Bhattacharjee, A .; Bruno, T.C .; Vignali, D.A.A .; Hand, T.W. Microbiota-specific T follicular helper cells drive tertiary lymphoid structures and anti-tumor immunity against colorectal cancer. Immunity 2021, 54, 2812-2824.e2814. [CrossRef] [PubMed]

6. Pevsner-Fischer, M .; Tuganbaev, T .; Meijer, M .; Zhang, S.H .; Zeng, Z.R .; Chen, M.H .; Elinav, E. Role of the microbiome in non-gastrointestinal cancers. World J. Clin. Oncol. 2016, 7, 200-213. [CrossRef]

7. Galeano Nino, J.L .; Wu, H .; LaCourse, K.D .; Kempchinsky, A.G .; Baryiames, A .; Barber, B .; Futran, N .; Houlton, J .; Sather, C .; Sicinska, E .; et al. Effect of the intratumoral microbiota on spatial and cellular heterogeneity in cancer. Nature 2022, 611, 810-817. [CrossRef]

8. Heymann, C.J.F .; Bard, J.M .; Heymann, M.F .; Heymann, D .; Bobin-Dubigeon, C. The intratumoral microbiome: Characterization methods and functional impact. Cancer Lett. 2021, 522, 63-79. [CrossRef]

9. Yousefi, Y .; Baines, K.J .; Maleki Vareki, S. Microbiome bacterial influencers of host immunity and response to immunotherapy. Cell Rep. Med. 2024, 5, 101487. [CrossRef]

10. Panebianco, C .; Andriulli, A .; Pazienza, V. Pharmacomicrobiomics: Exploiting the drug-microbiota interactions in anticancer therapies. Microbiome 2018, 6, 92. [CrossRef]

11. Zhao, L.Y .; Mei, J.X .; Yu, G .; Lei, L .; Zhang, W.H .; Liu, K .; Chen, X.L .; Kolat, D .; Yang, K .; Hu, J.K. Role of the gut microbiota in anticancer therapy: From molecular mechanisms to clinical applications. Signal Transduct. Target. Ther. 2023, 8, 201. [CrossRef] [PubMed]

12. Said, S.S .; Ibrahim, W.N. Gut Microbiota-Tumor Microenvironment Interactions: Mechanisms and Clinical Implications for Immune Checkpoint Inhibitor Efficacy in Cancer. Cancer Manag. Res. 2025, 17, 171-192. [CrossRef]

13. Fu, Y .; Li, J .; Cai, W .; Huang, Y .; Liu, X .; Ma, Z .; Tang, Z .; Bian, X .; Zheng, J .; Jiang, J .; et al. The emerging tumor microbe microenvironment: From delineation to multidisciplinary approach-based interventions. Acta Pharm. Sin. B 2024, 14, 1560-1591. [CrossRef] [PubMed]

14. Fujisaka, S .; Watanabe, Y .; Tobe, K. The gut microbiome: A core regulator of metabolism. J. Endocrinol. 2023, 256, e220111. [CrossRef]

15. Lambring, C.B .; Siraj, S .; Patel, K .; Sankpal, U.T .; Mathew, S .; Basha, R. Impact of the Microbiome on the Immune System. Crit. Rev. Immunol. 2019, 39, 313-328. [CrossRef]

6. Jiang, C .; Wan, S .; Hu, P .; Li, Y .; Li, S. Editorial: Transcriptional Regulation in Metabolism and Immunology. Front. Genet. 2022, 13, 845697. [CrossRef]

17. Nichols, R.G .; Davenport, E.R. The relationship between the gut microbiome and host gene expression: A review. Hum. Genet. 2021, 140, 747-760. [CrossRef]

18. Camp, J.G .; Frank, C.L .; Lickwar, C.R .; Guturu, H .; Rube, T .; Wenger, A.M .; Chen, J .; Bejerano, G .; Crawford, G.E .; Rawls, J.F. Microbiota modulate transcription in the intestinal epithelium without remodeling the accessible chromatin landscape. Genome Res. 2014, 24, 1504-1516. [CrossRef]

19. Davison, J.M .; Lickwar, C.R .; Song, L .; Breton, G .; Crawford, G.E .; Rawls, J.F. Microbiota regulate intestinal epithelial gene expression by suppressing the transcription factor Hepatocyte nuclear factor 4 alpha. Genome Res. 2017, 27, 1195-1206. [CrossRef] [PubMed]

20. Meisel, J.S .; Sfyroera, G .; Bartow-McKenney, C .; Gimblet, C .; Bugayev, J .; Horwinski, J .; Kim, B .; Brestoff, J.R .; Tyldsley, A.S .; Zheng, Q .; et al. Commensal microbiota modulate gene expression in the skin. Microbiome 2018, 6, 20. [CrossRef]

21. Kazmierczak-Siedlecka, K .; Ruszkowski, J .; Skonieczna-Zydecka, K .; Jedrzejczak, J .; Folwarski, M .; Makarewicz, W. Gastrointesti- nal cancers: The role of microbiota in carcinogenesis and the role of probiotics and microbiota in anti-cancer therapy efficacy. Cent. Eur. J. Immunol. 2020, 45, 476-487. [CrossRef]

22. Wingender, E .; Schoeps, T .; Haubrock, M .; Krull, M .; Donitz, J. TFClass: Expanding the classification of human transcription factors to their mammalian orthologs. Nucleic Acids Res. 2018, 46, D343-D347. [CrossRef]

23. Jin, C .; Luo, Y .; Liang, Z .; Li, X .; Kolat, D .; Zhao, L .; Xiong, W. Crucial role of the transcription factors family activator protein 2 in cancer: Current clue and views. J. Transl. Med. 2023, 21, 371. [CrossRef]

24. Yu, Y.J .; Kolat, D .; Kaluzinska-Kolat, Z .; Liang, Z .; Peng, B.Q .; Zhu, Y.F .; Liu, K .; Mei, J.X .; Yu, G .; Zhang, W.H .; et al. The AP-2 Family of Transcription Factors-Still Undervalued Regulators in Gastroenterological Disorders. Int. J. Mol. Sci. 2024, 25, 9138. [CrossRef]

25. Eckert, D .; Buhl, S .; Weber, S .; Jager, R .; Schorle, H. The AP-2 family of transcription factors. Genome Biol. 2005, 6, 246. [CrossRef] [PubMed]

26. Wu, H.R .; Zhang, J. AP-2x expression in papillary thyroid carcinoma predicts tumor progression and poor prognosis. Cancer Manag. Res. 2018, 10, 2615-2625. [CrossRef]

27. . Kolat, D .; Kaluzinska, Z .; Bednarek, A.K .; Pluciennik, E. The biological characteristics of transcription factors AP-2x and AP-2y and their importance in various types of cancers. Biosci. Rep. 2019, 39, BSR20181928. [CrossRef] [PubMed]

28. Gao, Q.; Shi, Y.; Sun, Y.; Zhou, S.; Liu, Z.; Sun, X.; Di, X. Identification and verification of aging-related lncRNAs for prognosis prediction and immune microenvironment in patients with head and neck squamous carcinoma. Oncol. Res. 2023, 31, 35-61. [CrossRef] [PubMed]

29. Zhong, L .; Wang, Y .; Kannan, P .; Tainsky, M.A. Functional characterization of the interacting domains of the positive coactivator PC4 with the transcription factor AP-2x. Gene 2003, 320, 155-164. [CrossRef]

30. Zhao, F .; Satoda, M .; Licht, J.D .; Hayashizaki, Y .; Gelb, B.D. Cloning and characterization of a novel mouse AP-2 transcription factor, AP-26, with unique DNA binding and transactivation properties. J. Biol. Chem. 2001, 276, 40755-40760. [CrossRef]

31. Kolat, D .; Kaluzinska, Z .; Orzechowska, M .; Bednarek, A.K .; Pluciennik, E. Functional genomics of AP-2x and AP-2y in cancers: In silico study. BMC Med. Genom. 2020, 13, 174. [CrossRef]

32. Niu, C .; Wen, H .; Wang, S .; Shu, G .; Wang, M .; Yi, H .; Guo, K .; Pan, Q .; Yin, G. Potential prognosis and immunotherapy predictor TFAP2A in pan-cancer. Aging 2024, 16, 1021-1048. [CrossRef] [PubMed]

33. Zhang, Z .; Zhang, L .; Jia, L .; Cui, S .; Shi, Y .; Chang, A .; Zeng, X .; Wang, P. AP-2« suppresses invasion in BeWo cells by repression of matrix metalloproteinase-2 and -9 and up-regulation of E-cadherin. Mol. Cell. Biochem. 2013, 381, 31-39. [CrossRef]

34. Li, Z .; Xiong, W .; Liang, Z .; Wang, J .; Zeng, Z .; Kolat, D .; Li, X .; Zhou, D .; Xu, X .; Zhao, L. Critical role of the gut microbiota in immune responses and cancer immunotherapy. J. Hematol. Oncol. 2024, 17, 33. [CrossRef]

35. Kolat, D .; Zhao, L.Y .; Kciuk, M .; Pluciennik, E .; Kaluzinska-Kolat, Z. AP-26 Is the Most Relevant Target of AP-2 Family-Focused Cancer Therapy and Affects Genome Organization. Cells 2022, 11, 4124. [CrossRef]

36. Kolat, D .; Kaluzinska, Z .; Bednarek, A.K .; Pluciennik, E. Prognostic significance of AP-2x/y targets as cancer therapeutics. Sci. Rep. 2022, 12, 5497. [CrossRef]

37. Liu, M.; Jia, W.; Bai, L.; Lin, Q. Dysregulation of lncRNA TFAP2A-AS1 is involved in the pathogenesis of pulpitis by the regulation of microRNA-32-5p. Immun. Inflamm. Dis. 2024, 12, e1312. [CrossRef]

8. Li, M .; Chen, W.D .; Wang, Y.D. The roles of the gut microbiota-miRNA interaction in the host pathophysiology. Mol. Med. 2020, 26, 101. [CrossRef]

39. Prukpitikul, P .; Sirivarasai, J .; Sutjarit, N. The molecular mechanisms underlying gut microbiota-miRNA interaction in metabolic disorders. Benef. Microbes. 2024, 15, 83-96. [CrossRef] [PubMed]

40. Fraune, C .; Harms, L .; Buscheck, F .; Hoflmayer, D .; Tsourlakis, M.C .; Clauditz, T.S .; Simon, R .; Moller, K .; Luebke, A.M .; Moller- Koop, C .; et al. Upregulation of the transcription factor TFAP2D is associated with aggressive tumor phenotype in prostate cancer lacking the TMPRSS2:ERG fusion. Mol. Med. 2020, 26, 24. [CrossRef] [PubMed]

41. Frierson, H.F., Jr .; El-Naggar, A.K .; Welsh, J.B .; Sapinoso, L.M .; Su, A.I .; Cheng, J .; Saku, T .; Moskaluk, C.A .; Hampton, G.M. Large scale molecular analysis identifies genes with altered expression in salivary adenoid cystic carcinoma. Am. J. Pathol. 2002, 161, 1315-1323. [CrossRef]

2. Assie, G .; Letouze, E .; Fassnacht, M .; Jouinot, A .; Luscap, W .; Barreau, O .; Omeiri, H .; Rodriguez, S .; Perlemoine, K .; Rene-Corail, F .; et al. Integrated genomic characterization of adrenocortical carcinoma. Nat. Genet. 2014, 46, 607-612. [CrossRef]

43. Li, Z .; Shi, J .; Wu, X .; Yan, S .; Gao, Z .; Liu, Y. Gut microbiota is associated with the disease characteristics of patients with newly diagnosed diffuse large B-cell lymphoma. Am. J. Cancer Res. 2025, 15, 2285-2300. [CrossRef]

44. Yoon, S.E .; Kang, W .; Choi, S .; Park, Y .; Chalita, M .; Kim, H .; Lee, J.H .; Hyun, D.W .; Ryu, K.J .; Sung, H .; et al. The influence of microbial dysbiosis on immunochemotherapy-related efficacy and safety in diffuse large B-cell lymphoma. Blood 2023, 141, 2224-2238. [CrossRef]

45. Lin, Z .; Mao, D .; Jin, C .; Wang, J .; Lai, Y .; Zhang, Y .; Zhou, M .; Ge, Q .; Zhang, P .; Sun, Y .; et al. The gut microbiota correlate with the disease characteristics and immune status of patients with untreated diffuse large B-cell lymphoma. Front. Immunol. 2023, 14, 1105293. [CrossRef]

46. Xu, Y .; Shi, C .; Qian, J .; Yu, X .; Wang, S .; Shao, L .; Yu, W. The gut microbiota is altered significantly in primary diffuse large b-cell lymphoma patients and relapse refractory diffuse large b-cell lymphoma patients. Clin. Transl. Oncol. 2025, 27, 2347-2353. [CrossRef] [PubMed]

47. Yijia, Z .; Li, X .; Ma, L .; Wang, S .; Du, H .; Wu, Y .; Yu, J .; Xiang, Y .; Xiong, D .; Shan, H .; et al. Identification of intratumoral microbiome-driven immune modulation and therapeutic implications in diffuse large B-cell lymphoma. Cancer Immunol. Immunother. 2025, 74, 131. [CrossRef] [PubMed]

48. Yoon, S.E .; Kang, W .; Chalita, M .; Lim, J .; Kim, W.S .; Kim, S.J. Comprehensive Understanding of Gut Microbiota in Treatment Naïve Diffuse Large B Cell Lymphoma Patients. Blood 2021, 138, 2409. [CrossRef]

49. Zhang, Y .; Han, S .; Xiao, X .; Zheng, L .; Chen, Y .; Zhang, Z .; Gao, X .; Zhou, S .; Yu, K .; Huang, L .; et al. Integration analysis of tumor metagenome and peripheral immunity data of diffuse large-B cell lymphoma. Front. Immunol. 2023, 14, 1146861. [CrossRef]

50. Zhou, D .; Xiong, S .; Xiong, J .; Deng, X .; Long, Q .; Li, Y. Integrated analysis of the microbiome and transcriptome in stomach adenocarcinoma. Open Life Sci. 2023, 18, 20220528. [CrossRef]

51. Rodriguez, R.M .; Menor, M .; Hernandez, B.Y .; Deng, Y .; Khadka, V.S. Bacterial Diversity Correlates with Overall Survival in Cancers of the Head and Neck, Liver, and Stomach. Molecules 2021, 26, 5659. [CrossRef] [PubMed]

52. Wen, Y .; Wang, Y .; Huang, Y .; Liu, Z .; Hui, C. PLVAP protein expression correlated with microbial composition, clinicopathological features, and prognosis of patients with stomach adenocarcinoma. J. Cancer Res. Clin. Oncol. 2023, 149, 7139-7153. [CrossRef]

53. Hu, Y.L .; Pang, W .; Huang, Y .; Zhang, Y .; Zhang, C.J. The Gastric Microbiome Is Perturbed in Advanced Gastric Adenocarcinoma Identified Through Shotgun Metagenomics. Front. Cell. Infect. Microbiol. 2018, 8, 433. [CrossRef]

54. Wang, W .; Lv, L .; Pan, K .; Zhang, Y .; Zhao, J.J .; Chen, J.G .; Chen, Y.B .; Li, Y.Q .; Wang, Q.J .; He, J .; et al. Reduced expression of transcription factor AP-2« is associated with gastric adenocarcinoma prognosis. PLOS ONE 2011, 6, e24897. [CrossRef]

55. Wen, Y .; Feng, L .; Wang, H .; Zhou, H .; Li, Q .; Zhang, W .; Wang, M .; Li, Y .; Luan, X .; Jiang, Z .; et al. Association Between Oral Microbiota and Human Brain Glioma Grade: A Case-Control Study. Front. Microbiol. 2021, 12, 746568. [CrossRef]

56. Elvevi, A .; Laffusa, A .; Gallo, C .; Invernizzi, P .; Massironi, S. Any Role for Microbiota in Cholangiocarcinoma? A Comprehensive Review. Cells 2023, 12, 370. [CrossRef]

57. Yu, H .; Du, Y .; He, Y .; Sun, Y .; Li, J .; Jia, B .; Chen, J .; Peng, X .; An, T .; Li, J .; et al. Lactate production by tumor-resident Staphylococcus promotes metastatic colonization in lung adenocarcinoma. Cell Host Microbe 2025, 33, 1089-1105.e7. [CrossRef]

58. Li, R .; Li, J .; Zhou, X. Lung microbiome: New insights into the pathogenesis of respiratory diseases. Signal Transduct. Target. Ther. 2024, 9, 19. [CrossRef]

59. Wei, Y .; Sandhu, E .; Yang, X .; Yang, J .; Ren, Y .; Gao, X. Bidirectional Functional Effects of Staphylococcus on Carcinogenesis. Microorganisms 2022, 10, 2353. [CrossRef] [PubMed]

60. Szmajda-Krygier, D .; Krygier, A .; Zebrowska-Nawrocka, M .; Pietrzak, J .; Swiechowski, R .; Wosiak, A .; Jelen, A .; Balcerczak, E. Differential Expression of AP-2 Transcription Factors Family in Lung Adenocarcinoma and Lung Squamous Cell Carcinoma-A Bioinformatics Study. Cells 2023, 12, 667. [CrossRef] [PubMed]

61. Liu, C.C .; Grencewicz, D .; Chakravarthy, K .; Li, L .; Liepold, R .; Wolf, M .; Sangwan, N .; Tzeng, A .; Hoyd, R .; Jhawar, S.R .; et al. Breast tumor microbiome regulates anti-tumor immunity and T cell-associated metabolites. bioRxiv 2024. [CrossRef]

62. Wang, C .; Zhao, Z .; Zhao, Y .; Zhao, J .; Xia, L .; Xia, Q. Macroscopic inhibition of DNA damage repair pathways by targeting AP-2x with LEI110 eradicates hepatocellular carcinoma. Commun. Biol. 2024, 7, 342. [CrossRef]

63. Do, H .; Kim, D .; Kang, J .; Son, B .; Seo, D .; Youn, H .; Youn, B .; Kim, W. TFAP2C increases cell proliferation by downregulating GADD45B and PMAIP1 in non-small cell lung cancer cells. Biol. Res. 2019, 52, 35. [CrossRef]

64. Cao, Y .; Xia, H .; Tan, X .; Shi, C .; Ma, Y .; Meng, D .; Zhou, M .; Lv, Z .; Wang, S .; Jin, Y. Intratumoural microbiota: A new frontier in cancer development and therapy. Signal Transduct. Target. Ther. 2024, 9, 15. [CrossRef]

65. Yang, L .; Li, A .; Wang, Y .; Zhang, Y. Intratumoral microbiota: Roles in cancer initiation, development and therapeutic efficacy. Signal Transduct. Target Ther. 2023, 8, 35. [CrossRef]

66. Li, Y .; Zhang, D .; Wang, M .; Jiang, H .; Feng, C .; Li, Y.X. Intratumoral microbiota is associated with prognosis in patients with adrenocortical carcinoma. iMeta 2023, 2, e102. [CrossRef]

67. The Cancer Genome Atlas Research Network; Weinstein, J.N .; Collisson, E.A .; Mills, G.B .; Shaw, K.R .; Ozenberger, B.A .; Ellrott, K .; Shmulevich, I .; Sander, C .; Stuart, J.M. The Cancer Genome Atlas Pan-Cancer analysis project. Nat. Genet. 2013, 45, 1113-1120. [CrossRef]

68. Chen, K.P .; Hsu, C.L .; Oyang, Y.J .; Huang, H.C .; Juan, H.F. BIC: A database for the transcriptional landscape of bacteria in cancer. Nucleic Acids Res. 2023, 51, D1205-D1211. [CrossRef] [PubMed]

69. Jensen, M.A .; Ferretti, V .; Grossman, R.L .; Staudt, L.M. The NCI Genomic Data Commons as an engine for precision medicine. Blood 2017, 130, 453-459. [CrossRef]

70. Liu, J .; Lichtenberg, T .; Hoadley, K.A .; Poisson, L.M .; Lazar, A.J .; Cherniack, A.D .; Kovatich, A.J .; Benz, C.C .; Levine, D.A .; Lee, A.V .; et al. An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics. Cell 2018, 173, 400-416.e411. [CrossRef] [PubMed]

71. Ogluszka, M .; Orzechowska, M .; Jedroszka, D .; Witas, P .; Bednarek, A.K. Evaluate Cutpoints: Adaptable continuous data distribution system for determining survival in Kaplan-Meier estimator. Comput. Methods Programs Biomed. 2019, 177, 133-139. [CrossRef] [PubMed]

72. Ritchie, M.E .; Phipson, B .; Wu, D .; Hu, Y .; Law, C.W .; Shi, W .; Smyth, G.K. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015, 43, e47. [CrossRef] [PubMed]

73. Law, C.W .; Chen, Y .; Shi, W .; Smyth, G.K. voom: Precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol. 2014, 15, R29. [CrossRef] [PubMed]

74. Al Seesi, S .; Tiagueu, Y.T .; Zelikovsky, A .; Mandoiu, I.I. Bootstrap-based differential gene expression analysis for RNA-Seq data with and without replicates. BMC Genom. 2014, 15 (Suppl. 8), S2. [CrossRef]

75. Kulesa, A .; Krzywinski, M .; Blainey, P .; Altman, N. Sampling distributions and the bootstrap. Nat. Methods 2015, 12, 477-478. [CrossRef]

76. Qu, X .; Wu, S .; Gao, J .; Qin, Z .; Zhou, Z .; Liu, J. Weighted gene co expression network analysis (WGCNA) with key pathways and hub-genes related to micro RNAs in ischemic stroke. IET Syst. Biol. 2021, 15, 93-100. [CrossRef]

77. Almeida-Silva, F .; Venancio, T.M. BioNERO: An all-in-one R/Bioconductor package for comprehensive and easy biological network reconstruction. Funct. Integr. Genom. 2022, 22, 131-136. [CrossRef]

78. Love, M.I .; Huber, W .; Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014, 15, 550. [CrossRef]

79. Corces, M.R .; Granja, J.M .; Shams, S .; Louie, B.H .; Seoane, J.A .; Zhou, W .; Silva, T.C .; Groeneveld, C .; Wong, C.K .; Cho, S.W .; et al. The chromatin accessibility landscape of primary human cancers. Science 2018, 362, eaav1898. [CrossRef]

80. Consortium, E.P. An integrated encyclopedia of DNA elements in the human genome. Nature 2012, 489, 57-74. [CrossRef]

81. Perez, G .; Barber, G.P .; Benet-Pages, A .; Casper, J .; Clawson, H .; Diekhans, M .; Fischer, C .; Gonzalez, J.N .; Hinrichs, A.S .; Lee, C.M .; et al. The UCSC Genome Browser database: 2025 update. Nucleic Acids Res. 2025, 53, D1243-D1249. [CrossRef] [PubMed]

82. Bai, Y .; Kinne, J .; Ding, L .; Rath, E.C .; Cox, A .; Naidu, S.D. Identification of genome-wide non-canonical spliced regions and analysis of biological functions for spliced sequences using Read-Split-Fly. BMC Bioinform. 2017, 18, 382. [CrossRef]

83. Quinlan, A.R. BEDTools: The Swiss-Army Tool for Genome Feature Analysis. Curr. Protoc. Bioinform. 2014, 47, 11.12.11-11.12.34. [CrossRef]

84. Heilbrun, E.E .; Tseitline, D .; Wasserman, H .; Kirshenbaum, A .; Cohen, Y .; Gordan, R .; Adar, S. The epigenetic landscape shapes smoking-induced mutagenesis by modulating DNA damage susceptibility and repair efficiency. Nucleic Acids Res. 2025, 53, gkaf048. [CrossRef]

85. Capriotti, E .; Fariselli, P. PhD-SNPg: Updating a webserver and lightweight tool for scoring nucleotide variants. Nucleic Acids Res. 2023, 51, W451-W458. [CrossRef]

86. Vorontsov, I.E .; Eliseeva, I.A .; Zinkevich, A .; Nikonov, M .; Abramov, S .; Boytsov, A .; Kamenets, V .; Kasianova, A .; Kolmykov, S .; Yevshin, I.S .; et al. HOCOMOCO in 2024: A rebuild of the curated collection of binding models for human and mouse transcription factors. Nucleic Acids Res. 2024, 52, D154-D163. [CrossRef] [PubMed]

87. Grant, C.E .; Bailey, T.L .; Noble, W.S. FIMO: Scanning for occurrences of a given motif. Bioinformatics 2011, 27, 1017-1018. [CrossRef]

88. Taing, L .; Dandawate, A .; L’Yi, S .; Gehlenborg, N .; Brown, M .; Meyer, C.A. Cistrome Data Browser: Integrated search, analysis and visualization of chromatin data. Nucleic Acids Res. 2024, 52, D61-D66. [CrossRef] [PubMed]

89. Park, K.J .; Yoon, Y.A .; Park, J.H. Evaluation of Liftover Tools for the Conversion of Genome Reference Consortium Human Build 37 to Build 38 Using ClinVar Variants. Genes 2023, 14, 1875. [CrossRef]

90. Layer, R.M .; Skadron, K .; Robins, G .; Hall, I.M .; Quinlan, A.R. Binary Interval Search: A scalable algorithm for counting interval intersections. Bioinformatics 2013, 29, 1-7. [CrossRef]

91. Schmid-Burgk, J.L .; Gao, L .; Li, D .; Gardner, Z .; Strecker, J .; Lash, B .; Zhang, F. Highly Parallel Profiling of Cas9 Variant Specificity. Mol. Cell 2020, 78, 794-800.e8. [CrossRef] [PubMed]

92. Hsu, F.M .; Wang, C.R .; Chen, P.Y. Reduced Representation Bisulfite Sequencing in Maize. Bio-Protocol 2018, 8, e2778. [CrossRef]

93. Chai, X .; Wang, J .; Li, H .; Gao, C .; Li, S .; Wei, C .; Huang, J .; Tian, Y .; Yuan, J .; Lu, J .; et al. Intratumor microbiome features reveal antitumor potentials of intrahepatic cholangiocarcinoma. Gut Microbes 2023, 15, 2156255. [CrossRef]

94. Li, Q .; Wu, W .; Gong, D .; Shang, R .; Wang, J .; Yu, H. Propionibacterium acnes overabundance in gastric cancer promote M2 polarization of macrophages via a TLR4/PI3K/ Akt signaling. Gastric Cancer 2021, 24, 1242-1253. [CrossRef]

95. Park, C.H .; Hong, C .; Lee, A.R .; Sung, J .; Hwang, T.H. Multi-omics reveals microbiome, host gene expression, and immune landscape in gastric carcinogenesis. iScience 2022, 25, 103956. [CrossRef]

96. de Groen, R.A.L .; Schrader, A.M.R .; Kersten, M.J .; Pals, S.T .; Vermaat, J.S.P. MYD88 in the driver’s seat of B-cell lymphomagenesis: From molecular mechanisms to clinical implications. Haematologica 2019, 104, 2337-2348. [CrossRef]

97. Isaza-Correa, J.M .; Liang, Z .; van den Berg, A .; Diepstra, A .; Visser, L. Toll-like receptors in the pathogenesis of human B cell malignancies. J. Hematol. Oncol. 2014, 7, 57. [CrossRef]

98. Xie, Z .; Bailey, A .; Kuleshov, M.V .; Clarke, D.J.B .; Evangelista, J.E .; Jenkins, S.L .; Lachmann, A .; Wojciechowicz, M.L .; Kropiwnicki, E .; Jagodnik, K.M .; et al. Gene Set Knowledge Discovery with Enrichr. Curr. Protoc. 2021, 1, e90. [CrossRef]

99. Gao, J .; Aksoy, B.A .; Dogrusoz, U .; Dresdner, G .; Gross, B .; Sumer, S.O .; Sun, Y .; Jacobsen, A .; Sinha, R .; Larsson, E .; et al. Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci. Signal 2013, 6, pl1. [CrossRef] [PubMed]

100. Miller, H.E .; Bishop, A.J.R. Correlation AnalyzeR: Functional predictions from gene co-expression correlations. BMC Bioinform. 2021, 22, 206. [CrossRef] [PubMed]

101. Chandrashekar, D.S .; Karthikeyan, S.K .; Korla, P.K .; Patel, H .; Shovon, A.R .; Athar, M .; Netto, G.J .; Qin, Z.S .; Kumar, S .; Manne, U .; et al. UALCAN: An update to the integrated cancer data analysis platform. Neoplasia 2022, 25, 18-27. [CrossRef]

102. Liska, O .; Bohar, B .; Hidas, A .; Korcsmaros, T .; Papp, B .; Fazekas, D .; Ari, E. TFLink: An integrated gateway to access transcription factor-target gene interactions for multiple species. Database 2022, 2022, baac083. [CrossRef] [PubMed]

103. Stelzer, G .; Rosen, N .; Plaschkes, I .; Zimmerman, S .; Twik, M .; Fishilevich, S .; Stein, T.I .; Nudel, R .; Lieder, I .; Mazor, Y .; et al. The GeneCards Suite: From Gene Data Mining to Disease Genome Sequence Analyses. Curr. Protoc. Bioinform. 2016, 54, 1.30.1-1.30.33. [CrossRef] [PubMed]

104. Diamant, I .; Clarke, D.J.B .; Evangelista, J.E .; Lingam, N .; Ma’ayan, A. Harmonizome 3.0: Integrated knowledge about genes and proteins from diverse multi-omics resources. Nucleic Acids Res. 2025, 53, D1016-D1028. [CrossRef] [PubMed]

105. Plaisier, C.L .; O’Brien, S .; Bernard, B .; Reynolds, S .; Simon, Z .; Toledo, C.M .; Ding, Y .; Reiss, D.J .; Paddison, P.J .; Baliga, N.S. Causal Mechanistic Regulatory Network for Glioblastoma Deciphered Using Systems Genetics Network Analysis. Cell Syst. 2016, 3, 172-186. [CrossRef]

106. Zhou, X .; Zhou, L .; Qian, F .; Chen, J .; Zhang, Y .; Yu, Z .; Zhang, J .; Yang, Y .; Li, Y .; Song, C .; et al. TFTG: A comprehensive database for human transcription factors and their targets. Comput. Struct. Biotechnol. J. 2024, 23, 1877-1885. [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.