ELSEVIER

Contents lists available at ScienceDirect

Computational and Structural Biotechnology Journal

journal homepage: www.elsevier.com/locate/csbj

COMPUTATIONAL ANDSTRUCTURAL BIOTECHNOLOGY JOURNAL

01010100100101

0010101

1010101011 001

0101001 01010

101010101110

101001 01010

010101001001

0101001 10101

010101001011 1010101011 10

0101001 10101

0101001 01010

101010101100

0101001 51010

010101001001010101001 10101

Research article

Identification of the H3K36me3 reader LEDGF/p75 in the pancancer landscape and functional exploration in clear cell renal cell carcinoma

Check for updates

Yuwei Zhang a,b,1, Wei Guo b,1, Yangkun Feng ª,1, Longfei Yang a,b, Hao Lin b, Pengcheng Zhou ®, Kejie Zhao , Lin Jiang ”,”, Bing Yao ”, "", Ninghan Feng a, b, c,

a Nantong University Medical School, Nantong, China

b Department of Urology, Jiangnan University Medical Center, Wuxi, China

” Wuxi School of Medicine, Jiangnan University, Wuxi, China

d Department of Endocrinology and Metabolism, First Affiliated Hospital of Nanjing Medical University, Nanjing, China

e Department of Medical Genetics, Nanjing Medical University, Nanjing, China

ARTICLE INFO

Keywords:

LEDGF/p75 H3K36me3 Pancancer

SETD2 Clear cell renal cell carcinoma

ABSTRACT

Lens epithelium-derived growth factor (LEDGF/p75) is a reader of epigenetic marks and a potential target for therapeutic intervention. Its involvement in human immunodeficiency virus (HIV) integration and the devel- opment of leukemia driven by MLL (also known as KMT2A) gene fusion make it an attractive candidate for drug development. However, exploration of LEDGF/p75 as an epigenetic reader of H3K36me3 in tumors is limited. Here, for the first time, we analyze the role of LEDGF/p75 in multiple cancers via multiple online databases and in vitro experiments. We used pancancer bulk sequencing data and online tools to analyze correlations of LEDGE/ p75 with prognosis, genomic instability, DNA damage repair, prognostic alternative splicing, protein in- teractions, and tumor immunity. In summary, the present study identified that LEDGF/p75 may serve as a prognostic predictor for tumors such as adrenocortical carcinoma, kidney chromophobe, liver hepatocellular carcinoma, pancreatic adenocarcinoma, skin cutaneous melanoma, and clear cell renal cell carcinoma (ccRCC). In addition, in vitro experiments and gene microarray sequencing were performed to explore the function of LEDGF/p75 in ccRCC, providing new insights into the pathogenesis of the nonmutated SETD2 ccRCC subtype.

1. Introduction

LEDGF/p75, encoded by PSIP1, was originally identified as a protein copurifying with the general transcriptional coactivator PC4 and described as a transcriptional coactivator related to stress and autoim- mune responses[1]. PSIP1 also codes for an alternative splicing isoform referred to as p52. Compared with p52, LEDGF/p75 has an integrase binding domain (IBD) in addition to the common PWWP domain[2]. LEDGF/p75 has been reported to play a key role in the development of human immunodeficiency virus (HIV) and MLL leukemia[3-5]. HIV integrase can recognize and bind to the IBD of LEDGF/p75 and hijack it to the transcriptionally active region of the genome, allowing the virus to replicate in large numbers[6,7]. Similarly, MLL/MENIN complex, an

important player in the development and progression of MLL leukemia, can bind to the IBD of LEDGF/p75 to promote development and pro- gression of the disease[8,9]. In fact, as a chromatin-binding protein, LEDGF/p75 mediates chromatin localization of several nuclear proteins [10-13].

Posttranslational modification of histones is an important branch of epigenetic inheritance that has been widely reported in various diseases, especially in cancer. H3K36me3 is a 3-methylated modification at the 36th K of histone H3 and mediates several key tumor processes, such as transcriptional elongation, DNA methylation, and DNA damage repair [14]. Studies have reported that LEDGF/p75 reads the H3K36me3 mark via its PWWP domain to recruit functional proteins to the region of actively transcribed genes[15,16]. However, its role in tumors is poorly

* Correspondence to: Department of Endocrinology and Metabolism, First Affiliated Hospital of Nanjing Medical University, 300 Guangzhou Road, Nanjing 210029, China.

** Correspondence to: Department of Medical Genetics, Nanjing Medical University, 101 Longmian Road, Nanjing 211166, China.

*** Correspondence to: Nantong University Medical School, 9 Qiangyuan Road, Nantong 226019, China. E-mail addresses: jlinna0000@163.com (L. Jiang), byao@njmu.edu.cn (B. Yao), n.feng@njmu.edu.cn (N. Feng).

1 These authors contributed equally to this work.

https://doi.org/10.1016/j.csbj.2023.08.023

understood. Therefore, the present study was performed to characterize the landscape of LEDGF/p75 across cancers for the first time.

Renal cell carcinoma includes more than 10 histological and mo- lecular subtypes, of which clear cell renal cell carcinoma (ccRCC) is the most common and accounts for the majority of deaths associated with kidney cancer[17]. Genetically, ccRCC results from high-frequency mutations or even deletion of multiple tumor-suppressor genes (VHL, 80%; PBRM1, 29-46%; BAP1, 6-19%; and SETD2, 8-30%), which leads to genomic instability and promotes defects in DNA repair pathways [18]. ccRCC can be classified into clinically and therapeutically relevant subtypes based on the molecular characteristics caused by these defects [19]. SETD2 is an RNA polymerase II-associated histone methyl- transferase that catalyzes H3K36me3, which is a transcriptional activity marker. Previous studies indicated that H3K36me3 is only added by SETD2. Although SMYD5 was recently reported to play a role in methylation, SETD2 is the most dominant specific methylase [20].

Research on the presence or absence of SETD2 is of key clinical significance for personalized treatment. Therefore, the present study was performed as preliminary functional exploration of LEDGF/p75, the main reader of H3K36me3, in ccRCC.

2. Materials and Methods

2.1. Acquisition of basic information about LEDGF/p75

The genomic view for the LEDGF/p75 gene was obtained from the GeneCards database[21]. The features for the domains and regions of LEDGF/p75 were obtained from the UniProt database[22]. A three-dimensional structure for LEDGF/p75 was constructed from AlphaFold[23]. The immunofluorescence graphs of the intracellular location of LEDGF/p75 in U-251MG cells (HPA019697) were obtained from the ATLAS database[24].

2.2. Interaction network of LEDGF/P75 and functional enrichment analyses

The LEDGF/p75 protein-protein interaction network with physical interactions was predicted via the GeneMANIA database[25], and the potential pathways are marked in colors. The LEDGF/p75 protein — protein interaction network with known experimental validations was also explored via the STRING database[26]. We further predicted scores in cancers and other diseases based on the LEDGF/p75 protein-protein interaction network via the canSAR database[27].

Gene Ontology (GO) analyses, including biological process, cellular component analyses and molecular function analyses, along with reac- tome pathways were predicted via the TISIDB database[28]. Gene set enrichment analysis (GSEA) was carried out via R software with ca- nonical pathway gene sets derived from the KEGG pathway database [29]. All R programs used in the present study were uploaded to GitHub (https://github.com/melondoctor/LEDGF/tree/master).

We obtained pancancer expression profiles from the UCSC Xena database[30]. Associations between 5 histone modification genes, 5 DNA mismatch repair genes, tumor environment, immune cell infiltra- tion, immune-related genes and LEDGF/p75 expression in pancancer were visualized using R software.

2.4. Differential expression of LEDGF/p75 in normal and tumor groups

We processed expression data from the UCSC Xena database, deleted data with less than three normal samples, and analyzed the remaining data for 21 tumor types via R software. For the remaining 12 tumor

types, we added data from GTEx through GEPIA[31] for further analysis and identified differences in LEDGF/p75 expression between three tumor groups and corresponding normal groups. We further explored differential expression of LEDGF/p75 in different kinds of cells and obtained results from the ATLAS database.

For the LEDGF/p75 protein expression of tumor and normal groups, we searched the UALCAN database[32] and found 7 tumor types with different expression. Subsequently, we searched the ATLAS database and found corresponding immunohistochemical diagrams to show LEDGF/p75 protein expression.

2.5. Analyses between LEDGF/p75 expression and patient prognosis

Data on expression and survival in pancancer were obtained from the UCSC Xena database and analyzed via R software. Overall survival (OS), disease-specific survival (DSS), disease-free interval (DFI), and progression-free interval (PFI) were analyzed to indicate patient prog- nosis. Cox regression analysis and Kaplan-Meier (K-M) analysis were used for forest plots and K-M plots, respectively. In addition, we analyzed LEDGF/p75 expression and clinical stages of patients across cancers.

2.6. Genomic alterations of LEDGF/p75 across cancers

We searched the cBioPortal database[33] for information about genomic alterations in LEDGF/p75 across cancers. We first identified the landscape of alteration frequency across cancers, including mutations, structural variants, amplifications, deep deletions and multiple alter- ations. Then, we searched the ratio of the alteration group in pancancer. We further obtained detailed information on copy-number alterations and mutations in LEDGF/p75. Finally, we analyzed the correlation be- tween genomic alterations of LEDGF/p75 and patient survival.

We processed the mutation data of pancancer from the UCSC Xena database and analyzed them via R software to identify information about the tumor mutational burden (TMB) and microsatellite instability (MSI).

2.7. Clinically relevant alternative splicing analyses of LEDGF/p75

To identify clinically relevant alternative splicing (AS) events, the OncoSplicing database[34] was searched for AS events for LEDGF/p75. We chose project 247053 for subsequent analyses, which was the only known splice type in SplAdder methodology according to the OncoS- plicing database. Pan plots indicate the reads in, reads out and percent spliced-in (PSI) values in pancancer and normal tissues. PanDiff plots compared the PSI differences of queried AS events (detected in more than 3 cancers) between cancers and adjacent or GTEx normal tissues. Finally, we explored the prognostic significance of LEDGF/p75 AS events across cancers via K-M plots.

2.8. Exploration of the immunological roles of LEDGF/p75 across cancers

We first explored the association between LEDGF/p75 expression and immune subtypes across cancers via the TISIDB database. Subse- quently, we analyzed detailed subtypes, including C1 (wound healing), C2 (IFN-gamma dominant), C3 (inflammatory), C4 (lymphocyte depleted), C5 (immunologically quiet), and C6 (TGF-b dominant). We visualized the distribution across subtypes for the top six tumor types with the highest LEDGF/p75 expression. Heatmaps showed Spearman correlations between immunoinhibitors, chemokines, tumor infiltrating lymphocytes (TILs), and LEDGF/p75 expression across cancers.

We further searched the most confident results using gene expression data in the TIDE database[35], including parts of cancer, subtype, Pearson correlation with cytotoxic T lymphocyte level (CTL Cor), T-cell dysfunction score (T Dysfunction), survival risk score (Risk), survival risk score adjusted for the effect of cytotoxic T lymphocyte (Risk. adj),

and sample count in the dataset (count).

We then compared LEDGF/p75 expression levels across cell lines between pre- and postcytokine-treated samples with the TISMO data- base[36]; IFNy, IFNß, TNFa, and TGFb1 are included as cytokine treat- ments in the module.

Finally, we searched ROC Plotter[37] to explore the correlation be- tween LEDGF/p75 expression and immunotherapy. Receiver operating characteristic (ROC) curves indicated the high diagnostic value of LEDGF/p75 for assessing immunotherapy outcomes.

2.9. Cell culture

Human kidney cancer cell lines (i.e., Caki-1, 786-O, A498) and normal human kidney epithelial cells (HK-2) were acquired from Procell Life Science & Technology Company (Wuhan, China). All cell lines used in this study were tested and authenticated by DNA sequencing using the STR method (ABI 3730XL Genetic Analyzer) and tested for the absence of mycoplasma contamination (MycoAlert). All cell lines were cultured in commercial cell culture medium at 37 ℃ in a 5% CO2 atmosphere.

2.10. RNA isolation and quantitative real-time PCR (qRT-PCR)

TRIzol Reagent (Invitrogen, USA) was used to isolate total RNA. HiScript III SuperMix (Vazyme, China) was used to perform reverse transcription. qRT-PCR was used to analyze the expression level of mRNAs, as performed using a SYBR Green Kit (Yeasen, China) with the LightCycler® 96 SW 1.1 system (Roche, Switzerland).

2.11. Western blotting

RIPA buffer (Beyotime, China) mixed with protease inhibitor (Beyotime) was used to extract total cell protein. The proteins were separated and then transferred to a polyvinylidene difluoride mem- brane, which was incubated in 10% milk for 2 h at room temperature (RT). Subsequently, the membrane was incubated in the primary anti- body (1:1000, anti-LEDGF/p75: Abcam#ab177159; anti-H3K36me3: Cell Signaling Technology#4909 s; anti-H3: Proteintech#17168-1-AP; anti-ß-Actin: Proteintech#81115-1-RR) for 12 h at 4 ℃ and then treated with a matched secondary antibody (1:5000, Proteintech#SA00001-2) at RT for 2 h. Enhanced chemiluminescence (Tanon, China) was used for detection. ß-Actin and histone H3 were used as endogenous controls.

2.12. siRNA transfection

siRNA sequences, which were designed and synthesized by RiboBio (Guangzhou, China), used to target LEDGF/p75 were GGAAGA- TACCGACCATGAA (5’-3’, KD1) and GCAGCAACTAAACAATCAA (5’-3’, KD2). The siRNAs were transfected with Lipo3000 (Invitrogen) according to the instruction manual.

2.13. Cell counting kit-8 (CCK-8) and clone formation

A total of 2000 cells were seeded into 96-well plates and cultured for the indicated times. At each time point, 10 ul of CCK-8 reagent (Yeasen) was mixed with the cells for 1 h. Optical density (OD) values were measured at 450 nm.

For the clone formation experiment, 1000/well of the indicated cells were cultured in a 6-well plate for 10 days. Methanol was used to fix the cells for 30 min, followed by crystal violet staining for 30 min.

2.14. Transwell assay

Cell migration assays were performed with 24-well no-Matrigel Transwell chambers (Corning, USA). A total of 3 x 104 cells were cultured in the upper chamber suspended in 200 ul of medium without fetal bovine serum (FBS), and 600 ul of medium containing 10% FBS was

added to the bottom chamber. After overnight incubation, crystal violet was used to stain the cells on the lower surface of the chamber for 30 min. Images of three random fields were acquired using a fluorescence microscope, and the cells were counted.

2.15. Gene microarray analysis

786-O cells were selected for LEDGF/p75 knockdown treatment, and three biological replicates were prepared for gene microarray detection. Agilent SurePrint G3 Human Gene Expression v3 8×60K Microarray (DesignID:072363) chip experiments and data analysis of 6 samples were performed at Shanghai Ouyi Biomedical Technology Co., Ltd. China.

Feature Extraction software version 10.7.1.1 (Agilent Technologies) was used to process original images and extract original data. The original data were then standardized. Differential genes were screened according to fold change > 1.5 and P value < 0.05. Then, GO and KEGG enrichment analyses of differentially expressed genes were performed to determine biological functions and pathways.

2.16. Online databases

Information on all online databases used in the present study can be found in Table 2.

2.17. Statistical analyses

All bioinformatics analyses were conducted via R software (version 4.2.2), except for the results obtained from the online databases mentioned in this study. Independent t tests were used to compare normally distributed continuous variables and Mann-Whitney U tests to compare skewed continuous variables. All statistical tests were two- sided. P values less than 0.05 (* P < 0.05) were considered significant.

3. Results

3.1. Biological information of LEDGF/p75

PSIP1 is a protein-coding gene located in the short arm of the ninth chromosome (Supplementary Fig. 1 A). LEDGF/p75, encoded by PSIP1, has two functional domains: the PWWP domain (aa 1-91) and the IBD (aa 347-454) (Supplementary Fig. 1B-C)[38]. The PWWP domain reads the H3K36me3 mark, which is an important regulatory mode in epige- netics[39]. For the IBD, a protein binding hub, previous studies have reported several interacting proteins, such as HIV integrase 1 and 2, MENIN-MLL, CDCA7L, PogZ, CDC7-DBF4, and IWS1 [38].

We further explored localization of LEDGF/p75 in U-251MG cells and found that almost all of the protein in the nucleus, which was consistent with its chromatin identification function (Supplementary Fig. 1D).

3.2. LEDGF/p75 is involved in several diseases, especially cancer

We first focused on protein interactions of LEDGF/p75 and predicted proteins that physically bind to it via online tools. As displayed in Fig. 1A, LEDGF/p75 is mainly involved in viral infections and DNA repair, such as the viral life cycle, DNA repair complex and DNA binding. In addition, previous experiments confirmed that LEDGF/p75 binds to histones and histone-modification proteins (Fig. 1B). Finally, we pre- dicted the scores of LEDGF/p75 in cancer and other diseases based on protein-protein interactions, which highlighted the importance of LEDGF/p75 in cancer (Fig. 1C). As LEDGF/p75 was reported as a reader of histone modification marks such as H3K36me3, we further explored the correlation between LEDGF/p75 expression and several common histone modification-related genes. As shown in Fig. 1D, SETD2, a classic writer of H3K36me3, correlated highly positively with LEDGF/

Fig. 1. LEDGF/p75 plays a vital role in tumors and other diseases. (A) Physical interactions of LEDGF/p75 and other key proteins indicate potential functions of LEDGF/p75. (B) LEDGF/p75 binds to histones and several histone modification proteins. The purple lines represent experimentally determined interactions, and the blue lines show the results from curated databases. (C) Predicted scores of LEDGF/p75 in cancers and other diseases based on protein-protein interactions. Different colors represent different rating levels of the canSAR database. (D) Correlations between LEDGF/p75 expression and 5 histone modification genes in pancancer. (E-G) Results of GSEA-KEGG analyses. * P < 0.05, * * P < 0.01, * ** P < 0.001.

A

KMT2B

CXCR4

CDCA7L

B

C

RBBPB

HSPB1

HIST2H3D

cancer score

HIST1H4B

HIST2H3A

PPP2R1A

CRYAB

HIST1H4C

APOBEC3G

DBF4

HIST1H2AJ

0

100

PSIP1

97%

KMT2A

HIMGA1

y

PSIP1

XRCCS

other disease score

HIST1H4H

HIST1H4F

KMT2A

HDGFL2

BANF1

XRCCB

0

100

D

44%

KPNA1

CAST

PPIA

XRCC4

Coexpression across cancer types

LIG4

PRMT5

P value

1

Physical Interactions

PRMT1

viral latency

0

DNA repair complex

non-recombinational repair

KMT2B

double-strand break repair

0.7

DNA secondary structure binding

KMT2A

viral life cycle

SETD2

Cor

ACC

BLCA

BRCA

CESC

CHOL

COAD

DLBC

ESCA

GBM

HNSC

KICH

KIRC

KIRP

LAML

LGG

LIHC

LUAD

LUSC

MESO

PAAD

PCPG

PRAD

READ

SARC

SKCM

STAD

TGCT

THCA

THYM

UCEC

UVM

-0.5

OV

UCS

E

F

G

Running Enrichment Score

Running Enrichment Score

Running Enrichment Score

GBM

OV

PRAD

0.8

0.5

0.4

0.6

0.25

0.0

0.4

0.0

-0.

4

0.2

-0.25

0.0

-0.5

Ranked List Metric

OGOGO

Ranked List Metric

Ranked List Metric

10

00050

10

10

09090

-10

-10

-10

10000

20000

30000

40000

50000

10000

20000

30000

40000

50000

10000

20000

30000

40000

50000

Rank in Ordered Dataset

Rank in Ordered Dataset

Rank in Ordered Dataset

KEGG_CHEMOKINE_SIGNALING_PATHWAY

KEGG_ANTIGEN_PROCESSING_AND_PRESENTATION

KEGG_CALCIUM_SIGNALING_PATHWAY

KEGG_CYTOSOLIC_DNA_SENSING_PATHWAY

KEGG_AUTOIMMUNE_THYROID_DISEASE

KEGG_CELL_ADHESION_MOLECULES_CAMS

KEGG_NOD_LIKE_RECEPTOR_SIGNALING_PATHWAY

KEGG_CYTOSOLIC_DNA_SENSING_PATHWAY

KEGG_CYTOKINE_CYTOKINE_RECEPTOR_INTERACTION

KEGG_REGULATION_OF_AUTOPHAGY

KEGG_REGULATION_OF_AUTOPHAGY

KEGG_NEUROACTIVE_LIGAND_RECEPTOR_INTERACTION

KEGG_RIG ___ LIKE_RECEPTOR_SIGNALING_PATHWAY

KEGG_RIG_I_LIKE_RECEPTOR_SIGNALING_PATHWAY

KEGG_RIBOSOME

p75 expression in all 33 tumor types. Interestingly, apart from the pre- viously reported lysine methylation factors (SETD2, KMT2A, KMT2B), our results showed that LEDGF/p75 expression also correlated with arginine methylation factors (PRMT1, PRMT5) in some tumor types, suggesting a potential biological role.

GO analyses, including biological process, cellular component, and molecular function, indicated vital roles for LEDGF/p75 in both HIV infection and cancers (Supplementary Table 1). For instance, LEDGF/

p75 functions in viral latency, response to oxidative stress, and chro- matin binding. The reactome pathway of LEDGF/p75 illustrates the details of LEDGF/p75 in HIV integration and the viral life cycle (Sup- plementary Table 2). Then, we performed GSEA-KEGG analyses to further explore LEDGF/p75 function. LEDGF/p75 expression correlated with regulation of autophagy in GBM and OV and with cell adhesion molecules in PRAD, which indicates the potential function of LEDGE/ p75 in tumor metastasis and development (Fig. 1E-G). All TCGA

abbreviations are provided in Table 1.

3.3. LEDGF/p75 is differentially expressed across cancers

We analyzed LEDGF/p75 expression data from the UCSC Xena database and found LEDGF/p75 mRNA to be significantly differentially expressed in 15 of 21 tumor types. Among the 15 types, LEDGF/p75 was downregulated in 11 types (BLCA, BRCA, KICH, KIRC, KIRP, LUAD, LUSC, PRAD, READ, THCA, and UCEC) and highly expressed in another 4 (CHOL, HNSC, LIHC, and PCPG) (Fig. 2A). We supplemented data from GTEx normal tissues and performed the analyses for another 12 tumor types. We found LEDGF/p75 to be highly expressed in DLBC and THYM but reduced in OV (Fig. 2B-D).

Next, we explored LEDGF/p75 expression in various cell lines and found LEDGF/p75 to be differentially expressed in different cell lines (Fig. 2E). The top two cell lines with the highest LEDGF/p75 expression were HEL and NTERA-2, and the expression level of HEL cells was more than twice that of NTERA-2 cells. HEL, a cancer cell line derived from myeloid cells, is an erythroleukemia cell line (AML M6 in relapse after treatment for Hodgkin’s disease). Such high expression of LEDGF/p75 in HEL cells suggests that it may play a role in erythroleukemia. Previous studies have reported that LEDGF/p75 is essential for MLL-rearranged leukemogenesis[40]. Whether there is a deeper connection between the two diseases other than both being blood cancers and whether this connection is related to LEDGF/p75 remains unclear.

Then, we analyzed LEDGF/p75 protein expression across cancers via online tools. LEDGF/p75 was significantly reduced in 5 tumor types (BRCA, COAD, HNSC, LUAD, and UCEC) but markedly increased in another 2 (LIHC and OV) (Fig. 2F-L). As expected, immunohistochem- ical results from the ATLAS database showed trends similar to the above results.

These findings indicate significant differences in LEDGF/p75 expression across cancers. After integrated analysis of differences in

Table 1 Abbreviations of 33 tumors.
AbbreviationFull name
ACCAdrenocortical Carcinoma
KIRCKidney Renal Clear Cell Carcinoma
PRADProstate Adenocarcinoma
BLCABladder Urothelial Carcinoma
KIRPKidney Renal Papillary Cell Carcinomal
READRectum Adenocarcinoma
BRCABreast Invasive Carcinoma
LAMLAcute Myeloid Leukemia
SARCSarcoma
CESCCervical Squamous Cell Carcinoma
LGGLower Grade Glioma
SKCMSkin Cutaneous Melanoma
CHOLCholangiocarcinoma
LIHCLiver Hepatocellular Carcinoma
STADStomach Adenocarcinoma
COADColon Adenocarcinoma
LUADLung Adenocarcinoma
TGCTTesticular Germ Cell Tumors
DLBCDiffuse Large B-cell Lymphoma
LUSCLung Squamous Cell Carcinoma
THCAThyroid Carcinoma
ESCAEsophageal Carcinoma
MESOMesothelioma
THYMThymoma
GBMGlioblastoma Multiforme
OVOvarian Serous Cystadenocarcinoma
UCECUterine Corpus Endometrial Carcinoma
HNSCHead and Neck Squamous Cell Carcinoma
PAADPancreatic Adenocarcinoma
UCSUterine Carcinosarcoma
KICHKidney Chromophobe
PCPGPheochromocytoma and Paraganglioma
UVMUveal Melanoma
Table 2 Information of databases used in the present study.
DatabaseOnline link
GeneCardshttps://www.genecards.org/
Uniprothttps://www.uniprot.org/
AlphaFoldhttps://alphafold.ebi.ac.uk/
ATLAShttps://www.proteinatlas.org/
GeneMANIAhttp://genemania.org/
STRINGhttps://cn.string-db.org/
canSARhttps://cansar.ai/
TISIDBhttp://cis.hku.hk/TISIDB/index.php
KEGG pathwayhttp://www.gsea-msigdb.org/gsea/msigdb/human/collections. jsp
UCSC Xenahttps://xena.ucsc.edu/
GEPIAhttp://gepia.cancer-pku.cn/index.html
UALCANhttp://ualcan.path.uab.edu/index.html
cBioPortalhttps://www.cbioportal.org/
OncoSplicinghttp://www.oncosplicing.com/
TIDEhttp://tide.dfci.harvard.edu/
TISMOhttp://tismo.cistrome.org/
ROC Plotterhttps://www.rocplot.org/
DepMap portalhttps://depmap.org/portal/

LEDGF/p75 mRNA and protein expression, we found the same trend for LIHC, BRCA, LUAD and UCEC. Notably, LEDGF/p75 expression was opposite at the mRNA and protein levels in HNSC, which requires experimental verification.

3.4. LEDGF/p75 expression correlates with patient prognosis

To investigate the correlation between LEDGF/p75 expression and patient prognosis, we performed comprehensive analysis of expression and clinical data from the UCSC Xena database. Cox regression and K-M analyses were used to explore OS, DSS, DFI and PFI. As shown in the forest plot, LEDGF/p75 expression correlated negatively with OS in 4 tumor types (ACC, KICH, LIHC, and UCEC) but positively in 5 (CESC, KIRC, LGG, PAAD, and SKCM) (Fig. 3A). Regarding K-M plots, there was a negative correlation between LEDGF/p75 expression and OS in ACC but a positive correlation in 3 other tumor types (KIRC, LGG, and OV) (Fig. 3B-E). The results for DSS, DFI and PFI are shown in Supple- mentary Fig. 2 and Supplementary Fig. 3.

Furthermore, we analyzed the correlation between LEDGF/p75 expression and the clinical stage of patients. The results indicated that patients with advanced ACC generally expressed high levels of LEDGF/ p75; this phenomenon was reversed in patients with PAAD, SKCM, THCA, or KIRC (Fig. 3F-L). All these analyses indicate that LEDGF/p75 expression is closely associated with patient prognosis across cancers. Specifically, LEDGF/p75 is a potential oncogene in ACC, KICH and LIHC but a potential protective gene in PAAD and SKCM.

3.5. LEDGF/p75 genomic alterations in pancancer and correlation with patient prognosis

As genomic alterations may cause tumorigenesis, we explored LEDGF/p75 gene alterations in the cBioPortal database, which is a multidimensional cancer genomics dataset. We illustrate the landscape of alteration frequency in pancancer in Supplementary Fig. 4A, including mutation, structural variant, amplification, deep deletion, and multiple alterations. We then studied the percentage of altered groups in pancancer. Notably, the LEDGF/p75 gene was altered in nearly 50% of patients with 3 tumor types (COAD_POLE, ESCA POLE, and UCEC _- POLE) (Supplementary Fig. 4B). Putative copy number alterations from GISTIC showed that diploid and shallow deletions were the top two most common types (Fig. 4A). LEDGF/p75 mutation occurred at a low fre- quency (Fig. 4B), and specific mutation sites are shown in Fig. 4C.

Subsequently, we analyzed the TMB and MSI of LEDGF/p75 across cancers. The TMB results showed that LEDGF/p75 correlated negatively with 8 tumor types (KIRC, KIRP, LGG, PAAD, PRAD, THCA, UVM, and

Fig. 2. Differential expression of LEDGF/p75 in normal and tumor groups. (A) Differential expression of LEDGF/p75 across cancers from the TCGA database. (B-D) Box plots of LEDGF/p75 expression in tumor and normal (match TCGA normal and GTEx data) groups from the GEPIA database. (E) Differential expression of LEDGE/ p75 in different cells (classified according to the source of the organization). The cells of normal tissues are marked in black, and the tumor cells are marked in red. (F-L) Box plots and immuno- histochemical diagrams show the differential expression of LEDGF/p75 in 7 tumor types. * P < 0.05, * * P < 0.01, * ** P < 0.001.

A

Type Normal Tumor

8





**




LEDGF/p75 expression

0

A

2

0

BLCA

BRCA

CESC

CHOL

COAD-

ESCA

GBM

HNSC

KICH

KIRC

KIRP

LIHC

LUAD

LUSC

PAAD

PCPG

PRAD

READ

STAD

THCA

UCEC

B

7

C

E

U-937

THP-1

8

NB-4

K-562

6

HMC-1

LEDGF/p75 expression

LEDGF/p75 expression

HL-60

HEL

HAP

5

0)

hTEC/SVTERT24-B

U-698

U-266/84

U-266/70

4

RPMI-8226

Myeloid

REH

4

MOLT-4

Karpas-707

Lymphoid

00

JURKAT

HDLM-2

Daudi

Mesenchymal

2

2

U-2197

U-2 OS

Muscle

RH-30

LHCN-M2

1

HHSteC

Endothelial

DLBC

OV

HBF TERT88

fHDF/TERT166-

Female reproductive

0

BJ hTERT+ SV40 Large T+ RasG12V-

BJ hTERT+ SV40 Large T+

system

T=47

N=337

T=426

N=88

BJ hTERT

Lung

ASC TERT

BJ

Proximal digestive

ASC diff

HSKMC TIME

tract

Eye

D

F

HUVEC TERT2

T-47d

SK-BR-3

Skin


SiHa

8

3

MCF7

Kidney & Urinary

2

hTERT-HME1

Hela

bladder

EFO-21

Male reproductive

LEDGF/p75 expression

Z-value

1.

BEWO

AN3-CA

system

6

0

SCLC-21H

Pancreas

HBEC3-KT

A549

-1

OE19

Gastrointestinal tract

hTERT-RPE1

-2

hTCEpi-

Liver & Gallbladder

4

BRCA

WM-115

SK-MEL-30

-3

HaCaT

Brain

N=18

T=125

A-431

RT4

RPTEC TERT1

NTERA-2

2

HEK 293

SuSa

PC-3

CAPAN-2

THYM

CACO-2

Hep G2

U-87 MG

U-251 MG

T=118

U-138 MG

N=339

SH-SY5Y

Normal

Tumor

GAMG

AF22

nTPM

0

200

400 800

900

G

H

3



I

3

3-


2

2

2

Z-value

1

Z-value

1

Z-value

1

0

0

0

-1

-1

-1

-2

-3

COAD

-2

HNSC

-2

LIHC

N=100

T=97

-3

N=71

T=108

-3

N=165

T=165

Normal

Tumor

Normal

Tumor

Normal

Tumor

C

K

3


3-


L

3


2

2

2

Z-value

1

Z-value

1

Z-value

1

0

0

0

-1

-1

-1

-2

LUAD

-2

-2

-3

-3

OV

UCEC

N=111

T=111

N=25

T=100

-3

N=31

T=100

Normal

Tumor

Normal

Tumor

Normal

Tumor

pvalueHazard ratio
ACC0.0072.252(1.249-4.061)
BLCA0.7280.972(0.828-1.141)
BRCA0.7200.963(0.781-1.186)
CESC0.0190.672(0.482-0.937)
CHOL0.7390.863(0.362-2.055)
COAD0.8910.980(0.729-1.316)
DLBC0.3621.740(0.528-5.735)
ESCA0.4001.183(0.800-1.751)
GBM0.5300.923(0.718-1.186)
HNSC0.6640.966(0.825-1.130)
KICH0.0273.031(1.137-8.077)
KIRC<0.0010.601(0.453-0.798)
KIRP0.3751.299(0.729-2.317)
LAML0.8050.935(0.546-1.600)
LGG<0.0010.329(0.223-0.486)
LIHC0.0341.259(1.017-1.559)
LUAD0.6160.943(0.750-1.186)
LUSC0.2850.906(0.756-1.086)
MESO0.2931.297(0.799-2.105)
OV0.0530.830(0.688-1.003)
PAAD0.0190.662(0.469-0.934)
PCPG0.5490.722(0.249-2.096)
PRAD0.0733.835(0.883-16.650)
READ0.4240.769(0.404-1.465)
SARC0.6071.057(0.857-1.303)
SKCM0.0230.826(0.700-0.974)
STAD0.3560.875(0.658-1.162)
TGCT0.8931.154(0.143-9.329)
THCA0.3470.529(0.140-1.997)
THYM0.1900.651(0.342-1.238)
UCEC0.0491.303(1.001-1.697)
UCS0.5351.190(0.687-2.061)
UVM0.8731.043(0.624-1.742)
Fig. 3. Differential expression of LEDGF/p75 correlates with patient prognosis. (A) The forest plot shows correlations between LEDGF/p75 expression and the overall survival of patients. (B-E) K-M plots show correlations between LEDGF/p75 expression and the overall survival of patients. (F-L) Box plots show correlations between LEDGF/p75 expression and different stages of patients. Red indicates LEDGF/p75 as a potential oncogene, and blue indicates LEDGF/p75 as a potential protective gene.

A

Overall Survival

B

ACC

1.0

LEDGF/p75

Low

Survival probability

0.8

High

0.6

0.4

+

0.2

Overall Survival

HR = 2.39 (1.10-5.19)

0.0

P = 0.028

0

2.5

5

7.5

10

12.5

Time (years)

Low

39

29

16

6

2

High

40

21

8

4

2

1

C

KIRC

1.0

LEDGF/p75

Low

Survival probability

0.8

High

0.6

0.4

0.2

Overall Survival

HR = 0.61 (0.45-0.83)

0.0

P = 0.002

0

2.5

5

7.5

10

12.5

Time (years)

Low

269

165

71

23

6

0

High 270

166

82

32

7

0

D

LGG

1.0

LEDGF/p75

Low

Survival probability

0.8

High

0.6

0.4

0.2

Overall Survival

HR = 0.51 (0.36-0.72)

0.0

P < 0.001

0

5

10

15

0.12

0.50

2.0

8.0

Time (years)

Hazard ratio

Low

264

32

7

2

0

High

263

39

12

1

0

E

F

G

H

OV

ACC

BLCA

ESCA

1.0 -

LEDGF/p75

Low

Survival probability

0.8

High

71

0.00015

0.011

LEDGF/p75 expression

0.035

0.83

0.32

LEDGF/p75 expression

10.0

0.6

0.87

0.065

LEDGF/p75 expression

8

0.97

6

0.00029

0.52

0.12

0.22

0.4

5

0.3

0.45

7.5

0.45

0.56

6

0.031

0.0006

0.2

Overall Survival HR = 0.77 (0.59-0.99)

4

5.0

4

0.0

P = 0.043

3

0

5

10

15

2.5

Time (years)

2

2

Low

188

34

Stage | Stage II Stage III Stage IV

Stage | Stage II Stage III Stage IV

Stage | Stage II Stage III Stage IV

High

189

43

6

0

PAAD

J

SKCM

K

THCA

L

KIRC

0.21

0.79

0.71

0.86

LEDGF/p75 expression

0.8

LEDGF/p75 expression

0.21

LEDGF/p75 expression

6

0.62

0,72

0.6

LEDGF/p75 expression

8

0.16

0.99

0.17

0.14

7.5

0.13

0.0076

0.054

5

0.00023

0.25

0.035

0.02

6

0.0072

0.016

0.00052

0.81

5.0

4

4

E

4

B

2.5

3

2

2

2

Stage | Stage II Stage III Stage IV

Stage | Stage II Stage III Stage IV

Stage | Stage II Stage III Stage IV

Stage | Stage II Stage III Stage IV

Fig. 4. Genomic instability of LEDGF/p75 is associated with patient prognosis. (A) Putative copy-number alterations of LEDGF/p75. (B) Mutations in LEDGE/ p75. (C) Schematic diagram of LEDGF/p75 mutation sites. (D) The tumor mutation burden (TMB) of LEDGF/p75 across cancers. (E) Microsatellite instability (MSI) of LEDGF/p75 across cancers. (F) The heatmap shows correlations between LEDGF/p75 expression and 5 MMR genes in pancancer. (G-K) K-M plots show that al- terations in LEDGF/p75 are associated with the overall survival of patients. * P < 0.05, * * P < 0.01, * ** P < 0.001.

A

B

Amplification

70

Not profiled

524

Gain

1158

No mutation

10317

5980

Multiple

3

Diploid

13

Shallow Deletion

Splice

3432

39

7

Deep Deletion

Truncating

Inframe

72

Missense

64

0

1k

2k

3k

4k

5k

6k

0

1k

2k

3k

4k

5k

6k

7k

8k

9k

10k

LEDGF/p75: Putative copy-number alterations from GISTIC

LEDGF/p75: Mutations

C

TMB

LEDGF/p75 Mutations

D

E278del

ACC

UCS

UVM

BLCA

5

UCEC

0.6

BRCA

CESC

THYM

4

CHOL

THCA

0 2


COAD

TGCT

0

DLBC

0

PWWP

LEDGF

STAD

-0.2

-0

0.4

ESCA

PTM (dbPTM)

0

100

200

300

400

530aa

SKCM

-0.6

GBM

Phosphorylation

SARC

HNSC

Acetylation

Methylation

READ

KICH

Glutathionylation

Malonylation

PRAD

KIRC

Sumoylation

Exon

PCPG

KIRP

2

3

4

5

6

8

9

10

11

12

13

14

15

16

PAAD

LÃML

OV

F

G

E

MESO

LUSC

LUAD

LIHC

LGG

MSI

MLH1

MSH2

MSH6

PMS2

EPCAM

pan-cancer

100%

UVM

ACC

BLCA

UCEC

UCS

0.4

BRCA

0.3.

CESC

ACC

:

Probability of Overall Survival

90%

Logrank Test P-Value: 4.676e-3

THYM

0.2

CHOL

BLCA

80%

Altered group

THCA

0.1

COAD

0

BRCA

E

70%

Unaltered group

TGCT

DLBC

CESC

:

#

#

60%

STAD

001

50%

-0.2

ESCA

CHOL

SKCM

-0.3

GBM

COAD

¿

:

40%

30%

SARC

HNSC

DLBC

E

:

20%

READ

KICH

ESCA

:

E

E

10%

PRAD

KIRC

GBM

E

E

E

#

0%

PCPG

KIRP

HNSC

:

E

:

0

40

80

Overall Survival (Months)

120

160

200

240

280

320

360

PAAD

LAML

OV

MESO

LIHC

LGG

KICH

:

:

#

LUSC

LUAD

KIRC

:

E

E

#

H

TGCT

BRCA

KIRP

:

E

:

100%

Logrank Test P-Value: 4.848e-3

LAML

:

E

E

:

90%

100%

Probability of Overall Survival

Probability of Overall Survival

Altered group

LGG

:

#

:

:

80%

90%

Unaltered group

70%

80%

LIHC

:

:

#

60%

70%

LUAD

:

:

60%

LUSC

:

E

:

#

50%

50%

MESO

:

:

40%

Logrank Test P-Value: 1.551e-4

40%

OV

:

:

E

30%

Altered group

30%

PAAD

E

¿

E

20%

Unaltered group

20%

PCPG

:

E

10%

10%

0%

PRAD

:

E

E

#

0

J

20 40 60 80 100 120 140 160 180 200 220 240 Overall Survival (Months)

0%

0

40

80

Overall Survival (Months)

120

160

200

240

280

READ

:

E

E

K

SARC

:

¿

E

HNSC

LIHC

¿

E

100%

Logrank Test P-Value: 8.918e-4

100%

SKCM

Logrank Test P-Value: 0.0368

Probability of Overall Survival

90%

Probability of Overall Survival

90%

STAD

Altered group

80%

Altered group

Unaltered group

80%

Unaltered group

TGCT

E

70%

70%

THCA

E

60%

60%

THYM

:

50%

50%

UCEC

40%

40%

UCS

30%

30%

UVM

20%

20%

Cor

P value

10%

10%

0%

0%

-0.4

0.8

0

1

20

40

60

80

100

120

140 1

160

Overall Survival (Months)

180 200

0

20

Overall Survival (Months)

40

60

80

100

120

THYM) but positively with 8 types (ACC, BLCA, LOAD, LAML, LUAD, READ, SKCM, and UCS) (Fig. 4D). The MSI results showed a positive correlation between 7 tumor types (ACC, BRCA, COAD, LUAD, READ, STAD, and UCEC) and LEDGF/p75 but a negative correlation only be- tween SKCM and LEDGF/p75 (Fig. 4E). Next, we investigated the po- tential function of LEDGF/p75 in DNA mismatch repair (MMR). As displayed in Fig. 4F, LEDGF/p75 expression correlated highly with MMR genes across cancers, which was consistent with a previously reported result that LEDGF/p75 is involved in DNA damage repair[41].

To further explore the clinical value of LEDGF/p75 gene alterations across cancers, we analyzed their association with patient OS. Our comprehensive analysis showed that patients in the altered group had shorter OS, which was significantly different (Fig. 4G). We then per- formed a separate analysis of 33 tumor types and found that LEDGF/p75 gene alterations predicted poor patient prognosis in TGCT, BRCA, HNSC, and LIHC (Fig. 4H-K). In conclusion, LEDGF/P75 genomic instability is widespread across cancers and suggests poor prognosis.

3.6. Pancancer view of LEDGF/p75 alterative splicing and correlation with patient survival

Alternative splicing (AS) regulates the generation of multiple mRNA and protein products from a single gene. AS plays a crucial role in cancer progression, and cancer cells have general as well as cancer-type-specific and subtype-specific changes during splicing that may have prognostic value and contribute to cancer development and progression[42]. We chose the item (PSIP1_alt_3prime_247053) for subsequent analyses on the OncoSplicing database because it is the only known splice type in SplAdder, a bioinformatics tool for the analysis and quantification of alternative splicing events in RNA sequencing data. The read-in, read-out and PSI values are shown in Fig. 5A, and there were signifi- cant differences in LEDGF/p75 AS between tumor and normal tissues. We then visualized the PSI difference in tumor and adjacent normal tissues (Fig. 5B), and the result showed LUSC as the top result. However, the top three changed to LGG, GBM and TGCT when we compared the PSI difference between tumor and GTEx normal tissues (Fig. 5C).

We performed K-M analysis to explore the clinical value of LEDGE/ p75 AS. Consistent with our predictions, LEDGF/p75 AS suggested dif- ferences in patient prognosis in several tumor types. K-M curves of OS, DSS and PFI showed that a high LEDGF/p75 PSI indicated good prog- nosis in patients with SKCM (Fig. 5D-F). The same trend was also observed in patients with THCA and CESC (Fig. 5H-I). However, the opposite trend was observed in patients with COAD and LUAD (Fig. 5G, J-K). All of the above results imply the biological importance of LEDGF/ p75 AS events across cancers.

3.7. LEDGF/p75 is involved in cancer immune infiltration and immunotherapy

To further explore the immunomodulatory effects of LEDGF/p75, we analyzed correlations between immune cells, stromal cells, 22 immune cells, immune checkpoints and LEDGF/p75 expression across cancers (Supplementary Fig. 6); the results showed a tight association between LEDGF/p75 and tumor immunity. Subsequently, we explored whether LEDGF/p75 is differentially expressed in diverse cancer immune sub- types via the TISIDB database. Fig. 6A shows that LEDGF/p75 was significantly associated with immune subtypes in several tumors, and the top six are presented in Supplementary Fig. 5D-I. The detailed subtypes include C1 (wound healing), C2 (IFN-gamma dominant), C3 (inflammatory), C4 (lymphocyte depleted), C5 (immunologically quiet), and C6 (TGF-b dominant). In addition, we analyzed the association between immunoinhibitors, chemokines, TILs and LEDGF/p75 expres- sion (Supplementary Fig. 5A-C). As visualized in heatmaps, LEDGF/p75 expression correlated with several immunoinhibitors (CTLA4, IL10, PDCD1, etc.), chemokines (CXCL9, 10, 11, etc.) and TILs (activated CD4, Th2, CD56, etc.) in pancancer.

We searched the TIDE database to evaluate multiple published transcriptomic biomarkers to predict patient response. We list the most confident results about the correlation between LEDGF/p75 expression and CTLs, T dysfunction, and risks in Fig. 6C. The results showed a positive correlation between LEDGF/p75 expression and CTLs in breast cancer but a negative correlation in brain cancer. Moreover, high LEDGF/p75 expression indicated short overall survival for endometrial cancer patients.

We then explored the correlation between LEDGF/p75 and immu- notherapy. We compared LEDGF/p75 expression levels across cell lines between pre- and postcytokine-treated samples via the TISMO database, and the box plot is presented in Fig. 6B. Finally, we searched the ROC Plotter database to investigate the correlation between LEDGF/p75 expression and immunotherapy efficiency. Interestingly, high LEDGF/ p75 expression indicated effective immunotherapy results in ESCA PD- L1, STAD PD-1, SKCM CTLA-4, and SKCM PD-1 (Fig. 6D-G). Notably, the AUC was higher than 0.65 for ESCA PD-L1 and STAD PD-1, illus- trating the high value of the prediction model. In summary, LEDGF/p75 is involved in cancer immune infiltration and immunotherapy, which might guide personalized treatment of tumor patients.

3.8. LEDGF/p75 is highly expressed in ccRCC cells and significantly promotes proliferation and metastatic ability

Renal cell carcinoma includes more than 10 histological and mo- lecular subtypes, of which ccRCC is the most common and accounts for the majority of cancer-related deaths[17], and reduction or even dele- tion of H3K36me3 occurs due to the high mutation rate of SETD2 in patients with advanced ccRCC[43]. This phenomenon naturally strat- ifies ccRCC patients into subgroups; thus, the study of LEDGF/p75, the reader of H3K36me3, is of great significance. Although studies have reported the vital function of LEDGF/p75 in prostate cancer, leukemia and other kinds of tumors [44-46], there has been no study related to kidney cancer. Therefore, we are the first to conduct preliminary func- tional exploration of LEDGF/p75 in ccRCC.

Considering the presence of SETD2 mutations in ccRCC cell lines, we searched the DepMap portal database for relevant information. The re- sults revealed SETD2 mutations in A498 and Caki-1 cells: p.V2536fs and p.R400R, respectively. To determine expression of H3K36me3 in ccRCC cell lines, 786-O, A498 and Caki-1 cells were selected for western blot- ting. As shown in Fig. 7A, H3K36me3 was highly expressed in 786-O and Caki-1 cells but absent in A498 cells, as predicted by the database. For LEDGF/p75, all three ccRCC cell lines expressed higher levels than HK-2 cells. Therefore, we chose 786-O and Caki-1 cells for further studies.

To detect the impact of LEDGF/p75 on ccRCC cell characteristics, we first attempted to knock down LEDGF/p75. Both siRNAs tested achieved > 50% knockdown of the expression level of LEDGF/p75 (Fig. 7B-C). When LEDGF/p75 was knocked down, proliferation and migration ability of ccRCC cells were significantly reduced (Fig. 7D-G), which indicated that LEDGF/p75 is a potential oncogene in ccRCC.

To further explore the role of LEDGF/p75 in ccRCC, we knocked down LEDGF/p75 in 786-O cells and performed gene microarray anal- ysis (Fig. 8A). After LEDGF/p75 was knocked down, 655 genes were upregulated and 512 downregulated (Fig. 8B, Supplementary Table 3). Among them, the most down regulated protein-coding gene was ERO1L. We performed GO and KEGG analyses for all regulated genes. Consistent with our expectations, the transcriptional activity of cells was signifi- cantly changed after LEDGF/p75 knockdown (Figs. 8C, 8E). As H3K36me3 is a marker of active transcription[14], knockdown of LEDGF/p75, a reader of H3K36me3, is likely to cause transcriptional inhibition of some downstream genes. Therefore, we reperformed the GO and KEGG analyses of all downregulated genes.

A

PSIvalueReads-OutReads-In
0.00 0.250.75 0.501.00500 010001500 0200 400
bo
ACC-T (66)1
BLCA-N (17)0 8P6
BLCA-T (333)000IDo
BRCA-N (97)00 0PP
BRCA-T (927)kTID 0Im
CESC-N (2)V
CESC-T (264100000 O
10·-A
CHOL-N (4)
CHOL-T (28Q8
COAD-N (38O I 3A
COAD-T (269ILp
DLBC-T (411
ESCA-N (84 00
ESCA-T (136)100KDO10
GBM-T (165A9+01000
HNSC-N (32)A T 0 Tlo
HNSC-T (4011p
KICH-N (24..
KICH-T (5 (560P
KIRC-N (72)10p
KIRC-T (3431- FP
KIRP-N (31.P
KIRP-T (231)A00
LGG-T (520) LIHC-N (14). I -OLDHECIDOTITIDD G
0
LIHC-T (211)p00
LUAD-N (396
LUAD-T (428)0 0ACTO1
LUSC-N (470+A
LUSC-T (468)AExo op.
MESO-T (750*A
OV-T (280)0 0I O4..
PAAD-N (3HIA
PAAD-T (138).·
PCPG-N (3VI·
PCPG-T (160)0010100
PRAD-N (51I 0P
PRAD-T (432)L*0
READ-N (10b
READ-T (92)0
SARC-N (1Y
SARC-T (231)FOOD
CSKCM-N (1II
SKCM-T (404A100
STAD-N (230.b P
STAD-T (290)100KID0
TGCT-T (142)0+
THCA-N (55).+
THCA-T (470)100p
THYM-N(2) ++U
THYM-T (94)P
UCEC-N (21C·A
UCEC-T (145)40A
UCS-T (48)1 100V
UVM-T (35)b
Adipose (149)-P-
Adrenal (5600.
Artery (2571000
Bladder(900
Blood (114)to
0HaCOLD COCO 0
Brain (417)008
Breast(60) 0p
Cells (2610ICD100
Cervix(9) 4¢
Colon (73)0bb
Esophagus (181)08bo
Fallopian(7)
Heart(64)D10
Intestine(15) 0 VO
Kidney(0
Liver(8)Co-
Lung (128)0pp
(66) Op
Muscle
Nerve (115)pAco
Ovary(39)0+
Pancreas18) N0.
Pituitary24 000
Prostate38oA
Salivary(4
Skin (150)0bp
Spleen(32 -b
Stomach66Dp
Testis(67) 01- 0Hool®
O
Thyroid(120) 1lo
Uterus(36)..
Vagina(31) D410

Progression free interval

0.50

0.75

1.00-

H

0.00

Overall Survival

0.00

0.25

0.25

0.50

0.75

1.00-

D

-log10(FDR)

Log-rank p = 1.42e-02

THCA

Log-rank p = 2.16e-03

34.7

67.3

100

B

SKCM

0

2

0

0

-0.5

5

10

Years

Years

Median cutoff

PSI difference (Tumor-Normal)

-0.25

LIHC

10

PSI 0.103(n=241)

Median cutoff

20

PSI>0.103(n=229)

PSI>0.105(n=196)

PSI 0.105(n=199)

KIRP

0

15

30

LUSC

Progression free interval

Disease specific survival

0.00

0.25

0.50

0.75

1.00-

0.00

0.25

0.50

0.75

1.00

E

Log-rank p = 1.15e-02

CESC

Log-rank p = 9.62e-04

SKCM

0.25

0

0

5

10

Years

10

Years

0.5

20

15

PSI>0.096(n=132)

PSI 0.096(n=132)

Median cutoff

PSI>0.105(n=195)

PSI 0.105(n=194)

Median cutoff

0.24

0.19

0.15

0.10

Tumor PSI

20

30

U

Progression free interval

J

Progression free interval

F

-log10(FDR)

0.00

0.25

0.50

0.75

1.00-

0.00

0.25

0.50

0.75

1.00-

Log-rank p = 3.66e-02

LUAD

Log-rank p = 3.01e-02

SKCM

34.7

67.3

100

0

2

0

0

-0.5

GBM

5

LGG

10

Years

10

Median cutoff

Years

PSI difference (Tumor-GTEx)

-0.25

LIHC

TGCT

20

PSI>0.099(n=211)

PSI 0.099(n=211)

PSI>0.105(n=196)

PSI 0.105(n=199)

Median cutoff

BLCA

CESC

.

15

OV

LUSC

THCA

20

30

0

ACC

PAAD

Disease free interval

Progression free interval

0.00

0.25

0.50

0.75

1.00

K

0.00

0.25

0.50

0.75

1.00-

G

PRAD

Log-rank p = 5.77e-03

LUAD

Log-rank p = 4.32e-02

COAD

0

0.25

0

5

3

Years

0.5

10

Years

6

.

PSI 0.099(n=127)

Median cutoff

PSI>0.053(n=133)

PSI 0.053(n=134)

Median cutoff

15

PSI>0.099(n=130)

9

0.24

0.19

0.15

0.10

Tumor PSI

20

12

Fig. 5. LEDGF/p75 alternative splicing correlates with patient prognosis. (A) Read-in, read-out, and percent spliced in (PSI) values of LEDGF/p75 in pancancer

and normal tissues. The red and gray labels represent cancers and adjacent normal tissues, respectively; black labels represent normal tissues. The parts labeled with

“Reads-In” and “Reads-Out” on the Y-axis represent read count values, indicating exon splicing in or splicing out, respectively. (B-C) PSI differences between tumor

and adjacent normal tissues and between tumor and GTEx normal tissues. The red line refers to 0.05, the dot size represents the tumor PSI value, and different cancers

are marked in different colors. (D-K) The PSI value of LEDGF/p75 correlates with patient prognosis in several kinds of cancer.

A

Kruskal-Wallis Test (-log10pv)

12

10

8

6

4

2

0

ACC BLCA

BRCA

CESC

CHOL

COAD

ESCA

GBM

HNSC

KICH

KIRC

KIRP

LGG

LIHC

LUAD

LUSC

MESO

OV

PAAD

PCPG

PRAD

READ

SARC

SKCM

STAD

TGCT

THCA

UCEC

UCS

UVM

B

E Baseline IFNb

# IFNg TGFb1 TNFa

IFNb vs. Baseline

* IFNg vs. Baseline TGFb1 vs. Baseline TNFa vs. Baseline

Ę

4T1_GSE110912(n=6)

4T1_XW33589424(n=15)

4T1_RTM28723893(n=12)

B16_GSE149824(n=8)

B16_SSG33589424(n=16)

B16_GSE110708(n=6)

B16_GSE107670(n=6)

B16_GSE106390(n=6)

B16_GSE85535(n=7)

B16_RTM28723893(n=8) CT26_RTM28723893(n=12)

E0771_XW33589424(n=4)

EMT6_XW33589424(n=6)

KPC_RTM28723893(n=12) LLC_RTM28723893(n=44)

MC38_GSE112251(n=12)

MC38_RTM28723893(n=48)

MOC1_RU31562203_LZ5733(n=18)

MOC2_RU31562203(n=7)

MOC22_RU31562203(n=4)

Panc02_RTM28723893(n=12)

Renca_RTM28723893(n=11)

C

Fig. 6. LEDGF/p75 is involved in cancer immu- nity. (A) Correlations between LEDGF/p75 and im- mune subtypes were obtained from the TSIDB database. (B) The box plot retrieved from the TISMO database shows LEDGF/p75 expression levels across cell lines between pre- and postcytokine-treated samples. (C) The table shows the correlation be- tween LEDGF/p75 expression and CTLs, T dysfunc- tion, and risks. The bottom graphs show detailed information on corresponding data in the table. (D- G) Box plots show differential expression of LEDGE/ p75 in the indicated groups. ROC curves illustrate the feasibility of LEDGF/p75 expression as an indi- cator of the effectiveness of immunotherapy. * P < 0.05, * * P < 0.01, * ** P < 0.001.

-

*

*

1

*

**

**

**


*

*

**

5

6

7

8

LEDGF/p75 log(TPM)

. Cancer. SubtypeCTL CorT . Dysfunction¢ Risk+ Risk.adj+ Count
BreastTN0.2450.3870.3800.962233
MelanomaMetastatic-0.004-1.091-0.456-0.471317
Endometrial-0.0761.6632.2132.118541
LeukemiaAML-0.128-0.23500.6910.47779
BrainNeuroblastoma-0.202-0.8560.9570.697389

0.3

r= 0.245 , p= 0.000159

r= - 0.202 , p=6.2e-05

1.0

Continuous z= 2.21 , p= 0.0269

0.2

LEDGF/p75

5000

0.8

0.1

LEDGF/p75

Survival Fraction

0.0

0.6

-0.1

3000

0.4

0.2

1000

LEDGF/p75 Top (n=11)

-0.3

0.0

LEDGF/p75 Bottom (n=530)

-0.1

0.1

0.2

0.3

0.4

0.5

0

4000

8000

12000

0

50

100

150

200

CTL

CTL

OS (month)

D

ESCA PD-L1

E

STAD PD-1 LEDGF/p75

F

SKCM CTLA-4

G

SKCM PD-1

4000

LEDGF/p75

5000

LEDGF/p75

LEDGF/p75

2500

Gene expression

3000

Gene expression

2000

Gene expression

4000

Gene expression

3000

2000

1500

3000

2000

2000

1000

1000

1000

500 1000

500

Non-responder

Responder

Non-responder

Responder

Non-responder

Responder

Non-responder

Responder

1.0

1.0

1.0

1.0

True positive rate

0.8

True positive rate

0.8

True positive rate

0.8

True positive rate

0.8

0.6

0.6

0.6

0.6

0.4

0.4

0.4

0.4

AUC: 0.68

AUC:0.755

AUC: 0.596

AUC:0.582

0.2

p-value: 3.8e-04

strongest cutoff: 1210

0.2

p-value: 1.1e-03

p-value: 3.3e-02

strongest cutoff: 1653

0.2

strongest cutoff: 1637

0.2

p-value:3.2e-03

TPR: 0.71

strongest cutoff: 1436

0.0

TNR: 0.61

0.0

TPR: 0.92

TNR: 0.67

0.0

TPR: 0.67

TNR: 0.57

0.0

TPR:0.59

TNR:0.53

1.0

0.8

0.6

0.4

1.0

0.8

0.6

0.4

False positive rate

0.2

0.0

False positive rate

0.2

0.0

1.0

0.8

0.6

0.4

0.2

0.0

1.0

0.8

0.6

False positive rate

False positive rate

0.4

0.2

0.0

Fig. 7. LEDGF/p75 significantly promotes the proliferation and metastatic ability of kidney cancer cells. (A) Expression of LEDGF/p75 and H3K36me3 in kidney cancer cell lines. (B-C) qRT-PCR and western blot experiments show the high efficiency of LEDGF/p75 knockdown. (D-F) CCK-8 and colony formation assays show that knockdown of LEDGF/p75 significantly inhibits proliferation of kidney cancer cells. (G) Transwell experiments show that knockdown of LEDGF/p75 significantly inhibits metastasis of kidney cancer cells.

A

B

C

HK-2

786-O

A498

Caki-1

Relative LEDGF/p75 expression

LEDGF/p75

1.5

NC

KD1

KD2

1.0

p52

LEDGF/p75

0.5

H3K36me3

**

0.0

ß-Actin

H3

NC

KD1

KD2

D

F

NC

KD1

KD2

400-

786-O

Colony number

300-

**

OD value (450nm)

2.0

200-

**

- NC

Caki-1

100-

1.5

+ KD1

+ KD2

**

0

NC

KD1

KD2

1.0

500-

0.5

786-O

Colony number

400-

300-

*

0.0

**

0

24

48

72

96

200-

100-

Time (h)

0

NC

KD1

KD2

E

Caki-1

G

NC

KD1

KD2

OD value (450nm)

Migration cell number

1500-

3

1000-

NC

KD1

A

Caki-1

500

2.

KD2

*

0

NC

KD1

KD2

1

1500-

786-O

Migration cell number

1000-

0

0

24

48

72

96

500

**

Time (h)

0

NC

KD1

KD2

Interestingly, the results of GO analysis showed that after LEDGE/ p75 knockdown, the activity of several functional proteins, such as the Wnt protein, were changed (Fig. 8D). The p53 signaling pathway was also affected by changes in LEDGF/p75 (Fig. 8F).

4. Discussion

Studies to date on LEDGF/p75 have mainly focused on HIV and MLL diseases, but there are few studies on its role in tumors. In fact, as a reader of histone modification marks, LEDGF/p75 mediates chromatin binding of many nuclear proteins, thus playing an important biological function in tumors[14]. As a main reader of H3K36me3, its potential role in cancer is worthy of further study. Here, for the first time, we comprehensively analyze the LEDGF/p75 landscape across cancers

using multiple online databases, in vitro experiments and gene micro- array sequencing. The present study detailed the basic information, clinical significance, genomic instability, alternative splicing, and can- cer immunity related to LEDGF/p75.

The PWWP and IBD of LEDGF/p75 perform the functions of chro- matin recognition and protein binding, respectively; specifically, pro- cesses of chromatin and DNA binding, transcription regulation, protein-protein interactions, epitope recognition, HIV integration, and stress survival are involved, and homology to the hepatoma-derived growth factor protein family is notable[38]. Accordingly, LEDGF/p75 plays an important role in a variety of biological processes, consistent with the results of our GO analysis, protein interaction prediction, and GSEA across cancers.

LEDGF/p75 expression varied widely among normal groups, tumor

Fig. 8. LEDGF/p75 knockdown in ccRCC causes changes in cancer-related genes and pathways. (A) Heatmap of differentially expressed genes in different groups. (B) Volcano plot of differentially expressed genes after knockdown of LEDGF/p75. (C) Gene Ontology (GO) analysis of all regulated genes. (D) GO analysis of downregulated genes. (E) Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis of all regulated genes. (F) KEGG analysis of all downregulated genes.

A

B

· Up regulation: 655

· Down regulation: 512

· No significance: 57033

8

Fold change = - 1.50

Fold change = 1.50

6

-log10(P value)

ERO1L

4

2

0.8

Z score expression

0.4

pValue = 0.05

0.0

-0.4 5

0

-0.8N

-4

-2

0

2

4

NC1

NC2

NC3

KD1

KD2

KD3

log2 Fold change (LEDGF/p75-KD vs. NC)

C

All regulated genes

D

Down regulated genes

molecular function

molecular function

cellular component

7 -

cellular component 4

biological process

biological process

pValue=0.05

6

-log10(P value)

pValue=0.05

5

3

-log10(P value)

4 -

2.

3 -

2.

1 .

1

DNA binding transcription factor selvity

protein phosphalas - phosphata ~ actraty AS Misphosphate 5-post fatorescurity RNA polymerase Il proximal promotor sentence

0

motal lof Winding

Mene Inhibitor activity

a vararansterase activity sont

histone methy transtornoverseny

double-stranded DNA binding chromo shadow comail binding

pericentric heterocimamatin synaptic vesicle miaan

nuclear heterogene nucce

integral component of organelle membrane

nuclear pericentric heterdone maan

exon-exon junction complex

autophagosome men wane

auto cription, DNA-fengtated

regulation of transcription, DNAMONOTHed

negative regulation of endothelial ced migration

tary acid tigation

synaptic vesicle docking

response so

negative regulation of JUN kinh cold

regulation of nitric oxide These cavity

Iron-sulfur cluster aunembly

0

double-stranded DNA binding

Wal-protein birto

acyl-CoA dehydrogenase acity flavin adenine dinucleotide biro

Wat-activated receptor ach metalloendopeptidase acht

ARF guanyi-nucleotide exchange factor acuity

gamma-tubran

gamma- kibusin Ce metallopeutdash

metalloperadase om

integral component of membrane

pericentric heterochronntin

Cytoplasmic stress grande

chromosome, centromencourt

reticulum

endoplasmic rescu

negative regulation of endothelial cell migririnn engottendi coppiason

positive regulation of NF-kappab transcription

interferon

response to ne ral process testcase whereren delen a

cotranslational protein targeting to membran

embryonic pattern specific

cellular protein complex localizauy

nitric oxide mediated signal transduuion

drug transmembrane transport

inositol-

E

F

All regulated genes

Down regulated genes

pValue

pValue

Transcriptional misregulation in cancer

p53 signaling pathway

Other types of O-glycan biosynthesis

0.12

Endocytosis

0.15

Lysine degradation

0.1

Cellular senescence

Mannose type O-glycan biosynthesis

Other types of O-glycan biosynthesis

Gap junction

0.075

Vascular smooth muscle contraction

0.1

N-Glycan biosynthesis

0.05

alpha-Linolenic acid metabolism

p53 signaling pathway

Circadian entrainment

Adherens junction

0.025

Herpes simplex infection

0.05

Inositol phosphate metabolism

Glycerophospholipid metabolism

Phosphatidylinositol signaling system

Cell cycle

ListHit

Transcriptional misregulation in cancer

Choline metabolism in cancer

ListHit

MicroRNAs in cancer

Inflammatory bowel disease (IBD)

Cushing syndrome

2

Leukocyte transendothelial migration

1

Vascular smooth muscle contraction

Adherens junction

Regulation of actin cytoskeleton

Cell cycle

Long-term depression

6

Autophagy - animal

4

Fluid shear stress and atherosclerosis

Rheumatoid arthritis

Leukocyte transendothelial migration

Fatty acid degradation

Cholinergic synapse

Inflammatory bowel disease (IBD)

12

Gap junction

Chemokine signaling pathway

8

Cell adhesion molecules (CAMs)

Th1 and Th2 cell differentiation

Glycosaminoglycan biosynthesis

Ubiquinone and other terpenoid-quinone biosynthesis

Homologous recombination

Endocrine and other factor-regulated calcium regulation

Axon guidance

Notch signaling pathway

Amoebiasis

Valine, leucine and isoleucine degradation

Circadian entrainment

Prostate cancer

NF-kappa B signaling pathway

Endocrine and other factor-regulated calcium regulation

Human T-cell leukemia virus 1 infection

Choline metabolism in cancer

Breast cancer

AGE-RAGE signaling pathway in diabetic complication

Gastric cancer

Staphylococcus aureus infection

1.0

1.8

2.6

3.4

4.2

5.0

0

2

4

6

8

Enrichment score

10

Enrichment score

groups and cell lines, suggesting a disease-specific nature of LEDGF/p75. It is worth noting that LEDGF/p75 expression in HEL cells was much higher than that in dozens of other cells. As a cancer cell line derived from myeloid cells, HEL is an erythroleukemia cell line (AML M6 in relapse after treatment for Hodgkin’s disease). Whether there is a deeper connection between erythroleukemia and MLL (known to be dependent on LEDGF/p75) other than both being blood cancers and whether that connection is related to LEDGF/p75 is a question that has not yet been answered.

Compared with normal tissues, LEDGF/p75’s transcription and protein expression levels were lower in BRCA, LUAD, and UCEC and higher levels in LIHC. However, LEDGF/p75 transcriptional and protein levels in COAD, HNSC, OV, and KIRC did not seem to be uniform. There are many reasons for this. TCGA is a database based on tumor data, and the number of normal tissues is far less than the number of tumor tissues, which may cause a degree of error. In addition, posttranscriptional regulation and posttranslational modification have a great impact on transcription and translation level. Therefore, experiments need to be carried out to verify the findings.

Furthermore, we comprehensively analyzed the OS, DSS, DFI, PFI and clinical stages of patients. High expression of LEDGF/p75 in ACC, KICH, and LIHC suggested poor prognosis, with the opposite in PAAD and SKCM.

Genomic instability leads to the development of tumors, including mutation, structural variant, amplification, deep deletion, and multiple alterations. Our study indicates that genomic alterations in LEDGF/p75 occur in multiple cancers and are associated with poor prognosis in patients with TGCT, BRCA, HNSC, and LIHC. We also analyzed the relationship between LEDGF/p75 and tumor immunity. Our study re- ports the immune subtypes of LEDGF/p75, the relationship between LEDGF/p75 and immunoinhibitors, chemokines and TILs, and the effi- cacy of immunotherapy. Our analysis may provide additional treatment options for patients with cancer.

Previous studies have reported that LEDGF/p75 plays different key roles in cancers such as cervical cancer[47], breast cancer[48], ovarian cancer[49], and prostate cancer[50]. However, there is still no report about the function of LEDGF/p75 as a reader of H3K36me3 in kidney cancer.

A high proportion of SETD2 mutations in patients with ccRCC resulted in substantial reduction or even deletion of H3K36me3, natu- rally grouping patients into clinically and therapeutically relevant sub- types. Therefore, studying LEDGF/p75, a key reader of H3K36me3, is of great significance for clinical diagnosis and treatment of ccRCC patients. We selected 786-O and Caki-1 cells with high H3K36me3 expression to perform functional experiments after LEDGF/p75 was knocked down. The results suggested that LEDGF/p75 is a potential oncogene in ccRCC. Interestingly, LEDGF/p75 is significantly protected from mutations in the ccRCC cohort, which is consistent with its role as an oncogene. Therefore, targeting LEDGF/p75 interference may be a feasible personalized therapy for ccRCC patients without SETD2 mutation. However, the current experimental results are only a preliminary exploration of this hypothesis, and experiments are still needed for rigorous verification in the future.

To further explore the role of LEDGF/p75 in ccRCC, we knocked down LEDGF/p75 and performed gene microarray analysis. Considering that H3K36me3 is a transcriptional activation mark, we focused on the genetic functions that were lowered after LEDGF/p75 knockdown. Interestingly, we found that the p53 signaling pathway changed, which is worthy of further research. After LEDGF/p75 was knocked down, ERO1L was the most significantly decreased protein-coding gene among 512 downregulated genes. Overexpression of ERO1L, an endoplasmic reticulum oxidase, is related to the development and progression of many cancers, such as lung adenocarcinoma, glioblastoma and low- grade glioma, pancreatic ductal adenocarcinoma, and kidney renal papillary cell carcinoma[51]. Whether there is a regulatory axis of LEDGF/p75-ERO1L in SETD2 nonmutant ccRCC is worth exploring.

In summary, we performed multidirectional analysis of LEDGF/p75 across cancers and identified it as a prognostic biomarker. The present study preliminarily explored its potential function in kidney cancer, especially for personalized treatment of patients with SETD2 nonmutant ccRCC.

Funding

This study was funded by Wuxi Taihu Lake Talent Plan, Leading Talents in Medical and Health Profession Project: Research and appli- cation of early screening and accurate diagnosis and treatment of prostate cancer (THRCJH20200104).

CRediT authorship contribution statement

Ninghan Feng, Bing Yao, and Lin Jiang designed this study and provided clinical guidance as well as data interpretation. Yuwei Zhang, Wei Guo, and Yangkun Feng performed the analyses and experiments. Yuwei Zhang and Wei Guo prepared the figures for this study. Longfei Yang, Hao Lin, Pengcheng Zhou and Kejie Zhao checked the data. Yuwei Zhang drafted the article. All authors reviewed the manuscript, provided comments and approved the final version.

Declaration of Competing Interest

The authors declare that they have no conflicts of interest.

Data Availability

The data of this study are available from the corresponding author on reasonable request.

Appendix A. Supporting information

Supplementary data associated with this article can be found in the online version at doi:10.1016/j.csbj.2023.08.023.

References

[1] Ge H, Si Y, Roeder RG. Isolation of cDNAs encoding novel transcription coactivators p52 and p75 reveals an alternate regulatory mechanism of transcriptional activation. EMBO J 1998;17:6723-9.

[2] Nishizawa Y, Usukura J, Singh DP, Chylack Jr LT, Shinohara T. Spatial and temporal dynamics of two alternatively spliced regulatory factors, lens epithelium- derived growth factor (ledgf/p75) and p52, in the nucleus. Cell Tissue Res 2001; 305:107-14.

[3] Llano M, Saenz DT, Meehan A, Wongthida P, Peretz M, Walker WH, et al. An essential role for LEDGF/p75 in HIV integration. Science 2006;314:461-4.

[4] Okuda H, Kanai A, Ito S, Matsui H, Yokoyama A. AF4 uses the SL1 components of RNAP1 machinery to initiate MLL fusion- and AEP-dependent transcription. Nat Commun 2015;6:8869.

[5] Schroder AR, Shinn P, Chen H, Berry C, Ecker JR, Bushman F. HIV-1 integration in the human genome favors active genes and local hotspots. Cell 2002;110:521-9.

[6] Ciuffi A, Bushman FD. Retroviral DNA integration: HIV and the role of LEDGF/p75. Trends Genet 2006;22:388-95.

[7] Wang GP, Ciuffi A, Leipzig J, Berry CC, Bushman FD. HIV integration site selection: analysis by massively parallel pyrosequencing reveals association with epigenetic modifications. Genome Res 2007;17:1186-94.

[8] Canella A, Van Belle S, Brouns T, Nigita G, Carlon MS, Christ F, et al. LEDGF/p75- mediated chemoresistance of mixed-lineage leukemia involves cell survival pathways and super enhancer activators. Cancer Gene Ther 2022;29:133-40.

[9] Yokoyama A, Cleary ML. Menin critically links MLL proteins with LEDGF on cancer-associated target genes. Cancer Cell 2008;14:36-46.

[10] De Rijck J, Bartholomeeusen K, Ceulemans H, Debyser Z, Gijsbers R. High- resolution profiling of the LEDGF/p75 chromatin interaction in the ENCODE region. Nucleic Acids Res 2010;38:6135-47.

[11] Sharma S, Cermakova K, De Rijck J, Demeulemeester J, Fabry M, El Ashkar S, et al. Affinity switching of the LEDGF/p75 IBD interactome is governed by kinase- dependent phosphorylation. Proc Natl Acad Sci USA 2018;115:E7053-62.

[12] Huang J, Gurung B, Wan B, Matkar S, Veniaminova NA, Wan K, et al. The same pocket in menin binds both MLL and JUND but has opposite effects on transcription. Nature 2012;482:542-6.

[13] Cermakova K, Demeulemeester J, Lux V, Nedomova M, Goldman SR, Smith EA, et al. A ubiquitous disordered protein interaction module orchestrates transcription elongation. Science 2021;374:1113-21.

[14] Xiao C, Fan T, Tian H, Zheng Y, Zhou Z, Li S, et al. H3K36 trimethylation-mediated biological functions in cancer. Clin Epigenetics 2021;13:199.

[15] Pradeepa MM, Sutherland HG, Ule J, Grimes GR, Bickmore WA. Psip1/Ledgf p52 binds methylated histone H3K36 and splicing factors and contributes to the regulation of alternative splicing. PLoS Genet 2012;8:e1002717.

[16] Turlure F, Maertens G, Rahman S, Cherepanov P, Engelman A. A tripartite DNA- binding element, comprised of the nuclear localization signal and two AT-hook motifs, mediates the association of LEDGF/p75 with chromatin in vivo. Nucleic Acids Res 2006;34:1653-65.

[17] Hsieh JJ, Purdue MP, Signoretti S, Swanton C, Albiges L, Schmidinger M, et al. Renal cell carcinoma. Nat Rev Dis Prim 2017;3:17009.

[18] Hsieh JJ, Le VH, Oyama T, Ricketts CJ, Ho TH, Cheng EH. Chromosome 3p Loss- Orchestrated VHL, HIF, and Epigenetic Deregulation in Clear Cell Renal Cell Carcinoma. J Clin Oncol 2018;36. JCO2018792549.

[19] Jonasch E, Walker CL, Rathmell WK. Clear cell renal cell carcinoma ontogeny and mechanisms of lethality. Nat Rev Nephrol 2021;17:245-61.

[20] Zhang Y, Fang Y, Tang Y, Han S, Jia J, Wan X, et al. SMYD5 catalyzes histone H3 lysine 36 trimethylation at promoters. Nat Commun 2022;13:3190.

[21] Stelzer G, Rosen N, Plaschkes I, Zimmerman S, Twik M, Fishilevich S, et al. The GeneCards Suite: From Gene Data Mining to Disease Genome Sequence Analyses. Curr Protoc Bioinforma 2016;54. 1 30 31-31 30 33.

[22] C. UniProt, UniProt: the Universal Protein Knowledgebase in 2023, Nucleic Acids Res, (2022).

[23] Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, et al. Highly accurate protein structure prediction with AlphaFold. Nature 2021;596:583-9.

[24] Uhlen M, Fagerberg L, Hallstrom BM, Lindskog C, Oksvold P, Mardinoglu A, et al. Proteomics. Tissue-based map of the human proteome. Science 2015;347:1260419.

[25] Warde-Farley D, Donaldson SL, Comes O, Zuberi K, Badrawi R, Chao P, et al. The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function. Nucleic Acids Res 2010;38. W214-220.

[26] Szklarczyk D, Gable AL, Nastou KC, Lyon D, Kirsch R, Pyysalo S, et al. The STRING database in 2021: customizable protein-protein networks, and functional characterization of user-uploaded gene/measurement sets. Nucleic Acids Res 2021; 49:D605-12.

[27] Mitsopoulos C, Di Micco P, Fernandez EV, Dolciami D, Holt E, Mica IL, et al. canSAR: update to the cancer translational research and drug discovery knowledgebase. Nucleic Acids Res 2021;49:D1074-82.

[28] Ru B, Wong CN, Tong Y, Zhong JY, Zhong SSW, Wu WC, et al. TISIDB: an integrated repository portal for tumor-immune system interactions. Bioinformatics 2019;35:4200-2.

[29] Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci USA 2005;102:15545-50.

[30] Goldman MJ, Craft B, Hastie M, Repecka K, McDade F, Kamath A, et al. Visualizing and interpreting cancer genomics data via the Xena platform. Nat Biotechnol 2020; 38:675-8.

[31] Tang Z, Li C, Kang B, Gao G, Li C, Zhang Z. GEPIA: a web server for cancer and normal gene expression profiling and interactive analyses. Nucleic Acids Res 2017; 45:W98-102.

[32] Chandrashekar DS, Karthikeyan SK, Korla PK, Patel H, Shovon AR, Athar M, et al. UALCAN: an update to the integrated cancer data analysis platform. Neoplasia 2022;25:18-27.

[33] Cerami E, Gao J, Dogrusoz U, Gross BE, Sumer SO, Aksoy BA, et al. The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov 2012;2:401-4.

[34] Zhang Y, Yao X, Zhou H, Wu X, Tian J, Zeng J, et al. OncoSplicing: an updated database for clinically relevant alternative splicing in 33 human cancers. Nucleic Acids Res 2022;50:D1340-7.

[35] Fu J, Li K, Zhang W, Wan C, Zhang J, Jiang P, et al. Large-scale public data reuse to model immunotherapy response and resistance. Genome Med 2020;12:21.

[36] Zeng Z, Wong CJ, Yang L, Ouardaoui N, Li D, Zhang W, et al. TISMO: syngeneic mouse tumor database to model tumor immunity and immunotherapy response. Nucleic Acids Res 2022;50:D1391-7.

[37] 1 Fekete JT, Gyorffy B. ROCplot.org: Validating predictive biomarkers of chemotherapy/hormonal therapy/anti-HER2 therapy using transcriptomic data of 3,104 breast cancer patients. Int J Cancer 2019;145:3140-51.

[38] Cermakova K, Weydert C, Christ F, De Rijck J, Debyser Z. Lessons Learned: HIV Points the Way Towards Precision Treatment of Mixed-Lineage Leukemia. Trends Pharm Sci 2016;37:660-71.

[39] Wang H, Farnung L, Dienemann C, Cramer P. Structure of H3K36-methylated nucleosome-PWWP complex reveals multivalent cross-gyre binding. Nat Struct Mol Biol 2020;27:8-13.

[40] El Ashkar S, Schwaller J, Pieters T, Goossens S, Demeulemeester J, Christ F, et al. LEDGF/p75 is dispensable for hematopoiesis but essential for MLL-rearranged leukemogenesis. Blood 2018;131:95-107.

[41] Ui A, Chiba N, Yasui A. Relationship among DNA double-strand break (DSB), DSB repair, and transcription prevents genome instability and cancer. Cancer Sci 2020; 111:1443-51.

[42] Bonnal SC, Lopez-Oreja I, Valcarcel J. Roles and mechanisms of alternative splicing in cancer - implications for care. Nat Rev Clin Oncol 2020;17:457-74.

[43] Xie Y, Sahin M, Sinha S, Wang Y, Nargund AM, Lyu Y, et al. SETD2 loss perturbs the kidney cancer epigenetic landscape to promote metastasis and engenders actionable dependencies on histone chaperone complexes. Nat Cancer 2022;3: 188-202.

[44] Ortiz-Hernandez GL, Sanchez-Hernandez ES, Ochoa PT, Elix CC, Alkashgari HR, McMullen JRW, et al. The LEDGF/p75 Integrase Binding Domain Interactome Contributes to the Survival, Clonogenicity, and Tumorsphere Formation of Docetaxel-Resistant Prostate Cancer Cells. Cells 2021;10.

[45] Roudaia L, Speck NA. A MENage a Trois in leukemia. Cancer Cell 2008;14:3-5.

[46] Daugaard M, Kirkegaard-Sorensen T, Ostenfeld MS, Aaboe M, Hoyer-Hansen M, Orntoft TF, et al. Lens epithelium-derived growth factor is an Hsp70-2 regulated guardian of lysosomal stability in human cancer. Cancer Res 2007;67:2559-67.

[47] Leitz J, Reuschenbach M, Lohrey C, Honegger A, Accardi R, Tommasino M, et al. Oncogenic human papillomaviruses activate the tumor-associated lens epithelial- derived growth factor (LEDGF) gene. PLoS Pathog 2014;10:e1003957.

[48] Singh DK, Gholamalamdari O, Jadaliha M, Ling X, Lin Li, YC, Zhang Y, et al. PSIP1/ p75 promotes tumorigenicity in breast cancer cells by promoting the transcription of cell cycle genes. Carcinogenesis 2017;38:966-75.

[49] Sapoznik S, Cohen B, Tzuman Y, Meir G, Ben-Dor S, Harmelin A, et al. Gonadotropin-regulated lymphangiogenesis in ovarian cancer is mediated by LEDGF-induced expression of VEGF-C. Cancer Res 2009;69:9306-14.

[50] Liedtke V, Rose L, Hiemann R, Nasser A, Rodiger S, Bonaventura A, et al. Over- expression of LEDGF/p75 in HEp-2 cells enhances qutoimmune IgG response in patients with benign prostatic hyperplasia-a novel diagnostic approach with therapeutic consequence? Int J Mol Sci 2023;24.

[51] Shergalis AG, Hu S, Bankhead 3rd A, Neamati N. Role of the ERO1-PDI interaction in oxidative protein folding and disease. Pharm Ther 2020;210:107525.