Alternate promoters and alternate splicing of human tenascin-X, a gene with 5’ and 3’ ends buried in other genes

Mart Speek+, Floyd Barry and Walter L. Miller*

Department of Pediatrics and the Metabolic Research Unit, University of California, San Francisco, CA 94143-0978, USA

Received May 29, 1996; Revised and Accepted August 19, 1996

Tenascin-X (TN-X) is an extracellular matrix protein encoded by a large gene that overlaps the steroid 21-hydroxylase (P450c21) gene in the HLA locus on chromosome 6p21.3. This may be the most complex locus in the human genome identified to date, containing 13 overlapping transcription units in 160 kb of DNA. Previous studies determined the sequence of 39 TN-X exons, encoding a 12 kb open reading frame, but the promoter(s) of the gene had not been located. We identify the principal TN-X promoter and a previously unknown 5’ untranslated exon that lies more than 10 kb upstream from the previously known exons. This promoter, which is substantially different from the promoter for TN-C, initiates transcription in human fetal adrenal and muscle, but expression in human NCI-H295 adrenocortical carcinoma cells is initiated by two other promoters lying further upstream. One of these is the same as the promoter for a recently identified Creb-related protein (Creb-rp), but transcripts initiated from this promoter in human adrenal NCI-H295 tumor cells are spliced differently from Creb-rp, and are largely retained in the nuclei of these cells. By analogy with the other two members of the tenascin family, TN-C and TN-R, it has been predicted that TN-X should undergo alternate splicing in its fibronectin-like domains. RACE cloning and RNase protection experiments reveal no such alternate splicing. The TN-X gene appears to be unique in having both its 5’ and 3’ ends buried in other genes.

INTRODUCTION

The genomic locus containing the gene for steroid 21-hydroxylase (P450c21) may be the most complex locus in the human genome. Studies in the mid-1980s showed that the human (1,2), rodent (3,4) and bovine (5-7) genomes have a pair of duplicated loci termed A and B containing closely linked genes for P450c21 and complement component C4 residing in the major

histocompatibility locus. In the class III region of the human leukocyte antigen locus on chromosome 6p21.3, the duplicated A and B units are ~35 kb long and have precisely defined boundaries (8). The gene for tenascin-X was identified as a previously unknown transcript overlapping the last exon of P450c21B on the opposite strand of DNA (9). The A and B loci also harbor the novel, paired transcripts XA and XB-short (XB-S) (8,10), YA and YB (11), and ZA and ZB (12). In addition, the G11 gene, also termed RP, lies only 611 bp upstream from C4A, but only a short segment of its 3’ end is duplicated in the B locus, and this is not transcribed (13,14). The XB gene encoding TN-X is over 65 kb long, and hence is too large to have been wholly duplicated in this region (15); nevertheless, the duplicated segment of XB, termed XA, is abundantly transcribed in an adrenal-specific fashion (8). Thus the arrangement of genes in this locus from telomere to centromere is G11-C4A-ZA- 21A-YA-XA-C4B-ZB-21B-YB-XBS-XB.

The tenascins are a family of at least three extracellular matrix proteins, termed tenascin-C (TN-C, also known as tenascin or cytotactin), tenascin-R (TN-R, also known as restrictin), and tenascin-X (TN-X). All three tenascins share the same modular structure, consisting of an amino-terminal heptad repeat domain, which permits tenascin chains to dimerize or trimerize, followed by a series of fibronectin type III (FnIII) repeats, and a carboxy-terminal fibrinogen-like domain (15-21). The tenascins exert anti-adhesive effects, whereas fibronectin is adhesive (22,23). However, TN-X does not bind to TN-C, TN-R or fibronectin, whereas these other proteins will bind to one another (24). TN-C is expressed in a wide variety of fetal tissues and has been thought to play an important role in development (25-27), but tenascin-C knockout mice have no phenotype other than diminished gliosis in response to brain injury (28,29). The expression of TN-R is restricted to the brain (20,21) and hence probably cannot substitute for TN-C. However, TN-X is widely expressed with its greatest expression in fetal adrenal, testis and muscle (8,15) and in developing connective tissues (24,30) where it appears to participate in connective tissue cell migration and to inhibit chondrogenesis (30). This distribution sometimes appears to be reciprocal to that of TN-C (24) but in a developmental pattern that is related to TN-C (30). Thus there is substantial interest in the genetics, cellular activity and expression of TN-X, but no TN-X promoters have been identified or characterized.

*To whom correspondence should be addressed

+Present address: Department of Biochemistry, Tartu University, Tartu, EE 2400, Estonia

Figure 1. Northern blot. Total RNA samples (20 µg) from human fetal skeletal muscle, fetal adrenal, and NCI-H295 adrenocortical carcinoma cells were analyzed with a 282 base antisense riboprobe specific for the extreme 3' end of TN-X. Size markers are BRL 1 kb ladder mixed with HindIII-digested bacteriophage 2. A large TN-X transcript of ~16 kb is seen more abundantly in muscle than in adrenal RNA. The additional 3.5 kb band seen in adrenal RNA represents XB-S mRNA (10) and the 2.5 kb band represents XA RNA (8).

MUSCLE

ADRENAL

NCI-H295

Sequencing of large segments of the human TN-X gene showed that the gene spanned at least 65 kb and encoded a 12 kb open reading frame that predicted a protein monomer of 3816 amino acids (15). This predicted size is consistent with the apparent size

of the protein monomers seen on Western blots of mouse (24) and human (J.D. Bristow and W.L. Miller unpublished) TN-X. Mouse and human TN-C and TN-R undergo alternative splicing in their FnIII domains (20,21,27,31) and mouse TN-X mRNA is found as ~11 and 13 kb bands suggesting similar alternate splicing (24), and such alternate splicing has been predicted for TN-X (32,33). We now identify three alternate transcriptional start sites of human TN-X and show that TN-X mRNA does not undergo the pattern of alternate splicing used by TN-C and TN-R, and that normal human fetal adrenal tissue and human adrenocortical NCI-H295 tumor cells (34), which bear many characteristics of the human fetal adrenal (35), use different TN-X transcriptional start sites. At least two of the TN-X transcriptional start sites are buried within the recently described Creb-rp gene (36), while the 3’ end of TN-X lies within the P450c21 gene. The intimate overlapping arrangement of transcription units in this locus is reminiscent of a prokaryote genome, but is unique in the human genome.

RESULTS

Size of the human TN-X transcript

Previous sequencing of six overlapping human genomic DNA clones encompassing the TN-X gene predicted an mRNA of a ~~12 kb with an open reading frame encoding a TN-X monomer of 3816 amino acids (15), consistent with northern blotting studies that showed a human TN-X mRNA of~~12 kb (8), and two mouse TN-X mRNAs of~11 and 13 kb (24). Northern blotting of RNA

-1348

-1228

-1108

-988

-868

-748

-620

-508

+1++M1

1→M2

I+ AS

1-9M3

1-H4

+194 →|

+93

1+195 -

+257 MOCACAGCCAGROCA

HMPAQYALTSSLVLLVLL

STARA

Figure 2. Sequence of the principal promoter of the human TN-X gene. Base #1 is the 5’-most base in RACE clone M1, and corresponds to the 5’-most transcriptional start site identified by RNase protection experiments; the other transcriptional start sites identified in Figure 3 are also underlined. The underlinedPstI site corresponds to the site at bases 20-25 of the sequence as numbered in (15). Following base 194 is a large intron; the 567 bp shown downstream from base 194 extend to the 3 MboI cloning site of cTNX:7; the 458 bp directly upstream from base 195 are from cTNX:5. The cytosine at 198 was not reported in Bristowet al. (15). The dashes following base 194 and preceding base 195 indicate exon/intron boundaries, and do not represent missing bases. Note that an uncloned/unsequenced region of >10 kb lies between bases 194 and 195. GenBank sequences are U52697 for the 2709 bpHindIII fragment of cTNX:7 upstream from this region and U52698 for the 458 bp region from cTNX:5 lying directly upstream of base 195.

from human fetal adrenal and skeletal muscle detected a single TN-X band that migrated a bit more slowly than the~12 kb DNA size marker. This band was readily detected in muscle, and is barely visible in the adrenal sample in the original gel (Fig. 1), but could not be detected in human NCI-H295 adrenocortical carcinoma cells (34,35). Thus multiple sizes of human TN-X mRNA were not found, even though multiple sizes of mouse TN-X mRNA and human, chicken and rodent TN-C RNA are readily detectable. In Figure 1, human TN-X mRNA appeared to be larger than 12 kb, possibly as large as 16 kb. The apparent discrepancy with the 12 kb size reported previously (8) could be due to the difficulty in evaluating large RNA sizes by northern blotting, or might reflect the presence of additional sequences that were not detected in our sequencing studies (15). Therefore we evaluated the unexplored 5’ end of the human TN-X gene.

Structure of the principal promoter of the human TN-X gene

We previously used 5’ RACE to identify the 5’ untranslated region in the human TN-X sequence, but the number and location of the transcriptional start site(s) were not determined (15). To identify the transcriptional start site(s) of the full-length TN-X mRNA and to locate the 5’ portion of the TN-X gene, we used antisense primers corresponding to bases 252-271 and 206-225 as numbered in Figure 2 (bases 98-117 and 52-71 in reference 15) to perform primary and nested RACE. Parallel RACE reactions were done with adrenal, muscle, and NCI-H295 cell RNA. Cloning and sequencing of the RACE products identified three different classes of sequences, termed classes I-III.

The class I sequences consisted of four muscle (M1-M4) and one adrenal (A5), RACE clone. These clones contained the 5’ most sequences identified previously, including the CTGCAG PstI site found at bases 20-25 as numbered by Bristow et al. (15), (bases 173-178 in Fig. 2) and hence appear to represent the 5’ untranslated region of the TN-X gene. However, probing of the genomic DNA in our 5’-most TN-X cosmid, cTNX:5 (15), failed to detect the corresponding genomic DNA. Therefore, to isolate the 5’ end of the TN-X gene, we re-screened a human genomic library in cosmid pWE15 using RACE clone M1 as the probe. Three cosmid clones with identical restriction maps were identified, and one of these, termed cTNX:7, was characterized further. RACE clones M1-M4 hybridized to a 2709 bp HindIII fragment at the extreme 3’ end of cTNX:7 (Fig. 2). This sequence contains the M1-M4 and A5 RACE sequences and the hallmark PstI site, but also shows that there is an intron beginning 16 bp 3’ to the PstI site. The sequences in cTNX:5 (15) and cTNX:7 do not overlap; Southern blotting of genomic DNA (not shown) suggests that there are at least 10 kb of uncloned DNA between cTNX:7 and cTNX:5. Thus the 5’ RACE cDNA sequence reported previously (15) is interrupted by a very large, previously unknown intron that lies between the eighth and ninth bases upstream from the first ATG codon that is presumed to initiate translation.

Transcriptional start sites of the TN-X gene

To locate the transcriptional start site in the sequence shown in Figure 2, we performed RNase protection experiments. RNA from muscle, and, to a minor extent adrenal, protected a full-length 224 bp band, from a vector containing exonic bases +1 to +224 in Figure 2, but the most prominent bands were at~195,

Figure 3. RNase protection. Left lanes: A 302 base cDNA riboprobe, which encompasses bases +1 to +194 and +195 to +224 in Figure 2 (excluding the intron) plus 78 bases of vector sequences, was generated from RACE clone M1 cut with Sall and transcribed with T7 RNA polymerase. Right lanes: protection of a 420 base genomic riboprobe generated from the 2.1 kb genomic HindIII fragment, cut with Styl and transcribed with T7 RNA polymerase. Markers: 100 bp ladder (BRL); P, undigested probe; T, 20 µg tRNA; adrenal, 15 µg human fetal adrenal RNA; muscle, 10 µg human fetal striated muscle RNA.

cDNA

GENOMIC

PIAM

600

500

400

300

200

100

180 and 160 bp (Fig. 3, left). This confirms that the long genomic sequence between bases +194 and +195 is a single intron and identifies four closely clustered transcriptional start sites. A 420 base genomic antisense riboprobe extending from -223 to +142 (as shown in Fig. 2) generated the same pattern of bands seen with the cDNA probe (Fig. 3, right), confirming that there are four TN-X transcriptional start sites within about 75 bp that are used in fetal adrenal and muscle tissue. These four sites account for the class I RACE clones and for most adrenal and virtually all muscle transcription of TN-X, thus we refer to the DNA that lies upstream from base +1 in Fig. 2 as the principal promoter of TN-X. The relative intensities of the protected bands and the amounts of RNA used in Figure 3 suggest that the principal promoter may be more powerful in muscle than in adrenal cells, although variations in RNA stability and rates of elongation or termination cannot be discounted.

Figure 4. RNase protection. (A) A 584-base riboprobe (P) encompassing the 49 bases of TN-X 5'UTR (beginning at base 154 in Fig. 2), 397 bases of TN-X coding region (extending beyond the sequence shown in Fig. 2), and 138 bases of vector sequences was transcribed with T3 RNA polymerase from BamHI-cleaved RACE clone M4, and hybridized with RNAs from various sources: t, t-RNA (10 µg); M, human fetal striated muscle (2.5 µg); A, human fetal adrenal (10 µg); L, human fetal liver (5 µg). NCI: NCI-H295 human adrenocortical carcinoma cells grown in suspension; C, cytoplasmic RNA (10 ug); N, nuclear RNA from the same number of cells used to prepare the 10µg cytoplasmic RNA sample. NCI-A: NCI-H295A cells grown in monolayer; amounts of RNA as for NCI cells. Markers are a 100 bp DNA ladder. (B) A 490 base riboprobe encompassing 8 bases of TN-X 5' UTR (beginning at base 195 in Fig 2) 397 bases on TN-X coding region, and 85 bases of vector sequences was transcribed with T3 RNA polymerase from SfaNI-cleaved RACE clone M4 and hybridized with RNAs designated as in panel A. Adrenal, 10 µg; NCI cytoplasmic, 20 µg; NCI nuclear, from same number of cells as cytoplasmic.

NCT NCI-A

₱

IMA

₱

300

700

400

-600

500

400

Altered transcription of the TN-X gene in NCI-H295 carcinoma cells

Human NCI-H295 adrenocortical carcinoma cells express many highly differentiated functions of the human fetal adrenal (35); however, no 12-16 kb TN-X mRNA was seen in our northern blot of NCI-H295 cell RNA (Fig. 1). To determine if these cells express TN-X and are suitable for the study of adrenal expression of TN-X, we performed a series of additional RNase protection experiments. Human fetal muscle and adrenal RNAs protected the expected 446 base band from a riboprobe encompassing the 5’ end of TN-X, but a comparable band was not protected by RNA from fetal liver or from NCI-H295 cells (Fig. 4A). However, RNA from NCI-H295 cells grown either in suspension or as adherent NCI-H295A cells, protected a smaller band of about 405 bases, but most of this RNA was found in nuclei rather than in cytoplasm. RNase protection with a 5’ truncated probe (Fig. 4B) showed that fetal adrenal RNA and both nuclear and cytoplasmic NCI-H295 cell RNAs protected the full-length 405-base region, indicating that the truncation leading to the 405 base band seen in Figure 4A encompasses sequences upstream from base 196. Thus TN-X mRNA is expressed in NCI-H295 cells, but its transcription is initiated at a location other than the principal promoter, and its RNA is then spliced to the standard TN-X coding cassette beginning at base 195. This protection experiment also confirms that most TN-X mRNA is retained in NCI-H295 cell nuclei.

Upstream alternative promoters

The class II and III RACE clones indicated the existence of alternative upstream promoters. The class II RACE clones consisted of two identical 371 base clones, termed A2 and N1,

Figure 5. Diagram of the DNA 5' to TN-X. Top lines: scale (in kb) and Smal map of cTNX:7 with the sizes of the Smal fragments in kb. The arrowheads and asterisk designate the PvuII and HindIII sites, respectively and the ends of the pWE5 vector are the open-ended boxes. The sequenced regions in Figures 2 and 7 are shown as bold bars and the three alternate promoters (I, II and III) are designated with rightward arrows. The >10 kb intron separating the DNA in cTNX:7 from the DNA in cTNX:5 is indicated by a space between the cTNX:7 sequences and the rightward arrow designated TN-X, which indicates the beginning of the 3816 amino acid TN-X open reading frame. The alternately spliced 5' TN-X cDNA structures shown below are labeled according to the classes of RACE clones. Class I clones (M1-M4, A5) are initiated by the principal TN-X promoter within the 2.7 kb HindIII fragment at the 3' end of cTNX:7. Class II clones (N1, A2) are initiated within the 1.5 kb Smal/PvuII fragment that comprises the 3' portion of the 2.4 kb PvuII fragment. Class III clones (N2, A3), are initiated from the Creb-rp promoter within the 5.5 kb Smal fragment. These transcripts are alternately spliced in the two fashions shown above the array of eight exons designated class III; the splicing of authentic Creb-rp is shown below that line. Between the regions shown as exons 2 and 3 lies a 1191 bp region of Creb-rp mRNA encoded by an unknown number of Creb-rp exons that lie in the unsequenced DNA between the class III and II promoters. The location of the 3' end of the Creb-rp gene lies in unsequenced DNA either between the class II and I promoter, or between the class I promoter and the principal Tn-X coding cassette. Mapping studies (not shown) (36) suggest that the 3' end of Creb-rp lies between promoter I and the TN-X coding cassette.

* Hind III A Fra Il

kb @

Smal 7.5

94 13 23

5.5

1.5

3.2

5.7

TN-X

[

2000g

Crth-rp

derived from human fetal adrenal and NCI-H295 cell RNA, respectively. These clones contain the sequence from base +195 to +272 in Figure 2, but the 293 bp at the 5’ ends of these clones was not found in Figure 2. Southern blotting showed that cosmid cTNX:7 contained the corresponding genomic sequences on a 2.4 kb PvulI fragment. Southern blotting of partial Smal digests of cTNX:7 permitted us to order its ten SmaI fragments and locate the 2.4 kb PvuII fragment (Fig. 5). Sequencing the 2.4 kb PvuII fragment located the upstream 293 bp of the class II RACE clones as a single contiguous sequence, designated as bases 1-293 following the class II transcriptional start site in Figure 6. The class II RACE clones have a potential ATG translational start beginning at base 95, but this ATG initiates an open reading frame (ORF) of only 111 codons, extends into the TN-X coding region by 32 bases, and is not in frame with TN-X. We have no evidence that the predicted 111 amino acid peptide, which bears no homology to sequences in the databases, is synthesized, hence this reading frame is not shown in Figure 6.

The class III RACE clones A3 and N2 contained identical 5’ sequences. In N2 these were directly linked to the standard TN-X coding cassette beginning at base 195 in Figure 2; but A3 contained additional sequences before base 195. The sequence of the 2.4 kb PvuII fragment and an adjacent upstream 1.3 kb PvuII

-326 aoagogtgooootttgag@gotatctatotooaggacacatagaagctgtotaaactasaottagtagt@gotqqqqgagatogtogacettaatactogetactacateteccaggege

-206 tttateactqgagectgtttocatqcottageqgtatggeactqtctegaattactteergecetterecetteegetaccettategeattgegattacgt

+1 class III →

-16 gotoagoaccenggtetettactggteggtagagottocqggsogeccccttttttgaaagagtosactgattagttggtgatgoggagaCocccCCCTICCCAMCOCICTCCIGGITCC MAELHLLSEIADP TRP FT DNLL SPE DWG1 Q

+31 GGGGTOGGGGGGARAGATGGOGGAGCIGATGCTGCICAGCGAGATIGCTGACOOGAGGCGTTTCIICACCGACANGCTOCITAGCCOGGAGGACTGGOGICIOCMANgtgaggoaceqqq

gtattaagggggatacastaggggactcnctestggsagtcgattttcggggtenggggggggstaanaccettatoattcagtg qqqqq .. about-D.9kb .. cctoagoatcctqsgtagttqqsooaoagacacagtecaccatgtctggotaatttttttattgtttqtaaaaatogogtettectatettgeceagagt etegeteastgateetceeacetteateenagergattacagtstageacestgettetetetterastataagttatgtggget qqqagaagagagttattqactgotgagaattet@gaactgttottgoocaagtt@tccactetgaggecatatttgaagteatgagaottcatcaatteteceagaaataaactagaaa qqqqaaagattagraceattesacacaattesaataattatgttageatgtttttttttatt … about-8kb .. cagotgasattottoattgagetoagtouttetttatocacageatcagtgagootecttcagotoccatotetectoggatgaacaaqqqqgageetcaaccceggag 2 PTDOPSES 3

+2 acacttyttetagagassatagggggattcaggggttcagggotasgagoooogagaGOCCIAqtagg cercetqtetteqqqtatattogtettereeggeretgggggseetaagetttetettggtcategttttttttgggggggg

NLTAFPGGAKELL LR DLDQL FL S & DCR 4 +282 agtetecaagtqttetetcettecteetteactecacCAACCTCACAGCCTTCCCTGGGGGOGCCAAGANGCTACTICTAAGAGACCTAGACCAGCTCTTOCTCICCTCTGATIGCCGGC HENRTESLR

+345 ACTICANOCOCACTGAGTOCCIGAGgtgtgggattoattgotggggcatooagctoooooggcctcosaaggocttotqaqtoagectggot@gectgtgtactaaagegesagtçete

LA DELSGHDOR 5

HORGRRKIPORAQERQ

gtgggcatetogotgt@gsagtttgagggatogtttgaaggaagtqgagegottgagtqpgaggtqgostosottocaggggetgtttgtctotgtocetttecetteaccetetecasa KS Q PR K K SP DV * AV PI Q PPGPPER 6 +471 gterttaagataucttetocttunaagAAGTCICAGCCACGGAAGAAGTCAOCICCAGTTANGOCAGTCCOCATOCAAOCCCCTGGAOCCCCAGAAMGgtgagagtggggggggtgott acttattaagtgaaattecacttteaagagetqtacccccagtagetqteetqtgectqteattactqteaceagtegettacacetecetecteesetgggggtetettgtetttttct DSVGQLOLYRHPDRT ctoatcctacoogettooctggtagaaactgagoattgggettagttcccctoanatoctgtttoccoacctgootagGGATTCTGTG000CAGCTGCANCTATATOOCCACCCMIACCG

+543 SOPAF LDAIDRREDTFYVVSFRR

+585 TTCOCAGCCNGCATICTTOGATOCAATTGNOCENOOGGAAGACACATTTTATGTICICICTTTOCCAAGqqtgagtttctertececttecatetetęteaccecagetteecageagte

TTCCSOPSAT #

4654 tgtottcagtggggggatgtagagtagggetggggagettgttggeatetttgostecacccttetggectggoodatctgttccccagGACCACCTGCTGCICCCMGOCATCAGOCACA Z RPPGPRCPM *

tttottttectaooctctccttgoooctgcaccaaaototgtccotgooatcotgtttaactecogetatggatetgectectcactattooectttetttttetetetgetcaceteta

qatgatgoagatogagtgtqaqgtoatggacaccaq@gtgattoaoatoaagacctooacagtgooooootogetcogaaaacagecatcoccaaccecageaatqccacagqtggece

aqqtoggactgtttoaaatttocotgatocceaggottggggosattogtaaaggasagagcaggtgtgggggttaagcacttatttgagqtgogggtgtteacetetettetcatecet

class II →

B GITTATGICICTCCATTICTTTTTTATTATTATTAAATAAACAACTTOGAQOGAGTTGANCORATOCTOCCICCCTTCCICCCTOGIGATGGGGNGTTTGGGGGAAACIGGCTTTTCCAG 128 AGCCAGCAICAAAGAAICAAGCACTGARGACATAGCIGGCICIGAGGIGGGGIGGQTGOGGGAICACAGNOOGCATGTCAICCOCTIGCCCCCAOOOCIGGAAGICIGGGGTATAICTGI

24B GICTNQCANCOACTCATOOCOTONCNANCIAGAGGATACOCATCCCTGGAATGTGAGTCAACAGGAAMQNTGOOGCCCCACCCCTTCAGGqtgageageetctgggtetcetgetetega aagqqaagggggatgtqgcaggatatotetaagecotaaotggaqqqqaotgagacttgagagtccagatgaaagacaaagacaagagtqqqqeaggcagaaaaagetgtacattettat

atcagoototcaagtagotgggactaoaggtgcacqccaooacacccggotaatttttgtatttttastagagatagggtttcaccatattqqocaggetgeceactteggecteccaaa gecegegegegeattattaqqqqgaaaarattgagetagesacetateetcc ageeg

Figure 6. Alternate upstream promoters. Three genomic fragments from cTNX:7 containing the exons of Creb-rp were identified by hybridization to RACE clones and sequenced. Sequences identified as exons from RACE clones are shown in upper case letters. The class III cap site identified by RACE and RNase protection corresponds to the Creb-rp cap site (36). The ORF shown directly above the second base of each codon is that predicted by the RACE clones, but does not correspond exactly to the Creb-rp ORF, which is spliced at the locations indicated by the arrowheads (e.g. bases 2-4, etc., following the second arrowhead correspond to Ile 403 of Creb-rp), and also contains the 1191 base coding region not found in our RACE clones. The exons predicted by the class III RACE clones are numbered 1-8 at the right, and the corresponding base numbers are shown on the left (exon 2 is contained in the 0.9 kb genomic gap). The 5 splice donor site following exon 7 is shown as the non-canonical sequence AGggt, consistent with the sequence of RACE clone A3. The class II cap site is nine lines from the bottom, the PvuII and SmaI sites are underlined. Note that there are gaps of about 0.9 and 8.0 kb in the seventh and eleventh lines. The 3717 bp sequence of the adjacent 1.3 and 2.4 kbPvuII fragments containing promoter II is GenBank sequence U52693. The sequence of the 725 bp fragment containing promoter III and of the 455 bp intron lying 0.9 kb downstream are GenBank sequences U52694 and U52695, respectively.

fragment (Fig. 6) showed that the 5’ ends of A3 and N2 were encoded by four small exons (termed 4-7 in Figs 5 and 6), and that the additional sequences in A3 were due to an additional exon, termed 8. All five of these exons shared the same ORF shown in Figure 6, but these were not in frame with the TN-X coding cassette beginning at base 195 in Figure 2.

To determine whether the class II and class III RACE clones represented major species of RNA transcripts, we prepared

riboprobes from RACE clones N1 (class II) and N2 (class III). Nuclear, but not cytoplasmic RNA from NCI-H295 cells protected the full-length 371 base region from probe N1 (Fig. 7A, left), while both nuclear and cytoplasmic RNA protected the full-length 320 base coding region of probe N2 (Fig. 7A, right). Thus in NCI-H295 cells, TN-X mRNA is alternately spliced at its 5’ end according to the two patterns predicted by the class II and III RACE clones, and both classes of TN-X mRNA tend to be

Figure 7. RNase protection experiment. (A) Left, a 449-base riboprobe from SalI-cut RACE clone N1 contained 340 bases of novel, upstream class II sequences linked to 31 bases from the TN-X leader peptide (bases 195-226 in Fig. 2), plus 78 bases of vector sequences. Right, a 398 base riboprobe from SalI-cut RACE clone N2 contained 289 bases from Creb-rp-like exons 4-7 (Fig. 5), and the same 31 bases of TN-X leader and 78 bases of vector sequences found in N1 probe. Both probes were transcribed with T7 RNA polymerase. RNA sources are designated as in Figure 4; M, muscle 3 ug; A, adrenal 5 µg; C, NCI cytoplasmic 10 µg; N, NCI nuclear, from same number of cells as cytoplasmic sample. (B) A 509 base riboprobe transcribed with T7 RNA polymerase from Nael-cut RACE clone A3, encompassing 377 bases of the upstream Creb-rp-like exons 4 to 8 (bases +362 to +738 in Fig. 6), bases 195 to 272 of TN-X (Fig. 2), and 55 bases of vector was hybridized with 5 µg RNA samples from human fetal adrenal (A) and muscle (M) and from human JEG-3 choriocarcinoma cells (J). The full-length protected fragment linking the upstream exons to TN-X is indicated by an arrowhead.

NOI

NCI

PIMANCPIMANC

PIAMJ

800

100

400

500

400

300

retained in the nuclei. Adrenal and muscle RNAs protected only miniscule amounts of these probes indicating that these upstream promoters are rarely used in these tissues. Hence RACE clone A2 represents a rare transcript.

Because the two class III RACE clones, A3 and N2, had different downstream splicing, we sought to determine the relative abundance of these two patterns. Protection of an antisense riboprobe generated from RACE clone A3 yielded a 395 base band with adrenal, muscle and JEG-3 cell RNA, showing that all five of these exonic sequences were found in a single, correctly spliced RNA (Fig. 7B). The expected 454 base band was very faint, showing that few of these upstream sequences were spliced to TN-X. Thus RACE clone A3 also represents a rare transcript. The high abundance of the 395 base protected band coupled with the low abundance of the full-length 454 base band suggested that the upstream exons represented by the 395 base band may represent a separate transcription unit. This splicing of the upstream reading frame to the 5’ UTR of TN-X was confirmed in three separate RNase protection experiments using human fetal adrenal RNA, but was not detected in human fetal muscle or human choriocarcinoma JEG-3 cells (not shown).

Expression of the TN-X gene in NCI-H295 cells is linked to the Creb-rp gene

Analysis by the BLAST and FASTA programs showed that the sequences of RACE clones A3 and N2 were similar to members of the bZip superfamily of transcription factors. While we were pursuing further analysis of these sequences, Min et al. described a gene encoding a novel bZip protein called Creb-rp immediately upstream from TN-X (36). The sequences of the A3 and N2

Figure 8. Northern blot. Total RNA samples (20 µg) from human fetal adrenal (A) and muscle (M) and from JEG-3 choriocarcinoma cell(s) were analyzed with a 551 base probe from RACE clone A3 encompassing Creb-rp-like exons 4-8 (Fig. 5). The abundant 2.6 kb transcript found in all tissues is indistinguishable from Creb-rp. Precursor and alternatively spliced forms are also seen on this overexposed autoradiogram.

MAJ

RACE clones correspond to only the 3’-half of the Creb-rp sequence; to determine if the full-length Creb-rp sequence was expressed in the fetal adrenal, we performed an additional round of RACE starting from exon 4. We identified two identical Creb-rp-like clones with sequences identical to the Crep-rp cDNA sequence (36), but these lacked 42 bases at the 5’ end of Creb-rp and lacked a large 1191 bp region of the Creb-rp gene sequence that encompasses intron 2 and part of exon 3 of the Creb-rp-like sequence (Fig. 5). Screening the DNA in cosmid cTNX:7 located the genomic DNA corresponding to Creb-rp exon 1 in a 5.5 kb Smal fragment (Fig. 5). The sequence of the genomic DNA encompassing the cDNA in our RACE clones is presented in Figure 6; the genomic DNA encoding the 1191 bp of Creb-rp not found in our RACE clones presumably lies in the ~8 kb of unsequenced DNA lying between exons 2 and 3 of the Creb-rp-like sequence (Fig. 6).

To identify a possible cap site for adrenal Creb-rp mRNA, we synthesized a genomic anti-sense RNA probe encompassing bases -178 to +137 (Fig. 6). Both adrenal and JEG-3 cell RNAs protected a fragment of130 bases (not shown), indicating that the Creb-rp mRNA in human fetal adrenal and JEG-3 cells is initiated from the same transcriptional cap site as that described by Min et al. (36). Northern blotting identified the expected 2.6 kb Creb-rp mRNA in human fetal muscle and adrenal and in JEG-3 choriocarcinoma cells and a large16 kb species of RNA hybridizing to the Creb-rp probe in JEG-3 cells (Fig.8); the nature of this large transcript is unknown.

No alternate splicing in the FN-III repeats of TN-X

The structures of TN-X, TN-C and TN-R all consist of similar modular structures, containing varying numbers of FnIII repeats. Both TN-C and TN-R undergo alternate splicing that can delete the middle cluster of FnIII repeats, preserving five repeats at the amino terminus and the three at the carboxy-terminus. Because the sequences of these eight retained FnIII repeats are highly conserved in TN-X, TN-C and TN-R, we (15) and others (32) have suggested that the 23 FN-III domains 6 to 26 of TN-X may be alternatively spliced in a similar fashion. To test this hypothesis, we probed 2.7 kb of the 3’ end of the TN-X coding region by RNase protection using riboprobes generated from three different RACE clones and eight different subclones of the

Figure 9. Alternative splicing of the 3' portion of human TN-X gene. (A) Exons 25-39 of TN-X, encoded by the 2.7 kb cDNA fragment (9), are shown with the corresponding FnIII repeats shown in lower case Roman numerals. Three RACE probes (marked by R) and eight cDNA probes derived from the 2.7 kb cDNA were used to generate the riboprobes shown by horizontal bars. The promoters of XA (8) and XB-S (10) are located upstream from exon 27 in the XA and XB genes, respectively. The XA gene contains an internal deletion, shown by the alternate splicing pattern between exons 30 and 31. (B) Analysis of the possible alternative splicing pattern between FN-III repeats 26 and 27 by RNase protection using a RACE probe covering 181 bases of exon 28 and 40 bases of exon 29. Total RNA isolated from M (muscle) and A (adrenal) protected the full 221 base coding region.

XA/XB-S

100 bp

XSÍN

1XY

zavi

zxvii

xxviii

XXIX

Fibrinogen

BPIMA

400

300

2.7 kb cDNA (9) (Fig. 9A). Together, these probes encompassed TN-X exons 25 to 39, which encode FN-III repeats 23-29 and the fibrinogen-like knob at the carboxy-terminus of the protein. We did not detect any alternatively spliced transcripts at the predicted site between FN-III repeats 26 and 27 (Fig. 9B) or at any other site between exons 25 and 39, in either adrenal or muscle RNA. However, both RNase protection and RACE experiments did re-confirm the presence of the XA transcript (8) and the XB-S transcript (10) in adrenal but not in muscle RNA. This lack of alternate splicing in the FnIII repeats is consistent with our detection of only one major size species of TN-X mRNA on Northern blots (Fig. 1 and fig. 5 of reference 8). Thus TN-X appears to be unique among the family of tenascins in not undergoing alternate splicing in the FnIII domain.

DISCUSSION

The human TN-X gene spans about 100 kb, overlapping the Creb-rp and P450c21B genes at its 5’ and 3’ ends, respectively. The C4/P450c21/TN-X/Creb-rp gene locus may be the most complex locus in the human genome (Fig. 10). At least 13 transcriptional units lie within only 160 kb, including several genes that overlap in the same orientation (Z within C4, P450c21 within Y, XB-S within XB) and in the opposite orientation (XA, XB-S, and XB within P450c21). This organization appears to be conserved in rodents (24,30) and cattle (6,7), although these species have not been studied in detail. While many human overlapping gene-within-a-gene systems have been described,

none to date approaches the complexity and intricacy of this locus.

Transcription of the human TN-X gene can be initiated from three different promoters. The principal (class I) promoter defines a 5’ untranslated exon located more than 10 kb upstream from the exon containing the translational initiation signal. Two additional promoters and alternate initial series of exons were used in adrenocortical carcinoma NCI-H295 cells, where transcripts are initiated from the promoter of Creb-rp or from a region at the 3’ end of Creb-rp and are spliced to the TN-X coding cassette. No mutations were found in the intron-exon boundaries of PCR-am- plified DNA from NCI-H295 cells, and RNase protection experiments with probes containing the introns separating exons 6-8 of the Creb-rp-like sequence did not detect RNA species in the expected amounts. Hence the alternate promoter choice and RNA splicing of TN-X in NCI-H295 cells is not due to mutations in these tumor cells. Tenascin-C, a protein related to TN-X, has anti-adhesive actions in vitro (23), suggesting that tenascin-like molecules might play a part in tumor metastases. TN-X is expressed from different promoters in normal and malignant human adrenal cells, but we have no information about the activity of TN-X in adrenal malignancy. Small amounts of class I transcripts derived from the principal promoter were found in NCI-H295 nuclear RNA, suggesting that the principal promoter has much lower activity in NCI-H295 adrenal carcinoma cells than in normal fetal adrenal cells. RNase protection experiments showed that the ratio of nuclear to cytoplasmic localization of 5’ ends of TN-X mRNA in NCI-H295 cells was about 10:1; by contrast the nuclear cytoplasmic ratio for 3’ ends is about 1:3 (37).

Figure 10. Structure of the C4/P450c21/TN-X/Creb-rp locus on 6p21.3. The centromere is to the right and the telomere is to the left, the top line shows the scale in kilobases, the boundaries of the duplication of the A and B regions (8) are shown as vertical dotted lines, and the arrows indicate transcriptional orientation. The XB gene is truncated in the enlarged lower diagram, as shown, but its 5' and 3' ends are shown at the same scale. I, II and III designate the three alternate promoters of TN-X as defined by the three classes of RACE clones shown in Figures 5 and 6; I is the principal TN-X promoter. The location of the 3 end of the Creb-rp gene has not been determined precisely, but appears to lie downstream from the principal TN-X promoter.

100

110

120

130

142

150

160

GTURE

CAA

21.A

CAB

210

Crub-rp

It is not clear why TN-X 5’ ends are retained in the nuclei of these cells. Interaction of transcripts with splicing machinery can cause nuclear retention of pre-mRNAs (38,39); however, we did not detect retained intronic sequences in alternatively spliced tran- scripts. Thus it is most likely that the abundant TN-X 3’ sequences seen in cytoplasmic RNA represent mainly the XA (8) and XB-S (10) sequences. The low level of expression from the principal promoter, plus the retention of most class II and III TN-X transcripts in the nucleus, render NCI-H295 cells unsuitable for transfection studies to analyze the activity of the TN-X principal promoter.

The TN-X promoter has not been characterized in other species, hence evolutionary comparisons cannot be made. However, the human (40), mouse (41), and chicken (42) TN-C promoters are highly conserved (33). A 250 base proximal promoter region of these genes contains a TATATAA box, the octamer motif ATGCAAAT, two homopolymeric dA-dT tracts and the homeodomain binding site TAAT. Only the two homopolymeric dA-dT tracts about position -250 and a homeo- domain at-210 were found in the principal promoter of the TN-X gene. Scanning for transcription factor binding sites did not identify other known motifs in~2 kb of 5’ flanking DNA (Fig. 2). As the expression of TN-X and TN-C are often reciprocal in different tissues (24,30), it may not be surprising that these two promoters contain different regulatory elements.

Our class III RACE clones were 99% identical to the Creb-rp transcript (36), but major differences were observed for the splicing pattern of these mRNAs. A 1191 base region of Creb-rp mRNA encoding the leucine-zipper and basic amino acid domains was spliced out from adrenal transcript A3, and the first exon of A3 used an alternative 5’ splice site. The third exon also used an alternative 3’ splice site and all three clones used different splice donor sequences for exon 7. Thus these alternately spliced forms of TN-X will not encode Creb-rp activity.

In contrast with predictions based on homology with TN-C and TN-R, no alternatively spliced transcripts were detected in the 3’ end of TN-X mRNA in either muscle or adrenal RNA. The presence of a single 12-16 kb TN-X mRNA in muscle (Fig. 4) and adrenal (8) is also consistent with the lack of alternate splicing in the FnIII domains of human TN-X. By contrast, mouse TN-X mRNA is seen as two discrete 13 and 11 kb bands of approximately equal abundance (24). The presence of a single

size-class of TN-X mRNAs, and presumably of TN-X protein monomers, is not a general characteristic of all mammalian TN-X and suggests that all 29 FnIII repeats may be needed for the function of human TN-X, despite its apparent evolutionary redundancy with other tenascins.

MATERIALS AND METHODS

Cosmid screening, mapping and subcloning

A cosmid library in pWE15 (Stratagene, San Diego, CA) was plated at (0.5-2.0)×105 colonies/plate and colony-lifted filters were processed as described (43). Filters were probed in 0.1 M Na phosphate (pH 7.5), 5 mM EDTA, 7% SDS, 100 µg/ml sonicated salmon sperm DNA and 50 µg/ml E.coli tRNA. Hybridization with~~107 c.p.m. of riboprobe (S.A .~~ 109 c.p.m./ug) was done at 68℃ for 2 h. Filters were washed in 0.1xSCC, 0.1% SDS at 65℃ for 30 min. Hybridization-positive colonies were purified by two additional rounds of plating and probing. Cosmid DNA was prepared by the alkaline lysis method (44). Complete and partial Smal digestions of cosmid clone cTNX:7, containing an insert of ~34 kb were analyzed by Southern blotting from 0.6% agarose gel. Riboprobes were generated from 5’ and 3’ portions of the cosmid clone by T3 and T7 RNA polymerases. Ordering of the internal Smal fragments was done with riboprobes transcribed from SalI-digested RACE clones.

Smal fragments were shotgun subcloned into a Smal-digested pBluescript KS vector (Stratagene), and individual subclones were identified by colony hybridization to various riboprobes. Further in situ subcloning of selected clones was done by digestion with a single restriction enzyme (or with two different enzymes, followed by rendering the ends blunt-ended with Klenow polymerase) using enzymes having sites in the multi- cloning site of the vector. Following ligation, and transformation into E.coli, recombinant KS clones were prepared by alkaline lysis (44).

DNA sequencing and analysis

Genomic and cDNA fragments subcloned into pBluescript vector were sequenced as double-stranded templates using sequenase (United States Biochemical Corp), T3 and T7 primers, and [35S]dATP as recommended by the manufacturer. All RACE

clones were sequenced on both strands. Sequence analysis was by DNA Inspector (TEXTCO, West Lebanon, NH). The DNA sequence analysis programs BLAST (45), FASTA (46), BLITZ (47), BLOCKS (48) and TFSEARCH (49) were accessed by internet (e-mail/www) and used for analysis of all newly determined sequences. BLITZ is based on the algorithm of Smith and Waterman (50) and TFSEARCH is based on the database of Wingender (51). DNA sequences are deposited with GenBank, under accession numbers U52693-U52701.

RNA isolation

Total RNA was isolated (52) from frozen human fetal adrenal and muscle tissues and cytoplasmic and nuclear RNA fractions were purified from human adrenocortical carcinoma NCI-H295 cells grown in suspension (35). A sub-line of confluent, adherent cells termed NCI-H295A cells (53) was lysed directly in the flask with 4 ml of lysis buffer containing 10 mM Tris-HCI (pH 8.0), 140 mM NaCl, 1.5 mM Mg Cl2, 0.5% Nonidet P-40 and 1 mM DTT. Nuclei were pelleted at 4℃ by centrifugation at 3000 g, washed twice, and suspended in 2 ml of lysis buffer. RNA fractions were treated with DNase and proteinase K, and extracted with phenol. Human JEG-3 choriocarcinoma cells were grown and used to prepare RNA as described (54).

Rapid amplification of cDNA ends (RACE)

RACE (55) was done with the following modifications. First strand cDNA synthesis was initiated from a 200-400 base antisense RNA annealed to the mRNA of interest in 10ul of 80% formamide, 50 mM PIPES (pH 6.4), 0.4 M NaCl, and 1 mM EDTA-Na2 overnight at 45℃. Before adding reverse transcrip- tase (Superscript II, BRL) annealed RNAs were precipitated with 3 volumes of ethanol and dissolved in 10 ul of the first strand synthesis buffer (Gibco-BRL). cDNA synthesis was carried out at 42℃ for 1 h. The reaction was stopped by adding 90 ul of RNase digestion cocktail containing 10 mM Tris-HCI (pH 7.4), 300 mM NaCl, 5 mM EDTA-Na2 (pH 7.5) and 50 µg/ml RNase A. Proteinase K and phenol treatments were as described for RNase protections (8). Free deoxynucleotides were removed by two cycles of ethanol precipitation. cDNA was tailed with dCTP and terminal transferase (Gibco-BRL) according to the manufac- turer’s protocol. Subsequent PCR reactions were carried out using sense primers corresponding to the 5’ portion of the newly synthesized cDNA and the antisense anchor primer GCCACGCGTCGACTAGTAC(G)12 using the PCR program 94℃ for 30 s, 50℃ for 30 s, 72℃ for 1 min for 35 cycles. PCR products were treated with Klenow polymerase to remove 3’ overhangs and to complete strand synthesis before size-selection on a 1% Nusieve (FMC) agarose gel in Tris-borate-EDTA buffer. cDNA fragments of 200-600 bp were isolated from gel slices by extraction with hot phenol, concentrated by butanol extraction, and precipitated with ethanol. Where necessary, nested PCR was carried out to increase the pool of specific cDNA products. All RACE products were digested with Sall and cloned into a Smal-SalI double digested pBluescript KS vector. Screening and colony selection of cDNA clones was done by hybridization and direct colony PCR with T3 and T7 primers. RACE clones are named according to the source of RNA used: M, human fetal striated muscle; A, human fetal adrenal; N, human NCI-H295 adrenocortical carcinoma cells. The sequences RACE clones A3

(U52696), M1 (U52699), N1 (U52700), and N2 (U52701) have been deposited with GenBank.

RNase protection

RACE and genomic DNA fragments subcloned into pBluescript vector were linearized with different restriction enzymes and transcribed with T3 and T7 RNA polymerase (Promega, Madison WI) in the presence of 32P[a-UTP]. If not stated otherwise, UTP concentrations of 1 uM and 5 uM were used to generate riboprobes of <400 and >400 bases, respectively. Processing of RNA samples and analysis of protected fragments was done as described (8,15). These riboprobes were also used for colony screening and Northern blotting. Quantitation of the 32P-RNA species was done by scintillation counting and by direct comparison of the RNA band intensities with prepared standards.

Northern blotting

Total RNA samples prepared from tissues or cell lines were loaded on formaldehyde-agarose gel and electrophoresed at 100 V for 3 h. The gel was soaked in 50 mM NaOH and 1 mM EDTA-Na2 for 30 min, followed by soaking in 20x SSC and transfer for 25 h. UV cross-linked blots were hybridized with riboprobes in the presence of about 1000-fold excess of a competitor RNA derived from a linearized KS vector. Hybridization cocktail and conditions were as described above for cosmid library screening. The 3’ TN-X cDNA probe was a 282 base riboprobe carrying 268 bases from the 3’ end of the 2.7 kb TN-X cDNA (9), generated by XmnI digestion, treatment with Klenow polymerase and dATP, and transcribed with T3 RNA polymerase in the presence of 0.5-10 uM [32P]UTP.

ACKNOWLEDGEMENTS

We thank Meng Kian Tee and James Bristow for helpful discussions and for review of the manuscript. This work was supported by NIH Grant DK37922 (WLM), by a minority supplement to DK37922 (FB), and by March of Dimes Grant 6-0098 (WLM).

REFERENCES

1. Carroll, M.C., Campbell, R.D. and Porter, R.R. (1985) Mapping of steroid 21-hydroxylase genes to complement component C4 genes in HLA, the major histocompatibility locus in man. Proc. Natl Acad. Sci. USA 82, 521-525.

2. White, P.C., Grossberger, D., Onufer, B.J., Chaplin, D.D., New, M.I., Dupont, B. and Strominger, J.L. (1985) Two genes encoding steroid 21-hydroxylase are located near the genes encoding the fourth component of complement in man. Proc. Natl Acad. Sci. USA 82, 1089-1093.

3. White, P.C., New, M.I. and Dupont, B. (1984) HLA-linked congenital adrenal hyperplasia results from a defective gene encoding a cytochrome P450 specific for steroid 21-hydroxylation. Proc. Natl Acad. Sci. USA 81, 7505.

4. Amor, M., Tosi, M., Duponchel, C., Steinmetz, M. and Meo, T. (1985) Liver cDNA probes disclose two cytochrome P450 genes duplicated in tandem with the complement C4 loci of the mouse H-2S region. Proc. Natl Acad. Sci. USA 82, 4453-4457.

5. Chung, B., Matteson, K.J. and Miller, W.L. (1985) Cloning and characteriz- ation of the bovine gene for steroid 21-hydroxylase (P450c21). DNA 4, 211-219.

6. Chung, B., Matteson, K.J. and Miller, W.L. (1986) Structure of the bovine gene for P450c21 (steroid 21-hydroxylase) defines a novel cytochrome P450 gene family. Proc. Natl Acad. Sci. USA 83, 4243-4247.

7. Skow, L.E., Womack, J.E., Petresh, J.M. and Miller, W.L. (1988) Synteny mapping of the genes for steroid 21-hydroxylase, alpha-A-crystallin, and class I bovine leukocyte antigen (BoLA) in cattle. DNA 7, 143-149.

8. Gitelman, S.E., Bristow, J. and Miller, W.L. (1992) Mechanism and consequences of the duplication of the human C4/P450c21/Gene X locus. Mol. Cell. Biol. 12, 2124-2134.

9. Morel, Y., Bristow, J., Gitelman, S.E. and Miller, W.L. (1989) Transcript encoded on the opposite strand of the human steroid 21-hydroxylase/comple- ment component/C4 gene locus. Proc. Natl Acad. Sci. USA 86, 6582-6586.

10. Tee, M.K., Thomson, A.A., Bristow, J. and Miller, W.L. (1995) Sequences promoting the transcription of the human XA gene overlapping P450c21A correctly predict the presence of a novel, adrenal-specific, truncated form of Tenascin-X. Genomics 28, 171-178.

11. Bristow, J., Gitelman, S.E., Tee, M.K., Staels, B. and Miller, W.L. (1993) Abundant adrenal-specific transcription of the human P450c21A ‘pseudo- gene’. J. Biol. Chem. 268, 12919-12924.

12. Tee, M.K., Babalola, G.O., Aza-Blanc, P., Speek, M., Gitelman, S.E. and Miller, W.L. (1995) A promoter within intron 35 of the human C4A gene initiates adrenal-specific transcription of a 1 kb RNA: location of a cryptic CYP21 promoter element? Hum. Mol. Genet. 4, 2109-2116.

13. Sargent, C.A., Anderson, M.J., Hsieh, S.L., Kendall, E., Gomez-Escobor, N. and Campbell, R.D. (1994) Characterization of the novel gene G11 lying adjacent to the complement C4A gene in the human major histocompatibility complex. Hum. Mol. Genet. 3, 481-488.

14. Shen, L., Wu, L., Sanlioglu, S., Chen, R., Mendoza, A.R., Dangel, A.W., Carroll, M.C., Zipf, W.B. and Yu, C.Y. (1994) Structure and genetics of the partially duplicated gene RP located immediately upstream of the comple- ment C4A and the C4B genes in the HLA class III region. J. Biol. Chem. 269, 8466-8476.

15. Bristow, J., Tee, M.K., Gitelman, S.E., Mellon, S.H. and Miller, W.L. (1993) Tenascin-X. A novel extracellular matrix protein encoded by the human XB gene overlapping P450c21B. J. Cell. Biol. 122, 265-278.

16. Jones, F.S., Burgoon, M.P., Hoffman, K.L., Crossin, B.A., Cunningham, B.A. and Edelman, G.M. (1988) A cDNA clone for cytotactin contains sequences similar to epidermal growth factor-like repeats and segments of fibronectin and fibrinogen. Proc. Natl Acad. Sci. USA 85, 2186-2190.

17. Spring, J., Beck, K. and Chiquet-Ehrismann, R. (1989) Two contrary functions of tenascin: dissection of the active sites by recombinant tenascin fragments. Cell 59, 325-334.

18. Gulcher, J.R., Niew, D.E., Marton, L.S. and Stefansson, K. (1989) An alternatively spliced region of the human hexabrachion contains a repeat of potential N-glycosylation sites. Proc. Natl Acad. Sci. USA 86, 1588-1592.

19. Nies, D.E., Hemesath, T.J., Kim, J.H., Gulcher, J.R. and L, S. (1991) The complete cDNA sequence of human hexabrachion (Tenascin). A multido- main protein containing unique epidermal growth factor repeats. J. Biol. Chem. 266, 2818-2823.

20. Nörenberg, U., Wille, H., Wolff, J.M., Frank, R. and Rathjen, F.G. (1992) The chicken neural extracellular matrix molecule restriction: Similarity with EGF-, fibronectin type III-, and fibrinogen-like motifs. Neuron 8, 849-863.

21. Fuss, B., Wintergerst, E.S., Bartsch, U. and Schachner, M. (1993) Molecular characterization and in situ mRNA localization of the neural recognition molecule J1-160/180: a modular structure similar to tenascin. J. Cell. Biol. 120, 1237-1249.

22. Chiquet-Ehrismann, R., Kalla, P., Pearson, C.A., Beck, K. and Chiquet, M. (1988) Tenascin interferes with fibronectin action. Cell 53, 383-390.

23. Sage, E.H. and Bornstein, P. (1991) Extracellular proteins that modulate cell-matrix interactions. J. Biol. Chem. 266, 14831-14834.

24. Matsumoto, K., Saga, Y., Ikemura, T., Sakakura, T. and Chiquet-Ehrismann, R. (1994) The distribution of tenascin-X is distinct and often reciprocal to that of tenascin-C. J. Cell. Biol. 125, 483-493.

25. Chiquet-Ehrismann, R., Mackie, E.J., Pearson, C.A. and Sakakura, T. (1986) Tenascin: An extracellular matrix protein involved in tissue integrations during fetal development and oncogenesis. Cell 47, 131-139.

26. Tan, S.S., Prieto, A.L., Newgreen, D.F., Crossin, K.L. and Edelman, G.M. (1991) Cytotactin expression in somites after dorsal neural tube and neural crest ablation in chicken embryos. Proc. Natl Acad. Sci. USA 88, 6398-6402.

27. Weller, A., Beck, S. and Ekblom, P. (1991) Amino acid sequence of mouse tenascin and differential expression of two tenascin isoforms during embryogenesis. J. Cell. Biol. 112, 355-362.

28. Saga, Y., Yagi, T., Ikawa, Y., Sakakura, T. and Aizawa, S. (1992) Mice develop normally without tenascin. Genes Dev. 6, 1821-1831.

29. Steindler, D.A., Settles, D., Erickson, H.P., Laywell, E.D., Yoshiki, A., Faissner, A. and Kusakabe, M. (1995) Tenascin knockout mice: barrels, boundary molecules and glial scars. J. Neurosci. 15, 1971-1983.

30. Burch, G.H., Bedolli, M.A., McDonough, S., Rosenthal, S.M. and Bristow, J. (1995) Embryonic expression of tenascin-X suggests a role in limb, muscle, and heart development. Dev. Dynamics 203, 491-504.

31. Prieto, A.L., Anderson-Fisone, C. and Crossin, K.L. (1992) Characterization of multiple adhesive and couteradhesive domains in the extracellular matrix protein cytotactin. J. Cell. Biol. 119, 663-678.

32. Erickson, H.P. (1993) Tenascin-C, tenascin-R and tenascin-X: a family of talented proteins in search of functions. Current Opinion Cell Biol. 5, 869-876.

33. Chiquet-Ehrismann, R., Hagios, C. and Schenk, S. (1995) The complexity in regulating the expression of tenascins. BioEssays 17, 873-878.

34. Gazdar, A.F., Oie, H.K., Shackleton, C.H., Chen, T.R., Triche, T.J., Myers, C.E., Chrousos, G.P., Brennan, M.F., Stein, C.A. and LaRocca, R.V. (1990) Establishment and characterization of a human adrenocortical carcinoma cell line that expresses multiple pathways of steroid biosynthesis. Cancer Res. 50, 5488-5496.

35. Staels, B., Hum, D.W. and Miller, W.L. (1993) Regulation of steroidogenesis in NCI-H295 cells: a cellular model of the human fetal adrenal. Mol. Endocrinol. 7, 423-433.

36. Min, J., Shukla, H., Kozono, H., Bronson, S.K., Weissman, S.M. and Chaplin, D.D. (1995) A novel Creb family gene telomeric of HLA-DRA in the HLA complex. Genomics 30, 149-156.

37. Speek, M. and Miller, W.L. (1995) Interaction between the complementary mRNAs for steroid 21-hydroxylase (P450c21) and Tenascin-X is prevented by sequence-specific binding of nuclear proteins. Mol. Endocrinol. 9, 1655-1665.

38. Legrain, P. and Roshbash, M. (1989) Some cis- and trans-acting mutants for splicing target pre-mRNA to the cytoplasm. Cell 57, 573-583.

39. Chang, D.D. and Sharp, P.A. (1989) Regulation by HIV Rev depends upon recognition of splice sites. Cell 71, 527-542.

40. Gherzi, R., Carnemolla, B., Siri, A., Ponassi, M., Balza, E. and Zardi, L. (1995) Human tenascin gene. Structure of the 5’-region, identification, and characterization of the transcription regulatory sequences.J. Biol. Chem. 270, 3429-3434.

41. Copertino, D.W., Jenkinson, S., Jones, F.S. and Edelman, G.M. (1995) Structural and functional similarities between the promoter for mouse tenascin and chicken cytotactin. Proc. Natl Acad. Sci. USA 92, 2131-2135.

42. Jones, F.S., Chalepakis, G., Gruss, P. and Edelman, G.M. (1992) Activation of the cytotactin promoter by the homeobox-containing gene Evx-1. Proc. Natl Acad. Sci. USA 89, 2091-2095.

43. Lin, D., Shi, Y. and Miller, W.L. (1990) Cloning and sequence of the human adrenodoxin reductase gene. Proc. Natl Acad. Sci. USA 87, 8516-8520.

44. Birnboim, H.C. and Doly, J. (1987) A rapid alkaline extraction procedure for screening recombinant plasmid DNA. Nucleic Acids Res. 7, 1513-1523.

45. Altschul, S.F., Gish, W., Miller, W., Myers, E.W. and Lipman, D.J. (1990) Basic local alignment search tool. J. Mol. Biol. 215, 403-410.

46. Pearson, W.R. (1990) In R. F. Doolittle (ed.), Methods in Enzymology, Rapid and sensitive sequence comparison with FASTP and FASTA. Academic Press, Inc., 183, 63-98.

47. Sturrock, S.S. and Collins, J.F. (1993) MPsrch version 1.3. Biocomputing Research Unit, University of Edinburgh, UK

48. Henikoff, S. and Henikoff, J.G. (1994) Protein family classification based on searching a database of blocks. Genomics 19, 97-107.

49. Akiyama, Y. (1995) TFSEARCH ver. 1.3. Kyoto University, JAPAN

50. Smith, T.F. and Waterman, M.S. (1981) Identification of common molecular subsequences. J. Mol. Biol. 147, 195-197.

51. Wingender, E. (1994) Recognition of regulatory regions in genomic sequences. J. Biotechnol. 35, 273-280.

52. Chomczynski, P. and Sacchi, N. (1987) Single-step method of RNA isolation by acid guanidinium thiocyanate-phenol-chloroform extraction. Anal. Bio- chem. 162, 156-159.

53. Rodriguez, H., Hum, D.W., Staels, B. and Miller, W.L. (1996) Transcription of the human genes for P450scc and P450c17 is regulated differently in human adrenal NCI-4295 cells than in mouse adrenal Y1 cells. (submitted).

54. Moore, C.C.D., Hum, D.W. and Miller, W.L. (1992) Identification of positive and negative placental-specific basal elements, a transcriptional repressor, and a cAMP response element in the human gene for P450scc. Mol. Endocrinol. 6, 2045-2058.

55. Frohman, M.A., Dush, M.K. and Martin, G.R. (1988) Rapid production of full-length cDNAs from rare transcripts: Amplification using a single gene-specific oligonucleotide primer. Proc. Natl Acad. Sci. USA 85, 8998-9002.

Quartz 4

Explorer

8923003