ThPOK is a transcription factor that acts as a master regulator of CD4+ T cell lineage commitment. We report the first human disease caused by a genetic alteration in ThPOK, specifically, a damaging heterozygous de novo variant in ThPOK (NM_001256455.2:c.1080A>C, p.K360N). This patient exhibited the unusual constellation of persistent CD4+ T cell deficiency, allergy, interstitial lung disease, corneal vascularization and scarring, developmental delay, and growth failure. The ThPOKK360N variant displayed abnormal multimorphic activity, interfering with ThPOKWT (antimorph), failing to bind wild-type ThPOK consensus sequences (amorph), and showing novel DNA-binding specificity (neomorph). Single-cell RNA sequencing revealed defects in CD4+ and CD8+ T cell maturation and activation (hypomorph). Recapitulated in lentivirally transduced healthy control T cells and fibroblasts, the transcriptomic analysis showed ThPOKK360N-transduced T cells had impaired TCR activation and ThPOKK360N-transduced fibroblasts with increased profibrotic gene expression. This novel human disease confirms ThPOK’s role in CD4+ T cell development but also uncovers novel roles in TCR activation and regulation of fibrotic pathways in fibroblasts.
Introduction
T helper–inducing POZ/Kruppel-like factor (ThPOK), encoded by ZBTB7B, is a transcription factor that belongs to the zinc finger and BTB/POZ domain-containing (ZBTB) family of proteins (Kappes, 2010). ThPOK acts both as an activator and a suppressor of transcription through direct binding to promoter regions of target genes and by interaction with various coregulators (Cheng et al., 2021). In recent years, ThPOK has garnered significant attention in immunological research, primarily through insights gained from murine models. The most well-studied function of ThPOK is its role in directing the differentiation of T cell lineages, particularly in promoting the development of CD4+ T cells and inhibiting the development of CD8+ T cells in the thymus (He et al., 2010). This discovery was initially made based on a spontaneous mutant mouse strain with biallelic missense variants in the second zinc finger domain of ThPOK. These so-called “helper-deficient” (HD) mice show a selective absence of mature CD4+ T cells and an increased frequency of CD8+ T cells. Although the immune defects in HD mice are well-documented, other aspects of their phenotype, aside from the previously reported high incidence of highly vascularized corneal defects, remain less defined (Kappes et al., 2006). Beyond its described role as a regulator of CD4+ T cells first described in HD mice, ThPOK’s role extends to other biological processes, although these are less well-understood. For example, ThPOK is highly expressed in the skin and downregulates type I collagen gene expression in murine skin fibroblasts, binding specifically to the regulatory regions of type I collagen genes (Col1a1 and Col1a2) (Galéra et al., 1996).
Despite advances in our understanding of the role of ThPOK in immunity, the direct implications of ZBTB7B variants in human health have remained largely unexplored. To date, disabling genetic changes in ThPOK have not been implicated as a cause of human monogenic disease. Here, we describe a patient with a de novo heterozygous pathogenic variant in ZBTB7B causing a complex syndromic phenotype including CD4+ T lymphopenia, CD8+ T lymphocytosis, early-onset allergic disease, severe fibroinflammatory interstitial lung disease, corneal defects, and failure to thrive. Notably, the unusual combination of CD4+ T lymphopenia and corneal defects in the patient was reminiscent of the phenotype of ThPOK-mutant HD mice. Our detailed mechanistic investigations reveal that this variant is multimorphic, exhibiting antimorphic dominant-negative (DN), amorphic and hypomorphic loss-of-function (LOF), and neomorphic gain-of-function (GOF) effects. Our findings, in the context of the whole human organism, both not only reinforce the well-established role of ThPOK in T cell development, as demonstrated by the patient’s abnormalities in T cell development and function, but also emphasize the unexpected role of ThPOK in pulmonary fibrosis, as demonstrated by the increase in profibrotic gene signature in fibroblasts expressing the variant. These findings not only elucidate and challenge the current understanding of ThPOK’s role in the immune system but also open new avenues for exploring its broader implications in other tissues and body systems in humans.
Results
A novel de novo heterozygous ZBTB7B variant in a child with immunodeficiency, allergic disease, corneal defects, and interstitial lung disease
The patient is a 5-year-old male born to healthy non-consanguineous parents at 33 wk gestational age due to preterm rupture of membranes and maternal preeclampsia. There was a family history of asthma in his mother. The patient’s complex clinical course was notable for global developmental delay, failure to thrive with feeding dependent on a gastrostomy tube, bilateral sensorineural hearing loss, non-healing corneal perforation with significant vascularization and dense scarring causing visual impairment, hepatic steatosis, increased echogenicity of multiple tissues (i.e., pancreas, liver, and kidneys), growth hormone deficiency, allergic disease, combined immunodeficiency (CID), recurrent respiratory infections, and severe interstitial lung disease requiring supplemental oxygen (Fig. 1 A). Allergic manifestations included atopic dermatitis, cow’s milk protein colitis requiring elemental formula, multiple food allergies, anaphylaxis following general anesthesia, elevated IgE, and intermittent peripheral blood eosinophilia.
While no respiratory support was required in the immediate postnatal period, he required hospitalization at 4 mo of age for respiratory distress secondary to rhinovirus/enterovirus bronchiolitis. The immune evaluation showed CD4+ lymphopenia and CD8+ lymphocytosis, with reduced T cell proliferation to mitogens (phytohemagglutinin). Despite the low CD4+ T cell count, the patient’s T cell receptor (TCR) excision circle level had been normal at birth. T cell memory panel analysis demonstrated reduced naïve CD4+ T cell percentage, opposite to the finding of a high naïve percentage in CD8 T cells. He had normal to elevated B cell counts and hypergammaglobulinemia, affecting both IgG and IgA isotypes, with protective titers to tetanus and indeterminate vaccine titers to diphtheria (Table S1). Chest imaging demonstrated diffuse bilateral ground glass opacities, concerning for Pneumocystis jirovecii pneumonia, though this was not confirmed microbiologically. He received a treatment course of sulfamethoxazole–trimethoprim, followed by long-term sulfamethoxazole–trimethoprim and itraconazole prophylaxis on the basis of his CD4+ T cell deficiency.
The patient’s chest imaging abnormalities persisted despite the resolution of his acute respiratory illness (Fig. 1, B and C), and he was started on continuous oxygen supplementation at 16 mo of age following an abnormal overnight oximetry study. Lung biopsy at 3.75 years demonstrated chronic inflammation and fibrosis of the lung, characterized by lipoproteinosis and notable subpleural cystic remodeling and honeycomb changes (Fig. 1, D–I). Rheumatologic evaluation showed normal C reactive protein and erythrocyte sedimentation rate, and negative antinuclear and rheumatoid factor antibodies. The patient’s lung disease was treated with pulse intravenous methylprednisolone followed by a prolonged course of oral prednisone, and he was started on dupilumab for his atopic conditions. Following initiation of immunosuppression, he developed hypogammaglobulinemia for which he was started on immunoglobulin replacement at the age 4 years.
Due to this unusual constellation of clinical features, trio whole-genome sequencing was performed which identified a novel heterozygous de novo variant in ZBTB7B (NM_001256455.2:c.1080A>C, p.K360N), predicted to be damaging by in silico tools (Table S2). The de novo nature of the variant was validated by Sanger sequencing of all family members (Fig. 1, J–L). The variant results in a substitution from a positively charged lysine residue to a polar uncharged asparagine residue at a highly conserved region of the ThPOK protein altering the first zinc finger in the DNA-binding domain of the transcription factor (Fig. 1 J and Fig. S1).
ThPOKK360N displays altered neomorphic DNA binding specificity
We assessed the effect of the NM_001256455.2:c.1080A>C (p.K360N) variant on protein expression using HEK293 cells. We selected HEK293 cells as our model system as these cells lack endogenous expression of ThPOK (https://www.proteinatlas.org/). Expression of ThPOKK360N was assessed both in isolation and in combination with ThPOKWT, modeling the patient’s heterozygous state. The expression of ThPOKK360N was indistinguishable from ThPOKWT both in isolation and in the heterozygous state, suggesting that the presence of the missense variant does not impact protein expression (Fig. 2 A and Fig. S2). Furthermore, coimmunoprecipitation (Co-IP) assays demonstrated that ThPOKK360N retains its ability to dimerize with ThPOKWT, as well as with itself, forming WT:WT, WT:K360N, and K360N:K360N dimers with comparable efficiency (Fig. 2 B). These findings suggest that the K360N variant does not impair ThPOK protein expression or dimerization, indicating that its functional effects on this transcription factor are likely mediated through other mechanisms, such as altered DNA binding or transcriptional activity.
Next, we evaluated the effect of the variant on the function of ThPOK as a transcription factor and its regulatory influence on SOCS1 expression. Building on Luckey et al.’s identification of ThPOK-binding sites within the mouse Socs1 promoter region (Luckey et al., 2014), we developed a luciferase reporter assay utilizing a segment of the human SOCS1 promoter containing the previously identified DNA binding motif of ThPOK (Jolma et al., 2013). We established that ThPOKK360N is impaired in its ability to upregulate SOCS1 transcriptional activity compared with ThPOKWT (Fig. 2 C). When cotransfecting cells to mimic the heterozygous state with both wild-type (WT) and p.K360N ZBTB7B plasmids, the transcriptional activity of SOCS1 exhibited an intermediate increase in comparison with the WT (Fig. 2 C). To further explore the potential impact of the p.K360N variant on the ThPOK function, we systematically varied the ratio of the variant plasmid relative to the WT plasmid. We observed a significant inverse correlation between the increasing amount of the variant plasmid and the decreasing amount of WT plasmid with SOCS1 transcriptional activity (Pearson correlation: r2 = 0.9262, P value = 0.0005), suggesting that increasing the relative abundance of the variant may modulate transcriptional output in a dose-dependent manner (Fig. 2 D).
To further assess the DNA-binding ability of ThPOKK360N, we employed an electrophoretic mobility shift assay (EMSA) using a tagged 35 nucleotide probe from the SOCS1 promoter. We observed that ThPOKWT could effectively bind to this probe, while ThPOKK360N could not (Fig. S3). To ensure the observed binding was specific, a supershift assay was performed and confirmed that the ThPOK variant exhibited no detectable binding activity, indicating a loss of DNA-binding capability against a known target likely attributable to the missense change in the DNA-binding zinc finger (Fig. 2 E). Additionally, a competition assay with a 100-fold excess of unlabeled competitor probe substantially diminished the supershifted ThPOKWT–DNA complex band, further verifying the specificity of the binding (Fig. 2 F).
We then used structural modeling to evaluate the predicted impact of the amino acid change at position 360 on the ability of ThPOK to interact with DNA using the paralog ZBTB7A (PDB: 8E3D), which shares sequence homology in the DNA-binding zinc finger domain with ThPOK (Ren et al., 2023). The structural analysis of ZBTB7A bound to a DNA target revealed that the lysine residue, equivalent to the changed 360 site in ThPOK, forms direct hydrogen bonds with the bases of AG dinucleotide. Consequently, the variant in ThPOK likely disrupts these interactions, altering its DNA binding specificity (Fig. S4). To experimentally test this prediction and to directly compare the DNA binding profiles of ThPOKWT and ThPOKK360N, we utilized high-throughput systematic evolution of ligands by exponential enrichment (HT-SELEX). HT-SELEX is a powerful approach allowing unbiased determination of preferred DNA target motifs of transcription factors. The HT-SELEX confirmed the altered DNA binding specificity of ThPOKK360N compared with ThPOKWT (Fig. 2 F). Position weight matrices (PWM) revealed a shift in nucleotide preference in positions 3 and 4, specifically changing “GA” nucleotides, preferred by the WT, to “CT” or “AT” nucleotides, preferred by ThPOKK360N. Notably, both “C” and “A” were similarly preferred at position 3 for ThPOKK360N (Fig. 2 G). Additionally, the relative abundance of 8-mer sequences from HT-SELEX experiments revealed that sequences with high affinity for either ThPOKWT or ThPOKK360N generally had a low incidence in the counterpart’s selection. This distinct separation further underscores the divergent DNA binding specificities between ThPOKWT and ThPOKK360N (Fig. 2 H).
Given that the original “GA” dinucleotide at positions 3 and 4 in the WT binding sequence is highly conserved and exhibits low flexibility, changes to this sequence are likely to impact the binding specificity. To test this using EMSA, we created two variant oligonucleotide probes based on the highest “hit” of the 8-mer plot (Variant Probe 1, Fig. 2 H) and the consensus sequence for ThPOKK360N, as indicated by PWM obtained from HT-SELEX (Variant Probe 2, Fig. 2 G) that incorporated the dinucleotide modifications “AT” (3′…CATCCCCC…5′) or “CT” (3′…CCTCCACC…5′), respectively (see Table S3 for full-length probe sequences). Confirming the HT-SELEX findings, ThPOKK360N bound to variant probes 1 and 2 more effectively than ThPOKWT (Fig. 2 I).
To more broadly assess the DNA binding landscape of the transcription factor, we integrated the HT-SELEX motifs with corresponding chromatin immunoprecipitation followed by sequencing (ChIP-seq) data to identify high-confidence gene targets for both ThPOKWT and ThPOKK360N. A gene was designated as a high-confidence target based on the co-occurrence of ChIP-seq peaks in all replicates within the promoter regions (spanning from −1,000 to +500 bp relative to the transcription start site) that also harbored corresponding HT-SELEX motif hits (Fig. 2 J). Our analysis revealed distinct gene targets likely to be regulated by ThPOKWT and ThPOKK360N, as illustrated by the Venn diagram (Fig. 2 J), with limited overlap between the gene sets and associated pathways (Fig. 2 J and Data S1) targeted by each protein, suggesting that ThPOKWT and ThPOKK360N engage in differential regulatory networks. Specifically, pathway analysis revealed that ThPOKWT regulates genes associated with endothelin, fibroblast growth factor (FGF), and epidermal growth factor (EGF) receptor signaling pathways, which aligns with its known roles in cellular differentiation and lineage commitment. In contrast, the ThPOKK360N variant uniquely regulates genes enriched in the gonadotropin-releasing hormone receptor pathway, suggesting an altered regulatory function (Fig. 2 J). This pattern indicates potentially unique transcriptional roles for each protein, which may underlie differences in cellular responses and phenotypic outcomes.
Patient T cells have impaired development, memory phenotype skewing, and enhanced Th2 effector function
ThPOK plays a critical role in the development of lymphoid cells, particularly in the lineage commitment of CD4+ T cells (Carpenter et al., 2012; Cheng et al., 2021; He et al., 2010; Kappes, 2010; Luckey et al., 2014) (Fig. 3 A). To define how ThPOKK360N impacts T cell development, we first performed immunophenotyping of the patient’s lymphoid compartment. Clinical flow cytometry data, collected over multiple time points, revealed consistent T cell abnormalities, and notably persistent CD4+ lymphopenia, CD8+ lymphocytosis, and a lower CD4+/CD8+ ratio, while largely maintaining a normal absolute CD3+ count in comparison with age-matched controls (Fig. 3 B). Clinical lymphocyte studies also indicated significant abnormalities in T cell naïve and memory subpopulations. The patient has a persistently low proportion of CD4+ naive T cells and a high proportion of CD4+ central memory and effector memory T cells. Additionally, the CD8+ T cell compartment shows an unusually high proportion of naive T cells, while the percentages of CD8+ central memory, effector memory, and terminally differentiated effector memory (TEMRA) T cells are consistently low. This pattern underscores a persistent imbalance in the patient’s T cell maturation and memory formation processes. Memory B cells also showed a skewed distribution, with low levels of switched, unswitched memory, and transitional B cells, and high levels of naive B cells (Table S1).
Along with the clinical observation of severe atopic manifestations in the patient, we also observed transient peripheral blood eosinophilia in the first 2 years of life and persistently elevated serum IgE levels (Fig. 3 C). This prompted an examination of the T helper 2 (Th2) subset in the patient’s primary cells. We conducted an immunophenotyping study using flow cytometry to analyze the T cell compartment using blood drawn from the patient at three different points over 2 years. For comparative analysis, we included three age-matched pediatric healthy controls (HC) and four adult HCs. We first confirmed a reduction in the frequency of total CD4+ T cells and an increase in CD8+ proportion in the CD3 compartment (Fig. 3 D). Within the CD4+ T cell population, there was an increase in the frequency of memory T cells alongside a substantial decrease in naive CD4+ T cells in the patient, in line with the clinically collected data (Fig. 3 E). Upon stimulation with phorbol 12-myristate 13-acetate (PMA) and ionomycin, a marked increase in Th2 cell activity was observed in the patient. This was characterized by an elevated production of Th2 effector cytokines, including IL-4, IL-5, and IL-13 (Fig. 3, F and G). These findings suggest that the patient exhibits distinct immune dysregulation, particularly in the T cell compartment, with a pronounced shift toward memory phenotypes and Th2 effector function.
Patient T cells exhibit impaired differentiation and activation
To understand the molecular underpinnings of the observed developmental T cell defect, we conducted single-cell RNA sequencing combined with antibody sequencing (scRNA-seq/Ab-seq) on the peripheral blood mononuclear cells (PBMCs) of the patient and two HCs, with and without TCR stimulation. Dimensionality reduction using uniform manifold approximation and projection (UMAP) on the unstimulated global transcriptomic signature did not place the patient’s T cells, natural killer (NK) cells, B cells, or monocytes in a separate cluster from those of the HCs (Fig. 4 A). Quantitative assessment of T cell subset proportions, as measured using surface marker sequencing, did confirm clinical flow cytometry data: the patient exhibited low CD4+ T cells and high CD8+ T cells (Fig. 4 B). However, T cell–specific clustering, visualized with UMAP, revealed that the patient’s T cells cluster separately from HC T cells. This difference in clustering was independent of CD4+, CD8+, or naïve and memory (using CD45RO) subset classification as determined by cell surface staining (Fig. 4 C). Patient T cells were largely naïve CD8+ but did not cluster with naïve T cells of HCs (Fig. 4 C).
Mechanistic insight was next derived from our transcriptomic analysis. CD4 lineage commitment has been shown to involve ThPOK-mediated SOCS1 expression and subsequent RUNX3 repression (Luckey et al., 2014) (Fig. 4 C). We previously demonstrated that ThPOKK360N had an impaired ability to activate the SOCS1 promoter (Fig. 2 B). We validated this finding in primary patient cells via single-cell transcriptomics showing that, when compared with HC cells, patient CD4 T cells have both lower baseline SOCS1 mRNA levels and the inability to upregulate SOCS1 mRNA levels following TCR stimulation (Fig. 4 E). Establishing that ThPOKK360N caused this impairment, HC T cells stably transduced with WT ThPOK upregulated SOCS1 expression (as measured via bulk transcriptomics) significantly more than those transduced with ThPOKK360N (Fig. 4 F). Mouse models have established that ThPOK represses Runx3 expression and promotes the CD4+ lineage fate (Luckey et al., 2014). Our data recapitulate this finding in primary human cells since RUNX3 transcript expression is higher in patient CD4 T cells when compared with HCs, further emphasizing the pathogenicity of ThPOKK360N (Fig. 4 G).
Next, we investigated gene signatures in naïve CD4+ and CD8+ T cells. While we did not see differences between the patient and HCs at baseline, we did see differences after TCR stimulation. Here, strong skewing was observed in both the naïve CD4+ and CD8+ compartments of patient T cells, suggestive of impaired TCR activation compared with HCs (Fig. 4, H–K). A lack of T cell activation was particularly evident with patient cells, failing to upregulate proliferative signaling pathways in CD4+ T cells (e.g., the MYC pathway) and effector CD8+ T cell pathways (e.g., interferon pathways) (Fig. 4, I and K). These results suggest a significant role for ThPOK in the activation of matured single-positive T cells. To establish this link experimentally, we stably transduced T cells with EV, WT, or p.K360N ThPOK containing lentiviral plasmids and cultured them for 3 days. We observed that many pathways (e.g., MYC proliferation and interferon effector) were upregulated in WT-transduced T cells, but these pathways failed to upregulate in p.K360N-transduced cells (Fig. 4 L). These results showed that the defects in T cell activation are consistent with a dominant-interfering effect of ThPOKK360N.
ThPOKK360N disrupts profibrotic gene suppression in pulmonary fibroblasts
In addition to the T cell defects, the patient carrying ThPOKK360N had evidence of multiorgan fibrosis, particularly pulmonary fibrosis (Fig. 1, B–I). This pathological fibrosis is notable given that murine ThPOK binds to the regulatory regions of collagen genes in the skin to inhibit their expression (Galéra et al., 1996). Using HEK293 cells, which express COL2A1 but lack endogenous expression of ThPOK, we compared the effects of transduced ThPOKK360N to ThPOKWT. We found that while ThPOKWT suppresses COL2A1 expression as expected, ThPOKK360N does not, leading to significantly higher COL2A1 levels, similar to the EV control (Fig. 5 A). Additionally, cells expressing both ThPOKWT and ThPOKK360N, mimicking a heterozygous state, also showed significantly increased COL2A1 expression compared with ThPOKWT alone, consistent with a dominant-interfering effect of ThPOKK360N on ThPOKWT in repressing collagen gene expression (Fig. 5 A). This led us to further validate our findings through stable lentiviral transduction, highlighting ThPOK’s critical role in regulating collagen gene expression. Since ThPOK is highly expressed in fibroblasts (https://www.proteinatlas.org/) and acts as a repressor of murine Col1a1 and Col1a2 (also highly expressed in fibroblasts) (Galéra et al., 1996), we utilized primary human pulmonary fibroblasts as a model system. HC pulmonary fibroblasts were stably transduced with either EV, WT, or p.K360N ThPOK-containing lentiviral plasmids (Fig. 5 B). Principal component analysis (PCA) of bulk RNA sequencing data showed clear and distinct clustering for each group and highlighted large differences in gene expression profiles influenced by the ThPOK genotype (Fig. 5 C). Differential analysis between EV-transduced and WT ThPOK-transduced fibroblasts showed significant gene expression changes, many of which are not observed to the same level in p.K360N variant-transduced cells (Fig. 5 D). Interestingly, overexpression of ThPOKK360N led to altered gene expression profiles compared with ThPOKWT, with a number of genes that were upregulated in the WT being downregulated in the variant and vice versa (Fig. 5, D and E). This shift indicates that the missense change in the first zinc finger of ThPOK significantly changes its regulatory impact, potentially altering its DNA-binding affinity, interaction with other proteins, or responsiveness to cellular signals.
Pathway analysis of the transcriptomic data revealed downregulation of critical pathways including interferon α and γ signaling, as well as the inflammatory response pathway in the p.K360N-transduced cells compared with WT-transduced controls. Conversely, ThPOKK360N drove the upregulation of several pathways, such as estrogen response late, Myc targets, and epithelial–mesenchymal transition when compared with WT-transduced controls (Fig. 5 F). The upregulation of the epithelial–mesenchymal transition, a pathway integral to fibrosis and wound healing, was notable as it suggested a potential role for p.K360N in exacerbating fibrotic conditions, highlighting a key area for further investigation. We next quantified the differential expression of key profibrotic genes that have been implicated in pulmonary fibrosis by multiple studies; specifically: ACTA2 (Alsafadi et al., 2017; Chioccioli et al., 2022; Hu et al., 2012; Liu et al., 2021; Nosrati et al., 2023; Peyser et al., 2019; Rock et al., 2011; Sun et al., 2022; Wynn, 2008), TGFB1 (Bonner, 2010; Enomoto et al., 2023; Wynn, 2008), LTBP2 (Enomoto et al., 2018; Peyser et al., 2019; Zou et al., 2021), BDNF (Cherubini et al., 2017), COL1A1 (Chioccioli et al., 2022; Jia et al., 2023), and MDK (Zhang et al., 2023) (Fig. 5 G). Notably, these profibrotic genes all had increased transcript abundance in fibroblasts stably expressing ThPOKK360N compared with WT controls. These findings suggest that ThPOKK360N represents an LOF in its ability to suppress profibrotic gene expression in fibroblasts, thereby highlighting the crucial role of ThPOK in maintaining human cellular homeostasis by modulating fibrotic processes.
Reversal of fibrotic gene signature in patient-derived pulmonary fibroblasts is achieved by overexpression of WT ThPOK or pirfenidone treatment
We next utilized patient-derived primary pulmonary fibroblasts to determine if stable overexpression of ThPOKWT could reverse the observed fibrotic gene signature (Fig. 5 H). Our results revealed that the previously identified profibrotic genes in the transduced HC pulmonary fibroblasts exhibited similar differential regulation patterns in transduced patient-derived cells. Notably, there was a decrease in transcript abundance of the profibrotic genes in the patient-derived fibroblasts stably expressing the WT gene, thereby indicating a successful rescue of the patient-derived cells from the fibrotic phenotype (Fig. 5 I).
Building on our previous findings, we tested the efficacy of Food and Drug Administration (FDA)–approved pulmonary fibrosis drugs—pirfenidone (Jin et al., 2019; King et al., 2014; Lehtonen et al., 2016) and nintedanib (Aimo et al., 2022; Jin et al., 2019; Lehtonen et al., 2016)—on patient-derived pulmonary fibroblasts in their steady state and also under fibrosis-induced conditions. This approach was designed to better mimic the fibrotic microenvironment of the patient’s lungs, acknowledging that in vivo, cells are influenced by surrounding signals and are not isolated. For the induction of a fibrotic cell model, we employed transforming growth factor β (TGF-β) stimulation, a method well-established in the literature for simulating a fibrosis response in human fibroblasts (Chioccioli et al., 2022; Jin et al., 2019). We used ACTA2 expression via quantitative PCR (qPCR) as our primary readout due to its recognition in the literature as a critical marker of myofibroblasts and activated fibroblasts, key players in fibrotic disease progression across various organs, including the lungs (Alsafadi et al., 2017; Chioccioli et al., 2022; Hu et al., 2012; Liu et al., 2021; Nosrati et al., 2023; Peyser et al., 2019; Rock et al., 2011; Sun et al., 2022; Wynn, 2008). Optimal in vitro treatment doses were determined by conducting dose-response experiments and assessing cytotoxicity in our model system using the lactate dehydrogenate (LDH) release cytotoxicity assay (Fig. S5). We demonstrated that treatment with pirfenidone (1 mM) over a 48-h period reduced ACTA2 expression in TGF-β-treated fibroblasts in comparison with untreated controls (Fig. 5 J). In contrast, a similar 48-h treatment regimen with nintedanib (0.1 μM) did not produce a comparable effect (Fig. 5 J). This finding highlights the efficacy of pirfenidone in modulating the fibrotic gene signature and may underscore the drugs’ distinct mechanisms of action and their varied impact on fibrotic pathways in patient-derived pulmonary fibroblasts.
Discussion
We describe the first reported human with a monogenic disorder caused by the disruption of ZBTB7B, which encodes the transcription factor ThPOK. This patient who was heterozygous for a variant in ZBTB7B presented with a complex syndromic phenotype including CID, severe atopy, severe fibroinflammatory interstitial lung disease, corneal vascularization and scarring, sensorineural hearing loss, global developmental delay, and growth failure. Our investigations align with the established guidelines for genetic studies in single patients, as proposed by Casanova and colleagues (Casanova et al., 2014): ThPOKK360N variant is not found in the unaffected individuals; functional investigations indicate that ThPOKK360N exhibits damaging multimorphic effects; and the causal relationship between ThPOKK360N and the clinical phenotype was confirmed through gene transfer experiments in both T cells and pulmonary fibroblasts.
The discovery of a monogenic defect in ThPOK in the patient offers an exciting opportunity to further define the role of ThPOK in human immunology, providing a direct in vivo context that was previously inaccessible and primarily inferred from HD mice and in vitro human studies. Immune system abnormalities caused by an absence of ThPOK function have been extensively investigated in HD mice, where homozygous mutants display a selective block in CD4+ T cell development, leading to severe CD4+ lymphopenia and increased numbers of CD8+ T cells (Dave et al., 1998). As such, ThPOK has become best known for its role as a “master regulator” of CD4+ lineage commitment in the thymus, where it has been shown to be crucial for CD4+ versus CD8+ lineage decisions, particularly in MHC-II-restricted CD69+ CD4+ CD8low intermediate thymocytes (Carpenter et al., 2012; Cheng et al., 2021; He et al., 2005, 2008, 2010; Kappes, 2010; Sun et al., 2005; Wang et al., 2008a, 2008b). Interestingly, the patient carrying a heterozygous variant in ThPOK exhibits an immune phenotype that mirrors the null state seen in homozygous HD mice. Our observations of persistent CD4+ lymphopenia and CD8+ lymphocytosis in the patient’s T cell compartment recapitulate the phenotypic alterations seen in HD mice (Luckey et al., 2014; Wang et al., 2008a, 2008b). Another unusual clinical feature the patient shares with the HD mice is the presence of vascularized corneal defects. This is consistent with our findings that the monoallelic ThPOKK360N variant exerts DN effects, disrupting the function of the WT protein, likely through interference with DNA-binding or altered protein–protein interactions. HD mice are also noted to have markedly increased frequency of CD4+ CD8low intermediate thymocytes (Dave et al., 1998). While this might suggest that class II–restricted thymocytes are arrested at this stage, ThPOK-deficient mice, including HD and knockout models, are reported to have MHC II-restricted CD8+ cells in both the thymus and spleen, suggesting that in the absence of functional ThPOK, MHC II–restricted thymocytes are reprogrammed to become CD8+ cells. This is supported by in vitro studies which show that ThPOK-deficient CD4+ CD8low thymocytes can differentiate into CD8+ cells (He et al., 2008; Kappes, 2010). However, we did not see significantly elevated levels of class II–restricted transcriptional regulators (SATB1, GATA3, SOX4, ID2, LEF1, MAX, EGR1, and HMGB2) (Karimi et al., 2021) in the patient’s CD8+ T cells compared to HC cells via scRNA-seq, suggesting differences in the patient versus HD models. Clarification of this discrepancy will need to be resolved through studies on future patients.
In the context of CD4+ T cell biology, our findings complement the established literature by demonstrating that ThPOK not only orchestrates the development of CD4+ T cells in the thymus but also regulates the differentiation of several CD4+ T helper cell subsets and the activation and memory potential of both CD4+ and CD8+ T cells in the periphery (Twu and Teh, 2014; Vacchio et al., 2014; Wang et al., 2008a). Notably, ThPOK plays a pivotal role in promoting Th2 cell differentiation while concurrently preventing the aberrant trans-differentiation of Th1/Th2 cells into cytotoxic T cell phenotypes (Vacchio et al., 2014; Wang et al., 2008a). Our data present an unexpected finding regarding Th2 cells. Contrary to existing reports that show the role of ThPOK in promoting Th2 cells (Vacchio et al., 2014; Wang et al., 2008a), the patient exhibited clinical features of allergic disease together with an increased Th2 response, with a significant enhancement in effector cytokine production (IL-4, IL-5, and IL-13) upon PMA/ionomycin stimulation, suggesting an unanticipated role of ThPOK in modulating Th2 cell effector functions in humans. While our data indicate a distinct Th2 cell hyperresponsiveness mediated by the ThPOKK360N variant, it remains unclear whether the enhanced Th2 responses are T cell intrinsic or influenced by other factors. If the allergic phenotype is T cell intrinsic, the increased Th2 responses could be hypothesized to stem from impaired SOCS1 expression in CD4 T cells, recapitulating the atopic phenotype of SOCS1 haploinsufficient patients (Gruber et al., 2024), or from the impaired TCR activation, as has been suggested to be a potential mechanism for primary atopic disorders (Lyons and Milner, 2018). However, it should be noted online databases report that ThPOK is expressed in all peripheral blood cells, including other cell types relevant to allergic disease including eosinophils, basophils, regulatory T cells, and B cells (Uhlen et al., 2019). Therefore, a range of cell types beyond CD4+ T cells may contribute to the allergic phenotype present in this patient.
ThPOK is also recognized to be important for CD4+ T cell activation, in particular, for preserving transcriptomic integrity and memory potential (Ciucci et al., 2019). Consistent with this model, we observed a marked defect in the upregulation of genes essential for TCR-stimulated activation in naïve CD4+ T cells from the patient compared with HCs, highlighting a significant deficit in proliferative signaling pathways, such as the MYC pathway. The introduction of ThPOKWT into T cells activated by TCR led to a notable upregulation of pathways critical for proliferation and effector functions, which were noticeably absent in cells transduced with the ThPOKK360N variant. Similarly, our data reinforces the known impact of ThPOK deficiency on clonal proliferation and effector molecule production in CD8+ T cells. ThPOK deficiency has been shown to markedly impair both clonal proliferation and the production of CD8+ effector molecules, such as IL-2 and granzyme B, within long-lived CD8+ T memory (Tm) cells upon antigenic rechallenge (Setoguchi et al., 2009). We identified a pronounced impairment in the activation of naïve CD8+ T cells in the patient, evidenced by a significant reduction in the upregulation of genes following TCR stimulation. This was particularly notable in the downregulation of effector pathways, such as those mediated by interferons, which are crucial for the functionality of CD8+ T cells. These findings suggest a direct link between ThPOK function and the activation of mature single-positive T cells. Our results thus provide an understanding of ThPOK’s role not only in the differentiation and development of T cell subsets but also in their activation and functional response, thereby underscoring the complex regulatory mechanisms orchestrated by ThPOK in T cell immunobiology.
Beyond challenging the existing paradigm of ThPOK’s role in immunity, our study significantly advances the understanding of ThPOK’s function in human biology both mechanistically and clinically. Through transcriptomic analysis of transduced primary pulmonary fibroblasts, we uncovered substantial differences in pathway regulation between HC pulmonary fibroblasts expressing ThPOKWT and those expressing the ThPOKK360N variant. Notably, we observed an upregulation in the epithelial–mesenchymal transition pathway, which is particularly relevant due to its association with fibrotic conditions. In addition, overexpression of ThPOKK360N led to an exaggerated fibrotic gene signature, indicative of ThPOK’s involvement in regulating fibrotic processes beyond its involvement in the immune system. This was further evidenced by experiments with patient-derived pulmonary fibroblasts, where overexpression of ThPOKWT reversed the fibrotic gene signature, pointing to ThPOK’s therapeutic potential in mitigating fibrotic diseases. Thus, our study suggests that genetic changes within ThPOK could have substantial consequences on the gene expression landscape in pulmonary fibroblasts, offering new insights into lung cell biology.
Moreover, our exploration of FDA-approved pulmonary fibrosis drugs on patient-derived fibroblasts revealed that pirfenidone significantly reduced the expression of ACTA2, a marker of myofibroblasts and fibrosis, under both steady-state and TGF-β-induced fibrotic conditions, while nintedanib did not have the same effects. Based on existing literature (Man et al., 2024), several factors may explain the differences observed in the effects of pirfenidone and nintedanib on ACTA2 expression. Nintedanib primarily inhibits receptor tyrosine kinases involved in fibrosis, such as PDGFR, FGFR, and VEGFR, effectively reducing fibroblast activation and extracellular matrix production (Wollin et al., 2015). Pirfenidone, by contrast, exerts its antifibrotic effects through broader mechanisms, including the inhibition of TGF-β signaling, oxidative stress, and inflammation, although its precise molecular targets remain less defined (Aimo et al., 2022). These distinct mechanisms suggest that nintedanib more effectively targets specific fibrotic signaling pathways, while pirfenidone may have a wider range of anti-inflammatory and antifibrotic actions. In our study, continuous exposure to TGF-β1 during drug treatment may have accentuated pirfenidone’s modulation of ACTA2 expression through its effects on TGF-β1 mediated pathways. Additionally, the presence of interstitial fibrosis and inflammation, as observed in the patient’s lung biopsy with lymphoid aggregates, likely underscores the importance of inflammation in driving the fibrotic process in this case. This may explain why pirfenidone, with its strong anti-inflammatory properties, proved more effective, whereas nintedanib’s focus on inhibiting growth factor pathways may have been less suited to an inflammation-driven fibrotic environment. Nonetheless, the demonstration of pirfenidone’s ability to modulate the fibrotic gene signature in patient-derived fibroblasts is particularly promising as it suggests a possible therapeutic approach in managing the fibrotic aspects of diseases associated with ThPOK dysfunction. This finding not only emphasizes the value of targeted therapies in complex genetic diseases but also offers new insights into idiopathic pulmonary fibrosis and other fibrotic lung conditions.
In conclusion, here we describe a novel human disease associated with a multimorphic damaging variant in ThPOK, establishing its causative link to the patient’s clinical profile through complementary experimental approaches. This study not only broadens the scope of ThPOK’s known functions in immune regulation but also reveals its tissue-dependent dual regulatory nature, impacting both immune cells and fibroblasts. While our findings offer significant insights, they are derived from a single patient, the only identified case to date, highlighting a limitation and the need for further discovery of similar cases to expand the disease phenotype. Additionally, although we used a variety of approaches to establish changes in DNA-binding preferences of the variant ThPOK, a limitation of our study is the use of a transduced cell line for ChIP-seq, as DNA-binding site availability depends on cell type and differentiation state. Future studies in primary cells will help provide a more physiologically relevant understanding of ThPOK’s function. Nonetheless, our research opens avenues for further research, which includes assessing the impact of ThPOK variants across different tissues and their potential role in non-immune pathologies. Moreover, further investigation into ThPOK’s interactions within the immune system, particularly its role in the skewed Th2 immune response, is essential. These future directions promise to deepen our understanding of ThPOK’s multifaceted role in human health and disease, paving the way for improved clinical management and therapeutic strategies for patients with ThPOK variant-associated conditions.
Materials and methods
Study design
The patient with a previously uncharacterized disease was identified after presenting to the immunology clinic. Trio whole-genome sequencing identified a de novo heterozygous variant in ZBTB7B that results in a missense change in the DNA-binding domain of ThPOK. We uncovered the altered DNA binding specificity and transcriptional activity of the variant by multiple approaches, including EMSA, HT-SELEX, luciferase assay, and ChIP-seq. We performed extensive phenotyping of the patient’s peripheral blood cells by scRNA-seq and conventional flow cytometry to reveal the abnormalities in T cell development, function, and activation. To assess the mechanisms underlying the fibroinflammatory changes seen in the patient’s lung, we employed an in vitro system utilizing lentiviral transduction to introduce both the WT and variant genes into primary pulmonary fibroblasts derived from HCs and the patient. We also performed treatment testing using two FDA-approved antifibrotic agents, one of which successfully reversed the fibrotic gene signature in patient primary pulmonary fibroblasts in an in vitro model of fibrosis.
Study approval
The study participant, his parents/guardians, and his sibling provided written informed consent. Research study protocols were approved by the University of British Columbia Clinical Research Ethics Board (H15-00641).
Genomic analysis
Genomic DNA from the patient, mother, and father were sequenced with paired-end reads on the Illumina platform by GeneDx. Average mean sequencing coverage was reported to be at least 40× across the genome, with a minimum threshold of 30× for each sample. Bidirectional sequencing reads were assembled and aligned to the reference sequence based on National Center for Biotechnology Information Refseq transcripts and human genome build GRCh37/UCSC hg19. Using a custom-developed analysis tool (XomeAnalyzer), data were filtered and analyzed to identify sequence variants, repeat expansions, and most deletions and duplications >1 kb. Reported clinically significant variants were confirmed by an appropriate orthogonal method in the proband and the parents. A missense variant in ZBTB7B was reported as the only candidate with a potential relationship to the disease phenotype, which was only observed in the patient.
Histopathology
A wedge biopsy was obtained from the edge of the right upper lobe of the lung in a surgical open biopsy procedure. The immunohistochemistry stainings of the lung tissue were performed in BC Children’s Hospital Pathology Laboratory with clinically validated antibodies and protocols.
Expression plasmid cloning
WT Myc-DDK tagged ZBTB7B plasmid (cat #RC234388), corresponding to the human-tagged open reading frame clone NM_001256455.2, was purchased from OriGene. The gene insert containing the point mutation corresponding to the patient’s variant (NM_001256455.2: c.1080A>C) was ordered from GenScript Biotech and inserted into the pCMV6-Entry plasmid (cat #PS100001; OriGene) using AsiSI and SacII restriction sites.
To generate C-terminal 6xHis-tagged WT and variant plasmids for Co-IP assay, a PCR-based approach was used to replace the C-terminal region of the original Myc-tagged protein with a 6xHis tag sequence (5′-CACCATCACCACCATCAC-3′). The insert was generated using Q5 Hot Start High-Fidelity 2X Master Mix (New England Biolabs), with primers designed to amplify the gene of interest while incorporating the 6xHis tag sequence at the reverse primer. The primers also included BamHI and SacII restriction sites to facilitate downstream cloning into the expression plasmid. The PCR products were purified using the Monarch DNA Gel Extraction Kit (New England Biolabs), digested with BamHI-HF and SacII (New England Biolabs), and ligated into the BamHI/SacII-digested Myc-tagged plasmid using the Quick Ligation Kit (New England Biolabs), according to the manufacturer’s protocol. The sequence integrity of all full-length plasmids used in this project was confirmed using long-read sequencing (Oxford Nanopore Technologies).
Immunoblotting
ThPOK expression was assessed by immunoblotting. Briefly, 1 × 106 HEK293 cells were seeded in 6-well culture plates in 2 ml of Dulbecco’s modified Eagle’s medium (DMEM) with 10% FBS (Gibco, Life Technologies), 2 mM L-glutamine (HyClone, Thermo Fisher Scientific), and 1 mM sodium pyruvate (Gibco, Life Technologies) and incubated overnight at 37°C prior to transfection. Each well was transfected with 3 μg of relevant plasmids using Lipofectamine 3000 Transfection Reagent (Thermo Fisher Scientific) according to the manufacturer’s recommendations. Whole-cell lysates were prepared 24 h after transfection by lysing cells in RIPA Lysis and Extraction Buffer (Thermo Fisher Scientific) supplemented with the Halt protease and phosphatase inhibitor cocktail (Thermo Fisher Scientific). The protein concentrations were measured using Pierce Coomassie Plus Assay Reagent (Thermo Fisher Scientific). Laemmli sample buffer (Bio-Rad Laboratories) supplemented with β-mercaptoethanol was added to the cell lysates and boiled at 37°C for 5 min. Lysates were separated by 10% SDS-PAGE and transferred onto polyvinylidene difluoride membranes (Bio-Rad Laboratories). Membranes were blocked using 5% BSA in Tris-buffered saline supplemented with Tween-20, incubated with anti-ThPOK (D9V5T) Rabbit mAb (1:1,000; Cell Signalling Technology), anti-Myc-Tag (9B11) Mouse mAb (1:1,000; Cell Signaling Technology), and anti-β-actin (8H10D10) Mouse mAb (1:20,000; Cell Signalling Technology) primary antibodies in a blocking buffer overnight at 4°C, and lastly, incubated with anti-rabbit (IgG DyLight 800; Rockland Immunochemicals) and anti-mouse (IgG IRDye 680RD, LI-COR) secondary antibodies at a concentration of 1:20,000 for 1 h at room temperature in blocking buffer. The membranes were imaged using the Odyssey DLx Near-Infrared Fluorescence Imaging System (LI-COR Biosciences).
Co-IP assay
Co-IP was performed using the Pierce c-Myc Tag IP/Co-IP Kit (Thermo Fisher Scientific) according to the manufacturer’s instructions. Cells were plated as described in the Immunoblotting section of the Supplementary Methods. Cell lysates were prepared using IP Lysis Buffer (Thermo Fisher Scientific) supplemented with Halt protease and phosphatase 100× inhibitor cocktail (Thermo Fisher Scientific). Protein concentrations were measured and normalized using the Bradford assay with Pierce Coomassie Plus Assay Reagent (Thermo Fisher Scientific). A portion of each lysate was set aside as the “input sample,” while the remaining normalized lysates were incubated with Anti-c-Myc Agarose overnight, following the protocol provided in the kit. After immunoprecipitation, proteins were eluted and analyzed by SDS-PAGE under the same conditions described in the Immunoblotting section. Immunoblotting was performed using the following primary antibodies: anti-Myc-Tag (9B11) Mouse mAb (Cell Signaling Technology, 1:1,000) and His-Tag (D3I1O) XP Rabbit mAb #12698 (1:750; Cell Signaling Technology).
Luciferase reporter assay
A 1,406 bp region of the promoter sequence of human SOCS (chr16:11,255,899–11,257,304 on GRCh38/hg38) was cloned into pGL4.20 [luc2/Puro] (Promega) firefly luciferase reporter plasmid using KpnI and HindIII restriction sites. The sequence integrity of the full-length plasmid was confirmed by long-read sequencing (Oxford Nanopore Technologies). Briefly, 1.5 × 105 HEK293 cells were seeded in 24-well culture plates in 0.5 ml of DMEM with 10% FBS, 2 mM L-glutamine, and 1 mM sodium pyruvate and incubated at 37°C for 24 h prior to transfection. We analyzed the effects of the EV, WT, and p.K360N variant plasmids, both individually and in combination, on the activity of the SOCS1 promoter. To test the effect of each plasmid in isolation, cells were transfected with 250 ng of the EV, WT, or the p.K360N variant plasmid. To model the heterozygous state, cotransfections were conducted where cells received a combination of WT and p.K360N plasmids in increasing ratios of WT to p.K360N (0.5:1, 1:1, 2:1, 4:1, and 8:1), totaling to 250 ng. Additionally, all cells were transfected with 250 ng of SOCS1-luciferase reporter plasmid and 10 ng of PGL4.74 Renilla luciferase control plasmid (Promega). The transfection was carried out using Lipofectamine 3000 (Thermo Fisher Scientific) according to the manufacturer’s protocol. After 24 h, cell lysates were prepared using 1× Glo Lysis Buffer (Promega) and transferred to white flat-bottom 96-well plates in technical triplicates. Dual-Glo Luciferase Assay Kit (Promega) was used according to the manufacturer’s recommendations, and luciferase activity was measured using the Infinite M200 plate reader (Tecan) by integrating luminescence over 10 s per well. In the analysis step, the firefly luciferase activity was normalized against Renilla luciferase, which controlled for variation in transfection efficiency. The normalized firefly luciferase activity was further divided by the normalized value from the EV condition, providing a relative measurement against the EV baseline.
Structural predictions
Due to the absence of a crystallized structure for ThPOK (encoded by ZBTB7B), we utilized the known crystal structure of its closely related homolog, ZBTB7A. This homolog shares significant similarities with ThPOK (ZBTB7B), especially in the four-finger DNA-binding domain and the C2H2 zinc finger domain, making it a suitable proxy for our analysis. Using ChimeraX-1.6, we visualized a schematic of the interaction of the lysine residue in ZBTB7A (analogous to the lysine at position 360 in the WT ThPOK) or the variant residue (p.K360N) with DNA.
EMSA
Preparation of whole-cell lysates was performed as previously described in the immunoblotting section above. A section of the SOCS1 promoter, containing the WT ThPOK consensus DNA-binding sequence along with two variant sequences identified by HT-SELEX as preferentially bound by variant ThPOK, were used to design the double-stranded oligonucleotide probes used in this assay (see Table S3 for sequences). Supershift assays were performed with 5 μg of whole cell protein lysate incubated on ice for 30 min with either anti-ThPOK (D9V5T) Rabbit mAb (1:1,000; Cell Signalling Technology) or Rabbit IgG Isotype Control (Invitrogen; Thermo Fisher Scientific) and then incubated at room temperature for 20 min with the labeled DNA probes and reagents provided in the Odyssey EMSA Kit (LI-COR Biosciences), according to the manufacturer’s recommendations. Protein–oligonucleotide–antibody mixtures were then subjected to electrophoresis in 5% acrylamide/Bis-acrylamide 29:1 gel in 1× Tris-borate-EDTA (TBE) migration buffer (Thermo Fisher Scientific) for 90 min at 70 V at room temperature. Imaging was done using the Odyssey DLx Near-Infrared Fluorescence Imaging System (LI-COR Biosciences).
HT-SELEX
WT and variant ThPOK (ZBTB7B_WT and ZBTB7B_K360N) were cloned into an eGFP expression vector pF3A–ResEnz–egfp. The TF samples were expressed using a TNT SP6 High-Yield Wheat Germ Protein Expression System Kit (Promega). HT-SELEX was modified from our previous approach (Jolma et al., 2013) to use Abcam Anti-GFP antibody ab290 immobilized to Protein G Mag Sepharose Xtra (Cytiva 28-9670-70) in the step where the protein–DNA complexes are separated from unbound DNA. The assay similarly used IVT-produced proteins and the selection reactions were carried out in a buffer of 140 mM KCl, 5 mM NaCl, 1 mMK2HPO4, 2 mM MgSO4, 100 μM EGTA, 1 mM ZnSO4, and 20 mM HEPES-HCl (pH 7). After each of the three selection cycles, the ligands were amplified by PCR and the output from all cycles was subjected to Illumina sequencing. Mung Bean Nuclease treatment was also used between each of the selection cycles to reduce the ssDNA background during the ligand selection. Data analysis was performed as previously described (Nitta et al., 2015), where automatic detection of a sequence pattern defining local maxima was followed by semi-manual generation of seeds that were then used to construct multinomial-1 or multinomial-2 position frequency matrices for the transcription factor target specificity.
ChIP-seq
To establish GFP-tagged ThPOK expressing HEK293 Flp-In-TRex cell lines, parental HEK293 Flp-In-TRex cells were transfected with separate expression vectors each carrying the WT or variant ZBTB7B open reading frames (FuGENE HD Transfection Reagent, Promega) and after 48 h transferred to Hygromycin selection media (0.2 µg/µl). Colonies of each line were pooled and used for further experiments. 24 h prior to crosslinking for chromatin immunoprecipitation, doxycycline (100 ng/ml) was added to cells and GFP expression was confirmed with fluorescent microscopy. Chromatin immunoprecipitation was performed as previously described (Schmidt et al., 2009). In brief, cells from 15-cm plates at 100% confluency were crosslinked for 10 min in 1% formaldehyde followed by 10 min of quenching with 2 M glycine. After washing the cells with cold PBS, cells were collected and pelleted. Using a three-step lysis process, chromatin was released and then sonicated to produce a DNA fragment length range of 200–300 bp using a Bioruptor sonicator (Diagenode). GFP-tagged proteins were immunoprecipitated with a polyclonal anti-GFP antibody (ab290; Abcam) and Dynabeads Protein G (Invitrogen). Crosslinks were reversed at 65°C overnight, and bound DNA fragments were purified (QIAquick PCR Purification Kit; Qiagen). ChIP libraries were prepared using NEBNext Ultra II DNA kit. Sequencing was performed with 150n paired-end at 2 × 107 reads per sample. For each variant of ZBTB7B, two biological replicates and two input control samples were sequenced using NovaSeq 6000 Illumina sequencer.
In the data processing step, adapter sequences were trimmed from ChIP-seq reads with Cutadapt (v2.1) (Martin, 2011) and mapped to the human genome (hg38) with Bowtie2 (v2.4.1) (Langmead and Salzberg, 2012) using the very-sensitive option. Reads with a map quality of <30 were discarded along with PCR duplicates and reads for which one or both of the paired ends could not be mapped. A single set of control reads was generated using the four-input control ChIP-seq replicates by merging them with SAMTools (v1.9) (Danecek et al., 2021) and subsampling to one-quarter of the total reads. MACS2 (v2.2.9.1) (Zhang et al., 2008) was used to call peaks for each of WT and variant ThPOK ChIP-seq replicates using the merged input read set as the control. For each set of peaks, the top 2,000 peaks with the highest enrichment scores were used for downstream analyses.
For each WT ThPOK ChIP replicate, we scanned the peak sequences to identify peaks that contained a match to the HT-SELEX WT ThPOK motif using FIMO (v5.5.0) (Grant et al., 2011) with default settings. Using BEDTools intersect (v2.30.0) (Quinlan and Hall, 2010), we overlapped the motif-containing ChIP peak summits for each replicate with a BED file containing the locations of human gene promoter regions (1,000 bp upstream of the transcription start site to 500 bp downstream). To establish a set of high-confidence candidates for direct regulation by WT ThPOK, we took the intersection of genes identified from each replicate. We then performed the same analysis using the variant ThPOK ChIP replicates and the variant ThPOK HT-SELEX motif. We searched for over-represented pathways in the set of high-confidence ThPOK-regulated genes and variant ThPOK-regulated genes using the PANTHER statistical over-representation test with PANTHER pathways (Thomas et al., 2022). We checked each set in its entirety, each set excluding genes that overlapped with the other set, and the intersection of the two sets. We report all over-represented PANTHER pathways with a false discovery rate P < 0.05.
Isolation and culture of primary pulmonary fibroblasts
Fresh lung tissue was collected from the patient undergoing open lung biopsy for histopathological evaluation of interstitial lung disease. The tissue was immediately transported to the laboratory in ice-cold Dulbecco’s Modified Eagle Medium (Thermo Fisher Scientific) supplemented with 1% penicillin–streptomycin (Thermo Fisher Scientific). We followed an optimized published method for generating a fibroblast-enriched single-cell suspension combining mechanical and enzymatic dissociation. Concentrations for all the reagents used are reported in the original study (Waise et al., 2019). The processed cells were grown using Fibroblast Growth Medium 2 (PromoCell).
Lentiviral transduction of primary pulmonary fibroblasts
To generate lentivirus vectors, WT and p.K360N cDNA from the previously described expression plasmids were cloned into a GFP-tagged Lenti vector (cat #PS100071; OriGene) using EcoRI-HF and NotI-HF (New England BioLabs). The sequence-verified lentiviral plasmids were packaged using third-generation packaging plasmids and transfected into HEK293T cells using Lipofectamine 3000 Transfection Reagent (Thermo Fisher Scientific). The viral supernatant was harvested and filtered through a 0.45-μm PES filter (Thermo Scientific Nalgene) and concentrated using an Amicon Ultra-15 100 kDa centrifugal filter (Millipore). The concentrated lentivirus solution was then aliquoted and stored at −80°C until use. To establish patient-derived and HC-derived pulmonary fibroblasts that stably express WT and p.K360N, patient and HC cells were infected with EV, WT, or p.K360N lentiviral particles and 5 µg/ml polybrene (Sigma-Aldrich) through spinfection at 1,000 × g for 2 h at 32°C, cultured, and expanded in Fibroblast Growth Medium 2 (PromoCell) for 3 days before undergoing sorting based on GFP expression using BD FACS Aria (BD Biosciences) cell sorter.
Lentiviral transduction of primary T cells
For the generation of stably transduced HC T cells, T cells were isolated from the PBMCs of a healthy adult donor using the EasySep Human T-Cell Isolation Kit (STEMCELL Technologies) and activated for 12 days. Isolated T cells were cultured in ImmunoCult-XF T cell expansion medium supplemented with IL-2 (10 U/ml) and ImmunoCult Human CD3/CD28 T cell Activator (STEMCELL Technologies) at a concentration of 1 × 106 cells/ml. The cell concentration was maintained by replenishing the medium with fresh IL-2-containing media every 2 days. On day 12, the IL-2 concentration was increased to 1,000 U/ml to support further T cell expansion. The expanded T cells were then infected with EV, WT, or p.K360N lentiviral particles (as explained above in detail) and further expanded in complete RPMI-1640 medium (Gibco, Life Technologies) supplemented with 10% FBS for 3 days prior to RNA extraction. Based on flow cytometry analysis, all three groups of EV, WT, and variant-transduced HC T cells exhibited more than 85% GFP positivity (transduction efficiency); therefore, no additional sorting was performed. All cultures were maintained at 37°C with 5% CO2 throughout the process.
RNA sequencing
To investigate the global transcriptome of the transduced patient and HC primary pulmonary fibroblasts stably expressing EV, WT, or p.K360N variant, 150,000 GFP + cells were sorted based on GFP expression using BD FACS Aria (BD Biosciences) cell sorter directly into 500 μl of cold Buffer RLT Plus (Qiagen) supplemented with 2-mercaptoethanol according to the Qiagen RNeasy Plus Mini Kit instructions. Following sorting, the volume of Buffer RLT Plus was adjusted so that there was exactly 350 μl of Buffer RLT Plus to 100 μl of sorted sample volume. RNA isolation was done according to the RNeasy Plus Mini Kit instructions (Qiagen). Sequencing was done on rRNA-depleted RNA libraries using PE150 Illumina NovaSeq Sequencing, targeting 100 M individual reads (50 M read pairs/clusters) per sample. Three biological replicates were included for each condition, which were independently transduced, sorted, and sequenced.
Sequenced reads were aligned to a reference sequence using Spliced Transcripts Alignment to a Reference aligner. E1 Assembly and expression were estimated using Cufflinks E2 through bioinformatics apps on Illumina BaseSpace. Expression data were normalized to reads between samples using the edgeR package in R (R Foundation). Normalized counts were filtered to remove low counts using the filterByExpr function in the edge. PCA was done on log2 (normalized counts+0.25) in R using the PCA function. Differential expression analysis was accomplished using Limma. Differentially expressed genes were defined as those with adjusted P value <0.05.
Pathway analysis was done by first performing gene set enrichment analysis with 1,000 permutations using the Molecular Signatures Database Hallmark module. The signal-to-noise ratio was used for gene ranking, and the obtained P values were further adjusted using the Benjamini–Hochberg method. Pathways with an adjusted P value <0.05 were considered significant. Leading edge genes from significant pathways between WT and p.K360N-transduced cells were identified. Expression levels of these genes were then determined in each group.
Sample level enrichment analyses (SLEA) scores were computed for each significant pathway. Briefly, z-scores were computed for gene sets of interest for each sample. The mean expression levels of significant genes were compared to the expression of 1,000 random gene sets of the same size. The difference between observed and expected mean expression was then calculated and represented on heatmaps generated using Morpheus (https://software.broadinstitute.org/morpheus).
scRNA-seq
Using scRNA-seq, whole transcriptome analysis (WTA) was conducted on PBMCs from the patient and three age-matched HC. The BD Rhapsody Single Cell platform was used, which included the Rhapsody Enhanced Cartridge Reagent Kit (BD), the BD Rhapsody Cartridge Kit (BD), the Rhapsody cDNA Kit (BD), the Rhapsody WTA Amplification Kit (BD), the Human Single-Cell Multiplexing Kit (BD), used according to manufacturer’s recommendations and protocols. Briefly, thawed PBMCs were rested overnight and then stimulated with ImmunoCult Human CD3/CD28 T cell Activator (STEMCELL Technologies) or left untreated for 16 h. Cells from each donor were then labeled with sample tags and a panel of BD AbSeq Ab-oligos (BD), washed with stain buffer, and pooled together in cold sample buffer to obtain ∼60,000 cells in 620 µl for each of the pooled unstimulated and stimulated samples. Two nanowell cartridges were primed and subsequently loaded with the pooled samples. Libraries for whole transcriptome and AbSeq analysis were prepared by following the BD Rhapsody System (TCR/BCR Full Length, mRNA WTA, BD AbSeq, and Sample Tag Library Preparation) Protocol. Quality control was performed using the Agilent DNA High Sensitivity Kit (Agilent Technologies) and the Agilent 2100 Bioanalyzer (Agilent Technologies) on the intermediate and final sequencing libraries, respectively, including estimating the concentration of each sample, measuring the average fragment size of the libraries, and following sequencing recommendations. Libraries were diluted to 350–650 picomolar range per sample and sequenced using Illumina NovaSeq 150 bp PE sequencing targeting 8,000 M individual reads (4,000 M read-pairs) with PhiX spike-in of 20%.
FASTQ files were processed using the BD Rhapsody Targeted Analysis Pipeline and Seven Bridges (https://www.sevenbridges.com) according to the manufacturer’s recommendations. The R package Seurat was utilized for all downstream analysis. Scaling and clustering were performed on each pool of samples independently. Dimensionality reduction using PCA was done on the most variable genes, and UMAP was based on the first 20 PCs. Cell identities were annotated manually or via cell surface Ab-seq to first identify major cell types (T cells, B cells, NK cells, and monocytes) and then define subtypes (CD4+ T cells, CD8+ T cells, double-positive T cells, and their subsequent naïve and memory subgroups). For differential gene expression analyses, we utilized the Seurat implementation of a negative binomial test, assuming an underlying negative binomial distribution in RNA-Seq data while leveraging the UMI counts to remove technical noise.
Flow cytometry
To carry out immunophenotyping and intracellular cytokine detection, PBMCs from the patient and age-matched HCs were stimulated with eBioscience Cell Stimulation Cocktail (Invitrogen, Thermo Fisher Scientific) for 5 h at 37°C. 1 h after the start of stimulation, eBioscience Protein Transport Inhibitor Cocktail (Invitrogen, Thermo Fisher Scientific) was added to the stimulated cells. Cells were then stained with a cocktail of antibodies against surface markers for 20 min at room temperature and then fixed with Foxp3 fixation/permeabilization working solution from the eBioscience Foxp3 transcription factor staining buffer set (Invitrogen, Thermo Fisher Scientific) for 20 min. The fixed cells were subsequently stained for 20 min with antibodies targeting intracellular cytokines in 1× permeabilization buffer (Invitrogen, Thermo Fisher Scientific). The samples were then washed and analyzed using the BD FACSymphony flow cytometer (BD Biosciences). Data were analyzed with FlowJo software (BD Biosciences). The antibody panels used for staining are listed in Table S4.
Treatment testing and qPCR of primary fibroblasts
Primary patient pulmonary fibroblasts were used to evaluate the effects of various treatments. Initially, 5 × 105 cells were plated in each well of a 6-well plate (Corning) containing fibroblast growth medium 2 (PromoCell). The cells were incubated overnight under standard conditions. Subsequently, they were either left untreated or treated with recombinant human TGF-β1 (R&D Systems) at a concentration of 5 ng/μl for 24 h to induce a fibrotic state, a well-established in vitro model of fibrosis as documented in existing literature. Following the TGF-β1 treatment period, cells were subjected to different drug treatments for an additional 48 h. These treatments which were purchased from MedChemExpress (https://www.medchemexpress.com), which included Nintedanib (0.01, 0.1, 1, and 10 μM) and Pirfenidone (0.25, 0.5, 1, and 2 mM). For the groups designated to continue with TGF-β1 exposure, fresh medium supplemented with TGF-β1 was added concurrently with the drug treatments. After treatment, total RNA was extracted from treated and untreated cells using a RNeasy Plus Mini Kit (Qiagen) and converted to cDNA using an iScript cDNA synthesis kit (BioRad Laboratories). Transcript abundance was measured using a Universal SYBR Green Super Mix (Bio-Rad) and a 7300 Real-Time PCR System (Applied Biosystems). Relative transcript abundance was quantified relative to actin-β (ACTB) using the 2−ΔΔCT method.
Lactate dehydrogenase release assay for assessing cytotoxicity
Cytotoxicity was assessed using the LDH-Glo Cytotoxicity Assay (Promega Corporation) according to the manufacturer’s recommendations. This assay quantitatively measures lactate dehydrogenase (LDH) release from damaged cells into the culture medium. Supernatants were obtained from cells that underwent treatment testing. At the end of the treatment period, 5 μl of supernatant from each well was diluted with 495 μl of LDH storage buffer, achieving a 100x dilution. Subsequently, 50 μl of this diluted sample was transferred to a new 96-well plate. To each well, 50 μl of reconstituted LDH detection reagent was added. The plates were incubated at room temperature for 30 min, protected from light. Luminescence was measured using the Infinite M200 plate reader (Tecan) by integrating luminescence over 1,000 ms per well. Percent cytotoxicity was determined by comparing the LDH release from experimental wells to a maximum LDH release control. The control consisted of cells lysed with 40 μl of 10% Triton X-100 in a total medium volume of 2,000 μl. The LDH release from each experimental well was divided by this maximum LDH release value to calculate the percent cytotoxicity. Additionally, a standard dilution curve of LDH, ranging from 32 to 0 mU/ml, was employed to confirm that the measured values fell within the assay’s reliable detection range.
Statistical analysis
All data are presented as mean ± SEM. Statistical significance was evaluated using appropriate methods for each dataset, which are detailed in the corresponding figure legends or the methods section, with the following annotations used to represent significance: P value < 0.05 (*), P value < 0.01 (**), and P value < 0.001 (***).
Online supplemental material
Fig. S1 presents the conservation of the lysine residue across species, demonstrating that lysine (K) in the first C2H2 zinc finger of ThPOK is consistently present in all 12 species examined, highlighting its evolutionary conservation. Fig. S2 shows full-length immunoblots and quantification of ThPOK expression in HEK293 cells transfected with plasmids expressing WT ThPOK (WT), p.K360N ThPOK (p.K360N), or an EV control. Fig. S3 presents EMSA evaluating the binding affinity of WT and variant ThPOK proteins to a known ThPOK consensus sequence within the SOCS1 promoter. Fig. S4 provides a structural modeling analysis of ThPOK’s interaction with DNA, using the homologous ZBTB7A protein structure (PDB: 8E3D) to predict the impact of the K360N variant on DNA-binding specificity, highlighting alterations in the PWM derived from HT-SELEX data. Fig. S5 assesses the efficacy and toxicity of pirfenidone and nintedanib in TGF-β-treated patient-derived pulmonary fibroblasts. In supplemental tables, Table S1 summarizes clinical flow cytometry data from a patient over a 5-year span, including complete blood count differentials, lymphocyte studies, and naïve, memory, and effector T cell profiles. Table S2 provides a detailed variant description, including genomic position, zygosity, and pathogenicity predictions. Table S3 lists the IRDye-labeled WT and variant probes used in EMSA. Table S4 includes a comprehensive list of antibodies used in the T cell phenotyping research flow cytometry panel, detailing fluorophores, dilutions, catalog numbers, and suppliers. Data S1 provides a comprehensive list of high-confidence ThPOK target genes regulated by WT, variant or both WT and variant ThPOK, along with corresponding pathway enrichment results.
Data availability
HT-SELEX and ChIP-seq data are publicly available under accession GSE292581, with ChIP-seq in SRA project PRJEB87201 and GHT-SELEX in PRJEB75580. RNA sequencing datasets are publicly available in the Gene Expression Omnibus (GEO), including GSE291517, which contains bulk RNA sequencing data of transduced patient and HC fibroblasts; GSE291518, which includes bulk RNA sequencing of transduced HC T cells; and GSE291519, which features scRNA-seq of patient and control PBMCs.
Acknowledgments
We thank the patient and their family members for their trust and participation in our study. We thank Dr. Remy Bosselut for his invaluable guidance, expert insights as a world-renowned authority on ThPOK, and thoughtful review of the manuscript. We also acknowledge the BC Children’s Hospital BioBank for providing age-matched HC PBMC samples. Additionally, we thank Dr. Jonathan Bramson’s lab at McMaster University, Ontario, Canada, for providing the optimized lentivirus transduction protocol.
This study received funding support from the Canadian Institutes of Health Research (PJQ-178054 to S.E. Turvey and C.M. Biggs), Genome British Columbia (SIP007 to S.E. Turvey), Tier 1 Canada Research Chair in Pediatric Precision Health (S.E. Turvey), Aubrey J. Tingle Professor of Pediatric Immunology (S.E. Turvey), the BC Children’s Hospital Foundation Precision Health Initiative (S.E. Turvey), the Health Professional-Investigator of the Michael Smith Foundation for Health Research (C.M. Biggs), a Providence Healthcare Research Institute Early Career Investigator award (C.M. Biggs), a Vanier Canada Graduate Scholarship (M. Vaseghi-Shanjani), and a four-year doctoral fellowship (M. Vaseghi-Shanjani and S. Samra).
Author contributions: M. Vaseghi-Shanjani: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Supervision, Validation, Visualization, Writing - original draft, Writing - review & editing, M. Sharma: Conceptualization, Data curation, Formal analysis, Investigation, Supervision, Validation, Visualization, Writing - original draft, Writing - review & editing, P. Yousefi: Formal analysis, Investigation, Validation, Writing - review & editing, S. Samra: Investigation, Visualization, Writing - review & editing, K.U. Laverty: Formal analysis, Visualization, Writing - review & editing, A. Jolma: Data curation, Formal analysis, Investigation, Methodology, Writing - original draft, R. Razavi: Investigation, Resources, Writing - review & editing, A.H.W. Yang: Investigation, M. Albu: Software, Visualization, L. Golding: Data curation, Writing - review & editing, A.F. Lee: Investigation, Visualization, Writing - review & editing, R. Tan: Methodology, P.A. Richmond: Data curation, Investigation, Software, M. Bosticardo: Investigation, Writing - review & editing, J.H. Rayment: Writing - review & editing, C.L. Yang: Data curation, Writing - original draft, Writing - review & editing, K.J. Hildebrand: Writing - review & editing, R. Brager: Writing - review & editing, M.K. Demos: Writing - review & editing, Y.-L. Lau: Writing - review & editing, L.D. Notarangelo: Investigation, Writing - review & editing, T.R. Hughes: Funding acquisition, Project administration, Resources, Supervision, C.M. Biggs: Conceptualization, Data curation, Funding acquisition, Investigation, Methodology, Project administration, Supervision, Validation, Writing - original draft, Writing - review & editing, S.E. Turvey: Conceptualization, Formal analysis, Funding acquisition, Methodology, Project administration, Resources, Supervision, Writing - review & editing.
References
Author notes
M. Vaseghi-Shanjani and M. Sharma are shared co-first authors.
C.M. Biggs and S.E. Turvey are shared co-senior authors.
Disclosures: J.H. Rayment reported personal fees from Vertex Pharmaceuticals, personal fees from Boehringer Ingelheim, and grants from Vertex Pharmaceuticals outside the submitted work. No other disclosures were reported.