STAG3 homozygous missense variant causes primary ovarian insufficiency and male non-obstructive azoospermia.

Infertility, a global problem affecting up to 15% of couples, can have varied causes ranging from natural aging to the pathological development or function of the reproductive organs. One form of female infertility is premature ovarian insufficiency (POI), affecting up to 1 in 100 women and characterised by amenorrhea and elevated follicle stimulating hormone before the age of 40. POI can have a genetic basis, with over 50 causative genes identified. Non-obstructive azoospermia (NOA), a form of male infertility characterised by the absence of sperm in semen, has an incidence of 1% and is similarly heterogeneous. The genetic basis of male and female infertility is poorly understood with the majority of cases having no known cause. Here, we study a case of familial infertility including a proband with POI and her brother with NOA. We performed whole-exome sequencing (WES) and identified a homozygous STAG3 missense variant that segregated with infertility. STAG3 encodes a component of the meiosis cohesin complex required for sister chromatid separation. We report the first pathogenic homozygous missense variant in STAG3 and the first STAG3 variant associated with both male and female infertility. We also demonstrate limitations of WES for the analysis of homologous DNA sequences, with this variant being ambiguous or missed by independent WES protocols and its homozygosity only being established via long-range nested PCR.


Introduction
Premature ovarian insufficiency (POI) represents one of the main causes of female infertility with a prevalence of 1-3%, depending on population characteristics such as ethnicity. This condition is characterised by the occurrence of menstrual disturbance (primary or secondary amenorrhea or oligomenorrhea) for at least four months before the age of 40 with high follicle stimulating hormone (FSH) levels (> 25 UI/l on two occasions > 4 weeks apart) and low estradiol levels (European Society of Human Reproduction and Embryology https://www.eshre.eu/). The mechanism leading to POI can be an impaired formation of primordial follicles leading to a reduced number of their pool, an impaired recruitment and/or an altered maturation of the follicles, and/or an increased follicular atresia (Laissue, 2015). POI can be caused by infections, medical treatments, or metabolic or autoimmune diseases, but often has a genetic basis with over 50 causative genes reported, affecting various processes such as gonadal development, meiosis, DNA repair, folliculogenesis, hormonal signaling, steroidogenesis, metabolism, mitochondrial function and immune regulation (Tucker et al., 2016). Such a genetic cause can be identified in 10-15% of the POI patients (Caburet et al., 2014). Depending on the gene involved, POI can occur as part of a syndromic disorder (e.g. AIRE, ATM,FOXL2) or as an isolated disease (non-syndromic POI: e.g. NOBOX, GDF9). In order to decipher the genetic heterogeneity of POI, recent studies using massive parallel sequencing (MPS) approaches (targeted panels and whole-exome sequencing (WES)), mainly in individuals with non-syndromic POI, have reported single nucleotide variants in novel dominant or recessive candidate genes such as MCM9 (Fauchereau et al., 2016), POLR2C (Moriwaki et al., 2017), NUP107 (Ren et al., 2018) or TP63 (Tucker et al., 2019). STAG3 was first described as a POI gene in 2014 (Caburet et al., 2014), and recessive high impact variants have since been described as a rare but recurrent http://molehr.oxfordjournals.org/ A c c e p t e d M a n u s c r i p t cause of non-syndromic POI (Caburet et al., 2014;Le Quesne Stabej et al., 2016;Colombo et al., 2017;He et al., 2018). STAG3 is a key gene essential for meiosis, and is required for gametogenesis and fertility. Its implication in male infertility has also been strongly suggested (Le Quesne Stabej et al., 2016), and two male patients with non-obstructive azoospermia (NOA) and biallelic high impact variants in STAG3 have recently been reported (Riera-Escamilla et al., 2019;van der Bijl et al., 2019). We report here the first description of a homozygous missense STAG3 variant associated with both female and male infertility, in a woman with POI (primary amenorrhea) and her infertile brother with NOA. This study confirms the recent data expanding the phenotypic spectrum of variants within STAG3 to include male infertility, and demonstrates the involvement of missense variants in contrast to loss-of-function variants previously reported.

Patients
The proband (Figure 1, II-3), born in 1984, was first seen at 15 years old for delayed puberty.
At this time, her height was 164.1 cm, weight was 61.2 kg and secondary sex characteristics were underdeveloped (Tanner I-II for breast, I for pubic and axillary hair). Genitals were normal and ultrasonographic examination showed a small uterus; the right ovary was present with a cyst suggesting a residual retained function and the left ovary was not visualised. Hormonal assessment evaluated growth factors (normal IGF-1 and IGFBP-3, subnormal GH stimulation test by ornithine), thyroid hormones (normal T4 and TSH, and TRH stimulation test), prolactin (normal) and reproductive hormones. High levels were detected for follicle stimulating hormone (FSH: 46.5 UI/l) and luteinising hormone (LH: 24.5 UI/l) with peaks at 78.5 UI/l and 87.7 UI/l respectively after gonadotrophin releasing hormone (GnRH) http://molehr.oxfordjournals.org/ A c c e p t e d M a n u s c r i p t stimulation. Decreased levels were noted for estradiol (33 pg/ml) and sex hormone-binding globulin (SHBG: 23 nmol/l, N: 28-80 nmol/l). Bone age was 12 years old and three months.
Autoimmunity was discounted (negative thyroid antibodies, ovarian antibodies, ACA/21-OH antibodies). Karyotype showed 46,XX constitution. Estrogen therapy was initiated, and the patient was evaluated six months later. Development of the secondary sex characteristics started (Tanner III for breast, II for pubic hair and I for axillary hair), bone age was 13 years old, hormonal assessment showed FSH: 43.1 UI/l, LH: 4.1 UI/l, estradiol: 26 pg/ml, and normal testosterone. Estrogen therapy was progressively increased and hormone replacement therapy was initiated in 2002. Other medical history included appendectomy and surgical treatment for inguinal hernia, and hysteroscopia which revealed endometrial hypotrophy but no malformation. Repeat ultrasonographic assessments in 2004 and 2008 confirmed uterine hypoplasia, small sized right ovary with some small follicles, and atrophic left ovary. She was enrolled in an oocyte donation program in 2013 and had a child in 2017.
At this time, height was 1.76 m and weight was 80 kg. When last seen in 2019, she was pregnant following another oocyte donation from the same woman.
In her familial history (Figure 1), her parents (I-1 and I-2) are healthy and consanguineous (her mother's great grandfather and her father's grandfather are brothers). Maternal menopause occurred at 50-year-old. She has two brothers, and a third (II-2) who died at four months (sudden infant death). One of the brothers (II-1) also has reproductive problems with NOA, and had two children after assisted reproductive techniques using donor sperm. Assessment included hormonal testing with normal FSH (11.13 UI/l), testosterone (6.84 µg/l), prolactin (8.68 ng/ml) and thyroid stimulation g hormone (TSH) (1.21 UI/l), and karyotype (46,XY). The other brother (II-4) is fertile and had two children following spontaneous pregnancies. No other history of infertility is noted in the family.
http://molehr.oxfordjournals.org/ A c c e p t e d M a n u s c r i p t Written informed consent was obtained from all participants. All procedures were in accordance with the ethical standards of the Ethics Committee of Rennes University Hospital and the French law.

Detection of the variants by sequencing
Whole-exome sequencing (WES)  WES assays DNA from the proband underwent WES at the Australian Genome Research Facility (AGRF).
Library preparation was performed with Agilent SureSelect Human All Exon V6 (Agilent Technologies, Santa Clara, CA, USA) and sequencing was with the NovaSeq™ 6000 Sequencing System (Illumina Inc., San Diego, CA, USA). All WES data were processed using the Cpipe pipeline (Sadedin et al., 2015) designed according to the GATK guidelines and deposited into SeqR for analysis (https://seqr.broadinstitute.org/). The proband and her family were also previously tested by exome sequencing, which was performed as described elsewhere (Murphy et al., 2015). Briefly, exon enrichment was performed using Agilent SureSelect Human All Exon V4 (Agilent Technologies, Santa Clara, CA, USA). Paired-end sequencing was performed on the Illumina HiSeq2000 platform (Illumina Inc., San Diego, CA, USA) with an average sequencing coverage of x50. Read files were generated from the sequencing platform via the manufacturer's proprietary software.
Reads were mapped using the Burrows-Wheeler Aligner and local realignment of the mapped reads around potential insertion/deletion (indel) sites was carried out with the GATK version 1.6. SNP and indel variants were called using the GATK Unified Genotyper for each sample. SNP novelty was determined against dbSNP138. Datasets were filtered for novel or rare (MAF<0.01) variants.
http://molehr.oxfordjournals.org/ A c c e p t e d M a n u s c r i p t  WES analysis We performed two phases of analysis for WES performed with Agilent SureSelect Human All Exon V6, based on our previously described method (Tucker et al., 2019), with the first focused on gene priority and the second focused on variant priority. For gene-centric analysis, we considered the potential pathogenicity of all "moderate to high impact" coding variants within diagnostic or candidate genes (gene list as per Tucker et al., 2019). For variant-centric analysis, we considered the potential pathogenicity of "high impact" variants in any gene, and "moderate to high impact" recessive-type variants in any gene. High priority variants are rare (<0.01 MAF) variants affecting essential splice sites, introducing frameshifts or premature stop codons, whereas moderate priority variants are rare (<0.01 MAF) missense variants and in-frame codon deletions/insertions. MAF and tolerance of genes to missense and/or loss-of-function variation were assessed in the public databases ExAC (http://exac.broadinstitute.org/) and gnomAD (https://gnomad.broadinstitute.org/).

Functional studies
In-silico analyses The effects of the missense variant identified were assessed using HOPE database (Venselaar et al., 2010)

Results
The filtering pipeline identified a total of 558 "moderate to high impact" variants in any gene. For gene-centric analysis, eight "moderate to high impact" variants were within diagnostic genes, and seven were within candidate genes. For variant-centric analysis, we identified 46 "high impact" variants in any gene, and 184 "moderate to high impact" recessive-type variants in any gene.
http://molehr.oxfordjournals.org/ A c c e p t e d M a n u s c r i p t The most relevant variants within diagnostic genes included (Table 1)  Interestingly, on IGV, many sequencing reads across this variant did not uniquely align to the genome, with a number of sequencing reads aligning non-specifically, thereby indicating a low mapping quality corresponding to highly repetitive and homologous genomic region (Supp Figure 1B). In fact, the family was studied independently by two groups, with WES performed in two different centres and data being analysed using two different bioinformatics pipelines. One group failed to detect the variant due to a capture with no baits targeting this exon of STAG3 (Supp Figure 1A, C). Initial attempts at sanger validation cast further doubt on the variant being a true variant. Initial sanger sequencing indicated a heterozygous variant at this site (Supp. Figure 2). However, this erroneous sequencing was due to poor specificity of the sequencing primers. Persisting, we attempted a long-range http://molehr.oxfordjournals.org/ A c c e p t e d M a n u s c r i p t nested PCR. The first PCR was performed with primers that captured a very large but specific region of genomic DNA, followed by nested PCR closely flanking the variant site.
There were no common/reported SNPs affecting the binding site of any of the primers, making allele-specific drop-out unlikely. Allele-specific drop-out was also discounted by sequencing parental DNA and demonstrating heterozygosity, indicating successful amplification of both alleles. This long-range nested PCR validated that the patient was truly homozygous for the STAG3 missense variant (Supp. Figure 2). The difficulty in validating the variant and the repeated transient belief that the variant was a false positive indicates the need to carefully investigate and persist when reads do not align uniquely to the genome.
Familial studies targeting STAG3 showed that the variant was present in a heterozygous state in the parents and the healthy brother whereas it was present in a homozygous state in the infertile brother ( Figure 1). This familial segregation with infertility adds weight to the argument that this variant is causative.
For the infertile brother bearing the homozygous STAG3 variant, testis biopsy showed the conservation of the testicular tissue architecture. On the right and on the left, the presence of germ cells was noted in the seminiferous tubules, but no spermatozoa were identified, indicating incomplete spermatogenesis, not beyond the spermatocyte stage and corresponding to a spermatogenic arrest. IHC staining performed on testis biopsies ( Figure   3) showed that in the control testis, STAG3 protein was present in the seminiferous tubules in the nucleus of spermatogonia and spermatocytes. In the testicular biopsies from the NOA brother, STAG3 expression was undetectable in the seminiferous tubules.

STAG3 variants in male infertility
Furthermore, male Stag3 −/− mice were also infertile (Caburet et al., 2014). Male mice lacking NOA also bearing the STAG3 variant in a homozygous state, with testis biopsy staining confirming reduced STAG3 stability and an early meiotic arrest. This is the third male patient described, and the first with a homozygous missense variant, confirming the involvement of STAG3 in male infertility manifesting as NOA. According to the classification established for meiotic arrest in human male meiosis (Jan et al., 2018), STAG3 biallelic variants are associated with type I meiotic arrest where severe asynapsis of the homologous chromosomes is a major feature, as observed in our patient . We report for the first time a STAG3 variant leading to POI and NOA in a single pedigree, strengthening the existence of shared genetic factors causing both these infertility conditions.

Impact of disturbance of the stromalin conservative domain (SCD)
STAG genes (STAG1, STAG2 and STAG3) encode members of the highly conserved family of stromalin nuclear proteins harbouring particular structural elements: 1) the stromalin conservative domain or SCD, an 86 amino acid region highly conserved from yeast to humans, 2) the STAG domain present in Schizosaccharomyces pombe mitotic cohesin Psc3, and the meiosis specific cohesin Rec11 (Ellermeier and Smith, 2005), 3) the amino-terminal and carboxyl-terminal domains, and 4) the armadillo-type fold corresponding to a superhelical structure adapted for binding large substrates such as proteins and nucleic acids. It is noteworthy that inconsistency was observed for the reported STAG3 variants when collecting the molecular data of the literature, with some misleading information (e.g.  identity with regions of STAG3L1, STAG3L2 and STAG3L3. This led to failure of variant detection by one WES protocol and ambiguous results with another WES protocol as well as subsequent sanger sequencing. We solved the ambiguity by the development of a longrange nested PCR. For variants in multi-mapping regions such as these, the most conclusive data comes from long-range nested PCR, and not WES. Allele-specific dropout due to SNVs in primer-binding sites remains a theoretical possibility, however, in this case is unlikely given the lack of common/reported SNPs at primer-binding sites, the fact that both alleles successfully amplified in each heterozygous parent, the strong gene:phenotype match and the lack of STAG3 in patient biopsy, providing functional evidence to support the involvement of this gene.

Conclusion
This study reports for the first time a STAG3 missense variant leading to POI and NOA, strengthening the existence of shared genetic factors causing both these infertility conditions. Furthermore, this study highlights the role of meiosis-specific genes of the cohesion complex in POI pathogenesis but also in male infertility, particularly NOA.

Acknowledgments
We would like to thank the patients for taking part in our research.

Funding statement
This work was supported by CHU Rennes and Rennes 1 University, Faculty of Medicine in     A c c e p t e d M a n u s c r i p t  A c c e p t e d M a n u s c r i p t