M. Fuentes, Preliminary observations on the spawning conditions of the European amphioxus (Branchiostoma lanceolatum) in captivity, J. Exp. Zool. B Mol. Dev. Evol, vol.302, pp.384-391, 2004.
URL : https://hal.archives-ouvertes.fr/hal-00121764

M. Fuentes, Insights into spawning behavior and development of the European amphioxus (Branchiostoma lanceolatum), J. Exp. Zool. B Mol. Dev. Evol, vol.308, pp.484-493, 2007.
URL : https://hal.archives-ouvertes.fr/hal-00150468

R. Hirakow and N. Kajita, Electron microscopic study of the development of amphioxus, Branchiostoma belcheri tsingtauense: the gastrula, J. Morphol, vol.207, pp.37-52, 1991.

R. Hirakow and N. Kajita, Electron microscopic study of the development of amphioxus, Branchiostoma belcheri tsingtauense: the neurula and larva

, Kaibogaku Zasshi, vol.69, pp.1-13, 1994.

R. Luo, SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler, Gigascience, vol.1, p.18, 2012.

S. Huang, HaploMerger: reconstructing allelic relationships for polymorphic diploid genome assemblies, Genome Res, vol.22, pp.1581-1588, 2012.

M. G. Grabherr, Full-length transcriptome assembly from RNA-seq data without a reference genome, Nat. Biotechnol, vol.29, pp.644-652, 2011.

B. J. Haas, Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies, Nucleic Acids Res, vol.31, pp.5654-5666, 2003.

O. Keller, M. Kollmar, M. Stanke, and S. Waack, A novel hybrid gene prediction method employing protein multiple sequence alignments, Bioinformatics, vol.27, pp.757-763, 2011.

B. J. Haas, Automated eukaryotic gene structure annotation using EVidenceModeler and the program to assemble spliced alignments, Genome Biol, vol.9, p.7, 2008.

D. Kim, TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions, Genome Biol, vol.14, p.36, 2013.

C. Trapnell, Transcript assembly and quantification by RNA-seq reveals unannotated transcripts and isoform switching during cell differentiation, Nat. Biotechnol, vol.28, pp.511-515, 2010.

B. J. Haas, De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis, Nat. Protocols, vol.8, pp.1494-1512, 2013.

L. Wang, CPAT: Coding-Potential Assessment Tool using an alignment-free logistic regression model, Nucleic Acids Res, vol.41, p.74, 2013.

A. C. Roth, G. H. Gonnet, and C. Dessimoz, Algorithm of OMA for large-scale orthology inference, BMC Bioinformatics, vol.9, p.518, 2008.

A. M. Altenhoff, M. Gil, G. H. Gonnet, and C. Dessimoz, Inferring hierarchical orthologous groups from orthologous gene pairs, PLoS ONE, vol.8, p.53786, 2013.

A. Siepel, Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes, Genome Res, vol.15, pp.1034-1050, 2005.

N. L. Bray, H. Pimentel, P. Melsted, and L. Pachter, Near-optimal probabilistic RNA-seq quantification, Nat. Biotechnol, vol.34, pp.525-527, 2016.

R. M. Labbé, A comparative transcriptomic analysis reveals conserved features of stem cell pluripotency in planarians and mammals, Stem Cells, vol.30, pp.1734-1745, 2012.

L. Kumar and M. E. Futschik, Mfuzz: a software package for soft clustering of microarray data, Bioinformation, vol.2, pp.5-7, 2007.

P. Langfelder and S. Horvath, WGCNA: an R package for weighted correlation network analysis, BMC Bioinformatics, vol.9, p.559, 2008.

J. D. Buenrostro, P. G. Giresi, L. C. Zaba, H. Y. Chang, and W. J. Greenleaf, Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position, Nat. Methods, vol.10, pp.1213-1218, 2013.

A. Fernández-miñán, J. Bessa, J. J. Tena, and J. L. Gómez-skarmeta, Assay for transposase-accessible chromatin and circularized chromosome conformation capture, two methods to explore the regulatory landscapes of genes in zebrafish, Methods Cell Biol, vol.135, pp.413-430, 2016.

Y. Zhang, Model-based analysis of ChIP-Seq (MACS)

, Genome Biol, vol.9, p.137, 2008.

A. N. Schep, Structured nucleosome fingerprints enable high-resolution mapping of chromatin architecture within regulatory regions, Genome Res, vol.25, pp.1757-1770, 2015.

O. Bogdanovi?, A. Fernández-miñán, J. J. Tena, E. De-la-calle-mustienes, and J. L. Gómez-skarmeta, The developmental epigenomics toolbox: ChIP-seq and MethylCap-seq profiling of early zebrafish embryos, Methods, vol.62, pp.207-215, 2013.

G. Geeven, H. Teunissen, W. De-laat, and E. De-wit, peakC: a flexible, nonparametric peak calling package for 4C and Capture-C data, Nucleic Acids Res, vol.46, p.91, 2018.

O. Bogdanovi? and G. J. Veenstra, Affinity-based enrichment strategies to assay methyl-CpG binding activity and DNA methylation in early Xenopus embryos, BMC Res. Notes, vol.4, p.300, 2011.

R. Lister, Hotspots of aberrant epigenomic reprogramming in human induced pluripotent stem cells, Nature, vol.471, pp.68-73, 2011.

M. Murata, Detecting expressed genes using CAGE, Methods Mol. Biol, vol.1164, pp.67-85, 2014.

, A promoter-level mammalian expression atlas, The FANTOM Consortium and the RIKEN PMI and CLST (DGT), vol.507, pp.462-470, 2014.

V. Haberle, A. R. Forrest, Y. Hayashizaki, P. Carninci, and B. Lenhard, CAGEr: precise TSS data retrieval and high-resolution promoterome mining for integrative analyses, Nucleic Acids Res, vol.43, p.51, 2015.

R. Wehrens and L. M. Buydens, Self-and super-organising maps in R: the kohonen package, J. Stat. Softw, vol.21, pp.1-19, 2007.

A. Gohr, M. Irimia, and . Matt, Unix tools for alternative splicing analysis, Bioinformatics, 2018.

M. T. Weirauch, Determination and inference of eukaryotic transcription factor sequence specificity, Cell, vol.158, pp.1431-1443, 2014.

S. J. Van-heeringen and G. J. Veenstra, GimmeMotifs: a de novo motif prediction pipeline for ChIP-sequencing experiments, Bioinformatics, vol.27, pp.270-271, 2011.

J. Bessa, Zebrafish enhancer detection (ZED) vector: a new tool to facilitate transgenesis and the functional analysis of cis-regulatory regions in zebrafish, Dev. Dyn, vol.238, pp.2409-2417, 2009.

A. R. Gehrke, Deep conservation of wrist and digit enhancers in fish, Proc. Natl Acad. Sci. USA, vol.112, pp.803-808, 2015.

. Fastxend,

. Soap-de-novo, , p.13, 2012.

, Haplomerger pipeline, 20111230.

. Lastz,

, CEGMA (v2.4, rel, 2006.

, EVidence Modeller, EVM, pp.2012-2018

. Repeatmasker,

, axtChain, chainMergeSort, chainPreNet, chainNet, multiz-tba, 2009.

. Mfuzz,

, Weighted Gene Correlation network analysis (WGCNA) (Program version

. Diffbind,

. Methyldackel,

, GNU/Linux Command-Line Tools: zcat (1.5)

, GitHub). See the Nature Research guidelines for submitting code & software for further information. Data Policy information about availability of data All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: -Accession codes, unique identifiers, or web links for publicly available datasets -A list of figures that have associated raw data -A description of any restrictions on data availability Next generation sequencing data have been, For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers upon request. We strongly encourage code deposition in a community repository

, GSE106430 (RNA-seq), GSE102144 (MethylC-seq and RRBS), and GSE115945 (4C-seq). Raw genome sequencing data and genome assembly have been submitted to European Nucleotide Archive (ENA) under the accession number PRJEB13665, GSE106428 (ATAC-seq), GSE106429 (CAGE-seq)

, For each cross-species transcriptomic and epigenomic comparisons as many orthologous genes as possible were used. Samples sizes for each analysis in the figures are indicated in the legends, Supplementary Dataset 8 and/or Supplementary Information. For each next generation sequencing experiment, for each of the biological replicate we used as many embryos or amount of adult tissues as necessary to obtain the enough amount of RNA/DNA for library preparation and sequencing

, Data exclusions For the CAGE-seq analysis, the muscle sample was excluded, as it did not fulfill the standard quality checks. Exclusion criteria for CAGE data are not predetermined, however it is long established that CAGE data has a characteristic variation in widths, and this, along with the very low number of reads recovered, vol.38, pp.626-661, 2006.

, We also perform two main types of experiments, largely for validation purposes: (i) generation of transgenic assays and (ii) in situ hybridization of specialized families. For (i), we provide the number of independent founders identified for each tested element and a description of the patterns obtained for each founder in Supplementary Table 8. For (ii), we have performed the in situ hybridization only once, Replication Nearly all the findings reported in this study correspond to computational analyses of next generation sequencing data. We provide the code and guidelines to reproduce all the analyses

, Randomization We did not have experimental groups that apply here. In our study we compared either (i) different tissues and developmental stages within a species, or (ii) matched samples for different species

, Antibodies Antibodies used -Rabbit polyclonal to Histone H3 (tri methyl K4) -ChIP Grade (#ab8580, Abcam), 1:200 -Rabbit polyclonal to Histone H3 (acetyl K27) -ChIP Grade (#ab4729, Abcam), 1:200 -Mouse monoclonal to Histone H3 (tri methyl K27) -ChIP Grade, vol.6002, p.200

, Validation The three primary antibodies used are all high-quality commercial antibodies against Histone H3 modifications, validated as ChIP grade by the manufacturer (Abcam, vol.501, p.168

, H3K4me3_36hpf_a 36 hpf 49 SE 33, vol.948, p.863

, H3K4me3_36hpf_b 36 hpf 49 SE 32, vol.457, p.194

, Rabbit polyclonal to Histone H3 (acetyl K27) -ChIP Grade (#ab4729, Abcam) -Mouse monoclonal to Histone H3 (tri methyl K27) -ChIP Grade (#ab6002, Abcam) Peak calling parameters Reads were mapped against the amphioxus reference genome using Bowtie, and peaks were called using the MACS2 software with default parameters. Data quality Chip-seq peaks were only used to overlap with the ATAC-seq peaks in multiple cross-validation analyses, Antibodies -Rabbit polyclonal to Histone H3 (tri methyl K4) -ChIP Grade (#ab8580, Abcam

, Software Reads were mapped against the amphioxus reference genome using Bowtie, and peaks were called using the MACS2 software with default parameters. The overlap between ATAC-seq and ChIP-seq peak was calculated using Bedtools