Data and Information relative
to the article published in PLoS Genetics (DOI:)
Extensive divergence of transcription factor binding in
Drosophila embryos with highly conserved gene expression
Mathilde Paris1, Tommy Kaplan1,2,
Xiao Yong Li1,3, Jacqueline E. Villalta2, Susan E. Lott1,4,
Michael B. Eisen1,2,3
1 Department of Molecular and Cell Biology, University
of California Berkeley, Berkeley, California, United States of America, 2 School
of Computer Science and Engineering, The Hebrew University, Jerusalem, Israel, 3
Howard Hughes Medical Institute, University of California Berkeley, Berkeley,
California, United States of America, 4 Department of Evolution and
Ecology, University of California, Davis, California, United States of America.
Corresponding authors: MP (thildeparis@gmail.com) and MBE (mbeisen@gmail.com)
ABSTRACT
To better characterize how variation in regulatory
sequences drives divergence in gene expression, we undertook a systematic study
of transcription factor binding and gene expression in blastoderm
embryos of four species, which sample much of the diversity in the 40
million-year old genus Drosophila: D. melanogaster, D. yakuba, D. pseudoobscura and D. virilis. We
compared gene expression, measured by mRNA-seq, to
the genome-wide binding, measured by ChIP-seq, of
four transcription factors involved in early anterior-posterior patterning. We
found that mRNA levels are much better conserved than individual transcription
factor binding events, and that changes in a geneÕs expression were poorly
explained by changes in adjacent transcription factor binding. However highly
bound sites, sites in regions bound by multiple factors and sites near genes
are conserved more frequently than other binding, suggesting that a
considerable amount of transcription factor binding is weakly or non-functional
and not subject to purifying selection.
ADDITIONAL
DATA
- All sequencing data were deposited in the NCBI GEO (accession
number GSE50773)
- Alignment of the
genomes of D. melanogaster, D. yakuba, D. pseudoobscura and D. virilis. Synteny map was created using MERCATOR and alignments were
created using PECAN.
- Modified annotations for D. yakuba, D. pseudoobscura and D. virilis. These annotations were created using the
reference annotations as well as RNA-seq data of
pooled embryos (GEO accession number GSE50773). Gene IDs correspond to the IDs
in the orthology table described below.
- Expression levels
per set of orthologous genes per species between D. melanogaster, D. yakuba, D. pseudoobscura and D. virilis. Inferred
ancestral values as well as parameters of the Brownian motion model are given. Orthology assignment between genes was established based on
the whole-genome alignment: genes were considered orthologous
if the coordinates of their exons intersect more than
40% of their total length and if their orientation is the same (or unknown).
Because this method is genome-alignment based, it takes into account both
sequence similarity and synteny, thus favoring ortholog over paralog
association. We removed from the analysis genes that showed orthology
inconsistencies (e.g. genes with different orthologs
in different species).