Abstract:
CGGBP1, a 20 kDa protein, has several functions associated with its DNA-binding through a C2H2 zinc finger. A range of studies have shown that GC richness, inter-strand G/C-skew and low cytosine methylation are associated with CGGBP1 occupancy. The non-preference of any sequence motif as CGGBP1 binding site suggests widespread association of CGGBP1 with DNA including at potent transcription factor-binding sites (TFBSs) in promoter regions. The evolutionary advantage of such a design remains unclear. The regulatory interference by human CGGBP1 at TFBSs is supported by purifying selection in the DNA-binding domain of CGGBP1 and its requirement for gene repression as well as restriction of cytosine methylation at GC-rich TFBSs. Here, we describe an evolutionary trajectory of this property of CGGBP1 by combining global gene expression and cytosine methylation analyses on human cells expressing CGGBPs from four different vertebrates (representatives of coelacanth, reptiles, aves and mammals). We discover a potent cytosine methylation restriction by human CGGBP1 at some GC-rich TFBSs in repressed promoters. Further, we combine a high-throughput analysis of GC compositional bias of these CGGBP-regulated TFBSs from available orthologous sequences from a pool of over 100 species. We show that cytosine methylation restriction by CGGBP1 is tightly linked to GC retention in a set of TFBSs. Our experiments using four representative and three consensus forms of CGGBPs and orthology analyses of target gene promoters indicate that this property of CGGBPs has most likely evolved in higher amniotes (aves and mammals) with lineage-specific heterogeneities in lower amniotes (reptiles). ChIP-seq and C-T transition analysis in MeDIP-seq suggest that occupancy of CGGBP1 at these target TFBSs plays a crucial role in their low methylation, GC-biased evolution and associated functions in gene repression.