Evolution of distinct EGF domains with specific functions
Abstract
EGF domains are extracellular protein modules cross-linked by three intradomain disulfides. Past studies suggest the existence of two types of EGF domain with three-disulfides, human EGF-like (hEGF) domains and complement Clr-like (cEGF) domains, but to date no functional information has been related to the two different types, and they are not differentiated in sequence or structure databases. We have developed new sequence patterns based on the different C-termini to search specifically for the two types of EGF domains in sequence databases. The exhibited sensitivity and specificity of the new pattern-based method represents a significant advancement over the currently available sequence detection techniques. We re-annotated EGF sequences in the latest release of Swiss-Prot looking for functional relationships that might correlate with EGF type. We show that important post-translational modifications of three-disulfide EGFs, including unusual forms of glycosylation and post-translational proteolytic processing, are dependent on EGF subtype. For example, EGF domains that are shed from the cell surface and mediate intercellular signaling are all hEGFs, as are all human EGF receptor family ligands. Additional experimental data suggest that functional specialization has accompanied subtype divergence. Based on our structural analysis of EGF domains with three-disulfide bonds and comparison to laminin and integrin-like EGF domains with an additional interdomain disulfide, we propose that these hEGF and cEGF domains may have arisen from a four-disulfide ancestor by selective loss of different cysteine residues. Copyright © 2005 The Protein Society.