File Name: specificity and stability in topology of protein networks .zip
Defining specific protein interactions and spatially or temporally restricted local proteomes improves our understanding of all cellular processes, but obtaining such data is challenging, especially for rare proteins, cell types, or events.
Biological networks are the representation of multiple interactions within a cell, a global view intended to help understand how relationships between molecules dictate cellular behavior. Recent advances in molecular and computational biology have made possible the study of intricate transcriptional regulatory networks that describe gene expression as a function of regulatory inputs specified by interactions between proteins and DNA.
In this issue, Leandrou et al. Kinase activity was notably different in mutant LRRK2 hetero- and homo-dimers. The cover image shows that de-phosphorylated and mutant LRRK2 cytoplasmic filaments are comprised of dimeric species. The image was supplied by Hardy J.
Specificity and Stability in Topology of Protein Networks
A gene or genetic regulatory network GRN is a collection of molecular regulators that interact with each other and with other substances in the cell to govern the gene expression levels of mRNA and proteins which, in turn, determine the function of the cell.
GRN also play a central role in morphogenesis , the creation of body structures, which in turn is central to evolutionary developmental biology evo-devo. The interaction can be direct or indirect through transcribed RNA or translated protein. In general, each mRNA molecule goes on to make a specific protein or set of proteins. In some cases this protein will be structural , and will accumulate at the cell membrane or within the cell to give it particular structural properties.
In other cases the protein will be an enzyme , i. Some proteins though serve only to activate other genes, and these are the transcription factors that are the main players in regulatory networks or cascades. By binding to the promoter region at the start of other genes they turn them on, initiating the production of another protein, and so on.
Some transcription factors are inhibitory. In single-celled organisms, regulatory networks respond to the external environment, optimising the cell at a given time for survival in this environment.
Thus a yeast cell, finding itself in a sugar solution, will turn on genes to make enzymes that process the sugar to alcohol. In multicellular animals the same principle has been put in the service of gene cascades that control body-shape. Sometimes a 'self-sustaining feedback loop' ensures that a cell maintains its identity and passes it on. Less understood is the mechanism of epigenetics by which chromatin modification may provide cellular memory by blocking or allowing transcription.
A major feature of multicellular animals is the use of morphogen gradients, which in effect provide a positioning system that tells a cell where in the body it is, and hence what sort of cell to become. A gene that is turned on in one cell may make a product that leaves the cell and diffuses through adjacent cells, entering them and turning on genes only when it is present above a certain threshold level.
These cells are thus induced into a new fate, and may even generate other morphogens that signal back to the original cell. Over longer distances morphogens may use the active process of signal transduction. Such signalling controls embryogenesis , the building of a body plan from scratch through a series of sequential steps. They also control and maintain adult bodies through feedback processes, and the loss of such feedback because of a mutation can be responsible for the cell proliferation that is seen in cancer.
In parallel with this process of building structure, the gene cascade turns on genes that make structural proteins that give each cell the physical properties it needs. At one level, biological cells can be thought of as "partially mixed bags" of biological chemicals — in the discussion of gene regulatory networks, these chemicals are mostly the messenger RNAs mRNAs and proteins that arise from gene expression.
These mRNA and proteins interact with each other with various degrees of specificity. Some diffuse around the cell. Others are bound to cell membranes , interacting with molecules in the environment. Still others pass through cell membranes and mediate long range signals to other cells in a multi-cellular organism. These molecules and their interactions comprise a gene regulatory network. A typical gene regulatory network looks something like this:. Edges between nodes represent interactions between the nodes, that can correspond to individual molecular reactions between DNA, mRNA, miRNA, proteins or molecular processes through which the products of one gene affect those of another, though the lack of experimentally obtained information often implies that some reactions are not modeled at such a fine level of detail.
The nodes can regulate themselves directly or indirectly, creating feedback loops, which form cyclic chains of dependencies in the topological network. The network structure is an abstraction of the system's molecular or chemical dynamics, describing the manifold ways in which one substance affects all the others to which it is connected.
In practice, such GRNs are inferred from the biological literature on a given system and represent a distillation of the collective knowledge about a set of related biochemical reactions. To speed up the manual curation of GRNs, some recent efforts try to use text mining , curated databases, network inference from massive data, model checking and other information extraction technologies for this purpose.
Genes can be viewed as nodes in the network, with input being proteins such as transcription factors , and outputs being the level of gene expression.
The value of the node depends on a function which depends on the value of its regulators in previous time steps in the Boolean network described below these are Boolean functions , typically AND, OR, and NOT. These functions have been interpreted as performing a kind of information processing within the cell, which determines cellular behavior. The basic drivers within cells are concentrations of some proteins, which determine both spatial location within the cell or tissue and temporal cell cycle or developmental stage coordinates of the cell, as a kind of "cellular memory".
The gene networks are only beginning to be understood, and it is a next step for biology to attempt to deduce the functions for each gene "node", to help understand the behavior of the system in increasing levels of complexity, from gene to signaling pathway, cell or tissue level. Mathematical models of GRNs have been developed to capture the behavior of the system being modeled, and in some cases generate predictions corresponding with experimental observations.
In some other cases, models have proven to make accurate novel predictions, which can be tested experimentally, thus suggesting new approaches to explore in an experiment that sometimes wouldn't be considered in the design of the protocol of an experimental laboratory. Recently it has been shown that ChIP-seq signal of histone modification are more correlated with transcription factor motifs at promoters in comparison to RNA level.
Gene regulatory networks are generally thought to be made up of a few highly connected nodes hubs and many poorly connected nodes nested within a hierarchical regulatory regime.
Thus gene regulatory networks approximate a hierarchical scale free network topology. There are primarily two ways that networks can evolve, both of which can occur simultaneously. The first is that network topology can be changed by the addition or subtraction of nodes genes or parts of the network modules may be expressed in different contexts.
The Drosophila Hippo signaling pathway provides a good example. The Hippo signaling pathway controls both mitotic growth and post-mitotic cellular differentiation. This suggests that the Hippo signaling pathway operates as a conserved regulatory module that can be used for multiple functions depending on context. The second way networks can evolve is by changing the strength of interactions between nodes, such as how strongly a transcription factor may bind to a cis-regulatory element.
Such variation in strength of network edges has been shown to underlie between species variation in vulva cell fate patterning of Caenorhabditis worms. Another widely cited characteristic of gene regulatory network is their abundance of certain repetitive sub-networks known as network motifs.
Network motifs can be regarded as repetitive topological patterns when dividing a big network into small blocks. Previous analysis found several types of motifs that appeared more often in gene regulatory networks than in randomly generated networks. This motif is the most abundant among all possible motifs made up of three nodes, as is shown in the gene regulatory networks of fly, nematode, and human.
The enriched motifs have been proposed to follow convergent evolution , suggesting they are "optimal designs" for certain regulatory purposes.
A recent research found that yeast grown in an environment of constant glucose developed mutations in glucose signaling pathways and growth regulation pathway, suggesting regulatory components responding to environmental changes are dispensable under constant environment. On the other hand, some researchers hypothesize that the enrichment of network motifs is non-adaptive.
Support for this hypothesis often comes from computational simulations. For example, fluctuations in the abundance of feed-forward loops in a model that simulates the evolution of gene regulatory networks by randomly rewiring nodes may suggest that the enrichment of feed-forward loops is a side-effect of evolution. De novo evolution of coherent type 1 feed-forward loops has been demonstrated computationally in response to selection for their hypothesized function of filtering out a short spurious signal, supporting adaptive evolution, but for non-idealized noise, a dynamics-based system of feed-forward regulation with different topology was instead favored.
Regulatory networks allow bacteria to adapt to almost every environmental niche on earth. In bacteria, the principal function of regulatory networks is to control the response to environmental changes, for example nutritional status and environmental stress. It is common to model such a network with a set of coupled ordinary differential equations ODEs or SDEs , describing the reaction kinetics of the constituent parts.
Then the temporal evolution of the system can be described approximately by. Michaelis—Menten enzymatic kinetics. Such models are then studied using the mathematics of nonlinear dynamics. System-specific information, like reaction rate constants and sensitivities, are encoded as constant parameters. By solving for the fixed point of the system:. Steady states of kinetic equations thus correspond to potential cell types, and oscillatory solutions to the above equation to naturally cyclic cell types.
Mathematical stability of these attractors can usually be characterized by the sign of higher derivatives at critical points, and then correspond to biochemical stability of the concentration profile. Critical points and bifurcations in the equations correspond to critical cell states in which small state or parameter perturbations could switch the system between one of several stable differentiation fates.
Trajectories correspond to the unfolding of biological pathways and transients of the equations to short-term biological events. For a more mathematical discussion, see the articles on nonlinearity , dynamical systems , bifurcation theory , and chaos theory.
The following example illustrates how a Boolean network can model a GRN together with its gene products the outputs and the substances from the environment that affect it the inputs. Stuart Kauffman was amongst the first biologists to use the metaphor of Boolean networks to model genetic regulatory networks. The validity of the model can be tested by comparing simulation results with time series observations. A partial validation of a Boolean network model can also come from testing the predicted existence of a yet unknown regulatory connection between two particular transcription factors that each are nodes of the model.
Continuous network models of GRNs are an extension of the boolean networks described above. Nodes still represent genes and connections between them regulatory influences on gene expression. Genes in biological systems display a continuous range of activity levels and it has been argued that using a continuous representation captures several properties of gene regulatory networks not present in the Boolean model.
This model is formally closer to a higher order recurrent neural network. The same model has also been used to mimic the evolution of cellular differentiation  and even multicellular morphogenesis. Recent experimental results   have demonstrated that gene expression is a stochastic process. Thus, many authors are now using the stochastic formalism, after the work by Arkin et al.
The first versions of stochastic models of gene expression involved only instantaneous reactions and were driven by the Gillespie algorithm. Since some processes, such as gene transcription, involve many reactions and could not be correctly modeled as an instantaneous reaction in a single step, it was proposed to model these reactions as single step multiple delayed reactions in order to account for the time it takes for the entire process to be complete.
From here, a set of reactions were proposed  that allow generating GRNs. These are then simulated using a modified version of the Gillespie algorithm, that can simulate multiple time delayed reactions chemical reactions where each of the products is provided a time delay that determines when will it be released in the system as a "finished product".
Furthermore, there seems to be a trade-off between the noise in gene expression, the speed with which genes can switch, and the metabolic cost associated their functioning.
More specifically, for any given level of metabolic cost, there is an optimal trade-off between noise and processing speed and increasing the metabolic cost leads to better speed-noise trade-offs. A recent work proposed a simulator SGNSim, Stochastic Gene Networks Simulator ,  that can model GRNs where transcription and translation are modeled as multiple time delayed events and its dynamics is driven by a stochastic simulation algorithm SSA able to deal with multiple time delayed events.
The time delays can be drawn from several distributions and the reaction rates from complex functions or from physical parameters. It can also be used to model specific GRNs and systems of chemical reactions. Genetic perturbations such as gene deletions, gene over-expression, insertions, frame shift mutations can also be modeled as well.
The GRN is created from a graph with the desired topology, imposing in-degree and out-degree distributions. Gene promoter activities are affected by other genes expression products that act as inputs, in the form of monomers or combined into multimers and set as direct or indirect. Next, each direct input is assigned to an operator site and different transcription factors can be allowed, or not, to compete for the same operator site, while indirect inputs are given a target. Finally, a function is assigned to each gene, defining the gene's response to a combination of transcription factors promoter state.
The transfer functions that is, how genes respond to a combination of inputs can be assigned to each combination of promoter states as desired.
Donate to arXiv
Hsp70 participates in a broad spectrum of protein folding processes extending from nascent chain folding to protein disaggregation. This versatility in function is achieved through a diverse family of J-protein cochaperones that select substrates for Hsp Substrate selection is further tuned by transient complexation between different classes of J-proteins, which expands the range of protein aggregates targeted by metazoan Hsp70 for disaggregation. We assessed the prevalence and evolutionary conservation of J-protein complexation and cooperation in disaggregation. We find the emergence of a eukaryote-specific signature for interclass complexation of canonical J-proteins. Consistently, complexes exist in yeast and human cells, but not in bacteria, and correlate with cooperative action in disaggregation in vitro.
Defining specific protein interactions and spatially or temporally restricted local proteomes improves our understanding of all cellular processes, but obtaining such data is challenging, especially for rare proteins, cell types, or events. Recent technological improvements, namely two highly active biotin ligase variants TurboID and miniTurbo , allowed us to address two challenging questions in plants: 1 what are in vivo partners of a low abundant key developmental transcription factor and 2 what is the nuclear proteome of a rare cell type? Proteins identified with FAMA-TurboID include known interactors of this stomatal transcription factor and novel proteins that could facilitate its activator and repressor functions. Directing TurboID to stomatal nuclei enabled purification of cell type- and subcellular compartment-specific proteins. Broad tests of TurboID and miniTurbo in Arabidopsis and Nicotiana benthamiana and versatile vectors enable customization by plant researchers. Cells contain thousands of different proteins that work together to control processes essential for life. To fully understand how these processes work it is important to know which proteins interact with each other, and which proteins are present at specific times or in certain cellular locations.
Gene regulatory network
Protein structure is the three-dimensional arrangement of atoms in an amino acid -chain molecule. A single amino acid monomer may also be called a residue indicating a repeating unit of a polymer. Proteins form by amino acids undergoing condensation reactions , in which the amino acids lose one water molecule per reaction in order to attach to one another with a peptide bond. By convention, a chain under 30 amino acids is often identified as a peptide , rather than a protein.
Progress in uncovering the protein interaction networks of several species has led to questions of what underlying principles might govern their organization. Few studies have tried to determine the impact of protein interaction network evolution on the observed physiological differences between species. For Homo sapiens this corresponds to 10 3 interactions changed per million years.
Studies of the yeast protein interaction network have revealed distinct correlations between the connectivity of individual proteins within the network and the average connectivity of their neighbours. Although a number of biological mechanisms have been proposed to account for these findings, the significance and influence of the specific datasets included in these studies has not been appreciated adequately. We show how the use of different interaction data sets, such as those resulting from high-throughput or small-scale studies, and different modelling methodologies for the derivation pair-wise protein interactions, can dramatically change the topology of these networks.