Representing Cooperative Interactions in PSI-MI
PSI-MI XML Schema
Although cooperativity between molecular interactions is a common and important phenomenon in cell regulation, it is currently not incorporated in widely used biological data resources. The main reason for this is that there is no computer-readable standard representation format that can adequately capture and represent cooperative interactions. However, there are standard data representation formats available that allow detailed annotation of molecular interactions, of which the most widely used is the PSI-MI XML data interchange format. This data exchange format provides the means to unambiguously and consistently capture, represent and exchange molecular interaction data, and to additionaly annotate a wide range of meta-data. The root element of the PSI-MI XML schema is the 'entrySet' element (Figure 2 - left). Each entry within an entrySet describes one or more interactions, with different aspects of an interaction being captured in detail in the different child elements of the 'entry' element (Figure 2 - right) (Kerrien et al., 2007).
The PSI-MI XML data interchange format was developed by the HUPO Proteomics Standards Initiative to facilitate the exchange, comparison and verification of molecular interaction data. The latest release of the PSI-MI XML schema is version 2.5.4 (Kerrien et al., 2007). The use of this XML schema to represent molecular interactions ensures that all representations are consistent in form. To ensure consistency in content, an extensive list of controlled vocabulary terms has been defined (PSI-MI controlled vocabulary). In addition, a community guideline called the Minimum Information about a Molecular Interaction Experiment (MIMIx) (Orchard et al., 2007) has been defined to advise users on how to fully describe a molecular interaction experiment, and to specify the minimum information that is required in order to unambiguously describe such experiments.
Although the PSI-MI format does not capture interdependencies between different binding events, the current version can still be used to annotate cooperative interactions, without a prerequisite for any structural changes to the XML schema. Together with the HUPO Proteomics Standards Initiative consortium, we have developed a strategy that allows the current PSI-MI XML format to describe cooperative interactions, without having made any changes to the PSI-MI XML schema, thus keeping it stable and compatible with existing tools. This strategy is further outlined below, and mainly uses the interaction 'attributeList' element to capture, in a semi-structured manner, any cooperative effect an interaction has on a subsequent interaction. Although there are more straightforward ways to incorporate cooperativity in the PSI-MI format, they would require changes to the XML schema. One important aspect of data representation standards however is their stability over time, since a wide variety of tools are available that are developed to be compatible with the current version of the schema. Making changes to the schema implies redeveloping these tools, which is time and effort consuming. Therefor, the strategy to capture cooperative interaction data using the current version of the PSI-MI XML schema, as it is explained here, provides an adequate temporary solution until a new version of the schema is released.
Strategy to annotate cooperative interactions in PSI-MI
Complex assembly
Ordered assembly of molecular complexes can be described using PSI-MI by referring to a previously described interaction as a participant of a subsequent interaction (Figure 3). For instance, if a molecule C binds to molecules A and B only when A and B are bound to each other in a complex, a first interaction will describe binding of A to B, i.e. the pre-formation of the complex A-B. A second interaction will then describe the binding of C to the complex A-B. This second interaction has two participants, which are annotated in the 'participantList' element of this interaction. The first participant is the molecule C, which can either be fully described in a 'participant' element, or can reference an interactor that was already described in the 'interactorList' element of this entry, using the ID of the corresponding interactor. The second participant is the complex A-B. Using the 'interactionRef' element (Figure 3), a child element of the 'participant' element, this participant will reference the first interaction, which describes the formation of the complex A-B, by using the ID of this interaction (See also MI:0317 in the PSI-MI controlled vocabulary).
If one of the subunits in the pre-formed complex has a feature that is important for the interaction of the complex with another molecule, it should be possible to discriminate between the different subunits of the complex and specify on which subunit the feature is located. To this end, a new controlled vocabulary term called 'participantRef' was added to the PSI-MI controlled vocabulary. The 'participantRef' term is a 'feature attribute name' (MI:0668) that can be used to specify on which subunit of the pre-formed complex a feature is located. Using the example of C binding to the pre-formed complex A-B, assume that subunit A is required to be phosphorylated to interact with C. In this case, a phosphorylated residue will be annotated as a feature of the complex A-B participant in the second interaction (in the 'featureList' element of this participant). To specify that it is subunit A having the phosphorylated residue, the 'participantRef' attribute will be annotated as a child element of this feature. The value of this attribute will be the participant ID of molecule A in the first interaction, i.e. the interaction that describes the binding of molecule A to molecule B.
Interaction interdependencies - Cooperativity
One of the child elements of the 'interaction' element in the PSI-MI XML schema is the optional 'attributeList' element, which itself can contain one to many 'attribute' elements (Figure 4). The 'attributeList' of the 'interaction' element allows additional description of the interaction data in a semi-structured manner. Each 'attribute' in the 'attributeList' contains a value of the type string and is specified by two attributes: the name of the 'attribute' (type: string), which is required, and the name accession (type: string), which is optional (Figure 4). The latter enables control of the 'attribute' type by referring to an external controlled vocabulary. The root term in the PSI-MI controlled vocabulary for terms that can be used to describe the free text that is stored as an 'attribute' value is the 'attribute name' term (MI:0590). In order to annotate cooperative interaction data using the current PSI-MI XML schema, new terms were added to the PSI-MI controlled vocabulary. Those new terms that are children of the MI:0664 ('interaction attribute name') term can be used to describe a particular aspect of a cooperative interaction. These terms, their definitions and relationships are described on the New CV terms page.
Cooperative interaction data to be annotated
Different controlled vocabulary terms were introduced to represent different aspects of cooperative interactions. The list below describes the different types of data that can be described for cooperative interactions using these new terms. A more detailed description of these CV terms can be found on the New CV terms page.
-There are two basic mechanisms underlying cooperativity in molecular interactions: allostery or pre-assembly.
-Either of these two mechanisms, or a combination of both, can mediate a cooperative effect of a binding event (including covalent post-translational modifications) on a subsequent interaction, referred to as the affected interaction.
-The outcome of this effect can be either positive (the affected interaction is augmented) or negative (the affected interaction is diminished).
-The effect can be quantified by the fold change of the affinity of a molecule (or a catalytic parameter of an enzyme) for a ligand in the absence, versus presence, of a second ligand or a covalent post-translational modification.
When allostery is the underlying mechanism, the following data can be annotated:
-the allosteric molecule, which specifies the molecule that is allosterically regulated.
-the allosteric effector, which specifies the ligand that elicits an allosteric response in the allosteric molecule upon binding to that molecule.
-the allosteric post-translational modification, which specifies the post-translational modification that elicits an allosteric response in the allosteric molecule upon addition to that molecule.
-the allosteric response, which specifies whether the allosteric perturbation elicits a change in affinity for a second ligand or a change in a catalytic parameter of an enzyme-catalysed reaction.
-the allosteric mechanism, which specifies whether the allosteric response results from a change in molecular structure or from a change in molecular dynamics in the allosteric molecule.
-the type of allostery, which specifies whether or not the allosteric ligand is chemically identical to the primary, orthosteric ligand.
When pre-assembly is the underlying mechanism, the following data can be annotated:
-the pre-assembly response, which specifies whether pre-assembly results in the generation of a functional continuous composite binding site spanning different components of the pre-formed complex, in alteration of the physicochemical properties of a functional binding site by covalent modification, thereby reducing or increasing its compatibility with its binding partner or in hiding of a functional binding site, or whether it pre-organises multiple discrete binding sites.
Experiment description
Experiments in which the interactions annotated in a PSI-MI entry are determined, are described in the main 'experimentList' element (Figure 5), a child element of the 'entry' element (Figure 2). Usually, a single experiment associated with a single publication is annotated in a single 'experimentDescription' element within the 'experimentList' element. At the level of the interaction, i.e. within the 'interaction' element, the experiment can be described by referring to an experiment previously annotated in the entry, using its ID as a value for the 'experimentRef' element (Figure 6). Since in many cases, cooperative interactions are not fully determined by a single experiment associated with a single publication, the annotation of experiments that provide evidence for interdependencies between molecular interactions has to be done in a different manner. A cooperative effect of one interaction on a subsequent interaction is annotated in the 'interaction' element describing the first interaction, using the new CV terms. The experiments used to determine a single interaction are annotated by cross-reference to an external description of that interaction, for instance an interaction instance curated in the IntAct database, which fully describes the experimental methods related to the interaction. This cross-reference is done in the 'xref' element of the interaction, using an identifier of the interaction in the external resource. The experiments used to determine the cooperative effect of an interaction on a subsequent interaction are captured in the 'experimentDescription' element, using the 'bibref' element to refer to the relevant publication. Within the 'experimentDescription' element, the value of the 'interactionDetectionMethod' element is the CV term 'inferred by author' (MI:0363), to indicate that the cooperative nature of the interaction was inferred from one or more publications, based on multiple experiments.
Issues
The main advantage of describing different aspects of cooperativity between molecular interactions in the attribute list of an interaction in a PSI-MI XML file is that the current PSI-MI XML schema can be used to represent this type of interactions, without having to make any structural changes to the schema. This means that the available tools that can read PSI-MI XML files can still be used for cooperative interactions that are represented according to this schema. However, describing cooperative effects in this manner only allows semi-structured annotation. Although controlled vocabulary terms have been defined that can be used as values for interaction attributes that describe cooperative effects of one interaction on a subsequent interaction, their usage cannot be enforced on the user, meaning that any value can be entered for the newly added attributes. As a result, consistency of the content of such PSI-MI files cannot be guaranteed. A second problem that arises when using this strategy to annotate cooperative interactions is the occurence of repetition and redundancy. Since only one cooperative effect can be described in the attribute list of an interaction, an interaction that has multiple cooperative effects has to be described repeatedly for each cooperative effect it has.
Currently, a minimal number of structural changes are being made to the PSI-MI schema to allow more efficient and structured annotation of cooperative interaction data. These changes will be added to the next version of the PSI-MI schema, which will not be released in the near future. However, the new interaction attribute names introduced here provide a temporary solution to describe cooperative interactions using the current PSI-MI schema, version 2.5.4.