KEGG network provider allows you to extract metabolic networks from KEGG [15] that are specific to a set of organisms. In addition, you can exclude certain compounds or reactions from these networks.
A range of tools works with KGML files. Click on ``Manual -> Related tools'' to see a selection of them. KEGG network provider differs from these tools by allowing also the extraction of RPAIR networks and by supporting filtering of compounds, reactions and RPAIR classes.
KEGG network provider itself has no network analysis or visualization functions, but you can use a NeAT tool (a choice of them will appear upon termination of network construction) or any other graph analysis tool that reads gml, VisML or dot format for these purposes.
For visualization of KEGG networks, you can use iPATH [22], KGML-ED [17] or metaSHARK [12]. Yanasquare [27] and Pathway Hunter Tool [26] offer organism-specific KEGG network construction in combination with analysis functions. With [36], you can construct KEGG metabolic networks in R.
It should be noted that KEGG annotators omitted side compounds in the KGML files. Thus, certain molecules (such as CO2, ATP or ADP) might be absent from the metabolic networks extracted from these files.
It is also worth noting that constructing metabolic networks from KGML files produces networks of much lower quality
than those obtained by manual metabolic reconstruction. In manual reconstruction, several resources are taken into account,
such as the biochemical literature, databases and genome annotations (e.g. [8]). This is why
the metabolism of only a few organisms has been manually reconstructed so far.
In automatically reconstructed networks, reactions might not be balanced and compounds might occur more
than once with different identifiers (see e.g. [25] for annotation problems in KEGG).
For the purpose of path finding the automatically reconstructed metabolic networks may still be of interest.
Our study case consists in the construction of two metabolic networks: one for five yeast species and the other for Escherichia coli K-12 MG1655. We will compare path finding results obtained for these two networks for a metabolic reference pathway (Lysine biosynthesis).
In the right panel, you should now see a form entitled ``KEGG network provider''.
The KEGG network provider form has now loaded the organism identifiers of five yeast species. As explained in the form, the species concerned are: Saccharomyces bayanus, Saccharomyces mikatae, Saccharomyces paradoxus, Schizosaccharomyces pombe and Saccharomyces cerevisiae.
The network extraction should take only a few seconds. Then, a link to the extracted network is displayed. In addition (for formats tab-delimited and gml), the Next step panel should appear.
Repeat the previous steps, but instead of selecting DEMO in the KEGG network provider form, enter eco in the organisms text input field. Make sure to select directed network in the KEGG network provider form, then follow steps 4 to 10 as described above.
The command-line version of this tutorial is restricted to the E. coli and S. cerevisiae metabolic networks. It is assumed that you have installed the required command-line tools.
java graphtools.util.MetabolicGraphProvider -i eco -d -o eco_metabolic_network_directed.txt
java graphtools.algorithms.Pathfinder -g eco_metabolic_network_directed.txt -f tab -s C00049 -t C00047 -r 1 -d -y con -b -T pathsUnion -O gml -o lysinebiosyn_eco.gml
java graphtools.util.MetabolicGraphProvider -i sce -d -o sce_metabolic_network_directed.txt
java graphtools.algorithms.Pathfinder -g sce_metabolic_network_directed.txt -f tab -s C00049 -t C00047 -d -r 1 -y con -b -T pathsUnion -O gml -o lysinebiosyn_sce.gml
After having executed the steps of this tutorial, you should have obtained two pathway images:
one for the yeast network and one for the E. coli network. Both pathways differ quite substantially.
If we compare each of these pathways with the respective organism-specific pathway map in KEGG, we notice that
the pathway inferred for the E. coli network reproduces the reference pathway correctly.
The yeast pathway deviates from the S. cerevisiae KEGG pathway map from L-aspartate to but-1-ene-1,2,4-tricarboxylate,
but recovers otherwise the reference pathway correctly (ignoring the intermediate steps 5-adenyl-2-aminoadipate and
alpha-aminoadipoyl-S-acyl enzyme associated to EC number 1.2.1.31).
For comparison purposes, we have chosen the same start and end compound for both metabolic networks, but it should
be noted that the reference lysine biosynthesis pathway in S. cerevisiae starts from 2-oxoglutarate.
The lysine biosynthesis KEGG map for yeast is available at:
http://www.genome.ad.jp/dbget-bin/get_pathway?org_name=sce&mapno=00300
The one for E. coli is available at:
http://www.genome.ad.jp/dbget-bin/get_pathway?org_name=eco&mapno=00300
The study case demonstrated that different organisms may employ different metabolic pathways for the synthesis or degradation of a given compound. For this reason, it is useful to be able to construct metabolic networks that are specific to a selected set of organisms.