Laboratory of Computer and Information Science / Neural Networks Research Centre CIS Lab Helsinki University of Technology
Back to the course web page

Fetching data from publications

In this exercise we get ourselves familiar with the process of obtaining data that could be used for modeling biological networks. We will not focus on the databases, but instead study on what kind of public data sets have been presented and analyzed in recent high-level publications. In this exercise we will not analyze the data sets, but the data available in these publications can be used later in the project work.

Below is a list of articles that all describe or use publicly accessible data. Pick 2-4 of them, quickly read the relevant sections of the papers to see what kind of data they have analyzed, and then proceed to find the data in the web. Some of the data files are small, and you should then download them in order to take a look, while some are so big that we cannot right now download them.

Write down a few observations from each of the papers you chose. An example could be something like: "PPI data from baker's yeast, obtained using mass spectrometry. Contains X interactions (binary values) between Y proteins. The data is stored in a tab-delimited file, which looks easy to parse." You can also write down what they did with the data in the paper.


  1. Nevan J Krogan et al. 2006 Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature. 2006 Mar 30; 440: 637-643
  2. Anne-Claude Gavin et al. 2006 Proteome survey reveals modularity of the yeast cell machinery. Nature. 2006 Mar 30; 440: 631-636
  3. Schwikowski,B., Uetz,P. & Fields,S. (2000) A network of protein-protein interactions in yeast. Nature Biotechnology 18 (12): 1257-1261
  4. Uetz P, et al. A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature. 2000 Feb 10;403(6770):623-7.
  5. Yuen Ho et al. Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. 2002. Nature 415, 180-183 (10 January 2002)
  6. Titz, B. et al. 2004. What do we learn from high-throughput protein interaction data and networks? Expert Reviews in Proteomics 1 (1): 89-99
  7. Nizar N Batada et al. Stratus Not Altocumulus: A New View of the Yeast Protein Interaction Network. PLoS Biol. 2006 Sep 19;4 (10).
  8. Sean R Collins et al. 2007 Towards a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae. Mol Cell Proteomics. 2007 Jan 2.
  9. Rob M Ewing et al. 2007 Large-scale mapping of human protein-protein interactions by mass spectrometry. Mol Syst Biol. 2007 ; 3: 89. Epub 2007 Mar 13.
  10. Ganghi et al. 2006. Analysis of the human protein interactome and comparison with yeast, worm and fly interaction datasets. Nature genetics 38: 285-293.
  11. Ptacek et al. 2005. Global analysis of protein phosphorylation in yeast. Nature 438: 679-684.
  12. Tong et al. 2004. Global mapping of the yeast genetic interaction network. Science 303: 808-813.
  13. Ghaemmaghami et al. Global analysis of protein expression in yeast. Nature 425: 737-741
  14. Ihmels et al. Principles of transcriptional control in the metabolic network of Saccharomyces cerevisiae. 2004 Nature Biotechnology 22: 86-92

Some of the data is likely to be in the general databases. Here are links to few of those. Even if the articles you chose did not link to these, you should take a brief look at what each has to offer.

"Solutions" to the exercise

  1. Yeast PPI data in form of link propabilities, found in supplementary material
  2. Nobody was interested...
  3. Yeast PPI data in binary form, collected from several sources
  4. Yeast PPI data; no downloadable version was found, only a web interface for individual interactions
  5. Nobody was interested...
  6. Fruit fly PPI data, binary, easily obtained from DIP database
  7. Yeast PPI data, binary, in supplementary information
  8. Yeast PPI data, likelihoods of interactions, uses previous data
  9. Nobody was interested...
  10. Data from protein chips, distributed as excel-files
  11. Yeast data in a pdf-file, needs heavy parsing
  12. Yeast proteome data, easy to obtain in plain text and from database
  13. Expression and network data for yeast, compiled from several sources and the combination is not available

Wireless internet access

The accounts you received can be used for authentication in the Aalto wireless network, which should work in most places in Otaniemi. See Aalto web page for instructions. Essentially you only need to set the net password using "passwd" in the unix command shell. You can also use the same command to change the master password, but make sure you don't forget the new one.

You are at: CIS → T-61.5110 - Intensive course on modeling biological networks

Page maintained by t615110 (at), last updated Wednesday, 15-Aug-2007 14:49:05 EEST