The exercise problems will be added here after each session. You should return your solution to the assistant within two weeks from the time the exercise was given. You can (and are encouraged to) use suitable tools for solving the exercises, so please use Matlab or something like that instead of inverting 6x6 matrices by hand. If solving the problem using computer then return the code and the key results. Otherwise it is completely okay to return also hand-written solutions.

2.What are the advantages and the limitations of using metabolic labelling in quantitative proteomics?

3.What is the difference between on/off peptides/proteins and proteins not identified in all biological replicates?

4.What are the main steps in the data processing of quantitative MS based proteomics?

2. Explain briefly two of the reasons why a tree might not be an accurate model of the evolution.

3. Find the internal nodes from the parsimony example with animal DNA sequences, in order to obtain lowest number of changes (use Fitch's algorithm). Give the total parsimony score.

4. Find the phylogenetic tree on the same example using UPGMA method (use the number of differences in characters as distance measure)

(1) On which two assumptions about pathways is the model based?

(2) Map each of the previous assumptions into a property of the model

(3) Why must the parameter in the markov random field be greater than one?

(4) What happens when (a) alpha= 1 or when (b) alpha is close to innity?

Question 1. Why it is not possible to measure absolute protein expression?

Question 2. What are the drawbacks in using Robust Ridge Regression? How these are handled with CCA approach? Explain in brief.

Question 3. To make the approach robust and less sensitive to outliers, what changes are done in CCA approach?

Question 4.Datafiles: exerData1.txt, exerData2.txt , each is matrix of size 1000X5

Given 5 sets of canonical directions(beta1_i, beta2_i), i in 1:5 for the first and second data respectively. Find out which one maximizes the correlation between the data? What is the maximum correlation? (PDF)

(1) What is the difference between pairwise models and just plain kernel models?

(2) What is the curse of dimensionality and how does it relate to SVM kernels?

(3) Describe briefly to other kernel based approaches that have been proven useful at solving biological problems.

(4) The motif and spectrum kernel are based on a trie data structure. How does this data structure works and how is it applied to the motif kernel? What is the main difference compared to a tree data structure?

1. What are the benefits of OPLS and O2PLS compared to PLS? Are there any downsides in using these analysis methods?

2.Name at least one reason why MS would be a better tool for metabonomics than NMR.

3.What kind of (biological) difficulties there are in combining data from different omics platforms?

1. What is the role of fuzzy methods in this article?

2. What is the significance of the gene ontology in this approach?

3. Go through all examples in the article and see if you agree. Try to summarize one of them.

4. Please name some other alternative approaches for measuring similarity of genes.

You are at: CIS → T-61.6070 Special course in bioinformatics I: Modeling of biological networks

Page maintained by t616070 (at) cis.hut.fi, last updated Monday, 12-May-2008 12:56:08 EEST