The T-122.102 seminar on February 4th has been cancelled. Instead, students are recommended to attend a presentation on
Finding frequent Substructures in 3D-Protein Databases
by Alexander Hinneburg from the Martin-Luther-University Halle/Wittenberg.
Due to the different genome projects and the improved analysis techniques the number of known protein sequences grows very fast. Despite some progress in protein X-ray analysis and other protein analysis techniques, the number of known protein 3D-structures is lower in the order of magnitudes as the number of known protein sequences. As X-ray protein analysis is very expensive, there is a strong desire to derive the 3D-structure directly from the protein sequence. The standard technique, protein homology modeling, is limited to proteins, for which exist proteins with known 3D-structures and similar sequences. For most proteins with known sequences and unknown 3D-structures homology modeling is not possible. An alternative way is to derive some knowledge about substructures. As the number of available high resolution X-ray protein structures has grown significantly over the last years, we want to perform a more comprehensive analysis of the conformational behavior of substructures, than which is done in known rotamer libraries describing small substructures of fixed lengths. In this paper we describe work in progress, which aims to find frequent substructures of different lengths and gaps between the amino acids. Additionally we derive association rules from the frequent substructures, which express the found knowledge in a similar way to logic rules.
About the speaker
Alexander Hinneburg has worked on clustering and visualization of large high-dimensional data sets and in bioinformatics, especially protein structure prediction.
Monday, 03-Feb-2003 11:39:14 EET