Poster K01
A Novel Approach to Performance Evaluation of Homology Search Methods
Avihay Apatoff (1,2), Eddo Kim (1,2), Yossef Kliger (2)
(1) The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Israel; (2) Compugen Ltd, Tel Aviv 69512, Israel |
Abstract:
We developed a method to estimate the fraction of homologs in a set of aligned proteins. The method gives an underestimation of the real number of homologs according to the prevalence of a conserved protein feature (e.g. the N-terminal signal peptide) in the protein population. The method is beneficial for the development and assessment of proteome-wide homology search methods.
Contact: avihay_ap@yahoo.com
Keywords: Protein, Homology, Signal Peptide |
Poster K02
A Computer Program to Predict Gene Position and Reveal Inconsistent Designation of Similar Genes in Different Organisms
Marjanca Starcic Erjavec (1), Marko Erjavec (2), Darja Zgur-Bertok (1)
(1) Department of Biology, Biotechnical Faculty, University of Ljubljana; (2) UnistarLC |
Abstract:
The DNA sequence is the foundation upon which the structure and function of an organism to a large degree depends. To find the most similar sequences the computer program BLAST is used. The presented computer program links the GenBank data with BLAST results and draws a figure with information including the sequence name, the percentage of similarity, gene positions and designations, that can be used to unravel the positions of genes on the examined DNA sequence and the different designations for genes with same function in different organisms. Examples of the computer program use are given.
Contact: Marjanca.Starcic.Erjavec@bf.uni-lj.si
Keywords: Gene Position and Designation, BLAST, GenBank |
Poster K03
Detection and Reduction of Evolutionary Noise in Correlated Mutation Analysis
Orly Noivirt, Miriam Eisenstein , Amnon Horovitz
Weizmann Institute of Science |
Abstract:
Direct or indirect inter-residue interactions in proteins are often reflected by mutations at one site that compensate for mutations at another site. Correlated mutation analysis for non-interacting proteins showed that the signal due to real interactions is of similar magnitude to the noise that arises from other evolutionary processes. A new method for detecting correlated mutation is presented that reduces the evolutionary noise by considering the evolutionary distances within the protein family.
Contact: orly.noivirt@weizmann.ac.il
Keywords: Coordinated Mutations, Co-evolving Residues |
Poster K04
Sequence Motif Characterization of Structural Sub-Units of Porin Proteins in Gram Negative Bacteria
Benny Shomer, Merav Bashary, Yeshayahu Nitzan
The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University |
Abstract:
We characterized conserved structural elements in bacterial porin proteins. We assume evolutionary conservation in regions of biological importance for bacterial virulence and porin functionality. External loops were found to contain repetitive sequence motifs, conserved throughout different taxonomy groups. We analyzed frequencies of amino acid triads. Compared with whole proteomes, frequencies varied between 3 to 14 times the computed STD in both external loopes and transmembranal beta strands. Our results exclaim the importance of studying porin structural sub-units.
Contact: bshomer@mail.biu.ac.il
Keywords: Porin, Motif, Transmembrane, Structural |
Poster K05
Complementary Intron Sequence Motifs Associated with Human Exon Repetition
Richard Dixon
Leicester University |
Abstract:
Exon repetition describes the presence of tandemly repeated exons in mRNA in the absence of duplications in the genome. The regulation of this process is not fully understood. We investigated the entire flanking intronic sequences of exons involved in exon repetition for common sequence elements. We have identified two common sequence motifs, that are complementary to each other and which support a model by which exon repetition occurs as a result of trans-splicing between separate pre-mRNA transcripts from the same gene that are brought together by complementary intronic sequences.
Contact: rd67@le.ac.uk
Keywords: Non-linear, Splicing, Introns, Motifs |
Poster K06
Mammalian microRNA Prediction using a Support Vector Machine Model of Sequence and Structure
Ying Sheng, Par G. Engstrom, Boris Lenhard
Computational Biology Unit, Bergen Computational Center of Science, University of Bergen, Norway |
Abstract:
MicroRNAs are endogenous small noncoding RNAs with important regulatory roles in animals and plants. We present an efficient microRNA prediction method that uses sequence conservation profiles and secondary structure characteristics. The method predicts an extensive set of potential human and mouse microRNAs. We expect our predictions to contribute a significant number of new candidate miRNAs for experimental verification.
Contact: ying.sheng@bccs.uib.no
Keywords: microRNA, Support Vector Machine, Prediction |
Poster K07
Analysis and Prediction of Disulfide Bond Connectivity
Rotem Rubinstein, Guiping Xu, Andras Fiser
Albert Einstein College of Medicine |
Abstract:
We investigated whether positions in the sequence, which participate in a disulfide bridge at least once, are conserved in a correlated manner. We structurally align proteins with various number of disulfide bridges spanning various evolutionary distances. We observed a simple correlation in the substitution pattern of disulfide bridge positions. When a bridge is lost both cysteines that participated in the bridge will also be lost. Based on our findings, we developed a correlated mutation algorithm to predict disulfide bond connectivity from sequence information alone.
Contact: rrubinst@aecom.yu.edu
Keywords: Correlated Mutation, Disulfide Bridges |
Poster K08
Classifying High-Level Protein Functionality: Toxins and Toxin-Like Proteins
Noam Kaplan (1,2), Michal Linial (2)
(1) Weizmann Institute; (2) The Hebrew University |
Abstract:
Toxins are short proteins that appear in animal venom and are extremely varied in sequence, structure and function. We have developed a toxin classifier based on extracting sequence-derived features that are related to the notion of structural stability. Application of the classifier to the honey bee genome and to a set of recently-discovered mouse sequences reveals 3 novel toxin-like proteins. Remarkably, these proteins seem to be expressed in the brain rather than in the venom. We suggest that the novel endogenous toxin-like proteins may function as neuromodulators via toxin-like mechanisms.
Contact: noam.kaplan@weizmann.ac.il
Keywords: Toxins, Machine Learning, Function Prediction |
Poster K09
Performance Asssessment of Motif and Module Discovery Methods for Target Gene Prediction
Stein Aerts
Human Genetics, University of Leuven, Belgium; Currently on leave at CNRS-IBDML, Marseille, France |
Abstract:
Expression profiling is often helpful to predict candidate target genes of transcription factors, but is usually complemented with motif detection to distinguish direct from indirect target genes and to christallize the actual network linkages. Chromatin immunoprecipitation on the other hand should directly find transcriptional target genes at the enhancer location itself, but has experimental limitations and high false positive rates. Both techniques are moreover less feasible when studying very localize
Contact: stein.aerts@med.kuleuven.be
Keywords: Regulatory Sequence Analysis, Enhancer |
Poster K10
Prediction of Transcription Factors and DNA Binding Sites in Transcription Factors
Suveyda Yeniterzi (1), Zeynep H. Coban (1), Serdar Cakici (1), A. Rasit Ozturk (2), Hilal Kosucu (1), Reyyan Yeniterzi (1), Ugur Sezerman (1)
(1) Sabanci University; (2) Bilkent University |
Abstract:
In order to identify a protein as a transcription factor or not, we used BSVM and Cart and the results meet our expectations to a great extend; however, we could not acquire the rules that define the binding regions since we do not exactly know how these methods generate results. Therefore, we tried to identify the binding region specific rules by data mining in the second part of our work by using the binding sites which are obtained from known TFs.
Contact: suveyday@su.sabanciuniv.edu
Keywords: Data Mining, Transcription Factors, SVM |
Poster K11
Elucidating the Complex Nature of Gene Fusion Events for Moving Towards Automatic Quantification of Protein Function
Ashwin Sivakumar, Chris Wilton, Swapan Mallick, Liisa Holm
Bioinformatics group, Division of Genetics, Institute of Biotechnology, University of Helsinki |
Abstract:
A complete functional unit of a protein can be multi-domain, and it is the co-occurrence and interaction of these multiple domains that determines the function and functional diversity of their gene products. We automatically mine these functional units (modules) using only sequence information for introducing an automatic functional classification system of contemporary proteins. The main biological theme behind the concept of modules lies in systems scale identification what we loosely term "gene fusion events".
Contact: ashwin.sivakumar@helsinki.fi
Keywords: Protein Function, Gene Fusion, Cancer |
Poster K12
Built-in Switches Allow Versatility in Domain-Domain Interactions
Eyal Akiva, Hanah Margalit
Department of Molecular Genetics and Biotechnology, Faculty of Medicine, The Hebrew University |
Abstract:
Certain domains have been identified as mediating homo-dimerization. As these domains are found also in proteins that function as monomers, a question arises as to what determines the oligomeric state of proteins having such domains. By comparing multiple sequence alignments of known dimers and monomers that contain such domains, we identify the domain segments that are responsible for this versatility. Analysis of our results in view of relevant solved structures provides insight into the molecular basis of this phenomenon and has implications for predicting the oligomeric states of proteins.
Contact: akiva2@md.huji.ac.il
Keywords: Protein, Interaction, Domain, Dimer, Monomer |
Poster K13
Augmenting Protein Sequence Alignment of Remote Homologues using Secondary Structure Confidence Scores
Yaniv Loewenstein, Elon Portugaly, Michal Linial
Hebrew University of Jerusalem, Israel |
Abstract:
Incorporating sequence together with protein secondary structure has a synergistic affect on alignment quality and remote homology detection, without affecting the alignment speed. Yet, synergism vanishes when noisy structure prediction is used. Using Bayesian confidence levels in the alignment scoring function, we overcome this inherent problem, and reproduce substantially better performance than current methods, even when no experimental structure is at hand. Our method is applicable to BLAST searches, better sequence profiles, and large-scale unsupervised classification of protein domains.
Contact: lonshy@cs.huji.ac.il
Keywords: Alignment, Confidence, Homology Detection |
Poster K14
Using Co-occurrence of Transcription Factor Binding Sites for the Assessment of Regulatory Potential
Holger Klein, Martin Vingron
Max Planck Institute for Molecular Genetics |
Abstract:
We present an approach for the detection of co-occurrence of transcription factor binding sites within known regulatory sequences. We annotate a set of upstream regions of human genes with predicted TFBSs based on a set of representative position weight matrices. We count co-occurring pairs of TFBSs using a sliding window and calculate a log-odds score of observed vs. expected number of pairs to identify significant combinations. To assess the co-occurrence scores we use known interactions of TFs. A way to use the introduced co-occurrence scores for promoter prediction is outlined.
Contact: holger.klein@molgen.mpg.de
Keywords: Gene Regulation, Co-occurrence of TFBS |
Poster K15
Olfactory Receptors in Peptide Space
Assaf Gottlieb (1), Tsviya Olender (2), Doron Lancet (2) , David Horn (1)
(1) School of Physics and Astronomy, Tel Aviv University, Israel; (2) Dept of Molecular Genetics, Weizmann Institute of Science, Israel |
Abstract:
We use the Motif EXtraction algorithm (MEX) to extract specific peptides from Olfactory Receptor (OR) proteins of seven vertebrate species. The peptides trace well evolutionary trends. 567 peptides are found to be common to all tetrapod species, probably related to shared structure and function. While mammals share most of their peptides (3016), many peptides in non-mammals species are species-specific. Within mammals 1874 peptides are OR-family specific. Peptides that lie on the extra-cellular part of the ORs may be related to specific OR functions.
Contact: assafgot@post.tau.ac.il
Keywords: Olfactory Receptors, Motif Extraction |
Poster K16
Robust Clustering of Gene-Expression Profiles by a Mutual Information Distance Measure
Ido Priness, Oded Maimon, Irad Ben-Gal
Faculty of Engineering, Tel Aviv University, Israel |
Abstract:
The definition of a distance measure plays a key role for a successful clustering of gene expression profiles [1][2][3] [4]. We compare the robustness of the Mutual Information (MI) measure [5] to that of the Euclidean distance and the Pearson correlation coefficient. We show that the MI measure is potentially more robust than these other measures in the differentiation between erroneous clustering solutions. The research continues a sequence of related research works [5] [6] [7] that address the question of the properness of used distance measures in bioinformatics studies.
Contact: bengal@eng.tau.ac.il
Keywords: Gene Expression, Distance Measure, Mutual Information |
Poster K17
SIGI-HMM: Score-Based Prediction of Genomic Islands in Prokaryotic Genomes using HMMs
Oliver Keller (1), Katharina Surovcik (1), Thomas Brodag (1), Rainer Merkl (2), Roman Asper (1), Carsten Damm (3), Stephan Waack (1)
(1) Institut fur Informatik, Universitat Gottingen; (2) Institut fur Biophysik und Physikalische Biochemie, Universitat Regensburg; (3) Institut fur Numerische und Angewandte Mathematik, Universitat Gottingen |
Abstract:
Horizonal gene transfer, i.e. the process of acquiring genes from foreign species, is a frequent phenomenon among microbial species. It is considered a strong evolutionary force allowing rapid adaptation to changing environmental demands. The transferred pieces of DNA often comprise a large number of genes that are found in contiguous regions called Genomic Islands (GIs). We have implemented the program SIGI-HMM that predicts GIs and the putative donor of each individual alien gene, by the use of a HMM that takes into account codon usage.
Contact: keller@cs.uni-goettingen.de
Keywords: Genomic Islands, Hgt, HMM, Codon Usage |
Poster K18
A New Approach for Estimation of Statistical Significance in Sequence Profile-Profile Comparisons
Mindaugas Margelevicius, Ceslovas Venclovas
Institute of Biotechnology, Lithuania |
Abstract:
Sequence profile-profile comparison is one of the most sensitive techniques for distant homology detection. We propose a new approach to estimate statistical significance of the profile-profile alignments. The exhaustive testing showed that this approach combined with a newly developed comparison procedure compares favorably to a number of other existing methods.
Contact: minmar@ibt.lt
Keywords: Profile Comparisons, Statistical Significance |
Poster K19
Anchored Multiple Sequence Alignment
Swapan Mallick, Liisa Holm
Institute of Biotechnology, Helsinki University |
Abstract:
We introduce a novel multiple sequence alignment strategy GTG-MSA. The algorithm takes a two-step progressive alignment approach, using a transitive psi-blast alignment graph to highlight evolutionary signals which are used as anchors. Intermediate regions are aligned using edge weights from the same graph. Since the algorithm is based on transitive alignments, it is particularly good at correctly aligning sparse motifs when compared with standard methods.
Contact: swapan.mallick@helsinki.fi
Keywords: Multiple Sequence Alignment, Motif |
Poster K20
An Algorithm for the Identification of Active Site Residues Using Pfam Alignments
Jaina Mistry, Alex Bateman, Rob Finn
Wellcome Trust Sanger Institute |
Abstract:
Over 10% of the 8300 Pfam families are enzymatic, but within these families only a small fraction of the sequences have been biochemically characterized. We have developed an algorithm that takes experimentally determined active site residues contained within a family, and identifies the presence of these active site residue patterns within other family members. We compare our results to UniProt and the Catalytic Site Atlas. Using this algorithm we have predicted 606110 active site residues and have increased the active site annotations in Pfam by over 200 times.
Contact: jm14@sanger.ac.uk
Keywords: Catalytic, Active Site, Annotation, Function |
Poster K21
Genome Comparisons on the Above-Gene Level: Compositional Spectra in Two- and Four-Letter Alphabets
V. Kirzhner, A. Paz, S. Hosid, E. Nevo, A. Korol
Institute of Evolution, University of Haifa |
Abstract:
Earlier, we developed a linguistic-like approach for genome comparisons, referred to as compositional spectra (CS), and based on distribution of frequencies of imperfect matching of oligonucleotide words from an arbitrary set. Now we attempt large-scale sequence comparisons using a two-letter (R-Y) analogue of our previous CS analysis.
Contact: valery@esti.haifa.ac.il |
Poster K22
Subsequence Feature Map for Protein Classification and Remote Homology Detection
O. Sinan Sarac (1), Volkan Atalay (1), Rengul Cetin-Atalay (2)
(1) Middle East Technical University; (2) Bilkent University |
Abstract:
In order to perform protein classification or remote homology detection, we propose a feature map that takes into account the information coming from the subsequences of a protein. We decompose given sequences into fixed-length subsequences and cluster similar subsequences using string similarity measures. A mapping can then be defined as the distribution of the subsequences of a new sample over these clusters. We make use of a set of Hidden Markov Models to represent the clusters of subsequences. We have multiple HMMs each of which is sensitive to different short subsequences.
Contact: volkan@ceng.metu.edu.tr
Keywords: Protein Classification, Clustering, HMM |
Poster K23
Information Content and Sequence Periodicity
Alexander Bolshoy
Genome Diversity Center, University of Haifa |
Abstract:
Here we report a novel information content based method for sequence analysis. The method can conveniently indicate all kinds of periodicity and repeat-related features in a set of genomic DNA sequences. We illustrate the power of the method by studying the nucleosomal database of Ioshikhes et al. and intergenic regions of E. coli.
Contact: bolshoy@research.haifa.ac.il |
Poster K24
DeltaProt: Toolbox for Molecular Comparison of Proteins based on Sequence Alignments
Steinar Thorvaldsen (1), Tor Fla (1), Nils P. Willassen (2)
(1) University of Tromso�, Faculty of Science, Tromso�, Norway; (2) Norwegian Structural Biology Centre, Tromso�, Norway |
Abstract:
We present statistical methods, trend-tests and visualisations that are useful when the protein sequences in alignments can be divided into two or more populations based on known phenotypic traits such as preference of temperature, pH, salt concentration or pressure. The algorithms have been successfully applied in the research on extremophile organisms.
Contact: steinart@math.uit.no
Keywords: Physicochemical Properties, Extremophiles |
Poster K25
Assessment of Protein Domain Classifications: Automatic Sequence-Based Method and Methods based on 3D Structures
Elon Portugaly (1), Nathan Linial (1), Michal Linial (2)
(1) School of Computer Science and Engineering, The Hebrew University of Jerusalem; (2) Dept. of Biochemistry, Inst. of Life Sciences, The Hebrew University of Jerusalem |
Abstract:
EVEREST is an automatic system that identifies and classifies domains within a database of protein sequences. We show that the set of EVEREST families is as similar to the set of CATH families and to the set of SCOP families as the latter two sets are similar to each other.
Contact: elonp-eccb06@cs.huji.ac.il
Keywords: Automatic Annotation, Protein Domains |
Poster K26
A Regulatory Resolution Score for Identifying Candidates for Laboratory Studies
Shannan J. Ho Sui (1,2,3), David J. Arenillas (1,2,3), Wyeth W. Wasserman (1,2,3)
(1) University of British Columbia; (2) Centre for Molecular Medicine and Therapeutics; (3) Child and Family Research Institute |
Abstract:
Phylogenetic footprinting is widely used to identify regulatory regions within alignments of orthologous sequences. However, different genes evolve at different rates. Those genes with well-defined conserved non-coding regions that are easily distinguished from the rest of the promoter are favourable candidates for laboratory gene regulation studies, compared to those that have evolved very slowly or genes with highly diverged promoters. We propose the use of a "regulatory resolution score" to discern candidates for laboratory study based on the properties of a gene's conservation profile.
Contact: shosui@cmmt.ubc.ca
Keywords: Phylogenetic Footprinting, Regulation |
Poster K27
Enhancing Transmembrane Beta-Barrel Topology Prediction by Information Encoded in Multiple Sequence Alignments
Costas Tsirigos (1), Pantelis Bagos (1), Vasilis Promponas (1,2), Stavros Hamodrakas (1)
(1) Department of Cell Biology & Biophysics, Faculty of Biology, University of Athens, Greece; (2) Department of Biological Sciences, University of Cyprus, Nicosia, Cyprus |
Abstract:
In this work we demonstrate that some common pitfalls of predictive algorithms (for example broken strands, false positives or dipping loops) may be resolved by using the positional conservation and gap distribution of multiple sequence alignments. Our results indicate that it is possible to use information encoded in multiple sequence alignments to post-process and enhance topology prediction for transmembrane beta-barrels. We also have preliminary data suggesting that positions in the alignment mainly occupied by gaps may accurately indicate shifted predictions.
Contact: vprobon@ucy.ac.cy
Keywords: TM Beta-Barrel, Multiple Alignment, Prediction |
Poster K28
On the Kinetics of Prokaryotic Transcription Initiation
Johanna Weindl (1), Pavol Hanus (1), Zaher Dawy (2), Juergen Zech (3), Joachim Hagenauer (1), Jakob C. Mueller (3,4)
(1) Institute for Communications Engineering (LNT), Technical University of Munich (TUM); (2) Department of Electrical and Computer Engineering (ECE), American University of Beirut; (3) Institute for Medical Statistics and Epidemiology, Technical University of Munich (TUM); (4) Hertie-Institute for Clinical Brain Research, University Clinic Tuebingen |
Abstract:
Transcription initiation in prokaryotes has been extensively studied in the past decades. However, little is known about the kinetics involved in promoter detection by the RNA polymerase (RNAP) and its sigma subunit. We present an approach to relate the binding energy between the sigma factor and the DNA to the kinetics of promoter detection during the first step of transcription. Results suggest that the sequence surrounding the promoters contains important information to guide the RNAP and its sigma subunit in a way to increase the probability of transcription initiation.
Contact: jweindl@tum.de
Keywords: Kinetics, Dynamics, Transcription Initiation |