Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
A Platform for Computational Comparative Genomics on the Web

Speaker  Sun Kim

Affiliation  Indiana University

Host  Dan Fay

Duration  00:22:39

Date recorded  7 October 2005

We have been developing a Web-based system for comparing multiple genomes, PLATCOM, where users can choose genomes and perform analysis of the selected genomes with a suite of computational tools. PLATCOM is built on internal databases such as GenBank, COG, KEGG, and Pairwise Comparison Database (PCDB) that contains all pairwise comparisons (97,034 entries) of protein sequence files (.faa) and whole genome sequence files (.fna) of 312 replicons. The pre-computed PCDB makes it possible to complete genome analysis very fast even on the web, so that users can choose any combination of genomes and analyze them with data mining tools. Genome comparison requires combining many sequence analysis tools. However, combining multiple tools for sequence analysis requires a significant amount of programming work and knowledge on each tool, thus it is very challenging to provide a service for comparing genomes on the web by using standard sequence analysis tools. Thus, to make genome comparison be done on the web, well-defined data mining concept and tools are very important since they can make genome comparison much easier. It is also important that the data mining tools for genome comparison should be scalable. We have been developing such scalable tools: a sequence clustering algorithm BAG, a metabolic pathway analysis tool MetaPath, a gene fusion event detection tool FuzFinder, a gene neighborhood navigation tool OperonViz, an algorithm for mining correlated gene sets MCGS, a genome sequence alignment tool GAME, a multiple genome sequence alignment algorithm by clustering local matches mgAlign, and a pairwise genome visulization tool COMPAM. The analysis results are summarized with visualization tools. We are currently working on integrating the data mining modules such that users can combine these in a very flexible way. In addition to sequence data, PLATCOM will include more data types such as gene expression data.

©2005 Microsoft Corporation. All rights reserved.
> A Platform for Computational Comparative Genomics on the Web