Charlotte Visualization Center
[ RESEARCH ]     [ PUBLICATIONS ]     [ PEOPLE ]     [ GALLERY ]     [ INSIDER ]
GVis : the scalable visualization framework

GVis (A Scalable Visualization Framework for Genomic Data) is a framework with which it is possible to brose the phylogeny hierarchy of organisms from the highest level down to the level of an individual organism of interest and also analyze each interest gene by initiating the gene-finding and gene-match analyzing tool. The framework permits one to navigate through and explore large amounts of genomic data (thousand of genomes or more) using a 2.5D space layout.

All genomic data used in GVis framework follow the NCBI GenBank flat-file format. The publicly available GenBank files consist of a set of ASCII text files, most of which contain gene sequence data, and some supplemental information that contain lists of author names, journal citations, gene names, keywords, and accession numbers of the records. By extracting several important features from the GenBank files, we are able to create our own GVis data files in binary.

A genomic tree structure is built by referencing the NCBI taxonomy database. Taxonomic information can be retrieved by directly connecting to the NCBI Taxonomy Browser through HTTP protocol with specific organism names.

ORFs (Open Reading Frames) of genomic sequences are collected using the NCBI ORF Finder. An ORF represents the minimum selectable size of a gene sequence, and it includes a start codon and one or more stop codons. With a collection of ORFs, users easily can select the minimum size of selectable sequences and compare the results with other ORF sequences. For displaying the collected ORFs, an ORF tree structure is implemented.

For displaying genomic data, a Venn-diagram approach is used instead of directly referencing the NCBI tree structure.

Also We integrate a network-based gene sequence matching tool by NCBI called netBLAST. netBLAST is a publicly available gene sequence matching program that emphasizes regions of local alignments in order to detect relationships among sequences that share isolated regions of similarities. Using an inner-window, a user can select an arbitrary length of gene sequence and submit it to netBLAST. Depending on the length and type of the query gene sequence, netBLAST could return more than ten matching sequences of any length. Since there is no way for the user to know a priori the number or the lengths of the resulting sequences, we implement navigation features within the inner-window such as zooming, panning, and scrolling to allow efficient visualization of the results. In addition results can be moved to the working window for closer comparison, as described below.

Copyright © 2005 Data Visualization Group. All rights reserved.