Gene Semantic Similarity Analysis and Measurement Tools


G-SESAME is a set of on-line tools to measure the semantic similarities of Gene Ontology (GO) terms and the functional similarities of gene products, and to discover biomedical knowledge through GO database. These tools are originally based on the G-SESAME paper in 2007. They were developed using MySQL, PHP and hosted by an Apache Web server running on a Linux operating system (CentOS 5). New methods taking into account the statistical distribution of the GO database are implemented as a new feature in 2013. Other state-of-the-art methods were also implemented to allow researchers to choose the best methods on their own needs.

Visualization techniques are provided in these tools to allow users to inspect the locations of the GO terms within the GO graph and to visually determine the semantic similarity. A batch command interface is also provided for users to execute the tools to measure the semantic similarity of a group of GO terms or functional similarities of a group of genes. Web based APIs are also provided for advanced users.

G-SESAME tools have been used more than 70.2 million times by researchers from 252 organizations between October 2006 and October 2015 according to our web log recordes . G-SESAME is currently using the gene ontology database published by the gene ontology consortium in Sep, 2015 .

How to cite the G-SESAME tools?

Please cite the following papers:

  1. James Z. Wang, Zhidian Du, Rapeeporn Payattakool, Philip S. Yu and Chin-Fu Chen, A New Method to Measure the Semantic Similarity of GO Terms, Bioinformatics, 2007, 23: 1274-1281; doi: 10.1093/bioinformatics/btm087 and the supplement information is located at here.

  2. Zhidian Du, Lin Li, Chin-Fu Chen, Philip. S. Yu, and James Z. Wang. G-sesame: web tools for go-term-based gene similarity analysis and knowledge discovery. Nucleic Acids Research, 37:W345–W349, 2009.

  3. Xuebo Song, Lin Li, Pradip K. Srimani, Philip S. Yu, James Z. Wang, "Measure the Semantic Similarity of GO terms Using Aggregate Information Content", IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, VOL. 11, NO. 3, MAY/JUNE 2014

Sponsor Information

This project is supported by NSF grant DBI-0960586 and DBI-0960443.