Williams and J

Williams and J. BACKGROUND == Over the past decade, gene silencing through RNA interference (RNAi) technology has emerged as a powerful tool for deciphering the mechanistic details of biological processes in higher eukaryotes. RNAi was first exploited for systematic functional studies inCaenorhabditis elegansandDrosophila melanogaster(1,2), and is now also widely Nelonicline used for selective suppression of gene expression in mammalian cells (3,4). More recently, viral-based pooled shRNA screening methods have been developed and applied in functional genetic screens to identify genes that are essential for cancer cell proliferation with the goal of identifying therapeutic targets (57). We have developed a standard operating procedure for carrying out large-scale pooled shRNA screens and are systematically looking for essential genes using cancer cell lines from various tumor types including breast, ovarian and pancreatic (Marcotteet al., submitted for publication). We are using a pooled subset of the human TRC collection that includes 78 432 shRNAs targeting approximately 16 000 human genes (5,8) to develop essential gene profiles across a large number of cancer cell lines (R. Marcotte, submitted for publication). To streamline the process from data generation to public user access of our cancer cell line screening results, we developed a web-accessible database system for processing, analyzing and retrieving data from the pooled screens. The COLT-Cancer database system is comprised of: (i) a laboratory information and management system (LIMS) for automation of basic microarray functions such as chip signal extraction, background correction, normalization and quality metric generation; (ii) an automated routine for generating hairpin-level and gene-level essentiality scores; and (iii) a web interface athttp://colt.ccbr.utoronto.ca/cancerthat enables researchers to query, visualize and compare essential genes across multiple cancer cell lines. Many RNAi screens in mammalian cells have been conducted in academic and industry labs and have yielded novel insight into genes that are essential for cancer cell proliferation. Some of the resulting data is available to the research community through a number of collation efforts, including RNAiDB (9), GenomeRNAi (10) and FLIGHT database (11). These databases support integrative visualization and analysis of RNAi data with other data such as gene annotations, shRNA sequence annotations and corresponding knockout efficiency and genomic information. In addition, several RNAi-based tools/databases also focus on providing searchable shRNA and siRNA constructs, such as RNAi Codex (12), E-RNAi (13), the RNAi Consortium (TRC) library database (http://www.broadinstitute.org/rnai/public/) and the Cancer Genome Anatomy Project (CGAP) shRNA clone library (http://cgap.nci.nih.gov/RNAi/RNAi2). The Nelonicline COLT-Cancer was designed with a unique focus to facilitate functional comparison of essential gene profiles across a compendium of cancer cell lines and integrate this information with structural genomic data from large cancer genome sequencing efforts to uncover vulnerabilities that can be used to develop better prognostics and therapeutics (Marcotteet al., submitted for publication). == DATABASE CONSTRUCTION AND CONTENT == == System architecture == COLT-Cancer is deployed on a back-end DB2 relational database management system. The DB2 database serves as central storage for data and images generated continuously from our automated computational pipeline for processing and analyzing RNAi pooled screens. As such, it was designed with the objective of achieving query and storage efficiency for large quantities of microarray images, signal intensity measurements, annotations of genes and shRNA reagents, genomic information, and other metadata and contains more than 200 relational tables (database schema available at COLT-Cancer online documentation). The COLT system is hosted on 2 IBM servers; one that functions as a database server and the other as a web server to facilitate Rabbit polyclonal to VCAM1 querying, data downloading and data visualization through the COLT-Cancer websites. The web interfaces of COLT-Cancer were developed using a combination of HTML, CGI Perl, DB2/Perl application programming interface, cascading Nelonicline style sheets and Javascript for easy navigation. Graphical plots are generated on-the-fly using R plotting functions. == Microarray LIMS system == At the back-end of COLT is Nelonicline a.