Query and Browse Data
You can currently query and browse LINCS data through multiple lenses, including our cell lines, perturbations, and the results of individual assays. A number of tools are being developed (in alpha release within the LINCS group) and will be publicly released in the future. These include integrated browsing and querying of LINCS data, and running analyses that bring together LINCS data and other large scale public domain databases. Below is a summary of these tools.
Available now:
Query and Browse: LINCS pharmacoresponse datasets
Tool: HMS LINCS Database
LINCS Center: Harvard Medical School
Tasks performed/data handledThe HMS LINCS Database contains documentation for HMS LINCS reagents (small molecule and protein perturbagens, cells, and other protein reagents) Multiple HMS LINCS datasets are also available in this database for browsing and download; for several datasets, primary images are available to view and download. The HMS LINCS Database evolves continuously and new features and data will be available in both January and March. It will not be “complete” per se, since we intend to continue to modify it throughout the project.
Visualize: LINCS data relevant to signaling pathways
Tool: Browse kinase inhibitor by canonical pathway
LINCS Center: Harvard Medical School
Tasks performed/data handledThis simple node-edge graph of a canonical immediate-early signaling pathway provides a simple interface to a subset of the results in our LINCS database. Mousing over selected nodes show small molecule kinase inhibitors whose primary target includes that node, along with IC50 data for six canonical cell lines. Click-through yields additional information on the perturbagens and the data. Currently we are expanding the number of cell lines from six to 40 and we are working through the first set of 60 compounds on these lines. A second set of 60 will follow by March, but all compounds will not be tested on all cell lines until the end of year three (many in the joint project). However, there will be enough to get a sense for the value of the data before then.
New methods: access source code
Tool: Multi-parameter analysis of perturbagen response
LINCS Center: Harvard Medical School
Tasks performed/data handledThis interface is also little more than a sketch but we are working actively to analyze publicly available perturbagen dose-response data and by the end of January we will have our scripts available. By March we will have a fairly complete meta-analysis available, with various browsing functions.
Browse: Identify new relationships in LINCS data by leveraging ontologies
Tool: LIFEwrx
LINCS Center: University of Miami
Tasks performed/data handledA novel, semantically-enhanced, web-based software application, that enables access, navigation and exploration of a knowledgebase built by integrating and indexing all the LINCS data types. LIFE will also link to analyses and some raw data for LINCS.
- LIFE provides access to LINCS assays, assay-relevant concepts (biomolecule participants or associated information), and assay data via different types of search (including free text search, concept search, chemical structure search)
- LIFE allows access, navigation and exploration of LINCS assays, biomolecules, related concepts and LINCS screening results via a variety of perspectives / views such as proteins, genes, cell lines, small molecules, etc.
- LIFE provides flexible navigation of the LINCS assay and data landscape via list functionality covering important assay biomolecules and concepts; this enables a variety of use cases
- LIFE provides a few simple options to interrogate and visualize LINCS data, for example by assay participants (compounds, target proteins, cell lines) across different assays
- LINCS screening results (summary, not raw data) can be downloaded via LIFE
- LIFE will provide summary signature views and link to LINCS data sources to direct users to all details about the assays and data
- LIFE will provide access to detailed assay and result data and views as part of the HMS Data Portal
- LIFE also provides the backend of the pLINDAW system
- The long-term goal of the LIFE system is to provide access to asserted and inferred knowledge as it related to participating biomolecules and model systems and associated LINCS screening results; LIFE with thus become a knowledge portal to LINCS data, assays and results.
Documentation available at main site LIFE, and within the tool. Video Tutorials also available.
Query and Integrate: LINCS and other publicly available datasets
Tool: iLINCS (integrated LINCS) genomics data portal
LINCS Center: Cincinnati
Tasks performed/data handled- iLINCS portal handles LINCS L1000 and KinomeScan data
- It facilitates integration of LINCS data-derived signatures with other genome-scale signatures: ENCODE transcription factor binding signatures, pre-defined disease-related signatures derived from GEO datasets, custom-build transcriptional signatures using data in Genomics Portals, uploaded custom-signatures.
Documentation: Tutorials and help available at main portal.
- Tasks that the portal will perform at the release time (January 2013):
- Text search and downloading of LINCS L1000 and KinomeScan signatures
- Using gene lists to find LINCS L1000 signatures associated with the list through enrichment analysis (including construction of gene list using functional knowledge base in Genomics Portals).
- Browsing pre-established relationships between LINCS perturbagen signatures, and between LINCS perturbagen signatures and other genome-scale signatures listed above.
- Concordance analysis between LINCS perturbagen signatures and user-supplied signatures where user-supplied can be created using Genomics Portals database or directly uploaded.
Available January 2013:
New Methods: Algorithms for image analysis
Tool: itNETZ: Cell-IA Module
LINCS Center: Methodist
Tasks performed/data handledThe software can be used to process the cell-based screen image data, such as nuclei segmentation, cell segmentation, cell tracking, and 3D medical image reconstruction. This module uses LINCS image data as well as images and movies from other sources.
Tutorial and User Manual available.
Visualize and Mine: LINCS data
Tool: Visualizing cell-ligand response data
LINCS Center: Harvard Medical School
Tasks performed/data handledThis interface is currently little more than a sketch but it will advance substantially once we submit a paper in January (this paper contains the figures that will be the first component of the tool). By the end of January it will present various regression and PCA-based models of our results with mouse-over and click-through features. By the end March we will have made the interface truly dynamic with the ability to select axes and principle components.
Available March 2013:
Browse: LINCS perturbations
Tool: Data Browser: View perturbagens that have been profiled or have been scheduled for profiling
LINCS Center: Broad
Tasks performed/data handledUser interface has 3 parts:
- A table-based view that allows a user to browse all perturbagens in the LINCS dataset
- A search box for text based searching
- A SMILES box for searching by chemical structure
The result of a search can be viewed in the browser and also downloaded as a text file.
Browse: Summary expression signatures
Tool: LINCS Digest: View connections for a particular gene or compound
LINCS Center: Broad
Tasks performed/data handledUser interface is a search box that allows users to type in the gene symbol for a name or the common name of a compound (that has already been profiled by LINCS).
The digest page (which will be pre-computed for every build of the dataset) provides information including:
- List of top genes that are up and down-regulated by the perturbagen
- Pathways (curated from MSigDB) that are enriched in the signature
- Other perturbagens whose signature match this signature (i.e an internal query of this LINCS perturbagen against all other LINCS perturbagens)
- list of the specific shRNAs used to derive the signature and the extent of correlation b/w the shRNAs
- Whether this gene was a landmark or inferred
- The chemical structure (SMILES string)
- Links (when available) to PubChem and ChEMBL
- List of potential gene targets of this compound identified by querying the LINCS compound signature against the database of LINCS gene signatures
New methods: Build drug response signatures
Tool: itNETZ: csNMF Module
LINCS Center: Methodist
Tasks performed/data handledOne task of this module is to process the raw L1000 data, calculate the expression data from the raw data (based on a beads assay), and perform normalization and quality control. Another task is to discover drug-response signatures using L1000 data based on our proposed method “constraint sparse non-negative matrix factorization” (csNMF). This module uses the LINCS L1000 transcriptional expression data.
Tutorial and User Manual available.
New methods: Predict new therapeutic or side effect profiles for kinase inhibitors
Tool: itNETZ: Kinase Inhibitor Effect Prediction Module
LINCS Center: Methodist
Tasks performed/data handledThe KIEP Module is used to study the kinase inhibitor induced network signature and predict the therapeutic and side effect of the kinase inhibitor. This module is based on the LINCS proliferation/apoptosis response data of cancer cell and cue signal response data of liver cell.
Tutorial and User Manual available.
Query: Gene expression signature similarities
Tool: Query App: Integrate LINCS Data with externally derived gene sets
LINCS Center: Broad
Tasks performed/data handled- Which perturbagens are connected to the provided (externally derived) signature?
- Are perturbagens of a given type (e.g. KD, CP, ORF) enriched relative to other types?
- Do perturbagen connections vary with cell line?
re perturbagens of a given set (e.g. ATC code, pathway,etc.) enriched relative to others?
Visualize and Integrate: LINCS and other publicly available genomic data
Tool: UCSC Cancer Genome Browser – LINCS data
LINCS Center: Harvard Medical School
Tasks performed/data handledThe Cancer Genomics Browser is a suite of web-based tools to visualize, integrate and analyze cancer genomics and LINCS data. LINCS data that will be accessible via the Browser include: L1000 data, KinomeScan assay results for HMS LINCS kinase inhibitors, LINCS annotations for cells and small molecules, and RPPA assay data.
Visualize and Mine: thousands of LINCS expression datasets
Tool: LINCS Canvas Browser link 1 link 2
LINCS Center: Harvard Medical School
Tasks performed/data handled- Compact visualization of thousands of L1000 experiments
- Clustering of perturbations based on signature similarity
- Interactive gene list enrichment analysis using 32 gene set libraries
- iPhone version of the application
New methods: Integrated query and mining of LINCS data
Tool: itNETZ: pLINDAW & jLINDAW
LINCS Center: Methodist Research Institute and University of Miami
Pre-release: jLINDAW module in January 2013
Tasks performed/data handledThis tool seamlessly incorporated all LINCS data types as well as important external data bases using various similarity metrics and fuzzy querying, solidifies four data processing and data mining pipelines into the schema of the data warehouse, and provides a user-friendly interface for on-the-fly data mining, visualization, and prediction of LINCS data and approaches. This tool uses all LINCS data types.
Tutorial and Manual released with jLINDAW; Online Tutorial and User Reference available.
New algorithms: From gene expression and regulatory models to drug MoA
Tool: DeMAND: Drug Mechanism of Action using Network Dysregulation
LINCS Center: Columbia
Note: Release at this time only to LINCS consortium
Tasks performed/data handled
- DeMAND identifies drug mechanism of action by comparing the gene expression profile following drug perturbation and control samples and computing its effect on the interactions in an interactome.
- The method is comprised of two steps. In the first step a pre-defined context-specific interactome is used. For each edge in the interactome we determine the two-dimensional probability distribution of the gene expression levels both in the control state, and following drug treatment. The change in the probability distribution is estimated using the Kullback-Leibler (KL) divergence, from which we determine the statistical significance of the dysregulation of each edge. In the second step of DeMAND, we test each gene to see whether its interactions are enriched with dysregulated ones, suggesting that it is a candidate mechanism of action.
- DeMAND is implemented in R and requires 2 inputs: a) a set of gene expression profiles, both control and drug treated, with at least 6 replicates at each condition; and b) a contextual interactome.
- As an output it generates a list of genes, ranked from genes with highest probability of being the drug mechanism of action to least.
New algorithms: From gene expression and regulatory models to protein activities
Tool: Virtual Proteomics
LINCS Center: Columbia
Note: Release at this time only to LINCS consortium
Tasks performed/data handled
- Introduces a “virtual proteomics” approach that leverages the increasingly accurate and context specific knowledge of regulatory networks, to infer the differential activity of proteins on an individual sample basis, in proteome-wide fashion
- The tool was developed based on the idea that the activity of a protein’s transcriptional targets, either direct ones for a transcription factor or indirect ones for a signaling protein, provides the most accurate assessment of its activity. Transcription factors are ideally suited to this type of analysis because they are directly responsible for determining mRNA expression patterns.
- The tool will be a computational framework to infer small compound functional mode of action (fMoA) by estimating their effect on transcriptional regulators. We performed this task by computing the enrichment of compound-perturbation gene expression signatures on each regulator target gene-set (regulon) to obtain compound-characteristic regulator activity signatures (compound fMoA).
- Virtual proteomics is implemented in R and it will be available as an R-System package, which will include all the functions required for performing the virtual proteomics analysis with the corresponding documentation, and metadata including the breast carcinoma / MCF7; prostate carcinoma / PC3; and AML / HL60 context-specific transcriptional regulatory networks (interactomes).
- Virtual proteomics requires 2 inputs: (1) a set of gene expression profiles, and (2) a context-specific interactome.
- As output it generates a regulators’ activity matrix, with samples as columns and regulators as rows.
Visualize & Mine: Cellular Image Assays
Tool: Image Browser
LINCS Center: Harvard Medical School
Tasks performed/data handledThis functionality is not yet in our “Explore” snapshots but is actively being developed. It will describe how to view, scale and download LINCS images.
Visualize & Mine: new methods for building signaling networks using LINCS data
Tool: Liver cells Cue Signal Response datasets (network graphs)
LINCS Center: Harvard Medical School
Tasks performed/data handledThis will turn into a pathway-oriented interface driven by fuzzy logic and dynamic Bayesian modeling. We expect to make the algorithms available by March, but they will be download analysis. We will however show example node-edge graphs on the web-site and explain what they mean. A paper is going out in January/February.
Available June 2013:
New Features: Tools to navigate and query LINCS data in geWorkbench
Tool: geWorkbench with LINCS-specific components
LINCS Center: Columbia
Tasks performed/data handledThe Columbia U01 centers are using computational and experimental methods to find drug pairs showing synergistic effect on cancer cells. Computational methods are used to find candidate pairs with most dissimilar “functional Mechanism of Action” (fMoA), where fMoA is defined as the set of regulatory genes whose targets evince the greatest change in expression when exposed to a drug (as determined using MINDy/GSEA).
The tools will provide for
- query and display of similarity (computational) and synergy (experimental) results for drug-pair interactions generated in the Columbia Centers.
- Display of results in tabular and heat-map formats, with network-oriented display if warranted for multiply-connected drug-pairs.
- Display of calculated drug functional Mechanism of Action on regulatory interaction network.