1. General query of MatrisomeDB 2.0:
- Locate the Search box.
- General query of MatrisomeDB 2.0 is performed by inputting either one or multiple gene symbols, a protein's description, or a tissue type in the Search box. General queries are case-insensitive and accept partial word matching.
• Search by typing a single gene name, tissue, or description in the Search box. For example, 'COL1A1' or 'Breast'.
Note: Partial gene names are also accepted, for example entering 'COL1' will return data on all the protein entries encoded by a gene whose symbol contains COL1, such as COL1A1 but also COL10A1, etc.
• Search by typing multiple genes in the Search box. This can be done by inputting comma-separated gene symbols, for example, 'COL1A1,COL6A3,COL6A1' or use syntax 'genes=(COL1A1,COL6A3,COL6A1)'.
2. Searching MatrisomeDB 2.0 using Option boxes:
- The three Option boxes below the Search box can be further used to explicitly select specific Matrisome Categories, Species, and /or Tissues.
Users can cancel their selection by holding Ctrl and left-click on the options to be deleted.
Example 1: Entering a partial gene name:
- To see the distribution of proteins encoded by genes whose symbols contain 'COL4'
a) Type: COL4 in Search box and click the button.
b) The first panel displayed at the top of the Result page is the 'Organ system legend'. Its color code will match that of the tissue types listed at the right of the heatmaps.
c) The second panel on the Result page presents a hierarchically-clustered tissue distribution heatmap of all proteins encoded by gene containing in their symbols "COL4". Since the input is case-insensitive both human (COL4A1, COL4A2, COL4A3, COL4A4, COL4A5, COL4A6) and mouse (Col4a1, Col4a2, Col4a3, Col4a4) proteins will be displayed. The color code reflects the confidence score from low (yellow) to high (dark blue).
Clicking on heatmap itself or on "Click here for more heatmap details" will take the user to a separate page that provides a larger view of the confidence-score-based and a second heat map based on total ion count. The data underlying both heatmaps can be downloaded as .csv files by clicking on the provided links.
d) The third panel on the Result page presents the enrichment text cloud of the query results (Details for enrichment text cloud please read the Methods section in our MatrisomeDB 2.0 manuscript)
From the COL4 enrichment text cloud, you could easily spot some genes enriched by COL4 (e.g. ltbp2) or tissue/cell lines (e.g. mc38)
e) The fourth panel on the Result page presents the details of the query results. All the data or filtered data can be downloaded by clicking the "Export all result to .tsv file" or "Export filtered results" button located above the Result table.
Example 2: Searching for multiple proteins
- Clicking on Gene entry will open the GeneCards page of the selected gene.
- Clicking on UniProt will open the UniProt page.
- Clicking on the Description will display a sample-specific and an overall domain/sequence coverage map computed using the data integrated in MatrisomeDB 2.0.
- Clicking on Open in SCV and Open PTMs occurrence table from either sample-specific or global will take users to 3D sequence coverage and a detailed PTMs table, respectively.
- Clicking on SMART domains will take users to SMART domain page of query protein and CPTAC assay, PeptideAtlas to highly predicted peptides from external resources.
- Clicking on Sample type and Reference will take users to the public repository to which the raw mass spectrometry data can be retrieved from and to the original publication reporting the analysis of each specific samples, respectively.
Example 3: Interrogating MatrisomeDB 2.0 using selections from the Option boxes
- To specifically examine the distribution of the following murine proteins Chondroadherin (Chad), Podocan (Podn), and Hyaluronan And Proteoglycan Link Protein 1 (Hapln1), users can type their gene symbols in the Search box as 'genes=(Chad,Podn,Hapln1)' and click button .
- The heatmap returned from the search presents the distribution of all three proteins across different tissues.
- The text cloud reveals the highly-enriched terms related with three query genes
- The Result table displayed below provides sample-specific information on these three proteins.
- To examine the distribution of all collagen VI proteins in both human colon and liver including normal and diseased tissues, users should
a) Input COL6A in the Search box.
b) Select Human in the Species option box, and Blood Vessel, Breast, Colon, Kidney, Liver in the Organ or Tissue of Origin.
c) Then click button.
- The results shown correspond exactly to the search parameters specified.
For more information, please refer to our Publications which also list additional examples.
If you have any questions or identify a technical issue, please contact Gao lab @ UIC or the Matrisome team.Thank you for using MatrisomeDB/MatrisomeDB 2.0.