Single-cell meta-analysis of inflammatory bowel disease with scIBD

Understanding the heterogeneous intestinal microenvironment is critical to uncover the pathogenesis of inflammatory bowel disease (IBD). Recent advances in single-cell RNA sequencing (scRNA-seq) have identified certain cell types and genes that could contribute to IBD. However, a comprehensively integrated analysis of these scRNA-seq datasets is not yet available. Here we introduce scIBD, a platform for single-cell meta-analysis of IBD with interactive and visualization features. scIBD combines highly curated single-cell datasets in a uniform workflow, enabling identifying rare or less-characterized cell types in IBD and dissecting the commonalities and differences between ulcerative colitis and Crohn's disease. scIBD also incorporates multi-functional information, including regulon activity, GWAS-implicated risk genes, and genes targeted by therapeutics, to infer clinically relevant cell-type specificity. Collectively, scIBD is a user-friendly web-based platform for the community to analyse the transcriptome features and gene regulatory networks associated with the pathogenesis and treatment of IBD at the single-cell resolution.

Overview of scIBD datasets and annotations

Mailing address
Institute of Cancer Research,
Shenzhen Bay Laboratory,
Guangming District, Shenzhen, Guangdong,
P.R. China

Developed by Lei Zhang Lab | © Copyright 2022

Options

Submit

Dot plot of gene expression

Loading...

Annotation of cell subsets

Loading...

Barplot of cell numbers

Loading...

Violin plot of gene expression

Loading...

Dot plot of gene expression

Loading...

Heatmap plot of marker genes

Loading...

Marker genes of each cell subtype

Options

Submit

Regulon activity

Loading...

Annotation of cell subsets

Loading...

SCENIC embedding

Loading...

Violin plot of regulon activity

Loading...

Heatmap plot of regulon activity

Loading...

Network of regulons

Loading...

Network of regulons

Loading...

Regulon specificity scores (RSS) in each cell type

Loading...

Select dataset

Submit

Embedding plot of gene expression

Loading...

Annotation of all cells

Loading...

Annotation of selected cells

Loading...

Violin plot of gene expression

Loading...

Cells in each group

Loading...

Violin plot of gene expression

Loading...

Cells in each group

Loading...

Select dataset

Submit

Violin plot of regulon activity

Loading...

Cells in each group

Loading...

Compare regulons between healthy and UC

Compare regulons between healthy and CD

Select samples

Submit

Cell compositions of major subsets

Loading...

Cell compositions of sub-cell types

Loading...

Input gene set

Output of enrichment analysis

Heatmap

Loading...

FDA approved drugs for IBD


Route: iv (intravenous) represents administration within or into a vein or veins; sc (subcutaneous) represents administration beneath the skin; po (Oral) represents administration to or by way of the mouth

CD represents Crohn’s disease; UC represents ulcerative colitis; x represents that the drug can be used to treat UC or CD

Therapy targets and drugs for IBD


Drugs and targets for IBD are retrived from Open Targets database

Major GWAS study on IBD

Risk genes of IBD patients

Risk genes of pediatric patients

Manual of scIBD

Authors: Peilu Lin & Hu Nie

Date: 2023-03-29


Outline


  1. Explore gene expression for cell subtypes

  2. Explore regulon activity for cell subtypes

  3. Compare gene expression between healthy individuals and patients with IBD

  4. Compare regulon activity between healthy individuals and patients with IBD

  5. Apply gene set enrichment analysis

  6. Explore clinical trials, therapy drugs/targets, and GWAS-risk genes of IBD


1. Explore gene expression for cell subtypes


After access scIBD, go to the Exploration page, click the Gene Expression Profile tab, the following interface will be shown. By default, expression levels of CPA3 and TPSAB1 are queried in the myeloid compartment.

Example: Explore differential expressed genes in myeloid compartment


After selecting Myeloid as major cluster, annotations of cell subsets (left) and number of cells (right) in myeloid compartment will be shown. The calculation of the cell numbers is based on the whole data set not the down sampled data set.


User can browser the top signature genes of each cell subtype in Heatmap plot of marker genes box and Marker genes of each cell subtype box.


After choosing Mast cell and ranking the list according to avg_log2FC, differential expressed genes are listed. User can explore and compare their expression profiles in the Gene Expression Profile tab and the Gene Expression Comparison tab.

cluster: cell subtypes in myeloid compartment

p_val: p value calculated by wilcoxon test

p_val_adj: adjusted p value, based on based on bonferroni correction using all genes in the dataset.

avg_log2FC: average log 2 fold change


Select one gene of interest


Similarly, after choosing LAMP3+ DC and ranking the list according to avg_log2FC, differential expressed genes are listed. You can explore and compare their expression profiles in the Gene Expression Profile tab and the Gene Expression Comparison tab.


Select multiple genes of interest



2. Explore regulon activity for cell subtypes


This section illustrates how to explore the regulon activity in each cell subtype. After selecting Myeloid as major cluster, UMAP embedding (left) and SCENIC embedding (right) with annotations of cell subsets in myeloid compartment will be shown.

Example: Explore regulons in the myeloid compartment



Select one regulon of interest


Activities of regulons in myeloid compartment


Select multiple regulons of interest


The Network of regulons interactive



3. Compare gene expression between healthy individuals and patients with IBD


Compared to the control panel of the first two parts, several additional choices are provided for Gene Expression Comparison part, including Tissue, Developmental stage, Study, Minor cluster, Location, Disease state and Sample.


Example: Explore the gene expression of HLA-II molecules


Explore the gene expression of HLA-II molecules in all major clusters


Explore the gene expression of HLA-II molecules in epithelial cells


Compare the gene expression of MHC-II molecules between health and UC in DUOX2+ epithelial cells


Compare the gene expression of MHC-II molecules between health and CD in enterocytes



4. Compare regulon activity between healthy individuals and patients with IBD


Example: Explore differentially activated regulons between health and UC or CD in epithelial cells


Select Epithelial as major cluster to explore differentially activated regulons (in this example, AR) between health and UC or CD in epithelial cells.


In the box of Compare regulons between healthy and CD (UC), the value in the second and the third column represents the average regulon activity in inflamed tissue of CD, UC patients or healthy individuals.


Example: Compare regulon activity of PITX1 between healthy individuals and patients with UC or CD in epithelial cells



Example: Compare regulon activity of PITX1 between healthy individuals and patients with UC in DUOX2+ epithelial cells in colorectum



Example: Compare regulon activity of PITX1 between colon and rectum in DUOX2+ epithelial cells in patients with UC




5. Apply gene set enrichment analysis


In the 'Gene Enrichment Analysis' panel, three ways to input a gene set are provided.

You can choose one or more pre-defined risk gene sets from different studies. Here, we choose all pre-defined risk genes of UC (1).After selection, gene list would be generated automatically in the box.

Similarly, you can define your interested gene set (2) in the same format (one gene per line) or upload a txt file (3) containing your gene set.


After applying gene set enrichment analysis on the risk genes of UC, the enrichment of them in each cell types are shown.

Odds ratio indicates how likely an outcome is to occur in one context relative to another.


Heatmap to show gene expression of each GWAS-risk genes in each cell subtypes

Rows: genes Columns: cell subtypes Gene expresion are scaled by row.



6. Explore clinical trials, therapy drugs/targets, and GWAS-risk genes of IBD


The purpose of the section 'Current Therapy Strategy' in 'Resources' is to provide a summary of therapy targets, drugs, and relevant clinical trials for IBD. Here, users can search for clinical trials by disease type (e.g. Crohn's disease, ulcerative colitis), therapy type (e.g. biologics, small molecules), or therapeutic target genes. The results of the search will include a list of clinical trials that match the specified criteria, along with the therapy type and target gene for each trial.

Two parts are included in this section:

  1. FDA approved drugs for IBD
  2. Therapy targets and drugs for IBD


Explore clinical trials of IBD

You can explore FDA approved drugs in the 'Current Therapy Strategy' panel.

Clinical trial information can be reached through the 'View' button.


The clinical trail information for each drug includes the clinical stage, year of publishment, and links for study record of clinical trail, PubMed page for the reference and more detailed clinical information.


You can also explore drugs or targets under clinical trials in this tab. Click 'View' button for clinical trail information.


You can also explore GWAS-risk genes of IBD in 'GWAS-implicated Risk Genes' tab.


The risk genes of audlt IBD were retrieved from these studies listed in the table of 'Major GWAS study on IBD'


The risk genes of pediatric IBD were retrieved from this paper:

B. Huang et al., Mucosal profiling of pediatric-onset colitis and IBD reveals common pathogenics and therapeutic pathways. Cell 179, 1160-1176 e1124 (2019).


Examples to demonstrate the generality of scIBD

Author: Hu Nie & Peilu Lin

Date: 2023-03-29


Outline


  1. Explore the transcriptome features of DUOX2+ epithelial cells and compare the gene expression between health and disease

  2. Explore the gene regulatory networks of DUOX2+ epithelial cells and compare the activity regulon between health and disease


1. Explore the transcriptome features of DUOX2+ epithelial cells and compare the gene expression between health and disease

First step: Select epithelial cells in the Gene Expression Profile tab


Turn to the Exploration page, click the Gene Expression Profile tab


Second step: Explore differentially expressed genes (DEGs) in ‘Marker genes of each cell subtype’ table


Differentially expressed genes of DUOX2+ epithelial cells in the epithelial compartment sorted by log2 fold change of average gene expression levels.


Third step: Explore DEGs of DUOX2+ epithelial cells or any gene of interest in in epithelial compartment




Forth step: Compare gene expression of HLA-II molecules across cell subtypes in epithelial cells


Turn to the Exploration page, click the Gene Expression Comparison tab


Fifth step: Compare gene expression of HLA-II molecules in DUOX2+ epithelial cells between health and disease


Turn to the Exploration page, click the Gene Expression Comparison tab


Sixth step: Compare gene expression of HLA-II molecules in enterocytes between health and disease


Turn to the Exploration page, click the Gene Expression Comparison tab


More: Explore the preference of development stages of cell subtypes


Turn to the Exploration page, click the Gene Expression Comparison tab



2. Explore the gene regulatory networks of DUOX2+ epithelial cells and compare the activity regulon between health and disease

First step: Explore regulons in epithelial compartment


Turn to the Exploration page, click the Regulon Activity Profile tab


Second step: Explore cell type-specific regulons in DUOX2+ epithelial cells


Turn to the Exploration page, click the Regulon Activity Profile tab


Third step: Explore DUOX2+ epithelial-specific regulons or any regulon of interest in epithelial compartment




Forth step: Explore differentially activated regulons between health and disease in epithelial cells


Turn to the Exploration page, click the Regulon Activity Comparison tab


Fifth step: Compare regulon activities between health and disease in epithelial cells


Turn to the Exploration page, click the Regulon Activity Comparison tab


Sixth step: Compare regulon activities between health and disease in DUOX2+ epithelial cells


Turn to the Exploration page, click the Regulon Activity Comparison tab


Browse references

Browse the recently published datasets studying IBD

FAQ

Q1: What is scIBD?

scIBD is a platform for single-cell meta-analysis of inflammatory bowel disease (IBD) that contains ~1.14 million single cells from 12 datasets across multiple development stages (including fetal, pediatric, and adult), tissues from multiple anatomical regions (includign blood, small intestine and large intestine, etc.) and different disease states (healthy, inflammed UC, inflammed CD, etc.). scIBD comprises 9 major subtypes (Myeloid, CD4 T cells, CD8 T cells, ILCs, B/Plasma cells, Epithelial cells, Mesenchynal cells, Endothelial cells, and Neural cells), and 101 cell subtypes. scIBD provides a multi-functional and user-friendly interface that provides interactive visualization for biologists to analyse the transcriptome features, gene regulatory networks and enrichment of given gene set in each cell subset.

Q2: What are the feature functions of scIBD?

We have integrated 12 datasets from multiple studies which investigate the pathologies of IBD, and present a comprehensive single cell transcriptomic atlas for further studying IBD. With scIBD, users are convenient to explore signature genes of each cell subtype, and compare gene expression of given genes (such as therapy targets, cytokines, IBD-GWAS related genes, or others) between health and disease across major clusters or cell subtypes. With scIBD, users are also convenient to explore the underlying gene regulatory networks (GRNs) of each cell subtype, and compare the activities of given regulons between health and disease. IBD is caused by a complex interaction between genetic and environment factors (such as gut microbes). Currently, treatments for IBD including 5-ASA, antibiotics, steroids, immunosuppressants, and biologic therapies (including anti–tumor necrosis factor [TNF] antibodies, anti–α4β7 integrin antibodies, and anti–IL12/23 antibodies). For convenience,scIBD also collected clinical trials, therapy targets, and GWAS-implicated risk genes to give a quick glance of advances in the treatment of IBD. With scIBD, user could capture both the enriched cell subsets and gene expression profiles of the risk genes for UC and CD or any given gene set.

Q3: How did you perform the cell type annotations?

Preprocessing and integration of scRNA-seq datasets were performed with Scanpy. Raw count matrix of gene expression of all samples in all datasets were merged. We used a double-MAD (median absolute deviation) method to detected the outliers of number of expressed genes in all cells with lower threshold of 1.5 and upper threshold of 10. As a result, cells with fewer than 399 genes or more than 6,849 genes, or > 25% mitochondrial UMI counts were filtered out. Samples with less than 100 cells were dropped. Then, we used a two rounds clustering strategy to integrate, cluster and annotate major clusters and minor clusters.

In the first-round clustering, immunoglobulin genes, T cell receptor (TCR) genes, and ribosome-protein-coding genes (gene symbol with string pattern “^RP[0-9]+-|[LS]”), cell cycling genes (e.g. TOP2A, MKI67), and mitochondrial genes were removed from the combined gene expression matrix. For each cell, the UMI counts of genes were divided by the total UMI count of the cell and then scaled by 1e4, and then log-transformed. Top 2,000 highly variable genes were identified across cells and samples (with highly_variable_genes function with scanpy (v1.8.2, n_top_genes=2000 and batch_key = “sampleName”). These 2,000 highly variable genes were used for downstream analysis. The expression values of each gene were scaled to unit variance, and PCA were performed. Batch correction of datasets from multiple studies which covered three developmental stages and multiple disease conditions, was performed with bbknn (v1.5.1, bath_key="sampleName", neighbors_within_batch=3, metric="euclidean", n_pcs =30). Dimensionality reduction (t-SNE and UMAP) and leiden clustering (resolution=0.1) was performed and major clusters were annotated based on canonical marker genes and differentially expressed genes (DEGs).

In the second-round, integration and clustering of given major cluster were similar to first-round. We checked the gene expression of canonical marker genes, and sub clusters expressed marker genes from other major clusters were considered as doublets and dropped. Then, each major cluster was integrated and clustered again, and minor clusters were annotated based on marker genes identified from DEGs and published studies.

Contact Details

Scientific Problems

We welcome any suggestions regarding how to improve our database, please feel free to contact us with feedback.

Please contact Hu Nie: niehu2021@163.com

Technical Problems

If you have any questions about the usage of scIBD and the interpretation of results, or encounter problems when using scIBD.

Please contact Hu Nie: niehu2021@163.com

Address

Gaoke Innovation Center, Guangqiao Road, Guangming District, Shenzhen

Institute of Cancer Research

Shenzhen Bay Laboratory

Guangdong, China

Tel: 86-10-26849285