Advanced integration tutorial

In this tutorial, we explore the integration of two multi-sample datasets using the harmonypy and scanorama libraries. The first dataset originates from a colorectal cancer study, comprising 12 tissue sections distributed across two biological replicates for each of the six samples. The details of this dataset are documented in Valdeolivas et al. 2023. The second dataset featured in this tutorial is derived from a mouse renal ischemia study, encompassing five samples collected at various time points, as outlined by Eryn E. et al. 2022.

[1]:
import scanorama
import numpy as np
import scanpy as sc
from glob import glob
import matplotlib.pyplot as plt
import chrysalis as ch

I. Colorectal Cancer

For this tutorial we are working with h5ad files that can be downloaded using this link.

[2]:
data_dir = '/mnt/c/Users/demeter_turos/PycharmProjects/chrysalis/data/crc_samples'

1.1 Preprocess samples

We start by reading the samples, performing QC and normalization, detecting spatially variable genes, and then saving the processed samples.

[3]:
samples = glob(data_dir + '/*.h5ad')

for sample in samples:

    ad = sc.read_h5ad(sample)

    # replace .uns dictionary key with the `sample_id` column to make it easier reach the H&E images
    spatial_key = list(ad.uns['spatial'].keys())[0]
    ad.uns['spatial'][ad.obs['sample_id'][0]] = ad.uns['spatial'][spatial_key]
    del ad.uns['spatial'][spatial_key]

    # we can optionally use ENSEMBL gene IDs instead of gene symbols to deal with non-unique gene identifiers
    # this circumvents issues during integration
    ad.var['gene_symbols'] = ad.var_names
    ad.var_names = ad.var['gene_ids']

    assert len(np.unique(ad.var_names)) == len(ad.var)

    # ad.var_names_make_unique()

    # filter low quality spots
    sc.pp.calculate_qc_metrics(ad, inplace=True)
    sc.pp.filter_genes(ad, min_cells=10)
    sc.pp.filter_cells(ad, min_counts=2000)
    sc.pp.filter_cells(ad, min_genes=10)

    # normalize counts
    sc.pp.normalize_total(ad, inplace=True, target_sum=1e4, exclude_highly_expressed=True)
    sc.pp.log1p(ad)

    # detect spatially variable genes
    ch.detect_svgs(ad, min_morans=0.05, min_spots=0.05)

    # save samples
    ad.write_h5ad(data_dir + f'/ch_{ad.obs["sample_id"][0]}.h5ad')
Calculating SVGs: 100%|██████████| 12285/12285 [01:25<00:00, 143.85it/s]
Calculating SVGs: 100%|██████████| 11955/11955 [01:30<00:00, 131.57it/s]
Calculating SVGs: 100%|██████████| 12191/12191 [01:27<00:00, 139.06it/s]
Calculating SVGs: 100%|██████████| 12495/12495 [01:08<00:00, 183.03it/s]
Calculating SVGs: 100%|██████████| 12966/12966 [00:47<00:00, 272.88it/s]
Calculating SVGs: 100%|██████████| 11224/11224 [00:58<00:00, 193.35it/s]
Calculating SVGs: 100%|██████████| 12022/12022 [01:03<00:00, 190.16it/s]
Calculating SVGs: 100%|██████████| 12346/12346 [00:19<00:00, 628.97it/s]
Calculating SVGs: 100%|██████████| 11769/11769 [00:56<00:00, 206.85it/s]
Calculating SVGs: 100%|██████████| 13264/13264 [01:08<00:00, 194.32it/s]
Calculating SVGs: 100%|██████████| 12280/12280 [00:18<00:00, 675.07it/s]
Calculating SVGs: 100%|██████████| 11580/11580 [00:36<00:00, 315.34it/s]

1.2 Create an integrated AnnData object

Next we read back the saved samples and take a look at the H&E.

[4]:
ch_samples = glob(data_dir + '/ch_*.h5ad')

# read AnnData objects into a list
adatas = [sc.read_h5ad(ad) for ad in ch_samples]
# get the sample names stored from the 'sample_id' column
sample_names = [ad.obs["sample_id"][0] for ad in adatas]

# look at the H&Es
plt.rcParams['figure.dpi'] = 60
fig, ax = plt.subplots(3, 4, figsize=(4 * 4, 3 * 4))
ax = ax.flatten()
for a in ax:
    a.axis('off')
for idx, ad in enumerate(adatas):
    sc.pl.spatial(ad, color=None, size=1.5, alpha=0,
                  ax=ax[idx], show=False, cmap='viridis')
    ax[idx].set_title(f'{ad.obs["sample_id"][0]}')
plt.tight_layout()
plt.show()
../_images/tutorials_advanced_integration_tutorial_7_0.png

Next, we can use ch.plot_svg_matrix to visualize the SVG overlap matrix. By examining the pairwise overlaps between spatially variable genes across samples, we can gain insights into the expected variability in tissue compartments. This analysis also enables the identification of significant overlaps, particularly between biological replicates.

[5]:
# plot spatially variable gene overlaps
plt.rcParams['figure.dpi'] = 80
ch.plot_svg_matrix(adatas, figsize=(8, 7), obs_name='sample_id', cluster=True)
plt.show()
../_images/tutorials_advanced_integration_tutorial_9_0.png

To initiate the integration process, simply invoke ch.integrate_adatas on the list of AnnData objects. This function concatenates the samples into a single AnnData instance, creating an outer-joined SVG list.

[6]:
# concatenate AnnData objects
adata = ch.integrate_adatas(adatas, sample_names, sample_col='ch_sample_id')

print(adata)
AnnData object with n_obs × n_vars = 18146 × 12438
    obs: 'in_tissue', 'array_row', 'array_col', 'ground_truth', 'sample_id', 'n_genes_by_counts', 'log1p_n_genes_by_counts', 'total_counts', 'log1p_total_counts', 'pct_counts_in_top_50_genes', 'pct_counts_in_top_100_genes', 'pct_counts_in_top_200_genes', 'pct_counts_in_top_500_genes', 'n_counts', 'n_genes', 'ch_sample_id'
    var: 'gene_ids', 'feature_types', 'genome', 'gene_symbols', 'n_cells_by_counts', 'mean_counts', 'log1p_mean_counts', 'pct_dropout_by_counts', 'total_counts', 'log1p_total_counts', 'n_cells', "Moran's I", 'spatially_variable'
    uns: 'spatial'
    obsm: 'spatial'
    varm: 'spatially_variable', "Moran's I"

We perform PCA here without integration and save the results to 'chr_X_pca_uncorrected' in .obsm in order to compare the results later.

[7]:
# replace ENSEMBL IDs with the gene symbols and make them unique
adata.var_names = list(adata.var['gene_symbols'])
adata.var_names_make_unique()

# run PCA
ch.pca(adata, n_pcs=50)
# save uncorrected PCs to compare the results later
adata.obsm['chr_X_pca_uncorrected'] = adata.obsm['chr_X_pca'].copy()

1.3 Integration with Harmony

We will use ch.harmony_integration to run harmonypy on the PCA embeddings. This will overwrite adata.obsm['chr_X_pca'] unless specified otherwise.

[8]:
# perform harmony integration
ch.harmony_integration(adata, 'ch_sample_id', random_state=42, block_size=0.05)

# set 'sample_id' as categorical variable for plotting
# adata.obs['ch_sample_id'] = adata.obs['ch_sample_id'].astype('category')

plt.rcParams['figure.dpi'] = 100
ch.plot_explained_variance(adata)
plt.show()
2024-01-08 09:33:03,056 - harmonypy - INFO - Computing initial centroids with sklearn.KMeans...
2024-01-08 09:33:10,982 - harmonypy - INFO - sklearn.KMeans initialization complete.
2024-01-08 09:33:11,079 - harmonypy - INFO - Iteration 1 of 10
2024-01-08 09:33:15,313 - harmonypy - INFO - Iteration 2 of 10
2024-01-08 09:33:23,389 - harmonypy - INFO - Iteration 3 of 10
2024-01-08 09:33:30,830 - harmonypy - INFO - Iteration 4 of 10
2024-01-08 09:33:34,859 - harmonypy - INFO - Converged after 4 iterations
../_images/tutorials_advanced_integration_tutorial_15_1.png

1.4 Integration with Scanorama

To run scanorama, we can utilize the correct_scanpy function. This function operates on a list of AnnData objects and produces a list containing the corrected expression matrices, so we perform this step prior to running ch.integrate_adatas. Lastly, we copy the PCA embedding to the main AnnData object (adata).

[9]:
# dropping the 'ch_sample_id' as it was already added by running 'ch.integrate_adatas' in the previous cell.
for ad in adatas:
    ad.obs.drop(columns=['ch_sample_id'], inplace=True)

scr_adata = scanorama.correct_scanpy(adatas, return_dimred=True)
scr_adata = ch.integrate_adatas(scr_adata, sample_col='ch_sample_id')

# replace ENSEMBL IDs with the gene symbols and make them unique
scr_adata.var_names = list(scr_adata.var['gene_symbols'])
scr_adata.var_names_make_unique()

# run PCA
ch.pca(scr_adata, n_pcs=50)
# save uncorrected PCs
adata.obsm['chr_X_pca_scanorama'] = scr_adata.obsm['chr_X_pca'].copy()
Found 12438 genes among all datasets
[[0.00000000e+00 8.73689010e-01 0.00000000e+00 0.00000000e+00
  8.44594595e-03 4.88102502e-03 9.25925926e-03 0.00000000e+00
  7.58533502e-03 1.45918833e-02 0.00000000e+00 0.00000000e+00]
 [0.00000000e+00 0.00000000e+00 9.16590284e-04 0.00000000e+00
  2.21283784e-01 1.95241001e-02 3.76639865e-02 0.00000000e+00
  3.79266751e-03 3.62017804e-02 0.00000000e+00 0.00000000e+00]
 [0.00000000e+00 0.00000000e+00 0.00000000e+00 9.09504838e-01
  0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
  0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00]
 [0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
  2.10586227e-02 1.22025625e-03 1.70745589e-03 0.00000000e+00
  6.32111252e-04 5.93471810e-04 7.31707317e-02 0.00000000e+00]
 [0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
  0.00000000e+00 1.00061013e-01 1.05574324e-01 8.71391076e-01
  2.78128951e-02 3.04054054e-02 5.18292683e-02 2.95275591e-03]
 [0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
  0.00000000e+00 0.00000000e+00 7.87065284e-02 1.09823063e-02
  9.05815424e-01 5.34124629e-03 0.00000000e+00 0.00000000e+00]
 [0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
  0.00000000e+00 0.00000000e+00 0.00000000e+00 5.24934383e-02
  1.95954488e-02 7.98219585e-01 6.09756098e-03 4.13943355e-02]
 [0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
  0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
  5.24934383e-03 5.24934383e-02 4.57317073e-02 3.14960630e-02]
 [0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
  0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
  0.00000000e+00 1.10619469e-01 0.00000000e+00 2.95275591e-03]
 [0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
  0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
  0.00000000e+00 0.00000000e+00 3.26219512e-01 5.01968504e-02]
 [0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
  0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
  0.00000000e+00 0.00000000e+00 0.00000000e+00 9.48170732e-01]
 [0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
  0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
  0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00]]
Processing datasets (10, 11)
Processing datasets (2, 3)
Processing datasets (5, 8)
Processing datasets (0, 1)
Processing datasets (4, 7)
Processing datasets (6, 9)
Processing datasets (9, 10)
Processing datasets (1, 4)
Processing datasets (8, 9)
Processing datasets (4, 6)
Processing datasets (4, 5)

1.6 Visualization

Now we can run ch.aa and ch.plot_samples to visualize the tissue compartments and inspect the results of the integration.

[10]:
# uncorrected
ch.aa(adata, n_pcs=20, n_archetypes=10, pca_key='chr_X_pca_uncorrected')

plt.rcParams['figure.dpi'] = 60
ch.plot_samples(adata, rows=3, cols=4, dim=10, suptitle='Uncorrected', sample_col='ch_sample_id', show_title=True)
../_images/tutorials_advanced_integration_tutorial_19_0.png
[12]:
# harmony integration
ch.aa(adata, n_pcs=20, n_archetypes=10)

plt.rcParams['figure.dpi'] = 60
ch.plot_samples(adata, rows=3, cols=4, dim=10, suptitle='Harmony', sample_col='ch_sample_id', show_title=True)
../_images/tutorials_advanced_integration_tutorial_20_0.png
[13]:
# scanorama integration
ch.aa(adata, n_pcs=20, n_archetypes=10, pca_key='chr_X_pca_scanorama')

plt.rcParams['figure.dpi'] = 60
ch.plot_samples(adata, rows=3, cols=4, dim=10, suptitle='Scanorama', sample_col='ch_sample_id', show_title=True)
../_images/tutorials_advanced_integration_tutorial_21_0.png

1.7 PCA plot

We can also take a look at the PCA embeddings to inspect the alignment.

[14]:
plt.rcParams['figure.dpi'] = 80

fig, axs = plt.subplots(1, 3, figsize=(12, 4))

axs[0].scatter(x=adata.obsm['chr_X_pca_uncorrected'][:, 0], y=adata.obsm['chr_X_pca_uncorrected'][:, 1],
            rasterized=True, s=4, c=adata.obs['ch_sample_id'].cat.codes, cmap='tab20')
axs[0].set_title('Uncorrected')
axs[0].set_xlabel('PC 0')
axs[0].set_ylabel('PC 1')

axs[1].scatter(x=adata.obsm['chr_X_pca'][:, 0], y=adata.obsm['chr_X_pca'][:, 1],
            rasterized=True, s=4, c=adata.obs['ch_sample_id'].cat.codes, cmap='tab20')
axs[1].set_title('Harmony')
axs[1].set_xlabel('PC 0')
axs[1].set_ylabel('PC 1')

axs[2].scatter(x=adata.obsm['chr_X_pca_scanorama'][:, 0], y=adata.obsm['chr_X_pca_scanorama'][:, 1],
            rasterized=True, s=4, c=adata.obs['ch_sample_id'].cat.codes, cmap='tab20')
axs[2].set_title('Scanorama')
axs[2].set_xlabel('PC 0')
axs[2].set_ylabel('PC 1')

plt.tight_layout()
plt.show()

../_images/tutorials_advanced_integration_tutorial_23_0.png

II. Renal Ischemia

In this section we integrate the mouse renal ischemia dataset. Input files can be downloaded from here.

[15]:
data_dir = '/mnt/c/Users/demeter_turos/PycharmProjects/chrysalis/data/renal_ischemia'
[16]:
samples = glob(data_dir + '/*.h5ad')

for sample in samples:

    ad = sc.read_h5ad(sample)

    # replace .uns dictionary key with the `sample_id` column to make it easier reach the H&E images
    spatial_key = list(ad.uns['spatial'].keys())[0]
    ad.uns['spatial'][ad.obs['sample_id'][0]] = ad.uns['spatial'][spatial_key]
    del ad.uns['spatial'][spatial_key]

    # we can optionally use ENSEMBL gene IDs instead of gene symbols to deal with non-unique gene identifiers
    # this circumvents issues during integration
    ad.var['gene_symbols'] = ad.var_names
    ad.var_names = ad.var['gene_ids']

    assert len(np.unique(ad.var_names)) == len(ad.var)

    # ad.var_names_make_unique()

    # filter low quality spots
    sc.pp.calculate_qc_metrics(ad, inplace=True)
    sc.pp.filter_genes(ad, min_cells=10)
    sc.pp.filter_cells(ad, min_counts=2000)
    sc.pp.filter_cells(ad, min_genes=500)

    # normalize counts
    sc.pp.normalize_total(ad, inplace=True, target_sum=1e4, exclude_highly_expressed=True)
    sc.pp.log1p(ad)

    # detect spatially variable genes
    ch.detect_svgs(ad, min_morans=0.05, min_spots=0.05)

    # save samples
    ad.write_h5ad(data_dir + f'/ch_{ad.obs["sample_id"][0]}.h5ad')
Calculating SVGs: 100%|██████████| 7325/7325 [00:42<00:00, 173.06it/s]
Calculating SVGs: 100%|██████████| 9705/9705 [00:57<00:00, 168.30it/s]
Calculating SVGs: 100%|██████████| 11019/11019 [01:20<00:00, 136.71it/s]
Calculating SVGs: 100%|██████████| 11283/11283 [00:57<00:00, 196.33it/s]
Calculating SVGs: 100%|██████████| 9958/9958 [00:50<00:00, 197.38it/s]
[17]:
ch_samples = glob(data_dir + '/ch_*.h5ad')

# read AnnData objects into a list
adatas = [sc.read_h5ad(ad) for ad in ch_samples]
# get the sample names stored from the 'sample_id' column
sample_names = [ad.obs["sample_id"][0] for ad in adatas]

# look at the H&Es
plt.rcParams['figure.dpi'] = 60
fig, ax = plt.subplots(2, 3, figsize=(3 * 4, 2 * 4))
ax = ax.flatten()
for a in ax:
    a.axis('off')
for idx, ad in enumerate(adatas):
    sc.pl.spatial(ad, color=None, size=1.5, alpha=0,
                  ax=ax[idx], show=False, cmap='viridis')
    ax[idx].set_title(f'{ad.obs["sample_id"][0]}')
plt.tight_layout()
plt.show()
../_images/tutorials_advanced_integration_tutorial_27_0.png
[18]:
# plot spatially variable gene overlaps
plt.rcParams['figure.dpi'] = 80
ch.plot_svg_matrix(adatas, figsize=(8, 7), obs_name='sample_id', cluster=True)
plt.show()
../_images/tutorials_advanced_integration_tutorial_28_0.png
[19]:
# concatenate AnnData objects
adata = ch.integrate_adatas(adatas, sample_names, sample_col='ch_sample_id')

print(adata)
AnnData object with n_obs × n_vars = 8771 × 12217
    obs: 'in_tissue', 'array_row', 'array_col', 'sample_id', 'n_genes_by_counts', 'log1p_n_genes_by_counts', 'total_counts', 'log1p_total_counts', 'pct_counts_in_top_50_genes', 'pct_counts_in_top_100_genes', 'pct_counts_in_top_200_genes', 'pct_counts_in_top_500_genes', 'n_counts', 'n_genes', 'ch_sample_id'
    var: 'gene_ids', 'feature_types', 'genome', 'gene_symbols', 'n_cells_by_counts', 'mean_counts', 'log1p_mean_counts', 'pct_dropout_by_counts', 'total_counts', 'log1p_total_counts', 'n_cells', "Moran's I", 'spatially_variable'
    uns: 'spatial'
    obsm: 'spatial'
    varm: 'spatially_variable', "Moran's I"
[20]:
# replace ENSEMBL IDs with the gene symbols and make them unique
adata.var_names = list(adata.var['gene_symbols'])
adata.var_names_make_unique()

# run PCA
ch.pca(adata, n_pcs=50)
# save uncorrected PCs to compare the results later
adata.obsm['chr_X_pca_uncorrected'] = adata.obsm['chr_X_pca'].copy()
[21]:
# harmony integration
ch.harmony_integration(adata, 'ch_sample_id', random_state=42, block_size=0.05)

# set 'sample_id' as categorical variable for plotting
# adata.obs['ch_sample_id'] = adata.obs['ch_sample_id'].astype('category')

plt.rcParams['figure.dpi'] = 100
ch.plot_explained_variance(adata)
plt.show()
2024-01-08 09:50:39,647 - harmonypy - INFO - Computing initial centroids with sklearn.KMeans...
2024-01-08 09:50:43,533 - harmonypy - INFO - sklearn.KMeans initialization complete.
2024-01-08 09:50:43,580 - harmonypy - INFO - Iteration 1 of 10
2024-01-08 09:50:45,973 - harmonypy - INFO - Iteration 2 of 10
2024-01-08 09:50:47,773 - harmonypy - INFO - Iteration 3 of 10
2024-01-08 09:50:51,101 - harmonypy - INFO - Iteration 4 of 10
2024-01-08 09:50:53,032 - harmonypy - INFO - Converged after 4 iterations
../_images/tutorials_advanced_integration_tutorial_31_1.png
[22]:

# dropping the 'ch_sample_id' as it was already added by running 'ch.integrate_adatas' in the previous cell. for ad in adatas: ad.obs.drop(columns=['ch_sample_id'], inplace=True) # scanorama integration scr_adata = scanorama.correct_scanpy(adatas, return_dimred=True) scr_adata = ch.integrate_adatas(scr_adata, sample_col='ch_sample_id') # replace ENSEMBL IDs with the gene symbols and make them unique scr_adata.var_names = list(scr_adata.var['gene_symbols']) scr_adata.var_names_make_unique() # run PCA ch.pca(scr_adata, n_pcs=50) # save uncorrected PCs adata.obsm['chr_X_pca_scanorama'] = scr_adata.obsm['chr_X_pca'].copy()
Found 12217 genes among all datasets
[[0.         0.38012959 0.07921999 0.06276661 0.06827048]
 [0.         0.         0.46652268 0.05714286 0.01170351]
 [0.         0.         0.         0.49142857 0.11768531]
 [0.         0.         0.         0.         0.71846554]
 [0.         0.         0.         0.         0.        ]]
Processing datasets (3, 4)
Processing datasets (2, 3)
Processing datasets (1, 2)
Processing datasets (0, 1)
Processing datasets (2, 4)
[23]:
# uncorrected
ch.aa(adata, n_pcs=20, n_archetypes=10, pca_key='chr_X_pca_uncorrected')

plt.rcParams['figure.dpi'] = 60
ch.plot_samples(adata, rows=2, cols=3, dim=10, suptitle='Uncorrected', sample_col='ch_sample_id', show_title=True, rotation=30)
../_images/tutorials_advanced_integration_tutorial_33_0.png
[24]:
# harmony integration
ch.aa(adata, n_pcs=20, n_archetypes=10)

plt.rcParams['figure.dpi'] = 60
ch.plot_samples(adata, rows=2, cols=3, dim=10, suptitle='Harmony', sample_col='ch_sample_id', show_title=True, rotation=30)
../_images/tutorials_advanced_integration_tutorial_34_0.png
[25]:
# scanorama integration
ch.aa(adata, n_pcs=20, n_archetypes=10, pca_key='chr_X_pca_scanorama')

plt.rcParams['figure.dpi'] = 60
ch.plot_samples(adata, rows=2, cols=3, dim=10, suptitle='Scanorama', sample_col='ch_sample_id', show_title=True, rotation=30)
../_images/tutorials_advanced_integration_tutorial_35_0.png
[26]:
plt.rcParams['figure.dpi'] = 80

fig, axs = plt.subplots(1, 3, figsize=(12, 4))

axs[0].scatter(x=adata.obsm['chr_X_pca_uncorrected'][:, 0], y=adata.obsm['chr_X_pca_uncorrected'][:, 1],
            rasterized=True, s=4, c=adata.obs['ch_sample_id'].cat.codes, cmap='tab10')
axs[0].set_title('Uncorrected')
axs[0].set_xlabel('PC 0')
axs[0].set_ylabel('PC 1')

axs[1].scatter(x=adata.obsm['chr_X_pca'][:, 0], y=adata.obsm['chr_X_pca'][:, 1],
            rasterized=True, s=4, c=adata.obs['ch_sample_id'].cat.codes, cmap='tab10')
axs[1].set_title('Harmony')
axs[1].set_xlabel('PC 0')
axs[1].set_ylabel('PC 1')

axs[2].scatter(x=adata.obsm['chr_X_pca_scanorama'][:, 0], y=adata.obsm['chr_X_pca_scanorama'][:, 1],
            rasterized=True, s=4, c=adata.obs['ch_sample_id'].cat.codes, cmap='tab10')
axs[2].set_title('Scanorama')
axs[2].set_xlabel('PC 0')
axs[2].set_ylabel('PC 1')

plt.tight_layout()
plt.show()
../_images/tutorials_advanced_integration_tutorial_36_0.png