Pipeline for multiplexed images analysis
As I’ve mentioned before, one of our team’s goals in the context of the Delta Tissue program was to develop a pipeline for analysing IMC images of TNBC. While we are still working on refining this pipeline and can’t share it just yet, I’d like to walk you through the general steps involved in analysing multiplexed images, like those generated by IMC.
Please note that while the information here is grounded in published literature, the opinions about different methods reflect my personal experience working with multiplexed images (IMC, CODEX and IHC). As I’ve emphasized throughout this section, there is no universally "correct" method or software—it all depends on the specific question being addressed and the data available. Keep this in mind as you explore this section.
I’ll update this section as soon as our pipeline is ready—stay tuned!
Please note that while the information here is grounded in published literature, the opinions about different methods reflect my personal experience working with multiplexed images (IMC, CODEX and IHC). As I’ve emphasized throughout this section, there is no universally "correct" method or software—it all depends on the specific question being addressed and the data available. Keep this in mind as you explore this section.
I’ll update this section as soon as our pipeline is ready—stay tuned!
Preprocessing
Multiplexed image analysis relies on measuring the intensity of each marker at the pixel level. However, raw images often contain various sources of technical noise that can affect the signal counts in each channel. This noise obscures the biological signal and hinders downstream analysis. Therefore, it is crucial to "clean" the images to ensure accurate interpretation.
Noise can arise at various stages of the image acquisition process, from tissue resection to the staining procedure, and it can manifest in many forms. Additionally, each imaging modality introduces specific noise sources, such as autofluorescence in fluorescence-based methods. Below are some common sources of noise I have encountered while working with different multiplexed imaging techniques (including IMC, CODEX, and IHC):
These are just a few examples.
Cleaning images is an essential step in image analysis. However, care must be taken to strike a balance: removing noise without unintentionally discarding valuable biological information is critical for ensuring the accuracy and reliability of downstream analysis. That’s why our pipeline includes a dedicated, carefully designed step to tackle technical noise while preserving the integrity of the biological signal.
Noise can arise at various stages of the image acquisition process, from tissue resection to the staining procedure, and it can manifest in many forms. Additionally, each imaging modality introduces specific noise sources, such as autofluorescence in fluorescence-based methods. Below are some common sources of noise I have encountered while working with different multiplexed imaging techniques (including IMC, CODEX, and IHC):
- Hot Pixels: Defective detector elements that consistently produce high-intensity values, independent of the signal.
- Background Signal: Non-specific binding of antibodies or reagents, resulting in elevated baseline intensities.
- Autofluorescence: Intrinsic fluorescence from tissue components, particularly relevant in fluorescence-based modalities (IMC largely avoids this issue).
- Staining Heterogeneity: Variability in antibody penetration or uneven reagent distribution across the sample.
- Overstaining or Understaining: Excessive or insufficient antibody binding, leading to intensity distortions.
- Batch Effects: Variations in staining or imaging conditions between experimental/patients batches.
- Tissue Folding or Tears: Physical artifacts introduced during tissue preparation.
- Edge Effects: Signal distortion near the edges of tissue sections, often caused by uneven staining or dehydration.
- Salt-and-Pepper Noise: Random bright and dark spots due to signal corruption or improper detector calibration.
These are just a few examples.
Cleaning images is an essential step in image analysis. However, care must be taken to strike a balance: removing noise without unintentionally discarding valuable biological information is critical for ensuring the accuracy and reliability of downstream analysis. That’s why our pipeline includes a dedicated, carefully designed step to tackle technical noise while preserving the integrity of the biological signal.
Segmentation
|
Once the images have been denoised, the next step is the segmentation of the cells. Cell segmentation is the process of identifying and delineating individual cells in an image. The goal is to assign each pixel in an image to a specific cell or to the background, enabling downstream analyses such as cell counting, morphology analysis, and spatial measurements.
Segmentation can target whole cells, nuclei, or even specific subcellular structures, such as organelles. Software tools for segmentation often combine computer vision, image processing, and machine learning techniques to detect and delineate cells. As my work primarily focuses on identifying different cell types based on cytoplasmic or membrane markers, I rely on whole-cell segmentation masks. Whole-cell segmentation is inherently more challenging than nuclear segmentation because cytoplasmic boundaries are often less distinct. |
Various approaches can be employed depending on the data, imaging modality, and available markers. These include nuclear expansion, gradient-based edge detection on membrane markers, or using pre-trained or custom deep learning models to segment cells directly from raw image data. Each method has its own strengths and limitations. For instance, nuclear expansion works well when cell boundaries are not visible or when cytoplasmic markers are unavailable. However, it assumes uniform cell size and shape, which is often unrealistic in heterogeneous samples, and it may introduce errors in crowded regions or tissues with overlapping cells.
It’s important to note that there is no universally "correct" method or software for segmentation—it depends on the question being addressed and the type of information available. Additionally, to my knowledge, there is no standardized metric to quantitatively assess segmentation quality apart from visual inspection, making it difficult to definitively determine which method performs best. Regardless of the approach, it is essential to perform thorough visual inspection and quality control of the segmentation to minimize issues in downstream analysis.
It’s important to note that there is no universally "correct" method or software for segmentation—it depends on the question being addressed and the type of information available. Additionally, to my knowledge, there is no standardized metric to quantitatively assess segmentation quality apart from visual inspection, making it difficult to definitively determine which method performs best. Regardless of the approach, it is essential to perform thorough visual inspection and quality control of the segmentation to minimize issues in downstream analysis.
Cell Type Annotation
Once we have located the cells, the next step is to identify their type—a process referred to as cell phenotyping, cell-type annotation, or cell-type identification. This involves grouping cells based on their marker expression and labelling these groups according to their biological roles.
There are several approaches to achieve cell phenotyping, each with its advantages and limitations. Some of them are:
There are several approaches to achieve cell phenotyping, each with its advantages and limitations. Some of them are:
- Marker-Based Phenotyping (Gating): This approach is effective when the classification is straightforward, requiring only a binary decision based on a specific marker. For example, to identify T cells in tissue, you could use the expression of CD4 and CD8, which are exclusive to helper and cytotoxic T cells, respectively. However, overlapping or ambiguous marker expression can complicate classification and limit granularity.
- Morphological Phenotyping: This method works best when combined with molecular information. While morphology can provide clues, relying solely on it may not reliably identify cell types in 2D images. This is because tissue sections only capture slices of cells, and the appearance of those slices can vary widely. As a reminder: most of the time the 2D section of an elephant won’t look like an elephant! Always supplement morphology with other data when possible.
- Unsupervised Clustering: This is my preferred method so far. Clustering algorithms group cells based on similarities in their marker expression profiles. After clustering, the mean expression of markers within each cluster can be used to assign biological labels. This approach is highly scalable and far less time-consuming than manual gating. However, it requires some biological knowledge to interpret marker profiles and is less useful when analysing datasets with only a few markers. In such cases, marker-based phenotyping or gating may be more appropriate.
As with segmentation, there is no universally "correct" method for cell phenotyping. The best approach depends on the specific research question and the available data. First, identify your needs and the characteristics of your dataset, and then select the method that aligns best with your goals.
In any case, it is essential to perform both a visual inspection and quality control after annotating the cells.