Auto CyCIF Project

This project is not completed and is a work in progress

This code is part of an ongoing project to create a microscope to automatically execute the CyCIF multiplex immunostaining method on several slides at once. This project fuses a custom precision engineered microscope from ASI with a few open source FDM 3D printed parts built into it with a raspberry pi to control them. Once this project is completed, all plans will be made available here and plans for the microscope will be on file and available for order from ASI. The goal of this project is not to make a super cheap, open source auto CyCIF microscope. Many of the projects that I have seen in the past require too much time commmitment and skill to execute. This microscope isn’t super cheap, but it also isn’t expensive. Its a price point that is reasonable for many individual labs and cores to purchase ($50k). The 3D printed components are simple for anyone to make and were printed on a Prusa i3Mk3S 3D printer with PLA filament. ## What is CyCIF? CyCIF is an affordable open source method of multiplex imaging that was developed by the Sorger lab at Harvard Medical School. Its basis of operation is: 1. Stain sample with DAPI or Hoechst (light stain) + 3 other antibodies directly conjugated to green, red and far red emitting dyes. These dyes must be cyanide based for the method to work properly. Currently, the best dyes that are commonly used are Alexa 488, Alexa 555 and Alexa 647. Alexa 647 is virtually a perfect dye for this method, Alexa 488 works pretty well and Alexa 555 works well for the method, but isn’t commonly available directly conjugated and leaves a bit to be desired from a fluorophore performance point of view. Never use Alexa 568. Its a great dye to image with, but is awful for this method. It just doesn’t bleach. DAPI’s only real use in this method to be an aligment marker because inbetween cycles, you will never get the sample back to its exact previous place perfectly, so DAPI is in every imaging cycle and provides a constant to align each cycle against. 2. Image sample with microscope (tiled/montage typically for whole tissue imaging, but in theory works with any microscope technology and imaging method such as z-stack although software doesn’t exist to handle z-stacks quite yet). 3. Bleach sample with solutoin that mainly contains hydrogen peroxide. This solution destroys the fluorescence of all dyes listed in step 1, but leaves DAPI relatively unbleached. 4. Restain with same dyes, but conjugated to different antibodies. 5. Repeat steps 2-4, for as many dyes as desired. 6. Use Ashlar software designed by the Sorger Lab to align all images from different cycles, stitch them together and output a new stitched image with every stained antibody contained in it.

What does the Microscope do?

The CyCIF method works great, but unfortunatley, it is a labor intensive and time consuming process. Most of the difficulty in the process is getting your sample back to the same spot (Ashlar can compensate, but you must at least be close to the same spot). With microsocpes that most people have access to, it requires etching a fiducial mark and then knowing the coordinates from that mark to the center of your tile image. The bleaching cycles are not hard, but do consume time. They consist of placing the slide into a bleaching solution with UV light on it to accelerate it, then wash and restain it. That process is around 1.5-2 hrs long to get it back on the microscope. Now say you want 24 antibodies stained? Its easy to see how it would be tedious and take several days to do a single slide. Slides may be done in parallel, but then this becomes even more a time consuming task. With this in mind, here is an overview of the oder of operations that the microscope will do. 1. User does first stain cycle manually, mounts up to 4 slides onto stage and manually defines the bounded tiled regions of each tissue section on each slide to the microscope. 2. The microscope goes to each region, autofocuses on a subset of tiles to get uneven parts of the tissue in focus and images each region. 3. Flood slide imaging chamber with bleaching solution and turn on UV leds. 4. Wash slides in chamber. 5. Drive autopipettor over each tissue section and deposit the next cycle’s stains. 6. Wait 1 hour and then wash slides again. 7. Run IR reflection autofocus to compensate for intercycle drift. 8. Image each region again. 9. Repeat steps 3-8 until all cycles are completed.

How does it Accomplish this?

  1. Since each tissue needs restained, cover glass cannot be used. This makes tissue proned to drying out during the imaging process. Additionally, we want to not have to remove slides for bleaching and staining cycles, so we would like to keep the slide in its exact same position which would elimate the need for tedious cycle alignment and fiducial marks. To solve these issues and increase the resolution via high NA objectives, we are employing an upright, water dipping objective style microscope.

  2. Micromanager 2.0 gamma is used to control the microscope.

  3. All 3D printed add on devices are controlled with a raspberry 3 B+ with python code.

  4. Micromanager is coded in java and thus I cannot use python to control both it and my raspberry pi controlled devices. To bridge this gap, pycromanager was employed. This python package created by Henry Pinkard translates python code into java code that micromanager understands.

  5. The raspberry pi is a separate device from the computer that runs the microscope. They must comunicate with each other, preferably via wifi. MQTT protocol was decided on to bridge this communication gap. This python package effectively allows the microscope computer to send instructions to the raspberry pi, which then a python program on the raspberry pi is programmed to execute certain task when it gets certain messages from the computer.

  6. The pump system is referred to as the octopump system. Its a versatile, simple python controlled peristaltic pump unit (can be used for much more than just this application and in fact I, myself use it for much more). This pump system is based upon Adafruit’s Motor Shield which is capable of driving 4 dc motors. A 3D printed housing for the motors and raspberry pi has been created too.

  7. The sample holder is a plate of glass with a 3D printed rim affixed to it that allows for liquids to be pumped into it and drained from it. It also allows for the liquids to be able to form a thick enough layer for the objectives to be able to dip into it.

  8. The same raspberry pi that runs the octopump system, runs the robotic pipettor. This pipettor has no arm, but instead uses the microscope stage to navigate itself over the tissue to be restained. Pycromanager allows us to find the center of the user defined tissue regions and adding in the known offset from the objective to the pipettor, we can tell the computer exactly where to drive the stage to. The pipettor is built around a revolver style with BD 1mL syringes loaded into it. Both the revolver location and syringe priming are determined with hall effect sensors.

  9. Ashlar is used post-acquistion to align and stitch all images. In our case, this is performed in high speed cluster computers, but in principle, any computer can be used for it.

Code Breakdown

Below here, we will dive into each function to understand what its point is and what it is doing.

lets import everything that we will need

[1]:
import math
import os

import matplotlib.pyplot as plt
import numpy as np
from tifffile import imread, imwrite

from pycromanager import Acquisition, Bridge, Dataset, multi_d_acquisition_events

Since I am using the microscope’s to move itself underneath my robotic pipettor, I need to find the center of the tissue. This method looks at all points in a user generated micromagellan surface, find the largest and lowest values of x and y stage coordinates and then find the center of those. This will be very close to the true center of the tissue.

[ ]:
def tissue_center(mag_surface):
    surface_points = {}
    interp_points = (
        mag_surface.get_points()
    )  # pull all points that are contained in micromagellan surface
    x_temp = []
    y_temp = []
    z_temp = []

    for i in range(interp_points.size()):
        point = interp_points.get(i)
        x_temp.append(point.x)
        y_temp.append(point.y)
        z_temp.append(point.z)
    surface_points["x"] = x_temp  ## put all points in dictionary to ease use
    surface_points["y"] = y_temp
    surface_points["z"] = z_temp
    x_middle = (max(surface_points["x"]) - min(surface_points["x"])) / 2 + min(
        surface_points["x"]
    )  # finds center of tissue in X and below in Y
    y_middle = (max(surface_points["y"]) - min(surface_points["y"])) / 2 + min(
        surface_points["y"]
    )
    xy_pos = list((x_middle, y_middle))
    return xy_pos

The next part is very easy. How do I now go to the center? In the case with the pipettor, it will be center+offset

[ ]:
core.set_xy_position(
    tissue_center(surface)[0], tissue_center(surface)[1]
)  # goto center of tissue

In order to auto make acquistion events, we need to know exactly how many surfaces the users created. This method can tell us, but it needs the default naming scheme, i.e. New Surface 1, New Surface 2, …

[ ]:
def num_surfaces_count():
    x = 1
    while magellan.get_surface("New Surface " + str(x)) != None:
        x += 1
    return x - 1

This function sets all the acquistion settings for each tissue section. It starts off by seeing if there are less, equal or more acquistion events than surfaces. Since I want them in a 1:1 ratio, it deletes any number of events greater than the surfaces, adds if they are less and does nothing if they equal each other. It then names each one tissue_x, where is an integer and sets everything with respect to the surface, i.e. tissue_1 is based all off of surface_1 and so on. Settings can be easily added on or taken away as desired.

[ ]:
def acq_settings_set():
    surface_count = num_surfaces_count()
    y = 1
    while magellan.get_acquisition_settings(y) != None:
        y += 1
    if y < surface_count:
        for z in range(y, surface_count):
            magellan.create_acquisition_settings()
    elif y > surface_count:
        for z in range(surface_count, y):
            magellan.remove_acquisition_settings(z)
    else:
        for x in range(0, surface_count):
            acq_settings = magellan.get_acquisition_settings(x)
            acq_settings.set_acquisition_name("tissue_" + str(x + 1))
            acq_settings.set_acquisition_space_type("2d_surface")
            acq_settings.set_xy_position_source("New Surface " + str(x + 1))
            acq_settings.set_surface("New Surface " + str(x + 1))
            acq_settings.set_bottom_surface("New Surface " + str(x + 1))
            acq_settings.set_top_surface("New Surface " + str(x + 1))

We would like no user interaction with the samples, so we must auto expose the samples. This simple algorithm goes to the center of each tissue and uses the magellan surface z at that point. It then takes a snap, sees how high the max intensity is, sees the exposure time and adjusts it to get the max intensity to 30% of the max value. The way it is defined, we must include our own channel names within the method (DAPI, FITC, etc). This isnt as exact as multi exposure algorithms, but its quick as satisfactory for this purpose. In the future, LED intensity may be changed as well in order to take image more rapidily. Another change could be to disregaurd the top 0.3% of pixels (its an arbitary number). The 4096 is valid for 12 bit cameras like what I am using.

[ ]:
def auto_expose(magellan_surface):

    xy, z = tissue_center(magellan_surface)
    x = xy[0]
    y = xy[1]
    z = z[0]
    channels = ["DAPI", "FITC", "Texas_Red", "Alexa_647"]

    core.set_xy_position(x, y)  # goto center of tissue
    core.set_stage_position = z
    exposure = np.array([])
    for x in range(0, len(channels)):
        core.set_config("Channel", channels[x])
        core.snap_image()
        tagged_image = core.get_tagged_image()
        pixels = np.reshape(
            tagged_image.pix,
            newshape=[tagged_image.tags["Height"], tagged_image.tags["Width"]],
        )
        exposure = np.append(
            exposure, 0.3 / (int(np.max(pixels)) / 4096) * core.get_exposure()
        )
    return exposure

At the end of all acquistions, we have a series of folders with images that micromagellan outputted. The way micromagellan organizes this is to for each surface, it creates a folder with that contain multiple resolutions and within that is a single multipage tiff file. This file has one page per tile, channel, etc. In this case, its just tile and channel. While this is convenient, it does not mesh with the required input organization for Ashlar. The scheme that Ashlar requires is each tile is its own separate tiff with the naming scheme being consistent. I chose the scheme to be tile number, channel number, cycle number and placed it within its own folder that denoted the tissue number. For example one file within the folder ashlar_input_tissue_1 was called tile1_channel1_cycle2.

This function looks into the directory that micromagellan created, sees how many cycles exist for each tissue (it knows the naming scheme is tissue_1, tissue_2, etc. so it uses the Filter method to find all file names that contain the sub string tissue_1 or tissue_2, etc. and then the number of files = number of cycles). Then it creates an overall directory for that tissue number (i.e. ashlar_input_tissue_1) and then proceeds to load in the multipage full resolution tiff file and break it apart. The pattern that micromagellan uses is if you have x number of channels, the first x pages are tile 1 for channel 1, tile 1 for channel 2, … tile 1 for channel x. Then the next x pages are tile 2 for channel 1, tile 2 for channel 2, … tile 2 for channel x and so on for every tile that you have. Knowing this pattern, this function breaks each page apart and makes a new tiff file with the appropriate name such as tile1_channel1_cycle2.

[ ]:
def Filter(string, substr):
    return [str for str in string if any(sub in str for sub in substr)]


def ashlar_input_file_organizer(saving_directory):
    all_files = os.listdir(saving_directory)
    num_surfaces = 1
    for x in range(0, num_surfaces):
        single_tissue_files = Filter(all_files, "tissue_" + str(x + 1))
        tissue_ashlar_directory = (
            saving_directory + "\\" + "ashlar_input_" + str(single_tissue_files[0])
        )
        os.mkdir(tissue_ashlar_directory)
        for y in range(0, len(single_tissue_files)):
            image = imread(
                saving_directory
                + "\\"
                + str(single_tissue_files[y])
                + "\\"
                + "full resolution"
                + "\\"
                + str(single_tissue_files[0] + "_MagellanStack.tif")
            )
            max_page = image.shape[0]
            cycle_num = y + 1
            channel_num = 1
            tile_num = 1
            tiff_page_counter = 0
            for page in range(0, max_page):
                temp_image = image[tiff_page_counter]
                imwrite(
                    tissue_ashlar_directory
                    + r"\tile"
                    + str(tile_num)
                    + "_channel"
                    + str(channel_num)
                    + "_cycle"
                    + str(cycle_num)
                    + ".tif",
                    temp_image,
                )
                channel_num += 1
                tiff_page_counter += 1
                if channel_num == 6:
                    channel_num = 1
                    tile_num += 1
                if tile_num == 10:
                    tile_num = 1

When imaging tissue slices at higher NAs, it becomes apparent that the tissue can be wavy. It never is very dramatic, but enough that if only one focal plane is chosen, then some will be crisp and some will be a touch blurry. Autofocusing on each tile can solve this, but is massively time consuming and in many cases takes significantly more time than the acquistion of the entire tiled image itself. To combat this, we are going to leverage micromagellan’s surface interpolation feature. With this feeature, if we feed it a few points with accurate in focus z positions, it will generate a surface that takes them into account and interpolate the rest. As long as the transistions are gentle, we can give it much less points than the number of tiles and still get everything in focus. To accomplish this, we need a method that is capable of looking at the magellan surface, find how many tiles are needed in each dimension, determine what tiles in the grid are suitable for subsampling with autofocus and then convert those tile locations into actual XY coordinates for the stage to use. The below method allows for a sub sampling % to be defined in order to tune speed and focus quality in indiviual sample. It performs a uniformly spaced, isotropic subsample of the entire region. In other words, it spreads out sampling points evenly. Due to build in roundings within the method, it cannot deliver the exact sub sample % desired, but it does output the actual sub sample % used. Tweaking the number can get you closer to the desired sub sampling %.

[ ]:
def bound_region2tile(tile_overlap):
    max_x = 7000
    min_x = 200
    max_y = 4000
    min_y = 500
    width = 1344
    height = 1024
    x_length = max_x - min_x
    y_length = max_y - min_y
    tiles_x = math.ceil(x_length / width * (1 - tile_overlap / 100))
    tiles_y = math.ceil(y_length / height * (1 - tile_overlap / 100))
    center_coords_x = x_length / 2 + min_x
    center_coords_y = y_length / 2 + min_y

    if tiles_x % 2 == 0:  # odd or even check. % 2 is remainder after a division by 2
        starting_x_point = (width * (1 - tile_overlap / 100)) / 2 + center_coords_x
        x_points = np.array(starting_x_point)
        for y in range(0, int(tiles_x / 2 - 1)):
            x_points = np.insert(
                x_points,
                0,
                (y + 1) * width * (1 - tile_overlap / 100) + starting_x_point,
            )
        for y in range(0, int(tiles_x / 2)):
            x_points = np.append(
                x_points, -(y + 1) * width * (1 - tile_overlap / 100) + starting_x_point
            )
    elif tiles_x % 2 != 0:
        starting_x_point = center_coords_x
        x_points = np.array(starting_x_point)
        for y in range(0, int(math.floor(tiles_x / 2))):
            x_points = np.insert(
                x_points,
                0,
                (y + 1) * width * (1 - tile_overlap / 100) + starting_x_point,
            )
        for y in range(0, int(math.floor(tiles_x / 2))):
            x_points = np.append(
                x_points, -(y + 1) * width * (1 - tile_overlap / 100) + starting_x_point
            )

    if tiles_y % 2 == 0:  # odd or even check. % 2 is remainder after a division by 2
        starting_y_point = (height * (1 - tile_overlap / 100)) / 2 + center_coords_y
        y_points = np.array(starting_y_point)
        for y in range(0, int(tiles_x / 2 - 1)):
            y_points = np.insert(
                y_points,
                0,
                (y + 1) * height * (1 - tile_overlap / 100) + starting_y_point,
            )
        for y in range(0, int(tiles_x / 2)):
            y_points = np.append(
                y_points,
                -(y + 1) * height * (1 - tile_overlap / 100) + starting_y_point,
            )
    elif tiles_y % 2 != 0:
        starting_y_point = center_coords_y
        y_points = np.array(starting_y_point)
        for y in range(0, int(math.floor(tiles_x / 2))):
            y_points = np.insert(
                y_points,
                0,
                (y + 1) * height * (1 - tile_overlap / 100) + starting_y_point,
            )
        for y in range(0, int(math.floor(tiles_x / 2))):
            y_points = np.append(
                y_points,
                -(y + 1) * height * (1 - tile_overlap / 100) + starting_y_point,
            )

    return tiles_x, tiles_y, np.int32(x_points), np.int32(y_points)


print(bound_region2tile(10))
[tiles_x, tiles_y, x_points, y_points] = bound_region2tile(10)


def focus_tile_pos_generator(area, tiles_in_x, tiles_in_y):
    iterations_x = math.ceil(tiles_in_x * math.sqrt(area))
    step_size_x = math.floor(tiles_in_x / iterations_x)
    remainder_x = tiles_in_x - (step_size_x * (iterations_x - 1))

    iterations_y = math.ceil(tiles_in_y * math.sqrt(area))
    step_size_y = math.floor(tiles_in_y / iterations_y)
    remainder_y = tiles_in_y - (step_size_y * (iterations_y - 1))

    if (remainder_x % 2) == 0:
        offset_x = remainder_x / 2
    else:
        if remainder_x == 1:
            offset_x = 1
        else:
            offset_x = (remainder_x - 1) / 2 + 1
    x = []
    for iteration in range(0, iterations_x):
        x.append(offset_x + iteration * step_size_x)

    if (remainder_y % 2) == 0:
        offset_y = remainder_y / 2
    else:
        if remainder_y == 1:
            offset_y = 1
        else:
            offset_y = (remainder_y - 1) / 2 + 1
    y = []
    for iteration in range(0, iterations_y):
        y.append(offset_y + iteration * step_size_y)

    focus_tile_locations = {"X": x, "Y": y}
    true_coverage = (
        str(round(len(x) * len(y) / (tiles_in_x * tiles_in_y), 2) * 100) + "%"
    )

    return focus_tile_locations, true_coverage


[grid1, true_coverage_grid1] = focus_tile_pos_generator(0.10, tiles_x, tiles_y)


def grid2points(sampling_grid, x_points, y_points):
    number_sampling_points = int(len(sampling_grid["X"]) * len(sampling_grid["Y"]))
    sampling_points = np.array([])
    for x in range(0, len(sampling_grid["X"])):
        for y in range(0, len(sampling_grid["Y"])):
            sampling_points = np.insert(
                sampling_points,
                0,
                [
                    x_points[int(sampling_grid["X"][x]) - 1],
                    y_points[int(sampling_grid["Y"][y]) - 1],
                ],
            )
    sampling_points.shape = (number_sampling_points, 2)
    return sampling_points


def surface2sampling_points(tile_overlap, area_fraction_sampled):
    [tiles_x, tiles_y, x_points, y_points] = bound_region2tile(10)
    [sampling_grid, true_coverage_grid1] = focus_tile_pos_generator(
        0.10, tiles_x, tiles_y
    )
    sampling_points = grid2points(sampling_grid, x_points, y_points)

    return sampling_points, true_coverage_grid1