Reading acquired data

The NDTiff format is the default saving format of pycromanager Acquisition object.

Images can be loaded individually, or all data can be loaded simulataneously into a memory-mapped dask array. This is a “virtual” array, which means the whole dataset isn’t loaded into RAM at first, but is instead “lazily” brought into RAM as each sub-part of it is used. This allows for processing of large datasets and viewing data in napari.

Creating a Dataset object

There are two ways to do this, depending on whether the data is part of an in-progress acquisition or not. In the former case:

from pycromanager import Acquisition

with Acquisition('/path/to/saving/dir', 'saving_name') as acq:

        ### send some instructions so something is acquired ######

        dataset = acq.get_dataset()

Alternatively, to open a finished dataset from disk:

from pycromanager import Dataset

#This path is to the top level of the dataset
data_path = '/path/to/data'

dataset = Dataset(data_path)

Reading data

Once opened, individual tiles can be accessed using read_image. This method accepts positions along different dimensions as argument. For example, to get the first image in a z stack, pass in z=0 as an argument.

img = dataset.read_image(z=0)
img_metadata dataset.read_metadata(z=0)

#img is a numpy array, img_metadata is a dict

To determine which axes are available, access the Dataset.axes attribute, which contains a dict with axis names as keys and a list of available indices as values.

If the dataset was created by tiling multiple XY positions, tiles along the axis corresponding to XY positions can be indexed by their row and column positions:

img = dataset.read_image(row=0, col=1)

Opening data as Dask array

Rather than reading each image individually, all data can be opened at once in a single dask array. Using dask arrays enables all_data to be held in a single memory-mapped array. This means that the data are not loaded in RAM until they are used, enabing a convenient way to work with datasets larger than the computer’s RAM. Dask arrays also enable and allow for code to be prototyped on a small computers and scaled up to clusters without having to rewrite code.

dask_array = dataset.as_array()

#dask array can be used just like numpy array
#take max intenisty projection along axis 0
max_intensity = np.max(all_data[0, 0], axis=0)

#visualize data using napari
v = napari.Viewer()

If the data was acquired by an XYTiledAcquisition or a MagellanAcquisition the grid on XY images can be automatically stitched into one contiguous image:

dask_array = dataset.as_array(stitched=True)

You can also slice along particular axes when creating the dask array:

dask_array = dataset.as_array(z=0, time=2)