Single-shot image-based autofocus with machine learning

This shows how to reproduce the data collection aspects of the paper:

Henry Pinkard, Zachary Phillips, Arman Babakhani, Daniel A. Fletcher, and Laura Waller, “Deep learning for single-shot autofocus microscopy,” Optica 6, 794-797 (2019)

In this paper, we show how the existing transmitted light path and camera on a microscope can be used collect a single image, which can then be fed into a neural network to predict how far out of focus that image is so the Z-stage can be moved back into the correct focal plane.

In particular, this tutorial will show: 1) How to create the neccessary hardware for the system. 2) How to use Micro-Magellan to easily collect training data 3) How to use Pycro-Manager to apply a trained network to apply focus corrections during subsequent experiments. An different notebook shows the steps in between (2) and (3) of processing data and training the network.

This tutorial assumes the use of a Micro-Manager configuration with motorized XY and Z stages.


Many biological experiments involve imaging samples in a microscope over long time periods or large spatial scales, making it difficult to keep the sample in focus. When observing a sample over time periods of hours or days, for example, thermal fluctuations can induce focus drift. Or, when scanning and stitching together many fields-of-view (FoV) to form a high-content, high-resolution image, a sample that is not sufficiently flat necessitates refocusing at each position. Since it is often experimentally impractical or cumbersome to manually maintain focus, an automatic focusing mechanism is essential.

Many hardware and software-based solutions have been proposed for this problem. Hardware solutions are, in general, more expensive and faster, while software methods tend to be slower. The technique we propose here aims to capture the benefits of each: A fast method for autofocusing that requires minimal, cheap hardware. The most basic software autofocus method is to simply take a focal stack of images, calculate some metric of image sharpness over the stack, and find the peak.

In our technique, we will bootstrap training data best on this basic method for generating ground truth. That is, we will collect paired Z-stacks of a sample, one with incoherent illumination (i.e. regular transmitted light with the aperture not fully closed down), and the other with coherent illumination, as provided by an off-axis LED illuminating the sample. A single plane of the latter will be used as the input to a neural network that can predict the correct focal position.


This tutorial will go through the full process in the figure below and is broken down in the following sections

  1. Hardware setup

  • How to choose off-axis illumination

  • How to control through Micro-Manager

  1. Collect training data (part a in figure below)

  2. Use trained network to apply focus corrections during experiments (part b in figure below)