Automated Cell Nuclei detection for Large-Volume Electron Microscopy of Neural Tissue

Abstract

Volumetric electron microscopy techniques, such as serial block-face electron microscopy (SBEM), generate massive amounts of image data that are used for reconstructing neural circuits. Typically, this requires time-intensive manual annotation of cells and their connections. To facilitate this analysis, we study the problem of automated detection of cell nuclei in a new SBEM dataset that contains cerebral cortex, white matter, and striatum from an adult mouse brain. The dataset was manually annotated to identify the locations of all 3309 cell nuclei in the volume. We make both dataset and annotations available here.

This is a supplementary to the ISBI 2014 paper "Automated Cell Nucleus Detection for Large-Volume Electron Microscopy of Neural Tissue".

Data and ground-truth annotations

Raw data

The raw data (HDF5 file, 4 GB, md5sum 8b1f88fd0cd57874dd8f0f0b74be61ba) can be requested using the form at the bottom of this page. The hd5 file is split up into 3 tar files for convenience.

This file uses the HDF5 data format for chunked and compressed storage. Regions of interest can be read from the file using Matlab's h5read command, using Python and h5py library as well as many other languages (see the wikipedia article for a list).

The data is stored in the group G1/20130722_132814 as a dataset of size 1024 x 768 x 7552. It was written in chunks of 64 x 64 x 64 using the deflate-1 OPT compression filter. Note that the data is in x,y,z axis order, which requires a transpose when read from C or python (see example below), and also from Matlab (because it uses y,x,z order).

An example python script to read a slice from the raw data and save it as a PNG image can be found here.

Ground-truth

Ground-truth annotations are provided as CSV files (separator: comma)with x,y,z and r columns.
x,y,z specify the manually marked approximate center position of each neuronal or glial nucleus;
r the manually estimated radius. Because nuclei are in 3D and not always spherical these are only rough estimates.
Two files are provided: In the first, the complete set of neuronal and glial nuclei annotations are given. In the second, we have removed those annotations for which the sphere touches the volume border in order to simplify evaluation of automated detection algorithms (see paper).

  • Original ground-truth (all neuronal and glial nuclei)
  • Edge clean ground-truth (annotations touching border removed)

Source Code

  • Code for block-wise thresholding and connected-component labeling can be found in the blockedarray github repository.
    Example usage from C++ can be found in the ccpipeline.cpp file.
  • Code for block-wise component accumulator can be found in the blockWiseComponentAccumulator.m
    This matlab function works on a labelled volume dataset stored in an HDF5 file (created by the blockedarray). It calculates bounding box, centroid, and coordinate list for all components except the label zero (assumed background). The output is written to a .mat file.
  • Code for component-wise morphological filtering can be found in the file componentWiseFiltering.m
    This matlab function works on a component list (created by the blockWiseComponentAccumulator.m) stored in a .mat file. It calculates and writes the new components to another .mat file.

Acknowledgements

Shawn Mikula, Sarah Mikula, Ivo Sonntag, Winfried Denk

Dataset

DATA ACQUISITION: A 20-week-old male mouse brain was prepared in its entirety for electron microscopy. A sub-volume containing cerebral cortex, white matter, and striatum was extracted from the epoxy-embedded whole brain with a trimmer (Leica) and scalpel blade, and mounted on an aluminum stub. Back- scattered electrons were imaged at 40nm pixel size in high vacuum with SBEM on a QuantaFEG 200 (FEI) and using a heuristic-based algorithm for automated aberration correc- tion. The final stack size for the cortico-striatal dataset was 4382 × 3435 × 30464 voxels, which was subsequently downscaled 4 times for cell nucleus detection. DATA FORMAT: The downscaled and cropped SBEM volume is grayscale (8-bit) and 1024×768×7552 voxels (x, y, z) in size, where each voxel is 160 × 160 × 200nm. It is available for download as an HDF5 file, which can be easily viewed by HDFView or read from using Python, Matlab or C. GROUND TRUTH: Ground-truth annotations are provided as CSV files (separator: comma) with x,y,z and r columns. x,y,z specify the manually marked approximate center position of each neuronal or glial nucleus; r the manually estimated radius. Because nuclei are in 3D and not always spherical these are only rough estimates. Two files are provided: In the first, the complete set of neuronal and glial nuclei annotations are given. In the second, we have removed those annotations for which the sphere touches the volume border in order to simplify evaluation of automated detection algorithms (see paper).