IOD-Video Dataset Overview

1.IOD-Video Dataset Visualization

we construct an IOD-Video dataset comprised of 600 videos (141,017 frames), which covers a wide range of scenarios (including pipeline, factory, flange, valve, experiments, cylinder, wild and others). The clear and vague sets are split according to whether annotator can judge the boundary of the object within a single frame subjectively.

2.Data Collection and Annotation

As most gases do not appear in the visible spectrum and hence they can not be seen by human eyes or traditional RGB cameras. The characteristic absorption peaks of many gases are concentrated in the mid-infrared spectrum, which are considered as the fingerprint region. So the IOD-Video dataset is captured in a restrained portion of the infrared (IR) domain range in 3∼5µm and 8∼12µm.

IOD-Video is carefully labeled by following rules:

  1. Annotations are temporally continuous without sudden change.
  2. Bounding boxes tightens the object boundary well by human’s subjective perception.
  3. Bounding boxes reacts immediately when diffusion direction varies.


The IOD task faces multiple challenges caused by its characteristics (Color Deficiency, Indistinct Boundary), photography restrictions (Camera Movement, Imaging Noise), and environmental interference.

4.IOD-Video Dataset Statistics

The following figure demonstrates multi-dependencies among IOD-Video attributes, which achieves good diversity by providing various distances (0∼100m), sizes, visibility, and scenes captured by different spectral ranges.


We thank the ZHIPUTECH for the multispectral camera supports and data collection.