2
a comprehensive and diverse associated dataset [11]. This
poses an additional challenge when considering commercial
vision products, as the limited availability of labeled data
hinders the design and training of a generalized machine
learning model. Typically, the solution to this issue involves
utilizing simulated data [12]. However, simulators often
lack representation of visual features commonly found
in mines, and the transition from simulation to reality is
known to be problematic. The pipeline proposed in this
paper can be used as an alternative by quickly and cheaply
generating a labeled data set.
It is difficult to accurately label event-based images
with semantic information directly. Therefore, a traditional
color camera is used in conjunction with two event cam-
eras. This is done so that a semantic segmentation network
can be trained in optical color space. The trained network is
then used to predict a semantically segmented image using
an event-based image, called a mask. This mask is shown to
be accurate enough to close the loop on a self-supervised
approach [13] to automatically label event-based images.
All data pictured is collected at the Edgar Experimental
Mine in Idaho Springs, Colorado. In order to direct the
drill autonomously to perform roof bolting, areas of rock
need to be segmented differentially from areas of support
strap [14].
The primary contribution of this work is the training
architecture to enable a self-supervised approach to the
semantic segmentation task, while simultaneously using
domain transfer techniques to reduce the cost and improve
the fidelity of labeling data.
Figure 1. Shows a typical color image of the roof with a
bolted strap, taken at the Edgar Mine in Idaho Springs, CO.
It is imperative for any computer vision system to be able to
differentiate between the strap and the underlying rock
Figure 2. Shows both the training and testing pipelines intended for the development of event-
based semantic segmentation
a comprehensive and diverse associated dataset [11]. This
poses an additional challenge when considering commercial
vision products, as the limited availability of labeled data
hinders the design and training of a generalized machine
learning model. Typically, the solution to this issue involves
utilizing simulated data [12]. However, simulators often
lack representation of visual features commonly found
in mines, and the transition from simulation to reality is
known to be problematic. The pipeline proposed in this
paper can be used as an alternative by quickly and cheaply
generating a labeled data set.
It is difficult to accurately label event-based images
with semantic information directly. Therefore, a traditional
color camera is used in conjunction with two event cam-
eras. This is done so that a semantic segmentation network
can be trained in optical color space. The trained network is
then used to predict a semantically segmented image using
an event-based image, called a mask. This mask is shown to
be accurate enough to close the loop on a self-supervised
approach [13] to automatically label event-based images.
All data pictured is collected at the Edgar Experimental
Mine in Idaho Springs, Colorado. In order to direct the
drill autonomously to perform roof bolting, areas of rock
need to be segmented differentially from areas of support
strap [14].
The primary contribution of this work is the training
architecture to enable a self-supervised approach to the
semantic segmentation task, while simultaneously using
domain transfer techniques to reduce the cost and improve
the fidelity of labeling data.
Figure 1. Shows a typical color image of the roof with a
bolted strap, taken at the Edgar Mine in Idaho Springs, CO.
It is imperative for any computer vision system to be able to
differentiate between the strap and the underlying rock
Figure 2. Shows both the training and testing pipelines intended for the development of event-
based semantic segmentation