7
an issue with the event cameras as they are passive sensors.
However, it is an issue when trying to baseline the proposed
architecture.
In addition to the advantages gained from learned fea-
ture matching, a noteworthy benefit is that the cameras
do not necessitate stereo-calibration to obtain the essential
matrix [22]. The essential matrix is used to encode the rela-
tive poses of each camera, aiding in solving the pixel cor-
respondence problem. Tradition- ally, stereo calibration is
a crucial step in the process of determining the intrinsic
and extrinsic parameters of the two cameras. However, in
the context of the learned stereo-vision approach, this cali-
bration step becomes redundant. The method leverages the
pixel correspon- dence learned during training, allowing
the cameras to perform feature-matching without the need
for explicit stereo-calibration. This not only simplifies the
overall pipeline but also reduces the complexity associated
with the setup and maintenance of stereo-calibrated cam-
eras. By eliminating the stereo-calibration require- ment,
the learned stereo-vision approach streamlines the imple-
mentation and deployment of multi-camera systems. This
reduction in complexity contributes to the practical feasi-
bility and efficiency of deploying vision systems in various
applications, particularly in harsh environments like under-
ground mines.
Another network with the same architecture was
instantiated separately and trained on a tightly scoped scene
with a horizontal strap and constant lighting conditions.
This was done to separate the overhead costs of the training
from the experimental parts of the pipeline. The output is
shown in Figure 9. This performs better than the Lidar as
with a higher number of training epochs, all the stochastic
noise introduced by the environment is filtered out. This
is promising moving forward, as the surface reconstruc-
tion problem can be solved using just two event cameras,
allowing for resources for machine learning model train-
ing [23]. The sensor rig is expected to be mounted onto
the drill rig. Therefore, this network can be specialized to
a narrow field of view. Since the dataset is representative of
the final use case, i.e. on the drill, the performance of this
network is severely degraded in depths greater than four
meters. However, since the labeling is cheap, the network
can be generalized given the appropriate resources.
Looking ahead, this work can be extended to a simul-
taneous localization and mapping (SLAM) [24] module,
offering real-time localization inputs to the roof-bolting
machine. This integration presents a prac- tical solution to
enhance the autonomy and efficiency of the roof-bolting
process by providing accurate and up-to-date information
about the machine’s position within the mine environment.
Furthermore, the applicability of this approach ex-
tends beyond roof bolting, holding significant value for
localization and mapping in the context of mobile robots
operating in challenging environmental and optical condi-
tions. Autonomous vehicles, for instance, encounter dimin-
ished LiDAR performance in adverse weather conditions
such as fog and rain [25]. By lever- aging the robustness of
event-based images and the learned stereo-vision approach,
the SLAM module can potentially overcome these chal-
lenges and maintain reliable localization and mapping
capabilities even in severe conditions.
This broader application underscores the versatility of
the developed methodology and its potential impact on
advancing autonomous systems in various domains. The
adaptability to challenging environmental factors sets apart
the event camera data pipeline as a promising solution for
improving the reliability and performance of mobile robots
and autonomous vehicles operating in real-world scenarios
marked by adverse optical conditions.
In essence, this pipeline offers a scalable and ef- ficient
means of continuously updating and improv- ing the
performance of learned stereo-vision models through the
integration of fresh, real-world data. The combination of
quick data collection, minimal pre- processing, and quick
training cycles enhances the adaptability and efficacy of the
model in addressing the challenges posed by unique and
evolving mining conditions.
CONCLUSION
This work has shown the capability to use event cameras
and leverage their superior optical properties to create a
3-dimensional representation of the roof during an active
mining operation. The contribution is the capability to
produce virtually free, labeled data. This can be used in
conjunction with any supervised approach. Moreover, a
small sensor rig comprising just the two event cameras can
be integrated onto a roof bolter. This work has shown that a
real-time surface representation can be computed and used
as inputs to planning algorithms. This is made easier due to
the low power requirements and the superior optical prop-
erties of the event cameras themselves.
REFERENCES
[1] Guillermo Gallego, Tobi Delbrück, Garrick Orchard,
Chiara Bartolozzi, Brian Taba, Andrea Censi, Stefan
Leutenegger, Andrew J. Davison, Jörg Conradt,
Kostas Daniilidis, and Davide Scaramuzza. Event-
based vision: A survey. IEEE Transactions on Pattern
Analysis and Machine Intelligence, 44(1):154–180,
2022.
an issue with the event cameras as they are passive sensors.
However, it is an issue when trying to baseline the proposed
architecture.
In addition to the advantages gained from learned fea-
ture matching, a noteworthy benefit is that the cameras
do not necessitate stereo-calibration to obtain the essential
matrix [22]. The essential matrix is used to encode the rela-
tive poses of each camera, aiding in solving the pixel cor-
respondence problem. Tradition- ally, stereo calibration is
a crucial step in the process of determining the intrinsic
and extrinsic parameters of the two cameras. However, in
the context of the learned stereo-vision approach, this cali-
bration step becomes redundant. The method leverages the
pixel correspon- dence learned during training, allowing
the cameras to perform feature-matching without the need
for explicit stereo-calibration. This not only simplifies the
overall pipeline but also reduces the complexity associated
with the setup and maintenance of stereo-calibrated cam-
eras. By eliminating the stereo-calibration require- ment,
the learned stereo-vision approach streamlines the imple-
mentation and deployment of multi-camera systems. This
reduction in complexity contributes to the practical feasi-
bility and efficiency of deploying vision systems in various
applications, particularly in harsh environments like under-
ground mines.
Another network with the same architecture was
instantiated separately and trained on a tightly scoped scene
with a horizontal strap and constant lighting conditions.
This was done to separate the overhead costs of the training
from the experimental parts of the pipeline. The output is
shown in Figure 9. This performs better than the Lidar as
with a higher number of training epochs, all the stochastic
noise introduced by the environment is filtered out. This
is promising moving forward, as the surface reconstruc-
tion problem can be solved using just two event cameras,
allowing for resources for machine learning model train-
ing [23]. The sensor rig is expected to be mounted onto
the drill rig. Therefore, this network can be specialized to
a narrow field of view. Since the dataset is representative of
the final use case, i.e. on the drill, the performance of this
network is severely degraded in depths greater than four
meters. However, since the labeling is cheap, the network
can be generalized given the appropriate resources.
Looking ahead, this work can be extended to a simul-
taneous localization and mapping (SLAM) [24] module,
offering real-time localization inputs to the roof-bolting
machine. This integration presents a prac- tical solution to
enhance the autonomy and efficiency of the roof-bolting
process by providing accurate and up-to-date information
about the machine’s position within the mine environment.
Furthermore, the applicability of this approach ex-
tends beyond roof bolting, holding significant value for
localization and mapping in the context of mobile robots
operating in challenging environmental and optical condi-
tions. Autonomous vehicles, for instance, encounter dimin-
ished LiDAR performance in adverse weather conditions
such as fog and rain [25]. By lever- aging the robustness of
event-based images and the learned stereo-vision approach,
the SLAM module can potentially overcome these chal-
lenges and maintain reliable localization and mapping
capabilities even in severe conditions.
This broader application underscores the versatility of
the developed methodology and its potential impact on
advancing autonomous systems in various domains. The
adaptability to challenging environmental factors sets apart
the event camera data pipeline as a promising solution for
improving the reliability and performance of mobile robots
and autonomous vehicles operating in real-world scenarios
marked by adverse optical conditions.
In essence, this pipeline offers a scalable and ef- ficient
means of continuously updating and improv- ing the
performance of learned stereo-vision models through the
integration of fresh, real-world data. The combination of
quick data collection, minimal pre- processing, and quick
training cycles enhances the adaptability and efficacy of the
model in addressing the challenges posed by unique and
evolving mining conditions.
CONCLUSION
This work has shown the capability to use event cameras
and leverage their superior optical properties to create a
3-dimensional representation of the roof during an active
mining operation. The contribution is the capability to
produce virtually free, labeled data. This can be used in
conjunction with any supervised approach. Moreover, a
small sensor rig comprising just the two event cameras can
be integrated onto a roof bolter. This work has shown that a
real-time surface representation can be computed and used
as inputs to planning algorithms. This is made easier due to
the low power requirements and the superior optical prop-
erties of the event cameras themselves.
REFERENCES
[1] Guillermo Gallego, Tobi Delbrück, Garrick Orchard,
Chiara Bartolozzi, Brian Taba, Andrea Censi, Stefan
Leutenegger, Andrew J. Davison, Jörg Conradt,
Kostas Daniilidis, and Davide Scaramuzza. Event-
based vision: A survey. IEEE Transactions on Pattern
Analysis and Machine Intelligence, 44(1):154–180,
2022.