10
The fourth tab, Ground-fall Classification, illustrates
the prediction of ground-fall incidents in U.S. coal mines
using the machine learning algorithm. The ground-fall
classes shown on the right-hand side of Figure 9 were pre-
dicted based on the best model which is logistic regression
model the average F1-score for that model was 96% for the
testing dataset. The predicted ground falls can be filtered
based on the state or the time period as shown on the left-
hand side of Figure 9.
CONCLUSIONS
This study provides a new method to analyze and classify
massive text dataset in mining application using machine
learning models. These machine learning models accu-
rately classified ground-fall-related incidents into five cat-
egories with an average F1-score of 96%. The outcome of
this effort may enhance research efforts by providing addi-
tional tools researchers can leverage and ultimately aim to
improve safety in U.S. mines and protect equipment from
ground-fall incidents. This study demonstrates that tools
can be developed that improves the knowledge gained from
evaluating MSHA narratives using machine learning with
the ultimate goal being to reduce the amount of time it
takes to sort through the narratives.
LIMITATION OF THE STUDY
The work completed in this study was from an exploratory
research perspective to evaluate the usefulness of using
machine learning being applied to MSHA data. The work
is not intended to suggest improvements or changes to data
collection methods utilized by MSHA. Nor does the study
suggest that immediate ground control improvements and
safety considerations can be taken away from the examples
presented. The study solely demonstrated that machine
learning techniques can be applied to MSHA narratives,
specific to ground falls, and shows promise to be success-
ful at conducting classification and efficiently visualizing
trends. More work needs to be done to explore the capabil-
ity of the machine learning related to the other categories
not analyzed from Figure 1. At this time, the Dashboard
demonstrated in this study is only intended for internal use
for visualization purposes. There are some limitations in
MSHA dataset, for example not all incidents and injuries
are reported to MSHA and this could make the MSHA
data biased toward specific categories of ground falls. Also,
there is no requirements to complete the narrative section
in MSHA Form 7000-1. Hence, some narratives could
be “blank” (The authors found four blank narratives out
of 8,017 in the trained dataset) or they could be “unclear
enough” to be useful in the analysis. The authors gave some
examples of unclear narratives in Table 2.
ACKNOWLEDGEMENTS
The authors acknowledge the Data Science Upscaling
(DSU) program for all the support and the unlimited
resources that the authors received during the DSU. We
Figure 9. The ground-fall classification tab of the developed Dashboard
Previous Page Next Page