5
from the underlying data. These rules implement the classi-
cal reasoning discussed earlier, producing clean, integrated
data ready for machine learning.
These views are like the idea of data virtualization,
which is usually implemented by making a virtual database
that’s a view of multiple source databases. By basing the
virtual database on top of a known data lake, problems with
virtualization can be avoided, such as difficulty in manag-
ing permissions.
Defining the schema of these views, which can be
thought of as what data belongs where, and creating these
rules may seem difficult, but an approach is possible here
from the work we did previously. Generative AI can help
in defining the schema of this virtual database, using in-
context learning of the business-side data and tool usage to
make suggestions. Similarly, it can help define the logical
rules of how data from the data lake gets represented in this
virtual database.
This virtual database, defined by deductive rules, ends
up being a bit different than a traditional database. Rather
than being focused on the physical schema of data storage,
the virtual database’s schema is an integrated definition of
the data lake’s contents. The rules, virtual database schema,
and the context provided from the business side together
create an ontology, not just a database. An ontology is a for-
mal representation of a domain and its relationships, which
emerges naturally from this definition.
Modeling
A mathematical model uses calculations and logic as a
substitute for a real-world object or concept. For example,
a model of a road network can be used to calculate and
understand haul truck cycle times, without having to go
and physically use the original roads to do calculations.
In practical terms, for the knowledge graph, a model
acts like a function that takes some set of inputs to calculate
some set of outputs. This can be a trained machine learn-
ing model, like the example of calculating mining process
recovery for different material types and processes, or other
model types, such as reinforcement learning models giving
dispatch recommendations, or simple statistical calcula-
tions on the input parameters, like averages and variance
in production.
Machine learning models can be trained using data from
the virtual database since it’s already known to be integrated,
cleaned, and enriched. When these models are stored inside
the same system, then rules defined in the virtual database
can incorporate models, for example, adding predicted
production values to each shift, or extracting features from
images, creating a further enriched virtual database.
Models can also be built on top of other models, which
can allow tackling complex problems from the bottom up.
For example, haul cycle time could be created by combin-
ing models managing load time, spot time, and travel time,
which could all be separate sophisticated models them-
selves. Once defined, these allow easy reuse, including over
department boundaries, where the difficult calculations can
be encapsulated.
Model creation can be assisted by generative AI in the
same way that it can be used to help generate virtual data-
base ontology. Domain experts can help guide the creation
of models, using AI to fill in the gaps. Once created, the
domain expertise contained within the model will be usable
by the whole company.
Putting it All Together
Together, these components define a powerful knowledge
graph that can leverage all the various types of AI discussed
previously.
Symbolic AI is used to define the virtual database
ontology, creating enriched, integrated data from the raw
data in the data lake. The relationships defined here by the
rules help provide data for everything else, standardizing
the complex set of mining data.
Machine Learning works using this enriched virtual
database, where data is known to be cleaned and integrated.
This gives a very strong base for training machine learning
models and doing statistical analysis.
Deterministic optimization can be done on the inte-
grated data, including testing multiple scenarios. Models
allow ways to store and reference some of these what-if sce-
narios in a clean way a database alone would struggle with.
Deep learning is also enhanced by the data lake’s abil-
ity to hold any type of file. This makes it easy to efficiently
store and reference data like pictures, video, audio, or what-
ever other rich data types are present at the mine.
Reinforcement learning can be done in a variety of
ways, from running simulations on old data to doing real-
time learning as new data gets ingested into the data lake
from data pipelines. Reinforcement learning can recom-
mend optimized actions to take, which will also then bring
new data to be learned from back into the system.
Generative AI helps tie everything together. Guided by
information from the business context, it can help enrich
the knowledge graph using the knowledge graph’s own
metadata. Generative AI makes interaction with the sys-
tem possible with natural language, and it makes all writ-
ten descriptions and metadata into executable instructions.
Descriptions of rules, models, and more can aid in creating
a truly intelligent system.
Previous Page Next Page