1444 XXXI International Mineral Processing Congress 2024 Proceedings/Washington, DC/Sep 29–Oct 3
satisfactory coverage of all individual ore types, than it is to
select samples from all points of the compass. Once mul-
tivariate geological representativity is satisfied, aiming for
broad spatial coverage provides a degree of insurance in case
there are variations within ore types across the orebody that
have not previously been recognized, such as competency
increasing with depth or microscopic changes in grain size
or texture.
Finally, grouping material into domains with similar
characteristics can be of practical convenience for schedul-
ing mine production and blending and may assist with met-
allurgical sample selection and geological interpretation.
Assessing Representativity Of Multivariate
Characteristics
A modern drill hole database is likely to include diverse
data types. Assay data commonly includes 30 or more
elements, hyperspectral data may include quantitative or
semi-quantitative analyses of minerals, and there may be
bulk density, rebound hardness measurements and visually
logged characteristics such as rock type or texture. With
such a large number of features available, assessment of
metallurgical sample representativity using traditional one,
two or even three dimensional statistical or graphical tools
is not efficient or reproducible.
Machine learning provides many tools for assessing
sample representativity in multivariate terms. Unsupervised
machine learning algorithms, where the algorithm is not
directed by a priori assumptions about the outcomes, are
useful for grouping multivariate geological data with simi-
lar characteristics.
We illustrate the principles of dimension reduction and
group analysis using data from a porphyry copper project.
The geological database consists of 28,734 samples of 2 m
length, with multielement ICP-OES* assays, silicate altera-
tion type groups derived from processing of Corescan
hyperspectral data, sulphide mineral species estimated from
elemental assays and partial-digest copper assays, and hard-
ness measured with an Equotip rebound hardness tester.
The objective of the analysis was to assess whether samples
selected for flotation and comminution testing provided
good multivariate coverage of the complete set of drill hole
samples.
We applied the Uniform Manifold Approximation
and Projection (UMAP) dimension reduction algorithm
to reduce the high-dimensional data, to a small number
of dimensions for viewing and grouping analysis. UMAP
*inductively coupled plasma -optical emission
spectrophotometry
is a dimension reduction technique (McInnes et al., 2020)
widely used in data science. UMAP was chosen for the
study as it better preserves local and global structure than
many other dimension reduction algorithms and the out-
comes can be applied to multiple data sets, such as drill
hole data and block model values. As the study was pri-
marily concerned with flotation performance, we trained
UMAP with the chalcopyrite, pyrite, bornite, and chalcoc-
ite† grades estimated from the assay data.
Useful groups may be visually identified by plotting
the data according to the first two or three UMAP vec-
tors but to make the process repeatable and independent
of cognitive biases a clustering algorithm is usually applied
to the transformed data to automate the identification and
labelling of groups. The clustering algorithms evaluate the
similarity of each data point to its neighbours. The number
of groups is selected by the user, based on their usefulness:
how well the groups are discriminated, their interpretabil-
ity and meaningful relationships. We applied the K-Means
clustering algorithm to the first two UMAP vectors and,
based on experimentation, selected K =7 as the most useful
number of groups.
Figure 1 shows the data points projected in UMAP
space, coloured according to the local density of the point
cloud. The first two vectors generated by the UMAP
dimension reduction were used as the × and y coordinates
for visualizing the data points. Data points that are similar
in a multivariate sense appear close together in the UMAP
space while those points that are dissimilar appear further
apart. Even without looking at the raw data, the point cloud
suggests three strongly discriminated groups. The K-means
group boundaries were approximated using Dirichlet tes-
sellation and the group centroids are shown as a black +
symbol and labelled with the group number.
An important step in validating the selection of the
variables used to train the UMAP transformation and the
value selected for K, is to visually assess the distribution
of all the variables across the K-means groups. The origi-
nal multivariate data is retained in the UMAP output file,
therefore the points in UMAP space can be interrogated in
terms of any of the variables in the input data.
Figure 2 shows the data points coloured according to
the four training features. Bornite-dominant and chalcoc-
ite-dominant samples are strongly discriminated by the
UMAP transformation and K-means analysis. Pyrite and
chalcopyrite content show a more complex, largely nega-
tive correlation with gradational changes. Further exami-
nation showed that the multivariate patterns observed in
The estimates of chalcocite content include covellite
Previous Page Next Page