XXXI International Mineral Processing Congress 2024 Proceedings/Washington, DC/Sep 29–Oct 3 1159
Feature extraction
Each individual grain as identified by SAM is extracted
from the original picture as shown on Figure 4.
Classification
Because of the nature of the material studied the criteria to
classify the grains can be based on color as a first approach.
For each grain, 6 color criteria are calculated:
• Median pixel value for each color channel (ie: Red,
Green, Blue)
• Interquartile range for each channel.
This set of criteria intends to capture the average color and
the color gradient for each grain.
The k-means algorithm is a widely used partitional
clustering method that aims to partition a given dataset (n
observations) into disjoint subsets or clusters (k), optimiz-
ing a specific clustering criterion. The primary objective
is to minimize the clustering error criterion, a commonly
employed metric in which the squared distance of each data
point from its corresponding cluster center is computed.
The algorithm achieves this optimization by iteratively
updating cluster assignments and adjusting cluster cen-
troids until convergence (Likas et al., 2003). The simplic-
ity of implementation and computational efficiency of the
k-means algorithm make it a popular choice for clustering
tasks, especially when the clustering error criterion aligns
with the characteristics of the data.
The algorithm steps are as follows:
• Initialization: Specify the number k of clusters to
assign and randomly initialize k centroids.
• Repeat:
– Expectation: Assign each point to its closest
centroid (the distance is the Euclidian distance
between the vectors of descriptors as defined above,
ie the median and the interquartile range for each
color channel)
– Maximization: compute the new centroid (mean)
of each cluster.
• Until convergence occurs when the centroids no lon-
ger change significantly, or when a specified number
of iterations is reached.
Figure 5 shows an example of automatic clustering with the
k-means algorithm (with 2 clusters), we see that the clas-
sification separates the grains by color, in this case with two
categories: dark grains and light grains.
RESULTS AND DISCUSSION
SAM Algorithm Performance
One of the parameters that plays a role in the time that the
segmentation algorithm is the parameter points_per_side, as
described above. To get better segmentation the number
of points_per_side needs to be high enough, however this
directly impacts the running time of the segmentation algo-
rithm. Figure 6 presents the running time (in seconds) in
function of the points_per_side parameter showing a non
linear relation. In the context of process control perfor-
mance in day to day mining operations a running time of
246s or 4,1 min is already considered too long for the anal-
ysis of one image. There is a balance to be found between
processing time and points_per_side parameter or increase
the computing power to reduce the analysis time. The data
analysis was performed on a computing system equipped
with 16 cores, each running at 2.30 GHz.
The accuracy of the algorithm is determined in the
number of objects detected on each iteration of the test and
the results are presented in Table 1. The results show that
increasing the points_per_side parameters (15, 33 and 50)
Figure 4. Individual grains extracted from the original image. Grains are resized for uniform display
purpose, no information on grain size available
Feature extraction
Each individual grain as identified by SAM is extracted
from the original picture as shown on Figure 4.
Classification
Because of the nature of the material studied the criteria to
classify the grains can be based on color as a first approach.
For each grain, 6 color criteria are calculated:
• Median pixel value for each color channel (ie: Red,
Green, Blue)
• Interquartile range for each channel.
This set of criteria intends to capture the average color and
the color gradient for each grain.
The k-means algorithm is a widely used partitional
clustering method that aims to partition a given dataset (n
observations) into disjoint subsets or clusters (k), optimiz-
ing a specific clustering criterion. The primary objective
is to minimize the clustering error criterion, a commonly
employed metric in which the squared distance of each data
point from its corresponding cluster center is computed.
The algorithm achieves this optimization by iteratively
updating cluster assignments and adjusting cluster cen-
troids until convergence (Likas et al., 2003). The simplic-
ity of implementation and computational efficiency of the
k-means algorithm make it a popular choice for clustering
tasks, especially when the clustering error criterion aligns
with the characteristics of the data.
The algorithm steps are as follows:
• Initialization: Specify the number k of clusters to
assign and randomly initialize k centroids.
• Repeat:
– Expectation: Assign each point to its closest
centroid (the distance is the Euclidian distance
between the vectors of descriptors as defined above,
ie the median and the interquartile range for each
color channel)
– Maximization: compute the new centroid (mean)
of each cluster.
• Until convergence occurs when the centroids no lon-
ger change significantly, or when a specified number
of iterations is reached.
Figure 5 shows an example of automatic clustering with the
k-means algorithm (with 2 clusters), we see that the clas-
sification separates the grains by color, in this case with two
categories: dark grains and light grains.
RESULTS AND DISCUSSION
SAM Algorithm Performance
One of the parameters that plays a role in the time that the
segmentation algorithm is the parameter points_per_side, as
described above. To get better segmentation the number
of points_per_side needs to be high enough, however this
directly impacts the running time of the segmentation algo-
rithm. Figure 6 presents the running time (in seconds) in
function of the points_per_side parameter showing a non
linear relation. In the context of process control perfor-
mance in day to day mining operations a running time of
246s or 4,1 min is already considered too long for the anal-
ysis of one image. There is a balance to be found between
processing time and points_per_side parameter or increase
the computing power to reduce the analysis time. The data
analysis was performed on a computing system equipped
with 16 cores, each running at 2.30 GHz.
The accuracy of the algorithm is determined in the
number of objects detected on each iteration of the test and
the results are presented in Table 1. The results show that
increasing the points_per_side parameters (15, 33 and 50)
Figure 4. Individual grains extracted from the original image. Grains are resized for uniform display
purpose, no information on grain size available