1
24-092
Utilizing Big Data Statistical Techniques in Python to
Optimize Geometallurgy Workflow for Metallurgical
Test Work Sample Selection
Muhammad Usman Siddiqui
Ausenco, Vancouver, BC
Kevin Erwin
Ausenco, Vancouver, BC
Rajiv Chandramohan
Ausenco, Vancouver, BC
Connor Meinke
Ausenco, Vancouver, BC
Shaihroz Khan
Ausenco, Toronto, ON
ABSTRACT
High-quality sample selection for metallurgical test work
is essential to a geometallurgy study, but the large multi-
dimensional dataset makes sample selection a daunting
task, as classifying the dataset while respecting its heteroge-
neity is difficult. This paper presents a streamlined approach
for sample selection, utilizing custom-built tools in Python
to standardize the methodology, saving time and costs.
This approach uses the cumulative sum method, princi-
pal component analysis, and k-means clustering method
to elegantly cluster the data and select representative sam-
ples. A case study is used to demonstrate the effectiveness
of the methodology by selecting 40 samples for flotation
test work.
NOMENCLATURE
PCA—Principal component analysis
CuSum—Cumulative sum
PC—Principal component
WCSS—Within-cluster sum of square
INTRODUCTION
Geometallurgy is a compelling methodology in the mining
industry to bridge the gap between metallurgy and geology.
It aims to reduce technical risk and enhance the economic
performance of a mineral processing plant by accounting
for the variability in a deposit to strengthen investor con-
fidence, facilitating robust revenue models, and developing
scenarios with variable throughput rates to more accurately
forecast cash flows for future mining operations. A geomet-
allurgy study looks at the relationship between a deposit’s
geological characteristics, its variability, and its response to
metallurgical processes. Selecting samples that capture the
heterogeneity of the deposit for metallurgical test work is
an essential component of a geometallurgy study.
High-quality sampling is vital across the entire mine
value chain, as sampling errors are additive and generate
monetary and intangible losses (Dominy et al., 2018). The
goal is to select samples that accurately describe the deposit
(Dominy et al., 2018). A geometallurgy database usually
consists of sample id, mineral grade, lithology, alteration,
and test work data if available. It is the basis for choos-
ing representative samples for metallurgical characteriza-
tion test work (competency, hardness, recoveries) which is
then used in robust flowsheet development and equipment
selection for optimum life of mine performance. However,
these datasets can be large with multiple columns, making
sample selection a daunting task during the analysis.
Michaux et al. (2020) present a framework to develop
a geometallurgical program in which they discuss the meth-
odology to cluster a geometallurgy database into similar
clusters, and they present the cumulative sum (CuSum)
24-092
Utilizing Big Data Statistical Techniques in Python to
Optimize Geometallurgy Workflow for Metallurgical
Test Work Sample Selection
Muhammad Usman Siddiqui
Ausenco, Vancouver, BC
Kevin Erwin
Ausenco, Vancouver, BC
Rajiv Chandramohan
Ausenco, Vancouver, BC
Connor Meinke
Ausenco, Vancouver, BC
Shaihroz Khan
Ausenco, Toronto, ON
ABSTRACT
High-quality sample selection for metallurgical test work
is essential to a geometallurgy study, but the large multi-
dimensional dataset makes sample selection a daunting
task, as classifying the dataset while respecting its heteroge-
neity is difficult. This paper presents a streamlined approach
for sample selection, utilizing custom-built tools in Python
to standardize the methodology, saving time and costs.
This approach uses the cumulative sum method, princi-
pal component analysis, and k-means clustering method
to elegantly cluster the data and select representative sam-
ples. A case study is used to demonstrate the effectiveness
of the methodology by selecting 40 samples for flotation
test work.
NOMENCLATURE
PCA—Principal component analysis
CuSum—Cumulative sum
PC—Principal component
WCSS—Within-cluster sum of square
INTRODUCTION
Geometallurgy is a compelling methodology in the mining
industry to bridge the gap between metallurgy and geology.
It aims to reduce technical risk and enhance the economic
performance of a mineral processing plant by accounting
for the variability in a deposit to strengthen investor con-
fidence, facilitating robust revenue models, and developing
scenarios with variable throughput rates to more accurately
forecast cash flows for future mining operations. A geomet-
allurgy study looks at the relationship between a deposit’s
geological characteristics, its variability, and its response to
metallurgical processes. Selecting samples that capture the
heterogeneity of the deposit for metallurgical test work is
an essential component of a geometallurgy study.
High-quality sampling is vital across the entire mine
value chain, as sampling errors are additive and generate
monetary and intangible losses (Dominy et al., 2018). The
goal is to select samples that accurately describe the deposit
(Dominy et al., 2018). A geometallurgy database usually
consists of sample id, mineral grade, lithology, alteration,
and test work data if available. It is the basis for choos-
ing representative samples for metallurgical characteriza-
tion test work (competency, hardness, recoveries) which is
then used in robust flowsheet development and equipment
selection for optimum life of mine performance. However,
these datasets can be large with multiple columns, making
sample selection a daunting task during the analysis.
Michaux et al. (2020) present a framework to develop
a geometallurgical program in which they discuss the meth-
odology to cluster a geometallurgy database into similar
clusters, and they present the cumulative sum (CuSum)