1158 XXXI International Mineral Processing Congress 2024 Proceedings/Washington, DC/Sep 29–Oct 3
objects are, and it can generate masks for any object in any
image or any video, even including objects and image types
that it had not encountered during training. SAM is gen-
eral enough to cover a broad set of use cases and can be
used out of the box on new image “domains”—whether
underwater photos or cell microscopy—without requiring
additional training (a capability often referred to as zero-
shot transfer) (Kirillov et al., 2023).
SAM’s training set (SA-1B) consists of 11 million
diverse, high resolution, licensed and privacy protecting
images and 1.1 billion high resolution masks collected with
Meta data engine (Kirillov et al., 2023) therefore it may
not be directly transferable to niche tasks on data such as
mineralogical data or specific data formats. Recent publi-
cations have been using SAM for application to a variety
of challenging scenarios such as medical images (Ma et al.,
2024) and specific data formats.
Under the hood, a heavy weight image encoder pro-
duces a one-time embedding for the image that can be effi-
ciently queried, while a lightweight encoder converts any
prompt into an embedding vector in real time, the structure
is shown in Figure 2 (Kirillov et al., 2023). These two infor-
mation sources are then combined in a lightweight decoder
that predicts segmentation masks. For ambiguous prompts
corresponding to more than one object, SAM can output
multiple valid masks and associated confidence scores.
The parameters that play a major role in the quality of
the results are embedded in the algorithm of SAM. Figure 3
shows the segmented image, for easier visualization each
segmented object is shown in a different color. To achieve
the result presented in Figure 3, the following parameters
were modified:
box_nms_thresh and crop_nms_thresh have been
adjusted to detected overlapping or superposed
grains while avoiding duplicated masks.
pred_iou_thresh: filter out bad quality masks
points_per_side: The number of points to be sampled
along one side of the image. The total number of
points is points_per_side**2 plays a role in analysis
duration.
Figure 2. Segment Anything Model (SAM) structure overview. Source: Kirillov et al., 2023
Figure 3. Original image (a) and segmented result using Segment Anything Model (b), colors indicate individual segmented
objects
(a) (b)
Previous Page Next Page