SAT405/poster.typ
2025-12-13 23:29:19 +08:00

203 lines
5.5 KiB
Typst

#import "postercise.typ": *
#import themes.boxes: *
#import "@preview/fletcher:0.5.8" as fletcher: diagram, edge, node
#import fletcher.shapes: brace, diamond, hexagon, parallelogram, pill
#set page(width: 16in, height: 22in)
#set text(size: 16pt)
#show: theme.with(
primary-color: rgb(28, 55, 103), // Dark blue
background-color: white,
accent-color: rgb(243, 163, 30), // Yellow
titletext-color: white,
titletext-size: 2em,
)
#poster-header(
title: [Exploring SAM2 for Pavement Crack Segmentation #linebreak() ],
subtitle: [Zero-Shot Performance and Prompt Strategy Analysis],
authors: [Hanwen Yu, 2467345],
affiliation: [School of Advanced Technology, Supervisor: Siyue Yu
],
logo-1: image("./img/xjtlu-o.png", width: 22em),
)
#poster-content(col: 3)[
#normal-box(color: none)[
== Introduction
The Segment Anything Model (SAM) has demonstrated remarkable
zero-shot segmentation capabilities on natural images. However, its zero-shot performance on domain-specific tasks remains underexplored.
We investigate SAM2's effectiveness for *pavement crack segmentation*, a task characterized by thin, *low-contrast* structures with *complex topologies*.
*Can SAM2 achieve competitive crack segmentation
performance without domain-specific training?*
]
#normal-box(color: none)[
== Methodology
We use *Crack500 Dataset* @PDFFeaturePyramid , which consists of 500 images with pixel-wise annotations of pavement cracks. The test set is 100 images for evaluation.
SAM's segmentation workflow is a bit different from traditional segmentation, as shown in. it also has *different prompt strategies*, we evaluate four prompt approaches:
#show table.cell: set text(size: 14pt)
#let frame(stroke) = (x, y) => (
left: if x > 0 { 0.2pt } else { stroke },
right: stroke,
top: if y < 2 { stroke } else { 0.2pt },
bottom: stroke,
)
#set table(
fill: (rgb("EAF2F5"), none),
stroke: frame(1pt + rgb("21222C")),
)
#show figure.where(
kind: table,
): set figure.caption(position: bottom)
#figure(
table(
columns: 2,
[Prompt Type], [Description],
[Bounding Box], [Tight box around ground truth mask],
[1-Point Prompt], [Single point sampled from GT skeleton (morphological center)],
[3-Point Prompt], [Three uniformly distributed points along GT skeleton],
[5-Point Prompt], [Five uniformly distributed points along GT skeleton],
),
caption: [Types of Prompts],
) <types-of-prompts>
#set text(size: 16pt)
#figure(
diagram(
node-fill: gradient.radial(white, blue, radius: 200%),
node-stroke: blue,
spacing: 25pt,
(
node((0, 0), [Crack Image], shape: rect),
node((0, 1), [SAM Image Encoder], shape: rect),
node((0, 2), [Prompt Generation #linebreak() BBox, 1/3/5 points], shape: rect),
node((1, 2), [SAM Mask Decoder], shape: rect),
node((1, 1), [Predircted Mask], shape: rect),
node((1, 0), [Metrics (IoU, F1)], shape: rect),
)
.intersperse(edge("-|>"))
.join(),
),
caption: [SAM2 Segmentation workflow],
)
Some supervised models are taken into comparison: UNet, DeepCrack, TransUNet,
CT-CrackSeg, VM-UNet, CrackSegMamba.
]
#normal-box(color: none)[
== Experiments and Results
#figure(
image("img/examples.png"),
caption: [Examples of SAM2 results],
)
*Evaluation*
#show math.equation: set text(size: 14pt)
#set math.equation(numbering: "(1)")
$ bold("IoU") = "TP" / ("TP" + "FP" + "FN") $ <iou_e>
$ bold("F1") = 2 * ("Precision" * "Recall") / ("Precision" + "Recall") $ <f1>
#figure(
image("img/metrics.png"),
caption: [Model Metrics Comparison ],
)
SAM2 with bbox prompts (39.6% IoU) lags behind supervised models, including UNet 2015.
#figure(
// columns[
image("img/sam_iou.png", width: 14em),
// #colbreak()
// #image("img/sam_f1.png")
// ],
caption: [IoU of SAM2 with 4 prompt strategies],
)
Bounding box prompts yield the best performance among zero-shot methods. There is a 4.7x performance gap between bbox(39.6% IoU) and 1-point prompts(8.4% IoU).
]
#normal-box(color: none)[
== Qualitative Analysis
#figure(
image("img/fail1.png"),
caption: [Failure Cases of SAM2 (bbox)],
)
#figure(
image("img/fail2.png"),
caption: [Failure Cases of SAM2 (5-point) ],
)
]
#normal-box(color: none)[
== Key Findings and Discussion
// *Prompt Effectiveness*
This highlights limitations of zero-shot approach without fine-tuning.
// *Single Point Prompt Limitations*
1-point prompts perform poorly (12.3% IoU), indicating insufficient guidance for complex crack structures. 5-point prompts approach bbox performance for highly irregular cracks, suggesting multiple points help capture shape.
]
#normal-box(color: none)[
== Conclusion and Future Work
SAM2 shows limited zero-shot capability for crack segmentation. Bounding box prompts significantly outperform point-based prompts. Performance still lags behind supervised methods, indicating need for domain adaptation.
]
#poster-footer[
// Content
#normal-box(color: none)[
== References
#columns()[
#bibliography("./cit.bib", title: none)
]
]
Hanwen Yu | Email: Hanwen.Yu24\@student.xjtlu.edu.cn
]
]