init: first version

This commit is contained in:
Dustella 2025-12-13 21:49:17 +08:00
commit 78033ee073
No known key found for this signature in database
GPG Key ID: C6227AE4A45E0187
9 changed files with 156 additions and 0 deletions

1
.gitignore vendored Normal file
View File

@ -0,0 +1 @@
*.pdf

BIN
img/examples.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 731 KiB

BIN
img/fail1.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.0 MiB

BIN
img/fail2.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.3 MiB

BIN
img/metrics.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 53 KiB

BIN
img/sam_f1.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 21 KiB

BIN
img/sam_iou.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 21 KiB

BIN
img/xjtlu-o.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 23 KiB

155
poster.typ Normal file
View File

@ -0,0 +1,155 @@
#import "@preview/postercise:0.2.0": *
#import themes.boxes: *
#import "@preview/fletcher:0.5.8" as fletcher: diagram, edge, node
#set page(width: 16in, height: 22in)
#set text(size: 16pt)
#show: theme.with(
primary-color: rgb(28, 55, 103), // Dark blue
background-color: white,
accent-color: rgb(243, 163, 30), // Yellow
titletext-color: white,
titletext-size: 1.8em,
)
#poster-header(
title: [Can SAM "Segment Anything"? #linebreak() ],
subtitle: [Evaluating Zero-Shot Performance on Crack Detection],
authors: [Hanwen Yu],
affiliation: [School of Advanced Technology, Supervisor: SiYue Yu
],
logo-2: image("./img/xjtlu-o.png", width: 15em),
)
// #image("examples.png", width: 100%)
#poster-content(col: 3)[
// Content goes here
#normal-box(color: none)[
== Introduction
he Segment Anything Model (SAM) has demonstrated remarkable
zero-shot segmentation capabilities on natural images. However, its zero-shot performance on domain-specific tasks remains underexplored.
// WHY CRACK SEGMENTATION?
// • Critical for infrastructure safety monitoring
// • Challenging characteristics:
// - Thin, elongated structures (often 1-5 pixels wide)
// - Low contrast against background
// - Complex branching topology
// RESEARCH QUESTION
*Can SAM2 achieve competitive crack segmentation
performance without domain-specific training?*
// CONTRIBUTIONS
// • First systematic evaluation of SAM2 zero-shot capability
// on crack segmentation
// • Comprehensive comparison of prompt strategies
// (bounding box vs. point-based prompts)
// • Analysis of failure modes and practical limitations
]
#normal-box(color: none)[
== Methodology
*Dataset*
- Crack500: 500 images with pixel-wise annotations
- Test set: 100 images for evaluation
*Prompt Strategies*
We evaluate four prompt generation approaches:
#table(
columns: 2,
[Prompt Type], [Description],
[Bounding Box], [Tight box around ground truth mask],
[1-Point Prompt], [Single point sampled from GT skeleton (morphological center)],
[3-Point Prompt], [Three uniformly distributed points along GT skeleton],
[5-Point Prompt], [Five uniformly distributed points along GT skeleton],
)
*Evaluation*
$
"IoU" = "TP" / ("TP" + "FP" + "FN")
$
$
"F1" = 2 * ("Precision" * "Recall") / ("Precision" + "Recall")
$
*Baselines*
Supervised models: UNet, DeepCrack, TransUNet,
CT-CrackSeg, VM-UNet, CrackSegMamba
#import fletcher.shapes: brace, diamond, hexagon, parallelogram, pill
#set text(size: 16pt)
#diagram(
node-fill: gradient.radial(white, blue, radius: 200%),
node-stroke: blue,
spacing: 25pt,
(
node((0, 0), [Crack Image], shape: rect),
node((0, 1), [SAM Image Encoder], shape: rect),
node((0, 2), [Prompt Generation #linebreak() BBox, 1/3/5 points], shape: rect),
node((1, 2), [SAM Mask Decoder], shape: rect),
node((1, 1), [Predircted Mask], shape: rect),
node((1, 0), [Metrics (IoU, F1)], shape: rect),
)
.intersperse(edge("-|>"))
.join(),
)
]
#normal-box(color: none)[
== Experiments and Results
#image("img/examples.png")
#image("img/metrics.png")
#image("img/sam_iou.png")
#image("img/sam_f1.png")
]
#normal-box(color: none)[
== Qualitative Analysis
#image("img/fail1.png")
#image("img/fail2.png")
]
#normal-box(color: none)[
== Key Findings and Discussion
// *Prompt Effectiveness*
Bounding box prompts yield the best performance among zero-shot methods. There is a 4.7x performance gap between bbox(39.6% IoU) and 1-point prompts(8.4% IoU).
SAM2 with bbox prompts (39.6% IoU) lags behind supervised models, even UNet in 2015. which highlights limitations of zero-shot approach without fine-tuning.
// *Single Point Prompt Limitations*
1-point prompts perform poorly (12.3% IoU), indicating insufficient guidance for complex crack structures. 5-point prompts approach bbox performance for highly irregular cracks, suggesting multiple points help capture shape.
]
#normal-box(color: none)[
== Conclusion and Future Work
SAM2 shows limited zero-shot capability for crack segmentation. Bounding box prompts significantly outperform point-based prompts. Performance still lags behind supervised methods, indicating need for domain adaptation.
]
#poster-footer[
// Content
Hanwen Yu | Email: Hanwen.Yu24\@student.xjtlu.edu.cn
]
]