284 lines
7.4 KiB
Typst
284 lines
7.4 KiB
Typst
#import "@preview/touying:0.6.1": *
|
|
#import themes.university: *
|
|
|
|
|
|
#show: university-theme.with(
|
|
aspect-ratio: "16-9",
|
|
config-info(
|
|
title: [Detection of Foreign Objects on Railway Tracks],
|
|
subtitle: [A Pilot Study with RandLA-Net],
|
|
author: [Hanwen Yu],
|
|
date: datetime.today(),
|
|
institution: [SAT, XJTLU],
|
|
logo: emoji.train,
|
|
),
|
|
)
|
|
|
|
// Title slide
|
|
#title-slide()
|
|
|
|
// Outline slide
|
|
#slide[
|
|
= Outline
|
|
|
|
#set text(size: 1.1em)
|
|
- Background & Problem Statement
|
|
- Pilot Study Design
|
|
- Data & Metrics
|
|
- Results & Discussion
|
|
]
|
|
|
|
// Motivation & Goals
|
|
= Background and Problem Statement
|
|
#slide[
|
|
#text(size: 1.2em, weight: "bold")[Why This Matters]
|
|
|
|
- Undetected objects on railway tracks cause derailments and catastrophic accidents
|
|
- Manual inspection is time-consuming and error-prone
|
|
- Financial impact of railway accidents is significant
|
|
][
|
|
#image("../assets/reallife-railway.png", width: 100%)
|
|
#text(size: 0.8em)[Fig: Real-life railway track scene]
|
|
]
|
|
|
|
#slide[
|
|
|
|
#text(size: 1.2em, weight: "bold")[Project Objectives]
|
|
|
|
- Develop automated detection system using LiDAR and 3D point cloud segmentation
|
|
- Accurately identify foreign objects amidst complex railway geometry
|
|
- Maintain computational efficiency for practical deployment
|
|
][
|
|
#image("../assets/whurailway.png", width: 100%)
|
|
#text(size: 0.8em)[Fig: Railway track PointCloud Example, from WHURailway3D Dataset]
|
|
|
|
]
|
|
---
|
|
|
|
#text(weight: "bold")[ Problem Statement]
|
|
|
|
Given a point cloud *$P = \{p_1, p_2, dots, p_n\}$*
|
|
|
|
where each point *$p_i in RR^3$* represents a 3D coordinate in the railway environment,
|
|
|
|
Our task is to assign each point a semantic label *$l_i in \{0, 1, dots, C-1\}$*
|
|
|
|
where *$C = 13$* represents our predefined classes.
|
|
|
|
The function $f: P -> L$ maps the input point cloud to a set of labels $L = \{l_1, l_2, dots, l_n\}$.
|
|
|
|
---
|
|
|
|
= Pilot Study Design
|
|
---
|
|
|
|
#slide()[
|
|
|
|
#v(1em)
|
|
#text(size: 1.2em, weight: "bold")[Pilot Study Aims]
|
|
|
|
- Establish *baseline* performance using RandLA-Net
|
|
- Evaluate feasibility of detecting extremely rare objects (0.001% of data)
|
|
|
|
][
|
|
#image("../assignment_3/fig/example.jpg", width: 100%)
|
|
]
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
#slide[
|
|
|
|
#text(size: 1.1em, weight: "bold")[Current Approaches]
|
|
|
|
- PointNet and PointNet++: Improved local feature extraction but computationally expensive for large point clouds
|
|
|
|
- *RandLA-Net*: Balances efficiency and accuracy through random sampling with local feature aggregation
|
|
|
|
- Attention-based methods: Focus on global context but may overlook local details, eg PointTransformer
|
|
|
|
We choose *RandLA-Net* for baseline.
|
|
]
|
|
|
|
= Data and Metrics
|
|
|
|
#slide[
|
|
|
|
|
|
#text(size: 1.1em, weight: "bold")[Training Setup]
|
|
|
|
- *858* training files, *172* test files
|
|
- Only *18* training files and *1* test file contain foreign objects
|
|
- *1/4* downsampling ratio
|
|
- NVIDIA RTX 4090 GPU
|
|
|
|
][
|
|
#text(size: 1.1em, weight: "bold")[Data Collection]
|
|
|
|
- *1,031* PLY files with *248M+* points
|
|
- *13* semantic classes including railway infrastructure elements
|
|
- "Box" class (label *11*) represents foreign objects
|
|
- Extreme class imbalance: boxes only *0.001%* of points
|
|
|
|
]
|
|
|
|
#slide(composer: (2fr, 3fr))[
|
|
#text(size: 1.2em, weight: "bold")[Inspection on data]
|
|
#v(1em)
|
|
#text(size: 0.8em)[Table: Distribution of semantic classes in the railway LiDAR dataset]
|
|
][
|
|
#set text(size: 0.7em)
|
|
#table(
|
|
columns: (auto, auto, auto, auto),
|
|
inset: 8pt,
|
|
align: (center, left, right, right),
|
|
stroke: 0.4pt,
|
|
[*Label*], [*Class Name*], [*Point Count*], [*Percentage*],
|
|
[0], [Track], [16,653,029], [6.71%],
|
|
[1], [Track Surface], [39,975,480], [16.11%],
|
|
[2], [Ditch], [7,937,154], [3.20%],
|
|
[3], [Masts], [4,596,199], [1.85%],
|
|
[4], [Cable], [2,562,683], [1.03%],
|
|
[5], [Tunnel], [31,412,582], [12.66%],
|
|
[6], [Ground], [73,861,934], [29.76%],
|
|
[7], [Fence], [7,834,499], [3.16%],
|
|
[8], [Mountain], [51,685,366], [20.82%],
|
|
[9], [Train], [9,047,963], [3.65%],
|
|
[10], [Human], [275,077], [0.11%],
|
|
[11], [*Box (object)*], [*3,080*], [*0.001%*],
|
|
[12], [Others], [2,360,810], [0.95%],
|
|
)
|
|
|
|
|
|
]
|
|
|
|
|
|
#slide[
|
|
#text(size: 1.2em, weight: "bold")[Evaluation Metrics]
|
|
|
|
|
|
For each class $c$, the IoU is calculated as:
|
|
$
|
|
text("IoU")_c = frac("TP"_c, "TP"_c + "FP"_c + "FN"_c)
|
|
$
|
|
|
|
where $text("TP")_c$, $ "FP"_c$, and $"FN"_c$ represent true positives, false positives, and false negatives for class $c$, respectively. The mIoU is then calculated by averaging the IoU values across all classes:
|
|
$
|
|
"mIoU" = 1 / C sum_(c=1)^(C) "IoU"_c
|
|
$
|
|
]
|
|
|
|
#slide[
|
|
#text(weight: "bold")[Precision]
|
|
$
|
|
"Precision"_"box" = frac("TP"_"box", "TP"_"box" + "FP"_"box")
|
|
$
|
|
|
|
where $"TP"_"box"$ and $"FP"_"box"$ represent true positives and false positives for the "Box" class, respectively.
|
|
]
|
|
|
|
= Results and Discussion
|
|
|
|
#slide()[
|
|
#text(weight: "bold")[Results]
|
|
|
|
- The overall mean IoU across all classes was *70.29\%*,
|
|
- the IoU for our target class— *"Box"* (foreign object)—was *0.00\%* (Will be discussed later).
|
|
- IoU of other classes was relatively high, with *"Train"* achieving *95.22\%* and *"Ground"* achieving *89.68\%*.
|
|
][
|
|
|
|
#set text(size: 0.8em)
|
|
#table(
|
|
columns: (auto, auto, auto),
|
|
inset: 8pt,
|
|
align: (center, left, right),
|
|
stroke: 0.4pt,
|
|
[*Label*], [*Class Name*], [*IoU (\%)*],
|
|
[0], [Track], [60.12],
|
|
[1], [Track Surface], [74.53],
|
|
[2], [Ditch], [74.21],
|
|
[3], [Masts], [82.48],
|
|
[4], [Cable], [73.62],
|
|
[5], [Tunnel], [83.03],
|
|
[6], [Ground], [89.68],
|
|
[7], [Fence], [79.81],
|
|
[8], [Mountain], [91.93],
|
|
[9], [Train], [95.22],
|
|
[10], [Human], [61.86],
|
|
[11], [Box (foreign object)], [0.00],
|
|
)
|
|
]
|
|
|
|
#slide[
|
|
|
|
#text(weight: "bold")[Visualization]
|
|
|
|
- The model assumes there are always half of points being mountain
|
|
- In some cases, where no mountain is present, the model still predicts train or other objects as mountain,
|
|
|
|
][
|
|
|
|
#image("../assets/pred.png", width: 90%)
|
|
#image("../assets/truth.png", width: 90%)
|
|
|
|
]
|
|
|
|
#slide()[
|
|
|
|
#text(weight: "bold")[Why Model Performs Bad on Boxes]
|
|
|
|
Let's look at Cross Entropy Loss:
|
|
|
|
$
|
|
ell(x, y) = 1 / N sum_(n=1)^N - w_(y_n) log frac(exp(x_{n,y_n}) , sum_(c=1)^C exp(x_{n,c}))
|
|
$
|
|
|
|
where $w_{y_n}$ is the weight for class $y_n$ and $N$ is the number of points in the batch.
|
|
|
|
In this case we just *blindly set weight for all class as 1*, which is not suitable. We should add weight on classes like boxes and human.
|
|
|
|
]
|
|
|
|
|
|
|
|
#slide()[
|
|
|
|
#text(weight: "bold")[Why Model Performs Bad on Boxes]
|
|
|
|
- In our dataset, *"Ground"* and *"Mountain"* are the majority classes, with *29.76% and 20.82%* of points, respectively, summing up to *50.58%* of the dataset.
|
|
- the "Box" class is extremely rare, with only 0.001% of points labeled as "Box".
|
|
- If the model blindly predicts the majority class (e.g., "Ground") for all points, it has 50% chance to be right. The loss will be low.
|
|
|
|
|
|
]
|
|
|
|
#slide(align: auto)[
|
|
|
|
#text(weight: "bold")[Why Model Performs Bad on Boxes]
|
|
|
|
#v(5em)
|
|
|
|
The model is *biased towards the majority classes*, leading to poor performance on the minority class (foreign objects).
|
|
|
|
]
|
|
|
|
#slide[
|
|
|
|
#text(weight: "bold")[Future Work]
|
|
|
|
- Add weights in loss functions to *address class imbalance*
|
|
- Explore *data augmentation* techniques to increase the representation of the "Box" class
|
|
- Consider *ensemble methods* or multi-task learning to improve detection performance
|
|
]
|
|
|
|
|
|
= The end
|
|
|
|
#text(size: 0.6em)[
|
|
|
|
Work by Hanwen Yu,
|
|
supervised by Dr. Siyue Yu
|
|
]
|