#import "@preview/touying:0.6.1": * #import themes.university: * #show: university-theme.with( aspect-ratio: "16-9", config-info( title: [Detection of Foreign Objects on Railway Tracks], subtitle: [A Pilot Study with RandLA-Net], author: [Hanwen Yu], date: datetime.today(), institution: [SAT, XJTLU], logo: emoji.train, ), ) // Title slide #title-slide() // Outline slide #slide[ = Outline #set text(size: 1.1em) - Background & Problem Statement - Pilot Study Design - Data & Metrics - Results & Discussion ] // Motivation & Goals = Background and Problem Statement #slide[ #text(size: 1.2em, weight: "bold")[Why This Matters] - Undetected objects on railway tracks cause derailments and catastrophic accidents - Manual inspection is time-consuming and error-prone - Financial impact of railway accidents is significant ][ #image("../assets/reallife-railway.png", width: 100%) #text(size: 0.8em)[Fig: Real-life railway track scene] ] #slide[ #text(size: 1.2em, weight: "bold")[Project Objectives] - Develop automated detection system using LiDAR and 3D point cloud segmentation - Accurately identify foreign objects amidst complex railway geometry - Maintain computational efficiency for practical deployment ][ #image("../assets/whurailway.png", width: 100%) #text(size: 0.8em)[Fig: Railway track PointCloud Example, from WHURailway3D Dataset] ] --- #text(weight: "bold")[ Problem Statement] Given a point cloud *$P = \{p_1, p_2, dots, p_n\}$* where each point *$p_i in RR^3$* represents a 3D coordinate in the railway environment, Our task is to assign each point a semantic label *$l_i in \{0, 1, dots, C-1\}$* where *$C = 13$* represents our predefined classes. The function $f: P -> L$ maps the input point cloud to a set of labels $L = \{l_1, l_2, dots, l_n\}$. --- = Pilot Study Design --- #slide()[ #v(1em) #text(size: 1.2em, weight: "bold")[Pilot Study Aims] - Establish *baseline* performance using RandLA-Net - Evaluate feasibility of detecting extremely rare objects (0.001% of data) ][ #image("../assignment_3/fig/example.jpg", width: 100%) ] --- #slide[ #text(size: 1.1em, weight: "bold")[Current Approaches] - PointNet and PointNet++: Improved local feature extraction but computationally expensive for large point clouds - *RandLA-Net*: Balances efficiency and accuracy through random sampling with local feature aggregation - Attention-based methods: Focus on global context but may overlook local details, eg PointTransformer We choose *RandLA-Net* for baseline. ] = Data and Metrics #slide[ #text(size: 1.1em, weight: "bold")[Training Setup] - *858* training files, *172* test files - Only *18* training files and *1* test file contain foreign objects - *1/4* downsampling ratio - NVIDIA RTX 4090 GPU ][ #text(size: 1.1em, weight: "bold")[Data Collection] - *1,031* PLY files with *248M+* points - *13* semantic classes including railway infrastructure elements - "Box" class (label *11*) represents foreign objects - Extreme class imbalance: boxes only *0.001%* of points ] #slide(composer: (2fr, 3fr))[ #text(size: 1.2em, weight: "bold")[Inspection on data] #v(1em) #text(size: 0.8em)[Table: Distribution of semantic classes in the railway LiDAR dataset] ][ #set text(size: 0.7em) #table( columns: (auto, auto, auto, auto), inset: 8pt, align: (center, left, right, right), stroke: 0.4pt, [*Label*], [*Class Name*], [*Point Count*], [*Percentage*], [0], [Track], [16,653,029], [6.71%], [1], [Track Surface], [39,975,480], [16.11%], [2], [Ditch], [7,937,154], [3.20%], [3], [Masts], [4,596,199], [1.85%], [4], [Cable], [2,562,683], [1.03%], [5], [Tunnel], [31,412,582], [12.66%], [6], [Ground], [73,861,934], [29.76%], [7], [Fence], [7,834,499], [3.16%], [8], [Mountain], [51,685,366], [20.82%], [9], [Train], [9,047,963], [3.65%], [10], [Human], [275,077], [0.11%], [11], [*Box (object)*], [*3,080*], [*0.001%*], [12], [Others], [2,360,810], [0.95%], ) ] #slide[ #text(size: 1.2em, weight: "bold")[Evaluation Metrics] For each class $c$, the IoU is calculated as: $ text("IoU")_c = frac("TP"_c, "TP"_c + "FP"_c + "FN"_c) $ where $text("TP")_c$, $ "FP"_c$, and $"FN"_c$ represent true positives, false positives, and false negatives for class $c$, respectively. The mIoU is then calculated by averaging the IoU values across all classes: $ "mIoU" = 1 / C sum_(c=1)^(C) "IoU"_c $ ] #slide[ #text(weight: "bold")[Precision] $ "Precision"_"box" = frac("TP"_"box", "TP"_"box" + "FP"_"box") $ where $"TP"_"box"$ and $"FP"_"box"$ represent true positives and false positives for the "Box" class, respectively. ] = Results and Discussion #slide()[ #text(weight: "bold")[Results] - The overall mean IoU across all classes was *70.29\%*, - the IoU for our target class— *"Box"* (foreign object)—was *0.00\%* (Will be discussed later). - IoU of other classes was relatively high, with *"Train"* achieving *95.22\%* and *"Ground"* achieving *89.68\%*. ][ #set text(size: 0.8em) #table( columns: (auto, auto, auto), inset: 8pt, align: (center, left, right), stroke: 0.4pt, [*Label*], [*Class Name*], [*IoU (\%)*], [0], [Track], [60.12], [1], [Track Surface], [74.53], [2], [Ditch], [74.21], [3], [Masts], [82.48], [4], [Cable], [73.62], [5], [Tunnel], [83.03], [6], [Ground], [89.68], [7], [Fence], [79.81], [8], [Mountain], [91.93], [9], [Train], [95.22], [10], [Human], [61.86], [11], [Box (foreign object)], [0.00], ) ] #slide[ #text(weight: "bold")[Visualization] - The model assumes there are always half of points being mountain - In some cases, where no mountain is present, the model still predicts train or other objects as mountain, ][ #image("../assets/pred.png", width: 90%) #image("../assets/truth.png", width: 90%) ] #slide()[ #text(weight: "bold")[Why Model Performs Bad on Boxes] Let's look at Cross Entropy Loss: $ ell(x, y) = 1 / N sum_(n=1)^N - w_(y_n) log frac(exp(x_{n,y_n}) , sum_(c=1)^C exp(x_{n,c})) $ where $w_{y_n}$ is the weight for class $y_n$ and $N$ is the number of points in the batch. In this case we just *blindly set weight for all class as 1*, which is not suitable. We should add weight on classes like boxes and human. ] #slide()[ #text(weight: "bold")[Why Model Performs Bad on Boxes] - In our dataset, *"Ground"* and *"Mountain"* are the majority classes, with *29.76% and 20.82%* of points, respectively, summing up to *50.58%* of the dataset. - the "Box" class is extremely rare, with only 0.001% of points labeled as "Box". - If the model blindly predicts the majority class (e.g., "Ground") for all points, it has 50% chance to be right. The loss will be low. ] #slide(align: auto)[ #text(weight: "bold")[Why Model Performs Bad on Boxes] #v(5em) The model is *biased towards the majority classes*, leading to poor performance on the minority class (foreign objects). ] #slide[ #text(weight: "bold")[Future Work] - Add weights in loss functions to *address class imbalance* - Explore *data augmentation* techniques to increase the representation of the "Box" class - Consider *ensemble methods* or multi-task learning to improve detection performance ] = The end #text(size: 0.6em)[ Work by Hanwen Yu, supervised by Dr. Siyue Yu ]