This commit is contained in:
Dustella 2025-04-21 08:47:04 +08:00
parent 03f3694c8b
commit 911df69b2a
Signed by: Dustella
GPG Key ID: 35AA0AA3DC402D5C
8 changed files with 758 additions and 0 deletions

View File

@ -0,0 +1,120 @@
# 基于激光雷达和三维语义分割的铁路异物入侵检测研究进展报告
## 1. 引言
铁路安全是交通运输领域的重中之重,任何安全事故都可能导致严重的人员伤亡和巨大的经济损失。异物入侵是威胁铁路安全的重要因素之一,例如落石、行人、车辆或动物侵入铁轨都可能引发列车脱轨、碰撞等恶性事故[1]。随着铁路网络的日益复杂和列车运行速度的不断提高,对铁路沿线进行实时、精确的异物入侵检测变得越来越重要[3]。
激光雷达Light Detection and Ranging, LiDAR技术作为一种主动遥感技术能够通过发射激光脉冲并测量其反射回波来获取周围环境精确的三维空间信息。相较于传统的基于视觉的检测方法激光雷达在黑暗、恶劣天气等光照条件不佳的环境下依然能够可靠地工作[3]。三维语义分割技术则能够对激光雷达获取的点云数据进行逐点分类,将场景中的不同物体(如铁轨、植被、建筑物、异物等)区分开来[6]。因此,结合激光雷达和三维语义分割技术,有望实现对铁路异物入侵的有效检测。
PointNet 和 RandLANet 是近年来在三维点云处理领域备受关注的深度学习模型。PointNet 作为一种开创性的网络结构可以直接处理无序的点云数据通过多层感知机MLP学习点云的特征[4]。RandLANet 则是一种更高效、更快速的网络,专门为大规模点云的语义分割而设计,它通过随机采样提高处理速度,并利用局部特征聚合模块保留关键细节信息[8]。鉴于这两种模型在处理三维数据方面的潜力,调研它们在铁路异物入侵检测任务中的应用进展具有重要的现实意义。本报告旨在对国内外利用激光雷达和三维语义分割技术进行铁路异物入侵检测的研究现状进行全面的调研和分析,为相关项目的开展提供参考。
## 2. 国内外研究概况
国际学术界对基于激光雷达的三维语义分割在铁路场景中的应用进行了积极探索。一些研究致力于开发新的深度学习方法,专门用于处理真实世界中的激光雷达铁路场景点云[6]。这些方法通常依赖于空间局部点云变换进行卷积学习,从而提高对不同点云密度的鲁棒性,同时保留重要的度量信息[6]。三维数据,特别是通过激光雷达获取的点云,在自动驾驶、城市测绘等多种应用中正逐渐取代图像,成为环境感知的主流数据形式[6]。为了促进该领域的发展,研究人员也发布了具有挑战性的、多样化的、高分辨率的铁路场景标注数据集,用于评估点云分割模型的性能[10]。这些努力表明,国际研究正积极推动将先进的三维感知技术应用于铁路安全领域。
在国内,利用激光雷达和深度学习进行铁路异物入侵检测的研究也日益受到重视。一些研究机构和企业针对铁路场景的特点,提出了特定的解决方案。例如,有研究提出了 Rail-PillarNet 网络,该网络基于 PointPillars 架构,并引入了注意力机制,旨在提高对铁路场景中远距离小尺寸异物的检测精度[3]。国内的研究也认识到二维图像在远距离和小目标检测方面的局限性,强调了激光雷达在全天候运行和提供精确距离信息方面的优势[3]。此外,为了提高模型的性能和泛化能力,国内研究也开始探索使用迁移学习等技术,例如先在 KITTI 等大型数据集上进行预训练,然后在铁路专用的 OSDaR23 数据集上进行微调[3]。这些研究表明,国内在利用激光雷达和深度学习解决铁路安全问题方面取得了显著进展。
## 3. PointNet 和 RandLANet 的应用与性能
PointNet 作为早期成功处理点云数据的深度学习模型,在铁路异物入侵检测领域也受到了一定的关注。一些研究将 PointNet 作为基准模型,用于对比评估新提出的网络结构的性能[11]。这体现了 PointNet 在该领域作为基础模型的参考价值。虽然提供的资料没有详细介绍针对 PointNet 架构的修改和增强以适应铁路场景的专门研究,但其在铁路轨道识别等相关任务中的应用[4]表明了其在铁路场景理解方面的潜力。PointNet 通过直接处理点云数据,避免了将三维数据转换为其他形式(如体素或多视图图像)可能导致的信息损失,这对于保持原始数据的精度至关重要。
RandLANet 因其高效性和处理大规模点云的能力,在铁路异物入侵检测领域展现出巨大的潜力。一些研究探索了将 RandLANet 应用于铁路场景的语义分割,这可以作为识别异物的第一步,将异物视为异常或独立的类别进行分割[13]。尽管目前缺乏直接比较 RandLANet 与其他模型在铁路异物入侵检测任务中性能的专门研究,但其在其他大规模点云语义分割任务中的出色表现[8]预示了其在铁路场景中的应用前景。RandLANet 的设计使其能够快速处理由激光雷达扫描产生的庞大点云数据,这对于需要实时响应的铁路安全系统至关重要。此外,一些研究还探索了对 RandLANet 的改进和集成,例如结合 NDT 注册算法以实现更好的点云对齐,这对于提高铁路应用的精度具有重要意义[13]。
总体而言PointNet 作为点云处理的奠基性工作为后续研究提供了重要的参考。RandLANet 则凭借其高效性和处理大规模数据的能力,成为铁路异物入侵检测领域一个极具潜力的模型。未来的研究可能会更多地关注如何针对铁路场景的特点,进一步优化和调整这两种模型,以实现更高的检测精度和更快的处理速度。
## 4. 铁路入侵检测数据集
为了推动基于激光雷达和三维语义分割的铁路异物入侵检测研究,高质量的标注数据集至关重要。目前,国内外已经发布了一些与铁路场景相关的点云数据集。
- **Rail3D**: 这是一个多场景点云数据集,包含来自匈牙利、法国和比利时的铁路数据[14]。该数据集覆盖了近 5.8 公里的铁路,约有 2.88 亿个点,标注了 9 个关键类别包括地面、植被、轨道、电杆、电线、信号、围栏、设施和建筑物。Rail3D 的多样性有助于开发更鲁棒和更通用的模型。
- **Catenary Arch Dataset**: 该数据集覆盖了荷兰代尔夫特附近 800 米长的铁轨,详细捕捉了 15 个接触网拱[10]。数据通过地面激光扫描仪获取,提供了点的三维坐标,但没有颜色信息。每个拱的点数从 160 万到 1100 万不等,每个点都手工标注为 14 个不同的类别。虽然该数据集主要关注接触网拱的分割,但其高分辨率和详细标注对于研究铁路基础设施的精细分割仍然很有价值。
- **OSDaR23**: 这是一个开放的铁路传感器数据 2023 数据集,在德国汉堡记录[11]。它使用配备了激光雷达、高低分辨率 RGB 相机、红外相机和雷达等多种传感器的铁路车辆采集数据。激光雷达数据由 Velodyne HDL-64E 传感器以 10 Hz 的频率获取。该数据集标注了 20 个不同的物体类别,包括火车、轨道、接触网杆、标志、植被、建筑物、行人、车辆等,可能包含异物。
- **Rail-DB**: 这是一个专注于视频帧中轨道检测的数据集,包含 7432 对图像和标注[17]。图像涵盖了不同的光照、道路结构和视角,轨道以折线标注,图像分为九个场景。该数据集主要用于评估二维轨道检测算法。
- **RailSem19**: 这是一个列车第一视角的语义场景理解数据集,包含 8500 个场景的相机图像,并对 22 个类别进行了像素级标注[16]。虽然该数据集没有激光雷达数据,但其高质量的语义标注可以用于迁移学习或作为补充数据。
- **WHU-Railway3D**: 这是武汉大学于 2023 年发布的一个新的铁路点云数据集[15]。它覆盖了 30 公里,包含 40 亿个点,并标注了 11 个类别。其主要优势在于覆盖了城市、乡村和高原三种不同的环境。
- **Synthetic Dataset**: 由于缺乏公开的真实世界标注数据,一些研究人员创建了合成的铁路点云数据集用于研究[20]。这种方法可以灵活地建模目标地标的形状和位置,例如交通信号灯和铁轨。
这些数据集在传感器类型、环境条件和标注方式上存在差异。Rail3D、Catenary Arch Dataset、OSDaR23 和 WHU-Railway3D 包含激光雷达数据,而 Rail-DB 和 RailSem19 主要提供图像数据。OSDaR23 和 WHU-Railway3D 标注了更广泛的物体类别,可能包含异物,而 Rail3D 和 Catenary Arch Dataset 则更侧重于基础设施。合成数据集可以针对特定任务生成数据,但可能存在与真实数据之间的域差距。
选择合适的数据集对于训练和评估基于 PointNet 和 RandLANet 的铁路异物入侵检测模型至关重要。研究人员需要根据任务的具体需求,例如是否需要点云数据、标注的类别是否包含异物、以及对环境条件的覆盖范围等因素进行选择。在某些情况下,可能需要结合使用多个数据集或进行数据增强来满足研究需求。
## 5. 现有检测系统与原型
目前,一些研究机构和企业已经开发了基于激光雷达的铁路异物入侵检测系统或原型系统。
- **Molinari Sensors** 提供基于激光雷达的铁路轨道安全系统,包括站台跌落检测和入侵检测系统[21]。这些系统利用 3D 数字激光雷达技术,结合了在北美超过 400 个铁路站台和隧道安装 2D 激光雷达技术的经验。他们的系统旨在减少误报,并在检测到人员或危险物体进入轨道区域时自动发送警报给列车安全控制系统、司机和控制中心。
- **Piper Networks** 的 StandClear™ 轨道入侵检测系统结合了短距离的飞行时间ToF摄像头和长距离的 TrackSight™ 激光雷达[22]。该系统可以检测站台边缘的危险行为、轨道上的物体或人员入侵,以及隧道内的入侵。激光雷达的检测范围可达 100 米,能够区分 animate 和 inanimate 物体,即使在低光照条件下也能工作。系统可以向列车操作员发出多种警报,并符合严格的铁路安全标准。
- **L.B. Foster** 为英国铁路公司 Network Rail 提供了用于 80 个平交道口的激光雷达障碍物检测系统[23]。这些系统能够检测到尺寸小至 115 毫米的物体,并集成了红灯违章监控、车牌识别、视频分析和数据记录功能。安装这些系统有效地防止了许多潜在的碰撞事故。
- 在中国,也有研究机构开发了结合激光雷达和传统摄像头的铁路轨道异物入侵检测系统[24]。该系统旨在提高异物入侵检测的准确性,并能够定位异物,对于保障铁路安全具有重要的社会和经济意义。
这些现有系统或原型系统通常采用激光雷达作为主要的感知传感器,有些还会结合其他传感器如摄像头或雷达,以提高检测的可靠性和覆盖范围。它们的目标是实现对铁路沿线异物的实时检测和报警,从而保障列车运行安全。然而,这些系统在实际应用中可能面临各种挑战,例如在复杂环境下的鲁棒性、恶劣天气的影响以及如何进一步降低误报率等。
## 6. 三维语义分割模型性能比较
在铁路场景下,不同三维语义分割模型的检测精度、速度和鲁棒性表现存在差异。虽然目前缺乏直接针对 PointNet 和 RandLANet 在铁路异物入侵检测任务上的全面性能比较研究,但一些研究对其他模型在铁路相关任务中的表现进行了评估,这可以为我们选择合适的模型提供参考。
例如,一篇研究评估了 PointNet++、PointCNN 和 DGCNN 三种不同的神经网络在阿默斯福特火车站数据集上的点云语义分割性能[25]。结果表明,所有网络都取得了超过 90% 的总体准确率但在平均类别准确率和平均交并比IoU方面存在差异。这说明不同的模型在处理复杂的室内铁路车站场景时对不同物体的识别能力有所不同。
另一项研究比较了 ConvPoint 和 ShellNet 等模型在铁路激光雷达点云数据上的语义分割质量[6]。结果显示,提出的新方法在分割大型和垂直物体(如建筑物、混凝土结构、接触网杆和环境物体)方面优于 ConvPoint 和 ShellNet但在分割地面、轨道和植被等非垂直物体方面表现稍逊。该研究还评估了模型的处理速度和对点云分布变化的鲁棒性。
此外Rail-PillarNet 的研究表明,通过在 PointPillars 基础上引入并行注意力柱编码PAPE和改进的骨干网络可以显著提高铁路异物检测的平均精度[3]。这表明,通过对现有模型进行针对性改进,可以提升其在特定铁路任务中的性能。
总的来说,选择合适的三维语义分割模型需要综合考虑任务需求、数据特点以及模型的精度、速度和鲁棒性。对于铁路异物入侵检测而言,模型不仅需要能够准确识别异物,还需要具备足够的速度以满足实时性要求,并在各种复杂环境下保持稳定的性能。
## 7. 提升模型实时性和处理复杂环境的策略
为了提升模型在铁路异物入侵检测中的实时性,研究中提到了一些关键策略。一种常见的方法是减少输入数据的规模。例如,体素降采样和金字塔划分等技术可以有效地降低点云数据的数量,同时尽量保留关键信息,从而加快后续的处理速度[4]。此外选择高效的网络架构也是至关重要的。RandLANet 本身就是一种为快速处理大规模点云而设计的网络[8]。对现有网络(如 UNet的卷积结构进行优化例如采用深度可分离卷积和空洞空间金字塔结构也可以在一定程度上提高检测速度[28]。另外,像 PointPillars 和 Rail-PillarNet 中使用的基于柱状编码的方法,通过将三维点云投影到二维平面进行处理,也可以提高计算效率[11]。
处理复杂环境(如雨雪、光照变化)是铁路异物入侵检测面临的另一大挑战。研究表明,多模态传感器融合是一种有效的策略。将激光雷达与摄像头或雷达等其他传感器的数据相结合,可以提供更全面的环境信息,从而提高在恶劣天气条件下的鲁棒性[24]。例如DHT-CL 策略在训练阶段融合了摄像头和激光雷达的数据,以改善复杂天气下点云的语义分割效果,且在推理阶段仅依赖激光雷达数据[29]。邻域注意力机制等技术可以根据不同模态数据的邻近点之间的相似性,选择性地保留或过滤信息,从而增强模型对天气变化的适应性[29]。此外,针对激光雷达点云中的噪声和外点进行滤波处理,特别是在恶劣天气条件下,也是提高模型鲁棒性的重要步骤[5]。数据增强技术通过在训练数据中模拟各种环境条件,可以帮助模型学习在不同场景下保持良好的性能。领域自适应方法则可以将从其他环境或传感器收集的数据中学到的知识迁移到特定的铁路部署场景中,从而提高模型的泛化能力。
## 8. 研究进展、挑战与未来方向
国内外在基于激光雷达和三维语义分割的铁路异物入侵检测领域已经取得了显著的进展。研究人员开发了专门的深度学习架构,并对现有模型进行了改进,以适应铁路场景的特点。标注数据集的不断涌现,为模型的训练和评估提供了有力支持。一些商业化和原型系统的出现,证明了该技术在实际铁路安全应用中的可行性。
然而,该领域仍然面临着诸多挑战。如何在资源受限的列车车载系统中实现高精度和高实时性的检测仍然是一个难题。确保模型在各种复杂和恶劣天气条件下的鲁棒性也需要进一步研究。针对特定类型的异物入侵,高质量、大规模的标注数据仍然相对稀缺,且数据中可能存在类别不平衡的问题。此外,将这些基于人工智能的系统可靠地应用于关键的铁路运营中,还需要进行更深入的可靠性和安全性验证。
未来的研究方向可能包括:开发更高效、更轻量级的三维语义分割模型,以满足实时车载处理的需求;探索更先进的传感器融合技术和更精细化的恶劣天气处理方法;创建更大规模、更多样化的标注数据集,并利用合成数据生成和改进的标注工具来弥补数据不足;研究主动学习策略,以在有限的标注数据下提高模型性能[16]考虑整合来自连续激光雷达扫描的时间信息以提高对运动物体的检测精度和鲁棒性探索可解释人工智能XAI技术以深入了解模型的决策过程这对于安全攸关的应用至关重要。
## 9. 实际重要性和研究价值
铁路异物入侵造成的事故和事件给社会带来了巨大的安全隐患和经济损失。统计数据显示,轨道侵入是导致铁路伤亡的主要原因之一,甚至超过了火车与车辆的碰撞[2]。例如2018 年美国发生了一起火车与垃圾车的碰撞事故,造成人员伤亡[1]。同年,意大利也发生了一起因山体滑坡导致火车脱轨的事故,造成多人伤亡[1]。这些案例都凸显了有效检测系统的必要性。
实施有效的异物入侵检测技术可以显著降低事故发生的风险,减少人员伤亡和财产损失,从而提高铁路的整体安全性。此外,该技术还可以最大限度地减少服务中断和延误,提高运营效率,并降低铁路运营商和乘客的经济损失。通过早期发现轨道磨损和结构缺陷,还可以延长轨道的使用寿命,实现预防性维护[24]。
综上所述,利用激光雷达和三维语义分割技术进行铁路异物入侵检测具有重要的实际意义和研究价值。通过不断克服技术挑战,未来的研究有望为铁路安全保障提供更智能、更可靠的解决方案。
## 10. 结论
本报告对国内外基于激光雷达和三维语义分割的铁路异物入侵检测研究进展进行了全面的调研和分析。研究表明该领域已经取得了显著的进展涌现出了一批专门为铁路场景设计的深度学习模型并积累了一定的标注数据集。PointNet 和 RandLANet 作为具有代表性的三维语义分割模型,在铁路场景中展现出各自的优势和潜力。然而,实现高精度、高实时性以及在复杂环境下的鲁棒性仍然面临挑战。未来的研究将继续朝着更高效、更可靠的方向发展,并有望通过技术创新和数据积累,为提升铁路安全水平做出重要贡献。
## References
1. FAST DETECTION STUDY OF FOREIGN OBJECT INTRUSION ON RAILWAY TRACK - Semantic Scholar, [Link](https://pdfs.semanticscholar.org/5525/d73efdf2c277631625ccaa4acd9a72c75c6a.pdf)
2. www.utrgv.edu, [Link](https://www.utrgv.edu/railwaysafety/_files/documents/reports/track-intrusion-detection-and-track-integrity-evaluation_093024.pdf)
3. CMC | Free Full-Text | Rail-PillarNet: A 3D Detection Network for ..., [Link](https://www.techscience.com/cmc/v80n3/57896/html)
4. Real-time Rail Recognition Based on 3D Point Clouds - arXiv, [Link](https://arxiv.org/pdf/2201.02726)
5. CMC | Free Full-Text | Advancing Railway Infrastructure Monitoring: A Case Study on Railway Pole Detection - Tech Science Press, [Link](https://www.techscience.com/cmc/v83n2/60516/html)
6. isprs-annals.copernicus.org, [Link](https://isprs-annals.copernicus.org/articles/V-2-2022/135/2022/isprs-annals-V-2-2022-135-2022.pdf)
7. Few-Shot Segmentation of 3D Point Clouds Under Real-World Distributional Shifts in Railroad Infrastructure - PMC, [Link](https://pmc.ncbi.nlm.nih.gov/articles/PMC11860123/)
8. randlanet - MathWorks, [Link](https://www.mathworks.com/help/lidar/ref/randlanet.html)
9. Point cloud classification using RandLA-Net | ArcGIS API for Python - Esri Developer, [Link](https://developers.arcgis.com/python/latest/guide/point-cloud-classification-using-randlanet/)
10. Semantic Segmentation of Terrestrial Laser Scans of Railway Catenary Arches: A Use Case Perspective - MDPI, [Link](https://www.mdpi.com/1424-8220/23/1/222)
11. Rail-PillarNet: A 3D Detection Network for Railway Foreign Object Based on LiDAR, [Link](https://www.sciopen.com/article/10.32604/cmc.2024.054525)
12. CMC | Rail-PillarNet: A 3D Detection Network for Railway Foreign Object Based on LiDAR, [Link](https://www.techscience.com/cmc/v80n3/57896)
13. An Improved RandLa-Net Algorithm Incorporated with NDT for Automatic Classification and Extraction of Raw Point Cloud Data - MDPI, [Link](https://www.mdpi.com/2079-9292/11/17/2795)
14. akharroubi/Rail3D: Rail3D: Multi-Context Point Cloud ... - GitHub, [Link](https://github.com/akharroubi/Rail3D)
15. Multi-Context Point Cloud Dataset and Machine Learning for Railway Semantic Segmentation - MDPI, [Link](https://www.mdpi.com/2412-3811/9/4/71)
16. [PDF] FRSign: A Large-Scale Traffic Light Dataset for Autonomous Trains | Semantic Scholar, [Link](https://www.semanticscholar.org/paper/FRSign%3A-A-Large-Scale-Traffic-Light-Dataset-for-Harb-R'eb'ena/4eb25625ac45244342fd39ef34a98ff0370e54d3)
17. Sampson-Lee/Rail-Detection: [ACM MM 2022] Official Rail-DB and Rail-Net - GitHub, [Link](https://github.com/Sampson-Lee/Rail-Detection)
18. Finding and Managing Anomalies: Case Study on RailSem19 Dataset - LatticeFlow AI, [Link](https://latticeflow.ai/news/finding-anomalies-in-railsem19)
19. Railway LiDAR semantic segmentation based on intelligent semi-automated data annotation - arXiv, [Link](https://arxiv.org/html/2410.13383v1)
20. 3D Object Detection on Synthetic Point Clouds for Railway Applications - ResearchGate, [Link](https://www.researchgate.net/publication/364601408_3D_Object_Detection_on_Synthetic_Point_Clouds_for_Railway_Applications)
21. Molinari Sensors: Railway track security systems, rail infrastructure ..., [Link](https://www.molinarisensors.com/)
22. Piper's StandClear™ Track Intrusion Detection System - Piper, [Link](https://www.pipernetworks.com/pipers-standclear-track-intrusion-detection-system/)
23. Network Rail - Level crossing obstacle detection systems | L.B. Foster, [Link](https://lbfoster.com/case-studies/network-rail-level-crossing-obstacle-detection-systems)
24. Build a Foreign Object Detection System for Railway Tracks - Livox, [Link](https://www.livoxtech.com/application/foreign_object_detection)
25. A COMPARATIVE STUDY OF POINT CLOUDS SEMANTIC SEGMENTATION USING THREE DIFFERENT NEURAL NETWORKS ON THE RAILWAY STATION DATASET - gdmc.nl, [Link](https://www.gdmc.nl/publications/2021/isprs-archives-XLIII-B3-2021-223-2021.pdf)
26. (PDF) A COMPARATIVE STUDY OF POINT CLOUDS SEMANTIC SEGMENTATION USING THREE DIFFERENT NEURAL NETWORKS ON THE RAILWAY STATION DATASET - ResearchGate, [Link](https://www.researchgate.net/publication/352821036_A_COMPARATIVE_STUDY_OF_POINT_CLOUDS_SEMANTIC_SEGMENTATION_USING_THREE_DIFFERENT_NEURAL_NETWORKS_ON_THE_RAILWAY_STATION_DATASET)
27. PointNet based Object Detection and Classification in Autonomous Driving - YouTube, [Link](https://www.youtube.com/watch?v=vq0LWzTb-Vo)
28. A new railway foreign object intrusion detection method - SPIE Digital Library, [Link](https://www.spiedigitallibrary.org/conference-proceedings-of-spie/13184/1318476/A-new-railway-foreign-object-intrusion-detection-method/10.1117/12.3033131.short)
29. Multi-Modal Contrastive Learning for LiDAR Point Cloud Rail-Obstacle Detection in Complex Weather - MDPI, [Link](https://www.mdpi.com/2079-9292/13/1/220)

View File

@ -0,0 +1,256 @@
---
sidebar_position: 2
---
# Getting Started
Before you begin, make sure you have the Typst environment installed. If not, you can use the [Web App](https://typst.app/) or install the [Tinymist LSP](https://marketplace.visualstudio.com/items?itemName=myriad-dreamin.tinymist) plugins for VS Code.
To use Touying, you just need to include the following in your document:
```typst
#import "@preview/touying:0.6.1": *
#import themes.simple: *
#show: simple-theme.with(aspect-ratio: "16-9")
= Title
== First Slide
Hello, Touying!
#pause
Hello, Typst!
```
![image](https://github.com/touying-typ/touying/assets/34951714/f5bdbf8f-7bf9-45fd-9923-0fa5d66450b2)
It's that simple! You've created your first Touying slides. Congratulations! 🎉
**Tip:** You can use Typst syntax like `#import "config.typ": *` or `#include "content.typ"` to implement Touying's multi-file architecture.
## More Complex Examples
In fact, Touying provides various styles for slide writing. You can also use the `#slide[..]` syntax to access more powerful features provided by Touying.
Touying offers many built-in themes to easily create beautiful slides. For example, in this case:
```typst
#import "@preview/touying:0.6.1": *
#import themes.university: *
#import "@preview/cetz:0.3.2"
#import "@preview/fletcher:0.5.5" as fletcher: node, edge
#import "@preview/numbly:0.1.0": numbly
#import "@preview/theorion:0.3.2": *
#import cosmos.clouds: *
#show: show-theorion
// cetz and fletcher bindings for touying
#let cetz-canvas = touying-reducer.with(reduce: cetz.canvas, cover: cetz.draw.hide.with(bounds: true))
#let fletcher-diagram = touying-reducer.with(reduce: fletcher.diagram, cover: fletcher.hide)
#show: university-theme.with(
aspect-ratio: "16-9",
// align: horizon,
// config-common(handout: true),
config-common(frozen-counters: (theorem-counter,)), // freeze theorem counter for animation
config-info(
title: [Title],
subtitle: [Subtitle],
author: [Authors],
date: datetime.today(),
institution: [Institution],
logo: emoji.school,
),
)
#set heading(numbering: numbly("{1}.", default: "1.1"))
#title-slide()
== Outline <touying:hidden>
#components.adaptive-columns(outline(title: none, indent: 1em))
= Animation
== Simple Animation
We can use `#pause` to #pause display something later.
#pause
Just like this.
#meanwhile
Meanwhile, #pause we can also use `#meanwhile` to #pause display other content synchronously.
#speaker-note[
+ This is a speaker note.
+ You won't see it unless you use `config-common(show-notes-on-second-screen: right)`
]
== Complex Animation
At subslide #touying-fn-wrapper((self: none) => str(self.subslide)), we can
use #uncover("2-")[`#uncover` function] for reserving space,
use #only("2-")[`#only` function] for not reserving space,
#alternatives[call `#only` multiple times \u{2717}][use `#alternatives` function #sym.checkmark] for choosing one of the alternatives.
== Callback Style Animation
#slide(
repeat: 3,
self => [
#let (uncover, only, alternatives) = utils.methods(self)
At subslide #self.subslide, we can
use #uncover("2-")[`#uncover` function] for reserving space,
use #only("2-")[`#only` function] for not reserving space,
#alternatives[call `#only` multiple times \u{2717}][use `#alternatives` function #sym.checkmark] for choosing one of the alternatives.
],
)
== Math Equation Animation
Equation with `pause`:
$
f(x) &= pause x^2 + 2x + 1 \
&= pause (x + 1)^2 \
$
#meanwhile
Here, #pause we have the expression of $f(x)$.
#pause
By factorizing, we can obtain this result.
== CeTZ Animation
CeTZ Animation in Touying:
#cetz-canvas({
import cetz.draw: *
rect((0, 0), (5, 5))
(pause,)
rect((0, 0), (1, 1))
rect((1, 1), (2, 2))
rect((2, 2), (3, 3))
(pause,)
line((0, 0), (2.5, 2.5), name: "line")
})
== Fletcher Animation
Fletcher Animation in Touying:
#fletcher-diagram(
node-stroke: .1em,
node-fill: gradient.radial(blue.lighten(80%), blue, center: (30%, 20%), radius: 80%),
spacing: 4em,
edge((-1, 0), "r", "-|>", `open(path)`, label-pos: 0, label-side: center),
node((0, 0), `reading`, radius: 2em),
edge((0, 0), (0, 0), `read()`, "--|>", bend: 130deg),
pause,
edge(`read()`, "-|>"),
node((1, 0), `eof`, radius: 2em),
pause,
edge(`close()`, "-|>"),
node((2, 0), `closed`, radius: 2em, extrude: (-2.5, 0)),
edge((0, 0), (2, 0), `close()`, "-|>", bend: -40deg),
)
= Theorems
== Prime numbers
#definition[
A natural number is called a #highlight[_prime number_] if it is greater
than 1 and cannot be written as the product of two smaller natural numbers.
]
#example[
The numbers $2$, $3$, and $17$ are prime.
@cor_largest_prime shows that this list is not exhaustive!
]
#theorem(title: "Euclid")[
There are infinitely many primes.
]
#pagebreak(weak: true)
#proof[
Suppose to the contrary that $p_1, p_2, dots, p_n$ is a finite enumeration
of all primes. Set $P = p_1 p_2 dots p_n$. Since $P + 1$ is not in our list,
it cannot be prime. Thus, some prime factor $p_j$ divides $P + 1$. Since
$p_j$ also divides $P$, it must divide the difference $(P + 1) - P = 1$, a
contradiction.
]
#corollary[
There is no largest prime number.
] <cor_largest_prime>
#corollary[
There are infinitely many composite numbers.
]
#theorem[
There are arbitrarily long stretches of composite numbers.
]
#proof[
For any $n > 2$, consider $
n! + 2, quad n! + 3, quad ..., quad n! + n
$
]
= Others
== Side-by-side
#slide(composer: (1fr, 1fr))[
First column.
][
Second column.
]
== Multiple Pages
#lorem(200)
#show: appendix
= Appendix
== Appendix
Please pay attention to the current slide number.
```
![image](https://github.com/user-attachments/assets/b1dfc4d9-e263-46ff-8588-a0635870e370)
For more detailed tutorials on themes, you can refer to the following sections.

94
.github/prompts/3.slides.prompt.md vendored Normal file
View File

@ -0,0 +1,94 @@
# Presentation Slides
## Summary
You are going to create a presentation for your pilot study. The slides should be clear and concise, with a focus on the key findings and implications of our pilot study.
Project information can be find in `.github/prompts/1.main_task.prompt.md`.
Some background research can be found in `.github/knowledge/3.background_and_recent_work.knowledge.md`
The references can be found at `assignment_3/refs.bib`
## Slides Outline
A suggested version of the slides is as follows. You can modify the order or add/remove slides as you see fit. The goal is to create a coherent and engaging presentation that effectively communicates your findings.
**Title Slide**
- Presentation title, your names, affiliation, date
- Oneline tagline of the pilot study
**Motivation & Goals**
- Why this problem matters (realworld pain points)
- Highlevel project objective
- Specific aims of the pilot study
**Background / Related Work**
- 34 bullets summarizing the key Chineselanguage findings (translate into English)
- State of the art & gap youre filling
**System Overview**
- Block diagram or workflow of your approach
- Key components/modules
**Pilot Study Design**
- Participants, setting, duration
- Data collected and instruments
- Success criteria
**Implementation Details**
- Any novel algorithms or hardware setup (as slides of “how we built it”)
- Important parameters or configurations
**Data & Metrics**
- What raw data looks like (screenshots/graphs)
- Metrics definition (accuracy, latency, user satisfaction…)
**Results Quantitative**
- Key performance numbers (tables/charts from main.tex)
- Compare against baseline
**Results Qualitative**
- User feedback highlights
- Observed behaviors or case studies
**Discussion**
- Interpretation of results
- Lessons learned in pilot
**Next Steps & Roadmap**
- How youll scale beyond pilot
- Remaining challenges & planned experiments
**Conclusion & Q&A**
- 23 takehome messages
- Invite questions
## Formatting
You are going to write the slides in Typst. The usage and examples can be found at `.github/knowledge/4.typst_formatting.knowledge.md`.
If some results are images or datatables, or requires visualization, put a placeholder, do not make up statistics.
You are a professional academic researcher, so your slides should be refined and well-structured. Use the following guidelines:
- Avoid long sentences and paragraphs. Use bullet points and short phrases.
- Use clear and concise language. Avoid jargon and technical terms unless necessary.
- Use visuals (images, graphs, charts) to support your points(can be placeholder). Avoid cluttering slides with too much text.
- Use consistent formatting (fonts, colors, sizes) throughout the presentation.(This is made sure by typst)
Finally, if you need to demonstrate more information to be delivered in spoken form, use the comments in the slides. The comments will not be shown in the presentation, but they will be available for you to read while presenting.
- Use the comments to elaborate on key points, provide additional context, or explain complex concepts.

BIN
assets/pred.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 58 KiB

BIN
assets/reallife-railway.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.7 MiB

BIN
assets/truth.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 56 KiB

BIN
assets/whurailway.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 348 KiB

288
assignment_2/slide.typ Normal file
View File

@ -0,0 +1,288 @@
#import "@preview/touying:0.6.1": *
#import themes.university: *
#show: university-theme.with(
aspect-ratio: "16-9",
config-info(
title: [Detection of Foreign Objects on Railway Tracks],
subtitle: [A Pilot Study with RandLA-Net],
author: [Hanwen Yu],
date: datetime.today(),
institution: [SAT, XJTLU],
logo: emoji.train,
),
)
// Title slide
#title-slide()
// Outline slide
#slide[
= Outline
#set text(size: 1.1em)
- Background & Problem Statement
- Pilot Study Design
- Data & Metrics
- Results & Discussion
]
// Motivation & Goals
= Background and Problem Statement
#slide[
#text(size: 1.2em, weight: "bold")[Why This Matters]
- Undetected objects on railway tracks cause derailments and catastrophic accidents
- Manual inspection is time-consuming and error-prone
- Financial impact of railway accidents is significant
][
#image("../assets/reallife-railway.png", width: 100%)
#text(size: 0.8em)[Fig: Real-life railway track scene]
]
#slide[
#text(size: 1.2em, weight: "bold")[Project Objectives]
- Develop automated detection system using LiDAR and 3D point cloud segmentation
- Accurately identify foreign objects amidst complex railway geometry
- Maintain computational efficiency for practical deployment
][
#image("../assets/whurailway.png", width: 100%)
#text(size: 0.8em)[Fig: Railway track PointCloud Example, from WHURailway3D Dataset]
]
---
#text(weight: "bold")[ Problem Statement]
Given a point cloud *$P = \{p_1, p_2, dots, p_n\}$*
where each point *$p_i in RR^3$* represents a 3D coordinate in the railway environment,
our task is to assign each point a semantic label *$l_i in \{0, 1, dots, C-1\}$*
where *$C = 13$* represents our predefined classes.
The function $f: P -> L$ maps the input point cloud to a set of labels $L = \{l_1, l_2, dots, l_n\}$.
---
= Pilot Study Design
---
#slide()[
#v(1em)
#text(size: 1.2em, weight: "bold")[Pilot Study Aims]
- Establish baseline performance using RandLA-Net
- Evaluate feasibility of detecting extremely rare objects (0.001% of data)
][
#image("../assignment_3/fig/example.jpg", width: 100%)
]
---
#slide[
#text(size: 1.1em, weight: "bold")[Current Approaches]
- PointNet and PointNet++: Improved local feature extraction but computationally expensive for large point clouds
- *RandLA-Net*: Balances efficiency and accuracy through random sampling with local feature aggregation
- Attention-based methods: Focus on global context but may overlook local details, eg PointTransformer
We choose *RandLA-Net* for baseline.
]
= Data and Metrics
#slide[
#text(size: 1.1em, weight: "bold")[Training Setup]
- 858 training files, 172 test files
- Only 18 training files and 1 test file contain foreign objects
- 1/4 downsampling ratio
- NVIDIA RTX 4090 GPU
][
#text(size: 1.1em, weight: "bold")[Data Collection]
- 1,031 PLY files with 248M+ points
- 13 semantic classes including railway infrastructure elements
- "Box" class (label 11) represents foreign objects
- Extreme class imbalance: boxes only 0.001% of points
]
#slide(composer: (1fr, 3fr))[
#text(size: 1em, weight: "bold")[Inspection on data]
#v(1em)
#text(size: 0.8em)[Table: Distribution of semantic classes in the railway LiDAR dataset]
][
// #text(size: 1.1em, weight: "bold")[Inspection on Data]
#set text(size: 0.7em)
#table(
columns: (auto, auto, auto, auto),
inset: 8pt,
align: (center, left, right, right),
stroke: 0.4pt,
[*Label*], [*Class Name*], [*Point Count*], [*Percentage*],
[0], [Track], [16,653,029], [6.71%],
[1], [Track Surface], [39,975,480], [16.11%],
[2], [Ditch], [7,937,154], [3.20%],
[3], [Masts], [4,596,199], [1.85%],
[4], [Cable], [2,562,683], [1.03%],
[5], [Tunnel], [31,412,582], [12.66%],
[6], [Ground], [73,861,934], [29.76%],
[7], [Fence], [7,834,499], [3.16%],
[8], [Mountain], [51,685,366], [20.82%],
[9], [Train], [9,047,963], [3.65%],
[10], [Human], [275,077], [0.11%],
[11], [Box (foreign object)], [3,080], [0.001%],
[12], [Others], [2,360,810], [0.95%],
)
]
#slide[
#text(size: 1.2em, weight: "bold")[Evaluation Metrics]
For each class $c$, the IoU is calculated as:
$
text("IoU")_c = frac("TP"_c, "TP"_c + "FP"_c + "FN"_c)
$
where $text("TP")_c$, $ "FP"_c$, and $"FN"_c$ represent true positives, false positives, and false negatives for class $c$, respectively. The mIoU is then calculated by averaging the IoU values across all classes:
$
"mIoU" = 1 / C sum_(c=1)^(C) "IoU"_c
$
]
#slide[
#text(weight: "bold")[Precision]
$
"Precision"_"box" = frac("TP"_"box", "TP"_"box" + "FP"_"box")
$
where $"TP"_"box"$ and $"FP"_"box"$ represent true positives and false positives for the "Box" class, respectively.
]
= Results and Discussion
#slide()[
#text(weight: "bold")[Results]
- The overall mean IoU across all classes was 70.29\%,
- the IoU for our target class— *"Box"* (foreign object)—was *0.00\%* (Will be discussed later).
- IoU of other classes was relatively high, with *"Train"* achieving *95.22\%* and *"Ground"* achieving *89.68\%*.
][
#set text(size: 0.8em)
#table(
columns: (auto, auto, auto),
inset: 8pt,
align: (center, left, right),
stroke: 0.4pt,
[*Label*], [*Class Name*], [*IoU (\%)*],
[0], [Track], [60.12],
[1], [Track Surface], [74.53],
[2], [Ditch], [74.21],
[3], [Masts], [82.48],
[4], [Cable], [73.62],
[5], [Tunnel], [83.03],
[6], [Ground], [89.68],
[7], [Fence], [79.81],
[8], [Mountain], [91.93],
[9], [Train], [95.22],
[10], [Human], [61.86],
[11], [Box (foreign object)], [0.00],
)
]
#slide[
#text(weight: "bold")[Visualization]
- The model assumes there are always half of points being mountain
- In some cases, where no mountain is present, the model still predicts train or other objects as mountain,
][
#image("../assets/pred.png", width: 90%)
#image("../assets/truth.png", width: 90%)
]
#slide()[
#text(weight: "bold")[Why Model Performs Bad on Boxes]
Let's look at Cross Entropy Loss:
// \ell(x, y) = L = \{l_1,\dots,l_N\}^\top, \quad
// l_n = - w_{y_n} \log \frac{\exp(x_{n,y_n})}{\sum_{c=1}^C \exp(x_{n,c})}
// \cdot \mathbb{1}\{y_n \not= \text{ignore\_index}\}
$
ell(x, y) = 1 / N sum_(n=1)^N - w_(y_n) log frac(exp(x_{n,y_n}) , sum_(c=1)^C exp(x_{n,c}))
$
where $w_{y_n}$ is the weight for class $y_n$ and $N$ is the number of points in the batch.
]
#slide()[
#text(weight: "bold")[Why Model Performs Bad on Boxes]
- In our dataset, *"Ground"* and *"Mountain"* are the majority classes, with *29.76% and 20.82%* of points, respectively, summing up to *50.58%* of the dataset.
- the "Box" class is extremely rare, with only 0.001% of points labeled as "Box".
- If the model blindly predicts the majority class (e.g., "Ground") for all points, it has 50% chance to be right. The loss will be low.
]
#slide(align: auto)[
#v(6em)
The model is *biased towards the majority classes*, leading to poor performance on the minority class (foreign objects).
]
#slide[
#text(weight: "bold")[Future Work]
- Add weights in loss functions to address class imbalance
- Explore data augmentation techniques to increase the representation of the "Box" class
- Consider ensemble methods or multi-task learning to improve detection performance
]
= The end
#text(size: 0.6em)[
Work by Hanwen Yu,
supervised by Dr. Siyue Yu
]