7.4 KiB
Main Task
Summary
You are an academic writer, and you are going to write a Pilot Study Report.
The objective of this task is for you to conduct a Project proposal, and a pilot study report and write it up. The pilot study will include a small scale experiment to evaluate the feasibility and predict the costs of a full-scale study. This will give you experience of planning, running, debugging, analyzing and interpreting the results of a small-scale evaluation.
The report will detail all aspects of the project including an introduction, background, problem statement, methodology, results, analysis, discussion, and future work. The discussion section should include an outline of any ethical and legal issues related to the methodology employed and a description of the social and professional implications of the research.
Requirements
- Clear, concise, and compelling description of the research problem and its significance
- Comprehensive summary of relevant literature with clear connections to research questions/hypotheses
- Detailed, well- justified methods with thorough legal, social, ethical and professional considerations
- Comprehensive techniques with clear discussion of challenges and solutions
- Detailed techniques with strong interpretation of findings
- Do not list points too much. Use paragraphs instead.
- Use LaTeX format.
Detail Specification
Undetected objects on railway tracks can lead to derailments, infrastructure damage, and potentially catastrophic accidents. Traditional monitoring methods rely heavily on manual inspection, which is both time-consuming and prone to human error.
This project aims to develop an advanced automated detection system for foreign objects on railway tracks using LiDAR sensors and state-of-the-art 3D point cloud segmentation techniques.
The core challenge lies in accurately identifying and classifying foreign objects amidst the complex railway scene geometry, while maintaining computational efficiency for practical deployment.
Recent advancements in 3D point cloud processing have revolutionized how we approach semantic segmentation tasks in unstructured 3D data. Unlike 2D image data, point clouds present unique challenges due to their irregular format, varying density, and lack of structured grid representation. Several pioneering approaches have emerged to address these challenges.
Hypotheses
Based on our analysis of the problem domain and existing techniques, we propose the following hypotheses:
- A architecture using the attention mechanisms of Point Transformer with the efficient sampling strategies will achieve higher detection accuracy for railway foreign objects than baseline method RandLANet.
- Certain data augmentation techniques simulating various railway environments and object placements will enhance model generalization and reduce false positives.
Data Acquisition and Preprocessing
We will deploy multiple LiDAR sensors along selected railway segments to capture comprehensive 3D point cloud data. The data collection strategy will ensure coverage of diverse environmental conditions (weather, lighting, seasons) and various types of potential foreign objects.
The data we generated is stored in PLY format. Follows are some information about these files:
Total PLY files: 1031
Total points across all files: 248,205,859
Average points per file: 240,742.8
Min points in a file: 50,048
Max points in a file: 952,476
We labeled all files into 13 classes. the name and distribution of classes are listed below:
1 轨面 track
2 沟渠 ditch
3 杆 masts
4 电缆 cable
5 隧道 tunnel
6 地面 ground
7 栅栏 fence
8 山体 mountain
9 列车 train
10 行人 human
11 盒子 box
12 其他 others
in these labels, 11 box is the foreign object we want to detect. The label 12 others is the background label, which is not a foreign object.
The label 0 track is the railway track, which is also not a foreign object. The other labels are not foreign objects either.
Label distribution:
Label 0: 16,653,029 points (6.71%)
Label 1: 39,975,480 points (16.11%)
Label 2: 7,937,154 points (3.20%)
Label 3: 4,596,199 points (1.85%)
Label 4: 2,562,683 points (1.03%)
Label 5: 31,412,582 points (12.66%)
Label 6: 73,861,934 points (29.76%)
Label 7: 7,834,499 points (3.16%)
Label 8: 51,685,366 points (20.82%)
Label 9: 9,047,963 points (3.65%)
Label 10: 275,077 points (0.11%)
Label 11: 3,080 points (0.00%)
Label 12: 2,360,810 points (0.95%)
we split the data into training set (858) and test set (172) randomly.
We will use Random Sampling acquired by RandLA to downsample the point cloud data. The downsample rate is 1/4, which means we will keep 1/4 of the points in the original point cloud data. The downsampled point cloud data will be used for training and testing.
Although random sampling may lose some information, RandLA-Net compensates for the potential loss of information by using a Local Feature Aggregation module to retain important features.
In the specific implementation, its default sampling method is: randomly sample a point, and then find the 30,000 neighboring points around this point as the input of the network.
Evaluation Metrics
Performance evaluation will focus on both accuracy and efficiency:
Detection accuracy: Precision and mIoU for foreign object detection Computational efficiency: Inference time and memory consumption at different input sizes System reliability: False positive and false negative rates under varying environmental conditions
Your Task in this Pilot Study
Our Task in this pilot study is to establish a baseline performance. We will use RandLA Net as our baseline model. The goal is to evaluate the performance and time of RandLA Net on our dataset.
In this pilot study, we will not modify the network architecture, nor do complex data argumentation. We will only use the RandLA Net model as is, and evaluate its performance on our dataset. We will also not do any hyperparameter tuning, or any other complex techniques. The goal is only to get our train and evaluation pipeline working, and to establish a baseline performance for our dataset.
Baseline Method RandLA
The details of RandLA can be found at RandLA.reference.md.
The following are the default hyper parameters of RandLA Net:
k_n = 16 # KNN
num_layers = 5 # Number of layers
num_points = 40960 # Number of input points
num_classes = 11 # Number of valid classes
sub_grid_size = 0.06 # preprocess_parameter
batch_size = 6 # batch_size during training
val_batch_size = 20 # batch_size during validation and test
train_steps = 500 # Number of steps per epochs
val_steps = 100 # Number of validation steps per epoch
sub_sampling_ratio = [4, 4, 4, 4, 2] # sampling ratio of random sampling at each layer
d_out = [16, 64, 128, 256, 512] # feature dimension
noise_init = 3.5 # noise initial parameter
max_epoch = 100 # maximum epoch during training
learning_rate = 1e-2 # initial learning rate
lr_decays = {i: 0.95 for i in range(0, 500)} # decay rate of learning rate
Result
The result of the pilot study is shown in the following table:
mean IoU: 70.29
ioU for each label across test sets:
Label 0: 60.12
Label 1: 74.53
Label 2: 74.21
Label 3: 82.48
Label 4: 73.62
Label 5: 83.03
Label 6: 89.68
Label 7: 79.81
Label 8: 91.93
Label 9: 95.22
Label 10: 61.86
Label 11: 0.00
Label 12: 47.31
Average accuracy: 0.8886