This commit is contained in:
Dustella 2025-04-12 15:22:19 +08:00
commit 1d572981f8
Signed by: Dustella
GPG Key ID: 35AA0AA3DC402D5C
5 changed files with 571 additions and 0 deletions

135
.github/prompts/1.main_task.prompt.md vendored Normal file
View File

@ -0,0 +1,135 @@
# Main Task
## Summary
You are an academic writer, and you are going to write a Pilot Study Report.
The objective of this task is for you to conduct a Project proposal, and a pilot study report and write it up. The pilot study will include a small scale experiment to evaluate the feasibility and predict the costs of a full-scale study. This will give you experience of planning, running, debugging, analyzing and interpreting the results of a small-scale evaluation.
The report will detail all aspects of the project including an introduction, background, problem statement, methodology, results, analysis, discussion, and future work. The discussion section should include an outline of any ethical and legal issues related to the methodology employed and a description of the social and professional implications of the research.
## Requirements
- Clear, concise, and compelling description of the research problem and its significance
- Comprehensive summary of relevant literature with clear connections to research questions/hypotheses
- Detailed, well- justified methods with thorough legal, social, ethical and professional considerations
- Comprehensive techniques with clear discussion of challenges and solutions
- Detailed techniques with strong interpretation of findings
## Detail Specification
Undetected objects on railway tracks can lead to derailments, infrastructure damage, and potentially catastrophic accidents. Traditional monitoring methods rely heavily on manual inspection, which is both time-consuming and prone to human error.
This project aims to develop an advanced automated detection system for foreign objects on railway tracks using LiDAR sensors and state-of-the-art 3D point cloud segmentation techniques.
The core challenge lies in accurately identifying and classifying foreign objects amidst the complex railway scene geometry, while maintaining computational efficiency for practical deployment.
Recent advancements in 3D point cloud processing have revolutionized how we approach semantic segmentation tasks in unstructured 3D data. Unlike 2D image data, point clouds present unique challenges due to their irregular format, varying density, and lack of structured grid representation. Several pioneering approaches have emerged to address these challenges.
## Hypotheses
Based on our analysis of the problem domain and existing techniques, we propose the following hypotheses:
- A architecture using the attention mechanisms of Point Transformer with the efficient sampling strategies will achieve higher detection accuracy for railway foreign objects than baseline method RandLANet.
- Certain data augmentation techniques simulating various railway environments and object placements will enhance model generalization and reduce false positives.
## Data Acquisition and Preprocessing
We will deploy multiple LiDAR sensors along selected railway segments to capture comprehensive 3D point cloud data. The data collection strategy will ensure coverage of diverse environmental conditions (weather, lighting, seasons) and various types of potential foreign objects.
The data we generated is stored in `PLY` format. Follows are some information about these files:
```text
Total PLY files: 1031
Total points across all files: 248,205,859
Average points per file: 240,742.8
Min points in a file: 50,048
Max points in a file: 952,476
```
We labeled all files into 13 classes. the name and distribution of classes are listed below:
```text
1 轨面 track
2 沟渠 ditch
3 杆 masts
4 电缆 cable
5 隧道 tunnel
6 地面 ground
7 栅栏 fence
8 山体 mountain
9 列车 train
10 行人 human
11 盒子 box
12 其他 others
```
in these labels, **11 box** is the foreign object we want to detect. The label **12 others** is the background label, which is not a foreign object.
The label **0 track** is the railway track, which is also not a foreign object. The other labels are not foreign objects either.
```text
Label distribution:
Label 0: 16,653,029 points (6.71%)
Label 1: 39,975,480 points (16.11%)
Label 2: 7,937,154 points (3.20%)
Label 3: 4,596,199 points (1.85%)
Label 4: 2,562,683 points (1.03%)
Label 5: 31,412,582 points (12.66%)
Label 6: 73,861,934 points (29.76%)
Label 7: 7,834,499 points (3.16%)
Label 8: 51,685,366 points (20.82%)
Label 9: 9,047,963 points (3.65%)
Label 10: 275,077 points (0.11%)
Label 11: 3,080 points (0.00%)
Label 12: 2,360,810 points (0.95%)
```
we split the data into training set (858) and test set (172) randomly.
We will use Random Sampling acquired by RandLA to downsample the point cloud data. The downsample rate is 1/4, which means we will keep 1/4 of the points in the original point cloud data. The downsampled point cloud data will be used for training and testing.
Although random sampling may lose some information, RandLA-Net compensates for the potential loss of information by using a Local Feature Aggregation module to retain important features.
In the specific implementation, its default sampling method is: randomly sample a point, and then find the 30,000 neighboring points around this point as the input of the network.
## Evaluation Metrics
Performance evaluation will focus on both accuracy and efficiency:
Detection accuracy: Precision and mIoU for foreign object detection
Computational efficiency: Inference time and memory consumption at different input sizes
System reliability: False positive and false negative rates under varying environmental conditions
## Your Task in this Pilot Study
Our Task in this pilot study is to establish a baseline performance. We will use RandLA Net as our baseline model. The goal is to evaluate the performance and time of RandLA Net on our dataset.
In this pilot study, we will not modify the network architecture, nor do complex data argumentation. We will only use the RandLA Net model as is, and evaluate its performance on our dataset. We will also not do any hyperparameter tuning, or any other complex techniques. The goal is only to get our train and evaluation pipeline working, and to establish a baseline performance for our dataset.
## Baseline Method RandLA
The details of RandLA can be found at `RandLA.reference.md`.
The following are the default hyper parameters of RandLA Net:
```text
k_n = 16 # KNN
num_layers = 5 # Number of layers
num_points = 40960 # Number of input points
num_classes = 11 # Number of valid classes
sub_grid_size = 0.06 # preprocess_parameter
batch_size = 6 # batch_size during training
val_batch_size = 20 # batch_size during validation and test
train_steps = 500 # Number of steps per epochs
val_steps = 100 # Number of validation steps per epoch
sub_sampling_ratio = [4, 4, 4, 4, 2] # sampling ratio of random sampling at each layer
d_out = [16, 64, 128, 256, 512] # feature dimension
noise_init = 3.5 # noise initial parameter
max_epoch = 100 # maximum epoch during training
learning_rate = 1e-2 # initial learning rate
lr_decays = {i: 0.95 for i in range(0, 500)} # decay rate of learning rate
```

11
.gitignore vendored Normal file
View File

@ -0,0 +1,11 @@
*.aux
*.fls
*.log
*.pdf
*.out
*.gz
*.fdb_latexmk
*.bbl
*.blg

3
.vscode/settings.json vendored Normal file
View File

@ -0,0 +1,3 @@
{
"cSpell.words": ["downsample", "downsampled"]
}

65
assignment_3/main.tex Normal file
View File

@ -0,0 +1,65 @@
\documentclass[11pt,twocolumn]{article}
\usepackage[utf8]{inputenc}
\usepackage{layout}
\usepackage[left=2cm,right=2cm,top=2.5cm,bottom=2.5cm]{geometry}
\setlength{\columnsep}{0.5in}
\title{CPT401 Assessment 2 Template}
\author{Hanwen Yu}
\date{April 2025}
\begin{document}
\maketitle
\abstract{This is the placeholder for abstract.}
\section{Introduction}
The report will be a 4-8 page paper prepared in LaTeX. You will submit electronic and printed copies of the paper with additional electronic appendices including all the data generated by the experiments, as well as copies of the presentation material and any other documents generated throughout the project. You should use Bitex for citations \cite{SHNEIDERMAN1996The} (see the example refs.bib file).
\section{Background}
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum
\section{Problem Statement}
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum
\section{Methodology}
\section{Results}
\section{Analysis}
\section{Discussion}
The discussion section should include an outline of any ethical and legal issues related to the methodology employed and an outline of the social and professional implications of the research.
\section{Conclusion}
\section{Future Work}
\bibliographystyle{plain} % We choose the "plain" reference style
\bibliography{refs} % Entries are in the refs.bib file
\end{document}

357
assignment_3/refs.bib Normal file
View File

@ -0,0 +1,357 @@
@inproceedings{SHNEIDERMAN1996The,
title={The Eyes Have It : A Task by Data Type Taxonomy for Information Visualization},
author={Shneiderman, B.},
booktitle={Proc IEEE Symposium on Visual Languages},
year={1996},
}
@article{Ronald1936The,
title={The Use of Multiple Measurements in Taxonomic Problems},
author={Ronald A. Fisher},
journal={Annals of Human Genetics},
volume={7},
number={7},
pages={179-188},
year={1936},
}
@article{Mendhe2014Supervised,
title={Supervised Machine (SVM) Learning for Credit Card Fraud Detection},
author={Mendhe, Neha K. and Thakare, M. N. and Korde, G. D.},
journal={International Journal of Engineering Trends \& Technology},
volume={8},
number={3},
pages={137-139},
year={2014},
}
@article{Kultur2017Hybrid,
title={Hybrid approaches for detecting credit card fraud},
author={Kultur, Yigit and Caglayan, Mehmet Ufuk},
journal={Expert Systems},
volume={34},
number={2},
pages={1-13},
year={2017},
}
@inproceedings{Masood2015Self,
title={Self-supervised learning model for skin cancer diagnosis},
author={Masood, Ammara and Al-Jumaily, Adel and Anam, Khairul},
booktitle={International IEEE/EMBS Conference on Neural Engineering},
year={2015},
}
@article{Ozolek2014Accurate,
title={Accurate diagnosis of thyroid follicular lesions from nuclear morphology using supervised learning},
author={Ozolek, John A. and Tosun, Akif Burak and Wang, Wei and Chen, Cheng and Kolouri, Soheil and Basu, Saurav and Huang, Hu and Rohde, Gustavo K.},
journal={Medical Image Analysis},
volume={18},
number={5},
pages={772-780},
year={2014},
}
@inproceedings{Srihari2009Semi,
title={Semi-supervised Learning for Handwriting Recognition},
author={Srihari, Sargur N.},
booktitle={International Conference on Document Analysis \& Recognition},
year={2009},
}
@inproceedings{Gross2008Semi,
title={Semi-supervised learning of multi-factor models for face de-identification},
author={Gross, Ralph and Sweeney, Latanya and Torre, Fernando De La and Baker, Simon}, booktitle={IEEE Conference on Computer Vision \& Pattern Recognition}, year={2008},
}
@misc{alsbury2012displaying,
title={Displaying stacked bar charts in a limited display area},
author={Alsbury, Quinton and Becerra, David},
year={2012},
month=aug # "~7",
publisher={Google Patents},
note={US Patent 8,239,765}
}
@article{harrower2003colorbrewer,
title={ColorBrewer. org: an online tool for selecting colour schemes for maps},
author={Harrower, Mark and Brewer, Cynthia A},
journal={The Cartographic Journal},
volume={40},
number={1},
pages={27--37},
year={2003},
publisher={Taylor \& Francis}
}
@article{fisher1936use,
title={The use of multiple measurements in taxonomic problems},
author={Fisher, Ronald A},
journal={Annals of eugenics},
volume={7},
number={2},
pages={179--188},
year={1936},
publisher={Wiley Online Library}
}
@inproceedings{he2016deep,
title={Deep residual learning for image recognition},
author={He, Kaiming and Zhang, Xiangyu and Ren, Shaoqing and Sun, Jian},
booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
pages={770--778},
year={2016}
}
@article{brynjolfsson2017can,
title={What can machine learning do? Workforce implications},
author={Brynjolfsson, Erik and Mitchell, Tom},
journal={Science},
volume={358},
number={6370},
pages={1530--1534},
year={2017},
publisher={American Association for the Advancement of Science}
}
@inproceedings{ribeiro2016should,
title={" Why should i trust you?" Explaining the predictions of any classifier},
author={Ribeiro, Marco Tulio and Singh, Sameer and Guestrin, Carlos},
booktitle={Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining},
pages={1135--1144},
year={2016}
}
@article{gunning2017explainable,
title={Explainable artificial intelligence (xai)},
author={Gunning, David},
journal={Defense Advanced Research Projects Agency (DARPA), nd Web},
volume={2},
year={2017}
}
@book{tzeng2005opening,
title={Opening the black box-data driven visualization of neural networks},
author={Tzeng, F-Y and Ma, K-L},
year={2005},
publisher={IEEE}
}
@article{allix2016empirical,
title={Empirical assessment of machine learning-based malware detectors for Android},
author={Allix, Kevin and Bissyand{\'e}, Tegawend{\'e} F and J{\'e}rome, Quentin and Klein, Jacques and Le Traon, Yves and others},
journal={Empirical Software Engineering},
volume={21},
number={1},
pages={183--211},
year={2016},
publisher={Springer}
}
@book{hassoun1995fundamentals,
title={Fundamentals of artificial neural networks},
author={Hassoun, Mohamad H and others},
year={1995},
publisher={MIT press}
}
@inproceedings{oshiro2012many,
title={How many trees in a random forest?},
author={Oshiro, Thais Mayumi and Perez, Pedro Santoro and Baranauskas, Jos{\'e} Augusto},
booktitle={International workshop on machine learning and data mining in pattern recognition},
pages={154--168},
year={2012},
organization={Springer}
}
@article{montavon2018methods,
title={Methods for interpreting and understanding deep neural networks},
author={Montavon, Gr{\'e}goire and Samek, Wojciech and M{\"u}ller, Klaus-Robert},
journal={Digital Signal Processing},
volume={73},
pages={1--15},
year={2018},
publisher={Elsevier}
}
@article{ming2018rulematrix,
title={Rulematrix: Visualizing and understanding classifiers with rules},
author={Ming, Yao and Qu, Huamin and Bertini, Enrico},
journal={IEEE transactions on visualization and computer graphics},
volume={25},
number={1},
pages={342--352},
year={2018},
publisher={IEEE}
}
@article{zeng2016towards,
title={Towards better understanding of deep learning with visualization},
author={Zeng, Haipeng},
journal={The Hong Kong University of Science and Technology},
year={2016}
}
@article{grun2016taxonomy,
title={A taxonomy and library for visualizing learned features in convolutional neural networks},
author={Gr{\"u}n, Felix and Rupprecht, Christian and Navab, Nassir and Tombari, Federico},
journal={arXiv preprint arXiv:1606.07757},
year={2016}
}
@article{wongsuphasawat2017visualizing,
title={Visualizing dataflow graphs of deep learning models in tensorflow},
author={Wongsuphasawat, Kanit and Smilkov, Daniel and Wexler, James and Wilson, Jimbo and Mane, Dandelion and Fritz, Doug and Krishnan, Dilip and Vi{\'e}gas, Fernanda B and Wattenberg, Martin},
journal={IEEE transactions on visualization and computer graphics},
volume={24},
number={1},
pages={1--12},
year={2017},
publisher={IEEE}
}
@inproceedings{nguyen2000visualization,
title={A visualization tool for interactive learning of large decision trees},
author={Nguyen, Trong Dung and Ho, Tu Bao and Shimodaira, Hiroshi},
booktitle={Proceedings 12th IEEE Internationals Conference on Tools with Artificial Intelligence. ICTAI 2000},
pages={28--35},
year={2000},
organization={IEEE}
}
@inproceedings{caruana2006empirical,
title={An empirical comparison of supervised learning algorithms},
author={Caruana, Rich and Niculescu-Mizil, Alexandru},
booktitle={Proceedings of the 23rd international conference on Machine learning},
pages={161--168},
year={2006}
}
@article{parker2001rank,
title={Rank and response combination from confusion matrix data},
author={Parker, JR},
journal={Information fusion},
volume={2},
number={2},
pages={113--120},
year={2001},
publisher={Elsevier}
}
@book{shneiderman2010designing,
title={Designing the user interface: strategies for effective human-computer interaction},
author={Shneiderman, Ben and Plaisant, Catherine},
year={2010},
publisher={Pearson Education India}
}
@inproceedings{graham2003using,
title={Using curves to enhance parallel coordinate visualisations},
author={Graham, Martin and Kennedy, Jessie},
booktitle={Proceedings on Seventh International Conference on Information Visualization, 2003. IV 2003.},
pages={10--16},
year={2003},
organization={IEEE}
}
@article{yuan2009scattering,
title={Scattering points in parallel coordinates},
author={Yuan, Xiaoru and Guo, Peihong and Xiao, He and Zhou, Hong and Qu, Huamin},
journal={IEEE Transactions on Visualization and Computer Graphics},
volume={15},
number={6},
pages={1001--1008},
year={2009},
publisher={IEEE}
}
@inproceedings{andrews2006evaluating,
title={Evaluating information visualisations},
author={Andrews, Keith},
booktitle={Proceedings of the 2006 AVI workshop on BEyond time and errors: novel evaluation methods for information visualization},
pages={1--5},
year={2006}
}
@inproceedings{andrews2006evaluating,
title={Evaluating information visualisations},
author={Andrews, Keith},
booktitle={Proceedings of the 2006 AVI workshop on BEyond time and errors: novel evaluation methods for information visualization},
pages={1--5},
year={2006}
}
@incollection{carpendale2008evaluating,
title={Evaluating information visualizations},
author={Carpendale, Sheelagh},
booktitle={Information visualization},
pages={19--45},
year={2008},
publisher={Springer}
}
@incollection{craig2015interactive,
title={Interactive animated mobile information visualisation},
author={Craig, Paul},
booktitle={SIGGRAPH Asia 2015 Mobile Graphics and Interactive Applications},
pages={1--6},
year={2015}
}
@inproceedings{craig2015pervasive,
title={Pervasive information visualization: toward an information visualization design methodology for multi-device co-located synchronous collaboration},
author={Craig, Paul and Huang, Xin and Chen, Huayue and Wang, Xi and Zhang, Shiyao},
booktitle={2015 IEEE International Conference on Computer and Information Technology; Ubiquitous Computing and Communications; Dependable, Autonomic and Secure Computing; Pervasive Intelligence and Computing},
pages={2232--2239},
year={2015},
organization={IEEE}
}
@inproceedings{craig2014animated,
title={Animated geo-temporal clusters for exploratory search in event data document collections},
author={Craig, Paul and Se{\"\i}ler, N{\'e}na Roa and Cervantes, Ana Delia Olvera},
booktitle={2014 18th International Conference on Information Visualisation},
pages={157--163},
year={2014},
organization={IEEE}
}
@inproceedings{craig2012vertical,
title={A vertical timeline visualization for the exploratory analysis of dialogue data},
author={Craig, Paul and Roa-Se{\"\i}ler, N{\'e}na},
booktitle={2012 16th International Conference on Information Visualisation},
pages={68--73},
year={2012},
organization={IEEE}
}
@article{gang2011advances,
title={Advances of Detection Technologyng for Antibiotics from Livestock and Poultry Breeding Wastewater},
author={Gang, Li and Zhiyong, Yan and Xiuyi, Tan and Junfeng, Chen},
journal={Journal of Green Science and Technology},
number={11},
pages={54},
year={2011}
}
@article{andersson2003persistence,
title={Persistence of antibiotic resistant bacteria},
author={Andersson, Dan I},
journal={Current opinion in microbiology},
volume={6},
number={5},
pages={452--456},
year={2003},
publisher={Elsevier}
}
@article{bjorkman1998virulence,
title={Virulence of antibiotic-resistant Salmonella typhimurium},
author={Bj{\"o}rkman, Johanna and Hughes, Diarmaid and Andersson, Dan I},
journal={Proceedings of the National Academy of Sciences},
volume={95},
number={7},
pages={3949--3953},
year={1998},
publisher={National Acad Sciences}
}