SEU Methods for Predicting the Likelihood of A Collision Paper

Project Title: Detect the likelihood of collision Problem statement:
Urban traffic collisions, especially in metropolitan areas such as Los Angeles, present multifacet problems
impacting people across various demographics and locations. We will attempt to delve into the patterns,
behaviors, and high-risk zones associated with collision incidents. By analyzing factors such as the time,
location, victim demographics, and modus operandi of collisions, the aim is to uncover underlying trends
and potential preventive measures. The ultimate goal is to enhance urban safety and optimize traffic
management in the city.
Related work:
Loukaitou-Sideris, A., Liggett, R., & Sung, H. G. (2007). Death on the crosswalk: A study of pedestrian-automobile
collisions in Los Angeles. Journal of Planning Education and Research, 26(3), 338-351.
The traffic flow is encoded as images through Gramian Angular Field (GAF), Markov Transition Field (MTF),
and Recurrence Plot (RP). Next, these images merge into a three-channel image as the input of the rear
components fic data, data-driven models have been gaining popularity in recent years due to their superior
performance compared to traditional methods (Cai et al. 2020a, b;Zheng et al. 2021;Huakang et al. 2020;Fang
et al. 2022)
Initial hypothesis:
How do temporal patterns, geographical zones, and victim demographics influence the frequency and
severity of traffic collisions in Los Angeles? Based on preliminary data insights, it’s anticipated that
certain time slots, specific urban areas, and particular demographic groups might be disproportionately
affected, highlighting areas for targeted traffic safety interventions.
Dataset(s):
Dataset source (link and reference)
Traffic Collision Data from LA City’s public safety portal.
https://data.lacity.org/Public-Safety/Traffic-Collision-Datafrom
-2010-to-Present/d5tf-ez2w
Number of instances
597,788
Number of features
18
Not applicable
Class distribution (# instances in
each class, if applicable)
Dataset splits
Suggestion: 70% for training, 15% for validation, and 15% for
testing.
Preprocessing steps
1. Handle missing values (e.g., imputation or removal).2.
Convert time from 24-hour format to standard format.3.
Geographical clustering for areas with high collision rates.4.
Encode categorical variables like Area Name, Victim Sex, and
Victim Descent.5. Extract features from date (e.g., day of the
week, month, year).
Method(s):
In order to analyze urban traffic accidents in Los Angeles, we can perform Exploratory data analysis. We
plot geospatial data on a map to identify high-risk areas through Pandas for data manipulation and
Matplotlib/ or Seaborn visualization. Our combined use of Spatio-Temporal Graph Convolutional
Networks (ST-GCN) will be used to capture spatial dependencies between different urban areas and
temporal patterns over time. The novelty lies in the integration of spatial and temporal data, which we
will implement using Python’s PyTorch Geometric. EDA helps to understand and visualize the current
state of the data, while ST-GCN’s spatio-temporal analysis leverages this understanding to make future
predictions. This combination provides a retrospective and prospective view of traffic accidents in Los
Angeles.
Evaluation:
To quantitatively measure the performance of the solution for traffic collision prediction, we’ll employ
Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE). These metrics will provide insights
into the average magnitude of errors between predicted and observed collision counts. Additionally,
considering the spatial nature of the problem, we’ll introduce distance-based metrics to evaluate how
closely our predicted high-risk areas align with actual collision locations. Qualitatively, feedback from
traffic management authorities will be sought to gauge the practical utility and interpretability of our
predictions. When comparing our methods to prior work, we’ll benchmark against traditional statistical
models or simpler machine learning models, focusing on both prediction accuracy and the depth of
insights provided.
Management plan:
To effectively manage our project implementation, we’ve assigned roles based on the strengths, interests,
and backgrounds of each team member. Our Project Manager( Redha) will coordinate team activities,
ensuring timely contributions and overseeing the synthesis of report and presentation contents. The
Software Architect(Jiayu Yuan) will lead the organization and structure of our code, managing Github
activities and ensuring collaborative code quality. Our Experiment Architect(Chittesh Pandita) will design
and implement the experiment protocols, focusing on method evaluation and hypothesis testing. The Data
Architect(Avish Khosla) is responsible for all data-related tasks, from collection to processing. For
projects with specific domain requirements, we’ll have a Domain Expert(Padmasini) to provide insights
and guidance. Communication will be maintained through regular virtual meetings, and accountability
will be ensured through periodic progress checks and collaborative platforms like Github.
Project Title:
Step 1: Summary of relevant work
To identify relevant work, you should search key words related to your chosen topic in search
engines such as Google Scholar. Then you will select the most relevant papers to your
proposed project. The number of relevant papers will vary for each project, but most projects will
probably find 8-10 key relevant papers.
Once you’ve selected your list of papers, at least one member of your group should read each
paper and make notes about the key points. To help you organize your notes about each paper,
fill out the following template for each of the key papers (thus you will have around 8-10 of these
blocks below, though the exact number of papers will depend on what is relevant for your
project).
Citation in ACM citation style.
Brief summary:
● 1-3 bullets that concisely summarize the key innovation and results in the paper
Strengths:
● 1-3 bullets that concisely summarize the key strengths of the paper
Limitations:
● 1-3 bullets that concisely summarize the key limitations of the paper
Here is an example:
Rußwurm, M., Courty, N., Emonet, R., Lefèvre, S., Tuia, D., & Tavenard, R. (2023). End-to-end
learned early classification of time series for in-season crop type mapping. ISPRS Journal of
Photogrammetry and Remote Sensing, 196, 445-456.
Brief summary:
● Proposed loss function that optimizes dual objective of classification accuracy and
earliness of classification
● Model outputs crop type class prediction in addition to probability that the prediction
should be used at that timestep or wait for more data from later timesteps
● Demonstrated using LSTM with Sentinel-2 time series, but can be implemented for any
deep learning model
Strengths:
● Simple approach that would be easy to implement for any neural network architecture
● Provides information that can be used to judge reliability of predictions at a given time in
the growing season (which can be used to inform end-user decision-making)
Limitations:
● Poor performance for minority classes (subject to class imbalance issues)
● Poor performance for small datasets (as with many deep learning models)
● Outperformed by random forest baseline
Step 2: Organization of relevant work
In this section, you will organize the papers from above into groups of papers that have similar
techniques, strengths, and/or limitations. For example, you might group papers by the type of
methods used (e.g., deep learning vs. other techniques for classification) or by their limitations
(e.g., studies that showed poor vs. strong performance on imbalance datasets).
There is not a specific format for this section, as long as you clearly show how you have
organized your papers from Step 1. This is meant to help you prepare to write your Related
Work section in your written report. You can refer to each paper by its in-text citation (e.g.,
Rußwurm et al., 2023 in the earlier example).
Here are some suggested resources to review to help you prepare to write a good Related Work
section based on your literature review:


Carnegie Mellon University pdf and video on preparing a literature review:
https://www.cmu.edu/student-success/other-resources/resource-descriptions/relatedwork.html
Related Work slides from Penn State University:
https://www.cse.psu.edu/~pdm12/cse544/slides/cse544-relwork.pdf
You can delete this page in your submission:
Final project literature review
Criteria
This
criterion is
linked to a
Learning
Outcome
Part 1:
Summary
of relevant
work
This
criterion is
linked to a
Learning
Outcome
Part 2:
Organizati
on of
relevant
work
Ratings
6 to >3.0 pts
Concise summary of
contributions and
strengths/limitations of
relevant prior work
5 to >3.0 pts
Organized related
work around
sensible themes
that form a
compelling
narrative about
prior work
Pts
3 to >0.0 pts
Summary of prior
relevant work written
but does not show
understanding of key
points in papers
3 to >0.0 pts
Organized related
work into common
themes but themes
are superficial or do
not form
compelling
narrative
0 pts
Missing or
insufficient
summary of
related work
0 pts
Missing or
incomplete
organization
of relevant
work
Total Points: 11 (we’ll scale to match 5% of final score)
6
pts
5
pts

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper
Still stressed with your coursework?
Get quality coursework help from an expert!