DMP:Vienna Traffic Peak Volume Prediction

Published April 14, 2026 | Version v2

Report Open

The aim of this dataset is to look at traffic congestion patterns in Vienna and figure out whether station-level vehicle count data could be used to classify traffic intensity into three categories: Low, Medium, and High. The raw data was taken from data.gv.at, which is Vienna's open data portal, and was not gathered manually. Everything else in the deposit was produced by running a Python script that cleans the data, trains two classifiers, and saves the results.

There are nine files in total. D1 is the original CSV from data.gv.at, which uses semicolon delimiters and latin1 encoding and has 45,672 rows. D2 is the processed version of that file saved as a standard CSV. D3 contains the predictions made on the test set. D4, D5, and D6 are PNG images showing the traffic volume histogram, the confusion matrix, and the model comparison chart. D7 is the Python script. D8 is the README and D9 is the DMP.
Two classification models are used here: a Decision Tree with a max depth of 8 and a Random Forest with 100 estimators. Both are trained using scikit-learn. The TVMAX column was binned into three congestion classes using the 33rd and 66th percentiles as cut-off points. The features used for training were ZNR, RINAME, and FZTYP, all of which were label-encoded before being passed to the models. The dataset was split into training (70%), validation (15%), and test (15%) sets. The best model was picked based on validation accuracy and then evaluated once on the test set.

Files

Name	Size
DMP：Vienna Traffic Peak Volume Prediction.pdf md5:cdb5fef2c480470a624c39c6ac2c72a1	276.9 KiB	Preview Download