BEST Office TOF Sensor Activity

Kretschmer, Raphael-Hafis

doi:10.70124/4d2nv-99y48

Published April 23, 2025 | Version v1

Model Open

BEST Office TOF Sensor Activity

Kretschmer, Raphael-Hafis (Contact person)¹

1. TU Wien

Abstract

This experiment involves training a machine learning model to predict door activity levels from sensor readings, based on time features. A TOF sensor mounted near a doorway collected distance measurements, which were filtered, aggregated, and labelled as activity counts in 10-minute bins. These were then used as training targets for a gradient boosting model. The model’s predictions were evaluated against a holdout test set and visualised in a time series plot. The goal of the project was to demonstrate basic principles of reproducible modelling and data publication, including FAIR metadata.

Context and methodology
This upload is part of a coursework project in the Data Stewardship course, TU Wien, 2025 summer semester. The model and results were generated for the assignment.

The dataset used consists of distance measurements captured by a TOF sensor installed near a door in the BEST Office (Room Code ACEG38). The data was cleaned, filtered and aggregated into 10-minute activity intervals. The machine learning pipeline was implemented in Python and includes preprocessing, model training, and evaluation stages. A gradient-boosted regression model was trained to predict activity counts based on time-based features (hour, minute, day of week, and weekend).

Technical details

Language and environment: Python 3.12, using scikit-learn, pandas, matplotlib
Model type: HistGradientBoostingRegressor from scikit-learn, trained using Poisson loss
Output files:
- output_model.pkl: The trained machine learning model (serialised using joblib)
- test_predictions_plot.png: A visual comparison of predicted vs. actual activity counts for the test set

The plot illustrates model performance using the test partition of the dataset, showing how well the predicted activity matches actual observations over time.

Provenance and metadata

Creator: Raphael-Hafis Kretschmer, TU Wien
Year: 2025
Language: Python
Dependencies: scikit-learn, matplotlib, pandas
Format: .pkl, .png
Model metadata is described using elements of the FAIR4ML schema(e.g. software environment, training parameters, target variable).
All code used to train and evaluate the model is version-controlled and publically available on GitHub.

Files

codemeta.json

Files (417.3 KiB)

Name	Size
codemeta.json md5:22bc409bab1f749798d64bd434384ed6	1.3 KiB	Preview Download
output_model.pkl md5:edbaf18fc8e62df37255dc35b18c8e93	348.6 KiB	Download
README.md md5:0972600c881570dede26f4a122f86bee	1.6 KiB	Preview Download
test_prediction_plot.png md5:9680cfbd55bc61b9d38b822dc9eb17b5	65.8 KiB	Preview Download

BEST Office TOF Sensor Activity

Files

codemeta.json

Files (417.3 KiB)

Additional details

References

BEST Office TOF Sensor Activity

Creators

Description

Files

codemeta.json

Files (417.3 KiB)

Additional details

References