Projects

Here is a selection of projects and collaborations I’ve contributed to across research, student leadership, and open-source work.

RouteLLM Reproduction

BERT · XLM-RoBERTa · HuggingFace Trainer · Dataset Rebalancing · LLM Routing · Benchmarking (MMLU, GSM8k, MT-bench)

This project is a reproduction and improvement of the BERT-based router from the RouteLLM paper. While the original framework aimed to save LLM costs without compromising quality, my reproduction revealed significant methodological flaws in the initial BERT router implementation.

Key Findings & Methodological Improvements:

  • Addressing Overfitting: I identified that the original BERT routers were overfitting to the majority class, achieving poor macro F1 scores (0.23-0.35). This was primarily due to a heavily skewed training dataset (51% strong model wins).
  • Dataset Rebalancing: To fix this, I implemented oversampling to balance the classes, which enabled meaningful training convergence and significantly improved routing performance.
  • Improved Performance: My rebalanced BERT router outperforms the original checkpoints on MMLU, GSM8k, and MT-bench benchmarks, even when using significantly less data (19k vs 130k+ samples).
  • Reproducible Pipeline: I developed a complete training and evaluation pipeline using HuggingFace Trainer and XLM-RoBERTa-base for 3-class classification (strong_win, tie, weak_win).

Source code

Playlist Manager

React · Vite · Vercel · LLM Integration · Semantic Search · Prompt Engineering · Structured Data (JSON)

The Playlist Manager is a web application designed to streamline the process of sorting tracks from large source playlists into multiple target playlists. It was built rapidly using an AI agent with a modern React and Vite stack, emphasizing a high-efficiency user experience through keyboard shortcuts and an integrated Spotify player.

Before finalizing the manual sorting application, I explored several automated categorization strategies:

  • Semantic Search: I implemented a similarity-based sorter using SentenceTransformer to calculate the cosine similarity between destination playlist names and artist genres (leveraging the fact that Spotify provides genre data at the artist level).
  • LLM-Driven Classification: I developed a specialized sorter utilizing LLMs via API. This involved complex prompt engineering and batched inference to assign multiple songs to playlists in a single pass, using structured JSON outputs (via JSON schema) to ensure reliable integration.

While these automated approaches provided valuable insights into agentic workflows and the limitations of music metadata, the final live application focuses on empowering the user with an optimized manual sorting interface due to scarce music metadata limiting the applicability of AI solutions.

Bachelor’s thesis: Automated Classification and Statistical Analysis of Scheduling-Diagrams

Python · NumPy · OpenCV · R · Shiny · UML · Git · Project Management · Data Analysis · Scientific Writing · LaTeX

For my bachelor’s thesis I conceptualized and implemented a computer-vision pipeline to automatically evaluate exam exercises. The system extracts student answers from imagery, detects the correct exercise permutation, and scores each submission using a matching solution. The generated data powers an interactive analysis dashboard that I built in R and Shiny to explore difficulty differences across exercise variants.

Olydorf App

Flutter · Scrum · Project Management · Software Architecture · Mobile Development · Git

During my time as CTO of Olynet I initiated and led the development of the OlyApp, a Flutter-based mobile app that supports students living in the Munich “Olympisches Dorf” dormitory. I coordinated a team of three developers in Scrum-like iterations, set the long-term technical direction, and acted as the interface between stakeholders and the engineering team. Source code.

TUM Campus App

Kotlin · Java · Android · REST APIs · SQLite · Migration · Asynchronous Programming · Mobile Development · Git

As part of the TUM Open Source Lab I joined the TUMdev development team to enhance the Android TUM Campus App. I merged multiple bug fixes and quality improvements, gained experience with Android testing, and worked on migrating one of the app’s oldest components, the cafeteria module, to a new API. Source code.

Personal Website

Jekyll · Web Hosting · Responsive Web Development · DevOps · JavaScript · CSS · SEO · Git

This Jekyll-based personal site has been my playground for experimenting with developer tooling, writing workflows, and design iterations. The current version builds on the AcademicPages theme, retaining blog content and custom visualizations while modernising the structure for long-term maintainability.