Get to know
About Me

Passion
AI | Deep Learning | HR-Tech | Sailing | Food & TravelingExperience
10+ projects: From AI-driven Chatbots to NLP-centered Web-ApplicationThe Skills I have
My Experience
Technology
Management
My Recent Work
Portfolio

Promptic
Prompt Engineering
LLMs
Generative AI
Prompt Optimization
Metric-driven optimization
Problem:
The optimization of prompts for Large Language Models is a critical yet challenging task. Engineers often face a complex, time-consuming, and manual process of trial-and-error. It's difficult to systematically evaluate and improve prompts based on specific business metrics and tailored to their own data, which often leads to suboptimal performance of the models.
Solution:
Promptic offers a simple, powerful, and analytics-driven platform that puts prompt engineering on the road to success. It provides an automated, one-click optimization process based on your own business data and metrics. Without any coding required, you can upload your data, provide an initial prompt, and let Promptic iteratively refine it using sophisticated deep-learning models. The result is a highly accurate and tailored prompt that elevates your foundation models, and unlocks their full potential for your specific use case.

PulsePicker
Agents
Generative AI
arXiv
OpenAI
Azure
Python
FastAPI
Next.js
TailwindCSS
PostgreSQL
Vercel
Problem:
As a data science consultant, I often need to stay up to date with the latest research in my field. This is a challenge in the fast paced world of AI. No Newsletter exist that provide me with the research relevant to the specific project needs I have as a consultant.
Solution:
A highly customizable research newsletter that is configurable to very specific knowledge needs. You can subscribe to any topic of your choice. An AI Agent screens the latest research papers on arXiv and sends you a weekly summary.

AI One Pager
Competence Extraction
Skills Database
Natural Language Processing
Large Language Models
OpenAI GPT
Serverless
Vercel
Google Cloud Platform
Microservices
AWS S3
Problem:
CV's, OnePager, Intranet are hard to maintain and often outdated
Solution:
Upload your Documents. We extract your skills and automatically generate an OnePager selling you as the expert that you are!
Secure
Automated
Easy


Human Value Detector
Transformer
RoBERTa | DeBERTa
T5
ExplainableAI
Lime | Loco
Ensemble Models
Plausibility
Faithfulness
Python
Pytorch Lightning
Neptune.ai
Goal:
"We should ban whaling, whaling is wiping out species for little in return." say some, "We shouldn’t, it is part of a great number of cultures." say others. Both arguments support their claim, but why are some arguments more convincing to us than others? This might relate to the underlying values they address. The task of identifying human values in arguments plays a crucial role in media, debates and policy-making.
Performance:
Competition Winning System in SemEval 2023 workshop (alias Adam Smith).
F1 Score of 0.56. Beating the baseline by 33%
LIME captures inner working of model and delivering human understandable explanations
LIME shows higher plausibility and faithfulness than LOCO
Example:
The Argument: "We should ban whaling, whaling is wiping out species for little in return."
Human Value: Universalsim nature (confidence: 0.92)
Explanations: Top 3 most important terms make up 0.78 of the models confidence
Explanation: The words in orange contribute to a confidence score, We can see how the considers the words "ban", "whaling", "wiping out" as important for the predicition "protecting the environent" (univesalism nature). We can capture the importance of the explanations with the concept of faithfulness. Faithfulness indicates how well the explanation captures the inner working of the model.
Abstract:
Whether an argument is in agreement or disagreement with our personal values influences its ability to persuade. This relation plays a crucial role in media, debates and policy-making. As part of this thesis we participated in the SemEval-2023 Task 4: "Identification of Human Values behind Arguments". The goal of the task was to create systems that automatically identify the values behind arguments. Our system showed the best performance among the submissions of 39 research teams. However, in many application scenarios models do not only need to deliver good performances but also need to be understandable by humans. For the identification of human values in media Scharfbillig et al., 2021 explicitly avoid modern machine learning techniques because they lack the feature of interpretability. In this study we close this gap by not only presenting the competition winning system but also employing techniques from the field of explainable AI. We show that the suggested explanations capture the inner working mechanism of the system and conduct a survey to demonstrate that they are in fact understandable by humans.

ProjectHR21
Natural Language Processing
Web-Application
MERN-Stack
Python
HR-TECH
Competence Extraction
Skills Monitoring
New Work
Problem:
I am working on a sentiment twitter model with python...
Who in my organization can help me?
In times of remote work, covid19, and flexible working models we loose track of the competencies that our colleagues have!
Goal:
Creating direct horizontal collaboration links. Meeting the demands that our "new work" world requires. People are the most crucial asset to adapt to fast changing environments and remaining competitive... Lets help our workforce to exploit their full potential.
Secure
Automated
Easy
Example:

Analysis of my thesis
1. Machine Learning within a production line
2. Reinforcement Learning in a serial supply chain

Solution:
Using NLP we extract competencies from the daily textual work. Each employee receives an private profile to monitor their skills. They can then decide, whether they want to be found within the organization as the expert that they are!

ExplainableAI for Siamese Neural Networks
Natural Language Processing
Deep Learning
Explainability
Neural Networks
Interpretability
Siamese Neural Networks
Python
LIME
SHAP
Key-Point Extraction
Goal:
An increasing fraction of public debates takes place online. The ability to automatically capture key ideas and opinions in online communication can be a game changer for politics and businesses. Argumentative texts can often be reduced to some core ideas, called key points. In this work we build models that assign arguments to their corresponding key-points and thereby asking our models: "Why is a certain argument assigned to a specific keypoint?".
Performance:
Siamese Neural Network outperforming current solutions by up to 5%
Novel Approach of Applying LIME to Siamese Neural Networks
Example:
We modified LIME to work with Siamese Neural Networks. Thereby we get an understanding why the model makes certain decisions.
Let's have a look at an example:
The Argument: The United States is undoubtedly the richest country that exists, its income is really high and higher than that of any other country, apart from being one of the main productive countries on the planet.
Key-Point: The US has a good economy/high standard of living.
Prediction probabilities: Similar 0.78 , Dissimilar 0.22.
Explanation: The words in orange contribute to a high similarity score, whereas the blue words work against a matching. Even though the words "income" and "productive country" are not part of the key-point, the model has learned that they represent concepts of a "good economoy" and a "high standard of living".
Abstract:
An increasing fraction of public debates takes place online. The ability to automatically capture key ideas and opinions in online communication can be a game changing for politics and businesses. Argumentative texts can often be reduced to some core ideas, called key points. Our work addresses the research question presented in the KPA shared task 2021 of matching arguments to key points. We propose two different models. The first model combines a transformer based denoising autoencoder (TSDAE) with a siamese neural network. The second one enriches a transformer based siamese neural network with additional features like part-of-speech tags. Our evaluation shows and performance improvement of 2% to 5% compared to the best performing model in prior research. On the other hand those models are characterised by a highly complex architecture and low interpretability. We implement different methods to answer why certain argument key point pairs receive a high similarity score. Furthermore, we propose a simple approach on applying LIME to siamese neural networks (SNNs) which is novel in literature. Hence, our twofold contribution to research comprises the presentation of new models for solving the KPA shared task and a new approach of applying LIME to SNNs
Reinforcement Learning for Logistics
Reinforcement Learning
DDPG
Policy Gradient
Multi Agent Supply Chain
Cost Optimization
Open AI Gym
Goal:
Within this thesis I applied RL-algorithms to train an agent for optimizing cost within a serial supply chain. Where usually the Bullwhip effect leads to high inventory holding cost, I could show that an artificial agent is capable of reducing cost and hence increasing profitability. The behaviour of the other actors in the supply chain have been modelled as: random, human-like and optimal. Especially when other actors are modelled as human-like we could reduce the cost by 69%. If you are interested in the details, feel free to download the paper...
Abstract:
In serial supply chains, humans tend to show irrational behaviour leading to high cost. The beer distribution game is often taught in management classes to demonstrate the bullwhip effect - a phenomenon describing that orders from the supplier tend to have a higher variance than sales to the buyer. This distortion leads to fluctuations in inventory levels, causing unnecessary high costs. The optimal ordering policy is already known, but unfortunately, humans usually tend to show different behaviour. Hence, we investigate whether reinforcement learning can derive better ordering policies. Thereby the co-players of our intelligent agent either act randomly, optimally or human-like (Sterman formula, 1989). So far, mainly action-value reinforcement learning algorithms have been implemented to solve the beer distribution game. The game is originally defined as a discrete setting (discrete order quantities), which might be the reason that policy-gradient algorithms have not been considered yet. However, in many economic situations, a continuous version of the game is applicable. The ordered quantities, for instance, are so high, that simply rounding them has no major economic impact. Policy-gradient methods have features that can deliver a valuable contribution to research. They can be easily extended from one-dimensional decision making to multi-dimensional decision making. Hence, they can be used to improve supply chains trading multiple items. We create a discrete and continuous game environment that allows a simple experimentation with reinforcement learning algorithms. We further implement the policy-gradient methods REINFORCE and Deep Deterministic Policy Gradient (DDPG). Within this study, we show that they perform on the same level as the current state of the art action-value approaches. Thereby we create a starting point for future research with policy-gradient algorithms in serial supply chains.

Builduing a chatbot with RASA
Internship: Builduing an AI-Based Chatbot and Question-Answering System for Front-Business Support.
RASA
Question-Answering-Systems
AI-based Chatbot
FastAPI
Python
Elasticsearch
Haystack
Project Setting:
This project was done during an Internship. Hence I cannot share any Code.
Goal:
Within this project we built an ai-based Chatbot with Rasa to support front-business. The task included the development of the chatbot itself, as well as providing a question answering system based on deepset.ai haystack. Therefore we built up a pipeline for extracting the right answers from a set of documents managed with elasticsearch. If you want to know more, feel free to contact me :)

NLP: StackOverflow tagging
Natural Language Processing
Deep Learning
Python
Neural Networks
Bag of Words
LSTM
Word Embeddings
Grid Search
The "Presentation" Pipeline
We have created a jupyter notebook containing a data-processing/training/evaulation pipeline that can be used for presentational purposes and also acts as an entry point to understand the structure and processes of the remaining notebooks. Please have a look at presentation_pipeline.ipynb to achieve a better understanding of the steps if necessary.
The Architecture
We present the different model architectures in this project. Each of the three model architectures has an associated notebook that leads through the data processing and model training process. You can find those notebooks in the Github Repo. The most sophisticated solution shall be briefly presented here.
3. Linear LSTM model for Title OR Body
This model is a variation of the Linear LSTM model for Title OR Body which takes in both question title and question body as inputs. These two inputs will be masked and passed into two separate LSTM layers (we use separate layers to allow them to individually discern between stylistic / syntactical differences in title and body). Their outputs are concatenated and processed as in the linear LSTM.
We provide a visualization of the model architecture below. For this visual example, we go with the following properties:

The Results
Our metric for model evaluation is the micro f1 score, calculated over all samples in the validation set. In contrast to multilabel single-class classification problems, we cannot just pick the predicted target label with the highest score, since we want to predict multiple labels.
This means that we have to set some threshold that dictates how high a tag's score must be for us to consider it set. Since we don't know the best threshold in advance, we iterate over all 100 possible thresholds from 0.00, 0.01, 0.02 ... 0.99, 1.00 and select the one resulting in the best f1 score.
| model | f1 score | precision | recall | threshold |
| bow | 0.622 | 0.667 | 0.582 | 0.3 |
| qb | 0.605 | 0.675 | 0.548 | 0.35 |
| qbt | 0.647 | 0.74 | 0.575 | 0.38 |
- bow: bag of words model, not optimized
- qb: question body model with 1 intermediary dense layer, not optimized
- qbt: question body and title model optimized via gridsearch

Predictive Maintenance @ Bosch production line
R
Machine Learning
CRISP-DM
Business Process
t-sne
random forest
xgBoost
cost sensitive learning
Goal:
Within this thesis I applied various Machine Learning Techniques to solve a predictive maintenance task within a production line. Thereby we used data from a BOSCH Competition on Kaggle. The goal was to run through a standard process model (CRISP-DM) for data science and develop a case of successfully deriving a solution with a positive impact on business performance. Feel free to download the pdf and find out more...
Abstract:
In times of industry 4.0, artificial intelligence and big data the workforce needs to be well educated in applying those technologies. Within this environment, a leading German manufacturing company faced the challenge of predicting whether a produced part fails the internal quality control. This would enable delivering high-quality products to the end user at lower costs. The target group of the developed case are students with a business background.
Therefore, I give weight to certain focal points. The first one is the process of solving those challenges utilising a standard process model for data mining (CRISP DM). The second one is the application of different techniques from the field of machine learning. Finally, I attach weight to the demonstration of how business and statistical knowledge together are needed to improve the performance of a model. This study does neither aim at developing an ideal model for a given dataset nor to provide the mathematical foundations of the applied techniques. Within the resulting case, I present a holistic overview of developing a predictive model for a real-world problem within given constraints. This includes the application of creativity, inventiveness and the need for compromises during the data mining process. With this purpose I recommend teaching the developed case. It is relevant for both academic researches teaching Big Data cases and for decision makers dealing with the topic of predictive analytics.

