Feature engineering to improve the prediction quality of ML processes
Project duration: 9 months
Brief description
To improve ML models (measured by the AUC), an existing database for document and telephony data was analysed for the possibility of generating new features.
Supplement
The processing was carried out via a JupyterHub environment. The necessary data sources (inventory system and a data lake) were analysed using Jupyter Notebooks. The relevant features were identified and processed and new features were generated on the basis of extensive analyses.
Subject description
Once the data had been processed and prepared, it was summarised and classified. This database was used to supply existing models in the company and to achieve a possible improvement in forecast quality (as part of AUC). The models tested relate to the cancellation of contracts: Next Best Offer (NBO) and Next Best Product (NBP). It should also be noted that the models must be evaluated in different contexts due to their dependency on the line of business. Overall, a significant increase in model performance was observed.