Analysis of document journals with the help of Data Science
Project duration: 5 months
Brief description
The customer digitizes all incoming and outgoing communications as part of the inventory management system. The metadata of the special incoming documents are analyzed for correlations, both with statistical methods and with unsupervised machine learning methods in the context of pattern recognition. In particular, valuable potential is identified for further processing in a cancellation model.
Supplement
The data of the inventory management system is abstracted in a higher-level BI base layer. The analyses are performed in a data science workbench (Jupyter Hub) of the customer. SQL with SQL Developer is used to access the corresponding Oracle databases. In addition, work is done via corresponding Python packages (Numpy, Pandas, etc.). Visualizations are mainly created via seaborn.
Subject description
The analyses carried out show in particular potential to supplement an existing cancellation model, but could also offer added value in other use cases. In a further step, the data could therefore be processed further in order to be able to determine the identified benefit in already existing models.