University students help create Machine Learning models for RB

Students from the The Hague University have assisted RB in strengthening its Machine Learning models. They specifically focused on the question how patterns in the large dataset about duties, crew and communication could be explored in an anonymous way, without needing access to sensitive information.

RB is fortunate by having the most engaged crew community on the planet. Over 60.000 crew use the RosterBuster app and a lot of them are also active on social channels such as Instagram, Facebook and Linked-In. To be able to give them the best possible experience, services can benefit from Machine Learning models that process the incoming data and provide insights that can further improve the user’s experience.

An international team of The Hague University students were asked to develop machine learning models to further improve RB’s understanding of the user’s engagement. This was part of the European Project Semester (EPS) course.

The team started investigating in which areas they could make the most impact. They identified roster codes and messaging as their key focus areas.

Roster codes

One of the main features of the RosterBuster app is to show the roster to the user on the go. To enable this a staggering 500 different airline roster formats are supported. All of these rosters have different codes on them, which need to be categorized as e.g. flight, standby, day off, layover and ground transport. Although RB has a database with over 70k codes in it, it often happens that codes are not known, resulting in a less than optimal experience.

The RB team applied machine learning to this problem before, which made it possible to predict roster codes. The students set out to improve this model and they were quite successful: they were able to significantly improve the prediction capability.

Messaging

As users will know, RosterBuster is not just a roster app but also a communication platform. Instant messages are sent and received all the time, and they might contain valuable hints to improve the app. But of course, these messages are private and encrypted. Not only end-to-end but also ‘at rest’.

To make it possible to do analysis on the messages, the team created a hook where a machine learning model could be applied to the messages just before they were stored. This way the data can be handled completely anonymous while being able to detect sentiment and the frequency of certain topics.

Conclusions

As we value privacy and security above all else, the team had to create the machine learning models in isolation on mocked data. When we run the models on live data, their true value can be leveraged. Based on what we have seen, we expect there to be nice improvements and learnings.

Working with the team has been a privilege and we wish Milo Kastablank, Hannah Shin, Long Dang, Rochan Soenessardien and Dawid Gliwka all the best in their further endeavors.