Geolytix Model Innovation Day
Geolytix recently held a Model Innovation Day, an internal event to challenge our Data Scientists. Who was the model modeller at the end of the day?
21st October 2022
In any industry, it’s important to keep abreast of the latest of the latest advancements in our field. This is of course also true in the world of location planning. At Geolytix we recently held our Model Innovation Day, an internal event which aims to give our Data Scientists a chance to challenge themselves whilst learning new modelling methodologies. It’s also a platform for us to share these ideas with each other, and a good excuse for ordering in pizza!
This session’s objective was to create a sales forecasting model for a Malaysian retailer, using a mix of demographic, mobile, affluence, competitor, POI and road data. The accuracy of each model was assessed against a hold-out sample of stores, which was revealed at the end of the day. The primary goal though was for each of us to challenge ourselves to try something new, whether it be a new approach, language or software.
By the end of the day we had a real mix of different approaches, including machine learning, scorecards and catchment models. Many of these different techniques worked well, and it was interesting to see how a wide-range of solutions could produce similar results. A mix of platforms and software were used by the team, including Postgres, Python (in IDEs and notebooks), R and Excel. It was great to see so many of the team trying new things and helping each other as the day progressed!
Unsurprisingly there were also challenges and limitations however, in building a model in a day. For example, one topic of debate was regarding one of the most highly correlated variables (a binary store operation factor), which was arguably a little questionable. Variables like these would likely been removed or ‘turned down’ before any of these models were to hit production in a live scenario. Another important consideration would also be how best to describe each model, and outline which variables contribute to store sales, and exactly how they do so. This understanding can be key for building user confidence in a model forecast. For example, my own attempt used an Auto-ML package (a hybrid of many ML models), generated very accurate results but was extremely difficult to understand - probably not a great combination for use in the real world! (Other machine learning approaches on the day had much better explainability).
Our winner for the day was Dan Dungate with his gradient boosted decision tree model. This was the most accurate, and nicely explainable, thanks to clear variable contribution charts he produced alongside the results (congrats Dan!). Most importantly though, the team had a great day and are going to reconvene to share their learnings from the session soon, and to discuss topics for our next innovation day.
Danny Hart, Head of Data Science at GEOLYTIX
Main Image: Photo Credit Lisa Taylor
Women in Data: The Flagship Event 2023
21st March 2023
"Women in Data® aims to bring awareness through media appearances and events; in particular, the annual Women in Data conference."
International Women's Day 2023
8th March 2023
We love to celebrate the wonderful women of Geolytix everyday but International Women's Day is a brilliant day to highlight this more.
GEOLYTIX are Geoawesome - Global Top 100 Geospatial Companies
30th January 2023
We are thrilled to make the Geoawesomeness Top 100 Geo for 2023, the annual list of the best geospatial companies in the world for geospatial companies.