• AIPressRoom
  • Posts
  • Boosting Mannequin Accuracy: Strategies I Discovered Throughout My Machine Studying Thesis at Spotify (+Code Snippets) | by Khouloud El Alami | Aug, 2023

Boosting Mannequin Accuracy: Strategies I Discovered Throughout My Machine Studying Thesis at Spotify (+Code Snippets) | by Khouloud El Alami | Aug, 2023

A tech knowledge scientist’s stack to enhance cussed ML fashions

This text is considered one of a two-part piece documenting my learnings from my Machine Studying Thesis at Spotify. Make sure you additionally try the second article on how I implemented Feature Importance in this research.

In 2021, I spent 8 months constructing a predictive mannequin to measure consumer satisfaction as a part of my Thesis at Spotify.

My purpose was to grasp what made customers happy with their music expertise. To take action, I constructed a LightGBM classifier whose output was a binary response: y = 1 → the consumer is seemingly happyy = 0 → not a lot

Predicting human satisfaction is a problem as a result of people are by definition unhappy. Even a machine isn’t so match to decipher the mysteries of the human psyche. So naturally my mannequin was as confused as one may be.

From Human Predictor to Fortune Teller

My accuracy rating was round 0.5, which is the worst doable final result you may get on a classifier. It means the algorithm has a 50% likelihood of predicting sure or no, and that’s as random as a human guess.

So I spent 2 months attempting and mixing totally different methods to enhance the prediction of my mannequin. Ultimately, I used to be lastly capable of enhance my ROC rating from 0.5 to 0.73, which was an enormous success!

On this publish, I’ll share with you the methods I used to considerably improve the accuracy of my mannequin. This text may come in useful everytime you’re coping with fashions that simply gained’t cooperate.

Because of the confidentiality of this analysis, I can’t share delicate data, however I’ll do my best for it to not sound complicated.

Earlier than diving into the strategies I used, I simply need to ensure you get the fundamentals proper first. A few of these strategies depend on encoding your variables and getting ready your knowledge accordingly to ensure that them to work. A few of the code snippets I’ve included additionally reference…