In this digital era, data-driven decisions are more important than ever for your business. When you make data-driven decisions, not only you can be sure in your decisions, but also you are ahead of the competition who is not making data-driven decisions. There are many interesting phenomena and topics in data science, but in this blog post, we are going to describe the problem of learning from imbalanced data, and how we at Quant Coding deal with it.
The problem which arises when you try to train a classifier on a dataset where the classes are not equally represented is usually referred to as the imbalanced data problem. For example, in a binary classification setting, 99% of the examples may be from one class and only 1% of the examples may be from the other class. And usually, the minority class is the one that we are interested in.
Evaluating models trained on imbalanced data is different from evaluating other models.