Customer Service
Case StudyCustomer Requests at Online Bank
· To automatically triage future customer requests, intent classification is a standard Machine Learning task in customer service applications that requires well-labeled data. When run on a dataset of customer requests (text) annotated with 10 different intents at an online bank, Cleanlab Studio automatically revealed that over 5% of the dataset labels were incorrect. Cleanlab also automatically identified several out-of-scope requests, queries that cannot really be categorized as any of the intents in the dataset, including the following requests: "how much is 1 share of aapl", "is android better than iphone".
· For training intent classification models, a 16% improvement in prediction error was achieved simply by auto-fixing these detected label issues with Cleanlab Studio (in a fully automated manner). This improvement was realized without any change to the existing modeling code (the same LLM Transformer was trained on the Studio-cleaned version of this dataset vs. on the original dataset). Quickly handling additional data issues (outliers and ambiguous examples) in the Cleanlab Studio interface led to even greater accuracy improvements, again without any change to the existing Transformer model or its training pipeline.
· This dataset was also used for customer analytics, to determine the relative frequencies of different types of customer requests and which types of requests are most common. However some conclusions drawn from the original dataset are inaccurate due to the mislabeling and out-of-scope issues. Significantly more accurate conclusions were obtained by running the same analytics on the cleaned version of the dataset obtained from Cleanlab Studio.
· Businesses striving to make better decisions for customers must rely on accurate data-driven conclusions. These in turn rely on accurate data, which for this customer service application was easy to ensure with Cleanlab Studio.
Case StudyGavagai
At Gavagai, we rely on labeled data to train our models, both publicly available datasets and data we have annotated ourselves. We know that the quality of the data is paramount when it comes to creating machine learning models that can produce business value for our customers.
Cleanlab Studio is a very effective solution to calm my nerves when it comes to label noise!
The tool allows me to upload a dataset and obtain a ranked list of all the potential label issues in the data in just a few clicks. The label issues can then be assessed and fixed right away in the GUI.
Cleanlab should be a go-to tool in every ML practitioners toolbox!
HOW CLEANLAB CAN HELP IMPROVE YOUR CUSTOMER SERVICE AND CUSTOMER UNDERSTANDING
Quickly produce an improved version of your customer dataset to produce more reliable versions of your existing models and data analyses — all without changing any of your existing code!
Supported ML tasks where Cleanlab is particularly effective include: intent recognition, conversational AI, multi-label classification, entity/product recognition, predicting sentiment or emotions. Read more
Practice data-centric AI to produce accurate ML models for messy real-world tabular or text data. Cleanlab offers powerful automated Machine Learning capabilities so you can focus on what matters — the data.
Read about ensuring high quality evaluation data for LLM prompt selection.
Read about handling noisy tabular data to improve XGBoost model.
Read about how to improve LLMs by systematically improving the training data.
Read about improving data stored in Databricks with Cleanlab Studio.