Back to blog
Articles
Tutorials
November 26, 2020
·
3 MIN READ

Bottom-up NLU with HumanFirst

November 26, 2020
|
3 MIN READ

Latest content

Tutorials
4 min read

Building Prompts for Generators in Dialogflow CX

How to get started with generative features.
August 15, 2024
Announcements
3 min read

HumanFirst and Infobip Announce a Partnership to Equip Enterprise Teams with Data + Generative AI

With a one-click integration to Conversations, Infobip’s contact center solution, HumanFirst helps enterprise teams leverage LLMs to analyze 100% of their customer data.
August 8, 2024
Tutorials
4 min read

Two Field-Tested Prompts for CX Teams

Get deeper insights from unstructured customer data with generative AI.
August 7, 2024
Tutorials
5 min read

Optimizing RAG with Knowledge Base Maintenance

How to find gaps between knowledge base content and real user questions.
April 23, 2024
Tutorials
4 min read

Scaling Quality Assurance with HumanFirst and Google Cloud

How to use HumanFirst with Vertex AI to test, improve, and trust agent performance.
March 14, 2024
Tutorials
6 min read

Generating Chatbot Flow Logic from Real Conversations

How to build flexible, intuitive Conversational AI from unstructured customer data.
February 29, 2024
Announcements
2 min read

Full Circle: HumanFirst Welcomes Maeghan Smulders as COO

Personal and professional history might not repeat, but it certainly rhymes. I’m thrilled to join the team at HumanFirst, and reconnect with a team of founders I not only trust, but deeply admire.
February 13, 2024
Tutorials
4 min read

Accelerating Data Analysis with HumanFirst and Google Cloud

How to use HumanFirst with CCAI-generated data to accelerate data analysis.
January 24, 2024
Tutorials
4 min read

Exploring Contact Center Data with HumanFirst and Google Cloud

How to use HumanFirst with CCAI-generated data to streamline topic modeling.
January 11, 2024
Tutorials
4 min read

Building Prompts for Generators in Dialogflow CX

How to get started with generative features.
August 15, 2024
Announcements
3 min read

HumanFirst and Infobip Announce a Partnership to Equip Enterprise Teams with Data + Generative AI

With a one-click integration to Conversations, Infobip’s contact center solution, HumanFirst helps enterprise teams leverage LLMs to analyze 100% of their customer data.
August 8, 2024
Tutorials
4 min read

Two Field-Tested Prompts for CX Teams

Get deeper insights from unstructured customer data with generative AI.
August 7, 2024

Let your data drive.

Learn how to apply the tried and tested divide-and-conquer approach to labeling large datasets using HumanFirst.

Heads Up: If you’re new to bottom-up labeling, please read “A bottom-up approach to NLU”.

What is bottom-up labeling?

Bottom-up labeling applies the tried and tested divide-and-conquer approach to the problem of labeling large datasets, with great success. Instead of expecting a human or unsupervised algorithm to correctly “predict” what intents and abstractions exist in the data, it provides a simple framework to iteratively discover this information. [1]

Below is a simple example of what bottom-up labeling looks like. Starting from the left with unlabeled utterances and moving to the right shows intents with increasing specificity. This specificity is achieved using a bottom-up approach to labeling. We’ll show you how to put this approach into practice using HumanFirst!

Part 1: Setting things up!

This article is part 1 in a series that will show you how to apply a bottom-up approach to labeling and intent discovery with HumanFirst. In this article, we’ll focus on getting started with HumanFirst and how to set up the bottom-up labeling process.

Getting started is simple.

Step 1: Upload your raw conversational data to HumanFirst. You can upload utterances or 2-way conversations in TXT or CSV formats respectively. For more information click here.

Step 2: Head to the Unlabeled Data section of HumanFirst and begin selecting utterances that are related with a high level of abstraction (i.e. questions, problems, requests etc).

In the example above we chose the initial level of abstraction to be: questions. We then selected utterances that relate to this label. Once a decent amount have been selected, we label the utterances in an intent. We’ll call this one “has a question”.

The outcome of these steps is valuable, as it provides high-quality and domain-specific training data to classify users who “have a question”.

Step 4: We’ll now want to look at our intent “has a question” and begin selecting some of its training data.

As you can see, selecting an utterance within an intent causes a semantic re-rank within the training data. This speeds up the selection and re-factoring process.

Step 5: Assign the selected utterances to a more specific sub-intent of your choosing.

We end up with two sub-intents: has a question > about account & has a question > about settings

Step 6: Repeat steps 4 & 5 within your new sub-intents (to classify labeled utterances into further sub-intents) until the desired level of granularity is achieved.

Every step produces training data for classifiers that can recognize increasingly specific intents: this is one of the major advantages of this approach.

Repeating this approach will yield an intent structure/hierarchy that will reflect your domain. After a few minutes of this process we’ve generated an intent structure that contains trained classifiers at every level of abstraction. This facilitates the understanding of our corpus and our identification of long-tail intents.

References

[1] A bottom-up approach to intent discovery and training

Subscribe to HumanFirst Blog

Get the latest posts delivered right to your inbox