Back to blog
Articles
Tutorials
November 11, 2020
·
7 MIN READ

How to bootstrap and continuously improve Botpress projects with HumanFirst Studio and real data

November 11, 2020
|
7 MIN READ

Latest content

Customer Stories
4 min read

How Infobip Generated 220+ Knowledge Articles with Gen AI For Smarter Self-Service and Better NPS

Partnering with HumanFirst, Infobip generated over 220 knowledge articles, unlocked 30% of their agents' time, and improved containment by a projected 15%.
September 16, 2024
Articles
7 min read

Non-Technical AI Adoption: The Value of & Path Towards Workforce-Wide AI

Reviewing the state of employee experimentation and organizational adoption, and exploring the shifts in thinking, tooling, and training required for workforce-wide AI.
September 12, 2024
Articles
6 min read

AI for CIOs: From One-Off Use to Company-Wide Value

A maturity model for three stages of AI adoption, including strategies for company leaders to progress to the next stage.
September 12, 2024
Tutorials
4 min read

Building Prompts for Generators in Dialogflow CX

How to get started with generative features.
August 15, 2024
Announcements
3 min read

HumanFirst and Infobip Announce a Partnership to Equip Enterprise Teams with Data + Generative AI

With a one-click integration to Conversations, Infobip’s contact center solution, HumanFirst helps enterprise teams leverage LLMs to analyze 100% of their customer data.
August 8, 2024
Tutorials
4 min read

Two Field-Tested Prompts for CX Teams

Get deeper insights from unstructured customer data with generative AI.
August 7, 2024
Tutorials
5 min read

Optimizing RAG with Knowledge Base Maintenance

How to find gaps between knowledge base content and real user questions.
April 23, 2024
Tutorials
4 min read

Scaling Quality Assurance with HumanFirst and Google Cloud

How to use HumanFirst with Vertex AI to test, improve, and trust agent performance.
March 14, 2024
Customer Stories
4 min read

How Infobip Generated 220+ Knowledge Articles with Gen AI For Smarter Self-Service and Better NPS

Partnering with HumanFirst, Infobip generated over 220 knowledge articles, unlocked 30% of their agents' time, and improved containment by a projected 15%.
September 16, 2024
Articles
7 min read

Non-Technical AI Adoption: The Value of & Path Towards Workforce-Wide AI

Reviewing the state of employee experimentation and organizational adoption, and exploring the shifts in thinking, tooling, and training required for workforce-wide AI.
September 12, 2024
Articles
6 min read

AI for CIOs: From One-Off Use to Company-Wide Value

A maturity model for three stages of AI adoption, including strategies for company leaders to progress to the next stage.
September 12, 2024

Let your data drive.

Tutorials

How to bootstrap and continuously improve Botpress projects with HumanFirst Studio and real data

MATHIEU RENE
November 11, 2020
.
7 MIN READ

In this article we’ll see how to use available datasets or your own in order to create a Botpress bot from scratch without having to come up with every single training phrase.

HumanFirst Studio was built in order to manage and continuously improve the training data of large conversational assistants, identifying valuable training data from existing sources that are often available but hard to tap into without proper tooling.

In this article we’ll see how to use available datasets or your own in order to create a Botpress bot from scratch without having to come up with every single training phrase. We’ll also see how to use our command line tool, hf, in order to seamlessly integrate Botpress with studio in a git-oriented workflow.

Note: What you’ll learn in this article can also be applied for continuous improvement of deployed Botpress projects

Installation

You will need a HumanFirst Studio account in order to go through this tutorial, you can create a free account here to get started.

Install the HumanFirst CLI tool

Download one of our precompiled binaries at: https://github.com/zia-ai/humanfirst/releases/tag/cli-0.0.4

Choose the binary for your operating system.
For linux do:

You can then login to your studio account from the command line:

You should then see something like this, indicating you have logged in properly.

Install Botpress

Download the latest archive from their downloads area

Start the server using the provided binary

You can now navigate to http://localhost:3000/ and create the admin user to begin using Botpress.

Starting a new Botpress project

Once you have logged in, click the Create Bot button on the top right of the screen, then select New Bot. Let's pick an example that's already within their templates. Call it smalltalk and select Small Talk in the bot templates dropdown, a bit further down. Click Create Bot to confirm.

This bot is fairly simple, and most of the logic lied within the Q&A section.

Importing your new Botpress project into Studio

Now that you have a bot, create a workspace in which you’ll import your data. (this is essentially our labeled container that will contain your intents and let you manage and improve them.)

Note: Since the commands are ran from the botpress’ root folder, we have to specify the bot id that you selected in the Create bot dialog. If you didn't name your bot smalltalk - you'll have to edit the command accordingly.
Note: We use --clear in order to erase the workspace's contents so it reflects exactly what you have in your repository. It's not necessary for the first time, but it's a good way to bring in changes that someone else committed to the repository.

http://studio.humanfirst.ai/ will now show your newly created workspace along with the intents imported from the Botpress project.

Adding more data

We’ll add some phrases to the existing intents. We can use publicly available datasets in order to search for training phrases that fit. Since the intents added in Botpress init are pretty generic, there are good chances we'll find relevant matches.

In Studio, click on the Data sources menu item on the left, then click the Use one of our data sets button to add existing conversations to your project. There are many choices available, but for this tutorial pick the STAR dataset, which contain goal oriented conversations for different tasks. If you have existing data, either from existing human-human conversation or a list of unclassified utterances, this is where you would import it into your workspace.

Augmenting existing intents

Now that we have some unlabled data to work with we can expand the currently defined intents.

In the Labeled data section, you'll find the list of imported intents. Activating one will bring up the list of its associated training examples. Click the Get Suggestions button and some suggestions will be provided from the dataset you added in the previous step. You can then accept training examples that make sense. The None of these look good button rejects the remaining elements.

Note: Recommendations work by looking at all the workspace’s training data and returns examples from your data sources. When you reject, we maintain a list of phrases that are internally tagged as “not part of that intent”. This list is used to improve suggestions, you can see it as an ephemeral binary classifier helping to narrow down your search until you get enough relevant examples.

Discovering new intents

Next, let’s take a look at the Unlabeled data section. This is where all utterances that haven't been assigned to an intent are located.

You’ll see a list of unlabeled utterances that is sourced from your data sources. Since you’ve already added some demo data, there should be a lot of data. The search bar on top is a full-text search feature allowing you to find things the old fashioned way. Try it first by searching for hotel - there are a few intents that can be created relating to these

One of the initial matches is Hi, I am looking for the rating of a hotel.. Go ahead and select it, you'll notice that a new option is available right under the selection: Show similar suggestions. This button will use semantic search to look for similar phrases in the corpus. It's a good idea to mix these two techniques because full text search gives you keyword-based results, and semantic search expands on the meaning of the utterance and returns more relevant matches.

Select a few examples where the user clearly asks for a hotel with a specific rating. Notice that the button is clickable again, doing so will look for results similar to all selected items.

Tip: You can shift+click to select a range without clicking on each of them separately.

Once you have enough elements, click the Label selected data button on the left, and click + Create here to create a new intent. Let's name it hotel_request_rating and click the Create and edit button.

Here are a few intents you may want to create:

  • Book an appointment
  • Reserve a hotel
  • Reserve a hotel with a specific rating (see if you can make this one a child intent of the reserve a hotel one)

Refactoring projects

While working on your project, you may decide that some intents should be merged together or even broken down into more specific intents. In the Labeled data section, where you can view the list of training phrases for an intent, you'll notice a checkbox next to each phrase, clicking it with automatically sort the rest of the list by similarity to the selected phrases. You can click the similar phrases and move them using the left column, as we did with unlabeled utterances in the previous step.

Back to Botpress

We can export our changes using the command line

You can now go back to Botpress and see your updated data.

Subscribe to HumanFirst Blog

Get the latest posts delivered right to your inbox