Back to blog
Articles
Articles
August 14, 2023
·
4 min read

Emerging Large Language Model (LLM) Application Architecture

August 14, 2023
|
4 min read

Latest content

Customer Stories
4 min read

How Infobip Generated 220+ Knowledge Articles with Gen AI For Smarter Self-Service and Better NPS

Partnering with HumanFirst, Infobip generated over 220 knowledge articles, unlocked 30% of their agents' time, and improved containment by a projected 15%.
September 16, 2024
Articles
7 min read

Non-Technical AI Adoption: The Value of & Path Towards Workforce-Wide AI

Reviewing the state of employee experimentation and organizational adoption, and exploring the shifts in thinking, tooling, and training required for workforce-wide AI.
September 12, 2024
Articles
6 min read

AI for CIOs: From One-Off Use to Company-Wide Value

A maturity model for three stages of AI adoption, including strategies for company leaders to progress to the next stage.
September 12, 2024
Tutorials
4 min read

Building Prompts for Generators in Dialogflow CX

How to get started with generative features.
August 15, 2024
Announcements
3 min read

HumanFirst and Infobip Announce a Partnership to Equip Enterprise Teams with Data + Generative AI

With a one-click integration to Conversations, Infobip’s contact center solution, HumanFirst helps enterprise teams leverage LLMs to analyze 100% of their customer data.
August 8, 2024
Tutorials
4 min read

Two Field-Tested Prompts for CX Teams

Get deeper insights from unstructured customer data with generative AI.
August 7, 2024
Tutorials
5 min read

Optimizing RAG with Knowledge Base Maintenance

How to find gaps between knowledge base content and real user questions.
April 23, 2024
Tutorials
4 min read

Scaling Quality Assurance with HumanFirst and Google Cloud

How to use HumanFirst with Vertex AI to test, improve, and trust agent performance.
March 14, 2024
Customer Stories
4 min read

How Infobip Generated 220+ Knowledge Articles with Gen AI For Smarter Self-Service and Better NPS

Partnering with HumanFirst, Infobip generated over 220 knowledge articles, unlocked 30% of their agents' time, and improved containment by a projected 15%.
September 16, 2024
Articles
7 min read

Non-Technical AI Adoption: The Value of & Path Towards Workforce-Wide AI

Reviewing the state of employee experimentation and organizational adoption, and exploring the shifts in thinking, tooling, and training required for workforce-wide AI.
September 12, 2024
Articles
6 min read

AI for CIOs: From One-Off Use to Company-Wide Value

A maturity model for three stages of AI adoption, including strategies for company leaders to progress to the next stage.
September 12, 2024

Let your data drive.

Articles

Emerging Large Language Model (LLM) Application Architecture

COBUS GREYLING
August 14, 2023
.
4 min read

Due to the highly unstructured nature of Large Language Models (LLMs), there are thought and market shifts taking place on how to implement LLMs.

Why do I say LLMs are unstructured? LLMs are to a large extent an extension of Conversational AI.

Due to the unstructured nature of human language, the input to LLMs are conversational and unstructured, in the form of Prompt Engineering.

And the output of LLMs is also conversational and unstructured; a highly succinct form of natural language generation (NLG).

LLMs introduced functionality to fine-tune and create custom models. And an initial approach to customising LLMs was creating custom models via fine-tuning.

This approach has fallen into disfavour for three reasons:

  1. As LLMs have both a generative and predictive side. The generative power of LLMs is easier to leverage than the predictive power. If the generative side of LLMs are presented with contextual, concise and relevant data at inference-time, hallucination is negated.
  2. Fine-tuning LLMs involves training data curation, transformation and cost. Fine-tuned models are frozen with a definite time-stamp and will still demand innovation around prompt creation and data presentation to the LLM.
  3. When classifying text based on pre-defined classes or intents, NLUstill has an advantage with built-in efficiencies.

The aim of fine-tuning of LLMs is to engender more accurate and succinct reasoning and answers. This also solves for one of the big problems with LLMs; hallucination, where the LLM returns highly plausible but incorrect answers.

The proven solution to hallucination is using highly relevant and contextual prompts at inference-time, and asking the LLM to follow chain-of-thoughtreasoning.

As seen below, there has been an emergence of vector stores / databases with semantic search, to provide the LLM with a contextual and relevant data snippet to reference.

Vector Stores, Prompt Pipelines and/or Embeddings are used to constitute a few-shot prompt. The prompt is few-shot because context and examples are included in the prompt.

In the case of Autonomous Agents, other tools can also be included like Python Math Libraries, Search and more. The generated response is presented to the user, and also used as context for follow-up or next-step queries or dialog turns.

The process of creating contextually relevant prompts are further aided by Autonomous Agents, prompt pipelines where a prompt is engineered in real-time based on relevant available data, conversation context and more.

Prompt chaining is a more manual and sequential process of creating a flow within a visual designer UI which is fixed and sequential and lacks the autonomy of Agents. There are advantages and disadvantages to both approaches; and both can be used in concert.

Lastly, an emerging field is testing different LLMs against a prompt; as opposed to in the past where we would focus on only testing various prompts against one single LLM. These tools include LangSmith, ChainForge and others.

The importance of determining the best suited model for a specific prompt addresses the notion that within enterprise implementations, multiple LLMs will be used.

I’m currently the Chief Evangelist @ HumanFirst. I explore & write about all things at the intersection of AI & language; ranging from LLMs, Chatbots, Voicebots, Development Frameworks, Data-Centric latent spaces & more.

Subscribe to HumanFirst Blog

Get the latest posts delivered right to your inbox