Building the AI Mind Map: Understanding LLM and How it works

September 30, 2024
Artificial Intelligence

Generative AI has opened new worlds of possibilities for businesses.  However, the technology, even the terms of GenAI, is evolving quickly. After reading a few cheat sheets of AI terms, I realized that this approach isn't enough for first-time learners to grasp the broader AI landscape. These cheat sheets often fail to show the relationships between terms and how they work together. Therefore, I’ve decided to create a mind map that visually represents these terms and their interconnections.

Artificial Intelligence: AI is a broad field that encompasses the development of computer systems that can perform tasks typically requiring human intelligence. These tasks include reasoning, learning, problem-solving, perception, language understanding, and more. 

Generative Artificial Intelligence: GenAI is a subset of AI that focuses on generating new, original content, such as text, images, music and more, based on patterns and data it has been trained on. 

You may have heard terms like “machine learning”, “deep learning”, they are all subsets of AI, each focusing on particular methodologies and technologies to enable these intelligent behaviors in machines. The chart below shows the relationships among the terms.

Large Language Model

I believe everyone has their own "GPT Moments." For me, it was the series of moments when I learned front-end coding with ChatGPT and built an app. Today, from powering conversational agents to assisting in complex decision-making processes, LLMs are revolutionizing industries and redefining the possibilities of AI.

At their core, LLM is a subset of GenAI that is designed to understand, generate and manipulate human languages, focusing specifically on text. Due to its advanced contextual understanding capability, LLM also plays a vital role in a wide range of GenAI technologies.

Popular Models

According to McKinsey’s report, the adoption of generative AI (GenAI) has surged from 33% to 65% between 2023 and 2024 in organizations. There is no doubt that LLMs have revolutionized natural language processing (NLP), significantly expanding business opportunities. This has ignited intense competition among major tech companies, each vying for dominance in this rapidly evolving field.

How do LLMs work

LLMs operate by predicting the next word in a sequence of text, given the context of the preceding words. This predictive capability is honed through extensive training on diverse text corpora, ranging from books and articles to websites and social media posts. The training process involves adjusting the model's parameters to minimize the difference between its predictions and the actual text.

Before the advent of GPT (Generative Pre-trained Transformer) and transformer architectures, several other architectures were commonly used in natural language processing (NLP) and machine learning tasks, such as RNNs, LSTMs, and CNNs. While their limitations in handling long-range dependencies and parallelization led to the development of transformer models. 

The Transformer model architecture, introduced in the paper “Attention is All You Need” by Vaswani et al. in 2017, is centered around a powerful mechanism known as self-attention. This allows the model to evaluate the importance of each word in a sentence, regardless of their physical proximity within the text. By doing so, it comprehensively understands the context and relationships among the words. This aspect of the Transformer is fundamental, revealing much about how modern NLP works. In future articles, I will delve deeper into the architectural nuances that make this model so revolutionary. 

Use Cases

When we think about Natural Language Processing (NLP), tasks like summarization, translation, and sentiment analysis typically come to mind. However, the applications of large language models (LLMs) extend far beyond these traditional use cases. Businesses are utilizing these models for more complex tasks, such as generating detailed reports, automating customer interactions, and providing personalized recommendations. The surge in LLM-based applications since 2022 is remarkable. Taking Qquest, a tool I developed as an example, it leverages the capabilities of LLMs in language understanding, reasoning, and code generation, and enables non-technical professionals to query data without writing code and quickly find answers to business questions.

How to leverage LLMs

There were many discussions suggesting that "prompt engineering" was all you needed to know and that software engineering was obsolete. However, during the development of Qquest, I realized that while AI can handle tasks like coding and script writing, creating a functional, well-designed application requires much more than simple prompt engineering. It demands a deep understanding of the domain, UX innovation, and a deep understanding about how LLMs work.

Prompt engineering: The inputs provided to a LLM to generate desired outputs. Before explaining details, I’d love to introduce the concept of Zero-shot/few-shot learning although they are different techniques. 

Zero-shot/few-shot learning refers to the model being able to recognize and handle tasks or make predictions for categories it has never explicitly seen during training. While most zero-shot learning models are often reliant on how well they can interpret and act upon the instructions given to them, which is where prompt engineering comes into play. 

Without a dedicated team of data scientists or a clear AI strategy, prompt engineering provides a low-cost solution for organizations and individuals to experiment with LLMs. In this case, domain knowledge is important. Individuals or organizations must rethink how to organize and manage this unstructured information, summarize the specialized workflow into a clear, unambiguous language that the model needs to generate a relevant response. 

Using LLM APIs

Prompt engineering does provide a convenient solution for majority to leverage LLM, while it has its own limitations:

  • Model Limitationsome text
    • Generalization Challenges: No matter how well-crafted a prompt is, the model won’t have great output for the entirely new types of tasks or data it hasn’t been exposed to during training. 
    • Knowledge Boundaries:  The knowledge of a model is typically frozen at the point of its last training update. This means that it can be outdated or incomplete, and no amount of prompt engineering can overcome the limitations of what the model has learned.
  • Scalability Issuessome text
    • Token ceilings: There is a max token limit in prompts. For example, Claude 3.5 allows a context window of 200K tokens, which is sufficient for most tasks. However, this limit could be reached in scenarios where extensive content, such as a detailed cookbook, needs to be processed.
    • Efficiency: Crafting effective prompts with all relevant context every time is also time-consuming. 
  • Complexity of Taskssome text
    • Simplification: Prompt engineering often requires simplifying complex tasks into components that the model can handle, which can lead to loss of nuance or depth in the model's responses.
    • Multi-step Reasoning: Some tasks require complex reasoning or multiple steps that are difficult to encapsulate effectively in a single prompt. This can limit the complexity of tasks that can be handled through prompt engineering alone.

To better address these challenges, you can use the public APIs that most GenAI models provide, to integrate LLM into your own workflow more seamlessly. In this case, there are 3 key techniques you must know to enhance your own AI application capacity:

Retrieval Augmented Generation: Appending the relevant information from an external source to improve the quality of LLM output.

RAG won’t change how LLM works, but it will definitely improve the quality and accuracy of the responses. For example, by plugging in a summary sheet of 2024’s Olympic medal count, you can know the most recent Olympic winning status in Paris. To design a good-perform RAG system requires high-quality data, streamless pipeline design, retrieval model selection, scaling infrastructures, mechanism for evaluation and feedback…et.al. I plan to write another article to show the best practice of RAG systems.

Fine-tuning: Fine-tuning a base model on domain-specific data is a powerful technique to improve the performance and accuracy of AI models for specific tasks or industries.

Compared with RAG, fine-tuning directly modifies the model’s weights based on specific examples from the target domain, potentially leading to more nuanced understanding and longer-term retention of domain specific knowledge. For example, if you want to leverage LLM for sentiment analysis specifically for customer reviews in the automotive industry, a car review stating "This ride is tight" might be misinterpreted by a general model as negative sentiment, whereas it is actually positive within the automotive context. Fine-tuning a general model on a new dataset includes automotive reviews collected from various real customers could help.

Function calling: The ability to call external functions, APIs, or perform specific actions based on the input or context provided to the AI.

Function calling, in many cases, is also called “Tool Use”.  In essence, it extended LLM capability from learning, understanding to include active problem solving. For example, more than email drafting, when equipped with email-sending functionalities, an LLM can actively manage the task of sending emails in the real world. This enhancement effectively transforms these models from passive assistants into proactive agents capable of executing tasks directly.

Summary

In this article, I simply covered the terms used in text-generation AI. While the GenAI terms mapping is a work still in progress.There is no doubt that AI will change the world. The journey from a simple idea to a fully launched product taught me invaluable lessons while building Qquest. I want to share these learnings to inspire and empower others on their AI journey. By learning from my experiences, I hope to make the path to exploring AI technology clearer and more accessible, so that together, we can unlock the limitless possibilities of this revolutionary field.

Related Posts