LLM Fundamentals with LangChain

février 9, 2026

The Preface introduced the power of LLM prompting and showed how different techniques—especially when combined—can dramatically change model outputs. The core challenge in building effective LLM applications lies in crafting the right prompt and reliably processing the model’s response to produce accurate results (see Figure 1).

Figure 1. The challenge in making LLMs a useful part of your application — **Figure 1.** The challenge in making LLMs a useful part of your application

Solving this problem puts you on a strong path toward building both simple and complex LLM-powered systems. In this chapter, we explore how LangChain’s building blocks map to core LLM concepts and how combining them enables robust applications. First, we briefly explain why LangChain is useful.

Why LangChain?

LLM applications can be built directly using a provider’s SDK (such as OpenAI’s). However, learning LangChain pays off both short- and long-term for two main reasons:

Prebuilt common patterns

LangChain includes reference implementations for common LLM patterns such as chain-of-thought and tool calling. These provide a fast starting point and are often sufficient out of the box. When they aren’t, LangChain also offers flexible lower-level components.

Interchangeable building blocks

LangChain components—LLMs, chat models, output parsers, embeddings, and vector stores—follow shared interfaces, making them easy to swap. This future-proofs applications as providers and requirements evolve.

Throughout this tutorial, examples use:

LLM / Chat model: OpenAI
Embeddings: OpenAI
Vector store: PGVector

Each can be replaced with alternatives:

Chat models: Anthropic (commercial) or Ollama (open source)
Embeddings: Cohere (commercial) or Ollama (open source)
Vector stores: Weaviate or OpenSearch

LangChain’s abstraction goes beyond shared method names. For example, while OpenAI and Anthropic both support chat-based APIs, their message formats differ subtly. Mixing them directly causes friction. LangChain smooths over these differences, enabling provider-agnostic applications—such as chatbots that use multiple models in the same conversation.

Finally, LangChain provides orchestration features that become essential as applications grow:

Callback-based observability
Consistent interfaces across components
Support for interrupting, resuming, and retrying long-running workflows

These capabilities make LangChain a practical foundation for production-grade LLM applications.

Getting Set Up with LangChain

To follow along with this chapter and the ones ahead, first set up LangChain locally. If you haven’t already, create an OpenAI account as described in the Preface. Then generate an API key from OpenAI’s API Keys page and save it.

Python Setup

Ensure Python is installed (see OS-specific instructions).
(Optional) Install Jupyter to run examples in notebooks:pip install notebook
Install LangChain and required packages:pip install langchain langchain-openai langchain-community pip install langchain-text-splitters langchain-postgres
Set your OpenAI API key in the environment:export OPENAI_API_KEY=your-key
Launch Jupyter:jupyter notebook

You’re now ready to run the Python examples.

Using LLMs in LangChain

To recap, LLMs are the core engine behind most generative AI applications. LangChain provides two simple interfaces for interacting with LLM providers:

LLMs
Chat models

LLM Interface

The LLM interface takes a single string prompt, sends it to the provider, and returns the model’s response.

from langchain_openai.llms import OpenAI

model = OpenAI(model="gpt-3.5-turbo")
model.invoke("The sky is")

from langchain_openai.llms import OpenAI

model = OpenAI(model="gpt-3.5-turbo")
model.invoke("The sky is")

Output

Blue!

Blue!

The model parameter selects the underlying LLM. Most providers offer multiple models with trade-offs in capability, cost, and latency.

Common configuration options include:

temperature: Controls randomness. Lower values (e.g., 0.1) produce predictable output; higher values (e.g., 0.9) increase creativity.
max_tokens: Limits output length and cost. Setting this too low may truncate responses.

Additional parameters vary by provider, so consult the relevant documentation.

Chat Model Interface

Chat models support multi-turn conversations using structured message roles:

System: Instructions for how the model should behave
User: User input
Assistant: Model-generated responses

This interface is especially useful for chatbot-style applications.

from langchain_openai.chat_models import ChatOpenAI
from langchain_core.messages import HumanMessage

model = ChatOpenAI()
prompt = [HumanMessage("What is the capital of France?")]
model.invoke(prompt)

from langchain_openai.chat_models import ChatOpenAI
from langchain_core.messages import HumanMessage

model = ChatOpenAI()
prompt = [HumanMessage("What is the capital of France?")]
model.invoke(prompt)

Output

AIMessage(content='The capital of France is Paris.')

AIMessage(content='The capital of France is Paris.')

Chat models use different message types:

HumanMessage (user role)
AIMessage (assistant role)
SystemMessage (system role)
ChatMessage (custom role)

Using System Instructions

System messages allow you to control model behavior independently of the user’s question.

from langchain_openai.chat_models import ChatOpenAI
from langchain_core.messages import HumanMessage, SystemMessage

model = ChatOpenAI()

system_msg = SystemMessage(
    "You are a helpful assistant that responds to questions with three exclamation marks."
)
human_msg = HumanMessage("What is the capital of France?")

model.invoke([system_msg, human_msg])

from langchain_openai.chat_models import ChatOpenAI
from langchain_core.messages import HumanMessage, SystemMessage

model = ChatOpenAI()

system_msg = SystemMessage(
    "You are a helpful assistant that responds to questions with three exclamation marks."
)
human_msg = HumanMessage("What is the capital of France?")

model.invoke([system_msg, human_msg])

Output

AIMessage('Paris!!!')

AIMessage('Paris!!!')

This shows how system instructions can enforce consistent, predictable behavior across your AI application—even when users don’t explicitly request it.

Making LLM Prompts Reusable

Prompt instructions strongly influence model output. A good prompt provides context and guides the model toward relevant answers. While prompts often start as static strings, real applications require dynamic inputs such as user questions or retrieved context.

From Static to Dynamic Prompts

A detailed prompt may look like this:

Answer the question based on the context below. If the question cannot be answered using the information provided, answer with « I don’t know ».
Context: …
Question: …
Answer:

Hardcoding values limits reuse. LangChain solves this with prompt templates, which define a reusable structure and inject values at runtime.

PromptTemplate

from langchain_core.prompts import PromptTemplate

template = PromptTemplate.from_template(
    """Answer the question based on the context below. If the question cannot be
    answered using the information provided, answer with "I don't know".

    Context: {context}
    Question: {question}
    Answer:"""
)

template.invoke({
    "context": """The most recent advancements in NLP are being driven by Large
    Language Models (LLMs). These models outperform their smaller counterparts
    and have become invaluable for developers.""",
    "question": "Which model providers offer LLMs?"
})

from langchain_core.prompts import PromptTemplate

template = PromptTemplate.from_template(
    """Answer the question based on the context below. If the question cannot be
    answered using the information provided, answer with "I don't know".

    Context: {context}
    Question: {question}
    Answer:"""
)

template.invoke({
    "context": """The most recent advancements in NLP are being driven by Large
    Language Models (LLMs). These models outperform their smaller counterparts
    and have become invaluable for developers.""",
    "question": "Which model providers offer LLMs?"
})

The template defines placeholders ({context}, {question}) that are filled dynamically using Python’s f-string-style syntax.

Using a PromptTemplate with an LLM

Once defined, both the prompt template and model can be reused multiple times.

from langchain_openai.llms import OpenAI
from langchain_core.prompts import PromptTemplate

template = PromptTemplate.from_template(
    """Answer the question based on the context below. If the question cannot be
    answered using the information provided, answer with "I don't know".

    Context: {context}
    Question: {question}
    Answer:"""
)

model = OpenAI()

prompt = template.invoke({
    "context": """The most recent advancements in NLP are being driven by Large
    Language Models (LLMs). Developers can access them through Hugging Face,
    OpenAI, and Cohere.""",
    "question": "Which model providers offer LLMs?"
})

model.invoke(prompt)

from langchain_openai.llms import OpenAI
from langchain_core.prompts import PromptTemplate

template = PromptTemplate.from_template(
    """Answer the question based on the context below. If the question cannot be
    answered using the information provided, answer with "I don't know".

    Context: {context}
    Question: {question}
    Answer:"""
)

model = OpenAI()

prompt = template.invoke({
    "context": """The most recent advancements in NLP are being driven by Large
    Language Models (LLMs). Developers can access them through Hugging Face,
    OpenAI, and Cohere.""",
    "question": "Which model providers offer LLMs?"
})

model.invoke(prompt)

Output

Hugging Face's `transformers` library, OpenAI, and Cohere offer LLMs.

Hugging Face's `transformers` library, OpenAI, and Cohere offer LLMs.

ChatPromptTemplate for Conversational Apps

For chat-based applications, LangChain provides ChatPromptTemplate, which supports message roles such as system and human.

from langchain_core.prompts import ChatPromptTemplate

template = ChatPromptTemplate.from_messages([
    ("system", "Answer the question based on the context below. If the question cannot be answered using the information provided, answer with \"I don't know\"."),
    ("human", "Context: {context}"),
    ("human", "Question: {question}"),
])

template.invoke({
    "context": """The most recent advancements in NLP are being driven by Large
    Language Models (LLMs). Developers can access them through Hugging Face,
    OpenAI, and Cohere.""",
    "question": "Which model providers offer LLMs?"
})

from langchain_core.prompts import ChatPromptTemplate

template = ChatPromptTemplate.from_messages([
    ("system", "Answer the question based on the context below. If the question cannot be answered using the information provided, answer with \"I don't know\"."),
    ("human", "Context: {context}"),
    ("human", "Question: {question}"),
])

template.invoke({
    "context": """The most recent advancements in NLP are being driven by Large
    Language Models (LLMs). Developers can access them through Hugging Face,
    OpenAI, and Cohere.""",
    "question": "Which model providers offer LLMs?"
})

This produces a structured prompt composed of a SystemMessage and multiple HumanMessage objects, while still allowing dynamic inputs.

Using ChatPromptTemplate with a Chat Model

from langchain_openai.chat_models import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate

template = ChatPromptTemplate.from_messages([
    ("system", "Answer the question based on the context below. If the question cannot be answered using the information provided, answer with \"I don't know\"."),
    ("human", "Context: {context}"),
    ("human", "Question: {question}"),
])

model = ChatOpenAI()

prompt = template.invoke({
    "context": """The most recent advancements in NLP are being driven by Large
    Language Models (LLMs). Developers can access them through Hugging Face,
    OpenAI, and Cohere.""",
    "question": "Which model providers offer LLMs?"
})

model.invoke(prompt)

from langchain_openai.chat_models import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate

template = ChatPromptTemplate.from_messages([
    ("system", "Answer the question based on the context below. If the question cannot be answered using the information provided, answer with \"I don't know\"."),
    ("human", "Context: {context}"),
    ("human", "Question: {question}"),
])

model = ChatOpenAI()

prompt = template.invoke({
    "context": """The most recent advancements in NLP are being driven by Large
    Language Models (LLMs). Developers can access them through Hugging Face,
    OpenAI, and Cohere.""",
    "question": "Which model providers offer LLMs?"
})

model.invoke(prompt)

Output

AIMessage(content="Hugging Face, OpenAI, and Cohere offer LLMs.")

AIMessage(content="Hugging Face, OpenAI, and Cohere offer LLMs.")

By separating prompt structure from runtime values, LangChain templates make prompts reusable, composable, and easier to maintain—an essential pattern for production LLM applications.

Getting Specific Formats out of LLMs

While plain text output is often sufficient, many applications require structured, machine-readable output—such as JSON—to pass LLM results directly into other parts of a system (APIs, databases, or downstream code).

JSON Output

JSON is the most common structured format used with LLMs. To generate reliable JSON, you first define a schema describing the expected output, then instruct the model to follow it.

In LangChain, this is done by pairing a schema with the model using with_structured_output.

from langchain_openai import ChatOpenAI
from langchain_core.pydantic_v1 import BaseModel

class AnswerWithJustification(BaseModel):
    """An answer to the user's question along with justification for the answer."""
    answer: str
    justification: str

llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
structured_llm = llm.with_structured_output(AnswerWithJustification)

structured_llm.invoke(
    "What weighs more, a pound of bricks or a pound of feathers?"
)

from langchain_openai import ChatOpenAI
from langchain_core.pydantic_v1 import BaseModel

class AnswerWithJustification(BaseModel):
    """An answer to the user's question along with justification for the answer."""
    answer: str
    justification: str

llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
structured_llm = llm.with_structured_output(AnswerWithJustification)

structured_llm.invoke(
    "What weighs more, a pound of bricks or a pound of feathers?"
)

Output

{
  "answer": "They weigh the same",
  "justification": "Both a pound of bricks and a pound of feathers weigh one pound."
}

{
  "answer": "They weigh the same",
  "justification": "Both a pound of bricks and a pound of feathers weigh one pound."
}

How Structured Output Works

The schema serves two purposes:

Guidance: LangChain converts the schema into a JSON Schema and sends it to the LLM (using function calling or prompting, depending on the model).
Validation: The model’s response is validated against the schema before being returned, ensuring the output matches the expected structure exactly.

By enforcing schemas, LangChain makes LLM outputs predictable, safe to parse, and easy to integrate into larger applications—an essential step for production-ready systems.

Other Machine-Readable Formats with Output Parsers

Beyond JSON, LLMs can generate other structured formats such as CSV, XML, or lists. In LangChain, this is handled using output parsers—components that help turn raw model text into structured data.

Output parsers serve two main purposes:

Providing Format Instructions

Parsers can supply formatting guidance that is injected into the prompt, helping the LLM produce output in a form the parser understands.

Validating and Parsing Output

Parsers transform the model’s text output into a structured representation (for example, a list or table). This may include cleaning extra text, handling partial output, and validating values.

Example: Comma-Separated List

from langchain_core.output_parsers import CommaSeparatedListOutputParser

parser = CommaSeparatedListOutputParser()
items = parser.invoke("apple, banana, cherry")

from langchain_core.output_parsers import CommaSeparatedListOutputParser

parser = CommaSeparatedListOutputParser()
items = parser.invoke("apple, banana, cherry")

Output

['apple', 'banana', 'cherry']

['apple', 'banana', 'cherry']

LangChain includes many built-in output parsers for common formats, including CSV, XML, and others. In the next section, these parsers are combined with prompts and models to produce fully structured, end-to-end LLM workflows.

Assembling the Many Pieces of an LLM Application

So far, you’ve seen the core building blocks of LangChain. The next question is how to combine them effectively into a full LLM application.

The Runnable Interface

All LangChain components—models, prompt templates, and output parsers—share a common Runnable interface. This is why you’ve seen the same invoke() pattern throughout the examples.

Every runnable component supports:

invoke: transform a single input into an output
batch: efficiently process multiple inputs
stream: stream output incrementally as it’s generated

Additional features include built-in retries, fallbacks, schema support, runtime configuration, and asyncio equivalents in Python.

Because the interface is consistent, once you learn it for one component, it applies to all others.

from langchain_openai.llms import ChatOpenAI

model = ChatOpenAI()

# Single input → single output
completion = model.invoke("Hi there!")
# "Hi!"

# Multiple inputs → multiple outputs
completions = model.batch(["Hi there!", "Bye!"])
# ["Hi!", "See you!"]

# Streamed output
for token in model.stream("Bye!"):
    print(token)
# Good
# bye
# !

from langchain_openai.llms import ChatOpenAI

model = ChatOpenAI()

# Single input → single output
completion = model.invoke("Hi there!")
# "Hi!"

# Multiple inputs → multiple outputs
completions = model.batch(["Hi there!", "Bye!"])
# ["Hi!", "See you!"]

# Streamed output
for token in model.stream("Bye!"):
    print(token)
# Good
# bye
# !

In practice:

invoke() returns one complete result
batch() returns a list of results
stream() yields partial outputs as they become available
(or a single chunk if streaming isn’t supported)

Ways to Combine Components

LangChain supports two composition styles:

Imperative

Call components directly, such as model.invoke(...). This gives full control and uses standard Python.

Declarative

Use LangChain Expression Language (LCEL) to declaratively compose components. LCEL handles parallelism, streaming, and async execution automatically (covered in a later section).

Table 1 summarizes the differences between imperative and declarative composition:

Feature	Imperative	Declarative (LCEL)
Syntax	Full Python	LCEL
Parallel execution	Threads / coroutines	Automatic
Streaming	Manual (`yield`)	Automatic
Async execution	`async` functions	Automatic

With the Runnable interface as a foundation, LangChain lets you assemble prompts, models, and parsers into flexible, production-ready LLM pipelines—either step by step or declaratively.

Imperative Composition

Imperative composition is simply writing regular Python code—functions and classes—that combine prompts, models, and parsers. It offers full control and flexibility, using patterns you already know.

Basic Example

Below is a minimal chatbot built by combining a prompt template and a chat model. The @chain decorator wraps the function in LangChain’s Runnable interface, enabling invoke, batch, and stream.

from langchain_openai.chat_models import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import chain

# Building blocks
template = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    ("human", "{question}"),
])

model = ChatOpenAI()

# Combine components in a function
@chain
def chatbot(values):
    prompt = template.invoke(values)
    return model.invoke(prompt)

# Use the chatbot
chatbot.invoke({"question": "Which model providers offer LLMs?"})

from langchain_openai.chat_models import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import chain

# Building blocks
template = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    ("human", "{question}"),
])

model = ChatOpenAI()

# Combine components in a function
@chain
def chatbot(values):
    prompt = template.invoke(values)
    return model.invoke(prompt)

# Use the chatbot
chatbot.invoke({"question": "Which model providers offer LLMs?"})

Output

AIMessage(content="Hugging Face, OpenAI, and Cohere offer LLMs.")

AIMessage(content="Hugging Face, OpenAI, and Cohere offer LLMs.")

This is a complete chatbot implemented with familiar Python syntax, making it easy to insert custom logic.

Adding Streaming Support

To stream output, simply yield tokens from the model. LangChain automatically exposes a stream() method.

@chain
def chatbot(values):
    prompt = template.invoke(values)
    for token in model.stream(prompt):
        yield token

for part in chatbot.stream({
    "question": "Which model providers offer LLMs?"
}):
    print(part)

@chain
def chatbot(values):
    prompt = template.invoke(values)
    for token in model.stream(prompt):
        yield token

for part in chatbot.stream({
    "question": "Which model providers offer LLMs?"
}):
    print(part)

Output (streamed)

AIMessageChunk(content="Hugging")
AIMessageChunk(content=" Face's")
AIMessageChunk(content=" transformers")
...

AIMessageChunk(content="Hugging")
AIMessageChunk(content=" Face's")
AIMessageChunk(content=" transformers")
...

Asynchronous Execution (Python)

For async workflows, define the function with async and use the async variants of invoke.

@chain
async def chatbot(values):
    prompt = await template.ainvoke(values)
    return await model.ainvoke(prompt)

await chatbot.ainvoke({
    "question": "Which model providers offer LLMs?"
})

@chain
async def chatbot(values):
    prompt = await template.ainvoke(values)
    return await model.ainvoke(prompt)

await chatbot.ainvoke({
    "question": "Which model providers offer LLMs?"
})

Imperative composition gives you maximum flexibility, but streaming and async behavior must be implemented manually. In the next section, declarative composition shows how LangChain can handle these concerns automatically.

Declarative Composition

LCEL (LangChain Expression Language) is a declarative way to compose LangChain components. Instead of writing glue code yourself, you describe how components connect, and LangChain compiles this into an optimized execution plan with automatic streaming, async support, tracing, and parallelization.

Basic LCEL Example

This example is functionally identical to the imperative chatbot, but uses the | operator to compose components.

from langchain_openai.chat_models import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate

# Building blocks
template = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    ("human", "{question}"),
])

model = ChatOpenAI()

# Declarative composition
chatbot = template | model

# Use it
chatbot.invoke({"question": "Which model providers offer LLMs?"})

from langchain_openai.chat_models import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate

# Building blocks
template = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    ("human", "{question}"),
])

model = ChatOpenAI()

# Declarative composition
chatbot = template | model

# Use it
chatbot.invoke({"question": "Which model providers offer LLMs?"})

Output

AIMessage(content="Hugging Face, OpenAI, and Cohere offer LLMs.")

AIMessage(content="Hugging Face, OpenAI, and Cohere offer LLMs.")

Crucially, the usage is identical to imperative composition: you still call
invoke, batch, or stream.

Streaming with LCEL (Automatic)

Unlike imperative composition, no extra code is needed to enable streaming.

for part in chatbot.stream({
    "question": "Which model providers offer LLMs?"
}):
    print(part)

for part in chatbot.stream({
    "question": "Which model providers offer LLMs?"
}):
    print(part)

Output (streamed)

AIMessageChunk(content="Hugging")
AIMessageChunk(content=" Face's")
AIMessageChunk(content=" transformers")
...

AIMessageChunk(content="Hugging")
AIMessageChunk(content=" Face's")
AIMessageChunk(content=" transformers")
...

Async Execution with LCEL (Python)

Async support also works out of the box.

await chatbot.ainvoke({
    "question": "Which model providers offer LLMs?"
})

await chatbot.ainvoke({
    "question": "Which model providers offer LLMs?"
})

No changes to the composition are required—LangChain handles it automatically.

Summary

In this chapter, you learned the core building blocks for LLM applications in LangChain:

Models to generate predictions
Prompts to guide model behavior
Output parsers to structure results

All LangChain components share a common interface (invoke, stream, batch), making them easy to combine.

You can assemble applications in two ways:

Imperatively, using standard Python for maximum control and custom logic
Declaratively, using LCEL for concise pipelines with automatic streaming and async support

Imperative composition is best when you need fine-grained control. Declarative composition shines when you want clean, composable pipelines with minimal boilerplate.

In Chapter 2, you’ll learn how to inject external data as context—enabling LLM applications that can truly chat with your data.

LLM Fundamentals with LangChain

Table of Contents

Why LangChain?

Getting Set Up with LangChain

Using LLMs in LangChain

Making LLM Prompts Reusable

From Static to Dynamic Prompts

PromptTemplate

Using a PromptTemplate with an LLM

ChatPromptTemplate for Conversational Apps

Using ChatPromptTemplate with a Chat Model

Getting Specific Formats out of LLMs

JSON Output

How Structured Output Works

Other Machine-Readable Formats with Output Parsers

Assembling the Many Pieces of an LLM Application

The Runnable Interface

Imperative Composition

Declarative Composition

Summary