Understanding Messages in LangChain

Messages are how you talk to AI models in LangChain. They're objects that carry text, images, audio, and all the metadata needed for a conversation.

What's in a Message?

Every message has three parts:

Role - Who's talking (system, user, or AI)
Content - The actual text, images, or files
Metadata - Extra info like token counts and message IDs

The Four Message Types

System Messages

System messages set the rules. They tell the AI how to behave before the conversation starts.

from langchain.messages import SystemMessage

system_msg = SystemMessage("You are a helpful coding assistant.")

Think of it like giving someone instructions before they start a task.

Human Messages

These are your inputs - what you ask the model.

from langchain.messages import HumanMessage

# Simple text
human_msg = HumanMessage("What is machine learning?")

# Or pass multimodal content
human_msg = HumanMessage(content=[
    {"type": "text", "text": "What's in this image?"},
    {"type": "image", "url": "https://example.com/photo.jpg"}
])

AI Messages

The model's responses come back as AI messages. They include the text, tool calls, and usage data.

response = model.invoke("Explain AI")
print(type(response))  # AIMessage

# Access the response
print(response.content)
print(response.usage_metadata)  # Token counts

Tool Messages

When the AI needs to use a tool (like searching the web or calling an API), tool messages carry the results back.

from langchain.messages import ToolMessage

# After AI makes a tool call
tool_message = ToolMessage(
    content="Sunny, 72°F",
    tool_call_id="call_123"
)

Three Ways to Send Messages

You can invoke models with messages in three formats:

1. Plain strings (simplest)

response = model.invoke("Write a haiku")

2. Message objects (for conversations)

messages = [
    SystemMessage("You are a poetry expert"),
    HumanMessage("Write a haiku about spring")
]
response = model.invoke(messages)

3. Dictionary format (OpenAI-style)

messages = [
    {"role": "system", "content": "You are helpful"},
    {"role": "user", "content": "Hello"}
]
response = model.invoke(messages)

Multimodal Content

Messages can carry more than text. You can send images, PDFs, audio, and video.

# Image from URL
message = HumanMessage(content=[
    {"type": "text", "text": "Describe this image"},
    {"type": "image", "url": "https://example.com/image.jpg"}
])

# PDF document
message = HumanMessage(content=[
    {"type": "text", "text": "Summarize this document"},
    {"type": "file", "url": "https://example.com/doc.pdf"}
])

# Audio file
message = HumanMessage(content=[
    {"type": "text", "text": "Transcribe this audio"},
    {"type": "audio", "base64": "...encoded_audio...", "mime_type": "audio/wav"}
])

Building Conversations

Conversations are just lists of messages that grow over time.

from langchain.messages import SystemMessage, HumanMessage, AIMessage

# Start a conversation
messages = [
    SystemMessage("You are a helpful assistant"),
    HumanMessage("What's 2+2?")
]

response = model.invoke(messages)

# Add AI response to history
messages.append(response)

# Continue the conversation
messages.append(HumanMessage("What's that times 3?"))
response = model.invoke(messages)

Token Usage and Metadata

AI messages include useful metadata like token counts:

response = model.invoke("Hello!")

# Check token usage
print(response.usage_metadata)
# {'input_tokens': 8, 'output_tokens': 304, 'total_tokens': 312}

Streaming Responses

You can stream responses as they're generated:

for chunk in model.stream("Tell me a story"):
    print(chunk.content, end="")

Chunks are combined into a full message at the end.

Quick Recap

Messages are objects that carry conversation data to and from AI models
Four types: System (instructions), Human (input), AI (output), Tool (function results)
Send messages as strings, objects, or dictionaries
Multimodal support lets you send images, audio, PDFs, and video
Build conversations by appending messages to a list
Track usage with metadata in AI responses

Messages are the foundation of everything you do with LangChain. Master them and you'll build better AI apps.