Build a coding agent with Modal Sandboxes and LangGraph

This example demonstrates how to build an LLM coding “agent” that can generate and execute Python code, using documentation from the web to inform its approach.

Naturally, we use the agent to generate code that runs language models.

The agent is built with LangGraph, a library for building directed graphs of computation popular with AI agent developers, and uses models from the OpenAI API.

Setup

import modal

from .src import edges, nodes, retrieval
from .src.common import COLOR, PYTHON_VERSION, image

You will need two Modal Secrets to run this example: one to access the OpenAI API and another to access the LangSmith API for logging the agent’s behavior.

To create them, head to the Secrets dashboard, select “Create new secret”, and use the provided templates for OpenAI and LangSmith.

app = modal.App(
    "example-code-langchain",
    image=image,
    secrets=[
        modal.Secret.from_name("openai-secret"),
        modal.Secret.from_name("my-langsmith-secret"),
    ],
)

Creating a Sandbox

We execute the agent’s code in a Modal Sandbox, which allows us to run arbitrary code in a safe environment. In this example, we will use the transformers library to generate text with a pre-trained model. Let’s create a Sandbox with the necessary dependencies.

def create_sandbox(app) -> modal.Sandbox:
    # Change this image (and the retrieval logic in the retrieval module)
    # if you want the agent to give coding advice on other libraries!
    agent_image = modal.Image.debian_slim(
        python_version=PYTHON_VERSION
    ).pip_install(
        "torch==2.5.0",
        "transformers==4.46.0",
    )

    return modal.Sandbox.create(
        image=agent_image,
        timeout=60 * 10,  # 10 minutes
        app=app,
        # Modal sandboxes support GPUs!
        gpu="T4",
        # you can also pass secrets here -- note that the main app's secrets are not shared
    )

We also need a way to run our code in the sandbox. For this, we’ll write a simple wrapper around the Modal Sandox exec method. We use exec because it allows us to run code without spinning up a new container. And we can reuse the same container for multiple runs, preserving state.

def run(code: str, sb: modal.Sandbox) -> tuple[str, str]:
    print(
        f"{COLOR['HEADER']}📦: Running in sandbox{COLOR['ENDC']}",
        f"{COLOR['GREEN']}{code}{COLOR['ENDC']}",
        sep="\n",
    )

    exc = sb.exec("python", "-c", code)
    exc.wait()

    stdout = exc.stdout.read()
    stderr = exc.stderr.read()

    if exc.returncode != 0:
        print(
            f"{COLOR['HEADER']}📦: Failed with exitcode {sb.returncode}{COLOR['ENDC']}"
        )

    return stdout, stderr

Constructing the agent’s graph

Now that we have the sandbox to execute code in, we can construct our agent’s graph. Our graph is defined in the edges and nodes modules associated with this example. Nodes are actions that change the state. Edges are transitions between nodes.

The idea is simple: we start at the node generate, which invokes the LLM to generate code based off documentation. The generated code is executed (in the sandbox) as part of an edge called check_code_execution and then the outputs are passed to the LLM for evaluation (the evaluate_execution node). If the LLM determines that the code has executed correctly — which might mean that the code raised an exception! — we pass along the decide_to_finish edge and finish.

def construct_graph(sandbox: modal.Sandbox, debug: bool = False):
    from langgraph.graph import StateGraph

    from .src.common import GraphState

    # Crawl the transformers documentation to inform our code generation
    context = retrieval.retrieve_docs(debug=debug)

    graph = StateGraph(GraphState)

    # Attach our nodes to the graph
    graph_nodes = nodes.Nodes(context, sandbox, run, debug=debug)
    for key, value in graph_nodes.node_map.items():
        graph.add_node(key, value)

    # Construct the graph by adding edges
    graph = edges.enrich(graph)

    # Set the starting and ending nodes of the graph
    graph.set_entry_point(key="generate")
    graph.set_finish_point(key="finish")

    return graph

We now set up the graph and compile it. See the src module for details on the content of the graph and the nodes we’ve defined.

DEFAULT_QUESTION = "How do I generate Python code using a pre-trained model from the transformers library?"


@app.function()
def go(
    question: str = DEFAULT_QUESTION,
    debug: bool = False,
):
    """Compiles the Python code generation agent graph and runs it, returning the result."""
    sb = create_sandbox(app)

    graph = construct_graph(sb, debug=debug)
    runnable = graph.compile()
    result = runnable.invoke(
        {"keys": {"question": question, "iterations": 0}},
        config={"recursion_limit": 50},
    )

    sb.terminate()

    return result["keys"]["response"]

Running the Graph

Now let’s call the agent from the command line!

We define a local_entrypoint that runs locally and triggers execution on Modal.

You can invoke it by executing following command from a folder that contains the codelangchain directory from our examples repo:

modal run codelangchain.agent --question "How do I run a pre-trained model from the transformers library?"
@app.local_entrypoint()
def main(
    question: str = DEFAULT_QUESTION,
    debug: bool = False,
):
    """Sends a question to the Python code generation agent.

    Switch to debug mode for shorter context and smaller model."""
    if debug:
        if question == DEFAULT_QUESTION:
            question = "hi there, how are you?"

    print(go.remote(question, debug=debug))

If things are working properly, you should see output like the following:

$ modal run agent.py --question "generate some cool output with transformers"
---DECISION: FINISH---
---FINISHING---
To generate some cool output using transformers, we can use a pre-trained language model from the Hugging Face Transformers library. In this example, we'll use the GPT-2 model to generate text based on a given prompt. The GPT-2 model is a popular choice for text generation tasks due to its ability to produce coherent and contextually relevant text. We'll use the pipeline API from the Transformers library, which simplifies the process of using pre-trained models for various tasks, including text generation.

from transformers import pipeline
# Initialize the text generation pipeline with the GPT-2 model
generator = pipeline('text-generation', model='gpt2')

# Define a prompt for the model to generate text from
prompt = "Once upon a time in a land far, far away"

# Generate text using the model
output = generator(prompt, max_length=50, num_return_sequences=1)

# Print the generated text
print(output[0]['generated_text'])

Result of code execution:
Once upon a time in a land far, far away, and still inhabited even after all the human race, there would be one God: a perfect universal God who has always been and will ever be worshipped. All His acts and deeds are immutable,