Modal

Modal’s cloud GPU infrastructure makes running your own LLMs easy!

This demo shows an LLM explaining the code of the demo.

Click Run to see it in action!

If you want to see what a more sophisticated LLM inference server on Modal looks like, check out this example.

You can also explore our gallery of other examples for inference, training, batch jobs, sandboxed code execution, and more!