flama

vortico/flama
★ 286 stars Python AI/LLM Updated today
The production framework for Predictive and Generative AI. Serve any model as an API in one line, with OpenAI/Anthropic/Ollama-compatible endpoints, a built-in chat UI, and native MCP.
View on GitHub → Try with Claude — $10 free →

Quick Install

Copy the config for your editor. Some servers may need additional setup — check the README.

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "flama": {
      "command": "uvx",
      "args": [
        "flama"
      ]
    }
  }
}

Or install with pip: pip install flama

README Excerpt

<p align="center"> <a href="https://flama.dev"><img src="https://raw.githubusercontent.com/vortico/flama/master/.github/logo.png" alt='Flama'></a> </p> <p align="center"> <em>Light up your models</em> &#128293; </p> <p align="center"> <a href="https://github.com/vortico/flama/actions/workflows/ci_production.yaml">

Topics

anthropicasgichatbotdomain-driven-designgenerative-aiinferencellmllm-servingmachine-learningmcpmlopsmlxmodel-context-protocolmodel-servingollama