Adopt plug-my-ai — Integration Guide

How it works

plug-my-ai is a local daemon that runs on your user's machine. It exposes an OpenAI-compatible API on localhost:21110 that routes requests to whatever AI provider the user has configured — their Claude Code subscription, API keys, local models, etc.

Your App web, native, CLI

→

plug-my-ai localhost:21110

→

User's AI Claude, GPT, Ollama...

Your app never touches the user's credentials or subscription. The daemon handles auth, routing, and provider abstraction. You just make standard chat completion requests.

Why this matters Users pay $20–$200/month for AI subscriptions they can only use in one app. plug-my-ai lets them use that subscription everywhere — and lets you offer AI features without absorbing API costs.

Get started

Your users need plug-my-ai installed. Point them to the one-liner:

curl -fsSL https://get.plugmy.ai/install.sh | sh

Once installed, the daemon runs in the background with a menu bar icon (macOS) or system tray (Linux). Your app communicates with it over HTTP on localhost:21110.

Check if plug-my-ai is available

Before starting the pairing flow, check if the daemon is running:

GET http://localhost:21110/v1/status

// Response:
{
  "status": "ok",
  "version": "0.1.0",
  "providers": ["claude-code"]
}

If the request fails (connection refused), the user hasn't installed plug-my-ai or it's not running. Show them the install prompt.

The pairing flow

Pairing is how your app gets permission to use the user's AI. It's a one-time process that generates an app-specific token.

Step 1: Request pairing

POST http://localhost:21110/v1/connect
Content-Type: application/json

{
  "app_name": "My App",
  "app_url": "https://myapp.com",
  "app_icon": "https://myapp.com/icon.png"  // optional
}

// Response:
{
  "request_id": "a1b2c3d4",
  "status": "pending",
  "expires_at": "2025-01-15T10:05:00Z"
}

This opens a browser window on the user's machine showing your app's name and URL. The user clicks Allow or Deny.

Step 2: Poll for approval

GET http://localhost:21110/v1/connect/{request_id}

// While pending:
{ "status": "pending" }

// When approved:
{
  "status": "approved",
  "token": "pma_abc123def456..."
}

// When denied:
{ "status": "denied" }

Poll every 1–2 seconds. Requests expire after 5 minutes. Once approved, store the token securely — you'll use it for all subsequent API calls.

Important Tokens are app-specific. Users can revoke them at any time from the plug-my-ai dashboard. Handle 401 Unauthorized responses gracefully — the token may have been revoked.

Using the chat API

Once paired, use the standard OpenAI chat completions format. If you're already using the OpenAI SDK, you can just point it at localhost:21110.

Streaming (recommended)

POST http://localhost:21110/v1/chat/completions
Authorization: Bearer pma_abc123...
Content-Type: application/json

{
  "model": "claude",
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user", "content": "Explain quantum computing briefly." }
  ],
  "stream": true
}

// Response: Server-Sent Events (SSE)
data: {"choices":[{"delta":{"content":"Quantum"},"index":0}]}
data: {"choices":[{"delta":{"content":" computing"},"index":0}]}
...
data: [DONE]

Available models

GET http://localhost:21110/v1/models
Authorization: Bearer pma_abc123...

// Response:
{
  "data": [
    { "id": "claude", "object": "model", "owned_by": "claude-code" }
  ]
}

Using the OpenAI SDK

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'http://localhost:21110/v1',
  apiKey: storedPmaToken, // the pma_xxx token from pairing
});

const stream = await client.chat.completions.create({
  model: 'claude',
  messages: [{ role: 'user', content: 'Hello!' }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || '');
}

Native app integration

plug-my-ai isn't just for web apps. Any application that can make HTTP requests can integrate — desktop apps, CLI tools, IDE extensions, mobile apps on the local network.

Swift (macOS / iOS)

// Pair with plug-my-ai
var request = URLRequest(url: URL(string: "http://localhost:21110/v1/connect")!)
request.httpMethod = "POST"
request.setValue("application/json", forHTTPHeaderField: "Content-Type")
request.httpBody = try JSONEncoder().encode([
    "app_name": "My Mac App",
    "app_url": "https://mymacapp.com"
])
let (data, _) = try await URLSession.shared.data(for: request)
let result = try JSONDecoder().decode(ConnectResponse.self, from: data)
// Poll result.request_id, then store the token

Python

import requests

# Pair
resp = requests.post("http://localhost:21110/v1/connect", json={
    "app_name": "My Python Tool",
    "app_url": "https://mytool.dev"
})
request_id = resp.json()["request_id"]

# Poll until approved
import time
while True:
    poll = requests.get(f"http://localhost:21110/v1/connect/{request_id}")
    data = poll.json()
    if data["status"] == "approved":
        token = data["token"]
        break
    time.sleep(1)

# Use it with the OpenAI SDK
from openai import OpenAI
client = OpenAI(base_url="http://localhost:21110/v1", api_key=token)
response = client.chat.completions.create(
    model="claude",
    messages=[{"role": "user", "content": "Hello!"}]
)

Rust

// Using reqwest
let client = reqwest::Client::new();
let res = client.post("http://localhost:21110/v1/connect")
    .json(&serde_json::json!({
        "app_name": "My Rust App",
        "app_url": "https://myrust.app"
    }))
    .send().await?;
let connect: ConnectResponse = res.json().await?;
// Poll, then use the token for chat completions

Best practices

Detect before prompting. Check /v1/status first. Only show the "Plug My AI" button if the daemon is running.
Store tokens securely. Treat pma_ tokens like API keys. Store them in your app's secure storage (Keychain, credential manager, encrypted localStorage).
Handle revocation. If you get a 401, the token was revoked. Clear it and prompt the user to re-pair.
Respect rate limits. The daemon proxies to the user's own subscription — don't hammer it. Show loading states.
Fallback gracefully. Not all users will have plug-my-ai. Offer alternative AI providers or let them input API keys directly.
Show model info. Call /v1/models to see what's available. Let users pick if multiple models are configured.

Disclaimers & FAQ

Is this secure?

plug-my-ai runs entirely locally. No data leaves the user's machine except to the AI provider they've configured. Tokens are app-specific and revocable. The daemon only listens on localhost — it's not accessible from the network.

Does this violate provider terms of service?

plug-my-ai routes requests through the user's own subscriptions and API keys. It doesn't share accounts, bypass rate limits, or resell access. Users are responsible for complying with their provider's terms of service.

What about rate limits?

The user's provider rate limits apply. plug-my-ai doesn't add or bypass any limits. If a user's Claude Code subscription has a usage cap, that cap applies to requests through plug-my-ai.

What if plug-my-ai isn't installed?

Requests to localhost:21110 will fail with a connection error. Detect this and show an install prompt or fall back to your own AI backend.

Can I use this commercially?

Yes. plug-my-ai is open source (MIT license). You can integrate it into commercial products without restriction.

What providers are supported?

Currently: Claude Code (subscription routing), with direct API key support (Anthropic, OpenAI), Ollama, and more coming soon. The provider system is extensible — contributions welcome.

Add plug-my-ai to your app

On this page

How it works

Get started

Check if plug-my-ai is available

The pairing flow

Step 1: Request pairing

Step 2: Poll for approval

Using the chat API

Streaming (recommended)

Available models

Using the OpenAI SDK

Native app integration

Swift (macOS / iOS)

Python

Rust

Best practices

Disclaimers & FAQ

Is this secure?

Does this violate provider terms of service?

What about rate limits?

What if plug-my-ai isn't installed?

Can I use this commercially?

What providers are supported?