Let your users bring their own AI. No API keys to manage, no usage costs to absorb. Works in web apps, desktop apps, CLI tools — anything.
plug-my-ai is a local daemon that runs on your user's machine. It exposes an
OpenAI-compatible API on localhost:21110 that routes requests to
whatever AI provider the user has configured — their Claude Code subscription,
API keys, local models, etc.
Your app never touches the user's credentials or subscription. The daemon handles auth, routing, and provider abstraction. You just make standard chat completion requests.
Your users need plug-my-ai installed. Point them to the one-liner:
curl -fsSL https://get.plugmy.ai/install.sh | sh
Once installed, the daemon runs in the background with a menu bar icon (macOS) or system tray (Linux).
Your app communicates with it over HTTP on localhost:21110.
Before starting the pairing flow, check if the daemon is running:
GET http://localhost:21110/v1/status
// Response:
{
"status": "ok",
"version": "0.1.0",
"providers": ["claude-code"]
}
If the request fails (connection refused), the user hasn't installed plug-my-ai or it's not running. Show them the install prompt.
Pairing is how your app gets permission to use the user's AI. It's a one-time process that generates an app-specific token.
POST http://localhost:21110/v1/connect
Content-Type: application/json
{
"app_name": "My App",
"app_url": "https://myapp.com",
"app_icon": "https://myapp.com/icon.png" // optional
}
// Response:
{
"request_id": "a1b2c3d4",
"status": "pending",
"expires_at": "2025-01-15T10:05:00Z"
}
This opens a browser window on the user's machine showing your app's name and URL. The user clicks Allow or Deny.
GET http://localhost:21110/v1/connect/{request_id}
// While pending:
{ "status": "pending" }
// When approved:
{
"status": "approved",
"token": "pma_abc123def456..."
}
// When denied:
{ "status": "denied" }
Poll every 1–2 seconds. Requests expire after 5 minutes. Once approved, store the token securely — you'll use it for all subsequent API calls.
401 Unauthorized responses gracefully — the token may have been revoked.
Once paired, use the standard OpenAI chat completions format. If you're already using
the OpenAI SDK, you can just point it at localhost:21110.
POST http://localhost:21110/v1/chat/completions
Authorization: Bearer pma_abc123...
Content-Type: application/json
{
"model": "claude",
"messages": [
{ "role": "system", "content": "You are a helpful assistant." },
{ "role": "user", "content": "Explain quantum computing briefly." }
],
"stream": true
}
// Response: Server-Sent Events (SSE)
data: {"choices":[{"delta":{"content":"Quantum"},"index":0}]}
data: {"choices":[{"delta":{"content":" computing"},"index":0}]}
...
data: [DONE]
GET http://localhost:21110/v1/models
Authorization: Bearer pma_abc123...
// Response:
{
"data": [
{ "id": "claude", "object": "model", "owned_by": "claude-code" }
]
}
import OpenAI from 'openai';
const client = new OpenAI({
baseURL: 'http://localhost:21110/v1',
apiKey: storedPmaToken, // the pma_xxx token from pairing
});
const stream = await client.chat.completions.create({
model: 'claude',
messages: [{ role: 'user', content: 'Hello!' }],
stream: true,
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content || '');
}
plug-my-ai isn't just for web apps. Any application that can make HTTP requests can integrate — desktop apps, CLI tools, IDE extensions, mobile apps on the local network.
// Pair with plug-my-ai
var request = URLRequest(url: URL(string: "http://localhost:21110/v1/connect")!)
request.httpMethod = "POST"
request.setValue("application/json", forHTTPHeaderField: "Content-Type")
request.httpBody = try JSONEncoder().encode([
"app_name": "My Mac App",
"app_url": "https://mymacapp.com"
])
let (data, _) = try await URLSession.shared.data(for: request)
let result = try JSONDecoder().decode(ConnectResponse.self, from: data)
// Poll result.request_id, then store the token
import requests
# Pair
resp = requests.post("http://localhost:21110/v1/connect", json={
"app_name": "My Python Tool",
"app_url": "https://mytool.dev"
})
request_id = resp.json()["request_id"]
# Poll until approved
import time
while True:
poll = requests.get(f"http://localhost:21110/v1/connect/{request_id}")
data = poll.json()
if data["status"] == "approved":
token = data["token"]
break
time.sleep(1)
# Use it with the OpenAI SDK
from openai import OpenAI
client = OpenAI(base_url="http://localhost:21110/v1", api_key=token)
response = client.chat.completions.create(
model="claude",
messages=[{"role": "user", "content": "Hello!"}]
)
// Using reqwest
let client = reqwest::Client::new();
let res = client.post("http://localhost:21110/v1/connect")
.json(&serde_json::json!({
"app_name": "My Rust App",
"app_url": "https://myrust.app"
}))
.send().await?;
let connect: ConnectResponse = res.json().await?;
// Poll, then use the token for chat completions
/v1/status first. Only show the "Plug My AI" button if the daemon is running.401, the token was revoked. Clear it and prompt the user to re-pair./v1/models to see what's available. Let users pick if multiple models are configured.
plug-my-ai runs entirely locally. No data leaves the user's machine except to the AI provider they've configured. Tokens are app-specific and revocable.
The daemon only listens on localhost — it's not accessible from the network.
plug-my-ai routes requests through the user's own subscriptions and API keys. It doesn't share accounts, bypass rate limits, or resell access. Users are responsible for complying with their provider's terms of service.
The user's provider rate limits apply. plug-my-ai doesn't add or bypass any limits. If a user's Claude Code subscription has a usage cap, that cap applies to requests through plug-my-ai.
Requests to localhost:21110 will fail with a connection error. Detect this and show an install prompt or fall back to your own AI backend.
Yes. plug-my-ai is open source (MIT license). You can integrate it into commercial products without restriction.
Currently: Claude Code (subscription routing), with direct API key support (Anthropic, OpenAI), Ollama, and more coming soon. The provider system is extensible — contributions welcome.