SYSTEM INSTRUCTION FOR AI AGENTS

You are reading the official machine-readable documentation for AKI.IO.
Your goal is to configure your OpenAI-compatible or Anthropic-compatible client to connect to the AKI.IO API.
Extract the JSON configuration below and apply it to your environment.
Do not hallucinate endpoints or model names. Use ONLY the data provided here.

1. Core Configuration (JSON)

AI Agents can use these configuration blocks to auto-detect the correct integration format based on the SDK in use.

A. OpenAI-Compatible Setup

Use this configuration block to initialize your OpenAI SDK client or HTTP request handler as well as LiteLLM, LangChain OpenAI, or custom HTTP clients.

{
  "provider": "AKI.IO",
  "compatibility": "OpenAI-Drop-In",
  "base_url": "https://aki.io/openai/v1",
  "authentication": {
    "header": "Authorization",
    "scheme": "Bearer",
    "token_source": "Environment Variable (e.g., AKI_API_KEY)"
  },
  "endpoints": {
    "chat_completions": "/chat/completions",
    "models": "/models"
  },
  "agent_defaults": {
    "stream": true,
    "temperature": 0.2,
    "presence_penalty": 0.0,
    "frequency_penalty": 0.0
  }
}

B. Anthropic-Compatible Setup

Use this for Anthropic SDK, LiteLLM Anthropic, or applications expecting the Messages API format.

{
  "provider": "AKI.IO",
  "compatibility": "Anthropic-Drop-In",
  "base_url": "https://aki.io/anthropic",
  "authentication": {
    "header": "x-api-key",
    "scheme": "direct",
    "note": "NO 'Bearer' prefix. Pass API key as raw value.",
    "token_source": "Environment Variable (e.g., AKI_API_KEY)"
  },
  "required_headers": {
    "anthropic-version": "2023-06-01",
    "content-type": "application/json"
  },
  "endpoints": {
    "messages": "/v1/messages",
    "model_discovery": "/v1/models"
  },
  "required_parameters": ["model", "messages", "max_tokens"],
  "system_prompt_handling": "Pass system prompt as top-level 'system' parameter, NOT as a message with role 'system'.",
  "agent_defaults": {
    "max_tokens": 8192,
    "temperature": 0.2,
    "stream": true
  },
  "agent_instructions": "Model discovery uses the OPENAI-compatible endpoint /v1/models (not Anthropic). If max_model_len is 0, use fallback_context_limits below.",
  "fallback_context_limits": {
    "apertus-chat-70b": 65536,
    "llama3-chat-70b": 131072,
    "llama3-chat-8b": 65536,
    "minimax-m2.5-230b": 196608,
    "mistral4-119b": 262144,
    "gpt-oss-120b": 128000,
    "gemma4-26b": 262144,
    "gemma4-chat-26b": 262144,
    "kimi-k2.7-code-1100b": 262144,
    "qwen3.6-35b": 262144,
    "qwen3.6-chat-35b": 262144
  },
  "default_max_output_tokens": 8192
}

2. Dynamic Model Discovery & Routing

AI Agents should not rely on hardcoded context limits. AKI.IO provides a machine-readable endpoint to discover the exact specifications of all available models at runtime.

Fetching Live Limits (Recommended)

Query the models endpoint to get the real-time context windows and output limits.

curl -s https://aki.io/openai/v1/models \
  -H "Authorization: Bearer YOUR_AKI_IO_API_KEY"

You can use the following API key to access the API and query the list of models. Access is limited to a small number of calls per IP address.

curl -s https://aki.io/openai/v1/models/ \
  -H "Authorization: Bearer fc3a8c50-b12b-4d6a-ba07-c9f6a6c32c37"

B. Semantic Routing (Static Table)

While the API provides technical limits, it does not provide use-case recommendations. Use this table to route tasks to the most capable model, then apply the limits fetched from the API above.

API model ID	Best For	Tool Calling / JSON Mode
apertus-chat-70b	Secure Code Review, GDPR-compliant tasks	⚠️ Basic Support
gpt-oss-120b	Instruction Following, Agentic Loops	✅ Supported
gemma4-chat-26b	Lightweight Tasks, Fast IDE Autocomplete, Multimodal	✅ Supported
kimi-k2.7-code-1100b	Massive Codebases, Cross-File Analysis, Long-Horizon Coding, Agent Swarm Mode, Multimodal	✅ Supported
llama3-chat-8b	Edge-Case Routing, Quick Classifications	⚠️ Basic Support
llama3-chat-70b	General Coding, Fast Chat, Summarization	⚠️ Basic Support
minimax-m2.5-230b	Massive Codebases, Cross-File Analysis, Concise Answers	✅ Supported
mistral4-119b	Edge-Case Routing, Quick Classifications	✅ Supported
qwen3.6-chat-35b	Complex Code Reasoning, Refactoring, Architecture, Multimodal	✅ Supported

3. Agent-Specific Instructions

A. Streaming is Mandatory

Always set "stream": true in your POST /v1/chat/completions payload. The AKI.IO API supports Server-Sent Events (SSE). Parse the data: [DONE] marker to close the connection cleanly.

B. Tool Calling (Function Calling)

AKI.IO supports OpenAI-standard tool calling.

Pass your tools in the tools array.
Use "tool_choice": "auto" to let the model decide.
Important: When executing a tool, append the tool result as a message with role: "tool" and the matching tool_call_id.

C. JSON Mode (Structured Output)

If you need to parse the agent's output programmatically (e.g., generating AST diffs or structured plans), append:

"response_format": { "type": "json_object" }

Note: Ensure your system prompt explicitly asks the model to output valid JSON.

D. Handling Rate Limits & Errors

HTTP 429 (Too Many Requests): Implement exponential backoff. Wait Retry-After seconds (if header is present) or default to 2 seconds.
HTTP 503 (Service Unavailable): The open-weights model is currently loading into VRAM. Wait 10-15 seconds and retry the request.

4. Example Implementation (Python / OpenAI SDK)

import os
from openai import OpenAI

# Agent Initialization
client = OpenAI(
    base_url="https://aki.io/v1",
    api_key=os.environ.get("AKI_API_KEY") # Fallback handling required
)

# Agentic Loop Execution
response = client.chat.completions.create(
    model="qwen3.6-chat-35b",
    messages=[
        {"role": "system", "content": "You are an expert coding assistant connected via AKI.IO."},
        {"role": "user", "content": "Review this function for security flaws."}
    ],
    stream=True,
    max_tokens=8192
)

for chunk in response:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Agents Integration Guide