The platform features a model centric API. Every available AI model is exposed as an endpoint, service of similar use cases (Large Language Models, Image Generation) have a similar and comparable structure.

However, since different AI models often have their own capabilities and functions, there are differing or specialized parameters and inputs. This is where a model-specific API has major advantages, as features can be exposed directly as additional parameters. There is no need to adapt the entire API specification at any time.

AKI.IO also offers an OpenAI interface for connection as a service-centered interface. This allows the service to be used as a drop-in replacement for existing OpenAI connections. More information can be found here: AKI.IO OpenAI API Interface here.

The API is designed as a bidirectional, real-time streaming interface, but it can also be used in a traditional blocking request–response mode.

All communication is performed exclusively via JSON messages. This includes binary data, which must be embedded directly within the JSON payload (for example, as Base64-encoded content). No additional transport channels or side channels are required.

Authentication is handled within the JSON payload itself. No HTTP headers are required for authentication. This design makes API calls straightforward, reliable, and proxy-safe.

The following example demonstrates a basic request using the curl command-line tool:

curl -X POST -H 'Content-Type: application/json' -d \
'{ "key":"fc3a8c50-b12b-4d6a-ba07-c9f6a6c32c37", "prompt_input":"Tell a joke", "wait_for_result": true}' \
https://aki.io/api/call/llama3_8b_chat

This example does a simple blocking call to the LLama3 8B LLM model endpoint (llama3_8b_chat), with a simple input prompt to “Tell a joke”, due the option “wait_for_result”, returning the complete response as JSON data:

{
"text":"A man walked into a library and asked the librarian, \"Do you have any books on Pavlov's dogs and Schrödinger's cat?\" \n\nThe librarian replied, \"It rings a bell, but I'm not sure if it's here or not.\"",
"model_name":"Llama-3.1-8BInstruct",
"max_seq_len":65536,
"prompt_length":38,
"num_generated_tokens":55,
"current_context_length":93,
"success":true,
"total_duration":0.721,
"compute_duration":0.7
}

In addition to the response result, the API returns relevant metadata such as the number of input tokens, generated tokens, and the total compute duration.

That’s it — you’ve made your first request.

To use the full capabilities of the AKI.IO API, including real-time streaming, official client libraries are available for Python (via PyPI) and Javascript. Additional SDKs and interfaces are currently under development and will be released soon.

If you need assistance integrating the AKI.IO API into your preferred programming language or application framework, please contact us at support@aki.io.

Do Your First Request