Projectie Streamchat Server

🚀 Quick Start

The Normalized LLM Respond API is our recommended endpoint for all LLM interactions. It provides a unified, vendor-agnostic interface that works with multiple AI providers.

✨ Why Use the Normalized API?

Provider-Agnostic: Switch between providers without changing your code
Tools/Function Calling: Built-in support for JSON Schema-based tools
Structured Streaming: Real-time SSE streaming with token, tool_call, and done events
Normalized Responses: Consistent format regardless of provider
Error Handling: Standardized error types and messages
Usage Tracking: Automatic usage recording for billing

Base URL

https://streamchat-staging.jobsolve.ai/api/v1/llm/respond

Authentication

All requests require an API key in the x-api-key header:

x-api-key: your-company-id

⚠️ Domain Registration Required

Your domain must be registered in the Jobsolve portal and whitelisted for your company ID. Contact support to register your domain.

📖 Core Concepts

How It Works

The microservice acts as a stateless intermediary between your application and AI providers:

Your application sends a normalized request to our API
We transform it to the provider-specific format (currently OpenRouter)
We forward the request to the AI provider
We normalize the response and return it to you
Usage is automatically recorded for billing

Request Format

All requests follow this normalized structure:

{
  "model": "meta/llama-3-70b-instruct",
  "messages": [
    {
      "role": "system",
      "content": [
        { "type": "text", "text": "You are a helpful assistant." }
      ]
    },
    {
      "role": "user",
      "content": [
        { "type": "text", "text": "Hello!" }
      ]
    }
  ],
  "temperature": 0.7,
  "max_tokens": 1000,
  "stream": false,
  "plugins": []  // Optional: for PDF parsing (GPT-5)
}

Response Format

All responses follow this normalized structure:

{
  "provider": "openrouter",
  "model": "meta/llama-3-70b-instruct",
  "finish_reason": "stop",
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 45,
    "total_tokens": 168
  },
  "message": {
    "role": "assistant",
    "content": [
      { "type": "text", "text": "Hello! How can I help you?" }
    ]
  },
  "tool_calls": []  // Present if tools were called
}

📄 PDF & File Uploads

The API supports PDF uploads and processing through different methods depending on the model:

GPT-5 Models (Recommended for PDFs)

GPT-5 models support PDFs via file content parts:

{
  "model": "openai/gpt-5",
  "messages": [
    {
      "role": "user",
      "content": [
        { "type": "text", "text": "Summarize this PDF in 5 bullets." },
        {
          "type": "file",
          "file": {
            "filename": "document.pdf",
            "fileData": "https://your-domain.com/uploads/document.pdf"
          }
        }
      ]
    }
  ],
  "plugins": [
    {
      "id": "file-parser",
      "pdf": { "engine": "pdf-text" }
    }
  ]
}

✨ Key Features

Automatic URL Conversion: Private URLs (localhost, ngrok) are automatically converted to base64
Parser Selection: Choose between pdf-text (free) or mistral-ocr (paid, for scanned documents)
Base64 Support: Can also use file_data with base64 data URLs

Using Base64 (for private files)

{
  "model": "openai/gpt-5",
  "messages": [
    {
      "role": "user",
      "content": [
        { "type": "text", "text": "Analyze this PDF" },
        {
          "type": "file",
          "file": {
            "filename": "document.pdf",
            "file_data": "data:application/pdf;base64,JVBERi0xLjQK..."
          }
        }
      ]
    }
  ]
}

Other Models (via image_url)

Some models support PDFs via the image_url content type:

{
  "model": "openai/gpt-4o",
  "messages": [
    {
      "role": "user",
      "content": [
        { "type": "text", "text": "Summarize this document" },
        {
          "type": "image_url",
          "image_url": {
            "url": "https://your-domain.com/uploads/document.pdf"
          }
        }
      ]
    }
  ]
}

⚠️ Model Compatibility

Not all models support PDFs. The system automatically filters out PDFs for unsupported models and includes a warning message. For reliable PDF processing, use openai/gpt-5 with file parts.

Parser Engines

When using GPT-5 with PDFs, you can specify a parsing engine:

pdf-text - Free text extraction (default, auto-added if not specified)
mistral-ocr - Paid OCR for scanned documents
native - Use model's native file handling (if supported)

🔧 Tools & Function Calling

Tools allow the AI to call functions in your application. This enables powerful integrations like database queries, API calls, or custom business logic.

How Tools Work

Define Tools: Provide JSON Schema definitions for each function
AI Decides: The model decides when to call a tool based on the conversation
Execute: Your application executes the function with the provided arguments
Continue: Send the tool result back to continue the conversation

Tool Definition

{
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get the current weather for a location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "The city and state, e.g. San Francisco, CA"
            },
            "unit": {
              "type": "string",
              "enum": ["celsius", "fahrenheit"],
              "description": "Temperature unit"
            }
          },
          "required": ["location"]
        }
      }
    }
  ],
  "tool_choice": "auto"  // or "none" or {"type": "function", "function": {"name": "get_weather"}}
}

Tool Call Response

When the AI wants to call a tool, the response includes tool_calls:

{
  "finish_reason": "tool_calls",
  "tool_calls": [
    {
      "id": "call_123",
      "name": "get_weather",
      "arguments": {
        "location": "San Francisco, CA",
        "unit": "fahrenheit"
      }
    }
  ]
}

Continuing the Conversation

After executing the tool, add the result to the conversation:

{
  "messages": [
    // ... previous messages ...
    {
      "role": "assistant",
      "content": null,
      "tool_calls": [
        {
          "id": "call_123",
          "name": "get_weather",
          "arguments": {...}
        }
      ]
    },
    {
      "role": "tool",
      "content": "72°F and sunny",
      "tool_call_id": "call_123"
    },
    {
      "role": "user",
      "content": [{ "type": "text", "text": "What should I wear?" }]
    }
  ]
}

💻 Integration Examples

Node.js Integration

Installation

npm install axios

Basic Non-Streaming Request

const axios = require('axios');

async function callLLM(messages, model = 'meta/llama-3-70b-instruct') {
    try {
        const response = await axios.post(
            'https://streamchat-staging.jobsolve.ai/api/v1/llm/respond',
            {
                model: model,
                messages: messages,
                temperature: 0.7,
                max_tokens: 1000,
                stream: false
            },
            {
                headers: {
                    'Content-Type': 'application/json',
                    'x-api-key': '11ea0155-84b0-48d9-826c-8a14bd41e235'
                }
            }
        );
        
        return response.data;
    } catch (error) {
        console.error('Error calling LLM:', error.response?.data || error.message);
        throw error;
    }
}

// Usage
const messages = [
    {
        role: 'system',
        content: [{ type: 'text', text: 'You are a helpful assistant.' }]
    },
    {
        role: 'user',
        content: [{ type: 'text', text: 'Hello!' }]
    }
];

callLLM(messages).then(response => {
    console.log('Response:', response.message.content[0].text);
    console.log('Usage:', response.usage);
});

Streaming Request

const axios = require('axios');

async function streamLLM(messages, model = 'meta/llama-3-70b-instruct') {
    try {
        const response = await axios.post(
            'https://streamchat-staging.jobsolve.ai/api/v1/llm/respond',
            {
                model: model,
                messages: messages,
                temperature: 0.7,
                max_tokens: 1000,
                stream: true
            },
            {
                headers: {
                    'Content-Type': 'application/json',
                    'x-api-key': '11ea0155-84b0-48d9-826c-8a14bd41e235'
                },
                responseType: 'stream'
            }
        );
        
        let buffer = '';
        let currentEventType = '';
        
        response.data.on('data', (chunk) => {
            buffer += chunk.toString();
            const lines = buffer.split('\n');
            buffer = lines.pop() || '';
            
            for (const line of lines) {
                if (line.startsWith('event: ')) {
                    currentEventType = line.slice(7).trim();
                } else if (line.startsWith('data: ')) {
                    const dataStr = line.slice(6).trim();
                    if (dataStr === '[DONE]') continue;
                    
                    try {
                        const data = JSON.parse(dataStr);
                        
                        if (currentEventType === 'token') {
                            process.stdout.write(data.text || '');
                        } else if (currentEventType === 'tool_call') {
                            console.log('\n[Tool Call]', data.name, data.arguments);
                        } else if (currentEventType === 'done') {
                            console.log('\n\n[Done]', data);
                        }
                    } catch (e) {
                        // Ignore parse errors
                    }
                }
            }
        });
        
        response.data.on('end', () => {
            console.log('\nStream complete');
        });
        
    } catch (error) {
        console.error('Error streaming LLM:', error.message);
        throw error;
    }
}

With Tools

const axios = require('axios');

// Define available tools
const tools = [
    {
        type: 'function',
        function: {
            name: 'get_weather',
            description: 'Get the current weather for a location',
            parameters: {
                type: 'object',
                properties: {
                    location: {
                        type: 'string',
                        description: 'The city and state'
                    }
                },
                required: ['location']
            }
        }
    }
];

// Tool execution functions
const toolFunctions = {
    get_weather: async (args) => {
        // Your implementation here
        return `The weather in ${args.location} is 72°F and sunny.`;
    }
};

async function chatWithTools(messages) {
    while (true) {
        const response = await axios.post(
            'https://streamchat-staging.jobsolve.ai/api/v1/llm/respond',
            {
                model: 'meta/llama-3-70b-instruct',
                messages: messages,
                tools: tools,
                tool_choice: 'auto',
                stream: false
            },
            {
                headers: {
                    'Content-Type': 'application/json',
                    'x-api-key': '11ea0155-84b0-48d9-826c-8a14bd41e235'
                }
            }
        );
        
        const data = response.data;
        
        // If there's a message, add it to conversation
        if (data.message) {
            messages.push(data.message);
            console.log('Assistant:', data.message.content[0].text);
        }
        
        // If there are tool calls, execute them
        if (data.tool_calls && data.tool_calls.length > 0) {
            for (const toolCall of data.tool_calls) {
                // Add assistant message with tool calls
                messages.push({
                    role: 'assistant',
                    content: null,
                    tool_calls: [toolCall]
                });
                
                // Execute the tool
                const toolResult = await toolFunctions[toolCall.name](toolCall.arguments);
                
                // Add tool result
                messages.push({
                    role: 'tool',
                    content: toolResult,
                    tool_call_id: toolCall.id
                });
            }
        } else {
            // No more tool calls, conversation complete
            break;
        }
    }
    
    return messages;
}

// Usage
const messages = [
    { role: 'user', content: [{ type: 'text', text: 'What is the weather in San Francisco?' }] }
];

chatWithTools(messages).then(() => {
    console.log('Conversation complete');
});

Vanilla JavaScript Integration

Basic Non-Streaming Request

async function callLLM(messages, model = 'meta/llama-3-70b-instruct') {
    try {
        const response = await fetch('https://streamchat-staging.jobsolve.ai/api/v1/llm/respond', {
            method: 'POST',
            headers: {
                'Content-Type': 'application/json',
                'x-api-key': '11ea0155-84b0-48d9-826c-8a14bd41e235'
            },
            body: JSON.stringify({
                model: model,
                messages: messages,
                temperature: 0.7,
                max_tokens: 1000,
                stream: false
            })
        });
        
        if (!response.ok) {
            const error = await response.json();
            throw new Error(error.error?.message || 'Request failed');
        }
        
        const data = await response.json();
        return data;
    } catch (error) {
        console.error('Error calling LLM:', error);
        throw error;
    }
}

// Usage
const messages = [
    {
        role: 'system',
        content: [{ type: 'text', text: 'You are a helpful assistant.' }]
    },
    {
        role: 'user',
        content: [{ type: 'text', text: 'Hello!' }]
    }
];

callLLM(messages).then(response => {
    console.log('Response:', response.message.content[0].text);
    console.log('Usage:', response.usage);
});

Streaming Request

async function streamLLM(messages, onToken, onToolCall, onDone) {
    try {
        const response = await fetch('https://streamchat-staging.jobsolve.ai/api/v1/llm/respond', {
            method: 'POST',
            headers: {
                'Content-Type': 'application/json',
                'x-api-key': '11ea0155-84b0-48d9-826c-8a14bd41e235'
            },
            body: JSON.stringify({
                model: 'meta/llama-3-70b-instruct',
                messages: messages,
                temperature: 0.7,
                max_tokens: 1000,
                stream: true
            })
        });
        
        if (!response.ok) {
            const error = await response.json();
            throw new Error(error.error?.message || 'Request failed');
        }
        
        const reader = response.body.getReader();
        const decoder = new TextDecoder();
        let buffer = '';
        let currentEventType = '';
        
        while (true) {
            const { done, value } = await reader.read();
            if (done) break;
            
            buffer += decoder.decode(value, { stream: true });
            const lines = buffer.split('\n');
            buffer = lines.pop() || '';
            
            for (const line of lines) {
                if (line.startsWith('event: ')) {
                    currentEventType = line.slice(7).trim();
                } else if (line.startsWith('data: ')) {
                    const dataStr = line.slice(6).trim();
                    if (dataStr === '[DONE]') continue;
                    
                    try {
                        const data = JSON.parse(dataStr);
                        
                        if (currentEventType === 'token' && onToken) {
                            onToken(data.text || '');
                        } else if (currentEventType === 'tool_call' && onToolCall) {
                            onToolCall(data);
                        } else if (currentEventType === 'done' && onDone) {
                            onDone(data);
                        }
                    } catch (e) {
                        // Ignore parse errors
                    }
                }
            }
        }
    } catch (error) {
        console.error('Error streaming LLM:', error);
        throw error;
    }
}

// Usage
const messages = [
    { role: 'user', content: [{ type: 'text', text: 'Tell me a story' }] }
];

streamLLM(
    messages,
    (token) => document.getElementById('output').textContent += token,
    (toolCall) => console.log('Tool call:', toolCall),
    (done) => console.log('Complete:', done)
);

With Tools

// Define tools
const tools = [
    {
        type: 'function',
        function: {
            name: 'get_weather',
            description: 'Get the current weather',
            parameters: {
                type: 'object',
                properties: {
                    location: { type: 'string' }
                },
                required: ['location']
            }
        }
    }
];

// Tool execution
const toolFunctions = {
    get_weather: (args) => {
        return `The weather in ${args.location} is 72°F and sunny.`;
    }
};

async function chatWithTools(messages) {
    while (true) {
        const response = await fetch('https://streamchat-staging.jobsolve.ai/api/v1/llm/respond', {
            method: 'POST',
            headers: {
                'Content-Type': 'application/json',
                'x-api-key': '11ea0155-84b0-48d9-826c-8a14bd41e235'
            },
            body: JSON.stringify({
                model: 'meta/llama-3-70b-instruct',
                messages: messages,
                tools: tools,
                tool_choice: 'auto',
                stream: false
            })
        });
        
        const data = await response.json();
        
        // Add assistant message if present
        if (data.message) {
            messages.push(data.message);
            console.log('Assistant:', data.message.content[0].text);
        }
        
        // Handle tool calls
        if (data.tool_calls && data.tool_calls.length > 0) {
            for (const toolCall of data.tool_calls) {
                // Add assistant message with tool call
                messages.push({
                    role: 'assistant',
                    content: null,
                    tool_calls: [toolCall]
                });
                
                // Execute tool
                const result = toolFunctions[toolCall.name](toolCall.arguments);
                
                // Add tool result
                messages.push({
                    role: 'tool',
                    content: result,
                    tool_call_id: toolCall.id
                });
            }
        } else {
            break; // Conversation complete
        }
    }
    
    return messages;
}

Laravel Integration

Service Class

apiUrl = config('services.llm.api_url');
        $this->apiKey = config('services.llm.api_key');
    }
    
    /**
     * Send a non-streaming request to the LLM API
     */
    public function respond(array $messages, string $model = 'meta/llama-3-70b-instruct', array $options = []): array
    {
        $response = Http::withHeaders([
            'x-api-key' => $this->apiKey,
            'Content-Type' => 'application/json',
        ])->post("{$this->apiUrl}/api/v1/llm/respond", [
            'model' => $model,
            'messages' => $messages,
            'temperature' => $options['temperature'] ?? 0.7,
            'max_tokens' => $options['max_tokens'] ?? 1000,
            'stream' => false,
            'tools' => $options['tools'] ?? null,
            'tool_choice' => $options['tool_choice'] ?? 'auto',
        ]);
        
        if (!$response->successful()) {
            Log::error('LLM API error', [
                'status' => $response->status(),
                'body' => $response->body()
            ]);
            throw new \Exception('LLM API request failed');
        }
        
        return $response->json();
    }
    
    /**
     * Stream a request to the LLM API
     */
    public function stream(array $messages, string $model = 'meta/llama-3-70b-instruct', callable $onToken = null, callable $onToolCall = null, callable $onDone = null): void
    {
        $response = Http::withHeaders([
            'x-api-key' => $this->apiKey,
            'Content-Type' => 'application/json',
        ])->withBody(
            json_encode([
                'model' => $model,
                'messages' => $messages,
                'temperature' => 0.7,
                'max_tokens' => 1000,
                'stream' => true,
            ]),
            'application/json'
        )->send('POST', "{$this->apiUrl}/api/v1/llm/respond");
        
        $buffer = '';
        $currentEventType = '';
        
        foreach (explode("\n", $response->body()) as $line) {
            if (str_starts_with($line, 'event: ')) {
                $currentEventType = trim(substr($line, 7));
            } elseif (str_starts_with($line, 'data: ')) {
                $dataStr = trim(substr($line, 6));
                if ($dataStr === '[DONE]') continue;
                
                try {
                    $data = json_decode($dataStr, true);
                    
                    if ($currentEventType === 'token' && $onToken) {
                        $onToken($data['text'] ?? '');
                    } elseif ($currentEventType === 'tool_call' && $onToolCall) {
                        $onToolCall($data);
                    } elseif ($currentEventType === 'done' && $onDone) {
                        $onDone($data);
                    }
                } catch (\Exception $e) {
                    // Ignore parse errors
                }
            }
        }
    }
}

Configuration

// config/services.php

return [
    // ... other services ...
    
    'llm' => [
        'api_url' => env('LLM_API_URL', 'https://your-api.com'),
        'api_key' => env('LLM_API_KEY'),
    ],
];

Usage in Controller

llmService = $llmService;
    }
    
    public function chat(Request $request)
    {
        $messages = [
            [
                'role' => 'system',
                'content' => [
                    ['type' => 'text', 'text' => 'You are a helpful assistant.']
                ]
            ],
            [
                'role' => 'user',
                'content' => [
                    ['type' => 'text', 'text' => $request->input('message')]
                ]
            ]
        ];
        
        $response = $this->llmService->respond($messages);
        
        return response()->json([
            'message' => $response['message']['content'][0]['text'],
            'usage' => $response['usage']
        ]);
    }
    
    public function streamChat(Request $request)
    {
        return response()->stream(function () use ($request) {
            $messages = [
                [
                    'role' => 'user',
                    'content' => [
                        ['type' => 'text', 'text' => $request->input('message')]
                    ]
                ]
            ];
            
            $this->llmService->stream(
                $messages,
                function ($token) {
                    echo "data: " . json_encode(['text' => $token]) . "\n\n";
                    ob_flush();
                    flush();
                },
                function ($toolCall) {
                    echo "data: " . json_encode(['tool_call' => $toolCall]) . "\n\n";
                    ob_flush();
                    flush();
                },
                function ($done) {
                    echo "data: " . json_encode(['done' => $done]) . "\n\n";
                    ob_flush();
                    flush();
                }
            );
        }, 200, [
            'Content-Type' => 'text/event-stream',
            'Cache-Control' => 'no-cache',
            'Connection' => 'keep-alive',
        ]);
    }
}

With Tools

 'function',
        'function' => [
            'name' => 'get_weather',
            'description' => 'Get the current weather',
            'parameters' => [
                'type' => 'object',
                'properties' => [
                    'location' => [
                        'type' => 'string',
                        'description' => 'The city and state'
                    ]
                ],
                'required' => ['location']
            ]
        ]
    ]
];

$messages = [
    ['role' => 'user', 'content' => [['type' => 'text', 'text' => 'What is the weather in San Francisco?']]]
];

while (true) {
    $response = $this->llmService->respond($messages, 'meta/llama-3-70b-instruct', [
        'tools' => $tools,
        'tool_choice' => 'auto'
    ]);
    
    // Add assistant message
    if (isset($response['message'])) {
        $messages[] = $response['message'];
    }
    
    // Handle tool calls
    if (isset($response['tool_calls']) && count($response['tool_calls']) > 0) {
        foreach ($response['tool_calls'] as $toolCall) {
            // Add assistant message with tool call
            $messages[] = [
                'role' => 'assistant',
                'content' => null,
                'tool_calls' => [$toolCall]
            ];
            
            // Execute tool (your implementation)
            $result = $this->executeTool($toolCall['name'], $toolCall['arguments']);
            
            // Add tool result
            $messages[] = [
                'role' => 'tool',
                'content' => $result,
                'tool_call_id' => $toolCall['id']
            ];
        }
    } else {
        break; // Conversation complete
    }
}

⚡ Error Handling

All errors follow a normalized format:

{
  "error": {
    "type": "rate_limit",
    "message": "Too many requests",
    "provider_status": 429,
    "retry_after_ms": 1500
  }
}

Error Types

rate_limit - Rate limit exceeded (429)
overloaded - Service overloaded (503, 502)
invalid_request - Invalid request format (400)
auth - Authentication failed (401, 403)
upstream - Provider error (500+)
timeout - Request timeout (504)
unknown - Unknown error

Example Error Handling

try {
    const response = await callLLM(messages);
    // Handle success
} catch (error) {
    if (error.response?.data?.error) {
        const apiError = error.response.data.error;
        
        switch (apiError.type) {
            case 'rate_limit':
                // Wait and retry
                await new Promise(resolve => 
                    setTimeout(resolve, apiError.retry_after_ms || 1000)
                );
                // Retry request
                break;
            case 'auth':
                // Check API key and domain registration
                console.error('Authentication failed:', apiError.message);
                break;
            case 'invalid_request':
                // Fix request format
                console.error('Invalid request:', apiError.message);
                break;
            default:
                console.error('Error:', apiError.message);
        }
    }
}

📊 Best Practices

1. Stateless Design

The API is stateless. Always send the full conversation history with each request:

// ✅ Good: Send full history
const messages = [
    { role: 'system', content: [...] },
    { role: 'user', content: [...] },
    { role: 'assistant', content: [...] },
    { role: 'user', content: [...] }
];

// ❌ Bad: Only sending latest message
const messages = [
    { role: 'user', content: [...] }
];

2. Tool Execution Loop

When using tools, continue the conversation until finish_reason is not "tool_calls":

while (true) {
    const response = await callLLM(messages, { tools });
    
    if (response.finish_reason === 'tool_calls') {
        // Execute tools and continue
        for (const toolCall of response.tool_calls) {
            const result = executeTool(toolCall);
            messages.push({
                role: 'tool',
                content: result,
                tool_call_id: toolCall.id
            });
        }
    } else {
        // Conversation complete
        break;
    }
}

3. Streaming for Better UX

Use streaming for better user experience, especially for longer responses:

// Show tokens as they arrive
streamLLM(messages, (token) => {
    document.getElementById('output').textContent += token;
});

4. Error Handling

Always handle errors gracefully and provide fallbacks:

try {
    const response = await callLLM(messages);
} catch (error) {
    // Log error
    console.error('LLM error:', error);
    
    // Show user-friendly message
    showError('Sorry, something went wrong. Please try again.');
}

🔗 Additional Resources

Test the API - Interactive testing interface
OpenAPI Specification - Complete API reference
Home - Return to main page

📚 API Documentation

🚀 Quick Start

✨ Why Use the Normalized API?

Base URL

Authentication

⚠️ Domain Registration Required

📖 Core Concepts

How It Works

Request Format

Response Format

📄 PDF & File Uploads

GPT-5 Models (Recommended for PDFs)

✨ Key Features

Using Base64 (for private files)

Other Models (via image_url)

⚠️ Model Compatibility

Parser Engines

🔧 Tools & Function Calling

How Tools Work

Tool Definition

Tool Call Response

Continuing the Conversation

💻 Integration Examples

Node.js Integration

Installation

Basic Non-Streaming Request

Streaming Request

With Tools

Vanilla JavaScript Integration

Basic Non-Streaming Request

Streaming Request

With Tools

Laravel Integration

Service Class

Configuration

Usage in Controller

With Tools

⚡ Error Handling

Error Types

Example Error Handling

📊 Best Practices

1. Stateless Design

2. Tool Execution Loop

3. Streaming for Better UX

4. Error Handling

🔗 Additional Resources