Real-Time AI with LLM Streaming and Tool Calling
November 28, 2025 · Eden Team
Tags: release
Eden's LLM endpoints now support streaming responses with mid-stream tool calling. That means AI models can query your databases in real-time while generating responses.
The Best of Both Worlds
Traditional LLM APIs make you choose: streaming (for responsiveness) or tool calling (for data access). Eden gives you both:
const stream = await eden.llm.chat({
model: "claude-3-opus",
messages: [{ role: "user", content: "What's the status of order #12345?" }],
tools: [eden.tools.database_query],
stream: true
});
for await (const chunk of stream) {
if (chunk.type === 'tool_call') {
console.log(`Querying: ${chunk.tool_call.query}`);
} else {
process.stdout.write(chunk.content);
}
}The model can invoke database queries at any point during generation, with results injected back into the context seamlessly.
How It Works
Eden's LLM proxy intercepts the streaming response and:
- Detects tool calls in the stream
- Pauses streaming while executing the tool
- Runs the query through Eden's proxy layer
- Injects results back into the context
- Resumes streaming to the client
From the client's perspective, the stream appears continuous. Tool execution happens transparently.
Real-World Examples
Customer support bot:
User: "What's my order status?"
AI: "Let me check... [queries order database]
Your order #12345 shipped yesterday via FedEx.
Tracking number: 789xyz. Expected delivery: Thursday."Data analysis assistant:
User: "How did sales compare this month vs last month?"
AI: "I'll pull those numbers... [queries analytics database]
This month: $2.3M (+15% vs last month)
Top performers: Widget Pro ($890K), Gadget Plus ($450K)"DevOps chatbot:
User: "Is the Redis cluster healthy?"
AI: "Checking cluster status... [queries Redis INFO]
All 6 nodes operational. Memory usage: 67%.
Replication lag: <1ms. No issues detected."Faster Connection Pooling
We also migrated from r2d2_redis to deadpool for async connection pooling:
| Metric | Before | After | Improvement |
|--------|--------|-------|-------------|
| Connection acquisition | 450µs | 120µs | 3.75x faster |
| Memory per connection | 8KB | 4KB | 50% less |
| Max throughput | 45K req/s | 63K req/s | 40% higher |
Deadpool's native async/await support integrates perfectly with Eden's Tokio-based architecture. No more blocking pool acquisition or thread synchronization overhead.