Back to Blog

Real-Time AI with LLM Streaming and Tool Calling

November 28, 2025 · Eden Team

Tags: release


Eden's LLM endpoints now support streaming responses with mid-stream tool calling. That means AI models can query your databases in real-time while generating responses.

The Best of Both Worlds

Traditional LLM APIs make you choose: streaming (for responsiveness) or tool calling (for data access). Eden gives you both:

javascript
const stream = await eden.llm.chat({
  model: "claude-3-opus",
  messages: [{ role: "user", content: "What's the status of order #12345?" }],
  tools: [eden.tools.database_query],
  stream: true
});

for await (const chunk of stream) {
  if (chunk.type === 'tool_call') {
    console.log(`Querying: ${chunk.tool_call.query}`);
  } else {
    process.stdout.write(chunk.content);
  }
}

The model can invoke database queries at any point during generation, with results injected back into the context seamlessly.

How It Works

Eden's LLM proxy intercepts the streaming response and:

  1. Detects tool calls in the stream
  2. Pauses streaming while executing the tool
  3. Runs the query through Eden's proxy layer
  4. Injects results back into the context
  5. Resumes streaming to the client

From the client's perspective, the stream appears continuous. Tool execution happens transparently.

Real-World Examples

Customer support bot:

User: "What's my order status?"
AI: "Let me check... [queries order database]
     Your order #12345 shipped yesterday via FedEx.
     Tracking number: 789xyz. Expected delivery: Thursday."

Data analysis assistant:

User: "How did sales compare this month vs last month?"
AI: "I'll pull those numbers... [queries analytics database]
     This month: $2.3M (+15% vs last month)
     Top performers: Widget Pro ($890K), Gadget Plus ($450K)"

DevOps chatbot:

User: "Is the Redis cluster healthy?"
AI: "Checking cluster status... [queries Redis INFO]
     All 6 nodes operational. Memory usage: 67%.
     Replication lag: <1ms. No issues detected."

Faster Connection Pooling

We also migrated from r2d2_redis to deadpool for async connection pooling:

| Metric | Before | After | Improvement |

|--------|--------|-------|-------------|

| Connection acquisition | 450µs | 120µs | 3.75x faster |

| Memory per connection | 8KB | 4KB | 50% less |

| Max throughput | 45K req/s | 63K req/s | 40% higher |

Deadpool's native async/await support integrates perfectly with Eden's Tokio-based architecture. No more blocking pool acquisition or thread synchronization overhead.

LLM streaming documentation →