How Many LLM Calls Does Your Chatbot/Agent Make per User Query?

Written by

Franz Malten Buemann

in

I’m doing a survey on LLM call patterns in chatbot/agent architectures and would love your inputs:

How many LLM calls (e.g. OpenAI chat/completion requests) does your bot make for a single user query Just a ballpark e.g. 1, 2+, 3.. No need for exact stats or traffic data.
If your count is 1: What trick or toolkit (chains, function‑calling, embeddings + structured prompts, etc.) lets you handle intent + response in one go? Is it possible to achieve it? How?
Any other architectures you’ve found that reliably handle multi‑step or branching logic with fewer calls? What do you do to optimize number of calls (other than caching)?

P.S.: No proprietary info needed. This is purely related to design-pattern. I’ll compile all responses into a short, anonymized summary and share it back here in a few days.

submitted by /u/shrikant4learning
[link] [comments]

More posts