I’m doing a survey on LLM call patterns in chatbot/agent architectures and would love your inputs:
- How many LLM calls (e.g. OpenAI chat/completion requests) does your bot make for a single user query Just a ballpark e.g. 1, 2+, 3.. No need for exact stats or traffic data.
- If your count is 1: What trick or toolkit (chains, function‑calling, embeddings + structured prompts, etc.) lets you handle intent + response in one go? Is it possible to achieve it? How?
- Any other architectures you’ve found that reliably handle multi‑step or branching logic with fewer calls? What do you do to optimize number of calls (other than caching)?
P.S.: No proprietary info needed. This is purely related to design-pattern. I’ll compile all responses into a short, anonymized summary and share it back here in a few days.
submitted by /u/shrikant4learning
[link] [comments]