← All tutorialsResearch

Deep-research assistant

Fan out 20 Tavily searches, summarize with GPT-4o, return a cited brief. Hard $2/run ceiling per user.

Cost~$0.40 per brief
Operations
web_search.querytext_generation.generate

Prerequisites

  • Set monthlyBudgetCents=20000 ($200) on the agent so 100 briefs/month is the ceiling.
  • Track per-user run cost in your own DB by reading `cost_cents` from each response.

Walkthrough

1. Plan queries

Have a cheap model decompose the user's question into 5–20 sub-queries.

bash# Plan: ask a cheap model to break the question into sub-queries.
curl -X POST https://www.upivia.com/v1/service-requests \
  -H "Authorization: Bearer $AGENT_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "service": "text_generation",
    "operation": "generate",
    "payload": {
      "model":"openai/gpt-4o-mini",
      "messages":[{"role":"user","content":"Decompose: <question>"}]
    }
  }'

2. Search in parallel

Fire web_search.query calls concurrently. Each call returns a list of {title, url, snippet}.

bash# Fire one of these per sub-query, concurrently from your code.
curl -X POST https://www.upivia.com/v1/service-requests \
  -H "Authorization: Bearer $AGENT_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "service": "web_search",
    "operation": "query",
    "payload": { "q": "<one sub-query>" }
  }'

3. Synthesize with GPT-4o

Concatenate all snippets, stuff into a single GPT-4o call, ask for a cited brief. Use a stronger model here - token cost is dwarfed by search cost.

Next steps

Audit every call at /audit-logs, watch spend at /usage, and tune budgets per service on the agent's page.

Create an account →