Your First Optimization Loop¶
What you'll learn
How one full RelyLoop study runs end-to-end — from a query set and judgments, through thousands of Optuna trials, to a Pull Request your approvers merge. This is the concrete version of the Quickstart, with a worked example.
The example¶
Say you run an e-commerce search over the sample products index, and
relevance for head queries like wireless headphones and running shoes
feels off — the right products are on page two. You want to tune the
query-time parameters without guessing.
Step 1 — Define what "good" means¶
A study optimizes against a query set (the queries you care about) and a judgment list (per-query, per-document relevance ratings). You can:
- Author judgments by hand, or
- Generate them with the LLM-as-judge tool (
generate_judgments_from_*), which rates each candidate document against the query using your configured model — every rating captures the exact model identifier for lineage.
See Query Sets & Judgments for the data model.
Step 2 — Let the agent propose a search space¶
Tell the chat agent what you're optimizing. It proposes a search space —
the set of query-time parameters to vary and their ranges: field boosts,
function scores, fuzziness, mm, tie-breakers, hybrid weights. You can accept
or edit it. Nothing about schema, mappings, or analyzers is touched — tuning
is query-time only.
See Search Space.
Step 3 — Run the trials¶
The study hands the search space to Optuna's TPE sampler. Each trial:
- Samples a candidate parameter set.
- Renders your query templates with those values.
- Runs the query set against the cluster through the engine adapter.
- Scores the ranked results against your judgments with
ir_measures(nDCG, ERR, precision@k, and friends).
TPE uses the scores so far to propose more promising parameter sets — thousands of trials, converging far faster than a grid sweep.
See Optimization Trials.
Step 4 — Read the digest¶
When the study ends, RelyLoop writes a digest: a plain-language narrative of which parameters moved the metric, by how much, and the trade-offs. It's the human-readable answer to "what did the loop learn?"
Step 5 — Open the Pull Request¶
The winning configuration becomes a proposal. Opening it creates a Pull Request against your central search-config Git repo — diffing the new parameters against what's live. Your named approvers review it; your CI deploys on merge. RelyLoop never touches the serving path itself.
The loop, end to end¶
flowchart TD
QS[Query set] --> S[Study]
J[Judgments] --> S
SS[Proposed search space] --> S
S --> T{Optuna TPE trial}
T -->|render + run + score| M[ir_measures metric]
M -->|inform next trial| T
T -->|best config found| D[Digest]
D --> P[Proposal]
P --> PR[Pull Request]
PR --> A[Approver merges]
A --> CI[Operator CI deploys]
That's one loop. Chain studies (feat_auto_followup_studies) to keep
optimizing as your corpus and queries evolve.