Your First Optimization Loop¶

What you'll learn

How one full RelyLoop study runs end-to-end — from a query set and judgments, through thousands of Optuna trials, to a Pull Request your approvers merge. This is the concrete version of the Quickstart, with a worked example.

The example¶

Say you run an e-commerce search over the sample products index, and relevance for head queries like wireless headphones and running shoes feels off — the right products are on page two. You want to tune the query-time parameters without guessing.

Step 1 — Define what "good" means¶

A study optimizes against a query set (the queries you care about) and a judgment list (per-query, per-document relevance ratings). You can:

Author judgments by hand, or
Generate them with the LLM-as-judge tool (generate_judgments_from_*), which rates each candidate document against the query using your configured model — every rating captures the exact model identifier for lineage.

See Query Sets & Judgments for the data model.

Step 2 — Let the agent propose a search space¶

Tell the chat agent what you're optimizing. It proposes a search space — the set of query-time parameters to vary and their ranges: field boosts, function scores, fuzziness, mm, tie-breakers, hybrid weights. You can accept or edit it. Nothing about schema, mappings, or analyzers is touched — tuning is query-time only.

See Search Space.

Step 3 — Run the trials¶

The study hands the search space to Optuna's TPE sampler. Each trial:

Samples a candidate parameter set.
Renders your query templates with those values.
Runs the query set against the cluster through the engine adapter.
Scores the ranked results against your judgments with ir_measures (nDCG, ERR, precision@k, and friends).

TPE uses the scores so far to propose more promising parameter sets — thousands of trials, converging far faster than a grid sweep.

See Optimization Trials.

Step 4 — Read the digest¶

When the study ends, RelyLoop writes a digest: a plain-language narrative of which parameters moved the metric, by how much, and the trade-offs. It's the human-readable answer to "what did the loop learn?"

Step 5 — Open the Pull Request¶

The winning configuration becomes a proposal. Opening it creates a Pull Request against your central search-config Git repo — diffing the new parameters against what's live. Your named approvers review it; your CI deploys on merge. RelyLoop never touches the serving path itself.

See Git-as-Source-of-Truth.

The loop, end to end¶

flowchart TD
    QS[Query set] --> S[Study]
    J[Judgments] --> S
    SS[Proposed search space] --> S
    S --> T{Optuna TPE trial}
    T -->|render + run + score| M[ir_measures metric]
    M -->|inform next trial| T
    T -->|best config found| D[Digest]
    D --> P[Proposal]
    P --> PR[Pull Request]
    PR --> A[Approver merges]
    A --> CI[Operator CI deploys]

That's one loop. Chain studies (feat_auto_followup_studies) to keep optimizing as your corpus and queries evolve.