FAQ — Gradient Disco

What exactly is Gradient Disco?

Gradient Disco is an end-to-end service that helps organisations build, optimise, and operate their own proprietary AI - instead of renting access to generic models forever. We combine a structured optimisation process (prompt engineering, fine-tuning, reinforcement learning) with a runtime layer that quantifies uncertainty and gates actions accordingly. The result is AI you own: smaller, cheaper, and tuned exactly to your processes.

Who is Gradient Disco for?

We work with mid-to-large organisations that are already experimenting with AI but have hit a ceiling: costs are climbing, outputs are inconsistent, or the models can't be adapted to proprietary processes. You don't need to have a machine learning team in-house — we guide the process end-to-end and transfer the knowledge and artifacts to your team.

What does "sovereign AI" mean in practice?

Sovereign AI means the models, prompts, fine-tuning checkpoints, and evaluation pipelines live in your infrastructure and belong to you — not a vendor. You are not locked into a specific cloud provider or API. If a vendor raises prices, changes their model, or shuts down, your AI capability survives. We treat AI as capital expenditure (CAPEX) that compounds in value over time, not an operational cost (OPEX) that keeps growing.

How is this different from just using GPT-4 or Claude with a good system prompt?

A well-crafted system prompt is a starting point, not a destination. General-purpose frontier models carry enormous inference costs, are not tuned to your domain, and improve on the vendor's schedule — not yours. Gradient Disco systematically builds smaller, specialised models that run faster and cheaper on your own infrastructure, using your own data and eval sets. The optimisation compounds: each generation of your model is better than the last, and you keep the IP.

What infrastructure do we need?

We meet you where you are. For organisations running on-premise or in a private cloud, we can deploy everything inside your environment. For teams on public cloud (AWS, GCP, Azure), we use managed GPU instances during training and self-hosted endpoints for inference. We also support a hybrid model: fine-tuning on cloud with inference on your own hardware. The only hard requirement is that you can provide access to your process data for the initial training run.

How do we get started?

The fastest path is a short scoping call — usually 45 minutes. We look at one concrete process you want to automate or improve, estimate the data you already have, and sketch a first optimisation cycle. From there we can move to a paid pilot (Agent Quickstart) in a matter of weeks. Use the form on the main page or email us directly to book a call.

Where does the name Gradient Disco come from?

Gradient descent is the fundamental optimization algorithm behind all neural network training - the process by which a model learns by following the steepest downhill path on an error surface. In a world where AI naming is dominated by cold, technical vocabulary we wanted to set a counterpoint, and give gradient descent the disco ball it deserves.
We also believe that your organization deserves a bit more disco - the ability to fluidly move to the tunes of different AI models and technology, while creating and honing your very own style of dancing.

What does a commercial engagement look like?

We progress from a short and time-boxed discovery phase - resulting in an implementation plan and a detailed concept of the expected value of building to own - to a pilot that solves your problem and can be tested for robust operation. This includes setting up the Gradient Disco infrastructure so you can own what we built. We support with live deployment and maintenance.

What is uncertainty quantification and why does it matter for AI automation?

Large language models cannot reliably assess their own confidence — they produce fluent, coherent text regardless of whether their output is correct. This means that in production, you cannot ask the model whether to trust its output. You have to measure uncertainty from the outside.

Gradient Disco uses two methods depending on deployment context. The first runs multiple model variants simultaneously and measures how much they disagree — tight agreement signals low uncertainty, wide disagreement signals high. The second uses a lightweight Interaction Model that learns the expected behaviour of your specific process and measures "surprise" when the actual output deviates from what was predicted.

Both methods produce a single calibrated number that feeds the decision rule: act autonomously when the expected cost of being wrong is lower than the cost of caution, and escalate to a human when it isn't. Without uncertainty quantification, the decision rule has no input — and autonomous AI systems have no principled basis for knowing when to stop.

How do you decide when an AI system is ready to act without human oversight?

Graduation to autonomy is economic, not intuitive. We use a three-stage framework called gated rollouts.

In the Shadow stage, the AI runs silently alongside human workflows, logging its outputs without acting on them. This produces calibration data at production volume before any automation risk is taken on.

In the Assisted stage, the system applies the decision rule: low-uncertainty outputs are committed automatically, high-uncertainty ones are routed to humans. Human attention is concentrated on cases where it actually matters.

In the Autonomous stage, the system handles the large majority of cases independently, with humans reserved for genuine exceptions.

Graduation from one stage to the next happens when the measured escalation rate multiplied by average loss per escalation falls below the savings from autonomy — a purely economic test. Critically, the process is reversible: if performance diverges in production, the system demotes automatically before damage compounds.

What is an Interaction Model?

An Interaction Model is a lightweight, task-specific model that runs alongside an AI agent and learns the expected behaviour of a particular process. It builds a compact representation of what normal looks like — the typical sequence of inputs, outputs, and transitions for your specific workflow — and measures "surprise" when the actual output deviates from prediction. That surprise score is the uncertainty signal that feeds our decision rule.

The name is deliberate. We avoid the term "world model" because the Interaction Model is far smaller and more focused than the concept implies: it doesn't model the world, only the narrow operational context of one specific process. That scope is a feature — it makes it fast, cheap to maintain, and calibratable from realistic production data volumes rather than requiring massive pre-training.

Can the system work with the AI tools and cloud providers we already use?

Yes. The Gradient Disco methodology is model-agnostic, optimiser-agnostic, and runtime-agnostic by design. The three optimisation tracks (prompt tuning, fine-tuning, reinforcement learning) work across model families — frontier APIs, open-source models, or your own checkpoints. The uncertainty and decision layers sit on top of whatever inference setup you have.

For infrastructure, we work with on-premise deployments, private cloud, public cloud (AWS, GCP, Azure), and hybrid setups. The key architectural choice is that all outputs — tuned prompts, fine-tuned checkpoints, RL policies, Interaction Models, evaluation pipelines — live in your artifact registry, not ours. When a vendor changes their model, raises prices, or shuts down, your AI capital is unaffected.

Questions & Answers