Loading…
Loading…
The process of running a trained AI model to generate predictions or outputs from new inputs. In contrast to training (which adjusts model weights), inference uses a fixed model to process live requests. Most AI governance concerns arise at inference time: what data is being sent to the model, what outputs are being generated, who is seeing those outputs, and how are they being acted on? Inference costs (compute, API calls) and inference latency also affect which models are practical to deploy.
Why this matters for your team
Governance happens at inference time, not just at build time. Every inference call is a live data event — log what data is sent, what responses are returned, and flag anomalies. Inference logs are your audit trail.