Infrastructure
Latency
How long it takes for an AI to respond after you send it a request, the delay between asking and receiving an answer.
Definition
The time elapsed between sending a request to an AI system and receiving a response. Measured in milliseconds. Affected by model size, hardware, network distance, and queue depth.
Why it matters
Users abandon interactions after 3 seconds of waiting. For real-time agent actions, latency determines usability.
Related terms in Infrastructure
API
A digital plug or messenger that lets two different software programs talk to each other.
API Gateway
The security guard at the front door of your software that checks IDs and directs traffic.
Cloud Computing
Renting powerful computers over the internet instead of buying and keeping them in your own office.
Edge AI
Running AI directly on the device (phone, camera, car) instead of sending data to the cloud, faster and more private.
From vocabulary to outcomes
Ready to put Latency to work?
Knowing the term is step one. Deploying it inside a revenue architecture that compounds is what Sophizo builds.
Book a Discovery Call