How does prompt caching work
last updated: Jan 17, 2026
https://ngrok.com/blog/prompt-caching/
Not satisfied with the answers in the vendor documentation, which do a good job of explaining how to use prompt caching but sidestep the question of what is actually being cached, I decided to go deeper. I went down the rabbit hole of how LLMs work until I understood the precise data providers cache, what it's used for, and how it makes everything faster and cheaper for everyone.
Goes deep into how LLMs work in general, before arriving at how prompt caching works
via Simon Willison