Agent Demos Are Too Generous

Most agent demos get a very generous audience.

People forgive bad latency, weird tool use, fragile state, and unclear failure modes because the demo has a good story.

That is fine for a demo. It is a bad way to judge a system.

The standard I care about is simpler:

If the answer is no, the demo is just a pretty temporary state.

This is also why I keep coming back to local setups and harnesses. They are less impressive on first glance and more useful after the third run.

That is the right tradeoff for me.