The agentic era is quietly rewriting how we track developer time
When an AI agent writes the first draft of your code, the old idea of an hour stops making sense. Here is how billable time is changing, and what to measure instead.
For about thirty years the unit of developer work was an hour of typing. You sat down, you wrote code, and the time on the clock mapped fairly cleanly to the value you produced. Tracking it was simple because the work was visible. You could watch the file grow.
That mapping is coming apart.
When a coding agent writes the first draft of a feature in ninety seconds, the ninety seconds are not the work. The work happened before, when you decided what to build and how it should behave, and it happens after, when you read what the agent produced and decide whether it is right. The typing, the part we used to bill, is now the cheapest thing in the whole process.
So what are you actually selling when half the keystrokes are not yours? That is the question every freelancer, consultant, and studio billing by the hour is going to have to answer this year, whether they want to or not.
The hour did not disappear, it moved
A useful way to think about this: the hour did not vanish, it relocated. It moved out of the editor and into judgment.
Picture a typical afternoon now. You spend twenty minutes describing a feature to an agent in enough detail that it builds the right thing. It returns a working draft in two minutes. You then spend forty minutes reading it closely, catching the two places where it confidently did the wrong thing, restructuring a module it got backwards, and writing the test it forgot. An hour passed. Real value was created. But only two minutes of it looked like coding in the old sense.
If your time tracker only respects the two minutes, it is measuring the wrong thing. The twenty minutes of specification and the forty minutes of review are where your expertise actually lives. That is the part a client cannot get from the model directly, and it is the part worth paying for.
Three kinds of time worth naming
Once you stop equating value with typing, it helps to separate the work into a few buckets. We have found three that hold up.
Direction. Deciding what to build, breaking it down, writing the prompt or the spec that gets a good result on the first try. This is getting more valuable, not less, because the quality of your direction now sets a ceiling on the quality of everything downstream.
Review. Reading generated code with the eye of someone who will own it at 2am when it breaks. Catching the subtle wrong assumption. Knowing when the elegant solution is a trap. Models do not carry the weight of consequences. You do, and that is review.
Repair and integration. Wiring the generated piece into a real system, handling the cases the model never saw, making it match the rest of the codebase. The unglamorous work that turns a plausible draft into something that ships.
None of these three is typing speed. All three are billable. If you can see them in your own log, you can defend them in an invoice.
Why per minute tracking matters more now, not less
There is a tempting conclusion here, which is that since AI makes coding faster, careful time tracking matters less. We think the opposite is true.
When work was slow and linear, an estimate was usually close enough. A feature took three days, you billed three days, everyone moved on. Now the same feature might take three hours or two days depending entirely on how cleanly the agent handled it, and you often cannot tell which until you are inside it. The variance went up. When variance goes up, guessing gets expensive, and an honest record of where the time actually went becomes the only thing standing between you and a quote that loses you money.
The teams adapting well are not tracking less. They are tracking with more honesty. They log the direction time and the review time as first class work instead of pretending the job was just the visible burst of generation in the middle.
A practical way to start this week
You do not need a new philosophy of labor to adjust. You need a slightly different habit.
Start one timer when you begin shaping a task, before you touch the agent. Keep it running through the specification, the generation, and the review, because all of it is the same piece of work. When you look back at the week, you will see that the genuinely hard tasks carried long review tails and the easy ones did not, and that pattern is gold for your next quote.
Then, and this is the part most people skip, write a one line note on each entry about what the time bought. Not “coded the auth flow” but “spec plus review of generated auth flow, caught a token refresh bug.” A note like that turns a number into a story your client will believe, and it reminds future you what these tasks really cost.
The agentic era did not make developer time worthless. It made the valuable part harder to see. Tracking is how you keep it in view.
If you want a timer that treats a resumed task as one continuous piece of work, and lets you annotate where the hour went, that is exactly what we built TimerStep to do. You can start free and see if it fits the way you work now.