on the AI front - devs take 19% longer w/AI

by HumanRobot, Cybertron, (273 days ago)

https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/


Core Result

When developers are allowed to use AI tools, they take 19% longer to complete issues—a significant slowdown that goes against developer beliefs and expert forecasts. This gap between perception and reality is striking: developers expected AI to speed them up by 24%, and even after experiencing the slowdown, they still believed AI had sped them up by 20%.

Note, this test was run on tickets or tasks that developers anticipated taking about 2 hours a piece. So that wouldn't cover AI helping a front end developer trying to do something novel in the back end.

--
I'm a bad take machine!

locked

What do you make of the claimed success of things like

by Jason93, Raining debris all over Europe, (273 days ago) @ HumanRobot

Gauntlet AI

It's a 12 week intensive AI coding bootcamp that, if you are invited and complete the course, you are guaranteed a $200K job with one of the sponsoring companies.

It was created by the guy behind BloomTech that sought to disrupt traditional college tracking for developers/coders/etc.

The publish videos nightly of that day's builds and the results are pretty impressive, on the surface, at least. More difficult to say if the apps are really enterprise-ready.

I wonder if the 19% degradation reflects lack of training on how best to use the AI tools, at least in part.

--
I think the children like it when I "get down" verbally.

locked

I love how they use a testimonial from a guy

by Bryan (IrishCavan), Howth Castle and Environs, (273 days ago) @ Jason93
edited by Bryan (IrishCavan),

working over 90 hours a week because he loves it. Dude, you're working the equivalent of 2 full-time jobs. No thanks.

locked

No kidding - two $100K jobs

by Jason93, Raining debris all over Europe, (273 days ago) @ Bryan (IrishCavan)

Not quite the same value prop.

--
I think the children like it when I "get down" verbally.

locked

I wonder where they found Roger

by HumanRobot, Cybertron, (273 days ago) @ Jason93

After using AI for a while as a developer, my feeling is

  • It's really good in the initial scoping, design, and R&D phases. Initial implementations are pretty good as well -- how to lay out a set of functions or classes that form the scaffolding for your project. This is stuff that architects and developers could sink a month into. A single, good developer could use AI here and come out with an overall design that's probably as good or better in a couple hours.
  • Very good at basic/intermediate coding. Saves a good amount of time here.
  • It seems good at the QA/unit testing phase, but it lets a surprising amount of things through.
  • It's good at basic debugging -- "this error message/exception at this line means you need to fix your code this way".
  • On more complex debugging it gets into cycles and thrashes. I've had a number of instances where I've spent 2-8 hours and it only ends because of human intervention/observation. That's a "me problem" to some degree, but there's a very real skill to knowing when you have to step on it pretty hard to get it pointed in the right direction.

--
I'm a bad take machine!

locked

I agree w this

by Mark, O Town, (273 days ago) @ HumanRobot

Your observations are in line with what I've seen.

I expect improvement in this space, so I think it will take time, but these solutions will evolve and become better slowly at assessing some of the more complex issues.

I just hope in the meantime product people put out better guidelines describing some of the risks/ best practices of where these limitations are.

--
"2020 ... Let's win it all ..."

locked

Or, devs take 19% longer than they estimated.

by Tim, Chicago, IL, (273 days ago) @ HumanRobot

News at 11.

Half joking. I will be interested in reading this.

locked

not quite

by HumanRobot, Cybertron, (273 days ago) @ Tim

The 19% is based on with and without AI control groups.

It's a really interesting read.

--
I'm a bad take machine!

locked

I'm pretty surprised.

by domer.mq ⌂, (273 days ago) @ HumanRobot

Though it somewhat aligns with my suspicion that it AI makes building prototypes of new products very fast, while it might only marginally help with fixing bugs or building features in existing products.

I have been vocal on here about my disdain for AI, particularly in domains where accuracy matters. But I have also managed to gin up prototypes for features to show my engineers, "This is how it should work" incredibly quickly. Still, the prototypes adhere to practically no rules of engineering we have for code that gets into our product, so the team can't just drop it in. Usually they just start over from scratch entirely. And the AI just simply refuses to follow the rules and guidelines we have for product engineering, so we don't use it on actual production code.

Still, I'd have guesses there would be at least a trace improvement on velocity.

Still don't want it answering any questions that can't immediately be validated with linting and compiling though.

--
Sometimes I rhyme slow sometimes I rhyme quick.

locked

That is how we use it.

by Tim, Chicago, IL, (273 days ago) @ domer.mq

Incredibly helpful in the sales process as well as validating features and the underlying technology that will enable them. Production code it is not.

locked

Which LLM (do you prefer for coding, and why?

by nedhead, (272 days ago) @ HumanRobot

- No text -

locked

4o

by HumanRobot, Cybertron, (272 days ago) @ nedhead

I think it has the best agents built around it.

--
I'm a bad take machine!

locked

The Claude Code and Gemini CLI agents have really upped...

by domer.mq ⌂, (272 days ago) @ nedhead

...the game.

It's pretty nuts what you can prototype with a plan, code, test pattern. I usually burn thru Claude's limits and then hit Gemini (which currently has no limits, but doesn't work quite so well).

--
Sometimes I rhyme slow sometimes I rhyme quick.

locked

Thanks, appreciate it!

by nedhead, (271 days ago) @ nedhead

- No text -