Something good

Mar 27

Could it be? Is it she?

32 Comments

Mar 29Edited

I guess this is a polarizing take, but I’m just shocked at all these people saying that it makes coding fun. If anything I have far less fun using Claude code. My job changes from thinking deeply about the craft to reviewing and correcting AI slop that’s like 80% right.

I definitely agree that Claude web UI is super useful as an enhanced Google search (eg “give me an overview of Linux cgroups API”), or for generating one off data analysis scripts (“build a histogram of request durations from this CSV”). But for real work in large existing repos with Claude code i’m not so sure. I tried using it for a medium sized project at work recently, and I’m fairly convinced that I spent more time cajoling it than I would have writing it de novo myself.

Mar 29Edited

It’s also just hard to take seriously the rhetoric that this truly provides 10 X productivity boosts. Like does anyone truly believe this? I get that the CEOs have to say it to shill their product. But at best the bottleneck shifts from writing code to reviewing code, which is arguably a lot harder to do for code you didn’t write yourself (and when it’s not your colleague who wrote it, but an LLM, whose ability to explain itself is questionable).

Like if the LLM was a full brain that accumulated context for years then sure, we’ve achieved the dream of capital replacing labor, and the AI is effectively just another coworker. Great. But until we reach that point, some person needs to ultimately be responsible for every line of code, regardless of how it’s generated.

In other words, if you claim a 10 X productivity boost, then one of two things must necessarily be true:

1. You can robustly review (and internalize understanding/responsibility for) Claude-generated code 10x faster than you could write code of similar quality

OR

2. You’re shipping (effectively) unreviewed AI generated code into production

Yeah, I agree with you that, if you're reviewing everything, it doesn't speed you up all that much. So my sense is that it's overwhelmingly #2 - people just kinda scan things, at best.

I do have much more mixed feelings on if that's actually bad though. On one hand, for sure; as you say, it still needs to work and you're responsible for it if it breaks. So just yolo-ing it doesn't seem right. On the other hand, does the review have to be a *code* review? That seems less obvious to me. The point of the review is to know that it works, and if you can test that it works in other ways, is that necessarily bad? Of course, it "working" needs to be mean more than some cursory happy path button clicks, but theoretically, I don't think you necessarily need to care what the code itself says. And I'd guess that's where the world goes - if have machines that can write a whole bunch of code for us, we figure out ways to make sure that code is reliable without requiring us to read it.

Mar 30Edited

Maybe we need a new way to do code reviews. Instead of doing it the old fashioned way of reading through the code, gathering context, and trying to understand it, we need to leverage AI in the review process just as much as in the code generation process (or get as close as possible).

This might mean "pairing" with an AI as we go through the code bit by bit (ideally voice-to-voice, but dictation is a tolerable alternative), instead of reading a sloppy PR novel written by AI (and trying to relate the novel to the code). Like basically replicate the process of having the writer of the code explain their PR "line by line", but doing that with AI instead of a person.

Otherwise, code reviews are becoming way too painful, and the process doesn't scale when code can be produced 10x faster than it can be reviewed. I imagine news editors have been feeling this pain for a while too. More news stories being generated than ever, and how can they possibly keep up? They need their own AI to assist with fact checking, quote checking, etc.

Actually, anyone who gets this right will probably strike it rich (until the big players catch up).

The trickiest part would just be to sync the AI to what I'm currently looking at on my screen, so that they can understand the context of my questions (otherwise, posting PR questions/comments all over the place would be a very painful experience). And then if it can address PR comments right then and there, code reviews just became even faster than before (because humans rarely address PR comments instantly).

@Benn: And there's another experience that AI does not understand (the pain humans are going through during that process). For it, reviewing a giant PR and posting 250 comments is the easiest thing in the world. Innovation comes from creatively solving human problems, but how can AI solve these problems (without human assistance) if it does not understand them?

Yeah, I definitely agree that there almost have to be new ways to do code reviews, because the realistic alternative is just "nobody does code reviews anymore." Though on the point about AI not understanding the pain of reviewing some giant thing, I think that's a big part of why how we write code could change so much. People hate that, and they need to code to be expressed in all sorts of ways that make it easy for us to understand. But AI "understands" things in different ways (ie, it can read much faster than we can, but probably "reasons" about abstractions in way worse ways), so the sorts of codebases that it can quickly read are probably pretty different than the ones we can read.

Apr 21Edited

Graphite just released Code Tours: https://www.linkedin.com/posts/withgraphite_no-more-jumping-between-description-comments-activity-7446962057293447168-Fpp-

It's a step forward, but I am now recording a video walkthrough for all of my PRs. Can't possibly offer a better experience than that to my reviewers. I wish everyone did the same. It also forces me to 100% understand my own PRs (can't explain them if I don't understand them), so it helps to battle cognitive debt from AI.

Also: https://blog.cloudflare.com/ai-code-review/

I know someone who works at a code review / CI startup, and they've apparently been doing really well over the last year (for obvious reasons). So on one hand, I do believe that AI code reviews work; on the other hand, I'd also believe that there are a lot of startups who are very much incentivized to make me believe that.

We are seeing this ourselves internally. In fact, after a recent hackathon, someone literally observed "my job is fun again!" Was neat to see you post this in the immediate wake of that experience.

Of course that's not determinative; there are lots of people having lots of experiences and who in the actual fuck knows where all of this is headed, but it also made me experience the same shred of "huh, that's noteworthy" that you seem to be expressing in this post.

Of course so many people are so shell-shocked and cynical today that it is challenging for all of us collectively to will something truly positive into existence. If all of this *does* end up being good (from a humanism perspective) it'll likely have to be in spite of us, not because of us :)

I can't exactly pretend to be an optimist about most things, but yeah, that last point really does seem striking to me. There's so much anxiety about things, and talking about how things will fail gets you more points on the internet (who would ever do that?), so it seems like we've forgotten how to notice when there's something potentially good happening. So far, this seems like one of those things, as long as we can keep it.

Week one: "my job is fun again!"

A few weeks/months later: https://steve-yegge.medium.com/the-ai-vampire-eda6e4f07163

Trena Christensen

I don’t find it fun at all. At first, yes it made everything easy. I could spend 15 mins on something that used to take an hour or two.

But now I’m still burning that same hour or two, now having to intensely review and fix the outputs. AI generated content looks great on the outside but the inside is empty.

Having to fix workslop all day is more mentally draining than just producing the product myself.

Some other folks brought this up as well, and yeah, cleaning up a mess is definitely less fun than taking a bit longer to build the thing without the mess in the first place. But, that said, I'd be very surprised if people didn't figure out ways to make the cleaning up part a lot easier and faster (or less necessary). And even so, it still seems worth paying attention to why the first part - which is theoretically very productive - feels so compelling to people.

Jeremy Howard has a great take on this from a recent interview, which is effectively: despite all the impressive post training innovation, Claude is still fundamentally just modeling a distribution, and when you take it outside of that distribution by getting hyper specific for example, the illusion of understanding and intelligence disappears very quickly.

I can attest to this firsthand as my day job requires a lot of Linux and OS knowledge for which Claude frequently fails. At the risk of coming off like a programming snob, yes, if you are generating a generic react front end or fast API back end that’s been done 1000 times before then Claude will be extremely good and save you a ton of time, but in my view, this is glorified boiler plate to begin with and not real software engineering.

I think it’s good that as an industry we’re returning to a focus on real CS principles: compilers and language design, systems and low latency programming, etc. and it’s great that people don’t have to get deep into the weeds of some bullshit python APIs anymore. But the lack of nuance in the mainstream discourse just bothers me.

I'm sure there's some version of the gell mann amensia effect with all of this, where people don't trust it within their domain, but think it's pretty useful and intelligence outside of that. (Though I could also see that going the other way, where, within our areas of expertise, we get all bent out of shape about the details and think they matter way more than they do.)

As a long-time software engineer, I'm having the best fucking time of my life rn. Don't really care about qq's from anyone else.

That's what seems weird to me, that so many people are saying that, and nobody's really quite stepped back and said, "how can we do this - as in, create that *feeling* - everywhere else?"

Ok is *everyone* obsessed like that, or just the 10x ones?

Because I feel like it’s a loop - you get obsessed, you 10x, nine colleagues go, you have 10x more work now.

Yeah, that's kind of the point that Steve Yegge makes here, and I think it's a reasonable point. https://steve-yegge.medium.com/the-ai-vampire-eda6e4f07163 But if the amount of effort the one person is putting in nets out the same, but now they really like it, that still seems...good? And even if employers end up abusing the productivity gains like this, it still seems like we should be trying to replicate what it is that made the work so much more fun before the fun got sucked out of it?

I have two theories:

1. Maslow was right that self-actualization is the highest human need. Expressing yourself simply feels this good - we just never had a chance to feel it, with all the boring typing.

2. These 10x folks are neurodivergent, and we're observing a classic ND "hyperfixation" pattern. Alex Karp is with me on this one.

If I'm right and it's one of these - or both - then the feeling may not go away at all.

I'd probably lean towards the first thing? The interesting question to me, which came up in another comment, is basically this:

Is Claude Code compelling because you *get* a thing at the end of using it, or is it because you get to *make* a thing? I'd guess it's much more of the former - that the appeal is the creating and the expressing, not that you then have a thing.

I’m with you 100%. We actually ship fewer things now, but they’re f-ing excellent. I’ve always wanted to be a Hermès of what I do, but execution used to get in the way. Not any more!

Yeah, this is how I hope people respond to this. Rather than building super-apps that do a million things, make the thing better (ie: https://benn.substack.com/p/make-it-better).

David Jayatillake ✝️

I don’t feel like I have to be using it all the time though. I have intense bursts where I am building with it, and everything that is possible to build I can build at that time, but then I have increasingly long quiet spells where I don’t have something new to build until I do…

Yeah, I doubt the addiction part is universal. But even for people who use it intermittently, I haven’t heard of many people who don’t think it’s usually fun. Which is kind of the question for me: How do we make more things like that?

David Jayatillake ✝️

It’s pretty much always fun except when it gets things wrong, which has become rare since November 30th.

For me the joy is the power it gives you. I can build almost anything I can conceive of within the laws of computer science.

I can’t think of a future product beyond robotic help (which will just be AI extended into the real world), that will give us this kind of power.

Or, you can at least build sketches of it. And that feels like a big part of what makes it so compelling - it's less that you *get* a thing, but more that you get to *review* a thing. Having anything you want is nice, but it's not all that fun. Getting to see a new thing and change it and play around with it; that part is what seems like the fun part.

Gambling still holds for me as the cleanest way to understand this feeling. The combination of variable reward, endlessly tune-able configuration, input sensitivity, self-improvement loops, and long processing times are highly effective at creating this feeling of "work is fun." There's always a new way to tinker with it and increase your odds of winning. But it's never a sure thing.

These dark patterns are well understood. I don't think anyone's using them maliciously (here) but it's an implication of non deterministic systems that really hooks people.

Yeah, a few people have pointed this out, often in a way that ties it back to the same addiction feeling. But I guess my question there is, is that ok? Can it be useful? Like, if you create those patterns in something where the end result is a desirable thing, as opposed to yoloing your money away at a craps table, is that...good?

Vincent Balalian

It's the passive aspect that hooked me. Being able to get dopamine hits while being passively productive... it's like doom scrolling for good.

I've thought about this part a lot. Like, the current fad is to run a million agents and have an imax of monitors for conducting all of them. But I think what I want is just the slow background agent that churns along and moves things along while I'm not looking. I don't want to maximize productivity; I want to be productive without having to work at it very hard.

During the early dot com era, Wired ran a piece along the lines of “those exciting New Economy jobs are still just jobs.” They profiled a customer service worker at Amazon who had to use this weird, terminal-based CRM app and he was slowly realizing that the vaunted tech skills he learned would only ever be useful at Amazon.

That's true, though I think that's part of what makes this kind hard to get our head around. Those jobs are still jobs, and these jobs that engineers like are still jobs, for sure. But if you like the job...that's ok? Like, all the AI utopia stuff talks about a world without jobs, but why not a world where the jobs are *good* jobs?

#nojs-banner { position: fixed; bottom: 0; left: 0; padding: 16px 16px 16px 32px; width: 100%; box-sizing: border-box; background: red; color: white; font-family: -apple-system, "Segoe UI", Roboto, Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 13px; line-height: 13px; } #nojs-banner a { color: inherit; text-decoration: underline; } This site requires JavaScript to run correctly. Please turn on JavaScript or unblock scripts