• 0 Posts
  • 7 Comments
Joined 3 years ago
cake
Cake day: August 15th, 2023

help-circle



  • The training is sophisticated, but inference is unfortunately really a text prediction machine. Technically token prediction, but you get the idea.

    For every single token/word. You input your system prompt, context, user input, then the output starts.

    The

    Feed the entire context back in and add the reply “The” at the end.

    The capital

    Feed everything in again with “The capital”

    The capital of

    Feed everything in again…

    The capital of Austria

    It literally works like that, which sounds crazy :)

    The only control you as a user can have is the sampling, like temperature, top-k and so on. But that’s just to soften and randomize how deterministic the model is.

    Edit: I should add that tool and subagent use makes this approach a bit more powerful nowadays. But it all boils down to text prediction again. Even the tools are described per text for what they are for.



  • You might genuinely be using it wrong.

    At work we have a big push to use Claude, but as a tool and not a developer replacement. And it’s working pretty damn well when properly setup.

    Mostly using Claude Sonnet 4.6 with Claude Code. It’s important to run /init and check the output, that will produce a CLAUDE.md file that describes your project (which always gets added to your context).

    Important: Review everything the AI writes, this is not a hands-off process. For bigger changes use the planning mode and split tasks up, the smaller the task the better the output.

    Claude Code automatically uses subagents to fetch information, e.g. API documentation. Nowadays it’s extremely rare that it hallucinates something that doesn’t exist. It might use outdated info and need a nudge, like after the recent upgrade to .NET 10 (But just adding that info to the project context file is enough).