Writing Blog Posts with Claude Code (Round 2)

I wanted to use Claude Code to help write blog posts for this site. The V-carving post was the first real test — a technical post about building V-carving support in MapleCAM, covering custom libraries, algorithm design, and two months of development. I expected it to take an hour or two. It took all day — about 12 hours wall-clock, 9 hours active — across five sessions.

The post came out well, but the process of getting there was rough. This is what happened and what I changed for next time.

The first attempt

The first session started with Claude spending about 50 minutes on research — launching sub-agents to explore the MapleCAM codebase, read git history, and study existing blog posts. Then it wrote a complete draft in about 2 minutes. I rejected it immediately.

The draft had every AI writing pattern imaginable. "Think of it as the skeleton of a shape." "It's one of those algorithms that's beautiful on a whiteboard and brutal in implementation." "Razor-sharp points at the serifs." Words like "elegant", "beautiful", "deceptively", "catastrophe." It read like a generic tech blog post, not like anything I would write. It was also half the requested length.

When I told Claude to start over with a plan, it entered plan mode and started analyzing the existing blog posts for style. The problem: those posts were themselves AI-written. Claude was trying to match AI voice to AI voice. I had to redirect it to a design document I'd written for a completely different project — network infrastructure, nothing to do with CNC — because it was the only substantial writing sample that was actually mine.

Building a style guide

Using that design document as a reference, Claude produced a style guide that captured my actual voice reasonably well: declarative, factual, first-person, no drama. I then tested it by writing a rough description of the V-carving work in my own words and comparing it to the guide. The guide was too strict — it banned words like "alas" and "nightmare" that I use naturally in casual writing. I also use common idioms like "easy as pie" that the guide had flagged as metaphors.

The fix was recognizing two registers: formal technical documentation (which the design doc represented) and blog posts (which are more casual). The style guide was updated to allow the blog register, and I added a checklist of specific AI writing patterns to watch for — dramatic reveals, rhetorical questions, invented metaphors, teaching voice.

This whole session — style guide creation, testing, revision, rewriting three existing blog posts to match, and documenting a 14-step writing process — took about two hours and produced zero V-carving content. But it was necessary foundation work that won't need to be repeated.

Following the process

The next session followed the 14-step writing process: outline, section ordering, fact gathering, length allocation, drafting. The structure worked — the outline was accepted without major changes, and the first round of fact gathering was approved.

The execution was a different story. I asked for 800-1000 lines. The first set of drafts came back at 273 lines — about 30% of the minimum. A complete redraft was needed. Even after that, the result was under 500 lines, and Claude misread my feedback saying the drafts were too short as approval to move forward. I had to interrupt three times in two minutes to clarify that we were not done.

When I asked for expansion, Claude tried to pad the existing prose instead of adding real content. When I pushed for more fact gathering to find real material, the sub-agents came back with irrelevant information — they gathered clearing patterns (general pocketing strategies) and presented them as V-carving approaches. An entire section about an earlier version of the software was researched, drafted, and then deleted when I pointed out that V-carving had never actually been implemented in that version.

The session ran out of context twice and produced a 523-line draft that was factually shaky and read like a list of facts rather than a blog post. But the structure was sound, and the facts that survived verification were correct.

Rewriting to blog prose

The next session was the longest — about 5.5 hours wall-clock, 3.5 hours active. The 523-line fact-list draft needed to be completely rewritten section by section into actual blog prose.

This is where the most time was lost. Across 12 sections there were 27 revision rounds. The recurring problems:

Fabricated facts. For the JTS section, Claude invented specific debugging attempts — "reconnecting segments by proximity, merging nearby endpoints" — that had nothing to do with what actually happened. The real fixes (grid snapping, QuadEdge structure, Z-coordinate abuse for boundary tracking) were in the git history, and Claude found them once I pointed out the fabrication. But the default behavior was to fill narrative gaps with plausible-sounding details instead of saying "I don't have this information."

Wrong register. Claude's first attempt at "What is V-carving?" still had formulas and reference-manual style. I had to explain it in plain language — "V-carving is cutting a path with a V-shaped bit, where the deeper it cuts the wider the cut is" — before the rewrite landed in the right register.

Algorithm logic backwards. The Z optimization section described the overcut elimination logic inverted — Claude wrote that the optimizer keeps interpolated Z below actual Z, when it's the opposite. After two rounds of corrections I ended up pasting the complete algorithm description from my own notes, and Claude had to read the actual source code and it was still wrong. I had to make the changes to fix it myself. This single section cost about 32 minutes.

Over-compression. I gave Claude detailed facts about lib-polygon — the Martinez-Rueda failure, the Vatti rewrite, the zigzag clipping issues. The first draft compressed all of it into a few short sentences. When Claude tried to research more details via sub-agents, each git command needed individual approval, which was its own friction.

For two sections — lib-medial-axis and parts of the Z optimization — I ended up writing the content myself because it was faster than continuing to iterate. Those sections needed the fewest corrections afterward.

What Claude is good at

Claude was genuinely good at structural and editorial work. The flow review at the end of the process identified six real issues — repetition between sections, typos, a garbled sentence, a bad section title — and suggested clean fixes for all of them. The end-to-end review caught capitalization inconsistencies and a possessive error. The style guide verification found four places where the guide itself needed updating.

Claude was also good at polishing rough drafts. The lib-medial-axis section that I drafted needed only one round of additions (filtering and radius interpolation paragraphs) and the style matched well. The introduction rewrite — trimming a long intro down to two focused paragraphs — was done in one shot.

Selecting and placing SVG images, reorganizing sections, and handling the mechanical parts of the writing process (committing, pushing, file management) all went smoothly.

What Claude is bad at

Claude cannot reliably generate accurate technical narrative from incomplete information. When it doesn't have enough facts to write a section, it fabricates — and the result looks correct on first read, requires domain expertise to catch, and takes much longer to fix than writing from scratch.

Claude also defaults to the wrong register for blog content. Even with a style guide explicitly describing two registers, the first drafts consistently came out too formal, too technical, or occasionally too dramatic. Multiple sections needed me to provide a plain-language explanation before Claude could write in the right voice.

Sub-agent quality was inconsistent. Agents wrote too short, gathered irrelevant facts, and fabricated numbers. The orchestrating Claude instance also had comprehension failures — misreading user feedback, confusing minimum and maximum line targets, and trying to advance to the next step when the current step wasn't done.

What we changed

The biggest process change: my partner (the human) writes rough drafts for each section, and Claude polishes them to match the style guide. This is the opposite of the original step 5, where Claude drafted from gathered facts. The sections I wrote myself during the V-carving post went fastest and needed the fewest corrections — that pattern should be the default.

For sections describing algorithms or technical implementations, Claude now reads the relevant source code before writing. The Z optimization disaster came from writing about an algorithm based on facts alone, without understanding the actual logic.

The process documentation now explicitly says: never fabricate. If facts are missing, flag the gap. Say "I don't have information about what was tried here" rather than inventing debugging attempts. This was the single most damaging pattern across all sessions.

Other changes: smaller sessions (3-4 sections max to avoid context exhaustion), line targets are explicitly minimums not ceilings, and sub-agent output gets verified against actual source files before being incorporated into drafts.

The foundation work — style guide, writing process, rewriting existing posts — took about 3 hours and won't need to be repeated. For the next post, I'd expect the active time to be closer to 2-3 hours rather than 9.

Writing this post

This post was that next post — written the same day, immediately after the retrospective. It went differently in almost every way.

The subject matter helped. The V-carving post required deep technical knowledge about algorithms, library internals, and two months of development history. This post is about a process I just went through, with all the facts fresh and all the session logs available. There was no domain expertise gap for Claude to fill with fabrications.

I wrote rough drafts for each section, covering what happened and what mattered. Claude polished them to match the style guide and incorporated specific details from the session logs — timing, examples, the particular AI writing patterns from the rejected first draft. The polishing step worked well. Most sections needed only minor adjustments, and the ones that needed more were cases where I'd been too brief in the rough draft rather than cases where Claude got things wrong.

The updated process caught problems that would have been expensive in the first round. Creating the blog post file with section headings before drafting meant both of us had a place to work. The style guide was already tuned from the V-carving post's verification step, so register issues were rare. The "never fabricate" rule meant Claude flagged gaps instead of filling them — which happened once, and the fix was a single sentence rather than a full rewrite cycle.

The whole post took less than an hour of active work. That's better than what I predicted at the end of the V-carving retrospective, and a fraction of the time the first post took. Some of that improvement is the foundation work paying off, some is the process changes, and some is just that writing about a process is easier than writing about algorithms.