diff --git a/_people/tiefenauer.md b/_people/tiefenauer.md new file mode 100644 index 0000000..de63d7c --- /dev/null +++ b/_people/tiefenauer.md @@ -0,0 +1,13 @@ +--- +name: 'tiefenauer' +firstName: 'Daniel' +description: 'Daniel Tiefenauer is a full stack software engineer at Karakun AG, living in Switzerland. He is passionate about quality, sharing knowledge and building great products.' +lastName: 'Tiefenauer' +github: 'tiefenauer' +mail: 'daniel.tiefenauer@karakun.com' +avatar: 'tiefenauer' +header: + image: tiefenauer + text: Daniel Tiefenauer + x: left +--- diff --git a/_posts/2026-07-01-sdd.md b/_posts/2026-07-01-sdd.md new file mode 100644 index 0000000..3216d98 --- /dev/null +++ b/_posts/2026-07-01-sdd.md @@ -0,0 +1,154 @@ +--- +layout: post +title: "Spec-Driven Development: Why Unguided AI Coding Is a Trap" +seo_title: "Spec-Driven Development for Guided AI Coding" +description: "Learn how Spec-Driven Development provides guardrails for AI-assisted coding, reducing architecture drift, improving traceability, and maintaining software quality." +authors: [ 'tiefenauer' ] +featuredImage: 'sdd' +excerpt: "AI agents can generate code faster than ever, but speed alone does not produce maintainable software. Without explicit specifications, AI-driven development risks architecture drift, hidden intent, and growing technical debt. Spec-Driven Development offers a different approach: treating requirements as versioned guardrails that keep generated code aligned with long-term goals." +permalink: '/2026/07/01/using-a-ferrari-to-deliver-pizza.html' +categories: [ Development, AI, SDD, Spec-Driven Development ] +header: + text: "AI Agents Need Guardrails: A Case for Spec-Driven Development" + image: 'post' +--- + +Using AI agents without specifications is like using a Ferrari to deliver pizza. +The problem isn't the vehicle—it's the lack of direction. +While AI can generate code at unprecedented speed, larger systems quickly suffer from architecture drift, hidden assumptions, and technical debt when implementation becomes detached from intent. +Spec-Driven Development (SDD) addresses this by making specifications the primary source of truth and treating code as an executable representation of requirements. + +--- + +## Table Of Contents + +* [The New Era of AI Assisted-Development](#The-New-Era-of-AI-Assisted-Development) +* + +--- + +When was the last time you used Google to research about a non-trivial problem? +And if so: When was the last time you actually clicked a link from the results list? + +## The New Era of AI Assisted-Development + +We’re living in exciting times. AI tools are omnipresent and have already had an impact on many aspects of our lives. Tools like Claude, Gemini and ChatGPT have become household names and are used in one way or another by almost everybody. I’m usually careful with terms like “disruptive” or “revolutionary” because they’re used so liberally these days. However, given the advances in AI, I think those labels are justified - not only because of the speed at which these tools are being adopted by the mainstream (which is impressive), but also because of the breadth of problems they can address. Unlike other technological breakthroughs in the past such as GPS, AI didn’t trickle down over years from military or government use to everyday users. It almost feels as though AI tools became part of our daily routine overnight. It’s remarkable how quickly AI has changed what people take for granted - things that would have seemed impossible just a few years ago. + +People working in tech - especially in software engineering - seem to be most affected by these changes. With AI agents, almost anyone can produce code now. The product manager has an idea for a new feature? With the help of a favorite AI tool, they could implement it, add it to the codebase, and deploy it to production. The UX designer spots a bug? They can fix it themselves with a few prompts. No tickets needed. No meetings. No waiting for approval. The time between inception and working code has become incredibly short. + +Everyone’s a builder now. Everyone feels empowered. Everyone is active. Everyone is moving - at least, that seems to be the prevailing narrative. But where does that leave Software Engineering? + +Exciting times, indeed! + +## Software Engineering is dead - long live Software Engineering! + +For decades, turning ideas into running software has been a multi-step, multi-disciplinary process. + +- A Requirement Engineer would try to understand the problem and derive requirements. +- A Developer would pick them up, write (and hopefully test) the code, and produce a working application. +- Later in the process, someone would verify that what’s produced satisfies the original requirements. + +Sometimes those roles are carried out by the same person. Sometimes the process is iterative, with small increments. But because it spans many disciplines, it tends to create friction and handoff overhead. There’s a constant trade-off between business needs, building the right thing, and building it right. The different actors also often have different incentives: + +- a program manager might just want to “get it out of the door quickly” +- a Software Engineer wants to produce software that is maintainable +- a QA Engineer is mainly worried about the Software doing the right thing +- the focus on operations is mainly about keeping the Software running stable + +At the heart of this process was the code. Code took center stage. Writing it used to be a craft reserved for a group of people calling themselves Software Engineers, Developers, Programmers or similar. It is a hard-earned skill, built through training and years of experience. Non-functional requirements make it even harder: code needs to be readable, maintainable, secure, and tested. As a result, writing code manually - especially high-quality code - takes a lot of time and often becomes the bottleneck between inception and running software. + +But that’s starting to shift. Today, AI agents are everywhere. They’re great: you tell them what to do, and they do it. Using an AI agent doesn’t require specialized technical skills. And as long as you’re satisfied with getting *some* result, it doesn’t even require you to understand the output. AI agents can produce code at a rate no human could match. + +## With great power comes great responsibility + +Keeping an AI agent in the loop is a common pattern these days, even for experienced software engineers. It’s temptingly simple: you describe what you want and let the AI handle the implementation. If the result isn’t satisfactory, you tweak the prompt until it is. + +![signpost.png](/assets/posts/2026-07-01-sdd/signpost.png) + +But that’s a very naive way to use AI. It can work well for creating small, self-contained pieces of code - like a function or a class - where a human still orchestrates the process. But it doesn’t scale to generating larger parts of an application. It forces you to add more and more context in the hope of getting usable output. Fingers crossed your prompts don’t contradict each other and that the resulting application is architecturally sound. + +It’s a bit like pulling the lever on a slot machine: sometimes you get what you want, sometimes you don’t. If you’re lucky - and willing to put in the effort to review what was generated - you might even understand the code that comes out. + +Since LLMs are inherently non-deterministic, there’s a high risk that your prompt may change the system in ways you didn’t anticipate. Going in all directions at once can feel actionable and energizing, but in reality it goes nowhere. By “vibing it,” it’s easy to go down the rabbit hole until you no longer know where you started. You’re bound to end up with a system whose inner workings you don’t fully understand anymore; you’re also in no position to assess its technical debt (which is always non-zero, even with manual coding). The only way to make changes then is to prompt the AI again, hoping it doesn’t break anything. + +Using an AI agent without guardrails for larger coding tasks is throwing tokens at problems. You stitch together generated pieces of code (each of which may make sense on its own) and hope the result still behaves as expected. AI can be a powerful tool, but used carelessly it can backfire. + +It’s like using a military-grade GPS satellite system to navigate a city by saying “take me somewhere nice,” then complaining when you end up at a bus depot. + +Or like the Ferrari mentioned in the titl, used to deliver pizza: The machine is capable of 0–100 in 3 seconds, but you're stuck crawling through residential streets, making U-turns, and ringing the wrong doorbells. + +![ferrari.png](/assets/posts/2026-07-01-sdd/ferrari.png) + +Code follows intent. Whether typed by a human or generated by an AI, every line is ultimately an expression of that underlying goal. While code answers the question of *how* something is implemented, intent answers *why* it was implemented in the first place. Unfortunately, that intent is often lost over time, making it impossible to infer the original requirement just by looking at the code. Decisions made, shortcuts taken, and alternatives considered often remain implicit. Understanding them at a later point usually requires tribal knowledge. And no matter how hard you try, as the code is updated over time, it inevitably pulls further away from the original intent - a phenomenon known as “drift”. + +Many paradigms attempted to handle this dilemma: + +- Architecture decision records (ADRs) capture the reasoning behind architectural decisions. +- Documentation attempts to explain the non-obvious logic. +- Agile shortens feedback loops to course-correct development. +- DevOps automates infrastructure to ensure stability and resilience. +- TDD builds quality and requirements in from the start. +- Version control guarantees that every change remains revertible. + +All of these concepts address the symptoms, but never solved the underlying problem: code evolved independently of its specifications. AI this will not help with this. Workflows built for manual coding are simply no longer compatible with a world of agents spewing out more and more code faster than any human could. Moving from AI-*augmented* coding to AI-*driven* coding - where productivity gains are measurable - requires guidance, structure, long-term vision, and architectural guidance. Without all of that, the whole endeavor is bound to fail. It’s not only inefficient; it’s a recipe for disaster. + +## Spec-Driven Development to the rescue + +This is where Spec-Driven Development (SDD) comes in. The term has gained momentum over the past year, driven by the explosive adoption of AI. As the name suggests, SDD inverts the power structure by treating requirements as the single source of truth for code - not the other way around. In fact, SDD treats requirements like code: they are made explicit through well-structured text, enriched with references, and placed under version control. + +AI can use those specifications (specs) as context to keep code generation consistent, even over longer periods of time. Each line of generated code can be traced back to a concrete requirement. Thus code can be seen as executable specification. In fact having explicit, traceable specs as guardrails is what makes codebases with a high degree of generated code feasible in the first place. Because specification and implementation are simply different expressions of the same requirement, the intent can’t be lost, the rationale behind the code can’t vanish into oblivion, and documentation can’t drift out of sync. Just like you wouldn’t throw away your code once it’s been compiled into an executable artifact, you don’t throw away the specs; you version them alongside the code they specify. + +Solving a problem-whether with code or not-requires first understanding the problem at hand. Or as Charles Kettering, the former head of research at General Motors, famously put it: + +> A problem well stated is a problem half solved. +> + +What’s new is that instead of iterating around pieces of code until it does that it’s supposed to do, SDD iterates around the specification, until it is clear, sharp and unambiguous enough to serve as a foundation against which the application can be verified. That way, SDD provides structure and guardrails to the process of generating code. Instead of mindlessly generating code until it (hopefully) does what it’s supposed to do, SDD ensures that no line of code contradicts the original intent. Code is no longer treated as an artefact of its own, but rather as an executable specification. Change is viewed as the driver of development, rather a disruption. Because the code is directly derived from the specification, there’s no way code can diverge from it, no way the code documentation can get outdated. Because each line of code can directly be traced back to a requirement, there’s no way the reasons for certain decision can disappear in oblivion. + +The concept of SDD itself is conceptually nothing new: There have always been specifications that code was checked against. What’s new is that with the advent of AI agents doing the bulk work of writing code, checking against the original requirement is no longer optional, it becomes absolutely crucial. Guardrails are essential to keep this new AI superpower on track. + +## The SDD workflow + +Tools like GitHub's Spec Kit or Amazon's Kiro facilitate building software with SDD, offering useful features such as Git integration, automatic feature numbering, semantic branch naming, and more. A typical developer workflow with SDD looks roughly like this: + +- **Constitution**: Before implementing or generating anything, you lay the foundation for the code. The constitution contains a set of core principles - non-negotiables like having tests for each feature, writing readable and maintainable code, maintaining a consistent UX at all times, not to guess, makinging assumptions explicit, etc… +- **Specification**: SDD is specification-first. Before any code is written, each new feature starts with a specification. The developer describes the desired outcome in natural language. The AI uses that prompt - together with the existing application as context - to generate a specification, also in natural language. It produces a set of Markdown files that spell out functional and non-functional requirements, success criteria, assumptions, and so on. The focus at this stage is the “why”, not the “how”, so technical details should be left out. +- **Clarification**: Optionally, the developer can use the AI to challenge the specification by asking clarifying questions. AI is used as a sparring partner here with the goal to resolve as much ambiguity as possible. There may be back-and-forth between specification and clarification until the spec becomes more stable. Again, technical details should remain out of scope at this stage to enable high-quality specs. +- **Plan**: Once a stable specification exists, the AI can be used to generate a detailed implementation plan. This is where business requirements are converted into architectural and implementation details; technical aspects - like which framework to use - can be decided here. Success criteria can be stipulated using a checklist, if needed. +- **Tasks**: After a plan exists, it can be broken down into manageable chunks that can be used by the AI to generate code. Tasks can be used to define what steps can be carried out in parallel by an AI +- **Implement**: As a last step, the requirement can be implemented in code. The preliminary phases should have provided enough structure and guidance for the LLM to produce good quality code that will do exactly what is intended and does not break existing implementation and also not contradict the existing requirements. A test-driven approach with a focus on integration testing (which could be made part of the constitution) may help ensuring the code can also be refactored without breaking things. + +It’s crucial to review the LLM’s output after each step, whether it’s Markdown files or code. Don't treat this as a fixed sequence. The process is iterative, meaning you can go back to earlier steps at any time. For example, if you notice the specification is still missing an important aspect, you can return to the Specification phase and amend it. + +All output produced this way becomes part of the context for further implementation. Having both the LLM’s rationale and its produced code reviewed and put under version control ensures no detail is forgotten, the reasoning behind decisions is explicit and reusable for each iteration, and the documentation stays up to date. + +## SDD vs waterfall + +Having specs as the single source of truth may lead to the assumption that SDD is going back to the waterfall process. However, there are some fundamental differences. + +![waterfall.jpg](/assets/posts/2026-07-01-sdd/waterfall.jpg) + +In Waterfall, the specification phase was done by *humans* (analysts, architects) and took weeks or months. Because it was so expensive to produce, the spec became a quasi-legal contract - changing it mid-project was painful, politically charged, and costly. So teams tried to get it *perfect* upfront, which is essentially impossible for complex software. The rigidity that was meant to ensure quality became the source of failure. + +Cheap specs change the economics. Waterfall's rigidity was a *rational response* to expensive spec production. If writing a spec costs 3 months of analyst time, you'd better not change it. But if an AI can regenerate or revise a spec in seconds, the whole calculus flips - you *want* to iterate on it cheaply rather than get it perfect upfront. + +SDD is actually closer in spirit to **Agile** than Waterfall: you still work incrementally and respond to feedback. The difference from plain Agile is that you're generating *more* structure and documentation than typical Agile teams bother with, but without the cost that made that documentation prohibitive. However, unlike agile, the iteration happens super-fast on one developer’s machine in minutes to hours; while until now iterations happened over days and weeks AND across agile team members. The cost: The “learnings” across iterations were always one of the great side-effects of the approach. Now the learnings never get shared across the team. This is cognitive debt in action. At some point, a bill will come for this. + +One honest similarity: Where SDD *does* share Waterfall's risk: if you rubber-stamp the AI-generated spec without critically reading it, you can end up with the same problem - implementing the wrong thing very efficiently. The discipline of actually reviewing and challenging the spec is still on you. The tool makes good process cheap; it doesn't enforce it. + +## What to take from it + +In my experience, most software engineers enjoy their craft. Writing code is a creative process, closely tied to job satisfaction. Many have entered the field with a passion for it. But with the rise of AI, that passion will suffer, as we will be probably writing a lot less code manually in months and years to come. Looking at job ads, it’s not uncommon to see expectations of “70%+ of code being generated”. Job responsibilities will shift, the focus on what matters will change, and priorities will be reshuffled. Processes will have to adapt in lockstep: more code also means more code reviews, more testing, more frequent deployments, more monitoring, more security issues, more rollbacks, more bugs to detect, and more liability. + +“Traditional” software engineering virtues like TDD, clean code, and iterative procedures are valuable and will remain so in the future. Some may even become more important. Writing good, maintainable code still requires a lot of skill, regardless of whether it’s generated by AI or written by a human. However, the role these virtues play - and how they are practiced in software engineering - will change fast and fundamentally. + +Training an LLM usually involves extensive data wrangling to obtain high-quality datasets for training and evaluation. Starting with low-quality data limits a model’s capabilities from the outset - the phrase “garbage in, garbage out” is well known. The same applies to using specs as single source of truth in AI-assisted coding, since AI agents are fundamentally tool-augmented LLMs working in hand in hand. + +By making coding almost instantaneous, AI seems to have removed the bottleneck. However, as any seasoned engineer will tell you: there’s no such thing as a free lunch. Reading many posts on networks like LinkedIn, you’ll find different predictions about where the future of software engineering is heading. There seem to be two extremes: + +- The zoomers who say it won’t be long until plain English is recognized as a programming language, making software engineering obsolete +- The doomers who fear that AI will do nothing but create a huge mess in the years to come, eventually increasing the need for experienced software engineers + +There’s a lot of uncertainty about the future of the industry. Since no one can predict the future, it’s safe to assume the reality will land somewhere in between. A lot of AI-in-production stories are still less than six months old, so there isn’t much hands-on experience with this new world yet. The same is true for SDD. Since it’s a relatively new concept, we lack long-term experience, and it’s hard to say whether SDD is the right way to use AI or just the next buzzword. What is clear, however, is that even if the daily routine hasn’t changed much yet from what it used to be five years ago, it won’t stay that way forever. Change has always been part of the DNA of software engineering, but given recent advances in AI, it’s fair to ask what that means for us as software engineers. [Skill erosion is real](https://www.reddit.com/r/cscareerquestions/comments/1tkccsi/my_senior_engineers_have_stopped_thinking_for/)x. AI agents are here to stay, and preventing people from using them is not the right solution. It’s not about being for or against AI or SDD. But if we use this powerful tool, we should at least use it properly. + +On the upside, writing code was never truly the job of a software engineer; understanding the problem and solving it is. Sometimes the right solution is code. Knowing where you are and which direction you want to go has always been an essential skill. AI is a powerful tool that can amplify your abilities - so long as you still feel responsible for its output. AI multiplies your skills, but that goes both ways: if you treat quality, maintainability, testing, and architecture as an afterthought, AI will multiply that. \ No newline at end of file diff --git a/assets/avatars/tiefenauer.png b/assets/avatars/tiefenauer.png new file mode 100644 index 0000000..c2f77cf Binary files /dev/null and b/assets/avatars/tiefenauer.png differ diff --git a/assets/featuredImages/sdd.png b/assets/featuredImages/sdd.png new file mode 100644 index 0000000..1af6e9e Binary files /dev/null and b/assets/featuredImages/sdd.png differ diff --git a/assets/posts/2026-07-01-sdd/ferrari.png b/assets/posts/2026-07-01-sdd/ferrari.png new file mode 100644 index 0000000..2037e0d Binary files /dev/null and b/assets/posts/2026-07-01-sdd/ferrari.png differ diff --git a/assets/posts/2026-07-01-sdd/signpost.png b/assets/posts/2026-07-01-sdd/signpost.png new file mode 100644 index 0000000..b847d38 Binary files /dev/null and b/assets/posts/2026-07-01-sdd/signpost.png differ diff --git a/assets/posts/2026-07-01-sdd/waterfall.jpg b/assets/posts/2026-07-01-sdd/waterfall.jpg new file mode 100644 index 0000000..8269ef5 Binary files /dev/null and b/assets/posts/2026-07-01-sdd/waterfall.jpg differ diff --git a/docker-compose.yml b/docker-compose.yml index 35ad0d5..ed882fb 100644 --- a/docker-compose.yml +++ b/docker-compose.yml @@ -2,7 +2,7 @@ version: '2.4' services: jekyll: - image: bretfisher/jekyll-serve + image: bretfisher/jekyll-serve:stable-20250915-2119a31 command: [ "bundle", "exec", "jekyll", "serve", "--config", "_config.yml,_config_dev.yml", "--force_polling", "-H", "0.0.0.0", "-P", "4000", "--future", "--drafts" ] volumes: - .:/site