cd /blog
Prompt Engineering AI Development SPARC Production Systems Roo Code

I Wrote 3,000 Lines of Prompts So AI Could Ship Production Code

| ~1 min read | by Ethan Cornwill

I didn’t build an AI coding framework.

I built something arguably harder: a comprehensive prompt library that turns an existing AI tool (Roo Code) into a production-grade development system with security controls, automated testing, and deterministic output.

3,000+ lines of carefully architected prompts. 12 specialized agent modes. Mandatory reasoning protocols. Security-first design.

Here’s what I learned about prompt engineering at scale, and why most developers are doing it wrong.

The Problem With AI Coding Tools

Every AI coding assistant has the same fundamental issues:

Inconsistent output:

  • Same prompt, different results
  • No systematic approach
  • Quality varies wildly
  • Hallucinations are common

No security controls:

  • Hardcoded API keys in generated code
  • No PII scanning
  • Missing input validation
  • Zero security review

Missing testing:

  • Code looks good, breaks in production
  • No systematic test coverage
  • Edge cases ignored
  • Error handling is TODO

Poor architecture:

  • No performance considerations
  • No Big O analysis
  • Technical debt from day one
  • Impossible to maintain

Most developers accept this as “the cost of using AI.” I decided to fix it with better prompts.

The Solution: Systematic Prompt Architecture

I use Roo Code, an agentic AI coding tool with a “custom modes” system. Think of it as programmable AI agents, each with specialized roles.

Instead of writing ad-hoc prompts for each task, I built SPARC: a comprehensive prompt library implementing a formal software development methodology.

S - Specification & Pseudocode
P - Planning & Architecture
A - Auto-Coding & Testing
R - Refinement & Review
C - Completion & Versioning

Not just a workflow. A complete prompt architecture with 12 specialized agents, each with explicit role definitions, reasoning protocols, and security mandates.

How Prompt Architecture Actually Works

Most people write prompts like this:

"Build me a login system with JWT auth"

Then they wonder why the output is garbage.

Here’s what a production-grade prompt actually looks like:

Example: The ‘code’ Agent

Role Definition:

You are a Senior Production Engineer specializing in Test-Driven
Development (TDD). Your core mandate is to implement the minimum
necessary code to transform failing tests into passing tests. You
MUST strictly adhere to the functional requirements (pseudocode,
Big O analysis) from the 'architect' agent and the design tokens
from the 'ui-ux-interpreter' agent, prioritizing secure and
modular implementation.

Custom Instructions (Abbreviated):

[ROLE & CONTEXT]
Act as a highly disciplined senior software engineer specialized
in the {PROJECT_STACK}. You are implementing the
{MAIN_FEATURE_BRANCH_NAME} feature. Your input specifications,
pseudocode, architectural constraints, design guidelines, and the
complete set of failing tests are provided within
<SPECIFICATION_INPUT> tags.

[REASONING & GUARDRAILS - Chain-of-Thought using <THINKING> tags]
Before generating ANY code, you MUST encapsulate your preparatory
work within <THINKING></THINKING> tags, strictly following these steps:

1. DECOMPOSE & TEST ANALYSIS: Analyze the specific failed assertions.
   Decompose the required implementation into minimal, logical,
   modular functions.

2. ADHERENCE CHECK: Verify strict adherence to UI/UX design tokens
   and architectural constraints (data model, Big O complexity).

3. SECURITY & SECRETS REVIEW: Explicitly check for handling secrets.
   NEVER hardcode secrets (a security mandate).

4. EXECUTION PLAN: Outline the sequence in which code blocks will
   be generated to pass the tests.

[STYLE & CONSTRAINTS]
Use low creativity decoding parameters (Temperature ≤ 0.3, Top-P ≤ 0.5)
to ensure logical consistency and accurate syntax.

⚠️ SECURITY MANDATE: NEVER STAGE, COMMIT, or CREATE files named
.env or any files containing hardcoded secrets.

Why this works:

  1. Explicit role → AI understands its exact responsibility
  2. Mandatory reasoning → AI must show its work via <THINKING> tags
  3. Security mandates → Built-in guardrails prevent vulnerabilities
  4. Temperature controls → Deterministic output, not creative guessing
  5. Context injection → All necessary info provided in structured tags

This isn’t a prompt. It’s a software specification for AI behavior.

The 12 Specialized Agents

Each agent has a narrow, well-defined purpose:

1. 🎨 UI/UX Interpreter

  • Generates design tokens (colors, typography, spacing)
  • Creates component guidelines
  • Enforces accessibility standards
  • Output: design_tokens.json + design_guidelines.md

2. 📋 Specification Writer

  • Translates requirements into formal pseudocode
  • Defines data structures and algorithms
  • Maps logical flows and error handling
  • Output: algorithm_specification.md

3. 🏗️ Architect

  • Designs system architecture from pseudocode
  • Calculates Big O complexity for critical algorithms
  • Defines data models and integration points
  • Reviews for security vulnerabilities
  • Output: architecture_blueprint.md + system_diagram.mmd

4. 🧪 Tester (TDD)

  • Writes failing unit and integration tests
  • Defines test cases (happy path, edge cases, errors)
  • Establishes requirements for code agent
  • Output: Test files that initially fail (Red phase)

5. 🧠 Auto-Coder

  • Implements code to pass failing tests
  • Adheres to architectural constraints
  • Follows design tokens and Big O limits
  • Includes comprehensive type hints and docstrings
  • Output: Application code (Green phase)

6. 🪲 Debugger

  • Analyzes failing test outputs and error logs
  • Performs root cause analysis (RCA)
  • Implements minimal fixes to resolve issues
  • Output: Bug fixes + root_cause_analysis.md

7. 🛡️ Security Reviewer

  • Audits code for vulnerabilities
  • Scans for hardcoded secrets and PII exposure
  • Reviews for SQL injection, XSS, CSRF risks
  • Checks for algorithmic bias
  • Output: security_audit_report.md

8. 🔗 System Integrator

  • Verifies component communication
  • Runs end-to-end integration tests
  • Validates data contracts between services
  • Output: Integration fixes + integration_report.md

9. 🧹 Optimizer

  • Refactors code for performance
  • Improves Big O complexity where possible
  • Enhances code clarity and maintainability
  • Output: Optimized code + optimization_report.md

10. 📚 Documentation Writer

  • Generates comprehensive project documentation
  • Creates README, API guides, architecture docs
  • Documents all decisions and trade-offs
  • Output: Markdown documentation files

11. 📦 Version Manager

  • Calculates semantic version bumps (PATCH/MINOR/MAJOR)
  • Updates package.json deterministically
  • Enforces versioning rules
  • Output: Updated package.json

12. 🌳 Git Expert

  • Handles ALL version control operations
  • Executes commits, pushes, PR creation
  • Enforces security mandate (never commits .env files)
  • Output: Git operations + confirmation

Plus the orchestrator:

⚡️ SPARC Orchestrator

  • Manages the entire workflow
  • Delegates to specialist agents in sequence
  • Validates outputs between phases
  • Enforces TDD loop (Red → Green → Refactor)
  • Uses Chain-of-Thought planning via sequential_thinking

The Systematic Approach: How It Actually Works

When you give SPARC an objective like “Build a user authentication system,” here’s what happens:

Phase 1: Specification (S)

spec-pseudocode agent activates:

<THINKING>
1. DECOMPOSE & DEFINE:
   - User registration flow
   - Login/logout flow
   - Password reset flow
   - Session management

2. DATA STRUCTURES:
   - User model (id, email, password_hash, created_at)
   - Session model (token, user_id, expires_at)

3. ETHICAL & LOGIC REVIEW:
   - No bias in user filtering
   - PII handling follows GDPR

4. FLOW CONSTRUCTION:
   - Registration: validate → hash → store → send verification
   - Login: validate → verify password → create session
   - Error handling for each step
</THINKING>

Output: algorithm_specification.md

The agent writes formal pseudocode defining every logical step, data structure, and error condition.

Phase 2: Planning & Architecture (P)

architect agent activates:

<THINKING>
1. STEP-BACK ABSTRACTION:
   - Scalability: Support 10k concurrent users
   - Security: Bcrypt hashing, JWT tokens, rate limiting
   - Fault tolerance: Database connection pooling

2. LOGICAL DECOMPOSITION:
   - Auth Service (handles login/registration)
   - Session Service (manages tokens)
   - PostgreSQL database

3. PERFORMANCE ANALYSIS:
   - Password hash check: O(1) with bcrypt
   - User lookup by email: O(1) with indexed query
   - Session validation: O(1) with token cache

4. SECURITY REVIEW:
   - Secrets in environment variables
   - Rate limiting: 5 attempts per IP per minute
   - SQL injection prevention: Parameterized queries
   - No PII in logs

5. DATA MODEL:
   Users table: id, email, password_hash, verified, created_at
   Sessions table: token, user_id, expires_at, created_at
</THINKING>

Output: architecture_blueprint.md + system_diagram.mmd

The architect designs the complete system with performance estimates and security controls.

Phase 3: Auto-Coding & Testing (A) - The TDD Loop

tdd agent writes failing tests:

// src/tests/auth.test.ts

describe("User Registration", () => {
  it("should create user with valid email and password", async () => {
    const result = await registerUser("test@example.com", "SecurePass123!");
    expect(result.success).toBe(true);
    expect(result.user.email).toBe("test@example.com");
  });

  it("should reject weak passwords", async () => {
    const result = await registerUser("test@example.com", "weak");
    expect(result.success).toBe(false);
    expect(result.error).toContain("Password too weak");
  });

  it("should prevent duplicate email registration", async () => {
    await registerUser("test@example.com", "SecurePass123!");
    const result = await registerUser("test@example.com", "SecurePass456!");
    expect(result.success).toBe(false);
    expect(result.error).toContain("Email already exists");
  });
});

Tests run. They fail (Red phase). This is expected.

code agent implements to pass tests:

<THINKING>
1. DECOMPOSE & TEST ANALYSIS:
   Failed assertions:
   - registerUser function doesn't exist
   - Password validation not implemented
   - Duplicate email check missing

2. ADHERENCE CHECK:
   - Use bcrypt (from architecture spec)
   - Password must be 12+ chars, mixed case, symbols
   - PostgreSQL with parameterized queries

3. SECURITY REVIEW:
   - No hardcoded secrets
   - Use env variables for DB connection
   - Bcrypt cost factor 12 (from spec)

4. EXECUTION PLAN:
   - Create User model
   - Implement password validation
   - Implement registerUser function
   - Add duplicate email check
</THINKING>

// Implementation code here...

Tests run again. If they pass (Green phase), move forward. If they fail, debug agent activates.

debug agent fixes failures:

<THINKING>
1. ANALYSIS & RCA:
   Error log shows: "bcrypt is not defined"
   Root cause: Missing import statement
   Location: src/auth/register.ts, line 15

2. CONSTRAINT CHECK:
   Fix doesn't violate architectural constraints
   Still adheres to Big O limits

3. SECURITY REVIEW:
   Fix doesn't introduce new vulnerabilities

4. EXECUTION PLAN:
   Add: import bcrypt from 'bcrypt';
   Re-run tests to confirm
</THINKING>

Output: Bug fix + root_cause_analysis.md

This loop continues until all tests pass.

Phase 4: Refinement & Review (R)

security-review agent audits everything:

<THINKING>
1. SCOPE & THREAT MODELING:
   User input points: email, password
   External calls: Database queries
   Data persistence: PostgreSQL

2. SECRET & PII AUDIT:
   ✓ No hardcoded secrets found
   ✓ No .env file in codebase
   ✓ PII (email) properly handled

3. VULNERABILITY SCAN:
   ✓ SQL injection: Using parameterized queries
   ✓ Password security: Bcrypt with cost 12
   ✓ Rate limiting: Implemented
   ⚠ Missing CSRF protection for session endpoints

4. REMEDIATION PLAN:
   Add CSRF token validation to session creation
</THINKING>

Output: security_audit_report.md with findings

integration agent tests cross-component flows:

<THINKING>
1. ANALYSIS & THREAT IDENTIFICATION:
   Integration points:
   - Frontend → Auth API
   - Auth Service → Database
   - Auth Service → Session Service

2. TEST PLAN GENERATION:
   E2E test: Complete registration + login flow

3. ACTION & OBSERVATION:
   Running integration tests...
   [Test output shows session creation failing]

4. REMEDIATION:
   Session Service not receiving user_id correctly
   Fix: Update data contract in Auth Service
</THINKING>

Output: Integration fixes + integration_report.md

refinement-optimization-mode improves performance:

<THINKING>
1. BOTTLENECK IDENTIFICATION:
   Password verification during login is slow
   Current: Bcrypt verify on every request
   Big O: O(1) but high constant factor

2. REFACTOR PLAN:
   Add session token caching (Redis)
   Only verify password on login, not every request
   Reduces load by 95%

3. SECURITY REVIEW:
   Session cache doesn't expose passwords
   Cache expiry matches session expiry
</THINKING>

Output: Optimized code + optimization_report.md

Phase 5: Completion & Versioning (C)

docs-writer generates documentation:

# User Authentication System

## Overview

JWT-based authentication with bcrypt password hashing, PostgreSQL storage,
and Redis session caching.

## Architecture

[Details from architecture_blueprint.md]

## API Endpoints

POST /auth/register - Create new user account
POST /auth/login - Authenticate and create session
POST /auth/logout - Invalidate session
POST /auth/reset-password - Initiate password reset

## Security Controls

- Bcrypt password hashing (cost factor 12)
- Rate limiting (5 attempts/IP/minute)
- CSRF protection on session endpoints
- SQL injection prevention via parameterized queries

## Performance

- Login: O(1) with indexed database lookup
- Session validation: O(1) with Redis cache
- Handles 10k concurrent users

version-manager calculates version bump:

<THINKING>
1. ANALYZE IMPACT:
   New feature: User authentication system
   Backwards-compatible: Yes (new endpoints, no breaking changes)

2. ASSESS NEED:
   This is a new feature, version bump required

3. DETERMINE LEVEL:
   MINOR bump (new backwards-compatible functionality)

4. CALCULATE NEW VERSION:
   Current: 1.2.3
   New: 1.3.0 (MINOR bump resets PATCH to 0)
</THINKING>

Output: package.json updated to v1.3.0

git-expert commits everything:

<THINKING>
1. DECOMPOSE & MAP:
   - Check current branch
   - Stage modified files
   - Commit with descriptive message
   - Push to remote

2. SECURITY AUDIT:
   File list review:
   ✓ src/auth/register.ts
   ✓ src/auth/login.ts
   ✓ src/tests/auth.test.ts
   ✓ docs/authentication.md
   ✓ package.json
   ✗ No .env files detected

3. EXECUTION PLAN:
   git add [files]
   git commit -m "feat: Add user authentication system"
   git push origin feature/auth-system
</THINKING>

Output: Committed and pushed

What This Actually Achieves

This isn’t about coding faster. It’s about systematic quality control.

Before SPARC:

  • ❌ Inconsistent code quality
  • ❌ Missing tests
  • ❌ Security vulnerabilities
  • ❌ No performance tracking
  • ❌ Poor documentation
  • ❌ Manual version management

After SPARC:

  • ✅ Consistent architecture patterns
  • ✅ Mandatory test coverage (TDD enforced)
  • ✅ Security review on every feature
  • ✅ Big O analysis documented
  • ✅ Complete documentation generated
  • ✅ Automated semantic versioning

Real Results:

Fusion Party Infrastructure:

  • Built complete Astro + Sanity CMS platform
  • Compliance systems and process automation
  • All following SPARC methodology
  • Production-ready, no post-launch bugs

MagnetLab Client Projects:

  • RAG pipelines with Pinecone + LangChain
  • AI agent systems for lead qualification
  • White-label CRM solutions
  • All delivered on time, all still running

This Portfolio:

  • 52 iterations tracked in Git
  • Every component tested
  • Security reviewed
  • Performance optimized
  • You’re looking at it right now

Lessons From 3,000 Lines of Prompts

1. Temperature Settings Matter More Than You Think

Most developers never adjust temperature settings. This is a mistake.

For creative tasks (naming, UX copy, marketing):

  • Temperature: 0.7-1.0
  • Top-P: 0.9-1.0
  • Want variety and creativity

For code generation (implementation, debugging):

  • Temperature: 0.1-0.3
  • Top-P: 0.3-0.5
  • Want deterministic, reliable output

For structured data (JSON, version numbers):

  • Temperature: ≤0.1
  • Top-P: ≤0.3
  • Need perfect accuracy

In SPARC, every agent has explicit temperature requirements in the prompt. The code agent uses ≤0.3. The version manager uses ≤0.1.

Result: Consistent, predictable output instead of random variations.

2. Reasoning Transparency Prevents Hallucinations

The single most effective technique I discovered: mandatory reasoning tags.

Every SPARC agent must use <THINKING> tags before generating output:

<THINKING>
1. Analyze the requirements
2. Check security constraints
3. Plan the solution
4. Validate against specs
</THINKING>

[Then generate actual output]

Why this works:

  • AI shows its work → You can verify logic
  • Catches errors early → Bad reasoning visible before bad code
  • Auditable trail → Know why decisions were made
  • Reduces hallucinations → AI can’t skip steps or make stuff up

Without reasoning tags, AI might generate code that “looks right” but violates requirements. With reasoning tags, you see the flawed logic before it becomes flawed code.

3. Security Can’t Be Bolted On

Security must be in every agent’s prompt, not just a final review step.

spec-pseudocode agent:

  • Checks requirements for PII handling
  • Identifies sensitive data flows
  • Plans secure data storage

architect agent:

  • Reviews architecture for vulnerabilities
  • Defines authentication/authorization boundaries
  • Plans secret management strategy

code agent:

  • SECURITY MANDATE: NEVER hardcode secrets
  • Must use environment variables
  • Cannot create .env files

security-review agent:

  • Final audit for vulnerabilities
  • Scans for hardcoded secrets
  • Checks for SQL injection, XSS, CSRF

git-expert agent:

  • Security audit before every commit
  • Refuses to commit .env files
  • Reports violations immediately

Result: Security violations caught at every phase, not just at the end.

4. Specificity Beats Generality

Bad prompts are vague. Good prompts are hyper-specific.

Bad prompt:

"Write good code"

Better prompt:

"Implement the authentication system using the pseudocode specification"

SPARC prompt:

"Generate code that passes the failing tests in <SPECIFICATION_INPUT>,
adheres to the Big O constraints (O(1) for user lookup) from the
architect agent, uses design tokens from ui-ux-interpreter, includes
comprehensive type hints and docstrings, uses bcrypt with cost factor 12,
implements rate limiting at 5 attempts per IP per minute, and stores
secrets in environment variables. Use Temperature ≤ 0.3 for deterministic
output."

The SPARC prompt provides:

  • Exact requirements (pass failing tests)
  • Performance constraints (Big O limits)
  • Design requirements (design tokens)
  • Code quality standards (type hints, docstrings)
  • Security requirements (bcrypt cost, rate limits, secret storage)
  • Decoding parameters (Temperature ≤ 0.3)

No ambiguity. No guessing. Just requirements.

5. Agent Specialization Works

One generalist AI agent trying to do everything = mediocre at everything.

Twelve specialized agents, each expert in one thing = excellent at their specific task.

Why specialization works:

Narrow scope = Better prompts:

  • Each agent has one clear job
  • Prompts can be highly specific
  • No conflicting priorities

Clear ownership:

  • tdd agent writes tests (and only tests)
  • code agent implements (and only implements)
  • debug agent fixes (and only fixes)
  • No confusion about who does what

Validation at boundaries:

  • Each agent’s output is validated
  • Next agent checks previous agent’s work
  • Errors caught between phases

Progressive refinement:

  • Raw spec → Architecture → Tests → Code → Review → Optimization
  • Each phase builds on the previous
  • Quality improves at each step

6. Context Injection Is Critical

AI needs context. But not just any context. Structured, tagged context.

SPARC uses <SPECIFICATION_INPUT> tags to inject context:

<SPECIFICATION_INPUT>
<PSEUDOCODE>
[Algorithm specification from spec agent]
</PSEUDOCODE>

<ARCHITECTURE>
[System design from architect agent]
</ARCHITECTURE>

<DESIGN_TOKENS>
[UI/UX specifications from ui-ux-interpreter]
</DESIGN_TOKENS>

<FAILING_TESTS>
[Test output from tdd agent]
</FAILING_TESTS>
</SPECIFICATION_INPUT>

Why nested tags matter:

  • Prevents tag spoofing (security against prompt injection)
  • Clear boundaries (AI knows where each section starts/ends)
  • Selective reading (agent only reads relevant sections)
  • Auditability (can verify what context was provided)

The code agent receives pseudocode, architecture, design tokens, and failing tests. It has everything needed, nothing extra.

7. Validation Loops Catch Errors Early

The ReAct pattern (Reason → Act → Observe) is built into SPARC.

After every agent completes its task:

  1. Log the output (Memory Bank records what was done)
  2. Run validation (tests, linting, build checks)
  3. Observe results:
    • Pass → Commit and move to next phase
    • Fail → Debug agent analyzes error and fixes
  4. Repeat until green

This creates a feedback loop:

  • Errors caught immediately
  • Root cause identified systematically
  • Fixes validated before proceeding

No “write code, deploy, hope it works.” Instead: “write code, test, fix, validate, then commit.”

8. Documentation Can’t Be An Afterthought

In SPARC, documentation is generated automatically as the final phase.

The docs-writer agent has access to:

  • Original specifications
  • Architecture decisions
  • Big O complexity analysis
  • Security audit findings
  • All implementation code

It synthesizes everything into:

  • README with setup instructions
  • API documentation
  • Architecture overview
  • Security controls
  • Performance characteristics

Why this works:

  • Documentation matches actual implementation (not aspirational)
  • Technical depth appropriate for audience
  • Complete coverage (nothing missing)
  • Generated fresh every time (never stale)

9. Versioning Requires Logic, Not Guessing

Most developers bump versions arbitrarily. SPARC calculates them deterministically.

The version-manager agent analyzes the feature:

<THINKING>
1. What changed?
   - New authentication endpoints (new feature)

2. Is it backwards-compatible?
   - Yes (existing endpoints unchanged)

3. What's the bump level?
   - MINOR (new backwards-compatible functionality)

4. Calculate new version:
   - Current: 1.2.3
   - Bump MINOR: 1.3.x
   - Reset PATCH: 1.3.0
</THINKING>

Result: Semantic versioning that actually means something.

  • PATCH: Bug fixes, no API changes
  • MINOR: New features, backwards-compatible
  • MAJOR: Breaking changes

No human judgment required. Just systematic analysis.

10. The Human Still Matters

SPARC doesn’t replace human judgment. It augments it.

Where AI excels:

  • Implementing well-defined specifications
  • Following explicit architectural constraints
  • Writing boilerplate code
  • Running systematic tests
  • Generating documentation

Where humans are essential:

  • Defining business requirements
  • Making architectural trade-offs
  • Understanding user needs
  • Evaluating security trade-offs
  • Deciding what to build

SPARC automates the mechanical parts so humans can focus on the strategic parts.

The Honest Limitations

This isn’t a magic solution. It has real constraints.

What SPARC Doesn’t Solve:

1. AI still hallucinates occasionally

  • Reasoning tags reduce it significantly
  • But it still happens
  • Human review catches it

2. Complex debugging needs human judgment

  • Debug agent handles ~80% of errors
  • Really weird bugs need human analysis
  • Especially system integration issues

3. Architectural decisions need business context

  • AI doesn’t know your budget
  • AI doesn’t know your team’s skills
  • AI doesn’t know your deadlines
  • Humans make the final call

4. Some domains need human expertise

  • Complex algorithms (beyond standard patterns)
  • Unusual business rules
  • Domain-specific edge cases
  • Novel solutions to unique problems

What SPARC Dramatically Reduces:

1. Inconsistent output quality

  • Before: Every task different quality
  • After: Systematic approach every time

2. Security vulnerabilities

  • Before: Security review optional
  • After: Mandatory at multiple phases

3. Missing tests

  • Before: “I’ll write tests later” (never happens)
  • After: TDD enforced, tests written first

4. Poor documentation

  • Before: READMEs lag behind code
  • After: Generated from actual implementation

5. Version management chaos

  • Before: “Uh, let’s call it 2.0?”
  • After: Deterministic SemVer calculation

How You Can Use This

The SPARC prompt library is open source.

GitHub: [github.com/finneh4249/sparc-prompts] (update with actual link)

To use it with Roo Code:

  1. Install Roo Code
  2. Load the custom modes configuration
  3. Give SPARC an objective
  4. Watch it orchestrate through the workflow

To adapt it for other tools:

Individual agent prompts work in:

  • Claude (via Projects with custom instructions)
  • ChatGPT (via custom GPTs)
  • Cursor (via .cursorrules)
  • Any AI tool that supports system prompts

Modify for your needs:

  • Change the tech stack references
  • Add your company’s security requirements
  • Adjust the workflow phases
  • Add custom validation steps

The methodology is transferable:

  • S-P-A-R-C works regardless of tooling
  • Specialized agents > generalist AI
  • Reasoning transparency prevents hallucinations
  • Security-first design is universal

Example: Using the ‘code’ agent prompt in Claude

  1. Create a new Project in Claude
  2. Add this custom instruction:
You are a Senior Production Engineer specializing in Test-Driven
Development. Before generating any code, you MUST use <THINKING>
tags to:
1. Analyze the failing test assertions
2. Verify adherence to architectural constraints
3. Check for security issues (never hardcode secrets)
4. Plan your implementation

Use Temperature ≤ 0.3 for deterministic code generation.
Output only the necessary code to pass the failing tests.
  1. Provide your failing tests and specifications
  2. Get deterministic, security-conscious implementation

The Bottom Line

I didn’t build a framework. I built a systematic approach to prompt engineering that encodes production engineering practices into AI behavior.

3,000+ lines of prompts that implement:

  • Explicit role definitions
  • Mandatory reasoning protocols
  • Security-first design
  • Performance tracking
  • Automated testing
  • Complete documentation

This is prompt engineering at scale.

Not one-off ChatGPT requests. Not random experiments. A formal specification for how AI should collaborate on production systems.

The code isn’t magic. The architecture isn’t revolutionary. The testing isn’t novel.

What’s novel is encoding all of it into prompts.

Most developers use AI as a faster autocomplete. I use AI as a systematic development partner with explicit workflows, security mandates, and quality controls.

That’s the difference between AI-assisted coding and AI-orchestrated development.

And it works. Fusion Party runs on it. MagnetLab clients paid for it. This portfolio proves it.

The prompts are open source. The methodology is free. The results speak for themselves.


About the Author

I’m Ethan Cornwill. I’ve spent 13+ years writing software and 10 years managing QSR operations where downtime costs money and failures are public. I train LLMs at DataAnnotation, founded MagnetLab (AI consultancy), and serve as National Secretary for Fusion Party where I built their technical infrastructure.

I developed SPARC because I got tired of AI-generated code that looked good in demos and died in production.

Want to see it in action? Check out this portfolio. It was built using SPARC.

Contact: mail@finneh.xyz

GitHub: github.com/finneh4249
LinkedIn: linkedin.com/in/ethancornwill
Portfolio: finneh.xyz


This post was written using AI-human collaboration. I outlined the structure and key points, Claude helped with phrasing and examples, and I validated every technical claim against my actual experience. The reasoning was mine. The words were collaborative. The honesty is non-negotiable.

That’s SPARC in action.