I Tested 3 AI Coding Assistants on the Same Project — Here's What Actually Happened
I'm not a developer. Before 2026, the last time I wrote code was a Python script in 2021 that barely worked. But in March, I needed to add a newsletter subscription system to One Person Company — database backend, API endpoint, and frontend form. Instead of hiring a freelancer ($500+), I decided to test whether AI coding assistants could get me there.
I ran the same project through three tools: Cursor, Claude Code, and GitHub Copilot. Here's what happened.
The Results: Head-to-Head
| Metric | Cursor Pro | Claude Code | GitHub Copilot |
|---|---|---|---|
| Time to working feature | 2.5 hours | 4 hours | 7 hours |
| Errors encountered | 4 | 2 | 6 |
| Times I had to Google for help | 1 | 3 | 8 |
| Code quality (my assessment) | Good — worked, minor style issues | Excellent — clean, well-structured | Messy — functional but inconsistent |
| Monthly cost | $20 | $20 (API usage) | $10 |
| Learning curve | ~2 hours to feel comfortable | ~4 hours | ~1 hour |
Test 1: Cursor Pro — 2.5 Hours, 4 Errors
Cursor felt like the most natural fit for this kind of project. It's an editor (fork of VS Code) with AI deeply integrated. I described what I wanted in the chat panel, and it generated code across multiple files.
What went well:
- The multi-file awareness was impressive. When I asked it to create the database schema, API endpoint, and frontend form, it understood the connections between files and generated consistent code.
- The inline editing was fast. I could select a block of code and say "add validation for email format" and it would rewrite just that section.
- It handled the D1 database setup correctly on the first try — the migration file, the binding in wrangler.toml, and the prepared statements all worked.
What went wrong:
- It generated a CORS configuration that was too permissive. Claude flagged this in review before I pushed to production.
- It used a deprecated API pattern for one of the Worker routes. This caused a deploy failure that took 20 minutes to debug.
- The generated error messages were generic ("Something went wrong"). I had to manually add specific error handling.
- It added a dependency I didn't need (an npm package for form validation when the native HTML5 validation was sufficient).
Test 2: Claude Code — 4 Hours, 2 Errors
Claude Code is terminal-based. You describe what you want, and it runs commands, creates files, and iterates. It felt more like pair programming with a very patient senior developer.
What went well:
- The code quality was noticeably better than Cursor's output. Cleaner structure, better error handling, more thorough validation.
- It asked clarifying questions before generating code. When I said "add a subscribe form," it asked: "What fields? What validation rules? What should happen on success?" This slowed things down but produced better results.
- It caught its own mistakes. Twice, it generated code, then immediately said "wait, that won't work because..." and fixed itself.
What went wrong:
- It was slower. The back-and-forth conversation model meant I spent a lot of time waiting for it to think, propose, then revise.
- It struggled with the deployment step. Claude Code couldn't directly run `wrangler deploy`, so I had to handle that part manually outside the tool.
- The terminal interface is less visual than an editor. I couldn't see the full file structure at a glance, which made it harder to understand what it was doing.
Test 3: GitHub Copilot — 7 Hours, 6 Errors
Copilot is primarily an autocomplete tool — it suggests code as you type. For a non-developer, this was the hardest to use effectively.
What went well:
- The inline completions were sometimes useful for small, repetitive tasks like writing HTML form fields.
- It integrated well with VS Code, which I was already familiar with.
What went wrong:
- Copilot doesn't have project-level awareness the way Cursor and Claude Code do. It couldn't coordinate changes across files — I had to manually ensure the frontend form matched the API endpoint, which led to 3 of the 6 errors.
- It frequently suggested code that looked reasonable but didn't actually compile. I spent about 2 hours of the 7 just debugging suggestions that didn't work.
- For complex tasks like setting up the D1 database, Copilot was essentially useless — it could suggest individual lines but couldn't handle the multi-step process.
- The autocomplete model meant I had to know roughly what I wanted before it could help. As a non-developer, I often didn't know what to type next.
What I Use Now
After the test, here's my setup:
- Cursor Pro ($20/mo) is my daily driver for site changes. It's fast enough for most work, and the editor interface is comfortable.
- Claude API ($20/mo) is my code reviewer. Before any Cursor-generated code goes to production, I paste the diff into Claude and ask: "Find anything dangerous, deprecated, or unnecessarily complex."
- Copilot I dropped. For a non-developer, the project-level awareness of Cursor and Claude Code is essential. Inline completions alone aren't enough.
My Safety Rules (Learned the Hard Way)
I broke the production site twice in my first two weeks with AI coding tools. Here are the rules that have prevented every incident since:
- AI-generated code never goes directly to production. Always to a staging branch first.
- Every change gets reviewed. I read the diff myself (I can spot obviously wrong things even as a non-developer), then run it through Claude for a second review.
- Changes touching more than 3 files get manual testing. I open the staging site and actually use the feature before merging.
- Keep a rollback plan. Before deploying, I note which commit to revert to if things break. I've used this twice.
- Start small. My first project with Cursor was fixing a single broken link. My second was changing a button color. Only after 10+ successful small changes did I attempt the newsletter system.
FAQ
Can a non-developer really ship production code with AI assistants?
Yes, but you need guardrails. I shipped a working newsletter system with Cursor as a non-developer. But I also broke the site twice before I established my safety rules. The tools accelerate execution — they don't remove the need for testing and review.
Which AI coding assistant should a non-developer start with?
Start with Cursor. It has the lowest friction for someone who isn't comfortable in a terminal. Claude Code produces better code but requires more technical comfort. Skip Copilot unless you already know what you're doing.
How long does it take to become productive?
Expect 2-3 hours to feel comfortable with Cursor, and about 2 weeks of part-time use before you're shipping real features confidently. Your first project should be something trivial — fix a typo, change a color, add a link. Build up to complex work.
Related Articles
- My Monthly AI Stack Review (June 2026)
- Claude vs Cursor vs Copilot: Full Comparison
- Build Your First AI Agent in 5 Steps
POWERED BY TYCOON
Run this playbook
with an AI team.
Tycoon assigns each step to a specialist AI agent.
You review. They execute.
made with Tycoon.us · superagent