💻 Coding Prompt
How Agency Software Developers Can Write Unit Tests During Building a Web Scraper — Gemini for Intermediates
From slow CI/CD pipeline to results — Intermediate techniques for Agency teams building and testing web scrapers
The Prompt
You are a senior software engineer with 11 years of experience in digital agency environments, web scraping systems, and automated testing pipelines. Help me write unit tests so I can improve application performance.
My situation:
Web scraper type and target: [e.g., e-commerce price monitor / news aggregator / job listing collector]
Current CI/CD pipeline problem: [e.g., full scraper run takes 25 minutes in CI / no test stage exists and the pipeline deploys directly / flaky tests cause pipeline restarts 3 times per day]
Agency project constraint: [e.g., client expects weekly data delivery / scraper is shared across 3 client projects / must not break when the target site does a minor layout update]
Testing gap: [e.g., no unit tests exist / tests only cover happy path / mock data does not reflect real site structure variation]
Tech stack: [e.g., Python with Pytest / JavaScript with Jest / TypeScript with Vitest]
Performance metric being tracked: [e.g., scrape completion time / data accuracy rate / CI pipeline duration]
Team size and testing experience: [e.g., 3 developers, one has written tests before / junior team learning testing for the first time]
Deliver:
A unit test strategy for web scrapers: define what to unit test in a scraper — parsing logic, data transformation, URL generation, and error handling — and what not to unit test at the unit level, with a rationale for each boundary
A mock data design guide: show how to create realistic mock HTML fixtures that cover the 4 most common site variation patterns — missing fields, changed CSS selectors, pagination edge cases, and empty result sets — with example fixture structure
A test suite skeleton: write the complete test file structure for the scraper module, including test naming conventions, setup and teardown patterns, and the grouping logic that makes the suite navigable for an intermediate developer
A CI pipeline optimization plan: identify the 3 specific changes that reduce CI pipeline duration for a scraper project — test parallelization, selective test execution on changed modules, and caching of mock fixtures — with implementation steps for the chosen CI tool
A flaky test prevention checklist: 6 rules for writing scraper unit tests that do not fail randomly — covering time dependencies, network call isolation, selector brittleness, and test data determinism
A performance regression test: write a test that measures the scraper's data processing speed against a baseline and fails the CI pipeline if processing time increases by more than 20% — with the implementation pattern for the chosen stack
A test coverage reporting setup: configure coverage reporting for the scraper module, define the minimum threshold for pipeline passage, and identify the 5 functions that must reach 100% branch coverage before the project ships
A test maintenance guide: define when to update tests, when to delete them, and how to handle test failures caused by target site changes versus failures caused by code regressions — as a one-page reference for the agency team
Write tests for the parsing logic before writing tests for anything else — in a web scraper, the parser is the only component where a silent failure produces incorrect data instead of an obvious error.
💡 How to use this prompt
- Start with output #2 — the mock data design guide. Most intermediate developers write unit tests against a single happy-path HTML fixture and discover the tests are useless the first time the target site changes a CSS class. Build realistic mock fixtures first, and every other test becomes more durable.
- The most common mistake is writing unit tests that make real HTTP requests. These tests are slow, flaky, and will fail in CI the moment the target site is unreachable or rate-limits the pipeline. Every network call in a scraper unit test must be replaced with a mock or fixture.
- Gemini's real-time web access gives it an edge here — use it when current data or recent sources matter. For the final narrative polish, paste Gemini's research output into Claude for cleaner professional language.
Best Tools for This Prompt
🤖 Best AI Coding Tools for This Prompt
Tested & reviewed — run this prompt with the best AI tools
About This Coding AI Prompt
This free Coding prompt is designed for Gemini and works with any modern AI assistant including ChatGPT, Claude, Gemini, and more. Simply copy the prompt above, paste it into your preferred AI tool, and customize the bracketed sections to fit your specific needs.
Coding prompts like this one help you get better, more consistent results from AI tools. Instead of starting from scratch every time, you can use this tested prompt as a foundation and adapt it to your workflow. Browse more Coding prompts →