Error (2026-06-24)
No councillors found (0 results)
Investigation
The scraper targets https://www.southtyneside.gov.uk/article/13598/councillors-a-to-z (Custom HTML) using:
- Container:
#COUNCILLORSLISTBYNAME_HTML
- Item:
tbody td a
The URL now returns HTTP 403 Forbidden from external IPs, indicating Cloudflare bot protection is active. The scraper managed to fetch the page (otherwise the error would be a connection or status error) but found 0 councillors — the container #COUNCILLORSLISTBYNAME_HTML was not present in whatever HTML was returned (likely the Cloudflare challenge page).
Fix patterns ruled out
- HTTPS migration — already HTTPS
verify_requests = False — not a cert issue; 403 response
- Cloudflare JS challenge without playwright — scraper has no
http_lib = "playwright" set
What needs to happen
- Try
http_lib = "playwright" — this executes Cloudflare JS challenges in a real browser and may resolve the 403. Add to the Scraper class in councillors.py.
- Check article URL — verify
https://www.southtyneside.gov.uk/article/13598/councillors-a-to-z still redirects to a councillors page and hasn't been retired. If the article number changed, update base_url in metadata.json.
- Inspect selectors — if the page now loads but
#COUNCILLORSLISTBYNAME_HTML is gone, the container selector needs updating by visual inspection.
This requires either a successful playwright load or visual inspection of the current page structure.
Error (2026-06-24)
Investigation
The scraper targets
https://www.southtyneside.gov.uk/article/13598/councillors-a-to-z(Custom HTML) using:#COUNCILLORSLISTBYNAME_HTMLtbody td aThe URL now returns HTTP 403 Forbidden from external IPs, indicating Cloudflare bot protection is active. The scraper managed to fetch the page (otherwise the error would be a connection or status error) but found 0 councillors — the container
#COUNCILLORSLISTBYNAME_HTMLwas not present in whatever HTML was returned (likely the Cloudflare challenge page).Fix patterns ruled out
verify_requests = False— not a cert issue; 403 responsehttp_lib = "playwright"setWhat needs to happen
http_lib = "playwright"— this executes Cloudflare JS challenges in a real browser and may resolve the 403. Add to theScraperclass incouncillors.py.https://www.southtyneside.gov.uk/article/13598/councillors-a-to-zstill redirects to a councillors page and hasn't been retired. If the article number changed, updatebase_urlinmetadata.json.#COUNCILLORSLISTBYNAME_HTMLis gone, the container selector needs updating by visual inspection.This requires either a successful playwright load or visual inspection of the current page structure.