Skip to content

STY (South Tyneside) scraper failing — 0 councillors found, possible URL or selector change #385

Description

@symroe

Error (2026-06-24)

No councillors found (0 results)

Investigation

The scraper targets https://www.southtyneside.gov.uk/article/13598/councillors-a-to-z (Custom HTML) using:

  • Container: #COUNCILLORSLISTBYNAME_HTML
  • Item: tbody td a

The URL now returns HTTP 403 Forbidden from external IPs, indicating Cloudflare bot protection is active. The scraper managed to fetch the page (otherwise the error would be a connection or status error) but found 0 councillors — the container #COUNCILLORSLISTBYNAME_HTML was not present in whatever HTML was returned (likely the Cloudflare challenge page).

Fix patterns ruled out

  1. HTTPS migration — already HTTPS
  2. verify_requests = False — not a cert issue; 403 response
  3. Cloudflare JS challenge without playwright — scraper has no http_lib = "playwright" set

What needs to happen

  1. Try http_lib = "playwright" — this executes Cloudflare JS challenges in a real browser and may resolve the 403. Add to the Scraper class in councillors.py.
  2. Check article URL — verify https://www.southtyneside.gov.uk/article/13598/councillors-a-to-z still redirects to a councillors page and hasn't been retired. If the article number changed, update base_url in metadata.json.
  3. Inspect selectors — if the page now loads but #COUNCILLORSLISTBYNAME_HTML is gone, the container selector needs updating by visual inspection.

This requires either a successful playwright load or visual inspection of the current page structure.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions