Skip to content

Try to detect common bot protection mechanisms#2187

Draft
thomas-zahner wants to merge 2 commits into
lycheeverse:masterfrom
thomas-zahner:hint-bot-detection
Draft

Try to detect common bot protection mechanisms#2187
thomas-zahner wants to merge 2 commits into
lycheeverse:masterfrom
thomas-zahner:hint-bot-detection

Conversation

@thomas-zahner
Copy link
Copy Markdown
Member

@thomas-zahner thomas-zahner commented May 7, 2026

Closes #2117

TODO

  • Create dedicated documentation page
  • Add test

Example run

cargo run 'https://appdb.winehq.org/'

145/145 ━━━━━━━━━━━━━━━━━━━━ Finished extracting links                                                                                                                                                                                  Issues found in 1 input. Find details below.

[https://appdb.winehq.org/]:
   [404] https://appdb.winehq.org/cdn-cgi/l/email-protection#3a4a48534c5b59437a4d53545f524b1455485d (at 316:81) | Rejected status code: 404 Not Found
   [403] https://forum.winehq.org/viewforum.php?f=11 (at 67:17) | Rejected status code: 403 Forbidden
   [403] https://forums.winehq.org/ (at 39:26) | Rejected status code: 403 Forbidden | Followed 1 redirect. Redirects: https://forums.winehq.org/ --[302]--> https://forum.winehq.org/
   [403] https://www.winehq.org/ (at 35:26) | Rejected status code: 403 Forbidden
   [403] https://www.winehq.org/ (at 45:35) | Rejected status code: 403 Forbidden
   [403] https://www.winehq.org/ (at 46:34) | Rejected status code: 403 Forbidden
   [403] https://www.winehq.org/search (at 49:19) | Rejected status code: 403 Forbidden

🔍 145 Total (in 32s 787ms) 🔗 138 Unique ✅ 138 OK 🚫 7 Errors 🔀 7 Redirects

Hint: Detected cloudflare bot protection on website appdb.winehq.org
Hint: Detected cloudflare bot protection on website forum.winehq.org
Hint: Detected cloudflare bot protection on website forums.winehq.org
Hint: Detected cloudflare bot protection on website www.winehq.org
Hint: Followed 7 redirects. You might want to consider replacing redirecting URLs with the resolved URLs. Use verbose mode (`-v`/`-vv`) to see redirection details.
Hint: You can configure accepted/rejected response codes with `-a` or `--accept`

@thomas-zahner thomas-zahner force-pushed the hint-bot-detection branch 3 times, most recently from 67b5e72 to 7c06f0f Compare May 8, 2026 12:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Handle DataDome protection

1 participant