fix: check deny list in webscrappers by 0xi4o · Pull Request #6559 · FlowiseAI/Flowise

0xi4o · 2026-06-24T13:46:42Z

No description provided.

gemini-code-assist

Code Review

This pull request integrates checkDenyList into the Cheerio, Playwright, and Puppeteer document loaders to prevent unauthorized URL access. However, the review highlights critical security concerns regarding Server-Side Request Forgery (SSRF) and DNS rebinding, as checking only the initial URL does not prevent bypasses via HTTP redirects. For Cheerio, the reviewer suggests overriding CheerioWebBaseLoader.prototype.scrape to use secureFetch. For Playwright and Puppeteer, which execute in separate browser processes, the reviewer recommends implementing network-level isolation or configuring a secure forward proxy to block intranet access.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-24T13:48:36Z

 import { SelectorType } from 'cheerio'
 import { ICommonObject, INodeOutputsValue, IDocument, INode, INodeData, INodeParams } from '../../../src/Interface'
 import { handleEscapeCharacters, webCrawl, xmlScrape } from '../../../src/utils'
+import { checkDenyList } from '../../../src/httpSecurity'


To prevent Server-Side Request Forgery (SSRF) via HTTP redirects or DNS rebinding, we should override CheerioWebBaseLoader.prototype.scrape to use secureFetch instead of the default fetch. secureFetch already resolves and validates all URLs in the redirect chain against the deny list and pins the resolved IP address.

import { checkDenyList, secureFetch } from '../../../src/httpSecurity' import { CheerioWebBaseLoader } from '@langchain/community/document_loaders/web/cheerio' import { load } from 'cheerio' // @ts-ignore CheerioWebBaseLoader.prototype.scrape = async function (this: any) { const response = await secureFetch(this.webPath) const html = await response.text() return load(html) }

gemini-code-assist · 2026-06-24T13:48:36Z


        async function playwrightLoader(url: string): Promise<Document[] | undefined> {
            try {
+                await checkDenyList(url)


The checkDenyList(url) check here only validates the initial URL. Because Playwright launches a full browser that follows HTTP redirects automatically, a malicious URL could redirect to an internal IP address (e.g., http://127.0.0.1 or http://169.254.169.254), bypassing this check entirely and leading to Server-Side Request Forgery (SSRF). Additionally, it is vulnerable to DNS rebinding attacks.

Since Playwright runs in a separate browser process, request interception or network-level isolation is required to fully secure it. If possible, run the browser scraper in an isolated network environment (e.g., a container/sandbox with firewall rules blocking private IP ranges) or configure a secure forward proxy (like Squid) that blocks intranet access.

gemini-code-assist · 2026-06-24T13:48:36Z


        async function puppeteerLoader(url: string): Promise<Document[] | undefined> {
            try {
+                await checkDenyList(url)


The checkDenyList(url) check here only validates the initial URL. Because Puppeteer launches a full browser that follows HTTP redirects automatically, a malicious URL could redirect to an internal IP address (e.g., http://127.0.0.1 or http://169.254.169.254), bypassing this check entirely and leading to Server-Side Request Forgery (SSRF). Additionally, it is vulnerable to DNS rebinding attacks.

Since Puppeteer runs in a separate browser process, request interception or network-level isolation is required to fully secure it. If possible, run the browser scraper in an isolated network environment (e.g., a container/sandbox with firewall rules blocking private IP ranges) or configure a secure forward proxy (like Squid) that blocks intranet access.

fix: check deny list in webscrappers

86c0c26

gemini-code-assist Bot reviewed Jun 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fix: check deny list in webscrappers#6559

fix: check deny list in webscrappers#6559
0xi4o wants to merge 1 commit into
mainfrom
fix/flowise-394

0xi4o commented Jun 24, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 24, 2026

Uh oh!

gemini-code-assist Bot Jun 24, 2026

Uh oh!

gemini-code-assist Bot Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Uh oh!

Conversation

0xi4o commented Jun 24, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant