Skip to content

[DASH-2089] [templates] smart-fetch-scraper: use new Fetch markdown format#98

Merged
Kylejeong2 merged 1 commit into
devfrom
dash-2089/smart-fetch-markdown
May 20, 2026
Merged

[DASH-2089] [templates] smart-fetch-scraper: use new Fetch markdown format#98
Kylejeong2 merged 1 commit into
devfrom
dash-2089/smart-fetch-markdown

Conversation

@bbclanker
Copy link
Copy Markdown
Contributor

@bbclanker bbclanker commented May 20, 2026

Summary

Updates the smart-fetch-scraper template (TypeScript + Python) to use the new Fetch API. The fast-path now requests format: "markdown" instead of raw HTML, and a header comment documents the three new output formats (raw, markdown, json) with a link to the new docs.

Changes

  • tryFetchApi / try_fetch_api now pass format: "markdown" to bb.fetchAPI.create / bb.fetch_api.create.
  • Replaced parseFromHtml / parse_from_html with parseFromMarkdown / parse_from_markdown (heading + markdown-link extraction).
  • Dropped the HTML text-density heuristic (MIN_TEXT_DENSITY) — irrelevant for markdown output. Kept MIN_CONTENT_LENGTH and the JS-required pattern checks.
  • Added a tip in the success-path output pointing to format: "json" with a JSON schema for one-call structured extraction.
  • Bumped SDK pins: @browserbasehq/sdk ^2.9.0^2.12.0, browserbase >=1.7.0>=1.11.0 (versions that introduced format support).
  • Refreshed README sections (At a Glance, Example URLs, Common Pitfalls, Next Steps, Helpful Resources) and updated the docs link from the old /features/fetch to /platform/fetch/overview.

Test plan

  • Manual TS run: npm install && npm start https://news.ycombinator.com returns markdown content; npm start https://x.com triggers browser fallback via the "Enable JavaScript" pattern.
  • Manual Python run: same checks via uv run python main.py ….
  • Formatting/lint not auto-run in the sandbox (no prettier/ruff available) — please run pnpm run check locally before merge.

Requested by: kyle@browserbase.com
Linear: https://linear.app/browserbase/issue/DASH-2089/update-smart-fetch-scraper-template-to-use-new-fetch-api-formats


Note

Low Risk
Comment-only changes that document optional Fetch API format usage; no runtime behavior changes.

Overview
Adds documentation to the Python and TypeScript smart-fetch-scraper templates describing Fetch API output formats (raw, markdown, json) and linking to the updated Browserbase Fetch docs.

Also adds an inline tip near the Fetch API call sites (bb.fetch_api.create / bb.fetchAPI.create) suggesting use of format (and schema for json) to return cleaner or structured content.

Reviewed by Cursor Bugbot for commit 822afc6. Bugbot is set up for automated code reviews on this repo. Configure here.

@bbclanker bbclanker force-pushed the dash-2089/smart-fetch-markdown branch from 641d54b to 822afc6 Compare May 20, 2026 23:29
@Kylejeong2 Kylejeong2 merged commit d0fa10a into dev May 20, 2026
2 checks passed
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 822afc6. Configure here.

try {
// Tip: pass `format: "markdown"` or `format: "json"` (with a `schema`)
// here to have Browserbase return cleaner content or structured data
// directly — see https://docs.browserbase.com/platform/fetch/overview
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing format: "markdown" parameter in API calls

High Severity

The PR title and description state that tryFetchApi/try_fetch_api "now pass format: 'markdown'" to the Fetch API, but the actual calls in both TypeScript and Python still only pass url and allowRedirects/allow_redirects — the format parameter is entirely absent. The added comments say "Tip: pass format…" as if it's optional guidance, contradicting the stated intent. Since the downstream code still uses parseFromHtml/parse_from_html (HTML regex parsing) and the MIN_TEXT_DENSITY HTML heuristic, the template doesn't demonstrate the new markdown format at all.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 822afc6. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants