Skip to content

Latest commit

 

History

History
250 lines (174 loc) · 7.64 KB

File metadata and controls

250 lines (174 loc) · 7.64 KB
title Transition Guide from v1 to v2
description Move from v1 to v2 quickly and safely

Transition from v1 to v2

Once you log in, your v1 code will be deprecated in 7 days. The legacy dashboard at [dashboard.scrapegraphai.com/login](https://dashboard.scrapegraphai.com/login/) will remain temporarily available during the transition.

If you are coming from the legacy v1 docs, use this page as your migration checkpoint.

Before anything else, log in to the dashboard at scrapegraphai.com/login.

Method-by-method migration

Use this table to map old entry points to new ones. Details and examples follow below.

v1 v2 Notes
markdownify scrape with format="markdown" (Python) or format: "markdown" (JS) HTML → markdown and related “raw page” outputs live under scrape.
smartscraper / smartScraper extract Same job: structured extraction from a URL. Rename params and pass extra fetch/LLM options via config objects.
searchscraper / searchScraper search Web search + extraction; use query (or positional string in JS).
smartcrawler (single start call) crawl.start, then crawl.get, crawl.stop, crawl.resume, crawl.delete Crawl is explicitly async: you poll or track job id.
Monitors (if you used them) monitor.create, monitor.list, monitor.get, pause/resume/delete Same product, namespaced API.
sitemap Removed from v2 SDKs Discover URLs with crawl.start and URL patterns, or call the REST sitemap endpoint if your integration still requires it—see Sitemap and SDK release notes.
healthz / checkHealth, feedback, built-in mock helpers Removed or changed Use credits, history, and dashboard features; check the SDK migration guides for replacements.
agenticscraper Removed Use extract with FetchConfig (e.g. mode="js", stealth=True, wait=2000) for hard pages, or crawl.start for multi-page flows.

Code-level transition

1. Markdownify → scrape

Before: markdownify(url).

After: scrape(url, format="markdown") (Python) or scrape(url, { format: "markdown" }) (JS).

from scrapegraph_py import ScrapeGraphAI, ScrapeRequest, MarkdownFormatConfig

# reads SGAI_API_KEY from env, or pass explicitly: ScrapeGraphAI(api_key="...")
sgai = ScrapeGraphAI()

res = sgai.scrape(ScrapeRequest(
    url="https://example.com",
    formats=[MarkdownFormatConfig()],
))

if res.status == "success":
    print(res.data.results["markdown"]["data"][0])
import { ScrapeGraphAI } from "scrapegraph-js";

// reads SGAI_API_KEY from env, or pass explicitly: ScrapeGraphAI({ apiKey: "..." })
const sgai = ScrapeGraphAI();

const res = await sgai.scrape({
  url: "https://example.com",
  formats: [{ type: "markdown" }],
});

if (res.status === "success") {
  console.log(res.data?.results.markdown?.data?.[0]);
}

2. SmartScraper → extract

Before (v1): website_url + user_prompt, optional flags on the same object.

After (v2): url + prompt; move fetch-related flags into FetchConfig / fetchConfig.

from scrapegraph_py import Client

client = Client(api_key="your-api-key")
response = client.smartscraper(
    website_url="https://example.com",
    user_prompt="Extract the title and price",
    stealth=True,
)
from scrapegraph_py import ScrapeGraphAI, ExtractRequest, FetchConfig

sgai = ScrapeGraphAI()
res = sgai.extract(ExtractRequest(
    url="https://example.com",
    prompt="Extract the title and price",
    fetch_config=FetchConfig(stealth=True),
))

if res.status == "success":
    print(res.data.json_data)
import { smartScraper } from "scrapegraph-js";

const response = await smartScraper(apiKey, {
  website_url: "https://example.com",
  user_prompt: "Extract the title and price",
  stealth: true,
});
import { ScrapeGraphAI } from "scrapegraph-js";

const sgai = ScrapeGraphAI();

const res = await sgai.extract({
  url: "https://example.com",
  prompt: "Extract the title and price",
});

if (res.status === "success") {
  console.log(res.data?.json);
}

3. SearchScraper → search

Before: searchscraper / searchScraper with a prompt-style query.

After: search with query (Python keyword argument; JS first argument is the query string).

from scrapegraph_py import ScrapeGraphAI, SearchRequest

sgai = ScrapeGraphAI()
res = sgai.search(SearchRequest(
    query="Latest pricing for product X",
    num_results=5,
))

if res.status == "success":
    for r in res.data.results:
        print(r.title, "-", r.url)
import { ScrapeGraphAI } from "scrapegraph-js";

const sgai = ScrapeGraphAI();

const res = await sgai.search({
  query: "Latest pricing for product X",
  numResults: 5,
});

if (res.status === "success") {
  for (const r of res.data?.results ?? []) console.log(r.title, "-", r.url);
}

4. Crawl jobs

Before: One-shot crawl(...) style usage depending on SDK version.

After: Start a job, then poll or webhook as documented:

from scrapegraph_py import ScrapeGraphAI, CrawlRequest

sgai = ScrapeGraphAI()

start = sgai.crawl.start(CrawlRequest(
    url="https://example.com",
    max_depth=2,
    include_patterns=["/blog/*"],
    exclude_patterns=["/admin/*"],
))

status = sgai.crawl.get(start.data.id)
print(status.data.status, status.data.finished, "/", status.data.total)
import { ScrapeGraphAI } from "scrapegraph-js";

const sgai = ScrapeGraphAI();

const start = await sgai.crawl.start({
  url: "https://example.com",
  maxDepth: 2,
  includePatterns: ["/blog/*"],
  excludePatterns: ["/admin/*"],
});
const status = await sgai.crawl.get(start.data.id);

5. REST calls

If you call the API with curl or a generic HTTP client:

  • Use the v2 host and path pattern: https://v2-api.scrapegraphai.com/api/<endpoint> (e.g. /api/scrape, /api/extract, /api/search, /api/crawl, /api/monitor).
  • Replace JSON fields to match v2 bodies (e.g. url and prompt instead of website_url and user_prompt on extract; formats: [{ type: "markdown" }] instead of format: "markdown").
  • Authenticate with the SGAI-APIKEY header.

Exact paths and payloads are listed under each service (for example Scrape) and in the API reference.

What else changed in v2 (docs & product)

  • Unified and clearer API documentation
  • Updated service pages and endpoint organization
  • New guides for MCP server and SDK usage

Recommended path

  1. Log in at scrapegraphai.com/login
  2. Start from Introduction
  3. Follow Installation
  4. Upgrade packages: pip install -U scrapegraph-py / npm i scrapegraph-js@latest (requires scrapegraph-js ≥ 2.0.1 and Node ≥ 22)

SDK migration guides (detailed changelogs)

Full method documentation:

Legacy v1 docs

You can still access v1 documentation here: