Problem
IMDB has enabled AWS WAF JavaScript challenges on all www.imdb.com HTML endpoints. Non-browser HTTP clients (including MediaElch) receive HTTP 202 with an empty response body. The response header x-amzn-waf-action: challenge confirms the block.
This affects all HTML-based IMDB functionality:
- Search (
/find?q=...) — no results
- Title page (
/title/ttXXXX/) — no details
- Reference page (
/title/ttXXXX/reference/) — no additional data
The issue has been reported before as intermittent (#1952), but as of March 20, 2026 it appears to be permanent. The current IMDB scraper is completely non-functional.
Working alternatives
Two IMDB API endpoints remain accessible and return JSON directly (no HTML parsing needed):
1. Suggest API (for search)
- URL:
https://v3.sg.media-imdb.com/suggestion/x/{query}.json
- Method: GET, no authentication
- Returns: IMDB ID, title, year, type (movie/tv/short), poster URL, top cast
- Example: Searching "Inception" returns
tt1375666, year 2010, type "movie", poster, cast
2. GraphQL API (for details)
- URL:
https://graphql.imdb.com/
- Method: POST with JSON body, no authentication
- Returns: Virtually all title metadata — ratings, plot, genres, runtime, cast, crew, Metacritic score, etc.
- Example query:
{ title(id: "tt1375666") {
titleText { text }
releaseYear { year }
ratingsSummary { aggregateRating voteCount }
plot { plotText { plainText } }
genres { genres { text } }
metacritic { metascore { score } }
runtime { seconds }
} }
Note on terms of use
The GraphQL API response includes a disclaimer: "Public, commercial, and/or non-private use of the IMDb data provided by this API is not allowed." MediaElch is LGPL-licensed and non-commercial, but this should be considered.
Affected code
src/scrapers/imdb/ImdbApi.cpp — URL construction, HTTP requests
src/scrapers/imdb/ImdbSearchPage.cpp — search result parsing (HTML-based)
src/scrapers/imdb/ImdbJsonParser.cpp — title detail parsing from __NEXT_DATA__
src/scrapers/imdb/ImdbReferencePage.cpp — reference page parsing
- All movie and TV scraper jobs that depend on these classes
Proposed approach
Replace the HTML-based scraper with API-based requests:
- Search: Replace
ImdbSearchPage with Suggest API parser
- Details: Replace
ImdbJsonParser + ImdbReferencePage with GraphQL API queries
- Preserve the existing interface —
ImdbApi remains the entry point, only the internal implementation changes
This would also resolve or improve several existing issues:
Closing PRs #1955 and #1956 as they are based on the now-blocked HTML approach.
Analyzed with AI assistance (Claude Code / Opus 4.6).
Problem
IMDB has enabled AWS WAF JavaScript challenges on all
www.imdb.comHTML endpoints. Non-browser HTTP clients (including MediaElch) receive HTTP 202 with an empty response body. The response headerx-amzn-waf-action: challengeconfirms the block.This affects all HTML-based IMDB functionality:
/find?q=...) — no results/title/ttXXXX/) — no details/title/ttXXXX/reference/) — no additional dataThe issue has been reported before as intermittent (#1952), but as of March 20, 2026 it appears to be permanent. The current IMDB scraper is completely non-functional.
Working alternatives
Two IMDB API endpoints remain accessible and return JSON directly (no HTML parsing needed):
1. Suggest API (for search)
https://v3.sg.media-imdb.com/suggestion/x/{query}.jsontt1375666, year 2010, type "movie", poster, cast2. GraphQL API (for details)
https://graphql.imdb.com/{ title(id: "tt1375666") { titleText { text } releaseYear { year } ratingsSummary { aggregateRating voteCount } plot { plotText { plainText } } genres { genres { text } } metacritic { metascore { score } } runtime { seconds } } }Note on terms of use
The GraphQL API response includes a disclaimer: "Public, commercial, and/or non-private use of the IMDb data provided by this API is not allowed." MediaElch is LGPL-licensed and non-commercial, but this should be considered.
Affected code
src/scrapers/imdb/ImdbApi.cpp— URL construction, HTTP requestssrc/scrapers/imdb/ImdbSearchPage.cpp— search result parsing (HTML-based)src/scrapers/imdb/ImdbJsonParser.cpp— title detail parsing from__NEXT_DATA__src/scrapers/imdb/ImdbReferencePage.cpp— reference page parsingProposed approach
Replace the HTML-based scraper with API-based requests:
ImdbSearchPagewith Suggest API parserImdbJsonParser+ImdbReferencePagewith GraphQL API queriesImdbApiremains the entry point, only the internal implementation changesThis would also resolve or improve several existing issues:
Closing PRs #1955 and #1956 as they are based on the now-blocked HTML approach.
Analyzed with AI assistance (Claude Code / Opus 4.6).