A comprehensive testing infrastructure has been implemented for ScrapeGraphAI with support for unit tests, integration tests, performance benchmarking, and automated CI/CD pipelines.
- Complete pytest configuration with coverage tracking
- Custom markers for test categorization (integration, slow, benchmark, etc.)
- Code coverage settings with HTML/XML reports
- Test discovery patterns and exclusions
- Shared fixtures for all LLM providers (OpenAI, Ollama, Anthropic, Groq, Azure, Gemini)
- Mock LLM and embedder fixtures for unit testing
- Test data fixtures (HTML, JSON, XML, CSV)
- Temporary file fixtures
- Performance tracking fixtures
- Custom pytest hooks and CLI options
- Automatic test filtering based on markers
A fully functional HTTP server for consistent testing without external dependencies:
Features:
- Static HTML pages (home, products, projects)
- JSON/XML/CSV API endpoints
- Slow response simulation
- Error condition testing (404, 500)
- Rate limiting simulation
- Dynamic content generation
- Pagination support
- Thread-safe operation
Endpoints:
/- Home page/products- Product listings with prices and stock status/projects- Project listings with descriptions/api/data.json- JSON data endpoint/api/data.xml- XML data endpoint/api/data.csv- CSV data endpoint/slow- 2-second delay simulation/error/404- 404 error page/error/500- 500 error page/rate-limited- Rate limit testing (5 requests max)/dynamic- Dynamically generated content/pagination?page=N- Paginated content
Components:
BenchmarkResult- Individual test result trackingBenchmarkSummary- Statistical analysis across multiple runsBenchmarkTracker- Result collection and reportingbenchmark()- Decorator/function for benchmarking- Baseline comparison utilities
- Performance regression detection
Metrics Tracked:
- Execution time (mean, median, std dev, min, max)
- Memory usage
- Token usage
- API call counts
- Success rates
Features:
- JSON export of results
- Human-readable reports
- Warmup runs support
- Multiple test runs with statistics
- Baseline comparison for regression detection
Assertion Helpers:
assert_valid_scrape_result()- Validate scraping resultsassert_execution_info_valid()- Validate execution metadataassert_response_time_acceptable()- Performance assertionsassert_no_errors_in_result()- Error detection
Mock Response Builders:
create_mock_llm_response()- Generate mock LLM responsescreate_mock_graph_result()- Mock graph execution results
Data Generators:
generate_test_html()- Customizable HTML generationgenerate_test_json()- Test JSON datagenerate_test_csv()- Test CSV data
Validation Utilities:
validate_schema_match()- Pydantic schema validationvalidate_extracted_fields()- Field extraction validation
Additional Utilities:
RateLimitHelper- Rate limiting testingretry_with_backoff()- Retry logic with exponential backoffcompare_results()- Result comparisonfuzzy_match_strings()- Fuzzy string matching- File loading and saving utilities
- SmartScraperGraph with multiple LLM providers
- Schema-based scraping tests
- Timeout handling tests
- Error condition tests (404, 500)
- Performance benchmarks
- Real website testing support
- SmartScraperMultiGraph tests
- Concurrent scraping tests
- Performance benchmarks for multi-page scraping
- SearchGraph integration tests
- JSONScraperGraph tests (files and URLs)
- XMLScraperGraph tests (files and URLs)
- CSVScraperGraph tests (files and URLs)
- Performance benchmarks for file format scrapers
Jobs:
-
Unit Tests
- Matrix: Ubuntu, macOS, Windows
- Python versions: 3.10, 3.11, 3.12
- Coverage reporting to Codecov
- Fast execution without external dependencies
-
Integration Tests
- Test groups: smart-scraper, multi-graph, file-formats
- Real LLM provider testing (with API keys)
- Artifact uploads for test results
-
Performance Benchmarks
- Track execution time and resource usage
- Save results as artifacts
- Compare against baseline (on PRs)
-
Code Quality
- Ruff linting
- Black formatting check
- isort import sorting check
- mypy type checking
-
Test Coverage Report
- Aggregate coverage from all jobs
- PR comments with coverage changes
-
Test Summary
- Overall test status reporting
Triggers:
- Push to main, pre/beta, dev branches
- Pull requests to main, pre/beta
- Manual workflow dispatch
Comprehensive guide covering:
- Test organization structure
- Running different test types
- Using fixtures and markers
- Performance benchmarking
- Mock server usage
- Environment variables
- Writing new tests (with templates)
- Best practices
- Troubleshooting
Test compatibility across all supported LLM providers:
- OpenAI (GPT-3.5, GPT-4)
- Ollama (local models)
- Anthropic Claude
- Groq
- Azure OpenAI
- Google Gemini
Organized test categorization:
@pytest.mark.unit- Fast unit tests@pytest.mark.integration- Integration tests@pytest.mark.slow- Long-running tests@pytest.mark.benchmark- Performance tests@pytest.mark.requires_api_key- Needs API credentials
# Unit tests only
pytest -m "unit or not integration"
# Integration tests
pytest --integration
# Performance benchmarks
pytest --benchmark -m benchmark
# Slow tests
pytest --slow
# With coverage
pytest --cov=scrapegraphai --cov-report=html- No external dependencies for basic tests
- Consistent, reproducible test conditions
- Simulate error conditions and edge cases
- Test rate limiting and timeouts
- Fast test execution
- Automatic tracking of execution time
- Token usage monitoring
- API call counting
- Regression detection
- Baseline comparison
def test_with_mock(mock_llm_model):
"""Fast test with mocked LLM."""
result = some_function(mock_llm_model)
assert result is not None@pytest.mark.integration
@pytest.mark.requires_api_key
def test_real_scraping(openai_config, mock_server):
"""Test with real LLM and mock server."""
url = mock_server.get_url("/products")
scraper = SmartScraperGraph(
prompt="Extract products",
source=url,
config=openai_config
)
result = scraper.run()
assert_valid_scrape_result(result)@pytest.mark.benchmark
def test_performance(benchmark_tracker, openai_config):
"""Benchmark scraping performance."""
import time
start = time.perf_counter()
# Run operation
end = time.perf_counter()
benchmark_tracker.record(BenchmarkResult(
test_name="my_test",
execution_time=end - start,
success=True
))- Comprehensive Coverage: Unit, integration, and performance tests
- Fast Feedback: Quick unit tests with extensive mocking
- Real-World Testing: Integration tests with actual LLM providers
- Performance Monitoring: Track and prevent performance regressions
- CI/CD Ready: Automated testing in GitHub Actions
- Developer Friendly: Clear documentation and templates
- Flexible Execution: Run specific test subsets easily
- Cross-Platform: Tested on Linux, macOS, Windows
- Multi-Python: Support for Python 3.10, 3.11, 3.12
- Add more integration tests for additional graph types
- Expand mock server with more realistic scenarios
- Add visual regression testing for screenshot comparisons
- Implement mutation testing for test quality
- Add property-based testing with Hypothesis
- Create performance dashboards for trend visualization
- Add load testing for concurrent scraping scenarios
New Files:
pytest.ini- Pytest configurationtests/conftest.py- Shared fixturestests/fixtures/mock_server/server.py- Mock HTTP servertests/fixtures/benchmarking.py- Performance frameworktests/fixtures/helpers.py- Test utilitiestests/integration/test_smart_scraper_integration.pytests/integration/test_multi_graph_integration.pytests/integration/test_file_formats_integration.py.github/workflows/test-suite.yml- CI/CD workflowtests/README_TESTING.md- Testing documentationTESTING_INFRASTRUCTURE.md- This file
Directories Created:
tests/fixtures/tests/fixtures/mock_server/tests/integration/benchmark_results/(auto-created when running benchmarks)
When adding new tests:
- Use appropriate fixtures from conftest.py
- Add proper markers (@pytest.mark.*)
- Follow existing test structure
- Update documentation as needed
- Ensure tests pass in CI
For questions or issues with the testing infrastructure, please open an issue on GitHub.