Skip to content

Latest commit

 

History

History
327 lines (266 loc) · 9.55 KB

File metadata and controls

327 lines (266 loc) · 9.55 KB

Enhanced Testing Infrastructure - Implementation Summary

Overview

A comprehensive testing infrastructure has been implemented for ScrapeGraphAI with support for unit tests, integration tests, performance benchmarking, and automated CI/CD pipelines.

What Was Added

1. Core Testing Configuration

pytest.ini

  • Complete pytest configuration with coverage tracking
  • Custom markers for test categorization (integration, slow, benchmark, etc.)
  • Code coverage settings with HTML/XML reports
  • Test discovery patterns and exclusions

tests/conftest.py

  • Shared fixtures for all LLM providers (OpenAI, Ollama, Anthropic, Groq, Azure, Gemini)
  • Mock LLM and embedder fixtures for unit testing
  • Test data fixtures (HTML, JSON, XML, CSV)
  • Temporary file fixtures
  • Performance tracking fixtures
  • Custom pytest hooks and CLI options
  • Automatic test filtering based on markers

2. Mock HTTP Server (tests/fixtures/mock_server/)

A fully functional HTTP server for consistent testing without external dependencies:

Features:

  • Static HTML pages (home, products, projects)
  • JSON/XML/CSV API endpoints
  • Slow response simulation
  • Error condition testing (404, 500)
  • Rate limiting simulation
  • Dynamic content generation
  • Pagination support
  • Thread-safe operation

Endpoints:

  • / - Home page
  • /products - Product listings with prices and stock status
  • /projects - Project listings with descriptions
  • /api/data.json - JSON data endpoint
  • /api/data.xml - XML data endpoint
  • /api/data.csv - CSV data endpoint
  • /slow - 2-second delay simulation
  • /error/404 - 404 error page
  • /error/500 - 500 error page
  • /rate-limited - Rate limit testing (5 requests max)
  • /dynamic - Dynamically generated content
  • /pagination?page=N - Paginated content

3. Performance Benchmarking (tests/fixtures/benchmarking.py)

Components:

  • BenchmarkResult - Individual test result tracking
  • BenchmarkSummary - Statistical analysis across multiple runs
  • BenchmarkTracker - Result collection and reporting
  • benchmark() - Decorator/function for benchmarking
  • Baseline comparison utilities
  • Performance regression detection

Metrics Tracked:

  • Execution time (mean, median, std dev, min, max)
  • Memory usage
  • Token usage
  • API call counts
  • Success rates

Features:

  • JSON export of results
  • Human-readable reports
  • Warmup runs support
  • Multiple test runs with statistics
  • Baseline comparison for regression detection

4. Test Utilities (tests/fixtures/helpers.py)

Assertion Helpers:

  • assert_valid_scrape_result() - Validate scraping results
  • assert_execution_info_valid() - Validate execution metadata
  • assert_response_time_acceptable() - Performance assertions
  • assert_no_errors_in_result() - Error detection

Mock Response Builders:

  • create_mock_llm_response() - Generate mock LLM responses
  • create_mock_graph_result() - Mock graph execution results

Data Generators:

  • generate_test_html() - Customizable HTML generation
  • generate_test_json() - Test JSON data
  • generate_test_csv() - Test CSV data

Validation Utilities:

  • validate_schema_match() - Pydantic schema validation
  • validate_extracted_fields() - Field extraction validation

Additional Utilities:

  • RateLimitHelper - Rate limiting testing
  • retry_with_backoff() - Retry logic with exponential backoff
  • compare_results() - Result comparison
  • fuzzy_match_strings() - Fuzzy string matching
  • File loading and saving utilities

5. Integration Test Suite

tests/integration/test_smart_scraper_integration.py

  • SmartScraperGraph with multiple LLM providers
  • Schema-based scraping tests
  • Timeout handling tests
  • Error condition tests (404, 500)
  • Performance benchmarks
  • Real website testing support

tests/integration/test_multi_graph_integration.py

  • SmartScraperMultiGraph tests
  • Concurrent scraping tests
  • Performance benchmarks for multi-page scraping
  • SearchGraph integration tests

tests/integration/test_file_formats_integration.py

  • JSONScraperGraph tests (files and URLs)
  • XMLScraperGraph tests (files and URLs)
  • CSVScraperGraph tests (files and URLs)
  • Performance benchmarks for file format scrapers

6. GitHub Actions Workflow (.github/workflows/test-suite.yml)

Jobs:

  1. Unit Tests

    • Matrix: Ubuntu, macOS, Windows
    • Python versions: 3.10, 3.11, 3.12
    • Coverage reporting to Codecov
    • Fast execution without external dependencies
  2. Integration Tests

    • Test groups: smart-scraper, multi-graph, file-formats
    • Real LLM provider testing (with API keys)
    • Artifact uploads for test results
  3. Performance Benchmarks

    • Track execution time and resource usage
    • Save results as artifacts
    • Compare against baseline (on PRs)
  4. Code Quality

    • Ruff linting
    • Black formatting check
    • isort import sorting check
    • mypy type checking
  5. Test Coverage Report

    • Aggregate coverage from all jobs
    • PR comments with coverage changes
  6. Test Summary

    • Overall test status reporting

Triggers:

  • Push to main, pre/beta, dev branches
  • Pull requests to main, pre/beta
  • Manual workflow dispatch

7. Documentation

tests/README_TESTING.md

Comprehensive guide covering:

  • Test organization structure
  • Running different test types
  • Using fixtures and markers
  • Performance benchmarking
  • Mock server usage
  • Environment variables
  • Writing new tests (with templates)
  • Best practices
  • Troubleshooting

Key Features

Multi-Provider Support

Test compatibility across all supported LLM providers:

  • OpenAI (GPT-3.5, GPT-4)
  • Ollama (local models)
  • Anthropic Claude
  • Groq
  • Azure OpenAI
  • Google Gemini

Test Markers

Organized test categorization:

  • @pytest.mark.unit - Fast unit tests
  • @pytest.mark.integration - Integration tests
  • @pytest.mark.slow - Long-running tests
  • @pytest.mark.benchmark - Performance tests
  • @pytest.mark.requires_api_key - Needs API credentials

Flexible Test Execution

# Unit tests only
pytest -m "unit or not integration"

# Integration tests
pytest --integration

# Performance benchmarks
pytest --benchmark -m benchmark

# Slow tests
pytest --slow

# With coverage
pytest --cov=scrapegraphai --cov-report=html

Mock Server Benefits

  • No external dependencies for basic tests
  • Consistent, reproducible test conditions
  • Simulate error conditions and edge cases
  • Test rate limiting and timeouts
  • Fast test execution

Performance Tracking

  • Automatic tracking of execution time
  • Token usage monitoring
  • API call counting
  • Regression detection
  • Baseline comparison

Usage Examples

Basic Unit Test

def test_with_mock(mock_llm_model):
    """Fast test with mocked LLM."""
    result = some_function(mock_llm_model)
    assert result is not None

Integration Test

@pytest.mark.integration
@pytest.mark.requires_api_key
def test_real_scraping(openai_config, mock_server):
    """Test with real LLM and mock server."""
    url = mock_server.get_url("/products")
    scraper = SmartScraperGraph(
        prompt="Extract products",
        source=url,
        config=openai_config
    )
    result = scraper.run()
    assert_valid_scrape_result(result)

Performance Benchmark

@pytest.mark.benchmark
def test_performance(benchmark_tracker, openai_config):
    """Benchmark scraping performance."""
    import time

    start = time.perf_counter()
    # Run operation
    end = time.perf_counter()

    benchmark_tracker.record(BenchmarkResult(
        test_name="my_test",
        execution_time=end - start,
        success=True
    ))

Benefits

  1. Comprehensive Coverage: Unit, integration, and performance tests
  2. Fast Feedback: Quick unit tests with extensive mocking
  3. Real-World Testing: Integration tests with actual LLM providers
  4. Performance Monitoring: Track and prevent performance regressions
  5. CI/CD Ready: Automated testing in GitHub Actions
  6. Developer Friendly: Clear documentation and templates
  7. Flexible Execution: Run specific test subsets easily
  8. Cross-Platform: Tested on Linux, macOS, Windows
  9. Multi-Python: Support for Python 3.10, 3.11, 3.12

Next Steps

  1. Add more integration tests for additional graph types
  2. Expand mock server with more realistic scenarios
  3. Add visual regression testing for screenshot comparisons
  4. Implement mutation testing for test quality
  5. Add property-based testing with Hypothesis
  6. Create performance dashboards for trend visualization
  7. Add load testing for concurrent scraping scenarios

Files Created/Modified

New Files:

  • pytest.ini - Pytest configuration
  • tests/conftest.py - Shared fixtures
  • tests/fixtures/mock_server/server.py - Mock HTTP server
  • tests/fixtures/benchmarking.py - Performance framework
  • tests/fixtures/helpers.py - Test utilities
  • tests/integration/test_smart_scraper_integration.py
  • tests/integration/test_multi_graph_integration.py
  • tests/integration/test_file_formats_integration.py
  • .github/workflows/test-suite.yml - CI/CD workflow
  • tests/README_TESTING.md - Testing documentation
  • TESTING_INFRASTRUCTURE.md - This file

Directories Created:

  • tests/fixtures/
  • tests/fixtures/mock_server/
  • tests/integration/
  • benchmark_results/ (auto-created when running benchmarks)

Contributing

When adding new tests:

  1. Use appropriate fixtures from conftest.py
  2. Add proper markers (@pytest.mark.*)
  3. Follow existing test structure
  4. Update documentation as needed
  5. Ensure tests pass in CI

For questions or issues with the testing infrastructure, please open an issue on GitHub.