Skip to content
View contactandyc's full-sized avatar

Highlights

  • Pro

Block or report contactandyc

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
contactandyc/README.md

GitHub Profile for Andy Curtis

Selective works, algorithmic research, and high-performance C infrastructure by Andy Curtis.


C C++ Python Algorithms


✦ Archival Works & Algorithms

A selection of algorithms and systems I've developed over my career, largely in the public domain at this point.

  • Quicksort Improvement (2019) — Designed an efficient modification to quicksort/introsort. By reusing the pivot sample to detect when input is already sorted, reverse-sorted, or all-equal, the algorithm finishes in O(n) with a single verification pass. Yields 5–30× speedups for sorted inputs with no regressions on adversarial cases.
  • Ad Inventory Overlapping Set Problem (2011–2012) — Designed a real-time ad inventory model to handle millions of buys with overlapping targeting criteria using inverted indices and bitmap intersections. Enabled accurate forecasting and scalable allocation at web scale.
  • Expected Frequency & Long Correlation (2009+) — An algorithm for search and recommendations leveraging entire user histories. By adjusting local co-occurrence with an expected vs. actual frequency correction, it systematically removes global popularity bias.
  • Click-Based Search & Recommendations (2003–2005) — A framework leveraging full user sessions (queries + clicks) to learn correlations, improving ranking, personalization, and localization.
  • Efficient Near-Shingling (2001) — An approach to near-duplicate detection that approximates shingling accuracy while reducing storage and compute costs. Documents are reduced to title + first + 10 longest sentence hashes, indexed in dual forward/inverted indices.
  • EzResult Search Engine (1998–1999) — A distributed search engine written entirely in C/C++/assembly, independently utilizing inverted indices, tries, and cosine similarity. Supported instant index updates and was acquired in 1999.


✦ The C Infrastructure Ecosystem

A modular, multi-tier dependency graph of C libraries built for out-of-core data processing, search, and system reliability. Designed strictly for performance, simplicity, and composability.

🧱 Foundation & Memory

  • a-memory-library Zero-overhead memory pools, auto-growing buffers, and debug-wrapped allocators.
  • the-macro-library Type-safe C macros for core algorithms (introsort, bsearch, red-black trees, heaps).
  • a-bitset-library Expandable bitset structures for setting, querying, and bitwise operations.

⚙️ Distributed Processing & I/O

  • a-map-reduce-library Single-node, partitioned DAG execution engine for out-of-core data processing and pipelining.
  • the-io-library Record-oriented file processing with transparent compression, partitioning, and sort-merging.
  • the-lz4-library Fast LZ4 compression and decompression primitives.

🕸️ Networking & Security

  • a-curl-library Async event-loop wrapper over libcurl with rate-limiting, backoffs, and dependency scheduling.
  • a-curl-openai-plugin Builder API on the curl event loop for handling OpenAI streams and structured outputs.
  • an-encryption-library Secure key generation and in-place AES-GCM encryption/decryption.

🗂️ Parsing & Serialization

🔎 NLP, Search & ML



✦ Fleet Management & Orchestration

Tools designed to tame the complexity of multi-repo ecosystems through declarative configuration and GitOps automation.

  • scaffold-repo — A declarative fleet manager and build orchestrator. Resolves dynamic dependency graphs, enforces OSS license compliance, and automates Git branching/releases across dozens of interconnected micro-repos via a single Python CLI.
  • scaffold-templates — The centralized Template Registry powering scaffold-repo. Defines language stacks (C/CMake, Python), organizational profiles, and dynamic Jinja2 file routing to prevent vendor lock-in. Clone this to define your own fleet!


✦ Activity & Statistics

contactandyc's Github chart

GitHub Stats
```

Pinned Loading

  1. a-map-reduce-library a-map-reduce-library Public

    A library for orchestrating a map reduce workload on a machine

    C

  2. scaffold-repo scaffold-repo Public

    A repo to scaffold other repos (handles licenses, builds, clones, makefiles, boilerplate stuff)

    Python

  3. a-memory-library a-memory-library Public

    A library for handling allocation

    C 3

  4. a-json-library a-json-library Public

    A very fast json library

    C 3

  5. search-index-library search-index-library Public

    Forked from knode-ai-open-source/search-index-library

    A library for indexing and finding data like a search engine

    C

  6. sql-parser-library sql-parser-library Public

    Forked from knode-ai-open-source/sql-parser-library

    A library for matching data structures to SQL

    C