Skip to content
#

a100

Here are 31 public repositories matching this topic...

ClusterOps is an enterprise-grade Python library developed and maintained by the Swarms Team to help you manage and execute agents on specific CPUs and GPUs across clusters. This tool enables advanced CPU and GPU selection, dynamic task allocation, and resource monitoring, making it ideal for high-performance distributed computing environments.

  • Updated Oct 13, 2025
  • Python
Prolepsis

Prolepsis is a speculative decoding implementation that accelerates LLM inference by 1.30x on an A100. By pairing a small draft model (Qwen 1.7B) with a larger target (Qwen 8B), it shifts generation workloads into a parallel verification pass. A rigorous rejection sampling pipeline guarantees the output distribution is preserved.

  • Updated Mar 26, 2026
  • Python

An implementation of Speculative RAG exploring latency-quality trade-offs in multi-draft retrieval. Features batched parallel drafting via vLLM and log-probability verifier selection for fast, high-quality QA on a single A100 GPU.

  • Updated Mar 20, 2026
  • Python

Improve this page

Add a description, image, and links to the a100 topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the a100 topic, visit your repo's landing page and select "manage topics."

Learn more