Published on: September 10, 2025
9 min read
Optimize git clone operations in any environment — up to 93% reduction in clone times and 98% reduction in disk space usage — with the Git Much Faster script.

Picture this: You're working on the Chromium project and you need to clone the repository. You run git clone, grab a coffee, check your email, maybe take a lunch break, and 95 minutes later, you finally have your working directory. This is the reality for developers working with large repositories containing 50GB+ of data.
The productivity impact is staggering. CI/CD pipelines grind to a halt waiting for repository clones. Infrastructure costs skyrocket as compute resources sit idle. Developer frustration mounts as context-switching becomes the norm.
But what if that 95-minute wait could be reduced to just 6 minutes? What if you could achieve a 93% reduction in clone times using proven techniques?
Enter Git Much Faster — a comprehensive benchmarking and optimization script that transforms how you work with large Git repositories. Built from real-world experience optimizing embedded development workflows, this script provides practical strategies delivering measurable performance improvements across standard git clones, optimized configurations, and Git's built-in Scalar tool.
You'll discover how to dramatically reduce git clone times using optimization strategies, explore real-world performance benchmarks from major repositories like the Linux kernel and Chromium, and understand how to implement these optimizations safely in both development and CI/CD environments.
Git Much Faster is a script I wrote as an enablement tool to allow you to benchmark multiple clone optimization approaches on the same client — whether that is a traditional developer workstation, CI, cloud-hosted development environments or specialized clones for GitOps. It also contains the curated configuration settings for the fastest clone optimization. You can use these settings as a starting point and adapt or remove configurations that create too lean of a clone for your client's intended use of the repository clone.
Git Much Faster addresses a fundamental challenge: Git's default clone behavior prioritizes safety over speed. While this works for small repositories, it becomes a significant bottleneck with large codebases, extensive binary assets, or complex monorepo structures.
The problem manifests across increasingly common scenarios. Embedded development teams inherit repositories filled with legacy firmware binaries, bootloaders, and vendor SDKs stored directly in version control. Web applications accumulate years of marketing assets and design files. Game development projects contain massive 3D models and audio files growing repository sizes into tens of gigabytes.
Enterprise CI/CD pipelines suffer particularly acute pain. Each job requires a fresh repository clone, and when operations take 20 to 90 minutes, entire development workflows grind to a halt. Infrastructure costs multiply as compute resources remain idle during lengthy clone operations.
Git Much Faster solves this through comprehensive benchmarking comparing four distinct strategies: standard git clone (baseline with full history), optimized git clone (custom configurations with compression disabled and sparse checkout), Git's Scalar clone (integrated partial cloning), and current directory assessment (analyzing existing repositories without re-cloning).
The tool provides measurable, repeatable benchmarking in controlled AWS environments, eliminating variables that make performance testing unreliable. The real power of Git Much Faster is to run all the benchmarks in whatever your target environment looks like — so if slow network connections are a reality for some developers, you can decipher the best clone optimization for their situation.
Understanding Git Much Faster's effectiveness requires examining specific configurations that address Git's performance bottlenecks through a layered approach tackling network transfer efficiency, CPU utilization, and storage patterns.
The most significant gains come from two key optimizations. The first, core.compression=0, eliminates CPU-intensive compression during network operations. CPU cycles spent compressing often exceed bandwidth savings on modern high-speed networks. This optimization alone reduces clone times by 40% to 60%.
The second major optimization, http.postBuffer=1024M, addresses Git's conservative HTTP buffer sizing. Large repositories benefit tremendously from increased buffer sizes, allowing Git to handle larger operations without breaking them into multiple requests, reducing protocol overhead.
Git Much Faster leverages shallow clones using --depth=1 (fetching only the latest commit) and partial clones with --filter=blob:none (deferring file content downloads until checkout). Shallow clones reduce data by 70%-90% for mature repositories, while partial clones prove particularly effective for repositories with large binary assets.
Sparse checkout provides surgical precision in controlling checked-out files. Git Much Faster implements comprehensive exclusion covering 30+ binary file types — images, documents, archives, media files, and executables — reducing working directory size by up to 78% while maintaining full source code access.
Git's Scalar tool, integrated into Git since Version 2.38, combines partial clone, sparse checkout, and background maintenance. However, benchmarking reveals Scalar doesn't implement the aggressive compression and buffer optimizations providing the most significant performance gains. Testing shows the custom optimized approach typically outperforms Scalar by 48%-67% while achieving similar disk space savings.
An interesting thing about optimizing the clone operation is that it also reduces complete system loading because you are reducing the size of your request. GitLab has a specialized, horizontal scaling layer known as Gitaly Cluster. When full history clones and large monorepos are the norm, the sizing of Gitaly Cluster is driven higher. This is because all git clone requests are serviced by a server-side binary to create “pack files” to be sent over the wire. Since these server-side git operations involve running compression utilities, it drives all three of memory, CPU, and I/O requirements at once.
When git clone operations are optimized to reduce the size of the total content ask, it reduces load on the end-to-end stack: Client, Network, Gitaly Service and Storage. All layers speed up and become cheaper at the same time.
Git Much Faster's effectiveness is demonstrated through rigorous benchmarking across diverse, real-world repositories using consistent AWS infrastructure with Arm instances and controlled network conditions.
Linux kernel repository (7.5GB total): Standard clone took 6 minutes 29 seconds. Optimized clone achieved 46.28 seconds — an 88.1% improvement, reducing the .git directory from 5.9GB to 284MB. Scalar took 2 minutes 21 seconds (63.7% improvement), completing 67.3% slower than the optimized approach.
Chromium repository (60.9GB total): Standard clone required 95 minutes 12 seconds. Optimized clone achieved 6 minutes 41 seconds — a dramatic 93% improvement, compressing the .git directory from 55.7GB to 850MB. Scalar took 13 minutes 3 seconds (86.3% improvement) but remained 48.8% slower than the optimized approach.
GitLab website repository (8.9GB total): Standard clone took 6 minutes 23 seconds. Optimized clone achieved 6.49 seconds — a remarkable 98.3% improvement, reducing the .git directory to 37MB. Scalar took 33.60 seconds (91.2% improvement) while remaining 80.7% slower.
The benchmarking reveals clear patterns: Larger repositories show more dramatic improvements, binary-heavy repositories benefit most from sparse checkout filtering, and the custom optimization approach consistently outperforms both standard Git and Scalar across all repository types.
Implementation requires understanding when to apply each technique based on use case and risk tolerance. For development requiring full repository access, use standard Git cloning. For read-heavy workflows needing rapid access to current code, deploy optimized cloning. For CI/CD pipelines where speed is paramount, optimized cloning provides maximum benefit.
Getting started requires only simple download and execution:
curl -L https://gitlab.com/gitlab-accelerates-embedded/misc/git-much-faster/-/raw/master/git-much-faster.sh -o ./git-much-faster.sh
# For benchmarking
bash ./git-much-faster.sh --methods=optimized,standard --repo=https://github.com/your-org/your-repo.git
For production-grade testing, Git Much Faster project includes complete Terraform infrastructure for AWS deployment, eliminating variables that skew local testing results.
Optimized clones require careful consideration of limitations. Shallow clones prevent access to historical commits, limiting operations like git log across file history. For teams adopting optimizations it is best to create specific optimizations for high volume usage. For instance, developers can perform an optimized clone, and if and when needed, convert to full clones when needed via git fetch --unshallow. If a given CI job accesses commit history (e.g. using GitVersion), then you may need the full history, but not a checkout.
Embedded development presents unique challenges where projects historically stored compiled firmware and hardware design files directly in version control. These repositories often contain FPGA bitstreams, PCB layouts, and vendor SDK distributions ballooning sizes into tens of gigabytes. Build processes frequently require cloning dozens of external repositories, multiplying performance impact.
Enterprise monorepos encounter Git performance challenges as repositories grow encompassing multiple projects and accumulated historical data. Media and asset-heavy projects compound challenges, as mentioned above — web applications accumulate marketing assets over years, while game development faces severe challenges with 3D models and audio files pushing repositories beyond 100GB. More use cases can be found in the project.
CI/CD pipelines represent the most impactful application. Each container-based CI job requires a fresh repository clone, and when operations consume 20 to 90 minutes, entire development workflows become unviable.
Geographically spread out development teams may have team members whose network performance to their primary development workstation is extremely limited or varies dramatically. Optimizing the Git clone can help by reducing over the wire sizes dramatically.
Git clone optimization represents a transformative opportunity delivering measurable improvements — up to 93% reduction in clone times and 98% reduction in disk space usage — that fundamentally change how teams interact with codebases.
The key insight is that Git's default conservative approach leaves substantial performance opportunities untapped. By understanding specific bottlenecks — network transfer inefficiency, CPU-intensive compression, unnecessary data downloads — teams can implement targeted optimizations delivering transformative results.
Ready to revolutionize your Git workflows?
Read the docs in the Git Much Faster repository and get started running benchmarks against your largest repositories. Begin with read-only optimization in CI/CD pipelines where benefits are immediate and risks minimal. As your team gains confidence, gradually expand optimization to development workflows based on measured results.
The future of Git performance optimization continues evolving, but fundamental principles — eliminating unnecessary work, optimizing for actual bottlenecks, measuring results rigorously — remain valuable regardless of future tooling evolution. Teams mastering these concepts today position themselves to leverage whatever improvements tomorrow's Git ecosystem provides.