<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Find on Clément Joly – Open-Source, Rust &amp; SQLite</title><link>https://joly.pw/tags/find/</link><description>Recent content in Find on Clément Joly – Open-Source, Rust &amp; SQLite</description><image><title>Clément Joly – Open-Source, Rust &amp; SQLite</title><url>https://joly.pw/images/open-graph-home-original.png</url><link>https://joly.pw/images/open-graph-home-original.png</link></image><generator>Hugo -- 0.154.3</generator><language>en</language><copyright>Clément Joly</copyright><lastBuildDate>Wed, 11 Mar 2026 03:32:38 +0000</lastBuildDate><atom:link href="https://joly.pw/tags/find/index.xml" rel="self" type="application/rss+xml"/><item><title>Git ls-files is Faster Than Fd and Find</title><link>https://joly.pw/blog/git-ls-files-is-faster-than-fd-and-find/</link><pubDate>Thu, 04 Nov 2021 06:06:21 +0000</pubDate><guid>https://joly.pw/blog/git-ls-files-is-faster-than-fd-and-find/</guid><description>Git ls-files is up to 5 times faster than fd or find in this benchmark, but why?</description><content:encoded><![CDATA[



  
  
  
  

  <div class="alert alert-tldr">
    <p class="alert-heading">
      ⚡
      
        TL;DR
      
    </p>
    <p>In the <a href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/">Linux Git repository</a>:</p>
<div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-fish" data-lang="fish"><span style="display:flex;"><span><span style="color:#61afef;font-weight:bold">hyperfine</span> <span style="color:#e06c75">--export-markdown</span> /tmp/tldr.md <span style="color:#e06c75">--warmup</span> <span style="color:#d19a66">10</span> <span style="color:#98c379">&#39;git ls-files&#39;</span> <span style="color:#98c379">&#39;find&#39;</span> <span style="color:#98c379">&#39;fd --no-ignore&#39;</span>
</span></span></code></pre></div><table>
  <thead>
      <tr>
          <th style="text-align: left">Command</th>
          <th style="text-align: right">Mean [ms]</th>
          <th style="text-align: right">Min [ms]</th>
          <th style="text-align: right">Max [ms]</th>
          <th style="text-align: right">Relative</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><code>git ls-files</code></td>
          <td style="text-align: right">16.9 ± 0.5</td>
          <td style="text-align: right">16.3</td>
          <td style="text-align: right">18.2</td>
          <td style="text-align: right">1.00</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>find</code></td>
          <td style="text-align: right">93.1 ± 0.7</td>
          <td style="text-align: right">92.4</td>
          <td style="text-align: right">95.7</td>
          <td style="text-align: right">5.52 ± 0.16</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>fd --no-ignore</code></td>
          <td style="text-align: right">85.8 ± 7.5</td>
          <td style="text-align: right">81.1</td>
          <td style="text-align: right">111.3</td>
          <td style="text-align: right">5.08 ± 0.47</td>
      </tr>
  </tbody>
</table>
<p><code>git ls-files</code> is more than <em>5 times faster</em> than both <code>fd --no-ignore</code> and <code>find</code>!</p>
  </div>



<h2 id="introduction">Introduction</h2>
<p>In my <a href="https://joly.pw/blog/my-setup/nvim-0-5/">editor</a> I changed my mapping to open files from <code>fd</code><sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> to <code>git ls-files</code><sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> and I noticed it felt faster after the change. But that’s intriguing, given <code>fd</code>’s goal to be <a href="https://github.com/sharkdp/fd#benchmark">very fast</a>. Git on the other hand is primarily a source code management system (SCM), it’s main business<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup> is not to help you list your files! Let’s run some benchmarks to make sure.</p>
<h2 id="benchmarks">Benchmarks</h2>
<p>Is <code>git ls-files</code> actually faster than <code>fd</code> or is that just an illusion? In our benchmark, we will use:</p>
<ul>
<li><code>fd</code> 8.2.1</li>
<li><code>git</code> 2.33.0</li>
<li><code>find</code> 4.8.0</li>
<li><code>hyperfine</code> 1.11.0</li>
</ul>
<p>We run the benchmarks with disk-cache filled, we are not measuring the cold cache case. That’s because in your editor, you may use the commands mentioned multiple times and would benefit from cache. The results are similar for an in memory repo, which confirms cache filling.</p>
<p>Also, you work on those files, so they should be in cache to a degree. We also make sure to be on a quiet PC, with CPU power-saving deactivated. Furthermore, the CPU has 8 cores with hyper-threading, so <code>fd</code> uses 8 threads. Last but not least, unless otherwise noted, the files in the repo are only the ones committed, for instance, no build artifacts are present.</p>
<h3 id="a-test-git-repository">A Test Git Repository</h3>
<p>We first need a Git repository. I’ve chosen to clone<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup> the <a href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/">Linux kernel repo</a> because it is a fairly big one and a <a href="https://github.blog/2020-12-22-git-clone-a-data-driven-study-on-cloning-behaviors/">reference</a> for Git performance measurements. This is important to ensure searches take a non-trivial amount of time: as hyperfine rightfully points out, short run times (less than 5 ms) are more difficult to accurately compare.</p>
<div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-fish" data-lang="fish"><span style="display:flex;"><span><span style="color:#61afef;font-weight:bold">git</span> clone <span style="color:#e06c75">--depth</span> <span style="color:#d19a66">1</span> <span style="color:#e06c75">--recursive</span> ssh://git@github.com/torvalds/linux.git ~/ghq/github.com/torvalds/linux
</span></span><span style="display:flex;"><span><span style="color:#c678dd">cd</span> ~/ghq/github.com/torvalds/linux
</span></span></code></pre></div><h4 id="choosing-the-commands">Choosing the commands</h4>
<p>We want to evaluate <code>git ls-files</code> versus <code>fd</code> and <code>find</code>. However, getting exactly the same list of file is not a trivial task:</p>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Command</th>
          <th style="text-align: right">Output lines</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><code>git ls-files</code></td>
          <td style="text-align: right">72219</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>find</code></td>
          <td style="text-align: right">77039</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>fd --no-ignore</code></td>
          <td style="text-align: right">76705</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>fd --no-ignore --hidden</code></td>
          <td style="text-align: right">77038</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>fd</code></td>
          <td style="text-align: right">72363</td>
      </tr>
  </tbody>
</table>
<p>After some more tries, it turns out that this command gives exactly<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup> the same output as <code>git ls-files</code>:</p>
<pre tabindex="0"><code>fd --no-ignore --hidden --exclude .git --type file --type symlink
</code></pre><p>It is a fairly complicated command, with various criteria on the files to print and that could translate to an unfair advantage to <code>git ls-files</code>. Consequently, we will also use the simpler examples in the table above.</p>
<h3 id="hyperfine">Hyperfine</h3>
<p><a href="https://github.com/sharkdp/hyperfine">Hyperfine</a> is a great tool to compare various commands: it has a colored and markdown output, attempts to detect outliers, tunes the number of run… Here is an <a href="https://asciinema.org/">asciinema</a><sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup> showing its output<sup id="fnref:7"><a href="#fn:7" class="footnote-ref" role="doc-noteref">7</a></sup>:</p>
<div id="demo3"></div>
<script>
AsciinemaPlayer.create("/blog/git-ls-files-is-faster-than-fd-and-find/hyperfine.json", document.getElementById('demo3'), {
"cols": "103","loop": "true","preload":  1 ,"rows": "32","speed": "4",
});
</script>
<noscript><blockquote><p>To run this asciicast without javascript, use <code>asciinema play https://joly.pw/blog/git-ls-files-is-faster-than-fd-and-find/hyperfine.json</code> with <a href="https://asciinema.org/">Asciinema</a></p></blockquote></noscript>

<h3 id="first-results">First Results</h3>
<p>For our first benchmark, on an SSD with <a href="https://en.wikipedia.org/wiki/Btrfs">btrfs</a>, with commit <a href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ad347abe4a9876b1f65f408ab467137e88f77eb4"><code>ad347abe4a…</code></a> checked out, we run:</p>
<div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-fish" data-lang="fish"><span style="display:flex;"><span><span style="color:#61afef;font-weight:bold">hyperfine</span> <span style="color:#e06c75">--export-markdown</span> /tmp/<span style="color:#d19a66">1</span>.md <span style="color:#e06c75">--warmup</span> <span style="color:#d19a66">10</span> <span style="color:#98c379">&#39;git ls-files&#39;</span> <span style="color:#98c379">\
</span></span></span><span style="display:flex;"><span>    <span style="color:#98c379">&#39;find&#39;</span> <span style="color:#98c379">&#39;fd --no-ignore&#39;</span> <span style="color:#98c379">&#39;fd --no-ignore --hidden&#39;</span> <span style="color:#98c379">&#39;fd&#39;</span> <span style="color:#98c379">\
</span></span></span><span style="display:flex;"><span>    <span style="color:#98c379">&#39;fd --no-ignore --hidden --exclude .git --type file --type symlink&#39;</span>
</span></span></code></pre></div><p>This yields the following results:</p>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Command</th>
          <th style="text-align: right">Mean [ms]</th>
          <th style="text-align: right">Min [ms]</th>
          <th style="text-align: right">Max [ms]</th>
          <th style="text-align: right">Relative</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><code>git ls-files</code></td>
          <td style="text-align: right">16.9 ± 0.6</td>
          <td style="text-align: right">16.3</td>
          <td style="text-align: right">19.2</td>
          <td style="text-align: right">1.00</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>find</code></td>
          <td style="text-align: right">93.2 ± 0.5</td>
          <td style="text-align: right">92.5</td>
          <td style="text-align: right">94.8</td>
          <td style="text-align: right">5.50 ± 0.19</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>fd --no-ignore</code></td>
          <td style="text-align: right">86.6 ± 7.8</td>
          <td style="text-align: right">80.5</td>
          <td style="text-align: right">115.7</td>
          <td style="text-align: right">5.11 ± 0.49</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>fd --no-ignore --hidden</code></td>
          <td style="text-align: right">121.0 ± 6.2</td>
          <td style="text-align: right">112.3</td>
          <td style="text-align: right">132.3</td>
          <td style="text-align: right">7.14 ± 0.44</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>fd</code></td>
          <td style="text-align: right">231.6 ± 22.3</td>
          <td style="text-align: right">200.8</td>
          <td style="text-align: right">272.5</td>
          <td style="text-align: right">13.68 ± 1.40</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>fd --no-ignore --hidden --exclude .git --type file --type symlink</code></td>
          <td style="text-align: right">80.9 ± 5.0</td>
          <td style="text-align: right">77.5</td>
          <td style="text-align: right">95.3</td>
          <td style="text-align: right">4.78 ± 0.34</td>
      </tr>
  </tbody>
</table>
<p>As mentioned in the TL;DR, <code>git ls-files</code> is at least 5 times faster than its closest competitor! Let’s find out why that is.</p>
<h2 id="how-does-git-store-files-in-a-repository">How Does Git Store Files in a Repository</h2>
<p>To try to understand where this performance advantage of <code>git ls-files</code> comes from, let’s look into how files are stored in a repository. This is a quick overview, you can find more details about Git’s storage internals in <a href="https://git-scm.com/book/en/v2/Git-Internals-Git-Objects">this section of the Pro Git book</a>.</p>
<h3 id="git-objects">Git Objects</h3>
<p>Git builds its own internal representation of the file system tree in the repository:</p>
<figure>
    <img loading="lazy" src="./data-model-2.png"
         alt="Internal Git representation of the file system tree" width="800" height="593"/> <figcaption>
            Internal Git representation of the file system tree<p>From the Pro Git book, written by Scott Chacon and Ben Straub and published by Apress, licensed under the <a href="https://creativecommons.org/licenses/by-nc-sa/3.0/">Creative Commons Attribution Non Commercial Share Alike 3.0</a> license, copyright 2021.</p>
        </figcaption>
</figure>

<p>In the figure above, each tree object contains a list of folder or names and references to these (among other things). This representation is then stored by its hash in the <code>.git</code> folder, like so:</p>
<pre tabindex="0"><code>.git/objects
├── 65
│  └── 107a3367b67e7a50788f575f73f70a1e61c1df
├── e6
│  └── 9de29bb2d1d6434b8b29ae775ad8c2e48c5391
├── f0
│  └── f1a67ce36d6d87e09ea711c62e88b135b60411
├── info
└── pack
</code></pre><p>As a result, to list the content of a folder, it seems Git has to access the corresponding tree object, stored in a file contained in a folder with the beginning of the hash. But doing that for the currently checked out files all the time would be slow, especially for frequently used commands like <code>git status</code>. Fortunately, git also maintains an <em>index</em> for files in the current working directory.</p>
<h3 id="git-index">Git Index</h3>
<p>This <a href="https://git-scm.com/docs/index-format">index</a>, lists (among other things) each file in the repository with file-system metadata like last modification time. More details and examples are provided <a href="https://medium.com/hackernoon/understanding-git-index-4821a0765cf">here</a>.</p>
<p>So, it seems that the index has everything <code>ls-files</code> requires. Let’s check it is used by <code>ls-files</code></p>
<h2 id="strace">Strace</h2>
<p>Let’s ensure that <code>ls-files</code> uses only the index, without scanning many files in the repo or the <code>.git</code> folder. That would explain its performance advantage, as reading a file is cheaper than traversing many folders. To this end, we’ll use <code>strace</code><sup id="fnref:8"><a href="#fn:8" class="footnote-ref" role="doc-noteref">8</a></sup> like so:</p>
<div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-fish" data-lang="fish"><span style="display:flex;"><span><span style="color:#61afef;font-weight:bold">strace</span> <span style="color:#e06c75">-e</span> !write git ls-files<span style="color:#56b6c2">&gt;</span>/dev/null <span style="color:#d19a66">2</span><span style="color:#56b6c2">&gt;</span>/tmp/a
</span></span></code></pre></div><p>It turns out the <a href="https://git-scm.com/docs/index-format"><code>.git/index</code></a> is read:</p>
<pre tabindex="0"><code>openat(AT_FDCWD, &#34;.git/index&#34;, O_RDONLY) = 3
</code></pre><p>And we are not reading objects in the <code>.git</code> folder or files in the repository. A quick check of Git’s source code <a href="https://github.com/git/git/blob/33be431c0c7284c1adf0fe49f7838dbc8aee6ea9/builtin/ls-files.c#L761">confirms</a> this. We now have an explanation for the speed <code>git ls-files</code> displays in our benchmarks!</p>
<h2 id="other-scenarios">Other Scenarios</h2>
<p>However, listing file in a fully committed repository is not the most common case when you work on your code: as you make changes, a larger portion of the files are changed or added. How does <code>git ls-files</code> compare in these other scenarios?</p>
<h3 id="with-changes">With Changes</h3>
<p>When there are changes to some files, we shouldn’t see any significant performance difference: the index is still usable directly to get the names of the files in the repository, we don’t really care about whether their content changed.</p>
<p>To check this, let’s change all the C files in the kernel sources (using some <a href="https://fishshell.com/">fish</a> shell scripting):</p>
<div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-fish" data-lang="fish"><span style="display:flex;"><span><span style="color:#c678dd">for</span> <span style="color:#e06c75">f</span> <span style="color:#c678dd">in</span> <span style="color:#56b6c2">(</span><span style="color:#61afef;font-weight:bold">fd</span> <span style="color:#e06c75">-e</span> c<span style="color:#56b6c2">)</span>
</span></span><span style="display:flex;"><span>  <span style="color:#c678dd">echo</span> <span style="color:#d19a66">1</span> <span style="color:#56b6c2">&gt;&gt;</span> <span style="color:#e06c75">$f</span>
</span></span><span style="display:flex;"><span><span style="color:#c678dd">end</span>
</span></span></code></pre></div><div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-fish" data-lang="fish"><span style="display:flex;"><span><span style="color:#61afef;font-weight:bold">git</span> <span style="color:#e5c07b">status</span> <span style="color:#56b6c2">|</span> <span style="color:#61afef;font-weight:bold">wc</span> <span style="color:#e06c75">-l</span>
</span></span><span style="display:flex;"><span><span style="color:#61afef;font-weight:bold">28350</span>
</span></span></code></pre></div><div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-fish" data-lang="fish"><span style="display:flex;"><span><span style="color:#61afef;font-weight:bold">hyperfine</span> <span style="color:#e06c75">--export-markdown</span> /tmp/<span style="color:#d19a66">2</span>.md <span style="color:#e06c75">--warmup</span> <span style="color:#d19a66">10</span> <span style="color:#98c379">&#39;git ls-files&#39;</span> <span style="color:#98c379">&#39;find&#39;</span> <span style="color:#98c379">&#39;fd --no-ignore&#39;</span> <span style="color:#98c379">\
</span></span></span><span style="display:flex;"><span>  <span style="color:#98c379">&#39;fd --no-ignore --hidden --exclude .git --type file --type symlink&#39;</span>
</span></span></code></pre></div><table>
  <thead>
      <tr>
          <th style="text-align: left">Command</th>
          <th style="text-align: right">Mean [ms]</th>
          <th style="text-align: right">Min [ms]</th>
          <th style="text-align: right">Max [ms]</th>
          <th style="text-align: right">Relative</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><code>git ls-files</code></td>
          <td style="text-align: right">16.8 ± 0.5</td>
          <td style="text-align: right">16.3</td>
          <td style="text-align: right">18.9</td>
          <td style="text-align: right">1.00</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>find</code></td>
          <td style="text-align: right">93.5 ± 0.7</td>
          <td style="text-align: right">92.7</td>
          <td style="text-align: right">95.5</td>
          <td style="text-align: right">5.55 ± 0.17</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>fd --no-ignore</code></td>
          <td style="text-align: right">86.1 ± 7.3</td>
          <td style="text-align: right">80.9</td>
          <td style="text-align: right">112.6</td>
          <td style="text-align: right">5.12 ± 0.46</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>fd --no-ignore --hidden --exclude .git --type file --type symlink</code></td>
          <td style="text-align: right">80.8 ± 6.6</td>
          <td style="text-align: right">77.8</td>
          <td style="text-align: right">115.0</td>
          <td style="text-align: right">4.80 ± 0.42</td>
      </tr>
  </tbody>
</table>
<p>We see the same numbers as before and it is again consistent with <a href="https://github.com/git/git/blob/33be431c0c7284c1adf0fe49f7838dbc8aee6ea9/builtin/ls-files.c#L761">ls-files source code</a>.</p>
<p>Run <code>git checkout -f @</code> after this to remove the changes made to the files.</p>
<h3 id="with-new-files-and--o">With New Files and <code>-o</code></h3>
<p>With yet uncommitted files, there are two subcases:</p>
<ul>
<li>files were created and added (with <code>git add</code>): then the files are in index and reading the index is enough for <code>ls-files</code>, like above,</li>
<li>files were created but not added: these files are not present in the index, but without the <code>-o</code> flag, <code>ls-files</code> won’t output them either, so it can still use the index, as before.</li>
</ul>
<p>So the only case that needs further investigations is the use of <code>-o</code>. Since we don’t have baseline results yet for <code>-o</code>, let’s first see how it compares without any unadded new files.</p>
<h4 id="without-any-unadded-new-files-baseline">Without any Unadded New Files (Baseline)</h4>
<p>When we haven’t added any new files in the repository:</p>
<div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-fish" data-lang="fish"><span style="display:flex;"><span><span style="color:#61afef;font-weight:bold">hyperfine</span> <span style="color:#e06c75">--export-markdown</span> /tmp/<span style="color:#d19a66">3</span>.md <span style="color:#e06c75">--warmup</span> <span style="color:#d19a66">10</span> <span style="color:#98c379">&#39;git ls-files&#39;</span> <span style="color:#98c379">&#39;git ls-files -o&#39;</span> <span style="color:#98c379">&#39;find&#39;</span> <span style="color:#98c379">\
</span></span></span><span style="display:flex;"><span>  <span style="color:#98c379">&#39;fd --no-ignore&#39;</span> <span style="color:#98c379">&#39;fd --no-ignore --hidden --exclude .git --type file --type symlink&#39;</span>
</span></span></code></pre></div><table>
  <thead>
      <tr>
          <th style="text-align: left">Command</th>
          <th style="text-align: right">Mean [ms]</th>
          <th style="text-align: right">Min [ms]</th>
          <th style="text-align: right">Max [ms]</th>
          <th style="text-align: right">Relative</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><code>git ls-files</code></td>
          <td style="text-align: right">16.7 ± 0.5</td>
          <td style="text-align: right">16.1</td>
          <td style="text-align: right">17.9</td>
          <td style="text-align: right">1.00</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>git ls-files -o</code></td>
          <td style="text-align: right">69.1 ± 0.7</td>
          <td style="text-align: right">67.8</td>
          <td style="text-align: right">70.8</td>
          <td style="text-align: right">4.12 ± 0.12</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>find</code></td>
          <td style="text-align: right">94.3 ± 0.5</td>
          <td style="text-align: right">93.4</td>
          <td style="text-align: right">95.3</td>
          <td style="text-align: right">5.63 ± 0.16</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>fd --no-ignore</code></td>
          <td style="text-align: right">86.6 ± 7.0</td>
          <td style="text-align: right">80.8</td>
          <td style="text-align: right">106.0</td>
          <td style="text-align: right">5.17 ± 0.44</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>fd --no-ignore --hidden --exclude .git --type file --type symlink</code></td>
          <td style="text-align: right">80.8 ± 7.4</td>
          <td style="text-align: right">77.9</td>
          <td style="text-align: right">118.0</td>
          <td style="text-align: right">4.82 ± 0.46</td>
      </tr>
  </tbody>
</table>
<p>That suggests that <code>git ls-files -o</code> is performing some more work besides “just” reading the index. With <code>strace</code>, we see lines like:</p>
<div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-fish" data-lang="fish"><span style="display:flex;"><span><span style="color:#61afef;font-weight:bold">strace</span> <span style="color:#e06c75">-e</span> !write git ls-files <span style="color:#e06c75">-o</span><span style="color:#56b6c2">&gt;</span>/dev/null <span style="color:#d19a66">2</span><span style="color:#56b6c2">&gt;</span>/tmp/a
</span></span><span style="display:flex;"><span>…
</span></span><span style="display:flex;"><span><span style="color:#61afef;font-weight:bold">openat</span><span style="color:#56b6c2">(</span><span style="color:#61afef;font-weight:bold">AT_FDCWD</span>, <span style="color:#98c379">&#34;Documentation/&#34;</span>, O_RDONLY<span style="color:#56b6c2">|</span><span style="color:#61afef;font-weight:bold">O_NONBLOCK</span><span style="color:#56b6c2">|</span><span style="color:#61afef;font-weight:bold">O_CLOEXEC</span><span style="color:#56b6c2">|</span><span style="color:#61afef;font-weight:bold">O_DIRECTORY</span><span style="color:#56b6c2">)</span> <span style="color:#56b6c2">=</span> <span style="color:#d19a66">4</span>
</span></span><span style="display:flex;"><span><span style="color:#61afef;font-weight:bold">newfstatat</span><span style="color:#56b6c2">(</span><span style="color:#61afef;font-weight:bold">4</span>, <span style="color:#98c379">&#34;&#34;</span>, <span style="color:#56b6c2">{</span><span style="color:#e06c75">st_mode</span><span style="color:#56b6c2">=</span>S_IFDIR<span style="color:#56b6c2">|</span><span style="color:#61afef;font-weight:bold">0755</span>, <span style="color:#e06c75">st_size</span><span style="color:#56b6c2">=</span><span style="color:#d19a66">1446</span>, ...<span style="color:#56b6c2">}</span>, AT_EMPTY_PATH<span style="color:#56b6c2">)</span> <span style="color:#56b6c2">=</span> <span style="color:#d19a66">0</span>
</span></span><span style="display:flex;"><span><span style="color:#61afef;font-weight:bold">getdents64</span><span style="color:#56b6c2">(</span><span style="color:#61afef;font-weight:bold">4</span>, 0x55df0a6e6890 /* <span style="color:#d19a66">99</span> entries */, <span style="color:#d19a66">32768</span><span style="color:#56b6c2">)</span> <span style="color:#56b6c2">=</span> <span style="color:#d19a66">3032</span>
</span></span><span style="display:flex;"><span>…
</span></span></code></pre></div><h4 id="with-unadded-new-files">With Unadded New Files</h4>
<p>Let’s add some files now:</p>
<div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-fish" data-lang="fish"><span style="display:flex;"><span><span style="color:#c678dd">for</span> <span style="color:#e06c75">f</span> <span style="color:#c678dd">in</span> <span style="color:#56b6c2">(</span><span style="color:#61afef;font-weight:bold">seq</span> <span style="color:#d19a66">1</span> <span style="color:#d19a66">1000</span><span style="color:#56b6c2">)</span>
</span></span><span style="display:flex;"><span>  <span style="color:#61afef;font-weight:bold">touch</span> <span style="color:#e06c75">$f</span>
</span></span><span style="display:flex;"><span><span style="color:#c678dd">end</span>
</span></span></code></pre></div><p>And compare with our baseline:</p>
<div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-fish" data-lang="fish"><span style="display:flex;"><span><span style="color:#61afef;font-weight:bold">hyperfine</span> <span style="color:#e06c75">--export-markdown</span> /tmp/<span style="color:#d19a66">4</span>.md <span style="color:#e06c75">--warmup</span> <span style="color:#d19a66">10</span> <span style="color:#98c379">&#39;git ls-files&#39;</span> <span style="color:#98c379">&#39;git ls-files -o&#39;</span> <span style="color:#98c379">&#39;find&#39;</span> <span style="color:#98c379">\
</span></span></span><span style="display:flex;"><span>  <span style="color:#98c379">&#39;fd --no-ignore&#39;</span> <span style="color:#98c379">&#39;fd --no-ignore --hidden --exclude .git --type file --type symlink&#39;</span>
</span></span></code></pre></div><table>
  <thead>
      <tr>
          <th style="text-align: left">Command</th>
          <th style="text-align: right">Mean [ms]</th>
          <th style="text-align: right">Min [ms]</th>
          <th style="text-align: right">Max [ms]</th>
          <th style="text-align: right">Relative</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><code>git ls-files</code></td>
          <td style="text-align: right">16.8 ± 0.5</td>
          <td style="text-align: right">16.1</td>
          <td style="text-align: right">18.0</td>
          <td style="text-align: right">1.00</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>git ls-files -o</code></td>
          <td style="text-align: right">69.9 ± 1.2</td>
          <td style="text-align: right">68.1</td>
          <td style="text-align: right">72.6</td>
          <td style="text-align: right">4.17 ± 0.14</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>find</code></td>
          <td style="text-align: right">94.5 ± 0.6</td>
          <td style="text-align: right">93.4</td>
          <td style="text-align: right">96.3</td>
          <td style="text-align: right">5.64 ± 0.17</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>fd --no-ignore</code></td>
          <td style="text-align: right">86.8 ± 7.5</td>
          <td style="text-align: right">81.5</td>
          <td style="text-align: right">114.4</td>
          <td style="text-align: right">5.18 ± 0.48</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>fd --no-ignore --hidden --exclude .git --type file --type symlink</code></td>
          <td style="text-align: right">81.0 ± 4.5</td>
          <td style="text-align: right">78.6</td>
          <td style="text-align: right">96.3</td>
          <td style="text-align: right">4.83 ± 0.31</td>
      </tr>
  </tbody>
</table>
<p>There is little to no statically significant difference to our baseline, which highlights that much of the time is spent on things relatively independent of the number of files processed. It’s also worth noting that there is relatively little speed difference between <code>git ls-files -o</code> and <code>fd --no-ignore --hidden --exclude .git --type file --type symlink</code>.</p>
<p>Using <code>strace</code>, we can establish that all commands but <code>git ls-files</code> were reading all files in the repository. By comparing the <code>strace</code> outputs of <code>git ls-files -o</code> and <code>fd --no-ignore --hidden --exclude .git --type file --type symlink</code> (the two commands that print the same file list), we can see that they make similar system calls for each file in the repository. How to explain the (small) time difference between the two? I haven’t found convincing reasons in git source code for this case. It might be that the use of the <code>index</code> gives <code>ls-files</code> a head start.</p>
<h2 id="conclusions">Conclusions</h2>
<p>I’m now using <code>git ls-files</code> in my <a href="https://joly.pw/blog/my-setup/nvim-0-5/">keyboard driven text editor</a> instead of <code>fd</code> or <code>find</code>. It is faster, although the perceived difference described in the Introduction is probably due to spikes in latency on a cold cache. The selection of files is also narrowed down with <code>ls-files</code> to the ones I care about. That’s said, I’ve still kept the <code>fd</code>-based file listing as a fallback, as sometimes I’m not in a Git repository.</p>
<p>After all, Git is already building an index, why not use it to speed up your jumping from file to file!</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>With <a href="https://github.com/nvim-telescope/telescope.nvim">Telescope.nvim</a> <code>:Telescope find_files</code>&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>With <a href="https://github.com/nvim-telescope/telescope.nvim">Telescope.nvim</a> <code>:Telescope git_files show_untracked=false</code>&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>That’s not to say git is slow, on the contrary, when one reads the release notes, it’s obvious that a lot of performance optimization work is done.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>Using a shallow clone makes it faster for you to reproduce results locally. However, running the benchmarks again on a full clone does not significantly change the results.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>Using the <code>diff</code> command on the outputs of <code>git ls-files</code> and <code>fd --no-ignore --hidden --exclude .git --type file --type symlink</code>&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p>This is inserted in this page using my <a href="https://joly.pw/gohugo-asciinema/?ref=git-faster-fd-find">asciinema hugo module</a>&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:7">
<p>This output has been edited to remove the warning about outliers. These warning appeared only with <code>asciinema</code>, probably because it is disturbing the benchmark. This also explains why the values in this “asciicast” are different from the tables in the rest of the article: I’ve used values from runs outside asciinema for these tables.&#160;<a href="#fnref:7" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:8">
<p>See also <a href="https://jvns.ca/blog/2014/04/20/debug-your-programs-like-theyre-closed-source/">https://jvns.ca/blog/2014/04/20/debug-your-programs-like-theyre-closed-source/</a>&#160;<a href="#fnref:8" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content:encoded></item></channel></rss>