<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Hall of Fame on Clément Joly – Open-Source, Rust &amp; SQLite</title><link>https://joly.pw/tags/hackernewshomepage/</link><description>Recent content in Hall of Fame on Clément Joly – Open-Source, Rust &amp; SQLite</description><image><title>Clément Joly – Open-Source, Rust &amp; SQLite</title><url>https://joly.pw/images/open-graph-home-original.png</url><link>https://joly.pw/images/open-graph-home-original.png</link></image><generator>Hugo -- 0.154.3</generator><language>en</language><copyright>Clément Joly</copyright><atom:link href="https://joly.pw/tags/hackernewshomepage/index.xml" rel="self" type="application/rss+xml"/><item><title>Rust Default Values for Maintainability</title><link>https://joly.pw/blog/rust-default-values-for-maintainability/</link><pubDate>Sat, 25 Jun 2022 14:41:55 +0000</pubDate><guid>https://joly.pw/blog/rust-default-values-for-maintainability/</guid><description>Hidden benefits of the humble Default trait</description><content:encoded><![CDATA[



  
  
  
  

  <div class="alert alert-tldr">
    <p class="alert-heading">
      ⚡
      
        TL;DR
      
    </p>
    <p>The <a href="https://doc.rust-lang.org/std/default/trait.Default.html"><code>Default</code></a> trait can enhance the maintainability of your code. Default values for common types are <a href="#quick-reference">listed</a> at the end.</p>
  </div>



<h2 id="a-pr-review">A PR Review</h2>
<p>Recently, while reviewing a <a href="https://github.com/cljoly/rusqlite_migration/pull/20/files">PR</a><sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>, I noticed that part of the patch was introducing a new field to a struct:</p>
<div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-diff" data-lang="diff"><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 1</span><span>diff --git a/src/lib.rs b/src/lib.rs
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 2</span><span>index eba9a3a..8619e06 100644
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 3</span><span><span style="color:#e06c75">--- a/src/lib.rs
</span></span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 4</span><span><span style="color:#98c379;font-weight:bold">+++ b/src/lib.rs
</span></span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 5</span><span>@@ -106,8 +108,9 @@ use std::{
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 6</span><span> #[derive(Debug, PartialEq, Clone)]
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 7</span><span> pub struct M&lt;&#39;u&gt; {
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 8</span><span>     up: &amp;&#39;u str,
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 9</span><span>     down: Option&lt;&amp;&#39;u str&gt;,
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f">10</span><span><span style="color:#98c379;font-weight:bold">+    foreign_key_check: bool,
</span></span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f">11</span><span> }
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f">12</span><span> 
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f">13</span><span> impl&lt;&#39;u&gt; M&lt;&#39;u&gt; {
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f">14</span><span>@@ -137,8 +140,9 @@ impl&lt;&#39;u&gt; M&lt;&#39;u&gt; {
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f">15</span><span>     pub const fn up(sql: &amp;&#39;u str) -&gt; Self {
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f">16</span><span>         Self {
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f">17</span><span>             up: sql,
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f">18</span><span>             down: None,
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f">19</span><span><span style="color:#98c379;font-weight:bold">+            foreign_key_check: false,
</span></span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f">20</span><span>         }
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f">21</span><span>     }
</span></span></code></pre></div><p>That prompted me to reflect on the code I had initially written.
Prior to the patch, it looked roughly<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> like this:</p>
<div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 1</span><span><span style="color:#7f848e">#[derive(Debug, PartialEq, Clone)]</span>
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 2</span><span><span style="color:#c678dd">pub</span> <span style="color:#c678dd">struct</span> <span style="color:#e5c07b">M</span><span style="color:#56b6c2">&lt;</span><span style="color:#e06c75">&#39;u</span><span style="color:#56b6c2">&gt;</span> {
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 3</span><span>    <span style="color:#e06c75">up</span>: <span style="color:#c678dd">&amp;</span><span style="color:#e06c75">&#39;u</span> <span style="color:#e5c07b">str</span>,
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 4</span><span>    <span style="color:#e06c75">down</span>: <span style="color:#e5c07b">Option</span><span style="color:#56b6c2">&lt;&amp;</span><span style="color:#e06c75">&#39;u</span> <span style="color:#e5c07b">str</span><span style="color:#56b6c2">&gt;</span>,
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 5</span><span>}
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 6</span><span>
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 7</span><span><span style="color:#c678dd">impl</span><span style="color:#56b6c2">&lt;</span><span style="color:#e06c75">&#39;u</span><span style="color:#56b6c2">&gt;</span> <span style="color:#e06c75">M</span><span style="color:#56b6c2">&lt;</span><span style="color:#e06c75">&#39;u</span><span style="color:#56b6c2">&gt;</span> {
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 8</span><span>    <span style="color:#c678dd">pub</span> <span style="color:#c678dd">const</span> <span style="color:#c678dd">fn</span> <span style="color:#61afef;font-weight:bold">up</span>(<span style="color:#e06c75">sql</span>: <span style="color:#c678dd">&amp;</span><span style="color:#e06c75">&#39;u</span> <span style="color:#e5c07b">str</span>) -&gt; <span style="color:#e5c07b">Self</span> {
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 9</span><span>        <span style="color:#e5c07b">Self</span> {
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f">10</span><span>            <span style="color:#e06c75">up</span>: <span style="color:#e5c07b">sql</span>,
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f">11</span><span>            <span style="color:#e06c75">down</span>: <span style="color:#e5c07b">None</span>,
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f">12</span><span>        }
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f">13</span><span>    }
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f">14</span><span>}
</span></span></code></pre></div><h2 id="the-default-trait">The <code>Default</code> Trait</h2>
<p>What if I had used the <a href="https://doc.rust-lang.org/std/default/trait.Default.html"><code>Default</code></a> trait here?
The code could have looked like this:</p>
<div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;display:grid;"><code class="language-rust" data-lang="rust"><span style="display:flex; background-color:#3d4148"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 1</span><span><span style="color:#7f848e">#[derive(Debug, Default, PartialEq, Clone)]</span>
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 2</span><span><span style="color:#c678dd">pub</span> <span style="color:#c678dd">struct</span> <span style="color:#e5c07b">M</span><span style="color:#56b6c2">&lt;</span><span style="color:#e06c75">&#39;u</span><span style="color:#56b6c2">&gt;</span> {
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 3</span><span>    <span style="color:#e06c75">up</span>: <span style="color:#c678dd">&amp;</span><span style="color:#e06c75">&#39;u</span> <span style="color:#e5c07b">str</span>,
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 4</span><span>    <span style="color:#e06c75">down</span>: <span style="color:#e5c07b">Option</span><span style="color:#56b6c2">&lt;&amp;</span><span style="color:#e06c75">&#39;u</span> <span style="color:#e5c07b">str</span><span style="color:#56b6c2">&gt;</span>,
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 5</span><span>}
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 6</span><span>
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 7</span><span><span style="color:#c678dd">impl</span><span style="color:#56b6c2">&lt;</span><span style="color:#e06c75">&#39;u</span><span style="color:#56b6c2">&gt;</span> <span style="color:#e06c75">M</span><span style="color:#56b6c2">&lt;</span><span style="color:#e06c75">&#39;u</span><span style="color:#56b6c2">&gt;</span> {
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 8</span><span>    <span style="color:#c678dd">pub</span> <span style="color:#c678dd">const</span> <span style="color:#c678dd">fn</span> <span style="color:#61afef;font-weight:bold">up</span>(<span style="color:#e06c75">sql</span>: <span style="color:#c678dd">&amp;</span><span style="color:#e06c75">&#39;u</span> <span style="color:#e5c07b">str</span>) -&gt; <span style="color:#e5c07b">Self</span> {
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 9</span><span>        <span style="color:#e5c07b">Self</span> {
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f">10</span><span>            <span style="color:#e06c75">up</span>: <span style="color:#e5c07b">sql</span>,
</span></span><span style="display:flex; background-color:#3d4148"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f">11</span><span>            <span style="color:#56b6c2">..</span><span style="color:#e5c07b">Default</span>::<span style="color:#e06c75">default</span>()
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f">12</span><span>        }
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f">13</span><span>    }
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f">14</span><span>}
</span></span></code></pre></div><p>On the first line, the <a href="https://doc.rust-lang.org/reference/attributes/derive.html"><code>#[derive(Default)]</code></a> attribute makes the structure <code>M</code> implement the <code>Default</code> trait.
Thanks to this trait, a call to <code>M::default()</code> will create a struct with default values for its fields: <code>M { up: &quot;&quot;, down: None }</code>. Note that when a structure <code>M</code> is expected, <code>Default::default()</code> is equivalent to <code>M::default()</code>.</p>
<p>We then need to initialize the two fields of that structure, overriding some defaults:</p>
<ul>
<li><code>up</code> is defined directly as before. That’s the value we want to override.</li>
<li><code>down</code> is set by the <code>Default</code> trait. This is done by <code>..Default::default()</code> on line 11. <code>Default::default()</code> provides the values. Then the <a href="https://doc.rust-lang.org/reference/expressions/struct-expr.html#functional-update-syntax"><code>..</code></a> syntax fills out the fields that were not directly set. <code>down</code> is thus set to the same value as before, <code>None</code>.</li>
</ul>
<p>The code is just as long as before, when we were not using the <code>Default</code> trait.
But then, <a href="#a-pr-review">line 19 of the above patch</a> would have been unnecessary: <code>false</code> is the default for a <code>bool</code>, so the new <code>foreign_key_check</code> field would have been covered by the <code>..Default::default()</code>.</p>
<p>This results in a shorter patch:</p>
<div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-diff" data-lang="diff"><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 1</span><span>diff --git a/src/lib.rs b/src/lib.rs
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 2</span><span>index eba9a3a..8619e06 100644
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 3</span><span><span style="color:#e06c75">--- a/src/lib.rs
</span></span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 4</span><span><span style="color:#98c379;font-weight:bold">+++ b/src/lib.rs
</span></span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 5</span><span>@@ -106,8 +108,9 @@ use std::{
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 6</span><span> #[derive(Debug, PartialEq, Clone)]
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 7</span><span> pub struct M&lt;&#39;u&gt; {
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 8</span><span>     up: &amp;&#39;u str,
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f"> 9</span><span>     down: Option&lt;&amp;&#39;u str&gt;,
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f">10</span><span><span style="color:#98c379;font-weight:bold">+    foreign_key_check: bool,
</span></span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f">11</span><span> }
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f">12</span><span> 
</span></span><span style="display:flex;"><span style="white-space:pre;-webkit-user-select:none;user-select:none;margin-right:0.4em;padding:0 0.4em 0 0.4em;color:#55595f">13</span><span> impl&lt;&#39;u&gt; M&lt;&#39;u&gt; {
</span></span></code></pre></div><h2 id="conclusion">Conclusion</h2>
<p>In the example of this post, we are doing only one instantiation of that particular struct, and it has very few fields anyway.
But if there were many instantiations of that struct, we would have had to change all of those.
Then, using <code>Default</code> would have been quite beneficial.</p>
<p><a href="https://cs.github.com/rust-lang/rust/blob/10f4ce324baf7cfb7ce2b2096662b82b79204944/compiler/rustc_target/src/spec/hermit_base.rs#L21">This</a>
<a href="https://cs.github.com/rust-lang/rust/blob/10f4ce324baf7cfb7ce2b2096662b82b79204944/compiler/rustc_target/src/spec/solaris_base.rs#L15">pattern</a>
<a href="https://cs.github.com/rust-lang/rust-analyzer/blob/6fc5c3cd2117a29981ba9b7cef8a51c1d6804089/crates/ide-completion/src/render.rs#L68">were</a>
<a href="https://cs.github.com/rust-lang/rustc-perf/blob/434ba59ca9fbd793ba2b6d02e65704c108479069/collector/benchmarks/clap-3.1.6/src/build/possible_value.rs#L56">the</a>
<a href="https://cs.github.com/rust-lang/rustup/blob/20ed5d9803ca237c39fbbcba8971c4558be4acca/src/diskio/immediate.rs#L34">return</a>
<a href="https://cs.github.com/rust-lang/mdBook/blob/0547868d4d25e1c840a871f9e17b2b4c2078596b/src/book/book.rs#L89">type</a>
is
<a href="https://cs.github.com/rust-lang/rust/blob/10f4ce324baf7cfb7ce2b2096662b82b79204944/compiler/rustc_target/src/spec/redox_base.rs#L16">built</a>
<a href="https://cs.github.com/rust-lang/docs.rs/blob/1ce3fd876a0f5fd5b52c9962cf7ae9df137e6366/src/web/releases.rs#L653">with</a>
a
<a href="https://cs.github.com/rust-lang/rust/blob/10f4ce324baf7cfb7ce2b2096662b82b79204944/compiler/rustc_target/src/spec/l4re_base.rs#L13">call</a>
to
<code>Default::default()</code>
seems
relatively
common
in
the
<a href="https://github.com/rust-lang">Rust organization</a>.
It can enhance maintainability, much more than I initially thought.</p>
<p>Of course this is a balancing act.
For instance, this pattern could be abused by <a href="https://play.rust-lang.org/?version=stable&amp;mode=debug&amp;edition=2021&amp;gist=ae7553d16b481951bd8e3418b82fcd27">defining custom default values</a> on primitive types for a particular structure.
That would lead to <code>Default::default()</code> filling surprising values and the code would be less predictable.</p>




  
  
  
  

  <div class="alert alert-edit">
    <p class="alert-heading">
      ✏
      
        Edit
      
    </p>
    <p>2022-07-29: As SpudnikV <a href="https://www.reddit.com/r/rust/comments/vkozed/comment/idtbgit/?utm_source=share&amp;utm_medium=web2x&amp;context=3">pointed out</a>, using defaults as explained in this post can hide the implications of a change made to a structure. There could be code in various places relying on invariants that may break due to the change. Without <code>Default::default()</code>, the change might make visible edits in these places, drawing the attention of reviewers on these invariants.</p>
<p>That’s another case where it might not be wise to use the <code>Default</code> trait. Again, it’s a balancing act!</p>
  </div>



<hr>
<h2 id="appendix-defaults-for-some-common-types-with-derive">Appendix: Defaults for Some Common Types With <code>derive</code></h2>
<p>Why did I not use the <code>Default</code> trait initially?
Part of it might be the fear of introducing incorrect code.
It was slightly unclear to me what <code>derive</code> uses as a default value for common primitive types.
And you don’t want a field set explicitly to <code>false</code> becoming a <code>true</code> once you use <code>Default</code>, right?</p>
<p>Let’s take a closer look by running the following program:</p>
<div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#7f848e">#[derive(Debug, Default)]</span>
</span></span><span style="display:flex;"><span><span style="color:#c678dd">struct</span> <span style="color:#e5c07b">D</span><span style="color:#56b6c2">&lt;</span><span style="color:#e06c75">&#39;a</span><span style="color:#56b6c2">&gt;</span> {
</span></span><span style="display:flex;"><span>    <span style="color:#e06c75">b</span>: <span style="color:#e5c07b">bool</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#e06c75">c</span>: <span style="color:#e5c07b">char</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#e06c75">o</span>: <span style="color:#e5c07b">Option</span><span style="color:#56b6c2">&lt;</span><span style="color:#e5c07b">usize</span><span style="color:#56b6c2">&gt;</span>,
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#e06c75">string</span>: <span style="color:#e5c07b">String</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#e5c07b">str</span>: <span style="color:#c678dd">&amp;</span><span style="color:#e06c75">&#39;a</span> <span style="color:#e5c07b">str</span>,
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#e06c75">v</span>: <span style="color:#e5c07b">Vec</span><span style="color:#56b6c2">&lt;</span><span style="color:#e5c07b">usize</span><span style="color:#56b6c2">&gt;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#e06c75">a</span>: [<span style="color:#e5c07b">usize</span>; <span style="color:#d19a66">10</span>],
</span></span><span style="display:flex;"><span>    <span style="color:#e06c75">s</span>: <span style="color:#c678dd">&amp;</span><span style="color:#e06c75">&#39;a</span> [<span style="color:#e5c07b">u32</span>],
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#e06c75">f</span>: <span style="color:#e5c07b">f64</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#e06c75">u</span>: (<span style="color:#e5c07b">usize</span>, <span style="color:#e5c07b">u32</span>, <span style="color:#e5c07b">u64</span>)
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#c678dd">fn</span> <span style="color:#61afef;font-weight:bold">main</span>() {
</span></span><span style="display:flex;"><span>    <span style="color:#56b6c2;font-weight:bold">println!</span>(<span style="color:#98c379">&#34;</span><span style="color:#98c379">{:#?}</span><span style="color:#98c379">&#34;</span>, <span style="color:#e06c75">D</span>::<span style="color:#e06c75">default</span>());
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Output:</p>
<div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#e06c75">D</span> {
</span></span><span style="display:flex;"><span>    <span style="color:#e06c75">b</span>: <span style="color:#e5c07b">false</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#e06c75">c</span>: <span style="color:#e06c75">&#39;</span>\<span style="color:#d19a66">0</span><span style="color:#e06c75">&#39;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#e06c75">o</span>: <span style="color:#e5c07b">None</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#e06c75">string</span>: <span style="color:#98c379">&#34;&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#e5c07b">str</span>: <span style="color:#98c379">&#34;&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#e06c75">v</span>: [],
</span></span><span style="display:flex;"><span>    <span style="color:#e06c75">a</span>: [
</span></span><span style="display:flex;"><span>        <span style="color:#d19a66">0</span>,
</span></span><span style="display:flex;"><span>        <span style="color:#d19a66">0</span>,
</span></span><span style="display:flex;"><span>        <span style="color:#d19a66">0</span>,
</span></span><span style="display:flex;"><span>        <span style="color:#d19a66">0</span>,
</span></span><span style="display:flex;"><span>        <span style="color:#d19a66">0</span>,
</span></span><span style="display:flex;"><span>        <span style="color:#d19a66">0</span>,
</span></span><span style="display:flex;"><span>        <span style="color:#d19a66">0</span>,
</span></span><span style="display:flex;"><span>        <span style="color:#d19a66">0</span>,
</span></span><span style="display:flex;"><span>        <span style="color:#d19a66">0</span>,
</span></span><span style="display:flex;"><span>        <span style="color:#d19a66">0</span>,
</span></span><span style="display:flex;"><span>    ],
</span></span><span style="display:flex;"><span>    <span style="color:#e06c75">s</span>: [],
</span></span><span style="display:flex;"><span>    <span style="color:#e06c75">f</span>: <span style="color:#d19a66">0.0</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#e06c75">u</span>: (
</span></span><span style="display:flex;"><span>        <span style="color:#d19a66">0</span>,
</span></span><span style="display:flex;"><span>        <span style="color:#d19a66">0</span>,
</span></span><span style="display:flex;"><span>        <span style="color:#d19a66">0</span>,
</span></span><span style="display:flex;"><span>    ),
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>It turns out that in general, the default for common types in the standard library is a 0 byte in the underlying data structure.
Note that arrays are fixed size in rust and thus the default is an array of the right size, filled with defaults for the inner type.
That’s quite similar to <a href="https://go.dev/ref/spec#The_zero_value">go</a>.</p>
<h3 id="quick-reference">Quick Reference</h3>
<p>Here is a table for future reference:</p>
<table>
  <thead>
      <tr>
          <th>Type</th>
          <th>Default value</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><code>bool</code></td>
          <td><code>false</code></td>
      </tr>
      <tr>
          <td><code>char</code></td>
          <td><code>'\0'</code></td>
      </tr>
      <tr>
          <td><code>Option</code></td>
          <td><code>None</code></td>
      </tr>
      <tr>
          <td><code>String</code>, <code>&amp;str</code></td>
          <td><code>&quot;&quot;</code></td>
      </tr>
      <tr>
          <td><code>Vec&lt;usize&gt;</code></td>
          <td><code>[]</code></td>
      </tr>
      <tr>
          <td><code>[usize; N]</code></td>
          <td><code>[0, 0, …, 0]</code></td>
      </tr>
      <tr>
          <td><code>&amp;[u32]</code></td>
          <td><code>[]</code></td>
      </tr>
      <tr>
          <td><code>f64</code>, <code>f32…</code></td>
          <td><code>0.</code></td>
      </tr>
      <tr>
          <td><code>usize</code>, <code>u32…</code></td>
          <td><code>0</code></td>
      </tr>
  </tbody>
</table>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Please don’t take anything in this post as critical of the PR’s author work. I’m very grateful that they took some time to contribute to <a href="https://cj.rs/rusqlite_migration/">the project</a>.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>I’ve slightly edited the patch and the code samples from <a href="https://cj.rs/rusqlite_migration/">rusqlite_migration</a> to make those shorter and easier to grasp.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content:encoded></item><item><title>Git ls-files is Faster Than Fd and Find</title><link>https://joly.pw/blog/git-ls-files-is-faster-than-fd-and-find/</link><pubDate>Thu, 04 Nov 2021 06:06:21 +0000</pubDate><guid>https://joly.pw/blog/git-ls-files-is-faster-than-fd-and-find/</guid><description>Git ls-files is up to 5 times faster than fd or find in this benchmark, but why?</description><content:encoded><![CDATA[



  
  
  
  

  <div class="alert alert-tldr">
    <p class="alert-heading">
      ⚡
      
        TL;DR
      
    </p>
    <p>In the <a href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/">Linux Git repository</a>:</p>
<div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-fish" data-lang="fish"><span style="display:flex;"><span><span style="color:#61afef;font-weight:bold">hyperfine</span> <span style="color:#e06c75">--export-markdown</span> /tmp/tldr.md <span style="color:#e06c75">--warmup</span> <span style="color:#d19a66">10</span> <span style="color:#98c379">&#39;git ls-files&#39;</span> <span style="color:#98c379">&#39;find&#39;</span> <span style="color:#98c379">&#39;fd --no-ignore&#39;</span>
</span></span></code></pre></div><table>
  <thead>
      <tr>
          <th style="text-align: left">Command</th>
          <th style="text-align: right">Mean [ms]</th>
          <th style="text-align: right">Min [ms]</th>
          <th style="text-align: right">Max [ms]</th>
          <th style="text-align: right">Relative</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><code>git ls-files</code></td>
          <td style="text-align: right">16.9 ± 0.5</td>
          <td style="text-align: right">16.3</td>
          <td style="text-align: right">18.2</td>
          <td style="text-align: right">1.00</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>find</code></td>
          <td style="text-align: right">93.1 ± 0.7</td>
          <td style="text-align: right">92.4</td>
          <td style="text-align: right">95.7</td>
          <td style="text-align: right">5.52 ± 0.16</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>fd --no-ignore</code></td>
          <td style="text-align: right">85.8 ± 7.5</td>
          <td style="text-align: right">81.1</td>
          <td style="text-align: right">111.3</td>
          <td style="text-align: right">5.08 ± 0.47</td>
      </tr>
  </tbody>
</table>
<p><code>git ls-files</code> is more than <em>5 times faster</em> than both <code>fd --no-ignore</code> and <code>find</code>!</p>
  </div>



<h2 id="introduction">Introduction</h2>
<p>In my <a href="https://joly.pw/blog/my-setup/nvim-0-5/">editor</a> I changed my mapping to open files from <code>fd</code><sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> to <code>git ls-files</code><sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> and I noticed it felt faster after the change. But that’s intriguing, given <code>fd</code>’s goal to be <a href="https://github.com/sharkdp/fd#benchmark">very fast</a>. Git on the other hand is primarily a source code management system (SCM), it’s main business<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup> is not to help you list your files! Let’s run some benchmarks to make sure.</p>
<h2 id="benchmarks">Benchmarks</h2>
<p>Is <code>git ls-files</code> actually faster than <code>fd</code> or is that just an illusion? In our benchmark, we will use:</p>
<ul>
<li><code>fd</code> 8.2.1</li>
<li><code>git</code> 2.33.0</li>
<li><code>find</code> 4.8.0</li>
<li><code>hyperfine</code> 1.11.0</li>
</ul>
<p>We run the benchmarks with disk-cache filled, we are not measuring the cold cache case. That’s because in your editor, you may use the commands mentioned multiple times and would benefit from cache. The results are similar for an in memory repo, which confirms cache filling.</p>
<p>Also, you work on those files, so they should be in cache to a degree. We also make sure to be on a quiet PC, with CPU power-saving deactivated. Furthermore, the CPU has 8 cores with hyper-threading, so <code>fd</code> uses 8 threads. Last but not least, unless otherwise noted, the files in the repo are only the ones committed, for instance, no build artifacts are present.</p>
<h3 id="a-test-git-repository">A Test Git Repository</h3>
<p>We first need a Git repository. I’ve chosen to clone<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup> the <a href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/">Linux kernel repo</a> because it is a fairly big one and a <a href="https://github.blog/2020-12-22-git-clone-a-data-driven-study-on-cloning-behaviors/">reference</a> for Git performance measurements. This is important to ensure searches take a non-trivial amount of time: as hyperfine rightfully points out, short run times (less than 5 ms) are more difficult to accurately compare.</p>
<div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-fish" data-lang="fish"><span style="display:flex;"><span><span style="color:#61afef;font-weight:bold">git</span> clone <span style="color:#e06c75">--depth</span> <span style="color:#d19a66">1</span> <span style="color:#e06c75">--recursive</span> ssh://git@github.com/torvalds/linux.git ~/ghq/github.com/torvalds/linux
</span></span><span style="display:flex;"><span><span style="color:#c678dd">cd</span> ~/ghq/github.com/torvalds/linux
</span></span></code></pre></div><h4 id="choosing-the-commands">Choosing the commands</h4>
<p>We want to evaluate <code>git ls-files</code> versus <code>fd</code> and <code>find</code>. However, getting exactly the same list of file is not a trivial task:</p>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Command</th>
          <th style="text-align: right">Output lines</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><code>git ls-files</code></td>
          <td style="text-align: right">72219</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>find</code></td>
          <td style="text-align: right">77039</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>fd --no-ignore</code></td>
          <td style="text-align: right">76705</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>fd --no-ignore --hidden</code></td>
          <td style="text-align: right">77038</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>fd</code></td>
          <td style="text-align: right">72363</td>
      </tr>
  </tbody>
</table>
<p>After some more tries, it turns out that this command gives exactly<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup> the same output as <code>git ls-files</code>:</p>
<pre tabindex="0"><code>fd --no-ignore --hidden --exclude .git --type file --type symlink
</code></pre><p>It is a fairly complicated command, with various criteria on the files to print and that could translate to an unfair advantage to <code>git ls-files</code>. Consequently, we will also use the simpler examples in the table above.</p>
<h3 id="hyperfine">Hyperfine</h3>
<p><a href="https://github.com/sharkdp/hyperfine">Hyperfine</a> is a great tool to compare various commands: it has a colored and markdown output, attempts to detect outliers, tunes the number of run… Here is an <a href="https://asciinema.org/">asciinema</a><sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup> showing its output<sup id="fnref:7"><a href="#fn:7" class="footnote-ref" role="doc-noteref">7</a></sup>:</p>
<div id="demo3"></div>
<script>
AsciinemaPlayer.create("/blog/git-ls-files-is-faster-than-fd-and-find/hyperfine.json", document.getElementById('demo3'), {
"cols": "103","loop": "true","preload":  1 ,"rows": "32","speed": "4",
});
</script>
<noscript><blockquote><p>To run this asciicast without javascript, use <code>asciinema play https://joly.pw/blog/git-ls-files-is-faster-than-fd-and-find/hyperfine.json</code> with <a href="https://asciinema.org/">Asciinema</a></p></blockquote></noscript>

<h3 id="first-results">First Results</h3>
<p>For our first benchmark, on an SSD with <a href="https://en.wikipedia.org/wiki/Btrfs">btrfs</a>, with commit <a href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ad347abe4a9876b1f65f408ab467137e88f77eb4"><code>ad347abe4a…</code></a> checked out, we run:</p>
<div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-fish" data-lang="fish"><span style="display:flex;"><span><span style="color:#61afef;font-weight:bold">hyperfine</span> <span style="color:#e06c75">--export-markdown</span> /tmp/<span style="color:#d19a66">1</span>.md <span style="color:#e06c75">--warmup</span> <span style="color:#d19a66">10</span> <span style="color:#98c379">&#39;git ls-files&#39;</span> <span style="color:#98c379">\
</span></span></span><span style="display:flex;"><span>    <span style="color:#98c379">&#39;find&#39;</span> <span style="color:#98c379">&#39;fd --no-ignore&#39;</span> <span style="color:#98c379">&#39;fd --no-ignore --hidden&#39;</span> <span style="color:#98c379">&#39;fd&#39;</span> <span style="color:#98c379">\
</span></span></span><span style="display:flex;"><span>    <span style="color:#98c379">&#39;fd --no-ignore --hidden --exclude .git --type file --type symlink&#39;</span>
</span></span></code></pre></div><p>This yields the following results:</p>
<table>
  <thead>
      <tr>
          <th style="text-align: left">Command</th>
          <th style="text-align: right">Mean [ms]</th>
          <th style="text-align: right">Min [ms]</th>
          <th style="text-align: right">Max [ms]</th>
          <th style="text-align: right">Relative</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><code>git ls-files</code></td>
          <td style="text-align: right">16.9 ± 0.6</td>
          <td style="text-align: right">16.3</td>
          <td style="text-align: right">19.2</td>
          <td style="text-align: right">1.00</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>find</code></td>
          <td style="text-align: right">93.2 ± 0.5</td>
          <td style="text-align: right">92.5</td>
          <td style="text-align: right">94.8</td>
          <td style="text-align: right">5.50 ± 0.19</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>fd --no-ignore</code></td>
          <td style="text-align: right">86.6 ± 7.8</td>
          <td style="text-align: right">80.5</td>
          <td style="text-align: right">115.7</td>
          <td style="text-align: right">5.11 ± 0.49</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>fd --no-ignore --hidden</code></td>
          <td style="text-align: right">121.0 ± 6.2</td>
          <td style="text-align: right">112.3</td>
          <td style="text-align: right">132.3</td>
          <td style="text-align: right">7.14 ± 0.44</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>fd</code></td>
          <td style="text-align: right">231.6 ± 22.3</td>
          <td style="text-align: right">200.8</td>
          <td style="text-align: right">272.5</td>
          <td style="text-align: right">13.68 ± 1.40</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>fd --no-ignore --hidden --exclude .git --type file --type symlink</code></td>
          <td style="text-align: right">80.9 ± 5.0</td>
          <td style="text-align: right">77.5</td>
          <td style="text-align: right">95.3</td>
          <td style="text-align: right">4.78 ± 0.34</td>
      </tr>
  </tbody>
</table>
<p>As mentioned in the TL;DR, <code>git ls-files</code> is at least 5 times faster than its closest competitor! Let’s find out why that is.</p>
<h2 id="how-does-git-store-files-in-a-repository">How Does Git Store Files in a Repository</h2>
<p>To try to understand where this performance advantage of <code>git ls-files</code> comes from, let’s look into how files are stored in a repository. This is a quick overview, you can find more details about Git’s storage internals in <a href="https://git-scm.com/book/en/v2/Git-Internals-Git-Objects">this section of the Pro Git book</a>.</p>
<h3 id="git-objects">Git Objects</h3>
<p>Git builds its own internal representation of the file system tree in the repository:</p>
<figure>
    <img loading="lazy" src="./data-model-2.png"
         alt="Internal Git representation of the file system tree" width="800" height="593"/> <figcaption>
            Internal Git representation of the file system tree<p>From the Pro Git book, written by Scott Chacon and Ben Straub and published by Apress, licensed under the <a href="https://creativecommons.org/licenses/by-nc-sa/3.0/">Creative Commons Attribution Non Commercial Share Alike 3.0</a> license, copyright 2021.</p>
        </figcaption>
</figure>

<p>In the figure above, each tree object contains a list of folder or names and references to these (among other things). This representation is then stored by its hash in the <code>.git</code> folder, like so:</p>
<pre tabindex="0"><code>.git/objects
├── 65
│  └── 107a3367b67e7a50788f575f73f70a1e61c1df
├── e6
│  └── 9de29bb2d1d6434b8b29ae775ad8c2e48c5391
├── f0
│  └── f1a67ce36d6d87e09ea711c62e88b135b60411
├── info
└── pack
</code></pre><p>As a result, to list the content of a folder, it seems Git has to access the corresponding tree object, stored in a file contained in a folder with the beginning of the hash. But doing that for the currently checked out files all the time would be slow, especially for frequently used commands like <code>git status</code>. Fortunately, git also maintains an <em>index</em> for files in the current working directory.</p>
<h3 id="git-index">Git Index</h3>
<p>This <a href="https://git-scm.com/docs/index-format">index</a>, lists (among other things) each file in the repository with file-system metadata like last modification time. More details and examples are provided <a href="https://medium.com/hackernoon/understanding-git-index-4821a0765cf">here</a>.</p>
<p>So, it seems that the index has everything <code>ls-files</code> requires. Let’s check it is used by <code>ls-files</code></p>
<h2 id="strace">Strace</h2>
<p>Let’s ensure that <code>ls-files</code> uses only the index, without scanning many files in the repo or the <code>.git</code> folder. That would explain its performance advantage, as reading a file is cheaper than traversing many folders. To this end, we’ll use <code>strace</code><sup id="fnref:8"><a href="#fn:8" class="footnote-ref" role="doc-noteref">8</a></sup> like so:</p>
<div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-fish" data-lang="fish"><span style="display:flex;"><span><span style="color:#61afef;font-weight:bold">strace</span> <span style="color:#e06c75">-e</span> !write git ls-files<span style="color:#56b6c2">&gt;</span>/dev/null <span style="color:#d19a66">2</span><span style="color:#56b6c2">&gt;</span>/tmp/a
</span></span></code></pre></div><p>It turns out the <a href="https://git-scm.com/docs/index-format"><code>.git/index</code></a> is read:</p>
<pre tabindex="0"><code>openat(AT_FDCWD, &#34;.git/index&#34;, O_RDONLY) = 3
</code></pre><p>And we are not reading objects in the <code>.git</code> folder or files in the repository. A quick check of Git’s source code <a href="https://github.com/git/git/blob/33be431c0c7284c1adf0fe49f7838dbc8aee6ea9/builtin/ls-files.c#L761">confirms</a> this. We now have an explanation for the speed <code>git ls-files</code> displays in our benchmarks!</p>
<h2 id="other-scenarios">Other Scenarios</h2>
<p>However, listing file in a fully committed repository is not the most common case when you work on your code: as you make changes, a larger portion of the files are changed or added. How does <code>git ls-files</code> compare in these other scenarios?</p>
<h3 id="with-changes">With Changes</h3>
<p>When there are changes to some files, we shouldn’t see any significant performance difference: the index is still usable directly to get the names of the files in the repository, we don’t really care about whether their content changed.</p>
<p>To check this, let’s change all the C files in the kernel sources (using some <a href="https://fishshell.com/">fish</a> shell scripting):</p>
<div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-fish" data-lang="fish"><span style="display:flex;"><span><span style="color:#c678dd">for</span> <span style="color:#e06c75">f</span> <span style="color:#c678dd">in</span> <span style="color:#56b6c2">(</span><span style="color:#61afef;font-weight:bold">fd</span> <span style="color:#e06c75">-e</span> c<span style="color:#56b6c2">)</span>
</span></span><span style="display:flex;"><span>  <span style="color:#c678dd">echo</span> <span style="color:#d19a66">1</span> <span style="color:#56b6c2">&gt;&gt;</span> <span style="color:#e06c75">$f</span>
</span></span><span style="display:flex;"><span><span style="color:#c678dd">end</span>
</span></span></code></pre></div><div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-fish" data-lang="fish"><span style="display:flex;"><span><span style="color:#61afef;font-weight:bold">git</span> <span style="color:#e5c07b">status</span> <span style="color:#56b6c2">|</span> <span style="color:#61afef;font-weight:bold">wc</span> <span style="color:#e06c75">-l</span>
</span></span><span style="display:flex;"><span><span style="color:#61afef;font-weight:bold">28350</span>
</span></span></code></pre></div><div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-fish" data-lang="fish"><span style="display:flex;"><span><span style="color:#61afef;font-weight:bold">hyperfine</span> <span style="color:#e06c75">--export-markdown</span> /tmp/<span style="color:#d19a66">2</span>.md <span style="color:#e06c75">--warmup</span> <span style="color:#d19a66">10</span> <span style="color:#98c379">&#39;git ls-files&#39;</span> <span style="color:#98c379">&#39;find&#39;</span> <span style="color:#98c379">&#39;fd --no-ignore&#39;</span> <span style="color:#98c379">\
</span></span></span><span style="display:flex;"><span>  <span style="color:#98c379">&#39;fd --no-ignore --hidden --exclude .git --type file --type symlink&#39;</span>
</span></span></code></pre></div><table>
  <thead>
      <tr>
          <th style="text-align: left">Command</th>
          <th style="text-align: right">Mean [ms]</th>
          <th style="text-align: right">Min [ms]</th>
          <th style="text-align: right">Max [ms]</th>
          <th style="text-align: right">Relative</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><code>git ls-files</code></td>
          <td style="text-align: right">16.8 ± 0.5</td>
          <td style="text-align: right">16.3</td>
          <td style="text-align: right">18.9</td>
          <td style="text-align: right">1.00</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>find</code></td>
          <td style="text-align: right">93.5 ± 0.7</td>
          <td style="text-align: right">92.7</td>
          <td style="text-align: right">95.5</td>
          <td style="text-align: right">5.55 ± 0.17</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>fd --no-ignore</code></td>
          <td style="text-align: right">86.1 ± 7.3</td>
          <td style="text-align: right">80.9</td>
          <td style="text-align: right">112.6</td>
          <td style="text-align: right">5.12 ± 0.46</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>fd --no-ignore --hidden --exclude .git --type file --type symlink</code></td>
          <td style="text-align: right">80.8 ± 6.6</td>
          <td style="text-align: right">77.8</td>
          <td style="text-align: right">115.0</td>
          <td style="text-align: right">4.80 ± 0.42</td>
      </tr>
  </tbody>
</table>
<p>We see the same numbers as before and it is again consistent with <a href="https://github.com/git/git/blob/33be431c0c7284c1adf0fe49f7838dbc8aee6ea9/builtin/ls-files.c#L761">ls-files source code</a>.</p>
<p>Run <code>git checkout -f @</code> after this to remove the changes made to the files.</p>
<h3 id="with-new-files-and--o">With New Files and <code>-o</code></h3>
<p>With yet uncommitted files, there are two subcases:</p>
<ul>
<li>files were created and added (with <code>git add</code>): then the files are in index and reading the index is enough for <code>ls-files</code>, like above,</li>
<li>files were created but not added: these files are not present in the index, but without the <code>-o</code> flag, <code>ls-files</code> won’t output them either, so it can still use the index, as before.</li>
</ul>
<p>So the only case that needs further investigations is the use of <code>-o</code>. Since we don’t have baseline results yet for <code>-o</code>, let’s first see how it compares without any unadded new files.</p>
<h4 id="without-any-unadded-new-files-baseline">Without any Unadded New Files (Baseline)</h4>
<p>When we haven’t added any new files in the repository:</p>
<div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-fish" data-lang="fish"><span style="display:flex;"><span><span style="color:#61afef;font-weight:bold">hyperfine</span> <span style="color:#e06c75">--export-markdown</span> /tmp/<span style="color:#d19a66">3</span>.md <span style="color:#e06c75">--warmup</span> <span style="color:#d19a66">10</span> <span style="color:#98c379">&#39;git ls-files&#39;</span> <span style="color:#98c379">&#39;git ls-files -o&#39;</span> <span style="color:#98c379">&#39;find&#39;</span> <span style="color:#98c379">\
</span></span></span><span style="display:flex;"><span>  <span style="color:#98c379">&#39;fd --no-ignore&#39;</span> <span style="color:#98c379">&#39;fd --no-ignore --hidden --exclude .git --type file --type symlink&#39;</span>
</span></span></code></pre></div><table>
  <thead>
      <tr>
          <th style="text-align: left">Command</th>
          <th style="text-align: right">Mean [ms]</th>
          <th style="text-align: right">Min [ms]</th>
          <th style="text-align: right">Max [ms]</th>
          <th style="text-align: right">Relative</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><code>git ls-files</code></td>
          <td style="text-align: right">16.7 ± 0.5</td>
          <td style="text-align: right">16.1</td>
          <td style="text-align: right">17.9</td>
          <td style="text-align: right">1.00</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>git ls-files -o</code></td>
          <td style="text-align: right">69.1 ± 0.7</td>
          <td style="text-align: right">67.8</td>
          <td style="text-align: right">70.8</td>
          <td style="text-align: right">4.12 ± 0.12</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>find</code></td>
          <td style="text-align: right">94.3 ± 0.5</td>
          <td style="text-align: right">93.4</td>
          <td style="text-align: right">95.3</td>
          <td style="text-align: right">5.63 ± 0.16</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>fd --no-ignore</code></td>
          <td style="text-align: right">86.6 ± 7.0</td>
          <td style="text-align: right">80.8</td>
          <td style="text-align: right">106.0</td>
          <td style="text-align: right">5.17 ± 0.44</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>fd --no-ignore --hidden --exclude .git --type file --type symlink</code></td>
          <td style="text-align: right">80.8 ± 7.4</td>
          <td style="text-align: right">77.9</td>
          <td style="text-align: right">118.0</td>
          <td style="text-align: right">4.82 ± 0.46</td>
      </tr>
  </tbody>
</table>
<p>That suggests that <code>git ls-files -o</code> is performing some more work besides “just” reading the index. With <code>strace</code>, we see lines like:</p>
<div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-fish" data-lang="fish"><span style="display:flex;"><span><span style="color:#61afef;font-weight:bold">strace</span> <span style="color:#e06c75">-e</span> !write git ls-files <span style="color:#e06c75">-o</span><span style="color:#56b6c2">&gt;</span>/dev/null <span style="color:#d19a66">2</span><span style="color:#56b6c2">&gt;</span>/tmp/a
</span></span><span style="display:flex;"><span>…
</span></span><span style="display:flex;"><span><span style="color:#61afef;font-weight:bold">openat</span><span style="color:#56b6c2">(</span><span style="color:#61afef;font-weight:bold">AT_FDCWD</span>, <span style="color:#98c379">&#34;Documentation/&#34;</span>, O_RDONLY<span style="color:#56b6c2">|</span><span style="color:#61afef;font-weight:bold">O_NONBLOCK</span><span style="color:#56b6c2">|</span><span style="color:#61afef;font-weight:bold">O_CLOEXEC</span><span style="color:#56b6c2">|</span><span style="color:#61afef;font-weight:bold">O_DIRECTORY</span><span style="color:#56b6c2">)</span> <span style="color:#56b6c2">=</span> <span style="color:#d19a66">4</span>
</span></span><span style="display:flex;"><span><span style="color:#61afef;font-weight:bold">newfstatat</span><span style="color:#56b6c2">(</span><span style="color:#61afef;font-weight:bold">4</span>, <span style="color:#98c379">&#34;&#34;</span>, <span style="color:#56b6c2">{</span><span style="color:#e06c75">st_mode</span><span style="color:#56b6c2">=</span>S_IFDIR<span style="color:#56b6c2">|</span><span style="color:#61afef;font-weight:bold">0755</span>, <span style="color:#e06c75">st_size</span><span style="color:#56b6c2">=</span><span style="color:#d19a66">1446</span>, ...<span style="color:#56b6c2">}</span>, AT_EMPTY_PATH<span style="color:#56b6c2">)</span> <span style="color:#56b6c2">=</span> <span style="color:#d19a66">0</span>
</span></span><span style="display:flex;"><span><span style="color:#61afef;font-weight:bold">getdents64</span><span style="color:#56b6c2">(</span><span style="color:#61afef;font-weight:bold">4</span>, 0x55df0a6e6890 /* <span style="color:#d19a66">99</span> entries */, <span style="color:#d19a66">32768</span><span style="color:#56b6c2">)</span> <span style="color:#56b6c2">=</span> <span style="color:#d19a66">3032</span>
</span></span><span style="display:flex;"><span>…
</span></span></code></pre></div><h4 id="with-unadded-new-files">With Unadded New Files</h4>
<p>Let’s add some files now:</p>
<div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-fish" data-lang="fish"><span style="display:flex;"><span><span style="color:#c678dd">for</span> <span style="color:#e06c75">f</span> <span style="color:#c678dd">in</span> <span style="color:#56b6c2">(</span><span style="color:#61afef;font-weight:bold">seq</span> <span style="color:#d19a66">1</span> <span style="color:#d19a66">1000</span><span style="color:#56b6c2">)</span>
</span></span><span style="display:flex;"><span>  <span style="color:#61afef;font-weight:bold">touch</span> <span style="color:#e06c75">$f</span>
</span></span><span style="display:flex;"><span><span style="color:#c678dd">end</span>
</span></span></code></pre></div><p>And compare with our baseline:</p>
<div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-fish" data-lang="fish"><span style="display:flex;"><span><span style="color:#61afef;font-weight:bold">hyperfine</span> <span style="color:#e06c75">--export-markdown</span> /tmp/<span style="color:#d19a66">4</span>.md <span style="color:#e06c75">--warmup</span> <span style="color:#d19a66">10</span> <span style="color:#98c379">&#39;git ls-files&#39;</span> <span style="color:#98c379">&#39;git ls-files -o&#39;</span> <span style="color:#98c379">&#39;find&#39;</span> <span style="color:#98c379">\
</span></span></span><span style="display:flex;"><span>  <span style="color:#98c379">&#39;fd --no-ignore&#39;</span> <span style="color:#98c379">&#39;fd --no-ignore --hidden --exclude .git --type file --type symlink&#39;</span>
</span></span></code></pre></div><table>
  <thead>
      <tr>
          <th style="text-align: left">Command</th>
          <th style="text-align: right">Mean [ms]</th>
          <th style="text-align: right">Min [ms]</th>
          <th style="text-align: right">Max [ms]</th>
          <th style="text-align: right">Relative</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><code>git ls-files</code></td>
          <td style="text-align: right">16.8 ± 0.5</td>
          <td style="text-align: right">16.1</td>
          <td style="text-align: right">18.0</td>
          <td style="text-align: right">1.00</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>git ls-files -o</code></td>
          <td style="text-align: right">69.9 ± 1.2</td>
          <td style="text-align: right">68.1</td>
          <td style="text-align: right">72.6</td>
          <td style="text-align: right">4.17 ± 0.14</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>find</code></td>
          <td style="text-align: right">94.5 ± 0.6</td>
          <td style="text-align: right">93.4</td>
          <td style="text-align: right">96.3</td>
          <td style="text-align: right">5.64 ± 0.17</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>fd --no-ignore</code></td>
          <td style="text-align: right">86.8 ± 7.5</td>
          <td style="text-align: right">81.5</td>
          <td style="text-align: right">114.4</td>
          <td style="text-align: right">5.18 ± 0.48</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>fd --no-ignore --hidden --exclude .git --type file --type symlink</code></td>
          <td style="text-align: right">81.0 ± 4.5</td>
          <td style="text-align: right">78.6</td>
          <td style="text-align: right">96.3</td>
          <td style="text-align: right">4.83 ± 0.31</td>
      </tr>
  </tbody>
</table>
<p>There is little to no statically significant difference to our baseline, which highlights that much of the time is spent on things relatively independent of the number of files processed. It’s also worth noting that there is relatively little speed difference between <code>git ls-files -o</code> and <code>fd --no-ignore --hidden --exclude .git --type file --type symlink</code>.</p>
<p>Using <code>strace</code>, we can establish that all commands but <code>git ls-files</code> were reading all files in the repository. By comparing the <code>strace</code> outputs of <code>git ls-files -o</code> and <code>fd --no-ignore --hidden --exclude .git --type file --type symlink</code> (the two commands that print the same file list), we can see that they make similar system calls for each file in the repository. How to explain the (small) time difference between the two? I haven’t found convincing reasons in git source code for this case. It might be that the use of the <code>index</code> gives <code>ls-files</code> a head start.</p>
<h2 id="conclusions">Conclusions</h2>
<p>I’m now using <code>git ls-files</code> in my <a href="https://joly.pw/blog/my-setup/nvim-0-5/">keyboard driven text editor</a> instead of <code>fd</code> or <code>find</code>. It is faster, although the perceived difference described in the Introduction is probably due to spikes in latency on a cold cache. The selection of files is also narrowed down with <code>ls-files</code> to the ones I care about. That’s said, I’ve still kept the <code>fd</code>-based file listing as a fallback, as sometimes I’m not in a Git repository.</p>
<p>After all, Git is already building an index, why not use it to speed up your jumping from file to file!</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>With <a href="https://github.com/nvim-telescope/telescope.nvim">Telescope.nvim</a> <code>:Telescope find_files</code>&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>With <a href="https://github.com/nvim-telescope/telescope.nvim">Telescope.nvim</a> <code>:Telescope git_files show_untracked=false</code>&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>That’s not to say git is slow, on the contrary, when one reads the release notes, it’s obvious that a lot of performance optimization work is done.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>Using a shallow clone makes it faster for you to reproduce results locally. However, running the benchmarks again on a full clone does not significantly change the results.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>Using the <code>diff</code> command on the outputs of <code>git ls-files</code> and <code>fd --no-ignore --hidden --exclude .git --type file --type symlink</code>&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p>This is inserted in this page using my <a href="https://joly.pw/gohugo-asciinema/?ref=git-faster-fd-find">asciinema hugo module</a>&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:7">
<p>This output has been edited to remove the warning about outliers. These warning appeared only with <code>asciinema</code>, probably because it is disturbing the benchmark. This also explains why the values in this “asciicast” are different from the tables in the rest of the article: I’ve used values from runs outside asciinema for these tables.&#160;<a href="#fnref:7" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:8">
<p>See also <a href="https://jvns.ca/blog/2014/04/20/debug-your-programs-like-theyre-closed-source/">https://jvns.ca/blog/2014/04/20/debug-your-programs-like-theyre-closed-source/</a>&#160;<a href="#fnref:8" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content:encoded></item><item><title>README In Static Site (RISS)</title><link>https://joly.pw/readme-in-static-site/</link><pubDate>Sat, 21 Aug 2021 08:15:54 +0100</pubDate><guid>https://joly.pw/readme-in-static-site/</guid><description>💎 Transform and insert your GitHub readme in your static site</description><content:encoded><![CDATA[<p>
<p style="display: flex; justify-content: space-between">
  <a href="https://codeberg.org/cljoly/readme-in-static-site" data-goatcounter-click="ext-codeberg-readme-in-static-site" data-goatcounter-title="cljoly/readme-in-static-site">
    <span class="svgicon"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="currentColor" stroke="none">
    <path
        d='M11.955.49A12 12 0 0 0 0 12.49a12 12 0 0 0 1.832 6.373L11.838 5.928a.187.14 0 0 1 .324 0l10.006 12.935A12 12 0 0 0 24 12.49a12 12 0 0 0-12-12 12 12 0 0 0-.045 0zm.375 6.467l4.416 16.553a12 12 0 0 0 5.137-4.213z' />
</svg></span>&nbsp;cljoly/readme-in-static-site
  </a>
  <a class="badges" href="https://github.com/cljoly/readme-in-static-site" data-goatcounter-click="ext-stargithub-readme-in-static-site" data-goatcounter-title="stars cljoly/readme-in-static-site">
    <img src="https://img.shields.io/github/stars/cljoly/readme-in-static-site?style=social" alt="Github stars for readme-in-static-site">
  </a>
</p>


<div class="badges">

</p>
<p><a href="https://github.com/cljoly/readme-in-static-site/blob/main/riss.awk">
<img alt="GitHub code size in bytes" loading="lazy" src="https://img.shields.io/github/size/cljoly/readme-in-static-site/riss.awk?color=purple"></a> 
<img alt="GitHub tag (latest SemVer)" loading="lazy" src="https://img.shields.io/github/v/tag/cljoly/readme-in-static-site?color=darkgreen&sort=semver"> <a href="https://github.com/cljoly/readme-in-static-site/actions/workflows/checks.yml">
<img alt="CI" loading="lazy" src="https://codeberg.org/cljoly/readme-in-static-site/actions/workflows/checks.yml/badge.svg"></a> <a href="https://cj.rs/riss">

  <img loading="lazy" src="data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHdpZHRoPSIxMDQiIGhlaWdodD0iMjAiIHJvbGU9ImltZyIgYXJpYS1sYWJlbD0icG93ZXJlZCBieTogcmlzcyI+PHRpdGxlPnBvd2VyZWQgYnk6IHJpc3M8L3RpdGxlPjxsaW5lYXJHcmFkaWVudCBpZD0icyIgeDI9IjAiIHkyPSIxMDAlIj48c3RvcCBvZmZzZXQ9IjAiIHN0b3AtY29sb3I9IiNiYmIiIHN0b3Atb3BhY2l0eT0iLjEiLz48c3RvcCBvZmZzZXQ9IjEiIHN0b3Atb3BhY2l0eT0iLjEiLz48L2xpbmVhckdyYWRpZW50PjxjbGlwUGF0aCBpZD0iciI+PHJlY3Qgd2lkdGg9IjEwNCIgaGVpZ2h0PSIyMCIgcng9IjMiIGZpbGw9IiNmZmYiLz48L2NsaXBQYXRoPjxnIGNsaXAtcGF0aD0idXJsKCNyKSI+PHJlY3Qgd2lkdGg9Ijc1IiBoZWlnaHQ9IjIwIiBmaWxsPSIjNTU1Ii8+PHJlY3QgeD0iNzUiIHdpZHRoPSIyOSIgaGVpZ2h0PSIyMCIgZmlsbD0iIzlmOWY5ZiIvPjxyZWN0IHdpZHRoPSIxMDQiIGhlaWdodD0iMjAiIGZpbGw9InVybCgjcykiLz48L2c+PGcgZmlsbD0iI2ZmZiIgdGV4dC1hbmNob3I9Im1pZGRsZSIgZm9udC1mYW1pbHk9IlZlcmRhbmEsR2VuZXZhLERlamFWdSBTYW5zLHNhbnMtc2VyaWYiIHRleHQtcmVuZGVyaW5nPSJnZW9tZXRyaWNQcmVjaXNpb24iIGZvbnQtc2l6ZT0iMTEwIj48dGV4dCBhcmlhLWhpZGRlbj0idHJ1ZSIgeD0iMzg1IiB5PSIxNTAiIGZpbGw9IiMwMTAxMDEiIGZpbGwtb3BhY2l0eT0iLjMiIHRyYW5zZm9ybT0ic2NhbGUoLjEpIiB0ZXh0TGVuZ3RoPSI2NTAiPnBvd2VyZWQgYnk8L3RleHQ+PHRleHQgeD0iMzg1IiB5PSIxNDAiIHRyYW5zZm9ybT0ic2NhbGUoLjEpIiBmaWxsPSIjZmZmIiB0ZXh0TGVuZ3RoPSI2NTAiPnBvd2VyZWQgYnk8L3RleHQ+PHRleHQgYXJpYS1oaWRkZW49InRydWUiIHg9Ijg4NSIgeT0iMTUwIiBmaWxsPSIjMDEwMTAxIiBmaWxsLW9wYWNpdHk9Ii4zIiB0cmFuc2Zvcm09InNjYWxlKC4xKSIgdGV4dExlbmd0aD0iMTkwIj5yaXNzPC90ZXh0Pjx0ZXh0IHg9Ijg4NSIgeT0iMTQwIiB0cmFuc2Zvcm09InNjYWxlKC4xKSIgZmlsbD0iI2ZmZiIgdGV4dExlbmd0aD0iMTkwIj5yaXNzPC90ZXh0PjwvZz48L3N2Zz4="></a></p>
<p>
<img alt="RISS in action" loading="lazy" src="/blog/putting-readmes-on-your-static-site/riss-in-action.png"></p>

</div>


<p>This <a href="#benchmark">fast</a> <a href="https://cj.rs/riss.awk">script</a> allows you to insert your GitHub README in your static site and apply transformations. For instance, you can read this <a href="https://github.com/cljoly/readme-in-static-site/blob/main/README.md">README on GitHub</a> and <a href="https://cj.rs/readme-in-static-site">on my website</a>.</p>
<h3 id="why">Why?</h3>
<p>The GitHub README of your repo needs to efficiently describe your project to GitHub’s visitor. But featuring your project on your website allows you to (among other things):</p>
<ul>
<li>have more control on the theme and layout,</li>
<li>insert scripts that GitHub would prohibit (like <a href="#replace-asciinema-image">asciinema</a>),</li>
<li>have your project’s homepage independent of your hosting platform, if you wish to change at some point.</li>
</ul>
<p>Chances are that for small projects, the page about your project is very similar to the GitHub README. Don’t duplicate efforts, describe the differences! This <a href="https://github.com/vhodges/stitcherd">has</a> <a href="https://dev.to/lornasw93/github-readme-on-portfolio-project-page-51i8">been</a> a <a href="https://richjenks.com/github-pages-from-readme/">long-awaited</a> <a href="https://medium.com/@bolajiayodeji/how-to-convert-github-markdown-files-to-a-simple-website-b08602d05e1c">feature</a>, in <a href="https://news.ycombinator.com/item?id=29305990">one</a> <a href="https://stackoverflow.com/q/48919200/4253785">form</a> or <a href="https://stackoverflow.com/a/69296054/4253785">another</a>.</p>
<p>See this <a href="https://cj.rs/blog/putting-readmes-on-your-static-site/">blog post</a> for more details.</p>
<h2 id="testimonials">Testimonials</h2>
<p><a href="https://news.ycombinator.com/item?id=29304376">
<img loading="lazy" src="https://img.shields.io/badge/dynamic/json?color=Orange&label=HackerNews&query=%24.score&url=https%3A%2F%2Fhacker-news.firebaseio.com%2Fv0%2Fitem%2F29304376.json&logo=ycombinator&color=orange"></a> <a href="https://lobste.rs/s/a4jzvv/readme_static_site_riss#c_hiil4z.json">
<img loading="lazy" src="https://img.shields.io/badge/dynamic/json?color=green&label=Lobsters&query=%24.score&url=https%3A%2F%2Flobste.rs%2Fs%2Fa4jzvv%2Freadme_static_site_riss.json&logo=data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAAAAXNSR0IArs4c6QAAAAZiS0dEAL0ALQAtZF7+HAAAAAlwSFlzAAALEwAACxMBAJqcGAAAAAd0SU1FB9wCCBAuLt2rqugAAACMSURBVDjL1ZMxCsJQEERfRPEk9im9mGfxBpaCnaXmBraeQEidKsKz2eIb/peEIOLAwu4Ws7PMbsUAKlOwYCaWmd4aWCV1B4yXpR59R61SitwKDdBHfgPaSTsF8zOmHz5NLykAeCRqvuvCfxGcgP0YF3ZqHy7c1Yt6jfo8dCF3idvkQjcRRVQ/f6bZBC+RBoeZnlCyqwAAAABJRU5ErkJggg==&labelColor=500000"></a></p>
<p>RISS made it to the first page of <a href="https://news.ycombinator.com/item?id=29304376">HackerNews</a> and <a href="https://lobste.rs/s/a4jzvv/readme_static_site_riss">Lobsters</a> and got comments like:</p>




  <figure>
    <blockquote >
      <p>I never really understood the idea to have a separate README and index page. Glad to see i&rsquo;m not the only one :)</p>

    </blockquote>
    
  </figure>



<p><a href="https://news.ycombinator.com/item?id=29305519">southerntofu</a></p>




  <figure>
    <blockquote >
      <p>Kudos for making it reusable and not specific to a single static site generator. […]</p>

    </blockquote>
    
  </figure>



<p><a href="https://lobste.rs/s/a4jzvv/readme_static_site_riss#c_hiil4z">hannu</a></p>




  <figure>
    <blockquote >
      <p>[…] A small but good idea, I like how simple riss.awk is.</p>

    </blockquote>
    
  </figure>



<p><a href="https://news.ycombinator.com/item?id=29305070">lifthrasiir</a></p>
<h3 id="run-it-nothing-to-install">Run It (Nothing to Install)</h3>
<p>To try it with <a href="https://gohugo.io/">Hugo</a> or <a href="https://www.getzola.org/">Zola</a>, run the following in your static-site sources:</p>
<div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-sh" data-lang="sh"><span style="display:flex;"><span>wget https://cj.rs/riss.awk
</span></span><span style="display:flex;"><span>awk -f riss.awk /path/to/my-project/README.md &gt; content/my-project.md
</span></span></code></pre></div><p>If you don’t use Hugo or Zola, no problem! It should also work with any markdown-based static-site generator. Just put the markdown file where it makes sense for that generator.</p>
<p>To automatically update these files in your static-site sources, see <a href="#automate-with-github-actions">Automate with GitHub Actions</a> below. Since RISS is based on Awk, there is nothing to install, besides copying the script itself!</p>
<h2 id="example">Example</h2>
<h3 id="add-a-front-matter">Add a Front Matter</h3>
<p>Most static site generators require a “<a href="https://gohugo.io/getting-started/configuration/#configure-front-matter">front matter</a>” at the beginning of a markdown file to attach some metadata. But you don’t want to add this to your GitHub README! Let’s hide this on GitHub and have it in the script’s output.</p>
<p>In you .md file on GitHub, put:</p>
<pre><code>&lt;!-- insert
---
title: &quot;README In Static Site (RISS)&quot;
date: 2021-08-21T10:15:54
---
end_insert --&gt;
&lt;!-- Powered by https://cj.rs/riss --&gt;
&lt;!-- remove --&gt;

# README In Static Site (RISS)
&lt;!-- end_remove --&gt;
</code></pre>
<p>The output of the script will be:</p>
<pre><code>---
title: &quot;README In Static Site (RISS)&quot;
date: 2021-08-21T10:15:54
---
&lt;!-- Powered by https://cj.rs/riss --&gt;
</code></pre>
<p>and this piece of yaml will be hidden on GitHub!</p>
<h3 id="replace-asciinema-image">Replace Asciinema Image</h3>
<p>You can’t embed the asciinema player on GitHub for security reasons. So the <a href="https://asciinema.org/docs/embedding">asciinema documentation</a> suggests using an image there and to link it to a webpage with the player. But on your own website, you can embed this player.</p>
<p>In your .md file, put:</p>
<pre><code>&lt;!-- remove --&gt;
[![Finding the repositories with “telescope” in their name, with the README in the panel on the right](https://asciinema.org/a/427156.svg)](https://asciinema.org/a/427156)
&lt;!-- end_remove --&gt;
&lt;!-- insert
&lt;asciinema-player src=&quot;./telescope.cast&quot; poster=&quot;npt:0:04&quot;&gt;&lt;/asciinema-player&gt;
end_insert --&gt;
</code></pre>
<p>The output will contain only the asciinema player:</p>
<pre><code>&lt;asciinema-player src=&quot;./telescope.cast&quot; poster=&quot;npt:0:04&quot;&gt;&lt;/asciinema-player&gt;
</code></pre>
<p><em>Note</em>: you also need to add the JS/CSS files of the asciinema player somewhere in your theme. This <a href="https://cj.rs/gohugo-asciinema/">Hugo module</a> makes it easy.</p>
<h3 id="more">More</h3>
<p>See the <a href="https://github.com/cljoly/readme-in-static-site/blob/main/test.md">input file (typically on GitHub)</a> and the <a href="https://github.com/cljoly/readme-in-static-site/blob/main/test_output.md">output of the script</a>. You can find another real word <a href="https://github.com/cljoly/telescope-repo.nvim/blob/master/README.md">README</a> converted to a <a href="https://cj.rs/telescope-repo-nvim/">webpage</a> (this gives another example of asciinema conversion using a Hugo shortcode).</p>
<p>With some shell scripting, you could also transform all the markdown files in your repo and put them in a subdirectory of your site, so that your project’s documentation, policy, etc… lives on your site or even on a site of its own.</p>
<h3 id="your-example">Your Example!</h3>
<p>Have you used this script to transform some markdown (or other) and insert it on your website? <a href="https://github.com/cljoly/readme-in-static-site/issues/new">Open an issue</a> if you would like a link to your use case from this README!</p>
<ul>
<li><strong>telescope-repo.nvim</strong>: <a href="https://github.com/cljoly/telescope-repo.nvim/blob/master/README.md">readme</a>, <a href="https://cj.rs/telescope-repo-nvim/">website</a>; features an Asciinema clip.</li>
<li><strong>neovide</strong>: <a href="https://github.com/neovide/neovide/blob/main/README.md">readme</a>, <a href="https://neovide.dev/">first iteration of their website</a>, <a href="https://github.com/neovide/neovide/pull/1114">PR setting this up</a>. They have now moved to mdbook and that’s great! RISS makes the first iteration of your website easy and you are free to move to more complete solutions when your project grows.</li>
<li><strong>Hugo APT Repository</strong>: <a href="https://github.com/8hobbies/hugo-apt/blob/master/README.md">readme</a>, <a href="https://hugo-apt.8hob.io/">website</a>, <a href="https://github.com/8hobbies/hugo-apt/pull/12">PR setting it up</a>.</li>
</ul>
<h2 id="transformations-reference">Transformations Reference</h2>
<p>The transformations are driven by HTML comments, so that you can have different results when comments are ignored (e.g. in your GitHub README) and after executing the script on your markdown file.</p>
<h3 id="escaping">Escaping</h3>
<p><strong>It is important that your comment starts at the beginning of the line:</strong> spaces are used for escaping, meaning that if the comment has spaces at the beginning of the line, it is ignored.</p>
<p>So this is escaped</p>
<pre tabindex="0"><code>    &lt;!-- insert
</code></pre><p>but this is not</p>
<pre><code>&lt;!-- insert
</code></pre>
<h3 id="insertion">Insertion</h3>
<p>Use these two lines for text you want to have in your output, but not in the raw .md file.</p>
<pre><code>&lt;!-- insert
end_insert --&gt;
</code></pre>
<h3 id="remove">Remove</h3>
<p>Use these two comments for text you want to have in your raw .md file, but not in the output</p>
<pre><code>&lt;!-- remove --&gt;
&lt;!-- end_remove --&gt;
</code></pre>
<h2 id="spread-the-word">Spread the Word</h2>
<p>If you find this script useful, please consider inserting the following in your readme:</p>
<div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-html" data-lang="html"><span style="display:flex;"><span><span style="color:#7f848e">&lt;!-- Powered by https://cj.rs/riss --&gt;</span>
</span></span></code></pre></div><p>This will help other people find the script. <em>Thanks for spreading the word!</em></p>
<p>If you feel especially charitable, you could put this badge somewhere:</p>
<p><a href="https://cj.rs/riss">

  <img loading="lazy" src="data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHdpZHRoPSIxMDQiIGhlaWdodD0iMjAiIHJvbGU9ImltZyIgYXJpYS1sYWJlbD0icG93ZXJlZCBieTogcmlzcyI+PHRpdGxlPnBvd2VyZWQgYnk6IHJpc3M8L3RpdGxlPjxsaW5lYXJHcmFkaWVudCBpZD0icyIgeDI9IjAiIHkyPSIxMDAlIj48c3RvcCBvZmZzZXQ9IjAiIHN0b3AtY29sb3I9IiNiYmIiIHN0b3Atb3BhY2l0eT0iLjEiLz48c3RvcCBvZmZzZXQ9IjEiIHN0b3Atb3BhY2l0eT0iLjEiLz48L2xpbmVhckdyYWRpZW50PjxjbGlwUGF0aCBpZD0iciI+PHJlY3Qgd2lkdGg9IjEwNCIgaGVpZ2h0PSIyMCIgcng9IjMiIGZpbGw9IiNmZmYiLz48L2NsaXBQYXRoPjxnIGNsaXAtcGF0aD0idXJsKCNyKSI+PHJlY3Qgd2lkdGg9Ijc1IiBoZWlnaHQ9IjIwIiBmaWxsPSIjNTU1Ii8+PHJlY3QgeD0iNzUiIHdpZHRoPSIyOSIgaGVpZ2h0PSIyMCIgZmlsbD0iIzlmOWY5ZiIvPjxyZWN0IHdpZHRoPSIxMDQiIGhlaWdodD0iMjAiIGZpbGw9InVybCgjcykiLz48L2c+PGcgZmlsbD0iI2ZmZiIgdGV4dC1hbmNob3I9Im1pZGRsZSIgZm9udC1mYW1pbHk9IlZlcmRhbmEsR2VuZXZhLERlamFWdSBTYW5zLHNhbnMtc2VyaWYiIHRleHQtcmVuZGVyaW5nPSJnZW9tZXRyaWNQcmVjaXNpb24iIGZvbnQtc2l6ZT0iMTEwIj48dGV4dCBhcmlhLWhpZGRlbj0idHJ1ZSIgeD0iMzg1IiB5PSIxNTAiIGZpbGw9IiMwMTAxMDEiIGZpbGwtb3BhY2l0eT0iLjMiIHRyYW5zZm9ybT0ic2NhbGUoLjEpIiB0ZXh0TGVuZ3RoPSI2NTAiPnBvd2VyZWQgYnk8L3RleHQ+PHRleHQgeD0iMzg1IiB5PSIxNDAiIHRyYW5zZm9ybT0ic2NhbGUoLjEpIiBmaWxsPSIjZmZmIiB0ZXh0TGVuZ3RoPSI2NTAiPnBvd2VyZWQgYnk8L3RleHQ+PHRleHQgYXJpYS1oaWRkZW49InRydWUiIHg9Ijg4NSIgeT0iMTUwIiBmaWxsPSIjMDEwMTAxIiBmaWxsLW9wYWNpdHk9Ii4zIiB0cmFuc2Zvcm09InNjYWxlKC4xKSIgdGV4dExlbmd0aD0iMTkwIj5yaXNzPC90ZXh0Pjx0ZXh0IHg9Ijg4NSIgeT0iMTQwIiB0cmFuc2Zvcm09InNjYWxlKC4xKSIgZmlsbD0iI2ZmZiIgdGV4dExlbmd0aD0iMTkwIj5yaXNzPC90ZXh0PjwvZz48L3N2Zz4="></a></p>
<p>with for instance this code:</p>
<div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-markdown" data-lang="markdown"><span style="display:flex;"><span>[<span style="color:#e06c75">![</span>](<span style="color:#e06c75">https://img.shields.io/badge/powered%20by-riss-lightgrey</span>)](https://cj.rs/riss)
</span></span></code></pre></div><h2 id="breaking-api-changes">Breaking API Changes</h2>
<p>We follow <a href="https://semver.org/">semver</a> and any change that change would cause real world READMEs to be converted differently requires a new major version. In particular, the following is a breaking change:</p>
<ul>
<li>adding new keywords (like <code>remove</code> or <code>insert</code>), as they may be used in the README prior to their introduction in RISS,</li>
<li>changing a keywords syntax.</li>
</ul>
<h2 id="benchmark">Benchmark</h2>
<p><strong>Processes 17600 lines in 10 ms</strong></p>
<div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-bash" data-lang="bash"><span style="display:flex;"><span>$ <span style="color:#c678dd">for</span> i in <span style="color:#56b6c2">{</span>1..100<span style="color:#56b6c2">}</span>; <span style="color:#c678dd">do</span> shuf README.md &gt;&gt;bench.md; <span style="color:#c678dd">done</span> <span style="color:#7f848e"># Create a big md file</span>
</span></span><span style="display:flex;"><span>$ wc -l README.md
</span></span><span style="display:flex;"><span><span style="color:#d19a66">176</span> README.md
</span></span><span style="display:flex;"><span>$ wc -l bench.md
</span></span><span style="display:flex;"><span><span style="color:#d19a66">17600</span> bench.md
</span></span><span style="display:flex;"><span>$ hyperfine <span style="color:#98c379">&#39;awk -f riss.awk README.md&#39;</span> <span style="color:#98c379">&#39;awk -f riss.awk bench.md&#39;</span>
</span></span></code></pre></div><table>
  <thead>
      <tr>
          <th style="text-align: left">Command</th>
          <th style="text-align: right">Mean [ms]</th>
          <th style="text-align: right">Min [ms]</th>
          <th style="text-align: right">Max [ms]</th>
          <th style="text-align: right">Relative</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td style="text-align: left"><code>awk -f riss.awk README.md</code></td>
          <td style="text-align: right">2.8 ± 0.2</td>
          <td style="text-align: right">2.4</td>
          <td style="text-align: right">3.7</td>
          <td style="text-align: right">1.00</td>
      </tr>
      <tr>
          <td style="text-align: left"><code>awk -f riss.awk bench.md</code></td>
          <td style="text-align: right">9.7 ± 0.4</td>
          <td style="text-align: right">8.9</td>
          <td style="text-align: right">10.7</td>
          <td style="text-align: right">3.46 ± 0.30</td>
      </tr>
  </tbody>
</table>
<h2 id="automate-with-github-actions">Automate with GitHub Actions</h2>
<p>You can automatically update the markdown file in the sources of your site with GitHub Actions. Add this workflow to, for instance, <code>.github/workflows/readme.yml</code>:</p>
<div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-yaml" data-lang="yaml"><span style="display:flex;"><span><span style="color:#e06c75">name</span>: Update README files
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#e06c75">on</span>:
</span></span><span style="display:flex;"><span>  <span style="color:#e06c75">schedule</span>:
</span></span><span style="display:flex;"><span>    - <span style="color:#e06c75">cron</span>: <span style="color:#98c379">&#39;30 */2 * * *&#39;</span>
</span></span><span style="display:flex;"><span>  <span style="color:#e06c75">push</span>:
</span></span><span style="display:flex;"><span>    <span style="color:#e06c75">branches</span>:
</span></span><span style="display:flex;"><span>    - master
</span></span><span style="display:flex;"><span>  <span style="color:#7f848e"># To run this workflow manually from GitHub GUI</span>
</span></span><span style="display:flex;"><span>  <span style="color:#e06c75">workflow_dispatch</span>:
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#e06c75">jobs</span>:
</span></span><span style="display:flex;"><span>  <span style="color:#e06c75">build</span>:
</span></span><span style="display:flex;"><span>    <span style="color:#e06c75">runs-on</span>: ubuntu-latest
</span></span><span style="display:flex;"><span>    <span style="color:#e06c75">steps</span>:
</span></span><span style="display:flex;"><span>    - <span style="color:#e06c75">name</span>: Check out the repo
</span></span><span style="display:flex;"><span>      <span style="color:#e06c75">uses</span>: actions/checkout@v2
</span></span><span style="display:flex;"><span>    - <span style="color:#e06c75">name</span>: Get the latest READMEs
</span></span><span style="display:flex;"><span>      <span style="color:#e06c75">run</span>: make readme-update
</span></span><span style="display:flex;"><span>    - <span style="color:#e06c75">name</span>: Commit and push if there are changes
</span></span><span style="display:flex;"><span>      <span style="color:#e06c75">run</span>: |-<span style="color:#98c379">
</span></span></span><span style="display:flex;"><span><span style="color:#98c379">        git diff
</span></span></span><span style="display:flex;"><span><span style="color:#98c379">        git config --global user.email &#34;bot@example.com&#34;
</span></span></span><span style="display:flex;"><span><span style="color:#98c379">        git config --global user.name &#34;bot&#34;
</span></span></span><span style="display:flex;"><span><span style="color:#98c379">        git diff --quiet || (git add -u &amp;&amp; git commit -m &#34;Update READMEs&#34;)
</span></span></span><span style="display:flex;"><span><span style="color:#98c379">        git push</span>
</span></span></code></pre></div><p>and then your <code>Makefile</code> may contain something like:</p>
<div class="highlight"><pre tabindex="0" style="color:#abb2bf;background-color:#282c34;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-make" data-lang="make"><span style="display:flex;"><span><span style="color:#61afef;font-weight:bold">readme-update</span><span style="color:#56b6c2">:</span>
</span></span><span style="display:flex;"><span>	curl https://raw.githubusercontent.com/cljoly/readme-in-static-site/main/README.md | awk -f riss.awk &gt;content/readme-in-static-site.md
</span></span></code></pre></div><p>Alternatively, you might configure your repositories to trigger a website rebuild when committing on your README, for instance using <a href="https://mainawycliffe.dev/blog/github-actions-trigger-via-webhooks/">GitHub actions webhooks</a>.</p>
<h2 id="contributions-are-welcome">Contributions are Welcome!</h2>
<p>Feel free to <a href="https://github.com/cljoly/readme-in-static-site/issues/new">open an issue</a> to discuss something or to send a PR.</p>
<p>See also the <a href="#spread-the-word">Spread the Word</a> section if you would like to make more folks aware of this script.</p>
<p>
<img alt="GitHub" loading="lazy" src="https://img.shields.io/github/license/cljoly/readme-in-static-site"></p>
]]></content:encoded></item></channel></rss>