<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.2.2">Jekyll</generator><link href="https://eoinkelly.info/feed.xml" rel="self" type="application/atom+xml" /><link href="https://eoinkelly.info/" rel="alternate" type="text/html" /><updated>2023-01-28T03:16:31+00:00</updated><id>https://eoinkelly.info/feed.xml</id><title type="html">Eoin Kelly</title><subtitle>Web development and architecture. Hopefully useful.</subtitle><entry><title type="html">Rails and pgBouncer Notes</title><link href="https://eoinkelly.info/2023/01/06/rails-and-pgbouncer-notes.html" rel="alternate" type="text/html" title="Rails and pgBouncer Notes" /><published>2023-01-06T00:00:00+00:00</published><updated>2023-01-06T00:00:00+00:00</updated><id>https://eoinkelly.info/2023/01/06/rails-and-pgbouncer-notes</id><content type="html" xml:base="https://eoinkelly.info/2023/01/06/rails-and-pgbouncer-notes.html"><![CDATA[<p><em>This post is <a href="https://github.com/eoinkelly/notes/blob/main/rails/pgbouncer-for-performance.blog.md">a snapshot of a page from learning notes</a>. I’m sharing it here in the hope that, even unpolished as it is, it is useful to you. It may be updated occasionally with improvements and corrections as I get time and learn more.</em></p>

<h1 id="rails-and-pgbouncer">Rails and pgBouncer</h1>

<p>An overview of using pgBouncer to improve the scalability of a Rails app.</p>

<ul>
  <li><a href="#rails-and-pgbouncer">Rails and pgBouncer</a>
    <ul>
      <li><a href="#sources">Sources</a></li>
      <li><a href="#background">Background</a>
        <ul>
          <li><a href="#how-many-connections-can-my-postgres-db-handle">How many connections can my Postgres DB handle?</a></li>
          <li><a href="#why-is-the-practical-maximum-no-of-connection-not-the-same-as-max_connections">Why is the practical maximum no. of connection not the same as max_connections?</a></li>
          <li><a href="#why-cant-we-just-set-the-size-of-the-activerecord-pools-to-match-our-available-db-connections">Why can’t we just set the size of the ActiveRecord pools to match our available DB connections?</a></li>
          <li><a href="#doesnt-activerecord-drop-connections-when-they-are-not-needed">Doesn’t ActiveRecord drop connections when they are not needed?</a></li>
          <li><a href="#deploys-can-temporarily-spike-the-number-of-db-connections">Deploys can temporarily spike the number of DB connections</a></li>
          <li><a href="#what-is-max_connections-set-to-in-rds">What is max_connections set to in RDS?</a></li>
          <li><a href="#what-problems-does-pgbouncer-fix">What problems does pgBouncer fix?</a></li>
          <li><a href="#what-are-the-downsides-of-pgbouncer">What are the downsides of pgBouncer?</a></li>
          <li><a href="#pgbouncer-alternative-rds-proxy">pgBouncer alternative: RDS Proxy</a></li>
          <li><a href="#pgbouncer-alternative-odyssey">pgBouncer alternative: odyssey</a></li>
          <li><a href="#pgbouncer-alternative-pgpool-ii">pgBouncer alternative: pgpool-II</a></li>
          <li><a href="#pgbouncer-alternative-pgcat">pgBouncer alternative: pgcat</a></li>
        </ul>
      </li>
      <li><a href="#pgbouncer">pgBouncer</a>
        <ul>
          <li><a href="#how-to-set-up-pgbouncer">How to set up pgBouncer</a>
            <ul>
              <li><a href="#using-pgbouncer-to-do-rudimentary-read-write-vs-read-routing">Using pgBouncer to do rudimentary read-write vs read routing</a></li>
              <li><a href="#using-pgbouncer-as-a-rudimentary-weighted-load-balancer">Using pgBouncer as a rudimentary weighted load balancer</a></li>
              <li><a href="#pgbouncer-failover">pgBouncer failover</a></li>
              <li><a href="#pgbconsole">pgbconsole</a></li>
            </ul>
          </li>
          <li><a href="#where-should-i-run-pgbouncer">Where should I run pgBouncer?</a></li>
          <li><a href="#how-do-i-choose-the-ratio-of-rails-connections-to-real-connections-nm">How do I choose the ratio of Rails connections to real connections (N:M)?</a></li>
          <li><a href="#what-is-the-optimal-number-of-db-connections-for-a-given-rds-instance-size">What is the optimal number of DB connections for a given RDS instance size?</a></li>
          <li><a href="#how-does-rds-do-routing-of-queries-to-multi-az-dbs">How does RDS do routing of queries to multi-az DBs?</a></li>
        </ul>
      </li>
    </ul>
  </li>
</ul>

<h2 id="sources">Sources</h2>

<ul>
  <li>Conversations with folks on NZ and AU Ruby Slacks</li>
  <li><a href="https://gist.github.com/Gargron/aa9341a49dc91d5a721019d9e0c9fd11">https://gist.github.com/Gargron/aa9341a49dc91d5a721019d9e0c9fd11</a>
    <ul>
      <li>Creator of Mastodon</li>
      <li>advocates for using pgBouncer</li>
    </ul>
  </li>
  <li><a href="https://github.com/brettwooldridge/HikariCP/wiki/About-Pool-Sizing">https://github.com/brettwooldridge/HikariCP/wiki/About-Pool-Sizing</a></li>
  <li><a href="https://edu.postgrespro.com/postgresql_internals-14_parts1-4_en.pdf">https://edu.postgrespro.com/postgresql_internals-14_parts1-4_en.pdf</a> , but it has a whole section on locks and the whole thing is gold.
    <ul>
      <li>page 94 of this book talks about why it has to look at each connection</li>
      <li>
        <blockquote>
          <p>lightweight locks (the non user facing kind) are currently our biggest
contributor to db load, page 275 of the book mentions it briefly it as
“unpleasant effects”. without multiplexing connections through pgbouncer we
couldn’t run at all</p>
        </blockquote>
      </li>
    </ul>
  </li>
  <li><a href="https://techcommunity.microsoft.com/t5/azure-database-for-postgresql/analyzing-the-limits-of-connection-scalability-in-postgres/ba-p/1757266">https://techcommunity.microsoft.com/t5/azure-database-for-postgresql/analyzing-the-limits-of-connection-scalability-in-postgres/ba-p/1757266</a></li>
  <li><a href="https://brandur.org/postgres-connections">https://brandur.org/postgres-connections</a></li>
  <li><a href="https://www.enterprisedb.com/postgres-tutorials/why-you-should-use-connection-pooling-when-setting-maxconnections-postgres">https://www.enterprisedb.com/postgres-tutorials/why-you-should-use-connection-pooling-when-setting-maxconnections-postgres</a></li>
  <li><a href="https://aws.amazon.com/blogs/database/performance-impact-of-idle-postgresql-connections/">https://aws.amazon.com/blogs/database/performance-impact-of-idle-postgresql-connections/</a></li>
  <li><a href="https://www.youtube.com/watch?v=9_pbEVeMEB4">https://www.youtube.com/watch?v=9_pbEVeMEB4</a> (How to tame a Mastodon talk)</li>
</ul>

<h2 id="background">Background</h2>

<h3 id="how-many-connections-can-my-postgres-db-handle">How many connections can my Postgres DB handle?</h3>

<ul>
  <li>The hard limit on number of connections is the <code class="language-plaintext highlighter-rouge">max_connections</code> setting.</li>
  <li>In theory, you can set this number based on the available RAM.</li>
  <li>In practice, Postgres performance degrades with very high numbers of connections so your practical limit may be lower than the <code class="language-plaintext highlighter-rouge">max_connections</code> limit.</li>
</ul>

<blockquote>
  <p>New rule of thumb: If you have to set postgres max_connections to above 512, don’t.</p>

  <p><a href="https://hazelweakly.me/blog/scaling-mastodon/">https://hazelweakly.me/blog/scaling-mastodon/</a></p>
</blockquote>

<p>No evidence cited for the above number.</p>

<blockquote>
  <p>While it is possible to have a few thousand established connections without
running into problems, there are some real and hard-to-avoid problems</p>

  <p><a href="https://techcommunity.microsoft.com/t5/azure-database-for-postgresql/analyzing-the-limits-of-connection-scalability-in-postgres/ba-p/1757266">https://techcommunity.microsoft.com/t5/azure-database-for-postgresql/analyzing-the-limits-of-connection-scalability-in-postgres/ba-p/1757266</a></p>
</blockquote>

<p>The article above is primarily an argument for PG improving snapshot scalability
to better handle large numbers of connections.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>TODO: read this article properly, some interesting details in there
</code></pre></div></div>

<blockquote>
  <p>most users find PostgreSQL’s default of max_connections = 100 to be too low</p>

  <p>…</p>

  <p>Talk to any PostgreSQL expert out there, and they’ll give you a range, “a few
hundred,” or some will flat-out say, “not more than 500,” and “definitely no
more than 1000.”</p>

  <p>But where do these numbers come from? How do they know that, and how do we
calculate that? Ask these questions, and you’ll only find yourself more
frustrated, because there isn’t a formulaic way to determine that number.</p>

  <p>…</p>

  <p>So it seems that for this server, the sweet spot was really somewhere between
300-400 connections, and max_connections should not be set much higher than
that, lest we risk forfeiting performance.</p>

  <p>So for this server that I’ve set up to be similar to some enterprise-grade
machines, the optimal performance was when there were 300-500 concurrent
connections. After 700, performance dropped precipitously (both in terms of
transactions-per-second and latency). Anything above 1000 connections performed
poorly, along with an ever-increasing latency. Towards the end, the latency
starts to be non-linear</p>

  <p><a href="https://www.enterprisedb.com/postgres-tutorials/why-you-should-use-connection-pooling-when-setting-maxconnections-postgres">https://www.enterprisedb.com/postgres-tutorials/why-you-should-use-connection-pooling-when-setting-maxconnections-postgres</a></p>
</blockquote>

<p>The above article ran PG on a really beefy machine and still the optimal was in the 300-500 connection range.</p>

<blockquote>
  <p>our largest RDS postgres instance typically sits at around 1200-1300 open
connections at peak and it works ok day to day. It’s definitely possible to have
500+ connections with a beefy server</p>

  <p>james.healy on Ruby AU slack</p>
</blockquote>

<p>There is evidence that over 500 can work fine.</p>

<h3 id="why-is-the-practical-maximum-no-of-connection-not-the-same-as-max_connections">Why is the practical maximum no. of connection not the same as max_connections?</h3>

<ul>
  <li>Sources:
    <ul>
      <li><a href="https://brandur.org/postgres-connections">https://brandur.org/postgres-connections</a></li>
    </ul>
  </li>
  <li>You can create an instance with enough RAM that the memory ceiling isn’t the limiting factor for Postgres</li>
  <li>Each connection to Postgres spawns a process
    <ul>
      <li>Process starts at a few MB (I have read 1.5 - 5MB as a starting point) but may grow a lot larger</li>
      <li>I have observed each PG connection process taking 15-20MB on one project</li>
    </ul>
  </li>
  <li>the Postmaster and its backend processes use shared memory for communication, and parts of that shared space are global bottlenecks.</li>
  <li><a href="https://brandur.org/postgres-connections">https://brandur.org/postgres-connections</a> says that connection numbers above 500 are problematic.</li>
  <li>The performance of even simple postgres tasks (simple INSERT, SELECT, DELETE) degrades the more backend processes Postgres has open</li>
  <li>So even if it doesn’t yet create “a problem”, it does slow down your system even when everything is working well</li>
  <li><a href="https://aws.amazon.com/premiumsupport/knowledge-center/rds-mysql-max-connections/">https://aws.amazon.com/premiumsupport/knowledge-center/rds-mysql-max-connections/</a></li>
  <li>has some best practices for Postgres as well as MySQL
    <ul>
      <li>=&gt; they recommend increasing instance size first</li>
    </ul>
  </li>
  <li>Aside: <a href="https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/AuroraPostgreSQL.Managing.html">https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/AuroraPostgreSQL.Managing.html</a> is docs on how Aurora does this</li>
  <li><a href="https://aws.amazon.com/blogs/database/performance-impact-of-idle-postgresql-connections/">https://aws.amazon.com/blogs/database/performance-impact-of-idle-postgresql-connections/</a></li>
  <li><a href="https://aws.amazon.com/blogs/database/resources-consumed-by-idle-postgresql-connections/">https://aws.amazon.com/blogs/database/resources-consumed-by-idle-postgresql-connections/</a>
    <ul>
      <li>notice that once a process takes memory it stays allocated to that process until the process is closed by killing the connection</li>
      <li>CPU utilisation does go up as the connection count goes up
        <blockquote>
          <p>The utilization increased to 2% with 100 idle connections, increased to 3%
with 500 idle connections, increased to 5% with 1,000 idle connections,
increased to 6% with 1,500 idle connections and increased to 8% with 2,000
idle. Note that this utilization is for an instance with 2 vCPUs</p>
        </blockquote>
      </li>
    </ul>
  </li>
</ul>

<h3 id="why-cant-we-just-set-the-size-of-the-activerecord-pools-to-match-our-available-db-connections">Why can’t we just set the size of the ActiveRecord pools to match our available DB connections?</h3>

<p>Short answer: load balancing is never completely fair.</p>

<p>Long answer:</p>

<ul>
  <li>Rails connection pools are per individual process
    <ul>
      <li>which means you have a large number of small pools which grow with the number of processes you run.</li>
    </ul>
  </li>
  <li>From <a href="https://api.rubyonrails.org/classes/ActiveRecord/ConnectionAdapters/ConnectionPool.html">https://api.rubyonrails.org/classes/ActiveRecord/ConnectionAdapters/ConnectionPool.html</a>
    <ul>
      <li>The idle timeout defaults to 5m (300 sec)</li>
      <li>ActiveRecord does not release connections fast enough to be of any use in real production scenarios</li>
    </ul>
  </li>
</ul>

<p>Running out of connections is bad. Load balancers can rarely distribute work in a way that is totally fair from the DB usage pov because not all work durations are the same - i.e. requests need to check a connection out of the pool for different durations.</p>

<p>This means that load balancing can never be fully fair. An individual instance might get an unfair allocation of requests which could cause it to run out of connections even when other Rails processes are idle. We solve this problem by allocating more connections that we have, assuming that all our instances will not use them at the same time. But if the system is fully loaded then that will happen.</p>

<p>We created a “system at full load” problem by solving the “individual nodes run out of connections at sub full loads” problem.</p>

<h3 id="doesnt-activerecord-drop-connections-when-they-are-not-needed">Doesn’t ActiveRecord drop connections when they are not needed?</h3>

<ul>
  <li>Yes but it doesn’t do it quickly enough to be any use in production environments.</li>
  <li>From <a href="https://api.rubyonrails.org/classes/ActiveRecord/ConnectionAdapters/ConnectionPool.html">https://api.rubyonrails.org/classes/ActiveRecord/ConnectionAdapters/ConnectionPool.html</a>
    <ul>
      <li>The idle timeout defaults to 5m (300 sec)</li>
      <li>ActiveRecord does not release connections fast enough to be of any use in real production scenarios</li>
    </ul>
  </li>
</ul>

<h3 id="deploys-can-temporarily-spike-the-number-of-db-connections">Deploys can temporarily spike the number of DB connections</h3>

<p>Even if your system is normally stable, deploys can cause spikes.
If your deploy creates new Rails processes/instances before killing old ones then you will get a temporary spike in DB connections.</p>

<h3 id="what-is-max_connections-set-to-in-rds">What is max_connections set to in RDS?</h3>

<ul>
  <li><code class="language-plaintext highlighter-rouge">max_connections</code> may not be our real limit but it is still important</li>
  <li>All DB providers set <code class="language-plaintext highlighter-rouge">max_connections</code> carefully. The value will depend on the instance size.</li>
  <li>From the docs
    <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># From: &lt;https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_Limits.html&gt;</span>
AWS RDS Postgres max_connections:
  Allowed values: 6 - 8388607
  Default value: LEAST<span class="o">({</span>DBInstanceClassMemory/9531392<span class="o">}</span>, 5000<span class="o">)</span>
</code></pre></div>    </div>
  </li>
  <li>Note the formula above implies that the maximum <code class="language-plaintext highlighter-rouge">max_connections</code> for Postgres on <strong>any RDS instance</strong> is 5000</li>
  <li>About <code class="language-plaintext highlighter-rouge">DBInstanceClassMemory</code>
    <ul>
      <li>measured in bytes</li>
      <li>the memory available to the DB instance minus stuff required for OS etc.
        <ul>
          <li>=&gt; it is <strong>not the same as the memory available for the given instances class</strong></li>
        </ul>
      </li>
      <li>I have not found a way to read <code class="language-plaintext highlighter-rouge">DBInstanceClassMemory</code> :-(</li>
    </ul>
  </li>
  <li>The only way I know of to find out <code class="language-plaintext highlighter-rouge">max_connections</code> is to spin up a DB of the given size and <code class="language-plaintext highlighter-rouge">show max_connections</code> in psql.</li>
  <li>You can override <code class="language-plaintext highlighter-rouge">max_connections</code> in the parameter group but probably shouldn’t unless you are super confident you know wtf you are doing.</li>
  <li>Some stuff I found on the web where people had done this:
    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  Eoin: I haven't verified this:
  &lt;https://serverfault.com/a/982173&gt;
  Actual info for Postgresql t3-instances (default.postgres10 parameter group):
  db.t3.micro - 112 max_connections
  db.t3.small - 225 max_connections
  db.t3.medium - 450 max_connections
  db.t3.large - 901 max_connections
  db.t3.xlarge - 1802 max_connections
  db.t3.2xlarge - 3604 max_connections
  Its similar for default.postgres9 and default.postgres11

  &lt;https://gist.github.com/guizmaii/1cacffef793c2ba9645083c3e18b3d8c&gt;
  So, here are the values I got when I ran the SQL commmand: `show max_connections;` in some RDS instances:
  | Instance type | RAM (GB) | max_connections |
  | ------------- | -------- | --------------- |
  | db.t2.small   | 2        | 198             |
  | db.t2.medium  | 4        | 413             |
  | db.t2.large   | 8        | 856             |
  | db.m4.large   | 8        | 856             |
  | db.r4.large   | 15.25    | 1660            |
</code></pre></div>    </div>
  </li>
</ul>

<h3 id="what-problems-does-pgbouncer-fix">What problems does pgBouncer fix?</h3>

<p>Introducing pgBouncer does have a latency cost but it solves the following problems:</p>

<ol>
  <li>The DB just gets slower as it has more connections</li>
  <li>ActiveRecord per-process pools interact with slightly unfair load balancing to mean that some processes run out of DB connections.</li>
  <li>Deploys can cause a big spike in the number of Rails processes which in turn causes a spike in the number of DB connections.</li>
</ol>

<h3 id="what-are-the-downsides-of-pgbouncer">What are the downsides of pgBouncer?</h3>

<ul>
  <li>You can’t run SQL commands which would change the global state of the connection
    <ul>
      <li>In particular, you can’t use prepared statements (when running in transaction mode which is almost certainly how you’ll want to configure it)</li>
    </ul>
  </li>
  <li>Adds latency</li>
  <li>Additional complexity</li>
  <li>More stuff to maintain</li>
  <li>Security implications - the pooler needs to be secured too, creds need to be managed, https etc.</li>
  <li>You may need server(s) to run the pooler(s) - servers cost money and need patching etc.</li>
</ul>

<h3 id="pgbouncer-alternative-rds-proxy">pgBouncer alternative: RDS Proxy</h3>

<blockquote>
  <p>RDS Proxy is a fully managed, highly available database proxy that uses
connection pooling to share database connections securely and efficiently
<a href="https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/rds-proxy.html">https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/rds-proxy.html</a></p>
</blockquote>

<p>Price is $0.015 per vCPU-hour. Pricing seems to increase based on the number of vCPUs in your DB: larger DB =&gt; larger RDS proxy bill</p>

<p>I have no data points on RDS Proxy except that it’s PG version can lag RDS’s PG version which in turn lags the official PG version.
As of End 2022, PG 15 is latest version, RDS supports PG 14 as does RDS Proxy so this may not be an issue anymore.</p>

<h3 id="pgbouncer-alternative-odyssey">pgBouncer alternative: odyssey</h3>

<ul>
  <li><a href="https://github.com/yandex/odyssey">https://github.com/yandex/odyssey</a>
    <ul>
      <li>multi-threaded unlike pgBouncer</li>
    </ul>
  </li>
</ul>

<h3 id="pgbouncer-alternative-pgpool-ii">pgBouncer alternative: pgpool-II</h3>

<ul>
  <li><a href="https://www.pgpool.net/mediawiki/index.php/Main_Page">https://www.pgpool.net/mediawiki/index.php/Main_Page</a></li>
  <li>also handles replication and load balancing but has rep of being a bit more heavyweight than pgBouncer</li>
  <li><a href="https://www.enterprisedb.com/blog/pgpool-vs-pgbouncer">https://www.enterprisedb.com/blog/pgpool-vs-pgbouncer</a>
    <blockquote>
      <p>In typical scenarios, PgBouncer executes pooling correctly “out of the box,”
whereas Pgpool-II requires fine-tuning of certain parameters for ideal
performance and functionality</p>

      <p>…</p>

      <p>Pgpool-II is often implemented by organizations because of its added
capabilities, but that doesn’t necessarily make Pgpool-II the ideal choice for
all use cases. Many perceive Pgpool-II as an end-all solution, but in reality,
PgBouncer is often a better solution for scenarios where bringing down database
connections is key</p>
    </blockquote>
  </li>
</ul>

<h3 id="pgbouncer-alternative-pgcat">pgBouncer alternative: pgcat</h3>

<ul>
  <li><a href="https://github.com/levkk/pgcat">https://github.com/levkk/pgcat</a></li>
  <li>new, not battle tested yet</li>
  <li>written in rust</li>
  <li>does sharding too so you can grow into sharding if need be</li>
</ul>

<h2 id="pgbouncer">pgBouncer</h2>

<h3 id="how-to-set-up-pgbouncer">How to set up pgBouncer</h3>

<ul>
  <li>Remember that pgBouncer is single threaded</li>
  <li>You have 3 possible options:
    <ol>
      <li>Session pooling
        <ul>
          <li>Real DB connections are assigned when the client opens a session and closed when the session is closed</li>
          <li>This is not very effective with a Rails app</li>
        </ul>
      </li>
      <li>Transaction pooling
        <ul>
          <li>A real DB connection is assigned for the duration of a transaction</li>
          <li>Best choice for Rails</li>
          <li>You cannot make “global” changes to the connection e.g. prepared statements, pub/sub
            <ul>
              <li>Does this mean that PG based background job managers would be a problem?</li>
            </ul>
          </li>
          <li><strong>You cannot use prepared statements with this</strong> - watch out if you are writing your own raw SQL</li>
        </ul>
      </li>
      <li>Statement pooling (not viable)
        <ul>
          <li>you have to avoid using transactions to do this which means it’s not really viable</li>
        </ul>
      </li>
    </ol>
  </li>
  <li>Heroku has a buildpack which implements a node level pgBouncer by default
    <ul>
      <li>pgBouncer on each node is good but not as good as a single shared pgBouncer.</li>
    </ul>
  </li>
</ul>

<blockquote>
  <p>In general, a single PgBouncer can process up to 10,000 connections. 1,000 or
so can be active at one time. The exact numbers will depend on your
configuration and the amount of data you it is copying between the database and
the application.</p>

  <p><a href="https://www.crunchydata.com/blog/postgres-at-scale-running-multiple-pgbouncers">https://www.crunchydata.com/blog/postgres-at-scale-running-multiple-pgbouncers</a></p>
</blockquote>

<p>You can use systemd to run multiple instances of pgBouncer (one per vCPU on your box)  - see
<a href="https://www.crunchydata.com/blog/postgres-at-scale-running-multiple-pgbouncers">https://www.crunchydata.com/blog/postgres-at-scale-running-multiple-pgbouncers</a>
This uses <code class="language-plaintext highlighter-rouge">SO_REUSEPORT</code> in linux kernel and systemd to run multiple pgBouncer processes.</p>

<p>pgBouncer creates a virtual <code class="language-plaintext highlighter-rouge">pgbouncer</code> database which you access via <code class="language-plaintext highlighter-rouge">psql</code> just like any other DB.</p>

<p>pgBouncer has the notion of users which can have different limits applied.</p>

<ul>
  <li>users in the admin users list can do everything</li>
  <li>users in the stats users list can view stats</li>
</ul>

<p>You can use this to lock down some apps more tightly than others if they are at risk of overwhelming the DB.</p>

<h4 id="using-pgbouncer-to-do-rudimentary-read-write-vs-read-routing">Using pgBouncer to do rudimentary read-write vs read routing</h4>

<p>You can use pgBouncer’s aliasing of databases with a the SO_REUSEPORT trick of running multiple pgBouncer processes to achieve some advanced outcomes. Note that if you need these outcomes, one of the alternatives to pgBouncer might be better - these are somewhat clever hacks.</p>

<ul>
  <li>pgBouncer creates alias DB names which are mapped to real DBs on real PG servers</li>
  <li>You can use this aliasing to have a “readwrite” (or similarly named) DB which only points at your read+write primary and an “readonly” db which points only at a follower DB
    <ul>
      <li>From the app’s POV there are two different databases.</li>
    </ul>
  </li>
  <li>systemd will invoke each pgBouncer process on a round-robin basis. You can use this to do load balancing if you configure each pgBouncer process to use different databases e.g.
    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pgBouncer@1
  readwrite: primary
  readonly: standby1
pgBouncer@2
  readwrite: primary
  readonly: standby2
</code></pre></div>    </div>
  </li>
  <li>Connections to pgBouncer will get either @1 or @2 on a round-robin basis</li>
  <li>This means that the app connecting to <code class="language-plaintext highlighter-rouge">readonly</code> will get either standby1 or standby2 on a round-robin basis</li>
  <li>Note that the linux kernel round-robin invoking of processes with SO_REUSEPORT is not perfect and can be a bit skewed</li>
</ul>

<h4 id="using-pgbouncer-as-a-rudimentary-weighted-load-balancer">Using pgBouncer as a rudimentary weighted load balancer</h4>

<p>You can tune the round-robin distribution of load by adding more pgBouncer processes.</p>

<p>Imagine that standby2 is much beefier than standby1 - we can control how much load it gets by having more pgBouncer processes target it e.g.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pgBouncer@1
  readwrite: primary
  readonly: standby1
pgBouncer@2
  readwrite: primary
  readonly: standby2
pgBouncer@3
  readwrite: primary
  readonly: standby2
</code></pre></div></div>

<h4 id="pgbouncer-failover">pgBouncer failover</h4>

<p>The app connects to pgBouncer. pgBouncer connects to the DB server(s).
If a connection to the DB server goes down, pgBouncer will not terminate the app’s connection but will try to find another available DB connection.
This gives you failover if one of your DB servers goes down.</p>

<h4 id="pgbconsole">pgbconsole</h4>

<ul>
  <li>separate to pgBouncer</li>
  <li>a <code class="language-plaintext highlighter-rouge">top</code> alike thing</li>
  <li>a resource monitor for pgBouncer</li>
  <li>can manage multiple pgBouncer instances</li>
</ul>

<h3 id="where-should-i-run-pgbouncer">Where should I run pgBouncer?</h3>

<p>Options are:</p>

<ol>
  <li>Run pgBouncer on the same instance as the Rails app
    <ul>
      <li>it seems common to start here</li>
      <li>++ you get to multiplex N Rails connections to M real DB connections</li>
      <li>++ this works well if the DB is on the same instance as the Rails app because in that case you do care about running out of memory (normally you will hit other PG perf issues before running out of memory on well configured DB hosts)</li>
      <li>– the no. of real DB connections scales with the number of instances</li>
    </ul>
  </li>
  <li>Run pgBouncer on dedicated instances
    <ul>
      <li>can be a single pgBouncer cluster or multiple clusters for different workloads with different N:M ratios</li>
      <li>but eventually teams move to dedicated pgBouncer instance(s)</li>
    </ul>
  </li>
  <li>run multiple pgBouncers for different parts of your load (e.g. background jobs, puma etc.)
    <ul>
      <li>lets you tailor how you allocate real DB connections</li>
    </ul>
  </li>
</ol>

<h3 id="how-do-i-choose-the-ratio-of-rails-connections-to-real-connections-nm">How do I choose the ratio of Rails connections to real connections (N:M)?</h3>

<p>Teams use pgBouncer when they need to increase the number of Rails processes/threads and they cannot add more real DB connections without significant slowdowns.</p>

<ul>
  <li>Teams seem to grow into choosing the ratio.</li>
  <li>They hit the limits of scaling without pgBouncer and then start using it.</li>
  <li>They seem to push the ratio as far as possible for their workload</li>
  <li>Some examples
    <ul>
      <li>pgBouncer on instance with Rails app. Turns &gt; 50 Puma threads into &lt; 10 real DB connections (approx. 5:1 ratio)</li>
      <li>pgBouncer on a dedicated instance: Turning 30k Rails connections into 1500 real DB connections (approx. 20:1 ratio)</li>
    </ul>
  </li>
</ul>

<h3 id="what-is-the-optimal-number-of-db-connections-for-a-given-rds-instance-size">What is the optimal number of DB connections for a given RDS instance size?</h3>

<ul>
  <li>Alternative wording: At what number of DB connections does a connection pooler like pgBouncer become useful?</li>
  <li>Alternative wording: Does pgBouncer make any sense for apps with &lt;100 DB connections?
    <ul>
      <li>(my instinct here is probably not but I can’t find any evidence either way)</li>
    </ul>
  </li>
  <li>Obviously this is workload dependent but are there any useful heuristics?</li>
  <li>In theory, your instance can only run as many parallel processes as it has CPU cores.</li>
  <li>If an app is making 50-100 DB connections, is there benefit to adding pgBouncer?</li>
  <li><a href="https://docs.joinmastodon.org/admin/scaling/#pgbouncer-why">https://docs.joinmastodon.org/admin/scaling/#pgbouncer-why</a>
    <ul>
      <li>configures it’s PG to top out at 100 connections and suggests you use pgBouncer after that?</li>
    </ul>
  </li>
  <li><a href="https://aws.amazon.com/blogs/database/resources-consumed-by-idle-postgresql-connections/">https://aws.amazon.com/blogs/database/resources-consumed-by-idle-postgresql-connections/</a>
    <ul>
      <li>
        <p>hints that for long running queries dropping from 100 concurrent connections to 20 can make the 2vCPU DB faster</p>

        <p>TODO this will require some experimentation</p>
      </li>
    </ul>
  </li>
</ul>

<h3 id="how-does-rds-do-routing-of-queries-to-multi-az-dbs">How does RDS do routing of queries to multi-az DBs?</h3>

<p>It depends on how many standby instances you have.</p>

<ul>
  <li>1 standby instance: all queries go to the primary.
    <ul>
      <li>The standby is a “hidden” replica</li>
    </ul>
  </li>
  <li>2 standby instances: separate read-only and read-write endpoints
    <ul>
      <li>read-write hits your primary</li>
      <li>read-only hits the standby instances</li>
    </ul>
  </li>
</ul>

<p>Source: <a href="https://aws.amazon.com/rds/features/multi-az/">https://aws.amazon.com/rds/features/multi-az/</a></p>]]></content><author><name></name></author><summary type="html"><![CDATA[This post is a snapshot of a page from learning notes. I’m sharing it here in the hope that, even unpolished as it is, it is useful to you. It may be updated occasionally with improvements and corrections as I get time and learn more.]]></summary></entry><entry><title type="html">Microservices are not your starting point</title><link href="https://eoinkelly.info/microservices/architecture/2022/12/19/microservices-are-not-your-starting-point.html" rel="alternate" type="text/html" title="Microservices are not your starting point" /><published>2022-12-19T20:13:28+00:00</published><updated>2022-12-19T20:13:28+00:00</updated><id>https://eoinkelly.info/microservices/architecture/2022/12/19/microservices-are-not-your-starting-point</id><content type="html" xml:base="https://eoinkelly.info/microservices/architecture/2022/12/19/microservices-are-not-your-starting-point.html"><![CDATA[<p>I have been doing some holiday reading on:</p>

<ol>
  <li>Edge computing (CloudFlare workers, Fastly Compute@Edge, misc. WebAssembly thingies)</li>
  <li>Serverless as it exists in 2022</li>
  <li>Jamstack</li>
</ol>

<p>Collectively, these technologies feel like “the future” in the minds of many devs because so much of the content we consume tells us so (circa 2022).</p>

<p>The marketing of these technologies has an unstated assumption that your team will use a microservices architecture because if you don’t then, well, they don’t have anything to sell you. So implicitly <em>“microservices are the future”</em> too.</p>

<p>I’m not mad at the marketing/dev-rel/early-adopter folks out there just doing their jobs but the rounded off version of this message is <em>“We should start with microservices because it’s the future”</em>. Anecdotally from interactions with clients this year and the content I consume in the dev space, this idea seems to have a lot of traction right now. I am unreasonably annoyed by this because this idea is dead wrong.</p>

<p>Of course microservices architectures have their place but know they are mostly a technical way to solve a social problems.
<strong>The complexity cost of microservices approach is really high</strong> (nobody disagrees on this) but in some situations that cost can be worth it e.g.</p>

<ul>
  <li><em>“we have so many engineers working on the same monolith that merging anything is a nightmare”</em></li>
  <li><em>“Our server load is really expensive <strong>and</strong> really spiky”</em></li>
</ul>

<p>Or let me put it another way: If your small team of devs (&lt;20) chooses microservices you will almost certainly waste a huge amount of engineering time (aka money) solving problems you would not have if a more monolithic approach.</p>

<p>This isn’t a unique insight by any means. I just want to highlight the link between these <em>“things that are supposedly the future”</em> and Microservices.</p>

<p>Sam Newman’s <em>Building Microservices 2nd Edition</em> (Basically <strong>the</strong> book on microservices) says:</p>

<blockquote>
  <p>Microservices aren’t without significant downsides, though. As a distributed
system, they bring a host of complexity, much of which may be new even to
experienced developers.</p>
</blockquote>

<blockquote>
  <p>Microservices have become, for many, the default architectural choice. This is
something that I think is hard to justify, and I wanted a chance to share why.</p>
</blockquote>

<p>And these are some tweets on the topic which I found insightful:</p>

<blockquote>
  <p>I’m happy to see a lot more open skepticism about microservices in the last
year. They’re a bad match for most systems, but programmers (and especially
programmers’ career aspirations) love complexity.
<a href="https://twitter.com/garybernhardt/status/1603494283935637505">https://twitter.com/garybernhardt/status/1603494283935637505</a></p>
</blockquote>

<blockquote>
  <p>Just as with microkernels, the justifications for microservices often stem
from the inability to conceptualize compile-time modularity as distinct from
runtime modularity.
<a href="https://twitter.com/Ngnghm/status/1603605738890444800?s=20&amp;t=fqc4tHV6ShPGN71qadTOGA">https://twitter.com/Ngnghm/status/1603605738890444800?s=20&amp;t=fqc4tHV6ShPGN71qadTOGA</a></p>
</blockquote>

<blockquote>
  <p>Microservices are a technical solution to an organizational problem of having
too large an engineering team with too many interdependencies.
It optimizes to reduce number of meetings by replacing them with service
interfaces.
Once you internalize this, you will know peace.
<a href="https://twitter.com/Carnage4Life/status/1592958912940380160">https://twitter.com/Carnage4Life/status/1592958912940380160</a></p>
</blockquote>

<p>So what to do? Just be aware that vendors are selling what they have available in 2022, not what is necessarily best for the long term health of your app.</p>

<p>This space is evolving quickly and, over time, I expect vendors will be able to offer greater flexibility.</p>

<p>While “serverless” isn’t exactly a well defined term but I don’t think it inherently has to be limited to Microservices style architectures. Arguably, something like ECS Fargate gets you there without dictating any architectural patterns to you.</p>]]></content><author><name></name></author><category term="microservices" /><category term="architecture" /><summary type="html"><![CDATA[I have been doing some holiday reading on:]]></summary></entry><entry><title type="html">Which ruby versions does Rails support?</title><link href="https://eoinkelly.info/ruby/rails/2022/04/23/which-ruby-versions-does-rails-support.html" rel="alternate" type="text/html" title="Which ruby versions does Rails support?" /><published>2022-04-23T18:13:28+00:00</published><updated>2022-04-23T18:13:28+00:00</updated><id>https://eoinkelly.info/ruby/rails/2022/04/23/which-ruby-versions-does-rails-support</id><content type="html" xml:base="https://eoinkelly.info/ruby/rails/2022/04/23/which-ruby-versions-does-rails-support.html"><![CDATA[<p>The <a href="https://guides.rubyonrails.org/upgrading_ruby_on_rails.html#ruby-versions">official Rails upgrading guides</a> will usually describe their Ruby version support as <em>X.Y or newer</em>. In some situations, I want to verify which Ruby versions the Rails version is actually <strong>tested</strong> against e.g.</p>

<ul>
  <li>I am upgrading an old Rails app and want to find out where the Ruby version support stops</li>
  <li>I am considering upgrading to a <em>very</em> new version of Ruby and want to verify that Rails is actually testing against it.</li>
</ul>

<p>This is what I currently do:</p>

<ol>
  <li>Go to <a href="https://github.com/rails/rails">https://github.com/rails/rails</a> and find the name of the branch which corresponds to my rails version - it’ll be something like <code class="language-plaintext highlighter-rouge">7-0-stable</code> or <code class="language-plaintext highlighter-rouge">6-1-stable</code></li>
  <li>Find the CI page for that branch by filling the branch name into <code class="language-plaintext highlighter-rouge">https://buildkite.com/rails/rails/builds?branch=&lt;YOUR_BRANCH_NAME&gt;</code> e.g.
    <ul>
      <li><a href="https://buildkite.com/rails/rails/builds?branch=7-0-stable">https://buildkite.com/rails/rails/builds?branch=7-0-stable</a></li>
      <li><a href="https://buildkite.com/rails/rails/builds?branch=6-0-stable">https://buildkite.com/rails/rails/builds?branch=6-0-stable</a></li>
      <li><a href="https://buildkite.com/rails/rails/builds?branch=5-0-stable">https://buildkite.com/rails/rails/builds?branch=5-0-stable</a></li>
    </ul>
  </li>
  <li>You can see which Ruby versions are tested in the summary  of each build
 <img src="/assets/posts/rails_ci.png" alt="Rails CI snapshot" /></li>
</ol>]]></content><author><name></name></author><category term="ruby" /><category term="rails" /><summary type="html"><![CDATA[The official Rails upgrading guides will usually describe their Ruby version support as X.Y or newer. In some situations, I want to verify which Ruby versions the Rails version is actually tested against e.g.]]></summary></entry><entry><title type="html">Why I don’t use the ‘except’ and ‘only’ params to Rails before_action</title><link href="https://eoinkelly.info/ruby/rails/2021/04/17/rails-before-action-only-except.html" rel="alternate" type="text/html" title="Why I don’t use the ‘except’ and ‘only’ params to Rails before_action" /><published>2021-04-17T20:13:28+00:00</published><updated>2021-04-17T20:13:28+00:00</updated><id>https://eoinkelly.info/ruby/rails/2021/04/17/rails-before-action-only-except</id><content type="html" xml:base="https://eoinkelly.info/ruby/rails/2021/04/17/rails-before-action-only-except.html"><![CDATA[<p>A <code class="language-plaintext highlighter-rouge">before_action</code> that runs before <strong>every</strong> action is a useful pattern which helps us avoid forgetting to call some important piece of code before every action in the controller e.g. authentication or authorization.</p>

<p>However I think that, on balance, <code class="language-plaintext highlighter-rouge">before_action</code> with <code class="language-plaintext highlighter-rouge">only:</code>  or <code class="language-plaintext highlighter-rouge">except:</code> params does more harm than good to a codebase.</p>

<p>Reasonable people disagree on this e.g. Rails scaffold uses  <code class="language-plaintext highlighter-rouge">before_action</code> with <code class="language-plaintext highlighter-rouge">only</code>/<code class="language-plaintext highlighter-rouge">except</code> for setting instance vars.</p>

<p>I’ll make my case below so you can make up your own mind about it.</p>

<h3 id="pro-less-typing-shorter-methods">Pro: Less typing, shorter methods</h3>

<p>You save a few characters of typing and your methods are a few lines shorter which might make rubocop happier.</p>

<h3 id="con-it-scales-badly">Con: It scales badly</h3>

<p>It doesn’t set a good example. Codebases tend to get more of what they have. The next dev who comes along will tend to do more of what is already there.</p>

<p><code class="language-plaintext highlighter-rouge">before_action</code> with <code class="language-plaintext highlighter-rouge">only</code>/<code class="language-plaintext highlighter-rouge">except</code> gets grim when you get a bunch of them together e.g.</p>

<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="c1"># Day 1, workin' on that organisations feature oh yeah!</span>

<span class="c1"># No problem here, pretty easy to read and understand.</span>
<span class="k">class</span> <span class="nc">OrganisationsController</span> <span class="o">&lt;</span> <span class="no">AdminController</span>
  <span class="n">before_action</span> <span class="ss">:set_organisation</span><span class="p">,</span> <span class="ss">only: </span><span class="sx">%i[show edit update destroy]</span>

  <span class="k">def</span> <span class="nf">index</span>
<span class="c1"># ...</span></code></pre></figure>

<figure class="highlight"><pre><code class="language-ruby" data-lang="ruby"><span class="c1"># Day 100, adding even more stuff to organisations...</span>

<span class="c1"># What about now?</span>
<span class="c1"># How hard did you have to run your mental ruby interpreter to figure out which</span>
<span class="c1"># of these actions run before the controller action you care about?</span>
<span class="k">class</span> <span class="nc">OrganisationsController</span> <span class="o">&lt;</span> <span class="no">AdminController</span>
  <span class="n">before_action</span> <span class="ss">:set_organisation</span><span class="p">,</span> <span class="ss">only: </span><span class="sx">%i[show edit update destroy]</span>
  <span class="n">before_action</span> <span class="ss">:set_other</span><span class="p">,</span> <span class="ss">:only</span><span class="p">:</span> <span class="sx">%i[edit index destroy]</span>
  <span class="n">before_action</span> <span class="ss">:set_other_1</span><span class="p">,</span> <span class="ss">:only</span><span class="p">:</span> <span class="sx">%i[edit index show]</span>
  <span class="n">before_action</span> <span class="ss">:set_other_2</span><span class="p">,</span> <span class="ss">except: </span><span class="sx">%i[show destroy]</span>
  <span class="n">before_action</span> <span class="ss">:do_some_other_thing</span><span class="p">,</span> <span class="ss">only: </span><span class="sx">%i[show]</span>

  <span class="k">def</span> <span class="nf">index</span>
<span class="c1"># ...</span></code></pre></figure>

<p>I think in <em>almost all</em> cases it is better to just type code directly into the controller action. If there is enough code to warrant pulling into a private method then that’s fine - just call the private method in the controller action.</p>]]></content><author><name></name></author><category term="ruby" /><category term="rails" /><summary type="html"><![CDATA[A before_action that runs before every action is a useful pattern which helps us avoid forgetting to call some important piece of code before every action in the controller e.g. authentication or authorization.]]></summary></entry></feed>