PostgreSQL Autovacuum Tuning for High-Write Workloads

Operations

May 25, 2026

14 min read

PostgreSQL Autovacuum Tuning for High-Write Workloads

A deep practical guide to tuning PostgreSQL autovacuum so busy tables stay healthy, bloat stays under control, and emergency VACUUM FULL stops being the default reaction.

ArmorDB EngineeringArmorDB engineering

PostgreSQLAutovacuumVACUUM

Autovacuum usually becomes visible only when something already feels wrong. A table gets slower week by week, index size grows faster than expected, dead tuples pile up, and someone ends up proposing VACUUM FULL in the middle of the day. The real problem is rarely that PostgreSQL forgot to clean up after itself. It is that the cleanup cadence no longer matches the write pattern of the busiest tables.

For managed PostgreSQL teams, this matters because autovacuum is not an obscure maintenance feature. It is part of the database’s normal write path. When it keeps up, updates and deletes remain routine. When it falls behind, storage grows noisier, cache efficiency drops, and query latency gets harder to explain. The fix is usually not “turn vacuum up everywhere.” It is understanding which tables change fastest, why the default thresholds miss them, and which settings should move per table instead of globally.

What autovacuum is really responsible for

PostgreSQL’s routine vacuuming documentation is explicit that VACUUM is needed for several different jobs: recovering or reusing space from updated and deleted rows, updating planner statistics through ANALYZE, maintaining the visibility map that helps index-only scans, and preventing transaction ID wraparound failures. That is why autovacuum problems show up in more than one way. A team may notice bloat first, but stale statistics or aggressive anti-wraparound work can be part of the same story.

The easiest mistake is to think of autovacuum as a background janitor that only matters after deletes. PostgreSQL’s MVCC model means updates create dead row versions too. On a table that is updated constantly, old row versions accumulate even if the row count stays flat. Busy application tables can therefore become vacuum-sensitive long before anyone thinks of them as “large.”

Responsibility	What it protects	What failure looks like
Reclaim dead tuples for reuse	Table and index growth stay sane	Rising bloat, extra I/O, slower hot paths
Refresh planner statistics through `ANALYZE`	Query plans stay close to reality	Sudden bad plans after data distribution changes
Maintain visibility map information	Index-only scans and efficient vacuum work	More heap visits and less predictable maintenance
Freeze old tuples before wraparound risk	Cluster safety and uptime	Emergency anti-wraparound vacuum pressure

That full list is why autovacuum tuning should be treated as production engineering, not cosmetic performance work.

Why default settings often miss the busiest tables

Autovacuum does not wake up because a table feels important. PostgreSQL checks thresholds. The docs describe the standard vacuum trigger as a base threshold plus a scale factor multiplied by the estimated row count, and a separate insert-triggered formula for insert-heavy tables. With default settings, a table usually reaches vacuum after 50 updated or deleted tuples plus 20% of reltuples, while ANALYZE usually triggers after 50 changed tuples plus 10% of reltuples.

Those defaults are sensible for many mixed workloads, but they break down on large hot tables. A table with 10 million rows and the default vacuum scale factor of 0.2 does not become a normal autovacuum candidate until roughly 2,000,050 updated or deleted rows have accumulated. If that table backs sessions, jobs, events, carts, or other write-heavy application state, waiting for two million dead tuples is often much too late.

The same logic affects inserts. PostgreSQL also documents autovacuum_vacuum_insert_threshold and autovacuum_vacuum_insert_scale_factor so insert-only or mostly-insert tables can still be vacuumed and frozen appropriately. Teams sometimes ignore these because they think inserts create no dead tuples. That misses another point from the docs: insert-triggered vacuums also help pages become all-visible or all-frozen, which reduces future work.

Table pattern	Why defaults struggle	Better first move
Large table with frequent updates	Scale factor waits for too many dead tuples	Lower per-table `autovacuum_vacuum_scale_factor`
Insert-heavy event table	Vacuum may not run early enough for freezing and visibility map progress	Tune insert threshold or insert scale factor
Table with rapidly changing distributions	Planner stats age out before vacuum pain is obvious	Lower `autovacuum_analyze_scale_factor`
Small but extremely hot queue table	It changes faster than its size suggests	Lower thresholds directly, not only scale factors

This is why blanket global tuning can disappoint. The tables that hurt most are usually a minority, and they need a different cadence from the rest of the cluster.

Start with evidence, not folklore

Before changing anything, look at table-level statistics. PostgreSQL’s statistics system exposes counts such as n_dead_tup, last_autovacuum, last_autoanalyze, and the HOT-update counters in pg_stat_all_tables. Those numbers are not perfect truth, but they are enough to show whether the busiest tables are being revisited often enough.

A practical first query is boring and extremely useful:

select
  schemaname,
  relname,
  n_live_tup,
  n_dead_tup,
  last_autovacuum,
  last_autoanalyze,
  vacuum_count,
  autovacuum_count,
  analyze_count,
  autoanalyze_count
from pg_stat_all_tables
where schemaname not in ('pg_catalog', 'information_schema')
order by n_dead_tup desc
limit 20;

If the same important tables keep surfacing with high dead-tuple counts and old last_autovacuum timestamps, the database is telling you where to focus. If last_autoanalyze is stale on a table whose row distribution changes all day, you may have a planning problem alongside the cleanup problem.

For a live vacuum, PostgreSQL’s progress-reporting docs expose pg_stat_progress_vacuum, including scanned heap blocks, vacuumed heap blocks, and phase information. That matters during an incident because it lets you distinguish “vacuum is not running” from “vacuum is running, but this table is expensive.” Those are different problems and should not be tuned the same way.

The safest tuning order

Most autovacuum incidents are better solved per table than cluster-wide. The reason is simple: global settings affect everything, including tables that are already healthy. A safer sequence is to identify the few hot relations that are consistently late, then reduce their thresholds until maintenance becomes boring again.

Tuning step	Why it helps	Typical risk
Lower `autovacuum_vacuum_scale_factor` on one hot table	Triggers cleanup earlier on large update-heavy relations	More frequent background I/O on that table
Lower `autovacuum_analyze_scale_factor` on the same table	Keeps planner stats fresher when data shape moves quickly	Slightly more analyze work
Raise cost limit or lower delay selectively	Lets autovacuum finish useful work sooner	Too much I/O if applied too broadly
Add table-specific fillfactor for update-heavy rows	Increases room for HOT updates and reduces page churn	Slightly larger table footprint up front

A common first change for a large update-heavy table is to move the vacuum scale factor from 0.2 to something like 0.02 or 0.01. That does not mean those numbers are universally correct. It means that for genuinely hot tables, ten times earlier can be healthier than waiting for millions of dead tuples. The useful goal is not “run autovacuum constantly.” It is “run it before bloat and stale statistics become user-visible.”

A table-specific change looks like this:

alter table public.orders set (
  autovacuum_vacuum_scale_factor = 0.02,
  autovacuum_analyze_scale_factor = 0.02,
  autovacuum_vacuum_threshold = 1000,
  autovacuum_analyze_threshold = 500
);

That example is intentionally table-local. It is much easier to reason about one hot table than a whole-cluster policy change that surprises unrelated workloads.

When cost settings are the real bottleneck

Sometimes autovacuum is not late because the thresholds are wrong. It is late because workers are too polite. PostgreSQL documents cost-based delay and the autovacuum cost settings so maintenance work can throttle itself. That is useful on busy systems, but it also means a worker can start on time and still take too long to finish. The runtime config docs also note that the effective cost limit is distributed across running autovacuum workers unless table-specific overrides are set.

This matters on clusters with several hot tables competing for attention. If vacuum starts frequently but progress is slow, the next useful question is whether workers are underpowered or too few, not whether thresholds should fall even further. Lowering thresholds without enough throughput can simply create a longer queue.

The pattern to watch is straightforward. If tables are vacuumed often and still show large dead-tuple accumulation, throughput is probably the issue. If tables are vacuumed rarely because they are never crossing the trigger soon enough, thresholds are the issue. Good tuning distinguishes those two cases.

Fillfactor and HOT updates are part of the story

The PostgreSQL storage documentation notes that lowering a table’s fillfactor can increase the likelihood of HOT updates by leaving enough space on a page for new row versions. That matters because HOT updates avoid some index churn when updated columns are not part of the indexed key path. In practice, a table that updates the same rows constantly can benefit from both earlier vacuuming and a better chance of keeping row-version churn local.

This is one of the clearest examples of why autovacuum tuning is not just one setting. If a table stores mutable status fields, timestamps, or retry counters, lowering fillfactor can make each update cheaper, while more appropriate autovacuum thresholds keep the accumulated dead tuples from lingering too long. Either change alone can help. Together they often make the table calmer.

That is also where PostgreSQL 17’s vacuum improvements matter for managed fleets. If you are planning a version upgrade, our guide to Postgres 17 performance improvements is useful background because vacuum efficiency improvements change how much pressure the maintenance path puts on write-heavy systems.

What not to do first

The most expensive mistake is reaching for VACUUM FULL as the default response. PostgreSQL’s VACUUM documentation says VACUUM FULL rewrites the table into a new disk file, requires extra disk space, and takes an ACCESS EXCLUSIVE lock. That makes it a recovery tool for specific space-reclamation situations, not a normal answer to everyday autovacuum lag.

Another common mistake is disabling autovacuum on a “special” table. PostgreSQL is very clear that anti-wraparound vacuuming will still be forced when necessary, even if regular autovacuum is disabled. In other words, you do not escape vacuuming. You just give up the controlled version and wait for the emergency version.

The third mistake is tuning only globally because that feels simpler. Global changes are reasonable when nearly every large table has the same problem. Most application databases are not like that. They usually have a few extremely hot tables, a larger set of moderate tables, and many cold tables that do fine with defaults.

A practical runbook for managed PostgreSQL teams

Start by identifying the top write-heavy tables in production and compare their n_dead_tup, last autovacuum times, and query importance. Pick one painful table, not ten. Lower its vacuum and analyze scale factors, wait long enough to observe at least several maintenance cycles, and compare whether dead tuples plateau sooner and query behavior stabilizes.

If the table still accumulates dead tuples too quickly, check whether autovacuum progress suggests workers are taking too long once they start. That is the moment to revisit cost settings, worker count, or table shape rather than blindly lowering thresholds again. If the table is update-heavy and row versions frequently stay on the same pages, consider a fillfactor change for future rewrites or maintenance windows.

Managed PostgreSQL also changes the operational boundary slightly. You may not control every system-level knob or every maintenance workflow directly, so it helps to keep the diagnosis small and evidence-based. On ArmorDB, that usually means checking workload shape, storage growth, and plan limits alongside the table statistics in /docs/limits, instead of assuming every slowdown is a parameter problem.

Common signs your tuning worked

The goal is not to make vacuum disappear from dashboards. The goal is to make it uneventful. Good results usually look like dead tuples peaking lower than before, more regular last_autovacuum timestamps on the hot tables, fewer sudden plan regressions from stale stats, and less pressure to run manual maintenance during business hours.

A healthy outcome can still include regular autovacuum activity. In fact, on a hot table, seeing it show up routinely is often better news than seeing it stay silent for too long.

Takeaway

Autovacuum tuning is mostly about matching maintenance frequency to write reality. PostgreSQL’s defaults are intentionally general, but large update-heavy tables often need earlier vacuum and analyze triggers than the defaults provide. Start with table-level evidence, tune the hottest relations first, separate threshold problems from throughput problems, and treat VACUUM FULL as an exception rather than the plan.

If your production database keeps rediscovering the same bloat problem, the answer is usually not more heroics. It is a calmer autovacuum cadence on the tables that actually earn one.

Sources / further reading

PostgreSQL routine vacuuming: https://www.postgresql.org/docs/current/routine-vacuuming.html
PostgreSQL vacuuming configuration: https://www.postgresql.org/docs/current/runtime-config-vacuum.html
PostgreSQL statistics and pg_stat_all_tables: https://www.postgresql.org/docs/current/monitoring-stats.html
PostgreSQL vacuum progress reporting: https://www.postgresql.org/docs/current/progress-reporting.html
PostgreSQL VACUUM command reference: https://www.postgresql.org/docs/current/sql-vacuum.html
PostgreSQL visibility map: https://www.postgresql.org/docs/current/storage-vm.html
PostgreSQL HOT updates and fillfactor: https://www.postgresql.org/docs/current/storage-hot.html

Topic

Operations

Updated

May 25, 2026

Read time

14 min read

About the author

ArmorDB Engineering writes about PostgreSQL operations, security, and infrastructure decisions for teams building production apps on ArmorDB.

Scale your databaseTry ArmorDB Free

Compare managed plans