PostgreSQL Autovacuum Tuning for High-Write Workloads
A deep practical guide to tuning PostgreSQL autovacuum so busy tables stay healthy, bloat stays under control, and emergency VACUUM FULL stops being the default reaction.
Autovacuum usually becomes visible only when something already feels wrong. A table gets slower week by week, index size grows faster than expected, dead tuples pile up, and someone ends up proposing VACUUM FULL in the middle of the day. The real problem is rarely that PostgreSQL forgot to clean up after itself. It is that the cleanup cadence no longer matches the write pattern of the busiest tables.
For managed PostgreSQL teams, this matters because autovacuum is not an obscure maintenance feature. It is part of the database’s normal write path. When it keeps up, updates and deletes remain routine. When it falls behind, storage grows noisier, cache efficiency drops, and query latency gets harder to explain. The fix is usually not “turn vacuum up everywhere.” It is understanding which tables change fastest, why the default thresholds miss them, and which settings should move per table instead of globally.
What autovacuum is really responsible for
PostgreSQL’s routine vacuuming documentation is explicit that VACUUM is needed for several different jobs: recovering or reusing space from updated and deleted rows, updating planner statistics through ANALYZE, maintaining the visibility map that helps index-only scans, and preventing transaction ID wraparound failures. That is why autovacuum problems show up in more than one way. A team may notice bloat first, but stale statistics or aggressive anti-wraparound work can be part of the same story.
The easiest mistake is to think of autovacuum as a background janitor that only matters after deletes. PostgreSQL’s MVCC model means updates create dead row versions too. On a table that is updated constantly, old row versions accumulate even if the row count stays flat. Busy application tables can therefore become vacuum-sensitive long before anyone thinks of them as “large.”
| Responsibility | What it protects | What failure looks like |
|---|---|---|
| Reclaim dead tuples for reuse | Table and index growth stay sane | Rising bloat, extra I/O, slower hot paths |
Refresh planner statistics through ANALYZE | Query plans stay close to reality | Sudden bad plans after data distribution changes |
| Maintain visibility map information | Index-only scans and efficient vacuum work | More heap visits and less predictable maintenance |
| Freeze old tuples before wraparound risk | Cluster safety and uptime | Emergency anti-wraparound vacuum pressure |
That full list is why autovacuum tuning should be treated as production engineering, not cosmetic performance work.
Why default settings often miss the busiest tables
Autovacuum does not wake up because a table feels important. PostgreSQL checks thresholds. The docs describe the standard vacuum trigger as a base threshold plus a scale factor multiplied by the estimated row count, and a separate insert-triggered formula for insert-heavy tables. With default settings, a table usually reaches vacuum after 50 updated or deleted tuples plus 20% of reltuples, while ANALYZE usually triggers after 50 changed tuples plus 10% of reltuples.
Those defaults are sensible for many mixed workloads, but they break down on large hot tables. A table with 10 million rows and the default vacuum scale factor of 0.2 does not become a normal autovacuum candidate until roughly 2,000,050 updated or deleted rows have accumulated. If that table backs sessions, jobs, events, carts, or other write-heavy application state, waiting for two million dead tuples is often much too late.
The same logic affects inserts. PostgreSQL also documents autovacuum_vacuum_insert_threshold and autovacuum_vacuum_insert_scale_factor so insert-only or mostly-insert tables can still be vacuumed and frozen appropriately. Teams sometimes ignore these because they think inserts create no dead tuples. That misses another point from the docs: insert-triggered vacuums also help pages become all-visible or all-frozen, which reduces future work.
| Table pattern | Why defaults struggle | Better first move |
|---|---|---|
| Large table with frequent updates | Scale factor waits for too many dead tuples | Lower per-table autovacuum_vacuum_scale_factor |
| Insert-heavy event table | Vacuum may not run early enough for freezing and visibility map progress | Tune insert threshold or insert scale factor |
| Table with rapidly changing distributions | Planner stats age out before vacuum pain is obvious | Lower autovacuum_analyze_scale_factor |
| Small but extremely hot queue table | It changes faster than its size suggests | Lower thresholds directly, not only scale factors |
This is why blanket global tuning can disappoint. The tables that hurt most are usually a minority, and they need a different cadence from the rest of the cluster.
Start with evidence, not folklore
Before changing anything, look at table-level statistics. PostgreSQL’s statistics system exposes counts such as n_dead_tup, last_autovacuum, last_autoanalyze, and the HOT-update counters in pg_stat_all_tables. Those numbers are not perfect truth, but they are enough to show whether the busiest tables are being revisited often enough.
A practical first query is boring and extremely useful:
select
schemaname,
relname,
n_live_tup,
n_dead_tup,
last_autovacuum,
last_autoanalyze,
vacuum_count,
autovacuum_count,
analyze_count,
autoanalyze_count
from pg_stat_all_tables
where schemaname not in ('pg_catalog', 'information_schema')
order by n_dead_tup desc
limit 20;
If the same important tables keep surfacing with high dead-tuple counts and old last_autovacuum timestamps, the database is telling you where to focus. If last_autoanalyze is stale on a table whose row distribution changes all day, you may have a planning problem alongside the cleanup problem.
For a live vacuum, PostgreSQL’s progress-reporting docs expose pg_stat_progress_vacuum, including scanned heap blocks, vacuumed heap blocks, and phase information. That matters during an incident because it lets you distinguish “vacuum is not running” from “vacuum is running, but this table is expensive.” Those are different problems and should not be tuned the same way.
The safest tuning order
Most autovacuum incidents are better solved per table than cluster-wide. The reason is simple: global settings affect everything, including tables that are already healthy. A safer sequence is to identify the few hot relations that are consistently late, then reduce their thresholds until maintenance becomes boring again.
| Tuning step | Why it helps | Typical risk |
|---|---|---|
Lower autovacuum_vacuum_scale_factor on one hot table | Triggers cleanup earlier on large update-heavy relations | More frequent background I/O on that table |
Lower autovacuum_analyze_scale_factor on the same table | Keeps planner stats fresher when data shape moves quickly | Slightly more analyze work |
| Raise cost limit or lower delay selectively | Lets autovacuum finish useful work sooner | Too much I/O if applied too broadly |
| Add table-specific fillfactor for update-heavy rows | Increases room for HOT updates and reduces page churn | Slightly larger table footprint up front |
A common first change for a large update-heavy table is to move the vacuum scale factor from 0.2 to something like 0.02 or 0.01. That does not mean those numbers are universally correct. It means that for genuinely hot tables, ten times earlier can be healthier than waiting for millions of dead tuples. The useful goal is not “run autovacuum constantly.” It is “run it before bloat and stale statistics become user-visible.”
A table-specific change looks like this:
alter table public.orders set (
autovacuum_vacuum_scale_factor = 0.02,
autovacuum_analyze_scale_factor = 0.02,
autovacuum_vacuum_threshold = 1000,
autovacuum_analyze_threshold = 500
);
That example is intentionally table-local. It is much easier to reason about one hot table than a whole-cluster policy change that surprises unrelated workloads.
When cost settings are the real bottleneck
Sometimes autovacuum is not late because the thresholds are wrong. It is late because workers are too polite. PostgreSQL documents cost-based delay and the autovacuum cost settings so maintenance work can throttle itself. That is useful on busy systems, but it also means a worker can start on time and still take too long to finish. The runtime config docs also note that the effective cost limit is distributed across running autovacuum workers unless table-specific overrides are set.
This matters on clusters with several hot tables competing for attention. If vacuum starts frequently but progress is slow, the next useful question is whether workers are underpowered or too few, not whether thresholds should fall even further. Lowering thresholds without enough throughput can simply create a longer queue.
The pattern to watch is straightforward. If tables are vacuumed often and still show large dead-tuple accumulation, throughput is probably the issue. If tables are vacuumed rarely because they are never crossing the trigger soon enough, thresholds are the issue. Good tuning distinguishes those two cases.
Fillfactor and HOT updates are part of the story
The PostgreSQL storage documentation notes that lowering a table’s fillfactor can increase the likelihood of HOT updates by leaving enough space on a page for new row versions. That matters because HOT updates avoid some index churn when updated columns are not part of the indexed key path. In practice, a table that updates the same rows constantly can benefit from both earlier vacuuming and a better chance of keeping row-version churn local.
This is one of the clearest examples of why autovacuum tuning is not just one setting. If a table stores mutable status fields, timestamps, or retry counters, lowering fillfactor can make each update cheaper, while more appropriate autovacuum thresholds keep the accumulated dead tuples from lingering too long. Either change alone can help. Together they often make the table calmer.
That is also where PostgreSQL 17’s vacuum improvements matter for managed fleets. If you are planning a version upgrade, our guide to Postgres 17 performance improvements is useful background because vacuum efficiency improvements change how much pressure the maintenance path puts on write-heavy systems.
What not to do first
The most expensive mistake is reaching for VACUUM FULL as the default response. PostgreSQL’s VACUUM documentation says VACUUM FULL rewrites the table into a new disk file, requires extra disk space, and takes an ACCESS EXCLUSIVE lock. That makes it a recovery tool for specific space-reclamation situations, not a normal answer to everyday autovacuum lag.
Another common mistake is disabling autovacuum on a “special” table. PostgreSQL is very clear that anti-wraparound vacuuming will still be forced when necessary, even if regular autovacuum is disabled. In other words, you do not escape vacuuming. You just give up the controlled version and wait for the emergency version.
The third mistake is tuning only globally because that feels simpler. Global changes are reasonable when nearly every large table has the same problem. Most application databases are not like that. They usually have a few extremely hot tables, a larger set of moderate tables, and many cold tables that do fine with defaults.
A practical runbook for managed PostgreSQL teams
Start by identifying the top write-heavy tables in production and compare their n_dead_tup, last autovacuum times, and query importance. Pick one painful table, not ten. Lower its vacuum and analyze scale factors, wait long enough to observe at least several maintenance cycles, and compare whether dead tuples plateau sooner and query behavior stabilizes.
If the table still accumulates dead tuples too quickly, check whether autovacuum progress suggests workers are taking too long once they start. That is the moment to revisit cost settings, worker count, or table shape rather than blindly lowering thresholds again. If the table is update-heavy and row versions frequently stay on the same pages, consider a fillfactor change for future rewrites or maintenance windows.
Managed PostgreSQL also changes the operational boundary slightly. You may not control every system-level knob or every maintenance workflow directly, so it helps to keep the diagnosis small and evidence-based. On ArmorDB, that usually means checking workload shape, storage growth, and plan limits alongside the table statistics in /docs/limits, instead of assuming every slowdown is a parameter problem.
Common signs your tuning worked
The goal is not to make vacuum disappear from dashboards. The goal is to make it uneventful. Good results usually look like dead tuples peaking lower than before, more regular last_autovacuum timestamps on the hot tables, fewer sudden plan regressions from stale stats, and less pressure to run manual maintenance during business hours.
A healthy outcome can still include regular autovacuum activity. In fact, on a hot table, seeing it show up routinely is often better news than seeing it stay silent for too long.
Takeaway
Autovacuum tuning is mostly about matching maintenance frequency to write reality. PostgreSQL’s defaults are intentionally general, but large update-heavy tables often need earlier vacuum and analyze triggers than the defaults provide. Start with table-level evidence, tune the hottest relations first, separate threshold problems from throughput problems, and treat VACUUM FULL as an exception rather than the plan.
If your production database keeps rediscovering the same bloat problem, the answer is usually not more heroics. It is a calmer autovacuum cadence on the tables that actually earn one.
Sources / further reading
- PostgreSQL routine vacuuming: https://www.postgresql.org/docs/current/routine-vacuuming.html
- PostgreSQL vacuuming configuration: https://www.postgresql.org/docs/current/runtime-config-vacuum.html
- PostgreSQL statistics and
pg_stat_all_tables: https://www.postgresql.org/docs/current/monitoring-stats.html - PostgreSQL vacuum progress reporting: https://www.postgresql.org/docs/current/progress-reporting.html
- PostgreSQL
VACUUMcommand reference: https://www.postgresql.org/docs/current/sql-vacuum.html - PostgreSQL visibility map: https://www.postgresql.org/docs/current/storage-vm.html
- PostgreSQL HOT updates and
fillfactor: https://www.postgresql.org/docs/current/storage-hot.html
Topic
Operations
Updated
May 25, 2026
Read time
14 min read
ArmorDB Engineering writes about PostgreSQL operations, security, and infrastructure decisions for teams building production apps on ArmorDB.
Read next
Tech-News · 7 min read
PostgreSQL 19 Beta 1: What Managed PostgreSQL Teams Should Test Now
PostgreSQL 19 Beta 1 previews changes in maintenance, replication, security, and observability that managed PostgreSQL teams should evaluate before the final release.
Read articleData-Specs · 8 min read
PostgreSQL JSONB vs Relational Tables: How to Choose the Right Schema
A practical comparison of PostgreSQL JSONB, relational columns, and hybrid schemas for SaaS teams deciding what to model, index, and constrain.
Read article