Operations

May 17, 2026

15 min read

PostgreSQL Backups and Restore Strategy

A practical guide to PostgreSQL backups, restore testing, retention, and why a backup only matters when recovery is predictable.

Security TeamArmorDB engineering

PostgreSQL BackupsRestoreDisaster Recovery

A backup is not a recovery plan by itself. It is only useful if the team knows what it contains, how long it takes to restore, which point in time it can recover to, and what the application should do while recovery is happening. Many teams discover this distinction too late: the backup exists, but the restore path has never been tested under realistic pressure.

PostgreSQL gives teams strong building blocks for backup and recovery. Managed PostgreSQL should turn those building blocks into a predictable workflow. The goal is not to collect backup files. The goal is to make recovery boring enough that an incident does not become an improvised database project.

What a good backup strategy needs

A good backup strategy starts with two business questions. How much data can the product afford to lose, and how long can the product afford to be unavailable? These are usually called RPO and RTO. RPO is the recovery point objective: the maximum acceptable data loss. RTO is the recovery time objective: the maximum acceptable time to restore service.

Those numbers do not need to be perfect on day one, but they need to be honest. A small internal tool can often tolerate a longer restore window. A customer-facing SaaS product with paid users needs a tighter plan, clearer ownership, and regular restore testing.

Term	Meaning	Example question
RPO	How much data loss is acceptable	Can we lose the last hour of writes?
RTO	How long recovery may take	Can the app be down for thirty minutes?
Retention	How long backups are kept	Do we need seven days or thirty days?
Restore test	Proof the backup works	When did we last restore successfully?

Logical and physical backups

PostgreSQL teams usually think about two broad backup styles. Logical backups capture database contents in a portable format, often through tools like pg_dump. They are useful for migrations, exports, smaller databases, and cases where portability matters. Physical backups capture the database files and write-ahead log needed to restore the database state more directly, which is more appropriate for larger production systems and point-in-time recovery.

The best choice depends on database size, recovery expectations, and operational maturity. A small product may start with logical backups and later move toward physical backup workflows as the dataset grows. What matters is that the restore process is understood before the database becomes critical.

Backup type	Best fit	Tradeoff
Logical backup	Portability, migrations, smaller databases	Slower for large databases
Physical backup	Production recovery, larger datasets	More operational complexity
Point-in-time recovery	Recovering before a bad migration or delete	Requires WAL handling and tested process

Point-in-time recovery is the safety net behind the backup

Point-in-time recovery, or PITR, is what makes a backup truly useful when an incident is subtle. If someone deletes rows at 10:12, a dump from last night may be too old, but PITR can restore the database to 10:11 instead of the start of the day. That works because PostgreSQL can replay WAL, the write-ahead log, up to a chosen restore point.

This matters for two common failure modes: quiet data corruption and bad migrations. In both cases, the database may be technically up, but the data is wrong. A restore path that supports PITR gives the team a smaller blast radius and a much cleaner recovery story.

Restore testing is the real signal

The strongest backup system is the one that has been restored recently. A dashboard that says backups are running is useful, but it is not the same as proof that recovery works. Restore testing catches missing permissions, corrupted assumptions, slow transfer paths, unexpected database size growth, and application configuration problems.

Teams do not need to run a full incident simulation every week. They do need a regular habit: restore into a non-production environment, connect an application or inspection tool, verify key tables, and document the actual time required. That single workflow turns backups from hope into operational knowledge.

A practical restore runbook

Pick a recent backup and restore it into a clean environment.
Confirm the database starts and accepts connections.
Verify the most important tables, roles, and extensions.
Run the application smoke tests against the restored database.
Record the actual restore time and compare it to the RTO.
Write down any manual steps so the next restore is easier.

Common recovery scenarios

Scenario	What usually breaks	What to use
Bad deploy or broken migration	Schema or data changes land incorrectly	PITR or rollback from a recent restore point
Accidental deletion	Rows disappear without a full outage	PITR to just before the mistake
Corruption or operator error	Data looks wrong, but the app is still running	Restore from backup plus validation
Region or environment outage	Entire deployment becomes unavailable	Full restore into a new environment

Retention should match risk

Longer retention is not automatically better. Retention should match the kinds of mistakes the team needs to recover from. A bad deploy may be noticed in minutes. A quiet data corruption bug may take days to discover. A compliance or audit requirement may need a longer window. The right retention policy balances storage cost, product risk, and the practical value of older backups.

For early-stage products, a simple retention policy is usually better than a complicated one that nobody understands. As the product grows, retention should become more deliberate, especially around customer data, billing data, and audit-sensitive workflows.

Who owns recovery

Backups only work when someone is responsible for using them. That does not mean one person has to perform every restore, but the team should know who decides when to recover, who verifies the restored data, and who confirms that the application is safe to re-open to users.

Role	Responsibility	Why it matters
Incident lead	Decides whether recovery should start	Prevents hesitation during an outage
Database owner	Runs or supervises the restore	Knows the database-specific details
Application owner	Verifies app behavior after restore	Confirms the service is actually usable
Operations/support	Communicates status and timing	Keeps the rest of the team informed

A documented owner set makes recovery faster because nobody has to guess who should press the button.

Backups and migrations

Database migrations are one of the most common reasons teams need recovery. A migration can be syntactically valid and still damage data if it changes the wrong rows, drops the wrong constraint, or rewrites a large table under load. Before risky migrations, teams should know which backup exists, how recent it is, and how recovery would work if the migration has to be reversed.

The practical rule is simple: the more destructive the migration, the more confidence the team needs in restore. For high-risk changes, recovery planning should happen before the migration is executed, not after errors appear in production.

Common mistakes

Keeping backups but never testing a restore.
Assuming logical backups are enough for every workload.
Setting retention without a clear reason.
Forgetting to document who owns recovery during an incident.
Treating a backup as a substitute for a rollback plan.

How ArmorDB approaches backups

ArmorDB treats backups as part of the managed database workflow, not as an optional advanced feature hidden behind infrastructure menus. The dashboard should help teams understand whether backups are available for their plan, what capacity they are operating within, and when it is time to move to a plan with stronger production-oriented limits.

The product goal is straightforward: developers should not need to become backup operators before they can ship responsibly. They should still understand recovery concepts, but the routine mechanics should be part of the platform. It also means a restore runbook can survive staff turnover and still make sense six months later and avoids tribal memory loss over time.

Sources / further reading

PostgreSQL documentation on backup and restore
PostgreSQL documentation on pg_dump, pg_restore, and WAL-based recovery
PostgreSQL documentation on point-in-time recovery
Your managed provider’s backup and restore documentation

Practical takeaway

A PostgreSQL backup strategy is only as good as the restore path. The important questions are not whether a backup file exists, but whether the team can recover the right data, in the expected time, with a process that has been tested. Start with honest RPO and RTO expectations, keep retention understandable, test restores regularly, and treat backups as a production workflow rather than a checkbox.

Topic

Operations

Updated

May 17, 2026

Read time

15 min read

About the author

Security Team writes about PostgreSQL operations, security, and infrastructure decisions for teams building production apps on ArmorDB.

Scale your databaseTry ArmorDB Free

Compare managed plans