Lob's website experience is not optimized for Internet Explorer.
Please choose another browser.

Arrow Up to go to top of page
Hero Image for Lob Deep Dives Blog PostHow to Run Postgres VACUUM FULL on AWS RDS with Minimal DowntimeDirect Mail Q&A's
Engineering
February 16, 2016

How to Run Postgres VACUUM FULL on AWS RDS with Minimal Downtime

By

Marcus Gartner

Share this post

Maintaining database performance is important, especially for applications that rely on speed and accuracy. Over time, PostgreSQL databases collect dead rows and unused space, which can slow things down. The VACUUM FULL command helps reclaim this space and improve efficiency.

Running VACUUM FULL on AWS RDS Postgres requires careful planning to avoid long downtimes or disruptions. This article explains step-by-step how to perform a full vacuum in a managed cloud environment, focusing on strategies to limit impact on your application.

This guide is written for anyone who wants to understand how to use Postgres VACUUM FULL, especially on AWS RDS, with a focus on minimizing downtime and keeping data accessible.

What VACUUM FULL does in PostgreSQL

VACUUM FULL is a PostgreSQL maintenance command that completely rewrites your database tables to reclaim unused disk space. When you run Postgresql VACUUM FULL, the database creates a new copy of the table with only active data, then replaces the original table with this compacted version.

This process differs from regular VACUUM operations. A standard VACUUM command marks deleted rows as reusable space but doesn't actually shrink the table file. VACUUM FULL physically reduces the table size by removing all that dead space.

Key differences between VACUUM and VACUUM FULL:

  • Regular VACUUM: Runs while your application stays online, marks dead space as reusable
  • VACUUM FULL: Locks the table completely, physically shrinks the file size
  • Space recovery: Regular VACUUM reuses space, VACUUM FULL returns space to the operating system
  • Performance impact: Regular VACUUM has minimal impact, VACUUM FULL blocks all table access

The trade-off is significant. While Full VACUUM Postgresql operations recover more disk space, they require exclusive table locks that prevent any reads or writes during the process.

Why VACUUM FULL is useful

Over time, PostgreSQL tables accumulate “bloat” from deleted or updated rows. This bloat makes your database files larger than necessary and can slow down queries that have to scan through more data than needed.

Vacuum Full Postgresql operations become useful when regular maintenance isn't enough. Here are the main scenarios:

Common situations that benefit from VACUUM FULL:

  • Heavy delete operations: After removing large amounts of data (like purging old records)
  • Bulk updates: When you've modified most rows in a table, creating significant dead space
  • Storage cost concerns: On cloud platforms where you pay for disk space usage
  • Performance degradation: When table scans become noticeably slower due to bloat

You can check if your tables have significant bloat by querying the pg_stat_user_tables view. Look for tables where n_dead_tup (dead tuples) represents a large percentage of n_tup (total tuples).

Drawbacks to consider

VACUUM FULL operations come with significant limitations that make them unsuitable for routine maintenance. Understanding these drawbacks helps you plan better and consider alternatives.

Primary limitations of VACUUM FULL:

  • Complete table locking: No reads or writes can happen during the operation
  • Extended downtime: Large tables can take hours to process
  • Double disk space: The operation temporarily needs space for both old and new table versions
  • No cancellation: Once started, you can't safely interrupt the process

The disk space requirement is particularly important on AWS RDS. If you're running Vacuumdb Full on a large table, you temporarily need enough free space to hold two copies of that table.

For most routine maintenance, PostgreSQL's autovacuum feature handles cleanup automatically without these drawbacks. VACUUM FULL works best as an occasional deep-cleaning operation rather than regular maintenance.

Prepare your AWS RDS instance

Before running VACUUM FULL on AWS RDS, you'll want to prepare your environment to minimize downtime and avoid common issues.

Essential preparation steps:

  • Monitor current bloat: Use queries against pg_stat_user_tables to confirm which tables actually need VACUUM FULL
  • Check available storage: Ensure you have at least 2x the target table size in free disk space
  • Schedule maintenance windows: Plan the operation during your lowest traffic periods
  • Create recent backups: Take a snapshot through the RDS console before starting

You can monitor table sizes and bloat with this query:

SELECT schemaname, tablename,
      pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size,
      n_dead_tup, n_live_tup
FROM pg_stat_user_tables
WHERE n_dead_tup > 0
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;

Consider temporarily increasing your RDS instance size or provisioned IOPS before the operation. Larger instances with more CPU and memory can complete VACUUM FULL operations significantly faster.

Step-by-step VACUUM FULL process

Running VACUUM FULL on AWS RDS follows the same basic process as standard PostgreSQL, with a few cloud-specific considerations.

1. Connect to your RDS instance

Connect using your preferred PostgreSQL client with administrative privileges:

psql -h your-rds-endpoint.amazonaws.com -U your-admin-user -d your-database

2. Identify target tables

Run the bloat monitoring query from the previous section to confirm which tables need attention. Focus on tables with high n_dead_tup values relative to their total size.

3. Run VACUUM FULL on specific tables

Target individual tables rather than running VACUUM FULL on the entire database:

VACUUM FULL your_table_name;

For verbose output that shows progress:

VACUUM (FULL, VERBOSE) your_table_name;

4. Monitor the operation

AWS RDS provides CloudWatch metrics to track the operation's progress. Watch for:

  • CPU utilization (will be high during the operation)
  • Free storage space (will decrease temporarily)
  • Database connections (other queries will be blocked)

The operation completes when you see a message like “VACUUM completed” and CPU usage returns to normal levels.

VACUUM FULL with only partial downtime

While VACUUM FULL inherently requires table locks, you can minimize downtime through careful planning and alternative approaches.

Strategies to reduce downtime:

  • Process tables individually: Instead of running VACUUM FULL on multiple tables, process them one at a time during separate maintenance windows
  • Use larger RDS instances temporarily: Scale up your instance before the operation to reduce processing time
  • Consider pg_repack: This extension rebuilds tables with minimal locking, though it requires installation and more complex setup

The pg_repack extension offers an alternative approach that rebuilds tables while allowing most read and write operations to continue. However, pg_repack isn't available by default on AWS RDS and requires careful configuration.

For most AWS RDS users, the most practical approach combines proper timing with temporary resource scaling. Running VACUUM FULL during your lowest traffic period while using a larger instance size can significantly reduce the impact.

Verify your results

After VACUUM FULL completes, verify that the operation achieved its goals and didn't cause any issues.

Immediate verification steps:

  • Check table sizes: Compare before and after sizes using \dt+ in psql or the size query from earlier
  • Verify data integrity: Run a few test queries to confirm your data is intact and accessible
  • Monitor performance: Check if query performance improved on the processed tables

Use this query to see the space reclaimed:

SELECT schemaname, tablename,
      pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS current_size
FROM pg_stat_user_tables
WHERE tablename = 'your_processed_table';

Ongoing monitoring:

  • Watch CloudWatch metrics for any performance changes
  • Monitor your application's response times
  • Check that autovacuum settings are properly configured to prevent future bloat

If you don't see the expected space reduction, the table might not have had significant bloat, or other factors might be affecting disk usage.

Best practices for ongoing maintenance

Regular maintenance prevents the buildup of table bloat that makes VACUUM FULL operations necessary in the first place.

Effective maintenance strategies:

  • Tune autovacuum settings: Adjust autovacuum_vacuum_scale_factor and related parameters for tables with high update/delete activity
  • Monitor table bloat regularly: Set up automated monitoring of dead tuple ratios
  • Schedule regular VACUUM operations: Use standard VACUUM commands during off-peak hours for high-activity tables
  • Consider partitioning: Large tables with predictable data patterns benefit from table partitioning

For AWS RDS, you can modify autovacuum parameters through parameter groups. Tables with frequent updates might benefit from more aggressive autovacuum settings:

ALTER TABLE high_activity_table SET (autovacuum_vacuum_scale_factor = 0.1);

This approach reduces the frequency and severity of bloat, making VACUUM FULL operations less necessary over time.

When to avoid VACUUM FULL

VACUUM FULL isn't always the right solution for performance or storage issues. Understanding when to avoid it can save time and prevent unnecessary downtime.

Situations where alternatives work better:

  • Regular maintenance needs: Use autovacuum and scheduled VACUUM operations instead
  • High-availability requirements: Consider pg_repack or other online reorganization tools
  • Limited storage space: If you don't have 2x the table size available, VACUUM FULL will fail
  • Frequent update patterns: Tables with constant activity might benefit more from tuned autovacuum settings

For production systems that can't tolerate extended downtime, evaluate whether the space savings justify the operational impact. Sometimes accepting higher disk usage is more practical than scheduling regular VACUUM FULL operations.

Optimize your data infrastructure with modern tools

Managing database maintenance becomes easier with platforms that handle infrastructure complexity automatically. Modern data operations focus on automation and reliability rather than manual intervention.

At Lob, we've found that automated data management reduces the operational overhead of maintaining high-performance databases. Our platform handles the underlying infrastructure so teams can focus on building features rather than managing maintenance windows.

Whether you're running VACUUM FULL operations or optimizing other database tasks, the right tooling makes these processes more predictable and less disruptive to your applications.

FAQs
Frequently asked questions about PostgreSQL VACUUM FULL

What is the difference between VACUUM and VACUUM FULL in PostgreSQL?

VACUUM marks dead rows as reusable space and can run alongside normal database operations, while VACUUM FULL physically rewrites the entire table to reclaim disk space but requires an exclusive lock that blocks all access to the table.

Does VACUUM FULL in PostgreSQL block other database queries?

Yes, VACUUM FULL takes an exclusive lock on the table being processed, which prevents all read and write operations on that table until the operation completes.

How long does VACUUM FULL take to run on large PostgreSQL tables?

The duration depends on table size, available system resources, and disk speed, but large tables can take several hours to process, which is why careful planning and scheduling during low-traffic periods is important.

Can you safely cancel a VACUUM FULL operation once it starts?

No, interrupting VACUUM FULL can leave your database in an inconsistent state, so you cannot safely cancel the operation once PostgreSQL begins rewriting the table.

Answered by:

Continue Reading