Strategies for table partitioning, index optimization, read replicas, connection pooling, materialized views, and data archival to handle 10M+ records efficiently.
With 254 PostgreSQL tables and projected growth to 10M+ records in high-volume tables (audit_logs, po_items, grn_items), database performance optimization is critical for maintaining sub-500ms API response times.
Partition large tables by date to improve query performance and enable efficient data archival.
| Table | Partition Key | Strategy | Retention |
|---|---|---|---|
audit_logs | created_at | Monthly partitions | 2 years active, then archive |
po_items | created_at | Quarterly partitions | 5 years |
grn_items | grn_date | Quarterly partitions | 5 years |
invoice_items | invoice_date | Quarterly partitions | 7 years (statutory) |
auction_bids | bid_time | Monthly partitions | 1 year active |
notification_logs | sent_at | Monthly partitions | 6 months |
-- Create partitioned audit_logs table CREATE TABLE audit_logs ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), entity_type VARCHAR(100) NOT NULL, entity_id UUID NOT NULL, action VARCHAR(50) NOT NULL, old_values JSONB, new_values JSONB, user_id UUID NOT NULL, tenant_id UUID NOT NULL, created_at TIMESTAMPTZ NOT NULL DEFAULT NOW() ) PARTITION BY RANGE (created_at); -- Create monthly partitions CREATE TABLE audit_logs_2026_01 PARTITION OF audit_logs FOR VALUES FROM ('2026-01-01') TO ('2026-02-01'); CREATE TABLE audit_logs_2026_02 PARTITION OF audit_logs FOR VALUES FROM ('2026-02-01') TO ('2026-03-01'); -- Auto-create partitions (pg_partman extension) SELECT partman.create_parent( p_parent_table := 'public.audit_logs', p_control := 'created_at', p_type := 'native', p_interval := '1 month', p_premake := 3 );
Strategic indexing for frequently queried columns while avoiding over-indexing that slows writes.
| Table | Index | Type | Purpose |
|---|---|---|---|
purchase_orders | idx_po_tenant_status | B-tree (composite) | List POs by tenant + status |
purchase_orders | idx_po_vendor | B-tree | Vendor-wise PO lookup |
vendors | idx_vendor_search | GIN (trgm) | Full-text search on name |
products | idx_product_category | B-tree | Category filtering |
audit_logs | idx_audit_entity | B-tree (composite) | Entity-wise audit trail |
rfq_responses | idx_rfq_vendor_price | B-tree | Price comparison queries |
-- Composite index for tenant + status queries (most common) CREATE INDEX idx_po_tenant_status ON purchase_orders (tenant_id, status) WHERE is_deleted = FALSE; -- Partial index for active vendors only CREATE INDEX idx_vendor_active ON vendors (tenant_id, vendor_code) WHERE status = 'ACTIVE'; -- GIN index for full-text search CREATE INDEX idx_vendor_search ON vendors USING GIN ( to_tsvector('english', name || ' ' || COALESCE(gstin, '')) ); -- BRIN index for time-series data (audit logs) CREATE INDEX idx_audit_created_brin ON audit_logs USING BRIN (created_at); -- Identify unused indexes (run weekly) SELECT schemaname, tablename, indexname, idx_scan FROM pg_stat_user_indexes WHERE idx_scan = 0 AND indexrelname NOT LIKE '%_pkey' ORDER BY pg_relation_size(indexrelid) DESC;
Run ANALYZE after bulk imports (> 1000 rows) to update statistics for query planner.
Manage database connections efficiently to prevent exhaustion under high load.
| Setting | Development | Staging | Production |
|---|---|---|---|
| Max Connections | 50 | 100 | 500 |
| Default Pool Size | 10 | 25 | 100 |
| Min Pool Size | 5 | 10 | 25 |
| Pool Mode | Session | Transaction | Transaction |
| Connection Timeout | 30s | 30s | 15s |
| Idle Timeout | 300s | 120s | 60s |
# pgbouncer.ini (Production) [databases] reqvise = host=127.0.0.1 port=5432 dbname=reqvise [pgbouncer] listen_port = 6432 listen_addr = 0.0.0.0 auth_type = md5 auth_file = /etc/pgbouncer/userlist.txt # Pool settings pool_mode = transaction max_client_conn = 500 default_pool_size = 100 min_pool_size = 25 reserve_pool_size = 10 reserve_pool_timeout = 5 # Timeouts server_connect_timeout = 15 server_idle_timeout = 60 client_idle_timeout = 300 query_timeout = 30
Route heavy read operations to replicas to reduce load on the primary database.
| Query Type | Target | Acceptable Lag |
|---|---|---|
| Dashboard KPIs | Read Replica | 5 minutes |
| Analytics Reports | Read Replica | 15 minutes |
| Audit Log Queries | Read Replica | 1 minute |
| Vendor Rating Calc | Read Replica | 5 minutes |
| CRUD Operations | Primary | N/A |
| Approval Workflows | Primary | N/A |
// EF Core Read Replica Configuration public class ReqviseDbContext : DbContext { private readonly bool _useReadReplica; public ReqviseDbContext(DbContextOptions options, bool useReadReplica = false) : base(options) { _useReadReplica = useReadReplica; } protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder) { var connectionString = _useReadReplica ? Configuration["ConnectionStrings:ReadReplica"] : Configuration["ConnectionStrings:Primary"]; optionsBuilder.UseNpgsql(connectionString); } } // Service using read replica for reports public class AnalyticsService { private readonly ReqviseDbContext _readContext; public AnalyticsService(IDbContextFactory<ReqviseDbContext> factory) { _readContext = factory.CreateDbContext(useReadReplica: true); } public async Task<DashboardKpis> GetKpisAsync() { return await _readContext.PurchaseOrders .AsNoTracking() .GroupBy(p => p.Status) .Select(g => new { Status = g.Key, Count = g.Count() }) .ToListAsync(); } }
Pre-compute expensive aggregations for dashboard queries.
| Materialized View | Source Tables | Refresh Interval | Purpose |
|---|---|---|---|
mv_spend_by_category | po_items, categories | 15 minutes | Spend analytics |
mv_vendor_performance | vendors, grns, ratings | 1 hour | Vendor ratings |
mv_procurement_kpis | prs, rfqs, pos | 15 minutes | Dashboard KPIs |
mv_msme_compliance | vendors, payments | Daily | MSME reports |
-- Spend by Category materialized view CREATE MATERIALIZED VIEW mv_spend_by_category AS SELECT c.id AS category_id, c.name AS category_name, po.tenant_id, DATE_TRUNC('month', po.po_date) AS month, SUM(poi.total_amount) AS total_spend, COUNT(DISTINCT po.id) AS po_count FROM po_items poi JOIN purchase_orders po ON poi.po_id = po.id JOIN products p ON poi.product_id = p.id JOIN categories c ON p.category_id = c.id WHERE po.status IN ('APPROVED', 'COMPLETED') GROUP BY c.id, c.name, po.tenant_id, DATE_TRUNC('month', po.po_date); -- Create unique index for concurrent refresh CREATE UNIQUE INDEX idx_mv_spend_unique ON mv_spend_by_category (category_id, tenant_id, month); -- Refresh concurrently (no lock) REFRESH MATERIALIZED VIEW CONCURRENTLY mv_spend_by_category;
Move old data to archive tables to keep active tables lean and fast.
| Table | Active Period | Archive Trigger | Archive Retention |
|---|---|---|---|
audit_logs | 2 years | Quarterly job | 7 years (compliance) |
notification_logs | 6 months | Monthly job | 2 years |
session_logs | 90 days | Monthly job | 1 year |
auction_bids | 1 year | Quarterly job | 5 years |
-- Archive audit logs older than 2 years INSERT INTO audit_logs_archive SELECT * FROM audit_logs WHERE created_at < NOW() - INTERVAL '2 years'; DELETE FROM audit_logs WHERE created_at < NOW() - INTERVAL '2 years'; -- Run as Hangfire scheduled job [RecurringJob("archive-audit-logs", "0 2 1 */3 *")] // Quarterly at 2 AM public async Task ArchiveAuditLogsAsync() { await _dbContext.Database.ExecuteSqlRawAsync(archiveSql); _logger.Information("Archived audit logs older than 2 years"); }
| Rule ID | Category | Description |
|---|---|---|
| BR-DB-001 | Statistics | Run ANALYZE after bulk imports > 1000 rows |
| BR-DB-002 | Partitioning | Tables > 10M rows MUST be partitioned |
| BR-DB-003 | Queries | All queries MUST have execution time < 30 seconds |
| BR-DB-004 | Indexes | Unused indexes MUST be reviewed and dropped quarterly |
| BR-DB-005 | Read Replica | Analytics queries MUST use read replica |
| BR-DB-006 | Archival | Audit logs MUST be retained for 7 years (compliance) |