Performance Monitoring & Alerting System

Comprehensive performance monitoring, automated reporting, and real-time alerting for Tartware PMS.

🎯 Overview

This system provides:

pg_qualstats Integration - WHERE clause analysis for missing indexes
HypoPG Integration - Zero-cost hypothetical index testing
Automated Reporting - Daily/hourly performance reports
Anomaly Detection - Baseline-based performance degradation detection
Real-time Alerting - Immediate alerts for critical issues

📁 File Structure

scripts/
├── triggers/
│   ├── 11_install_performance_extensions.sql      # Extension installation
│   └── 12_install_all_performance_monitoring.sql  # Master installer
│
├── tables/
│   ├── 23_performance_reporting_tables.sql        # Reporting tables
│   └── 24_performance_alerting_tables.sql         # Alerting tables
│
└── procedures/
    ├── 15_performance_reporting_procedures.sql    # Reporting functions
    └── 16_performance_alerting_procedures.sql     # Alerting functions

🚀 Installation

Prerequisites

Install PostgreSQL extensions at OS level:

Ubuntu/Debian:

sudo apt-get install postgresql-16-pg-qualstats
sudo apt-get install postgresql-16-hypopg
sudo apt-get install postgresql-16-pg-cron  # Optional

RHEL/CentOS:

sudo yum install pg_qualstats_16
sudo yum install hypopg_16
sudo yum install pg_cron_16  # Optional

Add to postgresql.conf:

shared_preload_libraries = 'pg_stat_statements,pg_qualstats'

Then restart PostgreSQL:

sudo systemctl restart postgresql

Install All Components

cd /home/navin/tartware/scripts/triggers
psql -U postgres -d tartware -f 12_install_all_performance_monitoring.sql

This installs:

✅ 4 extensions (pg_stat_statements, pg_qualstats, hypopg, pg_cron)
✅ 6 tables (reports, schedules, thresholds, baselines, alerts, rules)
✅ 17 functions (reporting, alerting, recommendations)
✅ 7 views (dashboards, alerts, trends)

📊 Components

1. Extension Support

Check Extension Status

SELECT * FROM v_extension_status;

Output:

extension_name        | version | status          | description
---------------------+---------+-----------------+---------------------------
pg_stat_statements    | 1.10    | ✅ Installed    | Query performance statistics
pg_qualstats          | 2.1     | ✅ Installed    | WHERE clause analysis
hypopg                | 1.4     | ✅ Installed    | Hypothetical index testing
pg_cron               | 1.6     | ✅ Installed    | Scheduled job automation

2. Index Recommendations

Get Automated Recommendations

SELECT * FROM v_index_recommendations;

Example Output:

priority | table_name        | index_definition                              | reason                           | estimated_benefit
---------+-------------------+-----------------------------------------------+----------------------------------+------------------
1        | public.guests     | CREATE INDEX idx_guests_email ON guests(email)| Column email filtered 1523 times | ⚡ VERY HIGH (10x+)
2        | public.properties | CREATE INDEX idx_properties_type ON ...       | Column property_type filtered... | 🔥 HIGH (5x+)

Analyze Missing Indexes (pg_qualstats)

SELECT * FROM analyze_missing_indexes_qualstats();

Shows:

Frequently filtered columns without indexes
Execution time impact
Recommended index creation commands

Test Hypothetical Index (HypoPG)

SELECT * FROM test_hypothetical_index(
    'CREATE INDEX ON reservations(guest_id, check_in_date)',
    'SELECT * FROM reservations WHERE guest_id = ''uuid'' AND check_in_date > CURRENT_DATE'
);

Output:

index_name                | cost_before | cost_after | cost_reduction_percent | recommendation
-------------------------+-------------+------------+------------------------+--------------------------------
<hypothetical index>      | 1250.50     | 125.25     | 90.0                   | ⚡ HIGHLY RECOMMENDED - Over 50% improvement

Key Benefits:

✅ Zero Cost - No disk space or CPU used
✅ Instant Testing - Test multiple index strategies quickly
✅ Query-Specific - See exact benefit for your workload

3. Performance Reporting

Generate Daily Performance Report

SELECT generate_daily_performance_report();
-- Returns: report_id UUID

-- View the report
SELECT * FROM get_latest_report('DAILY_PERFORMANCE');

Report Contains:

Database info (size, version, uptime)
Table statistics (rows, bloat, vacuum status)
Slow queries (top 10)
Unused indexes
Connection statistics
Cache hit rates
Index recommendations

Generate Health Check

SELECT generate_health_check_report();
-- Returns: report_id UUID

-- View the report
SELECT * FROM get_latest_report('HEALTH_CHECK');

Health Check Monitors:

Connection pool saturation
Long-running queries (>5 minutes)
Table bloat levels
Cache hit rate degradation

View Recent Reports

SELECT * FROM v_recent_reports;

Check Current Alerts

SELECT * FROM v_current_alerts;

Example Output:

metric_name            | current_value | severity | threshold_value | alert_message
-----------------------+---------------+----------+-----------------+--------------------------------------
connection_saturation  | 85.5          | WARNING  | 75.0            | Connection usage at 85.5% (171 of 200)
cache_hit_rate         | 87.3          | WARNING  | 90.0            | Cache hit rate is 87.3% (should be > 90%)

4. Performance Alerting

Update Performance Baselines

SELECT update_performance_baselines();

Establishes baselines for:

Query execution time (hourly average)
Connection count (rolling average)
Cache hit rate (daily)
Sequential scan rate (daily)

Monitor for Degradation

SELECT * FROM monitor_performance_degradation();

Detects:

Query performance degradation (>50% slower than baseline)
Connection spikes (>50% above baseline)
Cache hit rate drops (>5% below baseline)

Example Output:

alert_type          | severity | message                                      | current_value | baseline_value | action_required
--------------------+----------+----------------------------------------------+---------------+----------------+----------------------------------
QUERY_DEGRADATION   | WARNING  | Query slower by 87%: SELECT * FROM reserva...| 450.25        | 240.50         | Analyze query execution plan
CONNECTION_SPIKE    | WARNING  | Connection spike detected: 180 connections...| 180           | 100            | Check application connection pooling

View Active Alerts

SELECT * FROM v_active_performance_alerts;

Acknowledge Alerts

-- Acknowledge specific alert
SELECT acknowledge_alert('alert-uuid', 'username');

-- Acknowledge all alerts of a type
SELECT acknowledge_alerts_by_type('QUERY_DEGRADATION', 'dba');

View Performance Trends

SELECT * FROM v_performance_trends;

Shows:

Current value vs baseline
Status (Normal/Degraded/Spike)
Last updated timestamp

5. Alert Summary

SELECT * FROM v_alert_summary;

Output:

alert_type          | severity | alert_count | first_occurrence        | last_occurrence
--------------------+----------+-------------+-------------------------+-------------------------
QUERY_DEGRADATION   | WARNING  | 15          | 2025-10-15 08:00:00     | 2025-10-15 14:30:00
CONNECTION_SPIKE    | CRITICAL | 3           | 2025-10-15 12:00:00     | 2025-10-15 13:15:00

📅 Automated Monitoring Setup

Option 1: System Cron

Add to crontab (crontab -e):

# Update baselines every hour
0 * * * * psql -U postgres -d tartware -c "SELECT update_performance_baselines();"

# Check for degradation every 5 minutes
*/5 * * * * psql -U postgres -d tartware -c "SELECT * FROM monitor_performance_degradation();"

# Generate hourly health check
0 * * * * psql -U postgres -d tartware -c "SELECT generate_health_check_report();"

# Generate daily performance report at 2 AM
0 2 * * * psql -U postgres -d tartware -c "SELECT generate_daily_performance_report();"

Option 2: pg_cron (Recommended)

-- Update baselines every hour
SELECT cron.schedule(
    'update-baselines',
    '0 * * * *',
    $$SELECT update_performance_baselines();$$
);

-- Monitor performance every 5 minutes
SELECT cron.schedule(
    'monitor-performance',
    '*/5 * * * *',
    $$SELECT monitor_performance_degradation();$$
);

-- Health check every hour
SELECT cron.schedule(
    'health-check',
    '0 * * * *',
    $$SELECT generate_health_check_report();$$
);

-- Daily report at 2 AM
SELECT cron.schedule(
    'daily-report',
    '0 2 * * *',
    $$SELECT generate_daily_performance_report();$$
);

-- View scheduled jobs
SELECT * FROM cron.job;

-- Unschedule a job (if needed)
SELECT cron.unschedule('monitor-performance');

💡 Usage Examples

Example 1: Find and Fix Missing Indexes

-- Step 1: Get recommendations
SELECT * FROM v_index_recommendations LIMIT 5;

-- Step 2: Test a hypothetical index
SELECT * FROM test_hypothetical_index(
    'CREATE INDEX idx_guests_email ON guests(email)',
    'SELECT * FROM guests WHERE email = ''test@example.com'''
);

-- Step 3: If recommended, create the index
CREATE INDEX CONCURRENTLY idx_guests_email ON guests(email);

-- Step 4: Verify improvement
SELECT * FROM pg_stat_user_indexes WHERE indexrelname = 'idx_guests_email';

Example 2: Investigate Slow Queries

-- Get latest performance report
SELECT * FROM get_latest_report('DAILY_PERFORMANCE');

-- Extract slow queries (from JSONB)
SELECT
    jsonb_array_elements(report_data->'slow_queries') as slow_query
FROM performance_reports
WHERE report_type = 'DAILY_PERFORMANCE'
ORDER BY generated_at DESC
LIMIT 1;

-- Check for query degradation
SELECT * FROM detect_query_degradation();

Example 3: Monitor Connection Health

-- Check connection spike
SELECT * FROM detect_connection_spike();

-- View connection breakdown
SELECT
    state,
    COUNT(*) as connection_count,
    ROUND(100.0 * COUNT(*) / SUM(COUNT(*)) OVER(), 2) as percentage
FROM pg_stat_activity
GROUP BY state
ORDER BY connection_count DESC;

-- Find idle connections
SELECT
    pid,
    usename,
    application_name,
    state,
    AGE(NOW(), state_change) as idle_duration
FROM pg_stat_activity
WHERE state = 'idle'
AND state_change < NOW() - INTERVAL '10 minutes'
ORDER BY idle_duration DESC;

Example 4: Cache Performance Analysis

-- Check cache degradation
SELECT * FROM detect_cache_degradation();

-- Detailed cache statistics by table
SELECT
    schemaname,
    tablename,
    ROUND(100.0 * heap_blks_hit / NULLIF(heap_blks_hit + heap_blks_read, 0), 2) as cache_hit_rate,
    pg_size_pretty(pg_relation_size(schemaname||'.'||tablename)) as table_size
FROM pg_statio_user_tables
WHERE heap_blks_hit + heap_blks_read > 0
ORDER BY cache_hit_rate ASC
LIMIT 10;

-- Check shared_buffers usage
SELECT
    name,
    setting,
    unit,
    short_desc
FROM pg_settings
WHERE name IN ('shared_buffers', 'effective_cache_size', 'work_mem');

🔔 Alert Notifications

Email Integration (Example with pg_cron + notify)

-- Create notification function
CREATE OR REPLACE FUNCTION send_alert_notifications()
RETURNS void
LANGUAGE plpgsql
AS $$
DECLARE
    v_alert RECORD;
    v_message TEXT;
BEGIN
    FOR v_alert IN
        SELECT * FROM v_active_performance_alerts
        WHERE severity IN ('CRITICAL', 'WARNING')
    LOOP
        v_message := FORMAT(
            'ALERT: %s - %s (Current: %s, Baseline: %s)',
            v_alert.alert_type,
            v_alert.alert_message,
            v_alert.current_value,
            v_alert.baseline_value
        );

        -- Use NOTIFY for application to pickup
        PERFORM pg_notify('performance_alert', v_message);

        -- Or log to syslog
        RAISE WARNING '%', v_message;
    END LOOP;
END $$;

-- Schedule notifications
SELECT cron.schedule(
    'alert-notifications',
    '*/5 * * * *',
    $$SELECT send_alert_notifications();$$
);

Slack/Discord Webhooks (Application Layer)

Your application can listen to PostgreSQL NOTIFY and forward to webhooks:

import psycopg2
import requests

conn = psycopg2.connect("dbname=tartware user=postgres")
conn.set_isolation_level(psycopg2.extensions.ISOLATION_LEVEL_AUTOCOMMIT)
cur = conn.cursor()
cur.execute("LISTEN performance_alert;")

while True:
    conn.poll()
    while conn.notifies:
        notify = conn.notifies.pop(0)
        # Forward to Slack
        requests.post(
            'https://hooks.slack.com/services/YOUR/WEBHOOK/URL',
            json={'text': notify.payload}
        )

📈 Performance Impact

The monitoring system itself has minimal overhead:

Baselines Update: ~100-500ms (hourly)
Degradation Check: ~50-200ms (every 5 minutes)
Report Generation: ~1-3 seconds (daily/hourly)
pg_qualstats: <1% CPU overhead
HypoPG: Zero overhead (only during testing)

🆘 Troubleshooting

Extension Not Found

-- Check if extension files exist
SELECT * FROM pg_available_extensions WHERE name IN ('pg_qualstats', 'hypopg');

-- If not found, install at OS level first (see Installation section)

pg_qualstats Not Tracking

-- Check if it's in shared_preload_libraries
SHOW shared_preload_libraries;

-- Check if it's enabled
SHOW pg_qualstats.enabled;

-- If not, add to postgresql.conf and restart:
-- shared_preload_libraries = 'pg_stat_statements,pg_qualstats'
-- pg_qualstats.enabled = on

No Baseline Data

-- Manually establish baselines
SELECT update_performance_baselines();

-- Wait 1 hour for query statistics to accumulate
-- Then check:
SELECT * FROM performance_baselines;

High Alert Volume

-- Adjust thresholds
UPDATE performance_thresholds
SET warning_threshold = 85,  -- was 75
    critical_threshold = 95  -- was 90
WHERE metric_name = 'connection_saturation';

-- Or acknowledge noisy alerts
SELECT acknowledge_alerts_by_type('QUERY_DEGRADATION', 'system');

Main Triggers README - Query efficiency monitoring
PostgreSQL Documentation - pg_stat_statements
pg_qualstats GitHub
HypoPG GitHub
pg_cron GitHub

Created: 2025-10-15 Version: 1.0 Status: Production Ready ✅

Performance Monitoring & Alerting System

🎯 Overview

📁 File Structure

🚀 Installation

Prerequisites

Add to postgresql.conf:

Install All Components

📊 Components

1. Extension Support

Check Extension Status

2. Index Recommendations

Get Automated Recommendations

Analyze Missing Indexes (pg_qualstats)

Test Hypothetical Index (HypoPG)

3. Performance Reporting

Generate Daily Performance Report

Generate Health Check

View Recent Reports

Check Current Alerts

4. Performance Alerting

Update Performance Baselines

Monitor for Degradation

View Active Alerts

Acknowledge Alerts

View Performance Trends

5. Alert Summary

📅 Automated Monitoring Setup

Option 1: System Cron

Option 2: pg_cron (Recommended)

💡 Usage Examples

Example 1: Find and Fix Missing Indexes

Example 2: Investigate Slow Queries

Example 3: Monitor Connection Health

Example 4: Cache Performance Analysis

🔔 Alert Notifications

Email Integration (Example with pg_cron + notify)

Slack/Discord Webhooks (Application Layer)

📈 Performance Impact

🆘 Troubleshooting

Extension Not Found

pg_qualstats Not Tracking

No Baseline Data

High Alert Volume

📚 Related Documentation