Back to ER Diagrams
Analytics & AI Logic

Analytics & AI Intelligence Logic

Comprehensive analytics engine with KPI dashboards, predictive insights, spend analysis, vendor intelligence, and AI-powered recommendations for procurement optimization.

20 PostgreSQL Tables/Views
8 MongoDB Collections
Agentic AI + 8 Models
Real-Time Dashboards

1. Overview

The Analytics & AI Intelligence module is the decision-support backbone of ProKure. It aggregates procurement data across all modules, computes real-time KPIs, detects anomalies, predicts prices and delivery risks, and provides AI-driven vendor recommendations. The module powers executive dashboards, scheduled reports, and a Phase 2 natural language query interface.

Key Features

  • Executive KPI Dashboard with real-time refresh
  • Spend Analysis Engine with category/vendor/period drill-down
  • Vendor Intelligence & Recommendation Engine
  • Price Prediction using historical bid data
  • Procurement Cycle Analytics (PR to PO cycle time)
  • Anomaly Detection (unusual prices, quantities, patterns)
  • Natural Language Query Interface (Phase 2)
  • Mobile Push Analytics & Alerts (Phase 2)
  • Predictive Delivery Risk Scoring
  • Budget Forecasting & Variance Analysis

2. Database Architecture

The analytics module uses a dedicated PostgreSQL analytics schema for structured metrics and materialized views, plus MongoDB collections for AI model management, training datasets, and flexible document storage.

PostgreSQL Analytics Schema

analytics.kpi_snapshots

Stores computed KPI values at regular intervals for dashboard display and trend analysis.

  • id UUID PK DEFAULT gen_random_uuid()
  • tenant_id UUID FK NOT NULL
  • snapshot_date DATE NOT NULL
  • kpi_code VARCHAR(50) NOT NULL — e.g., PROCUREMENT_CYCLE_TIME, COST_SAVINGS_PCT, ON_TIME_DELIVERY_PCT, FIRST_PASS_QC_PCT, INVOICE_PROCESSING_DAYS, AVG_VENDOR_RATING
  • kpi_value DECIMAL(18,4) NOT NULL
  • kpi_target DECIMAL(18,4)
  • kpi_previous_value DECIMAL(18,4) — previous period value for trend arrow
  • trend VARCHAR(10) — UP, DOWN, STABLE (pre-computed)
  • unit VARCHAR(20) — "days", "%", "score"
  • variance_pct DECIMAL(8,4)
  • dimension_key VARCHAR(100) — e.g., project_id, vendor_id, category_id for drill-down
  • dimension_value VARCHAR(255)
  • period_type VARCHAR(20) — DAILY, WEEKLY, MONTHLY, QUARTERLY, YEARLY
  • calculated_at TIMESTAMPTZ DEFAULT NOW()
  • metadata JSONB

Indexes: (tenant_id, kpi_code, snapshot_date), (tenant_id, period_type, snapshot_date), (dimension_key, dimension_value)

analytics.spend_analysis

Aggregated spend data by vendor, category, project, department, and period for drill-down analysis.

  • id UUID PK
  • tenant_id UUID FK NOT NULL
  • analysis_period_start DATE NOT NULL
  • analysis_period_end DATE NOT NULL
  • dimension_type VARCHAR(20) NOT NULL — VENDOR, CATEGORY, PROJECT, DEPARTMENT, OVERALL
  • vendor_id UUID FK
  • vendor_name VARCHAR(200) — DENORM: avoids JOIN to vendor.vendors
  • vendor_code VARCHAR(50) — DENORM
  • category_id UUID FK
  • category_name VARCHAR(200) — DENORM
  • project_id UUID FK
  • project_name VARCHAR(200) — DENORM
  • project_code VARCHAR(50) — DENORM
  • department_id UUID FK
  • department_name VARCHAR(200) — DENORM
  • total_spend DECIMAL(18,4) NOT NULL
  • po_count INTEGER
  • invoice_count INTEGER — total invoices in period
  • avg_po_value DECIMAL(18,4)
  • max_po_value DECIMAL(18,4) — largest PO
  • min_po_value DECIMAL(18,4) — smallest PO
  • savings_amount DECIMAL(18,4) — L1 bid - awarded price
  • savings_pct DECIMAL(8,4)
  • maverick_spend DECIMAL(18,4) — off-contract spend
  • contract_spend DECIMAL(18,4)
  • spot_spend DECIMAL(18,4)
  • direct_spend DECIMAL(18,4)
  • currency_code VARCHAR(3) DEFAULT 'INR'
  • calculated_at TIMESTAMPTZ

Indexes: (tenant_id, analysis_period_start), (vendor_id), (category_id), (project_id)

analytics.price_history

Historical price records from bids, POs, contracts, and market data for trend analysis and prediction training.

  • id UUID PK
  • tenant_id UUID FK NOT NULL
  • material_id UUID FK NOT NULL
  • material_name VARCHAR(200) NOT NULL — DENORM
  • material_code VARCHAR(50) — DENORM
  • vendor_id UUID FK
  • vendor_name VARCHAR(200) — DENORM
  • unit_price DECIMAL(18,4) NOT NULL
  • currency_code VARCHAR(3) DEFAULT 'INR'
  • quantity DECIMAL(18,4)
  • uom_id UUID FK
  • uom_code VARCHAR(20) — DENORM
  • source_type VARCHAR(20) NOT NULL — BID, PO, CONTRACT, MARKET
  • source_id UUID — rfq_bid_id or po_item_id or contract_id
  • source_display_number VARCHAR(50) — DENORM: e.g. "RFQ-2026-00145"
  • effective_date DATE NOT NULL
  • created_at TIMESTAMPTZ DEFAULT NOW()

Indexes: (tenant_id, material_id, effective_date), (vendor_id, material_id), (source_type)

analytics.price_predictions

ML-generated price forecasts with confidence scores and post-facto accuracy tracking.

  • id UUID PK
  • tenant_id UUID FK NOT NULL
  • material_id UUID FK NOT NULL
  • material_name VARCHAR(200) NOT NULL — DENORM
  • material_code VARCHAR(50) — DENORM
  • category_id UUID — DENORM: material category for filtering
  • category_name VARCHAR(200) — DENORM
  • predicted_price DECIMAL(18,4) NOT NULL
  • last_known_price DECIMAL(18,4) — DENORM: most recent actual price
  • price_change_pct DECIMAL(8,4) — DENORM: predicted vs last known
  • confidence_score DECIMAL(5,4) — 0.0000 to 1.0000
  • prediction_date DATE NOT NULL — the date being predicted
  • model_version VARCHAR(20) NOT NULL
  • model_type VARCHAR(50) — LINEAR_REGRESSION, ARIMA, PROPHET, ENSEMBLE
  • input_data_points INTEGER
  • training_period_start DATE
  • training_period_end DATE
  • actual_price DECIMAL(18,4) — filled post-facto for accuracy tracking
  • accuracy_pct DECIMAL(8,4) — filled post-facto
  • created_at TIMESTAMPTZ DEFAULT NOW()

Indexes: (tenant_id, material_id, prediction_date), (model_version)

analytics.vendor_recommendations

AI-generated vendor suggestions ranked by composite score for purchase requisitions.

  • id UUID PK
  • tenant_id UUID FK NOT NULL
  • pr_id UUID FK — for which PR is this recommendation
  • pr_number VARCHAR(50) — DENORM: e.g. "PR-2026-00145"
  • pr_item_id UUID FK
  • material_id UUID FK
  • material_name VARCHAR(200) — DENORM
  • material_code VARCHAR(50) — DENORM
  • recommended_vendor_id UUID FK NOT NULL
  • vendor_name VARCHAR(200) NOT NULL — DENORM
  • vendor_code VARCHAR(50) — DENORM
  • vendor_is_msme BOOLEAN — DENORM: MSME flag
  • recommendation_rank INTEGER NOT NULL — 1, 2, 3...
  • score DECIMAL(8,4) NOT NULL
  • price_score DECIMAL(8,4) — DENORM: individual factor
  • quality_score DECIMAL(8,4) — DENORM
  • delivery_score DECIMAL(8,4) — DENORM
  • compliance_score DECIMAL(8,4) — DENORM
  • score_breakdown JSONB NOT NULL — e.g., {"rating": 0.35, "price": 0.30, "delivery": 0.20, "quality": 0.15}
  • reason TEXT
  • historical_pos INTEGER — how many POs with this vendor for this material
  • avg_historical_price DECIMAL(18,4)
  • avg_delivery_days INTEGER
  • last_quality_rating DECIMAL(5,2) — DENORM: latest QC pass rate
  • is_accepted BOOLEAN
  • accepted_at TIMESTAMPTZ
  • created_at TIMESTAMPTZ DEFAULT NOW()
  • expires_at TIMESTAMPTZ

Indexes: (tenant_id, pr_id), (recommended_vendor_id), (material_id)

analytics.anomaly_detections

Detected anomalies across procurement data with severity classification and acknowledgement tracking.

  • id UUID PK
  • tenant_id UUID FK NOT NULL
  • anomaly_type VARCHAR(50) NOT NULL — PRICE_SPIKE, QUANTITY_UNUSUAL, VENDOR_PATTERN, APPROVAL_DELAY, DELIVERY_RISK, SPEND_SURGE
  • severity VARCHAR(10) NOT NULL — LOW, MEDIUM, HIGH, CRITICAL
  • entity_type VARCHAR(30) NOT NULL — PR, RFQ, BID, PO, INVOICE, VENDOR
  • entity_id UUID NOT NULL
  • entity_display_number VARCHAR(50) — DENORM: "PR-2026-00145", "PO-2026-00312"
  • entity_display_name VARCHAR(200) — DENORM
  • vendor_id UUID — DENORM: related vendor (if applicable)
  • vendor_name VARCHAR(200) — DENORM
  • project_id UUID — DENORM: related project
  • project_name VARCHAR(200) — DENORM: for dashboard filtering
  • description TEXT NOT NULL
  • expected_value DECIMAL(18,4)
  • actual_value DECIMAL(18,4)
  • deviation_pct DECIMAL(8,4)
  • detection_model VARCHAR(50)
  • is_acknowledged BOOLEAN DEFAULT FALSE
  • acknowledged_by UUID FK
  • acknowledged_by_name VARCHAR(150) — DENORM
  • acknowledged_at TIMESTAMPTZ
  • is_false_positive BOOLEAN
  • resolution_notes TEXT
  • detected_at TIMESTAMPTZ DEFAULT NOW()

Indexes: (tenant_id, anomaly_type, detected_at), (entity_type, entity_id), (severity, is_acknowledged)

analytics.procurement_cycle_metrics

End-to-end procurement cycle timestamps and duration calculations per PR/PO.

  • id UUID PK
  • tenant_id UUID FK NOT NULL
  • pr_id UUID FK NOT NULL
  • pr_number VARCHAR(50) — DENORM
  • rfq_id UUID FK
  • rfq_number VARCHAR(50) — DENORM
  • po_id UUID FK
  • po_number VARCHAR(50) — DENORM
  • project_id UUID FK
  • project_name VARCHAR(200) — DENORM
  • vendor_id UUID — DENORM: awarded vendor
  • vendor_name VARCHAR(200) — DENORM
  • department_id UUID — DENORM
  • department_name VARCHAR(200) — DENORM
  • pr_created_at TIMESTAMPTZ
  • pr_approved_at TIMESTAMPTZ
  • rfq_created_at TIMESTAMPTZ
  • rfq_published_at TIMESTAMPTZ
  • bids_received_at TIMESTAMPTZ
  • bid_count INTEGER — DENORM: number of bids received
  • evaluation_completed_at TIMESTAMPTZ
  • po_created_at TIMESTAMPTZ
  • po_approved_at TIMESTAMPTZ
  • po_released_at TIMESTAMPTZ
  • vendor_acknowledged_at TIMESTAMPTZ
  • pr_to_approval_hours DECIMAL(10,2)
  • approval_to_rfq_hours DECIMAL(10,2)
  • rfq_to_bids_hours DECIMAL(10,2)
  • bids_to_evaluation_hours DECIMAL(10,2)
  • evaluation_to_po_hours DECIMAL(10,2)
  • po_to_release_hours DECIMAL(10,2)
  • total_cycle_hours DECIMAL(10,2)
  • total_cycle_days DECIMAL(8,2)
  • po_type VARCHAR(20) — STANDARD, DIRECT, SPOT, BLANKET, CONTRACT_RELEASE, EMERGENCY
  • bottleneck_stage VARCHAR(50)
  • total_pr_amount DECIMAL(18,4) — DENORM: PR value for analytics
  • total_po_amount DECIMAL(18,4) — DENORM: PO value

Indexes: (tenant_id, pr_id), (po_type), (project_id), (total_cycle_days)

analytics.budget_forecasts

ML-powered project budget forecasts with variance analysis and risk classification.

  • id UUID PK
  • tenant_id UUID FK NOT NULL
  • project_id UUID FK NOT NULL
  • project_name VARCHAR(200) NOT NULL — DENORM
  • project_code VARCHAR(50) — DENORM
  • department_id UUID — DENORM
  • department_name VARCHAR(200) — DENORM
  • forecast_date DATE NOT NULL
  • original_budget DECIMAL(18,4)
  • committed_amount DECIMAL(18,4) — approved POs
  • spent_amount DECIMAL(18,4) — paid invoices
  • pending_amount DECIMAL(18,4) — pending PRs + RFQs
  • forecasted_total DECIMAL(18,4) — ML predicted final cost
  • variance_amount DECIMAL(18,4)
  • variance_pct DECIMAL(8,4)
  • utilization_pct DECIMAL(8,4) — DENORM: (committed+spent)/original×100
  • remaining_budget DECIMAL(18,4) — DENORM: original - committed - spent
  • confidence_score DECIMAL(5,4)
  • risk_level VARCHAR(10) — LOW, MEDIUM, HIGH
  • forecast_model VARCHAR(50)
  • currency_code VARCHAR(3) DEFAULT 'INR'
  • created_at TIMESTAMPTZ DEFAULT NOW()

Indexes: (tenant_id, project_id, forecast_date)

analytics.delivery_risk_scores

Predictive delivery risk scores for open POs based on multi-factor ML model.

  • id UUID PK
  • tenant_id UUID FK NOT NULL
  • po_id UUID FK NOT NULL
  • po_number VARCHAR(50) NOT NULL — DENORM
  • po_total_amount DECIMAL(18,4) — DENORM: PO value
  • vendor_id UUID FK NOT NULL
  • vendor_name VARCHAR(200) NOT NULL — DENORM
  • vendor_code VARCHAR(50) — DENORM
  • project_id UUID — DENORM: related project
  • project_name VARCHAR(200) — DENORM
  • material_summary VARCHAR(500) — DENORM: top materials on this PO
  • risk_score DECIMAL(5,4) — 0.0 to 1.0, higher = more risk
  • risk_level VARCHAR(10) — LOW, MEDIUM, HIGH, CRITICAL
  • vendor_history_score DECIMAL(5,4) — 30% weight
  • lead_time_score DECIMAL(5,4) — 20%
  • complexity_score DECIMAL(5,4) — 15%
  • capacity_score DECIMAL(5,4) — 15%
  • season_score DECIMAL(5,4) — 10%
  • distance_score DECIMAL(5,4) — 10%
  • risk_factors JSONB — e.g., {"vendor_history": 0.3, "distance": 0.1, "material_complexity": 0.2, "season": 0.1}
  • predicted_delivery_date DATE
  • committed_delivery_date DATE
  • delay_probability_pct DECIMAL(5,2)
  • predicted_delay_days INTEGER
  • model_version VARCHAR(20)
  • scored_at TIMESTAMPTZ DEFAULT NOW()
  • actual_delivery_date DATE — filled post-facto
  • was_accurate BOOLEAN — filled post-facto

Indexes: (tenant_id, po_id), (vendor_id), (risk_level)

Materialized Views

analytics.mv_vendor_spend_summary

Refreshes: Every 6 hours via Hangfire

  • tenant_id, vendor_id, vendor_name
  • total_spend_ytd, total_spend_mtd
  • po_count_ytd, avg_po_value
  • on_time_delivery_pct, quality_score, overall_rating
  • last_po_date, top_category, vendor_code, msme_flag

analytics.mv_category_spend_summary

Refreshes: Every 6 hours

  • tenant_id, category_id, category_name
  • total_spend_ytd, total_spend_prev_year, vendor_count, avg_unit_price
  • price_trend (UP/DOWN/STABLE), yoy_change_pct

analytics.mv_monthly_procurement_summary

Refreshes: Daily

  • tenant_id, year_month
  • total_prs, total_rfqs, total_pos, total_spend
  • avg_cycle_days, savings_pct, on_time_pct, avg_bids_per_rfq, rejection_rate

analytics.mv_project_budget_status

Refreshes: Every 2 hours

  • tenant_id, project_id, project_name, project_code
  • department_name, original_budget, committed, spent, pending_amount, remaining
  • utilization_pct, forecast_total, variance_pct, confidence_score, risk_level, currency_code

MongoDB AI Collections (prokure_app DB)

ai_training_datasets

Training data prepared for ML model training. TTL: None (persistent)

  • dataset_id — unique identifier
  • model_type — PRICE_PREDICTION, VENDOR_RECOMMENDATION, ANOMALY_DETECTION, DELIVERY_RISK, BUDGET_FORECAST
  • tenant_id, version
  • status — PREPARING, READY, TRAINING, COMPLETED, FAILED
  • training_data — array of data points
  • feature_columns, target_column
  • record_count, data_range (start_date, end_date)
  • created_at, updated_at

ai_model_registry

Registry of all trained AI/ML models with versioning and deployment status. TTL: None

  • model_id, model_type, model_name, version
  • algorithm — LINEAR_REGRESSION, RANDOM_FOREST, XGBOOST, ARIMA, PROPHET, LSTM, ENSEMBLE
  • hyperparameters (JSONB), training_metrics (accuracy, rmse, mae, r2_score)
  • validation_metrics, training_duration_seconds
  • dataset_id (reference), artifact_path (model file path)
  • status — TRAINING, VALIDATED, DEPLOYED, DEPRECATED
  • deployed_at, deprecated_at
  • created_by, created_at

ai_prediction_logs

Audit trail of all predictions made by AI models. TTL: 180 days

  • prediction_id, model_id (reference), model_type
  • tenant_id, input_features, predicted_value, confidence_score
  • actual_value (filled post-facto), error_amount
  • latency_ms, created_at
  • user_id — user who triggered the prediction
  • entity_type, entity_id, entity_display_number — DENORM

nlp_query_history

Natural language queries and their parsed SQL translations. TTL: 365 days

  • query_id, tenant_id, user_id
  • natural_query — the user's question in plain English, e.g., "Show me top 5 vendors by spend this quarter"
  • parsed_intent — SPEND_ANALYSIS, VENDOR_COMPARISON, PRICE_TREND, DELIVERY_STATUS, BUDGET_STATUS
  • extracted_entities — [{type: "vendor"|"category"|"period"|"metric", value: "..."}]
  • generated_sql — the SQL/aggregation pipeline generated
  • result_summary, result_count
  • feedback — HELPFUL, NOT_HELPFUL, WRONG_RESULT
  • response_time_ms, created_at

mobile_push_analytics

Mobile push notification tracking for analytics alerts. TTL: 90 days

  • push_id, tenant_id, user_id, device_id
  • notification_type — KPI_ALERT, ANOMALY_ALERT, APPROVAL_PENDING, DELIVERY_UPDATE, BUDGET_WARNING
  • title, body, data_payload
  • platform — IOS, ANDROID
  • status — QUEUED, SENT, DELIVERED, OPENED, FAILED
  • sent_at, delivered_at, opened_at
  • fcm_token, apns_token
  • deep_link — app route for notification tap
  • action_taken — APPROVED, REJECTED, VIEWED, DISMISSED, NONE
  • created_at

user_analytics_preferences

Per-user dashboard preferences, pinned KPIs, and alert settings. TTL: None

  • user_id, tenant_id
  • default_dashboard_id
  • pinned_kpis — array of kpi_codes
  • favorite_reports — array of report_ids
  • alert_thresholds — [{kpi_code, operator (GT/LT/EQ), threshold_value, channels (EMAIL/PUSH/IN_APP)}]
  • digest_frequency — DAILY, WEEKLY, MONTHLY, NONE
  • timezone, locale
  • mobile_settings — {push_enabled, quiet_hours_start, quiet_hours_end}
  • updated_at, created_at

3. KPI Dashboard Engine

The KPI engine calculates six core metrics every 5 minutes via Hangfire, stores snapshots for trend analysis, caches results in Redis, and pushes updates to connected dashboards via SignalR.

1

Procurement Cycle Time

Average days from PR creation to PO release. Target: < 14 days. Alert: > 21 days.

-- Average days from PR creation to PO release
SELECT AVG(total_cycle_days) AS avg_cycle_days
FROM analytics.procurement_cycle_metrics
WHERE tenant_id = :tenant_id
  AND pr_created_at >= :period_start
  AND pr_created_at < :period_end;
-- Target: < 14 days
-- Alert: > 21 days
2

Cost Savings Percentage

Savings = (L1 Bid Amount - Awarded Amount) / L1 Bid Amount * 100. Target: > 5%.

-- Savings = (L1 Bid Amount - Awarded Amount) / L1 Bid Amount * 100
SELECT
  SUM(l1_bid_total - awarded_total) / NULLIF(SUM(l1_bid_total), 0) * 100 AS savings_pct
FROM analytics.spend_analysis
WHERE tenant_id = :tenant_id
  AND analysis_period_start >= :start;
-- Target: > 5%
3

On-Time Delivery %

Deliveries on or before committed date / Total deliveries * 100. Target: > 95%.

-- Deliveries on or before committed date / Total deliveries * 100
SELECT
  COUNT(*) FILTER (WHERE actual_delivery_date <= committed_delivery_date) * 100.0 /
  NULLIF(COUNT(*), 0) AS on_time_pct
FROM analytics.delivery_risk_scores
WHERE tenant_id = :tenant_id
  AND actual_delivery_date IS NOT NULL;
-- Target: > 95%
4

First-Pass QC Rate

GRN items passing quality check on first inspection. Target: > 90%.

SELECT COUNT(*) FILTER (WHERE qc_result = 'PASSED') * 100.0 /
  NULLIF(COUNT(*), 0) AS first_pass_pct
FROM procurement.grn_items
WHERE tenant_id = :tenant_id;
-- Target: > 90%
5

Invoice Processing Time

Average days from invoice submission to approval. Target: < 7 days.

SELECT AVG(EXTRACT(EPOCH FROM (approved_at - submitted_at)) / 86400) AS avg_days
FROM finance.invoices
WHERE tenant_id = :tenant_id
  AND status = 'APPROVED';
-- Target: < 7 days
6

Average Vendor Rating

Mean vendor performance score across all active vendors. Target: > 4.0 / 5.0.

SELECT AVG(overall_score) AS avg_rating
FROM vendor.vendor_scores
WHERE tenant_id = :tenant_id
  AND period_type = 'MONTHLY';
-- Target: > 4.0 / 5.0

KPI Refresh Flow

Hangfire Scheduler (every 5 min)
    |
    v
Calculate KPIs (SQL queries)
    |
    v
Store in kpi_snapshots
    |
    v
Cache in Redis (5-min TTL)
    |
    v
Dashboard API reads from Redis
    |
    v
SignalR pushes updates to UI

4. Spend Analysis Engine

The Spend Analysis Engine aggregates procurement spend across vendors, categories, projects, and departments. It classifies spend types and refreshes materialized views on a schedule.

PO Released --> Trigger spend_analysis update
    |
    v
Aggregate by: Vendor, Category, Project, Department, Period
    |
    v
Calculate: Total Spend, Savings, Maverick %, Contract %
    |
    v
Store in spend_analysis table
    |
    v
Refresh mv_vendor_spend_summary (6h)
    |
    v
Refresh mv_category_spend_summary (6h)

Spend Classification Rules

Spend Type Definition Column
Contract Spend PO linked to an active contract contract_spend
Direct Spend PO via DIRECT path (no RFQ) direct_spend
Spot Spend PO via SPOT path spot_spend
Maverick Spend PO outside approved contracts/catalogs maverick_spend

5. Price Prediction Model

ML-powered price forecasting uses historical bid, PO, and contract data to predict future material prices with confidence scoring.

1. Data Collection
   +-- analytics.price_history (historical bids, POs, contracts)

2. Feature Engineering
   +-- Material category
   +-- Vendor region
   +-- Order quantity
   +-- Season/month
   +-- Historical price trend (3mo, 6mo, 12mo)
   +-- Market index (if available)
   +-- Inflation rate

3. Model Training (Monthly via Hangfire)
   +-- Algorithm: Ensemble (Linear Regression + ARIMA)
   +-- Train/Test: 80/20 split
   +-- Validation: Cross-validation (5-fold)
   +-- Store model in ai_model_registry

4. Prediction
   +-- Input: material_id, quantity, target_date
   +-- Output: predicted_price, confidence_score
   +-- Store in price_predictions table

5. Accuracy Tracking
   +-- Compare predicted vs actual (post-PO)
   +-- Update accuracy_pct
   +-- Retrain if accuracy < 80%

6. Vendor Recommendation Engine

AI-driven vendor suggestions based on a weighted composite scoring algorithm combining performance, price, delivery, and quality factors.

Scoring Algorithm

Vendor Score = Sum(Weight x Normalized Score)
Factor Weight Calculation
Performance Rating 35% vendor_scores.overall_score
Price Competitiveness 30% 1 - (avg_price / max_price)
Delivery Reliability 20% on_time_deliveries / total
Quality Score 15% first_pass_qc_rate

Filters Applied

  • Vendor status = ACTIVE
  • Vendor is_compliant = TRUE
  • Vendor mapped to material category
  • Vendor not blacklisted
  • If MSME_COMPLIANCE enabled: MSME vendors get +5% bonus

Recommendation Flow

PR Approved --> Extract material_ids
    |
    v
Query vendor_recommendations cache
    |
    v
If stale (>24h) --> Recalculate:
    +-- Get eligible vendors per material
    +-- Calculate score per vendor
    +-- Rank top 5
    +-- Store in vendor_recommendations
    |
    v
Return ranked list to UI
    |
    v
User accepts/rejects --> Track in is_accepted

7. Anomaly Detection System

Automated detection of unusual patterns in procurement data using both rule-based and ML-based approaches.

Detection Rules

Anomaly Type Detection Method Trigger Condition Severity
PRICE_SPIKE Z-score > 2.5 on bid price vs material avg Bid price > mean + 2.5 sigma HIGH
QUANTITY_UNUSUAL IQR method on PR quantities Qty outside Q1-1.5xIQR to Q3+1.5xIQR MEDIUM
VENDOR_PATTERN Frequency analysis Same vendor awarded >60% in category MEDIUM
APPROVAL_DELAY SLA breach detection Approval pending > configured SLA hours HIGH
DELIVERY_RISK ML risk score risk_score > 0.7 HIGH
SPEND_SURGE Month-over-month comparison Spend > 150% of 3-month rolling avg CRITICAL

Detection Flow

Real-Time Events (PO, Bid, Invoice)
    |
    v
Anomaly Detection Service
    +-- Rule-Based Engine (immediate)
    |   +-- Z-score, IQR, threshold checks
    +-- ML-Based Engine (batch, daily)
        +-- Isolation Forest, clustering
    |
    v
Store in anomaly_detections
    |
    v
If severity HIGH/CRITICAL --> Push notification
    |
    v
Dashboard: Anomaly inbox for review

8. Delivery Risk Scoring

Predictive delivery risk model scores open POs daily to proactively identify potential delays.

Risk Factor Model

Risk Score (0.0 - 1.0) = Sum(Factor Weight x Factor Score)
Factor Weight Data Source
Vendor History 30% Past on-time % for this vendor
Material Lead Time 20% Historical lead time data
Order Complexity 15% Line items, custom specs
Vendor Capacity 15% Concurrent POs count
Season/Weather 10% Month, region factors
Distance 10% Vendor location to delivery

Risk Classification

Score Range Risk Level Color
0.0 - 0.3LOWGreen
0.3 - 0.5MEDIUMYellow
0.5 - 0.7HIGHOrange
0.7 - 1.0CRITICALRed

9. Budget Forecasting

ML-powered budget forecasting predicts final project costs by combining committed, spent, and predicted remaining amounts.

Forecasted Total = Committed + Spent + Predicted Remaining

Where:
- Committed = SUM(approved PO amounts not yet invoiced)
- Spent     = SUM(paid invoice amounts)
- Predicted Remaining = ML prediction based on:
    +-- Historical burn rate
    +-- Pending PR amounts
    +-- Open RFQ expected values
    +-- Category price trends

Budget Alert Thresholds

  • Warning at 80% utilization (configurable)
  • Block new PRs at 110% utilization (configurable)
  • Push notification to project manager on threshold breach

10. Natural Language Query (Phase 2)

Users can ask procurement questions in plain English. The NLP engine parses intent, extracts entities, generates SQL, and returns results.

Query Examples

// Example 1
User Query: "Show top 5 vendors by spend this quarter"
Intent:     SPEND_ANALYSIS
Entities:   [top_n: 5, dimension: vendor, period: this_quarter]
SQL:        SELECT vendor_name, total_spend
            FROM analytics.mv_vendor_spend_summary
            WHERE tenant_id = :t
            ORDER BY total_spend_ytd DESC LIMIT 5

// Example 2
User Query: "Which materials had price increase >10% last month?"
Intent:     PRICE_TREND
Entities:   [metric: price_change, threshold: 10%, period: last_month]
SQL:        SELECT material_name, price_change_pct
            FROM analytics.price_history ...

// Example 3
User Query: "What's the delivery risk for PO-2026-00145?"
Intent:     DELIVERY_STATUS
Entities:   [entity: PO, id: PO-2026-00145]
SQL:        SELECT * FROM analytics.delivery_risk_scores
            WHERE po_id = ...

11. Mobile Analytics (Phase 2)

Push notification rules and mobile-specific analytics features for on-the-go procurement management.

Push Notification Rules

Notification Type Trigger Recipient
KPI_ALERTKPI breaches thresholdDashboard owner
ANOMALY_ALERTCritical anomaly detectedProcurement admin
APPROVAL_PENDINGApproval waiting >4hApprover
DELIVERY_UPDATEASN dispatchedReceiving team
BUDGET_WARNINGBudget >80%Project manager

Mobile-Specific Features

Capabilities

  • Offline KPI snapshot caching
  • Swipe to approve/reject from notification
  • Voice query for NLP interface
  • Biometric auth for mobile access

12. Report Definitions

Standard report catalog with data sources, refresh schedules, and export formats.

Report ID Report Name Data Source Refresh Export
RPT-001PR Status Summaryprocurement.purchase_requestsReal-timeCSV, PDF
RPT-002RFQ Analysisprocurement.rfqs + bidsReal-timeCSV, PDF, XLSX
RPT-003PO Registerprocurement.purchase_ordersReal-timeCSV, PDF, XLSX
RPT-004Vendor Performancevendor.vendor_scoresMonthlyPDF
RPT-005Spend Analysisanalytics.spend_analysis6hCSV, XLSX
RPT-006Price Trendanalytics.price_historyDailyCSV, PDF
RPT-007Procurement Cycle Timeanalytics.procurement_cycle_metricsDailyPDF
RPT-008Budget vs Actualanalytics.budget_forecasts2hPDF
RPT-009Anomaly Reportanalytics.anomaly_detectionsReal-timePDF
RPT-010Delivery Riskanalytics.delivery_risk_scoresDailyPDF
RPT-011MSME Compliancevendor + finance tablesMonthlyPDF
RPT-012Invoice Agingfinance.invoicesDailyCSV, PDF
RPT-013Savings Dashboardanalytics.spend_analysis6hPDF
RPT-014Vendor Recommendationanalytics.vendor_recommendations24hPDF
RPT-015Project Budget Statusanalytics.mv_project_budget_status2hPDF

13. Caching Strategy

Redis-based caching layer for analytics endpoints with configurable TTLs per data type.

Cache Key Pattern: prokure:{tenant_id}:analytics:{type}:{params_hash}

KPI Dashboard:        TTL = 5 minutes
Spend Summary:        TTL = 30 minutes
Vendor Rankings:      TTL = 1 hour
Price Predictions:    TTL = 6 hours
Report Cache:         TTL = custom (stored in report_cache MongoDB)
Materialized Views:   Refreshed by Hangfire schedule

14. API Endpoints

RESTful API endpoints for the analytics module with caching details.

Method Endpoint Description Cache
GET/api/analytics/dashboardExecutive KPI dashboard5min Redis
GET/api/analytics/kpis/{code}Single KPI with history5min
GET/api/analytics/spendSpend analysis with filters30min
GET/api/analytics/spend/drill-downDrill by vendor/category/period30min
GET/api/analytics/price-trend/{materialId}Price history + prediction6h
GET/api/analytics/vendor-recommendations/{prId}AI vendor suggestions24h
GET/api/analytics/anomaliesAnomaly inboxReal-time
POST/api/analytics/anomalies/{id}/acknowledgeMark anomaly reviewed-
GET/api/analytics/delivery-risk/{poId}PO delivery risk scoreDaily
GET/api/analytics/budget-forecast/{projectId}Budget forecast2h
GET/api/analytics/cycle-metricsProcurement cycle analysisDaily
POST/api/analytics/reports/generateGenerate report-
GET/api/analytics/reports/{id}/downloadDownload report-
POST/api/analytics/nlp-queryNatural language query (Phase 2)-
POST/api/analytics/crm-syncCRM sync-back-
GET/api/analytics/vendor-rating/{vendorId}Vendor rating details1h

15. Hangfire Job Schedule

Background job definitions for analytics data processing, model training, and report generation.

Job Schedule Action
RefreshKpiSnapshotsEvery 5 minutesCalculate all 6 KPIs, store snapshots
RefreshSpendAnalysisEvery 6 hoursAggregate spend by all dimensions
RefreshMaterializedViewsmv_vendor: 6h, mv_category: 6h, mv_monthly: daily, mv_budget: 2hRefresh PostgreSQL materialized views
CalculateVendorRatingsMonthly (1st)Recalculate all vendor scores
TrainPriceModelsMonthly (15th)Retrain price prediction models
DetectAnomaliesDaily (2 AM)Run ML anomaly detection batch
ScoreDeliveryRisksDaily (6 AM)Score all open POs
ForecastBudgetsDaily (4 AM)Update project budget forecasts
GenerateScheduledReportsPer report configGenerate and email scheduled reports
CleanupPredictionLogsWeekly (Sunday)TTL cleanup for MongoDB collections
RefreshVendorRecommendationsDaily (3 AM)Pre-compute recommendations for open PRs
SyncCrmDataOn event + Daily (midnight)Sync project/PO data to CRM

16. Business Rules

Core business rules governing the analytics and AI intelligence module.

BR-ANL-001

  • Dashboard refresh every 5 minutes (configurable per tenant)

BR-ANL-002

  • Reports exportable to CSV, XLSX, PDF formats

BR-ANL-003

  • CRM sync on project closure is mandatory

BR-ANL-004

  • Vendor rating required before project closure

BR-ANL-005

  • Historical data retained for 7 years

BR-ANL-006

  • Price prediction requires minimum 10 historical data points

BR-ANL-007

  • Vendor recommendation requires minimum 3 eligible vendors

BR-ANL-008

  • Anomaly detection runs daily; rule-based checks run real-time

BR-ANL-009

  • Budget forecast alert at 80%, block at 110% (configurable)

BR-ANL-010

  • NLP query limited to read-only operations

BR-ANL-011

  • Mobile push respects user quiet hours

BR-ANL-012

  • AI model retraining triggered if accuracy drops below 80%

17. Agentic AI / Autonomous Procurement (Phase 2)

AI agents that can autonomously execute procurement workflows with human-in-the-loop oversight. Inspired by SAP Joule, Coupa Navi, and Ivalua IVA capabilities for next-generation intelligent procurement.

Agentic AI Capabilities

  • Autonomous RFQ Creation: AI agent creates RFQs from approved PRs, selects vendors, sets timelines
  • Intelligent Bid Evaluation: Auto-score bids, identify anomalies, generate award recommendations
  • Auto-PO Generation: Create POs from winning bids with configurable approval thresholds
  • Proactive Supplier Communication: Send reminders, request updates, negotiate terms
  • Exception Handling: Detect and resolve common issues autonomously
  • Natural Language Interface: Conversational procurement via chat/voice
  • Learning from Feedback: Improve decisions based on user corrections

Agentic AI Database Schema

analytics.ai_agents

Registered AI agents with their capabilities, permissions, and configuration.

  • id UUID PK DEFAULT gen_random_uuid()
  • tenant_id UUID FK NOT NULL
  • agent_code VARCHAR(50) NOT NULL UNIQUE — e.g., RFQ_CREATOR, BID_EVALUATOR, PO_GENERATOR
  • agent_name VARCHAR(100) NOT NULL
  • agent_description TEXT
  • agent_type VARCHAR(30) NOT NULL — WORKFLOW, RECOMMENDATION, COMMUNICATION, ANALYSIS
  • capabilities JSONB NOT NULL — ["CREATE_RFQ", "SELECT_VENDORS", "EVALUATE_BIDS", "CREATE_PO"]
  • allowed_actions JSONB NOT NULL — specific actions agent can take
  • approval_thresholds JSONB — {"auto_approve_below": 50000, "require_human_above": 100000}
  • max_autonomous_value DECIMAL(18,4) — max transaction value for auto-execution
  • model_version VARCHAR(20) — LLM model version
  • system_prompt TEXT — agent's system instructions
  • tool_definitions JSONB — available tools/functions
  • is_active BOOLEAN DEFAULT TRUE
  • requires_human_approval BOOLEAN DEFAULT TRUE
  • confidence_threshold DECIMAL(5,4) DEFAULT 0.8500 — min confidence for auto-action
  • created_at TIMESTAMPTZ DEFAULT NOW()
  • updated_at TIMESTAMPTZ

Indexes: (tenant_id, agent_code), (agent_type, is_active)

analytics.ai_agent_executions

Execution history of AI agent actions with full audit trail.

  • id UUID PK DEFAULT gen_random_uuid()
  • tenant_id UUID FK NOT NULL
  • agent_id UUID FK NOT NULL
  • agent_code VARCHAR(50) — DENORM
  • execution_type VARCHAR(30) NOT NULL — SCHEDULED, TRIGGERED, MANUAL, CONVERSATIONAL
  • trigger_event VARCHAR(100) — what triggered this execution
  • trigger_entity_type VARCHAR(30) — PR, RFQ, BID, PO, etc.
  • trigger_entity_id UUID
  • trigger_entity_number VARCHAR(50) — DENORM
  • input_context JSONB NOT NULL — data provided to agent
  • reasoning_chain JSONB — chain-of-thought reasoning
  • proposed_actions JSONB NOT NULL — what agent wants to do
  • executed_actions JSONB — what was actually executed
  • output_entities JSONB — created/modified entities [{type, id}]
  • confidence_score DECIMAL(5,4)
  • execution_status VARCHAR(20) NOT NULL — PENDING, APPROVED, REJECTED, EXECUTED, FAILED, PARTIAL
  • requires_approval BOOLEAN DEFAULT TRUE
  • approved_by UUID FK
  • approved_at TIMESTAMPTZ
  • rejection_reason TEXT
  • error_message TEXT
  • execution_time_ms INTEGER
  • token_usage JSONB — {"input": 1500, "output": 500}
  • cost_usd DECIMAL(10,6) — LLM cost
  • started_at TIMESTAMPTZ DEFAULT NOW()
  • completed_at TIMESTAMPTZ

Indexes: (tenant_id, agent_id, started_at), (execution_status), (trigger_entity_type, trigger_entity_id)

analytics.ai_recommendations

AI-generated recommendations for human review (actions not auto-executed).

  • id UUID PK DEFAULT gen_random_uuid()
  • tenant_id UUID FK NOT NULL
  • agent_execution_id UUID FK
  • agent_id UUID FK NOT NULL
  • recommendation_type VARCHAR(50) NOT NULL — VENDOR_SELECTION, BID_AWARD, PRICE_NEGOTIATION, RISK_MITIGATION, PROCESS_IMPROVEMENT
  • entity_type VARCHAR(30) NOT NULL
  • entity_id UUID NOT NULL
  • entity_display_number VARCHAR(50) — DENORM
  • title VARCHAR(200) NOT NULL
  • description TEXT NOT NULL
  • reasoning TEXT — why this recommendation
  • supporting_data JSONB — evidence for recommendation
  • recommended_action JSONB NOT NULL — proposed action
  • alternative_actions JSONB — other options
  • confidence_score DECIMAL(5,4) NOT NULL
  • priority VARCHAR(10) — LOW, MEDIUM, HIGH, CRITICAL
  • potential_savings DECIMAL(18,4)
  • potential_risk DECIMAL(18,4)
  • status VARCHAR(20) NOT NULL — PENDING, ACCEPTED, REJECTED, EXPIRED
  • reviewed_by UUID FK
  • reviewed_at TIMESTAMPTZ
  • user_feedback TEXT
  • expires_at TIMESTAMPTZ
  • created_at TIMESTAMPTZ DEFAULT NOW()

Indexes: (tenant_id, status, recommendation_type), (entity_type, entity_id), (expires_at)

analytics.ai_autonomous_actions

Actions executed autonomously by AI agents without human approval.

  • id UUID PK DEFAULT gen_random_uuid()
  • tenant_id UUID FK NOT NULL
  • agent_execution_id UUID FK NOT NULL
  • agent_id UUID FK NOT NULL
  • action_type VARCHAR(50) NOT NULL — CREATE_RFQ, ADD_VENDOR, SEND_REMINDER, UPDATE_STATUS, CREATE_PO, etc.
  • target_entity_type VARCHAR(30) NOT NULL
  • target_entity_id UUID
  • created_entity_type VARCHAR(30)
  • created_entity_id UUID
  • created_entity_number VARCHAR(50) — DENORM
  • action_details JSONB NOT NULL
  • before_state JSONB
  • after_state JSONB
  • transaction_value DECIMAL(18,4)
  • confidence_score DECIMAL(5,4) NOT NULL
  • auto_approved_reason TEXT — why auto-approval was allowed
  • reversible BOOLEAN DEFAULT TRUE
  • reversed BOOLEAN DEFAULT FALSE
  • reversed_by UUID FK
  • reversed_at TIMESTAMPTZ
  • reversal_reason TEXT
  • executed_at TIMESTAMPTZ DEFAULT NOW()

Indexes: (tenant_id, action_type, executed_at), (agent_execution_id), (target_entity_type, target_entity_id)

analytics.ai_learning_feedback

User feedback on AI actions for continuous learning and improvement.

  • id UUID PK DEFAULT gen_random_uuid()
  • tenant_id UUID FK NOT NULL
  • agent_id UUID FK NOT NULL
  • related_execution_id UUID FK
  • related_recommendation_id UUID FK
  • related_action_id UUID FK
  • feedback_type VARCHAR(30) NOT NULL — RATING, CORRECTION, PREFERENCE, REJECTION_REASON
  • rating INTEGER — 1-5 scale
  • was_helpful BOOLEAN
  • original_output JSONB — what AI produced
  • corrected_output JSONB — what user wanted
  • correction_reason TEXT
  • learning_signal JSONB — extracted preferences
  • incorporated BOOLEAN DEFAULT FALSE — used for fine-tuning
  • incorporated_at TIMESTAMPTZ
  • user_id UUID FK NOT NULL
  • user_name VARCHAR(150) — DENORM
  • created_at TIMESTAMPTZ DEFAULT NOW()

Indexes: (tenant_id, agent_id, feedback_type), (incorporated), (created_at)

Agentic AI Workflow

Agentic AI Procurement Workflow
================================

1. TRIGGER EVENT
   PR Approved / RFQ Deadline / Bid Received / etc.
         |
         v
2. AGENT ACTIVATION
   +-- Select appropriate agent (RFQ_CREATOR, BID_EVALUATOR, etc.)
   +-- Load agent configuration and thresholds
   +-- Gather context data
         |
         v
3. AI REASONING (Chain-of-Thought)
   +-- Analyze context
   +-- Consider business rules
   +-- Evaluate options
   +-- Generate proposed actions
         |
         v
4. CONFIDENCE CHECK
   +-- If confidence >= threshold AND value <= auto_limit
   |       |
   |       +-- AUTO-EXECUTE
   |       +-- Log to ai_autonomous_actions
   |       +-- Notify user (optional)
   |
   +-- Else
           |
           +-- CREATE RECOMMENDATION
           +-- Log to ai_recommendations
           +-- Await human approval
         |
         v
5. FEEDBACK LOOP
   +-- Collect user feedback
   +-- Store in ai_learning_feedback
   +-- Periodic model fine-tuning

Available AI Agents

Agent Trigger Actions Auto-Execute
RFQ_CREATOR PR Approved Create RFQ, Select vendors, Set timeline If value < ₹50K
VENDOR_SELECTOR RFQ Draft Recommend vendors, Check compliance Recommendations only
BID_EVALUATOR Bid Received Score bids, Detect anomalies, Rank Recommendations only
PO_GENERATOR Bid Awarded Create PO, Set terms, Route approval If value < ₹25K
COMMUNICATION_AGENT Various Send reminders, Request updates Yes (communications)
EXCEPTION_HANDLER Error/Delay Diagnose issue, Suggest resolution Recommendations only

Agentic AI Safety Controls

  • All autonomous actions logged with full audit trail
  • Value-based thresholds for auto-execution
  • Confidence score requirements before auto-action
  • Human-in-the-loop for high-value/high-risk decisions
  • All actions reversible within configurable window
  • Rate limiting to prevent runaway executions
  • Kill switch to disable all autonomous actions

Agentic AI API Endpoints

MethodEndpointDescription
GET/api/ai/agentsList configured agents
PUT/api/ai/agents/{id}/configUpdate agent configuration
GET/api/ai/executionsList agent executions
GET/api/ai/recommendationsList pending recommendations
POST/api/ai/recommendations/{id}/acceptAccept recommendation
POST/api/ai/recommendations/{id}/rejectReject recommendation
GET/api/ai/actionsList autonomous actions
POST/api/ai/actions/{id}/reverseReverse an action
POST/api/ai/feedbackSubmit feedback
POST/api/ai/chatConversational AI interface
POST/api/ai/agents/{code}/triggerManually trigger agent

MongoDB Collections for Agentic AI

ai_agent_logs

Detailed logs of AI agent operations for debugging and auditing.

  • _id ObjectId PK
  • tenant_id UUID indexed
  • agent_id UUID indexed
  • execution_id UUID indexed
  • log_level String — DEBUG, INFO, WARN, ERROR
  • step String — INIT, CONTEXT_LOAD, REASONING, ACTION_PROPOSE, EXECUTE
  • message String
  • details Object
  • llm_request Object — full LLM request
  • llm_response Object — full LLM response
  • timestamp Date indexed
  • ttl_expiry Date — 90-day retention

ai_conversation_history

Conversation history for chat-based AI interface.

  • _id ObjectId PK
  • tenant_id UUID indexed
  • user_id UUID indexed
  • session_id String indexed
  • messages Array — [{role, content, timestamp}]
  • context Object — current context
  • actions_taken Array — actions during conversation
  • started_at Date
  • last_message_at Date indexed
  • ttl_expiry Date — 30-day retention