Government of India

Technology and Features

Built entirely on Google Cloud Platform with unsupervised machine learning, privacy-preserving technology, and explainable AI for transparent governance.

Key Capabilities

Powerful features for intelligent welfare monitoring

1

Anomaly Detection Without Labeled Fraud Data

JanAvlokan uses unsupervised learning to identify deviations from normal behavior, eliminating the dependency on pre-labeled fraud datasets which are rare and delayed in public finance systems.

Autoencoder-based anomaly detection in BigQuery ML
Mean Squared Error (MSE) as anomaly score
Rule-based fallback for explainability
No dependency on labeled training data
2

Privacy-Safe Collusion Detection

The platform detects patterns such as shared bank accounts or devices using irreversibly hashed identifiers, enabling detection of coordinated misuse while fully preserving beneficiary privacy.

All identifiers are irreversibly hashed
No PII enters the cloud
Location data generalized to regions
Compliant with data protection principles
3

Policy-Aware Risk Calibration

Risk thresholds dynamically adapt based on scheme type, region, and time period. Seasonal surges and policy-driven variations are accounted for to reduce false positives.

Scheme-specific thresholds
Regional baseline adjustments
Seasonal variation handling
Continuous model recalibration
4

Explainable Audit Narratives

Each flagged case is accompanied by a human-readable explanation outlining contributing behavioral signals designed for administrative review, audits, and legal defensibility.

Feature importance breakdowns
Behavioral signal explanations
Audit-ready documentation
Legal defensibility focus
5

Geographic Risk Heatmaps

Aggregated risk scores are visualized at district or block levels, allowing administrators to identify regional concentrations of anomalous behavior and allocate audit resources efficiently.

District-level visualization
Block-level drill-down
Resource allocation insights
Regional trend analysis
6

Real-Time Processing

Built on Google Cloud Platform for scalability and reliability, JanAvlokan can process 100M+ transactions using distributed computing and optimized data pipelines.

100M+ transaction capacity
Distributed ETL pipelines
Real-time risk scoring
Batch prediction support
7

CSV Quick Scan

Upload beneficiary transaction data in CSV format for instant anomaly scoring. The system validates columns, runs ML inference via the Vertex AI endpoint, and returns per-row risk levels and flags within seconds.

Drag-and-drop CSV upload
Automatic column validation
Per-beneficiary risk breakdown
Instant summary with High/Medium/Low counts
8

Audit Panel with ML Feedback Loop

Each flagged beneficiary can be reviewed through an integrated Audit Panel. Officers can verify fraud, dismiss false positives, or escalate cases. Feedback is stored and used to retrain models, enabling Active Learning.

Verify / Dismiss / Escalate actions
Feedback accuracy tracking
Comment-based audit trail
Model retraining via feedback loop
9

Report Builder & Transaction Linker

Generate audit-ready reports with a collaborative rich-text editor. Link specific flagged transactions as evidence, add findings with severity ratings, and export reports to PDF or DOCX for official submission.

Rich-text collaborative editor
Transaction evidence linking
Findings panel with severity ratings
Export to PDF and DOCX formats
10

Automated Email Alerts

Automatically notify district-level officials via email when high-risk anomalies are detected in their jurisdiction. Includes risk summary, flagged beneficiary details, and direct links to the dashboard for immediate action.

District-wise recipient management
Email preview before sending
Batch send to all officials at once
Integrated with Gmail API

Cloud-Native Architecture

Scalable, secure, and designed for enterprise-grade deployments

S.No.ServiceDescription
1Cloud StorageSecure storage of anonymized raw datasets with encryption at rest
2DataflowDistributed ETL and feature engineering pipelines for scalable processing
3BigQueryAnalytics warehouse handling 100M+ transactions with partitioning and clustering
4Vertex AIModel training, versioning, batch prediction, and inference endpoints
5Cloud RunServerless backend APIs for dashboard and real-time data access
6Gmail APIAutomated email alerts to district officials when high-risk transactions are detected
7Web DashboardInteractive interface for anomalies, explanations, and regional insights

Machine Learning Approach

Autoencoder-based unsupervised anomaly detection with BigQuery ML

Primary Anomaly Detector

Autoencoders

Neural networks trained in BigQuery ML that learn to compress and reconstruct normal beneficiary behavior patterns. High reconstruction error (MSE) indicates anomalous behavior.

Fallback & Explainability

Rule-Based Detection

Deterministic rule engine that generates human-readable flags (high recent activity, multiple dealers, cross-district usage) for audit explanations.

Ensemble Output

The hybrid ensemble combines outputs from all three models to produce:

Risk Score

(0-1 normalized)

Risk Category

(Low/Medium/High)

Feature Signals

(Explainable)

Feature Signals Analyzed

  • 1Rolling claim frequency patterns
  • 2Deviation from personal baselines
  • 3Deviation from scheme-level baselines
  • 4Cross-scheme overlap detection
  • 5Hashed shared identifier analysis
  • 6Temporal spike indicators
  • 7Geographic clustering signals
  • 8Behavioral sequence modeling

Alert System & Risk Categorization

Real-time fraud pattern detection and intelligent alert prioritization

Five Key Fraud Risk Categories

JanAvlokan identifies and categorizes anomalous behavior into five distinct risk patterns, each representing a different mode of potential fraud or system misuse:

32%

Unusual Activity

Abnormal transaction patterns, spikes, or irregular claim timing

24%

Suspicious Locations

Geographic inconsistencies or claims from unexpected regions

18%

Scheme Overlaps

Multiple scheme enrollments with conflicting eligibility criteria

15%

Beneficiary Clusters

Groups sharing bank accounts, devices, or other identifiers

11%

Repeat Withdrawals

Excessive claim frequency beyond normal beneficiary behavior

High-Priority Alert System

The platform continuously monitors transactions and generates real-time alerts when high-risk patterns are detected. Each alert includes:

Beneficiary Identifier

Anonymized hash for tracking while preserving privacy

Risk Score & Category

Numerical score (0-1) and HIGH/MEDIUM/LOW classification

Alert Type Description

Human-readable explanation (e.g., "Multiple dealers detected")

Timestamp

Exact detection time for audit trail purposes

Trend Analysis & Weekly Monitoring

7%
Average Weekly Increase in Fraud Detection
<1 min
Alert Generation Time from Transaction
24/7
Continuous Real-Time Monitoring

Temporal trend analysis tracks weekly changes in fraud patterns, enabling proactive policy adjustments and resource allocation.

Privacy-First Design

Privacy is central to JanAvlokan's architecture. The system ensures compliance with data protection principles while maintaining analytical effectiveness.

This approach enables powerful anomaly detection while fully preserving beneficiary privacy and maintaining the trust essential for government systems.

Privacy Measures
1

No PII in Cloud

No personally identifiable information enters the cloud infrastructure

2

Irreversible Hashing

All sensitive identifiers are irreversibly hashed before processing

3

Location Generalization

Location data is generalized into regional clusters for privacy

4

Human-in-the-Loop

Outputs are strictly advisory with human decision-making

Scalability and Performance

Designed for national-scale deployment

100M+
Transactions Processed
<1s
Risk Score Generation
99.9%
Uptime SLA
Auto
Scaling Enabled

Access System Dashboard

Monitor risk assessments and generate audit reports