LLM API Testing Guide

Quick Start

Option 1: Postman

Import LLM_API_Tests.postman_collection.json into Postman
Replace placeholder IDs with your actual IDs:
- YOUR_EVALUATION_ID_HERE
- YOUR_GENERATOR_ID_HERE
- YOUR_DATASET_ID_HERE
Run requests!

Option 2: Swagger UI (Recommended for Quick Testing)

Open http://localhost:8000/docs
Find the endpoint you want to test
Click "Try it out"
Fill in parameters
Click "Execute"

All 11 LLM Endpoints

1. Interactive Chat

Endpoint: POST /llm/chat

Example Request:

{
  "message": "What's the overall quality of this synthetic data?",
  "evaluation_id": "abc-123",
  "history": []
}

Use Cases:

Ask questions about evaluation results
Get recommendations
Understand quality metrics
Explore trade-offs

2. Improvement Suggestions

Endpoint: POST /llm/suggest-improvements/{evaluation_id}

Example:

POST /llm/suggest-improvements/abc-123

Returns:

{
  "suggestions": [
    "Increase training epochs from 50 to 100 for better convergence",
    "Reduce batch size to 250 to improve gradient stability",
    "Consider using DP-CTGAN with epsilon=8 for stronger privacy"
  ]
}

3. Metric Explanations

Endpoint: GET /llm/explain-metric

Example:

GET /llm/explain-metric?metric_name=ks_statistic&metric_value=0.087

Common Metrics to Explain:

ks_statistic - Kolmogorov-Smirnov test
utility_ratio - ML utility score
epsilon - Privacy budget
wasserstein_distance - Distribution distance
dcr_score - Distance to closest record

4. Natural Language Insights

Endpoint: POST /evaluations/{evaluation_id}/explain

Example:

POST /evaluations/abc-123/explain

Returns:

{
  "executive_summary": "The synthetic data shows excellent quality...",
  "key_findings": [
    " Statistical similarity: 93% of tests passed",
    " ML utility: Models trained on synthetic data achieve 89% of original performance"
  ],
  "recommendations": [
    "Safe for production use",
    "Consider differential privacy for additional protection"
  ],
  "business_impact": "High-quality synthetic data ready for analytics and ML training"
}

5. Compare Evaluations

Endpoint: POST /evaluations/compare

Example Request:

["eval-1", "eval-2", "eval-3"]

Returns:

{
  "summary": "Evaluation 1 (CTGAN) offers best quality, Evaluation 2 (DP-CTGAN) offers best privacy",
  "winner": 1,
  "trade_offs": [
    "CTGAN: Higher quality (0.93) but no formal privacy",
    "DP-CTGAN: Strong privacy (ε=10) with slight quality trade-off (0.89)"
  ],
  "recommendation": "Use DP-CTGAN for production, CTGAN for development/testing"
}

6. Model Card Generation

Endpoint: POST /generators/{generator_id}/model-card

Example:

POST /generators/gen-456/model-card

Returns: Markdown-formatted model card with:

Model details and purpose
Intended use cases
Performance metrics
Privacy considerations
Compliance mappings
Usage guidelines

Use Cases:

Documentation for stakeholders
Compliance audits
Model registry
Team onboarding

7. Audit Narratives

Endpoint: GET /generators/{generator_id}/audit-narrative

Example:

GET /generators/gen-456/audit-narrative

Returns:

# Audit Trail: Healthcare Data Generator

**10:00 AM** - Generator Created
A DP-CTGAN generator was initialized for healthcare data synthesis...

**10:15 AM** - Training Started
Model training began with 300 epochs and batch size 500...

**11:30 AM** - Training Completed
Training completed successfully. Privacy budget spent: ε=9.8...

Endpoint: POST /generators/{generator_id}/compliance-report?framework=GDPR

Example:

POST /generators/gen-456/compliance-report?framework=GDPR

Returns:

{
  "framework": "GDPR",
  "compliance_level": "Partial",
  "controls_addressed": [
    {
      "control_id": "Article 25",
      "control_name": "Data Protection by Design",
      "how_addressed": "DP-CTGAN implements differential privacy",
      "evidence": "epsilon=10.0, delta=1e-5"
    }
  ],
  "gaps": ["Explicit consent mechanism not implemented"],
  "recommendations": [
    "Add data retention policies",
    "Implement right to erasure"
  ]
}

Supported Frameworks:

GDPR
HIPAA
CCPA
SOC2

9. Enhanced PII Detection

Endpoint: POST /datasets/{dataset_id}/pii-detection-enhanced

Example:

POST /datasets/data-789/pii-detection-enhanced

Returns:

{
  "analysis": {
    "total_columns": 10,
    "columns_with_pii": 3,
    "overall_risk_level": "Medium",
    "high_risk_columns": ["user_id"],
    "medium_risk_columns": ["customer_code", "account_number"],
    "column_analyses": {
      "user_id": {
        "contains_pii": true,
        "pii_type": "indirect_identifier",
        "confidence": 0.95,
        "risk_level": "High",
        "explanation": "Unique identifier that could enable re-identification",
        "recommendations": ["Remove before synthesis", "Use DP with ε≤8"]
      }
    },
    "recommendations": [
      "High-risk PII detected in 1 column: user_id. Consider removing.",
      "Use DP-CTGAN with epsilon ≤ 10 for privacy protection."
    ]
  }
}

Detects:

Direct identifiers (names, emails, SSNs)
Indirect identifiers (user codes, IDs)
Quasi-identifiers (combinations)
Encoded/obfuscated PII

Testing Workflow

Step 1: Get Your IDs

First, get actual IDs from your system:

# Get datasets
GET /datasets/

# Get generators
GET /generators/

# Get evaluations
GET /evaluations/generator/{generator_id}

Step 2: Test in Order

Start with Enhanced PII Detection
```
POST /datasets/{id}/pii-detection-enhanced
```
→ Understand your data privacy risks
Generate Synthetic Data
```
POST /generators/dataset/{id}/generate
```
→ Create synthetic data
Run Evaluation
```
POST /evaluations/run
```
→ Assess quality
Get Natural Language Insights
```
POST /evaluations/{id}/explain
```
→ Understand results
Ask Questions via Chat
```
POST /llm/chat
```
→ Interactive exploration

Generate Documentation

POST /generators/{id}/model-card
POST /generators/{id}/compliance-report?framework=GDPR

→ Compliance docs

Tips for Testing

1. Use Real Data

Test with actual evaluation results for best insights
LLM responses are context-aware

2. Try Different Questions

Chat examples:

"Is this data safe for production?"
"How does the privacy level compare to industry standards?"
"What's the trade-off between quality and privacy?"
"Should I use CTGAN or DP-CTGAN?"

3. Test Fallbacks

Disconnect internet to test fallback mechanisms
All endpoints have rule-based fallbacks

4. Check Response Times

Groq is ultra-fast (usually less than 1 second)
Chat: ~500ms
Model cards: ~2-3 seconds
Compliance reports: ~1-2 seconds

Common Issues

Issue: "Evaluation not found"

Solution: Use actual evaluation ID from your database

Issue: "No LLM providers available"

Solution: Check that GROQ_API key is set in .env

Issue: "Endpoints not showing in /docs"

Solution: Restart the server:

python -m uvicorn app.main:app --reload

Issue: "Rate limit exceeded"

Solution: Groq free tier: 30 req/min. Wait or upgrade.

Quick Test Script

# 1. Get a dataset ID
curl http://localhost:8000/datasets/ | jq '.[0].id'

# 2. Run enhanced PII detection
curl -X POST http://localhost:8000/datasets/YOUR_ID/pii-detection-enhanced

# 3. Get a generator ID
curl http://localhost:8000/generators/ | jq '.[0].id'

# 4. Generate model card
curl -X POST http://localhost:8000/generators/YOUR_ID/model-card

# 5. Get an evaluation ID
curl http://localhost:8000/evaluations/generator/YOUR_GEN_ID | jq '.[0].id'

# 6. Get natural language insights
curl -X POST http://localhost:8000/evaluations/YOUR_EVAL_ID/explain

# 7. Chat about the evaluation
curl -X POST http://localhost:8000/llm/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "What is the overall quality?", "evaluation_id": "YOUR_EVAL_ID"}'

Next Steps

Import Postman collection
Replace placeholder IDs
Test each endpoint
Try different questions in chat
Generate compliance docs
Integrate into your workflow

Happy Testing!

Quick Start​

Option 1: Postman​

Option 2: Swagger UI (Recommended for Quick Testing)​

All 11 LLM Endpoints​

1. Interactive Chat​

2. Improvement Suggestions​

3. Metric Explanations​

4. Natural Language Insights​

5. Compare Evaluations​

6. Model Card Generation​

7. Audit Narratives​

8. GDPR Compliance Report​

9. Enhanced PII Detection​

Testing Workflow​

Step 1: Get Your IDs​

Step 2: Test in Order​

Tips for Testing​

1. Use Real Data​

2. Try Different Questions​

3. Test Fallbacks​

4. Check Response Times​

Common Issues​

Issue: "Evaluation not found"​

Issue: "No LLM providers available"​

Issue: "Endpoints not showing in /docs"​

Issue: "Rate limit exceeded"​

Quick Test Script​

Next Steps​

Quick Start

Option 1: Postman

Option 2: Swagger UI (Recommended for Quick Testing)

All 11 LLM Endpoints

1. Interactive Chat

2. Improvement Suggestions

3. Metric Explanations

4. Natural Language Insights

5. Compare Evaluations

6. Model Card Generation

7. Audit Narratives

8. GDPR Compliance Report

9. Enhanced PII Detection

Testing Workflow

Step 1: Get Your IDs

Step 2: Test in Order

Tips for Testing

1. Use Real Data

2. Try Different Questions

3. Test Fallbacks

4. Check Response Times

Common Issues

Issue: "Evaluation not found"

Issue: "No LLM providers available"

Issue: "Endpoints not showing in /docs"

Issue: "Rate limit exceeded"

Quick Test Script

Next Steps