Skip to main content

Synthetic Data Studio Documentation

Welcome to the comprehensive documentation for Synthetic Data Studio, a production-ready platform for generating high-quality synthetic data with differential privacy guarantees.

Quick Navigation​

Just Getting Started?​

I'm a User​

I'm a Developer​

I Want to Learn​

Documentation Structure​

docs/
├── INDEX.md # This navigation hub
├── getting-started/ # First-time setup and basics
├── user-guide/ # Feature guides and workflows
├── tutorials/ # Step-by-step tutorials
├── developer-guide/ # Development and deployment
├── examples/ # Code examples and API usage
└── reference/ # Configuration and troubleshooting

Key Features Overview​

Differential Privacy​

  • Mathematical Guarantees: (ε, δ)-differential privacy with RDP accounting
  • Safety Validation: 3-layer validation prevents privacy failures
  • Compliance Ready: HIPAA, GDPR, CCPA, SOC-2 reporting
  • Multiple Algorithms: DP-CTGAN, DP-TVAE with automatic parameter tuning

AI-Powered Capabilities​

  • Interactive Chat: Ask questions about your synthetic data quality
  • Smart Suggestions: AI-powered recommendations for improvement
  • Auto-Documentation: Generate model cards and audit narratives
  • Enhanced Detection: Context-aware PII identification

Quality Assurance​

  • Statistical Similarity: KS tests, Chi-square, Wasserstein distance
  • ML Utility: Classification/regression performance evaluation
  • Privacy Leakage: Membership and attribute inference detection
  • Comprehensive Reports: Actionable quality assessments

Enterprise-Ready​

  • Multiple Synthesis Methods: CTGAN, TVAE, GaussianCopula
  • Background Processing: Asynchronous job handling
  • Scalable Architecture: FastAPI with SQLAlchemy
  • Production Deployment: Docker, cloud-native ready

Common Workflows​

1. Basic Data Synthesis​

  1. Upload Dataset
  2. Generate Profile
  3. Create Synthetic Data
  4. Evaluate Quality

2. Privacy-Preserving Synthesis​

  1. Validate DP Configuration
  2. Generate with Privacy Guarantees
  3. Review Privacy Report
  4. Compliance Documentation

3. Quality Assessment​

  1. Run Comprehensive Evaluation
  2. Review Statistical Metrics
  3. Check ML Utility
  4. AI-Powered Insights

Search & Discovery​

By Use Case​

By Technical Focus​

Reading Paths​

Beginner Path​

  1. Installation
  2. Quick Start
  3. Basic Synthesis Tutorial
  4. Platform Overview

Privacy Engineer Path​

  1. Privacy Features Overview
  2. DP Configuration Guide
  3. Privacy Synthesis Tutorial
  4. Compliance Reporting

Developer Path​

  1. Development Setup
  2. Architecture Overview
  3. API Examples
  4. Testing Guide

External Resources​

Support​

Contributing​

Help improve our documentation! See our Contributing Guide for guidelines on:

  • Writing documentation
  • Reporting issues
  • Suggesting improvements
  • Code contributions

Ready to explore? Start with our Quick Start Tutorial to generate your first synthetic dataset!