SDV based tabular data anonymization tool with a 4-step workflow:
- Upload data (CSV / XLS / XLSX)

- Detect PII

- Configure rules (
exempt_columns/force_pii_columns)
- Generate, preview, compare, and download synthetic data in csv format

src/api.py- FastAPI endpointsgenerate_metada.py- metadata generation and config saveGaussian_Coupla_syntheticdata_generator.py- SDV synthetic data pipeline
ui/- frontend page and step scriptsresults/- runtime outputsuploaded_data/- uploaded files converted to CSVmeta_data/- generated metadata and user configGaussianCopula_results/- synthetic data, quality reports, scores
raw_data/- optional mounted input folder (Docker use)
- Python 3.9+
- Dependencies in
requirements-api.txt - spaCy model:
python -m spacy download en_core_web_sm
- Install Docker Desktop
- Make sure Docker is running
git clone https://github.com/gyt197/Data-Anonymization-Service.git
cd Data-Anonymization-Service
docker compose up --buildApp: http://localhost:8000docker compose downgit clone https://github.com/gyt197/Data-Anonymization-Service.gitcd Data-Anonymization-Servicepython -m venv venvmy-env\Scripts\activatepip install -r requirements-api.txt
pip install sdv python-multipart
python -m spacy download en_core_web_sm
uvicorn src.api:app --host 0.0.0.0 --port 8000 --reloadThis project is licensed under the MIT License - see the LICENSE file for details.
