Synthetic data generation for tabular data
-
Updated
Aug 4, 2025 - Python
Synthetic data generation for tabular data
Conditional GAN for generating synthetic tabular data.
Synthetic Data SDK ✨
A library to model multivariate data using copulas.
[IMC 2020 (Best Paper Finalist)] Using GANs for Sharing Networked Time Series Data: Challenges, Initial Promise, and Open Questions
Synthetic Data Generation for mixed-type, multivariate time series.
(SIGCOMM '22) Practical GAN-based Synthetic IP Header Trace Generation using NetShare
Synthetic Data Engine 💎
Structural Entropy Guided Agent for Detecting and Repairing Knowledge Deficiencies in LLMs
[TMLR] GraphMaker: Can Diffusion Models Generate Large Attributed Graphs?
Flow Matching implemented in PyTorch
A toolset to test data classification engines that generates mock data in various file formats, sizes and data profiles.
[ACL 2024 Findings] This is the code for our paper "Knowledge-Infused Prompting: Assessing and Advancing Clinical Text Data Generation with Large Language Models".
[ECCV'24 Workshops Oral] DALDA: Data Augmentation Leveraging Diffusion Model and LLM with Adaptive Guidance Scaling
This UI serves as a Synthetic ASR Dataset Generator powered by/for OpenAI Whisper, enabling users to capture audio, transcribing it, on the fly and manage the generated dataset 🤗. Fine tune Whisper or enhanced and custom datasets
Synthetic data generation to fuel AI models
Scripts for data generation using Blender and 3D datasets like Matterport3D.
Building synthetic data for preference tuning
A testbed for agents and environments that can automatically improve models through data generation.
[ICLR 2025 Spotlight] LayerDAG: A Layerwise Autoregressive Diffusion Model of Directed Acyclic Graphs
Add a description, image, and links to the synthetic-data-generation topic page so that developers can more easily learn about it.
To associate your repository with the synthetic-data-generation topic, visit your repo's landing page and select "manage topics."