Prompt2Data

Generate high-quality synthetic datasets from simple text prompts.

Prompt2Data is a powerful, intuitive web application that allows you to generate synthetic datasets for various machine learning tasks using a wide range of large language models (LLMs). Simply provide a topic, choose your dataset type and model provider, and let the magic happen.

Features

Multiple Dataset Types: Generate data for Q&A, Summarization, Classification, and Text Generation tasks.
Flexible Model Integration:
- OpenRouter: Access a wide variety of models from different providers.
- Ollama: Connect to your local LLMs running via Ollama.
- OpenAI: Use models directly from OpenAI (e.g., GPT-4o, GPT-3.5).
- Google Gemini: Leverage Google's powerful Gemini models.
- Custom: Configure any other OpenAI-compatible API endpoint.
Advanced Generation:
- Basic & Advanced Modes: Control the complexity and nuance of the generated data.
- Focus Control: Guide the model with specific constraints or edge cases to focus on.
Customizable Output:
- Set the desired dataset size.
- Export your generated data in both JSON and CSV formats.
Responsive UI: A clean, modern, and responsive interface that works on any device.

Screenshots

Getting Started

Follow these steps to get the project running on your local machine.

Prerequisites

Node.js (v18 or higher recommended)
npm or a compatible package manager

Installation

Clone the repository:

git clone https://github.com/your-username/prompt-2-data.git
cd prompt-2-data

Install dependencies:
```
npm install
```
Set up environment variables:

Create a .env file in the root of the project by copying the example file:
```
cp .env.example .env
```
Open the .env file and add your API keys for the services you want to use:
```
VITE_OPENROUTER_API_KEY="your_openrouter_api_key"
VITE_OPENAI_API_KEY="your_openai_api_key"
VITE_GEMINI_API_KEY="your_gemini_api_key"
```
The application will automatically pick up these keys when you select the corresponding provider.

Usage

To start the development server, run the following command:

npm run dev

This will start the application, and you can access it in your browser at http://localhost:5173 (or another port if 5173 is in use).

How It Works

Configure:
- Select a Provider (e.g., OpenRouter, Ollama).
- The Model Name, Base URL, and API Key fields will be pre-filled with defaults. Adjust them if needed. For Ollama, no API key is required.
Define:
- Enter a Core Topic for your dataset (e.g., "Customer reviews for a coffee shop").
- Choose a Dataset Type (e.g., Classification).
- Optionally, add an Advanced Focus to guide the generation (e.g., "focus on ambiguous or sarcastic reviews").
Generate:
- Adjust the Dataset Size and Complexity Mode.
- Click the "GENERATE DATASET" button.
Preview & Export:
- The generated data will appear in the Output Preview table.
- Choose your desired format (JSON or CSV) and click the "Export" button to download the file.

Example Output

Here are small examples of what the generated files look like.

JSON Example (`qa_example.json`)

[
  {
    "question": "What is the capital of France?",
    "answer": "The capital of France is Paris."
  },
  {
    "question": "What is the main component of Earth's atmosphere?",
    "answer": "The main component of Earth's atmosphere is nitrogen, which makes up about 78% of the air."
  }
]

CSV Example (`qa_example.csv`)

"question","answer"
"What is the capital of France?","The capital of France is Paris."
"What is the main component of Earth's atmosphere?","The main component of Earth's atmosphere is nitrogen, which makes up about 78% of the air."

This project was built with Vite, React, and Tailwind CSS.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
examples		examples
public		public
src		src
.env.example		.env.example
.gitignore		.gitignore
GEMINI.md		GEMINI.md
LICENSE		LICENSE
README.md		README.md
eslint.config.js		eslint.config.js
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
tailwind.config.js		tailwind.config.js
vercel.json		vercel.json
vite.config.js		vite.config.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Prompt2Data

Features

Screenshots

Getting Started

Prerequisites

Installation

Usage

How It Works

Example Output

JSON Example (`qa_example.json`)

CSV Example (`qa_example.csv`)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Prompt2Data

Features

Screenshots

Getting Started

Prerequisites

Installation

Usage

How It Works

Example Output

JSON Example (qa_example.json)

CSV Example (qa_example.csv)

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

JSON Example (`qa_example.json`)

CSV Example (`qa_example.csv`)

Packages