File size: 4,523 Bytes
98dfc81
 
ab48ce6
202b559
 
98dfc81
202b559
98dfc81
176c4c1
98dfc81
 
311a519
 
98dfc81
 
202b559
 
e61fe86
 
202b559
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
---
title: Convert To Json
emoji: πŸ”¬πŸ“…πŸ“Š
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 5.33.1
app_file: app.py
pinned: true
license: mit
short_description: Convert Free Text Into Json Using AI
tags : 
  - mcp-server-track
---

youtube : https://youtu.be/PtWkJHNmo9k

for a faster & functional experience using ZeroGPU please visit : https://huggingface.co/spaces/Tonic/Convert-to-Json

# 🌊 Osmosis Structure - Text to JSON Converter

A powerful web application that converts unstructured text into well-formatted JSON using the Osmosis Structure 0.6B model. This tool is specifically designed for structured data extraction and format conversion tasks.

## 🌟 Features

- **Intelligent Text Processing**: Automatically identifies and extracts key information from unstructured text
- **Schema Support**: Optionally provide a JSON schema to structure the output according to your needs
- **Customizable Generation**: Fine-tune the output with adjustable parameters:
  - Temperature
  - Max tokens
  - Top-p sampling
  - Top-k sampling
- **User-Friendly Interface**: Clean and intuitive Gradio interface
- **Example Templates**: Pre-configured examples to help you get started
- **GPU Acceleration**: Optimized for GPU when available

## πŸš€ Quick Start

1. Clone the repository:
```bash
git clone https://github.com/yourusername/Convert-to-Json.git
cd Convert-to-Json
```

2. Install dependencies:
```bash
pip install -r requirements.txt
```

3. Run the application:
```bash
python app.py
```

## πŸ’» Usage

### Basic Usage

1. Enter your unstructured text in the input field
2. (Optional) Provide a JSON schema to structure the output
3. Adjust generation parameters if needed
4. Click "Convert" or press Enter
5. View the structured JSON output

### Example Input

```text
The conference will be held on June 10-12, 2024 at the Grand Hotel in San Francisco. 
Registration fee is $500 for early bird (before May 1) and $650 for regular registration. 
Contact info@conference.com for questions.
```

### Example Schema

```json
{
  "type": "object",
  "properties": {
    "event_start_date": {
      "type": "string",
      "format": "date"
    },
    "event_end_date": {
      "type": "string",
      "format": "date"
    },
    "location": {
      "type": "string"
    },
    "registration_fees": {
      "type": "object",
      "properties": {
        "early_bird_price": {
          "type": "number"
        },
        "regular_price": {
          "type": "number"
        },
        "early_bird_deadline": {
          "type": "string",
          "format": "date"
        }
      }
    },
    "contact_email": {
      "type": "string"
    }
  }
}
```

### Example Output

```json
{
  "event_start_date": "2024-06-10",
  "event_end_date": "2024-06-12",
  "location": "Grand Hotel, San Francisco",
  "registration_fees": {
    "early_bird_price": 500.0,
    "regular_price": 650.0,
    "early_bird_deadline": "2024-05-01"
  },
  "contact_email": "info@conference.com"
}
```

## βš™οΈ Generation Parameters

- **Max Tokens**: Controls the maximum length of the generated output (default: 512)
- **Temperature**: Controls randomness in generation (default: 0.6)
  - Lower values (e.g., 0.3) make output more focused and deterministic
  - Higher values (e.g., 0.9) make output more diverse and creative
- **Top-p**: Nucleus sampling parameter (default: 0.95)
- **Top-k**: Number of highest probability tokens to consider (default: 20)

## πŸ› οΈ Technical Details

- **Model**: Osmosis Structure 0.6B parameters
- **Architecture**: Qwen3 (specialized for structured data)
- **Purpose**: Converting unstructured text to structured JSON format
- **Optimizations**: Fine-tuned for data extraction and format conversion tasks

## 🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

## πŸ“ License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## πŸ™ Acknowledgments

- Thanks to the Hugging Face team for their excellent tools and resources
- Special thanks to Yuvi Sharma and all the folks at Hugging Face for the community grant

## 🌟 Join Our Community

- Join our active builder's community on [Discord](https://discord.gg/qdfnvSPcqP)
- Follow us on [Hugging Face](https://huggingface.co/MultiTransformer)
- Check out our [GitHub](https://github.com/tonic-ai)
- Contribute to [MultiTonic](https://github.com/MultiTonic)