--- license: apache-2.0 language: - en base_model: - Open-Bee/Bee-8B-SFT pipeline_tag: image-text-to-text library_name: transformers tags: - text-generation-inference --- ![1](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/DR0oZYGzJ2hv0g_x_Xw8S.png) # **Bee-8B-SFT-abliterated** > **Bee-8B-SFT-abliterated** is an abliterated (v1.0) variant of **Open-Bee’s Bee-8B-SFT** model. > It is a high-quality supervised fine-tuning (SFT) variant trained on approximately **15 million curated samples**. > This dataset was meticulously created using **HoneyPipe**, Open-Bee’s transparent, adaptable, and open-source data curation pipeline that systematically cleans noisy data and enriches it through a **dual-level Chain-of-Thought (CoT)** strategy for both short and long reasoning contexts. 1 ## Key Highlights * **Abliterated / Uncensored Captioning and Reasoning** Fine-tuned to bypass standard content filters while preserving factual accuracy, descriptive depth, and logical reasoning. * **High-Fidelity Reasoning and Visual Understanding** Generates detailed captions and structured reasoning for diverse visual categories—artistic, technical, abstract, or low-context. * **Enhanced Supervised Fine-Tuning (SFT) Alignment** Trained on a meticulously curated dataset via HoneyPipe with short and long Chain-of-Thought (CoT) annotations, ensuring deep reasoning coherence. * **Aspect-Ratio Robustness** Performs consistently across wide, tall, square, panoramic, and irregular visual formats. * **Variational Detail Control** Supports both concise summaries and highly detailed reasoning narratives, depending on prompt configuration. * **Multilingual Output Capability** Defaults to English but adaptable for multilingual use through prompt engineering. ## Quick Start with Transformers ```python import requests import torch from PIL import Image from transformers import AutoModel, AutoProcessor model_path = "prithivMLmods/Bee-8B-SFT-abliterated" # Load model model = AutoModel.from_pretrained( model_path, torch_dtype=torch.bfloat16, trust_remote_code=True, ).to("cuda") # Load processor processor = AutoProcessor.from_pretrained(model_path, trust_remote_code=True) # Define conversation messages messages = [{ "role": "user", "content": [ { "type": "image", "image": "https://huggingface.co/Open-Bee/Bee-8B-SFT/resolve/main/assets/logo.png", }, { "type": "text", "text": "Based on this picture, write an advertising slogan about Bee-8B (a Fully Open Multimodal Large Language Model)." }, ], }] # Apply chat template text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True, enable_thinking=True) # Load image image_url = "https://huggingface.co/Open-Bee/Bee-8B-SFT/resolve/main/assets/logo.png" image = Image.open(requests.get(image_url, stream=True).raw) # Process inputs inputs = processor(images=image, text=text, return_tensors="pt").to("cuda") # Generate output generated_ids = model.generate(**inputs, max_new_tokens=16384, temperature=0.6) output_ids = generated_ids[0][len(inputs.input_ids[0]):] # Decode output output_text = processor.decode(output_ids, skip_special_tokens=True) # Print result print(output_text) ``` ## Intended Use This model is suited for: * Generating detailed, uncensored captions and reasoning for complex or creative visual datasets. * Research in multimodal reasoning, safety evaluation, and content moderation studies. * Enabling descriptive captioning and analytical reasoning for datasets excluded from mainstream models. * Creative applications such as narrative generation, artistic interpretation, and visual storytelling. * Advanced reasoning over diverse visual structures and aspect ratios. ## Limitations * May produce explicit, sensitive, or offensive content depending on input and prompt. * Not recommended for deployment in production systems requiring strict moderation or filtering. * Style, tone, and reasoning detail may vary based on prompt phrasing. * May show variable performance on synthetic, abstract, or highly stylized visual inputs.