A few months ago, I built a quick POC in Hugging Face that used a fine-tuned variant of OpenAI's OSS-20B model that I trained to convert the text from pre-reform Russian-language documents into modern Russian orthography.
β‘οΈ This morning, I launched novoyaz.io.
This is a production app, the frontend for which I built in like two hours with Lovable, that uses that same fine-tuned model for transliteration, but now has a bunch of extra features that make using it even easier (like taking and uploading pictures with your on-device camera for example π ).
π If you're a researcher, or know a researcher, for whom this app will improve their day-to-day workflows, please get in touch with me.
Introducing Photo-Mate-v2, based on FLUX.1-Kontext-dev, for advanced image manipulation tasks. It supports transforming scenes into top-down/bottom-up perspectives, CAM-right/left-view and its reverse, as well as general kontext-specified object removal. Below is the list of demos and adapters.π₯π€
Introducing JFLEG-JA, a new Japanese language error correction benchmark with 1,335 sentences, each paired with 4 high-quality human corrections π
Inspired by the English JFLEG dataset, this dataset covers diverse error types, including particle mistakes, kanji mix-ups, incorrect contextual verb, adjective, and literary technique usage.
You can use this for evaluating LLMs, few-shot learning, error analysis, or fine-tuning correction systems.
A week ago, I shared a post about the latest transformers test implementation of DeepSeek-OCR Compatibility (https://tinyurl.com/ykc4mm66). Now, Iβm dropping the most compatible version of it to support the model with the latest transformers. π€π₯
β Supports the latest transformers v4.57.1 β torch: 2.6.0+cu124 (or) the latest version (i.e., torch 2.9.0) β cuda version: 12.4 β users can also opt out of specific attention implementations if desired.
It discusses the latest trends in OCR models, the multilingual support offered by modern OCR systems, their unique capabilities, OCR benchmark model comparisons, transformer-based implementations, and strategies for streamlining transformers compatibility.
Implemented DeepSeek-OCR to support the latest transformers on the strangervisionhf page. The page includes the model weights and corrected configuration, which fix the issues and allow transformers inference to run smoothly.π€π₯
β Supports the latest transformers β You can also opt out of the attention implementation if needed. β Supports torch version 2.6.0 or higher β torch version cuda: 12.4
If you are interested in experimenting with new things and streamlining compatibility, the strangervisionhf organization is open for you, and you can join the community.
Introducing Gliese-OCR-7B-Post2.0-final, a document content-structure retrieval VLM designed for content extraction (OCR), summarization, and document visual question answering. This is the fourth and final model in the Camel Doc OCR VLM series, following Gliese-OCR-7B-Post1.0. The model delivers superior accuracy across a wide range of document types, including scanned PDFs, handwritten pages, structured forms, and analytical reports.ππ€
Introducing the Finance-Instruct-500k-Japanese dataset π
This is a Japanese-translated version of the @Josephgflowers Finance-Instruct-500k dataset, which includes complex questions and answers related to finance and Economics.
How Financial News Can Be Used to Train Good Financial Models π° Numbers tell you what happened, but news tells you why. Iβve written an article explaining how news can be used to train AI models for sentiment analysis and better forecasting. Hope you find it interesting!
Given a news title, it calculates a sentiment score : if the score crosses a certain threshold, the strategy decides to buy or sell. Each trade lasts one day, and the strategy then computes the daily return. For Tesla the best model seems to be the regression π Just a quick note: the model uses the closing price as the buy price, meaning it already reflects the impact of the news.
Excited to announce 4 AWQ quantized models from #AllenAI! π
Molmo-7B-D AWQ (14GBβ5GB): Efficient VLM performing between GPT-4V and GPT-4o on academic benchmarks, with just 6.1% perplexity degradation.
MolmoAct-7B-D AWQ (14GBβ6GB): Specialized robotic manipulation model reduced by ~57%.
Molmo-72B AWQ (145GBβ38GB): VLM with Qwen2-72B decoder that performs competitively with GPT-4, achieving only 10.5% perplexity degradation while saving 107GB of memory.
OLMo-2-32B-Instruct AWQ (64GBβ17GB): LLM post-trained on TΓΌlu 3 with 3% perplexity degradation while saving ~50GB.
Introducing AWQ and GPTQ quantized versions of SmolVLM from Hugging Face!
These models only had their text models quantized, and had a 50% model size reduction (4GB~2GB) while keeping model degradation under 1% on the DocVQA benchmark.
How Financial News Can Be Used to Train Good Financial Models π° Numbers tell you what happened, but news tells you why. Iβve written an article explaining how news can be used to train AI models for sentiment analysis and better forecasting. Hope you find it interesting!
Now you can try all the latest state-of-the-art multimodal vision-language models from the Qwen3-VL series demo on Hugging Face Spaces β including 4B, 8B, and 30B (Instruct, 4B-Thinking) variants. Iβve also uploaded the weights for the Abliterated variants of these models, up to 30B parameters. Check out the Spaces and model links below! π€π₯
Note: This is version 1.0 of the Abliteration of the Qwen3-VL series of models. It may perform sub-optimally in some cases. If you encounter any issues, please open a discussion.