Instructions to use OpenMOSE/HRWKV7-Reka-Flash3.1-Preview with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use OpenMOSE/HRWKV7-Reka-Flash3.1-Preview with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="OpenMOSE/HRWKV7-Reka-Flash3.1-Preview")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("OpenMOSE/HRWKV7-Reka-Flash3.1-Preview", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use OpenMOSE/HRWKV7-Reka-Flash3.1-Preview with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "OpenMOSE/HRWKV7-Reka-Flash3.1-Preview"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "OpenMOSE/HRWKV7-Reka-Flash3.1-Preview",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/OpenMOSE/HRWKV7-Reka-Flash3.1-Preview

SGLang

How to use OpenMOSE/HRWKV7-Reka-Flash3.1-Preview with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "OpenMOSE/HRWKV7-Reka-Flash3.1-Preview" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "OpenMOSE/HRWKV7-Reka-Flash3.1-Preview",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "OpenMOSE/HRWKV7-Reka-Flash3.1-Preview" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "OpenMOSE/HRWKV7-Reka-Flash3.1-Preview",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use OpenMOSE/HRWKV7-Reka-Flash3.1-Preview with Docker Model Runner:
```
docker model run hf.co/OpenMOSE/HRWKV7-Reka-Flash3.1-Preview
```

Improve model card: Add metadata, paper abstract, links & transformers usage

by nielsr HF Staff - opened Jul 28, 2025

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+96

-46

Improve model card: Add metadata, paper abstract, links & transformers usagea10fae96

nielsr

Jul 28, 2025

This PR significantly improves the model card for HRWKV7-Reka-Flash3.1-Preview by:

Adding essential metadata: pipeline_tag: text-generation, library_name: transformers, and comprehensive tags (rwkv, linear-attention, reka, distillation, knowledge-distillation, hybrid-architecture, language-model). This enhances discoverability and enables the "how to use" widget on the Hub.
Adding the paper abstract for better context on the model's development via the RADLADS protocol.
Updating the paper link to the official Hugging Face Papers page: RADLADS: Rapid Attention Distillation to Linear Attention Decoders at Scale.
Adding direct links to the main RADLADS project GitHub repository (https://github.com/recursal/RADLADS) and clarifying the link to this model's specific training code (https://github.com/OpenMOSE/RWKVInside).
Replacing the non-standard curl usage snippet with a clear Python code example using the Hugging Face transformers library for easy model loading and generation.
Adding the paper's BibTeX citation for proper attribution.

Please review and merge this PR if everything looks good.

OpenMOSE changed pull request status to merged Jul 29, 2025

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment