To run this model check out OpenArc.

Muse-12B-int4_asym-ov

This model was converted to OpenVINO IR using weight only compression to int4_asym.

Muse-12B has been exceptional so far.

It doesn't shy away from, um, intense topics and refusals aren't a problem. At this time we don't have access to all the samplers reccomended by LatitudeGames but I haven't seen massive degredation. Long context performance remains strong, and with some scaffolding could be a reliable workhorse, though sometimes a bit verbose.

Another interesting usecase has been tinkering inside the talk_to_llm.py. Using this demo hooks up Muse-12B with whisper and kokoro using the OpenArc server.

Very interesting way to expereince a text adventure.

Performance on A770

Results were captured using openarc bench.

Very nice.

openarc bench selects input tokens by sampling the entire vocabulary using a similar approach to llama-bench.

input tokens: [512]
max tokens:   [128]
runs: 5

  benching... (5/5) 

Muse-12B-int4_asym-ov

┏━━━━━┳━━━━━┳━━━━━┳━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ run ┃   p ┃   n ┃ ttft(s) ┃ tpot(ms) ┃ prefill(t/s) ┃ decode(t/s) ┃ duration(s) ┃
┑━━━━━╇━━━━━╇━━━━━╇━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
β”‚   1 β”‚ 512 β”‚ 128 β”‚    0.20 β”‚    28.92 β”‚       2623.5 β”‚        34.6 β”‚        3.87 β”‚
β”‚   2 β”‚ 512 β”‚ 128 β”‚    0.16 β”‚    28.92 β”‚       3134.4 β”‚        34.6 β”‚        3.84 β”‚
β”‚   3 β”‚ 512 β”‚ 128 β”‚    0.17 β”‚    28.89 β”‚       3007.7 β”‚        34.6 β”‚        3.84 β”‚
β”‚   4 β”‚ 512 β”‚ 128 β”‚    0.17 β”‚    28.88 β”‚       3045.6 β”‚        34.6 β”‚        3.84 β”‚
β”‚   5 β”‚ 512 β”‚ 128 β”‚    0.17 β”‚    28.91 β”‚       2998.3 β”‚        34.6 β”‚        3.84 β”‚
β””β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Total: 5 runs

System:
Xeon W2255
128GB DDR4-ECC
Asrock A770
Ubuntu 24.04: 6.14.4-061404-generic 
openvino   2025.3.0
openvino-genai  2025.3.0.0
Downloads last month
10
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Echo9Zulu/Muse-12B-int4_asym-ov

Finetuned
(4)
this model