deepseek-ai/DeepSeek-R1

#167 opened 4 months ago by

henrycwf

90+ tokens per second for MI300x8 using batch_size = 1

#166 opened 4 months ago by

ghostplant

RytryR1

#165 opened 4 months ago by

Rocka01

"aha moment" comment deleted by Perplexity (recovered)

👍 1

#164 opened 4 months ago by

FalconNet

输出乱码

#163 opened 4 months ago by

cell22

'num_hidden_layers': 61, but layer 62 has weights.

#162 opened 4 months ago by

xinhe

Upload GTG Breaking every Limit

#161 opened 4 months ago by

GTGenesis

support prefix complete

❤️ 👍 3

#158 opened 4 months ago by

HuggineAllen

Create app.py

#157 opened 4 months ago by

SpaceAgeRobotics

Create 1

#156 opened 4 months ago by

madevii

Brokersponsor

#155 opened 4 months ago by

Brokersponsor

Update README.md

#154 opened 4 months ago by

egegvner

Upload IMG_4530.png

#152 opened 4 months ago by

Noemie202586

Upload IMG_1745.JPG

#151 opened 4 months ago by

Ladib

Create Clara

#150 opened 4 months ago by

Clblinks

If I understand correctly, evaluating MATH-500 requires 64*500 model calls?

#149 opened 4 months ago by

Rorschaaaach

Request: DOI

🚀 1

#148 opened 4 months ago by

Tarush-Appreciate

Update README.md

#147 opened 4 months ago by

tekno-power

Update README.md

#146 opened 4 months ago by

Ekimnedops6969

Update README.md

❤️ 1

#143 opened 4 months ago by

MuhammadEhsan

Request for Information on Purchasing Reasoning API Key

#142 opened 4 months ago by

brahamaandai

ssss

🔥 1

#140 opened 5 months ago by

DZGT

Update model_max_length in tokenizer_config.json

👍 2

#139 opened 5 months ago by

kkokkie2360

Host of the model

#138 opened 5 months ago by

henrycwf

Lite version for DeepSeek-R1?

👍 👀 6

#137 opened 5 months ago by

haili-tian

[Bug] assert not self.training

4

#136 opened 5 months ago by

Gaie

Upload IMG_0253.HEIC

#134 opened 5 months ago by

rynty

Upload comment-sample.xlsx

#133 opened 5 months ago by

faham123

non-reasoning data

#132 opened 5 months ago by

cmgzy

能不能放一些 4bit的权重，现在手里面的卡都不支持FP8

🔥 2

#131 opened 5 months ago by

zhnagchenchne

For the universe! DeepPhaser.py DeepCoralX.py and DeepSynapse.py

❤️ 👀 2

#129 opened 5 months ago by

karmikovic

Request: Create distill of Mistral Small 24B

#128 opened 5 months ago by

Kenshiro-28

which vision model is R1 using for text extraction from image or pdfs.

#127 opened 5 months ago by

ashutoshroy02

PRII

#126 opened 5 months ago by

bajramani

Request: DOI

#125 opened 5 months ago by

Yungchizzy

Little brother(s) of big DeepSeek-R1 ?

#124 opened 5 months ago by

MrDevolver

Upload gugagagaggagagagga.pdf

#123 opened 5 months ago by

HahahhahH

Change quant_method to bitsandbytes_4bit

#121 opened 5 months ago by

ngoc24794

Unknown quantization type

5

#120 opened 5 months ago by

Reewaz321

UPdate config.json

#119 opened 5 months ago by

keerthanaOfficial2001

所以部署一个671B的模型显存需要多少有什么基准的硬件配置？

27

#118 opened 5 months ago by

cena163

nodejs

#117 opened 5 months ago by

k1de

Distill Compatibility for PC w/ Ryzen 7 Pro 8840HS w/ 780M Graphics 2x32GB RAM 1TB DDR5 SSD

#115 opened 5 months ago by

arzx

Upload gitattributes.txt

#114 opened 5 months ago by

SafeerChalil

Introducing Deepseek's TinyZero

❤️ 1

#113 opened 5 months ago by

DeepSeekModerator

Create Kuch v

#112 opened 5 months ago by

gamerdowntown

Request: DOI

#111 opened 5 months ago by

Hassanabbas2975

quantization fp8 error occuring while using pipeline approach or transformer based approach