https://www.reddit.com/r/LocalLLaMA/s/DrZzCcFDi9
Epyc 9374F, 12x32GB RAM, that's 384GB of RAM so DeepSeek V3 just barely fits quantized to Q4_K_M.
TG ~ 7-9
Epyc 9374F, 12x32GB RAM, that's 384GB of RAM so DeepSeek V3 just barely fits quantized to Q4_K_M.
你想accuse deepseek v3係某個model既finetune?咁係咪呢堆open source weight就已經等同甚至勁過人哋chatgpt嗰個model?好大機會佢個backend其實偷偷地連去出面啲api佢成個weight係到,你自己可以down落黎試wor,連乜野出面api呢
講到尾 係咪軟體版本嘅漢芯
https://huggingface.co/deepseek-ai/DeepSeek-V3/tree/main
樓主聲稱佢訓練嘅時候需要嘅算力係十分之一,如果係咁嘅nvidia股價咪要暴跌?用普通gpu都可以訓練到啦
係咪偷偷地用咗pretrained model呀?我玩過類似嘅project,用pretrained model中間可以skip大量冤枉路
但問題咁樣就會dependent on個pre-train model,唔係話睇死大陸一定係做假,但都係費事跟車太貼係既話份paper仲大份wor