
Dribblersportz
Add a review FollowOverview
-
Founded Date 2001年7月1日
-
Sectors Telecommunications
-
Posted Jobs 0
-
Viewed 5
Company Description
DeepSeek’s First-generation Reasoning Models
DeepSeek’s first-generation reasoning designs, accomplishing performance similar to OpenAI-o1 across mathematics, code, and reasoning tasks.
Models
DeepSeek-R1
Distilled models
DeepSeek group has actually shown that the reasoning patterns of bigger models can be distilled into smaller sized designs, resulting in much better efficiency compared to the thinking patterns discovered through RL on little models.
Below are the models produced by means of fine-tuning versus a number of dense models extensively used in the research study community using reasoning data created by DeepSeek-R1. The examination results demonstrate that the distilled smaller models carry out extremely well on benchmarks.
DeepSeek-R1-Distill-Qwen-1.5 B
DeepSeek-R1-Distill-Qwen-7B
DeepSeek-R1-Distill-Llama-8B
DeepSeek-R1-Distill-Qwen-14B
DeepSeek-R1-Distill-Qwen-32B
DeepSeek-R1-Distill-Llama-70B
License
The design weights are licensed under the MIT License. DeepSeek-R1 series support business usage, enable any adjustments and acquired works, consisting of, however not limited to, distillation for training other LLMs.