DeepSeek-V3
by
deepseek-ai

Description: The deepseek-ai/deepseek-v3 repository contains the implementation and documentation for DeepSeek-V3, a large-scale Mixture-of-Experts language model with 671...

View on GitHub ↗

Summary Information

Updated 5 minutes ago

Added to GitGenius on February 2nd, 2025

Created on December 26th, 2024

Open Issues & Pull Requests: 256 (+1)

Number of forks: 16,734

Total Stargazers: 103,884 (-4)

Total Subscribers: 749 (+0)

Issue Activity (beta)

Open issues: 217

New in 7 days: 5

Closed in 7 days: 0

Avg open age: 28 days

Stale 30+ days: 33

Stale 90+ days: 1

Recent activity

Opened in 7 days: 2

Closed in 7 days: 0

Comments in 7 days: 124

Events in 7 days: 263

Top labels

stale (539)
closed-as-stale (441)
enhancement (19)
question (1)
spam (1)

Most active issues this week

#1466 跨框架逻辑审计的场域动力学统一视角 - 168 events / 66 comments
#1462 Cross-Framework Benchmark: Identity Pressure Eval — Type B (Low Causal Density), Scenarios 1–3 - 110 events / 21 comments
#1470 Resonance-Missile：中国版 Missile 漏洞挖掘系统——基于多智能体协同与场域调谐 - 45 events / 11 comments
#1471 DeepSeek Community — June 2026 Technical Ecosystem Summary Report (Draft for Review) - 29 events / 11 comments
#1447 心虫 HeartFlow — AI认知引擎求职展示：三层记忆 + 自愈RL + 决策验证 - 20 events / 6 comments

Explore full issue details

Repository Insights (GitGenius)

Median issue/PR response: 6.8 hours

Mean response time: 4.9 days

90th percentile: 11.1 days

Tracked items: 898

Most active contributors

qingkong66 - 832 events, 283 issues
mowentian - 422 events, 254 issues
icophy - 168 events, 50 issues
maratsultanov2 - 166 events, 9 issues
luoxuejian000 - 141 events, 13 issues

Related by overlapping contributors

Detailed Description

The deepseek-ai/deepseek-v3 repository contains the implementation and documentation for DeepSeek-V3, a large-scale Mixture-of-Experts language model with 671 billion total parameters and 37 billion activated parameters per token. The repository is written in Python and classified across multiple domains including natural language processing, transformer architecture, computer vision, model scaling, and multimodal AI capabilities. The codebase serves as a research development platform for advanced deep learning and neural network implementations.

DeepSeek-V3 introduces several architectural innovations built upon the foundation of DeepSeek-V2. The model pioneers an auxiliary-loss-free strategy for load balancing that minimizes performance degradation while maintaining efficient token distribution across experts. The architecture incorporates Multi-head Latent Attention and DeepSeekMoE designs, and introduces a Multi-Token Prediction training objective that improves model performance and enables speculative decoding for inference acceleration. The model was pre-trained on 14.8 trillion diverse and high-quality tokens, followed by Supervised Fine-Tuning and Reinforcement Learning stages.

The training process demonstrates remarkable efficiency and stability. DeepSeek-V3 required only 2.788 million H800 GPU hours for complete training, with 2.664 million hours dedicated to pre-training and 0.1 million hours for post-training stages. The repository documents an FP8 mixed precision training framework that was validated at scale for the first time on an extremely large model. The training process achieved nearly full computation-communication overlap through co-design of algorithms, frameworks, and hardware, overcoming cross-node MoE training bottlenecks. Notably, the entire training process remained stable without any irrecoverable loss spikes or rollbacks.

The repository provides access to two model variants: DeepSeek-V3-Base and DeepSeek-V3, both with 128K context length, available through Hugging Face. The total model size on Hugging Face is 685 billion parameters, including 671 billion main model weights and 14 billion Multi-Token Prediction module weights. Comprehensive evaluation results show DeepSeek-V3 outperforming other open-source models and achieving performance comparable to leading closed-source models across standard benchmarks including MMLU, BBH, code tasks, and mathematical reasoning.

GitGenius activity tracking shows the repository has grown from 103,853 to 103,854 stargazers since July 5, 2026. The project maintains active community engagement with a median issue and pull request response latency of 6.8 hours and a mean latency of 117.6 hours across 897 tracked items. The most active contributors tracked by GitGenius are qingkong66 with 821 events, mowentian with 422 events, and icophy with 162 events. The most prevalent issue labels are stale with 397 occurrences and closed-as-stale with 315 occurrences, indicating active issue management. The repository shares overlapping contributors with microsoft/vscode, microsoft/typescript, and rust-lang/rust, suggesting cross-project collaboration within the broader open-source ecosystem.

DeepSeek-V3
by
deepseek-ai

Summary Information

Issue Activity (beta)

Recent activity

Top labels

Most active issues this week

Repository Insights (GitGenius)

Most active contributors

Related by overlapping contributors

DeepSeek-V3
by
deepseek-aideepseek-ai/DeepSeek-V3

Repository Details

DeepSeek-V3 by deepseek-ai

Summary Information

Issue Activity (beta)

Recent activity

Top labels

Most active issues this week

Repository Insights (GitGenius)

Most active contributors

Related by overlapping contributors

DeepSeek-V3 by deepseek-aideepseek-ai/DeepSeek-V3

Repository Details

DeepSeek-V3
by
deepseek-ai

DeepSeek-V3
by
deepseek-aideepseek-ai/DeepSeek-V3