Apr 24
DeepSeek V4 - almost on the frontier, a fraction of the price
★★★★★
significance 4/5
DeepSeek has released the first two models of its highly anticipated V4 series: DeepSeek-V4-Pro and DeepSeek-V4-Flash. These models feature a 1 million token context window and use a Mixture of Experts architecture, with the Pro version being one of the largest open weights models available.
Why it matters
High-parameter frontier performance is decoupling from extreme compute costs, challenging the dominance of Western-centric, high-cost proprietary models.
Entities mentioned
DeepSeekTags
#deepseek #open weights #llm #mixture of experts #v4Related coverage
- arXiv cs.CLAu-M-ol: A Unified Model for Medical Audio and Language Understanding
- Simon WillisonIntroducing talkie: a 13B vintage language model from 1930
- Hugging FaceAdaptive Ultrasound Imaging with Physics-Informed NV-Raw2Insights-US AI
- Simon Willisonmicrosoft/VibeVoice
- WIRED AIThe Man Behind AlphaGo Thinks AI Is Taking the Wrong Path