JetBrains Releases Mellum2, a 12B Sparse Model for Sub-Second Inference
JetBrains' new Mixture-of-Experts model achieves 2x speedup over dense peers while activating just 2.5B parameters per token.
JetBrains' new Mixture-of-Experts model achieves 2x speedup over dense peers while activating just 2.5B parameters per token.
Anthropic shipped Claude Opus 4.8 on May 28, introducing a new agentic framework and improved uncertainty handling amid intensifying LLM competition.
A GitHub repository curates datasets for LLM fine-tuning, instruction tuning, and benchmarking across medical, NLP, multimodal, and code domains.