MiniMax-M1 is the world's first open-weight, large-scale hybrid-attention reasoning model.
It has a Mixture-of-Experts (MoE) architecture with a lightning attention mechanism.
The model supports a context length of 1 million tokens and is efficient in processing long inputs.
It is trained using reinforcement learning on diverse problems, including software engineering and mathematical reasoning.
MiniMax-M1 outperforms other models in complex tasks such as software engineering and long-context tasks.
Get notified when new stories are published for "🇺🇸 Hacker News English"