DeepSeek Unveils Prover V2, 671 Billion Parameter AI Model
DeepSeek, a prominent Chinese artificial intelligence development company, has unveiled a new open-weight large language model (LLM) named Prover V2. The model was uploaded to the hosting service Hugging Face on April 30 and is released under the permissive open-source MIT license. Prover V2 is designed to tackle math proof verification, a task that involves translating mathematical problems into formal logic using the Lean 4 programming language, a tool widely used for proving theorems.
Prover V2 boasts an impressive 671 billion parameters, making it significantly larger than its predecessors, Prover V1 and Prover V1.5, which were released in August 2024. The developers claim that Prover V2 compresses mathematical knowledge into a format that allows it to generate and verify proofs, potentially aiding research and education. The model's large parameter count results in a file size of approximately 650 gigabytes, which requires substantial RAM or VRAM and processing power to run. To mitigate this, the Prover V2 weights have been quantized down to 8-bit floating point precision, effectively halving the model’s bulk.
Prover V1, the predecessor to Prover V2, was based on the seven-billion-parameter DeepSeekMath model and was fine-tuned on synthetic data. Synthetic data refers to data used for training AI models that was also generated by AI models, as human-generated data is increasingly scarce. Prover V1.5 reportedly improved on the previous version by optimizing both training and execution, achieving higher accuracy in benchmarks. The improvements introduced by Prover V2 are unclear, as no research paper or other information has been published at the time of writing.
The number of parameters in the Prover V2 weights suggests that it is likely based on the company’s previous R1 model. When it was first released, R1 made waves in the AI space with its performance comparable to the then state-of-the-art OpenAI’s o1 model. The release of R1 in this manner raised security concerns, and some described it as China’s “Sputnik moment.”
Ask Aime: What is the impact of DeepSeek's Prover V2 on AI proof verification?
Publicly releasing the weights of LLMs is a controversial topic. On one side, it is a democratizing force that allows the public to access AI on their own terms without relying on private company infrastructure. On the other side, it means that the company cannot step in and prevent abuse of the model by enforcing certain limitations on dangerous user queries. Open source proponents rejoiced that DeepSeek continued where meta left off with the release of its LLaMA series of open-source AI models, proving that open AI is a serious contender for OpenAI’s closed AI. The accessibility of those models also continues to improve.
Now, even users without access to a supercomputer can run LLMs locally. This is primarily thanks to two AI development techniques: model distillation and quantization. Distillation refers to training a compact “student” network to replicate the behavior of a larger “teacher” model, so you keep most of the performance while cutting parameters to make it accessible to less powerful hardware. Quantization consists of reducing the numeric precision of a model’s weights and activations to shrink size and boost inference speed with only minor accuracy loss. An example is Prover V2’s reduction from 16 to eight-bit floating point numbers, but further reductions are possible by halving bits further. Both of those techniques have consequences for model performance, but usually leave the model largely functional.
DeepSeek’s R1 was distilled into versions with retrained LLaMA and Qwen models ranging from 70 billion parameters to as low as 1.5 billion parameters. The smallest of those models can even reliably be run on some mobile devices. This development underscores the growing trend of making advanced AI models more accessible to a broader range of users, potentially democratizing access to cutting-edge technology.
