DeepSeek, a Chinese artificial-intelligence start-up, released a new model and paper titled “Thinking with Visual Primitives” in April 2026 [1].
This development is significant because it aims to fundamentally change how artificial intelligence perceives and reasons about images. By focusing on efficiency, the model seeks to make image understanding more powerful while reducing the costs associated with processing complex visual data [3, 4].
The new release follows a rapid development cycle for the company. DeepSeek previously announced its first-generation reasoning models on Jan. 20, 2026 [2]. The rollout of DeepSeek V4 in April 2026 represents a shift toward optimizing how AI handles long-context visual reasoning [1].
The company said the V4 architecture enables million-token reasoning at a lower cost [3]. This efficiency allows the AI to process larger amounts of visual information without the prohibitive expense typically associated with high-token reasoning tasks. The approach utilizes "visual primitives" to streamline the way the system interprets image data [1].
Industry analysts said the move positions the company as a competitor in the open-source AI race [1]. By lowering the barrier to entry for complex visual reasoning, the start-up is challenging established norms regarding the computational power required for advanced AI perception [3].
DeepSeek said the goal of the project is to deliver a system that is both cheaper and more capable of handling long-form visual contexts [3, 4]. This capability allows the model to maintain a more comprehensive "memory" of an image, or a series of images, during the reasoning process.
“DeepSeek released a new model that claims to fundamentally change how AI perceives and reasons about images.”
The introduction of visual primitives suggests a move away from brute-force computation toward more elegant, efficient data representation. If DeepSeek successfully lowers the cost of long-context visual reasoning, it could democratize high-end image analysis for developers who cannot afford the massive compute costs of larger, proprietary models.





