U.S.-based Perceptron Inc. Releases Mk1 Video Analysis Model, Priced 80% to 90% Lower Than Competitors
2026-05-13 14:19
Favorite

en.Wedoany.com Reported - U.S. AI startup Perceptron Inc. officially launched its flagship video analysis reasoning model, Mk1, on May 12, 2026, priced at $0.15 per million input tokens and $1.5 per million output tokens, which is 80% to 90% lower than comparable frontier models such as Anthropic Claude Sonnet 4.5, OpenAI GPT-5, and Google Gemini 3.1 Pro. The model is purpose-built for video understanding and embodied reasoning, targeting physical industry scenarios like manufacturing, robotics, and security, in an attempt to break the constraints that high API costs impose on large-scale deployment.

In multiple industry benchmark tests, Mk1 matches or even surpasses frontier models at the cost level of lightweight models. On the spatial reasoning benchmark EmbSpatialBench, Mk1 scored 85.1, surpassing Google Robotics-ER 1.5's 78.4 and Alibaba Q3.5-27B's approximately 84.5; in the RefSpatialBench referring expression comprehension test, Mk1 scored 72.4, far exceeding GPT-5m's 9.0 and Claude Sonnet 4.5's 2.2. In video benchmarks, Mk1 scored 41.4 on the difficult subset of EgoSchema, significantly leading Gemini 3.1 Flash-Lite's 25.0; it achieved 88.5 on VSI-Bench, the highest score among all evaluated models.

The model design focuses on temporal reasoning in the physical world. Mk1 analyzes video at dynamic frame rates, supporting up to 2FPS, with a context window of 32K tokens, capable of tracking event chains frame by frame and returning structured timestamps. When a user inputs a long video stream and poses a query, the model can locate event nodes and generate time codes. In the robotics field, this capability can directly convert teleoperation footage into training data for path planning and grasp detection, compressing disparate visual understanding, action labeling, and data loop closure into a single model call. Beyond video, Mk1 also possesses frontier-level capabilities in image reasoning, complex OCR reading, and structured document extraction, supporting precise recognition of pointer positions and numerical readings on dashboards and industrial control panels.

The company was co-founded by Armen Aghajanyan and Akshat Shrivastava just two years ago. CEO Aghajanyan previously conducted AI research at Meta FAIR and Microsoft, while CTO Shrivastava focuses on physical AI and robotics. Aghajanyan stated at the launch event, "We created Perceptron to enable AI systems to read the physical world. Previously, the cost of frontier visual understanding far exceeded the reach of most industrial and consumer applications; we have changed that." Shrivastava further noted that robotics represents the most demanding test scenario for real-world physical AI, requiring perception, reasoning, and execution to form a closed loop under real conditions, and Mk1 was developed precisely with this as its benchmark.

From its inception, Perceptron targeted the efficiency frontier—in a coordinate system composed of composite scores on video and embodied reasoning benchmarks versus blended token cost, Mk1 falls within the same performance bracket as GPT-5 and Gemini 3.1 Pro, yet its cost approaches that of lightweight variants. Enterprise deployers can scale up visual understanding systems without compromising accuracy, bringing projects previously shelved due to cost—such as production line quality inspection, warehouse inventory counting, and drone inspection—back into budget.

Mk1 is equipped with multiple specialized capabilities designed for industrial scenarios. Built-in in-context learning and multimodal prompting mechanisms allow users to retrieve matches across all new images using just a single reference image or a video clip, eliminating the need for fine-tuning training data and pipelines. Pointing and counting functions can precisely handle dense scenes—such as the number of vehicles in a parking lot, shelf inventory, or parts on a pallet—where previous models often exhibited counting deviations or positional drift.

Mk1 is available immediately to developers and enterprises through the Perceptron AI API platform and OpenRouter. The company is based in Bellevue and Carnation, Washington. Mk1 is the first member of Perceptron's closed-source model family, while its previous generation open-source Isaac series will continue to be maintained.

This article is compiled by Wedoany. All AI citations must indicate the source as "Wedoany". If there is any infringement or other issues, please notify us promptly, and we will modify or delete it accordingly. Email: news@wedoany.com