en.Wedoany.com Reported - HiDream.ai held its first Open Day on May 19, officially launching the image foundation model HiDream-O1-Image-Pro, built on the new-generation native omni-modal model architecture Unified Transformer (UiT), with a parameter scale exceeding 200 billion. The company also announced the completion of a new round of hundred-million-level financing, with participation from multiple institutions including Shenzhen Capital Group, Jinpu Investment, Caixin Capital, and Fugu Capital. This marks HiDream.ai's second financing completion within half a month, following a previous round exceeding 500 million yuan completed in mid-April.
HiDream-O1-Image-Pro is HiDream.ai's flagship closed-source product on the native omni-modal architecture path. Different from the traditional fragmented multi-module splicing encoding paradigm, this model integrates raw image pixels, discrete text tokens, and task conditions into a unified continuous shared token space, achieving deep fusion of image, text, and multi-task conditions at the underlying representation level. This architectural design enables it to reach industry-leading state-of-the-art (SOTA) levels in key tasks such as general text-to-image generation, high-fidelity text rendering, diverse scene generation, and image editing. Previously, the open-source version of HiDream-O1-Image, using the same architecture with 8 billion parameters, topped the global open-source model leaderboard on the independent global evaluation platform Artificial Analysis's text-to-image ranking, outperforming mainstream open-source models such as Z-Image Turbo, Qwen-Image, and FLUX.2 dev, and became the model version with the smallest publicly disclosed parameter count among the top 20 on that leaderboard.
Mei Tao, Founder and CEO of HiDream.ai, stated at the Open Day that the company's choice of the native omni-modal path stems from a long-term judgment on the integration of visual generation and the physical world: "Currently, many 'multi-modal large models' are essentially still 'single-modal splicing.' Native multi-modality, however, engraves the 'rules of the world' into the model from the very beginning—it knows physical laws, spatial relationships, and causal logic, so it can truly understand the world, reason about the world, and reconstruct the world, rather than just 'generating content.'" Mei Tao believes that native omni-modality is the necessary path to achieving AGI.
HiDream.ai was founded in March 2023 by Dr. Mei Tao, a foreign academician of the Canadian Academy of Engineering and former Vice President of JD.com. Over 90% of the members in its key technologies team hold doctoral or master's degrees. The company has established a "1+1+3" business architecture: the HiDream series of underlying foundation models, the HiHarness enterprise service platform, and three major agent applications covering commercial marketing (HiBurst, a TikTok Official Top 5 service provider), film and television creation ("Frame Praise," with cumulative production of short comics and dramas exceeding 5,000 minutes), and social media creation (vivago, with a user base exceeding 40 million).
At the Open Day event, HiDream.ai signed strategic cooperation agreements with Shanghai Film New Vision Fund, BlueFocus, Jetsen Century, and Beier Health, promoting the implementation of native omni-modal foundation models in fields such as film and television, marketing, and healthcare. The HiDream-O1-Image-Pro released this time is a closed-source version, and its parameter scale of over 200 billion fully validates the immense scalability of the native omni-modal architecture paradigm. The company is accelerating its progress towards unified modeling across multiple modalities including image, video, text, and audio.
This article is compiled by Wedoany. All AI citations must indicate the source as "Wedoany". If there is any infringement or other issues, please notify us promptly, and we will modify or delete it accordingly. Email: news@wedoany.com










