US LF AI & Data Foundation Establishes Working Group to Develop AI-Native Document Standards

2026-06-10 13:40

Favorite

en.Wedoany.com Reported - The LF AI & Data Foundation, under the Linux Foundation, has established a working group focused on developing the DocLang specification, aiming to provide interoperable document processing standards for cross-AI and agent workflows.

AI laptop

Founded by premier members IBM, Nvidia, and Red Hat, the working group's mission is to create an open, universal, AI-native document format designed to improve how enterprises prepare, exchange, and manage document data for AI systems. Contributors ABBYY and Human Signal will also participate in its development.

According to the announcement, enterprises currently operate with various fragmented document formats, including PDF, JPEG, and other file types primarily built for human reading rather than AI interpretation. As organizations increasingly rely on generative AI and agent systems, this disconnect can introduce complexity, increase costs, and reduce reliability when extracting meaning from business documents.

Mark Collier, Executive Director of LF AI & Data, stated that the goal of the DocLang specification working group is to develop a vendor-neutral, interoperable standard to help organizations prepare document data for AI more reliably, transparently, and at scale. An informational document released by the working group states that PDF was born for printing, DOCX for editors, and DocLang for the next era—a machine-readable document standard that models can truly trust. DocLang defines a structured, machine-readable format for any type of document, just as JSON is for data and HTML is for the web, implementable by any tool and usable by any pipeline.

Independent technology analyst Carmi Levy noted that existing document standards have enabled global stakeholders to collaborate confidently for decades, but as AI reshapes the rules of how work gets done, these standards urgently need updating. He believes DocLang represents the earliest and greatest hope for achieving a foundational baseline in document standards, promising to make workflows smarter, more efficient, and less risky than they are today. Adopting an open-source, vendor-neutral approach ensures collective interests take precedence over specific vendor needs, and early standardization efforts around networks, documents, web pages, and clouds have driven the free-flowing digital landscape that defines modern life.

Jason Andersen, Principal Analyst at Moor Insights & Strategy, believes that when standards like DocLang are applied to content ingestion, users uploading documents to agents could run a skill to preprocess the document into the DocLang standard format, thereby saving tokens. He envisions that these standards need to preserve what humans can do and be proficiently usable without requiring knowledge of coding. Once preprocessing attaches metadata or code to documents, governance may become easier to achieve as long as it is properly maintained, but this aspect is not yet reflected in the specification, and he encourages the team to consider it.

Yaz Palanichamy, Senior Research Analyst at Info-Tech Research Group, stated that from a user productivity perspective, the concept of AI-native documents helps organizations prepare document data for AI-embedded systems. However, he emphasized that organizational compliance controls and overall governance models are absolutely necessary, and it is also essential to understand whether a company's technical readiness can standardize internal document management practices. Without conducting internal feasibility studies or advance preparation, change management cannot be properly executed, potentially hindering an organization's ability to further mature or scale AI-embedded document processing capabilities. From a governance perspective, several organizational control measures still need appropriate review to ensure this new collaborative standard and toolkit is expanded responsibly and securely.

This article is compiled by Wedoany. All AI citations must indicate the source as "Wedoany". If there is any infringement or other issues, please notify us promptly, and we will modify or delete it accordingly. Email: news@wedoany.com

America

Information and Communication Artificial Intelligence Engineering

This bulletin is compiled and reposted from information of global Internet and strategic partners, aiming to provide communication for readers. If there is any infringement or other issues, please inform us in time. We will make modifications or deletions accordingly. Unauthorized reproduction of this article is strictly prohibited. Email: news@wedoany.com

Previous：31 provinces release 15th Five-Year Plans, top 10 industrial provinces deploy advanced manufacturing

Next：In June 2026, Fraunhofer IVV and others showcase a portable food spoilage detector with second-level detection