The Open Source Initiative (OSI), a non-profit organisation that defines open source software standards released the Open Source Artificial Intelligence (AI) Definition 1.0 (OSAID) to clarify what qualifies as open source AI. Before this definition, various developers relied on unclear and diverging criteria to claim that their AI system is open source.
While the traditional definition of open source has worked well for software and source code, it is not directly applicable to an AI model. The classic Open Source Definition states that “the source code must be the preferred form in which a programmer would modify the program”. However this may not apply to AI models, because there is no consensus on the preferred form for modifying an AI system. In addition, AI and machine learning systems are more than just software programs: they also include data, configuration options, documentation and artifacts such as weights and biases.
OSAID defines an AI model as comprised of the model architecture, inference code and model parameters (such as AI weights – learned parameters that produce outputs from given inputs). For an AI system to qualify as Open Source under OSAID, all AI components – including the model architecture, inference code, model parameters, and artifacts – must be open source. Such an AI system must allow users to:
Moreover, users must have access to the “preferred form to make modifications” to the system – that is, the AI system must include each of the following components, provided under OSI-approved terms:
Having OSAID is expected to enable AI developers, deployers and end users to enjoy greater autonomy, transparency, frictionless reuse and collaborative improvement of AI systems. It also brings additional documented benefits of open source, such as improved safety and security, accelerated innovation, more flexible customisation and lower costs.
On the other hand, OSAID can pose a significant challenge for AI companies as it essentially requires full disclosure of training data. Given that the right to use copyrighted data for AI model training and AI system outputs remains a hotly contested topic in many jurisdictions, most AI companies have kept their training data tightly under wraps at this time, instead only disclosing the “weights” or “parameters”.
The release of OSAID marks a crucial step in defining what truly constitutes open source AI. By setting clear standards for transparency, accessibility and reuse, OSAID aims to prevent mislabelling and ensure AI systems align with open source principles. While its requirements pose challenges – especially regarding data disclosure – they also pave the way for greater collaboration, innovation and trust in AI development. As the debate over open source AI continues, it would be key for AI companies to continually monitor developments in this area.
OrionW regularly advises clients on technology and artificial intelligence (AI) related matters. For more information about Singapore or regional AI regulations, or if you have questions about this article, please contact us at info@orionw.com.
Disclaimer: This article is for general information only and does not constitute legal advice.