AI model training relies heavily on large datasets to teach machine learning algorithms to identify patterns, generate predictions, or create new content. This process often involves data scraping – the automated collection of information from websites and digital platforms – to assemble the comprehensive datasets necessary for effective AI model development.
As AI model training becomes more prevalent, questions surrounding the lawful use of third-party content, particularly copyrighted materials, have gained critical importance. In Thailand, AI model developers must navigate a legal framework primarily governed by the Copyright Act, which creates distinct challenges due to the lack of fair-use provisions.
This article explores the copyright-related obstacles and legal uncertainties that AI model developers face under Thailand’s existing copyright legislation, and provides guidance for managing these complex legal requirements.
Copyright risks in data scraping and AI model training
Thailand’s Copyright Act lacks the broad fair-use or fair-dealing exceptions found in jurisdictions such as the US, creating significant implications for AI model developers. These include the following:
Absence of protection for AI model training – using copyrighted material in AI model training is presumed infringing unless covered by specific statutory exceptions or authorised by rights holders. No general legal framework exists for incorporating copyrighted works into AI model training without explicit permission.
Heightened licensing requirements – developers must locate and obtain licences for each copyrighted work in their training datasets. Given the volume and variety of data needed for robust AI models, this requirement can be both challenging and financially burdensome.
Legal uncertainty and exposure to litigation – the absence of clear statutory direction or established case law creates a legal grey zone for developers. No precedent exists to clarify whether certain uses of copyrighted content for AI model training might be acceptable or considered too insignificant to constitute infringement. This ambiguity leaves developers vulnerable to copyright infringement allegations, potentially resulting in injunctions, monetary damages, or criminal liability.
Innovation deterrent – the liability risk and compliance complexity may discourage domestic and international entities from pursuing AI model development or deployment in Thailand. This could inhibit innovation and constrain growth in Thailand’s AI sector.
International collaboration challenges – AI model development frequently involves cross-border data sharing. Models trained using datasets compliant with foreign copyright laws may still violate Thai law if deployed or commercialised within Thailand, creating obstacles for international partnerships and technology transfers.
Data scraping and copyright infringement
Data scraping may constitute copyright infringement when it involves reproducing or extracting protected content without authorisation. In Thailand, the absence of fair-use exceptions amplifies this risk. Even copying publicly available website content to create training datasets may be actionable under Thai copyright law, regardless of whether the use is commercial or non-commercial.
Developers should recognise that public accessibility of content does not automatically render its use lawful under Thailand’s copyright framework. Additionally, developers should be aware that the outputs generated by AI models may also raise copyright concerns, particularly if the outputs reproduce or closely resemble protected content from the training data.
Jurisdictional issues and AI model deployment
Thai copyright law may extend to activities conducted outside Thailand when the resulting AI models are subsequently deployed, commercialised, or made accessible within the country. Additionally, AI models trained on datasets that comply with foreign copyright laws may still infringe Thai law if the training data contained copyrighted works not authorised for use in Thailand. This creates additional due diligence obligations for international collaborations and technology transfers.
Strategic recommendations for AI model developers
AI model developers operating in or targeting Thailand should consider implementing the following measures to reduce copyright risks:
Adopt comprehensive licensing procedures – identify and secure appropriate licences for relevant copyrighted materials included in training datasets;
Evaluate data scraping practices – ensure that data scraping does not involve unauthorised reproduction or extraction of protected content and that all activities comply with website terms of service and copyright notices;
Introduce detailed documentation systems – maintain comprehensive records of licensing activities and compliance measures;
Track legal developments – stay current with changes in Thai copyright law and emerging guidance related to AI model development and data usage; and
Conduct regular risk evaluation – continuously assess legal risks associated with new datasets, AI model deployments, and international partnerships.
Future developments
Thailand is considering potential regulatory reforms and policy measures to address challenges posed by AI and emerging technologies. These discussions encompass possible AI-specific regulations, evaluation of copyright exceptions for technological applications, and potential modifications to data protection legislation.
The scope and timeline of these developments remain uncertain, making it important for stakeholders to stay informed and engage in public consultation processes as they become available.
Key takeaways on AI model training in Thailand
AI model developers in Thailand confront substantial legal obstacles due to the absence of fair-use exceptions in the Copyright Act. This situation increases licensing burdens, elevates litigation risks, and generates uncertainty that may impede innovation and international collaboration. Proactive copyright compliance and monitoring of legal developments are essential for managing these challenges and supporting responsible AI model development.
While jurisdictions such as Singapore and Japan have implemented text and data mining exceptions for AI model training, Thailand has not yet adopted comparable provisions. The industry continues to anticipate legislative reforms or judicial decisions that may offer more definitive guidance for AI model development activities.