AI and Copyright in Japan — What International Businesses Need to Know [Intro to ML #11]

2026-04-25
AI, MBA & Business Logic
AI copyright, Article 30-4, generative AI, intellectual property, Japan, MBA

In my previous post (Intro to ML #10), I wrote about Demo Day — the final session of my MBA data science course — and reflected on what I’d learned across the six sessions.

This is the wrap-up of the series, covering the other big topic from that final session: AI and intellectual property in Japan.

If you do business in Japan or with Japanese partners and you’re using generative AI in your operations, the legal landscape here matters. Japan’s copyright law treats machine learning differently from many Western jurisdictions — significantly more permissively in some respects, with its own quirks in others. This post walks through what international businesses should know.

1 The Four Components of an AI System
2 ① Training Data — Why Japan Is Called a “Machine Learning Paradise”
3 ② Programs — Mostly Open Source, with Standard OSS Hygiene
4 ③ Trained Models — A Legal Gray Zone
5 ④ AI-Generated Outputs — The “Creative Contribution” Test
6 What About Patents?
7 A Practical Checklist for Operating in Japan
8 End of the Series
9 Going Deeper
- 9.1 ① For thinking about how AI systems can fail and what we mean by “alignment”
- 9.2 ② For a sober look at where AI hype outruns reality

The Four Components of an AI System

The most useful frame I picked up from the course was to stop thinking of “AI” as one thing. Instead, decompose it into four components, each with its own legal treatment:

Training data (the material used to train a model)
Programs (the software that performs training and inference)
Trained models (the resulting weights and parameters)
AI-generated outputs (text, images, audio, etc. produced by the model)

Each is treated differently under Japanese law. Let’s go through them.

① Training Data — Why Japan Is Called a “Machine Learning Paradise”

This is the most distinctive feature of Japanese copyright law for AI, and it surprised me when I first encountered it.

Japan’s Copyright Act Article 30-4, which came into effect in 2019, permits the use of copyrighted works without the rightsholder’s permission when the use is “not for the purpose of enjoying the thoughts or sentiments expressed in the work.” In plain language: using copyrighted images, text, audio, or video as training data for machine learning is generally permitted, regardless of whether the rightsholder consented.

This is significantly more permissive than the EU’s text-and-data-mining exceptions (which allow opt-out by rightsholders) and avoids the unsettled fair-use disputes that have dominated US AI litigation. As a result, Japan has become one of the friendliest jurisdictions in the world for training models on copyrighted material.

There is one important caveat. Article 30-4 includes a proviso: the exception does not apply when the use would “unreasonably harm the interests of the copyright holder.” Mass-scraping a paid database for training data, for example, would likely fall into that exception. The principle is: “freely accessible” is not the same as “free to use commercially without limit.”

For international businesses, this is a real strategic factor. Companies that need to train models on broad data corpora — particularly copyrighted material like images, news, or books — often find that Japan offers a more workable legal environment for the training step than their home jurisdiction.

② Programs — Mostly Open Source, with Standard OSS Hygiene

The core ML software stack — TensorFlow (Google), PyTorch (Meta), Hugging Face Transformers, and so on — is overwhelmingly open source. Commercial use is generally free.

The legal risks here are not Japan-specific. They’re the standard OSS compliance risks: respecting license terms (Apache 2.0, BSD, MIT, etc.), maintaining the required attribution, and not accidentally pulling in copyleft (GPL-family) code that would impose unwanted obligations on your own product. If you operate in Japan, you’ll find that local courts and regulators apply OSS licenses essentially as written.

③ Trained Models — A Legal Gray Zone

This is the murkiest area. A “trained model” is the combination of program code and the weights produced by training. Under Japanese law:

Copyright protection? Possibly, as a “computer program” under copyright law — but a serious view holds that weight matrices lack the creativity required for copyright protection. Case law is thin.
Patent protection? Algorithms and mathematical functions on their own are not patentable inventions in Japan. Patents are available only when “software-based information processing is concretely realized using hardware resources” (the patent office’s standard formulation).
Derivative and distilled models: If someone takes your model and fine-tunes it, or queries it to generate input/output pairs and trains a new model from those — proving that the new model “came from yours” is technically very difficult.

The most reliable practical protection in Japan is to treat trained models as trade secrets under the Unfair Competition Prevention Act. This requires three things: the information must be (1) kept confidential, (2) of business value, and (3) not generally known. In practice this means access controls, NDAs with employees and contractors, and clear internal classification.

For international companies licensing models into Japan or developing them locally, the trade-secret route is usually the right framing for legal protection — not copyright or patents.

④ AI-Generated Outputs — The “Creative Contribution” Test

“Who owns the text ChatGPT wrote?” “Is the Midjourney image copyrightable?” These are the questions that come up most often in business contexts. The Japanese position, as it currently stands, can be summarized in three buckets:

Human creation: The human creator owns the copyright. Standard.
AI used as a tool by a human creator: If the human had creative intent and made a creative contribution, the resulting work is copyrightable, and the copyright belongs to the human.
Output from minimal prompting: An image generated from a one-line prompt like “draw a cat” is generally treated as a non-copyrightable AI output.

The hard question is what counts as “creative contribution.” Iterating through prompts? Generating dozens of candidates and selecting one? Painting over an output with a stylus? There’s no firm answer yet — case law in Japan is still developing, and the Cultural Affairs Agency continues to publish guidance.

The pragmatic guidance I’d give: preserve a record of your creative process. Save your prompt history, document your selection decisions, and keep traces of any human modification. If a dispute or audit arises later, that record is your strongest argument for “this is my creative work.” Without it, you may have a much harder time asserting copyright on AI-assisted output.

What About Patents?

Patenting AI inventions in Japan follows the standard Japanese patent framework, with a few AI-specific considerations:

The invention must be “a creation of technical ideas utilizing the laws of nature” — Japan’s standard patentability requirement.
Algorithms alone aren’t patentable. The application must show that the software is concretely realized using hardware resources.
Standard requirements apply: novelty, inventive step, first-to-file.
Simply applying a known neural network architecture to a new domain is generally treated as lacking inventive step.

The practical implication: pure AI techniques are hard to patent. System-level claims tying the AI into a specific business process or hardware configuration are usually the workable angle.

A Practical Checklist for Operating in Japan

Before collecting training data: Confirm the data source’s terms and verify that your use fits within Article 30-4’s exception (and doesn’t trigger the “unreasonably harm” proviso).
Before adopting OSS: Read the license, comply with attribution requirements, and watch for copyleft contamination.
When you develop a proprietary model: Manage it as a trade secret. Don’t rely on copyright or patents alone.
When delivering AI-generated work product: Spell out in the contract whether AI was used, in what scope, and who owns the result.
When clients ask whether AI was used: Be ready to walk through your prompt history and the human creative steps that followed.

This area is moving fast. Japan’s Cultural Affairs Agency, Ministry of Economy, Trade and Industry, and Fair Trade Commission have all been issuing guidance on generative AI, and case law is still being built. Treat this post as the state of play at the time of writing, and check current sources before making important decisions.

End of the Series

This is the eleventh and final post in the Intro to ML series. From the Titanic dataset on day one through regression, classification, evaluation, prediction bias, generative AI mechanics, prompting, Demo Day, and now copyright, it’s been a longer arc than I expected when I started.

If there’s one thing I’d highlight from finishing the course: AI is no longer a specialist skill. It’s becoming a baseline literacy for anyone in business — engineer or not. Understanding the mechanism in broad strokes, and being able to use it in your actual work, will quietly compound into a real difference over the next few years.

If this series helped someone get started, that’s enough.

Going Deeper

① For thinking about how AI systems can fail and what we mean by “alignment”

Brian Christian’s The Alignment Problem is the most readable account I’ve found of why getting AI systems to do what we actually want is harder than it looks. It’s not a Japan-specific book, but the core ideas — how training data shapes outputs, how optimization can drift from intent — are essential context for anyone making decisions about AI in any jurisdiction.

The Alignment Problem: Machine Learning and Human Values

created by Rinker

② For a sober look at where AI hype outruns reality

Arvind Narayanan and Sayash Kapoor’s AI Snake Oil distinguishes the parts of AI that genuinely work from the parts that are marketing. For business decision-makers evaluating AI vendors and projects — particularly the predictive AI claims that come up often in hiring, risk scoring, and fraud detection — this book is a calibration tool.