The U.S. Copyright Office released today a notice of inquiry concerning AI and copyright. The Press Release states:
Today, the U.S. Copyright Office issued a notice
of inquiry (NOI) in the Federal Register on copyright
and artificial intelligence (AI). The Office is undertaking a study of the
copyright law and policy issues raised by generative AI and is assessing
whether legislative or regulatory steps are warranted. The Office will use the
record it assembles to advise Congress; inform its regulatory work; and offer
information and resources to the public, courts, and other government entities
considering these issues.
The NOI seeks factual information and views on a number of
copyright issues raised by recent advances in generative AI. These issues
include the use of copyrighted works to train AI models, the appropriate levels
of transparency and disclosure with respect to the use of copyrighted works,
the legal status of AI-generated outputs, and the appropriate treatment of
AI-generated outputs that mimic personal attributes of human artists.
The NOI is an integral next step for the Office’s AI initiative, which was launched in
early 2023. So far this year, the Office has held four public listening
sessions and two webinars. This NOI builds on the feedback and questions the
Office has received so far and seeks public input from the broadest audience to
date in the initiative.
“We launched this initiative at the beginning of the year to
focus on the increasingly complex issues raised by generative AI. This NOI and
the public comments we will receive represent a critical next step,” said Shira
Perlmutter, Register of Copyrights and Director of the U.S. Copyright Office.
“We look forward to continuing to examine these issues of vital importance to
the evolution of technology and the future of human creativity.”
Initial written comments are due by 11:59 p.m. eastern time
on Wednesday, October 18, 2023. Reply comments are due by 11:59 p.m. eastern
time on Wednesday, November 15, 2023. Instructions for submitting comments are
available on the Office’s website.
Commenters may choose which and how many questions to respond to in the NOI.
The NOI includes the following
questions:
General Questions
The Office has several general questions about generative
AI in addition to the specific topics listed below. Commenters are encouraged
to raise any positions or views that are not elicited by the more detailed
questions further below.
1. As described above, generative AI systems have the ability
to produce material that would be copyrightable if it were created by a human
author. What are your views on the potential benefits and risks of this
technology? How is the use of this technology currently affecting or likely to
affect creators, copyright owners, technology developers, researchers, and the
public?
2. Does the increasing use or distribution of AI-generated
material raise any unique issues for your sector or industry as compared to
other copyright stakeholders?
3. Please identify any papers or studies that you believe are
relevant to this Notice. These may address, for example, the economic effects
of generative AI on the creative industries or how different licensing regimes
do or could operate to remunerate copyright owners and/or creators for the use
of their works in training AI models. The Office requests that commenters
provide a hyperlink to the identified papers.
4. Are there any statutory or regulatory approaches that have
been adopted or are under consideration in other countries that relate to
copyright and AI that should be considered or avoided in the United States? How
important a factor is international consistency in this area across borders?
5. Is new legislation warranted to address copyright or
related issues with generative AI? If so, what should it entail? Specific
proposals and legislative text are not necessary, but the Office welcomes any
proposals or text for review.
Training
If your comment applies only to a specific subset of AI
technologies, please make that clear.
6. What kinds of copyright-protected training materials are
used to train AI models, and how are those materials collected and curated?
6.1. How or where do developers of AI models acquire the
materials or datasets that their models are trained on? To what extent is
training material first collected by third-party entities (such as academic
researchers or private companies)?
6.2. To what extent are copyrighted works licensed from
copyright owners for use as training materials? To your knowledge, what
licensing models are currently being offered and used?
6.3. To what extent is non-copyrighted material (such as
public domain works) used for AI training? Alternatively, to what extent is
training material created or commissioned by developers of AI models?
6.4. Are some or all training materials retained by
developers of AI models after training is complete, and for what purpose(s)?
Please describe any relevant storage and retention practices.
7. To the extent that it informs your views, please briefly
describe your personal knowledge of the process by which AI models are trained.
The Office is particularly interested in:
7.1. How are training materials used and/or reproduced when
training an AI model? Please include your understanding of the nature and
duration of any reproduction of works that occur during the training process,
as well as your views on the extent to which these activities implicate the exclusive
rights of copyright owners.
7.2. How are inferences gained from the training process
stored or represented within an AI model?
7.3. Is it possible for an AI model to “unlearn” inferences
it gained from training on a particular piece of training material? If so, is
it economically feasible? In addition to retraining a model, are there other
ways to “unlearn” inferences from training?
7.4. Absent access to the underlying dataset, is it possible
to identify whether an AI model was trained on a particular piece of training
material?
8. Under what circumstances would the unauthorized use of
copyrighted works to train AI models constitute fair use? Please discuss any
case law you believe relevant to this question.
8.1. In light of the Supreme Court's recent decisions in Google
v. Oracle America and Andy Warhol Foundation v. Goldsmith,
how should the “purpose and character” of the use of copyrighted works to
train an AI model be evaluated? What is the relevant use to be analyzed? Do
different stages of training, such as pre-training and fine-tuning, raise
different considerations under the first fair use factor?
8.2. How should the analysis apply to entities that collect
and distribute copyrighted material for training but may not themselves engage
in the training?
8.3. The use of copyrighted materials in a training dataset
or to train generative AI models may be done for noncommercial or research
purposes. How should the fair use analysis apply if AI models or datasets are
later adapted for use of a commercial nature? Does it make a difference if
funding for these noncommercial or research uses is provided by for-profit
developers of AI systems?
8.4. What quantity of training materials do developers of
generative AI models use for training? Does the volume of material used to
train an AI model affect the fair use analysis? If so, how?
8.5. Under the fourth factor of the fair use analysis, how
should the effect on the potential market for or value of a copyrighted work
used to train an AI model be measured? Should the inquiry be
whether the outputs of the AI system incorporating the model compete with a
particular copyrighted work, the body of works of the same author, or the
market for that general class of works?
9. Should copyright owners have to affirmatively consent (opt
in) to the use of their works for training materials, or should they be
provided with the means to object (opt out)?
9.1. Should consent of the copyright owner be required for
all uses of copyrighted works to train AI models or only commercial uses?
9.2. If an “opt out” approach were adopted, how would that
process work for a copyright owner who objected to the use of their works for
training? Are there technical tools that might facilitate this process, such as
a technical flag or metadata indicating that an automated service should not
collect and store a work for AI training uses?
9.3. What legal, technical, or practical obstacles are there
to establishing or using such a process? Given the volume of works used in
training, is it feasible to get consent in advance from copyright owners?
9.4. If an objection is not honored, what remedies should be
available? Are existing remedies for infringement appropriate or should there
be a separate cause of action?
9.5. In cases where the human creator does not own the
copyright—for example, because they have assigned it or because the work was
made for hire—should they have a right to object to an AI model being trained
on their work? If so, how would such a system work?
10. If copyright owners' consent is required to train
generative AI models, how can or should licenses be obtained?
10.1. Is direct voluntary licensing feasible in some or all
creative sectors?
10.2. Is a voluntary collective licensing scheme a feasible
or desirable approach? Are there existing collective management
organizations that are well-suited to provide those licenses, and are there
legal or other impediments that would prevent those organizations from
performing this role? Should Congress consider statutory or other changes, such
as an antitrust exception, to facilitate negotiation of collective licenses?
10.3. Should Congress consider establishing a compulsory
licensing regime? If so, what should such a regime look like? What
activities should the license cover, what works would be subject to the
license, and would copyright owners have the ability to opt out? How should
royalty rates and terms be set, allocated, reported and distributed?
10.4. Is an extended collective licensing scheme a
feasible or desirable approach?
10.5. Should licensing regimes vary based on the type of work
at issue?
11. What legal, technical or practical issues might there be
with respect to obtaining appropriate licenses for training? Who, if anyone,
should be responsible for securing them (for example when the curator of a
training dataset, the developer who trains an AI model, and the company employing
that model in an AI system are different entities and may have different
commercial or noncommercial roles)?
12. Is it possible or feasible to identify the degree to
which a particular work contributes to a particular output from a generative AI
system? Please explain.
13. What would be the economic impacts of a licensing
requirement on the development and adoption of generative AI systems?
14. Please describe any other factors you believe are
relevant with respect to potential copyright liability for training AI models.
Transparency & Recordkeeping
15. In order to allow copyright owners to determine whether
their works have been used, should developers of AI models be required to
collect, retain, and disclose records regarding the materials used to train
their models? Should creators of training datasets have a similar obligation?
15.1. What level of specificity should be required?
15.2. To whom should disclosures be made?
15.3. What obligations, if any, should be placed on
developers of AI systems that incorporate models from third parties?
15.4. What would be the cost or other impact of such a
recordkeeping system for developers of AI models or systems, creators,
consumers, or other relevant parties?
16. What obligations, if any, should there be to notify
copyright owners that their works have been used to train an AI model?
17. Outside of copyright law, are there existing U.S. laws
that could require developers of AI models or systems to retain or disclose
records about the materials they used for training?
Generative AI Outputs
If your comment applies only to a particular subset of
generative AI technologies, please make that clear.
Copyrightability
18. Under copyright law, are there circumstances when a human
using a generative AI system should be considered the “author” of material
produced by the system? If so, what factors are relevant to that determination?
For example, is selecting what material an AI model is trained on and/or
providing an iterative series of text commands or prompts sufficient to claim
authorship of the resulting output?
19. Are any revisions to the Copyright Act necessary to
clarify the human authorship requirement or to provide additional standards to
determine when content including AI-generated material is subject to copyright
protection?
20. Is legal protection for AI-generated material desirable
as a policy matter? Is legal protection for AI-generated material necessary to
encourage development of generative AI technologies and systems? Does existing
copyright protection for computer code that operates a generative AI system
provide sufficient incentives?
20.1. If you believe protection is desirable, should it be a
form of copyright or a separate sui generis right? If the latter, in
what respects should protection for AI-generated material differ from
copyright?
21. Does the Copyright Clause in the U.S. Constitution permit
copyright protection for AI-generated material? Would such protection “promote
the progress of science and useful arts”? If so, how?
Infringement
22. Can AI-generated outputs implicate the exclusive rights
of preexisting copyrighted works, such as the right of reproduction or the
derivative work right? If so, in what circumstances?
23. Is the substantial similarity test adequate to address
claims of infringement based on outputs from a generative AI system, or is some
other standard appropriate or necessary?
24. How can copyright owners prove the element of copying
(such as by demonstrating access to a copyrighted work) if the developer of the
AI model does not maintain or make available records of what training material
it used? Are existing civil discovery rules sufficient to address this
situation?
25. If AI-generated material is found to infringe a
copyrighted work, who should be directly or secondarily liable—the developer of
a generative AI model, the developer of the system incorporating that model,
end users of the system, or other parties?
25.1. Do “open-source” AI models raise unique considerations
with respect to infringement based on their outputs?
26. If a generative AI system is trained on copyrighted works
containing copyright management information, how does 17 U.S.C.
1202(b) apply to the treatment of that information in outputs of the
system?
27. Please describe any other issues that you believe
policymakers should consider with respect to potential copyright liability
based on AI-generated output.
Labeling or Identification
28. Should the law require AI-generated material to be
labeled or otherwise publicly identified as being generated by AI? If so, in
what context should the requirement apply and how should it work?
28.1. Who should be responsible for identifying a work as
AI-generated?
28.2. Are there technical or practical barriers to labeling
or identification requirements?
28.3. If a notification or labeling requirement is adopted,
what should be the consequences of the failure to label a particular work or
the removal of a label?
29. What tools exist or are in development to identify
AI-generated material, including by standard-setting bodies? How accurate are
these tools? What are their limitations?
Additional Questions About Issues Related to Copyright
30. What legal rights, if any, currently apply to
AI-generated material that features the name or likeness, including vocal
likeness, of a particular person?
31. Should Congress establish a new federal right, similar to
state law rights of publicity, that would apply to AI-generated material? If
so, should it preempt state laws or set a ceiling or floor for state law
protections? What should be the contours of such a right?
32. Are there or should there be protections against an AI
system generating outputs that imitate the artistic style of a human creator
(such as an AI system producing visual works “in the style of” a specific
artist)? Who should be eligible for such protection? What form should it take?
33. With respect to sound recordings, how does section 114(b)
of the Copyright Act relate to state law, such as state right of publicity
laws? Does this issue require legislative attention in the context
of generative AI?
34. Please identify any issues not mentioned above that the
Copyright Office should consider in conducting this study.
It will be very interesting to see
the responses. I wonder if AI was used
to help generate the questions. I am
sure someone will submit AI generated responses to the questions. I do wonder about moral rights [fn. 38 in the
document].
No comments:
Post a Comment