I know a lot of authors are eager to learn if D2D conducting an AI training survey meant that such a service was imminent. At this time, Draft2Digital will not offer AI rights licensing opportunities.

Draft2Digital has always been dedicated to providing opportunities for authors. We also know that each of these opportunities need to be vetted properly. From the beginning, we’ve structured our Terms of Service to only grant the rights we need to distribute your works and nothing more. We cannot, and will not, make choices about your works for you.

The stakes are high enough around Al Licensing that we felt it was imperative to include the community as much as possible in our decisions to offer these options or not.

I sincerely thank you for your honest responses. I personally read thousands of them. Please read the below post for lots of great information on what authors are thinking.

Kris Austin
CEO of Draft2Digital

Over the last year, AI developers and data brokers have begun approaching large publishers – and even distributors such as D2D – seeking to acquire limited AI training rights on books. As we investigated the situation, we sought counsel from The Authors Guild which has been ahead of the curve in their advocacy for authors’ training rights.

The long form narrative structures of books are highly prized for teaching AI systems to mimic natural human language, especially in the areas of expressing human emotions, idioms, and narrative structures.

The authors of traditional publishers are beginning to have their works licensed without their input. Indies are expecting the same.

On August 27, we surveyed our global community representing over 1 million books from over 300,000 indie authors and small press publishers. We sought to understand if our community wanted access to paid AI rights licensing opportunities.

Our goal was to present the survey in a neutral tone and without stating D2D’s views on these topics. We wanted to avoid influencing answers in either direction.

Consider these survey results our community’s message to all AI developers that would seek to license books for AI training purposes. This community of indie authors and publishers represents the future of publishing. There is much wisdom to be learned here.

The results were fascinating.

A Chasm of Mistrust

The survey found deep mistrust of AI in general and Big Tech AI developers in particular.

AI developers earned this mistrust.

The mistrust has multiple underlying causes, though the most cited reasons for mistrust from survey respondents centered around AI developers’ reckless and willful disregard for:

author copyrights, as witnessed by AI developers continuing to train on unlicensed copyrighted content
the potential harmful impacts of AI on the livelihoods of authors, writers, artists, and other creatives
the negative environmental impacts of AI

At D2D, we share these same concerns.

This chasm of mistrust between indies and AI developers must be narrowed before the two sides can collaborate on paid licensing.

Until Draft2Digital sees significant reforms, especially around greater contractual protections and transparency governing use, intellectual property protections, and rights restrictions, we will not offer rights licensing opportunities.

Draft2Digital’s Stance

At Draft2Digital, we believe:

It’s a positive development that AI developers are seeking to pay for licenses
Better protections are needed before D2D or its publishers can entertain such licenses
AI training rights are an exclusive, valuable subsidiary right under the sole control of the author or publisher
The rights-holder deserves full control over decisions related to if, when, and how their books are used or licensed for AI training purposes.
Authors and publishers should refuse AI rights licensing contracts that are opaque, or that provide inadequate protections for author concerns
AI developers must stop training upon books obtained without the rights-holder’s permission; otherwise, they will face continued reputational harm in the eyes of their customers and the creative community
LLMs previously trained upon unlicensed content, and the applications built upon them, should either negotiate retroactive licensing settlements with rights holders, or scrap their LLMs and rebuild them from scratch by training upon licensed content only

Survey Findings

Survey questions were presented as multiple choice, with a couple questions providing additional free-form space for respondents to elaborate on the thinking behind their answers.

Indie awareness of the new paid licensing trend for AI training rights

The first question asked, “Prior to receiving this survey, were you aware that AI developers were seeking to pay authors and publishers for AI training rights on books?”

57% answered they were unaware, 38% were aware, and 5% weren’t sure.

Do authors know that AI companies might be willing to pay for training data?

Unaware

Aware

Unsure

SOURCE: DRAFT2DIGITAL.COM

Does unauthorized scraping and training constitute fair use?

The next question asked, “Several lawsuits have been launched against AI developers from various sectors of the creator community, including one by the Authors Guild, alleging that AI developers that train upon unlicensed book content obtained through scraping are violating author and publisher copyrights. The AI companies argue their actions are fair use under copyright law.”

When composing this question for the survey, we knew AI developers were claiming fair use while most authors and publishers considered such use outright theft. But we also knew that even copyright experts are divided on the question. It could be years before regulators, or the courts decide this question. In the meantime, the unauthorized scraping and training continues.

49% of respondents believe it’s not fair use to train AI models upon unlicensed content.

42% answered that even if a court ruled that such a practice was fair use, they would still consider such use to be ethically questionable.

Only 5% of respondents believed that training upon scraped content was fair use and therefore not a copyright violation.

3% had no opinion on the fair use question.

Do authors consider current scraping methods fair use?

It's not fair use

Ethically questionable

Fair use

No opinion

SOURCE: DRAFT2DIGITAL.COM

Indie interest in AI rights licensing

The indie publishing community currently doesn’t have the ability to consider licensing opportunities for their AI training rights.

The next question sought to understand if indies would want the ability to consider AI rights licensing opportunities, asking simply, “Would you appreciate the ability to evaluate and join AI rights licensing opportunities?”

56% of respondents indicated potential willingness to consider AI rights licensing opportunities, as expressed with 31% indicating Yes they would be interested and 25% indicating a Maybe. A solid 45% of authors answered No; they would not be interested in evaluating AI rights licensing opportunities.

The answers indicate there’s a solid block of authors and publishers firmly opposed to the concept of AI rights licensing, yet there’s a slightly larger group that is curious to consider it.

Are authors interested in the opportunity to sell their AI training rights?

Yes

Maybe

No

SOURCE: DRAFT2DIGITAL.COM

The next question sought to learn if authors and publishers wanted Draft2Digital to develop a rights licensing solution, asking, “Would you want the ability to evaluate and join AI rights licensing opportunities directly within your Draft2Digital or Smashwords dashboard?”

58% indicated a willingness to consider it, as reflected by 36% indicating Yes and 22% answering Maybe. Nevertheless, a full 42% of authors and publishers expressed zero desire for D2D to build such a solution.

The result indicates that authors and publishers are slightly more amenable to the opportunity if it’s offered by Draft2Digital as opposed to an unnamed provider.

Do authors want to evaluate AI licensing rights through D2D or Smashwords?

Yes

Maybe

No

SOURCE: DRAFT2DIGITAL.COM

Competitive vs. non-competitive

The next set of questions sought to understand if and how indies differentiate between LLM applications that are potentially competitive to human authors versus those that are non-competitive.

The first question of three questions in this vein asked, “Does it matter to you how the LLM is used? For example, if the licensing deal was to train an LLM allowed to support a competitive application, such as generating stories or books on-demand for readers, would it matter to you?”

A whopping 76% of survey respondents stated they’d withhold AI training rights if the LLM was intended to support a competitive application (or any other application that the rights holder found objectionable).

22% answered that they’d consider licensing their book for a competitive application as long as they were fairly compensated.

Does it matter to authors how the end product LLM will be used?

Yes, it matters.

Not as long as I am compensated

No opinion

SOURCE: DRAFT2DIGITAL.COM

Compensation for LLM applications

The next couple of questions explored the competitive/non-competitive question from a compensation perspective.

What level of compensation would the indie community expect for their AI training rights?

This is a difficult question to ask and answer because every AI rights licensing deal will have different terms. Large LLMs would typically be expected to pay more, whereas smaller LLMs would be expected to pay less. Other permutations, such as the length of the book, rights allowances and restrictions, or whether the application is for internal or external use, or for story generation or not, could also impact notions of fair compensation.

A near-universal concern expressed by respondents is that authors are reluctant to train the machines they fear might one day replace them.

To further understand how authors view compensation expectations in light of the competitive/non-competitive question, we presented two scenarios where the first was non-competitive and second was competitive.

Compensation for non-competitive licenses

The non-competitive question asked, “Imagine this fictitious example for a non-competitive application: You’ve authored a 75,000-word novel, and an automotive company wants to license it along with thousands of other books to train their LLM’s natural language processing capabilities. They plan to use their LLM to support a suite of internal corporate productivity applications used by 200,000 employees. At what minimum price would you consider licensing the AI training rights to your book for this type of use?”

49.5% of authors answered that no amount of money would ever be enough. The takeaway here is that nearly half of respondents are saying they currently have no interest in licensing their AI training rights, even under non-competitive circumstances.

A narrow majority of 50.5% of rights-holder respondents are open to non-competitive licensing opportunities if the price is right. 39.3% of authors would demand more than $100 for such a non-competitive application, and 14.1% would require at least $5,000 per book. Only 11.1% said they’d consider licensing for this application if the fee was less than $100 per book.

If the use case is non-competitive, will authors consider selling their AI training rights?

No Amount of money will ever be enough. 49.5% 49.5%
Open to non-competitive opportunities 50.5% 50.5%
Would accept less than $100 per book 11.1% 11.1%
Only if $100 or more per book 39.3% 39.3%
Only if more than $5,000 per book 14.1% 14.1%

SOURCE: DRAFT2DIGITAL.COM

This non-competitive question was followed by an optional free-form question in which respondents were invited to elaborate on the reasoning behind their pricing choice.

We manually analyzed a statistically significant sample of several hundred free-form responses from the “never-train” authors to identity some of the top reasons for their opposition.

We found:

25% of the never-train authors said that no price is acceptable because AI companies have shown themselves to be unethical and untrustworthy.
25% of authors felt that AI systems will harm creatives and harm people, even in non-competitive applications. The notion of “harm” pervaded other answers as well, likely making “harm to authors, writers, and humanity” a top-of-mind concern shared by more AI opponents than this 25% would indicate.
19% indicated their opposition to AI training stems from ethical objections to AI in general, with many of these ethical objections are also related to harm.

Why do authors oppose AI training?

AI companies are unethical/untrustworthy 25% 25%
Harms creatives & people 25% 25%
Ethical Objections to AI 19% 19%
Other Reasons 14% 14%
I worked hard for my work and it’s mine 10% 10%
AI has no place in creative work 8% 8%

SOURCE: DRAFT2DIGITAL.COM

Compensation for competitive licenses

The next question sought to understand compensation expectations for an LLM that would power a competitive application, in this case an imagined application that would allow customers to generate customized short stories in the style of a particular author.

We posed the question, “Again, imagine the same automotive company example above and you have a 75,000-word novel, but this time the license is for a potentially competitive application in which the company plans to train an LLM that will power a new external facing application for millions of its customers. It will brand the application, AutoFiction. AutoFiction invites customers to generate custom short stories in the style of their favorite author featuring the car owner and their car. At what minimum price would you consider licensing the AI training rights for this potentially competitive application?

62.8% of authors answered that no amount of money would ever be enough, 13.3% higher than from the non-competitive question, signifying a 26.8% relative increase in opposition when the application is seen as competitive.

This underscores a key finding of the survey: To rights-holders, use matters.

Of the only 37.2% of authors who’d consider a competitive licensing deal, 42% of them (15.8% of all respondents) said they’d demand per-book licensing fees of at least $5,000.

The percentage of authors who’d consider this fictitious competitive deal for under $100 per book dropped to 6.3%, a relative drop of 43% from the 11.1% from the non-competitive example.

The numbers further indicate that survey respondents are reluctant to license their books if the applications are seen as potentially competitive. For deals considered competitive, they expect significantly higher compensation.

If the use case is competitive, will authors consider selling their AI training rights?

No amount of money will ever be enough 62.8% 62.8%
Open to competitive opportunities 37.2% 37.2%
Would accept less than $100 a book 6.3% 6.3%
Only if $100 or more per book 30.9% 30.9%
Only if more than $5,000 per book 15.8% 15.8%

SOURCE: DRAFT2DIGITAL.COM

Conclusions

Big Tech and the AI Goldrush

In a gold rush mentality among Big Tech AI developers, as well as mainstream corporate end-users, tens of billions of dollars in investment has been committed over the next few years to build, train, and operate LLMs. We don’t know how much of this will be invested to acquire AI training rights of books and other texts, but if early LLM licensing deals are any indication, the opportunity could represent a new and worthwhile income stream for many authors.

AI Training Rights a Valuable Subsidiary Right

In recent months, there’s been growing recognition within traditional publishing that AI training rights are an important subsidiary right of the author. Licensing deals are beginning to happen to the benefit of traditional authors.

Indies Polarized

Our survey sought to understand if indies want to be involved in these discussions. The survey findings indicate polarized interest, with the never-train sentiment dominant when the LLM application is considered potentially competitive to human authors.

Authors and Publishers Harbor Deep Distrust of AI Developers

AI developers must do more to earn the publishing community’s trust. The survey results represent a scathing rebuke and condemnation of past and current AI development practices. What AI companies claim as fair use – training upon unlicensed content scraped off the Internet – is taken as deeply offensive and threatening by the creative community. The creative community considers it theft at worst, and ethically dubious at best. When these companies claim fair use, hide behind armies of lawyers, and persist in these practices unabashed and unabated, it only adds further insult to injury. Authors and publishers feel violated by these persistent practices. It’s no surprise that the creative community holds Big Tech AI developers in such disdain.

Reform Needed

Serious reforms are needed. Authors and publishers don’t want to do business with companies whose ethical compasses are so clearly pointed in the wrong direction. Without serious reforms, AI developers can expect to garner ever greater anger and mistrust within the publishing community, and further reputational damage to their businesses in the eyes of their prospective customers and partners. Immediate reforms that would go a long way toward rebuilding trust would include:

Cease training LLMs on unlicensed content
Settle the lawsuits in favor of rights holders
Cease commercialization of LLMs that have been trained on unlicensed content unless retroactive licensing settlements are reached with affected rights holders

A Path Forward: Radical Transparency

Not all AI developers are engaged in lawsuits, yet they face the same headwinds of mistrust created by Big Tech’s transgressions. Our survey results hint at potential paths AI developers can follow to earn the trust of authors, publishers, and D2D. Beyond avoiding the sins of their predecessors, they can offer up contracts that adhere to what D2D calls, “radical transparency.” A radically transparent licensing contract should clearly articulate all allowed and prohibited uses of the book as training material. It should provide for controlling the end use of the LLM trained on the book. It should also provide detailed articulation of how the author’s intellectual property (their books, styles, characters, worlds) will be used and protected.

What’s Next?

For D2D authors and publishers who were excited by the survey and the prospect of establishing new income streams, and to those of you who are interested, we understand you’re probably disappointed that D2D will not introduce such a service in the near-term. For survey respondents opposed to all things AI, please know that we heard your concerns, and we thank you for articulating them so fully. We will only enter this market if and when we see an opportunity to protect author and publisher interests. For D2D’s potential supply chain partners, we will continue to engage with you to design systems, practices, and policies that work for the mutual benefit of rights holders and rights licensees.

2024 AI Training Survey Results

A Chasm of Mistrust

Draft2Digital’s Stance

Survey Findings

Indie awareness of the new paid licensing trend for AI training rights

Do authors know that AI companies might be willing to pay for training data?

Unaware

Aware

Unsure

Does unauthorized scraping and training constitute fair use?

Do authors consider current scraping methods fair use?

It's not fair use

Ethically questionable

Fair use

No opinion

Indie interest in AI rights licensing

Are authors interested in the opportunity to sell their AI training rights?

Yes

Maybe

No

Do authors want to evaluate AI licensing rights through D2D or Smashwords?

Yes

Maybe

No

Competitive vs. non-competitive

Does it matter to authors how the end product LLM will be used?

Yes, it matters.

Not as long as I am compensated

No opinion

Compensation for LLM applications

Compensation for non-competitive licenses

If the use case is non-competitive, will authors consider selling their AI training rights?

Why do authors oppose AI training?

Compensation for competitive licenses

If the use case is competitive, will authors consider selling their AI training rights?

Conclusions

Big Tech and the AI Goldrush

AI Training Rights a Valuable Subsidiary Right

Indies Polarized

Authors and Publishers Harbor Deep Distrust of AI Developers

Reform Needed

A Path Forward: Radical Transparency

What’s Next?

Archives

Categories

Feeds