Symbols

Microsoft (MSFT.US) is teaming up with HarperCollins, a unit of News Corp. (NWS.US), to train AI models on vast amounts of book data.

Generated by AI AgentMarket Intel

Tuesday, Nov 19, 2024 10:10 pm ET1min read

Microsoft Corp. (MSFT.US) has reached an agreement with HarperCollins, a division of News Corp. (NWS.US), to use its rich trove of non-fiction books to train its artificial intelligence models in order to improve their quality and performance, according to people familiar with the matter. The collaboration is limited to using selected old books for model training and does not involve creating new books, and authors have the right to choose whether to participate.

Specifically, Microsoft hopes to incorporate HarperCollins books into its yet-to-be-announced AI models to expand high-quality text sources and enhance the accuracy and expertise-providing capabilities of the models. Although Microsoft declined to comment, HarperCollins confirmed the agreement and said it would "allow limited use of selected out-of-copyright non-fiction books to train AI models."

At the same time, HarperCollins emphasized that the scope of the agreement is limited and has clear restrictions on exemplary outputs that respect author rights, and authors have the choice to participate.

"Our mission is to create opportunities for authors to deliberate while ensuring that the core value of their work and the revenue and royalties we share are protected," HarperCollins said. "This agreement, with its limited scope and clear boundaries for outstanding works that respect author rights, accomplishes that."

As is known, tech companies have been seeking more high-quality text sources to train their AI models, and Microsoft is no exception. They obtain licenses to use data ranging from social media sites to news articles to make their programs more accurate and better at answering questions or providing expertise on specific topics.

It's worth noting that News Corp. has previously signed an agreement with OpenAI to allow it to use content from several of its publications. Microsoft has also collaborated with multiple publishers on AI projects.

Moreover, earlier this year, Google reached a $60 million deal with Reddit, allowing the search giant to use a large number of subreddits to train its AI models.

However, some publishers have expressed dissatisfaction with AI companies' unauthorized use of content and have filed lawsuits. For example, The New York Times sued OpenAI and Microsoft, alleging copyright infringement.

All things considered, Microsoft's agreement with HarperCollins marks another significant step in tech companies' efforts to seek high-quality text sources to train AI models. However, how to respect author rights while utilizing these resources remains a challenge that publishers and tech companies need to face together.

Market Intel

Global insights driving the market strategies of tomorrow.

Latest Articles

Stay ahead of the market.

Get curated U.S. market news, insights and key dates delivered to your inbox.

Comments

﻿

Add a public comment...

No comments yet

AInvest
PRO

Editorial Disclosure & AI Transparency: Ainvest News utilizes advanced Large Language Model (LLM) technology to synthesize and analyze real-time market data. To ensure the highest standards of integrity, every article undergoes a rigorous "Human-in-the-loop" verification process. While AI assists in data processing and initial drafting, a professional Ainvest editorial member independently reviews, fact-checks, and approves all content for accuracy and compliance with Ainvest Fintech Inc.’s editorial standards. This human oversight is designed to mitigate AI hallucinations and ensure financial context. Investment Warning: This content is provided for informational purposes only and does not constitute professional investment, legal, or financial advice. Markets involve inherent risks. Users are urged to perform independent research or consult a certified financial advisor before making any decisions. Ainvest Fintech Inc. disclaims all liability for actions taken based on this information. Found an error?Report an Issue