Authors using a new tool to search a list of 183,000 books used to train AI are furious to find their works on the list.

  • @[email protected]
    link
    fedilink
    English
    39 months ago

    Did you write a comment on Reddit before 2015? If so, your copyrighted content was used without your permission to train today’s LLMs, so you absolutely get to feel one way or another about it.

    The idea that these authors were somehow the backbone of the models when any individual contribution was like spitting in the ocean and model weights would have considered 100 pages of Twilight fan fiction equivalent to 100 pages from Twilight is honestly one of the negative impacts of the extensive coverage these suits are getting.

    Pretty much everyone who has ever written anything indexed online is a tiny part of today’s LLMs.

    • El Barto
      link
      fedilink
      English
      2
      edit-2
      9 months ago

      Thank you for your reply.

      On a completely separate note, it’s funny to think that there exists Twilight fan fiction when Twilight itself started as fan fiction work.

      Edit: I dun goofed.

      • @[email protected]
        link
        fedilink
        English
        29 months ago

        Pretty sure it’s the other way around.

        Fifty Shades of Gray started out as Twilight fanfiction before becoming its own thing.

        AFAIK Twilight was always just its own pulp fiction.

        • El Barto
          link
          fedilink
          English
          29 months ago

          Oh true! My memory was fuzzy on the details. Thanks for the correction.