• Peanut@sopuli.xyz
    link
    fedilink
    arrow-up
    14
    arrow-down
    1
    ·
    edit-2
    9 months ago

    couple decent thoughts. That the real issue is more economic than technological is the reality that’s good to focus on.

    Others just really display how little they know about both the issue and the technology.

    “That AI is conceived and enabled by brilliant, ambitious, but immature men” was a bit of a funny line, because I’m wondering how you could defend that statement among minds like Melanie Mitchell. I mean, many of my favorites in the field are anything but “immature” In any way.

    Some complain about the Canadian standards of disregarding copyright for educational purposes. I’ve always thought that was something that shows great humanity in the face of a system fueled by greed.

    Remember when copyright only lasted a couple decades, and virtually everything else existed in public domain? We used to have these weird ideas like thinking about the betterment of the general public or educational systems were important for some reason.

    All of the complaints are extremely unspecific. Do they care about open source vs corporate? Do they even understand the basic concept of how these things work?

    Does our economic system need to be fixed? Yes. Are we going to get there by crying about the terror of the “soullessness” of machines and education? I doubt it.

  • Rocket@lemmy.ca
    link
    fedilink
    arrow-up
    7
    arrow-down
    6
    ·
    edit-2
    9 months ago

    No more than any of us are ripping off writers when we read a book.

    If you don’t want your book to be read, maybe don’t publish it?

    • xmunk@sh.itjust.works
      link
      fedilink
      arrow-up
      8
      arrow-down
      3
      ·
      9 months ago

      You know that FBI warning on DvDs about how this movie is for personal viewing and not commercial screenings… well… these books are being used in a way grossly out of line with how common sense would dictate they’re used. The books might end up being partially regurgitated by generative ai. Tweets are one thing - there’s no concept of ownership with them… but books are a significantly different matter.

      • baconisaveg@lemmy.ca
        link
        fedilink
        arrow-up
        1
        arrow-down
        1
        ·
        9 months ago

        Trained models have different licenses based on the data they were trained on. Several released datasets for example acknowledge they were trained on non-public data and therefore outputs from the AI cannot be used for commercial purposes.

        Educate yourself.

      • Rocket@lemmy.ca
        link
        fedilink
        arrow-up
        2
        arrow-down
        2
        ·
        edit-2
        9 months ago

        DVDs, at least those with said FBI warning, are sold under an explicit license agreement which only permits personal viewing. That is not a matter of common sense. That is a matter of contract law.

        If these books are sold under a specific contract that forbids being read then the case is open and shut, but since the article seems afraid to share the license agreement used we can surmise that license does not define who may or may not read the work and now authors are crying because they didn’t think things through when they didn’t want their work read.

        Such is life. We all screw up sooner or later.

        • nyan@lemmy.cafe
          link
          fedilink
          English
          arrow-up
          5
          ·
          edit-2
          9 months ago

          The thing is, AI is not a “who”, it’s a “what”. A license that permits human beings to read a thing may not be sufficient to authorize AIs to read a thing.

          That’s what the legal system needs to thrash out: is feeding information to an AI a separate right that needs to be assigned specifically by contract or license, or is it a subset of the human right to read a published document that the human has legitimate access to? If it is separate, then any work outside the public domain that doesn’t specifically have an authorization for AIs attached is going to be off-limits as training data, because rights not specifically assigned are reserved unto the copyright holder.

          Given the speed with which the law typically moves, it’s going to be years before we have an answer.

          • Rocket@lemmy.ca
            link
            fedilink
            arrow-up
            2
            arrow-down
            4
            ·
            edit-2
            9 months ago

            The thing is, AI is not a “who”, it’s a “what”.

            If that is the case then this is already firmly settled. It is clear that you do not need special authorization to read a book through glasses. You reading a book through any other device would be no different.

            • nyan@lemmy.cafe
              link
              fedilink
              English
              arrow-up
              3
              ·
              9 months ago

              “Reading” is the act of viewing the unaltered text. It’s possible for a LLM to have text in its training set that no human has ever viewed in its original form. So no, a LLM is not equivalent to a pair of glasses in this case.

              In effect, a LLM is a fancy datanbase that spits back modified subsets of its stored information in response to queries. Does this mean that the original data was “stored in a retrieval system” without permission (specifically forbidden by British copyright law, I believe, and possibly others)? Is the LLM creating “unauthorized derivative works”, which is illegal in many countries? Just where is the line between a derivative work and a work “inspired by” another, but still legal, anyway? Is it the same in every jurisdiction?

              I’m not a lawyer, and I have no idea how many worms are going to creep out of this can, but one thing I’m absolutely certain of is that it isn’t going to be simple to sort out, no matter how many people would like it to be or think that “common sense” is on their side.

              • Rocket@lemmy.ca
                link
                fedilink
                arrow-up
                1
                arrow-down
                4
                ·
                edit-2
                9 months ago

                “Reading” is the act of viewing the unaltered text.

                Something “what” cannot do. LLMs have no concept of text, only the physical manifestation of text, applying transformations on that physical manifestation – just like glasses.

                Does this mean that the original data was “stored in a retrieval system” without permission

                Information is stored within glasses, at least for a short period of time. So, yes, you have created a storage and retrieval system when you use glasses. They would be useless otherwise. If that’s illegal, I hope you have good vision. Except your eye does the same thing, so…

                it isn’t going to be simple to sort out

                It only needs to be as difficult as we want it to be. And there is no reason to make it difficult.

                • nyan@lemmy.cafe
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  9 months ago

                  . . . Wow. I’m going to be polite and assume that you never took physics in high school, instead of failing the unit on optics. Might want to bone up on that before you make an argument that deals with the physics of lenses, just sayin’.

                  You don’t appear to have much understanding of how the law operates either. It’s always complicated and difficult, and judges take a dim view of people who try to twist words around to mean something other than the contract defines them to mean.

        • xmunk@sh.itjust.works
          link
          fedilink
          arrow-up
          4
          ·
          9 months ago

          Please explain to the lawyers in 1980 who wrote the contract for a published a book what precisely generative AI, Twitter and the internet are so they can be sure to account for their fair use in their contract… until five years ago none of us knew what this stuff would do. And, I’d mention, that Google Books has been pummeled by lawsuits for pretty much the same reason and ended up needing to pull almost all books from their free reading section.

          • Rocket@lemmy.ca
            link
            fedilink
            arrow-up
            3
            arrow-down
            1
            ·
            edit-2
            9 months ago

            The training of an attention network is not meaningfully different from training a perceptron, which was invented in 1957. Interconnected networks of computers date back to 1969. finger captured the spirt of Twitter in 1971. The businesses may have changed, but the concepts you speak of were very well understood in 1980.

            A contract precisely specifying “generative AI”, “Twitter”, and “the Internet” would be silly even today. That would be like the DVD specifying “You may only watch this while sitting on a Chesterfield flower-print couch in your living room on a RCA CRT television”. My understanding from you is that the intent is simply to prohibit non-humans reading the work. That is something that could have been easily written in 1980.

            Hell, robots doing human-like things was front in centre in popular culture in the 1980s. It would have been impossible to not realize that robots reading books was an almost certain future. To act surprised now that it is happening is an act not selling itself.

            • xmunk@sh.itjust.works
              link
              fedilink
              arrow-up
              2
              ·
              edit-2
              9 months ago

              There is a massive difference between AI tech in the 70s and today. The scale we’re able to achieve is orders of magnitude beyond what was dreamed of. These modern issues were conceived as taking much longer to arrive and giving the legal system more time to catch up. Our legal system can force a common baseline of behavior on our new technology and that will be necessary to have a healthy balance of power.

              • Rocket@lemmy.ca
                link
                fedilink
                arrow-up
                1
                arrow-down
                1
                ·
                edit-2
                9 months ago

                There is a massive difference between AI tech in the 70s and today.

                Not really. We’ve learned a few tricks along the way, but the fundamentals of neural networks have not changed at all. The most significant progress AI has made is in seeing compute become orders of magnitude faster. Which we knew, with reasonable confidence, was going to happen. Moore’s Law and all that.

                The scale we’re able to achieve is orders of magnitude beyond what was dreamed of.

                While I disagree, the scale is irrelevant. The slow systems in the 1970s were maybe only ingesting one book rather than millions of books, but legally there is no difference between one book and millions of books. If we are to believe that there is no legal right for a machine to read a book, then reading just one book is in violation of that.

                our new technology

                What new technology? The Attention is All You Need paper, which gave rise to LLMs, showed a way to train neural networks faster, but it is not the speed at which machines can read books that is in question. Nobody has suggested that the legal contention is in traffic law, with computers breaking speed limits.

                We’ve been doing this for decades upon decades upon decades. Incremental improvements in doing it faster changes the legal state not. To pretend that suddenly the world was flipped upside down is ridiculously disingenuous.