• Fluffles@pawb.social
    link
    fedilink
    arrow-up
    13
    arrow-down
    1
    ·
    1 year ago

    I believe this phenomenon is called “artificial hallucination”. It’s when a language model exceeds its training and makes info out of thin air. All language models have this flaw. Not just ChatGPT.

    • ☆ Yσɠƚԋσʂ ☆@lemmy.mlOP
      link
      fedilink
      arrow-up
      19
      arrow-down
      6
      ·
      1 year ago

      The fundamental problem is that at the end of the day it’s just a glorified Markov chain. LLM doesn’t have any actual understanding of what it produces in a human sense, it just knows that particular sets of tokens tend to go together in the data it’s been trained on. GPT mechanic could very well be a useful building block for making learning systems, but a lot more work will need to be done before they can actually be said to understand anything in a meaningful way.

      I suspect that to make a real AI we have to embody it in either a robot or a virtual avatar where it would learn to interact with its environment the way a child does. The AI has to build an internal representation of the physical world and its rules. Then we can teach it language using this common context where it would associate words with its understanding of the world. This kind of a shared context is essential for having AI understand things the way we do.

      • FunkyStuff [he/him]@hexbear.net
        link
        fedilink
        English
        arrow-up
        4
        ·
        1 year ago

        You have a pretty interesting idea that I hadn’t heard elsewhere. Do you know if there’s been any research to make an AI model learn that way?

        In my own time while I’ve messed around with some ML stuff, I’ve heard of approaches where you try to get the model to accomplish progressively more complex tasks but in the same domain. For example, if you wanted to train a model to control an agent in a physics simulation to walk like a humanoid you’d have it learn to crawl first, like a real human. I guess for an AGI it makes sense that you would have it try to learn a model of the world across different domains like vision, or sound. Heck, since you can plug any kind of input to it you could have it process radio, infrared, whatever else. That way it could have a very complete model of the world.

      • v_krishna@lemmy.ml
        link
        fedilink
        English
        arrow-up
        3
        ·
        1 year ago

        A lot of semantic NLP tried this and it kind of worked but meanwhile statistical correlation won out. It turns out while humans consider semantic understanding to be really important it actually isn’t required for an overwhelming majority of industry use cases. As a Kantian at heart (and an ML engineer by trade) it sucks to recognize this, but it seems like semantic conceptualization as an epiphenomenon emerging from statistical concurrence really might be the way that (at least artificial) intelligence works

        • ☆ Yσɠƚԋσʂ ☆@lemmy.mlOP
          link
          fedilink
          arrow-up
          7
          arrow-down
          5
          ·
          1 year ago

          I don’t see the approaches as mutually exclusive. Statistical correlation can get you pretty far, but we’re already seeing a lot of limitations with this approach when it comes to verifying correctness or having the algorithm explain how it came to a particular conclusion. In my view, this makes purely statistical approach inadequate for any situation where there is a specific result desired. For example, an autonomous vehicle has to drive on a road and correctly decide whether there are obstacles around it or not. Failing to do that correctly results in disastrous results and makes purely statistical approaches inherently unsafe.

          I think things like GPT could be building blocks for systems that are trained to have semantic understanding. I think what it comes down to is simply training a statistical model against a physical environment until it adjusts its internal topology to create an internal model of the environment through experience. I don’t expect that semantic conceptualization will simply appear out of feeding a bunch of random data into a GPT style system though.

          • v_krishna@lemmy.ml
            link
            fedilink
            English
            arrow-up
            2
            ·
            1 year ago

            I fully agree with this, would have written something similar but was eating lunch when I made my former comment. I also think there’s a big part of pragmatics that comes from embodiment that will become more and more important (and wish Merleau-Ponty was still around to hear what he thinks about this)

            • ☆ Yσɠƚԋσʂ ☆@lemmy.mlOP
              link
              fedilink
              arrow-up
              2
              arrow-down
              6
              ·
              1 year ago

              Indeed, I definitely expect interesting things to start developing on that front, and we may see old ideas getting dusted off because now there’s enough computing power to put them to use. For example, I thought The Society of Mind from Minsky lays out a plausible architecture for a mind. Imagine each agent in that scenario being a GPT system, and the bigger mind being built out of a society of such agents each being concerned with a particular domain it learns about.

              • v_krishna@lemmy.ml
                link
                fedilink
                English
                arrow-up
                1
                ·
                1 year ago

                Many (14?) years back I attended a conference (now I can’t remember what it was for, I think a complex systems department at some DC area university) and saw a lady give a talk about using agent based modeling to do computational sociology planning around federal (mostly navy/army) development in Hawaii. Essentially a sim city type of thing but purpose built to help aid in public planning decisions. Now imagine that but the agents aren’t just sets of weighted heuristics but instead weighted heuristic/prompt driven LLMs with higher level executive prompts to bring them together.

                • ☆ Yσɠƚԋσʂ ☆@lemmy.mlOP
                  link
                  fedilink
                  arrow-up
                  3
                  arrow-down
                  6
                  ·
                  edit-2
                  1 year ago

                  I’m really excited to see this kind of stuff experimented with. I find it’s really useful of thinking of machine learning agent training in terms of creating a topology through balancing of the weights and connections that ends up being a model of a particular domain described by the data that it’s being fed. The agent learns patterns in the data it observes and creates an internal predictive model based on that. Currently, most machine learning systems seem to focus on either individual agents or small groups such as adding a supervisor. It would be interesting to see large graphs of such agents that interact in complex ways and where high level agents are only interacting with other agents and don’t even need to see any of the external inputs directly. One example would be to have a system trained on working with visual input and another with audio, and then have a high level system that’s responsible for integrating these inputs and doing the actual decision making.

                  and just ran across this https://arxiv.org/abs/2308.00352