• oyo@lemm.ee
    link
    fedilink
    English
    arrow-up
    157
    arrow-down
    1
    ·
    1 month ago

    LLMs: using statistics to generate reasonable-sounding wrong answers from bad data.

      • Blackmist@feddit.uk
        link
        fedilink
        English
        arrow-up
        51
        ·
        1 month ago

        And the system doesn’t know either.

        For me this is the major issue. A human is capable of saying “I don’t know”. LLMs don’t seem able to.

        • xantoxis@lemmy.world
          link
          fedilink
          English
          arrow-up
          33
          ·
          1 month ago

          Accurate.

          No matter what question you ask them, they have an answer. Even when you point out their answer was wrong, they just have a different answer. There’s no concept of not knowing the answer, because they don’t know anything in the first place.

          • Blackmist@feddit.uk
            link
            fedilink
            English
            arrow-up
            16
            ·
            1 month ago

            The worst for me was a fairly simple programming question. The class it used didn’t exist.

            “You are correct, that class was removed in OLD version. Try this updated code instead.”

            Gave another made up class name.

            Repeated with a newer version number.

            It knows what answers smell like, and the same with excuses. Unfortunately there’s no way of knowing whether it’s actually bullshit until you take a whiff of it yourself.

            • nilloc@discuss.tchncs.de
              link
              fedilink
              English
              arrow-up
              4
              ·
              1 month ago

              So instead of Prompt Engineer, the more accurate term should be AI Taste Tester?

              From what I’ve seen you’ll need an iron stomach.

      • treadful@lemmy.zip
        link
        fedilink
        English
        arrow-up
        12
        ·
        1 month ago

        They really aren’t. Go ask about something in your area of expertise. At first glance, everything will look correct and in order, but the more you read the more it turns out to be complete bullshit. It’s good at getting broad strokes but the details are very often wrong.

        Now imagine someone that doesn’t have your expertise reading that answer. They won’t recognize those details are wrong until it’s too late.

        • Quereller@lemmy.one
          link
          fedilink
          English
          arrow-up
          5
          ·
          1 month ago

          That is about the experience I have. I asked it for factual information in the field I work at. It didn’t gave correct answers. Or, it gave working protocols which were strange and would not be successful.

      • GBU_28@lemm.ee
        link
        fedilink
        English
        arrow-up
        6
        arrow-down
        2
        ·
        edit-2
        1 month ago

        With proper framework, decent assertions are possible.

        1. It must cite the source and provide the quote, not just a summary.
        2. An adversarial review must be conducted

        If that is done, the work on the human is very low.

        That said, it’s STILL imperfect, but this is leagues better than one shot question and answer

        • Aceticon@lemmy.world
          link
          fedilink
          English
          arrow-up
          3
          ·
          edit-2
          1 month ago

          Except LLMs don’t store sources.

          They don’t even store sentences.

          It’s all a stack of massive N-dimensional probability spaces roughly encoding the probabilities of certain tokens (which are mostly but not always words) appearing after groups of tokens in a certain order.

          And all of that to just figure out “what’s the most likely next token”, an output which is then added to the input and fed into it again to get the next word and so on, producing sentences one word at a time.

          Now, if you feed it as input a long, very precise sentence taken from a unique piece, maybe you’re luck and it will output the correct next word, but if you already have all that you don’t really need an LLM to give you the rest.

          Maybe the “framework” you seek - which is quite akin to a indexer with a natural language interface - can be made with AI, but it’s not something you can do with LLMs because their structure is entirely unsuited for it.

          • GBU_28@lemm.ee
            link
            fedilink
            English
            arrow-up
            1
            ·
            edit-2
            1 month ago

            The proper framework does, with data store, indexing and access functions.

            The cutting edge work is absolutely using LLMs in post-rag pipelines.

            Consumer grade chat interfaces def do not do this.

            Edit if you worry about topics like context window, sentence splitting or source extraction, you aren’t using a best in class framework any more.