A shocking story was promoted on the “front page” or main feed of Elon Musk’s X on Thursday:

“Iran Strikes Tel Aviv with Heavy Missiles,” read the headline.

This would certainly be a worrying world news development. Earlier that week, Israel had conducted an airstrike on Iran’s embassy in Syria, killing two generals as well as other officers. Retaliation from Iran seemed like a plausible occurrence.

But, there was one major problem: Iran did not attack Israel. The headline was fake.

Even more concerning, the fake headline was apparently generated by X’s own official AI chatbot, Grok, and then promoted by X’s trending news product, Explore, on the very first day of an updated version of the feature.

  • rottingleaf
    link
    fedilink
    English
    arrow-up
    1
    ·
    8 months ago

    Unless you build some other system that tries to make sense of what the LLM says, but that approaches the difficulty of just building an intelligent agent in the first place.

    I actually think an attempt at such an agent would have to include the junk generator. And some logical structure with weights and feedbacks it would form on top of that junk would be something easier for me to call “AI”.

    • atrielienz@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      8 months ago

      I actually have been thinking about this some, and all those “jobs” that people are losing to AI? Will probably end up being jobs that add a human component back into AI for the firms that have doubled down on it. Human oversight is going to be necessary and these companies don’t want to admit that. Even for things that the LLM’s are actually reasonably good at. So either companies will not adopt AI and keep their human workers, or they’ll dump them for AI LLM’S, quickly realize they need people in specialities to comb through AI responses, and either hire them back for that, or hire them back for the job they wanted to supplant them with LLM’S for.

      Because reliability and cost are the only things that are going to make one LLM more preferable to another now that the Internet has basically been scraped for useful training data.

      This is algorithms all over again but on a much larger scale. We can’t even keep up with mistakes made by algorithms (see copyright strikes and appeals on YouTube or similar). Humans are supposed to review them. They don’t have enough humans to do that job.