[long] Some tests of how much AI "understands" what it says (spoiler: very little)

diz@awful.systems · edit-2 1 month ago

Full time AI grift jobs would of course be forever closed to any AI whistleblower. There’s still a plenty of other jobs.

I did participate in the hiring process, I can tell you that at your typical huge corporation the recruiter / HR are too inept to notice that you are a whistleblower, and don’t give a shit anyway. And of the rank and file who will actually google you, plenty enough people dislike AI.

At the rank and file level, the only folks who actually give a shit who you are are people who will have to work with you. Not the background check provider, not the recruiter.

diz@awful.systems · 4 months ago

Well the OP talks about a fridge.

I think if anything it’s even worse for tiny things with tiny screws.

What kind of floating hologram is there gonna be that’s of any use, for something that has no schematic and the closest you have to a repair manual is some guy filming themselves taking apart some related product once?

It looks cool in a movie because it’s a 20 second clip in which one connector gets plugged, and tens of person hours were spent on it by very talented people who know how to set up a scene that looks good and not just visually noisy.

diz@awful.systems · 4 months ago

but often the video isn’t clear or fine quality enough

Wouldn’t it be great if 100x the effort that didn’t go into making the video clear or fine quality enough, instead didn’t go into making relevant flying, see-through overlay decals?

Ultimately the reason it looks cool is that you’re comparing a situation of little effort being put into repair related documentation, to some movie scenario where 20 person-hours were spent making a 20-second repair fragment whereby 1 step of a repair is done.

diz@awful.systems · 4 months ago

I’m not sure it’s actually being used, beyond C suite wanting something cool to happen and pretending it did happen.

diz@awful.systems · 4 months ago

Exactly. It goes something like "remember when you were fixing a washing machine and you didn’t know what some part was and there was no good guide for fixing it, no schematic, no nothing? Wouldn’t it be awesome if 100x of the work that wasn’t put into making documentation was not put into making VR overlays?

diz@awful.systems · edit-2 4 months ago

Using tools from physics to create something that is popular but unrelated to physics is enough for the nobel prize in physics?

If only, it’s not even that! Neither Boltzmann machines nor Hopfield networks led to anything used in the modern spam and deepfake generating AI, nor in image recognition AI, or the like. This is the kind of stuff that struggles to get above 60% accuracy on MNIST (hand written digits).

Hinton went on to do some different stuff based on backpropagation and gradient descent, on newer computers than those who came up with it long before him, and so he got Turing Award for that, and it’s a wee bit controversial because of the whole “people doing it before, but on worse computers, and so they didn’t get any award” thing, but at least it is for work that is on the path leading to modern AI and not for work that is part of the vast list of things that just didn’t work and it’s extremely hard to explain why you would even think they would work in the first place.

diz@awful.systems · 4 months ago

Then next year Hopfield and Hinton go back to Sweden, don’t tell king of Sweden anything, king of Sweden still gives them the Nobel Prize! King of Sweden now has conditioned reflex!

diz@awful.systems · 4 months ago

I seriously wonder, do any of the folks with the “AR glasses to assist repair” thing ever actually repair anything, or do they get their ideas of how you repair stuff from computer games?

diz@awful.systems · edit-2 4 months ago

Nobel prize in Physics for attempting to use physics in AI but it didn’t really work very well and then one of the guys working on a better more pure mathematics approach that actually worked and got the Turing Award for the latter, but that’s not what the prize is for, while the other guy did some other work, but that is not what the prize is for. AI will solve all physics!!!111

diz@awful.systems · 4 months ago

Maybe if the potato casserole is exploded in the microwave by another physicist, on his way to start a resonance cascade…

(i’ll see myself out).

diz@awful.systems · 7 months ago

The counting failure in general is even clearer and lacks the excuse of unfavorable tokenization. The AI hype would have you believe just an incremental improvement in multi-modality or scaffolding will overcome this, but I think they need to make more fundamental improvements to the entire architecture they are using.

Yeah.

I think the failure could be extremely fundamental - maybe local optimization of a highly parametrized model is fundamentally unable to properly learn counting (other than via memorization).

After all there’s a very large number of ways how a highly parametrized model can do a good job of predicting the next token, which would not involve actual counting. What makes counting special vs memorization is that it is relatively compact representation, but there’s no reason for a neural network to favor compact representations.

The “correct” counting may just be a very tiny local minimum, with tall hill all around it and no valley leading there. If that’s the case then local optimization will never find it.

diz@awful.systems · 7 months ago

I feel like letter counting and other letter manipulation problems kind of under-sell the underlying failure to count - LLMs work on tokens, not letters, so they are expected to have a difficulty with letters.

The inability to count is of course wholly general - in a river crossing puzzle an LLM can not keep track of what’s on either side of the river, for example, and sometimes misreports how many steps it output.

diz@awful.systems · 7 months ago

Other thing to add to this is that there’s just one or two people in the train providing service for hundreds of other people or millions of dollars worth of goods. Automating those people away is simply not economical, not even in terms of the headcount replaced vs headcount that has to be hired to maintain the automation software and hardware.

Unless you’re a techbro, who deeply resents labor, someone who would rather hire 10 software engineers than 1 train driver.

diz@awful.systems · 7 months ago

[long] Some tests of how much AI "understands" what it says (spoiler: very little)