Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
Microsoft AI IT

'Talking To Windows' Copilot AI Makes a Computer Feel Incompetent' (theverge.com) 56

Microsoft's Copilot AI assistant in Windows 11 fails to replicate the capabilities shown in the company's TV advertisements. The Verge tested Copilot Vision over a week using the same prompts featured in ads airing during NFL games. When asked to identify a HyperX QuadCast 2S microphone visible in a YouTube video -- a task successfully completed in Microsoft's ad -- Copilot gave multiple incorrect answers. The assistant identified the microphone as a first-generation HyperX QuadCast, then as a Shure SM7b on two other occasions. Copilot couldn't identify the Saturn V rocket from a PowerPoint presentation despite the words "Saturn V" appearing on screen. When asked about a cave image from Microsoft's ad, Copilot gave inconsistent responses.

About a third of the time it provided directions to find the photo in File Explorer. On two occasions it explained how to launch Google Chrome. Four times it offered advice about booking flights to Belize. The cave is Rio Secreto in Playa del Carmen, Mexico. Microsoft spokesperson Blake Manfre said "Copilot Actions on Windows, which can take actions on local files, is not yet available." He described it as "an opt-in experimental feature that will be coming soon to Windows Insiders in Copilot Labs, starting with a narrow set of use cases while we optimize model performance and learn." Copilot cannot toggle basic Windows settings like dark mode. When asked to analyze a benchmark table in Google Sheets, it "constantly misread clear-as-day scores both in the spreadsheet and in the on-page review."
This discussion has been archived. No new comments can be posted.

'Talking To Windows' Copilot AI Makes a Computer Feel Incompetent'

Comments Filter:
  • In spite of the advances in AI

    • Ha. I can't access the article to be sure, but I think they mean making the user feel like the computer is incompetent, not making the computer feel bad about its performance like Marvin the Paranoid Android.
    • by hey! ( 33014 ) on Tuesday November 18, 2025 @05:08PM (#65803545) Homepage Journal

      Correct. This is why I don't like the term "hallucinate". AIs don't experience hallucinations, because they don't experience anything. The problem they have would more correctly be called, in psychology terms "confabulation" -- they patch up holes in their knowledge by making up plausible sounding facts.

      I have experimented with AI assistance for certain tasks, and find that generative AI absolutely passes the Turing test for short sessions -- if anything it's too good; too fast; too well-informed. But the longer the session goes, the more the illusion of intelligence evaporates.

      This is because under the hood, what AI is doing is a bunch of linear algebra. The "model" is a set of matrices, and the "context" is a set of vectors representing your session up to the current point, augmented during each prompt response by results from Internet searches. The problem is, the "context" takes up lots of expensive high performance video RAM, and every user only gets so much of that. When you run out of space for your context, the older stuff drops out of the context. This is why credibility drops the longer a session runs. You start with a nice empty context, and you bring in some internet search results and run them through the model and it all makes sense. When you start throwing out parts of the context, the context turns into inconsistent mush.

      • Re: (Score:3, Interesting)

        by quenda ( 644621 )

        if anything it's too good; too fast; too well-informed. But the longer the session goes, the more the illusion of intelligence evaporates.

        We associate knowledge with intelligence. Ask a person about their favourite topic, and they will sound smarter.
        An LLM knows far more than any human, so we tend to over-estimate their "IQ". The intelligence is still very real, the problem is just that we initially over-estimated it.

        This is why credibility drops the longer a session runs. You start with a nice empty context, and you bring in some internet search results and run them through the model and it all makes sense. When you start throwing out parts of the context, the context turns into inconsistent mush.

        And how is that any different for humans? I can read them a ten-digit number and their context overflows. Dumb as hammers.
        Why do you think a bunch of wet machinery with membranes and chemical messengers is intrinsically supe

        • by hey! ( 33014 )

          It's different from humans in that human opinions, expertise and intelligence are rooted in their experience. Good or bad, and inconsistent as it is, it is far, far more stable than AI. If you've ever tried to work at a long running task with generative AI, the crash in performance as the context rots is very, very noticeable, and it's intrinsic to the technology. Work with a human long enough, and you will see the faults in his reasoning, sure, but it's just as good or bad as it was at the beginning.

      • It's called "pathetic fallacy"-- ascribing feelings (pathos, in Greek) to inanimate objects.

        I'm afraid that we do this all the time. I don't even think twice before saying something like "the toaster doesn't like you to run the blender while it's toasting" or "this program wants two special characters in the password, not just one."

        • It's called "pathetic fallacy"-- ascribing feelings (pathos, in Greek) to inanimate objects.

          The machine spirit must be appeased.

    • by quenda ( 644621 )

      Well, yes. LLMs do not "feel anything" by design

      What is a "feeling"? Is it like a "mood"? In humans, they are short-term states that affect behaviour.
      No doubt an AI could be trained using rewards when it modified output in response to praise or insults. It could be trained to get impatient with poorly worded or dumb questions. You could also get such behaviour by modifying the system prompt, but that would be more like humans faking an emotion. What is the difference between a real and fake emotion? Se

  • by know-nothing cunt ( 6546228 ) on Tuesday November 18, 2025 @02:46PM (#65803277)

    to Drunken Passenger.

  • Seriously, it's totally incompetent.

  • by fropenn ( 1116699 ) on Tuesday November 18, 2025 @02:51PM (#65803283)
    ...to hold back on shipping a piece of trash. Yes, Apple got a carried away with their advertising ("Ready for Apple Intelligence!"), but rightly held back on shipping garbage.
    • by AmiMoJo ( 196126 )

      Siri has been one of the weakest assistants for many years now, and given that they usually just ship half finished software (Apple Maps comes to mind), I'm surprised they were able to resist. Maybe there is another reason, like it kills the battery.

    • You're implying the end result will be something of quality from Apple. Given the state of Siri I feel like Apple "shipping garbage" would have been a signifikant improvement over their status quo.

  • by nightflameauto ( 6607976 ) on Tuesday November 18, 2025 @02:53PM (#65803287)

    Microsoft desperately wants to sell us a vision of the PC being an "agentic" device. You speak, it responds. Except, they're creating the equivalent of a blind and deaf person being peddled as an expert in all things. It can't read the files on the computer? It can't respond with answers clearly spelled out in the content currently pulled up on the screen? And apparently it can't understand simple questions well enough to even fully grok the scope or domain of the query itself.

    Maybe one of the AI pushing tech companies could try to work through the shit-show of pre-alpha state software in their own labs before attempting to foist it off on developers or "insiders" or, more often, the end users? Maybe, just maybe, we'd have a better perspective on AI if we didn't have so much of it shoved in our faces while it's half baked and nowhere near ready to fulfill even the most basic tasks it's being sold as the perfect solution for? But it seems more and more likely that we'll just let the entirety of humanity drown in the refuse pile that half baked AI is creating. Nobody seems at all interested in saying, "How about we get it functional before we shove it out the door?"

    • it can't understand simple questions well enough to even fully grok the scope or domain of the query itself.

      Not what you mean but it would be funny if one ai just fed its inputs into another and copied and pasted the results so it didn't have to work. Then we'd be approaching human intelligence.

  • by RobinH ( 124750 ) on Tuesday November 18, 2025 @02:54PM (#65803289) Homepage
    Most of the people on Slashdot have been screaming that the emperor has no clothes for a while now. Building a machine that spits out semi-plausible dialogue is very different from making an intelligent machine. I just asked Google's AI to summarize information about myself (my own name from my own town) and it rather hilariously indicated that my wife was actually my (IRL) sister. It had apparently retrieved the names from an obituary for our grandparent but didn't actually understand the relationships. We're not seeing the meaningful improvements that you would hope to see given the ludicrous capital that's been invested into LLMs. This isn't like Moore's Law in the 90's where there is constant improvement along a single axis (transistors per square mm). There have been a couple really big breakthroughs (first deep learning, and then transformers). Throwing more and more compute power at it isn't going to create the payoff that all the investors think. It's not going to get significantly better without more big breakthroughs, and those might come tomorrow, or not until long after we're all dead and gone. LLMs are a very risky bet right now.
    • by fluffernutter ( 1411889 ) on Tuesday November 18, 2025 @03:12PM (#65803341)
      That's one possibility. The other possibility is that we have reached AGI and you need to have a long talk with your mother.
    • by Gilmoure ( 18428 )

      I tried to use a chat tool to write a bash script that would prompt for username and file system, for a quota change.

      Was total garbage.

      I'm not a coder; just a stupid admin but how are folks using these tools for programming?

    • Most of the people on Slashdot have been screaming that the emperor has no clothes for a while now

      rsilvergun has been screaming even louder about how AI as we have it now it's already the end of the world, and that society isn't "ready" for it until he says it is.

      Funny coincidence, two hours ago I just finished two cavern dives in the very cenote complex in Playa del Carmen TFS alluded to. Some of the best diving I've done yet (comes really close to diving with tiger and bull sharks.) Currently on my second margarita while having a rest Doing another two cavern dives tomorrow.

      • rsilvergun has been screaming even louder about how AI as we have it now it's already the end of the world, and that society isn't "ready" for it until he says it is.

        Since he's living rent-free in your head, can we assume you're the one responsible for the rsilvergun-impersonating LLM spam?

        • Since he's living rent-free in your head

          This statement was cute, even funny, the first few times that it was used. That was because it was such an absurd way of making that point.

          But, after this statement has been repeated so many times, it's just fucking stupid now. You should consider abandoning it before people start thinking that you are stupid.

    • by gweihir ( 88907 )

      Most of the people on Slashdot have been screaming that the emperor has no clothes for a while now.

      Yes. Well, make that "many". But incredible as that sounds given some comments, many people here are wayyyy above average in tech understanding and insight. Obviously, we have the occasional keyword-trigger-only-no-insight MAGA and some tech fanatics, but generally we are insulting each other on a comparatively (if not absolute) pretty high insight level here.

  • Call me shocked!

  • Color me shocked. Shocked!

    When I see "agentic" I always read it as "agnatic", which somehow makes it less stupid.

  • ... that overpromises and underdelivers... what a surprise.
  • I just asked it earlier to generate a simple Excel file, something I should think that a Microsoft AI should... excel at, and the excel file was broken, unusuable. The one thing I would think it would be good at and it failed.
  • There is one thing that CoPilot is good at, and that is searching your files. Windows Explorer search has always blown goats, but Outlook search used to be good and sucks now, and OneNote search is merely passable. I suppose it isn't surprising a tool made via data scraping is good at scraping data but if you can't find something due to lack of functional search on your work PC give it a go and it does a pretty decent job.
  • by Jeslijar ( 1412729 ) on Tuesday November 18, 2025 @04:26PM (#65803465) Homepage

    It's just so full of shit. It's a wonder it even runs anymore.

    At least the linux marketshare is slowly but steadily increasing, so I approve of the enshittification of windows. No better marketing than what they do themselves.

  • Random number generators were not intended to be used to make AI decisions. But here we are. A real AI would always give the same answer even if that answer is wrong. And that answer would not change unless the base data changes. You can't do that though with current large-set-databases queried with a random number generator.
  • Steve Jobs would not release a product until it actually did what they claimed it would do. I don't understand why this is some strangely difficult lesson for CEOs to understand. I suppose with the success of Musk and his ilk that idea seems quaint.

    • Steve Jobs would not release a product until it actually did what they claimed it would do.

      You mean like when he claimed the iPhone would be all webapps?

      Let's face it, Jobs' only superpower was being a super dick to employees. This can only take you so far.

      • by MobyDisk ( 75490 )

        Actually, Apple did deliver that capability but developers pushed-back and didn't want it.

        Jobs was indeed a dick, but he did not make advertisements claiming features that do not exist.

        • Actually, Apple did deliver that capability but developers pushed-back and didn't want it.

          Right. It was a fuckup. And moreover, it was anti-developer and anti-consumer. Yet we're supposed to worship His Holy Turtleneck and address our ills with juice fasts in His name.

          • by MobyDisk ( 75490 )

            Who in this thread is worshipping Steve Jobs? Who said he has super powers? Do you just have an automated filter that finds any post that mentions Steve Jobs and then starts posting flamebait? Does it have a list of everything he ever did wrong so that you can randomly post a response? There is as much to learn from people who you hate as the people you admire.

            This is why nobody can discuss anything rationally on the internet. When someone post something that Joe Biden did right, a troll will inevitabl

  • while i acknowledge that as a windows user I should feel bad, it's surprising to see their AI feels the same

  • The AI scam can't go on much longer. LLMs have legitimate uses and possibilities, but nothing to justify the hype.

    So... what if Someone is pushing all the AI hysteria for other reasons. They plan to:
      1. Completely tank the economy and blame the tech sector;
      2. Get lots of nuclear power plants running again;
      3. (Hopefully not) use the growing horde of destitute
                tech workers to kick off a communist revolution.

    • by gweihir ( 88907 )

      While I agree that LLMs are somewhat useful in a much, much narrower scope than hyped, I am not sure the scam/hype instigators have any agenda besides get-rich-quick. Never attribute to a hidden agenda that which can be nicely attributed to greed. Or something.

  • This is really the standard expectation and MS consistently delivers. That is when they do not deliver worse quality.

  • Does anyone actually want to have a conversation with their laptop anyway? I thought those ads were selling a nonsense concept even if the technology could do it. AI being a conversational and helpful “Friend” will be the land of mental illness outside of maybe children with missing parents or extreme elder elders who have no family or friends. I do not believe many people find that stuff desirable.
    • It does seem like it'd be more useful in a smart speaker than a laptop. I can't imagine an office environment being tolerable with everyone talking to their PC, just the phones and Teams calls are bad enough. One issue they'll have is that even once you learn a smart speaker's commands they have lots of trouble hearing clearly over noise, particularly with higher pitched voices. If your speaker mishears your timer request and runs a silly google search instead of starting your pasta timer immediately, it
      • Microsoft really only has those commercials to sell PCs to consumers. But I sure as hell am not going to turn Copilot loose with my credit card to book a trip for me, or to even suggest where I should go. I do not want my PC trying to trick me into thinking it is alive, and that is what they are advertising.
  • The failure was probably caused by improvements made to the engine.

"Imitation is the sincerest form of television." -- The New Mighty Mouse

Working...