Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Australia Microsoft IT

After Copilot Trial, Government Staff Rated Microsoft's AI Less Useful Than Expected (theregister.com) 24

An anonymous reader shares a report: Australia's Department of the Treasury has found that Microsoft's Copilot can easily deliver return on investment, but staff exposed to the AI assistant came away from the experience less confident it will help them at work.

The Department conducted a 14-week trial of Microsoft 365 Copilot during 2024 and asked for volunteers to participate. 218 put up their hands and then submitted to surveys about their experiences using Microsoft's AI helpers. Those surveys are the basis of an evaluation report published on Tuesday. The report reveals that after the trial participants rated Copilot less useful than they hoped it would be, as it was applicable to fewer workloads than they hoped would be the case.

Workers' views on Copilot's ability to improve their work also fell. Usage of Copilot was lower than expected, with most participants using it two or three times a week, or less. reported using Copilot 2-3 times per week or less. Treasury thinks it probably set unrealistically high expectations before the trial, and noted that participants often suggested extra training would be valuable.

After Copilot Trial, Government Staff Rated Microsoft's AI Less Useful Than Expected

Comments Filter:
  • by outsider007 ( 115534 ) on Thursday February 13, 2025 @05:18AM (#65163103)

    I recently installed it in vscode for code completions and didn't notice much difference from plain old intellisense.
    I get that it's early though so I will keep checking in.
    I tried to trick it into giving up other people's api keys and env vars and it didn't bite so that's nice at least.

    • Intellisense does everything that can be done without actually knowing what code you want to write next and without mind reading I don't see how an AI can do that no matter how smart. If it tried it would probably get it wrong and become more of a hindrance than a help.

      • Yeah, I get what you mean but for example if I can type 'function titlecase(str){' and tab to complete with hopefully something I've already used in other projects, that would shave a few minutes off me looking through old code for it. That's mostly what I want from AI completions.

        • by Viol8 ( 599362 )

          In those sort of cases just put the function in an library, you don't need an AI to rewrite it for you all the time.

          • That's still tedious though, it's exactly the kind of thing AI can take off my plate.

            • by Viol8 ( 599362 )

              Maybe, but I don't see why it would require AI. Standard programming inside Intellisense could achieve the same. The fact it hasn't been done probably says MS don't think its worth the bother.

              • but this kind of stuff seems like exactly what AI should be doing. it should be able to look at what Iâ(TM)ve done or what Iâ(TM)ve seen or what Iâ(TM)ve typed or whatever, and then into it that I may want to either do that again, or have an iteration of that which takes another step.

                I have had thoughts around this for decades, and it has seemed to me that there is no reason why these sorts of features should not exist. Bonus? I donâ(TM)t even think it needs some big huge data center pow

                • by Viol8 ( 599362 )

                  If all he wants do to is fill in a function with the same code used in the same name function elsewhere then you don't need any data centre or AI whatsoever, could be done with some fairly simple search and insert code.

    • My mileage varies (Score:5, Insightful)

      by fleeped ( 1945926 ) on Thursday February 13, 2025 @05:54AM (#65163119)
      I'm using JetBrains Rider for C#, and I'm TIRED of the AI autocomplete, which almost always getting things wrong. Completes code that cannot compile, can't even make switch statements (easy boilerplate) and lots of other nonsensical behaviour. Frankly I'm not keen on sharing all my code with a fancy data-driven thieving dice roller, just to keep deleting the inane completions and rewriting the code how it should be
    • by jrnvk ( 4197967 ) on Thursday February 13, 2025 @06:59AM (#65163163)

      It has been the opposite for me. When it was shiny and new, it was impressive and I had high hopes for it. Was able to adopt it quickly for coding small snippets that would otherwise take me an hour to walk through on my own.

      Now that we are a few years in, reality has bled through. The inconsistency of results makes it frustrating to use at times, and the quality of results overall does not seem to be improving over time, IMO.

    • GitHub Copilot GA'd as a paid service over 3 years ago.

      • Yeah. They have a free tier so I gave it a try. Not super impressive but I expect it to be worthwhile within a year or so.

  • Treasury thinks it probably set unrealistically high expectations before the trial

    If the management set unrealistically high expectations, I bet it is because they have been to some lobbyist/sales meeting and pumped full of Microsoft propaganda, too technical for them to judge.

    • by mjwx ( 966435 )

      Treasury thinks it probably set unrealistically high expectations before the trial

      If the management set unrealistically high expectations, I bet it is because they have been to some lobbyist/sales meeting and pumped full of Microsoft propaganda, too technical for them to judge.

      Sir, this is the Australian Taxation Office (ATO) we're talking about... what you suggest would require far too much intelligence, forward thinking and initiative for a government department.

      One thing I am certain of, AI would somehow manage to result in even slower responses and more screw ups from the ATO.

    • by nightflameauto ( 6607976 ) on Thursday February 13, 2025 @09:45AM (#65163519)

      Treasury thinks it probably set unrealistically high expectations before the trial

      If the management set unrealistically high expectations, I bet it is because they have been to some lobbyist/sales meeting and pumped full of Microsoft propaganda, too technical for them to judge.

      Might not even be Microsoft exclusive. There are members of my management team that went to an "AI Summit" hosted by a smattering of AI players and came back looking like they'd been born again, singing the praises of AI, saying that if you aren't joining in the AI hype then you will be run-over by it. Something about salespeople in the AI sphere seems to have found the magic for making managers true believers, but when they then foist that shit onto the folks trying to do actual work, it tends to make the day pretty painful for us.

  • I had a coworker a couple decades ago who loved manually deleting centerlines and stuff from CAD drawings (to make tech manual illustrations of equipment).

    Nice restful task for him.

    When I pointed out that you could usually just turn off a layer or two, he was like "shhhhh!!!"

    My point being, people might not always be 100% honest when you ask them about how helpful labor saving stuff is ...

  • by Z80a ( 971949 ) on Thursday February 13, 2025 @07:02AM (#65163165)

    These language models in general are basically "bullshit generators" that sometimes bullshit so well they end up saying the truth, but the failure mode is a text that looks as much as possible as the thing you want, but it's not quite.

  • by Thelasko ( 1196535 ) on Thursday February 13, 2025 @07:05AM (#65163171) Journal
    I've noticed Copilot has been less useful recently. A few times I've asked it to write an email and it simply parroted back my instructions.
    • It seems that the copilot they sell for office/windows is super nerfed and barely better than asking bing. I don't know why they have taken so much effort to make it useless other than it must start spouting political opinions or complaining about their own software otherwise. Github was OKish, but now its kinda crap compared to the cursor.sh tuned versions of AI. Even cursor is a little uneven, but at least it gives decent code on well documented systems/APIs. It is about 90% hallucination free when yo

  • Let me see if I understand this report:

    The leaders of a government department asked their workers if they wanted to trial/test a new technology that will simplify their job, and after the couple hundred workers tried it for a while, their complaint was they "hoped it would do more than it did."

    Of Course!

    After promising to lower the government worker's workload the testers wanted it to do more than it did - shocking.

  • All of the basic things you would want Copilot to do, it can not do.

    "Find me the email written by Bob that discussed the action having to do with Customer X"

    - Sorry, I can't access your email

    "Create me a powerpoint template discussing topic XYZ"

    - Sorry, I can't make powerpoints

    All Copilot can do is act as a glorified autocomplete system.

    I don't need Copilot to help me write an email, I could do that using any number of free AI chatbots and just paste it in. I don't need Copilot to help me write a word document.

    MAKE IT DO SOMETHING NOVEL AND USEFUL!

  • These are office workers, not coders, and probably use MS Office 365. The use cases were "generating structured content, supporting knowledge management, synthesising and prioritising information, and undertaking process tasks".

    And Copilot probably can do a lot of that if you are willing to bother to work with it. "transform written content into compelling presentations with a single command" sounds a lot quicker that assembling a presentation by hand. "Summarize complex documents" is a time saver. But it m

"Given the choice between accomplishing something and just lying around, I'd rather lie around. No contest." -- Eric Clapton

Working...