After Copilot Trial, Government Staff Rated Microsoft's AI Less Useful Than Expected (theregister.com) 31

Posted by msmash on Thursday February 13, 2025 @06:00AM from the closer-look dept.

An anonymous reader shares a report: Australia's Department of the Treasury has found that Microsoft's Copilot can easily deliver return on investment, but staff exposed to the AI assistant came away from the experience less confident it will help them at work.

The Department conducted a 14-week trial of Microsoft 365 Copilot during 2024 and asked for volunteers to participate. 218 put up their hands and then submitted to surveys about their experiences using Microsoft's AI helpers. Those surveys are the basis of an evaluation report published on Tuesday. The report reveals that after the trial participants rated Copilot less useful than they hoped it would be, as it was applicable to fewer workloads than they hoped would be the case.

Workers' views on Copilot's ability to improve their work also fell. Usage of Copilot was lower than expected, with most participants using it two or three times a week, or less. reported using Copilot 2-3 times per week or less. Treasury thinks it probably set unrealistically high expectations before the trial, and noted that participants often suggested extra training would be valuable.

After Copilot Trial, Government Staff Rated Microsoft's AI Less Useful Than Expected

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 31 Comments Log In/Create an Account

Comments Filter:

It's not great but it will get better (Score:3, Interesting)

by outsider007 ( 115534 ) writes: on Thursday February 13, 2025 @06:18AM (#65163103)

I recently installed it in vscode for code completions and didn't notice much difference from plain old intellisense.
I get that it's early though so I will keep checking in.
I tried to trick it into giving up other people's api keys and env vars and it didn't bite so that's nice at least.

- Not sure how it could improve (Score:3)
  
  by Viol8 ( 599362 ) writes:
  
  Intellisense does everything that can be done without actually knowing what code you want to write next and without mind reading I don't see how an AI can do that no matter how smart. If it tried it would probably get it wrong and become more of a hindrance than a help.
  - Re: (Score:2)
    
    by outsider007 ( 115534 ) writes:
    
    Yeah, I get what you mean but for example if I can type 'function titlecase(str){' and tab to complete with hopefully something I've already used in other projects, that would shave a few minutes off me looking through old code for it. That's mostly what I want from AI completions.
    - Re: (Score:3)
      
      by Viol8 ( 599362 ) writes:
      
      In those sort of cases just put the function in an library, you don't need an AI to rewrite it for you all the time.
      - Re: (Score:1)
        
        by outsider007 ( 115534 ) writes:
        
        That's still tedious though, it's exactly the kind of thing AI can take off my plate.
        
        Re: (Score:2)
        
        by Viol8 ( 599362 ) writes:
        
        Maybe, but I don't see why it would require AI. Standard programming inside Intellisense could achieve the same. The fact it hasn't been done probably says MS don't think its worth the bother.
        
        Re: Not sure how it could improve (Score:2)
        
        by Dripdry ( 1062282 ) writes:
        
        but this kind of stuff seems like exactly what AI should be doing. it should be able to look at what Iâ(TM)ve done or what Iâ(TM)ve seen or what Iâ(TM)ve typed or whatever, and then into it that I may want to either do that again, or have an iteration of that which takes another step.
        I have had thoughts around this for decades, and it has seemed to me that there is no reason why these sorts of features should not exist. Bonus? I donâ(TM)t even think it needs some big huge data center pow
        
        Re: (Score:2)
        
        by Viol8 ( 599362 ) writes:
        
        If all he wants do to is fill in a function with the same code used in the same name function elsewhere then you don't need any data centre or AI whatsoever, could be done with some fairly simple search and insert code.
        
        Re: (Score:2)
        
        by bobm ( 53783 ) writes:
        
        Until you find a bug in one of the copies of the function and have to find all the other projects and fix it there. Been there, learned to just have a portable library that I keep common stuff in.
    - Re: (Score:2)
      
      by sarren1901 ( 5415506 ) writes:
      
      Without being much of a programmer myself, don't some IDE's do some of that already? Or is that new IDE with AI?
      The next comment is of course correct in that you could always make your own library for the function call. You are also correct in that if AI can handle it, why not?
- My mileage varies (Score:5, Insightful)
  
  by fleeped ( 1945926 ) writes: on Thursday February 13, 2025 @06:54AM (#65163119)
  
  I'm using JetBrains Rider for C#, and I'm TIRED of the AI autocomplete, which almost always getting things wrong. Completes code that cannot compile, can't even make switch statements (easy boilerplate) and lots of other nonsensical behaviour. Frankly I'm not keen on sharing all my code with a fancy data-driven thieving dice roller, just to keep deleting the inane completions and rewriting the code how it should be
  
  - Re: (Score:2)
    
    by ihavesaxwithcollies ( 10441708 ) writes:
    
    and I'm TIRED of the AI autocomplete
    It's about worthless.
- Re: It's not great but it will get better (Score:5, Interesting)
  
  by jrnvk ( 4197967 ) writes: on Thursday February 13, 2025 @07:59AM (#65163163)
  
  It has been the opposite for me. When it was shiny and new, it was impressive and I had high hopes for it. Was able to adopt it quickly for coding small snippets that would otherwise take me an hour to walk through on my own.
  Now that we are a few years in, reality has bled through. The inconsistency of results makes it frustrating to use at times, and the quality of results overall does not seem to be improving over time, IMO.
  
- Re: (Score:2)
  
  by Kurrelgyre ( 548338 ) writes:
  
  GitHub Copilot GA'd as a paid service over 3 years ago.
  - Re: (Score:1)
    
    by outsider007 ( 115534 ) writes:
    
    Yeah. They have a free tier so I gave it a try. Not super impressive but I expect it to be worthwhile within a year or so.
Read: Management fooled by AI salesperson (Score:3)

by pipatron ( 966506 ) writes: <pipatron@gmail.com> on Thursday February 13, 2025 @06:18AM (#65163105) Homepage

Treasury thinks it probably set unrealistically high expectations before the trial
If the management set unrealistically high expectations, I bet it is because they have been to some lobbyist/sales meeting and pumped full of Microsoft propaganda, too technical for them to judge.

- Re: (Score:2)
  
  by mjwx ( 966435 ) writes:
  
  Treasury thinks it probably set unrealistically high expectations before the trial
  If the management set unrealistically high expectations, I bet it is because they have been to some lobbyist/sales meeting and pumped full of Microsoft propaganda, too technical for them to judge.
  Sir, this is the Australian Taxation Office (ATO) we're talking about... what you suggest would require far too much intelligence, forward thinking and initiative for a government department.
  
  One thing I am certain of, AI would somehow manage to result in even slower responses and more screw ups from the ATO.
- Re:Read: Management fooled by AI salesperson (Score:4, Insightful)
  
  by nightflameauto ( 6607976 ) writes: on Thursday February 13, 2025 @10:45AM (#65163519)
  
  Treasury thinks it probably set unrealistically high expectations before the trial
  If the management set unrealistically high expectations, I bet it is because they have been to some lobbyist/sales meeting and pumped full of Microsoft propaganda, too technical for them to judge.
  Might not even be Microsoft exclusive. There are members of my management team that went to an "AI Summit" hosted by a smattering of AI players and came back looking like they'd been born again, singing the praises of AI, saying that if you aren't joining in the AI hype then you will be run-over by it. Something about salespeople in the AI sphere seems to have found the magic for making managers true believers, but when they then foist that shit onto the folks trying to do actual work, it tends to make the day pretty painful for us.
  
Er, incentives anyone? (Score:3, Interesting)

by cascadingstylesheet ( 140919 ) writes: on Thursday February 13, 2025 @07:54AM (#65163161) Journal

I had a coworker a couple decades ago who loved manually deleting centerlines and stuff from CAD drawings (to make tech manual illustrations of equipment).
Nice restful task for him.
When I pointed out that you could usually just turn off a layer or two, he was like "shhhhh!!!"
My point being, people might not always be 100% honest when you ask them about how helpful labor saving stuff is ...

Bullshit generators (Score:2, Insightful)

by Z80a ( 971949 ) writes:

These language models in general are basically "bullshit generators" that sometimes bullshit so well they end up saying the truth, but the failure mode is a text that looks as much as possible as the thing you want, but it's not quite.
- Re: (Score:2)
  
  by sarren1901 ( 5415506 ) writes:
  
  That's basically what the marketing department is composed of. They are practical con-artists and because they always seem to gush confidence, surely they must be correct! Management also spent money to attend the seminar (propaganda) and therefore it must be true because we spent money! Not to mention, I'm sure the marketers are telling Management exactly what they want to hear. It's no wonder we end up with this crap.
  Sadly, our society likes bullshit. Those best at it are well rewarded. It's baffling to m
Copilot Has Regressed (Score:4, Interesting)

by Thelasko ( 1196535 ) writes: on Thursday February 13, 2025 @08:05AM (#65163171) Journal

I've noticed Copilot has been less useful recently. A few times I've asked it to write an email and it simply parroted back my instructions.

- Copilot has been nerfed for awhile (Score:2)
  
  by ccham ( 162985 ) writes:
  
  It seems that the copilot they sell for office/windows is super nerfed and barely better than asking bing. I don't know why they have taken so much effort to make it useless other than it must start spouting political opinions or complaining about their own software otherwise. Github was OKish, but now its kinda crap compared to the cursor.sh tuned versions of AI. Even cursor is a little uneven, but at least it gives decent code on well documented systems/APIs. It is about 90% hallucination free when yo
Of Course! (Score:1)

by kenh ( 9056 ) writes:

Let me see if I understand this report:
The leaders of a government department asked their workers if they wanted to trial/test a new technology that will simplify their job, and after the couple hundred workers tried it for a while, their complaint was they "hoped it would do more than it did."
Of Course!
After promising to lower the government worker's workload the testers wanted it to do more than it did - shocking.
- Re: (Score:2)
  
  by sarren1901 ( 5415506 ) writes:
  
  IDK, that almost sounds like wishing one's self out of a job. It's nice to get functional tools that help increase productivity but let's be honest, you almost never see any of that productivity regurgitated back with better wages, benefits or anything else.
  Still, I've little doubt that the Aussie government is also bloated and inefficient as the next large government. Seems to be part of what government is.
Copilot is next to useless (Score:2, Interesting)

by brunes69 ( 86786 ) writes:

All of the basic things you would want Copilot to do, it can not do.
"Find me the email written by Bob that discussed the action having to do with Customer X"
- Sorry, I can't access your email
"Create me a powerpoint template discussing topic XYZ"
- Sorry, I can't make powerpoints
All Copilot can do is act as a glorified autocomplete system.
I don't need Copilot to help me write an email, I could do that using any number of free AI chatbots and just paste it in. I don't need Copilot to help me write a word docum
four use cases (Score:2)

by ZipNada ( 10152669 ) writes:

These are office workers, not coders, and probably use MS Office 365. The use cases were "generating structured content, supporting knowledge management, synthesising and prioritising information, and undertaking process tasks".
And Copilot probably can do a lot of that if you are willing to bother to work with it. "transform written content into compelling presentations with a single command" sounds a lot quicker that assembling a presentation by hand. "Summarize complex documents" is a time saver. But it m
Maybe I should try it (Score:1)

by gewalker ( 57809 ) writes:

Since I expect it to be worthless, it's entirely possible I won't be disappointed by the result.
Still, why would I want MS to have access to some content that I create that actually might be useful.
I think I'll pass.
Microsoft wants you to use it! (Score:1)

by RealMelancon ( 4422677 ) writes:

Even if you donâ(TM)t want it, they will push it down your throat.
What were they trying to do? (Score:2)

by TJHook3r ( 4699685 ) writes:

I wonder what expectations the government staff actually had and whether the study was to look at cost-saving measures ultimately...? It's funny how people find negatives in a tool that might replace them! Then again, government staff are likely looking at business processes and tasks that LLMs aren't typically trained on, unlike software devs for example, who find that AI has extensive exposure to their language of choice. I also suspect (knowing how small departments work) that there are several old-timer

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

After Copilot Trial, Government Staff Rated Microsoft's AI Less Useful Than Expected (theregister.com) 31

After Copilot Trial, Government Staff Rated Microsoft's AI Less Useful Than Expected More Login

After Copilot Trial, Government Staff Rated Microsoft's AI Less Useful Than Expected

It's not great but it will get better (Score:3, Interesting)

Not sure how it could improve (Score:3)

Re: (Score:2)

Re: (Score:3)

Re: (Score:1)

Re: (Score:2)

Re: Not sure how it could improve (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

My mileage varies (Score:5, Insightful)

Re: (Score:2)

Re: It's not great but it will get better (Score:5, Interesting)

Re: (Score:2)

Re: (Score:1)

Read: Management fooled by AI salesperson (Score:3)

Re: (Score:2)

Re:Read: Management fooled by AI salesperson (Score:4, Insightful)

Er, incentives anyone? (Score:3, Interesting)

Bullshit generators (Score:2, Insightful)

Re: (Score:2)

Copilot Has Regressed (Score:4, Interesting)

Copilot has been nerfed for awhile (Score:2)

Of Course! (Score:1)

Re: (Score:2)

Copilot is next to useless (Score:2, Interesting)

four use cases (Score:2)

Maybe I should try it (Score:1)

Microsoft wants you to use it! (Score:1)

What were they trying to do? (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot