When Copilot Gets It Wrong: Handling Errors

Overview

Copilot is a powerful tool, but it is not infallible. It will occasionally produce responses that contain inaccuracies, miss important details, or misinterpret your request. In government work, where accuracy directly affects policy decisions, compliance obligations, and public trust, knowing how to handle these situations is essential.

This video covers why Copilot makes mistakes, how to verify its output, techniques for correcting and redirecting, and how to report issues when they arise.

What You’ll Learn

Why Mistakes Happen: The types of errors Copilot produces and why
Verification: Strategies for checking Copilot’s output before using it
Correction: How to fix errors and redirect Copilot to better responses
Reporting: How to provide feedback and escalate patterns of issues

Script

Hook: Copilot will get things wrong

Let’s start with an important truth: Copilot will get things wrong. Not every time, not most of the time, but often enough that you need to be prepared for it.

This isn’t a reason to avoid using Copilot. It’s a reason to use it wisely. A calculator can give wrong answers if you enter the wrong numbers—that doesn’t make calculators useless. It means you check your work.

In government work, accuracy is non-negotiable. A wrong regulation number in a compliance report, an incorrect budget figure in a Congressional response, or a fabricated citation in a policy memo can have real consequences. Knowing how to handle errors is just as important as knowing how to write good prompts.

Why Copilot makes mistakes

Understanding why Copilot makes mistakes helps you anticipate and catch them. There are five primary reasons.

First, AI hallucinations. This is the most well-known issue. Copilot generates responses by predicting what text should come next based on patterns in its training. Sometimes it produces information that sounds completely plausible but is entirely fabricated—a regulation that doesn’t exist, a statistic that was never published, or a citation to a document that was never written. Hallucinations are more likely when Copilot is asked about highly specific topics where it has limited training data.

Second, outdated information. Copilot’s training data has a cutoff date, and organizational data may not be fully indexed. If you ask about recent policy changes or newly published guidance, Copilot may reference older versions or miss updates entirely.

Third, misinterpreting ambiguous prompts. If your prompt can be read in more than one way, Copilot picks an interpretation—and it may not be the one you intended. “Review the policy” could mean summarize it, critique it, or check it for errors. Ambiguity in your prompt leads to misalignment in the response.

Fourth, limitations in specialized government domains. Copilot has broad knowledge but may lack depth in niche federal processes, specific regulatory frameworks, or agency-specific procedures. It might give a generically correct answer that misses the nuances of your particular government context.

Fifth, restricted information. Copilot cannot access classified information, and it may not have access to all controlled unclassified information in your environment. If the answer requires information that Copilot can’t reach, it may fill the gap with general knowledge that doesn’t apply to your specific situation.

The important point is this: mistakes don’t mean the tool is broken. They mean the tool requires informed human oversight—which is true of every tool in your professional toolkit.

Verifying Copilot’s output

The single most important habit you can develop with Copilot is reviewing every response before you use it. Never copy and paste Copilot’s output directly into an official document, email, or presentation without reading it critically first.

Start with facts and figures. If Copilot cites a specific number—a budget amount, a percentage, a deadline—verify it against the source. Copilot may present a number confidently that is approximate, outdated, or simply wrong. Check it independently.

Check citations and references. If Copilot references a specific regulation, policy, executive order, or Microsoft Learn article, verify that it exists and says what Copilot claims it says. Hallucinated citations are one of the most common errors and one of the most damaging in government work.

Cross-reference with authoritative sources. For policy information, check the actual policy document. For regulatory references, check the Federal Register or the relevant agency guidance. For technical information about Microsoft 365, verify against Microsoft Learn documentation.

Pay extra attention to legal references, statistics, and proper names. These are areas where Copilot is most likely to produce plausible-sounding errors. A regulation number that’s off by one digit, a statistic attributed to the wrong report, or a person’s title that’s slightly wrong—these errors look credible at first glance and can slip through casual review.

Adopt a “trust but verify” approach. Trust that Copilot gives you a useful starting point. Verify the details before the output leaves your desk. This approach lets you move quickly while maintaining the accuracy standards that government work demands.

For government professionals specifically, verify any output that will be used in official correspondence, submitted to oversight bodies, shared with external agencies, or included in public-facing materials. The higher the stakes, the more rigorous your review should be.

Correcting and redirecting

When you find an error in Copilot’s response, you don’t need to start over. Tell Copilot what’s wrong and give it the correct information.

Be direct and specific. “That regulation number is incorrect. The correct citation is 48 CFR Part 15, not Part 12.” Copilot will acknowledge the correction and update its response accordingly. This is more efficient than starting from scratch because Copilot retains the rest of the context and only fixes the identified error.

Provide the correct information when you have it. “The budget figure you cited is from FY2024. The current FY2026 request is $4.2 million.” This not only fixes the immediate error but helps Copilot produce more accurate content for the rest of the conversation.

Ask Copilot to regenerate with corrections. “Rewrite that paragraph using the correct regulation number and the updated budget figure I just provided.” This produces a clean version that incorporates your corrections seamlessly.

Redirect the scope when Copilot focuses on the wrong area. “Focus only on FY2025 data—don’t include any historical figures” or “I need this specifically for DoD environments, not general federal guidance.”

When correction doesn’t work after two or three attempts, rephrase the entire prompt. Sometimes the original prompt led Copilot down a path that’s difficult to redirect. A fresh, more precise prompt often produces better results than continued corrections.

Reporting issues and providing feedback

Your feedback helps improve Copilot for everyone—including other government users.

The simplest way to provide feedback is the thumbs-down button that appears next to Copilot responses. When you encounter an inaccurate or unhelpful response, click thumbs-down and provide a brief description of what went wrong. This feedback goes directly to Microsoft and is used to improve the service.

Beyond individual feedback, report issues to your organization’s IT team or Copilot administrator. They can track patterns of errors that may indicate configuration issues, missing data sources, or policy settings that need adjustment.

Document patterns of errors for your team. If you notice that Copilot consistently struggles with a particular type of request or topic, share that with colleagues so they know to verify those areas more carefully. A shared understanding of Copilot’s strengths and limitations makes the entire team more effective.

Your organization may have specific channels for reporting AI-related concerns. Check with your IT department or AI governance team for guidance on escalating issues that go beyond individual response quality—such as responses that surface information inappropriately or consistently produce inaccurate information about your agency’s policies.

Close: errors are expected – preparation isn’t optional

Every AI tool produces errors. This is a fundamental characteristic of the technology, not a flaw unique to Copilot. The question isn’t whether Copilot will make mistakes—it will. The question is whether you’re prepared to catch and correct them.

Build verification into your workflow. Make it a habit, not an afterthought. Review before you send. Check citations before you include them. Verify figures before you present them.

The value of Copilot still far outweighs the occasional mistake. It saves hours of drafting, research, and synthesis work. It surfaces information you wouldn’t have found manually. It helps you produce higher-quality output faster.

Your professional judgment is the final quality check. Copilot is the assistant. You are the expert. Use it accordingly.

Sources & References

Write better prompts for Microsoft 365 Copilot — Guidance on improving prompts to reduce errors
Microsoft 365 Copilot overview — Overview of how Copilot generates responses and known limitations
Microsoft Responsible AI — Microsoft’s responsible AI principles including accuracy and reliability
Microsoft Copilot adoption — Adoption resources including handling AI limitations in the workplace

When Copilot Gets It Wrong: Handling Errors

When Copilot Gets It Wrong: Handling Errors

Overview

What You’ll Learn

Script

Hook: Copilot will get things wrong

Why Copilot makes mistakes

Verifying Copilot’s output

Correcting and redirecting

Reporting issues and providing feedback

Close: errors are expected – preparation isn’t optional

Sources & References

Related Resources

Watch on YouTube