Managers have realistic expectations with AI

chatgpt logo
chatgpt logo
chatgpt logo

I just had a conversation with ChatGPT about all of the frustrations that I have had with Copilot and its related add ins. I disagreed with its answer several times until I got this below. The conversation was based on responses I had with it. I asked it for a conversation and it came up with this below and then asked for an article geared towards manager.


A Candid Office Exchange

Manager: “We’ve been running Copilot Analyst for a week—what’s your verdict?”
Analyst: “Frankly, it’s underwhelming. It forgets our earlier questions, fails to load Word and Excel files cleanly, and often fabricates ‘facts.’”

Manager: “Isn’t it supposed to natively support Microsoft formats?”
Analyst: “In theory, yes. But the import pipeline mangles tables, strips out images or embedded charts, and sometimes just errors out. You end up re-exporting to plain text or CSV—and losing time.”

Manager: “What about memory—shouldn’t it recall our previous conversation?”
Analyst: “Most large language models only handle a few thousand words at once. A 20-page report overwhelms the context window, so everything beyond the first few pages vanishes—unless you manually chunk and feed it again.”

Manager: “And why the wild inaccuracies?”
Analyst: “Those are called hallucinations. The AI is optimized for plausible language, not verifiable facts. Without a grounding mechanism—checking against a trusted source—it confidently reports errors.”


Why AI Stutters Today

  1. Limited Context Windows
    AI “forgets” anything beyond its token limit. Large documents must be split into smaller chunks or indexed externally, or the assistant simply drops older material.
  2. Unreliable File Ingestion
    Converting DOCX, XLSX, or PPTX into plain text often garbles formatting, tables, and embedded objects. Every failed import forces a manual workaround.
  3. Fundamental Hallucinations
    Generative models predict the next word, not the truth. Without explicit validation, they’ll invent plausible-sounding details—dates, numbers, names—that don’t exist.
See also  High Conflict personality - Do you know anyone like this?

Human-in-the-Loop: Non-Negotiable

  • Draft Generation, Not Final Copy
    Treat AI output as a first draft. It accelerates brainstorming, outline creation, and rough summaries—but never as a finished deliverable.
  • Grounding & Citations
    Always require the AI to cite its sources: “Which file, which page?” If it can’t, assume it’s unreliable and verify manually.
  • Automated Sanity Checks
    For numeric or date extractions, run quick scripts or spreadsheet formulas to compare totals, check date ranges, and flag anomalies.
  • Persistent Memory Layers
    Integrate an external memory or vector-search index so the AI retrieves past queries and project context—rather than starting from scratch each session.

Building a Trustworthy AI Workflow

  1. Index Your Documents
    Store every report, spreadsheet, and slide deck in a semantic vector store. When you ask a question, the AI fetches only the relevant chunks—no blind re-uploads.
  2. Standardize to Simple Formats
    Convert critical files to PDF/A, CSV, or structured JSON before ingestion. This avoids conversion errors and preserves data integrity.
  3. Enforce a Rigorous Prompt
    Use a system prompt such as: “You are an exacting analyst. For every fact, cite the file name and location. If you’re uncertain, answer ‘I don’t know.’”
    Prompt it to “think step by step,” so you can inspect the reasoning path.
  4. Layer in Validation
    After the AI runs, automatically cross-check its output against the source data. Any discrepancy triggers a manual review.
  5. Choose Specialized Tools
    If your work is spreadsheet-heavy, consider a dedicated data-analysis assistant (e.g. a code interpreter or a BI platform) that natively handles tabular imports.

The Bottom Line for Managers

AI assistants today are powerful accelerators, but they’re far from plug-and-play. Expect to treat them like a highly enthusiastic—but still junior—analyst:

  • Strengths: Rapid summarization, creative suggestions, batch processing of simple tasks.
  • Weaknesses: Memory limits, file-conversion errors, and hallucinations that demand verification.
See also  Why I keep buying headphones

By architecting a system—combining document indexing, robust prompting, external memory, and automated checks—you move from “toy demo” to “trusted tool.” Human oversight isn’t a bug; it’s the feature that ensures accuracy, compliance, and ultimately, the credibility of your AI-driven insights.

Human here. AI isn’t going to replace people. It is going to change how we work.