The honest assessment
Researcher is the first Copilot feature since meeting recap where the honest review and the marketing slide say the same thing. Give it a real research question — “build me a brief on this company’s last six months and our history with them” — and it does what a competent junior analyst does: plans the work, runs multiple rounds of digging across the web and your tenant, cross-checks, and hands back a structured report with citations on the claims. That combination — web plus your own mail, files, and chats, in one cited document — exists nowhere else in mainstream software.
The score is a 9, not a 10, because Researcher has one structural enemy: time. It is deliberately slower than chat, and the entire product bets that users will trade minutes for quality. Many won’t. Watch a rollout and you’ll see the same failure twice: people fire a question, watch the spinner for ninety seconds, conclude it’s broken, and go back to chat. And before that, the clarifying questions — Researcher’s best feature on paper — get answered with “whatever you think” by users trained on instant chat. A lazy answer to a scoping question doesn’t degrade the output a little; it wrecks it, because every subsequent minute of research is aimed at the wrong target.
The reports themselves are good but house-styled: a confident, consultant-flavored register with more words than your question needed. You can fight it with formatting instructions, but the default verbosity is real, and busy readers notice.
The workarounds that change the score
- Answer the clarifying questions like a brief, not a chat. When Researcher asks about scope, give it audience, length, sections, and what to exclude. Sixty seconds of real answers is the highest-ROI minute in the whole product.
- Fire and walk away. Researcher is asynchronous work, not conversation. Submit, go do something else, come back to a finished report. Teams that frame it this way (“it’s a research request, not a chat”) keep using it; teams that watch the spinner churn out of it.
- Pre-scope in the first prompt. Specify sections, word limit, and source mix up front and the clarifying round gets shorter and the house style gets tamer. This is mandatory in workflows — via the M365 Copilot node nobody is there to answer questions, so the prompt has to carry the whole scope.
- Route by question type. One-fact questions go to Copilot Chat; anything you’d assign to a person for an afternoon goes to Researcher. Teach that routing rule explicitly or people burn time using a research engine as a chatbot.
What Microsoft won’t tell you
- The slowness is the product. Researcher is deliberate about taking minutes because that’s what multi-step verification costs — but nothing in the UI sells the wait, so users read quality as latency failure. Your rollout messaging has to do the job the UI doesn’t.
- The clarifying-question step is where most bad reports are born, and it will never show up in any diagnostic. The tool did exactly what the lazy answer asked.
- Workflow mode quietly changes the contract: no clarifying questions, prompt-as-full-spec. Teams that learned Researcher interactively get worse results from the workflow node until someone tells them why.
Bottom line
This is the rare Copilot feature that’s better than its demo. The catch isn’t capability — it’s discipline. Users who scope properly and let it run get analyst-grade cited reports included in a license they already pay for. Users who treat it like fast chat get slow chat, and tell everyone it’s overrated. Train the discipline; the tool already works.