---
title: 'The Tool Can Be Issued. The Operator Cannot.'
subtitle: 'Notes from the Scite webinar — *Academic AI: From Policy to Practice*'
date: 2026-05-14
author: Magnús Smári Smárason
location: Akureyri, Iceland
publication: smarason.is
status: final draft for publication
word_count_target: 1,400
tags:
  [
    AI in higher education,
    governance,
    BORG,
    University of Akureyri,
    Scite,
    methodology,
  ]
companion: QA_ANSWERS_scite_webinar.md
---

# The Tool Can Be Issued. The Operator Cannot.

_Notes after the Scite webinar — Academic AI: From Policy to Practice · 14 May 2026_

On 14 May I joined a Scite webinar from Akureyri — that dot in the north of Iceland, a short drive from the Arctic Circle — to talk about academic AI with Sean Rife, moderated by Julia Heesen. Thirty minutes of prepared argument. A short fireside. And then the part I always learn the most from: the open Q&A.

The prepared talk had one spine. **Citation integrity, and academic AI more broadly, is a governance problem — not a software problem.** I walked the room through three pillars. The policy I wrote for the University of Akureyri. The infrastructure we built to enforce it, which we call **BORG**. And the third leg, the one that is not a deliverable at all. I call it the irreducible human. The part that cannot be outsourced. Judgement. Coaching. The one-on-one conversation in which a colleague arrives at their own working metaphor for what AI is, in their own words.

That was the talk. Then the room started asking questions, and the questions taught me something the script could not.

## The questions were the inverse of the talk

Seventeen questions in sixty minutes — only seven of which we had time to answer on air. They were good questions, asked by serious people from real institutions: an Irish university, a Tuskegee professor, a Canadian college, a Colorado academic-medicine educator, several from US state universities.

Almost every one of them was a variation on the same instinct: _what should I get, what should I copy, what tool will solve this for me?_

Someone asked whether our system is agentic. The honest answer is no — we run specialised chatbots over a knowledge graph that is the real brain of the operation, and we wrap every one in an evaluation suite designed to torture it before any user touches it. Our users do not need agentic systems yet, and agentic systems in enterprise production are genuinely hard with the manpower a small university has. Someone asked about economic viability. The answer there is good news: we are vendor-agnostic, we pay only for the API we use, the data sits on our own servers, and in the quiet summer months it costs almost nothing.

Then someone asked the question I had been waiting for. _Is there an AI product you would recommend, for its ethics?_

I did not name one. I did my BA thesis in law on the data-retention practices of telecommunications companies, and I have worn a tinfoil hat about recommending big tech ever since. But more than personal scepticism, naming a product would have been answering the wrong question. So I handed it back as a method instead: ask the AI you are already using to help you find the most ethical provider, and search the question properly. That small exercise — turning a buying question into a literacy exercise — is the entire thesis of the talk happening live, without me planning it.

The environmental question got the same treatment, and this one I think matters most. Someone wanted to know how to address the environmental cost. My answer: **if you train yourself to use these tools well, you spend fewer tokens, because your effectiveness is downstream of your prompts.** If you are reinventing how you work with AI every single day — always in a long back-and-forth, always reprocessing the same context — you are destroying the environment. If you structure your work, you are not. Sean called it a really great point, and I think the reason it lands is that it refuses the comforting answer. The environmental cost of AI is real, and it is also not separable from operator competence. **The careless operator burns more of everything.**

## The harness question, and the one about small institutions

Two questions went straight to the heart of it.

The first: _what harness can you share so we can replicate this?_ I reached for a video-game metaphor on air. You do not start a role-playing game at level sixty — you start with a simple knife and the game teaches you what to add as you go. The harness our students actually have access to is not exotic: Microsoft Copilot 365 and Scite. Two licensed tools. In the hands of a clever operator, the potential is immense. Use one to write brilliant prompts, use the other to do the research, understand that **context is a recipe** — and you have something that, handed to a solo operator ten years ago, would have made them one of the most capable researchers alive.

The harness is not a product I can give you. It is a posture you grow into, and you will know when you have outgrown the starter tools.

The second — the last question of the hour — was the one I would teleport back into the room to answer again: _what is your advice for small higher-education institutions? What size of team, what skills?_

If I were starting again today, I would get people talking and I would record it. Speech-to-text. Combine it. Gather as many opinions and as many thoughts as you can. And then find someone who has actually **built** with these tools — not someone who has read about them, someone who has operated them — and empower that person to drive your own training and your own integration.

Because here is the failure mode. If an institution does not build its own capability, it becomes a **consumer of ready-made products**. It loses oversight of its own judgement, because the product encodes a vendor's judgement. And the culture of the institution flattens out, because everyone ends up using the same technology in the same way, shaped by the same external interest. That flattening is the quiet cost nobody puts on the invoice.

## What the evidence says — quickly, because it matters

The talk and the Q&A were not advocacy. They were anchored on peer-reviewed empirical findings that survived an adversarial review pass on our own work. The four that did the most load-bearing work on stage are worth naming here in case you missed them.

**The deskilling finding.** Bastani, Bastani & Sungu (2025, _PNAS_, n ≈ 1,000 high-school math students): unscaffolded GPT-4 access produced +48% performance with the tool and **−17% performance once the tool was withdrawn**. A scaffolded "GPT Tutor" prompt largely eliminated the post-removal deficit. The August 2025 PNAS correction preserves the central finding. ([10.1073/pnas.2422633122](https://doi.org/10.1073/pnas.2422633122), correction [10.1073/pnas.2518204122](https://doi.org/10.1073/pnas.2518204122))

**The detector finding.** Weber-Wulff et al. (2023, _International Journal for Educational Integrity_): 14 AI-text detectors including Turnitin tested at around 28% accuracy on paraphrased text — "**no better than random classifiers**". Liang et al. (2023, _Patterns_): GPT detectors misclassify over 50% of TOEFL essays by non-native English writers as AI-generated, with near-zero false positives for US-born writers. Universities deploying detector-based enforcement are deploying a biased instrument that does not work. The governance response is not better detectors; it is assessment redesign. ([10.1007/s40979-023-00146-z](https://doi.org/10.1007/s40979-023-00146-z), [10.1016/j.patter.2023.100779](https://doi.org/10.1016/j.patter.2023.100779))

**The hallucination finding.** Gravel, D'Amours-Gravel & Osmanlliu (2023, _Mayo Clinic Proceedings: Digital Health_): **69% of ChatGPT-supplied medical references were fabricated**, with real author names attached. Magesh et al. (2025, _Journal of Empirical Legal Studies_) showed proprietary retrieval-augmented legal-AI products hallucinate 17–33% of citations. **The fabrication problem is not solved by enterprise wrappers; it is the cost of unsupervised use.** ([10.1016/j.mcpdig.2023.05.004](https://doi.org/10.1016/j.mcpdig.2023.05.004), [10.1111/jels.12413](https://doi.org/10.1111/jels.12413))

**The retraction-blindness finding.** A `has_retraction: true` filter on the GenAI-in-education literature returned 15 retracted papers. The most consequential — Yu (2024, _Heliyon_, on ChatGPT in educational transformation) — has accumulated **50 Smart Citations across 184 citing publications _after_ retraction**. The retraction signal is not propagating to the citing layer. **This is the documented failure mode that Smart Citation infrastructure was built to address** — and the reason an honest engagement with the literature in 2026 is no longer possible without it.

None of these findings is a reason to be against AI in universities. All of them are reasons to be _specific_ about how it is deployed.

## What I carry out of it

The Q&A overran its own clock. Julia had to call a last question. I take that as the real reception signal — more reliable than any applause line — because a room that keeps asking is a room still thinking.

But the pattern is what stays with me. **The questions were the inverse of the talk.** The room kept reaching for the thing it could acquire, and every honest answer pointed back to the thing it cannot: the human who understands, chooses, and remains accountable for the work. The tools can be provisioned. A licence can reach three thousand people in an afternoon. The clever operator cannot be issued.

I have seven months left in this role at the University of Akureyri. The policy is written and signed. The infrastructure can be built — BORG is the proof that it does not take a large team; it takes a methodology. But the third pillar is not a deliverable, and the Q&A confirmed it from the other side: people will keep asking for the product, and the work will keep being the person in the chair.

Thank you to Sean Rife for the fireside, to Julia Heesen for moderating with such consistency, to the Scite team for the platform, and to every person who asked a question — including the ten whose questions we did not have time to take on air. **Full written answers to all seventeen questions are now published as a companion to this post.** The same standard the live Q&A held to: cite what is real, mark what is uncertain, refuse the easy answer when the easy answer is wrong.

The acceleration is real. The irreducible human stays human.

---

**Read next:**

- [Q&A — Full Written Answers (all seventeen questions)](./QA_ANSWERS_scite_webinar.md)
- [Three Lenses on Accelerated Learning — the 5-page evidence synthesis behind the talk](../REPORT_5page_Synthesis.md)
- The full research workspace, methodology, and adversarial review process: [smarason.is/projects](https://smarason.is)

---

_Magnús Smári Smárason · AI Project Manager, University of Akureyri · smarason.is · 2026_
