When AI Becomes Judge: A Lesson Iceland Cannot Afford to Miss

Magnús Smári Smárason
Share:

The Bifröst case reveals dangerous flaws when AI like Claude assesses human contribution without understanding. AI is powerful but shouldn't judge careers.

The judge’s gavel, powered by information from the web of large language models

The article is based entirely on public media coverage and does not assume that all facts of the case are fully known.

The case of Háskólinn á Bifröst can shed light on the danger that arises when technology is used without full understanding to assess human contribution, and how it can become a cover for decisions with serious consequences.

In recent days, the media have reported on disputes within Háskólinn á Bifröst. What has emerged is, to put it mildly, alarming.

According to the coverage, the AI system Claude was used to assess whether three employees were entitled to co-authorship. In that process, personal data, CVs, and unpublished articles were uploaded without permission, a procedure that may constitute violations of copyright and data protection. These so‑called “results” of the AI were then used as the basis for a complaint to the ethics committee.

If this is confirmed, it is questionable from both an ethical and professional standpoint.

The rector is said to have asked the ethics committee that the staff members not be informed that they were under investigation, a request the committee rejected as it would violate administrative law. Foreign universities were notified that the staff were “under review” before any conclusion had been reached.

The Association of Academic Staff passed a vote of no confidence with sixteen votes to one. Foreign co-authors have confirmed the participation of the Icelandic staff members. It has also been established that the ethics committee of a university in Serbia, which was involved with the second article, had already ruled that no violations had taken place, a fact that did not seem to halt further inquiries.

In an open society, institutions must withstand criticism. This case deserves harsh criticism, not only because of what happened, but because of what it reveals: a lack of understanding of the technology and the irresponsibility that follows when it is placed in the judge’s seat.

"Rating: 2/10" – When artificial intelligence pretends to be an expert

In memos dated 13 October 2025, suggesting three months of preparation, one can see how the artificial intelligence system “assessed” the employees’ authorship status. Among the justifications given were:

  • “No thanks to the institution” (No acknowledgement)
  • “No Icelandic business case studies.”
  • “No tangible contribution: Not even data collection or understanding of context.”

One employee received a “score” of 2 out of 10 for their authorship status. Elsewhere it says that Claude considers there to be “only a 30% chance” that the employees meet the requirements of the code of ethics.

This does not hold up to scrutiny, and it is important to understand why.

Large language models like Claude, ChatGPT or Gemini are not expert systems. They do not have built-in access to databases on authorship contribution, research data or publication history unless they are explicitly fed such information, and even then they cannot verify what they receive.

When you ask a language model: “Could this author have written this article?” the answer is not the result of an investigation. It is the most likely and most convincing answer based on the wording of your question (prompt) and the data you provide to it. The model produces text that sounds professional, decorates it with percentages and “assessments”, and builds arguments that seem reasonable, but at its core this is a scenario.

In academic discussion this is sometimes called “sophisticated nonsense” (e. sophisticated bullshit) or simply “hallucinations” (e. hallucinations): a highly convincing presentation without solid foundations.

“No thanks to the institution” is not a scientific measure of authorship contribution. This is a language model that is trying to be “helpful.”

When AI Becomes Judge: A Lesson Iceland Cannot Afford to Miss

The employees’ lawyer pointed out an important point: language models often suffer from sycophancy, where they tend to confirm the user’s assumptions.

This is not malice. It is a design feature. The models are trained to be helpful and accommodating — and “helpful” can turn into giving the user the answer they seem to want. A leading question yields a leading answer. A suspicion that is presented as a fact becomes a “narrative” in the output.

Recent research supports this; Petrov et al. (2025) introduced BrokenMath, a benchmark that measures submissiveness in proof tasks and shows how models tend to confirm the user's incorrect assumptions.

When AI Becomes Judge: A Lesson Iceland Cannot Afford to Miss

This case illustrates what I call a responsibility fog: when technology is used to give decision-making an appearance of objectivity and to diffuse accountability.

“The AI evaluated this” sounds more scientific than “we decided this.” But AI does not make decisions. It bears no responsibility. It does not discover the truth. It only writes convincing text.

A general danger of technology is that decisions are made first, and the technology is then used to produce justification that supports them. This is the opposite of good practice: first data, then conclusion.

When AI Becomes Judge: A Lesson Iceland Cannot Afford to Miss

In this context, it is useful to mention another phenomenon that I call cognitive debt. It arises when institutions increasingly rely on algorithms and language models to support assessment and decision-making, without maintaining the human capability, processes, and expertise needed to scrutinize, question, and take responsibility. Like technical debt in software, cognitive debt accumulates with interest: the more judgment is outsourced, the more expensive and difficult it becomes to restore it when systems fail or issues become sensitive.

THE GOLDEN RULE: If you cannot justify the decision yourself without pointing to the computer, then you have not made a decision – you have only obeyed an order. Such submission to technology is unacceptable when people’s rights and reputations are at stake.

Fortunately, it seems that in this case the ethics committee intervened and halted a process that could otherwise have caused even greater harm. This shows how important it is that human oversight mechanisms work.

Privacy and copyright

This is not just an ethical question. It could also be a legal issue. If data were processed without an appropriate legal basis (cf. GDPR), that could violate fundamental principles of lawfulness and proportionality. Likewise, uploading unpublished academic articles into a third-party system without consent could raise questions about copyright and contractual or confidentiality obligations towards co-authors and publishers.

Here, “transparency” is also misleading if technological literacy is lacking. Saying “we used Claude” does not tell the whole story, because the service is composed of different model types and configurations.

In consumer and general subscription services for large language models, it is neither guaranteed that data are excluded from training nor that their processing takes place exclusively within the European Economic Area (EEA). In such use, it is generally necessary to ensure a clear institutional or enterprise solution, a data processing agreement (DPA), and appropriate transfer mechanisms and safeguards if processing or access takes place outside the EEA. Without such measures, there is no assurance that the processing of personal data and the handling of unpublished academic works meet the requirements of GDPR and the fundamental principles of confidentiality and proportionality.

Bright spot: When human judgment works

In the midst of these events, however, there is one thing that worked exactly as it was supposed to, and that is the independence of the ethics committee.

When the rector requested that the investigation be kept secret from the staff, the committee said no. It referred to administrative law and people’s fundamental right to know when they are under investigation.
This is the core of the matter: The ethics committee applied human judgment, legal expertise, and moral courage to stop a process that had gone off the rails.

The artificial intelligence (Claude) did exactly what it was designed to do – it was obedient and submissive. The ethics committee did the opposite – it was critical and followed the rules. This shows, in black and white, why oversight roles must never be placed in the hands of technology or managers who want “quick fixes.” It was the human element that prevented the violation of rights from becoming even more serious.

Bifröst is not unique, but Bifröst deserves criticism

What makes this case important is not just Bifröst, but the fact that it could happen anywhere. Icelandic institutions and companies are using artificial intelligence with increasing frequency, often without clear rules, oversight, or education.

But universities should safeguard critical thinking, methodology, and rights. When a university uses artificial intelligence to build accusations on an “assessment” that cannot withstand scrutiny, something has gone seriously wrong.

The way forward: Awareness and frameworks

Iceland now needs an awareness shift and clear frameworks.

Awareness-raising: People in positions of responsibility need to understand the basics:

  • that large language models predict text, not truth
  • that they sound more confident than they are,
  • that they can confirm the user's assumptions,
  • and that responsibility is always human.

In which cases is AI only a support tool, and where is it prohibited to use it as a basis for accusations or decisions?

  • When is it necessary to inform people that artificial intelligence has been used?
  • Who is responsible, and how are review, logging, and documentation handled?
  • How is it ensured that rights (right to object, due process and proportionality) are respected?

At the University of Akureyri, we face these challenges just like other digital challenges concerning privacy, data security, and the ethical use of technology in teaching and research. No one has all the answers, but we must ask the right questions before the damage becomes a reality.

Conclusion

Artificial intelligence is a powerful tool. It can increase productivity and assist with research and analysis. But it is not a judge. It is not an expert. And it is not an excuse.

If you use artificial intelligence to justify a decision that affects people’s lives and careers, you are still the one making the decision and you bear responsibility for the consequences.

The technology is new. The rules do not yet exist. But the rights were never unclear.

Source

Petrov, I., Dekoninck, J., & Vechev, M. (2025). BrokenMath: A benchmark for sycophancy in theorem proving with LLMs. arXiv. https://arxiv.org/abs/2510.04721

Some links to news coverage of the case:

MBL

https://www.mbl.is/frettir/innlent/2026/01/15/ovenjulegt_mal_sem_fjallar_um_heidur_folks/

https://www.mbl.is/frettir/innlent/2026/01/14/vantraust_samthykkt_a_rektor_og_stjornendur_skolans/

Vísir

https://www.visir.is/g/20262829141d/akademiskir-starfs-menn-lysa-yfir-van-trausti-a-rektor

https://www.visir.is/g/20262829485d/lodin-svor-gervi-greindar-sem-brjoti-gegn-hofundarretti-engin-thakk-laeti-til-stofnunarinnar-

https://www.visir.is/g/20262829874d/mikil-vaegt-ad-vanda-sig-og-beita-var-ud

https://www.visir.is/g/20262830320d/likir-kaerunni-vid-faglega-aftoku-

RÚV

https://www.ruv.is/frettir/innlent/2026-01-15-lysa-vantrausti-a-rektor-haskolans-a-bifrost-463784