Large Language Models and Integrity

This article is an informal guideline, for a more formal description please refer to our Plagiarism Policy

Large Language Models (LLMs) such as ChatGPT, Bard and CoPilot offer some unique ways to work and study, but also some unique pitfalls.

When using an LLM during your time at Dev Academy Aotearoa, the most basic way to think about them is as though you have a friend who knows a lot about computers, programming and technology, who makes themselves available to you 24 hours a day, who never tires of conversation or has other commitments.

If you were curious about a programming topic, and that friend explained something to you, that would be totally fine and is encouraged.

If that friend wrote code for you and you submitted it as part of an assignment as though it were your own work, that would be inappropriate and would violate our integrity and plagiarism policies.

If you told your friend about a coding problem, and they gave you a solution that you didn’t understand, you might solve the problem without learning how to solve the problem. Submitting solutions that you didn’t write or don’t fully understand can ruin an opportunity for learning and also obscures your progress, making it harder for your teachers to know if you need help in some areas or when you need to be stretched in others.

An important difference between a human friend and our hypothetical LLM friend, is that your human friend will likely understand what is and isn’t appropriate and helpful for your study. They will usually guide you to good resources and give you the kind of assistance you need instead of handing you a solution ready to submit.

Popular LLMs don’t have the context or capacity to do that by default, if you ask them to solve a problem their most likely output is a solution to that problem.

Through careful use, you can convince an LLM to behave more like a teacher and less like a solution factory. You can use prompts like:

explain this step by step

Add comments to each line of code explaining their purpose

I don’t understand what’s happening in the if condition, what does that expression mean?

… and one that I really like to use is:

okay, I think I understand. Can you ask me a series of questions one at a time to check if I really understand the code

Hallucination

A pitfall of LLMs is hallucination.

Hallucination refers to the generation of incorrect or nonsensical information that the model presents as fact. This phenomenon occurs because LLMs generate responses based on patterns learned from the vast datasets they were trained on, rather than accessing real-time or factual databases.

There are different types of hallucinations in the context of LLMs:

Factual Hallucinations: When an LLM confidently generates a factually incorrect statement or answer.
Semantic Hallucinations: When the generated text is grammatically correct but semantically nonsensical or irrelevant to the context.
Referential Hallucinations: When an LLM invents details, references, or entities that do not exist or are not relevant to the given context.

That means that they will often produce answers that look and sound just as good as any other answer, but that reference facts that are not true or documents that don’t exist.

Lawyers have already gotten into trouble from using LLMs to produce documents that cite precedent from cases that don’t exist. Scholars have cited research papers that are equally imaginary.

In code, LLMs are perfectly capable of serving you nonsense, and unlike a human who doesn’t know a subject matter, an LLM will deliver it just as confidently as a good answer and it will look just as convincing.

Consider this prompt:

write a function to calculate the julian number of a give javascript date

ChatGPT gave me this function:

function calculateJulianNumber(date) {
  const julianEpoch = 2440587.5; // Julian epoch constant
  const millisecondsPerDay = 86400000; // Number of milliseconds in a day
  const dateInMilliseconds = date.getTime();
  const julianNumber = (dateInMilliseconds / millisecondsPerDay) + julianEpoch;
  return julianNumber;
}

Amazing! A nice clean solution, with comments

Is this correct? I can’t tell just by reading it. If I compare it to a calculator online, it doesn’t give me the same answers and I’m not sure why… should I paste this code into my program?

I think this points to a good principle:

Don’t ask an LLM a question if you won’t be able to judge the answer.

I don’t really understand how a julian day should be calculated, so there’s no way for me to correct Chat GPT here. Instead of asking just for a solution, I could have asked it:

What is a julian day

How is a julian day calculated

It’s likely that line of questions would eventually arrive at some code that works and more importantly that I can read and understand.

Extensions

Yellow Flags