Skip to main content


The teacher profession has lately been hard in the US, and is going to be made even harder by LLMs. I reject the article's comparison with calculators, these are exact and you need to know what to ask before getting a useful answer from them. On the contrary LLMs satisfy neither of these propositions by accepting arbitrary prompts and outputting only plausible answers which might be useful or not.

I believe the introduction of accessible LLMs will further the divide between privileged students who will reap the benefits of homework vs the others who will use free LLM tools to skip homework, a cheap short-term win that will end up costing them in the long run.
#education #LLM


Preparing for the Homework Apocalypse: giving assignments to perform actually using ChatGPT? https://www.oneusefulthing.org/p/the-homework-apocalypse

My school district has done away with homework entirely. Problem solved?
@Brian Ó I'm not sure, homework has benefits, but AI-completed homework is both useless and generates overhead for teachers, so it does seem the least worst policy.
@Aaron Kuhn How do you "deal with it" when your school/district policy includes mandatory homework and you have no say in it? At this point it's just cruelty.
sooo…idk how to break this to you, but calculators aren’t exact. Try calculating a geometric mean using the naive approach (multiply and nth root)
@Tim Kellogg Would "accurate" be a better term for you?
you’ve now entered the gray area where it’s no longer straightforward to clearly articulate what’s wrong with LLMs. Fwiw you can dig up plenty of areas where LLMs give very accurate results or can be made to give accurate results, and plenty of areas where calculators give total garbage
@Tim Kellogg It isn't a uniformly gray area for me though, I've stated that if you know what to ask the calculator, you will get consistent and accurate answers. On the other hand, for LLMs there is no way to ensure the accuracy of the output in advance.

You were able to mention a specific case where calculators are falling short, but I dare you to mention specific cases where LLMs always give accurate results. This is the main difference for me. Calculators can be trusted to be consistent even in their shortfalls, LLMs can't be trusted to be consistent even when the actually output accurate information.
ChatGPT gives me the same result every time for this prompt, 7 times in a row. I could improve it further by lowering the temperature via the API or by adding examples

Complete the following sentence with no explanation. Stop when you've completed it.

The first sentence of the pledge of allegiance is:
to follow up on this, remember that LLMs are just neural nets, just a bunch of multiply and add operations, so they really are just calculators. The only reason you get unstable runs is because of a “random seed”, and many models (not OpenAI though) let you set the random seed, so you get stable results, just like how naive geometric mean can give you stable garbage. The random seed isn’t core to how the LLM works, it just makes it seem more creative
@Tim Kellogg But this is only true if you control the model, which isn't the case for ChatGPT. You don't get an email each time they update the model of the free version of their tool. So you can't guarantee the consistency over time.
that’s an issue with all software as a service, like Excel if you use O365. And it’s only true of some LLMs, specifically the proprietary ones. And even those, they let you control when you take changes (they have frozen versions that deprecate every 6 months or so), which is more than you can say for O365
@Tim Kellogg What are you talking about? The topic is ChatGPT used by students to skip homework. I don't know what you're trying to achieve here but you're wearing down my patience.

Let me be clear: I won't be convinced LLMs provide overall positive net value because of the way they can and have been used to produce misinformation at scale. What I'm interested in is how bad is it going to be in specific contexts, like for screenwriters, translators and now students.

If you think LLMs are a fine piece of technology, we will not see eye to eye no matter what inane comparison you draw with other technologies. LLMs are uniquely positioned to drag down the value of written knowledge to below zero at a global scale, which no other technology has been even remotely able to do before.

If this isn't a concern for you, it's fine, but please miss me with your defense of LLMs.
ah, sorry, I misjudged the situation

@Matthew Graybosch Here's the quote from the article I'm basing myself on:

One study of eleven years of college courses found that when students did their homework in 2008, it improved test grades for 86% of them


It's only one study and it's only college courses but it's confirming my own bias. I was fortunate to grow up in an environment where my parents were available to push for and help with homework, and it probably helped with my grades considering my lack of attention in class even if I thought I got it.

The problem is that the curriculum density can't be adequately covered in-class. I believe homework can help with understanding or cementing knowledge quickly dished in class. Homework essays are at a particularly uncomfortable intersection of requiring a lot of time, not being obvious/specific about the knowledge/skill it's meant to train, and easily done plausibly by LLMs, so I believe these will go away first, but then what will remain of the in-person essay tests?

This entry was edited (1 year ago)
@Aaron Kuhn You're welcome, and I appreciate the self-awareness.
@Doug Arley Thank you for this write-up, I'm also not happy with the linked article but it was a good conversation base.

I'm also not sure why anyone would be afraid of LLM perfect accuracy. I'm afraid of the opposite, that it will never reach 100% accuracy because models are trained towards plausibility first, not accuracy. I don't even believe they can ever reach 100% accuracy, which would be superhuman anyway. But their increasing use as authoritative source and their window dressing as humans (by using first person pronouns for example) makes them a prime vector to leverage and launder popular biases.