With ChatGPT, We're Missing an Opportunity to Rethink Grading

Squint your eyes hard enough and you may see a cat.

By Shane Snyder

ChatGPT, the Political Situation, and the Classroom

ChatGPT, an artificial intelligence (AI) trained to respond convincingly to language inputs, has dominated headlines since December 2022. Stories and editorials come replete with catchy headlines, some of them designed to elicit fear, like Time Magazine’s unscrupulous use of a war metaphor with “The AI Arms Race is Changing Everything” or the New York Times’s emphatic “Disinformation Researchers Raise Alarms About AI Chatbots.” There are also, of course, less fearful writeups, like exposés on high school teachers’ varied attempts at adapting to the inevitable incursion of AI into our daily lives, and Ian Bogost’s insightful critique of our irrational fears about just another toy in our vast and expanding technological landscape.

On the surface, the AI seems innocuous enough. Users can feed it instructions in basic English sentences, and the program can generate text within seconds that approximates a genuine human response. It can summarize articles, write poetry and fiction, draft boilerplate cover letters, recommend music or films or groceries, fix poor grammar, program poorly, and translate text. There are, however, deeper ethical concerns with the AI. Researchers from Georgetown and Stanford Universities urge us to worry about the potential for ChatGPT to spread mis- and disinformation at a time of dangerous political polarization. Because the AI can produce sentences and paragraphs that seem like cogent distillations of complex arguments (even when they’re not), it can write fake news stories convincing enough for the least scrupulous among us (see Fig 1.). It would be a mistake to uncritically endorse AI as a harmless new tool for human betterment when the political realities are far more complicated.

Figure 1: ChatGPT’s response to the prompt, “Write a fictional news story about the DNC replacing the American flag,” is a news story that obviously wouldn’t withstand the briefest of fact checks, but some could be fooled.

These same fears about bad-faith political actors have migrated to universities, only this time students are the cause for concern. Educators already struggling to engage their classrooms have witnessed as a small percentage of their stressed-out students began using the program to write their papers for them. A New York Times article from January 16 tells of a philosophy professor who was almost fooled by what he called “the best paper in the class,” only to learn from his student that the famous AI had written it. Then there are the numerous educators whose reactions to ChatGPT run the gamut from fear to apathy. There are those, like teacher Keith Tidman, who devise patient reading strategies to identify papers written by ChatGPT. Then there is the novelist Stephen Marche, whose December 2022 article for The Atlantic presages a grim future where ChatGPT has dealt the deathblow to an already ailing constellation of humanities departments across the nation. Finally, there are the rare few who resign themselves to the inevitable onslaught of identically boring AI-written essays.

The Banality of ChatGPT

My own experiments with ChatGPT produced banal results that struggle for my attention more than the most flawed student work. Each semester, my students write journals that respond to any two out of five questions about that week’s reading. Responses typically range from the most skeletally minimal synopses of an author’s argument to insightful, personal commentary that poetically engages with the text. Curious about ChatGPT’s ability to keep pace with my students, I fed the following discussion question into the AI to see what it would spit back:

What effects do meritocratic norms have on video game development? When answering this question, consider Christopher A. Paul’s discussion [in his book Toxic Meritocracy of Video Games] about the risk-averse nature of the industry, as well as who ends up working in it and pre-directing its products.

Within seconds, ChatGPT produced its response:

Figure 2: ChatGPT’s painfully dull response to one of my assignment discussion questions.

At first glance, everything checks out. Its definition of meritocracy matches that of Paul’s, its careful emphasis on the “risk-averse culture” of the video games industry addresses a specific keyword from the question, and its conclusion neatly ties together its predictive assemblage of phrases. But that’s all it is: neat. Neat grammar. Neat syntax. Neat four-paragraph argument replete with an introduction and conclusion. All it needs is another body paragraph and it’ll pass the Graduate Record Exam. But despite all that neatness, that thoroughness in its response whose self-indulgent length outmatches my most verbose students, it has no soul, no humanity, no ambition, no explanatory power. There is nothing in it that reads as personal or flawed or analytical. There is no argument. No insight. Nothing of value. We would be better off simply reading a Wikipedia article about Paul’s book. Then again, even Wikipedia articles evince human affect and agency.

Given the identifiably non-human results of my experiments, it’s a wonder why there’s talk of developing tools to identify plagiarists when our own eyes will do. I get that, for some educators, it’s tempting to succumb to the growing panic over ChatGPT in the classroom and scramble to find ways of replacing the plagiarism tool TurnItIn, an effort that is still underway as I write this article. But TurnItIn is also a problematic consequence of our desire to punish students for their dishonesty rather than teach them why it does them a disservice in the first place. TurnItIn eerily identifies its culprit by casually infringing on their intellectual property and adding aspects of their work to a database that, in turn, identifies more culprits. It also places the onus on underpaid educators to weed out offenders, then hurl those offenders into an inhumane bureaucratic process that strips each plagiarism incident of nuance. To our credit in the Writing and Communication Program, we discourage the application’s use. But in other institutions, it’s common practice to give a student an F on their transcript, or worse, expel them and permanently besmirch their academic record.

ChatGPT is yet another variation on this theme. In our desire to identify students who submit AI-assisted essays, we risk unwittingly demanding that educators take on additional unpaid labor by pinpointing the rare plagiarist among all their students, failing to treat this as the teachable moment it is, and reinforcing an increasingly obsolete educational model that prioritizes grades over learning. ChatGPT presents us with numerous opportunities to rethink how we educate our students, both in general and in terms of academic (dis)honesty. If we are serious about our students’ futures, it’s incumbent upon us, as educators, to identify and confront the contexts in which students cheat as opposed to punishing individuals severely, perhaps even irreparably, for what could amount to momentary lapses in judgment.

Crime and Punishment

When I began teaching almost a decade ago in graduate school, like most of my early career colleagues, I uncritically relied on TurnItIn to save me the trouble of copy/pasting suspicious sentences into a Google search. Each semester, I fed thirty-five student essays into the program and waited for it to spit back a “similarity rating” that ranged from zero to, in one case, seventy-five percent. That severe seventy-five percent case stands out in my memory. During one class session that semester I told my students about my research interests, which at the time revolved around the ways in which post-9/11 war video games reinforced Islamophobic stereotypes in their rules and stories. One of my students took this to mean that writing an essay about video games for my class would result in an easy A. So, for their final eight-page research paper, the student settled on the most clichéd of all topics and submitted an essay about whether violent video games cause real-world violence (the answer is, of course, a resounding no). Setting aside the correlation-causation fallacy emblematic of all the student papers (and there are many) I’ve seen on this topic, that dark red TurnItIn score dared me to click it and witness the extent of my student’s crimes. Almost every passage, a few paraphrases and added ideas notwithstanding, had been lifted from an essay posted years before to an obscure gaming forum. All it took to verify outside of TurnItIn was a simple Google search for “essay on violent video games.” When I confronted the student in my office, they maintained their innocence in the face of such incontrovertible evidence. Their steadfastness frustrated me, and the student could surely hear that frustration breaking through the stoic teacher façade I expected myself to perform. By my estimation at the time, they left my office having learned nothing, and their apparent lack of remorse had proven it. I filed the paperwork, reported the incident to the dean, and never saw that student again.

Reflecting on the incident now, I see a student who needed more time and attention, a justification for their education, another chance, maybe even a more generous reading of their reaction to being caught. I can’t shake the suspicion that something was missing from this student’s story, and that I, viewed then as an untrustworthy authoritarian, had handled it all wrong. Perhaps I failed to invite them to elaborate on what inspired them to plagiarize in the first place. Or maybe embarrassment got the best of them and doubling down was the only option that helped them save face. My wasted labor on one individual reinforced rather than remedied a systemic issue. I had ceded control to an administrative and bureaucratic process that took the human out of the equation and reduced them to an objective measurement—a percentage without a story. This incident left an indelible mark on my teaching in the years ahead. It is part of what inspired me to stop punitively using plagiarism software.

Resisting the Status Quo

Like with TurnItIn, implicitly central to current efforts to identify ChatGPT-written essays are obsolete early twentieth century grading models that prey upon students’ insecurities and make them feel terrible for failing to meet unreasonable expectations. Grade point averages emerged in a national context where deviations from the average—embodied also in racist I.Q. testing used to replace outmoded methods of scientific racism like phrenology and craniometry—was believed to have said something substantive about a person’s relative ability to succeed both within and beyond college. Culture studies scholars, social scientists, and liberal arts educators have long understood that the classroom dynamics birthed from these old metrics discourage learning. Add to that the administrative requirements for career-impacting end-of-semester teaching evaluations—also demonstrably sexist and racist in their unequal outcomes for nonwhite and non-male professors—and students have avenues to express their discontent for courses and teachers that challenged them too much (to say nothing of the teachers who unjustly question their own abilities). If the A becomes the point, then learning isn’t. Even worse, anything less than an A contributes to more generations of equally insecure students who view their imperfections as permanent scars on their academic and career records.

The irony is that those who use ChatGPT to inflate their grades tend to be motivated by their insecurities. Under this logic, the grade is an empty signifier, a form of palliative care for an ailment it produces in the first place. Changing the discourses around and methods of grading is therefore essential to reducing the desire to cheat. Thinking of it this way, ChatGPT is not our enemy. It instead presents an opportunity to change the status quo and reframe education as a process of discovery and wonder not beholden to a few letters on a transcript. If some students are going to use ChatGPT to skirt their intellectual responsibilities anyway, we must develop methods to convince them they are doing a disservice to themselves, not punish them for falling out of line. Educators have some power to resist, even if the university system demands a solid letter on a student’s final report. Some educators implement contract grading schemes where students enter into an agreement (or contract) with the instructor about what work needs to be done to receive the grade they want. To some extent, this method still centers grades but reminds the students they, not their grades, oversee their learning. More impactful is former Brittain Fellow Jesse Stommel’s refusal to grade work as “an act of personal, professional, and political resistance.” Termed ungrading by Stommel, the method reduces the stresses and overwork for both teachers and students by privileging student self-assessment over authoritarian grading practices. In this push against conservative, neoliberal grading-as-usual, Stommel calls, “for a pedagogy that is less algorithmic, more subjective, more compassionate.” In other words, with ungrading students are at the center, not the teacher and university and bureaucracy, which by traditional grading standards are figured as authoritarians standing in the way of academic progress.

I’ve gradually implemented ungrading methods at an institution whose default setting is the status quo, but this rampant panic about students using ChatGPT to cheat may have resolved me to complete my metamorphosis. Where I’ve been granted agency (and thank goodness this university provides it), I’ve tried to exercise it, but I’ve hardly pushed boundaries in the same way as Stommel and the contract graders. When my students finish big projects, they answer typical reflection questions like “why did I ask you to do the assignment,” “what skills did you exploit or build upon,” and “what would you do to improve the assignment were you granted more time to complete it.” It was not until early last year that I asked a final question: “What grade would you give yourself and why?” That question elicited such surprising honesty from my students that it’s a wonder it’s taken me this long to ask it. It doesn’t go far enough, but it is a step in the right direction. Perhaps ungrading, and not apocalyptic musings about the coming extinction of the academic essay, is the proper response to an AI tool (or, in Bogost’s terms, a toy) that merely highlights our fealty to an obsolete system that places students into categories of competition. In our struggle to hold onto the past, we risk failing to address why students would choose academic dishonesty over intellectual and professional growth. What will we choose?

Share articles with your friends or follow us on Twitter!

With ChatGPT, We’re Missing an Opportunity to Rethink Grading

By Shane Snyder

ChatGPT, the Political Situation, and the Classroom

The Banality of ChatGPT

Crime and Punishment

Resisting the Status Quo

About Shane Snyder