Detecting AI Text On Your Laptop? It’s Possible

One of the things that AI doesn’t have that humans have in abundance is fingerprints.

Researchers at Northeastern University used the unique fingerprints of human writing — word choice variety, complex sentences and inconsistent punctuation — to develop a tool to sniff out AI-generated text.

“Just like how everyone has a distinct way of speaking, we all have patterns in how we write,” says Sohni Rais, a graduate student in information systems at Northeastern and a researcher on the project. In order to distinguish between human writing and AI text, she says, “we just need to spot the telltale patterns in writing style.”

AI text detection typically requires substantial computer power in the form of neural network transformers, says Rais, because these approaches analyze every letter, word and phrase in extreme detail. But this level of analysis isn’t necessary to distinguish between human and AI-generated text, Northeastern researchers say. In fact, the technically “lightweight” tool Rais helped develop can run on a regular laptop and is 97 percent accurate.

“We are not the first in the world who develop detectors,” says Sergey Aityan, teaching professor in Northeastern’s Multidisciplinary Graduate Engineering Program on the Oakland campus. “But our solution requires between 20 and 100 times less computer power to do the same job.”

Existing AI-text detecting services, including ZeroGPT, Originality and AI Detector, train large language models to analyze each word. Text entered into these tools is analyzed by proprietary algorithms trained with large datasets powered by transformers.

The lightweight tool can be trained by the user and live on their laptop, offering security and customizing advantages.

“Either you don’t want your secret information to go somewhere beyond your laptop,” says Aityan, “or you are a professor and you want to catch your students cheating, so you train your own dataset based on specific texts.” 

Instead of using transformers, the lightweight approach uses 68 unique stylometric features — or “writing fingerprints,” as Rais calls them — that make each person’s writing unique. These features include sentence complexity.

While AI agents tend to write at a very consistent reading level, humans naturally vary, she says. 

“We might write simply when texting a friend but more formally in an email to our boss,” Rais says.

The tool also looks at word variety, which humans naturally mix up. 

“We might say ‘happy,’ then ‘glad,’ then ‘pleased,’” Rais says. “AI often gets stuck using the same words repeatedly despite knowing many synonyms.”

It also looks at how far apart related words are in a sentence, she says. For instance, in “the cat that I saw yesterday was orange,” the subject (cat) and the verb (was) are separated by five words. Sentences generated by AI, Rais says, maintain consistent distances of two or three words between subjects and verbs.

Instead of looking at every single word, the lightweight approach looks for the most relevant clues.

“It’s like taking a person’s vital signs at the doctor,” she says. “Instead of running every possible test, we measure key indicators like temperature, blood pressure and heart rate that tell us what we need to know.”

The work to develop ways of detecting AI-generated text isn’t over, says Aityan. It is the nature of AI-based systems, however, to learn and improve, he says. As soon as people developed the technology to generate AI text, he says, the technology to detect it followed. And shortly after that, he says, came so-called humanization algorithms to make AI-generated text sound more natural.

“It’s an ongoing battle,” he says.

Continue Reading