The Number Students Panic About vs. the Number That Actually Gets Them in Trouble
When a Turnitin report comes back, most students fixate on the similarity percentage - the big colored number sitting at the top of the screen. That number is misunderstood by the majority of students who see it. The AI score, meanwhile, is the one flying under the radar because students often cannot see it at all.
These two scores measure completely different things. They are generated by separate systems. They can move in opposite directions simultaneously. And the consequences of each one are handled differently by institutions worldwide. Getting them confused is one of the most common - and most costly - mistakes students make before submission.
This guide breaks down exactly what each score measures, what the numbers actually mean, how they interact (and how they do not), and what you should do before you submit.
What the Similarity Score Actually Measures
The similarity score is a text-matching number. It compares your submitted document against a database that includes billions of current and archived web pages, thousands of academic journals and publications, and a repository of all papers ever previously submitted through Turnitin. When it finds text that matches something in that database, it highlights it.
That is all it does. It does not detect plagiarism. It detects overlap.
This distinction matters enormously. A properly cited direct quote will appear as a match. Your bibliography will appear as a match. Common disciplinary phrases - "statistically significant difference," "the results indicate," "as previously discussed" - will all appear as matches because thousands of students have written those exact phrases before. Even your own name in the header can generate a match if you have submitted work through Turnitin before.
According to Turnitin's own official guidance, the similarity score simply highlights matched text so educators can review it in context - it is not a plagiarism verdict on its own. The score ranges from 0% to 100%, and the color coding follows a standard pattern in most Feedback Studio integrations: blue (0%), green (1-24%), yellow (25-49%), orange (50-74%), and red (75-100%).
A 0% score is not automatically good news. If an assignment required you to use sources and your similarity score comes back at 0%, that might raise more eyebrows than a 15% score would - because it suggests you either did not engage with any outside research or, more concerning, that the writing cannot be traced to anything in the database at all.
The question an instructor asks when reviewing the similarity report is not "is this number above X%?" It is "what is actually being matched, and why?" A 28% similarity score made up entirely of properly cited and quoted material is far safer than a 12% score where the matches cluster in the middle of your argument paragraphs with no citations nearby.
What the AI Score Actually Measures
The AI score operates on a completely different principle. It does not check your text against any database of sources. It does not care whether anything you wrote has appeared online before. Instead, it analyzes the statistical patterns of your prose to estimate how much of it was likely generated by a large language model.
The percentage generated by Turnitin's AI writing detection model is explicitly described as different from and independent of the similarity score. The two scores live in the same interface but they are produced by entirely separate systems and they measure entirely different things.
Here is how the AI detection model works mechanically: your submission is broken into segments of roughly five to ten sentences each. Each segment is scored for how closely it matches the statistical patterns of AI-generated text. Those segment scores are then aggregated into an overall document-level percentage. A sentence that scores close to 1.0 is one the model believes was almost certainly written by an AI. A sentence that scores near 0 is one the model believes was written by a human.
There is a critical threshold built into this system: Turnitin does not display a numerical AI score for documents where the detected AI content falls below 20% of the total text. Instead, it shows an asterisk. This design choice reflects something Turnitin has openly acknowledged - there is a higher incidence of false positives when the percentage is between 0 and 19%. Below that threshold, the score is treated as too unreliable to report numerically.
For a paper to generate any AI Writing Report at all, the submission must be at least 300 words of qualifying prose - meaning actual paragraph-level writing, not tables, code, or lists.
The Visibility Gap Students Do Not Know About
This is the part that trips people up most: the similarity score is visible to both students and instructors. The AI writing score, by default, is visible only to instructors and administrators.
Students submitting an assignment cannot see their own AI score. They get the similarity percentage. Their instructor, meanwhile, opens a separate AI Writing Report that includes the overall AI percentage and highlights the specific passages the model believes were generated by AI - with cyan highlighting for likely AI-generated text and purple for text that appears to have been AI-generated and then run through a paraphrasing tool.
This asymmetry has real consequences. A student who spends time worrying about their 18% similarity score may be unaware that their AI score is showing 84% to the instructor sitting across from them. Conversely, a student who sanitized their paper with light paraphrasing might feel safe, not knowing the detector is flagging exactly those paraphrased sections in purple.
Some institutions and instructors choose to share AI reports with students, but this is not the default behavior. The safest approach is to treat the AI score as something that is always being evaluated, even when you cannot see it yourself.
Why the Two Scores Are Completely Independent
This independence is the most important concept in this entire article. The similarity score and AI score do not influence each other. A clean similarity score says nothing about your AI score, and a high AI score says nothing about your similarity score.
Consider the four possible combinations:
Low similarity, low AI: The cleanest outcome. Original writing with minimal source overlap. Most human-written essays with proper paraphrasing land here.
High similarity, low AI: A typical citation-heavy paper. Lots of direct quotes, a lengthy bibliography, or heavily source-reliant content like a literature review. The overlap is trackable to sources, not to AI patterns. This is usually not an AI concern at all - it is a question of whether citations and quotes are handled correctly.
Low similarity, high AI: The most common scenario for AI-generated papers. The text does not match anything in Turnitin's database because AI writes original phrasing it has never published anywhere. But the writing patterns - the probability distributions of word choices, sentence structures, and transitions - look unmistakably like LLM output. This is the scenario where students who think a low similarity score means they are safe get surprised.
High similarity, high AI: The most problematic case. Text that both matches sources and exhibits AI writing patterns. This might happen when AI is used to generate content that heavily incorporates material from the training data, or when a student pastes AI output that happens to echo published text closely.
The takeaway is direct: checking only your similarity score before submission is like checking only your speed when driving - technically a number about your trip, but not the one that will get you pulled over right now.
The False Positive Problem - and Who It Hits Hardest
Turnitin states a false positive rate of approximately 1% at the document level for fully human-written papers with over 20% AI-like content flagged. In other words, roughly 1 in 100 papers written entirely by humans may be flagged as AI-generated under that threshold.
That sounds low until you consider the scale. Turnitin processes hundreds of millions of submissions. Even a 1% false positive rate translates into a large absolute number of incorrectly flagged students. And independent analysis suggests the false positive rate rises for specific groups - most notably students writing in English as a second language, where formal academic register and disciplinary phrasing can pattern-match more closely to AI output.
Certain writing types are disproportionately vulnerable to false positives. Technical reports, lab write-ups, legal analysis, standardized methodology sections, and any writing with highly formulaic structure all look statistically more similar to AI output than a casual personal essay would. A student writing a pharmacology report using the mandated IMRAD format is not doing anything wrong, but parts of that submission may score higher on AI probability simply because the format constrains the language so heavily.
Turnitin's own guidance acknowledges this directly - the AI writing score may not always be accurate, and it should not be used as the sole basis for adverse actions against a student. It takes further scrutiny and human judgment in conjunction with institutional academic policies to determine whether academic misconduct has actually occurred.
If you receive a false positive accusation, your best tools are process artifacts: Google Docs version history, timestamped drafts, handwritten notes, and research annotations. Turnitin explicitly does not make determinations of misconduct - only instructors do, and instructors are required to apply professional judgment, not just report the number.
How Turnitin Detects AI Bypassing Tools
An important development that most students are not aware of: Turnitin has updated its AI writing detection to include detection of AI bypasser tools - also known as humanizers. These are tools that take AI-generated text and rewrite it to appear more human-like, with the explicit goal of evading AI detection.
Turnitin's current AI detection is powered by a combination of three models working together. The first detects likely AI-written text. The second detects AI-written text that was then processed through an AI paraphrasing tool. The third - added more recently - detects AI-generated content that was subsequently modified by a bypasser or humanizer tool.
In the AI Writing Report, text flagged by the bypasser model falls under the same "AI-generated" category as directly generated text. Instructors see the overall percentage, but the system is now trained to recognize not just raw AI output but also text that has been algorithmically manipulated to avoid detection.
The bypasser detection currently supports English-language submissions only. Turnitin has been transparent that its model is updated regularly and that what evades detection today may not evade it in the future - the model is retrained to keep pace with new LLMs and new bypassing techniques.
What a "Safe" Score Actually Looks Like for Each Metric
For similarity scores, there is no universal safe number. Turnitin itself states this explicitly - there is no fixed percentage to aim for. What is acceptable depends on assignment type, discipline, instructor policy, and what the matched text actually is. Many institutions treat scores above roughly 20% as a reason to review more closely, but a 35% similarity score on a literature review with properly cited block quotes is not the same problem as a 12% score where every flagged passage sits in your own argument with no attribution.
For AI scores, the threshold question is similarly institution-dependent. Some universities set their trigger for formal investigation at 15%, others at 40%. Turnitin only displays a numerical score when the AI content exceeds 20% of the document. Below that, you get the asterisk - a signal that some AI patterns were detected but the confidence level was too low to report a number.
The most useful frame is not "what score is safe" but "what does each score require me to do." For similarity, it requires you to check whether the matched text is properly cited and contextualized. For AI, it requires you to consider whether your writing process is defensible and whether your submitted text genuinely reflects your own thinking.
Want to see how your text scores?
Paste any text and get an instant AI detection score. 500 free words/day.
Try EssayCloak FreeHow to Self-Check Before Submission
The institutional Turnitin tool is not available for student self-checking in most setups. You submit, the report generates, and whatever it says is what your instructor sees. That is a significant information disadvantage.
There are practical ways to address this. For AI risk specifically, running your draft through a dedicated AI detection checker before submission gives you a read on the signals the text is emitting. Tools like EssayCloak's AI Detection Checker score text for AI signals before it goes anywhere near your institution's submission portal - giving you a chance to identify and address problem sections before they become a conversation with your instructor.
For similarity, the most effective pre-submission steps are: exclude your bibliography before looking at the raw percentage (many institutions do this automatically), check whether the matched passages are properly cited, and look at where matches cluster. Isolated small matches across dozens of sources are typically fine. A block of text matching 8% of your paper from a single student submission is a different conversation entirely.
The students who consistently avoid problems are not the ones who chase a specific score - they are the ones who understand what each score actually measures and check both independently.
The AI Score Trap That Catches People Off Guard
There is a specific scenario that catches students who think they have been careful. It goes like this: a student uses AI to draft sections of their paper, then manually rewrites those sections to reduce AI patterns. The similarity score comes back clean - no source matches, because AI output is novel text. The student feels confident. They submit.
What they may not have checked is whether the rewriting was actually sufficient to change the statistical fingerprint of the text. Basic paraphrasing - synonym replacement and sentence restructuring - still leaves a detectable pattern in many cases. The underlying rhythm of the sentences, the probability distributions of connecting phrases, the absence of the small imperfections that characterize genuinely human writing - these are what the model is reading, and light editing often does not move them enough.
The more thorough approach is not cosmetic editing but substantive rewriting - adding your own analysis, restructuring the argument, incorporating your actual knowledge of the subject matter, and letting the AI output serve as a rough reference rather than a draft to clean up. When the underlying ideas are your own and the prose has been genuinely reconstructed rather than just surface-edited, both the AI score and the writing quality tend to improve together.
If you do use AI as part of your writing process and your institution permits it, the safest position is transparency. Cite the tool, describe how you used it, and ensure the submitted text reflects your own intellectual contribution. The academic integrity conversation about AI is still evolving, and policies vary enormously - but what creates consistent problems is not AI assistance itself but concealment of it.
What Instructors Are Actually Looking At
When an instructor opens a Turnitin report, they do not see one number. They see two separate reports, two sets of highlights, and a broader context they build from everything they know about the student and the assignment.
For similarity, instructors are trained to look past the percentage and examine the match breakdown - what sources were matched, whether citations are present, whether matched passages cluster in argument sections versus reference lists, and whether the writing around the matched text demonstrates genuine understanding of the material.
For AI, instructors look at the highlighted sections, the overall percentage, and how those signals line up with what they know about the student's usual writing. A student who has submitted polished, sophisticated writing all semester and suddenly submits a paper that patterns like a GPT output will prompt a very different reaction than a student whose prior work already reads at a similar level.
Turnitin explicitly frames both scores as starting points for a conversation, not verdicts. The institutional process in most cases requires the instructor to meet with the student before any formal misconduct proceeding - which gives students who have been flagged a chance to demonstrate their understanding of the material. If you can discuss your paper's content fluently, explain your argument choices, and produce evidence of your writing process, a high AI score becomes much harder to act on.
Checking Both Scores Before They Check You
The practical upshot of everything above is that students are flying partially blind. You can see your similarity score. Your instructor sees everything.
The gap in information is something you can close before submission. Running your draft through an AI detection tool gives you a read on the signal your text is emitting. If the signal is high, you have time to address it. If the signal is low, you submit with confidence rather than anxiety.
For students using AI as any part of their writing process - for research, outlining, drafting, or editing - EssayCloak offers an Academic humanization mode specifically designed for academic writing. It preserves formal register, citation structure, and discipline-specific language while rewriting the patterns that detectors use to classify text as AI-generated. The result is prose that reads like the work of a careful human writer, because the writing patterns have been fundamentally restructured rather than just surface-edited.
Try EssayCloak FreeKey Differences at a Glance
| Feature | Similarity Score | AI Score |
|---|---|---|
| What it measures | Text overlap with sources in database | Probability that prose was AI-generated |
| Detection method | Database matching | Statistical language pattern analysis |
| Who can see it | Students and instructors | Instructors only (by default) |
| Minimum word count | 30 words | 300 words |
| Low threshold display | Shows all scores | Shows asterisk (*%) below 20% |
| Bypasser detection | Not applicable | Now included in AI writing model |
| Affects the other score? | No | No |
FAQs
Can I have a low similarity score and a high AI score at the same time?
Yes - this is actually the most common pattern for AI-generated papers. The similarity score measures overlap with sources in Turnitin's database. AI output is novel text that does not appear in that database, so it rarely produces high similarity. The AI score measures writing patterns, not source overlap. These two systems operate independently and a clean similarity score provides zero protection against a high AI score.
Why does Turnitin show an asterisk instead of a number for AI scores below 20%?
Turnitin's own testing found a higher incidence of false positives when AI content is detected at between 0% and 19% of a document. Rather than display a number that could unfairly flag students, the system replaced scores in this range with an asterisk to signal that the detection confidence is too low for a reliable percentage to be meaningful. No numerical AI score is displayed until the detected AI content exceeds 20% of the submission.
Can students see their own AI score in Turnitin?
Not by default. The AI writing indicator and full AI Writing Report are visible only to instructors and administrators. Students can see their similarity score, but the AI detection data is on the instructor side only. Some institutions choose to share this information with students, but that is not standard behavior. This is why running an AI detection check on your own draft before submitting - using a tool outside of Turnitin's institutional portal - is the only reliable way to know what signal your text is emitting.
Does a high similarity score mean I used AI?
No. The similarity score and AI score measure completely different things and do not influence each other. A high similarity score typically reflects citation-heavy writing, direct quotations, shared academic phrasing, or bibliography matches. It has nothing to do with whether your prose was generated by an AI. An instructor reviewing a high similarity score with proper citations and attribution will reach a very different conclusion than one reviewing the same percentage where the matched text sits uncited in the middle of an argument.
Can Turnitin detect text that has been humanized or paraphrased by AI?
Turnitin's AI writing detection is now powered by three separate models. One detects raw AI output. A second detects AI-written text that was subsequently processed through an AI paraphrasing tool. A third - added more recently - detects AI-generated content that was modified by a humanizer or bypasser tool. The AI Writing Report categorizes detected text under "AI-generated" regardless of which model flagged it. Basic synonym-swapping paraphrasing still gets detected at a high rate; more thorough structural rewriting is harder to detect but not immune, especially as the model is regularly updated.
What should I do if I receive a false positive AI detection result?
Turnitin explicitly acknowledges that false positives occur and that the AI score should not be used as the sole basis for adverse action. If you believe your paper was incorrectly flagged, gather process evidence - Google Docs version history, timestamped drafts, handwritten notes, research annotations. Request a meeting with your instructor before any formal process proceeds. Most institutions require that conversation anyway. During that meeting, your ability to discuss the paper's content, explain your argument choices, and demonstrate genuine understanding of the material is often the most effective defense. You can also cite Turnitin's own documentation on false positive rates and limitations.
Is a 0% similarity score the safest outcome?
Not necessarily - and for research-heavy assignments, it can actually be a red flag. A 0% similarity score means Turnitin found no overlap with its database, which sounds good but can indicate a lack of source engagement for assignments that require citations and evidence. The AI score, meanwhile, can still be very high even when similarity is at 0%, because AI writes novel prose that does not appear in existing databases. The cleanest outcome is not 0% similarity but a similarity score where all matched text is properly cited and attributed, accompanied by a low or asterisked AI score.
Try EssayCloak Free