General Updates: Thank you sincerely for signing up for the GPTzero beta. I’m completely awestruck by the support this app has generated. In the past day, over 4000+ people have signed up for the beta (via this substack) and 10,000+ more have tried and tested it out on the Streamlit
Can I have an API please :-)
I tested it out, and it seemed like it was working on - and it does work for texts which are generated by GPT models entirely or generated with semi-human intervention; however, that said, It does not work well with essays written by good writers. It false flagged so many essays as AI-written. This is at the same time a VERY useful tool for professors, and on the other hand a very dangerous tool - trusting it too much would lead to exacerbation of the false flags.
To Edward: Please make sure the model has a false flag rate of <1%-2% on all type of contents: articles, very poor essays, good essays, stories, etc. For example, my college essays were false flagged multiple times, while I didn't even use ChatGPT or any language model. I uses Thesaurus, and grammarly and that's about it. I urge you to train it on a dataset which accounts for every type of content available. Coming from a high school student, I specially want to emphasize to train this on very good essays because yes a lot of students will use GPT to elevate their writing but some are honest in their essays and the model seems to not take that into account.
Nice attempt, but ChatGPT is so good at what it does...
I have pasted in roughly 350 words of French (the text of an in-preparation novel) mostly generated by ChatGPT. The text is slightly manually edited for a better style, and generated with a strong, enforced context leading to the presence of proper nouns. When given some context, ChatGPT tends to integrate it in a simple manner, and thereby reproduces the vocabulary and style of the author. By that, I mean that if told how a location looks, and asked to write a paragraph describing a character traversing that location, the descriptions will mostly be reused verbatim.
And that text passes the GPTZero test as human. I can share the printout of the final screen for further analysis. I am not totally convinced that proper human-AI cooperation can be flagged.
gptzero.me just brings us here to this substack. The Streamlit app doesn't load ("Oh no.
Error running app. If this keeps happening, please contact support.")... How do we use GPTZero?
Awesome work, GPTZero is appreciated. High quality AI content has seemingly sprung up overnight, so this tool is valuable to hopefully identify shortcuts where shortcuts shouldn't be taken!
I tested it out, and yeah, it did work. Unfortunately, it does not work well with well-written essays. It highlighted various texts with the assumptions that it was AI generated. I must confess, I got pretty disappointed especially after all the testimonies I heard on how good it was. But frankly, this technology should not be discarded but modified.
I would love to have access to an API or possibly on-premise processing.
This is a very useless app I must confess. I actually copied part of my last year's masters project that has not even been published and ran it on this platform but it returned AI-generated text. Can you imagine that? Please you guys should not confuse educators and make professors punish students unnecessarily.
Any thoughts on how to efficiently check/train on a dataset of articles such as those described being authored here? :
I grabbed a few posts from Twitter & Linkedin that were very clearly ai-generated without even needing the GPTZero. All of them came back with positive scores of high probability of AI-generation.
Are there any plans to potentially allow others to build with this tool?
I'd like to try building something that allows me to paste someone's twitter or linkedin profile to find out how much of their profile for a specific period of time has been AI-generated.
I'm sure these critiques are useful, however if we think of gpt as being a little like the calculator, then using it with the additional of human intervention in the process maybe something for academics to get used to and (dare I say it) accept? I'd think a tool that can pick pure AI would be a great addition to the anti-plagiarism toolkit. Given this is in very early development, with so far a solo developer, I'll be watching this with great interest.
"texts with lower perplexities are more likely to be generated by language models" - could you at least give us a range? What does "lower" mean?
Hi, I entered this prompt into ChatGPT: "Examine and discuss the importance of ancestry in Beowulf," and received the following output, which was determined by GPTZero to have a likelihood of human creation. This is the feedback from GPTZero: https://etedward-gptzero-main-zqgfwb.streamlit.app/
I watched ChatGPT generate it this text:
In the epic poem Beowulf, ancestry plays a significant role in the story, as well as in the culture of the characters. The poem is set in Scandinavia in the 6th century, a time when a person's genealogy was of great importance to their identity and their place in society.
One of the main themes of the poem is the concept of the "mead-hall," which serves as both a political and social center for the community. The mead-hall is described as a place where the lord of the hall, or the king, distributes gifts and treasures to his followers, in order to demonstrate his generosity and maintain the loyalty of his retainers. In this way, the mead-hall is a symbol of the lord's power and prestige, as well as a symbol of his responsibility to his people.
In the poem, the hero Beowulf is presented as a warrior of great strength and courage, who is also a member of a distinguished line of warriors. He is described as the "kin of Hygelac," which indicates that he is a member of the same clan as the king of the Geats, Hygelac. Beowulf's ancestry is also used to establish his credibility and prestige as a warrior, and to emphasize his connection to the royal line.
In addition, the poem also focuses on the importance of lineage for the Danes and Geats in the poem . The Danish king, Hrothgar, is described as being from the line of Shield Sheafson, which is said to be the royal line of Denmark. Similarly, the Geat king, Hygelac, is said to be from the line of Waegmundings.
In Beowulf, Ancestry is also linked with the heroic deeds and glory that the characters seek to attain during their lifetime. As is the case with Beowulf, who wants to gain fame and glory to bring honor to his people, and to ensure that his name will be remembered in the songs and stories of future generations.
The importance of ancestry in Beowulf is also linked to the idea of the "comitatus," which is a Germanic concept of loyalty and bravery among a lord and his followers. The lord and his followers are bound together by a bond of loyalty, which is based on their shared ancestry and their shared sense of honor and duty.
In summary, Ancestry plays a central role in the epic poem Beowulf, it's used to establish the credibility and prestige of the characters, as well as to establish the political and social hierarchy of the society depicted in the poem. It also serves as a way of preserving the memory and deeds of the characters, providing an insight of how genealogy was an important factor in shaping the identity of the characters and the society in which they lived.
thank you - I am still unable to submit essays to this site for comparison to work, possible completed by Chatgpt (chatbox) - how can I use your new GPTZero to detect who wrote the essay?
How can I use it? The website says it’s not working.
Great work Edward! I'm looking forward to seeing your app unfold and how others may react to it in the future.