OpenAI has built a text watermarking method to detect chatgpt written content

@[email protected] · 2 years ago

OpenAI has built a text watermarking method to detect chatgpt written content

@[email protected] · edit-2 2 years ago

They could inject random zero width non joiners to help detection too. Easy to defeat, but something a layperson would have to go through extra effort to filter out. Kinda like how some plagiarism cases have been won by pointing out identical misspelled words.

just another dev · 2 years ago

Yeah, no chance they’d rely on something that would be so easy to defeat. Watermarking by using word patterns is far more likely.

Still easy to defeat by just using another LLM to rephrase it though.

@[email protected] · 2 years ago

It’s one of many things they could do just like how security is a layers thing.

just another dev · 2 years ago

They could, but adding random zero width characters into words would also destroy ever spell checker, giving it away immediately and making sure that even unaware people would filter it. Doing it outside the words would leave them with too few spots to use for proper watermarking.

I think it’s far more likely they’ll use some kind of pattern in the tokens - that way the watermark will remain even when you don’t copypaste it.

But yeah, as said, they will never tell how it’s implemented, but it can still be simply subverted.

OpenAI has built a text watermarking method to detect chatgpt written content

OpenAI has built a text watermarking method to detect chatgpt written content

OpenAI has built a text watermarking method to detect ChatGPT-written content — company has mulled its release over the past year