Skip to content
Home » Digital Thinking Blog: Free Online Marketing Advice » Avoid Duplicate Content: ChatGPT vs. Gemini and Why It Matters for SEO

Avoid Duplicate Content: ChatGPT vs. Gemini and Why It Matters for SEO

    Last Updated on May 20, 2024

    Translated from the original German article by Bettina Heuser

    As you know, the most commonly used AIs for generating texts are Gemini and ChatGPT. You should register with both of them so that you have access to the full scope of the app. Both AIs create content that is more than 80% unique content.

    ChatGPT & Gemini: unique content?

    This question is explored in an article by Ubersuggest, where Gemini and ChatGPT were tested. Both apps were asked to write 1000 articles. These articles dealt with different topics and were checked after creation by Copyscape.

    What causes duplicate content and why should you avoid it?

    Duplicate content is the same content that can be found on several websites or in several versions of a website. Is Duplicate Content Harmful? Not really. How you USE duplicate content can be harmful.

    Duplicate content is often created by:

    • + Copying content from the web
    • + Using the same texts for different subpages
    • + Incorrect technical settings, for instance when relaunching or moving a website. Duplicate content can occur here when identical or very similar content is available under several URLs (i.e. page addresses).

    In the images below, you can see the result of my check at Siteliner – with one click you can see a list of files that are considered duplicate content. It’s a very useful tool!

    Why should you avoid duplicate content?

    In general, duplicate content isn’t best practice. However, I wrote a post about why duplicate content might not always be a bad thing:

    Search engines such as Google or Bing try to find the best content for a specific search query. If there are several websites with the same content, it is difficult for the search engine to assess which content is the best. Websites with too much duplicate content lose ranking.

    As a rule of thumb, the proportion of duplicate content on a page should not exceed 25%. If the proportion of duplicate content is higher, this can lead to problems with SEO and the user experience.

    Tips for avoiding duplicate content:

    • + Do not copy content from other websites without obtaining the author’s consent.
    • + Use different content for different websites.
    • + Check the technical settings

    With Siteliner you can detect duplicate content on your website. Up to 200 pages of your website can be analyzed free of charge.

    Plagiarism programs: Copyright and checking whether the text was written by an AI

    Duplicate content: ChatGPT vs. Gemini

    Of the 1000 articles tested by Neil Patel’s team, ChatGPT had 18.6% duplicate content and Gemini had 14.9%. Sometimes it was a whole paragraph, often just one or two sentences

    German text: Please note that these programs originate from America and therefore always classify a text in German as 100% written by humans when assessing whether it is AI or not (until now)

    Copyscape 30 dollars / month
    My tip here would be: write your content in English (or have it created), check it and then translate it into German. Then copyright should no longer be an issue.

    How does Gemini create duplicate content?

    This is what Gemini says:

    Duplicate content at Gemini occurs for two main reasons:

    • + Similar inputs lead to similar outputs. If a user makes a query that is already known to the model, it is likely to generate similar text to previous queries.
    • + The model learns from the data on which it has been trained. If a text appears twice in the training data, it is likely that the model will also generate it twice.

    How is duplicate content created with ChatGPT?

    This is what ChatGPT says:

    Duplicate content, also known as redundancy, can occur in ChatGPT in a number of ways. Here are some possible reasons:

    Questions with different wording:

    • + If users ask similar questions with different wording, ChatGPT’s response may still be similar. The program may repeat the same information due to the different wording.

    Repetition of information:

    • + ChatGPT has a limited memory capacity and reads text in chunks. It may tend to repeat information to ensure context is maintained, especially if the information is presented in different sections of the conversation.

    Suggested answers:

    • + When a user asks a question and gives several possible phrases, ChatGPT can generate similar responses based on the suggested phrases.

    Training data:

    • + The program has been trained with a variety of texts from the Internet. If there is duplicate or similar content in the training corpus, this can lead to redundant output.

    It is important to note that ChatGPT does not intentionally duplicate content and that the model responses depend on the input and context. Avoid duplicate content with Gemini or ChatGPT

    This is what Gemini says:

    • + Diversify the user’s input. This can be achieved by asking the user more open-ended or complex questions.
    • + Train the model with different data sets. This can help the model learn to generate different text formats.
    • + Check the model for duplicate content. This can help the model learn to avoid duplicate content.

    In Neil Patel’s study, ChatGPT and Gemini were compared in terms of their ability to generate duplicate content. ChatGPT generated more duplicate content on average than Gemini. This is probably due to the fact that ChatGPT was trained on a smaller dataset than Gemini.

    This is what ChatGPT says:

    • + Clear wording: To get precise answers, formulate your questions simply and clearly. Clear wording increases the chance that you will receive specific and non-repetitive information.
    • + Consider contextual information: Add additional contextual information to your questions. This helps ChatGPT to better understand the context and allows for more accurate and targeted answers.
    • + Customization of questions: If you get a general answer and need more details, customize your question or add more information. This helps to retain the desired information.
    • + Manual review and customization: Read the generated answers carefully. If information is repetitive or unclear, adjust the question and ask it again to steer the model in the desired direction.
    • + Experiment: Try out different wording and questions to see if you get different but relevant answers.

    It is important to note that ChatGPT is based on previous input and does not necessarily have an awareness of previous responses. Therefore, it is up to you to control the interaction and make sure you get the information you want.

    Conclusion: Duplicate content with these two AIs

    I have learned that Gemini has been trained on a larger data set and that I should be more precise and creative in my communication with the AI.

    The next time I create content, I will try both and compare the results – at the moment I use these programs for free because I mainly write for myself and my students 🙂

    So try to create unique content and don’t forget E-E-A-T and Helpful Content.

    Image Content credentials: Post Image Generated with Bing Image Creator AI ∙ December 20, 2023 at 12:56 PM