• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • The Famous Blog
    • Blogging
    • Social Media
    • SEO
    • Marketing
    • Design

Famous Bloggers

How To Blog and Start a Business

  • Contribute
    • Submit News
  • Login

Home » The Famous Blog » How to Prevent Content Scraping before It Starts

How to Prevent Content Scraping before It Starts

May 15, 2012 - Last Modified: March 29, 2014 by Amanda DiSilvestro 1,725

Content Scraping

Copying something you’re wearing or copying your new hairstyle is one thing, but no one likes to see someone copy something that took hours to complete.  This can be infuriating, and it unfortunately happens quite often when it comes to online content. Not only can someone copy your articles, but he/she could actually rank higher on a SERP for that article than your original. This is beyond whatever you would call infuriating. Therefore, it is important that a blogger take all the necessary precautions to prevent a website from copying content.

The first thing that a blogger should do is check to see if his/her content is being copied in the first place. Although Google does have rules set in place about scraped, or copied, content, it is in the best interest of the blogger to find this content and get it taken care of as quickly as possible. You can do this through a few different methods:

  • Copyscape – This is a service that allows you to type in the URL of your content and then find the websites where it has been copied. They have a free and premium version.
  • Google Alerts – This works best if you don’t post new content very often. You can type in the title of your post in quotations, and then you will get an alert email if that title ever shows up on Google again.
  • Webmaster Tools – This will help give you a list of websites that have links to pages on your site. If you see that one website is linking back to your site an unusually high amount (more than three or four times), you could have a content scraper on your hands. Learn more about using webmaster tools.

Once you have discovered if someone is scraping your content, you can take measures to make sure that that website removes the content. In many cases, simply finding the contact information of the website owner and asking them to take it down is enough. In other cases, you may have to go through Google and file a complaint. This is done by going to the Google DMCA page, answering a few questions to describe your situation, and then giving the URL to the original content and the scraped content.

How to Make Sure Your Content Doesn’t Get Scraped

Discovering if your content has been copied and then reporting the copied content is important, but most bloggers can actually stop the majority of content copying in the first place. If you take extra measures, you can make it much harder for a website to steal your hours of hard work. Consider some of the tips below:

1. CAPTCHA – Oftentimes content is copied not by a person, but by a computer. A CAPTCHA will help make sure that your content cannot be scraped and it will help reduce the amount of spam found on your site. A CAPCHA usually asks someone to type in a few jumbled letters and numbers such as the example below. This will ensure that a person is on your site and not a bot. Although this won’t stop people from scraping your content, you would be surprised at how many less issues you have.

captcha

2. Pinging – You can actually let search engines know that your content has been uploaded before including it in an RSS feed. You can do this by using a Ping service, or a service that notifies a server that content has been uploaded. You can learn more about setting up this service here.

3. Canonical Links – You can add in the rel=”canonical” tag to help make sure that your website gets credit for all content scraped. Although it doesn’t stop the scrapers, it will at least help give you the credit you deserve. Google is also able to see this tag, so the site that is scraping content could get penalized. You can learn how to insert the tag here.

It is important to remember that content scraping isn’t always bad. If a website is giving you credit, this could actually help drive traffic to your website and help you gain visibility. Your content is being put in front of a new audience, and that audience may want to visit your site to read similar articles. Many websites even strike up a business proposal that is centered around republishing content. However, content scraping without proper attribution is something no blog owner wants to see. The first steps should be trying to avoid it in the first place by installing a CAPTCHA, pinging, or using canonical links.

What do you do to prevent content scraping? Have you found that this is a common problem for most blogs?

Image © ekaterinabondar – Fotolia.com

ShareTweet

Filed Under: Blogging

About Amanda DiSilvestro

Follow @ADiSilvestro

Amanda DiSilvestro is the Editor-in-chief for Plan, Write, GO. She has been writing about all things digital marketing, both as a ghostwriter, guest writer, and blog manager, for over 10 years. Check out her blogging services to learn more!

Reader Interactions

Related Posts

  • Fix ReplytoCom Links and Image Attachment Pages Issues in WordPress
  • Search EnginesHow to Avoid Getting Banned by the Search Engines
  • Spinning ArticlesHow to Manually Spin Articles for Better Quality SEO
  • Freelance Writing OnlineHow to Make Money Freelance Writing Online

{ 23 Responses }

  1. Nasrul Hanis says:
    Although this is an optional matter for most webmasters and bloggers, I guess this should be an essential awareness so we can avoid more duplicated and similar content being created - especially the one duplicated from our content. However the more important thing to be concerned is - never duplicate content as can give much damage on your credibility and you website.
    • Amanda DiSilvestro says:
      Absolutely. As Haroun states below, some people really don't mind having their content scraped. However, no blogger should be the scraper! Thank you for reading!
  2. Haroun says:
    I like the stance of people like Danny Brown and Steve Pavlina Uncopyright, and Creative Comments their content, I realised that I don't mind having my own content scraped anymore.
  3. Eapen says:
    Thanks Amanda for the wonderful tips to protect our blog from scrapers. But I am a bit confused about the rel=canonical tag, pardon me if i am wrong. As per my understanding goes, only the non canonical versions will have this tag in their headers, so this tag must be present in all the versions that are scraped, is that what you meant? How do you do that technically btw?
  4. DeniseGabbard says:
    Hi Amanda-- Some great tips in here. I write so much, and do a lot of guest posting that it is hard to keep track of all my work. I did find a few posts months ago that had been scraped and poorly spun...WITH my name as author intact...talk about aggravating. Set up monitoring tools and have not seen anything lately...but honestly, who knows? Definitely will be implementing the canonical idea.
    • Amanda DiSilvestro says:
      Ahhh that would absolutely infuriate me! I don't think I've ever dealt with someone changing the content and putting my name on it, but yes...that needs to stop. Good luck and let us know how it goes.
  5. Steve Hippel says:
    Really enjoyed that post. I have been using a service called Tynt and it suprised me just how much my content was copied. Tynt add your link to the copied content but it gets deleted a lot. I'm going to take your advice and check out adding the canonical tag.
  6. Denise Fay says:
    Hi Amanda, I saw your article on bizsugar.com. What really useful tips you have. I've thought about copying or scraping but that's as much as I've done. Thought about it. I'm going to now relook at doing something about it. Thanks for these tips. Something that all of us bloggers should be aware of. It's the kickstart that I need. Denise
    • Amanda DiSilvestro says:
      Excellent! I think that this really hits home for writers, but others who work with websites often let this pass them by. Glad to see you're going to take a second look :)
  7. Grady Pruitt says:
    I have to admit that I haven't thought much about scraping. I guess I figure if they're going to do it, they're going to do it. I guess I should be more careful about it. I have to wonder, though. I know there are some out there who won't comment on a page if it has a captcha on it. I think I do have one, but I don't know if it would protect me from scraping. Great post and something to think about. Thanks for sharing!
    • Amanda DiSilvestro says:
      It won't necessarily protect you from ALL content scraping, but it will help! Having a CAPTCHA on your blog isn't difficult, so I always say why not! It's easy and it can deter many automated spammers. Thanks for reading!
  8. Betty Rhodes says:
    I am not a fan of content scraping nor even any form of content copying (if not well cited). As I have always been strict about, do not copy if you will not give credit to the person whom you copied from. It's is win-win situation and should be well practised even here in the virtual world. Currently, I am well accustomed with CAPTCHAs and Pinging. That would be a good start for blogs.
  9. Ricky Shah says:
    Even plugins like RSS Footer or Yoast Wordpress SEO helps in thwarting RSS scraper. With recent Panda algorithm update and refresh, I see many spammers ranking higher than original producer. This is the reason why we've to take matters into our hand
    • Amanda DiSilvestro says:
      Absolutely. You must take matters into your own hands and not just hope it will go away. Although it may not be the end of the world a time or two, it could potentially get out of control.
  10. Aasma says:
    Not even blogs, sometimes scrappers don't leave a website as well. In my opinion first step is to contact the scrapper and ask him to remove your content, if you don't see any response then go directly to Google and register your complaint.
    • Amanda says:
      Agreed! Although many copy content on purpose, many don't even realize that they have some sort of bug that is scraping others content. It's best to always talk to the website that has copied your content first and then go to Google. Thanks for reading!
  11. Anton Koekemoer says:
    Yes - a lot of the time your content is copied not by a person, but by a computer. And having a CAPTCHA installed on your site is the best way of preventing the content from being Scraped. But one thing I've learned, If someone decides to Scrape content from a website , there is very little one can do to stop them. Not even a privacy policy is worth deterring a spammer.
    • Amanda says:
      Yes I am a big believer in CAPTCHAs (I'm not allowed to add photos on this site...but I did have a great photo of a CAPTCHA I promiste!). I actually wrote an article about what to do after your content has been scraped, which you can read here: http://www.webmarketingtherapy.com/blog/what_to_do_if_a_website_is_stealing_your_content/ Although there are a few things you can do, it's absolutely best to stop it before it starts and not take any chances. Thanks for reading!
  12. Scott says:
    Hi Amanda, thank you for sharing this. Out of all the things that I worry about, I think I just cannot afford to take the time out to worry about someone stealing my content. i think the rel="author" tag is enough let google know who the content belongs too.
    • Amanda says:
      I can respect that. I think I'm just a paranoid freak! My full-time job is creating content (more-so than helping run our website), so the concern is probably heightened for me. I agree with you, though, the rel="author" tag is awesome and works in the majority of situations. Thanks for reading!
  13. Jason Nelson says:
    Good work on this post Amanda. I like the tip about rel="canonical" to make sure you get credit for the content. I hate it when content gets scraped and you don't get any link credit for it.
  14. Amanda says:
    I think it's an excellent idea to try and get some backlinks out of content scraping. I don't think this is necessarily something to completely stress over, but it's important to be aware that it can happen and there are things you can do about it. Thank you for reading!!
  15. Jeremy says:
    I try not to worry about this stuff too much. I just try to make a point of including some links to other pages on my blog. That way if they do scrape my content at least I'll get some backlinks out of it. As my blog builds up more momentum I'll probably start caring more and file some complaints. Search engines are getting better at recognizing who the original source of content is.

Primary Sidebar

Our Newsletter

Our Newsletter

Join our mailing list to receive the latest news and updates from our blog.

You have Successfully Subscribed!

Popular Articles

  1. Top 10 Sites Where You Can Get Paid to Write 115,649 views
  2. How to Get Targeted Twitter Followers Fast 92,036 views
  3. How to Set Half Rating Scale 1-5 (Poor to Excellent) by Words 86,704 views
  4. 66 Awesome Social Media Quotes 78,368 views
  5. 50 Traffic Sources You Should Milk Like Crazy 75,081 views
Schema Structured Data for wordPress
  • Blog
  • Contribute
  • About Us
  • Contact Us
  • Disclosure Policy

Copyright ©2020 · FamousBloggers - All Rights Are Reserved · Powered by Genesis Framework

  • Login
Forgot Password?
Lost your password? Please enter your username or email address. You will receive a link to create a new password via email.
Go to mobile version