Table of Contents

What is Spam Score?

Spam Score shows the percentage of sites with similar characteristics that have been penalized or banned by Google. The Spam Score is based on a machine learning model that identifies 27 common characteristics among millions of banned or penalized sites.
A high Spam Score for your site or a site you are looking at does not mean that it is spam. This is a sign that you should conduct a bit more research into the quality and relevance of the site.

How to Use Spam Score?

Your Spam Score (which doesn’t necessarily mean that your site is spam) signals a variety of potential signals. The solution does not need to change these factors on your site, as this is based on correlation with penalization rather than circumstance. If you have not received any penalties, you do not need to worry about a Low or Medium score. It is best to use this percentage figure to evaluate the quality of the links pointing to your site, as it provides a clue to help you determine which links may need further investigation or potentially removal.
Another site’s Spam Score does not mean that those sites are spam. So it represents a wide range of potential signals, from content concerns to low authority metrics. Since this is based solely on penalization correlation, the solution does not necessarily require rejecting links from sites with Spam or high Spam Scores.

Spam Score Signals

To identify key signals associated with penalized or banned domains, a system has been developed using extensive training data from known penalized and banned domains.
Here is a list of the 27 signals used for the score:
  • Low page count.
  • TLD associated with spam domains.
  • Domain name length (the length of the subdomain and root domain, similar to those used by spam sites)
  • Does the domain name contain numbers? (Many spam sites contain numeric characters in the domain name)
  • Is Google Font API available? (the spam domain does not use custom fonts (e.g. Google Font API). Lack of this feature is common on spam sites)
  • Google Tag Manager (Google Tag Manager is almost never found on spam sites)
  • Doubleclick Ads (almost never found on spam sites)
  • Is Phone Number Available? (Spam sites rarely have real phone numbers on their pages)
  • LinkedIn Connections (Almost no spam sites have an associated LinkedIn page, so sites lacking this feature are prone to being associated with spam)
  • Is Email Address Available? (Email addresses are almost never found on spam sites)
  • HTTPS (Few spam sites invest in SSL certificates; HTTPS is usually a good trust signal)
  • Use of Meta Keywords (pages that use the meta keywords tag are more likely to be spam)
  • Jumpshot Visit Ranking
  • Rel Canonical (Using a non-native rel=canonical tag is often associated with spam)
  • Length of Title Element (Pages with titles that are too long or too short are associated with spam sites)
  • Meta Description Length (Pages with very short or short meta description tags are associated with spam sites)
  • Length of Meta Keywords (Pages with very long meta keywords are often found on spam sites)
  • Browser Icon (Spam sites rarely use favicon)
  • Facebook Pixel (Facebook tracking pixel is almost never found on spam sites)
  • Number of External Outputs (Spam sites are more likely to have abnormally high or low external link outputs)
  • Number of Domains Linked to Spam (spam sites are more likely to have abnormally high or low unique domains)
  • External Links to Content Ratio (Spam sites are more likely to have abnormal link to content ratios)
  • Vowels / Consonants in Domain Name (Spam sites often have many sequential vowels or consonants in domain names)
  • Hyphens in Domain Name (Spam sites are more likely to use multiple hyphens in their domain names)
  • URL Length (Spam pages often have abnormally short or long URL lengths)
  • Presence of Toxic Words (Spam sites often use specialized words related to webspam topics, such as drugs, adult content, gaming, and others)
  • Use of High CPC Anchor Text (Spam sites often use specific words in anchor text that are associated with webspam topics such as drugs, adult content, gaming and others)