Crawling is the process by which the search engine tries to visit every page of the website with a bot. The search engine bot finds a link on your website and starts to find all your public pages there. The bot crawls the pages and indexes all the content for use in Google, it also adds all the links from those pages to the pile of pages it needs to crawl. As a website owner, your main goal is to ensure that the search engine bot reaches all pages on the site. Failure to do so returns what we call scan errors. In a nutshell, crawl errors occur when a search engine fails when trying to reach a page on your website.
Your goal is to ensure that every link on your website leads to an actual page. This can happen with a 301 redirect, but the page at the end of this link should always return a 200 OK server response.
Scan Errors are divided into 2 groups;
- Site errors: You don’t want these because your entire site cannot be crawled.
- URL errors: These are easier to maintain and fix as they are only associated with a specific URL per error.
If we examine the above in detail;
1. Site Browsing Errors
Site errors refer to all crawl errors that prevent a search engine bot from accessing your website. There can be many reasons for this, the most common ones being;
DNS Errors: This means that a search engine cannot communicate with your server.
For example, it may mean that your website cannot be visited and it is a temporary problem. Google will then go back to your website and crawl it anyway. If you notice crawl errors in your Search Console, it probably means that Google has tried several times to reach your website and still can’t.
- Server errors: If your search console shows server errors, it means that the bot could not access your website. The request may be out of time. Server errors also occur when there are flaws in your code that prevent a page from loading. It can also mean that your site has so many visitors that the server can’t handle all the requests. Most of these errors are returned as status code 5xx, like status codes 500 and 503.
- Bot malfunction: Googlebot tries to crawl your file before scanning it. If this bot cannot access the robots.txt file, Google will delay the crawl until it can access the robots.txt file.
2. URL Crawling Errors
As mentioned, URL errors refer to crawl errors that occur when a search engine bot tries to crawl a specific page of your website. You should check for such errors frequently and fix them (use Google Search Console or Bing webmaster tools). Don’t forget to make sure your sitemap and internal links are also clear and up-to-date!
Most of these URL errors are caused by internal links, meaning that most of these errors reflect the site owner’s mistakes. If you remove a page from your site; you must also correct and remove incoming links. These common errors include the occasional DNS error or server error for the URL in question. You should remember to check this URL again later to see if the error has disappeared.
Obvious URL errors are;
There are some URL errors that only apply to certain sites. So we can evaluate them by listing them separately:
- Mobile-specific URL errors: Refers to page-specific crawl errors that occur on a smartphone. If you use a separate mobile subdomain, such as m.example.com, you may encounter more errors.
- Malware errors: If you see malware errors in your webmaster tools, it means that Bing or Google has found malware in that URL. For example, it could mean that software was found that was used “to collect protected information or generally disrupt their operations”. You should investigate this page and remove the malware immediately.
- Google News (Google News) errors: Refers to Google’s news bugs. Google’s documentation has a list of these possible errors, and if your website is on Google News, you might get these crawl errors!
Conclusion:
Through Google Search Console, you can observe technical errors on your site and easily fix those errors. You will be able to easily check the scanning errors we mentioned above from your website; you will fix the errors if any. In terms of SEO, you will be able to make your website SEO compatible, especially by fixing technical errors.