What are broken links with guest blogging

Broken Link Building with the Screaming Frog - A Guide in 5 Steps

  • Type of article: Concrete step by step instructions with examples
  • SEO level: Advanced
  • Aim: Rebuild unreachable backlinks
  • Tools needed: Backlink tool (s), Screaming Frog, Excel

What are broken links? What potential can be expected?

Broken links are generally any type of link that leads nowhere. You should distinguish between the following link types:

  1. Lost links - Backlinks that once referred to your own domain, but theirs Link source have been lost (removed, source now 404 or similar)
  2. Broken links (internal) - internal links that lead to nowhere (404 status code or similar)
  3. Broken links (external, also called broken backlinks here) - backlinks from other domains, their Link target is not available on your domain (or linked to a non-optimal link target)

In this article we deal specifically with type 3: Still existing backlinks on other domains, which, however, point to a URL with 4xx / 5xx status code on your domain to lead. In the sense of optimizing the link juice, you can also add link targets that are no longer optimal because, for example, a better landing page now exists.

The idea is to regain backlinks that have existed before but have been lost for various reasons. Since these links have legitimately already existed, this technology cannot be viewed as spammy link building from Google's point of view. On the contrary: We help "repair" broken link structures.

Update July 2020: In a Q&A with SEO Südwest, John Müller (Webmaster Trends Analyst at Google) specifically said that broken link building is a permissible and sustainable link building technique from Google's point of view. This makes the pronounced recommendation for broken link building official.

  • Not every domain has potential in the area of ​​broken backlinks.
  • If you are not sure if you could benefit, just follow the steps in this article.

Possible reasons for broken backlinks

Not every domain has broken backlinks. The domain in question is old and has a generally extensive link profile? Then the chances are very good to find hidden potential in this area. Typical reasons why links that run nowhere have accumulated over time are, for example:

  • The site had a relaunch and URL structures were changed and not correctly forwarded
  • The target page was deleted by you or the URL was changed manually (status code 400)
  • The target page cannot be reached on the server side (status code 500)
  • The target page has been linked incorrectly from the source: write error in the URL (status code 400)
  • There is now a more appropriate target the link could point to

Here's what you need for broken link building

The tools you need are not very extensive, but unfortunately not free:

  • Backlink reports from tools such as the Google Search Console, Link Research Tools, Ahrefs, Semrush or Majestic (the more data from different sources, the better)
  • Microsoft Excel (or similar; in the most up-to-date version possible; unfortunately Google Sheets does not currently have an essential function)
  • The Screaming Frog SEO Spider

In the following we will guide you step by step through the Analysis of the potential towards Regaining link strength.

It makes sense to keep exactly the goals in mind that we want to achieve in this process

  • The aim of steps 1-4 is to create a comprehensive backlink report with data from various tools, in which we receive data on the metrics source, target, status code and link strength.
  • We don't want to go through an endless list, but rather get an overview of the most important links. In other words: we want to provide an overview for each unique link target only the strongest link source have in the report.

Step 1: Research backlink sources and destinations

In the first step, we collect as much data as possible about the backlink profile of your domain. It is recommended to use as many sources as possible. The reason: Different tools, such as LRT or Moz, crawl different databases. We will merge the collected data in the second step. The aim is to report with specific URLs from Link sources as Link targeting so that we can see exactly from where to where the link is being made.

Data source # 1 - The Google Search Console

Let's start with the Google Search Console. Basically, it's a very reliable data source as it only shows us backlinks that Google can crawl. Say that are also relevant. Unfortunately, Search Console's backlink reports only spit out sources, not link targets. But what is definitely worth it is to combine them with tools such as LRT or Semrush, as these tools can use a larger data basis. You can find the backlinks of your domain in the Search Console in the left navigation underLeft. At the top right you can download a .csv file.

In the Search Console you can also find the option under coverGo through your 404 errors, if any. Alternatively, you can use the URL checking check. Here, too, there are partially broken backlinks whose inlinks Google has crawled on other domains, but which cannot be reached on your domain. The problem: Google doesn't show a source here. So you may find broken backlink targets to be forwarded, but unfortunately you cannot analyze the source from which they originate.

Other data sources - backlink tools

In order to get really detailed reports on link sources and destinations, we are happy to recommend a wide range of tools:

  • Ahrefs, Semrush, Majestics, Moz, LRT, Xovi & Sistrix

Some of them already have the possibility to recognize broken backlinks or to export backlink reports with status codes. This can be quite helpful when collecting data, although it should be noted that the technology presented here:

  1. The detailed results because the data is aggregated from several tools
  2. This technique can also be used to analyze defective link sources that are are at the end of a redirect chain

Step 2: data aggregation and preparation in Excel

To illustrate the process, we will analyze a specific domain in this article: The Galeria Kaufhof online shop. For this domain we have now drawn backlinks from various tools that we in the form of Excel sheets have exported. The next step is to aggregate the data. It must be ensured that the various tools in the exports naturally have an overlap of the same data. These duplicates need to be filtered. You have to pay attention to a few things. First of all, we should consider the problems that arise.

Problems with aggregated data:
  • If we combine the reports in one sheet, we generate a lot of duplicate URLs (both link sources and targets)
  • Duplicate urls are only crawled individually by the Screaming Frog and must therefore be removed in preparation for step 3
  • There are some Cells with empty link targets - these must also be removed
  • The Link Strength Tools Metrics may not always match exactly

First, we will bring the various Excel sheets together in one sheet. We should make sure that the columns are identical. Required columns are: Link source url, Link target url and any Link strength indicator. Everything else is optional, but depending on the intended use, columns such as Anchor text or Toxic Score.

We already have the necessary information. But half a million links? We still have to filter a little before we can work with it.

So that we can now work effectively with the aggregated data, we still have to filter it a little. In our example we have a table with half a million links. As you can see in the picture above, many of them are duplicates, as we have aggregated the data from different tools. It is now necessary to remove these duplicates. We also have to remove empty cells. This is a little cumbersome but done quickly.

Remove empty cells

  1. We filter in the column Link target after all empty cells
  2. We mark the lines with empty cells in the Link Target column
  3. We delete the lines (Do not just delete the content of the lines, but delete the lines completely.Shortcut: Ctrl + -)

Remove duplicates

  1. We'll sort the table by that Link strength indicator (Domain Authority or similar) so that the highest value is at the top
  2. With Ctrl + a select all dates
  3. We use the Excel function Remove duplicates Under the menu item Dataand select the when removing the duplicates Link target column out
  4. We repeat step 3 until we get the message that there are no more duplicates
  5. Subsequently let's repeat steps 2 to 4 for the Link sources column

The removal of duplicates must be carried out for both columns - link targets and sources - one after the other

  • Make sure that all duplicates have been removed from the Link Sources and Link Targets columns.Only if each URL in the Link Targets column is unique, we can continue to work effectively with the report.
  • It is also important to remove duplicates first for the link targets and then perform for the sources. In this manner we prevent accidental deletion from different link targets from the same source domains.

Since we sorted by link strength before removing the duplicates, we have left for each unique backlink target, respectively only the strongest source received in the report. Thus we have a clear list with all link targets as a single URL with the strongest source in each case - a list with which one can continue to work productively. In this example, we were able to reduce the number of links from half a million to around 3,000 using this method.

Finally we will sort the table according to the Link source column alphabeticallyin preparation for the next step.

Step 3: crawl the status codes of the link targets

Next we want to know which of ours Link Targets are reachable and which are not. And this is where the Screaming Frog finally comes into play: We crawl them with it Status codes of our link targets.

To do this, we first put the frog in the List mode. We also activate in the menu under Configuration > Spider under the tab Advanced the function Always Follow Redirects. This function also helps us with the Status codes of the final destinations of routing chains to be able to crawl.

We also advise against this Configuration > speed to limit the crawl speed to a maximum of 5 URLs per second in order not to overload the server. The Screaming Frog is now ready and we can start crawling. For this we copy all URLs from the link sources column from our Excel sheet with Ctrl + C. We can then use them in the Screaming Frog Upload > paste easy to load. The URLs are read and the big crawling begins.

3.413 unique BAcklinks - Depending on the number of URLs and the number of redirect chains, crawling the status codes can take a while

At this point you can now see whether the analysis has hidden potential! We look in the tool under Response Codes for 404 or 500 error messages. Which are displayed? This means that backlink destination URLs are inaccessible.

Lots of 404s - in this case 26% of the unique backlinks. That makes the SEO happy.
In the next step, we look at the sources from which the dead link targets come.

Step 4: Assign status codes of the link targets to the sources

Since we want to be able also include redirect chains in the analysis, it is not very easy to assign the status codes to the link targets, as the number of URLs is simply different.

  • In this example we have 3,413 URLswhose status codes we want to crawl
  • Since we too Redirect chains want to take into account, the frog follows all redirects (see step 3)
  • The consequence: Including redirects, we have a total of 12,313 URLs status codes

But there is a perfect solution for this: We export in the Screaming Frog under the menu item Reports > Redirect chains. What we now get is a detailed report of all 3,413 URLs including status codes at all points in the redirect chains. So that we can combine the data with our backlink report, it is worth removing some superfluous columns first. We only keep the following really relevant columns:

  • Address (corresponds to the link target URL in our output report)
  • Number of redirects
  • Status code 1
  • Redirect URI 1
  • Status code 2
  • Redirect URI 2
  • etc. (depending on how long the chains are)

Now let's sort the report Redirect chains in the column Address Alphabetical. Since our original backlink report in Excel is also arranged alphabetically by link target URL, we can use the columns from the report Redirect chains easy Copy into the original backlink report. If in step 2 Everything went smoothly, the columns should be sorted alphabeticallyLink Targets and Address now match:

If that's the case, we can delete one of the duplicate columns of link targets. Et voilà, we have a complete backlink report with all status codes of the link targets, as well as any redirect chains and their status codes in each link of the chain.

Step 5: Analyze potential and regain link strength

Now it goes into the analysis. Filter the table in the column Status code 1 after 400 and 500 errors, possibly also after URLs with no response (Status Code = 0) or JS redirects. It can also be helpful to highlight the cells with the status code in color. Because in the next steps we also have to filter in the columns Status Code 2 (and possibly the following) for 4xx / 5xx:

Let's be honest: currently no tool on the market can create such a holistic backlink report

What we now have is a concrete overview of all backlinks from strong sources that are not accessible on our site. Due to the complex combination of data sources, this report is more comprehensive than from a tool and can at the same time reveal problems with redirect chains. We also have the option of subjecting each defective link to an audit and going through our questionnaire:

Questionnaire for broken link building

  • Why is the URL not reachable?
  • Should it still be unavailable or do we want to use the link juice?
  • What is the source of the link?
  • Is the source strong & relevant?
  • Is the source spammy?
  • Is there a redirect chain? If so, how does it come about?

We should ask ourselves these questions when auditing every broken backlink. For those who are interested in the topic or are faced with some difficult decisions, I recommend my article on "What are good backlinks?". Here it is discussed in more detail how the quality of a link can be assessed in the qualitative detailed analysis.

If we agree that we would like to give a dead link a breath of life again, the following concrete options are available with corresponding advantages and disadvantages:


Write to the domain operator of the link sources and ask them to correct the links

Optimal and loss-free use of the link juiceVery time-consuming & not all domain operators answer or act as requested
Forward Broken Link URLs via 301Can be implemented independently and on a large scale immediatelyLoss of up to 15% link equity via 301 redirect

Choose the best solution for you according to your goals and capacities. Watch your Backlink profile, Your domain score, Toxic Score and your Rankings Be very precise over the next few weeks and months and be sensitive to any major change outside of regular fluctuations.

Since we only kept the strongest link sources in the report, it can easily happen that, when setting up a redirect, you let uninvited guests into the house at the same time: Referring domains with a high spam / toxic score.
Therefore it is advisable to carry out a new Link Detox in the following weeks after a larger action in Broken Linkbuidling!

Did this guide help you to use untapped potential? Was it possible for you to improve your rankings? Or have you encountered technical obstacles during execution? Then let us know and leave a comment.