Here’s How to Supercharge Your Competitive Research Using a URL Profiler and Fusion Tables

Posted by Craig_Bradshaw

[Estimated read time: 19 minutes]

As digital marketers, the amount of data that we have to collect, process, and analyze is overwhelming. This is never more true than when we’re looking into what competitors are doing from a link building perspective.

Thankfully, there are a few things we can do to make this job a little bit easier. In this post, I want to share with you the processes I use to supercharge my analysis of competitor backlinks. In this post, you’ll learn:

  • How to use URL Profiler for bulk data collection
  • How to use fusion graphs to create powerful data visualizations
  • How to build an SEO profile of the competition using URL Profiler and fusion tables

Use URL Profiler for bulk data collection

Working agency-side, one of the first things I do for every new client is build a profile of their main competitors, including those who have a shared trading profile, as well as those in their top target categories.

The reason we do this is that it provides a top-level overview of the industry and how competitive it actually is. This allows us to pick our battles and prioritize the strategies that will help move the right needles. Most importantly, it’s a scalable, repeatable process for building links.

This isn’t just useful for agencies. If you work in-house, you more than likely want to watch your competitors like a hawk in order to see what they’re doing over the course of months and years.

In order to do this, you’re inevitably going to need to pull together a lot of data. You’ll probably have to use a range of many different tools and data points.

As it turns out, this sort of activity is where URL Profiler becomes very handy.

For those of you who are unfamiliar with URL Profiler is, it’s a bulk data tool that allows you to collect link and domain data from thousands of URLs all at once. As you can probably imagine, this makes it an extremely powerful tool for link prospecting and research.

URL Profiler is a brilliant tool built for SEOs, by SEOs. Since every SEO I know seems to love working with Excel, the output you get from URL Profiler is, inevitably, most handy in spreadsheet format.

Once you have all this amazing bulk data, you still need to be able to interpret it and drive actionable insights for yourself and your clients.

To paraphrase the great philosopher Ben Parker: with great data power comes great tedium. I’ll be the first to admit that data can be extremely boring at times. Don’t get me wrong: I love a good spreadsheet as much as I love good coffee (more on that later); but wherever possible, I’d much rather just have something give me the actionable insights I need.

This is where the power of data visualization comes into play.

Use fusion tables for powerful data visualization

Have you ever manually analyzed one million articles to see what the impact of content format and length has on shares on links? Have you ever manually checked the backlink profile of a domain that has over half a million links? Have you ever manually investigated the breakdown of clicks and impressions your site gets across devices? Didn’t think so.

Thanks to Buzzsumo & Moz, Majestic, Ahrefs, and the Google Search Console, we don’t have to; we just use the information they give us to drive our strategy and decision-making.

The reason these tools are so popular is they allow you to input your data and discern actionable insights. Unfortunately, as already mentioned, we can’t easily get any actionable insights from URL Profiler. This is where fusion tables become invaluable.

If you aren’t already familiar with fusion tables, then the time has come for you to get acquainted with them.

Back in 2012, Google rolled out an “experimental” version of their fusion tables web application. They did this to help you get more from your data and tell the story of what’s going on in your niche with less effort. It’s best to think of fusion tables as Google’s answer to big data.

There are plenty of examples of how people are using fusion tables to tell their stories with data. However, for the purpose of brevity, I only want to focus on one incredibly awesome feature of fusion tables — the network graph.

h8SDcTN.png

If fusion tables are Google’s answer to big data, then the network graph feature is definitely Google’s answer to Cerebro from X-Men.

I won’t go into too many details about what network graphs are (you can read more about them here), as I would much rather talk about their practical applications for competitive analysis.

Note: There is a fascinating post on The Moz Blog by Kelsey Libert about effective influencer marketing that uses network graphs to illustrate relationships. You should definitely check that post out.

I’d been using URL Profiler and fusion tables tools in isolation of each other for quite a while — and they each worked very well — before I figured out how to combine their strengths. The result is a process that combines the pure data collection power of URL Profiler with the actionable insights that fusion graphs provide.

I’ve outlined my process below. Hopefully, it will allow you to do something similar yourself.

Build a competitive SEO profile with URL Profiler and fusion tables

To make this process easier to follow, we’ll pretend we’re entering the caffeinated, yet delicious space of online coffee subscriptions. (I’ve chosen to use this particular niche in our example for no reason other than the fact that I love coffee.) Let’s call our hypothetical online coffee subscription company “Grindhaus.”

Step 1: Assess your competition

We’ll start by looking at the single keyword “buy coffee online.” A Google search (UK) gives us the top 10 that we’ll need to crack if we want to see any kind of organic progress. The first few results look like this: zjDG2Tc.png?1

Step 2: Gather your data

However, we’ve already said that we want to scale up our analysis, and we want to see a large cross-section of the key competitors in our industry. Thankfully, there’s another free tool that comes in handy for this. The folks over at URL Profiler offer a number of free tools for Internet marketers, one of which is called the SERP Scraper. No prizes for guessing what it does: add in all the main categories and keywords you want to target and hit scrape.

e3jAb81.png?1

As you can see from the image above, you can do this for a specific keyword or set of keywords. You can also select which country-specific results you want to pull, as well as the total number of results you want for each query.

It should only take a minute or so to get the results of the scrape in a spreadsheet that looks something like this:

sNko03Z.png

In theory, these are the competitors we’ll need to benchmark against in order for Grindhaus to see any sort of organic progress.

From here, we’ll need to gather the backlink profiles for the companies listed in the spreadsheet one at a time. I prefer to use Majestic, but you can use any backlink crawling tool you like. You’ll also need to do the same for your own domain, which will make it easier to see the domains you already have links from when it’s time to perform your analysis.

After this is done, you will have a file for your own domain, as well as a file for each one of the competitors you want to investigate. I recommend investigating a minimum of five competitors in order to obtain a data set large enough to obtain useful insights from.

Next, what we need to do is clean up the data so that we have all the competitor link data in one big CSV file. I organize my data using a simple two-column format, as follows:

  • The first column contains the competitor being linked to. I’ve given this column the imaginative heading “Competitor.”
  • The second column contains the domains that are linking to your competitors. I’ve labeled this column “URL” because this is the column header the URL Profiler tool recognizes as the column to pull metrics from.

Once you have done this, you should have a huge list of the referring domains for your competitors that looks something like this:

IjfGTeb.png

This is where the fun begins.

Step 3: Gather even more data

Next, let’s take each domain that is linking to one, some, or all of your competitors and run it through URL Profiler one at a time. Doing this will pull back all the metrics we want to see.

It’s worth noting that you don’t need any additional paid tools or APIs to use URL Profiler, but you will have to set up a couple of API keys. I won’t go into detail here on how to do this, as there are already plenty of resources explaining this readily available, including here and here. Vl6tUIQ.png?1

One of the added benefits of doing this through URL Profiler is that you can use its “Import and Merge” feature to append metrics to an existing CSV. Otherwise, you would have to do this by using some real Excel wizardry or by tediously copying and pasting extreme amounts of data to and from your clipboard.

As I’ve already mentioned, URL Profiler allows me to extract both page-level and domain-level data. However, in this case, the domain metrics are what I’m really interested in, so we’ll only examine these in detail here.

Majestic, Moz, and Ahrefs metrics

Typically, SEOs will pledge allegiance to one of these three big tools of the trade: Majestic, Moz, or Ahrefs. Thankfully, with URL Profiler, you can collect data from any or all of these tools. All you need to do is tick the corresponding boxes in the Domain Level Data selection area, as shown below. iIoJzQi.png

In most cases, the basic metrics for each of the tools will suffice. However, we also want to be able to assess the relevance of a potential link, so we’ll also need Topical Trust Flow data from Majestic. To turn this on, go to Settings > Link Metrics using the top navigation and tick the “Include Topical Trust Flow metrics” box under the Majestic SEO option.
JnUG72w.png

Doing this will allow us to see the three main topics of the links back to a particular domain. The first topic and its corresponding score will give us the clearest indication of what type of links are pointing back to the domain we’re looking at.

In the case of our Grindhaus example, we’ll most likely be looking for sites that scored highly in the “Recreation/Food” category. The reason we want to do this is because relevance is a key factor in link quality. If we’re selling coffee, then links from health and fitness sites would be useful, relevant, and (more likely to be) natural. Links from engineering sites, on the other hand, would be pretty irrelevant, and would probably look unnatural if assessed by a Google quality rater.

Social data

Although the importance of social signals in SEO is heavily disputed, it’s commonly agreed that social signals can give you a good idea of how popular a site is. Collecting this sort of information will help us to identify sites with a large social presence, which in theory will help to increase the reach of our brand and our content. In contrast, we can also use this information to filter out sites with a lack of social presence, as they’re likely to be of low quality.

Social Shares

Ticking “Social Shares” will bring back social share counts for the site’s homepage. Specifically, it will give you the number of Facebook likes, Facebook shares, Facebook comments, Google plus-ones, LinkedIn shares, and Pinterest pins.

Social Accounts

Selecting “Social Accounts” will return the social profile URLs of any accounts that are linked via the domain. This will return data across the following social networks: Twitter, Google Plus, Facebook, LinkedIn, Pinterest, YouTube, and Instagram.

Traffic

In the same way that sites with strong social signals give us an indication of their relative popularity, the same can also be said for sites that have strong levels of organic traffic. Unfortunately, without having direct access to a domain’s actual traffic figures, the best we can do is use estimated traffic.

This is where the “SEMrush Rank” option comes into play, as this will give us SEMrush’s estimation of organic traffic to any given domain, as well as a number of organic ranking keywords. It also gives us AdWords data, but that isn’t particularly useful for this exercise. pNgt3pH.png

It’s worth mentioning once more time that this is an estimate of organic traffic, not an actual figure. But it can give you a rough sense of relative traffic between the sites included in your research. Rand conducted an empirical study on traffic prediction accuracy back in June — well worth a read, in my opinion.

Indexation

One final thing we may want to look at is whether or not a domain is indexed by Google. If it hasn’t been indexed, then it’s likely that Google has deindexed the site, suggesting that they don’t trust that particular domain. The use of proxies for this feature is recommended, as it automatically queries Google in bulk, and Google is not particularly thrilled when you do this! pw4DOYa.png

After you’ve selected all the metrics you want to collect for your list of URLs, hit “Run Profiler” and go make yourself a coffee while it runs. (I’d personally go with a nice flat white or a cortado.)

For particularly large list of URLs, it can sometimes take a while, so it would probably be best to collect the data a day or two in advance of when you plan to do the analysis. For the example in this post, it took around three hours to pull back data for over 10,000 URLs. But I could have it running in the background while working on other things.

Step 4: Clean up your data

One of the downsides of collecting all of this delicious data is that there are invariably going to be columns we won’t need. Therefore, once you have your data, it’s best to clean it up, as there’s a limit on the number of columns you can have in a fusion table. CXFldtb.png

You’ll only need the combined results tab from your URL Profiler output. So you can delete the results tab, which will allow you to re-save your file in CSV format.

Step 5: Create your new fusion table

Head on over to Google Drive, and then click New > More > Google Fusion Tables. zbULZzA.png

If you can’t see the “Google Fusion Tables” option, you’ll have to select the “Connect More Apps” option and install Fusion Tables from there: nffgrIL.png

From here, it’s pretty straightforward. Simply upload your CSV file and you’ll then be given a preview of what your table will look like.

Click “Next” and all your data should be imported into a new table faster than you can say “caffeine.”
VwO62dA.png

WSpdPNN.png

Step 6: Create a network graph

Once you have your massive table of data, you can create your network graph by clicking on the small red “+” sign next to the “Cards” tab at the top of your table. Choose “Add Chart” and you’ll be presented with a range of chart options. The one we’re interested is the network graph option: DadqMBW.png

Once you’ve selected this option, you’ll then be asked to configure your network graph. We’re primarily interested in the link between our competition and their referring domains.

However, the relationship only goes in one direction: I, the referring website, give you, the retailer, a link. Thus the connection. Therefore, we should tick the “Link is directional” and “Color by columns” options to make it easier to distinguish between the two.

By default, the network graph is weighted by whatever is in the third column — in this case, it’s Majestic CitationFlow, so our blue nodes are sized by how high the CitationFlow is for a referring domain. Almost instantly, you can spot the sites that are the most influential based on how many sites link to them.

This is where the real fun begins.

One interesting thing to do with this visualization that will save you a lot of time is to reduce the number of visible nodes. However, there’s no science to this, so be careful you’re not missing something. wzwURXr.png

As you increase the number of nodes shown, more and more blue links begin to appear. At around 2,000 nodes, it’ll start to become unresponsive. This is where the filter feature comes in handy, as you can filter out the sites that don’t meet your chosen quality thresholds, such as low Page Authority or a large number of outbound links.

So what does this tell us — other than there appears to be a relatively level playing field, which means there is a low barrier to entry for Grindhaus?

This visualization gives me a very clear picture of where my competition is getting their links from. adaFRBx.png

In the example above, I’ve used a filter to only show referring domains that have more than 100,000 social shares. This leaves me with 137 domains that I know have a strong social following that would definitely help me increase the reach of my content.

You can check out the complete fusion table and network graph here.

Step 7: Find your mutant characteristics

Remember how I compared network graphs to Google’s answer to Cerebro from X-Men? Well, this is where I actually explain what I meant.

For those of you that are unfamiliar with the X-Men universe, Cerebro is a device that amplifies the brainwaves of humans. Most notably, it allows telepaths to distinguish between humans and mutants by finding the presence of the X-gene in a mutant’s body.

Using network graphs, we can specify our own X-gene and use it to quickly find high-quality and relevant link opportunities. For example, we could include sites that have a Domain Authority greater than or equal to 50:

81Wu6Zp.png

For Grindhaus, this filter finds 242 relevant nodes (from a total of 10,740 total nodes). In theory, these are domains Google would potentially see as being more trustworthy and authoritative. Therefore, they should definitely be considered as potential link-building opportunities.

You should be able to see that there are some false positives in here, including Blogspot, Feedburner, and Google. However, these are outweighed by an abundance of extremely authoritative and relevant domains, including Men’s Health, GQ Magazine, and Vogue.co.uk.

Sites that have “Recreation/Food” as their primary Topical Trust Flow Topic:

rp5JT4o.png

This filter finds 361 relevant nodes out of a total of 10,740 nodes, which all have “Recreation/Food” as their primary Topical Trust Flow Topic.

Looking at this example in more detail, we see that another cool feature of network graphs is that the nodes that have the most connections are always in the center of the graph. This means you can quickly identify the domains that link to more than one of your competitors, as indicated by the multiple yellow lines. This works in a similar way to Majestic’s “Click Hunter” feature and Moz’s “Link Intersect” tool.

However, you can do this on a much bigger scale, having a wider range of metrics at your fingertips.

qFP2gro.png

In this case, toomuchcoffee.com, coffeegeek.com, and beanhunter.com would be three domains I would definitely investigate further in order to see how I could get a link from them for my own company.

Sites that are estimated to get over 100,000 organic visits, weighted by social shares:

1ui0EZa.png

For our Grindhaus, this filter finds 174 relevant nodes out of 10,740, which are all estimated to receive more than 100,000 organic visits per month. However, I have also weighted these nodes by “Homepage Total Shares.” This allows me to see the sites that have strong social followings and have also been estimated to receive considerable amounts of organic traffic (i.e., “estimorganic” traffic).

By quickly looking at this network graph, we can immediately see some authoritative news sites such as The Guardian, the BBC, and the Wall Street Journal near the center, as well as quite a few university sites (as denoted by the .ac.uk TLD).

Using this data, I would potentially look into reaching out to relevant editors and journalists to see if they’re planning on covering National Coffee Week and whether they’d be interested in a quote from Grindhaus on, say, coffee consumption trends.

For the university sites, I’d look at reaching out with a discount code to undergraduate students, or perhaps take it a bit more niche by offering samples to coffee societies on campus like this one.

This is barely scratching the surface of what you can do with competitor SEO data in a fusion table. SEOs and link builders will all have their own quality and relevance thresholds, and will also place a particular emphasis on certain variables, such as Domain Authority or total referring domains. This process lets you collect, process, and analyze your data however you see fit, allowing you to quickly find your most relevant sites to target for links.

Step 8: Publish and share your amazing visualization

Now that you have an amazing network graph, you can embed it in a webpage or blog post. You can also send a link by email or IM, which is perfect for sharing with other people in your team, or even for sharing with your clients so you can communicate the story of the work you’re undertaking more easily.

Note: Typically, I recommend repeating this process every three months.

Summary and caveats

Who said that competitive backlink research can’t be fun? Aside from being able to collect huge amounts of data using URL Profiler, with network graphs you can also visualize the connections between your data in a simple, interactive map.

Hopefully, I’ve inspired you to go out and replicate this process for your own company or clients. Nothing would fill me with more joy than hearing tales of how this process has added an extra level of depth and scale to your competitive analysis, as well as given you favorable results.

However, I wouldn’t be worth my salt as a strategist if I didn’t end this post with a few caveats:

Caveat 1: Fusion tables are still classed as “experimental,” so things won’t always run smoothly. The feature could also disappear altogether overnight, although my fingers (and toes) are crossed that it doesn’t.

Caveat 2: Hundreds of factors go into Google’s ranking algorithm, and this type of link analysis alone does not tell the full story. However, links are still seen as an incredibly important signal, which means that this type of analysis can give you a great foundation to build on.

Caveat 3: To shoehorn one last X-Men analogy in… using Cerebro can be extremely dangerous, and telepaths without well-trained, disciplined minds put themselves at great risk when attempting to use it. The same is true for competitive researchers. However, poor-quality link building won’t result in insanity, coma, permanent brain damage, or even death. The side effects are actually much worse!

In this age of penguins and penalties, links are all too often still treated as a commodity. I’m not saying you should go out and try to get every single link your competitors have. My emphasis is on quality over quantity. This is why I like to thoroughly qualify every single site I may want to try and get a link from. The job of doing competitive backlink research using this method is to assess every possible option and filter out the websites you don’t want links from. Everything that’s left is considered a potential target.

I’m genuinely very interested to hear your ideas on how else network graphs could be used in SEO circles. Please share them in the comments below.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

from Raymond Castleberry Blog http://raymondcastleberry.blogspot.com/2016/03/heres-how-to-supercharge-your_31.html
via http://raymondcastleberry.blogspot.com

Here’s How to Supercharge Your Competitive Research Using a URL Profiler and Fusion Tables

Posted by Craig_Bradshaw

[Estimated read time: 19 minutes]

As digital marketers, the amount of data that we have to collect, process, and analyze is overwhelming. This is never more true than when we’re looking into what competitors are doing from a link building perspective.

Thankfully, there are a few things we can do to make this job a little bit easier. In this post, I want to share with you the processes I use to supercharge my analysis of competitor backlinks. In this post, you’ll learn:

  • How to use URL Profiler for bulk data collection
  • How to use fusion graphs to create powerful data visualizations
  • How to build an SEO profile of the competition using URL Profiler and fusion tables

Use URL Profiler for bulk data collection

Working agency-side, one of the first things I do for every new client is build a profile of their main competitors, including those who have a shared trading profile, as well as those in their top target categories.

The reason we do this is that it provides a top-level overview of the industry and how competitive it actually is. This allows us to pick our battles and prioritize the strategies that will help move the right needles. Most importantly, it’s a scalable, repeatable process for building links.

This isn’t just useful for agencies. If you work in-house, you more than likely want to watch your competitors like a hawk in order to see what they’re doing over the course of months and years.

In order to do this, you’re inevitably going to need to pull together a lot of data. You’ll probably have to use a range of many different tools and data points.

As it turns out, this sort of activity is where URL Profiler becomes very handy.

For those of you who are unfamiliar with URL Profiler is, it’s a bulk data tool that allows you to collect link and domain data from thousands of URLs all at once. As you can probably imagine, this makes it an extremely powerful tool for link prospecting and research.

URL Profiler is a brilliant tool built for SEOs, by SEOs. Since every SEO I know seems to love working with Excel, the output you get from URL Profiler is, inevitably, most handy in spreadsheet format.

Once you have all this amazing bulk data, you still need to be able to interpret it and drive actionable insights for yourself and your clients.

To paraphrase the great philosopher Ben Parker: with great data power comes great tedium. I’ll be the first to admit that data can be extremely boring at times. Don’t get me wrong: I love a good spreadsheet as much as I love good coffee (more on that later); but wherever possible, I’d much rather just have something give me the actionable insights I need.

This is where the power of data visualization comes into play.

Use fusion tables for powerful data visualization

Have you ever manually analyzed one million articles to see what the impact of content format and length has on shares on links? Have you ever manually checked the backlink profile of a domain that has over half a million links? Have you ever manually investigated the breakdown of clicks and impressions your site gets across devices? Didn’t think so.

Thanks to Buzzsumo & Moz, Majestic, Ahrefs, and the Google Search Console, we don’t have to; we just use the information they give us to drive our strategy and decision-making.

The reason these tools are so popular is they allow you to input your data and discern actionable insights. Unfortunately, as already mentioned, we can’t easily get any actionable insights from URL Profiler. This is where fusion tables become invaluable.

If you aren’t already familiar with fusion tables, then the time has come for you to get acquainted with them.

Back in 2012, Google rolled out an “experimental” version of their fusion tables web application. They did this to help you get more from your data and tell the story of what’s going on in your niche with less effort. It’s best to think of fusion tables as Google’s answer to big data.

There are plenty of examples of how people are using fusion tables to tell their stories with data. However, for the purpose of brevity, I only want to focus on one incredibly awesome feature of fusion tables — the network graph.

h8SDcTN.png

If fusion tables are Google’s answer to big data, then the network graph feature is definitely Google’s answer to Cerebro from X-Men.

I won’t go into too many details about what network graphs are (you can read more about them here), as I would much rather talk about their practical applications for competitive analysis.

Note: There is a fascinating post on The Moz Blog by Kelsey Libert about effective influencer marketing that uses network graphs to illustrate relationships. You should definitely check that post out.

I’d been using URL Profiler and fusion tables tools in isolation of each other for quite a while — and they each worked very well — before I figured out how to combine their strengths. The result is a process that combines the pure data collection power of URL Profiler with the actionable insights that fusion graphs provide.

I’ve outlined my process below. Hopefully, it will allow you to do something similar yourself.

Build a competitive SEO profile with URL Profiler and fusion tables

To make this process easier to follow, we’ll pretend we’re entering the caffeinated, yet delicious space of online coffee subscriptions. (I’ve chosen to use this particular niche in our example for no reason other than the fact that I love coffee.) Let’s call our hypothetical online coffee subscription company “Grindhaus.”

Step 1: Assess your competition

We’ll start by looking at the single keyword “buy coffee online.” A Google search (UK) gives us the top 10 that we’ll need to crack if we want to see any kind of organic progress. The first few results look like this: zjDG2Tc.png?1

Step 2: Gather your data

However, we’ve already said that we want to scale up our analysis, and we want to see a large cross-section of the key competitors in our industry. Thankfully, there’s another free tool that comes in handy for this. The folks over at URL Profiler offer a number of free tools for Internet marketers, one of which is called the SERP Scraper. No prizes for guessing what it does: add in all the main categories and keywords you want to target and hit scrape.

e3jAb81.png?1

As you can see from the image above, you can do this for a specific keyword or set of keywords. You can also select which country-specific results you want to pull, as well as the total number of results you want for each query.

It should only take a minute or so to get the results of the scrape in a spreadsheet that looks something like this:

sNko03Z.png

In theory, these are the competitors we’ll need to benchmark against in order for Grindhaus to see any sort of organic progress.

From here, we’ll need to gather the backlink profiles for the companies listed in the spreadsheet one at a time. I prefer to use Majestic, but you can use any backlink crawling tool you like. You’ll also need to do the same for your own domain, which will make it easier to see the domains you already have links from when it’s time to perform your analysis.

After this is done, you will have a file for your own domain, as well as a file for each one of the competitors you want to investigate. I recommend investigating a minimum of five competitors in order to obtain a data set large enough to obtain useful insights from.

Next, what we need to do is clean up the data so that we have all the competitor link data in one big CSV file. I organize my data using a simple two-column format, as follows:

  • The first column contains the competitor being linked to. I’ve given this column the imaginative heading “Competitor.”
  • The second column contains the domains that are linking to your competitors. I’ve labeled this column “URL” because this is the column header the URL Profiler tool recognizes as the column to pull metrics from.

Once you have done this, you should have a huge list of the referring domains for your competitors that looks something like this:

IjfGTeb.png

This is where the fun begins.

Step 3: Gather even more data

Next, let’s take each domain that is linking to one, some, or all of your competitors and run it through URL Profiler one at a time. Doing this will pull back all the metrics we want to see.

It’s worth noting that you don’t need any additional paid tools or APIs to use URL Profiler, but you will have to set up a couple of API keys. I won’t go into detail here on how to do this, as there are already plenty of resources explaining this readily available, including here and here. Vl6tUIQ.png?1

One of the added benefits of doing this through URL Profiler is that you can use its “Import and Merge” feature to append metrics to an existing CSV. Otherwise, you would have to do this by using some real Excel wizardry or by tediously copying and pasting extreme amounts of data to and from your clipboard.

As I’ve already mentioned, URL Profiler allows me to extract both page-level and domain-level data. However, in this case, the domain metrics are what I’m really interested in, so we’ll only examine these in detail here.

Majestic, Moz, and Ahrefs metrics

Typically, SEOs will pledge allegiance to one of these three big tools of the trade: Majestic, Moz, or Ahrefs. Thankfully, with URL Profiler, you can collect data from any or all of these tools. All you need to do is tick the corresponding boxes in the Domain Level Data selection area, as shown below. iIoJzQi.png

In most cases, the basic metrics for each of the tools will suffice. However, we also want to be able to assess the relevance of a potential link, so we’ll also need Topical Trust Flow data from Majestic. To turn this on, go to Settings > Link Metrics using the top navigation and tick the “Include Topical Trust Flow metrics” box under the Majestic SEO option.
JnUG72w.png

Doing this will allow us to see the three main topics of the links back to a particular domain. The first topic and its corresponding score will give us the clearest indication of what type of links are pointing back to the domain we’re looking at.

In the case of our Grindhaus example, we’ll most likely be looking for sites that scored highly in the “Recreation/Food” category. The reason we want to do this is because relevance is a key factor in link quality. If we’re selling coffee, then links from health and fitness sites would be useful, relevant, and (more likely to be) natural. Links from engineering sites, on the other hand, would be pretty irrelevant, and would probably look unnatural if assessed by a Google quality rater.

Social data

Although the importance of social signals in SEO is heavily disputed, it’s commonly agreed that social signals can give you a good idea of how popular a site is. Collecting this sort of information will help us to identify sites with a large social presence, which in theory will help to increase the reach of our brand and our content. In contrast, we can also use this information to filter out sites with a lack of social presence, as they’re likely to be of low quality.

Social Shares

Ticking “Social Shares” will bring back social share counts for the site’s homepage. Specifically, it will give you the number of Facebook likes, Facebook shares, Facebook comments, Google plus-ones, LinkedIn shares, and Pinterest pins.

Social Accounts

Selecting “Social Accounts” will return the social profile URLs of any accounts that are linked via the domain. This will return data across the following social networks: Twitter, Google Plus, Facebook, LinkedIn, Pinterest, YouTube, and Instagram.

Traffic

In the same way that sites with strong social signals give us an indication of their relative popularity, the same can also be said for sites that have strong levels of organic traffic. Unfortunately, without having direct access to a domain’s actual traffic figures, the best we can do is use estimated traffic.

This is where the “SEMrush Rank” option comes into play, as this will give us SEMrush’s estimation of organic traffic to any given domain, as well as a number of organic ranking keywords. It also gives us AdWords data, but that isn’t particularly useful for this exercise. pNgt3pH.png

It’s worth mentioning once more time that this is an estimate of organic traffic, not an actual figure. But it can give you a rough sense of relative traffic between the sites included in your research. Rand conducted an empirical study on traffic prediction accuracy back in June — well worth a read, in my opinion.

Indexation

One final thing we may want to look at is whether or not a domain is indexed by Google. If it hasn’t been indexed, then it’s likely that Google has deindexed the site, suggesting that they don’t trust that particular domain. The use of proxies for this feature is recommended, as it automatically queries Google in bulk, and Google is not particularly thrilled when you do this! pw4DOYa.png

After you’ve selected all the metrics you want to collect for your list of URLs, hit “Run Profiler” and go make yourself a coffee while it runs. (I’d personally go with a nice flat white or a cortado.)

For particularly large list of URLs, it can sometimes take a while, so it would probably be best to collect the data a day or two in advance of when you plan to do the analysis. For the example in this post, it took around three hours to pull back data for over 10,000 URLs. But I could have it running in the background while working on other things.

Step 4: Clean up your data

One of the downsides of collecting all of this delicious data is that there are invariably going to be columns we won’t need. Therefore, once you have your data, it’s best to clean it up, as there’s a limit on the number of columns you can have in a fusion table. CXFldtb.png

You’ll only need the combined results tab from your URL Profiler output. So you can delete the results tab, which will allow you to re-save your file in CSV format.

Step 5: Create your new fusion table

Head on over to Google Drive, and then click New > More > Google Fusion Tables. zbULZzA.png

If you can’t see the “Google Fusion Tables” option, you’ll have to select the “Connect More Apps” option and install Fusion Tables from there: nffgrIL.png

From here, it’s pretty straightforward. Simply upload your CSV file and you’ll then be given a preview of what your table will look like.

Click “Next” and all your data should be imported into a new table faster than you can say “caffeine.”
VwO62dA.png

WSpdPNN.png

Step 6: Create a network graph

Once you have your massive table of data, you can create your network graph by clicking on the small red “+” sign next to the “Cards” tab at the top of your table. Choose “Add Chart” and you’ll be presented with a range of chart options. The one we’re interested is the network graph option: DadqMBW.png

Once you’ve selected this option, you’ll then be asked to configure your network graph. We’re primarily interested in the link between our competition and their referring domains.

However, the relationship only goes in one direction: I, the referring website, give you, the retailer, a link. Thus the connection. Therefore, we should tick the “Link is directional” and “Color by columns” options to make it easier to distinguish between the two.

By default, the network graph is weighted by whatever is in the third column — in this case, it’s Majestic CitationFlow, so our blue nodes are sized by how high the CitationFlow is for a referring domain. Almost instantly, you can spot the sites that are the most influential based on how many sites link to them.

This is where the real fun begins.

One interesting thing to do with this visualization that will save you a lot of time is to reduce the number of visible nodes. However, there’s no science to this, so be careful you’re not missing something. wzwURXr.png

As you increase the number of nodes shown, more and more blue links begin to appear. At around 2,000 nodes, it’ll start to become unresponsive. This is where the filter feature comes in handy, as you can filter out the sites that don’t meet your chosen quality thresholds, such as low Page Authority or a large number of outbound links.

So what does this tell us — other than there appears to be a relatively level playing field, which means there is a low barrier to entry for Grindhaus?

This visualization gives me a very clear picture of where my competition is getting their links from. adaFRBx.png

In the example above, I’ve used a filter to only show referring domains that have more than 100,000 social shares. This leaves me with 137 domains that I know have a strong social following that would definitely help me increase the reach of my content.

You can check out the complete fusion table and network graph here.

Step 7: Find your mutant characteristics

Remember how I compared network graphs to Google’s answer to Cerebro from X-Men? Well, this is where I actually explain what I meant.

For those of you that are unfamiliar with the X-Men universe, Cerebro is a device that amplifies the brainwaves of humans. Most notably, it allows telepaths to distinguish between humans and mutants by finding the presence of the X-gene in a mutant’s body.

Using network graphs, we can specify our own X-gene and use it to quickly find high-quality and relevant link opportunities. For example, we could include sites that have a Domain Authority greater than or equal to 50:

81Wu6Zp.png

For Grindhaus, this filter finds 242 relevant nodes (from a total of 10,740 total nodes). In theory, these are domains Google would potentially see as being more trustworthy and authoritative. Therefore, they should definitely be considered as potential link-building opportunities.

You should be able to see that there are some false positives in here, including Blogspot, Feedburner, and Google. However, these are outweighed by an abundance of extremely authoritative and relevant domains, including Men’s Health, GQ Magazine, and Vogue.co.uk.

Sites that have “Recreation/Food” as their primary Topical Trust Flow Topic:

rp5JT4o.png

This filter finds 361 relevant nodes out of a total of 10,740 nodes, which all have “Recreation/Food” as their primary Topical Trust Flow Topic.

Looking at this example in more detail, we see that another cool feature of network graphs is that the nodes that have the most connections are always in the center of the graph. This means you can quickly identify the domains that link to more than one of your competitors, as indicated by the multiple yellow lines. This works in a similar way to Majestic’s “Click Hunter” feature and Moz’s “Link Intersect” tool.

However, you can do this on a much bigger scale, having a wider range of metrics at your fingertips.

qFP2gro.png

In this case, toomuchcoffee.com, coffeegeek.com, and beanhunter.com would be three domains I would definitely investigate further in order to see how I could get a link from them for my own company.

Sites that are estimated to get over 100,000 organic visits, weighted by social shares:

1ui0EZa.png

For our Grindhaus, this filter finds 174 relevant nodes out of 10,740, which are all estimated to receive more than 100,000 organic visits per month. However, I have also weighted these nodes by “Homepage Total Shares.” This allows me to see the sites that have strong social followings and have also been estimated to receive considerable amounts of organic traffic (i.e., “estimorganic” traffic).

By quickly looking at this network graph, we can immediately see some authoritative news sites such as The Guardian, the BBC, and the Wall Street Journal near the center, as well as quite a few university sites (as denoted by the .ac.uk TLD).

Using this data, I would potentially look into reaching out to relevant editors and journalists to see if they’re planning on covering National Coffee Week and whether they’d be interested in a quote from Grindhaus on, say, coffee consumption trends.

For the university sites, I’d look at reaching out with a discount code to undergraduate students, or perhaps take it a bit more niche by offering samples to coffee societies on campus like this one.

This is barely scratching the surface of what you can do with competitor SEO data in a fusion table. SEOs and link builders will all have their own quality and relevance thresholds, and will also place a particular emphasis on certain variables, such as Domain Authority or total referring domains. This process lets you collect, process, and analyze your data however you see fit, allowing you to quickly find your most relevant sites to target for links.

Step 8: Publish and share your amazing visualization

Now that you have an amazing network graph, you can embed it in a webpage or blog post. You can also send a link by email or IM, which is perfect for sharing with other people in your team, or even for sharing with your clients so you can communicate the story of the work you’re undertaking more easily.

Note: Typically, I recommend repeating this process every three months.

Summary and caveats

Who said that competitive backlink research can’t be fun? Aside from being able to collect huge amounts of data using URL Profiler, with network graphs you can also visualize the connections between your data in a simple, interactive map.

Hopefully, I’ve inspired you to go out and replicate this process for your own company or clients. Nothing would fill me with more joy than hearing tales of how this process has added an extra level of depth and scale to your competitive analysis, as well as given you favorable results.

However, I wouldn’t be worth my salt as a strategist if I didn’t end this post with a few caveats:

Caveat 1: Fusion tables are still classed as “experimental,” so things won’t always run smoothly. The feature could also disappear altogether overnight, although my fingers (and toes) are crossed that it doesn’t.

Caveat 2: Hundreds of factors go into Google’s ranking algorithm, and this type of link analysis alone does not tell the full story. However, links are still seen as an incredibly important signal, which means that this type of analysis can give you a great foundation to build on.

Caveat 3: To shoehorn one last X-Men analogy in… using Cerebro can be extremely dangerous, and telepaths without well-trained, disciplined minds put themselves at great risk when attempting to use it. The same is true for competitive researchers. However, poor-quality link building won’t result in insanity, coma, permanent brain damage, or even death. The side effects are actually much worse!

In this age of penguins and penalties, links are all too often still treated as a commodity. I’m not saying you should go out and try to get every single link your competitors have. My emphasis is on quality over quantity. This is why I like to thoroughly qualify every single site I may want to try and get a link from. The job of doing competitive backlink research using this method is to assess every possible option and filter out the websites you don’t want links from. Everything that’s left is considered a potential target.

I’m genuinely very interested to hear your ideas on how else network graphs could be used in SEO circles. Please share them in the comments below.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

from Raymond Castleberry Blog http://raymondcastleberry.blogspot.com/2016/03/heres-how-to-supercharge-your.html
via http://raymondcastleberry.blogspot.com

What You Should Know About Accessibility + SEO, Part I: An Intro

Posted by Laura.Lippay

[Estimated read time: 4 minutes]

Do you know anyone who is visually impaired? Maybe they have low vision or color blindness, or are fully blind. Think about how they use the Internet. Close your eyes, or at least squint really hard, and try to find today’s news or interact with your friends on Facebook. It’s a challenge many of us don’t think about every day, but some of what we do in SEO can affect the experience that people with visual impairments have when visiting a page.

Accessibility and the Internet

accessibilitymac.gif

Visually impaired Internet users are able to navigate and use the web using screen readers like VoiceOver or Jaws. Screen readers, much like search engine crawlers, rely on signals in the code to determine the structure and the context of what they’re crawling. The overlap in what search crawlers look for and interpret versus what screen readers look for and interpret is small, but the idea is the same: Where are the elements of this page and how do I understand them?

The SEO overlap

While it’s important to understand where SEO and accessibility (a11y) overlap in order to optimize correctly for both, it’s also important to note that optimizing for one is not necessarily akin to optimizing for the other. In other words, if you’ve optimized a page for search engines, it doesn’t mean you’ve necessarily made it accessible — and vice versa.

Recently, web accessibility expert Karl Groves wrote a post called The Accessibility & SEO Myth. Mr. Groves knows the world of accessibility inside and out, and knows that optimizing for accessibility, which goes far beyond optimizing for the visually-impaired, is very different overall, and much more complex (strictly from a technical standpoint) than optimizing for search engines. He’s right — that despite the ways SEO and a11y overlap, a11y is a whole different ballgame. But if you understand the overlap, you can successfully optimize for both.

Here are just some examples of where SEO and accessibility can overlap:

  • Video transcription
  • Image captioning
  • Image alt attributes
  • Title tags
  • Header tags (H1, H2, etc)
  • Link anchor text
  • On-site sitemaps, table of contents, and/or breadcrumbs
  • Content ordering
  • Size and color contrast of text
  • Semantic HTML

If you’re developing the page yourself, I would challenge you to learn more about the many things you can do for accessibility beyond where it overlaps with SEO, like getting to know ARIA attributes. Take a look at the W3C Web Content Accessibility Guidelines and you’ll see there are far more complex considerations for accessibility than what we typically consider for technical SEO. If you think technical SEO is fun, just wait until you get a load of this.

Optimizing for accessibility or SEO?

Chances are, if you’re optimizing for accessibility, you’re probably covering your bases for those technical optimizations where accessibility and SEO overlap. BUT, this doesn’t always work the other way around, depending on the SEO tactics you take.

Thankfully, the Converse site has a pretty descriptive alt attribute in place!

Consider a screen reader reaching an image of a pair of women’s black Chuck Taylor All-Star shoes and reading its alt attribute as “Women’s black Chuck Taylor All-Stars buy Chucks online women’s chuck taylors all-stars for sale.” Annoying, isn’t it? Or compare these page titles with SEO and accessibility in mind: “Calculate Your Tax Return” versus “Online Tax Calculator | Tax Return Estimator | Tax Refund/Rebate.” Imagine you just encountered this page without being able to see the content. Which one more crisply and clearly describes what you can expect of this page?

While it’s nice to know that proper technical search engine optimization will affect how someone using a screen reader can contextualize your site, it’s also important to understand (1) that these two optimization industries are, on a bigger level, quite different, and (2) that what you do for SEO where SEO and a11y overlap will affect how some visitors can (or can’t) understand your site.

http://platform.twitter.com/widgets.js

For Global Accessibility Awareness Day on May 19, I’ll be collaborating with some experts in a11y on a post that will go into more details on what aspects of SEO + a11y to be keenly aware of and how to optimize for both. I’ll be sure to find as many examples as I can — if you’ve got any good ones, please feel free to share in the comments (and thanks in advance).

Educational resources & tools

In the meantime, to learn more about accessibility, check out a couple of great resources:

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

from Raymond Castleberry Blog http://raymondcastleberry.blogspot.com/2016/03/what-you-should-know-about_30.html
via http://raymondcastleberry.blogspot.com

What You Should Know About Accessibility + SEO, Part I: An Intro

Posted by Laura.Lippay

[Estimated read time: 4 minutes]

Do you know anyone who is visually impaired? Maybe they have low vision or color blindness, or are fully blind. Think about how they use the Internet. Close your eyes, or at least squint really hard, and try to find today’s news or interact with your friends on Facebook. It’s a challenge many of us don’t think about every day, but some of what we do in SEO can affect the experience that people with visual impairments have when visiting a page.

Accessibility and the Internet

accessibilitymac.gif

Visually impaired Internet users are able to navigate and use the web using screen readers like VoiceOver or Jaws. Screen readers, much like search engine crawlers, rely on signals in the code to determine the structure and the context of what they’re crawling. The overlap in what search crawlers look for and interpret versus what screen readers look for and interpret is small, but the idea is the same: Where are the elements of this page and how do I understand them?

The SEO overlap

While it’s important to understand where SEO and accessibility (a11y) overlap in order to optimize correctly for both, it’s also important to note that optimizing for one is not necessarily akin to optimizing for the other. In other words, if you’ve optimized a page for search engines, it doesn’t mean you’ve necessarily made it accessible — and vice versa.

Recently, web accessibility expert Karl Groves wrote a post called The Accessibility & SEO Myth. Mr. Groves knows the world of accessibility inside and out, and knows that optimizing for accessibility, which goes far beyond optimizing for the visually-impaired, is very different overall, and much more complex (strictly from a technical standpoint) than optimizing for search engines. He’s right — that despite the ways SEO and a11y overlap, a11y is a whole different ballgame. But if you understand the overlap, you can successfully optimize for both.

Here are just some examples of where SEO and accessibility can overlap:

  • Video transcription
  • Image captioning
  • Image alt attributes
  • Title tags
  • Header tags (H1, H2, etc)
  • Link anchor text
  • On-site sitemaps, table of contents, and/or breadcrumbs
  • Content ordering
  • Size and color contrast of text
  • Semantic HTML

If you’re developing the page yourself, I would challenge you to learn more about the many things you can do for accessibility beyond where it overlaps with SEO, like getting to know ARIA attributes. Take a look at the W3C Web Content Accessibility Guidelines and you’ll see there are far more complex considerations for accessibility than what we typically consider for technical SEO. If you think technical SEO is fun, just wait until you get a load of this.

Optimizing for accessibility or SEO?

Chances are, if you’re optimizing for accessibility, you’re probably covering your bases for those technical optimizations where accessibility and SEO overlap. BUT, this doesn’t always work the other way around, depending on the SEO tactics you take.

Thankfully, the Converse site has a pretty descriptive alt attribute in place!

Consider a screen reader reaching an image of a pair of women’s black Chuck Taylor All-Star shoes and reading its alt attribute as “Women’s black Chuck Taylor All-Stars buy Chucks online women’s chuck taylors all-stars for sale.” Annoying, isn’t it? Or compare these page titles with SEO and accessibility in mind: “Calculate Your Tax Return” versus “Online Tax Calculator | Tax Return Estimator | Tax Refund/Rebate.” Imagine you just encountered this page without being able to see the content. Which one more crisply and clearly describes what you can expect of this page?

While it’s nice to know that proper technical search engine optimization will affect how someone using a screen reader can contextualize your site, it’s also important to understand (1) that these two optimization industries are, on a bigger level, quite different, and (2) that what you do for SEO where SEO and a11y overlap will affect how some visitors can (or can’t) understand your site.

http://platform.twitter.com/widgets.js

For Global Accessibility Awareness Day on May 19, I’ll be collaborating with some experts in a11y on a post that will go into more details on what aspects of SEO + a11y to be keenly aware of and how to optimize for both. I’ll be sure to find as many examples as I can — if you’ve got any good ones, please feel free to share in the comments (and thanks in advance).

Educational resources & tools

In the meantime, to learn more about accessibility, check out a couple of great resources:

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

from Raymond Castleberry Blog http://raymondcastleberry.blogspot.com/2016/03/what-you-should-know-about.html
via http://raymondcastleberry.blogspot.com

The Guide to International Website Expansion: Hreflang, ccTLDs, & More!

Posted by katemorris

Growth. Revenue, visits, conversions. We all want to see growth. For many, focusing on a new set of potential customers in another market (international, for instance) is a source of growth. It can sometimes seem like an easy expansion. If your current target market is in the US, UK, or Australia, the other two look promising. Same language, same content — all you need is to set up a site for them and target it at them, right?

International expansion is more complicated than that. The ease of expansion depends highly on your business, your resources, and your customers. How you approach expansion and scale it over time takes consideration and planning. Once you’ve gone down a path of URL structure and a process for marketing and content, it’s difficult to change.

This guide is here to help you go down the international expansion path on the web, focused on ensuring your users see the right content for their query in the search engines. This guide isn’t about recommendations for translation tools or how to target a specific country. It is all about international expansion from a technical standpoint that will grow with your business over time.

At the end is a bonus! A flow chart to help you troubleshoot international listings showing up in the wrong place in the SERPs. Have you ever wondered why your Canadian page showed for a user in the US? This will help you figure that out!

Before we begin: Terminology

ccTLD – A country-specific top-level domain. These are assigned by ICANN and are geo-targeted automatically in Google Search Console.

gTLD – A generic top-level domain. These are not country-specific and if used for country-specific content, they must be geo-target inside of Google Search Console or Bing Webmaster Tools. Examples include .com, .net, and .tv. Examples from Google found here.

Subdomain – A major section of a domain, distinguished by a change to the characters before the root domain. The most-used standard subdomain is www. Many sites start with www.domain.com as their main subdomain. Subdomains can be used for many reasons: marketing, region targeting, branded micro sites, and more.

Subfolder – A section of a subdomain/domain. Subfolders are sections marked by a trailing slash. Examples include www.domain.com/subfolder, or in terms of this guide, www.domain.com/en or www.domain.ca/fr.

Parameter – A modifier of a URL that either tracks a path of a user to the content or changes the content on the page based on the parameters in the URL. These are often used to indicate the language of a page. An example is www.domain.com/page1?lang=fr, with lang being the parameter.

Country – A recognized country that has a ccTLD by ICANN or an ISO code. Google uses ISO 3166-1 Alpha-2 for hreflang.

Region – Collections of countries that the general public groups together based on geography. Examples include the EU or the Middle East. These are not countries and cannot be geo-targeted at this time.

Hreflang – A tag used by Google to allow website owners to indicate that a specific page has a copy in another language. The tags indicate all other translated versions of that page along with the language. The language tags can have regional dialects to distinguish between language differences like British English and American English. These tags can reside on-page or in XML sitemaps.

Meta language – The language-distinguishing tag used by Bing. This tag merely informs Bing of the language of the current page.

Geo-targeting – Both Bing Webmaster Tools and Google Search Console allow website owners to claim a specific domain, subfolder, or subdomain, and inform the search engine that the content in that domain or section is developed for and targeted at the residents of a specific country.

Translation – Changing content from one language or regional dialect to another language or regional dialect. This should never be done with a machine, but rather always performed by someone fluent in that language or regional dialect.

Understanding country and language targeting

The first step in international expansion planning is to determine your target. There is some misunderstanding between country targeting and language targeting. Most businesses start international expansion wanting to do one of two things:

  1. Target users that speak another language.
    Example – A business in Germany: “We should translate our content to French.”
  2. Target users that live in another part of the world.
    Example – A business in Australia: “We should expand into the UK.”

False associations: Country and language

The first issue people run into is associating a country and a language. Many of the world’s top languages have root countries that share the same name; specifically, France/French, Germany/German, Portugal/Portuguese, Spain/Spanish, China/Chinese, Japan/Japanese, and Russia/Russian. Many of these languages are used in a number of other countries, however. Below is a list of the top languages used by Internet users.

Click to open a bigger version in a new tab!

Please note this is not the list of top languages in the world; that is a vastly different list. This list is based on Internet usage. And there are some languages that only have one country set as the official language, but users exist in other countries that browse the Internet with that language as their preferred language. An example might be a Japanese national working in the US setting up a new office.

Another note is that the “main” country chosen above is what country is the originator of the language (English) or what country shares a name with/is close to the language name. This is how many people associate languages and countries in most instances, but those assumptions are not correct.

Flags and languages

We must disassociate languages and countries. There are too many times when a country flag is used to note a language change on a site. Flags should only be used when the country is being targeted, not the language.

Click to open a bigger version in a new tab!

Web technology and use impacts targeting

The second issue arises in the execution. The business in Germany from the first few examples might hire a translator from France and translate their content to French. From there, the targeting can get confused based on where that content is placed and how it is tagged.

Below are some implementations of posting the translated content we might see by the business. This table looks at a variety of combinations of ccTLDs, gTLDs, subfolders, subdomains, hreflang tagging, and geo-targeting. Each combination of URL setup and tagging results in different targeting according to search engines and how that can impact the base number of Internet users in that group.

Click to open a bigger version in a new tab!

Given the above, you can see that the implementation is not as straightforward as it might seem. There’s no single right answer in the above possible implementations. However, many of them change the focus of the original target market (speakers of the French language) and that has an impact on the base target market.

International search strategy tool

This is what many of us face when trying to do international expansion. There is conflicting data on what should be done. This is why I developed a tool to help businesses determine which route they should take in international expansion. It helps them determine what their real focus should be (language, country, or if they need to use both) and narrows down the list of choices above while understanding their business needs, resources, and user needs. It’s developed over the years from a flow chart, to a poorly designed tool, to a better-structured tool found by clicking the link in the image below.

Start with those questions and then come back here when you have other questions. That’s what the rest of this guide is about. It’s broken down into three types of targeting:

  1. Language
  2. Country
  3. Hybrid (multiple countries with multiple languages)

No one type is easier than another. You really need to choose the path early on and use what you know of your business, user needs, and resources.

Language targeting

Language-only targeting can seem like the easiest route to take, as it doesn’t require a major change and multiple instances of marketing plans. Country-focused targeting requires new targeted content to each targeted country. There are far fewer languages in the world than countries. In addition, if you target the major world languages, you could potentially start with a base of millions of users that speak those languages.

However, language targeting involves two very tricky components: translation and language tagging. If either of these components are not done right, it can cause major issues with user experience and indexation.

Translation

The first rule of working with languages and translation is NEVER machine translate. Machine translation is highly inaccurate. I was just at an all-inclusive resort in Mexico, and you could tell the translations were done by a machine, not a person. Using machine translations produces a very poor user experience and poor SEO targeting as well.

Translations of content should always be done by a human who is fluent both in that language and the original language of the content. If you are dealing with regional variations, it is recommended to get someone that is native to and/or living in that area to translate, as well as being fluent.

Spending the right resources on translation will ensure the best user experience and the most organic traffic.

Language tagging: Hreflang and meta language

When you hear about translation and international expansion, the first thing people think about is the hreflang tag. Relative to the Internet, the hreflang tag is new. This launched in late 2010. It is only used by Google as of when this post was written. If the bulk of your traffic comes from Google and you are translating only, this is of use to you. However, do know that Bing uses a different tag format, called the meta language tag.

Tips: Ensure that there’s an hreflang tag on every page that’s translated to every other translated instance of that page. I prefer the tags be put in XML sitemaps (instructions here) to keep the tagging off the page, as any removal of code increases page load time, no matter how small. Do what works for your team.

What about x-default?

One of the tagging mistakes that happens most often is using x-default. Many people misunderstand its use. X-default was added to the hreflang markup family to help Google serve un-targeted pages, like those from IKEA and FedEx, to users that don’t have language-targeted content on that site or Google doesn’t know where to place them. This tag is not meant to set the “original” page.

Checking for tagging issues

Once you have your tagging live (or on a testing server that is crawlable by Google but not indexable), you can check for issues inside of Google Search Console. This will let you know what tag issues you are having and where they’re located.

URL selections

Choosing the URL structure of your language extensions is totally up to you. If you are focusing on language targeting only, don’t use a ccTLD. Those are meant for targeting a specific country, not a language. ccTLDs automatically geo-target and that selection cannot be changed. Your other choices are subfolder, subdomain, and parameter. They’re listed below in order of my professional preference and why.

  1. Subfolders provide a structure that’s easier to build upon and develop as your site and business grows and changes. You might not want to target specific countries now or have the resources, but you may someday. Setting up a subfolder structure allows you to use the same structure for any future ccTLDs or subdomains for country sections in the future. Your developers will appreciate this choice because it’s scalable for hreflang tags, as well.
  2. Parameters allow a backup system in case your tagging fails in a site update in the future. Parameters can be defined in Google as being used to modify the language on the page. If your other tags are lost, that parameter setting is still telling Google that the content is being translated.
    Using a parameter for language is also scalable for future plans and easy for tagging, like subfolders. The downsides are that they’re ugly and might accidentally be negated by a misplaced rel canonical tag in the future.
  3. Subdomains for language targeting is my least favorite option. Only use this if it’s the only option you have, by decree of your technical team. Using subdomains for languages means that if you change plans to target countries in the future, you’ll lose many options for URLs there. To follow the same structure for each country, you would need to use ccTLDs; while those are the strongest signal for geo-targeting, they are also the option that requires the most investment.

Notice that ccTLDs are not on this list. Those are only for geo-targeting. Unless you’re changing your content to focus on a specific country, do not use ccTLDs. I say this multiple times for a reason: too many websites make this mistake.

Detecting languages

Many companies want to try to make the website experience as easy as possible for the user. They attempt to detect the user’s preferences without needing input from the user. This can cause problems with languages.

There are a few ways to try to determine a user’s language preferences. The most-used are browser settings and IP address. It is not recommended to ever use the IP address for language detection. An IP address can show an approximate user location, but not their preferred language. The IP address is also highly inaccurate (just the other day I was “in” North Carolina and live in Austin) and Google still only crawls from a US IP address. Any automatic redirects based on IP should be avoided.

If you choose to try to guess at the user’s language preference when they enter your site, you can use the browser’s language setting or the IP address and ask the user to confirm the choice. Using JavaScript to do this will ensure that Googlebot does not get confused. Pair this with a good XML sitemap and the user can have a great interaction. Plus, the search engines will be able to crawl and index all of your translated content.

Country targeting, AKA geo-targeting

If your business or content changes depending on the location of the user, country targeting is for you. This is the most common answer for those businesses in retail. If you offer a different set of products, if you have different shipping, pricing, grouping structure, or even different images and descriptions, this is the way to go.

Example: If a greeting card business in the US wanted to expand to Australia, not only are the prices and products different (some different holidays), the Christmas cards are VASTLY different. Think of Christmas in summer, as it is in Australia, and only being able to pick from cards with winter scenes!

Don’t go down the geo-targeting route if your content or offerings don’t change or you don’t have the resources to change the content. If you launch country-targeted content in any URL structure (ccTLD, subdomain, or subfolder) and the content is identical, you run the risk of users coming across another country’s section.

Check out the flow chart at the end to help figure out why one version of your site might be ranking over another.

Example: As a web development service in Canada, you want to expand into the US. Your domain at the moment is www.webdevexpress.ca (totally made up!). You buy www.webdevexpress.us (that’s the ccTLD for the US, by the way). Nothing really needs to change, so you just use the same content and go live. A few months down the road, US clients are still seeing www.webdevexpress.ca when they do a brand name search. The US domain is weaker (fewer links, mentions, etc.) and has the same content! Google is going to show the more relevant, stronger page when everything is the same.

Regions versus countries

Knowing what country or which countries you want to focus on in expansion is usually decided before you determine how to get there. That’s what spawns the conversation.

There’s one misconception that can throw off the whole process of expansion, and that is that you can target a region with geo-targeting. As of right now, you can purchase a regional top-level domain like .eu, but those are treated as general top-level domains like .com or .net.

The search engines only operate geo-targeting in terms of countries right now. The Middle East and the European Union are collections of countries. If you set up a site dedicated to a region, there are no geo-targeting options for you.

One workaround is to select a primary country in that region, perhaps one in which you have offices, and geo-target to that country. It’s possible to rank for terms in that primary language in surrounding countries. We see this all the time with Canada and the US. If the content is relevant to the searcher, it’s possible to rank no matter the searcher.

Example: If you’re anywhere other than the UK, Google “fancy dress” — you see UK sites, right? At least in the US, “fancy dress” is not a term we use, so the most relevant content is shown. I can’t think of a good Canadian/US term, but I guarantee there are some out there!

URL selections

The first thing to determine in geo-targeting beyond the target countries is URL structure. This is immensely important because once you choose a structure, every country expansion should follow that. Changing URL structure in the future is difficult and costly when it comes to short-term organic traffic.

In order of my professional preference, your choices are:

  1. Subfolders. As with the language/translation option, this is my preferred setup, as it utilizes the same domain and subdomain across the board. This translates to utilizing some of the power you already built with other country-focused areas (or the initial site). This setup works well for adding different translations within one country (hybrid approach) down the line.
    Note: If you go with subfolders on both, always lead with the country, then language down the line.
    Example:
    www.domain.com/us/es (US-focused, in Spanish language) or www.domain.com/ca/fr (Canada-focused, in Canadian French).
  2. ccTLDs. This is the strongest signal that you’re focusing your content on a specific country. They geo-target automatically (one less step!), but that has a downside as well. If you started with a ccTLD and expanded later, you can’t geo-target a subfolder within a ccTLD at this point in time.
    Example: www.domain.ca/us will not work to target the US. The target will remain Canada. It might rank in the US, depending on the term competition and relevance, but you can’t technically geo-target the /us subfolder within the Canadian ccTLD.
  3. Subdomains. My last choice, because while you’re still on the same root domain, there’s that old SEO part of me that thinks a subdomain loses some equity from the main domain. BUT, if your tech team prefers this, there’s nothing wrong with using a subdomain to geo-target. You’ll need to claim each subdomain in Search Console and Bing Webmaster Tools and set the geo-target for each, just as you would with subfolders.
    Example: gb.domain.com

Content changes

The biggest question asked when someone embarks on country-targeting expansion is: “How much does my content need to change to not be duplicated?” In short — there is no magic number. No metric. There isn’t a number of sentences or a percentage. How much your content needs to change per country site or subsite is entirely up to your target market and your business.

You’ll need to do research into your new target market to determine how your content should change to meet their needs. There are a number of ways you might change your content to target a new country. The most common are:

Product differentiation

If you offer a different set of products or services to different countries by removing those that are not in demand, outlawed, or otherwise not wanted, or by adding new products for that country specifically, that is changing your site content.

Example #1: Amazon sells the movie “Elf” in the US and the UK, but they are different products. DVDs in Europe are coded for Europe and might not play on US players.

Example #2: Imagine you’re a drugstore in the UK and want to expand to the US. One of your products, 2.5% Selenium Sulphide, is not approved for use in the US. This is one among hundreds or thousands of products that are different.

Naming schema

The meaning of product names can change in different countries. How a specific region terms a product or service can change as well, making it necessary to change your product or service naming schema.

Keyword usage

Like the above, the words you use to describe your products or services might change in a new country. This can look like translation, but if it’s the change of just a few terms, it’s not considered full translation. There’s a fine line between these two things. If you realize that the only thing you’re changing is the wording between US and UK English, for example, you might not need to geo-target at all and mark the different pages as translations.

Keyword use change example: “Mum” versus “Mom” or “Mother” when it comes to Happy Mother’s Day cards. You need to offer different cards in this and other categories because of the country change. This is more than a word change, so it’s a case of geo-targeting — not just translation.

Translation change example: Etsy.com. Down at the bottom of the page, you can change your language setting. I set mine to UK English, and words like “favourite” started to show up. If this sounds like what you would need to do and your content would not change otherwise (Etsy shows all content to all users regardless of their location), consider translation only.

Pricing structure

Many times, one of the most common things that change in country-specific content is pricing. There’s the issue of different currency, but more than that, different countries have different supply and demand markets that should and will change your pricing structure.

Imagery changes

When dealing with different cultures, sometimes you find the need to change your site imagery. If you’ve never explored psychology, I highly recommend checking out The Web Psychologist – Nathalie Nahai and some of her talks. Understanding your new target market’s culture is imperative to marketing effectively.

Example: Samsung changes the images on their UK versus China sites to change the focus from an individualistic to a collectivistic culture. See my presentation at SearchLove San Diego for more examples.

Laws, rules, and regulations

One of the most important ways to change your content is to satisfy the local laws and regulations. This is going to depend on each business. You might deal with tons, while others might deal with none. Check out local competitors — the biggest you can identify — to see what you might need to do.

Example: If you move into the UK and set cookies on your visitor’s machine, you have to alert them to the use of cookies. This is not a law in the US and is easily missed.

User experience and IP redirects

When people start moving into other countries, one of the things they want to ensure is that users get to the right content. This is especially important when products change and the purchase of an incorrect product would cause issues for the user, or the product isn’t available to them. Your customer service, user experience, or legal team is going to ask that you redirect users to the correct country. Everyone gets to the right place and the headaches lessen.

There isn’t anything wrong with asking a user to select the country they reside in and set a cookie, but many people don’t want to bother their users. Therefore, they detect the user’s IP address and then force a redirect from there. There are two problems with this setup.

  1. IP addresses are inaccurate – I was in Seattle, WA once and my IP had me in Washington, DC. No kidding. Look at that distance on a map. Think about that distance in terms of Europe and how much might change there.
  2. Google crawls from California – For the time being, using an IP-based forced redirect will ensure your international content is not indexed. Google will only ever see the US content if you do a forced redirect.

You can deal with this by detecting the country-using IP address (or if organic traffic, what version of Google they came from) and using a JavaScript popup to ask what their preferred country is, then set a cookie with that preference. Even if the user clicks on another country’s content in the future, they will be redirected to their own.

No hreflang??

If you went through that tool, you noticed that my geo-targeting plan does not include hreflang. Many other people disagree with me on this point, saying that the more signals you can send, the better.

Before I get into why I don’t recommend setting up hreflang between country targeted sub-sites, let me make one thing clear. Setting up hreflang will not hurt your site if you are really focusing on country targeting and it’s not that intricate of a setup yet (more on that later). Let’s say you’re in Canada and want to open a US-targeted site. Your content changes because your products change, your prices change, your shipping info changes. You create domain.com/us and geo-target it to the US. You can add hreflang between each page that is the same between the two sub-sites — two products that exist in both locations, for example. The hreflang will not hurt.

Example: If you don’t have the resources to change your content at the moment to fully target the UK, only translate your content a bit between your US (domain.com) and UK (domain.co.uk), and have plans to change your content down the road, an hreflang tag between those two ccTLDs can help Google understand the content change and who you’re targeting.

Why I don’t recommend hreflang for geo-targeting only

Hreflang was meant to help Google understand when two pages are exactly the same, but translated. It works much like a canonical tag (which is why using another canonical can be detrimental to the hreflang working) in which you have multiple versions of one page with slight changes.

Many people get confused because there’s the ability to use country codes in the hreflang tags. This is for when you need to tell Google of a dialect change. An example would be if you have two sub-sites that are identical, but the American English has been changed to British English. It’s not meant to inform Google that content that’s targeted at a different country is targeted at that country.

When I recommend geo-targeting only, I make it very clear to clients that going down this route means you really need to change the content. International business is so much more than just translation. Translating content only might hurt your conversion rates if you miss some aspect of the new target market.

Hiring content writers in that country that understand the nuances is very important. I worked for a British company for 4 years, so I get some of the differences, but things continually surprise me still. I would never feel comfortable as an American writing content for a British audience.

I also don’t recommend hreflang in most geo-targeting cases, because the use of geo-targeting and hreflang can get really confusing. This has led to incorrect hreflang tags in the past that have wreaked havoc on Google’s understanding of the site structure.

Example: A business starts off with a Canadian domain (domain.ca) and a France domain (domain.fr). They use hreflang between the English for Canada and French for France using the code below. They then add a US site and the code is modified to add a line for the US content.



<link rel="alternate" hreflang="en" href="http://domain.ca/" />
<link rel="alternate" hreflang="fr" href="http://domain.fr/" />
<link rel="alternate" hreflang="en-us" href="http://domain.com/" />

This looks odd because there is one English-language page with no regional modifications that is on a Canadian-targeted domain. There is a US regional English dialect version on a general top-level domain (as .com is general and is not US-specific, but people use it that way).

Remember, this is a bot that’s trying to logic out a structure. For a user that prefers UK English, there is no logical choice. The general English is a Canadian site and the general TLD is in US English. This is where we get some of the inconsistencies with international targeting.

You might be saying things like “That would never happen!” and “They should have changed the first English to Canadian English (en-ca)!”, but if you’ve ever dealt with hurried developers (they really do have at least 50 requests at once sometimes) you’ll know that they, like search bots, prefer consistency.

Hreflang should not be needed in geo-targeting cases because, if you’re really going to target a new country-specific market, you should treat them as a whole new market and create content just for them. If you can’t, or don’t think it’s needed, then providing language translations is probably all you need to do at the moment. And hreflang in geo-targeting cases can cause confusion with code that might confuse the search engines. The less we can confuse them, the better the results are!

Hybrid targeting

Finally, there is the route I call “hybrid,” or utilizing both geo-targeting and translation. This is what most major retail corporations should be doing if they’re international. Due to laws, currency, market changes, and cultural changes, there is a big need for geo-targeted content. But in addition to that, there are countries that require multiple language versions. There might be anywhere from one to a few hundred used languages in a single country! Here are the top countries that use the web and how many recognized languages are used in each.

Click to open a bigger version in a new tab!

Do you need to translate into all 31 languages used in the US? Probably not. But if 50% of your target market in Canada prefers Canadian French as their primary language, the translation investment might be a good one.

In cases where a geo-targeted site (ccTLD use) or sub-site (subdomain or subfolder) needs more than one language, then there is the need to geo-target the site or sub-site and then use hreflang within that country-specific site.

This statement can be confusing, so let me show you what I mean:

Click to open a bigger version in a new tab!

This requires a good amount of planning and resources, so if you need to embark on this path in the future, start setting up the structure now. If you need to go the hybrid route, I recommend the following URL structures for language and country targeting. As with before, these are in order of my professional preference and are all focused on content targeted to Canada in Canadian French.

(Country structure/Language structure)

  1. Subfolder/Subfolder
    Example: domain.com/ca/fr
  2. Subfolder/Parameter
    Example: domain.com/ca/page.html?lang=fr
  3. ccTLD/Subfolder
    Example: domain.ca/fr
  4. ccTLD/Parameter
    Example: domain.ca/page.html?lang=fr
  5. Subdomain/Subfolder
    Example: ca.domain.com/fr
  6. Subdomain/Parameter
    Example: ca.domain.com/page.html?lang=fr
  7. ccTLD/Subdomain (not recommended, nor are the other combinations I intentionally left out)
    Example: fr.domain.ca

The hybrid option is where the hreflang setup can get the most messed up. Make sure you have mapped everything out before implementing, and ensure you’re considering future business plans as well.

I hope this helps clear up some of the confusion around international expansion. It really is specific to each individual business, so take the time to plan and happy expansion!

Troubleshooting International SEO: A flowchart

Click to open a bigger version in a new tab!

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

from Raymond Castleberry Blog http://raymondcastleberry.blogspot.com/2016/03/the-guide-to-international-website_29.html
via http://raymondcastleberry.blogspot.com

The Guide to International Website Expansion: Hreflang, ccTLDs, & More!

Posted by katemorris

Growth. Revenue, visits, conversions. We all want to see growth. For many, focusing on a new set of potential customers in another market (international, for instance) is a source of growth. It can sometimes seem like an easy expansion. If your current target market is in the US, UK, or Australia, the other two look promising. Same language, same content — all you need is to set up a site for them and target it at them, right?

International expansion is more complicated than that. The ease of expansion depends highly on your business, your resources, and your customers. How you approach expansion and scale it over time takes consideration and planning. Once you’ve gone down a path of URL structure and a process for marketing and content, it’s difficult to change.

This guide is here to help you go down the international expansion path on the web, focused on ensuring your users see the right content for their query in the search engines. This guide isn’t about recommendations for translation tools or how to target a specific country. It is all about international expansion from a technical standpoint that will grow with your business over time.

At the end is a bonus! A flow chart to help you troubleshoot international listings showing up in the wrong place in the SERPs. Have you ever wondered why your Canadian page showed for a user in the US? This will help you figure that out!

Before we begin: Terminology

ccTLD – A country-specific top-level domain. These are assigned by ICANN and are geo-targeted automatically in Google Search Console.

gTLD – A generic top-level domain. These are not country-specific and if used for country-specific content, they must be geo-target inside of Google Search Console or Bing Webmaster Tools. Examples include .com, .net, and .tv. Examples from Google found here.

Subdomain – A major section of a domain, distinguished by a change to the characters before the root domain. The most-used standard subdomain is www. Many sites start with www.domain.com as their main subdomain. Subdomains can be used for many reasons: marketing, region targeting, branded micro sites, and more.

Subfolder – A section of a subdomain/domain. Subfolders are sections marked by a trailing slash. Examples include www.domain.com/subfolder, or in terms of this guide, www.domain.com/en or www.domain.ca/fr.

Parameter – A modifier of a URL that either tracks a path of a user to the content or changes the content on the page based on the parameters in the URL. These are often used to indicate the language of a page. An example is www.domain.com/page1?lang=fr, with lang being the parameter.

Country – A recognized country that has a ccTLD by ICANN or an ISO code. Google uses ISO 3166-1 Alpha-2 for hreflang.

Region – Collections of countries that the general public groups together based on geography. Examples include the EU or the Middle East. These are not countries and cannot be geo-targeted at this time.

Hreflang – A tag used by Google to allow website owners to indicate that a specific page has a copy in another language. The tags indicate all other translated versions of that page along with the language. The language tags can have regional dialects to distinguish between language differences like British English and American English. These tags can reside on-page or in XML sitemaps.

Meta language – The language-distinguishing tag used by Bing. This tag merely informs Bing of the language of the current page.

Geo-targeting – Both Bing Webmaster Tools and Google Search Console allow website owners to claim a specific domain, subfolder, or subdomain, and inform the search engine that the content in that domain or section is developed for and targeted at the residents of a specific country.

Translation – Changing content from one language or regional dialect to another language or regional dialect. This should never be done with a machine, but rather always performed by someone fluent in that language or regional dialect.

Understanding country and language targeting

The first step in international expansion planning is to determine your target. There is some misunderstanding between country targeting and language targeting. Most businesses start international expansion wanting to do one of two things:

  1. Target users that speak another language.
    Example – A business in Germany: “We should translate our content to French.”
  2. Target users that live in another part of the world.
    Example – A business in Australia: “We should expand into the UK.”

False associations: Country and language

The first issue people run into is associating a country and a language. Many of the world’s top languages have root countries that share the same name; specifically, France/French, Germany/German, Portugal/Portuguese, Spain/Spanish, China/Chinese, Japan/Japanese, and Russia/Russian. Many of these languages are used in a number of other countries, however. Below is a list of the top languages used by Internet users.

Click to open a bigger version in a new tab!

Please note this is not the list of top languages in the world; that is a vastly different list. This list is based on Internet usage. And there are some languages that only have one country set as the official language, but users exist in other countries that browse the Internet with that language as their preferred language. An example might be a Japanese national working in the US setting up a new office.

Another note is that the “main” country chosen above is what country is the originator of the language (English) or what country shares a name with/is close to the language name. This is how many people associate languages and countries in most instances, but those assumptions are not correct.

Flags and languages

We must disassociate languages and countries. There are too many times when a country flag is used to note a language change on a site. Flags should only be used when the country is being targeted, not the language.

Click to open a bigger version in a new tab!

Web technology and use impacts targeting

The second issue arises in the execution. The business in Germany from the first few examples might hire a translator from France and translate their content to French. From there, the targeting can get confused based on where that content is placed and how it is tagged.

Below are some implementations of posting the translated content we might see by the business. This table looks at a variety of combinations of ccTLDs, gTLDs, subfolders, subdomains, hreflang tagging, and geo-targeting. Each combination of URL setup and tagging results in different targeting according to search engines and how that can impact the base number of Internet users in that group.

Click to open a bigger version in a new tab!

Given the above, you can see that the implementation is not as straightforward as it might seem. There’s no single right answer in the above possible implementations. However, many of them change the focus of the original target market (speakers of the French language) and that has an impact on the base target market.

International search strategy tool

This is what many of us face when trying to do international expansion. There is conflicting data on what should be done. This is why I developed a tool to help businesses determine which route they should take in international expansion. It helps them determine what their real focus should be (language, country, or if they need to use both) and narrows down the list of choices above while understanding their business needs, resources, and user needs. It’s developed over the years from a flow chart, to a poorly designed tool, to a better-structured tool found by clicking the link in the image below.

Start with those questions and then come back here when you have other questions. That’s what the rest of this guide is about. It’s broken down into three types of targeting:

  1. Language
  2. Country
  3. Hybrid (multiple countries with multiple languages)

No one type is easier than another. You really need to choose the path early on and use what you know of your business, user needs, and resources.

Language targeting

Language-only targeting can seem like the easiest route to take, as it doesn’t require a major change and multiple instances of marketing plans. Country-focused targeting requires new targeted content to each targeted country. There are far fewer languages in the world than countries. In addition, if you target the major world languages, you could potentially start with a base of millions of users that speak those languages.

However, language targeting involves two very tricky components: translation and language tagging. If either of these components are not done right, it can cause major issues with user experience and indexation.

Translation

The first rule of working with languages and translation is NEVER machine translate. Machine translation is highly inaccurate. I was just at an all-inclusive resort in Mexico, and you could tell the translations were done by a machine, not a person. Using machine translations produces a very poor user experience and poor SEO targeting as well.

Translations of content should always be done by a human who is fluent both in that language and the original language of the content. If you are dealing with regional variations, it is recommended to get someone that is native to and/or living in that area to translate, as well as being fluent.

Spending the right resources on translation will ensure the best user experience and the most organic traffic.

Language tagging: Hreflang and meta language

When you hear about translation and international expansion, the first thing people think about is the hreflang tag. Relative to the Internet, the hreflang tag is new. This launched in late 2010. It is only used by Google as of when this post was written. If the bulk of your traffic comes from Google and you are translating only, this is of use to you. However, do know that Bing uses a different tag format, called the meta language tag.

Tips: Ensure that there’s an hreflang tag on every page that’s translated to every other translated instance of that page. I prefer the tags be put in XML sitemaps (instructions here) to keep the tagging off the page, as any removal of code increases page load time, no matter how small. Do what works for your team.

What about x-default?

One of the tagging mistakes that happens most often is using x-default. Many people misunderstand its use. X-default was added to the hreflang markup family to help Google serve un-targeted pages, like those from IKEA and FedEx, to users that don’t have language-targeted content on that site or Google doesn’t know where to place them. This tag is not meant to set the “original” page.

Checking for tagging issues

Once you have your tagging live (or on a testing server that is crawlable by Google but not indexable), you can check for issues inside of Google Search Console. This will let you know what tag issues you are having and where they’re located.

URL selections

Choosing the URL structure of your language extensions is totally up to you. If you are focusing on language targeting only, don’t use a ccTLD. Those are meant for targeting a specific country, not a language. ccTLDs automatically geo-target and that selection cannot be changed. Your other choices are subfolder, subdomain, and parameter. They’re listed below in order of my professional preference and why.

  1. Subfolders provide a structure that’s easier to build upon and develop as your site and business grows and changes. You might not want to target specific countries now or have the resources, but you may someday. Setting up a subfolder structure allows you to use the same structure for any future ccTLDs or subdomains for country sections in the future. Your developers will appreciate this choice because it’s scalable for hreflang tags, as well.
  2. Parameters allow a backup system in case your tagging fails in a site update in the future. Parameters can be defined in Google as being used to modify the language on the page. If your other tags are lost, that parameter setting is still telling Google that the content is being translated.
    Using a parameter for language is also scalable for future plans and easy for tagging, like subfolders. The downsides are that they’re ugly and might accidentally be negated by a misplaced rel canonical tag in the future.
  3. Subdomains for language targeting is my least favorite option. Only use this if it’s the only option you have, by decree of your technical team. Using subdomains for languages means that if you change plans to target countries in the future, you’ll lose many options for URLs there. To follow the same structure for each country, you would need to use ccTLDs; while those are the strongest signal for geo-targeting, they are also the option that requires the most investment.

Notice that ccTLDs are not on this list. Those are only for geo-targeting. Unless you’re changing your content to focus on a specific country, do not use ccTLDs. I say this multiple times for a reason: too many websites make this mistake.

Detecting languages

Many companies want to try to make the website experience as easy as possible for the user. They attempt to detect the user’s preferences without needing input from the user. This can cause problems with languages.

There are a few ways to try to determine a user’s language preferences. The most-used are browser settings and IP address. It is not recommended to ever use the IP address for language detection. An IP address can show an approximate user location, but not their preferred language. The IP address is also highly inaccurate (just the other day I was “in” North Carolina and live in Austin) and Google still only crawls from a US IP address. Any automatic redirects based on IP should be avoided.

If you choose to try to guess at the user’s language preference when they enter your site, you can use the browser’s language setting or the IP address and ask the user to confirm the choice. Using JavaScript to do this will ensure that Googlebot does not get confused. Pair this with a good XML sitemap and the user can have a great interaction. Plus, the search engines will be able to crawl and index all of your translated content.

Country targeting, AKA geo-targeting

If your business or content changes depending on the location of the user, country targeting is for you. This is the most common answer for those businesses in retail. If you offer a different set of products, if you have different shipping, pricing, grouping structure, or even different images and descriptions, this is the way to go.

Example: If a greeting card business in the US wanted to expand to Australia, not only are the prices and products different (some different holidays), the Christmas cards are VASTLY different. Think of Christmas in summer, as it is in Australia, and only being able to pick from cards with winter scenes!

Don’t go down the geo-targeting route if your content or offerings don’t change or you don’t have the resources to change the content. If you launch country-targeted content in any URL structure (ccTLD, subdomain, or subfolder) and the content is identical, you run the risk of users coming across another country’s section.

Check out the flow chart at the end to help figure out why one version of your site might be ranking over another.

Example: As a web development service in Canada, you want to expand into the US. Your domain at the moment is www.webdevexpress.ca (totally made up!). You buy www.webdevexpress.us (that’s the ccTLD for the US, by the way). Nothing really needs to change, so you just use the same content and go live. A few months down the road, US clients are still seeing www.webdevexpress.ca when they do a brand name search. The US domain is weaker (fewer links, mentions, etc.) and has the same content! Google is going to show the more relevant, stronger page when everything is the same.

Regions versus countries

Knowing what country or which countries you want to focus on in expansion is usually decided before you determine how to get there. That’s what spawns the conversation.

There’s one misconception that can throw off the whole process of expansion, and that is that you can target a region with geo-targeting. As of right now, you can purchase a regional top-level domain like .eu, but those are treated as general top-level domains like .com or .net.

The search engines only operate geo-targeting in terms of countries right now. The Middle East and the European Union are collections of countries. If you set up a site dedicated to a region, there are no geo-targeting options for you.

One workaround is to select a primary country in that region, perhaps one in which you have offices, and geo-target to that country. It’s possible to rank for terms in that primary language in surrounding countries. We see this all the time with Canada and the US. If the content is relevant to the searcher, it’s possible to rank no matter the searcher.

Example: If you’re anywhere other than the UK, Google “fancy dress” — you see UK sites, right? At least in the US, “fancy dress” is not a term we use, so the most relevant content is shown. I can’t think of a good Canadian/US term, but I guarantee there are some out there!

URL selections

The first thing to determine in geo-targeting beyond the target countries is URL structure. This is immensely important because once you choose a structure, every country expansion should follow that. Changing URL structure in the future is difficult and costly when it comes to short-term organic traffic.

In order of my professional preference, your choices are:

  1. Subfolders. As with the language/translation option, this is my preferred setup, as it utilizes the same domain and subdomain across the board. This translates to utilizing some of the power you already built with other country-focused areas (or the initial site). This setup works well for adding different translations within one country (hybrid approach) down the line.
    Note: If you go with subfolders on both, always lead with the country, then language down the line.
    Example:
    www.domain.com/us/es (US-focused, in Spanish language) or www.domain.com/ca/fr (Canada-focused, in Canadian French).
  2. ccTLDs. This is the strongest signal that you’re focusing your content on a specific country. They geo-target automatically (one less step!), but that has a downside as well. If you started with a ccTLD and expanded later, you can’t geo-target a subfolder within a ccTLD at this point in time.
    Example: www.domain.ca/us will not work to target the US. The target will remain Canada. It might rank in the US, depending on the term competition and relevance, but you can’t technically geo-target the /us subfolder within the Canadian ccTLD.
  3. Subdomains. My last choice, because while you’re still on the same root domain, there’s that old SEO part of me that thinks a subdomain loses some equity from the main domain. BUT, if your tech team prefers this, there’s nothing wrong with using a subdomain to geo-target. You’ll need to claim each subdomain in Search Console and Bing Webmaster Tools and set the geo-target for each, just as you would with subfolders.
    Example: gb.domain.com

Content changes

The biggest question asked when someone embarks on country-targeting expansion is: “How much does my content need to change to not be duplicated?” In short — there is no magic number. No metric. There isn’t a number of sentences or a percentage. How much your content needs to change per country site or subsite is entirely up to your target market and your business.

You’ll need to do research into your new target market to determine how your content should change to meet their needs. There are a number of ways you might change your content to target a new country. The most common are:

Product differentiation

If you offer a different set of products or services to different countries by removing those that are not in demand, outlawed, or otherwise not wanted, or by adding new products for that country specifically, that is changing your site content.

Example #1: Amazon sells the movie “Elf” in the US and the UK, but they are different products. DVDs in Europe are coded for Europe and might not play on US players.

Example #2: Imagine you’re a drugstore in the UK and want to expand to the US. One of your products, 2.5% Selenium Sulphide, is not approved for use in the US. This is one among hundreds or thousands of products that are different.

Naming schema

The meaning of product names can change in different countries. How a specific region terms a product or service can change as well, making it necessary to change your product or service naming schema.

Keyword usage

Like the above, the words you use to describe your products or services might change in a new country. This can look like translation, but if it’s the change of just a few terms, it’s not considered full translation. There’s a fine line between these two things. If you realize that the only thing you’re changing is the wording between US and UK English, for example, you might not need to geo-target at all and mark the different pages as translations.

Keyword use change example: “Mum” versus “Mom” or “Mother” when it comes to Happy Mother’s Day cards. You need to offer different cards in this and other categories because of the country change. This is more than a word change, so it’s a case of geo-targeting — not just translation.

Translation change example: Etsy.com. Down at the bottom of the page, you can change your language setting. I set mine to UK English, and words like “favourite” started to show up. If this sounds like what you would need to do and your content would not change otherwise (Etsy shows all content to all users regardless of their location), consider translation only.

Pricing structure

Many times, one of the most common things that change in country-specific content is pricing. There’s the issue of different currency, but more than that, different countries have different supply and demand markets that should and will change your pricing structure.

Imagery changes

When dealing with different cultures, sometimes you find the need to change your site imagery. If you’ve never explored psychology, I highly recommend checking out The Web Psychologist – Nathalie Nahai and some of her talks. Understanding your new target market’s culture is imperative to marketing effectively.

Example: Samsung changes the images on their UK versus China sites to change the focus from an individualistic to a collectivistic culture. See my presentation at SearchLove San Diego for more examples.

Laws, rules, and regulations

One of the most important ways to change your content is to satisfy the local laws and regulations. This is going to depend on each business. You might deal with tons, while others might deal with none. Check out local competitors — the biggest you can identify — to see what you might need to do.

Example: If you move into the UK and set cookies on your visitor’s machine, you have to alert them to the use of cookies. This is not a law in the US and is easily missed.

User experience and IP redirects

When people start moving into other countries, one of the things they want to ensure is that users get to the right content. This is especially important when products change and the purchase of an incorrect product would cause issues for the user, or the product isn’t available to them. Your customer service, user experience, or legal team is going to ask that you redirect users to the correct country. Everyone gets to the right place and the headaches lessen.

There isn’t anything wrong with asking a user to select the country they reside in and set a cookie, but many people don’t want to bother their users. Therefore, they detect the user’s IP address and then force a redirect from there. There are two problems with this setup.

  1. IP addresses are inaccurate – I was in Seattle, WA once and my IP had me in Washington, DC. No kidding. Look at that distance on a map. Think about that distance in terms of Europe and how much might change there.
  2. Google crawls from California – For the time being, using an IP-based forced redirect will ensure your international content is not indexed. Google will only ever see the US content if you do a forced redirect.

You can deal with this by detecting the country-using IP address (or if organic traffic, what version of Google they came from) and using a JavaScript popup to ask what their preferred country is, then set a cookie with that preference. Even if the user clicks on another country’s content in the future, they will be redirected to their own.

No hreflang??

If you went through that tool, you noticed that my geo-targeting plan does not include hreflang. Many other people disagree with me on this point, saying that the more signals you can send, the better.

Before I get into why I don’t recommend setting up hreflang between country targeted sub-sites, let me make one thing clear. Setting up hreflang will not hurt your site if you are really focusing on country targeting and it’s not that intricate of a setup yet (more on that later). Let’s say you’re in Canada and want to open a US-targeted site. Your content changes because your products change, your prices change, your shipping info changes. You create domain.com/us and geo-target it to the US. You can add hreflang between each page that is the same between the two sub-sites — two products that exist in both locations, for example. The hreflang will not hurt.

Example: If you don’t have the resources to change your content at the moment to fully target the UK, only translate your content a bit between your US (domain.com) and UK (domain.co.uk), and have plans to change your content down the road, an hreflang tag between those two ccTLDs can help Google understand the content change and who you’re targeting.

Why I don’t recommend hreflang for geo-targeting only

Hreflang was meant to help Google understand when two pages are exactly the same, but translated. It works much like a canonical tag (which is why using another canonical can be detrimental to the hreflang working) in which you have multiple versions of one page with slight changes.

Many people get confused because there’s the ability to use country codes in the hreflang tags. This is for when you need to tell Google of a dialect change. An example would be if you have two sub-sites that are identical, but the American English has been changed to British English. It’s not meant to inform Google that content that’s targeted at a different country is targeted at that country.

When I recommend geo-targeting only, I make it very clear to clients that going down this route means you really need to change the content. International business is so much more than just translation. Translating content only might hurt your conversion rates if you miss some aspect of the new target market.

Hiring content writers in that country that understand the nuances is very important. I worked for a British company for 4 years, so I get some of the differences, but things continually surprise me still. I would never feel comfortable as an American writing content for a British audience.

I also don’t recommend hreflang in most geo-targeting cases, because the use of geo-targeting and hreflang can get really confusing. This has led to incorrect hreflang tags in the past that have wreaked havoc on Google’s understanding of the site structure.

Example: A business starts off with a Canadian domain (domain.ca) and a France domain (domain.fr). They use hreflang between the English for Canada and French for France using the code below. They then add a US site and the code is modified to add a line for the US content.


<link rel="alternate" hreflang="en" href="http://domain.ca/" />
<link rel="alternate" hreflang="fr" href="http://domain.fr/" />
<link rel="alternate" hreflang="en-us" href="http://domain.com/" />

This looks odd because there is one English-language page with no regional modifications that is on a Canadian-targeted domain. There is a US regional English dialect version on a general top-level domain (as .com is general and is not US-specific, but people use it that way).

Remember, this is a bot that’s trying to logic out a structure. For a user that prefers UK English, there is no logical choice. The general English is a Canadian site and the general TLD is in US English. This is where we get some of the inconsistencies with international targeting.

You might be saying things like “That would never happen!” and “They should have changed the first English to Canadian English (en-ca)!”, but if you’ve ever dealt with hurried developers (they really do have at least 50 requests at once sometimes) you’ll know that they, like search bots, prefer consistency.

Hreflang should not be needed in geo-targeting cases because, if you’re really going to target a new country-specific market, you should treat them as a whole new market and create content just for them. If you can’t, or don’t think it’s needed, then providing language translations is probably all you need to do at the moment. And hreflang in geo-targeting cases can cause confusion with code that might confuse the search engines. The less we can confuse them, the better the results are!

Hybrid targeting

Finally, there is the route I call “hybrid,” or utilizing both geo-targeting and translation. This is what most major retail corporations should be doing if they’re international. Due to laws, currency, market changes, and cultural changes, there is a big need for geo-targeted content. But in addition to that, there are countries that require multiple language versions. There might be anywhere from one to a few hundred used languages in a single country! Here are the top countries that use the web and how many recognized languages are used in each.

Click to open a bigger version in a new tab!

Do you need to translate into all 31 languages used in the US? Probably not. But if 50% of your target market in Canada prefers Canadian French as their primary language, the translation investment might be a good one.

In cases where a geo-targeted site (ccTLD use) or sub-site (subdomain or subfolder) needs more than one language, then there is the need to geo-target the site or sub-site and then use hreflang within that country-specific site.

This statement can be confusing, so let me show you what I mean:

Click to open a bigger version in a new tab!

This requires a good amount of planning and resources, so if you need to embark on this path in the future, start setting up the structure now. If you need to go the hybrid route, I recommend the following URL structures for language and country targeting. As with before, these are in order of my professional preference and are all focused on content targeted to Canada in Canadian French.

(Country structure/Language structure)

  1. Subfolder/Subfolder
    Example: domain.com/ca/fr
  2. Subfolder/Parameter
    Example: domain.com/ca/page.html?lang=fr
  3. ccTLD/Subfolder
    Example: domain.ca/fr
  4. ccTLD/Parameter
    Example: domain.ca/page.html?lang=fr
  5. Subdomain/Subfolder
    Example: ca.domain.com/fr
  6. Subdomain/Parameter
    Example: ca.domain.com/page.html?lang=fr
  7. ccTLD/Subdomain (not recommended, nor are the other combinations I intentionally left out)
    Example: fr.domain.ca

The hybrid option is where the hreflang setup can get the most messed up. Make sure you have mapped everything out before implementing, and ensure you’re considering future business plans as well.

I hope this helps clear up some of the confusion around international expansion. It really is specific to each individual business, so take the time to plan and happy expansion!

Troubleshooting International SEO: A flowchart

Click to open a bigger version in a new tab!

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

from Raymond Castleberry Blog http://raymondcastleberry.blogspot.com/2016/03/the-guide-to-international-website.html
via http://raymondcastleberry.blogspot.com

Are Keywords Really Dead? An Experiment

Posted by sam.nemzer

[Estimated read time: 6 minutes]

A quantitative analysis of the claim that topics are more important than keywords.

What’s more important: topics or keywords? This has been a major discussion point in SEO recently, nowhere more so than here on the Moz blog. Rand has given two Whiteboard Fridays in the last two months, and Moz’s new Related Topics feature in Moz Pro aims to help you to optimize your site for topics as well as keywords.

The idea under discussion is that, since the Hummingbird algorithm update in 2013, Google is getting really good at understanding natural language. So much so, in fact, that it’s now able to identify similar terms, making it less important to worry about minor changes in the wording of your content in order to target specific keyword phrases. People are arguing that it’s more important to think about the concepts that Google will interpret, regardless of word choice.

While I agree that this is the direction that we’re heading, I wanted to see how true this is now, in the present. So I designed an experiment.

The experiment

The question I wanted to answer was: “Do searches within the same topic (but with different keyword phrases) give the same result?” To this end, I put together 10 groups of 10 keywords each, with each group’s keywords signifying (as closely as possible) the same concept. These keywords were selected in order to represent a range of search volume, and across the spectrum of informational to transactional. For example, one group of keywords are all synonymous the phrase “cheapest flight times” (not-so-subtly lifted from Rand’s Whiteboard Friday):

  • cheapest flight times
  • cheapest time for flights
  • cheapest times to fly
  • cheap times for flights
  • cheap times to fly
  • fly at cheap times
  • time of cheapest flights
  • what time of day are flights cheapest
  • what time of day to fly cheaply
  • when are flights cheapest

I put the sample of 100 keywords through a rank-tracking tool, and extracted the top ten organic results for each keyword.

Then, for each keyword group, I measured two things.

  1. The similarity of each topic’s SERPs, by position.
    • For example, if every keyword within a group has the same page ranking no. 2, that result will score 10. If 9 results are the same and one is different, nine results will get a score of 9, and the other will score 1.
    • This score is then averaged across all 100 (10 results * 10 keywords) results within each topic. The highest possible score (every SERP identical) is 10, the lowest possible (every result different) is 1.
  2. The similarity of each topic’s SERPs, by all pages that rank (irrespective of position).
    • As above, but scoring each keyword’s results by the number of other keywords that contain that result anywhere in the top 10 results. If a result appears in the top 10 for all keywords in a topic group, it scores a 10, even if the results in the other keywords’ SERPs are in different positions.
    • Again, the score is averaged across all results in each topic, with 10 being the highest possible and 1 the lowest.

Results

The full analysis and results can be seen in this Google Sheet.

This chart shows the results of the experiment for the 10 topic groups. The blue bars represent the by position score, averaged across each topic group, and the red bars show the average all pages score.

The most striking thing about this is the wide range of results that can be seen. Topic group D’s keywords are 100% identical if you don’t take ordering into account, whereas group J only has 38% crossover of results between keywords.

We can see from this that targeting individual keywords is definitely not a thing of the past. For most of the topic groups, the pages that rank in the top 10 have little consistency across different wordings of the same concepts. From this we can assume that the primary thing making one page rank where another does not, is matching exact keywords.

Why is there such variation?

If we look into what factors might be affecting the varying similarities between the different topic groups, we could consider the following factors:

  • Searcher intent: Informational (Know) vs Transactional (Do) topics.
  • Topics with high competition levels.

Searcher intent

Although Google’s categorisation of searches into do, know and go can be seen as a false trichotomy, it can still be useful as a simplistic model to classify searcher intent. All of the keyword groups I used can be classed as either informational or transactional.

If we break up our topic groups in this way, we can see the following:

As you can see, there’s no clear difference between the two types. In fact the highest and lowest groups (D and J) are both transactional.

This means that we can’t say — based on this data, at least — that there’s any link between the search intent of a topic and whether you should focus on topics over keywords.

Keyword Difficulty

Another factor that could be correlated with similarity of SERPs is keyword difficulty. As measured by Moz’s keyword difficulty tool, this is a proxy for how strong the sites that rank in a SERP are, based on their Page Authority and Domain Authority.

My hypothesis here is that, for searches where there are a lot of well-established, high-DA sites ranking, there will be less variation between similar keywords. If this is the case, we would expect to see a positive correlation in the data.

This is not borne out by the data. The higher the keyword difficulty is across the keywords in a topic group, the less similarity there is between SERPs within that topic group. This correlation is fairly weak (R2=0.28), so we can’t draw any conclusions from this data.

One other factor that could explain the lack of pattern in this result is that 100 keywords in 10 groups is a fairly small sample size, and is subject to variation in the selection of keywords to go into each group. It is impossible to perfectly control how “close” in definition the keywords in each group are.

Also, it may just be the case that Google simply understands some concepts better than others. This would mean it can see some synonyms as being very closely related, whereas for others it’s still perplexed by the variations, so looks for specific words within the content of each page.

Conclusion

So does this mean that we should or shouldn’t ignore Rand when he tells us to forget about keywords and focus on topics? Somewhat unsatisfyingly, the answer is a strong “maybe.”

While for some search topics there’s a lot of variation based on the exact wording of the keywords, for others we can see that Google understands what users mean when they search and sees variations as equivalent. The key takeaway from this? Both keywords and topics are important.

You should still do keyword research. Keyword research is always going to be essential. But you should also consider the bigger picture, and as more tools that allow you to use natural language processing become available, take advantage of that to understand the overall topics you should write about, too.

It may be a useful exercise to carry out this type of analysis within your own vertical, and see how well Google can tell apart the similar keywords you want to target. You can then use this to inform how exact your targeting should be.

Let me know what you think, and if you have any questions, in the comments.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

from Raymond Castleberry Blog http://raymondcastleberry.blogspot.com/2016/03/are-keywords-really-dead-experiment_28.html
via http://raymondcastleberry.blogspot.com