What is indexable content?

Question

Yashoda Puty · Answer

Indexability is a web page's ability to be indexed by search engines. Only indexable pages can show up in search results. To index a web page, search engines, such as Google must: Discover the page URL.

Mariel Antoine · Answer

The report is found in Ryte's module Website Success, and consists of three components: Figure 1: The indexability report on Ryte You want to ensure that your pages are ranked highly. For this to happen, the search engine crawler firstly has to crawl the pages and add them to its index. Making sure that your website is indexable is therefore an important prerequisite for ranking. Ryte's indexability report shows you the URLs of your website that are indexable by search engines, as well as those that are not indexable for various reasons. Possible reasons why website content is not indexable: - paginated pages (rel-prev, rel-next) 
- pages that have the "nofollow" attribute in the robot’s tag
- pages that contain a redirect
- pages that have a canonical tag
- error pages
- pages that are blocked using the disallow command in the robots.txt The report therefore gives you a clear overview of the indexable and non-indexable pages. You should take a closer look at the non-indexable pages to find possible problems. Bear in mind that search engines have limited resources (crawl budget) and therefore may not cover all areas of your website per crawl session, particularly if your website has more than a few thousand URLs. If there are certain pages you don't want to have indexed, for example if they have no value for the user, you should instruct the search engine bot not to crawl these pages, for example by using the nofollow command in the robots.txt. This way, you essentially "present" the search engine with the most important URLs, without wasting the crawler's resources. Example: 
The SEA department regularly has stand-alone pages for its campaigns. These pages are saved in a separate directory (/sea). Over time, hundreds of landing pages accumulate. Since the campaign pages are specifically created for SEA, they are not well-suited for ranking in the organic search. Such pages should therefore be excluded from crawling, as they do not need to be indexed. The first part of the Ryte's indexability report contains an overview graph in the form of a bar chart, with sections marked in three different colors. Green: Everything is okay! No action is needed for these URLs. These pages are usually indexed without any problems. Yellow: These pages are indexable, but it would be worth double-checking to verify whether or not they are indexed. Red: This indicates that there could be an error with these URLs, for example they could have been inadvertently excluded from the robots.txt. You should definitely check these. Figure 2: The bar chart above the indexability report It is advisable to start with the red bars, as these show potential errors with your website. Clicking on one of the red bars, i.e. "broken" or "disallowed via robots.txt", activates the respective filter and you will see a list of the URLs that are not indexable. Figure 3: Only the non-indexable pages are listed after clicking on the bar "broken" The opportunity: Non-indexable pages with a high OPR The OPR (OnPage Rank) is an indication of how strong a URL is in a domain. It is calculated based on the page’s ranking by Google. Pages that are well linked have a higher OPR. Non-indexed URLs that have a high OPR are therefore a waste of link power. Here, you might want to consider whether it would be logical to make the URL indexable, and link to other pages that could profit from this link power. The indexability report helps you sort the entire left column based on the OPR and, therefore easily identify pages that have a high OPR. The yellow bars in the indexability report usually show URLs containing: Paginated pages
Paginated pages have a rel=prev, rel=next attribute and tell the search engine that a previous and subsequent page exists. As soon as the bot comes across this attribute, the search engine recognizes that this is a list of similar pages and does not index these pages. You can found out more about pagination here. Redirects 
Here, the pages either contain a status 301 or 302 redirect, meaning that they refer to another page. The search engine bot therefore simply moves on and does not index the redirected page. Canonicals on other pages
If a page contains a canonical tag referring to an alternative URL, the search engine does not index this page. Side note: Indexable URLs without canonicals The canonical tag is the best way of avoiding duplicate content. For example, if you are tracking a campaign and have added a certain amount of campaign parameters to your URL, every URL with the same content will be accessible from different URLs if a canonical tag is not used. This would be seen by the search engines as duplicate content, and would have a negative effect on the ability of the pages to rank. noindex via robots tag
This attribute speaks for itself. In the robots meta tag, the noindex attribute instructs the search engine not to index the page. Below the bar chart is an overview of the document types found in the report. Clicking on a specific document type (e.g., HTML, PDF, image, etc.) activates the filter. Figure 4: Easily filter out document types Assume the report is showing a large number of PDFs that are indexable. You should first ask yourself if the PDFs should be indexable. Although PDFs are effective, stable, and often ranked highly by search engines, they also present a dead end for the user. When the user ends up on this PDF, he/she can only go back to your website by changing the URL. You can identify the indexable PDFs with these steps: Ideally, the green bar should be the highest with much more contents than the non-indexable content. However, this can still give you a lot of information. The following are examples of how you can review pages in a more detailed way. Add extra columns to the indexability report in order to analyze these situations. 1. Indexable URLs that are not in the sitemap. 
If they are meant to be indexed, they should also be in the XML sitemap. You can find out which URLs are in the XML sitemap by clicking on the menu point "Sitemaps" "Included in sitemaps" Figure 5: Are the indexable URLs in the sitemap? 2. Indexable URLs that only have a few inbound links
A good internal link structure is also very important for the ranking. In order to ensure that users find the pages, indexable URLs should have a sufficient number of internal links. 3. Indexable URLs that have a long click path 
The click path measures the shortest path from the homepage to the respective URL. If navigating to a URL requires too many clicks, search engines perceive this page as not being very important. Tip: You should have a closer look at URLs that are too far from the homepage and consider whether it would be logical to add links from one page to another in order to improve the internal navigation and, thus, raise your rankings. 4. Indexable pages that do not have a title and description
Indexable URLs that do not have a title and description can also be listed in the search engine results. If these two meta tags are missing, the pages could be ranked poorly and the description will be generated by the search engine itself. You should make use of the snippet to encourage users to click. 5. Indexable pages with thin contents (less than 300 words)
Pages that contain little content are often ranked lower than those with relevant contents. Identify all indexable pages on your website that have less than 300 words. This will help you prevent the search engine from realizing that your domain has so-called thin content. The indexability report not only gives you a very good overview of the non-indexable content, but also shows you shortcomings of your pages that are already in the search engine's index. Regularly keep track on the indexing of your pages, as this is the alpha and omega of successful search engine rankings. Happy Analyzing!

Waya Malik · Answer

Because, let’s be honest, how many users search beyond the first page? None. They never do.

For your site to be indexed on Google, Bing and other search engines, while also having a chance at overtaking your competitors, it needs to have visibility.

But, how do you get it out of the shadows and into the light? This process is called indexation.

As mentioned earlier, having indexable content is a necessary step for your site to be ranked among the top results. This is where web crawlers come into play.

Web crawlers provide search engines with useful information from the billions of sites on the internet. Their goal is to detect web pages and to “take note” of the subjects covered within them. This will allow your link to be indexed when a user types in your keyword or topic.

Despite their goodwill, robots are not infallible. They sometimes miss out on blogs or sites that are otherwise very interesting.

It is therefore essential to organize and develop your website so that it can be located and easily crawled by robots. In other words, to optimize it.

We often write about our hobby horse, natural referencing or SEO. And, of course, indexing is an integral part of this considerable undertaking.

You no doubt understand that indexable content is much more valuable to you.

To ensure that your site is in the Google index, type: site: www.yoursite.com into the search bar. The number of results corresponds to the indexed pages of your website.

Now, how can you achieve a website that is properly detected and crawled by the search engines?

It is possible to ask Google to proceed with the indexation of your site, either by submitting a form or placing a link to your site on another site (backlinking). Of course, make sure that the site hosting your link is itself indexed!

Furthermore, the content must be authentic and unique. Much like at school, duplicating content is very negatively perceived by Google.

Another way to be noticed by crawlers is to add a custom TITLE tag. This is a particularly important step during the indexing process. The same goes for the META DESCRIPTION. Do not let the machine do it for you, at the risk that it won’t grab the attention of either the crawlers or your potential customers. Optimize them!

If you use images or videos, it is also necessary to improve them by filling in the title and description. Search engines are not yet able to recognize visual content, so they do not take it into account.

Think about it: if Google can’t detect the image subject (is it Barack Obama or a cat?), how then can it find the image when someone searches for the first black president of the United States?

Here are some examples of things that make it more difficult for crawlers to read content:

Content that is unindexable is found on what is called the deep web. Make sure your valuable data does not get stuck in this vast meandering space!

In other words, indexing is the process of detecting and analyzing your site that is performed by search engine robots.

If your content is unindexed, your page will not show up in searches, even if your info is the most relevant.

There are several solutions to getting your pages out of the hidden abyss of the internet, which usually comes down to optimizing your content. It’s not so surprising that content writing is in such high demand!

K.A. Ilic · Answer

When you use our website www.contentkingapp.com (“Website”), we may place cookies. This Cookie Policy explains what cookies are and in which way cookies are stored on, and information is read from, your computer, mobile device and/or tablet (“Devices”). Please read this Cookie Policy carefully in order to understand what type of cookies ContentKing uses, the information we collect using cookies and how that information is used.

Cookies are small text files which are stored on the browser or hard drive of your Device when you visit a webpage or use an application. Cookies may be needed to show the webpage or application on your Device and are also used to enhance the user experience. Cookies cannot damage your Devices or the files saved on it. There are different types of cookies. Some cookies come directly from our website (first party cookies) and others come from third parties which place cookies on our site (third party cookies).

Cookies can be stored for varying lengths of time on your browser or device. Session cookies are deleted from your computer or device when you close your web-browser. Persistent cookies will remain stored on your computer or Device until deleted or until they reach their expiry date.

We try to provide an advanced and user-friendly website that adapts automatically to needs and wishes of our visitors and users. To achieve this, we use technical cookies to i.e. show you our Website, to make it function correctly, to create your ContentKing account, to sign you in and to manage your requests. These technical cookies are necessary for our website to function properly.

We also use functional cookies to remember your preferences and settings (as username, password, language etc.) and to help you to use the Website efficiently and effectively. These functional cookies are not strictly necessary for the functioning of our Website, but they add functionality for you and enhance your experience.

We use analytical cookies in order to collect statistics about the use and visits of the Website. These analytic cookies generate statistical and other information about website use by means of cookies, which are stored on users’ Devices. The information generated relating to our Website is used to create reports about the use of the Website. Analytical cookies may be placed without your consent, unless such cookies have an impact on your privacy. In such cases, prior consent will be asked for.

We use Google Analytics cookies in order to collect statistics about the use and visits of the Website. With these analytical cookies a permanent cookie is saved on your device to register your use of the Website. Google Analytics analyses this data and provides us with the results. This way, we obtain insight into the traffic of the Website and the way in which the Website is used. Based on this information we are able to make specific adjustments to the Website.

The information that we obtain is transferred to Google and stored by Google on servers outside the European Economic Area. We have entered into a data processing agreement with Google in which agreements have been made regarding the processing of personal data.

If you would like more information about analytical cookies and their expiry date, please visit the Google Analytics information page. Google also offers an opt-out option for data collection within the scope of Google Analytics.

We use third party marketing cookies in order to follow your internet browsing behavior and collect data and information on your browsing behavior from various websites you visited. This information is used to make the content of the displayed advertisements as relevant as possible, and to limit repetition of the same advertisements.

By using the Website you can get access to social media websites, such as Facebook, Twitter and LinkedIn. These buttons come with a code which has been made by these networks themselves. Using these so called social plug-ins you can login to your social media profile and subsequently share information from the Website with others. With these social plug-ins third party marketing cookies are saved on your Devices. These cookies serve the purpose of optimizing your user experience.

Companies such as LinkedIn, Twitter and Facebook may share your personal data outside the European Economic Area. Please read our Privacy Policy here and the respective social network’s privacy statement to see how these companies treat your (personal) data.

We also place pixels of third parties. A pixel keeps track of your surfing behavior and how you use the Website. These data are aggregated and give us information about the target group that visits our website. Based on this information, we can show an advertisement to a specific or similar target group on the website of the third party that produced the pixel.

When you first visit the Website, we ask you to give consent for cookies that are not necessary or functional and that have an impact on your privacy. If you do not give consent, these cookies will not be placed. In that case, you may not be able to use all the functionalities on our Website.

You can configure your browser so that you do not receive any cookies the next time you use the website. You can read here how you can remove different types of cookies in different browsers. Please consult the help-function of your browser if your browser is not listed below.

Note: Refusing or deleting cookies only affects the Device and the browser on which you perform this action. If you use different Devices and/or browsers you will need to repeat the above described actions on these Devices and/or browsers.

It is possible that information collected through a cookie or obtained otherwise, contains personal data. If this is the case, our Privacy Policy is applicable on the processing of these data. The Privacy Policy can be read here.

We may amend this Cookie Policy in the future. If there are substantive or material changes that may affect one or more of the parties involved to a considerable extent, we inform those involved in advance. Our changed Cookie Policy will also be available on this web page, so it is recommended to regularly have a look at this page.

Essential (Technical) cookies help make a website usable by enabling basic functions like page navigation and access to secure areas of the website. The website cannot function (properly) without these cookies.

These cookies help us to understand how visitors interact with websites by collecting and reporting information anonymously.

Marketing cookies are used to follow visitors across websites. The intention is to display ads that are relevant and engaging for the individual user and thereby more valuable for publishers and third party advertisers.

Amadi pvqyph J · Answer

What is crawlability and indexability? Watch this video with Will from the Internet Marketing team to find out! Transcript: Google’s search engine results pages (SERPs) may seem like magic, but when you look more closely, you see that sites show up in the search results because of crawling and indexing. This means for your website to show up in the search results, it needs to be crawlable and indexable. Search engines have these bots we like to call crawlers. They basically find websites on the internet, crawl their content, follow any links on the site, and then create an index of the sites they’ve crawled. The index is this huge database of URLs that a search engine like Google puts through its algorithm to rank. You see the results of the crawling and indexing when you search for something and the results pages load. It’s all the sites a search engine has crawled and has deemed relevant to your search based on a bunch of different factors. I won’t touch on the algorithm that Google and other search engines use to figure out what content is relevant to a search, but you can check out our website to learn more. Crawlability means that search engine crawlers can read and follow links in your site’s content. You can think of these crawlers like spiders following tons of links across the web. Indexability means that you allow search engines to show your site’s pages in the search results. If your site is crawlable and indexable, that’s excellent! If it’s not, you can be losing out on a lot of potential traffic from Google’s search results. And this lost traffic translates to lost leads and lost revenue for your business. It’s easy. Go to Google or another search engine and type in site:… and then your site’s address. You should see the results for how many pages on your site have been indexed. If you don’t see anything, don’t worry, I’ll tell you how to fix it and get your site submitted to Google! You want crawlers to get to every page on your site, right? Then make sure every page on your site has a link pointing to it. So, looking at Target as an example, you can easily follow the links in their navigation to get from page to page. If you click on women’s clothing, you can see even more links to different types of clothing, and then links to even more specific types of clothing within that menu. There are links leading to every page, which a crawler will follow. If you don’t have a lot of internal links, HTML sitemaps can give crawlers links to follow on your site. HTML sitemaps are for people and search engines, and they list links to every page on your site. You can usually find them in the footer of a site.

But best practice is to include links to every page throughout relevant content and in navigational tabs on your site. Again, links matter for your site. But backlinks are much harder to get than internal links because they come from someone outside of your business. Your site gets a backlink when another site includes a link to one of your pages. So when crawlers are going through that external site, they’ll reach your site through that link as long as they’re allowed to follow it. The same happens for other websites if you link to them in your content. Backlinks are tricky to get, but check out our link building video to learn how you can earn them for your business. It’s good practice to submit an XML sitemap of your site to Google Search Console. Check out our video on XML sitemaps to learn all about them. But not right now. It’s my time to shine. Here’s a short summary. XML sitemaps should contain all of your page URLs so crawlers know what you want them to crawl. They’re different from HTML sitemaps because they’re just for crawlers.

You can create an XML sitemap on your own, use an XML sitemap tool, or even use a plugin if it’s compatible with your site’s CMS. But don’t include links you don’t want crawled and indexed in your sitemap. This can be something like a landing page for a really targeted email campaign. This one’s a little more technical. A robots.txt file is a file on the backend of your site that tells crawlers what they can’t crawl and index on your site. If you’re familiar with robots.txt, make sure you’re not accidentally blocking a crawler from doing its job. If you’re blocking a crawler, it will look something like this. The term user-agent refers to the bot crawling your site. So, for example, Google’s crawler is called Googlebot and Bing’s is Bingbot.

Ask Sawal

What is indexable content?

Related Questions

More Questions

Contact