更改的内容类型导致谷歌错误抓取

时间:2021-12-21 15:24:32

In our website built on WordPress, we changed name of one of our Custom Post type from 'A' to 'B' and also changed hierarchy of few categories.

在我们建立在WordPress上的网站中,我们将自定义帖子类型的名称从“A”更改为“B”,并更改了几个类别的层次结构。

Now, the problem is that google is indexing/crawling the old 'A' CPT Name and also old catgeory structure, which is leading to either random pages (because WordPress makes guess and shows page with those keywords in URL) or 404 errors.

现在,问题是google正在索引/抓取旧的'A'CPT名称以及旧的catgeory结构,这导致随机页面(因为WordPress猜测并显示URL中包含这些关键字的页面)或404错误。

What can we do (via Webmaster Tools) to make google re-index our whole site and start honoring our new structure? Thanks.

我们可以做什么(通过网站管理员工具)让谷歌重新索引我们的整个网站并开始尊重我们的新结构?谢谢。

1 个解决方案

#1


1  

Here is the brief explanation of the Google's indexing policy:

以下是Google索引政策的简要说明:

The process

这个过程

The crawl process begins with a list of web addresses from past crawls and sitemaps provided by website owners. As Google crawlers visit these websites, they look for links for other pages to visit. The software pays special attention to new sites, changes to existing sites and dead links.

抓取过程从网站所有者提供的过去抓取和站点地图的网址列表开始。当Google抓取工具访问这些网站时,他们会查找其他网页的访问链接。该软件特别关注新站点,现有站点的更改和死链接。

Computer programs determine which sites to crawl, how often and how many pages to fetch from each site. Google doesn't accept payment to crawl a site more frequently for your web search results. They care more about having the best possible results because in the long run that's what's best for users and, therefore, their business.

计算机程序确定要从每个站点获取的站点的抓取次数,频率和页数。 Google不接受付款以更频繁地为您的网络搜索结果抓取网站。他们更关心的是获得最好的结果,因为从长远来看,这对用户来说是最好的,因此对他们的业务也是如此。

Choice for website owners

网站所有者的选择

Most websites don't need to set up restrictions for crawling, indexing or serving, so their pages are eligible to appear in search results without having to do any extra work.

大多数网站不需要为抓取,编制索引或投放设置限制,因此他们的网页有资格在搜索结果中显示,而无需进行任何额外的工作。

That said, site owners have many choices about how Google crawls and indexes their sites through Webmaster Tools and a file called “robots.txt”. With the robots.txt file, site owners can choose not to be crawled by Google bot or they can provide more specific instructions about how to process pages on their sites.

也就是说,网站所有者可以通过网站站长工具和名为“robots.txt”的文件对Google如何抓取和索引网站进行多种选择。使用robots.txt文件,网站所有者可以选择不被Google僵尸程序抓取,也可以提供有关如何在其网站上处理网页的更具体说明。

Site owners have granular choices and can choose how content is indexed on a page-by-page basis. For example, they can opt to have their pages appear without a snippet (the summary of the page shown below the title in search results) or a cached version (an alternate version stored on Google's servers in case the live page is unavailable). Web-masters can also choose to integrate search into their own pages with Custom Search.

网站所有者具有精细的选择,可以选择逐页索引内容的方式。例如,他们可以选择让他们的网页显示没有摘要(搜索结果中标题下方显示的页面摘要)或缓存版本(如果实时网页不可用,则存储在Google服务器上的替代版本)。网站管理员还可以选择使用自定义搜索将搜索集成到自己的网页中。

Read more here and here.

在这里和这里阅读更多。

#1


1  

Here is the brief explanation of the Google's indexing policy:

以下是Google索引政策的简要说明:

The process

这个过程

The crawl process begins with a list of web addresses from past crawls and sitemaps provided by website owners. As Google crawlers visit these websites, they look for links for other pages to visit. The software pays special attention to new sites, changes to existing sites and dead links.

抓取过程从网站所有者提供的过去抓取和站点地图的网址列表开始。当Google抓取工具访问这些网站时,他们会查找其他网页的访问链接。该软件特别关注新站点,现有站点的更改和死链接。

Computer programs determine which sites to crawl, how often and how many pages to fetch from each site. Google doesn't accept payment to crawl a site more frequently for your web search results. They care more about having the best possible results because in the long run that's what's best for users and, therefore, their business.

计算机程序确定要从每个站点获取的站点的抓取次数,频率和页数。 Google不接受付款以更频繁地为您的网络搜索结果抓取网站。他们更关心的是获得最好的结果,因为从长远来看,这对用户来说是最好的,因此对他们的业务也是如此。

Choice for website owners

网站所有者的选择

Most websites don't need to set up restrictions for crawling, indexing or serving, so their pages are eligible to appear in search results without having to do any extra work.

大多数网站不需要为抓取,编制索引或投放设置限制,因此他们的网页有资格在搜索结果中显示,而无需进行任何额外的工作。

That said, site owners have many choices about how Google crawls and indexes their sites through Webmaster Tools and a file called “robots.txt”. With the robots.txt file, site owners can choose not to be crawled by Google bot or they can provide more specific instructions about how to process pages on their sites.

也就是说,网站所有者可以通过网站站长工具和名为“robots.txt”的文件对Google如何抓取和索引网站进行多种选择。使用robots.txt文件,网站所有者可以选择不被Google僵尸程序抓取,也可以提供有关如何在其网站上处理网页的更具体说明。

Site owners have granular choices and can choose how content is indexed on a page-by-page basis. For example, they can opt to have their pages appear without a snippet (the summary of the page shown below the title in search results) or a cached version (an alternate version stored on Google's servers in case the live page is unavailable). Web-masters can also choose to integrate search into their own pages with Custom Search.

网站所有者具有精细的选择,可以选择逐页索引内容的方式。例如,他们可以选择让他们的网页显示没有摘要(搜索结果中标题下方显示的页面摘要)或缓存版本(如果实时网页不可用,则存储在Google服务器上的替代版本)。网站管理员还可以选择使用自定义搜索将搜索集成到自己的网页中。

Read more here and here.

在这里和这里阅读更多。