Update
Ok - I now know where the multiple page loads are coming from! (However, the mystery is not yet solved).
我现在知道多页负载来自哪里了!(然而,这个谜还没有解开)。
It seems that immediately after a request is made to a page containing AdSense ads, Google makes a request for exactly the same URL (one or more times)
似乎在向包含AdSense广告的页面发出请求后,谷歌立即请求相同的URL(一次或多次)
e.g. this is what the logs look like (note requests from Mediapartners-Google):
这就是日志的样子(注意来自Mediapartners-Google的请求):
2011-07-20 09:50:20 xxx.xxx.xxx.xxx GET /requestedURL/ 80 - xxx.xxx.xxx.xxx Mozilla/5.0+(Browserstring removed) 200 0 0 1140
2011-07-20 09:50:20 xxx.xxx.xxx.xxx GET /requestedURL/ 80 - 66.249.72.52 Mediapartners-Google 200 0 64 218
2011-07-20 09:50:22 xxx.xxx.xxx.xxx GET /requestedURL/ 80 - 66.249.72.52 Mediapartners-Google 200 0 0 171
(I should have paid more attention to the IIS logs, rather than my own application logs - it just didn't occur to me that these multiple, identical, simultaneous request could have been coming from different sources). This also explains why I couldn't find anything strange when analysing the request with WireShark, and why fiddler didn't show anything strange.
(我本应该更关注IIS日志,而不是我自己的应用程序日志——我没想到这些多个、相同的同时请求可能来自不同的源)。这也解释了为什么我在用WireShark分析请求时找不到任何奇怪的东西,以及为什么fiddler没有显示任何奇怪的东西。
So the question for the bounty now becomes:
所以现在的问题是:
- Why is google making these requests so quickly after the page is requested? (I know they need to asses the page for content, but immediately after, and multiple times sees like abuse to me.)
- 为什么谷歌在页面被请求后这么快地发出这些请求?(我知道他们需要评估页面的内容,但马上就会,而且很多次我都觉得这是对我的虐待。)
- What can I do to stop this?
- 我能做什么来阻止这一切?
And out of interest:
的兴趣:
- Has anyone else seem something similar in their logs? (or is this something weird with my AdSense account)
- 在他们的日志中还有类似的东西吗?(或者是我的AdSense账户有点奇怪)
Ok, I'll apologise in advance for the length!...
好的,我先为长度道歉!
This question is realted to this one, regarding Google Adsense Javascript code causing errors. (of the form Unable to post message to googleads.g.doubleclick.net. Recipient has origin something.com
)
这个问题是关于谷歌Adsense Javascript代码导致错误的。(无法向google .g.doubleclick.net发布消息的表单)。收件人已经起源something.com)
I won't duplicate all of the information there, but the conclusion seems to be that the AdSense JS is buggy. (please read the question for background if you have time). I knew about this problem for some time, but decided to live with the JS errors rather than pulling AdSense from the site.
我不会在那里重复所有的信息,但是结论似乎是AdSense JS是错误的。(如果有时间,请阅读背景问题)。我知道这个问题有一段时间了,但我决定忍受JS的错误,而不是从网站上拖出AdSense。
However, Recently I noticed that in my ASP.NET MVC2 application, Controller Actions seemed to be called twice per page request (sometimes even 3 times). Odly, it was only happening on the production server. After some thought I relalised that one difference between the Dev and Production environments was that the AdSense javscript was only active in production.
然而,最近我注意到在我的ASP。NET MVC2应用程序,似乎每个页面请求两次调用控制器动作(有时甚至3次)。Odly,它只发生在生产服务器上。经过一番思考之后,我重申了开发环境和生产环境之间的一个区别,即AdSense javscript只在生产环境中活动。
To test this I removed all adsense code from one of the production pages, and lone behold, the multiple-page-load problem went away!
为了测试这一点,我从其中一个生产页面中删除了所有adsense代码,您会发现,多页加载的问题消失了!
I thought that perhaps it was the fact that there were general JS errors on the page that was causing the problem, so to test this I introduced some simple errors into my own JS code, however this did not cause the multiple-page-load problem to reappear.
我认为可能是页面上存在一般的JS错误导致了这个问题,所以为了测试这个问题,我在自己的JS代码中引入了一些简单的错误,但是这并没有导致多页加载问题再次出现。
One known situation where pages can be called multiple times per request is when there are
image tags with empty src attributes, or external resource references with empty src attributes. Crucially, The most upvoted answer to the
AdSense JS Bug question notes that:
一个已知的情况是,每个请求可以多次调用页面的情况是,当存在带有空src属性的图像标记时,或者是具有空src属性的外部资源引用。最重要的是,对AdSense JS Bug问题的最正面的回答指出:
"The targetOrigin argument in this call, this.la is set to http://googleads.g.doubleclick.net. However, the new iframe was written with its src set to about:blank."
这个电话里的targetOrigin参数。la被设置为http://googleads.g.doubleclick.net。然而,新的iframe被编写为:blank。
This seems eerily similar to the empty [EDIT: This was a red herring]src
issue.... This seems too much of a co-incidence, and currently I'm of the opinion that this is the problem.
这似乎陷入了诡异的类似于空src问题....这似乎是一个太多的共同事件,目前我认为这是问题所在。[编辑:这是在转移视线]
However, I've no idea wehre to go from here. These multiple action calls are causing real problems (I'm having to use code blocking, serialised transactions, and all sorts of nasty hacks to limit adverse effects). Of course, I could be barking up the wrong tree entirely - I'm puzzled that I can't find any other references to this, given the ubiquity of AdSense, and the nature of the problem (but then again the conclusions of the AdSense JS Bug question are also surprising). I would love this to turn out to be a stupid mistake on my part, so I need a sanity check.
然而,我不知道我们该怎么走。这些多个操作调用正在导致真正的问题(我必须使用代码阻塞、序列化事务以及各种讨厌的技巧来限制负面影响)。当然,我可能完全弄错了方向——考虑到AdSense的普遍性和问题的性质,我找不到任何其他的参考文献,这让我很困惑(不过,AdSense JS Bug问题的结论也令人惊讶)。我希望这对我来说是一个愚蠢的错误,所以我需要一个健全的检查。
I'd like to ask the community:
我想问社区:
- Has anyone else experienced this problem?, or can anyone who is using AdSense replicate and confirm it? [See note below]
- 有其他人经历过这个问题吗?,或任何正在使用AdSense复制并确认它的人?(请参见下面的note)
- Assuming the problem is what it seems, what can I do? (other than pulling AdSense of course)
- 假设问题是这样的,我该怎么办?(当然除了拉动AdSense)
- If not, then what might be causing this?
- 如果不是,那么可能是什么原因造成的呢?
To Sumarise: - My actions are being executed 2 (sometimes 3) times per page request.
对于Sumarise: -我的操作在每个页面请求中被执行2次(有时是3次)。
- THIS ONLY HAPPENS WHEN GOOGLE ADSENSE ADS ARE PRESENT
- 这只在谷歌ADSENSE广告出现时发生
- I removed all AdSense JS and introduced an error into my own JS : Actions are called only once...
- 我删除了所有AdSense JS,并在我自己的JS中引入了一个错误:操作只调用一次……
-
A similar problem can happen when empty src properties are present on the page - 当页面上出现空src属性时,也会出现类似的问题
-
An answer to a previous question sumarises that the AdSense JS sets asrc="about:blank"
on an iFrame - 在iFrame上,AdSense JS设置src=“about:blank”的一个问题的回答
-
I have come to the conclusion that thesrc="about:blank"
from the AdSense code is the most likely source of the problem. - 我得出的结论是,AdSense代码中的src="about:blank"是问题最可能的来源。
- If I disable JavaScript on the browser, the problem goes away
- 如果我在浏览器上禁用JavaScript,问题就会消失
Just to document the things I have ruled out:
只是为了记录下我排除的因素:
- This is happening across browsers: Chrome(12) Firefox(5) and IE(8).
- 这是在浏览器中发生的:Chrome(12) Firefox(5)和IE(8)。
- I have dissabled all plugins on browsers (YSlow, Firebug etc...)
- 我已经取消了浏览器上的所有插件(YSlow、Firebug等)。
- There are no empty src (
src=""
/src="#"
) for images, or other external resources in the html in my code - 在我的代码中,没有空的src (src=""/src="#")用于图像或其他外部资源
- There are no empty url references in the css (
url('')
) - css中没有空url引用(url("))
- It's unlikely to be server side code/config problem, as it doesn't happen in Dev (and of the few differences between dev and production is the absence of AdSence JS in Dev)
- 这不太可能是服务器端代码/配置问题,因为它在Dev中不会发生(Dev和产品之间的几个不同之处在于Dev中没有AdSence JS)
Note: For anyone looking to replicate this, it should be noted that, strangely, when the multiple action calls happen Fiddler shows only one request being sent to the server. I have no idea why this should be the case, but the server logging doesn't lie :) Perhaps someone who has prior experience with this problem when caused by empty src attributes in img tags can say whether they have seen the same behaviour with Fiddler.
注意:对于任何希望复制此功能的人来说,应该注意,奇怪的是,当多个操作调用发生时,Fiddler只显示一个请求被发送到服务器。我不知道为什么会出现这种情况,但是服务器日志记录并没有撒谎:)当img标记中的空src属性导致这个问题时,可能有经验的人会说他们是否看到过Fiddler的相同行为。
Requested extra information
HTML (@Ivan)
HTML(@Ivan)
Here's how I'm implementing the Adsense (ids removed)
下面是如何实现Adsense(删除id)
<%@ Control Language="C#" Inherits="System.Web.Mvc.ViewUserControl" %>
<div class="ad">
<%if (!HttpContext.Current.IsDebuggingEnabled) { %>
<script type="text/javascript"><!--
google_ad_client = "ca-pub-xxxxxxxxxxxxxxx";
/* xxxxxxxxxxxxxxx */
google_ad_slot = "xxxxxxxxx";
google_ad_width = 728;
google_ad_height = 15;
//-->
</script>
<script type="text/javascript" src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
</script>
<%} else { %>
<img src="/Content/images/googleAdMock728x15_4_e.gif" width="728" height="15" />
<%} %>
</div>
This is being inserted by a RenderPartial in the View:
这是由一个RenderPartial在视图中插入的:
<% Html.RenderPartial("AdSense_XXXXXX"); %>
TCP Logging (@Tomas)
TCP日志(@Tomas)
So far I have done a wireshark capture:
到目前为止,我已经做了一个wireshark捕捉:
- on client when requesting page on production with problem
- 在客户端请求页面时出现问题
- on client when requesting page on production without problem (i.e. Adsense Removed)
- 在客户端请求生产页面时没有问题(即删除Adsense)
I can't really see a significant difference between the two (although my network skills are not great). One thing to note is that they both seem to have a TCP retransmittion
of the HTTP request immediately after the initial request - I don't know the significance of that. I can confirm though that in case 1 the server logs reported 2 executions, and in case 2 only one execution.
我看不出这两者之间有什么明显的区别(尽管我的社交技巧不是很好)。需要注意的一点是,它们似乎都在初始请求之后立即有HTTP请求的TCP重新发送——我不知道这有什么意义。我可以确认在案例1中服务器记录了2次执行,在案例2中只有一次执行。
Next I will try TCP logging on the server side in both cases, and post results here.
接下来,我将在这两种情况下在服务器端尝试TCP日志记录,并在这里发布结果。
4 个解决方案
#1
1
Given that the behaviour that you are observing appear to be hard to avoid, can we rather focus on workarounds?
鉴于你所观察到的行为似乎难以避免,我们是否可以更专注于解决方法?
Can you differentiate requests based on UserAgent, and thus filter out requests. Could that be a viable approach for you? If so then you could probably base upon this approach: http://blog.flipbit.co.uk/2009/07/writing-iphone-sites-with-aspnet-mvc.html Here they detect iPhones, but the consept is the same for Mediapartners-Google bot.
可以根据UserAgent对请求进行区分,从而过滤掉请求。这对你来说可行吗?如果是这样的话,那么您可以基于这种方法:http://blog.flipbit.co.uk/2009/07/writing-iphone- sites-aspnet -mvc。这里的html是用来检测iphone的,但是对于Mediapartners-Google机器人来说,它的影响是一样的。
#2
4
Mediabot is the name given to the web crawler that Google uses to crawl webpages for purposes of analysing the content so Google AdSense can serve contextually relevant advertising to the page.
Mediabot是web爬虫的名称,谷歌用于抓取网页以分析内容,以便谷歌AdSense可以为页面提供上下文相关的广告。
In my experience, it is impredictable and, yes , it can be pretty heavy and annoying.
根据我的经验,这是不可预测的,是的,它可能会非常沉重和烦人。
If you don't want Mediapartner bot to access a specific page, you can disallow it in your robots.txt
with:
如果您不希望Mediapartner机器人访问特定页面,您可以在您的机器人中禁用它。txt:
#
# disallow adsense bot
#
User-agent: Mediapartners-Google
Disallow: path to your specific page
This will have the drawback of service untargeted ads from that specific page.
这将会有服务无目标广告的缺点,从特定的页面。
If you are seeing this pattern always on the same page with different query string, adding the canonical rel could ease the pain.
如果您总是在同一个页面上使用不同的查询字符串看到这种模式,那么添加规范的rel就可以减轻这种痛苦。
If you can't resolve this issue, and you see it as an abuse, don't esitate to ask help in the Crawling Indexing and Ranking Google support.
如果您不能解决这个问题,并且您认为这是一个滥用,请不要在爬行索引和排名谷歌支持中寻求帮助。
#3
0
Aside from the embedding of the AdSense code itself, there are two things related to AdSense that differ in your two test cases:
除了嵌入AdSense代码本身之外,在您的两个测试用例中有两个与AdSense相关的东西是不同的:
-
What else happens when
!HttpContext.Current.IsDebuggingEnabled
? This appears to be the de-facto production flag; maybe there is some other nuance somewhere that is happening that depends on this same flag.当!HttpContext.Current.IsDebuggingEnabled时还会发生什么?这似乎是事实上的生产标志;也许在某个地方发生了一些其他的细微差别这取决于这面旗帜。
-
Is it possible that
Html.RenderPartial("AdSense_XXXXXX")
is somehow causing your Controller to jump back to the beginning of its execution?Html.RenderPartial(“adsense_xxxx”)是否可能导致控制器跳转到执行的开始?
From your description, it seems like the execution is happening twice on the server but only one request is being sent from the client. This implies a server error, and these two lines are the crux of your AdSense triggering. To further narrow it down, try embedding the AdSense partial directly instead of calling Html.RenderPartial()
. If that doesn't change the result, it might be worth a sanity check on what else switches on HttpContext.Current.IsDebuggingEnabled
.
从您的描述来看,似乎执行在服务器上发生了两次,但客户机只发送了一个请求。这意味着服务器错误,这两行是AdSense触发的关键。要进一步缩小范围,可以尝试直接嵌入AdSense分部,而不是调用Html.RenderPartial()。如果这没有改变结果,那么值得对HttpContext.Current.IsDebuggingEnabled上的其他开关进行全面检查。
Failing that, it might be helpful to know whether your server-side logging takes place as the request is received, before the response is sent, or after the response is sent.
如果做不到这一点,那么了解服务器端日志记录是在接收请求时、发送响应之前还是发送响应之后进行的可能会有帮助。
#4
0
Yes, I just detected this during a TeamView session with my partner. On my box my main page ONLY for my site loads once per request.
是的,我刚刚在与我的合作伙伴的TeamView会话中发现了这一点。在我的框中,我的主页只用于每个请求加载一次站点。
Then by coincidence while using Fiddler my partner is getting 4 requests to the sample page. It is a 1.5 MB page with big scripts and lotsa other dependencies so this was truly a WTF moment as I have never seen anything like this in 15 years of web development.
然后,在使用Fiddler的时候,我的伙伴正在向示例页面发送4个请求。这是一个1.5 MB的页面,有大脚本和其他依赖项,所以这真的是一个WTF时刻,因为我在15年的web开发中从来没有见过这样的事情。
If google is doing this I must say they should realize today's sites might have very big pages and very big audiences. That could mean they are jacking bandwidth by a factor of 4 per request. Like I said, WTF?????
如果谷歌正在这样做,我必须说他们应该意识到今天的网站可能有非常大的页面和非常大的观众。这可能意味着它们将每请求的带宽增加4倍。就像我说的,WTF ? ? ? ? ?
I wish this Q&A had a more definitive resolution. I do use Google Translate widget but this is only occurring on his box and for the main page. The other pages also use the translate widget and I do request my JQUERY via the google CDN. Could anything Google be doing this.
我希望这次问答能有一个更明确的答案。我确实使用了谷歌翻译小部件,但这只发生在他的框和主页上。其他页面也使用translate小部件,我通过谷歌CDN请求JQUERY。谷歌能做什么吗?
#1
1
Given that the behaviour that you are observing appear to be hard to avoid, can we rather focus on workarounds?
鉴于你所观察到的行为似乎难以避免,我们是否可以更专注于解决方法?
Can you differentiate requests based on UserAgent, and thus filter out requests. Could that be a viable approach for you? If so then you could probably base upon this approach: http://blog.flipbit.co.uk/2009/07/writing-iphone-sites-with-aspnet-mvc.html Here they detect iPhones, but the consept is the same for Mediapartners-Google bot.
可以根据UserAgent对请求进行区分,从而过滤掉请求。这对你来说可行吗?如果是这样的话,那么您可以基于这种方法:http://blog.flipbit.co.uk/2009/07/writing-iphone- sites-aspnet -mvc。这里的html是用来检测iphone的,但是对于Mediapartners-Google机器人来说,它的影响是一样的。
#2
4
Mediabot is the name given to the web crawler that Google uses to crawl webpages for purposes of analysing the content so Google AdSense can serve contextually relevant advertising to the page.
Mediabot是web爬虫的名称,谷歌用于抓取网页以分析内容,以便谷歌AdSense可以为页面提供上下文相关的广告。
In my experience, it is impredictable and, yes , it can be pretty heavy and annoying.
根据我的经验,这是不可预测的,是的,它可能会非常沉重和烦人。
If you don't want Mediapartner bot to access a specific page, you can disallow it in your robots.txt
with:
如果您不希望Mediapartner机器人访问特定页面,您可以在您的机器人中禁用它。txt:
#
# disallow adsense bot
#
User-agent: Mediapartners-Google
Disallow: path to your specific page
This will have the drawback of service untargeted ads from that specific page.
这将会有服务无目标广告的缺点,从特定的页面。
If you are seeing this pattern always on the same page with different query string, adding the canonical rel could ease the pain.
如果您总是在同一个页面上使用不同的查询字符串看到这种模式,那么添加规范的rel就可以减轻这种痛苦。
If you can't resolve this issue, and you see it as an abuse, don't esitate to ask help in the Crawling Indexing and Ranking Google support.
如果您不能解决这个问题,并且您认为这是一个滥用,请不要在爬行索引和排名谷歌支持中寻求帮助。
#3
0
Aside from the embedding of the AdSense code itself, there are two things related to AdSense that differ in your two test cases:
除了嵌入AdSense代码本身之外,在您的两个测试用例中有两个与AdSense相关的东西是不同的:
-
What else happens when
!HttpContext.Current.IsDebuggingEnabled
? This appears to be the de-facto production flag; maybe there is some other nuance somewhere that is happening that depends on this same flag.当!HttpContext.Current.IsDebuggingEnabled时还会发生什么?这似乎是事实上的生产标志;也许在某个地方发生了一些其他的细微差别这取决于这面旗帜。
-
Is it possible that
Html.RenderPartial("AdSense_XXXXXX")
is somehow causing your Controller to jump back to the beginning of its execution?Html.RenderPartial(“adsense_xxxx”)是否可能导致控制器跳转到执行的开始?
From your description, it seems like the execution is happening twice on the server but only one request is being sent from the client. This implies a server error, and these two lines are the crux of your AdSense triggering. To further narrow it down, try embedding the AdSense partial directly instead of calling Html.RenderPartial()
. If that doesn't change the result, it might be worth a sanity check on what else switches on HttpContext.Current.IsDebuggingEnabled
.
从您的描述来看,似乎执行在服务器上发生了两次,但客户机只发送了一个请求。这意味着服务器错误,这两行是AdSense触发的关键。要进一步缩小范围,可以尝试直接嵌入AdSense分部,而不是调用Html.RenderPartial()。如果这没有改变结果,那么值得对HttpContext.Current.IsDebuggingEnabled上的其他开关进行全面检查。
Failing that, it might be helpful to know whether your server-side logging takes place as the request is received, before the response is sent, or after the response is sent.
如果做不到这一点,那么了解服务器端日志记录是在接收请求时、发送响应之前还是发送响应之后进行的可能会有帮助。
#4
0
Yes, I just detected this during a TeamView session with my partner. On my box my main page ONLY for my site loads once per request.
是的,我刚刚在与我的合作伙伴的TeamView会话中发现了这一点。在我的框中,我的主页只用于每个请求加载一次站点。
Then by coincidence while using Fiddler my partner is getting 4 requests to the sample page. It is a 1.5 MB page with big scripts and lotsa other dependencies so this was truly a WTF moment as I have never seen anything like this in 15 years of web development.
然后,在使用Fiddler的时候,我的伙伴正在向示例页面发送4个请求。这是一个1.5 MB的页面,有大脚本和其他依赖项,所以这真的是一个WTF时刻,因为我在15年的web开发中从来没有见过这样的事情。
If google is doing this I must say they should realize today's sites might have very big pages and very big audiences. That could mean they are jacking bandwidth by a factor of 4 per request. Like I said, WTF?????
如果谷歌正在这样做,我必须说他们应该意识到今天的网站可能有非常大的页面和非常大的观众。这可能意味着它们将每请求的带宽增加4倍。就像我说的,WTF ? ? ? ? ?
I wish this Q&A had a more definitive resolution. I do use Google Translate widget but this is only occurring on his box and for the main page. The other pages also use the translate widget and I do request my JQUERY via the google CDN. Could anything Google be doing this.
我希望这次问答能有一个更明确的答案。我确实使用了谷歌翻译小部件,但这只发生在他的框和主页上。其他页面也使用translate小部件,我通过谷歌CDN请求JQUERY。谷歌能做什么吗?