This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Don't let search bots look at buglist.cgi


On 05/16/2011 01:09 PM, Michael Matz wrote:
> Hi,
> 
> On Mon, 16 May 2011, Andrew Haley wrote:
> 
>> On 16/05/11 10:45, Richard Guenther wrote:
>>> On Fri, May 13, 2011 at 7:14 PM, Ian Lance Taylor <iant@google.com> wrote:
>>>> I noticed that buglist.cgi was taking quite a bit of CPU time.  I looked
>>>> at some of the long running instances, and they were coming from
>>>> searchbots.  I can't think of a good reason for this, so I have
>>>> committed this patch to the gcc.gnu.org robots.txt file to not let
>>>> searchbots search through lists of bugs.  I plan to make a similar
>>>> change on the sourceware.org and cygwin.com sides.  Please let me know
>>>> if this seems like a mistake.
>>>>
>>>> Does anybody have any experience with
>>>> http://code.google.com/p/bugzilla-sitemap/ ?  That might be a slightly
>>>> better approach.
>>>
>>> Shouldn't we keep searchbots way from bugzilla completely?  Searchbots
>>> can crawl the gcc-bugs mailinglist archives.
>>
>> I don't understand this.  Surely it is super-useful for Google etc. to
>> be able to search gcc's Bugzilla.
> 
> gcc-bugs provides exactly the same information, and doesn't have to 
> regenerate the full web page for each access to a bug report.

It's not quite the same information, surely.  Wouldn't searchers be directed
to an email rather than the bug itself?

Andrew.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]