This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Don't let search bots look at buglist.cgi
On 05/16/2011 01:09 PM, Michael Matz wrote:
> Hi,
>
> On Mon, 16 May 2011, Andrew Haley wrote:
>
>> On 16/05/11 10:45, Richard Guenther wrote:
>>> On Fri, May 13, 2011 at 7:14 PM, Ian Lance Taylor <iant@google.com> wrote:
>>>> I noticed that buglist.cgi was taking quite a bit of CPU time. I looked
>>>> at some of the long running instances, and they were coming from
>>>> searchbots. I can't think of a good reason for this, so I have
>>>> committed this patch to the gcc.gnu.org robots.txt file to not let
>>>> searchbots search through lists of bugs. I plan to make a similar
>>>> change on the sourceware.org and cygwin.com sides. Please let me know
>>>> if this seems like a mistake.
>>>>
>>>> Does anybody have any experience with
>>>> http://code.google.com/p/bugzilla-sitemap/ ? That might be a slightly
>>>> better approach.
>>>
>>> Shouldn't we keep searchbots way from bugzilla completely? Searchbots
>>> can crawl the gcc-bugs mailinglist archives.
>>
>> I don't understand this. Surely it is super-useful for Google etc. to
>> be able to search gcc's Bugzilla.
>
> gcc-bugs provides exactly the same information, and doesn't have to
> regenerate the full web page for each access to a bug report.
It's not quite the same information, surely. Wouldn't searchers be directed
to an email rather than the bug itself?
Andrew.