150 – bugzilla robots.txt blocking web crawlers such as archive.org

Bug 150 - bugzilla robots.txt blocking web crawlers such as archive.org

Summary: bugzilla robots.txt blocking web crawlers such as archive.org

Status:	CONFIRMED

Alias:	None

Product:	Libre-SOC Website
Classification:	Unclassified
Component:	website (show other bugs)
Version:	unspecified
Hardware:	All All

Importance:	--- normal
Assignee:	Luke Kenneth Casson Leighton

URL:

Depends on:
Blocks:

Reported:	2019-12-22 01:24 GMT by Jacob Lifshay
Modified:	2019-12-22 02:05 GMT (History)
CC List:	1 user (show)

See Also:
NLnet milestone:	---
total budget (EUR) for completion of task and all subtasks:	0
budget (EUR) for this task, excluding subtasks' budget:	0
parent task for budget allocation:
child tasks for budget allocation:
The table of payments (in EUR) for this task; TOML format:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Jacob Lifshay 2019-12-22 01:24:13 GMT

We should switch robots.txt to allow more things, a good template to use would be Mozilla's bugzilla robots.txt:
https://bugzilla.mozilla.org/robots.txt

Comment 1 Luke Kenneth Casson Leighton 2019-12-22 02:00:24 GMT

yep this apparently is quite common, web-crawling of bugzilla can be pretty heavy so mozilla set up a default that banned pretty much everything.  i'm not so bothered so have set it to "Allow /"

Comment 2 Luke Kenneth Casson Leighton 2019-12-22 02:01:42 GMT

(In reply to Jacob Lifshay from comment #0)
> a good template to use
> would be Mozilla's bugzilla robots.txt:
> https://bugzilla.mozilla.org/robots.txt

just copied it entirely, just... because :)

Comment 3 Jacob Lifshay 2019-12-22 02:05:52 GMT

(In reply to Luke Kenneth Casson Leighton from comment #2)
> (In reply to Jacob Lifshay from comment #0)
> > a good template to use
> > would be Mozilla's bugzilla robots.txt:
> > https://bugzilla.mozilla.org/robots.txt
> 
> just copied it entirely, just... because :)

Thanks, sounds good to me!

if you have more time, it would be nice to also fix #149