the repository which powers this website
robots.txt: disallow access to snapshots
My dmesg is filled with the oom killer bringing down processes while the Bingbot downloads every snapshot for every commit of the Linux kernel in tar.xz format. Sure, I should be running with memory limits, and now I'm using cgroups, but a more general solution is to prevent crawlers from wasting resources like that in the first place. Suggested-by: Natanael Copa <[email protected]> Suggested-by: Julius Plenz <[email protected]> Signed-off-by: Jason A. Donenfeld <[email protected]>
Jason A. Donenfeld 2013-08-13
parent 830eb6f · commit 23debef
-rw-r--r--Makefile1
-rw-r--r--robots.txt3
2 files changed, 4 insertions, 0 deletions
diff --git a/Makefile b/Makefile
index 00b32690..f11b60f7 100644
--- a/Makefile
+++ b/Makefile
@@ -78,6 +78,7 @@ install: all
$(INSTALL) -m 0644 cgit.css $(DESTDIR)$(CGIT_DATA_PATH)/cgit.css
$(INSTALL) -m 0644 cgit.png $(DESTDIR)$(CGIT_DATA_PATH)/cgit.png
$(INSTALL) -m 0644 favicon.ico $(DESTDIR)$(CGIT_DATA_PATH)/favicon.ico
+ $(INSTALL) -m 0644 robots.txt $(DESTDIR)$(CGIT_DATA_PATH)/robots.txt
$(INSTALL) -m 0755 -d $(DESTDIR)$(filterdir)
$(COPYTREE) filters/* $(DESTDIR)$(filterdir)
diff --git a/robots.txt b/robots.txt
new file mode 100644
index 00000000..4ce948fe
--- /dev/null
+++ b/robots.txt
@@ -0,0 +1,3 @@
+User-agent: *
+Disallow: /*/snapshot/*
+Allow: /