Commit Graph

4 Commits

Author SHA1 Message Date
Mickaël Serneels
0155c6f2ad ppf: check content-type (once) before trying to download/extract proxies
avoid trying to extract stuff from pdf and such (only accept text/*)

REQUIRES:
sqlite3 websites.sqlite "alter table uris add content_type text"

Don't test known uris:
sqlite3 websites.sqlite "update uris set content_type='text/manual' WHERE error=0"
2019-05-01 17:43:28 +02:00
rofl0r
bf7ec03fbf fetch.py: factor out twice used var 2019-05-01 17:43:28 +02:00
rofl0r
b99f83a991 fetch.py: improve readability of extract_urls 2019-01-18 19:32:37 +00:00
rofl0r
4a41796b19 factor out http related code from ppf.py 2019-01-18 19:30:42 +00:00