Commit Graph

224 Commits

Author SHA1 Message Date
Your Name
89e73a818a add more flood stuff 2021-04-26 03:29:43 +02:00
Your Name
5984f23616 add usernames 2021-04-26 02:07:22 +02:00
Your Name
2962613e80 update server list (remove unresolving server) 2021-04-26 01:54:34 +02:00
Your Name
3f7ec14931 add options to floodbot 2021-04-26 01:53:38 +02:00
Your Name
f9200a99f4 update config.ini.sample 2021-04-26 00:10:36 +02:00
Your Name
b35b8cd5f6 add proxyflood.py 2021-04-26 00:09:12 +02:00
Your Name
1a4d51f08c ppf: play nice with cpu 2021-02-10 22:26:27 +01:00
Your Name
298987612c proxywatchd: do not test that much oldies 2021-02-10 22:25:41 +01:00
Your Name
60c78be3fb import new url as bulk list, misc cleansing 2021-02-06 23:25:12 +01:00
Your Name
f321e5a934 fetch: more describing debug message 2021-02-06 23:23:47 +01:00
Your Name
d658309ee4 dbs: ensure items are unique 2021-02-06 23:22:39 +01:00
Your Name
f6749393ae more imports 2021-02-06 23:22:12 +01:00
Your Name
7e91ae5237 changes 2021-02-06 21:50:08 +01:00
Your Name
68394da9ab misc changes and fixes and 2021-02-06 15:36:14 +01:00
Your Name
b29c734002 fix: url → self.url, make thread option configurable 2021-02-06 14:33:44 +01:00
Your Name
abd9b5bb9f tabs to spaces 2021-02-06 14:30:18 +01:00
Your Name
5965312a9a make leeching multithreaded, misc changes 2021-02-06 14:30:07 +01:00
Your Name
615af656a2 fix: differentiate log message on added url/proxy 2021-02-06 12:21:16 +01:00
Your Name
dd3d3c3518 fix: always check if is_bad_url 2021-02-06 12:20:34 +01:00
Your Name
01bded472f tabs to space 2021-02-06 12:14:22 +01:00
Your Name
9aa2c91f41 more random changes 2021-02-06 11:00:25 +01:00
Your Name
e15b9d2994 more changes 2021-02-04 23:06:37 +01:00
Your Name
78b29a1187 some changes 2021-01-24 03:52:56 +01:00
Mickaël Serneels
fe2353acb2 update urignore 2019-05-30 21:17:46 +02:00
Mickaël Serneels
d6b1880ade urignore: modify entry 2019-05-17 23:00:18 +02:00
Mickaël Serneels
f179080cca use geoloc
now saves proxy's country in db
2019-05-17 22:59:32 +02:00
Mickaël Serneels
eeedf9d0a1 extract url only from same domains ? (default: False)
setting this option will make ppf not follow external links when extracting uris
2019-05-14 21:24:29 +02:00
Mickaël Serneels
b226bc0b03 check if bad url *after* building the url 2019-05-14 19:31:19 +02:00
Mickaël Serneels
eeae849e12 space2tab 2019-05-14 19:29:30 +02:00
Mickaël Serneels
bcaf7af0e7 extract_urls(): only when stale_count = 0 2019-05-13 23:49:35 +02:00
Mickaël Serneels
e2122a27d9 ppf: strip extraced uris 2019-05-13 23:48:55 +02:00
Mickaël Serneels
225b76462c import_from_file: don't add empty url 2019-05-13 23:48:55 +02:00
Mickaël Serneels
99330204bc add new ignores 2019-05-13 23:48:55 +02:00
Mickaël Serneels
c241f1a766 make use of dbs.insert_urls() 2019-05-01 23:19:50 +02:00
Mickaël Serneels
c8d594fb73 add url extraction
url get extracted from webpage when page contains proxies

this allows to "learn" as much links as possible from a working website
2019-05-01 22:58:23 +02:00
rofl0r
866f308322 proxywatchd: remove bogus blanket exception handler
this would catch *any* exception, including typos
2019-05-01 20:05:57 +01:00
rofl0r
01435671c1 add latest rocksock 2019-05-01 20:04:30 +01:00
Mickaël Serneels
0fb706eeae clean code 2019-05-01 17:43:29 +02:00
Mickaël Serneels
9a624819d3 check content type 2019-05-01 17:43:29 +02:00
Mickaël Serneels
0962019386 add own searx instance 2019-05-01 17:43:29 +02:00
Mickaël Serneels
70b6285394 scraper: more changes 2019-05-01 17:43:29 +02:00
Mickaël Serneels
482cf79676 scraper: make query configurable (Proxies, Websites, Search)
--scraper.query = 'pws'
2019-05-01 17:43:28 +02:00
Mickaël Serneels
15fc29abc4 externalize searx instances into new file "searx.instances" 2019-05-01 17:43:28 +02:00
Mickaël Serneels
c194d5cfc7 scraper: add debug option 2019-05-01 17:43:28 +02:00
Mickaël Serneels
0155c6f2ad ppf: check content-type (once) before trying to download/extract proxies
avoid trying to extract stuff from pdf and such (only accept text/*)

REQUIRES:
sqlite3 websites.sqlite "alter table uris add content_type text"

Don't test known uris:
sqlite3 websites.sqlite "update uris set content_type='text/manual' WHERE error=0"
2019-05-01 17:43:28 +02:00
Mickaël Serneels
e19c473514 update imports.txt 2019-05-01 17:43:28 +02:00
Mickaël Serneels
75318209ab oldies_multi: change default value from 100 to 10 2019-05-01 17:43:28 +02:00
Mickaël Serneels
d09244d04d proxywatchd: fix Exception error
Exception in thread Thread-9:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.7/threading.py", line 754, in run
    self.__target(*self.__args, **self.__kwargs)
  File "proxywatchd.py", line 200, in workloop
    job.run()
  File "proxywatchd.py", line 123, in run
    sock, proto, duration, tor, srv, failinc = self.connect_socket()
ValueError: need more than 5 values to unpack
2019-05-01 17:43:28 +02:00
Mickaël Serneels
7aea9a3e53 irc: minimize possible response code 2019-05-01 17:43:28 +02:00
Mickaël Serneels
7b9f8b2e00 create socks4_resolve()
moves socks4 resolution out of socket_connect block
2019-05-01 17:43:28 +02:00