Commit Graph

  • 3f7ec14931 add options to floodbot Your Name 2021-04-26 01:53:38 +02:00
  • f9200a99f4 update config.ini.sample Your Name 2021-04-26 00:10:36 +02:00
  • b35b8cd5f6 add proxyflood.py Your Name 2021-04-26 00:09:12 +02:00
  • 1a4d51f08c ppf: play nice with cpu Your Name 2021-02-10 22:26:27 +01:00
  • 298987612c proxywatchd: do not test that much oldies Your Name 2021-02-10 22:25:41 +01:00
  • 60c78be3fb import new url as bulk list, misc cleansing Your Name 2021-02-06 23:25:12 +01:00
  • f321e5a934 fetch: more describing debug message Your Name 2021-02-06 23:23:47 +01:00
  • d658309ee4 dbs: ensure items are unique Your Name 2021-02-06 23:22:39 +01:00
  • f6749393ae more imports Your Name 2021-02-06 23:22:12 +01:00
  • 7e91ae5237 changes Your Name 2021-02-06 21:50:08 +01:00
  • 68394da9ab misc changes and fixes and Your Name 2021-02-06 15:36:14 +01:00
  • b29c734002 fix: url → self.url, make thread option configurable Your Name 2021-02-06 14:33:44 +01:00
  • abd9b5bb9f tabs to spaces Your Name 2021-02-06 14:30:18 +01:00
  • 5965312a9a make leeching multithreaded, misc changes Your Name 2021-02-06 14:30:07 +01:00
  • 615af656a2 fix: differentiate log message on added url/proxy Your Name 2021-02-06 12:21:16 +01:00
  • dd3d3c3518 fix: always check if is_bad_url Your Name 2021-02-06 12:20:34 +01:00
  • 01bded472f tabs to space Your Name 2021-02-06 12:14:22 +01:00
  • 9aa2c91f41 more random changes Your Name 2021-02-06 11:00:25 +01:00
  • e15b9d2994 more changes Your Name 2021-02-04 23:06:37 +01:00
  • 78b29a1187 some changes Your Name 2021-01-24 03:52:56 +01:00
  • fe2353acb2 update urignore Mickaël Serneels 2019-05-30 21:17:46 +02:00
  • d6b1880ade urignore: modify entry Mickaël Serneels 2019-05-17 23:00:18 +02:00
  • f179080cca use geoloc Mickaël Serneels 2019-05-17 22:59:32 +02:00
  • eeedf9d0a1 extract url only from same domains ? (default: False) Mickaël Serneels 2019-05-14 21:24:29 +02:00
  • b226bc0b03 check if bad url *after* building the url Mickaël Serneels 2019-05-14 19:31:19 +02:00
  • eeae849e12 space2tab Mickaël Serneels 2019-05-14 19:27:09 +02:00
  • bcaf7af0e7 extract_urls(): only when stale_count = 0 Mickaël Serneels 2019-05-13 23:49:35 +02:00
  • e2122a27d9 ppf: strip extraced uris Mickaël Serneels 2019-05-13 23:31:35 +02:00
  • 225b76462c import_from_file: don't add empty url Mickaël Serneels 2019-05-13 23:23:01 +02:00
  • 99330204bc add new ignores Mickaël Serneels 2019-05-13 23:16:27 +02:00
  • c241f1a766 make use of dbs.insert_urls() Mickaël Serneels 2019-05-01 23:19:50 +02:00
  • c8d594fb73 add url extraction Mickaël Serneels 2019-05-01 22:37:38 +02:00
  • 866f308322 proxywatchd: remove bogus blanket exception handler rofl0r 2019-05-01 20:05:54 +01:00
  • 01435671c1 add latest rocksock rofl0r 2019-05-01 20:04:30 +01:00
  • 0fb706eeae clean code Mickaël Serneels 2019-04-28 02:00:07 +02:00
  • 9a624819d3 check content type Mickaël Serneels 2019-04-28 01:04:32 +02:00
  • 0962019386 add own searx instance Mickaël Serneels 2019-04-23 00:24:10 +02:00
  • 70b6285394 scraper: more changes Mickaël Serneels 2019-04-23 00:22:32 +02:00
  • 482cf79676 scraper: make query configurable (Proxies, Websites, Search) Mickaël Serneels 2019-04-22 23:26:37 +02:00
  • 15fc29abc4 externalize searx instances into new file "searx.instances" Mickaël Serneels 2019-04-22 22:24:41 +02:00
  • c194d5cfc7 scraper: add debug option Mickaël Serneels 2019-04-22 22:08:58 +02:00
  • 0155c6f2ad ppf: check content-type (once) before trying to download/extract proxies Mickaël Serneels 2019-04-22 21:45:13 +02:00
  • e19c473514 update imports.txt Mickaël Serneels 2019-04-22 21:14:40 +02:00
  • 75318209ab oldies_multi: change default value from 100 to 10 Mickaël Serneels 2019-04-14 18:22:35 +02:00
  • d09244d04d proxywatchd: fix Exception error Mickaël Serneels 2019-04-14 18:12:41 +02:00
  • 7aea9a3e53 irc: minimize possible response code Mickaël Serneels 2019-04-13 15:31:50 +02:00
  • 7b9f8b2e00 create socks4_resolve() Mickaël Serneels 2019-04-13 15:56:49 +02:00
  • bad4d25bcf make watchd.tor_safeguard a configurable option (default: True) Mickaël Serneels 2019-04-09 21:22:52 +02:00
  • 59eea18bca update urignore Mickaël Serneels 2019-04-08 00:52:23 +02:00
  • 6427d4a645 remove that specific blogspot url Mickaël Serneels 2019-04-08 00:51:59 +02:00
  • 475f10560e search: more changes Mickaël Serneels 2019-04-07 22:51:34 +02:00
  • 8900153871 set default error value to 1 for new urls Mickaël Serneels 2019-04-07 22:46:50 +02:00
  • fdd486f73c remove '-intitle:pdf' from default search Mickaël Serneels 2019-04-07 18:39:58 +02:00
  • a2783bdfcf don't loop over every searx instances Mickaël Serneels 2019-04-07 18:39:01 +02:00
  • 67aec84320 fix Exception error Mickaël Serneels 2019-04-07 18:36:40 +02:00
  • 003a9074d2 make server file configurable Mickaël Serneels 2019-04-04 00:15:31 +02:00
  • c729bf666e searx: use sample instances Mickaël Serneels 2019-04-03 00:47:54 +02:00
  • 207574c815 import.txt: add chinese site rofl0r 2019-03-29 23:08:17 +00:00
  • bf7ec03fbf fetch.py: factor out twice used var rofl0r 2019-03-29 23:07:44 +00:00
  • 096ee21286 urignore: add some rules suppressing SEO spam rofl0r 2019-03-29 23:06:57 +00:00
  • 310b01140a irc: implement use_ssl = 2 0: disabled, 1: enabled, 2: maybe default is 0 mickael 2019-03-03 23:57:33 +00:00
  • 0eebe4daff populate import.txt mickael 2019-03-03 22:51:13 +00:00
  • 61c3ae6130 fix: define retrievals on import mickael 2019-03-03 22:50:27 +00:00
  • 0d1316052c add servers.txt.sample mickael 2019-03-03 17:27:04 +00:00
  • ceb840b00f remove noexistent server mickael 2019-03-03 09:44:55 +00:00
  • 1ad5ca53e5 take care of old proxies mickael 2019-03-03 09:42:49 +00:00
  • 2bacf77c8c split ppf into two programs, ppf/scraper rofl0r 2019-01-18 22:53:35 +00:00
  • 8400eab7ee insert_proxies: remove 500-at-a-time logic rofl0r 2019-01-18 21:50:44 +00:00
  • 8be5ab1567 ppf: move insert function into dbs.py rofl0r 2019-01-18 21:43:17 +00:00
  • aba74c8eab mysqlite.py: improve rofl0r 2019-01-18 20:42:15 +00:00
  • 5fd693a4a2 ppf: remove more unneeded stuff rofl0r 2019-01-18 19:55:54 +00:00
  • d926e66092 ppf: remove unneeded stuff rofl0r 2019-01-18 19:53:55 +00:00
  • b0f92fcdcd ppf.py: improve urignore code readability rofl0r 2019-01-18 19:52:15 +00:00
  • b99f83a991 fetch.py: improve readability of extract_urls rofl0r 2019-01-18 19:32:37 +00:00
  • 4a41796b19 factor out http related code from ppf.py rofl0r 2019-01-18 19:30:42 +00:00
  • 0dad0176f3 ppf: add new field proxies_added to be able to rate sites rofl0r 2019-01-18 15:44:03 +00:00
  • 0734635e30 watchd main thread: be less nervous rofl0r 2019-01-18 15:35:19 +00:00
  • ddee92d20f watchd: introduce configurable 'outage_threshold' rofl0r 2019-01-18 15:34:49 +00:00
  • aaac14d34e worker: add threading lock mickael 2019-01-13 16:41:48 +00:00
  • f489f0c4dd set retrievals to 0 for new uris mickael 2019-01-13 04:35:11 +00:00
  • 69d366f7eb ppf: add retrievals field so we know whether an url is new rofl0r 2019-01-12 16:07:56 +00:00
  • bc41bad9de dbs.py: remove unused column hash rofl0r 2019-01-12 01:42:35 +00:00
  • 54e2c2a702 ppf: simplify statement rofl0r 2019-01-11 23:11:21 +00:00
  • 2f7a730311 ppf: use slice for the 500 rows limitation rofl0r 2019-01-11 23:07:09 +00:00
  • d209356a85 comboparse: fix bug with bool cmd args always True rofl0r 2019-01-11 22:49:36 +00:00
  • 7c7fa8836a patch: 1y4C mickael 2019-01-11 20:23:37 +00:00
  • 24d2c08c9f ppf: make it possible to import a file containing proxies directly rofl0r 2019-01-11 05:45:13 +00:00
  • ecf587c8f7 ppf: set newly added sites to 0,0 (err/stale) rofl0r 2019-01-11 05:23:01 +00:00
  • 8b10df9c1b ppf.py: start using stale_count rofl0r 2019-01-11 05:08:32 +00:00
  • d2cb7441a8 ppf: add optional debug output rofl0r 2019-01-11 05:03:40 +00:00
  • b6dba08cf0 ppf: only extract ips with port >= 10 rofl0r 2019-01-11 03:29:13 +00:00
  • 122847d888 ppf: fix bug referencing removed db field rofl0r 2019-01-11 02:53:16 +00:00
  • 7d59404d31 watchd: add totals statistics rofl0r 2019-01-11 00:43:16 +00:00
  • 4c6a83373f split databases mickael 2019-01-10 22:20:09 +00:00
  • b85cb863ba remove more dead servers mickael 2019-01-10 20:43:36 +00:00
  • 5e774b4e2a config.py: put section name in var rofl0r 2019-01-10 19:53:53 +00:00
  • ef9158015f proxywatchd: make checktime constants configurable rofl0r 2019-01-10 19:47:11 +00:00
  • 087559637e ppf: improve cleanhtml() and cache compiled re's rofl0r 2019-01-09 22:48:03 +00:00
  • befb346941 proxywatchd: preliminary support for ip caching rofl0r 2019-01-09 22:34:10 +00:00
  • 7067a8199f rocksock: bump to latest rofl0r 2019-01-09 22:32:32 +00:00