SquidBlocker - A url filtering DB for hospitals

Key Features

  • Built in ICAP service
  • Online super fast updates of about 16k records per second
  • Simple HTTP api which supports queries results caching
  • HTTP 2.0 support (works only with TLS)
  • Basic Web UI for managment and remote updates
  • Full URLs matching algorithm
  • Can use SquidGuard domains and urls blacklists
  • Supports raw IPV6 in urls and CONNECT requests
  • A http HUB reverse proxy that helps updating multiple DB servers(http v1)
  • Squid-Cache external acl helper(http v1)
  • Static binaries for: Linux, Windows, netBSD, OpenBSD, FreeBSD, Solaris, OSX(Darwin)
  • Free

For who was it built?

SquidBlocker was built for critical sytems which requires very low down time such as hospitals and other healthcare facilities. For places which human lives are mandatory and shutting down the service might lead into unpleasant situations. SquidBlocker is there for the sys-admins that needs the service to stay up for a very long periods of time without the requirment to restart or reload the DB server.

SquidBlocker DB library

SquidBlocker uses LevelDB as it's backend DB library and there for is very fast. You can do anything directly on the LevelDB files using any other external tools.

Why?

While SquidGuard does a great job for years it is missing couple features:

  • SquidGuard doesn't support online database updates.
  • SquidGuard is running under url_rewrite interface instead of external_acl helper or ICAP.
  • SquidGuard Doesn't implement any level of concurrency support.
Due to it's nature SquidGuard requires a restart\reload of squid for any DB\blacklists update which adds complexity or down time risk per operation.
Not supporting concurrency also requires a huge amount of workers\processes running in paralel for a busy system to not slow down the requests but at the cost of memory and CPU consumption.
A simple GoLang external_acl helper can handle about 2k concurrent requests. And in a production SMB office you can use 5 helpers which handles traffic that without concurrency would require more then 40 SquidGuard helpers.

The lookup algorigthm

Regular urls:

To allow more flexibility for a more strict environments first we run the lookup for a full url match and then recursively test and when reaching the domain, test the domain vs the domains black list.
So the lookup path for "http://www.example.com:8080/test1/1.jpg?arg=1" would look like this:

  • http://www.example.com:8080/test1/1.jpg?arg=1
  • http://www.example.com:8080/test1/1.jpg
  • http://www.example.com:8080/test1
  • http://www.example.com:8080
  • www.example.com
  • example.com
  • com
  • * (default action for all domains and urls)

It's is considered a "slow" lookup but it is the most resilience algorithm.
For IP based hosts there is only one lookup in the domains blacklist.

CONNECT requests (tcp/CONNECT IP/DOMAIN + port)

The CONNECT method for tunneling connections is using a destination of an ip or domain and a port which is another side of the algorithm. SquidGuard takes in account only the domain\IP level of the request and there for doesn't fit for many environments which the proxy needs to be more accurate. It needs to either block all and allow from a list or allow all and block very specific tcp services. The lookup path for a CONNECT request is:

  • tcp://ip:port or tcp://domain.example.com:port
  • domain.example.com (skipped for an ip)
  • example.com (skipped for an ip)
  • com (skipped for an ip)
  • tcp://*:port (default action to all CONNECT requests with a specific port)
  • tcp://* (default action to all CONNECT requests)

V6 addresses

Any IPV6 raw address is being represtend in the DB with square brackets such as:

  • http://[2a02:ed0:3000:c664::110]/v6/badge.png
  • [2a02:ed0:3000:c664::110]:443

Download and installtion

Downloading the tar.xz and installing manually

Instructions Will be updated later

Installing on CentOS using yum or the RPM

Will be updated later

Basic Operations

Starting up the service

Most SquidBlocker settings are done using command line arguments, you can see the full list of options using the "-h" argument but for now I will give an example to start the service:

/installtion_path/squidblocker-server -db_path=/var/filter.db \
-ui_path=/var/squidblocker/www \
-http_port=:8080 \
-icap_port=:1344 \
-lists_jail=/var/squidblocker/blacklists
-htpasswd_file=/etc/sbserver/htpasswd
  • The service works as a forground daemon and can be used with a systemd service file.
  • Configuring squid to use the ICAP service

    Squid can use the DB ICAP service to block or allow urls\pages using the next addition to squid.conf:

    icap_enable on
    adaptation_send_client_ip on
    adaptation_send_username on
    icap_service service_req reqmod_precache icap://127.0.0.1:1344/filter bypass=0 on-overload=wait
    adaptation_access service_req deny manager
    adaptation_access service_req deny CONNECT
    adaptation_access service_req allow all

  • Change the ip:port of the icap service to reflect your environent.
  • The "/filter" icap path is the only path that does url filtering.
  • The icap service answers to all incomming requests and if you need to restrict the access to it you would need to use firewall rules.
  • Configuring squid to use the external_acl client

    Squid can use the external_acl Client to block or allow urls\pages using the next addition to squid.conf:

    external_acl_type filter_url ipv4 concurrency=50 ttl=3 %URI %METHOD %un /usr/bin/sblocker_client -db_pass=hello -db_url=http://squidblocker-server:8080/opendns/01/
    acl filter_url_acl external filter_url
    deny_info http://ngtech.co.il/block_page/?url=%u&domain=%H filter_url_acl
    http_access deny !filter_url_acl
    http_access allow localnet filter_url_acl

  • Change squidblocker-server domain of the http service to reflect your environent.
  • The "/opendns/01/" service is a public domains blacklist, you can use multiple helpers instances against the different options of the service.
  • Other options for black list services are: "/sbv2/01/", "/symantech/01/". use sbv2 url to work against the local black\white list.
  • The http api service is protected by simple username and password.
  • The DB web UI

  • To access the web UI use the url: "http://(ip or hostname):port/ui/". You can either upload and update from a local file or to trigger an update from a file on the DB server(restricted to the lists jail path).
  • The static UI pages doesn't require user and password but all DB api access requires username and password.
  • Lists jail

    The DB now allows to trigger an update remotly using the UI based on the lists that resides on the server disk. The update is restricted to a specific path\jail which is defined at the DB startup.

    DB SquidGuard blacklists update script

    An update script exits on the github repository.
    The script was designed to update hourly if necessery. It can be modified to use a custom remote or local tar.xz\bz2\gz.
    A Gui can utilize this script to force an online re-update.
    Just add the lists names to either black or white list array and run it.

    Updating the DB using SquidGuard blacklists

    Updating the DB is done using the http interface and a POST request and can also be done via a web browser using the UI, I will list couple CLI examples
    For domains blacklist update you would use the next command:

    curl -i -X POST -H "Content-Type: multipart/form-data" \
    -F "prefix=dom:" \
    -F "val=1" \
    -F "listfile=@squidguard_domains_file_path" \
    "http://admin_useranme:admin_password@ip\domain:port/db/set_batch/"

    For a urls blacklist update you would use the next command:

    curl -i -X POST -H "Content-Type: multipart/form-data" \
    -F "prefix=url:http://" \
    -F "val=1" \
    -F "listfile=@squidguard_urls_file_path" \
    "http://admin_username:admin_password@ip\domain:port/db/set_batch/"

  • The "val" argument can be set for "1" to blacklist and block or "0" to whitelist and permit the file content patterns\domains.
  • Since the SquidGuard format ignores the "www." prefix of urls and SquidBlocker is using a full match algorightm it would require to run two updates or more to match the same functionality of SquidGuard.
    One with the prefix value "url:http://" and a second one with the prefix value "url:http://www." which will match only "http" and not "https". To block "https" full match urls you need to add\update using the "url:https://" or "url:https://www." prefixes.
  • To set a specific key such as the "*" domain which is the server default action to either allow or deny urls you would use the command:

    curl -i -X POST -H "Content-Type: multipart/form-data" \
    -F "key=dom:*" \
    -F "val=1" \
    "http://admin_username:admin_password@ip\domain:port/db/set/"

    A similar command would be used to set the default CONNECT requests policy using the next command:

    curl -i -X POST -H "Content-Type: multipart/form-data" \
    -F "key=url:tcp://*" \
    -F "val=1" \
    "http://admin_username:admin_password@ip\domain:port/db/set/"

  • The "val" argument can be set for "1" to blacklist and block or "0" to whitelist and permit the key, any other value will set it to be ignored.