0. Index - Version 0.2.6
1. Introduction
2. Installing
3. Configuration

-- Globals
-- Toggle Options
-- Tweak Options
-- Acls
-- Subnets

4. Soulcatchers Arguments.
1. Introduction.

Soulcatcher is a squid redirector with primary and secondary schools in mind. It filters unwanted sites from the eyes of pupils and logs their attempts. When a user asks squid for a website, squid tells soulcatcher the URL wanted, the user attempting it and from which machine, as well as what type of method it was, eg POST(sending data) or GET(recieving).

From this Soulcatcher gives you many methods to block certain sites:

Keywords in the URL or just the DOMAIN
Extension match blocking (*.jpg, *.exe, *.mp3)
Domain match blocking
URL match blocking
Username match blocking
Method match blocking
Probability blocking

all can be determined for a set ip range. Soulcatcher also has a method to switch off whole classrooms (ip ranges)

Soulcatcher is designed to be as easy to use as possible but it is still in the beta stages, so its not there yet :)

If there is any comments/bugs towards the software please e-mail: hubnerd@btinternet.com

2. Installing

First of all make sure you have sleepycat's berekley databases installed, you can find the source at www.sleepycat.org or get the RPM from your distribution download site.

Then you should be able to simply do ./configure , make, make install

3. Configuration

-- GLOBAL OPTIONS:

The following options MUST be present in the config file.

a) redirect_page =

When a page is not allowed soulcatcher displays the content on this webpage. eg, http://www.somesite.com/foo.html

From version 0.2.6 you can add substitutions to the URL, substitutions are;

%i = Source IP Address.
%m = Method. eg GET, POST, CONNECT.
%f = Source fully qualified domain name.
%d = Ident/Username.

Example: http://www.somesite.com/foo.html?%i&%d

b) error_redirect =

When a URL cannot be understood by Soulcatcher or there is some other error it shall display this page and log the error, eg, http://www.somesite.com/error.html

c) log_file =

The location of the log file. eg, /var/log/soul.log

-- TOGGLE OPTIONS:

The follow options do not have to be present in the config file, the allowed answers are YES, NO, TRUE, FALSE (unless stated) and are not case sensative.

a) log_urls_switch =

If you wish to log all urls allowed through the filter.

b) log_denied_switch =

If you wish to log all denied urls.

-- TWEAK OPTIONS (Optional):

The follow options do not have to be present in the config file, the allowed answers are a integer (unless stated) and are not case sensative.

a) policy = ( known to not work yet )

Allow/Deny Policy - Takes allow or deny as argument, default is allow.

b) url_size =

Sets the length in characters allowed for the url, default is 4000.

c) fqdn_size =

Sets the length in characters allowed for the Fully Qualified Domain Name, default is 100.

d) ident_size =

Sets the length in characters allowed for the ident(username), default is 100.

e) method_size =

Sets the length allowed in characters for the method(GET,POST,CONNECT), default is 20.

f) domain_size =

Sets the length allowed in characters for the domain, default is the same as url_size.
WARNING: It is dangerous to set lower than the url size - change only if you know what you are doing.

-- ACLS:

Acls start with the acl keyword and the acl name, (eg, acl nastystuff) followed by the options wanted for that acl, it must have a finishing '}' not a starting one. eg,

acl funstuff

# funstuff
user_list = boo,fish,goo
extensions_file = /usr/soul/db/funstuff/extensions
probability_file = /usr/soul/db/funstuff/prob
keywords_file = /usr/soul/db/funstuff/keywords
text_url_file = /usr/soul/db/funstuff/urls
text_domain_db = /usr/soul/db/funstuff/domains
soul_db = /usr/soul/db/funstuff/soul.db
}

The following options do not have to be present in the acl.

a) user_list =

List of users to match seperated by a comma ','

b) probability_file =

The probability functions allows you to evalute a URL by setting a total score and setting certain words,functions and letter with a score.

For Example:

SCORE=10
LETTER_k=0.2
LETTER_w=0.4
WORD_download=10
WORD_porn=20.2
LINEOVER_200=10

The SCORE keyword allows you to set the total score before denying/allowing the URL. If not present in the file the score will set to the default score of 12.0. LETTER_ keyword allows you to set a score for a certain letter. WORD_ allows you to set a score for a certain word. This function is experimental and is subject to change.

c) extensions_file =

Extensions to match, the file shall just contain the extention to block, eg,

# EXT FILE

exe
jpg
jpeg
gif

will match all exes, jpg, jpegs and gifs.

d) keywords_file =

Keywords match. The file should contain certain keywords to match. If you wish the keyword to scan just the domain, add a + to the beginning of the word, eg,

# Keywords File

nasty
word
+chat
lines
+phone
tones

e) text_domain_db =

This file should contain all domain names you wish to match, this option should always have the soul_db option as well. eg,

# BAD DOMAINS
195.90.214.1
badsite.com
somemorebadsite.com
badbadsite.co.uk
nopethatbad.org

you shall need to run soulcatcher with the -a argument to remake the databases.

f) text_url_db =

This file should contain all urls you wish to match, this option should always have the soul_db option as well. eg,

#BAD URLS
http://www.badbadsite.org/morebadness/index.html
http://www.whatabadsite.com/uck/index.php

g) soul_db =

The location of the database of the urls and domains, made when you create a database. This option is only needed if you have the text_domain_db or text_url_db options.

-- SUBNETS:

This is the ip ranges(subnets), which allows you to select different acls and deny methods for each range. There MUST be a default subnet. Again the subnet is defined by the subnet keyword and the ip range(s) or for the default subnet the keyword default must be used. Again the subnet has a closing '}' but no opening one. If a '!' is followed by an acl it shall allow anything in that acl instead of blocking it. eg,

subnet default

ban_list = nastystuff,!okstuffthatgetsmistaken,foo

deny_methods=POST,CONNECT
redirect_page = http://www.somesite.uk/nono.html
}

subnet 192.168.0.1-192.168.0.100

ban_list = chat

redirect_page = http://www.somesite.uk/nono.html
deny_methods=POST,CONNECT
lock=yes
allow_all=yes
}

subnet 192.168.0.101

ban_list = chat,moo

redirect_page = http://www.somesite.uk/nono.html
deny_methods=POST,CONNECT
lock=yes
}

The following options APART from ban_list do not have to be present in the acl.

ban_list =

A list of the ACLS you want to use for the subnet. If you want everything in the acl to be allowed use a ! in front of the acl.

redirect_page =

The redirect page for this SUBNET, same usage as the global one.

deny_methods =

Allows you to deny certain methods. They are seperated by a ','

lock=

Allows you to lock off the whole subnet no matter what happens in checks.

allow_all=

Allows you to allow everything for that subnet.

4. Soulcatcher arguments

-c , Specifies the path to the config file, otherwise it shall go to its compile default.

-t , Create all url and domain databases for a certain acl.

-r , Create a certain domain database.

-l , Create a certain url database.

-v, Version Details

-a, Create all url and domain databases.

-o, Create all domain databases.

-u, Create all url databases.

-h, Help Screen.

-d, Debug

-s, stats.