Broadband Filtering - Squid Access Controls v1

  • Upload
    mbozho

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

  • 8/9/2019 Broadband Filtering - Squid Access Controls v1

    1/25

    TrainingManual

    Broadband Filtering

  • 8/9/2019 Broadband Filtering - Squid Access Controls v1

    2/25

    2

  • 8/9/2019 Broadband Filtering - Squid Access Controls v1

    3/25

    3

    Contents

    Overview of Web Filtering on Worcestershire Broadband .................................................................4SmartFilter configuration at County Hall ............................................................................................5

    Changing Smartfilter Categorisation...............................................................................................5Smartfilter Configuration .................................................................................................................7Policy groups ..................................................................................................................................7Global restrictions ...........................................................................................................................7Site exemptions ..............................................................................................................................7Search engine keyword checking...................................................................................................7Exercises ........................................................................................................................................8

    Squid Access Controls .......................................................................................................................9How Access Controls work...........................................................................................................11 Basic Source and Destination ACLS .................................................................................12 Regular Expression ACLs..................................................................................................14 Other ACL Types ...............................................................................................................16

    An Example ..................................................................................................................................19Squid Proxy Exercises..................................................................................................................22

    How the use of Squid and SmartFilter is enforced...........................................................................23Using Log Files.................................................................................................................................24Calamaris......................................................................................................................................24 Exercise ........................................................................................................................................25

  • 8/9/2019 Broadband Filtering - Squid Access Controls v1

    4/25

    4

    Overview of Web Filtering on Worcestershire BroadbandDecisions over filtering take place in three locations:

    1. SecureComputing, the company that supplies SmartFilter, the Countys main web filteringproduct.

    2. County Hall where SmartFilter is configured and adjusted according to the requirements

    of Worcestershire schools.3. Your schools where sensible configuration of the Squid proxy server can considerably

    enhance the effectiveness of filtering, and allow you to manage users access to websites ina variety of ways.

    The operation of filtering and management may be visualised with the help of the followingdiagram:

    Secure Computing:- Control List

    County Hall -broadband team:- config.txt- search.txt- site.txt

    School- Squid ACLs

    Requestexemptions oradditions to

    site.txt

    Request re-categorisation of asite on theSecureComputing

    website.

    Control listdownloadedtwice-weekly

    Configure Squidto restrict access

  • 8/9/2019 Broadband Filtering - Squid Access Controls v1

    5/25

  • 8/9/2019 Broadband Filtering - Squid Access Controls v1

    6/25

  • 8/9/2019 Broadband Filtering - Squid Access Controls v1

    7/25

    7

    Smartfilter Configuration

    At the installation of SmartFilter at County Hall provides three main configuration files that controlthe operation of the filter config.txt, site.txt and search.txt.

    Policy groups

    The config.txt file at County Hall allows the creation of policy groups. Each of these groups canhave separate filtering regimes, according to the 30 categories.

    Existing groups are:

    Schools

    Libraries

    It is possible that these could be extended in future, for example, to take account of the needs ofdifferent types of school. The allow/deny scheme for schools has been listed above. In the caseof libraries (at the time of writing), only the Sex and Gambling categories are denied.

    The policy groups are also responsible for defining any denied file types. Currently, school usersaccessing the Web via the proxy are not permitted to download .exe or .zip files.

    Global restrictions

    The config.txt file also decides on a number of global restrictions, the most important of which isprobably access to sites via IP address. This is currently permitted, as library users must haveaccess to Hotmail accounts, which make extensive use of IP address links.

    Site exemptions

    The site.txt file on the County Hall cache servers allows us to add sites that have not beencategorised by SmartFilter, or to override the control list. For example, the site:

    http://www.hardcorevideos.org

    is included in the list to allow it to be categorised as Sex.

    On the other hand,

    http://www.thinkquest.org

    is exempted in the same file, as it would otherwise be banned in schools under the Chat category.

    Search engine keyword checking

    The search.txt file allows checking to take place on users entries into certain search engines. Thelist of search engines currently in force is as follows:

    *google.com*google.de

    *google.fr*google.co.uk*google.it*google.ca

  • 8/9/2019 Broadband Filtering - Squid Access Controls v1

    8/25

    8

    *google.co.jp*google.co.kr*go.com*infoseek.com*altavista.digital.com*altavista.com*altavista.senet.com.au

    *altavistacanada.com*altavista.magallanes.net*altavista.skali.com.my*altavista.yellowpages.com.au*austronaut.ims.at*lycos.com*yahoo.com*yahoo.dk*yahoo.fr*yahoo.de*yahoo.it*yahoo.no*yahoo.es*yahoo.se*yahoo.com.au*yahoo.co.uk*yahoo.co.jp*yahoo.co.kr*yahoo.com.sg*excite.com*excite.de*excite.co.jp*excite.co.uk

    *mckinley.com*webcrawler.com*hotbot.com*dejanews.com*nlightn.com*snap.com*whoizzy.com

    Entries into any of these search engines are matched against a list of proscribed keywords.Positive matches will result in an Access Denied by SmartFilter message, with an indication as tothe category.

    Exercises

    1. Use SmartFilterWhere to check the categorisation of two or three websites you know. Ifyou disagree with the categorisation, request a suitable change.

    2. Change the proxy settings on the browser to allow .exe files to be downloaded. Confirmthat your changes are working. Change the setting back afterwards. See the sectionHow the use of Squid and SmartFilter is enforced on page 23 for further information.

    3. Check the operating of the keyword checking on various search engines using the wordhardcore or cocaine.

  • 8/9/2019 Broadband Filtering - Squid Access Controls v1

    9/25

    9

    Squid Access Controls

    Squid provides a system of access controls to enable you to decide who gets access to what, andwhen. You can manage squid access control lists (ACLs) using Webmin. To log into Webmin, goto https://10..1.1:10000 where is your schools IP identifier number. At the FinstallCentre, for example, we would type in:

    https://10.11.1.1:10000 orhttps://finstall.networcs.net:10000

    Choose the Servers Tab:

    Then Squid Proxy Server:

  • 8/9/2019 Broadband Filtering - Squid Access Controls v1

    10/25

    10

    Then Access Control:

    The Access Control screen looks like this:

    This is the list ofaccess control lists.The most important

    are the blacklist andwhitelist, but you canalso add your own forspecial purposes

    The proxy restrictionslist is an ordered listthat defines howACLs are applied.Access checking

    starts from the top.

    Always be sure touse the ApplyChanges link toensure that yourACLs are correctlyenforced.

  • 8/9/2019 Broadband Filtering - Squid Access Controls v1

    11/25

    11

    How Access Controls work

    There are two main parts to the Access Control page. On the left are Access Control lists (ACLs).These are named definitions. For example, Blacklist is the name that has been given to a list ofWeb Server Hostnames, and this list can be edited to suit your purposes. As such, the BlacklistACL is just a definition and does not determine any particular action on the part of the proxy server.

    On the right is a list of Proxy restrictions. This is essentially an ordered list of ACLs, with eachone allowing or denying traffic according to the content of the ACL. It is important to recognise thefollowing facts:

    ACLs do not have any effect on the operation of the proxy server until they are incorporatedinto the proxy restrictions list.

    The position in the list is critical, as processing starts at the top of the list and proceedsdown the list until a matching rule is found. This means that if you place your rule under theAllow all rule, it will have no effect.

    In addition to editing the blacklist, there is a range of other types of ACL that can be created. Click

    on the drop-down list by the Create new ACL to see a list of the types available to you.

    Before we go any further, some explanation of these ACL types might be helpful. The followingdefinitions were taken from the Squid website (which will explain the occasionally quirky use ofEnglish), to which we have added our own notes, with examples from the NGfL server. Some ofthe more exotic ACLs have been omitted, as you are unlikely to need to use them (if necessary,please refer to the documentation at www.squid-cache.org). Each one corresponds with one ofthe types listed in the drop-down box shown above (shown in red below). The syntax relates to theconfiguration file squid.conf (the configuration file used by Squid to record all preferences).

    These different typesof ACL are available;

    the most commonlyused is Web ServerHostname as inthe blacklist.

  • 8/9/2019 Broadband Filtering - Squid Access Controls v1

    12/25

  • 8/9/2019 Broadband Filtering - Squid Access Controls v1

    13/25

    13

    SrcdomainClient Hostname

    This can be used to control requests from other domains perhaps toprevent other schools from using your squid proxy!Since squid needs to reverse dns lookup (from client ip-address toclient domain-name) before this acl is interpreted, it can causeprocessing delays. This lookup adds some delay to the request.

    Usage acl aclname srcdomain domain-name

    Example acl aclname srcdomain .kovaiteam.comNote Here . is important (see the note on Webserver Hostname below).

    DstdomainWebserver Hostname

    This is the ACL type used by the blacklist to control access to sitesthat have not been filtered by SmartFilter.

    Usage acl aclname dstdomain domain-nameExample acl aclname dstdomain .kovaiteam.com

    Hence this look for *.kovaiteam.com from URLNote Here . is important.

    Note that you can force the proxy to match multiple servers on thesame domain by prefixing it with a dot and omitting the hostname. Forexample,

    .bbc.co.uk

    would match both of these:

    www.bbc.co.uknews.bbc.co.uk

    Also, do not enter the protocol part of the URL into this list in otherwords this is correct:

    www.ibsed.networcs.net

    but not:

    http://www.ibsed.networcs.net

  • 8/9/2019 Broadband Filtering - Squid Access Controls v1

    14/25

    14

    Regular Expression ACLs

    Regex-type ACLs use pattern matching to control access. Typically, you will only need to enter asingle word to match against a domain name to achieve a result.

    srcdom_regex

    Client Regexp

    This could be used to control accesses from stations with names

    containing a particular character string, although you are probablymore likely to want to achieve this using IP addresses.Since squid needs to reverse dns lookup (from client ip-address toclient domain-name) before this acl is interpreted, it can causeprocessing delays. This lookup adds some delay to the request

    Usage acl aclname srcdom_regex pattern

    Example acl aclname srcdom_regex kovai

    Hence this look for the word kovai from the client domain nameNote This type of ACL may introduce delays into the display of pages.

    dstdom_regex

    Webserver Regexp

    This can be used to find website requests containing a specificcharacter string.

    Usage acl aclname dstdom_regex patternExample acl aclname srcdom_regex kovai

    Hence this will look for the word kovai from the destination domainname

    The above example will find www.cannabis.com.

  • 8/9/2019 Broadband Filtering - Squid Access Controls v1

    15/25

    15

    url_regexURL Regexp

    The url_regex means to search the entire URL for the regularexpression you specify. Note that these regular expressions are case-sensitive

    Usage acl aclname url_regex pattern

    Example acl ACLREG url_regex cooking

    ACLREG refers to the url containing ``cooking'' not Cooking

    This example will find the word radio anywhere in the URL; it willfind:www.radio.comwww.bbc.co.uk/radio3etc.

    urlpath_regexURL Path Regexp

    The urpath_regex regular expression pattern matching from URL butwithout protocol and hostname. Note that these regular expressionsare case-sensitive

    Usage acl aclname urlpath_regex pattern

    Example acl ACLPATHREG urlpath_regex cookingACLPATHREG refers only containing cooking not Cooking andwithout referring protocol and hostname.If URL is http://www.visolve.com/folder/subdir/cooking/first.html thenthis acltype only looks afterhttp://www.visolve.com/ .

    This example will findwww.bbc.co.uk/newsbut not:www.news.com

  • 8/9/2019 Broadband Filtering - Squid Access Controls v1

    16/25

    16

    BrowserBrowser Regexp

    Regular expression pattern matching on the request's user-agentheader

    Usage acl aclname browser pattern

    Example acl aclname browser MOZILLA

    This refers to the requests, which are coming from the browsers whohave MOZILLA keyword in the user-agent header

    This example could be used to prevent users from using the Mozillabrowser.

    Other ACL Types

    TimeData & Time

    Time of day, and day of week

    Usage acl aclname time [day-abbrevs] [h1:m1-h2:m2]day-abbrevs:S - SundayM - MondayT - TuesdayW - Wednesday

    H - ThursdayF - FridayA - Saturdayh1:m1 must be less than h2:m2

    Example acl ACLTIME time M 9:00-17:00

    ACLTIME refers day of Monday from 9:00 to 17:00.

    This example will find accesses occurring during weekday lunchtimes.

  • 8/9/2019 Broadband Filtering - Squid Access Controls v1

    17/25

  • 8/9/2019 Broadband Filtering - Squid Access Controls v1

    18/25

    18

    MethodRequest method

    This specifies the type of the method of the request

    Usage acl aclname method method-type

    Example acl aclname method GET POST

    This refers get and post methods only

    This is the only example of this ACL in the default Squid proxy setup.You are probably unlikely to need to use this ACL.

  • 8/9/2019 Broadband Filtering - Squid Access Controls v1

    19/25

    19

    An Example

    Lets put together an ACL to prevent access to the Internet from a particular machine (or group ofmachines) at a particular time of day. In this case, the time of day will be lunchtime everyweekday.

    To start with, we create a Date and Time ACL:

    Now define the days and times of day you want your ACL to refer to:

    After you have clicked on the Save button, your list will look something like this:

    Choose Date andTime from the drop-down list, then click

    on Create newACL.

    Click on theSelected radiobutton, thenselect the daysyou want the list

    to apply to.

    Click on theradio button tothe left of thefirst time box,then enter thestart and endtimes.

  • 8/9/2019 Broadband Filtering - Squid Access Controls v1

    20/25

    20

    Next, we will create an ACL to define the station (or group of stations) to which we want the ACL toapply. Note that this step would unnecessary if we wanted the rule to apply to all stations on thenetwork.

    Create a new Client Address ACL. To do this, you must give the first address in the range, plus asubnet mask. If there is only to be one machine in the range (as in this example), you can leavethe To IP box empty. If you want to create a list of several ad-hoc addresses, this can be doneby saving single addresses one by one.

    Finally, you need to join these two ACLs together in a single proxy restriction, and move it into theappropriate place in the list to create the desired effect. Click on the Add proxy restriction link atthe bottom of the Proxy restrictions list.

    The new ACL inplace.

  • 8/9/2019 Broadband Filtering - Squid Access Controls v1

    21/25

    21

    Select both the lunchtimes list and the station56 list, and ensure that the Deny radio button isselected, then click on Save.

    Now this list must be moved into position, using the arrows on the right, as shown below:

    Both of the newACLs have beenselected the rulewill be looking forrequests fromstation56happening during

    lunchtimes.

    The new list ispositioned aboveAllow all toensure that it hasthe desired effect.

  • 8/9/2019 Broadband Filtering - Squid Access Controls v1

    22/25

    22

    Squid Proxy Exercises

    1 Set up a new blacklist, and test it to ensure that it works correctly2 Set up a whitelist and test it to ensure that it works correctly3 Set up a list to filter out sites with a specific word in the domain name4 Set up a list to filter out sites with a specific word in the path

    5 Set up a list to stop a group of computers in the room from accessing the Internet6 Configure Squid to prevent any web access between set hours7 Create a whitelist that only applies to a group of workstations, and make three websites

    available to users of those stations. Can you make this list apply on Monday morningsonly?

    8 Implement failure URL.

  • 8/9/2019 Broadband Filtering - Squid Access Controls v1

    23/25

  • 8/9/2019 Broadband Filtering - Squid Access Controls v1

    24/25

    24

    Using Log Files

    Your NGfL server keeps a continuous log of Website accesses. A typical portion of the log mightlook something like this:

    1021296336.929 3857 10.11.1.56 TCP_MISS/200 3807 GET http://services.postcodeanywhere.co.uk/form.asp? -ROUNDROBIN_PARENT/cache1.networcs.net text/html

    1021296337.032 103 10.11.1.56 TCP_MISS/304 230 GET http://www.worcestershire.gov.uk/home/jo_house_top.jpg -ROUNDROBIN_PARENT/cache2.networcs.net text/html1021296337.108 102 10.11.1.56 TCP_MISS/304 230 GET http://www.worcestershire.gov.uk/home/jo_boat.jpg -ROUNDROBIN_PARENT/cache2.networcs.net text/html1021296337.166 134 10.11.1.56 TCP_CLIENT_REFRESH_MISS/304 230 GET http://www.worcestershire.gov.uk/home/content_tag.gif -ROUNDROBIN_PARENT/cache2.networcs.net text/html1021296337.285 83 10.11.1.56 TCP_MISS/304 230 GET http://www.worcestershire.gov.uk/home/jo_house_bottom.jpg -ROUNDROBIN_PARENT/cache2.networcs.net text/html1021296338.580 1605 10.11.1.56 TCP_MISS/304 230 GET http://www.worcestershire.gov.uk/home/jo_middle_text.gif -ROUNDROBIN_PARENT/cache1.networcs.net text/html1021296338.641 1632 10.11.1.56 TCP_MISS/304 230 GET http://www.worcestershire.gov.uk/home/jo_house_middle.jpg -ROUNDROBIN_PARENT/cache1.networcs.net text/html1021296338.643 1476 10.11.1.56 TCP_CLIENT_REFRESH_MISS/304 230 GET http://www.worcestershire.gov.uk/home/spacer_line.gif -

    Calamaris

    As such, it is of little use. However, you are provided with a log analysis tool known as Calimaris

    (website at http://calamaris.cord.de/ ). This provides the following:

    Summary Incoming requests by method Incoming UDP-requests by status Incoming TCP-requests by status Outgoing requests by status Outgoing requests by destination Request-destinations by 2ndlevel-domain Request-destinations by toplevel-domain TCP-Request-protocol

    Requested content-type Requested extensions Incoming UDP-requests by host Incoming TCP-requests by host Performance in 60 minute steps

    Here is an example of the 2nd

    level domain report:

  • 8/9/2019 Broadband Filtering - Squid Access Controls v1

    25/25

    To gain access to Calamaris, choose the Servers tab in Webmin, then choose Squid Proxy, andfinally Calamaris Log Analysis:

    This should give you a good idea of how your proxy is being used, and the type of sites that arebeing most frequently accessed. However, should you wish to have direct access to the log filesthemselves, this can be arranged on a weekly basis (beware a compressed archive from aweeks log file can easily be 10MB). Please contact the Broadband Support Team if you wish topursue this.

    Exercise

    Browse the Calamaris Log Analysis page for your school and determine the most popular sites.