Live Monitoring of Web Traffic in Proxy and Content Filter

This guide demonstrates how an admin can use the log files for the proxy server and content filter to monitor web traffic. Using the log files, you can dynamically watch user connections and filter your view of the traffic by IP, username, or even website.

Log Files

ClearOS has a facility for monitoring log files as they are populated using the command line. Log into your ClearOS server and we will take a look at one of the two files associated with web traffic:

  • /var/log/squid/access.log
  • /var/log/dansguardian-av/access.log

If you are only using the proxy, you will use the squid log file. If you are using the content filter you can use either. For the most part, they are the same. A key difference is that the squid file uses Unix epoch time for the entries and the dansguardian-av log uses a more human-friendly format.

There are a number of ways that you can view these files including using editors like nano, vi, or others to edit them or 'cat' to just spit the whole file out to the screen. For this demonstration we will be using 'tail'.

Tailing a file in Linux

Tailing a file in Linux means that you just want to see the last bits of the file. Typically this is the last ten rows but you can specify more. For our use, we will follow the file instead of spitting out the most recent 10 rows. To follow the file, issue the following command (using the appropriate log file):

tail -f /var/log/dansguardian-av/access.log

This command will follow and continue to follow this log file. As the file grows, the content will be spit to the screen. It will continue to do so until you cancel the follow with a Ctrl+c. When the content filter or proxy is running, it will show you each and every link that is hit realtime as your users browse the internet.

Searching the results using grep

The utility 'grep' is a regular expression matcher. If it sees what you are searching for, it will display it. It will ignore all other results. Grep can use 'regex' matches or simple words. To use tail and grep together, we will send the standard output (the data from the screen of the tail) into the standard input of grep with our search term and it will only display the results.

For example, if you wanted to monitor a user named 'user1', you would issue the following:

tail -f /var/log/dansguardian-av/access.log | grep user1

I may get results like this:

2013.10.3 7:59:33 user1;src=4228629;met=1;v=1;pid=103167436;aid=276091750;ko=0;cid=55817230;rid=55706519;rv=2;&timestamp=1380779973853;eid1=2;ecn1=0;etm1=30;  GET 42 0  2 200 image/gif  sales -
2013.10.3 7:59:34 user1;src=4228629;met=1;v=1;pid=103167436;aid=276090300;ko=0;cid=55817226;rid=55706515;rv=2;&timestamp=1380779974910;eid1=2;ecn1=0;etm1=30;  GET 42 0  2 200 image/gif  sales -
2013.10.3 7:59:37 user1  GET 982 0  2 200 image/x-icon  sales -
2013.10.3 7:59:38 user1  GET 894 0  2 200 image/x-icon  sales -
2013.10.3 7:59:39 user1  GET 1150 0  2 200 image/x-icon  sales -

There are several key data points here.

  • The date
  • The time
  • The username (if you are using user-based authentication)
  • The URL
  • The type of request
  • The size of the data
  • The type of the data
  • The group that matched in the content filter


content/en_us/kb_bestpractices_live_monitoring_of_web_traffic_in_proxy_and_content_filter.txt · Last modified: 2014/12/23 23:04 by dloper