Catching flies with honey tokens

HONEY TOKENS

November 2009 Network Security15

Catching flies with honey tokens

Perhaps you can determine how it got out and who allowed this to happen. But that is often more difficult than it sounds. Understanding the normal ebb and flow of data within an organisation is hard enough, determining how it leaks out of your business through unofficial means is harder still.

In truth, it is unlikely that you will be able to Figure out how the data got into the competition’s possession, but by tak-ing some simple steps you can ensure that should it happen again, you are in a much better position to deal with the conse-quences without spending huge amounts of money. What you have to do is build into your infrastructure the means to trace data leaks. So, where do you start?

Firstly, you need to understand how the data will be transported around your network when accessed in a legitimate way. If the data you want to protect is held in a central database, then this is a good place to start.

It is important to find out how many legitimate ways your database can be accessed. Is it through a web interface, via a client-server application or via synchronisation means with another database? Does it always end up on screens, or can records be accessed and copied down to local storage such as USB sticks and transported that way? And the most important question is how can you detect when it is being used ille-gitimately?

The answer is : honey data, Snort and honey rules.

What is honey data?Honey data is artificially crafted data you add into your business application

database so that its movement around your organisation can be monitored. You can use honey data, combined with the means to monitor it, to verify who is accessing what data from where.

A honey data token is the name given to the smallest unit of honey data. It may be a whole record or a word in a field; a customer data record or an administration record. Basically, it is any form of structured or unstructured data that is unique and can be positively detected by your monitoring resources. Good honey data has two key attributes: it can be identified by your monitoring system so that the false positive rate for this data record is very low, and it can be positively identified and avoided by legitimate business processes in a way that legitimate business use of the data will not be adversely affected.

One example of honey data is by use of special fake addresses seeded by mail-ing database companies. These databases are the intellectual property of the mail-ing company – you are typically licensed to use a mailing database just once. A mailing company would lose revenue if you re-used the list, so it includes honey records which, when used, cause your mail item to be sent back to the mail-ing company in question, allowing it to identify misuse of its IP.

Honey tokens and the rules that detect them should be given special status in your company. Knowledge of the exact form and location of the honey data should be restricted and the tokens themselves stored in a secure database, along with information about where they might appear in a legitimate process. A good rule of thumb is to give the honey data the same status as

cryptographic key material within your organisation.

Why ‘honey’? The term has been bor-rowed from honey nets, systems put onto public-facing networks by compa-nies to entice hackers. Honey nets are systems designed to look ‘interesting’ to hackers and to contain them in the honey net should they gain access. This is often achieved by making the systems ‘sticky’; that is, hard to disengage from. For example, this could mean that a hacker’s system times out on access rather than supporting a quick discon-nection initiated by a mouse click or character such as a Control-C.

“You can use honey data, combined with the means to monitor it, to verify who is accessing what data from where”

Another scenario in which honey data can come in useful is for mobile phone operators. Consider that mobile telephone operator ‘A’ has entered into a roaming agreement with two other providers, ‘B’ and ‘C’. As part of the roaming reconcilia-tion service, each exchanges with the others lists of numbers routed through their net-works. Each provider would like to recruit big-spending customers, so they are tempt-ed to rank the callers who are routed from their network, identify which of the biggest spenders are not their customers and send this list to their telesales team. The pros-pects would then be cold-called on a regu-lar basis by salespeople with special offers to entice them to move operators. This kind of behaviour is forbidden by contract between operators, but ‘A’ may still suspect ‘B’ and ‘C’ of surreptitiously doing this. But what can they do to prevent it?

One way may be for ‘A’ to create some special accounts that are not billed to true subscribers, but link directly back to A’s computer systems. Then they arrange for these accounts to be very active over

Dominic Storey, technical director, Sourcefire UK

So there you are in your boss’s office. He has called you in because your com-pany-confidential database data has turned up in the hands of your competitor. He’s not happy and he wants a scapegoat. How can you ensure it’s not you?

HONEY TOKENS

16Network Security November 2009

a period of a few months and monitor whether ‘B’ and ‘C’ tries to call them to sell services. There may be real phones associated with these accounts, which are used by ‘A’s employees in ‘B’ and ‘C’ territories so that valid roaming charges are incurred. These account telephone numbers are the honey tokens.

Detecting honey tokensAn ideal tool for the detection of honey tokens is Snort, a network-based intru-sion prevention system by Sourcefire Inc, under the GNU General Public License v2. Snort is also a commercial solu-tion and forms part of a comprehensive cyber-security solution, Sourcefire 3D.

Snort started life as a network packet capture tool with similarities to tcpdump and Ethereal (now WireShark), but quickly evolved into an intrusion detec-tion system as Martin Roesh, its origina-tor, added a sophisticated rules language and pre-processor architecture. In-line and other blocking functions were also added.

The rules language has remained open and transparent – anyone can view and modify the thousands of published Snort rules or add their own from scratch. Consequently, detecting data in transit becomes a simple matter of having a rule that will catch the honey token as it is transferred across your network. However, to make a good job of this, there are a number of pre-requisites. Firstly, you must have a clear under-standing of how the honey token will look on the wire. For example, will data in a database be transported in binary format or text format? Will data be transferred in Unicode (16 bit) or ASCII (8 bit)? Secondly, you need a means of identifying (and possibly suppressing) events from the legitimate business proc-ess. This way, logs will not be filled with legitimate transactions that will appear as ‘noise’ from the perspective of looking for an anomalous data transfer. Thirdly, there must be suitable monitoring points in the network that allow you to unam-biguously identify the transport of the honey data.

A sensible approach is to deploy sen-sors to monitor multiple points. Start

by monitoring the network connection to the database with a rule that will detect the token being transferred from the database. Then, incorporate rules at other strategic egress around the net-work. Your rules may well be different according to how data is translated at each stage. By incorporating multiple measuring points in this way, you can detect how the honey data can leave through the network.

Building a honey netIdeally, you should have access to a test version of your business application. Many companies implement test or ‘sandbox’ databases. Access to this will enable you to test and debug your honey system. The test version should be load-ed with authentic sample data.

First, create a series of well-qualified honey tokens and deploy them to the sample database. Then, write your Snort rules and deploy them to a sen-sor that monitors your sample database. You may wish to have more than one rule – one for every stage in which the data is transposed is recommended. For instance, if you have a three-tier applica-tion (database->web app server->web server), you might have three rules: a database transport rule, a web transport rule and a user interface rule. Then run a legitimate test process and make a note of the rules that fire. If possible, use rule-suppression based on source and destination to suppress triggers on these legal accesses. Next, use some hack-

ing techniques to get to the data some other way (for example if the database is based on MySQL, use the MySQL cli-ent to access the data from an alternate machine). Validate that the rule has fired and contains meaningful data (source and destination addresses and a mean-ingful message). Once you are satisfied that this works, remove the honey data from the database and run the test again. If you receive any events, then you have a false positive problem and will need to refine the honey tokens.

When you have zero false positive and false negative rates, deploy your rules to your production network, but do not deploy the tokens. Again, wait until the business process has run through a cycle and test for false positives in your sys-tem. If all is clear, deploy the honey data and start production monitoring.

Honey token rotation

It is good security practice to consider the possibility that in time your honey tokens may become known to a data thief. If they know in advance which records are honey records, they can structure their database query to exclude the records in question. Or if they have used encrypted access, they can strip the records from the data set to counter your data-mining probes. The simple remedy is to change the honey tokens.

Our recommendation is to create and test new tokens and rules on a periodic basis. One of the advantages of Snort is that rules are text-based, so you can link the creation of a token and the detection

Figure 1: An example honeytoken rules scenario.

HONEY TOKENS

November 2009 Network Security17

rule together in a script. With suitable scripting, the entire process of creation and testing of honey tokens and rules and tokens can be automated. This type of automation also identifies problems that may occur because of application software updates changing the low-level format of network data.

Honey token/rule example using a LAMP stack application

Using Linux as the OS, Apache as the web server, MySQL as the database and Perl as the application server, these three-tier applications may be completely self-contained or spread over multiple machines. Consider the following scenario:

(not shown). All traffic to the WFEP is over Port 80.

(WFEP) handles initial authentica-tion and end user page formatting.

is a set of Perl applications invoked by the WFEP over HTTP on port 12080

back end database via MySQL client calls over port 3306

fields: Name Address Postcode AccountNumber [primary key,unique]

Since AccountNumber is guaran-teed to be unique, simply create a new record in the database and initialise the fields to believable but fake values. Then store the account number in the honey token database and embed this in the honey rule:

Name = “Dominic Storey”Address = “West Forest Gate,

Wellington Road, Wokingham”Postcode = “RG40 2AQ”AccountNumber = 14421233

There are three honey rules to catch data leakage operating at each layer. The first rule monitors the database layer and

looks for data being extracted directly from the database. The second rule monitors potential accesses made to the web applica-tion server. The third rule monitors direct access from the web application.

“It is good security practice to consider the possibility that in time your honey tokens may become known to a data thief”

These rules might have the following form:

alert tcp $TRUSTED_HOSTS 3306 -> $UNTRUSTED_HOSTS any

(msg:”DATABASE Honey Token detected”; \

flow:established,from_server; flowbits:isset,lamp_db;\

:jump instructions to locate account

number… :content: “14421233”; within 8;\ classtype: honey-token-access;

sid:1000001; rev:1;)

alert tcp $TRUSTED_HOSTS 12080 -> $UNTRUSTED_HOSTS any

(msg:”WEBAPP Honey Token detect-ed”; \

flow:established,from_server; flowbits:isset,lamp_was;\


number… :content: “14421233”; within 8;\classtype: honey-token-access;

sid:1000002; rev:1;)

alert tcp $TRUSTED_HOSTS 80 -> $EXTERNAL_NET any

(msg:”WEB Honey Token detected”; \

flow:established,from_server; flowbits:isset,lamp_wfep;\


number… :content: “14421233”; within 8;\

classtype: honey-token-access; sid:1000003; rev:1;)

“What is attractive about honey tokens, and the Snort rules that detect them, is that they are easy to put together”

Where $TRUSTED_HOSTS is a variable containing the IP addresses of all machines that comprise the LAMP application and $UNTRUSTED_HOSTS is set as !$TRUSTED HOSTS. The logic of locating the account num-bers is not shown, as this will be specific to the data model. However, note the use of the flowbits keyword: this allows rule chaining that is used for application layer 7 state tracking – in this example, two more rules (not shown) would be written to set this flowbit when a ses-sion opens a database connection to the monitored database and to reset it when the database is closed. This ensures that we do not falsely trigger just because we see the number 14421233 in any data-base transaction.

Making it betterHoney tokens provide the best results when they can somehow be tied to a single usage of the data, as in the example of the use of a mailing list by a client. It’s a little harder when the data may be accessed by many people on a fairly continuous basis. This can be addressed by the use of a time-stamped token. In this case, the token record contains data that is updated on a regular basis (e.g. from one hour to one minute, which can easily be achieved by a short script). The honey rules are designed to ignore this part of the record, but when the rule triggers, the data will be stored in the Snort packet record. This can be retrieved later for validation purposes.

Another good approach is to embed obfuscated system time (accurate to one minute) in the data. Then, when the honey data is detected in the field, you can recover the time stamp from the data and search the Snort logs for the client machine that accessed this token at this time.

BUDGET

18Network Security November 2009

Is it really important to have a structured security budget? Dario Forte

The reasonableness of the funds available to protect information is a concept that management takes for granted. If it is not so, it means that the defences imple-mented are over- or underestimated and money is wasted. And while an excellent (or mediocre) system may be developed, it may have nothing to do with the needs of the business at that moment.

Another factor to be considered is the contingency of the investment. Not to be confused with an exclusively tacti-cal security configuration (certainly something to be avoided), this criterion relates to the value of information, which varies over time with variations in the market, company complexity, busi-ness strategy, business drivers, the use of products and services correlated to the core business, compliance needs and, last but not least, the degree of risk-aversion of the company and those who govern it.

In presenting a security budget, it is necessary to anticipate what value man-agement will assign to each information category and each service offered and

therefore what exactly constitutes ‘appro-priate’ protection. The security manager must verify each time a change is made that the security strategies and informa-tion policies are aligned with the devel-opment of the organisation.

Valorising the security budgetMost top managers (including CIOs, when included among this rank) avail themselves of various surveys and assess-ment methods to cover their backs. But when these processes quantify costs without reference to the context, out the window they go. For example, many managers, when confronted with an IT plan, have difficulty distinguishing between what is considered security and what is not. Especially with innovative or cutting-edge projects, even when the budget is in the hands of the IT function, the information security com-ponent is buried within the item repre-senting the greatest value for the client.

Hence, even if the security component is necessary (especially in distributed systems such as cloud computing) and may well require even more significant efforts and investments than those for the implementation of the application, it cannot be directly assessed.

A practical example, similar to the one above, is that of virtualisation. In these cases, it is always difficult to know what part of the investment serves security purposes. And it is probable, especially in a period of financial crisis, that to get a virtualisation project approved, the idea of a return in terms of efficiency rather than in terms of security takes precedence, even though virtualisation often makes securisation and business continuity (including disaster recovery) possible, elements previously not within everyone’s range.

Company problems cannot be resolved with software

Another factor I have recently noticed in my work as a security advisor is the divergence between the enthusiasm of providers for the growth in security budgets, as revealed in the surveys of recent years, and reality. More is prob-ably invested in security hardware and software, but this does not necessarily

Dario Forte, CFE, CISM, CGEIT

At the time this article was written, the dynamics of management’s under-standing of security issues and the valorisation of security investments remain somewhat nebulous. Until this concept is fully interpreted and explored, this valorisation is assigned to those in the company who deal with information security. And if they are lucky, they are given a reasonable budget with which to administrate.

Conclusions

The leaking of data is not an easy problem to solve. There are many products in the market that claim to provide data leakage protection (DLP), but none of these can offer 100% protection. This is not neces-

sarily a reflection on the software vendors, but more on the difficulty of the problem. Companies can easily spend hundreds of thousands of dollars on DLP and still have incomplete protection.

What is attractive about honey tokens, and the rules that detect them, is that

they are easy to put together, and as long as some development and testing is car-ried out, they provide a reliable and high-ly flexible way of detecting data leakage. The cost can be minimal – all you need is the sensor hardware, plus your time and ingenuity to make it work.

Documents

Catching flies with honey tokens