8
A practical and light-weight data capture tool for Xen virtual machine NGUYEN ANH QUYNH, YOSHIYASU TAKEFUJI Graduate School of Media and Governance, Keio university 5322 Endoh, Fujisawa, Japan JAPAN {quynh,takefuji}@sfc.keio.ac.jp Abstract: Honeypot is a common solution to investigate attacker’s activities, but the data capture tool, one of the key components of high-interaction honeypot architecture, faces a major difficulty: it is very hard to hide its presence. For example Sebek, the de-factor data capture tool, suffers from this problem: the intruder can easily uncover it even without privileged access right. This paper presents a design and implementation of a light-weight “camera” software in Xen virtual machine environment: the camera can be put into the virtual machine honeypot to gather necessary data about intruder’s action. The camera tool is named XenKamera, which aims to collect TTY data from consoles of observed honeypot, then replays the collected data in on-line or off-line manner as the administrator wishes. Simply put, XenKamera allows us to watch the intruder as if we were looking over his shoulder while he is typing. In order to prevent the intruder from discovering XenKamera, a special architecture is proposed, so the data recording process becomes stealth, hard to detect and circumvent. To protect the gathered data, the TTY logging is secretly transferred to a separate virtual machine and safely kept there. Experiments demonstrate that XenKamera is effective and reliable. Besides to serve for honeypot purpose, XenKamera is designed to be so light-weight that it is practical and can also be used in the production systems to record the working sessions, and the administrator can rely on the logging data to investigate and trouble-shoot administration. Key–Words: Xen virtual machine, Linux, honeypot, data capture tool, stealth communication, keylogger, TTY logging, computer administration. 1 Introduction Honeypot ([1], [2], [3], [4]) is a computer system with the purpose: to lure attacker in order to gather infor- mation about threats. These collected information is used to better understand threats, how they are evolv- ing and changing, in order to counter those threats in the best way possible. If applying the honeypot technology properly, we can discover the novel attack patterns and unknown security holes. Honeypot also helps to study the attacker’s motives. The high-interaction honeypot consists of 3 key components: Data control: this component is used to contain the intruder’s activities and ensure that he does not cause any harm to other production systems outside the honeypot. Data capture: a honeypot must capture all the ac- tivities within the honeypot, including the infor- mation entered and left the system. Data collection: the gathered information got from the capture component must be securely and secretly forwarded to a central data server. This allows data captured from various honeypot sensors to be centrally collected for analysis and archiving. Sebek ([5]) is a de-factor, widely-used tool in cur- rent honeypot technology. Sebek architecture consists of 2 key components: a kernel module run on the hon- eypot system, and a central server to collect data. The first component, Sebek kernel module, serves as the data capture tool, and can capture intruder’s activities in the honeypot. It also serves as a part of the data collection component: the collected data is then trans- ferred by this module to a Sebek server (sebekd) run- ning on a central machine, and then analyzing process is taken there with some utilities provided with Sebek package. One of the vital requirements of the data capture component is: it must be as covert as possible, so the intruder never knows that he is under watch eye. To satisfy that demand, Sebek kernel module applies many tricks borrowed from the black-hat community. Unfortunately these tricks are still not good enough to cover Sebek: researchers have pointed out many Proceedings of the 5th WSEAS International Conference on Applied Computer Science, Hangzhou, China, April 16-18, 2006 (pp467-474)

A practical and light-weight data capture tool for Xen ...€¦ · as server consolidation, co-located hosting facilities, distributed services and application mobility. Xen community

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A practical and light-weight data capture tool for Xen ...€¦ · as server consolidation, co-located hosting facilities, distributed services and application mobility. Xen community

A practical and light-weight data capture toolfor Xen virtual machine

NGUYEN ANH QUYNH, YOSHIYASU TAKEFUJIGraduate School of Media and Governance, Keio university

5322 Endoh, Fujisawa, JapanJAPAN

{quynh,takefuji}@sfc.keio.ac.jp

Abstract: Honeypot is a common solution to investigate attacker’s activities, but the data capture tool, one of the keycomponents of high-interaction honeypot architecture, faces a major difficulty: it is very hard to hide its presence.For example Sebek, the de-factor data capture tool, suffers from this problem: the intruder can easily uncover iteven without privileged access right. This paper presents a design and implementation of a light-weight “camera”software in Xen virtual machine environment: the camera can be put into the virtual machine honeypot to gathernecessary data about intruder’s action. The camera tool is named XenKamera, which aims to collect TTY data fromconsoles of observed honeypot, then replays the collected data in on-line or off-line manner as the administratorwishes. Simply put, XenKamera allows us to watch the intruder as if we were looking over his shoulder while he istyping. In order to prevent the intruder from discovering XenKamera, a special architecture is proposed, so the datarecording process becomes stealth, hard to detect and circumvent. To protect the gathered data, the TTY logging issecretly transferred to a separate virtual machine and safely kept there. Experiments demonstrate that XenKamerais effective and reliable. Besides to serve for honeypot purpose, XenKamera is designed to be so light-weight thatit is practical and can also be used in the production systems to record the working sessions, and the administratorcan rely on the logging data to investigate and trouble-shoot administration.

Key–Words: Xen virtual machine, Linux, honeypot, data capture tool, stealth communication, keylogger, TTYlogging, computer administration.

1 IntroductionHoneypot ([1], [2], [3], [4]) is a computer system withthe purpose: to lure attacker in order to gather infor-mation about threats. These collected information isused to better understand threats, how they are evolv-ing and changing, in order to counter those threatsin the best way possible. If applying the honeypottechnology properly, we can discover the novel attackpatterns and unknown security holes. Honeypot alsohelps to study the attacker’s motives.

The high-interaction honeypot consists of 3 keycomponents:

• Data control: this component is used to containthe intruder’s activities and ensure that he doesnot cause any harm to other production systemsoutside the honeypot.

• Data capture: a honeypot must capture all the ac-tivities within the honeypot, including the infor-mation entered and left the system.

• Data collection: the gathered information gotfrom the capture component must be securely

and secretly forwarded to a central data server.This allows data captured from various honeypotsensors to be centrally collected for analysis andarchiving.

Sebek ([5]) is a de-factor, widely-used tool in cur-rent honeypot technology. Sebek architecture consistsof 2 key components: a kernel module run on the hon-eypot system, and a central server to collect data. Thefirst component, Sebek kernel module, serves as thedata capture tool, and can capture intruder’s activitiesin the honeypot. It also serves as a part of the datacollection component: the collected data is then trans-ferred by this module to a Sebek server (sebekd) run-ning on a central machine, and then analyzing processis taken there with some utilities provided with Sebekpackage.

One of the vital requirements of the data capturecomponent is: it must be as covert as possible, sothe intruder never knows that he is under watch eye.To satisfy that demand, Sebek kernel module appliesmany tricks borrowed from the black-hat community.Unfortunately these tricks are still not good enoughto cover Sebek: researchers have pointed out many

Proceedings of the 5th WSEAS International Conference on Applied Computer Science, Hangzhou, China, April 16-18, 2006 (pp467-474)

Page 2: A practical and light-weight data capture tool for Xen ...€¦ · as server consolidation, co-located hosting facilities, distributed services and application mobility. Xen community

methods to detect Sebek’s presence, and some of themare not even required privileged access.

Another potential problem of Sebek is that thedata collection component run on the central servermust expose to the network to capture data forwardedfrom honeypots, so this server must be protected at allcost. Otherwise the attacker will bring it down, thenhe can do anything he likes to the honeypot he brokenin without worrying that he is being observed.

This paper proposes a novel approaches to elim-inate these problems of Sebek. A design and imple-mentation of a light-weight data capture tool namedXenKamera, which works in Xen virtual machine([6], [7]) environment, will be presented. XenKam-era focuses on capturing only TTY data of the hon-eypot, and is able to function much more stealthilyand quietly than Sebek. All the recording data arenot saved on the local honeypot, but secretly trans-ferred to a central server and safely kept there. Thisleaves absolutely no chance to the intruder to find outor delete the previous logging data. Meanwhile, thecollection logging daemon is not necessarily exposedon the network, and that improves the security of thewhole honeypot architecture.

To make the solution more practical, XenKam-era is designed so the overhead is very negligible andoverall it makes almost no impact to the system it runson. Consequently, XenKamera can be also used forproduction systems to transparently record the work-ing sessions, so the administrator can rely on it toinvestigate and carry out forensic process to recoverbroken system made by mistaken operations.

The rest of this paper consists of 4 sections: The2nd section outlines some problems with current cap-ture tools (specifically focus on Sebek). The 3rd sec-tion proposes XenKamera architecture. The next sec-tion discusses some issues in XenKamera design andan experience we had while experimenting it. Section5th summaries related works, and finally conclusioncloses this paper in the 6th section.

2 Sebek’s ProblemsAs mentioned above, Sebek is the de-factor, widely-used data capture tool in honeypot technology ([8]).The reason is obviously: being inside the kernel, Se-bek has the opportunity to capture all the necessarydata including keystrokes, input and output data fromfile system or network socket, while it can control thewhole system as it wants. Sebek even patches the ker-nel to hide its presence from system users, so the in-truder would have hard time to discover its existence.

Unfortunately while many efforts are put on Se-bek to make it more stealthy, Sebek is still rather easy

to uncover and defeat. Here are 8 outstanding prob-lems of Sebek:

(1) Unprivileged local user can run a simple com-mand that generates big input or output data, suchas the dd-attack method proposed in [9], thenchecks to see if the network performance is de-creased with ping command. The reason is thatthe data captured from all the input and output aresecretly forwarded to the central sebekd server vianetwork stack, and this activity may lead to con-gest network if the data is enormous.

(2) Though Sebek tries to patch the network stack tohide the actual network statistics when using net-work layer to send data, we can still compare theinternal statistics got from inside the kernel withwhat the system shows to user-space ([9]). Thedifference will disclose the presence of Sebek.

(3) Sebek inserts a kernel module into the system, andthis module could be listed in the kernel modulelist (with lsmod command on Linux). Though wecan try to hide it with another kernel module (likethe clean method proposed by adore-ng [10]), Se-bek module can still be found with a brute-forcescanning technique [11].

(4) Sebek replaces some system-calls with its ownfunctions. Paper [12] exploits this fact to proposeanother way to discover Sebek: we just need tocheck if the address of these system calls are atabnormal places in the memory. If that is the case,chances are Sebek is present in the kernel.

(5) Sebek replaces some system-calls with the codethat packs the data into UDP packets and sendthem out to the sebekd machine. This changesignificantly increases the time to complete thesystem-calls, and the difference (more than 3times in the case of read() system-call [12]) canbe easily recognized from user-space by unprivi-leged user.

(6) After detecting Sebek, the intruder can remove itby recovering the original system call (for exam-ple with unsebek.c tool in [13]). The fact that Se-bek is a kernel module makes it easier to do that.

(7) Sebek sends the captured data to the central servervia network. If the intruder has a sniffer (such astcpdump [14]) installed at the right place in thenetwork, he will see these data and easily figureout that the system he has penetrated is a honey-pot.

(8) The central server must expose to the network toreceive data sent from the honeypot. That will

Proceedings of the 5th WSEAS International Conference on Applied Computer Science, Hangzhou, China, April 16-18, 2006 (pp467-474)

Page 3: A practical and light-weight data capture tool for Xen ...€¦ · as server consolidation, co-located hosting facilities, distributed services and application mobility. Xen community

tempt the intruder to attack this server to bringdown this fundamental component of our honey-pot. This is not a theory, but the actual threat:Paper [15] proposes such a method, in which se-bekd will be taken over if it uses a libpcap librarywith buffer overflow bug.

As we see, there are too many problems with thecurrent Sebek, and they all make honeypot less attrac-tive solution for security practices.

3 XenKamera SolutionTraditionally, honeypots have been physical systemson a dedicated network, with multiple physical ma-chines to monitor and collect logging data from thehoneypots. The requirement resources posed by hon-eypot prevent it become a common network securitysolution.

The advent of virtual machines such as Xen hasmade setting up honeypots far easier. Instead of a setof physical machines, the honeypot is now the Xenvirtual machine with the host filtering and monitoringnetwork traffic and collecting logs. Even better, oneXen host can have multiple honeypots running on it,and those honeypots can be configured in a realisticvirtual network, with each plays a specific role: datacontrol, data capture or data collection.

Our solution XenKamera is based on Xen, andtake some advantages provided by Xen to address theoutstanding problems which Sebek currently experi-ences. Because XenKamera is made to work in Xenenvironment, we will first take a brief look at Xentechnology, and then discuss more about XenKam-era’s design and implementation.

3.1 Xen Virtual MachineXen is a virtual machine monitor initially developedby the University of Cambridge Computer Labora-tory and now promoted by various industrial monsterslike Intel, AMD, IBM, HP, RedHat, Novel and by thewhole open source community. Being released un-der the open source GNU GPL license, Xen can beused to partition a machine to support the concurrentexecution of multiple operating systems (OS). Com-modity OS (now officially Linux, FreeBSD, NetBSDare supported) can run on Xen with small changes tothe kernel. Xen is outstanding because the perfor-mance overhead introduced by virtualization is neg-ligible: the slowdown is around 3% only ([16]). Vari-ous practices take the advantages offered by Xen, suchas server consolidation, co-located hosting facilities,distributed services and application mobility.

Xen community is working hard to graduallypush Xen into Linux kernel, so it will be available forevery Linux users. The process is expected to startfrom kernel 2.6.15 with Xen version 3.0.

Xen ArchitectureBasically, Xen is a thin layer of software above thebare hardware, and Xen exposes a virtual machine ab-straction that is slightly different from the underlyinghardware. In Linux, Xen introduces a new architec-ture called xen, which is very similar to x86 architec-ture. The virtual machine (VM) executing on Xen aremodified (at kernel level) to work with xen architec-ture. All the accesses of DomUs to the hardware andperipherals must go through Xen, so Xen can keep theclose eye to those VMs and control all the activities.

Running on top of Xen, VM is called Xen do-main, or domain in short. A privileged special do-main named Domain0 (or Dom0 in short) always runs.Dom0 manages other domains (called User Domain,or DomU in short), including jobs like start, shut-down, reboot, save, restore and migrate them betweenphysical machines.

Because all the domains run on the same machine,they can share physical resources such as memory,hardware interrupt, and peripherals. Note that all ofthose sharing must be approved by Xen after all.

3.2 XenKamera Design3.2.1 Goals and ApprochesXenKamera is designed with the aim to overcome 8problems experienced by Sebek we discussed above.

1. The first goal of XenKamera is to be the cap-ture data tool for the honeypot architecture. Re-garding this, we see that Sebek tries to cap-ture all the I/O data in the system, including theI/O data from console, file-system and networksocket. In some experiments we have carriedout, we found that in many cases, we were mostinterested in the session data generated by con-sole, which shows us what the intruder had typedat his console when he broken in the honey-pot. These information will disclose quite a fewthings about the goals, motives and attitude of theintruder. This observation leads us to a decision:XenKamera should focus on capturing only thedata from console sessions, but not the data fromfile-system or network socket like Sebek does.These I/O data from consoles show us not onlywhat the intruder types in, but also the outputdata returned by the system, exactly as what hereceives on his screen. In Unix-derived systems,

Proceedings of the 5th WSEAS International Conference on Applied Computer Science, Hangzhou, China, April 16-18, 2006 (pp467-474)

Page 4: A practical and light-weight data capture tool for Xen ...€¦ · as server consolidation, co-located hosting facilities, distributed services and application mobility. Xen community

the communication layer that allows users to ac-cess the local and remote physical/virtual devicesis the TTY subsystem. By hooking to TTY sub-system and capturing TTY data, we can gatherthe I/O user data even if the session is encrypted(for example when the intruder logs in via SSHservice). The other benefit is that only deal-ing with TTY data (instead of capturing every-thing like Sebek does) makes XenKamera verylight-weight, since the amount of data producedby console layer is usually pretty small. More-over, as we choose to put the capture code at theTTY layer, but not in the critical path like Sebekdoes (Sebek gets the data by patch the system-calls which leads to very high overhead), the per-formance impact of XenKamera is very negli-gible. Consequently the problem (5) of Sebekmentioned above is much more mitigated. Be-sides that, since we no longer patch the system-call, our solution defeats the problem (4) of Se-bek.

In order to capture the console data, we mustmodify the DomU’s kernel in TTY layer. TTYlayer is the subsystem that manages all the in-put and output data concerning console sessions.Patching at the right place makes XenKamerasupport most type of TTYs: virtual console, BSDconsole, unix98-style PTYs (xterm/ssh), serial,ISDN, etc.

2. Another target of XenKamera is to secretly sendthe logging data to the collection machine. Incontrast with Sebek, which exploits networkstack to transfer data out, we will instead take theadvantage of Xen to send the data out via sharedmemory. Since all the domains run on the samephysical machine, they can share memory witheach other. Thanks to Xen intercommunicationmechanisms, we can establish a shared memorybetween DomU, the virtual machine we are try-ing to run XenKamera on, and Dom0. DomUputs all the gathered data from TTY layer in theshared memory, then notifies Dom0 to pick upthem. Obviously with this scheme, data is nolonger sent out through the network stack, thusthe process becomes more quietly, stealthily, andsubsequently XenKamera can defeat the prob-lems (1), (2), (5) and (7) Sebek currently suffers.

In addition, there is one more merit of this solu-tion: data is sent via shared memory (but not net-work stack), the overall reliability and efficiencyis significantly increased.

3. With the strategy of exchanging data betweenDomU and Dom0 via shared memory, we run a

daemon process in Dom0 to pick up logging dataforwarded out by DomU. Because all the com-munication is done via shared memory and otherXen communication mechanisms, the whole pro-cess is not carried out on the network. Conse-quently the daemon process is not necessarily ex-posed on the network like sebekd does, thus ourapproach does not face the problem (8) of Sebek.

Because Xen provides strict isolation betweenDomU and Dom0, even if the intruder knowsthat he is under observation, he cannot access ormodify the logging data kept in Dom0. This ad-vantage still stands even if he somehow gains theultimate privileges of root user.

4. Besides a small patch at the right place of TTYlayer of DomU, we propose that XenKamera isapplied in a whole as a patch to the DomU’s ker-nel, but not as a kernel module like Sebek’s ap-proach. With this trick, we are not worried anymore about hiding kernel module as Sebek does,and it is also more difficult for the intruder to re-move XenKamera from kernel. Therefore, theproblem (3) of Sebek is addressed with our ap-proach, while the problem (6) is much more re-lieved (we will discuss further on this later).

5. The final goal is XenKamera must be flexible, sothe administrator can disable or enable it as hedesires at run-time.

All of those goals and approaches lead us to thearchitecture for XenKamera as followings.

3.2.2 XenKamera ArchitectureXenKamera consists of 3 main components: The cam-era device in DomU, which plays as a data capturetool (kameraU); the camera recorder in Dom0, whichplays as a data collection daemon (xenkamerad); andcamera utilities in Dom0 (including camera player,keystroke extractors and others). The overall archi-tecture of XenKamera is outlined as in Figure 1.Camera device in DomU: kameraU is a kernel codeXenKamera put in kernel-space of DomU. This codepatches and hooks into the TTY core layer of DomUto capture TTY data. The captured TTY events areopen (new console is opened), deinitialized (a consoleis cleaned up before closing), read (there is data to in-put), and write (there is data to output) events. Thecollected data together with event information is de-livered to Dom0 via a shared memory between DomUand Dom0.

To be flexible, kameraU can be disabled and en-abled by an instruction sent from Dom0’s user-space.When inactive, it costs no overhead in the DomU.

Proceedings of the 5th WSEAS International Conference on Applied Computer Science, Hangzhou, China, April 16-18, 2006 (pp467-474)

Page 5: A practical and light-weight data capture tool for Xen ...€¦ · as server consolidation, co-located hosting facilities, distributed services and application mobility. Xen community

Figure 1: XenKamera architecture

Camera recorder: xenkamerad runs in user-space ofDom0 to record TTY data sent from kameraU. Thisdaemon process patiently waits for the notifications onthe new data from kameraU. If it detects that the newdata arrived, it gets the data from the shared mem-ory between Dom0 and DomU above, then saves thedata into separate logging files for each domain re-spectively.Camera utilities: Once having the TTY logging datafrom camera recorder in the above step, we need toreplay it or analysis these data. XenKamera provides2 key tools: a player (ttyreplay) and a keystroke ex-tractor (xkeys). With the player, the administrator canreplay the data in on-line manner, when the intruderis typing at his console, so the administrator can fol-low the intruder lively. On the other hand, the admin-istrator can also replay TTY logging data in off-linemanner by picking up any off-line logging file and re-play it later, when the intruder has already gone. Incase the administrator is only interested in what theintruder typed (not the output screen, which shows theresult of the command the intruder runs), he can usethe keystroke extractor to extract the keystrokes fromthe logging data. With this tool, the administrator canquickly take a look and figure out what the intruder isdoing or already done to the honeypot.

4 DiscussionWe have done some experimental with XenKameraand found that its impact is very negligible: on av-erage the overhead is around only 19% when the datais input or output to the console. The reason is obvi-ous: XenKamera forwards data to Dom0 via shared

memory, and it works in asynchronous way. Con-sequently the user/intruder never see any differencewhile working on a XeKamera-powered system. Incontrast, our measurements on Sebek in the same setof tests show that Sebek cost up to nearly 950%: thereason is Sebek choose to patch the system-calls (butnot the TTY layer as XenKamera does) to capturedata, and the system-calls are especially sensitive toany change. This is also a major problems of Sebek:the intruder can easily figure out that Sebek is run-ning with some simple benchmarks. Our evaluationdemonstrates that XenKamera is a much more effec-tive solution as a keystroke/TTY logger than Sebek.

While XenKamera is able to observe DomUs, wedo not intend to watch the control domain (Dom0),because Dom0 is the trusted domain. The administra-tor must protect the Dom0 at all cost, as if the intrudertakes over Dom0, the game is over: he can do anythinghe likes to other DomU. Normally it is a good idea torun Dom0 without network address, so the outsiderhave less chance to attack it.

XenKamera provides patches for DomU in 2places: TTY hooks and kameraU. These codes are ap-plied on DomU’s kernel as built-in, so they are notshown in the kernel module list (with lsmod com-mand), and consequently we cut down one chance forthe intruder to detect XenKamera’s presence. Thisapproach also makes it harder to remove XenKamerafrom the memory if the intruder wants to do that.

There might be one more place the intruder caninvestigate to discover XenKamera’s presence: kernelbinary and kernel symbol files. Fortunately, in Xenarchitecture DomU is run by loading the kernel fromDom0, so we will not need to have kernel binary file,together with kernel symbol files in DomU’s file sys-tem.

Last but not least, all the path to the kernelmemory should be prohibited, as the intruder mightsomehow get the root access in DomU and use thatprivilege to access the kernel internal and modifyit to disable XenKamera. In order to prevent thisproblem, DomU’s kernel should be compiled with/dev/{kmem,mem,port} removed ([17]), and the abil-ity of loading kernel module at run-time should beeliminated, too. This can lead to some objections:the honeypot becomes too restrictive, and the attackermight suspect. But we argue that this kind of hardenenvironment is increasingly popular, and it should beexpected by the attacker on any production systems.

In the current solution, Dom0 has a difficulty inunderstanding the domain-level semantic data in theTTY logging files it records. For example, TTY log-ging data saves meta-information about user-id, whoopens and generates the console data. But the user-id is only meaningful in the domain that produces the

Proceedings of the 5th WSEAS International Conference on Applied Computer Science, Hangzhou, China, April 16-18, 2006 (pp467-474)

Page 6: A practical and light-weight data capture tool for Xen ...€¦ · as server consolidation, co-located hosting facilities, distributed services and application mobility. Xen community

logging, but not in Dom0. The reason is that the TTYlogging data is collected from inside DomU’s kernel,but the kernel is only aware of the user-id (which is anumber), and user-name is something only availablein user-space level. Consequently, the administratorwho runs xenkamerad in Dom0 to gather TTY log-ging can only identify the logging user by his user-id,but not by his user-name. This problem can be solvedby 2 solutions: either Dom0 keeps the user databaseof all the DomU (the /etc/passwd file of DomU is suit-able for this purpose), or DomU should inform Dom0the user-name instead of the user-id. But no solutionis perfect: with the first method, Dom0 must alwaysupdate the database, which can be changed dynami-cally. On the other hand, the second method requiresDomU’s kernel to read the data in user-space, andthere is no good and clean way to do that. So at themoment, we are temporarily content with the currentsolution, and look to improve it in the future.

As XenKamera is very light-weight, we also pro-pose to use it to record the TTY working sessions. Wemade an experiment: in our test-bed Xen system, werun one Xen virtual domain with XenKamera. On thisdomain, we have an apache web server, which hostsdocumentations for local network. In 3 month, wecollected 218 logging files, with totally 37906KB insize. These data consist of the logging files about allthe console sessions generated when the administra-tors logged in via SSH to download/upload documen-tations and to reconfigure the apache server. Duringthis time, once our web server had a problem: userscould not find the documentations on the server anymore. We replayed all the logging files of the previ-ous day, and figured out the problem: in one SSH ses-sion, one administrator logged in, and used vim editorto open the apache configured file, then he changedone option of apache by mistaken. The error madeapache looked into the wrong virtual directory for thedocumentations it hosts. The trouble was quickly ex-amined and fixed. Obviously, this problem is hard toexplain without XenKamera logging files. This expe-rience shows us that XenKamera can be a good tool totrouble-shoot the administration.

5 Related WorksHoneypot is one of the hottest topics on security re-search fields. Many papers focus on applying honey-pot to improve defense system or to trap malwares.The honeypot can be broken down into 2 kinds: low-interaction and high-interaction type.

The low-interaction honeypots have limited inter-action: they normally work by emulating services andoperating systems. Attacker activity is limited to the

level of emulation by the honeypot. Honeyd ([18]) canemulate TCP/IP stack and simulate network behavior,is one of the most popular honeypot of this type.

The Honeynet ([2]) is the high-interaction honey-pot, which is the main research topic of this paper.A honeynet may contain one of more honeypots, andSebek plays a key-role in a honeypot, with the job isto capture the intruder’s activities. Though Sebek isa popular tool in the honeypot community, there arevery few papers that discuss the weak points of hon-eypot or propose methods to improve Sebek, whichare related to the topic of this paper.

In [13] and [15], J.Corey points out some prob-lems with honeypot, especially with Sebek, and sev-eral methods were proposed to defeat it. M.Dornseifand T.Holz also presents few other methods to detectand exploit Sebek in [9] and [12]. Our paper tries toinvestigate all the current outstanding problems of Se-bek, and proposes XenKamera as a solution to addressor mitigate them.

There are many attempts to capture keystrokesand TTY logging for either administration or securitypurpose. Basically we can divide them into 2 kinds:user-space-based and kernel-space-based solutions.

The user-space solutions run in user-space, andthey capture the keystroke either by poking at I/O port([19], [20], [21]), or intercepting TTY file descriptor([22]). They all have same drawbacks: easy to un-cover and disable.

Kernel-based solutions are favorable, becausethey are much harder to detect. They can even stayinvisible, and usually only be detected by privilegedusers with special techniques. Two of the most famouskeystroke and TTY logger kits are vlogger ([23]), andttyrpld ([24]).

vlogger is a Linux kernel-based key logger. It isa favorite tool of black-hat community, and is usuallyinstalled on a penetrated Linux system to steal infor-mation typed at console (user-name and password arethe most wanted). vlogger intercepts TTY internalfunction to record keystrokes and either saved to localfile system, or send them out to another machine. Italso tries to evade the network level probes by patch-ing the network stack. Nevertheless this kit does notgive us the output screen-shots like TTY-based so-lution, and can be easily detected by a sniffer (liketcpdump) placed on separately independent machinein the same broadcast domain. Simply put, it sharesmany problems with Sebek.

ttyrpld is the TTY capture solution for multipleOperating Systems (currently Linux, FreeBSD andOpenBSD are supported). Also based on kernel, ttyr-pld is difficult to circumvent, and has ability to log anyTTY type (including virtual console, bsd/unix98 pty,serial, isdn). Even better, the overhead impact to the

Proceedings of the 5th WSEAS International Conference on Applied Computer Science, Hangzhou, China, April 16-18, 2006 (pp467-474)

Page 7: A practical and light-weight data capture tool for Xen ...€¦ · as server consolidation, co-located hosting facilities, distributed services and application mobility. Xen community

system is quite low. In order to get TTY data, ttyrpldpatches OS kernel and put some hooks in TTY corelayer. On Linux, the hooks are taken advantage by akernel module which attach its own functions to thesehooks to gather TTY data. The information is thentransferred to user-space via a software device (put at/dev/rpl). ttyrpld also provides a player named ttyre-play to replay the saved logging data, and the admin-istrator can watch those date in real-time or off-linemanner. Overall, ttyrpld is a very nice tool.

Unfortunately, ttyrpld is not suitable for the pur-pose of secret watching the intruder, because there aretoo many clues about its presence: ttyrpld requires aprocess daemon (rpld) to run in user-space, and it alsoinstalls a device at /dev/rpl in the system. In addi-tion, the kernel module of ttyrpld named rpldev canbe listed with lsmod command. All of these evidencescan be spotted even with an unprivileged user. Con-sequently the intruder will quickly realize that he isbeing observed.

Last but not least, ttyrpld saves all the loggingdata in the local file system (by default at /var/log/rpl/directory), the intruder with privileged access candelete all the valuable logging data to cover his foot-prints.

Our solution XenKamera is initially inspired byttyrpld, but is able to address all the mentioned draw-backs of ttyrpld: XenKamera can record all the TTYdata from the intruder’s console like ttyrpld, but itworks quietly in kernel space of Xen domain and leaveno trace for the intruder to be suspicious about its ex-istence. XenKamera has no daemon process run inuser-space, no device is necessary in /dev, while itnever send out any information via network stack likevlogger does. Finally, it never keeps any logging datain the local file-system.

All of these characteristics make XenKamerapretty hard to discover, and suitable for honeypot pur-pose.

Our paper somewhat shares the similar ideas withour previous work name Xebek project [25], but is dif-ferent in the scope: while Xebek tries to capture allthe I/O related data (similar to what Sebek does, butXebek is able to fix almost all the problems of Sebekthanks to its special architecture), the XenKamera toolfocuses only on gathering TTY data. This approachmakes XenKamera more light-weight, and practicalfor production systems.

6 ConclusionsThis paper proposes the design and implementationof XenKamera solution for Xen-based systems, withthe aim to eliminate some outstanding problems of

Sebek, a data capture tool widely-used in honeypottechnology. We demonstrated that XenKamera canbe employed instead of Sebek for honeypot purposein Xen environment, and if being installed in a strictmanner, XenKamera is stealthier and harder to detect,even with privileged user. Our solution is also moreflexible, effective and reliable than Sebek.

Moreover, XenKamera is practical for productionsystem, because it causes very little overhead. Wepropose to use XenKamera to record working oper-ations, and the data collected might help to trouble-shoot daily administration.

At the moment XenKamera only works forLinux-based DomU. We plan to provide support forother OSes such as FreeBSD, NetBSD once theseports are working stably on Xen.

References:

[1] Lance Spitzner. Honeypots: Tracking hack-ers. Addison-Wesley Professional publisher,September 2002.

[2] The Honeynet Project. Know your enemy:Honeynets. http://www.honeynet.org/papers/honeynet/, May 2005.

[3] The Honeynet Project. Know Your Enemy:GenII Honeynets. http://www.honeynet.org/papers/gen2/, May 2005.

[4] Edward Balas and Camilo Viecco. Towards aThird Generation Data Capture Architecture forHoneynets. In The 6th IEEE Information Assur-ance Workshop, June 2005.

[5] The Honeynet Project. Know your en-emy: Sebek. http://www.honeynet.org/papers/sebek.pdf, November 2003.

[6] Boris Dragovic, Keir Fraser, Steve Hand, TimHarris, Alex Ho, Ian Pratt, Andrew Warfield,Paul Barham, and Rolf Neugebauer.

[7] Ian Pratt, Keir Fraser, Steven Hand, ChristianLimpach, Andrew Warfield, Dan Magenheimer,Jun Nakajima, and Asit Mallick. Xen 3.0 and theart of virtualization. In Proceedings of the 2005Ottawa Linux Symposium, Ottawa, Canada, July2005.

[8] The Honeynet Project. Honeywall CDROM.http://www.honeynet.org/tools/cdrom/index.html/, May 2005.

Proceedings of the 5th WSEAS International Conference on Applied Computer Science, Hangzhou, China, April 16-18, 2006 (pp467-474)

Page 8: A practical and light-weight data capture tool for Xen ...€¦ · as server consolidation, co-located hosting facilities, distributed services and application mobility. Xen community

[9] Maximillian Dornseif, Thorsten Holz, andChristian Klein. NoSEBrEaK - Attacking hon-eynets. In The 5th Annual IEEE Information As-surance Workshop, June 2004.

[10] stealth. adore-ng rootkit. http://stealth.7530.org/rootkits/, March 2004.

[11] madsys. Advanced incident response tool.http://sourceforge.net/projects/airt-linux/, August 2005.

[12] Thorsten Holz. Detecting honeypots and othersuspicious environments. In Proceedings ofthe 6th IEEE Information Assurance Workshop,June 2005.

[13] Joseph Corey. Local honeypot identifi-cation. http://www.phrack.org/unofficial/p62/p62-0x07.txt,September 2003.

[14] tcpdump project. http://www.tcpdump.org, October 2005.

[15] Joseph Corey. Advanced honeypot identificationand exploitation. http://www.phrack.org/unofficial/p63/p63-0x09.txt,January 2004.

[16] E. Dow S. Evanchik M.Finlayson J.Herne J.N.Matthews B. Clark, T. Deshane.Xen and the art of repeated research. InProceedings of the Usenix annual technicalconference, Freenix track., pages 135–144, July2004.

[17] sd. Linux on-the-fly kernel patching.http://www.phrack.org/show.php?p=58&a=7, July 2002.

[18] Niels Provos. A virtual honeypot framework. InThe 13th USENIX Security Symposium, August.

[19] Linux Key Logger. http://www.spine-group.org/tools/lkl-0.1.0.tar.gz, August 2003.

[20] Uberkey. http://www.linuks.mine.nu/uberkey/uberkey-1.2.tar.gz,November 2003.

[21] unixKeyLogger. http://packetstorm.linuxsecurity.com/exploits/rootkits/unixKeyLogger.c, August1999.

[22] snoop. http://snoop.sf.net, July 2005.

[23] rd. Writing Linux kernel keylogger.http://www.phrack.org/show.php?p=59&a=14, July 2002.

[24] Jan Engelhardt. TTY logging daemon project.http://ttyrpld.sf.net, July 2005.

[25] Nguyen Anh Quynh and Yoshiyasu Takefuji. Anovel approach to secured and central loggingdata. In 4th WSEAS International Conferenceon Information Security, Communications andComputers (ISCOCO 2005), December 2005.

Proceedings of the 5th WSEAS International Conference on Applied Computer Science, Hangzhou, China, April 16-18, 2006 (pp467-474)