Data Loss Prevention Management and Control: Inside

Journal of Digital Forensics, Journal of Digital Forensics,

Security and Law Security and Law

Volume 10 Number 1 Article 3

2015

Data Loss Prevention Management and Control: Inside Activity Data Loss Prevention Management and Control: Inside Activity

Incident Monitoring, Identification, and Tracking in Healthcare Incident Monitoring, Identification, and Tracking in Healthcare

Enterprise Environments Enterprise Environments

Manghui Tu Purdue University Calumet

Kimberly Spoa-Harty Purdue University Calumet

Liangliang Xiao Frostburg State University

Follow this and additional works at: https://commons.erau.edu/jdfsl

Part of the Computer Engineering Commons, Computer Law Commons, Electrical and Computer

Engineering Commons, Forensic Science and Technology Commons, and the Information Security

Commons

Recommended Citation Recommended Citation Tu, Manghui; Spoa-Harty, Kimberly; and Xiao, Liangliang (2015) "Data Loss Prevention Management and Control: Inside Activity Incident Monitoring, Identification, and Tracking in Healthcare Enterprise Environments," Journal of Digital Forensics, Security and Law: Vol. 10 : No. 1 , Article 3. DOI: https://doi.org/10.15394/jdfsl.2015.1196 Available at: https://commons.erau.edu/jdfsl/vol10/iss1/3

This Article is brought to you for free and open access by the Journals at Scholarly Commons. It has been accepted for inclusion in Journal of Digital Forensics, Security and Law by an authorized administrator of Scholarly Commons. For more information, please contact [email protected].

(c)ADFSL

http://commons.erau.edu/jdfsl

http://commons.erau.edu/jdfsl

https://commons.erau.edu/jdfsl

https://commons.erau.edu/jdfsl

https://commons.erau.edu/jdfsl/vol10

https://commons.erau.edu/jdfsl/vol10/iss1

https://commons.erau.edu/jdfsl/vol10/iss1/3

https://commons.erau.edu/jdfsl?utm_source=commons.erau.edu%2Fjdfsl%2Fvol10%2Fiss1%2F3&utm_medium=PDF&utm_campaign=PDFCoverPages

http://network.bepress.com/hgg/discipline/258?utm_source=commons.erau.edu%2Fjdfsl%2Fvol10%2Fiss1%2F3&utm_medium=PDF&utm_campaign=PDFCoverPages







https://doi.org/10.15394/jdfsl.2015.1196

https://commons.erau.edu/jdfsl/vol10/iss1/3?utm_source=commons.erau.edu%2Fjdfsl%2Fvol10%2Fiss1%2F3&utm_medium=PDF&utm_campaign=PDFCoverPages

mailto:[email protected]

http://commons.erau.edu/

http://commons.erau.edu/

/creativecommons.org/licenses/by-nc-nd/4.0/

/creativecommons.org/licenses/by-nc-nd/4.0/

Data Loss Prevention and Control: Inside Activity Incident Monitoring... JDFSL V10N1

© 2015 ADFSL Page 27

DATA LOSS PREVENTION AND CONTROL: INSIDEACTIVITY INCIDENT MONITORING,

IDENTIFICATION, AND TRACKING INHEALTHCARE ENTERPRISE ENVIRONMENTS

Manghui TuDepartment of ComputerInformation Technology

& GraphicsPurdue University Calumet

Kimberly Spoa-HartyDepartment of ComputerInformation Technology

& GraphicsPurdue University Calumet

Liangliang XiaoDepartment of ComputerScience and Information

TechnologiesFrostburg State University

ABSTRACTAs healthcare data are pushed online, consumers have raised big concerns on the breach of theirpersonal information. Law and regulations have placed businesses and organizations underobligations to take actions to prevent data breach. Among various threats, insider threats havebeen identified as a major threat on data loss. Thus, effective mechanisms to control insiderthreats on data loss are urgently needed. The objective of this research is to address data lossprevention challenges in healthcare enterprise environment. First, a novel approach is provided tomodel internal threat, specifically inside activities. With inside activities modeling, data loss pathsand threat vectors are formally described and identified. Then, threat vectors and potential dataloss paths have been investigated in a healthcare enterprise environment. Threat vectors havebeen enumerated and data loss statistics data for some threat vectors have been collected. Afterthat, issues on data loss prevention and inside activity incident identification, tracking, andreconstruction are discussed. Finally, evidences of inside activities are modeled as evidence trees toprovide guidance for inside activity identification, tracking, and reconstruction.

1. INTRODUCTIONAs healthcare data are pushed online,consumers have raised big concerns on thebreach of their personal information. Law andregulations have placed businesses andorganizations under obligations to take actionsto prevent data breach. Among variousthreats, insider threats have been identified asa major threat on data loss. Thus, effectivemechanisms to control insider threats on dataloss are urgently needed. The objective of thisresearch is to address data loss preventionchallenges in healthcare enterprise

environment. First, a novel approach isprovided to model internal threat, specificallyinside activities. With inside activitiesmodeling, data loss paths and threat vectorsare formally described and identified. Then,threat vectors and potential data loss pathshave been investigated in a healthcareenterprise environment. Threat vectors havebeen enumerated and data loss statistics datafor some threat vectors have been collected.After that, issues on data loss prevention andinside activity incident identification, tracking,and reconstruction are discussed. Finally,

This work is licensed under a Creative Commons Attribution 4.0 International License.

http://creativecommons.org/licenses/by/4.0/


JDFSL V10N1 Data Loss Prevention and Control: Inside Activity Incident Monitoring...

Page 28 © 2015 ADFSL

evidences of inside activities are modeled asevidence trees to provide guidance for insideactivity identification, tracking, andreconstruction

2. SYSTEM MODELAn abstraction of classical healthcareenterprise environment is modeled as a multi-tier system that consists of multiple dataaccess or management parties, including a datamodule, service providers, business users, datamanagement team, and client party. Theoverview of the system is shown in Figure 1.The data module is a central and critical partof this architecture and is composed of a datastorage system, a data process module, and adata access module. The data storage moduleis essentially databases and files that containthe information to be protected. The dataprocess module is composed of a healthcareinformation system which process theinformation stored in the data storage modulefor business clients, patients, governmentagencies, and healthcare service providers. Thedata access module is essentially web basedinterfaces for users. The service provider partyis composed of different services that areprovided by the healthcare business, includinghospital services, lab test services, healthprevention care services, disease diagnosis andtreatment services, nursing, etc. These servicesinvolve many human users such as doctors,technicians, and nurses. The business party is

essentially the interactions between thehealthcare business entity and other entitiessuch as other healthcare service providers,government agencies, insurance companies,healthcare equipment/pharmacy providers.The data management party is essentially theIT teams includingdatabase/web/network/computer systemadministers, and the security/compliance team.The client party is essentially the patients andtheir legal guardians.

In this research, the data items to beprotected by using data loss preventionmechanisms include personal healthinformation (PHI) such as social security num-bers (SSNs), data of birth, payment data,insurance policy information in digital format,and personal electronic health records (EHRs),as well as business data such as business clientinformation. In this architecture, the datamodule is interacted with other modules, forexample, databases and file systems aremanaged by the database administers,protected by system and networkadministrators as well as thesecurity/compliance team. The service moduleand the business module will not only read theinformation stored in the data module, it willalso create and update the records in the datamodule. In most cases, the client module willonly read the information stored in the datamodule such as patient’s health record andpayment information.






Figure.1. An architecture overview of the healthcare enterprise system.

3. RESEARCHMETHODOLOGY

3.1 Inside Activity IdentificationMethods

In order to develop an effective method toidentify inside activities from regular businessactivities, in this paper, two methods will beintroduced to formally describe the relationshipbetween inside activities and regular businessactivities. 1). The Work Role-Data Asset-UserOperation-Access Preference (WDOA) modeland, 2). The User-Operation-Data Asset-Access Path (UODP) model.

3.1.1 The WDOA (Work Role-Data Asset-User Operation-Access Preference) ModelIn a well-managed healthcare enterpriseenvironment, appropriate security policy andacceptable user policy should be in place andenforced by policy based access control (Chen,Laih, Pouget, & Dacier, 2005; Ellard &Megquier, 2004; Johnson & Willey, 2011;Murphey, 2007). For such a healthcare

information system, denoted as Ω., the insideactivity model WDOA can be defined as below.

Definition 1.1 (user): A user ui � U ={u1, u2, …, uL}, where L � 0 and 0 � i � L, isa specific subject to access and consumerecourses in Ω to perform tasks defined by thework role, where L denote the number of usersin Ω.

A user in Ω is a specific subject who canbe a doctor, a nurse, a business user, aninformation technology staff, or a non-emptyset of software processes in Ω.

Definition 1.2 (work role): A work rolewi� W = {w1, w2, …, wM}, where M � 0 and 0� i � M, is a group of users who have thesame type of tasks and the same set ofprivileges to access and consume recourses in Ωto perform tasks, where M denote the numberof work roles in Ω.

A work role in Ω can be doctor, nurse,business users, information technology staff, ora non-empty set of software processes in Ω.Each user ui should have a well-defined work






role wi and been assigned with data accessprivilege based on the need-to-know principle.

Definition 1.3 (sensitive asset): Anasset di� D = {d1, d2, …, dN}, where N � 0 and0 � i � N, is a category of resource that isowned by owner of Ω and can be consumed byuser, where N denote the number of resourcecategories in Ω.

The healthcare business should identify itsown set of assets D. For example, useraccounts, computer and network resources, andprotected healthcare information. Also, thehealthcare data should be classified intodifferent sensitive levels, and the set ofsensitive levels is denoted as S, where S = {s1,s2, …, sm}. Each data object in D, di, isassigned a sensitive label sj. A data item thatis labeled as si has higher sensitive level than adata item that is labeled as sj if i > j. To fulfilltasks defined by the work role, a user accessessensitive assets with certain preference. Let A

denote the set of preference level where A =

{a1, a2, …, an}, then the sensitive accesspreference can be defined by a set of 2-tuples,(di, aj). A data item with access preference ai isaccessed with higher frequency than a dataitem with access preference aj if i > j, and a1 isdefined as the lowest access preference, e.g.,zero access preference.

Definition 1.4 (operation): Anoperation oi � O = {o1, o2, …, oΖ}, where Z � 0and 0 � i � Z, is a user activity to access orconsume resources defined in Ω, where Z

denote the total number of operations that canbe performed by work roles defined in Ω.

The specific operations in O include read,write, execute, delete, shutdown, print, copy,and any other operation that is defined in aspecific business sector. Based on thepreference levels, some operations will beperformed regularly, and some should rarely beperformed or may never be performed. Forexample, an IT system administrator may have

to copy and move sensitive data objectsaround but should not delete or modifysensitive data objects, and should not copy thedata to personal devices; an applicationdeveloper will need to query sensitive dataobjects a lot but should not modify sensitivedata. Therefore, a user’s accesses to sensitivedata objects in the healthcare enterpriseenvironment can be modeled into certainpattern based on the work role of differentusers.

Definition 1 (WDOA Model): TheWDOA Model is a 4-tuple {W, D, O, A} dataaccess preference model, where the first fieldrepresents a work role in W, the second fieldrepresents an asset in D, the third fieldrepresents an operation in O, and the last fieldrepresents an access preference in A.

The WDOA Model can give a hint onwhether a data access activity is normal ornot. However, the WDOA model cannotprecisely determine whether an access activityis an inside activity or not. For example,application developer should have accesspreference a1 to user account information, andany access to such data would be suspicious.An IT administrator has low access preferenceai (i > 1) to personal healthcare records fordata management purpose, and a copy accessto those data items cannot determine whethersuch access is suspicious or not.

3.1.2 The UODP (User-Operation-Data Asset-AccessPath) ModelAn inside activity model UODP is defined foran arbitrary healthcare information system Ω.In such a model, to reach a sensitive asset di,an insider needs to have known or unknownaccess paths to asset di (Ellard & Megquier,2004; Kowalski et al, 2008; Moore, Cappelli, &Trzeciak, 2008).






Definition 2.1 (access path): An accesspath pi � P = {p1, p2, …, pK},where K � 0 and0 � i � K, denote the access channels oraccess media for users to access or consumeassets in the healthcare enterpriseenvironment, where K is number of accesspaths enabled in Ω.

A specific access path can be USB access,CD access, VM instance access, email access,or any other access mechanism that allowsusers to access the sensitive asset, inlegitimately way or illegitimately way. Forexample, to steal an asset di for personal use,an insider may copy asset di from the datastorage site and then send to a personal USBdevice that has been attached to a systemwithin a healthcare enterprise environment. Aninsider may first create secret user account andsetup a virtual machine (VM) instance, andthen copy di to the VM instance to be accessedlater. Let Path({ui}, k) denote the set of accesspaths to asset dk by a subset of users {ui}, thenPath({ui}, k) can be defined by a set of 4-tuples (ui, oj, dk, pm).

Definition 2 (UODP Model): TheUODP Model is a 4-tuple {U, O, D, P} dataaccess path model, Where the first fieldrepresents a user in U, the second fieldrepresents an operation in O, the third fieldrepresents an asset in D, and the last fieldrepresents an access path in P.

With the 4-tuples (U, O, D, P) model, it ispossible to determine whether an accessactivity is an inside activity or not. Anapplication developer ui copy healthcarerecords dk to a un-monitored VM instance, the4-tuples (application developer, copy, dk, un-monitored virtual machine) can be definitelyconsidered as a suspicious inside activity sincean un-monitored VM instance is beyond thecontrol of the healthcare enterprise and canlead to data loss of dk. While an IT systemadministrator ui copy healthcare records dk to

a monitored USB device, the 4-tuples (ITsystem administrator, copy, dk, monitoredUSB) is not an inside activity since themonitored USB device is still under the controlof the healthcare enterprise and will not leadto data loss of dk at the current stage. Withsufficient resources, the elements in U, O, D

can be well classified and identified based oncurrent technologies. However, due to thecomplexity of the healthcare informationsystem, data storage techniques, user accesscontrols, and usage obligations, theidentification and classification of access pathsis still challenging to for healthcare enterprises.

3.2 Inside Activity Modeling forIncident Tracking and

ReconstructionThe attack tree approach that is first proposedby Schneier (1999) is used to systematicallyanalyze security threats. Attacks are modeledand represented by a tree structure where theroot node represents the final goal, otherinterior nodes represent subgoals, and leafnodes are attacking approaches to achieve thefinal goal (Poolsapassit & Ray, 2007). Childrenof a node in the tree can be one of the twological types: AND and OR. To reach the goal,all of its AND children, or at least one of itsOR children, must be accomplished. Attacktrees grow incrementally by time and theycapture knowledge in a reusable form. First,possible attack goals must be identified. Eachattack goal becomes the root of its own attacktree. Construction continues by considering allpossible attacks against the given goal. Theseattacks form the AND and OR children of thegoal. Next, each of these attacks becomes agoal and their children are generated. Figure 2shows an example of an attack tree of theinside threat, “achieving the root privilege”. Insuch an attack, the attacker is a regular userand has a lower access privilege to the target(which needs root privilege), and conducts a






series of attacking operations to achieve theroot privilege as the system user. Note thatlinks that are connected with a line represents

the “AND” relationship among the states orsub-goals, which are working together toachieve the same parent goal.

Figure 2. An attack tree of an internal threat “achieving the root privilege”.

4. DATA LOSS RESULTSAND ANALYSIS

Based on the UODP access model, if alegitimate user accesses and operatessensitive data through an uncontrolled accesspath, it can lead to potential data loss. Thecombination of such uncontrolled accesspaths and access operations are threats todata loss, and are defined as the data lossthreat vectors. Therefore, to prevent andcontrol data loss in healthcare enterpriseenvironment, the first critical task is toidentify the set of data loss threat vectors,more specifically, the set of uncontrolledaccess paths. In this research, we will explorethe potential threat vectors in the healthcareenterprise environment. Safend DataProtection Suite, an end point securityproduct from Wave, has been used in thisresearch to regulate data loss prevention inan enterprise healthcare environment. Dataloss results are collected before and after theplacement and enforcement of end pointsecurity protection. To identify data loss

threat vectors, examine potential data lossthreats, and to analyze potential data losscontrols, the following studies will beconducted. First, potential threat vectors willbe enumerated and feasible operationcontrols will be listed for each threat vector.The status of the enforcement of suchcontrols is also indicated. Second, statisticaldata loss prevention results are provided.

4.1 Data Loss Threat VectorsIdentification

Data loss threat vectors can be categorizedas external storage media and transmissionmedia. External hard disks, USB flash drives,PDA’s, CD/DVD, floppy disks, and tapesare traditional storage media, while cellphones, SD card readers, IPAD, FTP, websites, and printing can be categorized astransmission media. The only exception iscloud storage which is a new technologycombining transmission and storage. Tocontrol data operations data, port controlssuch as block, allow, force encryption, set to






read only are enforced. Data filteringtechnologies based on expressions aredeployed to filter sensitive data such as

credit card numbers, social security numbers,and healthcare records. The results areshown in Table 1.

Table 1The enumeration of data loss threat vectors in an enterprise healthcare environment

Threat Vectors Port Control Options Enforcement Status

External Hard DrivesBlock/Allow/ForceEncryption/Set To Read Only Enforced

USB Flash DrivesBlock/Allow/ForceEncryption/Set To Read Only Enforced

Cell Phones Block/Allow Not Implemented

PDA's Block/Allow Enforced as external storage media

SD Card Readers Block/Allow/Set To Read Only Not Implemented

iPad Block/Allow Not Implemented

CD/DVDBlock/Allow/ForceEncryption/Set To Read Only Enforced

Floppy Drives Block/Allow

No data to report - technology inenvironment does not allow for floppydrives

Tape Drives Block/AllowNo data to report - technology inenvironment does not allow for tape drives

Websites noneDue to product, high administration effortsto identify and analyze risks.

FTP none

Blocked by perimeter within the domain,by static IP, site IP allowed to use, anduser security - 3 factor authentications.

Cloud Storage none Not Implemented

Email noneEmail filtering, algorithms to look forsensitive data, will force encryption

Printing

can block physical printersfrom connecting, but notnetwork printers Not Implemented

As indicated in table 1, traditionalstorage media are usually well controlled byenforcing port controls, since they have beenwell documented and the monitoring andcontrol technologies have been well designed.Cloud storage is a new technology and notwell documented (Biggs & Vidalis, 2010;Bruening & Treacy, 2009; Brunette & Mogull,2009), thus, mature control technologies arenot ready yet. Some transmission media such

as FTP can easily be controlled since FTPcan easily be replaced with an alternativesecure technology. It means that thesetechnologies are not required to accomplishhealthcare activities and thus can be blocked.Some other transmission media such asprinting are not easily controlled sinceprinting is required for routine businessactivities. Also, due to the nature of printing(graphical presentation of information),






sophisticated identification and examinetechnologies are needed to filter sensitivedata. Currently, efficient deployment of suchtechnologies has not been ready yet.

4.1 Data Loss AnalysisA 90-day time period data collection isconducted prior to the deployment of anyend point security protection technology(denoted as /P). After the 90-day time

period, Safend Security Protection Suite wasdeployed in the enterprise healthcareenvironment to control data loss. Then, a 90-day time period data collection is conductedwith the deployment of the end pointsecurity protection technology (denoted as/A). Due to the limitation of the technologyand the feasibility of the policy enforcementin the enterprise environment, only part ofthe threat vectors, USB, CD/DVD, externalhard disk, and phone, are controlled.

Table 2The potential data loss path accesses and operations before and after the deployment of Safend.

Threat Vector#

Users/P#

Users/A # Files/P # Files/AData

Size/PData

Size/AUSB 2765 413 4449429 374015 1123 G 432.4GCD/DVD 157 44 212067 8530 291.7 G 76.5GExternal Hard Disk 161 21 443805 9356 804.39 G 2.4GPhone 426 5805 0 0 0 0

As indicated in Table 2, the number ofusers access potential data loss threat vectors,such as USB, CD/DVD, and external harddisks have been significantly reduced (USBusers from 2765 to 413, CD/DVD users from157 to 44, external hard disk users from 161to 21). The only exception is the use ofphone and the usage has been significantlyincreased (from 426 to 5805). One reasoncould be the block or reluctance of the use ofemail and other controlled communicationpaths. However, users may plug in to chargedevices without proper removable mediaprotection, phones can be used to takingpictures and then transmitted out withoutcontrol. Therefore, such an abrupt increaseneeds to be carefully analyzed and better toconduct a thorough investigation. Acountermeasure to such data loss threatthrough phone can be achieved to enforcenon-personal phone policy in sensitiveworking environment. As indicated by the

number of files and the size of files moved inTable 2, employees tend to abuse such threatvector accesses and operations without dataloss control, since such significant reductions(for examples, the number of USB accessedfiles from 4449429 to 374015, the number ofCD/DVD accessed files from 212067 to 8530,the number of external disk accessed filesfrom 443805 to 9536) does not affect businessactivities in the enterprise healthcareenvironment. Please note that no data istransferred to phones due to the reason thatmost phones when connected are seen asremovable storage or external hard drives,and thus will adhere to the policies alreadyenforced for that media type. In Table 3, itindicates that a large part of accesses areencrypted, which can significantly reduce thepotential of unintentional data loss due totheft, mis-sent, and misconfiguration.However, there are some accesses andoperations are unencrypted but all of them






are approved. As stated in the security usagepolicy and recorded in the usage logs, suchapproved usages are required to follow

predesigned processes such that all filesaccessed and operations on files are all loggedin audit reports.

Table 3The encrypted and unencrypted data loss path accesses and operations after the deployment of Safend

Threat Vector#

Users/A#

Files/AData

Size/A#

Users/A # Files/AData

Size/AEncrypted Use Unencrypted Use

USB 187 247680 334 226 126335 98.4CD/DVD 32 N/A 64.6 12 N/A 11.9Ext. Hard Disk N/A N/A N/A N/A N/A N/APhone N/A N/A N/A N/A N/A N/A

5. INSIDER ACTIVITYIDENTIFICATION &

TRACKINGEven with data loss threat vectorsidentification, control, and monitoring, insideactivities cannot be detected or identifiedwith current access control techniques sincethe access operation and access path are bothlegitimate user privileges. Therefore, forensicsinvestigation on inside activities inhealthcare enterprise environment, includingincident detection and reconstruction iscritically needed (Tu et al, 2012). Currentresearch on inside threat detection andidentification (Eberle & Holder, 2009; Moore,Cappelli, & Trzeciak, 2008; Phua, Lee,Smith, & Gayler, 2007) and eventreconstruction mechanisms (Case et al, 2008;Tang, & Daniels, 2005; Tu et al, 2012) arelimited in real world since they require acomprehensive set of information includingsocial information and explicit dependenceknowledge, which are not available in anenterprise environment. Hence, a novelmechanisms are critical to identify potentialinside activity and reconstruct the insideactivity for tracking.

5.1 Data Loss Identification

With deployment of end point securityprotection product such as Safend SecurityProtection Suite, it is possible to control dataloss through traditional external storagemedia. For example, with appropriate accesscontrol, any data accessed can be logged andcan be blocked to be moved to USB storagemedia or other external storage media.However, potential uncontrolled data lossaccess paths could still exist.

(1) A combination of multiple accesstechnologies in the extended healthcareenterprise environment. For the purpose ofbusiness trip, an employee ui with work rolewj (e.g., sales representative) may need tomove data (e.g., dn) outside of the enterprisenetwork by applying access operation (i.e.,copy), and such access has a high accesspreference for wj. Then, based on the WDOAmodel, the 4-tuple (sales representative, dn,copy, high access preference), will be definedas a legitimate access without an alert. Byapplying UODP, it can help to detectpotential data loss due to access violations.End point security product can monitorregular access to the data and any violation(e.g., copy dn to an unauthorized personnelUSB device pm) may result in an alert sincethe union of copy and pm (copy � pm) hasbeen pre-defined as data loss threat vector.






In this way, the combination of UODP andend security protection product together cancreate an extended enterprise environmentoutside the physical enterprise networkboundary. However, there will be othertechniques available to bypass the control.For example, employee can photograph thedata if the read access is permitted, orstorage media can be bit-by-bit imagedwithout leaving any evidence on the media ifit is connected through write block devices.

(2) Forgotten paths due to unsuccessfulchange management. For example,misconfiguration could be unnoticed duringsystem update, security product update, orhuman resource change. With suchmisconfiguration, media block and mediaaccess log may not be enabled.

(3) A combination of uncontrolledaccesses within the physical network ofhealthcare enterprise environment. In Table1, some access technologies, such as web site,phone, and cloud technology, have not beencontrolled. As analyzed in the above section,phone technology has the potential to resultin data loss. With appropriate surveillancetechnology deployed, such data loss accesspath can be monitored and detected. Cloudtechnology, web site, and local virtualizationtechnology can provide a perfect uncontrolleddata loss access path. Local data can bemoved between physical storage within thenetwork and a local VM instance, which canthen connect to remote private cloud storagewebsite. With this secret path, dataencrypted in the VM instance can be movedoutside the physical network boundary of thehealthcare business. After the local VMinstance deleted, little evidence will be leftwithin the boundary of the healthcareenterprise network. The only feasible controlis to block any encrypted traffic (Wippich,2007).

5.2 Inside Activity Operation andEvidence Modeling

Inside activity identification and trackrequire that inside activities to be thoroughlystudied. In such study, inside activities willbe modeled as attack tree and thenconducted in a simulated environment and aforensic investigation will be followed foreach attack successfully committed.Fingerprints will be located and identified foreach operation of the inside activity. Themetadata of the fingerprints of each attackoperation, such as log name, format, location,timestamps, and security features. arecomposed into nodes, which will then becomechild nodes of the leaf nodes in theaugmented attack tree. This entire processwill finally result in an evidence tree for eachinside activity studied. Fingerprints ofsensitive operations of the evidence trees willbe identified as incident identifiers and theevidence tree can provide the contextualinformation to reconstruct security incidentsautomatically.

In this research, two inside activitieshave been studied, both of which utilizeremovable media (USB drive and CD-ROM)as the access paths leading to data loss in thehealthcare enterprise environment.

5.2.1 Inside Activity OperationModelingInside Activity A, as shown in Figure 2, isa typical industrial espionage inside activity.In such incident, the insider has all theneeded privileges to access data and the USBports which are required to perform theuser’s duty. However, those sensitive datashould not be copied to personal USB devicessince this may result in potential informationleakage. To perform such an attack, the useris logged into system with all neededprivileges, navigate to sensitive data, copyand paste the sensitive data into the USB






device. The USB device is then removed and the user is logged out of the system later.

Inside Activity B, as shown in Figure3, is also a typical industrial espionage insideattack. In such an attack, the attacker hasall the needed privileges to access sensitivedata and to access the CD-ROM Drive,which is needed to perform the user’s duty.However, those sensitive data should not becopied to CD-ROM since this may result in

potential information leakage. To performsuch an attack, the user can log into systemwith all needed privileges and navigate tosensitive data, and then burns the sensitivedata onto a CD-ROM. The CD-ROM is thenremoved and the user is logged out of thesystem.

Figure 2. Case Two Internal Attack A (USB copy)

Figure 3. Case Two Internal Attack B (CD-Rom copy attack)






5.2.2 Inside Activity EvidenceModelingThe results of Attack A and Attack B areshown in Figures 4 and 5. Each figurecontains an augmented threat tree thatrepresents the vulnerability exploited, thesteps needed to exploit it, the attacker'soperations, and the fingerprint generated bythose operations. The final goal of bothattacks is to steal sensitive information froma business information system with desiredsystem permissions. Operations conducted on

a Windows machine may leave some forensictraces in the registry, some are persistent fora long time and some are volatile. If a pieceof registry fingerprint is coupled withinformation from the event logs and filesystems, the insider attack may be trackedand reconstructed. Based on our observation,relevant fingerprints can be located inmachine’s System hive, Software hive, theuser’s NTuser.dat hive, the setupapi.log thatkeeps a history of all devices installed viaplug and play, and the Security event log.

The inside attack A is conducted on7/29/2011. Based on information in theregistry, at 1:03:39 AM, a Centon USBdevice with a serial number of 6AFA4AAD80

was attached to the machine. At 1:04:34 AM,the attack was logged into the system andleft fingerprints in the security event log.Based on additional fingerprints in theregistry, the USB device with serial number6AFA4AAD80 can be linked with the disk

with driver letter E. Examining theRecentDocs registry key with the toolRegExtract shows that _USBSTOR.sql,Removable Disk (E:), _USB.sql, and a filenamed “highly sensitive things” which isflagged in the honeypot as a sensitive file,were recently accessed. At 1:14:44 AM, Usersynchronized the document titled with“highly sensitive things”, with the Removable

Figure 4. A part of the evidence tree for Inside Activity A.






Disk (E:). The evidence tree of attack A isshown in Figure 4.

The inside attack B is conducted on7/29/2011. Based on fingerprints in thesecurity event log, user Worker 2 logged intothe system at 5:46: 26, and attempted tocreate a hard link with “highly sensitive very

sensitive” at 5:53:12. Analysis of the IDEDevice Class registry shows that a CD ROMwas documented at 5:47:34, a minute afterWorker 2 logged on to the system. Finally,the user Worker 2 is found to burn the file“highly sensitive things” to the CD ROM at5:53:12. The evidence tree of attack B isshown in Figure 5.

Now we discuss how to determine theincident identifiers for the two insideactivities. To perform the job allowed for auser’s work role, the operation and accesspath of an inside activity are allowed andcannot be prevented, thus, any individualoperation on data and access path cannotidentify an inside activity. One approach isto apply the UODP model and utilize thecontextual information of the incident suchas a joint of the operations of data accessand path access (e.g., copy di � access USB)

to identify a potential inside activity.Operations on sensitive assets can be labeledas safe or highly risky for each work role wi,and can be defined by a tuple {{W, O, D,

P}, R}, where R defines the risk levels.Hence, a risk table containing entries of{{W, O, D, P}, R} can be developed foreach service. Once a risky operation (e.g.,copy di) has been performed by user ui (withwork role of wj), the ui’s operations will betracked to look for an operation pi (an accesspath) associate with di (e.g., USB access �

Figure 5. A part of the evidence tree for Inside Activity B






CD access � email access) performed by ui.If the two operations (data access O oraccess path P) are discovered, then apotential inside activity is identified. Sincesuch combinations cannot be totallyprohibited and the knowledge of suchcombinations cannot be obtained from accesscontrol, the detection has to rely on thelogged information in the system.

Another challenge in digital forensicsinvestigation is the lack of efficient digitalforensics investigation mechanisms. Hugeamount of artifacts of events and operationsare logged in the system, which mayintroduce inefficiency to internal incidenttracking and reconstruction. Many of thesecurity breaches are not investigated due tothe unaffordable effort required to perform aforensics investigation (Sheyner, 2002;Todtmann, Riebach, & Rathgeb, 2007; Tu etal, 2012). Therefore, to improve theresponsiveness and to free businesses andpublic organizations’ burden on the incidentreport and investigation process, an incidentreconstruction mechanism should be in placeto track inside activity incidentautomatically. To automate thereconstruction of an inside activity incident,external contextual information is needed tocorrelate individual operations of suchincident, which can only be learned fromlogged information from the networks andinformation systems within the healthcareenterprise environment. Therefore,mechanisms such as automatic tracking andreconstruction of a crime scene should bedesigned (Tu et al, 2012).

6. RELATED WORKForensics readiness has recently been a bigresearch concern in digital forensicinvestigation and information assurance(Carrier & Spafford, 2003; Carrier &

Spafford, 2004; Popovsky, Frincke, andTaylor, 2007; Rowlinson, 2004; Tan, 2001;Tang & Daniels, 2005; Wilson & Wolfe,2003; Yasinsac & Manzano, 2001). Theseexisting research efforts focus onorganization-level framework design such aspolicy or management. None of them hasaddressed the details of the technology partof forensics readiness, e.g., mechanisms of theapplication and system event logging,fingerprint storage and archiving, andevidence-handling procedures, which areessential to enable forensics readiness forcomputer information systems. Our researchpresented in this paper attempts to provide apractical mechanism to automaticallyidentify, track, and reconstruct attacks orinside activities, through the identificationand tracking of the evidences of the attacksor inside activities.

An insider usually has the desiredprivilege and does not need to conduct anymalicious activity (or attack) to obtain theprivilege to access sensitive assets. Currentmalicious activity monitoring and detectiontechniques have limitations to effectivelydetect inside activities (Moore, Cappelli, &Trzeciak, 2008; Tu et al, 2012). Someresearch works have attempted to addressinside threat modeling and detection issues(Bradford, Brown & Perdue, 2004; Burford,Lewis, & Jakobson, 2008; Chivers, 2009;Eberle & Holder, 2009). Bradford, Brown, &Perdue (2004) proposed principles forproactive computer-system forensicsinvestigation on security incidents includeinternal threats, but no technicalimplementation of the proposed principleshas been given and their focus is not threatdetection. Burford, Lewis, & Jakobson (2008)proposed a comprehensive frameworkdefining a large set of internal threat‘observables’, and a graph theory basedmethod to model individuals’ behavior






(Chivers, 2009). Eberle & Holder (2009)proposed an inside activity detection methodin which behavioral events are modeled asgraphs and abnormal behaviors such asinside activities can be identified bysearching abnormal subgraphs. The aboveapproaches offer the advantage of modelingpotential attacker and providing interestinginsights into observable behavior (Chivers,2009). However, their applications arelimited by the availability of socialknowledge of the insiders. Our research,however, will simply require the locating andidentification the fingerprints left in thesystems by operations of attacks or insideactivities, with the guidance of evidencemodels, attack identifiers, and accesspreference models.

The WDOA (Work Role-Data Asset-UserOperation-Access Preference) Model and theUODP (User-Operation-Data Asset-AccessPath) model reply on the classification ofwork role and data asset. Similarly, accesscontrol models such as role based accessmodels and Lattice based access models[Harris, 2012] all rely on such on theclassification or labeling of subjects andobjects. The role based access control modelsclassify users to a set of work roles and eachrole is assigned with a set of access privileges.A subject (or a user) can exercise apermission only if the permission isauthorized for the subject's (or user’s) activerole. The Lattice based access models aremandatory access control models. The Bell-LaPadula Model focuses on the protection ofconfidentiality of information such that anobject (data) can only be read by a subject(or a user) with higher (or equal) securityclearance and an object (data) can only bewrite by a subject (or a user) with lower (orequal) security clearance. The Biba Modelfocuses on the protection of system integritysuch that an object (data) can only be write

by a subject (or a user) with higher (or equal)security clearance and an object (data) canonly be read by a subject (or a user) withlower (or equal) security clearance. Theseaccess control mechanisms can protectconfidentiality and integrity upon theauthorization of access requests from users,however, they have no control on data afteraccesses are granted. Also, insiders usuallyhave the desired access privileges to accessdata objects, thus, access control mechanismswill have limited effectiveness on insideactivity identification and tracking.

7. CONCLUSIONThis paper addressed the data lossprevention management problem inhealthcare enterprise environment. First, anovel approach is provided to model insideactivities and a UODP inside activitymodeling mechanism is proposed. Withinside activities modeling, data loss pathsand threat vectors are formally described andidentified. Second, threat vectors andpotential data loss paths have beeninvestigated in a healthcare enterpriseenvironment. Threat vectors have beenenumerated and data loss statistics resultsfor some threat vectors have been collectedand analyzed. After that, issues on data lossprevention and inside activity incidentidentification, tracking, and reconstructionare discussed. Finally, inside activities areconducted in a simulated healthcareenvironment, evidences of inside activitiesare collected, analyzed and then modeled.Evidence trees have been developed for insideactivities, which are expected to provideguidance for internal activity incidentidentification and reconstruction.






REFERENCESBiggs, S. and Vidalis, S. (2010). Cloud

Computing Storms: IJICR 1(1), pp. 61-68.

Bradford, P., Brown, M., Perdue, J. (2004).Towards proactive computer-systemforensics. IEEE International Conferenceon Information Technology: Coding andComputing (ITCC 2004).

Bruening, P. J. and Treacy, B. C. (2009).Cloud computing: privacy, securitychallenges. Privacy & Security LawReport by The Bureau of NationalAffairs, Inc. [online]. Available:http://www.bna.com.

Brunette, G. and Mogull, R. (2009). SecurityGuidance for critical areas of focus inCloud Computing V2. 1. CSA (CloudSecurity Alliance), USA. [online].Available:http://www.cloudsecurityalliance.org/guidance/csaguide.

Burford, J., Lewis, L., and Jakobson, G.(2008). Insider threat detection usingsituation-aware MAS. In IEEE 11th

International Conference on InformationFusion, 1–8, Germany.

Carrier, B. & Spafford, E. (2003). Gettingphysical with the digital investigationprocess. International Journal of Digital

Evidence, 2(2).

Carrier, B. & Spafford, E. (2004, July). Anevent-based digital forensic investigationframework. In Proceedings of DigitalForensic Research Workshop.

Case, A. Cristina, A., Marziale, L., RichardG., & Roussev, V. (2008). FACE:automated digital evidence discovery andcorrelation. Digital Investigation, 5, s65-s75.

CENZIC. (2008). Q1 Cenzic applicationsecurity trends report. [online]. Available:http://www.cenzic.com/downloads/Cenzic_AppSecTrends_Q3_Q4-2008.pdf.

Chen, P., Laih, C., Pouget, E. and Dacier,M. (2005). Comarative survey of localhoneypot sensor to assist networkforensics. Proceedings of the 1st

International Workshop on Systematic

Approach to Digital Forensics

Engineering, 120-132.

Chivers, H., Nobles, P., Shaikh, S., Clark, J.,Chen, H. (2009). Accumulating Evidenceof Insider Attacks. 1st InternationalWorkshop on Managing Inside SecurityThreats (MIST09).

Eberle, W. and Holder, L. (2009). Insiderthreat detection using graph-basedapproaches. Proceedings of IEEECybersecurity Applications & TechnologyConference for Homeland Security(CATCH), 237-241.

Ellard, D. and Megquier, J. (2004). DISP:practical, efficient, secure and fault-tolerant distributed data storage. ACMTransactions on Storage. 1(1). 71-94.

El Emam, K., Neri, E., Jonker, E., Sokolova,M., Peyton, L., Neisa, A., Scassa, T.(2010). The inadvertent disclosure ofpersonal health information through peer‐to‐peer file sharing programs. J.American Medical Informatics Assoc.,17(2), 148–158.

Ernst & Young. (2011). Data loss prevention:keeping your sensitive data out of thepublic domain. White Paper. [online].Available:https://www.watchguard.com/tips-






resources/grc/wp-data-loss-prevention.asp.

Fratto, M. (2008). Security survey: we’respending more, but data’s no safer thanlast year. [online]. Available:http://www.informationweek.com/news/security/management/showArticle.jhtml?articleID=208800942.

Halbesleben, J.R.B, Wakefield, D.S. andWakefield, B.J. (2008). Work-arounds inhealthcare settings: literature review andresearch agenda. Health Care Manage-ment Rev., 33(1), pp. 2–12.

Harris, S. (2012). CISSP All-In-One ExamGuide. 6th edition, ISBN: 978-0071781749.

Hoffman, P. (2007). RSA security reportslow level of trust in online bankingsecurity. eWeek News. [online].Available:http://www.eweek.com/c/a/Security/RSA-Survey-Reports-Low-Level-of-Trust-in-Online-Banking-Security/.

Johnson, M. E, and Willey, N. (2011).Usability failures and healthcare datahemorrhages. IEEE Security and Privacy.Issue March/April 2011, pp. 18-25.

Kowalski, E., Conway, T., Keverline, S.,Williams, M., Cappelli, D. and Moore, A.(2008). Insider threat study: illicit cyberactivity in the government sector.[online]. Available:http://www.cert.org/insider_threat/.

Mauw, S. & Oostdijk, M. (2005).Foundations of attack trees. In Won, D.,Kim, S., eds.: International Conferenceon Information Security and Cryptology– ICISC 2005.Volume 3935 of LNCS,Springer 186–198.

Moore, A., Cappelli, D.. & Trzeciak, R.(2008). The “big picture”of insider IT sabotage across U.S.critical infrastructures. Advances in

Information Security. 39, 17-52.

Murphey, R. (2007). Automated windowsevent logs forensics. Journal of Digital

Investigations. 4S, S92-S100.

Phua, C., Lee, V., Smith, K. and Gayler, R.(2007). A comprehensive survey of datamining-based fraud detection research.[online]. Available:http://www.bsys.monash.edu.au/people/cphua/.

Poolsapassit, N. & Ray, I. (2007).Investigating computer attacks usingattack trees. IFIP InternationalFederation for Information Processing,Vol. 242. Advanced Digital Forensics III.

Popovsky, B. E. & Frincke, D. (2004).Adding the fourth “R”. In Proceeding ofthe 2004 IEEE Workshop on InformationAssurance.

Popovsky, B. E., Frincke, D., and Taylor, C.(2007). A theoretical framework fororganizational network forensic readiness.Journal of Computers. Vol. 2, No. 3.

Ramzan, Z. (2008). Security trends of 2008and predictions for 2009. Net SecurityNews, [online]. Available:http://www.net-security.org/article.php?id=1194. Dec.24.

Randazzo, M. Keeney, M., Kowalski, E.,Cappelli, D. and Moore, A. (2004).Insider threat study: illicit cyber activityin the banking and finance sector,”[online]. Available:http://www.cert.org/insider_threat/.

Rowlinson, R. (2004). Ten steps to forensicreadiness. International Journal of

Digital Evidence, 2(3).

Rozinat, A. van der Aalst, W., Dustdar, S.,Fiadeiro, J. and Sheth, A. (2006).Decision mining in ProM. In: Lecture






Notes in Computer Science. 4102.Springer, Berli

Rozinat, A., Mans, R., Song, M. and van derAalst, W. (2008). Discovering coloredpetri nets from event logs. International

Journal on Software Tools for

Technology Transfer, 10(1).

RSA Security. (2008). CSI computer crime &security survey. [online]. Available:http://i.zdnet.com/blogs/csisurvey2008.pdf.

Saini, V., Duan, Q., Paruchuri, V. (2008).Threat modeling using attack trees. J.Comput.Small Coll. 23(4).

Schneier, B. (1999). Attack trees: modelingsecurity threats. Dr. Dobb’s Journal.

Seltxer, L. (2006). Is online banking toodangerous? eWeek News. [online].Available:http://www.eweek.com/c/a/Security/Is-Online-Banking-Too-Dangerous/.

Shah, A. (2009). More employees neglectingdata security, survey says. [online].Available:http://www.networkworld.com/news/2009/061009-more-employees-neglecting-data-security.html. IDG News Service.

Sheyner, O., Haines, J., Jha, S., Lippmann,R. and Wing, J. (2002). Automatedgeneration and analysis of attack graphs.Proceedings of the IEEE Symposium onSecurity and Privacy, 273-284.

Singleton, T., Singleton, A., Bologna, G., andLindquist, R. (2006). Fraud Auditing andForensic Accounting, 3rd edition. ISBN:9780471785910. Wiley.

Siponen, M. and Oinas-Kukkonen, H. (2007).A review of information security issuesand respective research contributions.Database for Advances in InformationSystems. 38(1), 60-80.

Tan, J. (2001). Forensics readiness.Electronic version available atHTUhttp://www.arcert.gov.ar/webs/textos/forensic_readiness.pdf.

Tang, Y. and Daniels, T. (2005). A simpleframework for distributed forensics. InProceedings of the 25

thIEEE

International Conference on Distributed

Computing Systems Workshops, 163-169.

Todtmann, B., Riebach, S. and Rathgeb, E.(2007). The honeynet quarantine:reducing collateral damage caused byearly intrusion response. In proceedingsof the 6th international Conference onNetworking, 464-465.

Tu, M., Xu, D., Butler, E., and Schwartz, A.(2012). Locating and identifying forensicevidence for attacks against onlinebusiness information systems by usinghoneynet. Journal of Digital Forensics,Security, and Law. 7(4), 73- 97.

Wilson, W. & Wolfe, H. (2003). Management

strategies for implementing forensic

security measures. Information Security

Technical Report, 8(2).

Wippich, B. (2007). Detecting andpreventing unauthorized outboundtraffic. White Paper, SANs InstituteReading Room. [online]. Available:https://www.sans.org/reading-room/whitepapers/detection/detecting-preventing-unauthorized-outbound-traffic-1951.

Yasinsac, A. and Manzano, Y. (2001).Policies to enhance computer andnetwork forensics. Proceedings of the2001 IEEE Workshop on InformationAssurance and Security.




Documents

Data Loss Prevention Management and Control: Inside