File System Analysis 1. What is a File System? Computers need a way to store data and then to retrieve the said data in simple but quick ways. File systems

File System Analysis

1

What is a File System?

Computers need a way to store data and then to retrieve the said data in simple but quick ways. File systems provide a way for computers to store data in a hierarchy of files and directories.

2

What is a File System?

A file system defines how and where files are stored and how they are named.

Files systems are independent of any type of computer. The file systems we will look at are FAT (File Allocation Table), NTFS, Ext2 and Ext3.

However, what they have in common is that they all know how to store data and to retrieve the data in a quick manner.

3

Essential and Non-Essential Data

When looking at data, it is important to differentiate between the essential and non-essential data. We have to trust the essential data, but not necessarily trust the non-essential data.

It is also important to know which OS wrote the file system because different OS’s might have different requirements as to what is essential and what is not.

4

Essential and Non-Essential Data

Essential data is that data that is needed to save and retrieve files. Some essential data: Content storage location. Name of the file. Pointer from name to metadata.

Non-essential data is there for convenience but not needed for the basic functionality of storing and retrieving files. Access Times Security Permissions

5

Data Categories

In order to understand different analysis techniques, we divide the data on a computer into several categories.

1. File System

2. Content

3. Metadata

4. File Name

5. Application

The tools in The Sleuth Kit (TSK) are based on these same categories.

6

Data Categories - File System

Each instance of a file system is unique because it has a unique size.

It contains the general file system information such as where to find certain data (which cluster or block does this reside in) and how big the data unit is. It works like a map.

7

Data Categories - Content

This category is made up of the actual content of a file. Most of the information in a computer is made up of this data category. It is typically organized into data units. Some file systems call these clusters, and some call them blocks.

8

Data Categories - Metadata

This category consists of the data that describes the files such as:

Where the file content is stored Size of the file Creation time Modification time Access control information

This category does not contain the content of the file and the name of the file. e.g.: inodes and Master File Table (MFT) entries

9

Data Categories - File Name

The file name category is also known as the Human Interface category. In this category, a name is assigned to each file, to make data access more convenient and easier for humans.

In most systems, these are contents in a directory with corresponding metadata addresses.

10

Data Categories - Application

This category provides special features. However, these data are not necessary for normal reading and writing of data. Instead they provide features such as user quota information and file system journals. These data can be more easily forged than other data.

11

Interaction Between Categories

12

Layout and Size Information

Quota Data

Content Data #1

Content Data #2

Content Data #1

Time and Addresses

Times and Addresses

file1.txt

file2.txt

File Name Category Metadata Category

File System Category Application Category

Content Category

Carrier, Figure 8.1

Analysis

Different categories of data require different methods for analysis. File system data typically consists of single and

independent values, so analysis is conducted by looking at the values and interpreting them. For instance, if we are searching for files with ‘jpg’ extensions, we would have to focus on file name category analysis techniques.

13

Analysis – File System Category

The file system category of data includes the data that describes the layout and general information about a file system. The File system category might be analyzed to find out on which computer a file system was created, or

TSK has a tool called fsstat which will read the boot sector or superblock and other data structures that are specific to the different types of file systems. The type of data in the output of fsstat is different for each file system because different types of data are available.

14

Analysis – Content Category

Typically, the content category includes equal sized data units that are allocated for files and directories. There is also a data structure that keeps track of the allocation unit of the data units that store this data. This category contains a lot of data, a 40 Mb volume contains > 80000, 512 byte sectors.

The content category is analyzed to recover deleted data and conduct low-level searches. All TSK tools in this category start with the letter d.

15

Addressing Data Units A sector can have multiple addresses

Physical: Address relative to the start of the drive Logical: Address relative to the start of the volume

File systems use the Logical volume address, but also group sectors into clusters or blocks and assign them addresses.

16

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16VolumeAddress

File SystemAddress

Carrier, Figure 8.2

Allocation Strategies

File systems may use several strategies to allocate data units. Data units are generally allocated contiguously. However, this is not always possible, in which case, the file is considered to be fragmented.

The used allocation strategy affects the probability of recovering deleted (unallocated) content.

17

Allocation Strategies (cont.)

First available strategy: Clusters or blocks are assigned from the beginning of the file

system as needed. Most likely to become fragmented. File systems using this strategy are more likely to overwrite free

space containing deleted files.

Next Available Strategy: Similar to the first available, but looks for subsequent clusters

for a file starting from next cluster to end of file system.

Best fit strategy: OS looks for contiguous clusters sufficient to store entire file. Less likely to be fragmented, but if file size changes, then

fragmentation can still happen.

18

Allocation Strategies (cont.)

Last allocated

19

Unallocated

Allocated

First available

Next available

Best fit

Damaged Data Units

Some file systems have ability to mark clusters as damaged, so that they are not used again to store data. This was useful for older disks that did not have the capability to handle errors. Modern disks have ability to handle errors, so this

capability is not needed. If this capability exists, it can be used to hide

data, because consistency checking tools do not verify a data unit that is marked as damaged.

20

Content Analysis

The Data Unit Viewing technique can be used when you know the address where evidence might be located. For example:

In FAT 32 volumes, logical sector 3 is not allocated, so it contains all zeroes. Looking at this sector can easily tell the examiner if data is hidden there. (dcat in TSK)

The Logical File System-Level Searching technique you look for contents, without knowing where it might be.

Search for the word ‘password’ in each data unit

21

Content Analysis

22

0 1 2 3 4 5 6 7 8 9 10 11

0 1 2 3 4 5 6 7 8 9 10 11

Data Unit Viewing: The investigator knows the address of evidence, so uses the logical file system address ... a tool calculates the byte or sector address

20,480 bytes

2,048 bytes in one data unit

Content of data unit 10

in ASCII

Logical File System Search: Looks in each data unit for a known value.

Is the string ‘password’ in this data unit?

Content AnalysisUsing Bitmaps

23

0 1 2 3 4 5 6 7 8 9 10 11

1 00 1 0 00 …

If we want to look in unallocated space, we can use tools to extract all unallocated data units or we can restrict analysis to only the unallocated space. We can see how a bitmap data structure can tell us whether a data unit is allocated or not.

Data units

Bitmap: 0 unallocated1 allocated

Content Analysis (cont.)

The Data Unit Allocation Status technique shows all the unallocated data units to find hidden or ‘deleted’ data.

The dls tool in TSK shows the unallocated data units.

The Consistency Check technique allows you to determine if the file system is in a suspicious state.

Orphan data units (allocated data units that do not have a corresponding metadata structure) and be found in this manner. See Figure 8.7

24

Content Analysis (cont.)

Wiping Techniques Zeroes or random data are written into the data units

allocated by a file or to all unused data units Secure Deletion is becoming more common for many

operating systems An investigator may check to see if a wiping tool is

available on a system, then determine when it was last accessed

There may also be temporary copies of files that were wiped

25

Analysis – Metadata Category

The metadata category includes the data that describe a file. Here you will find the data unit addresses that a file has allocated, the size of the file, and temporal information. The types of data in this category vary depending on the file system type.

There are four TSK tools in this category, and the names all start with i such as the istat command.

26

Metadata Analysis

Analyzing metadata can Provide corroborating information about the document data

itself. Reveal information that someone tried to hide, delete, or

obscure. Automatically correlate documents from different sources.

Since metadata is fundamentally data, it suffers all of the data quality issues as any other form of data. Nevertheless, because metadata isn't generally visible unless you use a special tool, more skill is required to alter or otherwise manipulate it.

27

Metadata Analysis

Slack space occurs when the file size is not a multiple of the data unit size It appears between the end of the file and the

end of the sector in which the file ends Or it can be in sectors that contain no file

content A new file may not completely overwrite all data that

was previously in a memory location

28

Analysis – File Name Category

The file name category of data includes the data that associates a name with a metadata entry. Most file systems separate the name and metadata, and the name is located inside of the data units allocated to a directory.

There are two TSK tools that operate at the file name layer, and their names start with f. fls will list the file names in a given directory. If we want to know which file name corresponds to a given metadata address, the ffind tool can be used.

29

File Name Analysis

One of the most common investigation techniques is to list the names of the files and directories. Many file systems do not clear the file name of a deleted file, so deleted file names may be shown in a listing.

Another technique is to search for file names. For example, we might know a filename’s extension, but not know its name or full path.

As long as the pointer to metadata is intact, we can potentially recover data even if both the filename and metadata are unallocated.

30

Documents

File System Analysis 1. What is a File System? Computers need a way to store data and then to retrieve the said data in simple but quick ways. File systems