21
CSC 213 – Large Scale Programming

CSC 213 – Large Scale Programming. What is “the BTree?” Common multi-way tree implementation Every BTree has an order (“BTree of order m ”) m

Embed Size (px)

Citation preview

Page 1: CSC 213 – Large Scale Programming. What is “the BTree?”  Common multi-way tree implementation  Every BTree has an order (“BTree of order m ”)   m

CSC 213 – Large Scale Programming

Page 2: CSC 213 – Large Scale Programming. What is “the BTree?”  Common multi-way tree implementation  Every BTree has an order (“BTree of order m ”)   m

What is “the BTree?”

Common multi-way tree implementation Every BTree has an order (“BTree of

order m”) m/2 to m children per internal node Root node has m or fewer elements

Many variants exist to improve some failing Each variant is specialized for some

niche use Minor differences only between each

variant This lecture will stick with vanilla BTrees

Page 3: CSC 213 – Large Scale Programming. What is “the BTree?”  Common multi-way tree implementation  Every BTree has an order (“BTree of order m ”)   m

BTree Order

Order selected to minimize paging Elements & references to kids in full node fills

page Nodes have at least m/2 elements, even at

their smallest In memory guarantees each page is at least

50% full How many pages touched during

operation?

Page 4: CSC 213 – Large Scale Programming. What is “the BTree?”  Common multi-way tree implementation  Every BTree has an order (“BTree of order m ”)   m

Removal from BTree

Swap element with successor in a leaf node Similar to (2,4) node removal

If removal node left with under m/2 elements See if can move element from sibling to

parent & steal element from parent Else, merge with sibling & steal element

from parent But this might propagate underflow to parent

node!

Remind anyone else of another structure?

Page 5: CSC 213 – Large Scale Programming. What is “the BTree?”  Common multi-way tree implementation  Every BTree has an order (“BTree of order m ”)   m

Where to Find BTrees

Often used to implement databases Contain lots of data -- more than

machine’s RAM Perform lots of data accesses, insertions Need simple, efficient organization

Databases must store data permanently Losing information may cause significant

problems RAM contents lost when powered off But storing files on hard drive is s — l — o

—w

Page 6: CSC 213 – Large Scale Programming. What is “the BTree?”  Common multi-way tree implementation  Every BTree has an order (“BTree of order m ”)   m

Database Implementation

Maintain BTree in memory… … but maintain copies of records on disk

Nodes have unique ID & location in file

Immediately write changes to disk Always keep file as up-to-date copy Just re-read file in case of program crash

Ignore virtual memory & instead use file Records stored in random order within

file Execution may change element order

Page 7: CSC 213 – Large Scale Programming. What is “the BTree?”  Common multi-way tree implementation  Every BTree has an order (“BTree of order m ”)   m

Better Ways To Access Data BTrees cannot read & write file

sequentially Must jump around in file instead Need way of specify each record within

file

Java’s solution: RandomAccessFile

Page 8: CSC 213 – Large Scale Programming. What is “the BTree?”  Common multi-way tree implementation  Every BTree has an order (“BTree of order m ”)   m

RandomAccessFile

Can create new files or use existing oneraf = new RandomAccessFile(“f.txt”,“rw”); Creates (or rewrites) the file named f.txt When problem arises, throws IOException Allows reading & writing to the file from within

program File can be used and modified using raf

Page 9: CSC 213 – Large Scale Programming. What is “the BTree?”  Common multi-way tree implementation  Every BTree has an order (“BTree of order m ”)   m

Reading RandomAccessFile

Read RandomAccessFile instance using: boolean readBoolean(), int readInt(), double readDouble()… Reads and returns the appropriate value

int read(byte[] b) Reads up to b.length bytes & stores back

in b Returns number of bytes read

Page 10: CSC 213 – Large Scale Programming. What is “the BTree?”  Common multi-way tree implementation  Every BTree has an order (“BTree of order m ”)   m

Writing RandomAccessFile

Write RandomAccessFile using: void writeInt(int i), void writeDouble(double d)… Writes value at next location in the file When at the end, will extend the file Overwrites file, erasing data that had been

there void write(byte[] b)

Write contents of b to the file As it is needed, will overwrite/extend file

Page 11: CSC 213 – Large Scale Programming. What is “the BTree?”  Common multi-way tree implementation  Every BTree has an order (“BTree of order m ”)   m

Typical File I/O

Ordinarily we read and write files sequentiallyRandomAccessFile raf = new …;char c = ‘’;while (c != ‘s’) {

c = raf.readChar();}

This is an example file we accessraf:

Page 12: CSC 213 – Large Scale Programming. What is “the BTree?”  Common multi-way tree implementation  Every BTree has an order (“BTree of order m ”)   m

Ordinarily we read and write files sequentiallyRandomAccessFile raf = new …;char c = ‘’;while (c != ‘s’) {

c = raf.readChar(); raf.writeChar(c);

}

Typical File I/O

This is an example file we access

Page 13: CSC 213 – Large Scale Programming. What is “the BTree?”  Common multi-way tree implementation  Every BTree has an order (“BTree of order m ”)   m

Typical File I/O

Ordinarily we read and write files sequentiallyRandomAccessFile raf = new …;char c = ‘’;while (c != ‘s’) {

c = raf.readChar(); raf.writeChar(c);

}

TTis is an example file we access

Page 14: CSC 213 – Large Scale Programming. What is “the BTree?”  Common multi-way tree implementation  Every BTree has an order (“BTree of order m ”)   m

Typical File I/O

Ordinarily we read and write files sequentiallyRandomAccessFile raf = new …;char c = ‘’;while (c != ‘s’) {

c = raf.readChar(); raf.writeChar(c);

}

TTii is an example file we access

Page 15: CSC 213 – Large Scale Programming. What is “the BTree?”  Common multi-way tree implementation  Every BTree has an order (“BTree of order m ”)   m

Typical File I/O

Ordinarily we read and write files sequentiallyRandomAccessFile raf = new …;char c = ‘’;while (c != ‘s’) {

c = raf.readChar(); raf.writeChar(c);

}

TTii s an example file we access

Page 16: CSC 213 – Large Scale Programming. What is “the BTree?”  Common multi-way tree implementation  Every BTree has an order (“BTree of order m ”)   m

Typical File I/O

Ordinarily we read and write files sequentiallyRandomAccessFile raf = new …;char c = ‘’;while (c != ‘s’) {

c = raf.readChar(); raf.writeChar(c);

}

TTii ssan example file we access

Page 17: CSC 213 – Large Scale Programming. What is “the BTree?”  Common multi-way tree implementation  Every BTree has an order (“BTree of order m ”)   m

Skipping Around The File

Read & write anywhere in RandomAccessFile void seek(long pos) moves to position in

file Positions specified as bytes from beginning

of file

Page 18: CSC 213 – Large Scale Programming. What is “the BTree?”  Common multi-way tree implementation  Every BTree has an order (“BTree of order m ”)   m

RandomAccessFile I/O

Ordinarily we read and write files sequentiallyRandomAccessFile raf = new …;char c;raf.seek(raf.length()-1);c = raf.readChar();raf.seek(0);raf.writeChar(c);

This is an example file we access

Page 19: CSC 213 – Large Scale Programming. What is “the BTree?”  Common multi-way tree implementation  Every BTree has an order (“BTree of order m ”)   m

RandomAccessFile I/O

Ordinarily we read and write files sequentiallyRandomAccessFile raf = new …;char c;raf.seek(raf.length()-1);c = raf.readChar();raf.seek(0);raf.writeChar(c);

shis is an example file we access

Page 20: CSC 213 – Large Scale Programming. What is “the BTree?”  Common multi-way tree implementation  Every BTree has an order (“BTree of order m ”)   m

How Does This Work?

Use positions to simplify everything Element contains position of record

within file Simplify building nodes from start of

program Record new nodes at end of file Stores node table of contents at file start Node records position of each of its

children

Page 21: CSC 213 – Large Scale Programming. What is “the BTree?”  Common multi-way tree implementation  Every BTree has an order (“BTree of order m ”)   m

For Next Lecture

Start week #14 assignment (due on Tuesday) Contains 3 problems to reinforce lecture

topics Provides practice for labs & final Often helps build up to project

Programming project #4 now available Read sections 30.1 - 30.10, 30.25 of

book Will complete semester by looking at graphs Graphs are very important data structure