24
Reiser4 By Hans Reiser Owner/Architect Namesys Corporation

Reiser4 By Hans Reiser Owner/Architect Namesys Corporation

Embed Size (px)

Citation preview

Page 1: Reiser4 By Hans Reiser Owner/Architect Namesys Corporation

Reiser4

By Hans ReiserOwner/Architect

Namesys Corporation

Page 2: Reiser4 By Hans Reiser Owner/Architect Namesys Corporation

Reiser4 New Features Overview

• Performance– Dancing trees (squeeze on flush)– BLOBs replaced with extents on the twig level– Allocate on flush ala XFS– encrypt on flush (unique to us)– Wandering logs– Bottom up locking– Repacker (debugging and tweaking now)

• Plugins make it easy to extend• Transactions (no isolation this year, only atomicity)• Online growing, shrinking, repacking

– Uses transactions and flush plugins• Inheritance (work in progress)• API effective for small files (debugging now)• Attributes implemented as file plugins• Constraints (not completed)• Hidden files

Page 3: Reiser4 By Hans Reiser Owner/Architect Namesys Corporation

Items and Keys

• Objects are chopped up into items so that they fit into nodes

• Every item has a key

• Items are fully ordered in the tree by their key

• Changing key assignment algorithm can completely change what is aggregated with what without changing much code

Page 4: Reiser4 By Hans Reiser Owner/Architect Namesys Corporation

B-trees

Page 5: Reiser4 By Hans Reiser Owner/Architect Namesys Corporation

B+Trees

Page 6: Reiser4 By Hans Reiser Owner/Architect Namesys Corporation

Fanout Matters A Lot

• In V3 seeks tended to get added when balancing changed who the parent was– Increasing fanout and allocating on flush

improved that in V4

• Fanout affects cachability of internal nodes• Reiser’s Caching Principle: Increasing

temperature variance improves caching• Segregating pointers from data increases

temperature variance

Page 7: Reiser4 By Hans Reiser Owner/Architect Namesys Corporation

Segregation Can Increase Temperature Variance

• If two sets of objects have different average temperatures

• And they are stored in larger containers that are the units of caching (e.g. nodes in a tree)

• Segregating them can increase the temperature variance

Page 8: Reiser4 By Hans Reiser Owner/Architect Namesys Corporation

Reiser3 (and most databases, since they use BLOBs also)

Unbalanced The Tree

Page 9: Reiser4 By Hans Reiser Owner/Architect Namesys Corporation

Reiser4 Balanced Tree Diagram

Page 10: Reiser4 By Hans Reiser Owner/Architect Namesys Corporation

Dancing Trees

• Traditional trees employ fixed criteria for balancing

• Squeeze adjacent dirty nodes all the way to the left when balancing

• Let repacker deal with singletons.

Page 11: Reiser4 By Hans Reiser Owner/Architect Namesys Corporation

Locking

– Hi Priority/ Lo Priority• Left and down is low priority• Right and up are high priority• If low priority thread fails to get a lock held by a

high priority thread it checks if anyone high priority is waiting for one of its locks and if they are it yields to them

• Sometimes when a low priority direction lock is desired but not needed, it may be better to just live without it for the relatively rare cases where it is already held.

Page 12: Reiser4 By Hans Reiser Owner/Architect Namesys Corporation

Wandering Logs

• Joys of WAFL• Neither relocation ala WAFL nor write twice

journaling are always optimal• Write twice is optimal when making small

changes and then reading before a repacker can run

• Fixed size and location journal vs. wandering• Tool kit approach to design rather than a belief

that screwdrivers are better than hammers.

Page 13: Reiser4 By Hans Reiser Owner/Architect Namesys Corporation

Atomic Filesystem

• Full transactional infrastructure in place• Currently all fs operations are atomic.• Sys_reiser4 allows performing any 64

assignments as atoms– Limit is arbitrary and no limit for trusted

processes can be granted

• Mail servers, version control, many applications could improve their performance as a result

Page 14: Reiser4 By Hans Reiser Owner/Architect Namesys Corporation

Online Repacker/Resizer/Defragger

• 80% of files don’t move for long periods of time

• Repack once a week, and 80% of files are perfectly laid out for reads

• Joys of LFS

Page 15: Reiser4 By Hans Reiser Owner/Architect Namesys Corporation

Reducing the Effort of Hacking

• Plugins– File plugins– Directory plugins– Hash plugins– Security plugins– Key assigment plugins– Node and item search plugins

Page 16: Reiser4 By Hans Reiser Owner/Architect Namesys Corporation
Page 17: Reiser4 By Hans Reiser Owner/Architect Namesys Corporation

Key Assignment

• Keys for filebodies have same order as keys for first directory entry they are created with

• X2 performance increase for operations in readdir order (e.g. cp –r)

• Keys are larger • This change took Nikita 3 days.

Page 18: Reiser4 By Hans Reiser Owner/Architect Namesys Corporation

Miscellaneous

• Inserting and appending to items in increasing order requires no shifting inside node

• Timestamps and mkfsids make fsck more effective

• Putting directory entries near file bodies is better

• Better directory readahead code

Page 19: Reiser4 By Hans Reiser Owner/Architect Namesys Corporation

Benchmarks

..\Desktop\v4marks.html

Page 20: Reiser4 By Hans Reiser Owner/Architect Namesys Corporation

Reiser4 Talk Review• Performance

– Dancing trees (squeeze on flush)– BLOBs replaced with extents on the twig level– Allocate on flush ala XFS– encrypt on flush (unique to us)– Wandering logs– Bottom up locking– Repacker (debugging and tweaking now)

• Plugins make it easy to extend• Transactions (no isolation this year, only atomicity)• Online growing, shrinking, repacking

– Uses transactions and flush plugins• Inheritance (work in progress)• API effective for small files (debugging now)• Attributes implemented as file plugins• Constraints (not completed)• Hidden files

Page 21: Reiser4 By Hans Reiser Owner/Architect Namesys Corporation

Reiser5

• Writes followed by reads at another location must be limited by network transmission latency

• Caching can help with most of the rest

Page 22: Reiser4 By Hans Reiser Owner/Architect Namesys Corporation

Features

• Globally Scalable

• Location Transparent

• Consistent

• Multi-level Caching

• Auto-Migrating

• Per File Specifiably Available

• Encryption Based Security

Page 23: Reiser4 By Hans Reiser Owner/Architect Namesys Corporation

Hashing vs. trees

• Why trees not hashing are used in reiser4

• Same performance principle applies to distributed filesystems

Page 24: Reiser4 By Hans Reiser Owner/Architect Namesys Corporation