22
Implementing Dual- Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect

Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect

Embed Size (px)

Citation preview

Page 1: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect

Implementing Dual-Boot Clusters in a

Distributed Environment

Surajit Bose, Technology Services ManagerDustin King, Systems Imaging Architect

Page 2: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect

Our Environment

• Not central IT

• Over 100 computer clusters, mostly unstaffed

• Dorms, Row Houses, Graduate Residences

• Central and Branch Libraries

• Student Centers

• Most open 24/7

• Approximately 500 cluster machines

• Historically, even mix of Dells and Apples

Page 3: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect

Our Prior Infrastructure

• Campus-wide Kerberos authentication

• PXE/Ghost for Windows imaging

• Windows machines joined to AD

• Domain scripts for Windows maintenance

• NetRestore for Mac imaging

• Macs bound to LDAP

• Radmind for Mac maintenance

• Linux server environment

Page 4: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect

Why Dual-Boot?

• Bypass question of optimal platform mix

• Improve availability of single-platform software

• Provide choice for students

• Homogenize inventory

• Seemed like a cool thing to try

Page 5: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect

Desiderata

• Network-based full-disk imaging

• Platform parity

• Manage each platform independently

• Ease of switching OS

• Non-ridiculous login times

• Server-side control

• Consistent imaging process across hardware

• Shared local storage across OSes

Page 6: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect

What We Discovered

• Managing the reboot cycle is difficult

• Existing solutions unsatisfactory for us

• BootPicker, NetRestore/WinClone Mac-centric

• rEFIt makes management difficult

• No network boot environment works for both Dell and Apple machines

• Partition order matters

Page 7: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect

What We Decided• Control boot process with EFI shell

environment (SCUBA)

• Inter-OS communication via locally stored state file

• NetBoot install environment (Genie)

• Use convoluted partition scheme

• Use Paragon NTFS and MacDrive

• Use customized login screens

• Nightly maintenance reboots

• Server-side tracking of machine state

Page 8: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect

EFI Shell Environment

• Boot to EFI shell

• Fits on a flash drive for full-disk imaging

• Shell modified to ignore keyboard interrupts

• EFI toolkit has network stack, http client, Python

• Startup script

• validates nvram boot options

• checks with server

• reads and updates local state file

• sets nextboot value in nvram

Page 9: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect

Priority of Boot Flags

• Required (from server)

• Mac Maintenance (from local state file, set by script)

• Windows Maintenance (from local state file, set by script)

• Requested (from local state file, set by user)

• Suggested (from server)

Page 10: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect

Local State File

• Houses maintenance and requested boot flags

• Caches most recent response from the server

• Has to be writable from both OSes as well as EFI shell environment

Page 11: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect

Genie

• Based on NetInstall set built with Mac OS X Server Admin Tools

• Bash scripts check server for configuration and manage imaging process

• Report progress through iHook

Page 12: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect

Windows Login Screen (pGina)

Page 13: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect

Mac Login Screen (SCUBA)

Page 14: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect

Partition Scheme

• EFI System Partition: leave alone per Apple recommendation

• FAT: store Windows images and local state file

• NTFS: local storage space for users

• NTFS: Windows system partition

• HFS+: EFI shell environment

• HFS+: Mac system partition

Page 15: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect

Handling Partitions

• Mac OS X

• Paragon NTFS

• Remount volumes under /Library/Mounts

• Windows XP

• MacDrive

• Some partitions already invisible

• Remount volumes under c:\stucomp\mnt

Page 16: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect

Nightly Maintenance

• Scripts on each OS write maintenance flags into state file

• Windows

• Python reboot service

• Domain startup scripts

• Mac

• Radmind

• iHook

Page 17: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect

Server-Side Setup

• Genie

• Background downloads

• SCUBA flags

• Printer configuration

• Imaging request page

• Status “database”

Page 18: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect

Gotchas

• Per-seat licensing costs

• Mouse and keyboard confusion

• NetBoot memory management horror

• Windows reboot behavior

• Time and Kerberos logins

• Permissions on shared volumes

• SSH keys

Page 19: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect

Planned Enhancements• Improve build processes for EFI, NetBoot

environments

• Increase structural similarity of configuration and management between platforms

• Implement PKI for client-server communications

• Explore emerging solutions (e.g. XHooks)

• Implement cross-platform monitoring system

• Reduce power usage on clients

• Create documentation

• Release as open-source

Page 20: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect

Acknowledgments

• Karl Kuehn, Software Image Developer

• Alex Schorsch, Student Developer

• Fangling Zhang, Student Developer

• Paul Nuyujukian, Student Developer

• Ian Comfort, Systems Administrator

Page 21: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect

Questions?

Page 22: Implementing Dual-Boot Clusters in a Distributed Environment Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect

[email protected]

[email protected]_________

Evaluate!http://www.resnetsymposium.org/rspm/

evaluation/