60

HKG15-206: Solving the year 2038 problem in Linux

  • Upload
    linaro

  • View
    442

  • Download
    1

Embed Size (px)

Citation preview

Page 1: HKG15-206: Solving the year 2038 problem in Linux
Page 2: HKG15-206: Solving the year 2038 problem in Linux

Presented by

Date

Event

The end of time(32-bit edition)Arnd Bergmann

Tina Ruchandani

Linaro Connect HKG 2015

Page 3: HKG15-206: Solving the year 2038 problem in Linux

Understanding the problem

Page 4: HKG15-206: Solving the year 2038 problem in Linux

UNDERSTANDING THE PROBLEM

Date and Time settings on an Android phone (Nexus 4) today.

No year beyond 2037?

Page 5: HKG15-206: Solving the year 2038 problem in Linux

2038 issue

● Unix/POSIX’s Y2K● time_t representing number of

seconds since Jan 1 1970.● 32bit systems can represent dates from:

○ Dec 13 1901○ Jan 19th 2038

● Less than 23 years to go

Page 6: HKG15-206: Solving the year 2038 problem in Linux

What does this mean for Linux?

● On Jan 19th 2038, time_t values overflow and go negative.

● Negative times for timers are considered invalid, so timers set beyond Jan 19 2038 fail

● Internal kernel timers set beyond Jan 19 2038 will never fire

● Until recently the kernel would hang as soon as time_t rolls negative

Page 7: HKG15-206: Solving the year 2038 problem in Linux

But we have 23 years!That’s tons of time!

Folks will just upgrade to 64bits by then!

Page 8: HKG15-206: Solving the year 2038 problem in Linux

Problem with that...

Lots and lots of 32bit ARM devices being deployed today that may have 23+ year life spans

Page 9: HKG15-206: Solving the year 2038 problem in Linux

1992 Honda Civic

Page 10: HKG15-206: Solving the year 2038 problem in Linux

1992 wasn’t so long ago

● So maybe not a “classic” car, but these are still on the road.

● People expect their radio to still work● Especially if they paid for the fancy

in-dash infotainment system.

Page 11: HKG15-206: Solving the year 2038 problem in Linux

1992 Lamborghini Diablo

Page 12: HKG15-206: Solving the year 2038 problem in Linux

* Assuming you can still get battery replacements then….

Page 13: HKG15-206: Solving the year 2038 problem in Linux

Other long deployment life systems

● Security systems● Utility monitoring sensors● Satellites● Medical devices● Industrial fabrication machines

As embedded processors gain power, these are more likely to be running general-purpose kernels like Linux

Page 14: HKG15-206: Solving the year 2038 problem in Linux

Finding a solution

Page 15: HKG15-206: Solving the year 2038 problem in Linux

Crux of the issue

● Moving to 64 bit hardware isn’t a realistic answer for everyone○ Even with 64 bit hardware, some users rely on 32 bit

user space● However, today’s 32 bit applications are

terminally broken

Page 16: HKG15-206: Solving the year 2038 problem in Linux

OpenBSD precedent

● Converted time_t to long long in 2013:http://www.openbsd.org/papers/eurobsdcon_2013_time_t/

● Broke ABI, but “distro” is self-contained so limited compatibility damage○ Lots of interesting thoughts there on the risks of

compatibility support delaying conversion

Page 17: HKG15-206: Solving the year 2038 problem in Linux

NetBSD precedent

● Added a new system call API in 2008● Kept ABI compatibility for existing binaries● No option to build against old ABI● Some user space (e.g. postgresql) broke

after recompiling

Page 18: HKG15-206: Solving the year 2038 problem in Linux

Our strategy for Linux

● Build time:○ support both 32-bit and 64-bit time_t○ Leave decision up to libc

● Run time:○ Like NetBSD, add a new 64-bit time_t ABI○ Support both system call ABIs by default○ Allow 32-bit time_t interface to be disabled

Page 19: HKG15-206: Solving the year 2038 problem in Linux

Kernel implications

● Have to change one bug at a time● Hundreds of drivers● 30-40 system calls● Dozens of ioctl commands● Incompatible changes for on-wire and on-

disk data

Page 20: HKG15-206: Solving the year 2038 problem in Linux

Just to be clear

● This won’t solve *all* 2038 issues● Just want to focus on solving the kernel issues and

give a path for applications to migrate to.● Applications likely do dumb things (which seemed

reasonable at the time) w/ time_t● If you’re ~40 years old, fixing this won’t hurt your

supplemental retirement planning○ There will still be lucrative contracts to validate and fix

applications.

Page 21: HKG15-206: Solving the year 2038 problem in Linux

Current status

● Core timekeeping code already fixed● OPW internship ongoing, lots of simple driver

fixes, but more broken code gets added● Linaro and others spending developer time● Known broken files down from 783 to 725

since v3.15, lots more work to dogit grep -wl '\(time_t\|struct timespec\|struct timeval\)'

Page 22: HKG15-206: Solving the year 2038 problem in Linux

Fixing drivers internals

Page 23: HKG15-206: Solving the year 2038 problem in Linux

Fixing drivers, typical codestruct timeval start_time, stop_time;do_gettimeofday(&start_time);...

do_gettimeofday(&stop_time);t = stop_time.tv_sec - start_time.tv_sec;t *= 1000000;if (stop_time.tv_usec < start_time.tv_usec)

t -= start_time.tv_usec - stop_time.tv_usec;else

t += stop_time.tv_usec - start_time.tv_usec;

Example from sound/pci/es1968.c

Page 24: HKG15-206: Solving the year 2038 problem in Linux

Fixing drivers, typical codestruct timeval start_time, stop_time;do_gettimeofday(&start_time);...

do_gettimeofday(&stop_time);t = stop_time.tv_sec - start_time.tv_sec;t *= 1000000;if (stop_time.tv_usec < start_time.tv_usec)

t -= start_time.tv_usec - stop_time.tv_usec;else

t += stop_time.tv_usec - start_time.tv_usec;

Trying to remove these

Page 25: HKG15-206: Solving the year 2038 problem in Linux

Fixing drivers, trivial fixstruct timespec64 start_time, stop_time;do_getnstimeofday64(&start_time);...

do_getnstimeofday64(&stop_time);t = stop_time.tv_sec - start_time.tv_sec;t *= 1000000000;if (stop_time.tv_nsec < start_time.tv_nsec)

t -= start_time.tv_nsec - stop_time.tv_nsec;else

t += stop_time.tv_nsec - start_time.tv_nsec;t /= 1000;

direct replacement type, using nanosecond resolution.

Code was actually safe already but not obviously so.

Page 26: HKG15-206: Solving the year 2038 problem in Linux

Fixing drivers, trivial fixstruct timespec64 start_time, stop_time;do_getnstimeofday64(&start_time);...

do_getnstimeofday64(&stop_time);t = stop_time.tv_sec - start_time.tv_sec;t *= 1000000000;if (stop_time.tv_nsec < start_time.tv_nsec)

t -= start_time.tv_nsec - stop_time.tv_nsec;else

t += stop_time.tv_nsec - start_time.tv_nsec;t /= 1000;

direct replacement type, using nanosecond resolution.

Code was actually safe already but not obviously so.

Possible overflow?

Page 27: HKG15-206: Solving the year 2038 problem in Linux

Fixing drivers, better fixktime_t start_time;start_time = ktime_get();...

t = ktime_us_delta(ktime_get(), start_time);

Using monotonic time also fixes concurrentsettimeofday() calls

Efficient, safe and easy to use helper functions improve drivers further

Page 28: HKG15-206: Solving the year 2038 problem in Linux

Fixing system calls

Page 29: HKG15-206: Solving the year 2038 problem in Linux

System calls, example time()SYSCALL_DEFINE1(time, time_t __user *, tloc)

{

time_t i = get_seconds();

if (put_user(i,tloc))

return -EFAULT;

return i;

}

#define __NR_time 13

Page 30: HKG15-206: Solving the year 2038 problem in Linux

System calls, example time()SYSCALL_DEFINE1(time, time_t __user *, tloc)

{

time_t i = get_seconds();

if (put_user(i, tloc))

return -EFAULT;

return i;

}

#define __NR_time 13

Need to fix for 32-bit

Page 31: HKG15-206: Solving the year 2038 problem in Linux

System calls, example time()SYSCALL_DEFINE1(time, __kernel_time64_t __user *, tloc)

{

__kernel_time64_t i = get_seconds64();

if (put_user(i,tloc))

return -EFAULT;

return i;

}

#define __NR_time 13

#define __NR_time64 367

Better, but now breaks compatibility

Page 32: HKG15-206: Solving the year 2038 problem in Linux

System calls, example time()#ifdef CONFIG_COMPAT_TIME

COMPAT_SYSCALL_DEFINE1(time, compat_time_t __user *, tloc)

{

compat_time_t i = (compat_time_t)get_seconds64();

if (put_user(i, tloc))

return -EFAULT;

return i;

}

#endif

Page 33: HKG15-206: Solving the year 2038 problem in Linux

System calls, traditional typestypedef long __kernel_time_t; /* user visible */

typedef __kernel_time_t time_t; /* kernel internal */

Page 34: HKG15-206: Solving the year 2038 problem in Linux

System calls, intermediate typestypedef long __kernel_time_t; /* user visible */

typedef __kernel_time_t time_t; /* kernel internal */

#ifdef CONFIG_COMPAT_TIME

typedef s64 __kernel_time64_t; /* user visible */

typedef s32 compat_time_t; /* kernel internal */

#else

typedef long __kernel_time64_t; /* internal HACK! */

#endif

Page 35: HKG15-206: Solving the year 2038 problem in Linux

System calls, final typestypedef long __kernel_time_t; /* user visible */

typedef __kernel_time_t time_t; /* kernel internal */

typedef s64 __kernel_time64_t; /* user visible */

#ifdef CONFIG_COMPAT_TIME

typedef s32 compat_time_t; /* kernel internal */

#endif

Page 36: HKG15-206: Solving the year 2038 problem in Linux

Fixing user space

Page 37: HKG15-206: Solving the year 2038 problem in Linux

Embedded distros

● Change libc to use 64-bit time_t● Recompile everything● Profit

Page 38: HKG15-206: Solving the year 2038 problem in Linux

Embedded distros

● Change libc to use 64-bit time_t● Recompile everything● Profit

● Caveat: ioctl

Page 39: HKG15-206: Solving the year 2038 problem in Linux

Embedded distros

● Change libc to use 64-bit time_t● Recompile everything● Profit

● Caveat: ioctl● Caveat 2: programs hardcoding 32-bit types

Page 40: HKG15-206: Solving the year 2038 problem in Linux

Standard distros

● Need to provide backwards compatibility● glibc to use symbol versioning● multi-year effort

Page 41: HKG15-206: Solving the year 2038 problem in Linux

Standard distros

● Need to provide backwards compatibility● glibc to use symbol versioning● multi-year effort

● Any 32-bit standard distros remaining in 2038? Maybe Debian

Page 42: HKG15-206: Solving the year 2038 problem in Linux

Fixing ioctl

Page 43: HKG15-206: Solving the year 2038 problem in Linux

Fixing ioctl commands in drivers

● Full audit of data structures needed● Some user space needs source changes● Recompiled user space tools may break on

old kernels● Some headers need #ifdef

to know user time_t size

Page 44: HKG15-206: Solving the year 2038 problem in Linux

Fixing file systems

Page 45: HKG15-206: Solving the year 2038 problem in Linux

y2038 and filesystems

Unmount followed bymount. Think of it as a reboot.

Oops!

Choosing xfs as an example.

Page 46: HKG15-206: Solving the year 2038 problem in Linux

y2038 and filesystems

Oops!

This is a 64-bit system

Page 47: HKG15-206: Solving the year 2038 problem in Linux

On-disk representation● Up to the filesystem● Example: Adding epochs in xfs.

Reinterpret seconds field● 8-bit padding → 255*136 years

Fixing filesystem timestamps

Page 48: HKG15-206: Solving the year 2038 problem in Linux

inode_operations● change in-inode fields● inode_time or timespec64? ● Rewriting getattr / setattr callbacks

Fixing filesystem timestamps

Page 49: HKG15-206: Solving the year 2038 problem in Linux

ioctl interface● Also controlled by filesystem● Rewrite ioctls to reinterpret time with epochs.● update xfsprogs to understand new format

Fixing filesystem timestamps

Page 50: HKG15-206: Solving the year 2038 problem in Linux

Questions?

Page 51: HKG15-206: Solving the year 2038 problem in Linux

Backup slides

Page 52: HKG15-206: Solving the year 2038 problem in Linux

Discussed solutions

● Unsigned time_t● New 64bit time_t ABI ● New kernel syscalls that provide 64bit time

values

Page 53: HKG15-206: Solving the year 2038 problem in Linux

Unsigned time_t

● Have the kernel interpret time_t’s as unsigned values● Would allow timers to function past 2038

○ Might work for applications that only deal with relative timers.● Still problematic

○ Could modify glibc to properly convert time_t to “modern” string○ Applications lose ability to describe pre-1970 events

● Could subtly change userspace headers?○ Probably not a good idea

Page 54: HKG15-206: Solving the year 2038 problem in Linux

● Define time_t as a long long● Provide personality compat support for

existing 32bit ABI● Applications recompiled w/ new ABI would

work past 2038, old ABI applications would work until 2038.

New Kernel ABI

Page 55: HKG15-206: Solving the year 2038 problem in Linux

New Kernel ABI Drawbacks

● Still issues with application bad behavior○ Internally casting time_t to longs○ Storing time_t in 32bit file/protocol formats

● Have to implement new compat functions○ Not a trivial amount of work

● New ABI is a big break, might want to “fix” more than just time_t○ Might get lots of “riders” on the change

Page 56: HKG15-206: Solving the year 2038 problem in Linux

Add new gettime64() interfaces

● New time64_t type● Also need interfaces for setting the time,

setting timers, querying timers, sleep interfaces, etc.

● 30-40 syscalls take time_t○ ioctls are even worse

● Lots of structures embed time_t

Page 57: HKG15-206: Solving the year 2038 problem in Linux

How usage of epoch works - watered-down example● 2 digit number: range is 0-99● Add 1 digit as “epoch” (or “extender digit”). ● Epoch is initially 0. ● When the 2 digit number turns 99, increment the

“epoch” by 1. True value = 100*epoch + 2-digit value

Digit 1 Digit 2“Epoch”

5 81 Congratulations!You have now extended the range from 0-99 to 0-999( and reinvented the decimal number system :-) )

Page 58: HKG15-206: Solving the year 2038 problem in Linux

Applying the epoch in code - I ...

Use some of the padding bytes to store bytes. We had 6 bytesof padding initially,we store 4 for 4 different 1-byte epochs and are left with 2 bytes of padding.

Page 59: HKG15-206: Solving the year 2038 problem in Linux

Applying the epoch in code - II ...

Replacing original timespecwith timespec + epoch “Is an epoch being used?” The bool answers that.

If an epoch is being used, the xfs_timestamp_sec functions reinterprets the seconds value, as (epoch value) * (range of original seconds) + (current value of seconds)→ (epoch value) = 0-255

Page 60: HKG15-206: Solving the year 2038 problem in Linux

inode time changes - watered down example

- setattr/getattr callbacks are common to all filesystems. Cannot change without breaking them.

- Example:- Assume: void foo_gettime(time1& aTime)- Assume: there are 2 users of foo_gettime. - Solution:- #define time2 time1- Change both users to use foo_getttime(time2& aTime)- Then define time2 to be whatever you want