Click here to load reader
Upload
anne-nicolas
View
362
Download
0
Embed Size (px)
DESCRIPTION
When designing a new interface between kernel and userspace, say a new ioctl, a new character device or a new syscall, I’m forgetting some basic issues and/or corner cases. This talk is about such pitfalls and how to address them. Yann Droneaud, OPTEYA
Citation preview
What I should not forgot when designing a new userspace <-> kernel
interface
Yann Droneaud
2014-09-24
What I should not forgot when designing a new userspace <-> kernel
interface
Where do you want to go today ?
I have this shiny new subsystem/driver/…
I need a mean to exchange data between userspace and kernel, so that my kernel code can do some work.
What are you going to do ?
I'm going to design a new
character device,
ioctl,
syscall.
Are you sure ?
I'm "Not Sure" !
You're probably not
Indeed.
There's always alternative
Am I really in desperate need for a new
character device,
ioctl,
syscall ?
You could try
It would be so much fun !
Show me your data structure
struct foo{ char a; unsigned long b;#ifdef CONFIG_EXTENDED void *c;#endif};
Don't use conditionals in public datastructure
struct foo{ char a; unsigned long b;#ifdef CONFIG_EXTENDED void *c;#endif};
I should have noted that's never gonna work: userspace compiled without define vs kernel compiled with the define:
ABI mismatch !
You're using the wrong data types (1/4)
struct foo{ char a; unsigned long b; void *c;};
Dependending on the ABI, char signedness would change, so I should use fixed signedness data type.
You're using the wrong data types (2/4)
struct foo{ u8 a; unsigned long b; void *c;};
Dependending on the ABI, sizeof(long) would change, so I should use fixed length data types.
You're using the wrong data types (3/4)
struct foo{ u8 a; u64 b; void *c;};
Depending on the ABI, sizeof(void *) would change, so I should use a fixed data type for pointer too.
I should also note that such pointer doesn't have __user annotation, which might hide some access violation warnings.
(Hint: install sparse, and use make C=2)
You're using the wrong data types (4/4)
struct foo{ u8 a; u64 b; u64 c;};
What might happen if I don't rewrite my data structure this way ?
You don't want to maintain a "compat" layer, do you ?
What's "compat" ?
It's not the layer you're looking for
I should probably avoid introducing new interfaces which require a compat layer.
You're still using the wrong data types
struct foo{ u8 a; u64 b; u64 c;};
I should not forgot that kernel data types are not exposed asis in userspace.
Kernel data types are prefixed with __.
Your data structure is almost correct
struct foo{ __u8 a; __u64 b; __u64 c;};
Let's talk about ABI
I should note the following about ABIs for some common architectures.
sizeof(u64) alignof(u64)
i386 8 4
x86_64 8 8
arm 8 8
In particular, i386 is special in this regard.
Your're digging hole
There's implicit padding in my data structure.
struct foo{ __u8 a; /* u8 padding[n]; with n in 3 to 7 */ __u64 b; __u64 c;};
Still down the rabbit hole
Is this better ?
struct foo{ __u64 b; __u64 c; __u8 a; /* u8 padding[n]; with n in 3 to 7 */};
Padding can be introduced at the end of the structure too.
You should know about padding
I've discovered many ways to create holes in data structure.
Such holes are possible discrepancy between ABIs: a compat layer might be needed for 32bits emulation on 64bits kernel.
It's also wasted space: my data structure is using more bytes than strictly required.
How to initialize what doesn't have a name
Since this padding is implicit, I don't initialize it.
As I'm allocating the data structure on stack, I'm going to leak information to userspace.
I will probably receive a semi-automatic email from Dan Carpenter who maintains and run smatch on kernel code.
You have to fill the hole
struct foo{ __u8 a; __u8 padding[7]; __u64 b; __u64 c;};
When filling hole, I have to use the largest alignment requirement, so I should not use i386 constraints.
You have to fill some headers for userspace
My data structure must be described in a header in include/uapi/linux.
It should not be in include/linux nor in drivers/... or whatever.
I could read The UAPI header file split, by Michael Kerrisk
http://lwn.net/Articles/507794/
Take care of ABI stability
Once designed, once merged, once exposed to userspace, my ABI cannot be changed without breaking existing userspace program.
So I should start thinking right now about extension.
What you can do to prevent such issues (1/2)
I have to ensure that data structure has no implicit padding, and alignment/size are the same for multi-ABI architectures (i386/amd64, sparc32/64,
ppc32/64, etc.).
What you can do to prevent such issues (2/2)
When reading data from userspace:
Check every unused bits for being 0
When writing data to userspace:
Set every unused bits to 0
Tools that can help you (gcc)
GCC
GCC has a nice warning option -Wpadded
$ gcc -Wpadded -c foo.c
foo.c:6:15: attention : padding struct to align ‘b’ [-Wpadded] __u64 b; ^
fuu.c:9:1: warning: padding struct size to alignment boundary [-Wpadded] }; ^
But I don't want to compile the whole kernel with this option.
Tools that can help you (pahole 1/5)
Pahole
I could use pahole to check the data structure layout.
Pahole is a tool written by Arnaldo Carvalho de Melo (acme).
http://git.kernel.org/cgit/devel/pahole/pahole.git
I can use it to look after padding, and I could use it to compare the layout of the structure for i386 and x86_64.
pahole and other DWARF2 utilities
http://lwn.net/Articles/206805/
Part of dwarves packages on some distribution.
You can check earlier structure with pahole (2/5)
$ gcc -c -m32 foo.c$ pahole foo.o
struct foo {__u8 a; /* 0 1 */
/* XXX 3 bytes hole, try to pack */
__u64 b; /* 4 8 */__u64 c; /* 12 8 */
/* size: 20, cachelines: 1, members: 3 *//* sum members: 17, holes: 1, sum holes: 3 *//* last cacheline: 20 bytes */};
You can check earlier reordered structure with pahole (3/5)
$ gcc -c -m32 foo.c$ pahole foo.o
struct foo {__u64 b; /* 0 8 */__u64 c; /* 8 8 */__u8 a; /* 16 1 */
/* size: 20, cachelines: 1, members: 3 *//* padding: 3 *//* last cacheline: 20 bytes */};
You can check ABI differences with pahole (4/5)
$ gcc -c -m32 foo.c -o foo-32.o$ pahole foo-32.o > foo-32.o.c
$ gcc -c -m64 foo.c -o foo-64.o$ pahole foo-64.o > foo-64.o.c
and diff (5/5)
$ diff -u foo-32.o.c foo-64.o.c
--- foo-32.o.c 2014-09-25 14:28:03.988105812 +0200+++ foo-64.o.c 2014-09-25 14:28:11.591073455 +0200@@ -3,7 +3,7 @@ __u64 c; /* 8 8 */ __u8 a; /* 16 1 */
-/* size: 20, cachelines: 1, members: 3 */-/* padding: 3 */-/* last cacheline: 20 bytes */+/* size: 24, cachelines: 1, members: 3 */+/* padding: 7 */+/* last cacheline: 24 bytes */ };
Tools that can help you (trinity)
Trinity
I could add my new kernel interface to Trinity.
Trinity is a Linux system call fuzzer written by Dave Jones.
http://codemonkey.org.uk/projects/trinity/
https://github.com/kernelslacker/trinity
LCA: The Trinity fuzz tester, by Michael Kerrisk
http://lwn.net/Articles/536173/
You could find some help
As usual, I should submit patches earlier in my development cycle to [email protected] in order to get some reviews.
But another mailing list may be of interest: The linux-api mailing list
https://www.kernel.org/doc/man-pages/linux-api-ml.html
Links
How to not invent kernel interfaces, by Arnd Bergmann
http://www.ukuug.org/events/linux2007/2007/papers/Bergmann.pdf
Botching up ioctls, by Daniel Vetter
http://blog.ffwll.ch/2013/11/botching-up-ioctls.html
Proper handling of unknown flags in system calls, by Michael Kerrisk
http://lwn.net/Articles/588444/
Flags as a system call API design pattern, by Michael Kerrisk
http://lwn.net/Articles/585415/
More Links
Fixing the unfixable autofs ABI, by Jonathan Corbet
https://lwn.net/Articles/494993/
System calls and 64-bit architectures, by Jake Edge
http://lwn.net/Articles/311630/
A perf ABI fix, by Jonathan Corbet
http://lwn.net/Articles/567894/
Questions ?
One last thing
Please use O_CLOEXEC by default !
Final words
Thank You !