105
Kernel Module Taku Shimosawa 1 Feb. 21, 2015 Pour le livre nouveau du Linux noyau

Linux Kernel Module - For NLKB

Embed Size (px)

Citation preview

Page 1: Linux Kernel Module - For NLKB

Kernel ModuleTaku Shimosawa

1

Feb. 21, 2015 Pour le livre nouveau du Linux noyau

Page 2: Linux Kernel Module - For NLKB

Notes• Linux kernel version: 3.19• Quoted source codes come from kernel/module.c

unless otherwise noted.

2

Page 3: Linux Kernel Module - For NLKB

Kernel Module• A feature for dynamically adding/removing kernel

features while the kernel is running• Benefits• To update the kernel features while running• To reduce memory consumption (and CPU overhead) by

loading only necessary kernel modules• Avoiding GPL (Not required to compliant with GPL;

proprietary drivers)

• Many kernel features can be compiled either linked to the kernel statically or independent modules• File systems, device drivers, etc.• “TRISTATE” in Kconfig (y, m, or n)

3

Page 4: Linux Kernel Module - For NLKB

Where is the kernel module?• Linux kernel modules are ELF binaries with an

extension “.ko”• Many distributions locate the kernel modules

under /lib/modules• e.g. /lib/modules/3.13.0-44-generic/kernel (Ubuntu 14.10)• “depmod” finds the kernel modules located under the

directory to create module dependency map (modules.dep)

• “modprobe” utility loads a kernel module with its dependent modules by looking up the modules.dep file

• However, a module located in any place can be loaded to the kernel if specified explicitly.

4

Page 5: Linux Kernel Module - For NLKB

What is the “dependency?”• A kernel module can export “symbols” that may be used by

another kernel module• A symbol : a name for a location in the memory; a global variable

or a function in C

• If a module (B) uses a symbol exported by another module (A), then the module B has dependency for the module A

• Thus, the module A should be loaded before the module B is loaded

• (There seems to be no way to load modules that have circular dependencies (e.g. A depends on B; B also depends on A))

5

Kernel module A Kernel module B

function f() {}EXPORT_SYMBOL(f);

function g() { f();}

DEP

Page 6: Linux Kernel Module - For NLKB

Exported Symbols• The symbols explicitly marked as “export” can be accessed

by other kernel modules• The Linux kernel itself has “export”-ed symbols.• Kernel modules are allowed to use only the exported symbols in

the kernel• Not all the global functions are available for the modules!

• The symbols to be exported are declared with the EXPORT_SYMBOL and EXPORT_SYMBOL_GPL macros.• The latter makes the symbol available only for GPL modules.

6

struct task_struct *pid_task(struct pid *pid, enum pid_type type){ ... }EXPORT_SYMBOL(pid_task);...struct task_struct *get_pid_task(struct pid *pid, enum pid_type type){ ... }EXPORT_SYMBOL_GPL(get_pid_task);

(kernel/pid.c)

Page 7: Linux Kernel Module - For NLKB

(BTW)• What makes difference?

7

struct task_struct *pid_task(struct pid *pid, enum pid_type type){...}EXPORT_SYMBOL(pid_task);

struct task_struct *get_pid_task(struct pid *pid, enum pid_type type){

struct task_struct *result;rcu_read_lock();result = pid_task(pid, type);if (result)

get_task_struct(result);rcu_read_unlock();return result;

}EXPORT_SYMBOL_GPL(get_pid_task);

(kernel/pid.c)

Page 8: Linux Kernel Module - For NLKB

Make a kernel module!• Out-of-tree module• The only necessary files are

• Makefile• C source file(s)

• Example for Makefile

8

obj-m += hello.o

KERN_BUILD=/lib/modules/$(shell uname -r)/build

all: make -C $(KERN_BUILD) M=$(PWD) modules

clean: make -C $(KERN_BUILD) M=$(PWD) clean

cf.obj-$(CONFIG_SHIMOS) = shimos.o

Page 9: Linux Kernel Module - For NLKB

Inside the kernel module• What sections are inside a kernel module?

9

$ readelf –a hello.koSection Headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align… [ 2] .text PROGBITS 0000000000000000 00000064 0000000000000000 0000000000000000 AX 0 0 1 [ 3] .init.text PROGBITS 0000000000000000 00000064 0000000000000016 0000000000000000 AX 0 0 1 [ 4] .rela.init.text RELA 0000000000000000 000009c0 0000000000000030 0000000000000018 16 3 8 [ 5] .exit.text PROGBITS 0000000000000000 0000007a 0000000000000006 0000000000000000 AX 0 0 1…

[ 7] .modinfo PROGBITS 0000000000000000 00000091 00000000000000c1 0000000000000000 A 0 0 1 [ 8] __versions PROGBITS 0000000000000000 00000160 0000000000000080 0000000000000000 A 0 0 32…

[18] .gnu.linkonce.thi PROGBITS 0000000000000000 00000280 0000000000000260 0000000000000000 WA 0 0 32

Page 10: Linux Kernel Module - For NLKB

Sections10

Section Name Description

.gnu.linkonce.this_module Module structure

.modinfo String-style module information(Licenses, etc.)

__versions Expected (compile-time) versions (CRC) of the symbols that this module depends on.

__ksymtab* Table of symbols which this module exports.

__kcrctab* Table of versions of symbols which this module exports.

*.init Sections used while initialization (__init)

.text, .data, etc. The code and data

* : (none), _gpl, _gpl_future, _unused, unused_gpl (License restriction / attribute of the symbols)

Page 11: Linux Kernel Module - For NLKB

Module load and unload• The simplest way : “insmod” and “rmmod” commands

• More sophisticated way is “modprobe” and “modprobe –r”• The former tries to load modules which the specified module

depends on• The latter tries to unload modules which the specified module

depends on

11

# insmod (file name) [parameters…]

(e.g.) # insmod helloworld.ko msg=hoge

# rmmod (module name)

(e.g.) # rmmod helloworld

Page 12: Linux Kernel Module - For NLKB

How insmod calls the kernel?• Source: kmod-19

12

KMOD_EXPORT int kmod_module_insert_module(struct kmod_module *mod, unsigned int flags, const char *options){... if (kmod_file_get_direct(mod->file)) { unsigned int kernel_flags = 0;

if (flags & KMOD_INSERT_FORCE_VERMAGIC) kernel_flags |= MODULE_INIT_IGNORE_VERMAGIC; if (flags & KMOD_INSERT_FORCE_MODVERSION) kernel_flags |= MODULE_INIT_IGNORE_MODVERSIONS;

err = finit_module(kmod_file_get_fd(mod->file), args, kernel_flags); if (err == 0 || errno != ENOSYS) goto init_finished; }...

(libkmod/libkmod-module.c)

Page 13: Linux Kernel Module - For NLKB

System calls• 3 Module-related System Calls• init_module• finit_module

• To load a module• delete_module

• To unload a module

13

int init_module(void *module_image, unsigned long len, const char *param_values);

int finit_module(int fd, const char *param_values, int flags);

int delete_module(const char *name, int flags);(from man pages)

Page 14: Linux Kernel Module - For NLKB

init_module / finit_module• Load a kernel module• How to specify the module?• init_module : by user memory buffer that contains the

kernel module image• finit_module : by file descriptor for the kernel module

file

• By using finit_module, some flags can be specified

14

flags

MODULE_INIT_IGNORE_MODVERSIONS Ignore symbol version hashes

MODULE_INIT_IGNORE_VERMAGIC Ignore kernel version magic

Page 15: Linux Kernel Module - For NLKB

delete_module• Unload a kernel module• Specifies a module to be unloaded by its “name”

• Some flags can be specified• Why different policy from finit_module…?

15

flags

O_NONBLOCK | O_TRUNC Forcefully unload the module(even when the ref count is not zero; taints the kernel)

O_NONBLOCK Returns immediately with an error(EWOULDBLOCK)

O_NONBLOCK not set Stops the module, and waits until the ref count reaches zero.(UNINTERRUPTIBLE)

Page 16: Linux Kernel Module - For NLKB

Data structures for modules• struct load_info• Used while initializing a module• Most members are ELF-related.

16

struct load_info {Elf_Ehdr *hdr;unsigned long len;Elf_Shdr *sechdrs;char *secstrings, *strtab;unsigned long symoffs, stroffs;struct _ddebug *debug;unsigned int num_debug;bool sig_ok;struct {

unsigned int sym, str, mod, vers, info, pcpu;

} index;};

(include/linux/module.h)

Page 17: Linux Kernel Module - For NLKB

Data structures for modules• struct module (too large..)

17

struct module {enum module_state state;

/* Member of list of modules */struct list_head list;

/* Unique handle for this module */char name[MODULE_NAME_LEN];

/* Sysfs stuff. */struct module_kobject mkobj;

.../* Exported symbols */const struct kernel_symbol *syms;const unsigned long *crcs;unsigned int num_syms;

/* Kernel parameters. */struct kernel_param *kp;unsigned int num_kp;

“modules” list

Exported symbolsSymbol CRC

Page 18: Linux Kernel Module - For NLKB

Data structures for modules

18

/* GPL-only exported symbols. */unsigned int num_gpl_syms;const struct kernel_symbol *gpl_syms;const unsigned long *gpl_crcs;

...#ifdef CONFIG_MODULE_SIG

/* Signature was verified. */bool sig_ok;

#endif...

/* Exception table */unsigned int num_exentries;struct exception_table_entry *extable;

/* Startup function. */int (*init)(void);

/* If this is non-NULL, vfree after init() returns */void *module_init;

.../* Here is the actual code + data, vfree'd on unload. */void *module_core;

GPL Symbols

“init” function

“init” sections

Other (core) sections

Page 19: Linux Kernel Module - For NLKB

Data structures for modules

19

/* Here are the sizes of the init and core sections */unsigned int init_size, core_size;

/* The size of the executable code in each section. */unsigned int init_text_size, core_text_size;

/* Size of RO sections of the module (text+rodata) */unsigned int init_ro_size, core_ro_size;

/* Arch-specific module values */struct mod_arch_specific arch;

.../* The command line arguments (may be mangled). People

like keeping pointers to this stuff */char *args;

...#ifdef CONFIG_SMP

/* Per-cpu data. */void __percpu *percpu;unsigned int percpu_size;

#endifz

Sizes of sections

Command lineparameters

Per-CPUDatas

Page 20: Linux Kernel Module - For NLKB

Data structures for modules

20

...#ifdef CONFIG_MODULE_UNLOAD

/* What modules depend on me? */struct list_head source_list;/* What modules do I depend on? */struct list_head target_list;

/* Destruction function. */void (*exit)(void);

struct module_ref __percpu *refptr;#endif

#ifdef CONFIG_CONSTRUCTORS/* Constructor functions. */ctor_fn_t *ctors;unsigned int num_ctors;

#endif};

(include/linux/module.h)

Lists to manage dependencies

(only unload is enabled)

Page 21: Linux Kernel Module - For NLKB

Module state• state in struct module

• During its load, state becomes (created) -> UNFORMED -> COMING -> LIVE.• During its unload, state becomes

LIVE -> GOING -> (removed)

21

state descriptionMODULE_STATE_UNFORMED Appeared in the modules list, but still during

set upMODULE_STATE_COMING Fully formed. Running module_init.MODULE_STATE_LIVE Normal state.MODULE_STATE_GOING Being unloaded.

Page 22: Linux Kernel Module - For NLKB

Global module information

Variables DescriptionLIST_HEAD(modules) List of modules that are in the kernel.DEFINE_MUTEX(module_mutex) Protection against “modules,” etc.

• Add : RCU list operations• Remove : stop_machine(~3.18)

22

/* * Mutex protects: * 1) List of modules (also safely readable with preempt_disable), * 2) module_use links, * 3) module_addr_min/module_addr_max. * (delete uses stop_machine/add uses RCU list operations). */DEFINE_MUTEX(module_mutex);EXPORT_SYMBOL_GPL(module_mutex);

Page 23: Linux Kernel Module - For NLKB

Loading a Module• Load the whole module file onto memory• Parse the ELF and module information• Check the module information to

determine whether the module is loadable or not• Layout the sections and copy to the final

location• Add the module to the kernel• Resolve the symbols and apply relocations• Copy module parameters• Call the init function

23

System Calls

load_modulelayout_and_allocate

setup_load_infocheck_mod_info

layout_sectionslayout_symtabsmove_module

add_unformed_module

simply_symbolsapply_relocations

do_init_module

UNFORMED

COMING

LIVE

Page 24: Linux Kernel Module - For NLKB

Unloading a Module• Check if the reference count of the

module is zero• If zero or it is forced unloading, then set

the state to GOING• If not zero, it fails

• Call the “exit” function• Free and cleanup everything

24

sys_delete_module

try_stop_module

__try_stop_module

free_module

Page 25: Linux Kernel Module - For NLKB

stop_machine (-3.18)• Until Linux 3.18, the reference count check and

module remove in module unloading is implemented with stop_machine.

25

static int try_stop_module(struct module *mod, int flags, int *forced){

struct stopref sref = { mod, flags, forced };

return stop_machine(__try_stop_module, &sref, NULL);}

static void free_module(struct module *mod){...

mutex_lock(&module_mutex);stop_machine(__unlink_module, mod, NULL);mutex_unlock(&module_mutex);

...}

Page 26: Linux Kernel Module - For NLKB

Now (3.19)• Reference count is now atomic_t (was per-cpu int

before) and checked without stop_machine• (thanks to a mysterious guy)

26

static int try_stop_module(struct module *mod, int flags, int *forced){

/* If it's not unused, quit unless we're forcing. */if (try_release_module_ref(mod) != 0) {

*forced = try_force_unload(flags);if (!(*forced))

return -EWOULDBLOCK;}

/* Mark it as dying. */mod->state = MODULE_STATE_GOING;

return 0;}

Page 27: Linux Kernel Module - For NLKB

Now (3.19)• Stop_machine also goes away from removing

27

static void free_module(struct module *mod){...

/* Now we can delete it from the lists */mutex_lock(&module_mutex);/* Unlink carefully: kallsyms could be walking list. */list_del_rcu(&mod->list);/* Remove this module from bug list, this uses

list_del_rcu */module_bug_cleanup(mod);/* Wait for RCU synchronizing before releasing mod->list

and buglist. */synchronize_rcu();mutex_unlock(&module_mutex);

...}

Page 28: Linux Kernel Module - For NLKB

Details (1)Loading

28

Page 29: Linux Kernel Module - For NLKB

sys_init_module/sys_finit_module• Initialize a load_info structure• Check whether module load is permitted or not.

(may_init_module function)• [finit only] Flags check• [init only] Copy module data in user memory to

kernel memory (copy_module_from_user function)• [finit only] Read from the fd into kernel memory

(copy_module_from_fd function)• Call the load_module function

29

Page 30: Linux Kernel Module - For NLKB

may_init_module• Capability: CAP_SYS_MODULE• “module_disabled” parameter• Blocks loading and unloading of modules

30

/* Block module loading/unloading? */int modules_disabled = 0;core_param(nomodule, modules_disabled, bint, 0);...static int may_init_module(void){

if (!capable(CAP_SYS_MODULE) || modules_disabled)return -EPERM;

return 0;}

(kernel/module.c)

# sysctl kernel.modules_disabledkernel.modules_disabled = 0

Page 31: Linux Kernel Module - For NLKB

copy_module_from_fd• Pass the file struct to the security module• vmalloc an area for the module data• Load the whole module file into the area

• Set the pointer to info->hdr

31

static int copy_module_from_fd(int fd, struct load_info *info){...

err = security_kernel_module_from_file(f.file);if (err)

goto out;...

info->hdr = vmalloc(stat.size);if (!info->hdr) {

err = -ENOMEM;goto out;

}...

while (pos < stat.size) {bytes = kernel_read(f.file, pos, (char *)(info->hdr) + pos,

stat.size - pos);... }

info->len = pos;

Page 32: Linux Kernel Module - For NLKB

copy_module_from_user• Differences:• Pass “NULL” pointer to the security module• Just copy_from_user instead of kernel_read

32

static int copy_module_from_user(const void __user *umod, unsigned long len, struct load_info *info)

{...info->len = len;

...err = security_kernel_module_from_file(NULL);if (err)

return err;...

/* Suck in entire file: we'll want most of it. */info->hdr = vmalloc(info->len);if (!info->hdr)

return -ENOMEM;...

if (copy_from_user(info->hdr, umod, info->len) != 0) {vfree(info->hdr);return -EFAULT;

}return 0;

Page 33: Linux Kernel Module - For NLKB

load_module function (1)• Signature check (module_sig_check)• ELF header check (elf_header_check)• Layout and allocate the final location for the module

(layout_and_allocate)• Add the module to the “modules” list (add_unformed_module)• Allocate per-cpu areas used in the module (percpu_modalloc)• Initialize link lists used for dependency management and

unloading features (module_unload_init)• Find optional sections (find_module_sections)• License and version dirty hack

(check_module_license_and_versions)• Setup MODINFO_ATTR fields (setup_modinfo)

33

Page 34: Linux Kernel Module - For NLKB

load_module function (2)• Resolve the symbols (simplify_symbols)• Fix up the addresses in the module (apply_relocations)• Extable and per-cpu initialization (post_relocation)• Flush I-cache for the module area (flush_module_icache)• Copy the module parameters to mod->args.• Check duplication of symbols, and setup NX attributes.

(complete_formation)• Parse the module parameters (parse_args)• sysfs setup (mod_sysfs_setup)• Free the copy in the load_info structure (free_copy)• Call the init function of the module (do_init_module)

34

Page 35: Linux Kernel Module - For NLKB

module_sig_check• Check the signature in the module (if

CONFIG_MODULE_SIG=y)• If a module is signed, “signature” and “marker” resides at the

tail of the module file.

• If signature is OK, module->sig_ok is set to true.

• If no signature is found (-ENOKEY) and signature is not enforced, it returns success(0).• Signature is enforced either

• When CONFIG_MODULE_SIG_FORCE is Y• When “sig_enforce” parameter is set

35

Module (ELF) Signature Marker

“~Module signature appended~\n”

$ hd /lib/module/3.13.0-45-generic/kernel/fs/btrfs/btrfs.ko0014b470 f8 a6 b7 74 01 06 01 1e 14 00 00 00 00 00 02 02 |...t............|0014b480 7e 4d 6f 64 75 6c 65 20 73 69 67 6e 61 74 75 72 |~Module signatur|0014b490 65 20 61 70 70 65 6e 64 65 64 7e 0a |e appended~.|0014b49c

Page 36: Linux Kernel Module - For NLKB

elf_header_check• Sanity check for the ELF header

• The magic number is correct• The architecture is correct• The length is large enough to contain all the section headers, etc.

36

static int elf_header_check(struct load_info *info){

if (info->len < sizeof(*(info->hdr)))return -ENOEXEC;

if (memcmp(info->hdr->e_ident, ELFMAG, SELFMAG) != 0 || info->hdr->e_type != ET_REL || !elf_check_arch(info->hdr) || info->hdr->e_shentsize != sizeof(Elf_Shdr))

return -ENOEXEC;

if (info->hdr->e_shoff >= info->len || (info->hdr->e_shnum * sizeof(Elf_Shdr) >

info->len - info->hdr->e_shoff))return -ENOEXEC;

return 0;}

Page 37: Linux Kernel Module - For NLKB

ELF (.ko)

ELF Header37

Elf_Ehdr

e_ident

e_type

e_shoff

e_shentsize

e_shnum

e_shstrndx

Elf_Shdr

Elf_Shdr

load_info.hdr (ELF_EHdr)= The head of the kernel module file= The head of the ELF= Pointer to ELF_EHdr

e_shentsize

e_shentsize

e_shoff

e_shnum

ELF (.ko)

e_ident: magic (‘\x7fELF’), 32/64-bit, etc. (16 byte in total incl. padding)e_type: ET_REL / ET_EXEC / ET_DYN

Page 38: Linux Kernel Module - For NLKB

layout_and_allocate• Fill the section information of the load_info, and

create a module structure pointing to the temporary location (setup_load_info)• Check the module information and report if the

module taints the kernel (check_modinfo)• Calculate the size required for the final location of

the module (layout_sections / layout_symtab)• Allocate the memory of the calculated size, and

copy the contents of the module, and move the pointer of the module structure there (move_module).

38

Page 39: Linux Kernel Module - For NLKB

setup_load_info• Set the following members according to the ELF header

and section headers.• sechdrs (Pointer to the section header)• secstrings (Pointer to the string section that contains section

names)• index.info, index.ver (Section indices of modinfo, version)• index.sym, index.str (Section indices of symbols, strings)• strtab (Pointer to the string section)• index.mod (section index of module section)

• “.gnu.linkonce.this_module” section• Set the module pointer to this section (temporally)

• index.pcu (section index for per-cpu section)• “.data..percpu” section (if exists)

• Return a pointer to a (temporary) module structure

39

Page 40: Linux Kernel Module - For NLKB

setup_load_info• info->sechdrs• info->secstrings• info->strtab

• Each section’s offset isstored in ELF_Shdr.sh_offset

• info->index.info = 12• info->index.vers = 16• info->index.sym = 24• info->index.str = 25• info->index.mod = 18

• struct module *mod

• info->index.pcpu = 0• No per-cpu data in this example.

40

Elf_Ehdr

Elf_Shdr (0)

Elf_Shdr (18).gnu.linkonce.this_module

Elf_Shdr (23) : .shstrtab

Elf_Shdr (24) : .symtab

Elf_Shdr (25) : .strtab

.shstrtab section

.strtab section

.gnu.linkonce.this_module section

Elf_Shdr (12) : .modinfo

Elf_Shdr (16) : __versions

Header

Section (Contents)

Page 41: Linux Kernel Module - For NLKB

check_modinfo (1)• Check “modinfo” in the module, and check if the

version magic is identical to the current kernel, and mark “tainted” if it taints the kernel.• “Modinfo” resides in the “.modinfo” section, and is

composed of zero-terminated strings of key-value pairs connected by “=“.

41

description=Hello world kernel module\0author=Taku Shimosawa <[email protected]>\0license=GPL v2\0srcversion=8D5BACDC1EA9421ABFF79DD\0depends=\0vermagic=3.13.0-44-generic SMP mod_unload modversions

Page 42: Linux Kernel Module - For NLKB

check_modinfo (2)• First, check the version magic in the module

42

static int check_modinfo(struct module *mod, struct load_info *info, int flags){

const char *modmagic = get_modinfo(info, "vermagic");...

if (flags & MODULE_INIT_IGNORE_VERMAGIC)modmagic = NULL;

...if (!modmagic) {

err = try_to_force_load(mod, "bad vermagic");if (err)

return err;} else if (!same_magic(modmagic, vermagic, info-

>index.vers)) {pr_err("%s: version magic '%s' should be '%s'\n", mod->name, modmagic, vermagic);return -ENOEXEC;

}

Page 43: Linux Kernel Module - For NLKB

check_modinfo (3)• Version magic

• Example:

• same_magic function• Compare the vermagic strings excluding CRCs if they

have CRCs.

43

#define VERMAGIC_STRING \UTS_RELEASE " " \MODULE_VERMAGIC_SMP MODULE_VERMAGIC_PREEMPT \MODULE_VERMAGIC_MODULE_UNLOAD

MODULE_VERMAGIC_MODVERSIONS \MODULE_ARCH_VERMAGIC

(include/linux/vermagic.h)

3.13.0-44-generic SMP mod_unload modversions

Page 44: Linux Kernel Module - For NLKB

check_modinfo (4)• …And mark tainted if any is necesary

44

if (!get_modinfo(info, "intree"))add_taint_module(mod, TAINT_OOT_MODULE,

LOCKDEP_STILL_OK);

if (get_modinfo(info, "staging")) {add_taint_module(mod, TAINT_CRAP, LOCKDEP_STILL_OK);pr_warn("%s: module is from the staging directory, the

quality ""is unknown, you have been warned.\n", mod-

>name);}

/* Set up license info based on the info section */set_license(mod, get_modinfo(info, "license"));

Page 45: Linux Kernel Module - For NLKB

check_modinfo (5)• License information is also important

45

static void set_license(struct module *mod, const char *license){

if (!license)license = "unspecified";

if (!license_is_gpl_compatible(license)) {if (!test_taint(TAINT_PROPRIETARY_MODULE))

pr_warn("%s: module license '%s' taints kernel.\n",

mod->name, license);add_taint_module(mod, TAINT_PROPRIETARY_MODULE,

LOCKDEP_NOW_UNRELIABLE);}

}

Page 46: Linux Kernel Module - For NLKB

check_modinfo (6)• GPL compatible?

• See the “GPL\0….” case

46

static inline int license_is_gpl_compatible(const char *license){

return (strcmp(license, "GPL") == 0|| strcmp(license, "GPL v2") == 0|| strcmp(license, "GPL and additional rights")

== 0|| strcmp(license, "Dual BSD/GPL") == 0|| strcmp(license, "Dual MIT/GPL") == 0|| strcmp(license, "Dual MPL/GPL") == 0);

}(include/linux/license.h)

Page 47: Linux Kernel Module - For NLKB

check_modinfo (7)• Also, the kernel is marked tainted when the module

is loaded forcefully

47

static int try_to_force_load(struct module *mod, const char *reason){#ifdef CONFIG_MODULE_FORCE_LOAD

if (!test_taint(TAINT_FORCED_MODULE))pr_warn("%s: %s: kernel tainted.\n", mod->name,

reason);add_taint_module(mod, TAINT_FORCED_MODULE,

LOCKDEP_NOW_UNRELIABLE);return 0;

#elsereturn -ENOEXEC;

#endif}

Page 48: Linux Kernel Module - For NLKB

Taints!• Tainted mask are composed of several flags that

identifies the reason of tainting• Lockdep is disabled if it will not work well

• Ignoring the version magic, proprietary drivers, forceful unload

48

void add_taint(unsigned flag, enum lockdep_ok lockdep_ok){

if (lockdep_ok == LOCKDEP_NOW_UNRELIABLE && __debug_locks_off())

pr_warn("Disabling lock debugging due to kernel taint\n");

set_bit(flag, &tainted_mask);}

(kernel/panic.c)static inline void add_taint_module(struct module *mod, unsigned flag,

enum lockdep_ok lockdep_ok){

add_taint(flag, lockdep_ok);mod->taints |= (1U << flag);

}(kernel/module.c)

Kernel global flags

Per-module flags

Page 49: Linux Kernel Module - For NLKB

Taints!• 15 reasons are defined

49

#define TAINT_PROPRIETARY_MODULE 0#define TAINT_FORCED_MODULE

1#define TAINT_CPU_OUT_OF_SPEC

2#define TAINT_FORCED_RMMOD 3#define TAINT_MACHINE_CHECK

4#define TAINT_BAD_PAGE 5#define TAINT_USER 6#define TAINT_DIE 7#define TAINT_OVERRIDDEN_ACPI_TABLE

8#define TAINT_WARN 9#define TAINT_CRAP 10#define TAINT_FIRMWARE_WORKAROUND 11#define TAINT_OOT_MODULE 12#define TAINT_UNSIGNED_MODULE

13#define TAINT_SOFTLOCKUP 14

(include/linux/kernel.h)

$ sysctl kernel.taintedkernel.tainted = 12288

12288 = 0x3000

Page 50: Linux Kernel Module - For NLKB

layout_sections• Calculate the size of final memory to load the module

• Load only sections with “SHF_ALLOC” flags set• Calculate sizes for “core” and “init”

• “init” sections are determined when the section name starts with “.init”

• Sets the following member of module• core_size : sum of the sizes of the “core” sections to be loaded• core_text_size, core_ro_size : sum of the sizes of the text and

R/O “core” sections• init_size : sum of the sizes of the “init” sections to be loaded• init_text_size, init_ro_size : … of “init” sections

• sh_entsize in ELF_Shdr is used as the offset of the memory where the section will be loaded.

50

Page 51: Linux Kernel Module - For NLKB

layout_sections• The sections in the example “hello.ko” are

categorized as follows:

51

Sections

Core Text .text, .exit.text

R/O __ksymtab, __kcrctab, .rodata.str1.1, __ksymtab_strings__mcount_loc,

R/W .data, .gnu.linkonce.this_module, .bss,

Init Text .init.text

R/O

R/W

(Others) Not loaded .rela.text, .rela.init.text, .rela__ksymtab, .rela__kcrctab.rela__mcount_loc, .rela.gnu.linonce.this_module.comment, .note.GNU-stack, .shstrtab, .symtab, .strtab.modinfo, __versions (*)

(*) These two sections originally have SHF_ALLOC, but the flags are dropped by rewrite_section_headers

Page 52: Linux Kernel Module - For NLKB

layout_symtab• Put the symtab and strtab at the end of the init part• (Actually this function does not put, but add init_size by

the size of symtab)

• Put the symtab and strtab for the core symbols at the end of core part.

52

Page 53: Linux Kernel Module - For NLKB

move_module• Allocate the final memory of the module, and

update the boundary addresses for the modules (module_alloc_update_bounds)

• Copy the section contents and update sh_addr’s

53

static void *module_alloc_update_bounds(unsigned long size){

void *ret = module_alloc(size);

if (ret) {mutex_lock(&module_mutex);if ((unsigned long)ret < module_addr_min)

module_addr_min = (unsigned long)ret;if ((unsigned long)ret + size > module_addr_max)

module_addr_max = (unsigned long)ret + size;mutex_unlock(&module_mutex);

}return ret;

}

Page 54: Linux Kernel Module - For NLKB

module_alloc : x86• x86• Get_module_load_offset() determines the load offset as

a random value at the first time if KASLR is enabled

54

#define MODULES_VADDR VMALLOC_START#define MODULES_END VMALLOC_END

(arch/x86/include/asm/pgtable_32_types.h)#define MODULES_VADDR (__START_KERNEL_map + KERNEL_IMAGE_SIZE)#define MODULES_END _AC(0xffffffffff000000, UL)

(arch/x86/include/asm/pgtable_64_types.h)void *module_alloc(unsigned long size){

if (PAGE_ALIGN(size) > MODULES_LEN)return NULL;

return __vmalloc_node_range(size, 1, MODULES_VADDR +

get_module_load_offset(), MODULES_END, GFP_KERNEL |

__GFP_HIGHMEM, PAGE_KERNEL_EXEC, NUMA_NO_NODE, __builtin_return_address(0));

}(arch/x86/kernel/module.c)

Page 55: Linux Kernel Module - For NLKB

module_alloc : ARM• ARM

55

#ifndef CONFIG_THUMB2_KERNEL#define MODULES_VADDR (PAGE_OFFSET - SZ_16M)#else/* smaller range for Thumb-2 symbols relocation (2^24)*/#define MODULES_VADDR (PAGE_OFFSET - SZ_8M)#endif

(arch/arm/include/asm/memory.h)#define MODULES_END (PAGE_OFFSET)#define MODULES_VADDR (MODULES_END - SZ_64M)

(arch/arm64/include/asm/memory.h)#ifdef CONFIG_MMUvoid *module_alloc(unsigned long size){

return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,

GFP_KERNEL, PAGE_KERNEL_EXEC, NUMA_NO_NODE,

__builtin_return_address(0));}#endif

(arch/arm/kernel/module.c)

Page 56: Linux Kernel Module - For NLKB

module to final place• Struct module for the module loaded was pointed

to the temporary module image memory• Now, it’s loaded and copied to the final location , so

the pointer is also changed to the final location

56

/* Module has been copied to its final place now: return it. */

mod = (void *)info->sechdrs[info->index.mod].sh_addr;

Page 57: Linux Kernel Module - For NLKB

load_module function (1) [RE]• Signature check (module_sig_check)• ELF header check (elf_header_check)• Layout and allocate the final location for the module

(layout_and_allocate)• Add the module to the “modules” list (add_unformed_module)• Allocate per-cpu areas used in the module (percpu_modalloc)• Initialize link lists used for dependency management and

unloading features (module_unload_init)• Find optional sections (find_module_sections)• License and version dirty hack

(check_module_license_and_versions)• Setup MODINFO_ATTR fields (setup_modinfo)

57

Page 58: Linux Kernel Module - For NLKB

add_unformed_module• Add the module to the “modules” list• Checking the duplicated loading of the same module• If the same module is still being loaded, this waits for

the completion of the load, and it tries again• Just in case that the module fails to be loaded

58

Page 59: Linux Kernel Module - For NLKB

add_unformed_module59

static int add_unformed_module(struct module *mod){

mod->state = MODULE_STATE_UNFORMED;...again:

mutex_lock(&module_mutex);old = find_module_all(mod->name, strlen(mod->name), true);if (old != NULL) {

if (old->state == MODULE_STATE_COMING || old->state == MODULE_STATE_UNFORMED) {

mutex_unlock(&module_mutex);err = wait_finished_loading(mod);if (err)

goto out_unlocked;goto again;

}err = -EEXIST;goto out;

}list_add_rcu(&mod->list, &modules);err = 0;

...

Page 60: Linux Kernel Module - For NLKB

When loading occurs concurrently60

Module A UNFORMED LIVE

Module A UNFORMED (fail)

Module B(depends on A) UNFORMED Resolve Resolve LIVE

wakeup_all(@do_init_module)

time

COMING

Page 61: Linux Kernel Module - For NLKB

percpu_modalloc• Allocate per-cpu area for the size of the per-cpu

section

61

static int percpu_modalloc(struct module *mod, struct load_info *info){

Elf_Shdr *pcpusec = &info->sechdrs[info->index.pcpu];unsigned long align = pcpusec->sh_addralign;

if (!pcpusec->sh_size)return 0;

...

mod->percpu = __alloc_reserved_percpu(pcpusec->sh_size, align);

if (!mod->percpu) {pr_warn("%s: Could not allocate %lu bytes percpu data\

n",mod->name, (unsigned long)pcpusec->sh_size);

return -ENOMEM;}mod->percpu_size = pcpusec->sh_size;return 0;

}

Page 62: Linux Kernel Module - For NLKB

module_unload_init• Initialize a reference counter for the module

• After this function, it becomes 2.

• Initialize lists that manages dependency• source_list : list of “usages” in which the module is contained as their

“source” (= the list of modules which uses the symbols of the module)

• target_list : list of “usages” in which the module is contained as their “target” (= the list of modules symbols of which the module uses)

62

static int module_unload_init(struct module *mod){

atomic_set(&mod->refcnt, MODULE_REF_BASE);

INIT_LIST_HEAD(&mod->source_list);INIT_LIST_HEAD(&mod->target_list);

atomic_inc(&mod->refcnt);

return 0;}

Page 63: Linux Kernel Module - For NLKB

find_module_sections• Find additional sections in the module• Mostly related to symbol tables, and tracers

63

Sections

__param

__ksymtab

__kcrctab

__ksymtab_gpl

__kcrctab_gpl

__ksymtab_gpl_future

__kcrctab_gpl_future

__ksymtab_unused

__kcrctab_unused

__ksymtab_unused_gpl

__kcrctab_unused_gtpl

Sections

.ctors / .init_array

__tracepoints_ptrs

__jump_table

_ftrace_events

__trace_printk_fmt

__mcount_loc

__ex_table

__verbose

Page 64: Linux Kernel Module - For NLKB

check_module_license_and_versions

• Some hacks on specific modules• e.g.) ndiswrapper driver may be GPL (it needs symbols

exported only to GPL modules), but the driver it loads will not be GPL, so mark tainted

64

static int check_module_license_and_versions(struct module *mod){

if (strcmp(mod->name, "ndiswrapper") == 0)add_taint(TAINT_PROPRIETARY_MODULE,

LOCKDEP_NOW_UNRELIABLE);

/* driverloader was caught wrongly pretending to be under GPL */

if (strcmp(mod->name, "driverloader") == 0)add_taint_module(mod, TAINT_PROPRIETARY_MODULE,

LOCKDEP_NOW_UNRELIABLE);

/* lve claims to be GPL but upstream won't provide source */if (strcmp(mod->name, "lve") == 0)

add_taint_module(mod, TAINT_PROPRIETARY_MODULE, LOCKDEP_NOW_UNRELIABLE);

Page 65: Linux Kernel Module - For NLKB

check_module_license_and_versions

• Checks whether the symbols have CRCs (versions)

65

#ifdef CONFIG_MODVERSIONSif ((mod->num_syms && !mod->crcs) || (mod->num_gpl_syms && !mod->gpl_crcs) || (mod->num_gpl_future_syms && !mod->gpl_future_crcs)

#ifdef CONFIG_UNUSED_SYMBOLS || (mod->num_unused_syms && !mod->unused_crcs) || (mod->num_unused_gpl_syms && !mod->unused_gpl_crcs)

#endif) {return try_to_force_load(mod,

"no versions for exported symbols");

}#endif

return 0;

Page 66: Linux Kernel Module - For NLKB

setup_modinfo• Call “setup” for module attributes• Only “version” and “srcversion” have “setup” callback.

• Module attributes• version, srcversion• uevent• initstate• coresize, initsize• taint• refcnt

66

#define MODINFO_ATTR(field) \static void setup_modinfo_##field(struct module *mod, const char *s) \{ \

mod->field = kstrdup(s, GFP_KERNEL); \} \

Page 67: Linux Kernel Module - For NLKB

load_module function (2) [Re]• Resolve the symbols (simplify_symbols)• Fix up the addresses in the module (apply_relocations)• Extable and per-cpu initialization (post_relocation)• Flush I-cache for the module area (flush_module_icache)• Copy the module parameters to mod->args.• Check duplication of symbols, and setup NX attributes.

(complete_formation)• Parse the module parameters (parse_args)• sysfs setup (mod_sysfs_setup)• Free the copy in the load_info structure (free_copy)• Call the init function of the module (do_init_module)

67

Page 68: Linux Kernel Module - For NLKB

simplify_symbols• Change the address of the unresolved symbols in

the “symtab” section to the actual addresses

68

static int simplify_symbols(struct module *mod, const struct load_info *info){

Elf_Shdr *symsec = &info->sechdrs[info->index.sym];Elf_Sym *sym = (void *)symsec->sh_addr;

...for (i = 1; i < symsec->sh_size / sizeof(Elf_Sym); i++) {

const char *name = info->strtab + sym[i].st_name;...

case SHN_UNDEF:ksym = resolve_symbol_wait(mod, info, name);/* Ok if resolved. */if (ksym && !IS_ERR(ksym)) {

sym[i].st_value = ksym->value;break;

}/* Ok if weak. */if (!ksym && ELF_ST_BIND(sym[i].st_info) ==

STB_WEAK)break;

Page 69: Linux Kernel Module - For NLKB

resolve_symbol_wait• Waits if the resolved symbol is that of the module

which is under initialization.

69

static const struct kernel_symbol *resolve_symbol_wait(struct module *mod,

const struct load_info *info, const char *name)

{const struct kernel_symbol *ksym;char owner[MODULE_NAME_LEN];

if (wait_event_interruptible_timeout(module_wq,!IS_ERR(ksym = resolve_symbol(mod, info, name,

owner))|| PTR_ERR(ksym) != -EBUSY,

30 * HZ) <= 0) {pr_warn("%s: gave up waiting for init of module %s.\n",

mod->name, owner);}return ksym;

}

Page 70: Linux Kernel Module - For NLKB

resolve_symbol • Find the symbol from the kernel’s symbol tables

and other modules’ symbol tables. (find_symbol)

• If found, check if the version (CRC) of the symbol matches one that the module expects (check_versions)• And add dependency for the target module and the

symbol owner module (ref_module)

70

Page 71: Linux Kernel Module - For NLKB

find_symbol (1)• Well, try to find it from the kernel

71

bool each_symbol_section(bool (*fn)(const struct symsearch *arr, struct module *owner, void *data),

void *data){

struct module *mod;static const struct symsearch arr[] = {

{ __start___ksymtab, __stop___ksymtab, __start___kcrctab,

NOT_GPL_ONLY, false },{ __start___ksymtab_gpl, __stop___ksymtab_gpl, __start___kcrctab_gpl, GPL_ONLY, false },{ __start___ksymtab_gpl_future,

__stop___ksymtab_gpl_future, __start___kcrctab_gpl_future, WILL_BE_GPL_ONLY, false },

...};

if (each_symbol_in_section(arr, ARRAY_SIZE(arr), NULL, fn, data))

return true;

Page 72: Linux Kernel Module - For NLKB

find_symbol (2)• And, try to find in the modules (after UNFORMED)

72

list_for_each_entry_rcu(mod, &modules, list) {struct symsearch arr[] = {

{ mod->syms, mod->syms + mod->num_syms, mod->crcs,

NOT_GPL_ONLY, false },{ mod->gpl_syms, mod->gpl_syms + mod-

>num_gpl_syms, mod->gpl_crcs, GPL_ONLY, false },{ mod->gpl_future_syms, mod->gpl_future_syms + mod-

>num_gpl_future_syms, mod->gpl_future_crcs, WILL_BE_GPL_ONLY, false },

if (mod->state == MODULE_STATE_UNFORMED)continue;

if (each_symbol_in_section(arr, ARRAY_SIZE(arr), mod, fn, data))

return true;}return false;

}

Page 73: Linux Kernel Module - For NLKB

find_symbol (3)• Bianry search in the section!

73

static int cmp_name(const void *va, const void *vb){

const char *a;const struct kernel_symbol *b;a = va; b = vb;return strcmp(a, b->name);

}

static bool find_symbol_in_section(const struct symsearch *syms, struct module *owner, void *data)

{struct find_symbol_arg *fsa = data;

sym = bsearch(fsa->name, syms->start, syms->stop - syms->start,sizeof(struct kernel_symbol), cmp_name);

if (sym != NULL && check_symbol(syms, owner, sym - syms->start, data))

return true;return false;

}

Checks the found symbol’s target license

Page 74: Linux Kernel Module - For NLKB

ref_module• If the target module is NULL (=the symbol is in the kernel) or

the module already uses the target module, it immediately returns.• Increment the reference counter of the target module (if the

target module is in the middle of initialization, returns –EBUSY)• Add usage

• Source : the module• Target : the target module

74

static int add_module_usage(struct module *a, struct module *b){

struct module_use *use;use = kmalloc(sizeof(*use), GFP_ATOMIC);

use->source = a;use->target = b;list_add(&use->source_list, &b->source_list);list_add(&use->target_list, &a->target_list);

}

Page 75: Linux Kernel Module - For NLKB

Usage example75

Kernel module A Kernel module B

function f() {}

function g() { f();}

DEP

struct module A

refcnt : 2

struct module B

refcnt: 1struct module_use

source: &Btarget: &A

source_list

target_list

source_list

target_list

Page 76: Linux Kernel Module - For NLKB

apply_relocations• Apply relocations for each “rel” section• “rel” sections

• Section Type : SHT_REL or SHT_RELA

76

[Nr] Name Type Address Offset Size EntSize Flags Link Info Align[ 2] .text PROGBITS 0000000000000000 00000070 0000000000000019 0000000000000000 AX 0 0 16[ 3] .rela.text RELA 0000000000000000 00000ca8 0000000000000048 0000000000000018 24 2 8[ 4] .init.text PROGBITS 0000000000000000 00000089 0000000000000016 0000000000000000 AX 0 0 1[ 5] .rela.init.text RELA 0000000000000000 00000cf0 0000000000000030 0000000000000018 24 4 8[24] .symtab SYMTAB 0000000000000000 00000db0 00000000000003c0 0000000000000018 25 32 8[25] .strtab STRTAB 0000000000000000 00001170 000000000000014a 0000000000000000 0 0 1

Page 77: Linux Kernel Module - For NLKB

Relocation• Example• This function uses

the “printk” symboloutside the module.(And also __fentry__)

77

0000000000000000 <say_hello>: 0: e8 00 00 00 00 callq 5 <say_hello+0x5> 1: R_X86_64_PC32 __fentry__-0x4 5: 55 push %rbp 6: 48 c7 c7 00 00 00 00 mov $0x0,%rdi 9: R_X86_64_32S .rodata.str1.1 d: 31 c0 xor %eax,%eax f: 48 89 e5 mov %rsp,%rbp 12: e8 00 00 00 00 callq 17 <say_hello+0x17> 13: R_X86_64_PC32 printk-0x4 17: 5d pop %rbp 18: c3 retq

void say_hello(void){ printk(KERN_INFO "Hello, World.\n");}

RIP-relative is based on the next instruction

Page 78: Linux Kernel Module - For NLKB

apply_relocate[_add]• Addressing is architecture-dependent, so the

relocation is also architecture-dependent• x86_64 (RELA)• An RELA section is an array of Elf64_Rela

• In the “printk” example• r_offset = 0x13• r_info = R_X86_64_PC32 (RIP-relative in x86_64)• r_addend = -0x04

78

typedef struct elf64_rela { Elf64_Addr r_offset; /* Location at which to apply the action */ Elf64_Xword r_info; /* index and type of relocation */ Elf64_Sxword r_addend; /* Constant addend used to compute value */} Elf64_Rela;

Page 79: Linux Kernel Module - For NLKB

apply_relocate_add in x86_64

79

int apply_relocate_add(Elf64_Shdr *sechdrs, const char *strtab, unsigned int symindex, unsigned int relsec, struct module *me)

{...

for (i = 0; i < sechdrs[relsec].sh_size / sizeof(*rel); i++) {

/* This is where to make the change */loc = (void

*)sechdrs[sechdrs[relsec].sh_info].sh_addr+ rel[i].r_offset;

/* This is the symbol it is referring to. Note that

all undefined symbols have been resolved. */sym = (Elf64_Sym *)sechdrs[symindex].sh_addr

+ ELF64_R_SYM(rel[i].r_info);...

val = sym->st_value + rel[i].r_addend;

Page 80: Linux Kernel Module - For NLKB

apply_relocate_add in x86_64

80

switch (ELF64_R_TYPE(rel[i].r_info)) {...

case R_X86_64_64:*(u64 *)loc = val;break;

...case R_X86_64_32S:

*(s32 *)loc = val;if ((s64)val != *(s32 *)loc)

goto overflow;break;

case R_X86_64_PC32:val -= (u64)loc;*(u32 *)loc = val;

#if 0if ((s64)val != *(s32 *)loc)

goto overflow;#endif

break;

Calculate the delta between the current address and the

target address

Page 81: Linux Kernel Module - For NLKB

post_relocation• Sort the exception table (sort_extable)

• Exception table: the instruction addresses which the page fault handler treats specially page faults for.• get_user etc.

• Copy the per-cpu section contents for all the possible cpus. (percpu_modcopy)

• Set kallsyms-related members to the final location, and copy core symtab from the whole symtab. (add_kallsyms)• Call architecture-dependent finalizing function of loading

(module_finalize)

81

for_each_possible_cpu(cpu)memcpy(per_cpu_ptr(mod->percpu, cpu), from,

size);

Page 82: Linux Kernel Module - For NLKB

module_finalize in x86_64• Alternatives, paravirt and so on.

82

int module_finalize(const Elf_Ehdr *hdr, const Elf_Shdr *sechdrs, struct module *me)

{const Elf_Shdr *s, *text = NULL, *alt = NULL, *locks = NULL,

*para = NULL;char *secstrings = (void *)hdr + sechdrs[hdr->e_shstrndx].sh_offset;

for (s = sechdrs; s < sechdrs + hdr->e_shnum; s++) {if (!strcmp(".text", secstrings + s->sh_name))

text = s;if (!strcmp(".altinstructions", secstrings + s->sh_name))

alt = s;if (!strcmp(".smp_locks", secstrings + s->sh_name))

locks = s;if (!strcmp(".parainstructions", secstrings + s->sh_name))

para = s;if (alt) {

/* patch .altinstructions */void *aseg = (void *)alt->sh_addr;apply_alternatives(aseg, aseg + alt->sh_size);

}...

Page 83: Linux Kernel Module - For NLKB

flush_module_icache• Flush instruction cache for text area so that the

code be executed correctly

83

static void flush_module_icache(const struct module *mod){

mm_segment_t old_fs;

/* flush the icache in correct context */old_fs = get_fs();set_fs(KERNEL_DS);

if (mod->module_init)flush_icache_range((unsigned long)mod->module_init,

(unsigned long)mod->module_init + mod->init_size);

flush_icache_range((unsigned long)mod->module_core, (unsigned long)mod->module_core + mod-

>core_size);

set_fs(old_fs);}

Page 84: Linux Kernel Module - For NLKB

complete_formation• Check if the exported symbols are already exported

by another module (verify_export_symbols)• Add section information of symbols for BUG report

(module_bug_finalize)• Set NX and RO for core and init area.• Set the module state to MODULE_STATE_COMING

84

mod->state = MODULE_STATE_COMING;

Page 85: Linux Kernel Module - For NLKB

load_module function (2) [Re]• Resolve the symbols (simplify_symbols)• Fix up the addresses in the module (apply_relocations)• Extable and per-cpu initialization (post_relocation)• Flush I-cache for the module area (flush_module_icache)• Copy the module parameters to mod->args.• Check duplication of symbols, and setup NX attributes.

(complete_formation)• Parse the module parameters (parse_args)• sysfs setup (mod_sysfs_setup)• Free the copy in the load_info structure (free_copy)• Call the init function of the module (do_init_module)

85

Page 86: Linux Kernel Module - For NLKB

do_init_module (1)• Make a structure for call_rcu to free init area

• And call the init function in the module

• Set the module state to MODULE_STATE_LIVE

86

struct mod_initfree *freeinit;

freeinit = kmalloc(sizeof(*freeinit), GFP_KERNEL);...

freeinit->module_init = mod->module_init;

do_mod_ctors(mod);/* Start the module */if (mod->init != NULL)

ret = do_one_initcall(mod->init);

mod->state = MODULE_STATE_LIVE;

Page 87: Linux Kernel Module - For NLKB

do_init_module (2)• To avoid deadlock, perform synchronize

• Drop the initial reference

• And clears the init-related stuffs!

87

if (current->flags & PF_USED_ASYNC)async_synchronize_full();

mutex_lock(&module_mutex);/* Drop initial reference. */module_put(mod);

trim_init_extable(mod);#ifdef CONFIG_KALLSYMS

mod->num_symtab = mod->core_num_syms;mod->symtab = mod->core_symtab;mod->strtab = mod->core_strtab;

#endifunset_module_init_ro_nx(mod);module_arch_freeing_init(mod);

Page 88: Linux Kernel Module - For NLKB

do_init_module (3)• Finally, frees the init stuffs

• Wakes up if someone is waiting for the completion of the initialization.

88

call_rcu(&freeinit->rcu, do_free_init);mutex_unlock(&module_mutex);

wake_up_all(&module_wq);

Page 89: Linux Kernel Module - For NLKB

Details (2)Unloading

89

Page 90: Linux Kernel Module - For NLKB

sys_delete_module• Check capability and module blocking parameter• Find the specified module by name• If the module has the init function AND does not

have the exit function and it is not forceful unload, it fails with –EBUSY• Try to stop the module (try_stop_module)• Call the exit function• Frees the module

90

Page 91: Linux Kernel Module - For NLKB

Now (3.19) [RE]• Reference count is now atomic_t (was per-cpu int

before) and checked without stop_machine• (thanks to a mysterious guy)

91

static int try_stop_module(struct module *mod, int flags, int *forced){

/* If it's not unused, quit unless we're forcing. */if (try_release_module_ref(mod) != 0) {

*forced = try_force_unload(flags);if (!(*forced))

return -EWOULDBLOCK;}

/* Mark it as dying. */mod->state = MODULE_STATE_GOING;

return 0;}

Page 92: Linux Kernel Module - For NLKB

try_release_module_ref• Decrement the reference counter and checks if it

reaches is zero (= can be unloaded).

92

static int try_release_module_ref(struct module *mod){

int ret;

/* Try to decrement refcnt which we set at loading */ret = atomic_sub_return(MODULE_REF_BASE, &mod->refcnt);BUG_ON(ret < 0);if (ret)

/* Someone can put this right now, recover with checking */

ret = atomic_add_unless(&mod->refcnt, MODULE_REF_BASE, 0);

return ret;}

Page 93: Linux Kernel Module - For NLKB

Details (3)Building a out-of-tree kernel module

93

Page 94: Linux Kernel Module - For NLKB

Build steps (1) : .c -> .o• make .tmp_versions, create .tmp_versions/<module>.mod

• The file contains the names of the final .ko file and source .o files

• Compile .tmp_[name].o from [name].c• Calculate the CRCs (version) for the exported symbols

• Find a __ksymtab section in .tmp_[name].o• objdump –h (obj) | grep –q __ksymtab

• Calculate CRC for exported symbols in the source file by genksyms (Output is LD Script format)

• Compile the CRC values into the object file.

94

cmd_modversions = \if $(OBJDUMP) -h $(@D)/.tmp_$(@F) | grep -q __ksymtab; then \

$(call cmd_gensymtypes,$(KBUILD_SYMTYPES),$(@:.o=.symtypes))\

> $(@D)/.tmp_$(@F:.o=.ver); \\

$(LD) $(LDFLAGS) -r -o $@ $(@D)/.tmp_$(@F) \

-T $(@D)/.tmp_$(@F:.o=.ver); \rm -f $(@D)/.tmp_$(@F) $(@D)/.tmp_$(@F:.o=.ver); \

else \mv -f $(@D)/.tmp_$(@F) $@;

\fi;

_crc_say_hello = 0xb37b83db ;

Page 95: Linux Kernel Module - For NLKB

Exported Symbols• Each exported symbol has a struct in __ksymtab* section.

95

#define __EXPORT_SYMBOL(sym, sec) \extern typeof(sym) sym; \__CRC_SYMBOL(sym, sec) \static const char __kstrtab_##sym[] \__attribute__((section("__ksymtab_strings"), aligned(1))) \= VMLINUX_SYMBOL_STR(sym); \extern const struct kernel_symbol __ksymtab_##sym; \__visible const struct kernel_symbol __ksymtab_##sym \__used \__attribute__((section("___ksymtab" sec "+" #sym), unused)) \= { (unsigned long)&sym, __kstrtab_##sym }

#define EXPORT_SYMBOL(sym) \__EXPORT_SYMBOL(sym, "")

#define EXPORT_SYMBOL_GPL(sym) \__EXPORT_SYMBOL(sym, "_gpl")

#define EXPORT_SYMBOL_GPL_FUTURE(sym) \__EXPORT_SYMBOL(sym, "_gpl_future")

(include/linux/export.h)

Page 96: Linux Kernel Module - For NLKB

CRC sections• Declare CRC symbols in CRC sections with the weak

attribute.

96

#ifndef __GENKSYMS__#ifdef CONFIG_MODVERSIONS/* Mark the CRC weak since genksyms apparently decides not to * generate a checksums for some symbols */#define __CRC_SYMBOL(sym, sec) \

extern __visible void *__crc_##sym __attribute__((weak));\static const unsigned long __kcrctab_##sym \__used \__attribute__((section("___kcrctab" sec "+" #sym), unused)) \= (unsigned long) &__crc_##sym;

#else#define __CRC_SYMBOL(sym, sec)#endif

(include/linux/export.h)

Page 97: Linux Kernel Module - For NLKB

Build Steps (2) : .c -> .o• Create __mcount_loc list (if –pg is enabled)• The list of pointers where “mcount” is called

• Fix up the dep file• Link into a single object file (<module>.o) if the

module is composed of multiple object files

97

Page 98: Linux Kernel Module - For NLKB

Build Steps (3) – Stage 2• Create <module>.mod.c and <module>.symvers by modpost

command

• Compile the <module>.mod.c• Link the <module>.mod.o and <module>.o into a module

<module>.ko

98

modpost = scripts/mod/modpost \ $(if $(CONFIG_MODVERSIONS),-m) \ $(if $(CONFIG_MODULE_SRCVERSION_ALL),-a,) \ $(if $(KBUILD_EXTMOD),-i,-o) $(kernelsymfile) \ $(if $(KBUILD_EXTMOD),-I $(modulesymfile)) \ $(if $(KBUILD_EXTRA_SYMBOLS), $(patsubst %, -e %,$(KBUILD_EXTRA_SYMBOLS))) \ $(if $(KBUILD_EXTMOD),-o $(modulesymfile)) \ $(if $(CONFIG_DEBUG_SECTION_MISMATCH),,-S) \ $(if $(KBUILD_EXTMOD)$(KBUILD_MODPOST_WARN),-w)

MODPOST_OPT=$(subst -i,-n,$(filter -i,$(MAKEFLAGS)))

# We can go over command line length here, so be careful.quiet_cmd_modpost = MODPOST $(words $(filter-out vmlinux FORCE, $^)) modules cmd_modpost = $(MODLISTCMD) | sed 's/\.ko$$/.o/' | $(modpost) $(MODPOST_OPT) -s -T -

Page 99: Linux Kernel Module - For NLKB

modpost (1)• Collects module information, symbol information

and versions from kernel symbols, object files, and generate module source file and symvers file. • Arguments

• Options

99

Option Description

-m CONFIG_MODVERSIONS (Symbol version)

-a CONFIG_MODULE_SRCVERSION_ALL (“srcversion” in modinfo)MD4 for the source files that made the module

-I (symvers file) Input symbol versions (kernel symbols)

-e (symvers file) Input extra symbol versions

-o (symvers file) Output symbol versions (for exported symbols of the module)

-T (files) Source (object) file list

$ modpost [Options...] [(Module object files...)]

Page 100: Linux Kernel Module - For NLKB

modpost (2)• Generate the source file

100

for (mod = modules; mod; mod = mod->next) {char fname[PATH_MAX];

...buf.pos = 0;

add_header(&buf, mod);add_intree_flag(&buf, !external_module);add_staging_flag(&buf, mod->name);err |= add_versions(&buf, mod);add_depends(&buf, mod, modules);add_moddevtable(&buf, mod);add_srcversion(&buf, mod);

sprintf(fname, "%s.mod.c", mod->name);write_if_changed(&buf, fname);

} (scripts/mod/modpost.c)

Page 101: Linux Kernel Module - For NLKB

modpost (3)• Dump the symbol versions

101

static void write_dump(const char *fname){

struct buffer buf = { };struct symbol *symbol;int n;

for (n = 0; n < SYMBOL_HASH_SIZE ; n++) {symbol = symbolhash[n];while (symbol) {

if (dump_sym(symbol))buf_printf(&buf, "0x%08x\t%s\t%s\t%s\

n",symbol->crc, symbol->name,symbol->module->name,export_str(symbol->export));

symbol = symbol->next;}

}write_if_changed(&buf, fname);

}(scripts/mod/modpost.c)

0xb37b83db say_hello /home/shimos/test_module/hello EXPORT_SYMBOL

Page 102: Linux Kernel Module - For NLKB

Generated <module>.mod.c (1)• Example

102

#include <linux/module.h>#include <linux/vermagic.h>#include <linux/compiler.h>

MODULE_INFO(vermagic, VERMAGIC_STRING);

__visible struct module __this_module__attribute__((section(".gnu.linkonce.this_module"))) = { .name = KBUILD_MODNAME, .init = init_module,#ifdef CONFIG_MODULE_UNLOAD .exit = cleanup_module,#endif .arch = MODULE_ARCH_INIT,};

static const struct modversion_info ____versions[]__used__attribute__((section("__versions"))) = { { 0x9412fa01, __VMLINUX_SYMBOL_STR(module_layout) }, { 0x27e1a049, __VMLINUX_SYMBOL_STR(printk) }, { 0xbdfb6dbb, __VMLINUX_SYMBOL_STR(__fentry__) },}; ...

Additional modinfo is included

Base of struct module

Symbols and (expected) versions which this module depends on.

Page 103: Linux Kernel Module - For NLKB

Generated <module>.mod.c (2)• Example

103

static const char __module_depends[]__used__attribute__((section(".modinfo"))) ="depends=";

MODULE_INFO(srcversion, "8D5BACDC1EA9421ABFF79DD")

Modinfo about dependency(but the kernel does not use this)

Modinfo “srcversion”

Page 104: Linux Kernel Module - For NLKB

modinfo• The modinfo string is created by macros, and

concatenated by collecting the string into a single section

104

#define __MODULE_INFO(tag, name, info) \static const char __UNIQUE_ID(name)[] \ __used __attribute__((section(".modinfo"), unused, aligned(1))) \ = __stringify(tag) "=" info

(include/linux/moduleparam.h)#define MODULE_INFO(tag, info) __MODULE_INFO(tag, tag, info)...#define MODULE_LICENSE(_license) MODULE_INFO(license, _license)...#define MODULE_AUTHOR(_author) MODULE_INFO(author, _author)...

(include/linux/module.h)

Page 105: Linux Kernel Module - For NLKB

UNIQUE_ID105

#define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__)

(include/linux/compiler-gcc4.h)