Linux Kernel Module - For NLKB

Preview:

Citation preview

Kernel ModuleTaku Shimosawa

1

Feb. 21, 2015 Pour le livre nouveau du Linux noyau

Notes• Linux kernel version: 3.19• Quoted source codes come from kernel/module.c

unless otherwise noted.

2

Kernel Module• A feature for dynamically adding/removing kernel

features while the kernel is running• Benefits• To update the kernel features while running• To reduce memory consumption (and CPU overhead) by

loading only necessary kernel modules• Avoiding GPL (Not required to compliant with GPL;

proprietary drivers)

• Many kernel features can be compiled either linked to the kernel statically or independent modules• File systems, device drivers, etc.• “TRISTATE” in Kconfig (y, m, or n)

3

Where is the kernel module?• Linux kernel modules are ELF binaries with an

extension “.ko”• Many distributions locate the kernel modules

under /lib/modules• e.g. /lib/modules/3.13.0-44-generic/kernel (Ubuntu 14.10)• “depmod” finds the kernel modules located under the

directory to create module dependency map (modules.dep)

• “modprobe” utility loads a kernel module with its dependent modules by looking up the modules.dep file

• However, a module located in any place can be loaded to the kernel if specified explicitly.

4

What is the “dependency?”• A kernel module can export “symbols” that may be used by

another kernel module• A symbol : a name for a location in the memory; a global variable

or a function in C

• If a module (B) uses a symbol exported by another module (A), then the module B has dependency for the module A

• Thus, the module A should be loaded before the module B is loaded

• (There seems to be no way to load modules that have circular dependencies (e.g. A depends on B; B also depends on A))

5

Kernel module A Kernel module B

function f() {}EXPORT_SYMBOL(f);

function g() { f();}

DEP

Exported Symbols• The symbols explicitly marked as “export” can be accessed

by other kernel modules• The Linux kernel itself has “export”-ed symbols.• Kernel modules are allowed to use only the exported symbols in

the kernel• Not all the global functions are available for the modules!

• The symbols to be exported are declared with the EXPORT_SYMBOL and EXPORT_SYMBOL_GPL macros.• The latter makes the symbol available only for GPL modules.

6

struct task_struct *pid_task(struct pid *pid, enum pid_type type){ ... }EXPORT_SYMBOL(pid_task);...struct task_struct *get_pid_task(struct pid *pid, enum pid_type type){ ... }EXPORT_SYMBOL_GPL(get_pid_task);

(kernel/pid.c)

(BTW)• What makes difference?

7

struct task_struct *pid_task(struct pid *pid, enum pid_type type){...}EXPORT_SYMBOL(pid_task);

struct task_struct *get_pid_task(struct pid *pid, enum pid_type type){

struct task_struct *result;rcu_read_lock();result = pid_task(pid, type);if (result)

get_task_struct(result);rcu_read_unlock();return result;

}EXPORT_SYMBOL_GPL(get_pid_task);

(kernel/pid.c)

Make a kernel module!• Out-of-tree module• The only necessary files are

• Makefile• C source file(s)

• Example for Makefile

8

obj-m += hello.o

KERN_BUILD=/lib/modules/$(shell uname -r)/build

all: make -C $(KERN_BUILD) M=$(PWD) modules

clean: make -C $(KERN_BUILD) M=$(PWD) clean

cf.obj-$(CONFIG_SHIMOS) = shimos.o

Inside the kernel module• What sections are inside a kernel module?

9

$ readelf –a hello.koSection Headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align… [ 2] .text PROGBITS 0000000000000000 00000064 0000000000000000 0000000000000000 AX 0 0 1 [ 3] .init.text PROGBITS 0000000000000000 00000064 0000000000000016 0000000000000000 AX 0 0 1 [ 4] .rela.init.text RELA 0000000000000000 000009c0 0000000000000030 0000000000000018 16 3 8 [ 5] .exit.text PROGBITS 0000000000000000 0000007a 0000000000000006 0000000000000000 AX 0 0 1…

[ 7] .modinfo PROGBITS 0000000000000000 00000091 00000000000000c1 0000000000000000 A 0 0 1 [ 8] __versions PROGBITS 0000000000000000 00000160 0000000000000080 0000000000000000 A 0 0 32…

[18] .gnu.linkonce.thi PROGBITS 0000000000000000 00000280 0000000000000260 0000000000000000 WA 0 0 32

Sections10

Section Name Description

.gnu.linkonce.this_module Module structure

.modinfo String-style module information(Licenses, etc.)

__versions Expected (compile-time) versions (CRC) of the symbols that this module depends on.

__ksymtab* Table of symbols which this module exports.

__kcrctab* Table of versions of symbols which this module exports.

*.init Sections used while initialization (__init)

.text, .data, etc. The code and data

* : (none), _gpl, _gpl_future, _unused, unused_gpl (License restriction / attribute of the symbols)

Module load and unload• The simplest way : “insmod” and “rmmod” commands

• More sophisticated way is “modprobe” and “modprobe –r”• The former tries to load modules which the specified module

depends on• The latter tries to unload modules which the specified module

depends on

11

# insmod (file name) [parameters…]

(e.g.) # insmod helloworld.ko msg=hoge

# rmmod (module name)

(e.g.) # rmmod helloworld

How insmod calls the kernel?• Source: kmod-19

12

KMOD_EXPORT int kmod_module_insert_module(struct kmod_module *mod, unsigned int flags, const char *options){... if (kmod_file_get_direct(mod->file)) { unsigned int kernel_flags = 0;

if (flags & KMOD_INSERT_FORCE_VERMAGIC) kernel_flags |= MODULE_INIT_IGNORE_VERMAGIC; if (flags & KMOD_INSERT_FORCE_MODVERSION) kernel_flags |= MODULE_INIT_IGNORE_MODVERSIONS;

err = finit_module(kmod_file_get_fd(mod->file), args, kernel_flags); if (err == 0 || errno != ENOSYS) goto init_finished; }...

(libkmod/libkmod-module.c)

System calls• 3 Module-related System Calls• init_module• finit_module

• To load a module• delete_module

• To unload a module

13

int init_module(void *module_image, unsigned long len, const char *param_values);

int finit_module(int fd, const char *param_values, int flags);

int delete_module(const char *name, int flags);(from man pages)

init_module / finit_module• Load a kernel module• How to specify the module?• init_module : by user memory buffer that contains the

kernel module image• finit_module : by file descriptor for the kernel module

file

• By using finit_module, some flags can be specified

14

flags

MODULE_INIT_IGNORE_MODVERSIONS Ignore symbol version hashes

MODULE_INIT_IGNORE_VERMAGIC Ignore kernel version magic

delete_module• Unload a kernel module• Specifies a module to be unloaded by its “name”

• Some flags can be specified• Why different policy from finit_module…?

15

flags

O_NONBLOCK | O_TRUNC Forcefully unload the module(even when the ref count is not zero; taints the kernel)

O_NONBLOCK Returns immediately with an error(EWOULDBLOCK)

O_NONBLOCK not set Stops the module, and waits until the ref count reaches zero.(UNINTERRUPTIBLE)

Data structures for modules• struct load_info• Used while initializing a module• Most members are ELF-related.

16

struct load_info {Elf_Ehdr *hdr;unsigned long len;Elf_Shdr *sechdrs;char *secstrings, *strtab;unsigned long symoffs, stroffs;struct _ddebug *debug;unsigned int num_debug;bool sig_ok;struct {

unsigned int sym, str, mod, vers, info, pcpu;

} index;};

(include/linux/module.h)

Data structures for modules• struct module (too large..)

17

struct module {enum module_state state;

/* Member of list of modules */struct list_head list;

/* Unique handle for this module */char name[MODULE_NAME_LEN];

/* Sysfs stuff. */struct module_kobject mkobj;

.../* Exported symbols */const struct kernel_symbol *syms;const unsigned long *crcs;unsigned int num_syms;

/* Kernel parameters. */struct kernel_param *kp;unsigned int num_kp;

“modules” list

Exported symbolsSymbol CRC

Data structures for modules

18

/* GPL-only exported symbols. */unsigned int num_gpl_syms;const struct kernel_symbol *gpl_syms;const unsigned long *gpl_crcs;

...#ifdef CONFIG_MODULE_SIG

/* Signature was verified. */bool sig_ok;

#endif...

/* Exception table */unsigned int num_exentries;struct exception_table_entry *extable;

/* Startup function. */int (*init)(void);

/* If this is non-NULL, vfree after init() returns */void *module_init;

.../* Here is the actual code + data, vfree'd on unload. */void *module_core;

GPL Symbols

“init” function

“init” sections

Other (core) sections

Data structures for modules

19

/* Here are the sizes of the init and core sections */unsigned int init_size, core_size;

/* The size of the executable code in each section. */unsigned int init_text_size, core_text_size;

/* Size of RO sections of the module (text+rodata) */unsigned int init_ro_size, core_ro_size;

/* Arch-specific module values */struct mod_arch_specific arch;

.../* The command line arguments (may be mangled). People

like keeping pointers to this stuff */char *args;

...#ifdef CONFIG_SMP

/* Per-cpu data. */void __percpu *percpu;unsigned int percpu_size;

#endifz

Sizes of sections

Command lineparameters

Per-CPUDatas

Data structures for modules

20

...#ifdef CONFIG_MODULE_UNLOAD

/* What modules depend on me? */struct list_head source_list;/* What modules do I depend on? */struct list_head target_list;

/* Destruction function. */void (*exit)(void);

struct module_ref __percpu *refptr;#endif

#ifdef CONFIG_CONSTRUCTORS/* Constructor functions. */ctor_fn_t *ctors;unsigned int num_ctors;

#endif};

(include/linux/module.h)

Lists to manage dependencies

(only unload is enabled)

Module state• state in struct module

• During its load, state becomes (created) -> UNFORMED -> COMING -> LIVE.• During its unload, state becomes

LIVE -> GOING -> (removed)

21

state descriptionMODULE_STATE_UNFORMED Appeared in the modules list, but still during

set upMODULE_STATE_COMING Fully formed. Running module_init.MODULE_STATE_LIVE Normal state.MODULE_STATE_GOING Being unloaded.

Global module information

Variables DescriptionLIST_HEAD(modules) List of modules that are in the kernel.DEFINE_MUTEX(module_mutex) Protection against “modules,” etc.

• Add : RCU list operations• Remove : stop_machine(~3.18)

22

/* * Mutex protects: * 1) List of modules (also safely readable with preempt_disable), * 2) module_use links, * 3) module_addr_min/module_addr_max. * (delete uses stop_machine/add uses RCU list operations). */DEFINE_MUTEX(module_mutex);EXPORT_SYMBOL_GPL(module_mutex);

Loading a Module• Load the whole module file onto memory• Parse the ELF and module information• Check the module information to

determine whether the module is loadable or not• Layout the sections and copy to the final

location• Add the module to the kernel• Resolve the symbols and apply relocations• Copy module parameters• Call the init function

23

System Calls

load_modulelayout_and_allocate

setup_load_infocheck_mod_info

layout_sectionslayout_symtabsmove_module

add_unformed_module

simply_symbolsapply_relocations

do_init_module

UNFORMED

COMING

LIVE

Unloading a Module• Check if the reference count of the

module is zero• If zero or it is forced unloading, then set

the state to GOING• If not zero, it fails

• Call the “exit” function• Free and cleanup everything

24

sys_delete_module

try_stop_module

__try_stop_module

free_module

stop_machine (-3.18)• Until Linux 3.18, the reference count check and

module remove in module unloading is implemented with stop_machine.

25

static int try_stop_module(struct module *mod, int flags, int *forced){

struct stopref sref = { mod, flags, forced };

return stop_machine(__try_stop_module, &sref, NULL);}

static void free_module(struct module *mod){...

mutex_lock(&module_mutex);stop_machine(__unlink_module, mod, NULL);mutex_unlock(&module_mutex);

...}

Now (3.19)• Reference count is now atomic_t (was per-cpu int

before) and checked without stop_machine• (thanks to a mysterious guy)

26

static int try_stop_module(struct module *mod, int flags, int *forced){

/* If it's not unused, quit unless we're forcing. */if (try_release_module_ref(mod) != 0) {

*forced = try_force_unload(flags);if (!(*forced))

return -EWOULDBLOCK;}

/* Mark it as dying. */mod->state = MODULE_STATE_GOING;

return 0;}

Now (3.19)• Stop_machine also goes away from removing

27

static void free_module(struct module *mod){...

/* Now we can delete it from the lists */mutex_lock(&module_mutex);/* Unlink carefully: kallsyms could be walking list. */list_del_rcu(&mod->list);/* Remove this module from bug list, this uses

list_del_rcu */module_bug_cleanup(mod);/* Wait for RCU synchronizing before releasing mod->list

and buglist. */synchronize_rcu();mutex_unlock(&module_mutex);

...}

Details (1)Loading

28

sys_init_module/sys_finit_module• Initialize a load_info structure• Check whether module load is permitted or not.

(may_init_module function)• [finit only] Flags check• [init only] Copy module data in user memory to

kernel memory (copy_module_from_user function)• [finit only] Read from the fd into kernel memory

(copy_module_from_fd function)• Call the load_module function

29

may_init_module• Capability: CAP_SYS_MODULE• “module_disabled” parameter• Blocks loading and unloading of modules

30

/* Block module loading/unloading? */int modules_disabled = 0;core_param(nomodule, modules_disabled, bint, 0);...static int may_init_module(void){

if (!capable(CAP_SYS_MODULE) || modules_disabled)return -EPERM;

return 0;}

(kernel/module.c)

# sysctl kernel.modules_disabledkernel.modules_disabled = 0

copy_module_from_fd• Pass the file struct to the security module• vmalloc an area for the module data• Load the whole module file into the area

• Set the pointer to info->hdr

31

static int copy_module_from_fd(int fd, struct load_info *info){...

err = security_kernel_module_from_file(f.file);if (err)

goto out;...

info->hdr = vmalloc(stat.size);if (!info->hdr) {

err = -ENOMEM;goto out;

}...

while (pos < stat.size) {bytes = kernel_read(f.file, pos, (char *)(info->hdr) + pos,

stat.size - pos);... }

info->len = pos;

copy_module_from_user• Differences:• Pass “NULL” pointer to the security module• Just copy_from_user instead of kernel_read

32

static int copy_module_from_user(const void __user *umod, unsigned long len, struct load_info *info)

{...info->len = len;

...err = security_kernel_module_from_file(NULL);if (err)

return err;...

/* Suck in entire file: we'll want most of it. */info->hdr = vmalloc(info->len);if (!info->hdr)

return -ENOMEM;...

if (copy_from_user(info->hdr, umod, info->len) != 0) {vfree(info->hdr);return -EFAULT;

}return 0;

load_module function (1)• Signature check (module_sig_check)• ELF header check (elf_header_check)• Layout and allocate the final location for the module

(layout_and_allocate)• Add the module to the “modules” list (add_unformed_module)• Allocate per-cpu areas used in the module (percpu_modalloc)• Initialize link lists used for dependency management and

unloading features (module_unload_init)• Find optional sections (find_module_sections)• License and version dirty hack

(check_module_license_and_versions)• Setup MODINFO_ATTR fields (setup_modinfo)

33

load_module function (2)• Resolve the symbols (simplify_symbols)• Fix up the addresses in the module (apply_relocations)• Extable and per-cpu initialization (post_relocation)• Flush I-cache for the module area (flush_module_icache)• Copy the module parameters to mod->args.• Check duplication of symbols, and setup NX attributes.

(complete_formation)• Parse the module parameters (parse_args)• sysfs setup (mod_sysfs_setup)• Free the copy in the load_info structure (free_copy)• Call the init function of the module (do_init_module)

34

module_sig_check• Check the signature in the module (if

CONFIG_MODULE_SIG=y)• If a module is signed, “signature” and “marker” resides at the

tail of the module file.

• If signature is OK, module->sig_ok is set to true.

• If no signature is found (-ENOKEY) and signature is not enforced, it returns success(0).• Signature is enforced either

• When CONFIG_MODULE_SIG_FORCE is Y• When “sig_enforce” parameter is set

35

Module (ELF) Signature Marker

“~Module signature appended~\n”

$ hd /lib/module/3.13.0-45-generic/kernel/fs/btrfs/btrfs.ko0014b470 f8 a6 b7 74 01 06 01 1e 14 00 00 00 00 00 02 02 |...t............|0014b480 7e 4d 6f 64 75 6c 65 20 73 69 67 6e 61 74 75 72 |~Module signatur|0014b490 65 20 61 70 70 65 6e 64 65 64 7e 0a |e appended~.|0014b49c

elf_header_check• Sanity check for the ELF header

• The magic number is correct• The architecture is correct• The length is large enough to contain all the section headers, etc.

36

static int elf_header_check(struct load_info *info){

if (info->len < sizeof(*(info->hdr)))return -ENOEXEC;

if (memcmp(info->hdr->e_ident, ELFMAG, SELFMAG) != 0 || info->hdr->e_type != ET_REL || !elf_check_arch(info->hdr) || info->hdr->e_shentsize != sizeof(Elf_Shdr))

return -ENOEXEC;

if (info->hdr->e_shoff >= info->len || (info->hdr->e_shnum * sizeof(Elf_Shdr) >

info->len - info->hdr->e_shoff))return -ENOEXEC;

return 0;}

ELF (.ko)

ELF Header37

Elf_Ehdr

e_ident

e_type

e_shoff

e_shentsize

e_shnum

e_shstrndx

Elf_Shdr

Elf_Shdr

load_info.hdr (ELF_EHdr)= The head of the kernel module file= The head of the ELF= Pointer to ELF_EHdr

e_shentsize

e_shentsize

e_shoff

e_shnum

ELF (.ko)

e_ident: magic (‘\x7fELF’), 32/64-bit, etc. (16 byte in total incl. padding)e_type: ET_REL / ET_EXEC / ET_DYN

layout_and_allocate• Fill the section information of the load_info, and

create a module structure pointing to the temporary location (setup_load_info)• Check the module information and report if the

module taints the kernel (check_modinfo)• Calculate the size required for the final location of

the module (layout_sections / layout_symtab)• Allocate the memory of the calculated size, and

copy the contents of the module, and move the pointer of the module structure there (move_module).

38

setup_load_info• Set the following members according to the ELF header

and section headers.• sechdrs (Pointer to the section header)• secstrings (Pointer to the string section that contains section

names)• index.info, index.ver (Section indices of modinfo, version)• index.sym, index.str (Section indices of symbols, strings)• strtab (Pointer to the string section)• index.mod (section index of module section)

• “.gnu.linkonce.this_module” section• Set the module pointer to this section (temporally)

• index.pcu (section index for per-cpu section)• “.data..percpu” section (if exists)

• Return a pointer to a (temporary) module structure

39

setup_load_info• info->sechdrs• info->secstrings• info->strtab

• Each section’s offset isstored in ELF_Shdr.sh_offset

• info->index.info = 12• info->index.vers = 16• info->index.sym = 24• info->index.str = 25• info->index.mod = 18

• struct module *mod

• info->index.pcpu = 0• No per-cpu data in this example.

40

Elf_Ehdr

Elf_Shdr (0)

Elf_Shdr (18).gnu.linkonce.this_module

Elf_Shdr (23) : .shstrtab

Elf_Shdr (24) : .symtab

Elf_Shdr (25) : .strtab

.shstrtab section

.strtab section

.gnu.linkonce.this_module section

Elf_Shdr (12) : .modinfo

Elf_Shdr (16) : __versions

Header

Section (Contents)

check_modinfo (1)• Check “modinfo” in the module, and check if the

version magic is identical to the current kernel, and mark “tainted” if it taints the kernel.• “Modinfo” resides in the “.modinfo” section, and is

composed of zero-terminated strings of key-value pairs connected by “=“.

41

description=Hello world kernel module\0author=Taku Shimosawa <shimos@shimos.net>\0license=GPL v2\0srcversion=8D5BACDC1EA9421ABFF79DD\0depends=\0vermagic=3.13.0-44-generic SMP mod_unload modversions

check_modinfo (2)• First, check the version magic in the module

42

static int check_modinfo(struct module *mod, struct load_info *info, int flags){

const char *modmagic = get_modinfo(info, "vermagic");...

if (flags & MODULE_INIT_IGNORE_VERMAGIC)modmagic = NULL;

...if (!modmagic) {

err = try_to_force_load(mod, "bad vermagic");if (err)

return err;} else if (!same_magic(modmagic, vermagic, info-

>index.vers)) {pr_err("%s: version magic '%s' should be '%s'\n", mod->name, modmagic, vermagic);return -ENOEXEC;

}

check_modinfo (3)• Version magic

• Example:

• same_magic function• Compare the vermagic strings excluding CRCs if they

have CRCs.

43

#define VERMAGIC_STRING \UTS_RELEASE " " \MODULE_VERMAGIC_SMP MODULE_VERMAGIC_PREEMPT \MODULE_VERMAGIC_MODULE_UNLOAD

MODULE_VERMAGIC_MODVERSIONS \MODULE_ARCH_VERMAGIC

(include/linux/vermagic.h)

3.13.0-44-generic SMP mod_unload modversions

check_modinfo (4)• …And mark tainted if any is necesary

44

if (!get_modinfo(info, "intree"))add_taint_module(mod, TAINT_OOT_MODULE,

LOCKDEP_STILL_OK);

if (get_modinfo(info, "staging")) {add_taint_module(mod, TAINT_CRAP, LOCKDEP_STILL_OK);pr_warn("%s: module is from the staging directory, the

quality ""is unknown, you have been warned.\n", mod-

>name);}

/* Set up license info based on the info section */set_license(mod, get_modinfo(info, "license"));

check_modinfo (5)• License information is also important

45

static void set_license(struct module *mod, const char *license){

if (!license)license = "unspecified";

if (!license_is_gpl_compatible(license)) {if (!test_taint(TAINT_PROPRIETARY_MODULE))

pr_warn("%s: module license '%s' taints kernel.\n",

mod->name, license);add_taint_module(mod, TAINT_PROPRIETARY_MODULE,

LOCKDEP_NOW_UNRELIABLE);}

}

check_modinfo (6)• GPL compatible?

• See the “GPL\0….” case

46

static inline int license_is_gpl_compatible(const char *license){

return (strcmp(license, "GPL") == 0|| strcmp(license, "GPL v2") == 0|| strcmp(license, "GPL and additional rights")

== 0|| strcmp(license, "Dual BSD/GPL") == 0|| strcmp(license, "Dual MIT/GPL") == 0|| strcmp(license, "Dual MPL/GPL") == 0);

}(include/linux/license.h)

check_modinfo (7)• Also, the kernel is marked tainted when the module

is loaded forcefully

47

static int try_to_force_load(struct module *mod, const char *reason){#ifdef CONFIG_MODULE_FORCE_LOAD

if (!test_taint(TAINT_FORCED_MODULE))pr_warn("%s: %s: kernel tainted.\n", mod->name,

reason);add_taint_module(mod, TAINT_FORCED_MODULE,

LOCKDEP_NOW_UNRELIABLE);return 0;

#elsereturn -ENOEXEC;

#endif}

Taints!• Tainted mask are composed of several flags that

identifies the reason of tainting• Lockdep is disabled if it will not work well

• Ignoring the version magic, proprietary drivers, forceful unload

48

void add_taint(unsigned flag, enum lockdep_ok lockdep_ok){

if (lockdep_ok == LOCKDEP_NOW_UNRELIABLE && __debug_locks_off())

pr_warn("Disabling lock debugging due to kernel taint\n");

set_bit(flag, &tainted_mask);}

(kernel/panic.c)static inline void add_taint_module(struct module *mod, unsigned flag,

enum lockdep_ok lockdep_ok){

add_taint(flag, lockdep_ok);mod->taints |= (1U << flag);

}(kernel/module.c)

Kernel global flags

Per-module flags

Taints!• 15 reasons are defined

49

#define TAINT_PROPRIETARY_MODULE 0#define TAINT_FORCED_MODULE

1#define TAINT_CPU_OUT_OF_SPEC

2#define TAINT_FORCED_RMMOD 3#define TAINT_MACHINE_CHECK

4#define TAINT_BAD_PAGE 5#define TAINT_USER 6#define TAINT_DIE 7#define TAINT_OVERRIDDEN_ACPI_TABLE

8#define TAINT_WARN 9#define TAINT_CRAP 10#define TAINT_FIRMWARE_WORKAROUND 11#define TAINT_OOT_MODULE 12#define TAINT_UNSIGNED_MODULE

13#define TAINT_SOFTLOCKUP 14

(include/linux/kernel.h)

$ sysctl kernel.taintedkernel.tainted = 12288

12288 = 0x3000

layout_sections• Calculate the size of final memory to load the module

• Load only sections with “SHF_ALLOC” flags set• Calculate sizes for “core” and “init”

• “init” sections are determined when the section name starts with “.init”

• Sets the following member of module• core_size : sum of the sizes of the “core” sections to be loaded• core_text_size, core_ro_size : sum of the sizes of the text and

R/O “core” sections• init_size : sum of the sizes of the “init” sections to be loaded• init_text_size, init_ro_size : … of “init” sections

• sh_entsize in ELF_Shdr is used as the offset of the memory where the section will be loaded.

50

layout_sections• The sections in the example “hello.ko” are

categorized as follows:

51

Sections

Core Text .text, .exit.text

R/O __ksymtab, __kcrctab, .rodata.str1.1, __ksymtab_strings__mcount_loc,

R/W .data, .gnu.linkonce.this_module, .bss,

Init Text .init.text

R/O

R/W

(Others) Not loaded .rela.text, .rela.init.text, .rela__ksymtab, .rela__kcrctab.rela__mcount_loc, .rela.gnu.linonce.this_module.comment, .note.GNU-stack, .shstrtab, .symtab, .strtab.modinfo, __versions (*)

(*) These two sections originally have SHF_ALLOC, but the flags are dropped by rewrite_section_headers

layout_symtab• Put the symtab and strtab at the end of the init part• (Actually this function does not put, but add init_size by

the size of symtab)

• Put the symtab and strtab for the core symbols at the end of core part.

52

move_module• Allocate the final memory of the module, and

update the boundary addresses for the modules (module_alloc_update_bounds)

• Copy the section contents and update sh_addr’s

53

static void *module_alloc_update_bounds(unsigned long size){

void *ret = module_alloc(size);

if (ret) {mutex_lock(&module_mutex);if ((unsigned long)ret < module_addr_min)

module_addr_min = (unsigned long)ret;if ((unsigned long)ret + size > module_addr_max)

module_addr_max = (unsigned long)ret + size;mutex_unlock(&module_mutex);

}return ret;

}

module_alloc : x86• x86• Get_module_load_offset() determines the load offset as

a random value at the first time if KASLR is enabled

54

#define MODULES_VADDR VMALLOC_START#define MODULES_END VMALLOC_END

(arch/x86/include/asm/pgtable_32_types.h)#define MODULES_VADDR (__START_KERNEL_map + KERNEL_IMAGE_SIZE)#define MODULES_END _AC(0xffffffffff000000, UL)

(arch/x86/include/asm/pgtable_64_types.h)void *module_alloc(unsigned long size){

if (PAGE_ALIGN(size) > MODULES_LEN)return NULL;

return __vmalloc_node_range(size, 1, MODULES_VADDR +

get_module_load_offset(), MODULES_END, GFP_KERNEL |

__GFP_HIGHMEM, PAGE_KERNEL_EXEC, NUMA_NO_NODE, __builtin_return_address(0));

}(arch/x86/kernel/module.c)

module_alloc : ARM• ARM

55

#ifndef CONFIG_THUMB2_KERNEL#define MODULES_VADDR (PAGE_OFFSET - SZ_16M)#else/* smaller range for Thumb-2 symbols relocation (2^24)*/#define MODULES_VADDR (PAGE_OFFSET - SZ_8M)#endif

(arch/arm/include/asm/memory.h)#define MODULES_END (PAGE_OFFSET)#define MODULES_VADDR (MODULES_END - SZ_64M)

(arch/arm64/include/asm/memory.h)#ifdef CONFIG_MMUvoid *module_alloc(unsigned long size){

return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,

GFP_KERNEL, PAGE_KERNEL_EXEC, NUMA_NO_NODE,

__builtin_return_address(0));}#endif

(arch/arm/kernel/module.c)

module to final place• Struct module for the module loaded was pointed

to the temporary module image memory• Now, it’s loaded and copied to the final location , so

the pointer is also changed to the final location

56

/* Module has been copied to its final place now: return it. */

mod = (void *)info->sechdrs[info->index.mod].sh_addr;

load_module function (1) [RE]• Signature check (module_sig_check)• ELF header check (elf_header_check)• Layout and allocate the final location for the module

(layout_and_allocate)• Add the module to the “modules” list (add_unformed_module)• Allocate per-cpu areas used in the module (percpu_modalloc)• Initialize link lists used for dependency management and

unloading features (module_unload_init)• Find optional sections (find_module_sections)• License and version dirty hack

(check_module_license_and_versions)• Setup MODINFO_ATTR fields (setup_modinfo)

57

add_unformed_module• Add the module to the “modules” list• Checking the duplicated loading of the same module• If the same module is still being loaded, this waits for

the completion of the load, and it tries again• Just in case that the module fails to be loaded

58

add_unformed_module59

static int add_unformed_module(struct module *mod){

mod->state = MODULE_STATE_UNFORMED;...again:

mutex_lock(&module_mutex);old = find_module_all(mod->name, strlen(mod->name), true);if (old != NULL) {

if (old->state == MODULE_STATE_COMING || old->state == MODULE_STATE_UNFORMED) {

mutex_unlock(&module_mutex);err = wait_finished_loading(mod);if (err)

goto out_unlocked;goto again;

}err = -EEXIST;goto out;

}list_add_rcu(&mod->list, &modules);err = 0;

...

When loading occurs concurrently60

Module A UNFORMED LIVE

Module A UNFORMED (fail)

Module B(depends on A) UNFORMED Resolve Resolve LIVE

wakeup_all(@do_init_module)

time

COMING

percpu_modalloc• Allocate per-cpu area for the size of the per-cpu

section

61

static int percpu_modalloc(struct module *mod, struct load_info *info){

Elf_Shdr *pcpusec = &info->sechdrs[info->index.pcpu];unsigned long align = pcpusec->sh_addralign;

if (!pcpusec->sh_size)return 0;

...

mod->percpu = __alloc_reserved_percpu(pcpusec->sh_size, align);

if (!mod->percpu) {pr_warn("%s: Could not allocate %lu bytes percpu data\

n",mod->name, (unsigned long)pcpusec->sh_size);

return -ENOMEM;}mod->percpu_size = pcpusec->sh_size;return 0;

}

module_unload_init• Initialize a reference counter for the module

• After this function, it becomes 2.

• Initialize lists that manages dependency• source_list : list of “usages” in which the module is contained as their

“source” (= the list of modules which uses the symbols of the module)

• target_list : list of “usages” in which the module is contained as their “target” (= the list of modules symbols of which the module uses)

62

static int module_unload_init(struct module *mod){

atomic_set(&mod->refcnt, MODULE_REF_BASE);

INIT_LIST_HEAD(&mod->source_list);INIT_LIST_HEAD(&mod->target_list);

atomic_inc(&mod->refcnt);

return 0;}

find_module_sections• Find additional sections in the module• Mostly related to symbol tables, and tracers

63

Sections

__param

__ksymtab

__kcrctab

__ksymtab_gpl

__kcrctab_gpl

__ksymtab_gpl_future

__kcrctab_gpl_future

__ksymtab_unused

__kcrctab_unused

__ksymtab_unused_gpl

__kcrctab_unused_gtpl

Sections

.ctors / .init_array

__tracepoints_ptrs

__jump_table

_ftrace_events

__trace_printk_fmt

__mcount_loc

__ex_table

__verbose

check_module_license_and_versions

• Some hacks on specific modules• e.g.) ndiswrapper driver may be GPL (it needs symbols

exported only to GPL modules), but the driver it loads will not be GPL, so mark tainted

64

static int check_module_license_and_versions(struct module *mod){

if (strcmp(mod->name, "ndiswrapper") == 0)add_taint(TAINT_PROPRIETARY_MODULE,

LOCKDEP_NOW_UNRELIABLE);

/* driverloader was caught wrongly pretending to be under GPL */

if (strcmp(mod->name, "driverloader") == 0)add_taint_module(mod, TAINT_PROPRIETARY_MODULE,

LOCKDEP_NOW_UNRELIABLE);

/* lve claims to be GPL but upstream won't provide source */if (strcmp(mod->name, "lve") == 0)

add_taint_module(mod, TAINT_PROPRIETARY_MODULE, LOCKDEP_NOW_UNRELIABLE);

check_module_license_and_versions

• Checks whether the symbols have CRCs (versions)

65

#ifdef CONFIG_MODVERSIONSif ((mod->num_syms && !mod->crcs) || (mod->num_gpl_syms && !mod->gpl_crcs) || (mod->num_gpl_future_syms && !mod->gpl_future_crcs)

#ifdef CONFIG_UNUSED_SYMBOLS || (mod->num_unused_syms && !mod->unused_crcs) || (mod->num_unused_gpl_syms && !mod->unused_gpl_crcs)

#endif) {return try_to_force_load(mod,

"no versions for exported symbols");

}#endif

return 0;

setup_modinfo• Call “setup” for module attributes• Only “version” and “srcversion” have “setup” callback.

• Module attributes• version, srcversion• uevent• initstate• coresize, initsize• taint• refcnt

66

#define MODINFO_ATTR(field) \static void setup_modinfo_##field(struct module *mod, const char *s) \{ \

mod->field = kstrdup(s, GFP_KERNEL); \} \

load_module function (2) [Re]• Resolve the symbols (simplify_symbols)• Fix up the addresses in the module (apply_relocations)• Extable and per-cpu initialization (post_relocation)• Flush I-cache for the module area (flush_module_icache)• Copy the module parameters to mod->args.• Check duplication of symbols, and setup NX attributes.

(complete_formation)• Parse the module parameters (parse_args)• sysfs setup (mod_sysfs_setup)• Free the copy in the load_info structure (free_copy)• Call the init function of the module (do_init_module)

67

simplify_symbols• Change the address of the unresolved symbols in

the “symtab” section to the actual addresses

68

static int simplify_symbols(struct module *mod, const struct load_info *info){

Elf_Shdr *symsec = &info->sechdrs[info->index.sym];Elf_Sym *sym = (void *)symsec->sh_addr;

...for (i = 1; i < symsec->sh_size / sizeof(Elf_Sym); i++) {

const char *name = info->strtab + sym[i].st_name;...

case SHN_UNDEF:ksym = resolve_symbol_wait(mod, info, name);/* Ok if resolved. */if (ksym && !IS_ERR(ksym)) {

sym[i].st_value = ksym->value;break;

}/* Ok if weak. */if (!ksym && ELF_ST_BIND(sym[i].st_info) ==

STB_WEAK)break;

resolve_symbol_wait• Waits if the resolved symbol is that of the module

which is under initialization.

69

static const struct kernel_symbol *resolve_symbol_wait(struct module *mod,

const struct load_info *info, const char *name)

{const struct kernel_symbol *ksym;char owner[MODULE_NAME_LEN];

if (wait_event_interruptible_timeout(module_wq,!IS_ERR(ksym = resolve_symbol(mod, info, name,

owner))|| PTR_ERR(ksym) != -EBUSY,

30 * HZ) <= 0) {pr_warn("%s: gave up waiting for init of module %s.\n",

mod->name, owner);}return ksym;

}

resolve_symbol • Find the symbol from the kernel’s symbol tables

and other modules’ symbol tables. (find_symbol)

• If found, check if the version (CRC) of the symbol matches one that the module expects (check_versions)• And add dependency for the target module and the

symbol owner module (ref_module)

70

find_symbol (1)• Well, try to find it from the kernel

71

bool each_symbol_section(bool (*fn)(const struct symsearch *arr, struct module *owner, void *data),

void *data){

struct module *mod;static const struct symsearch arr[] = {

{ __start___ksymtab, __stop___ksymtab, __start___kcrctab,

NOT_GPL_ONLY, false },{ __start___ksymtab_gpl, __stop___ksymtab_gpl, __start___kcrctab_gpl, GPL_ONLY, false },{ __start___ksymtab_gpl_future,

__stop___ksymtab_gpl_future, __start___kcrctab_gpl_future, WILL_BE_GPL_ONLY, false },

...};

if (each_symbol_in_section(arr, ARRAY_SIZE(arr), NULL, fn, data))

return true;

find_symbol (2)• And, try to find in the modules (after UNFORMED)

72

list_for_each_entry_rcu(mod, &modules, list) {struct symsearch arr[] = {

{ mod->syms, mod->syms + mod->num_syms, mod->crcs,

NOT_GPL_ONLY, false },{ mod->gpl_syms, mod->gpl_syms + mod-

>num_gpl_syms, mod->gpl_crcs, GPL_ONLY, false },{ mod->gpl_future_syms, mod->gpl_future_syms + mod-

>num_gpl_future_syms, mod->gpl_future_crcs, WILL_BE_GPL_ONLY, false },

if (mod->state == MODULE_STATE_UNFORMED)continue;

if (each_symbol_in_section(arr, ARRAY_SIZE(arr), mod, fn, data))

return true;}return false;

}

find_symbol (3)• Bianry search in the section!

73

static int cmp_name(const void *va, const void *vb){

const char *a;const struct kernel_symbol *b;a = va; b = vb;return strcmp(a, b->name);

}

static bool find_symbol_in_section(const struct symsearch *syms, struct module *owner, void *data)

{struct find_symbol_arg *fsa = data;

sym = bsearch(fsa->name, syms->start, syms->stop - syms->start,sizeof(struct kernel_symbol), cmp_name);

if (sym != NULL && check_symbol(syms, owner, sym - syms->start, data))

return true;return false;

}

Checks the found symbol’s target license

ref_module• If the target module is NULL (=the symbol is in the kernel) or

the module already uses the target module, it immediately returns.• Increment the reference counter of the target module (if the

target module is in the middle of initialization, returns –EBUSY)• Add usage

• Source : the module• Target : the target module

74

static int add_module_usage(struct module *a, struct module *b){

struct module_use *use;use = kmalloc(sizeof(*use), GFP_ATOMIC);

use->source = a;use->target = b;list_add(&use->source_list, &b->source_list);list_add(&use->target_list, &a->target_list);

}

Usage example75

Kernel module A Kernel module B

function f() {}

function g() { f();}

DEP

struct module A

refcnt : 2

struct module B

refcnt: 1struct module_use

source: &Btarget: &A

source_list

target_list

source_list

target_list

apply_relocations• Apply relocations for each “rel” section• “rel” sections

• Section Type : SHT_REL or SHT_RELA

76

[Nr] Name Type Address Offset Size EntSize Flags Link Info Align[ 2] .text PROGBITS 0000000000000000 00000070 0000000000000019 0000000000000000 AX 0 0 16[ 3] .rela.text RELA 0000000000000000 00000ca8 0000000000000048 0000000000000018 24 2 8[ 4] .init.text PROGBITS 0000000000000000 00000089 0000000000000016 0000000000000000 AX 0 0 1[ 5] .rela.init.text RELA 0000000000000000 00000cf0 0000000000000030 0000000000000018 24 4 8[24] .symtab SYMTAB 0000000000000000 00000db0 00000000000003c0 0000000000000018 25 32 8[25] .strtab STRTAB 0000000000000000 00001170 000000000000014a 0000000000000000 0 0 1

Relocation• Example• This function uses

the “printk” symboloutside the module.(And also __fentry__)

77

0000000000000000 <say_hello>: 0: e8 00 00 00 00 callq 5 <say_hello+0x5> 1: R_X86_64_PC32 __fentry__-0x4 5: 55 push %rbp 6: 48 c7 c7 00 00 00 00 mov $0x0,%rdi 9: R_X86_64_32S .rodata.str1.1 d: 31 c0 xor %eax,%eax f: 48 89 e5 mov %rsp,%rbp 12: e8 00 00 00 00 callq 17 <say_hello+0x17> 13: R_X86_64_PC32 printk-0x4 17: 5d pop %rbp 18: c3 retq

void say_hello(void){ printk(KERN_INFO "Hello, World.\n");}

RIP-relative is based on the next instruction

apply_relocate[_add]• Addressing is architecture-dependent, so the

relocation is also architecture-dependent• x86_64 (RELA)• An RELA section is an array of Elf64_Rela

• In the “printk” example• r_offset = 0x13• r_info = R_X86_64_PC32 (RIP-relative in x86_64)• r_addend = -0x04

78

typedef struct elf64_rela { Elf64_Addr r_offset; /* Location at which to apply the action */ Elf64_Xword r_info; /* index and type of relocation */ Elf64_Sxword r_addend; /* Constant addend used to compute value */} Elf64_Rela;

apply_relocate_add in x86_64

79

int apply_relocate_add(Elf64_Shdr *sechdrs, const char *strtab, unsigned int symindex, unsigned int relsec, struct module *me)

{...

for (i = 0; i < sechdrs[relsec].sh_size / sizeof(*rel); i++) {

/* This is where to make the change */loc = (void

*)sechdrs[sechdrs[relsec].sh_info].sh_addr+ rel[i].r_offset;

/* This is the symbol it is referring to. Note that

all undefined symbols have been resolved. */sym = (Elf64_Sym *)sechdrs[symindex].sh_addr

+ ELF64_R_SYM(rel[i].r_info);...

val = sym->st_value + rel[i].r_addend;

apply_relocate_add in x86_64

80

switch (ELF64_R_TYPE(rel[i].r_info)) {...

case R_X86_64_64:*(u64 *)loc = val;break;

...case R_X86_64_32S:

*(s32 *)loc = val;if ((s64)val != *(s32 *)loc)

goto overflow;break;

case R_X86_64_PC32:val -= (u64)loc;*(u32 *)loc = val;

#if 0if ((s64)val != *(s32 *)loc)

goto overflow;#endif

break;

Calculate the delta between the current address and the

target address

post_relocation• Sort the exception table (sort_extable)

• Exception table: the instruction addresses which the page fault handler treats specially page faults for.• get_user etc.

• Copy the per-cpu section contents for all the possible cpus. (percpu_modcopy)

• Set kallsyms-related members to the final location, and copy core symtab from the whole symtab. (add_kallsyms)• Call architecture-dependent finalizing function of loading

(module_finalize)

81

for_each_possible_cpu(cpu)memcpy(per_cpu_ptr(mod->percpu, cpu), from,

size);

module_finalize in x86_64• Alternatives, paravirt and so on.

82

int module_finalize(const Elf_Ehdr *hdr, const Elf_Shdr *sechdrs, struct module *me)

{const Elf_Shdr *s, *text = NULL, *alt = NULL, *locks = NULL,

*para = NULL;char *secstrings = (void *)hdr + sechdrs[hdr->e_shstrndx].sh_offset;

for (s = sechdrs; s < sechdrs + hdr->e_shnum; s++) {if (!strcmp(".text", secstrings + s->sh_name))

text = s;if (!strcmp(".altinstructions", secstrings + s->sh_name))

alt = s;if (!strcmp(".smp_locks", secstrings + s->sh_name))

locks = s;if (!strcmp(".parainstructions", secstrings + s->sh_name))

para = s;if (alt) {

/* patch .altinstructions */void *aseg = (void *)alt->sh_addr;apply_alternatives(aseg, aseg + alt->sh_size);

}...

flush_module_icache• Flush instruction cache for text area so that the

code be executed correctly

83

static void flush_module_icache(const struct module *mod){

mm_segment_t old_fs;

/* flush the icache in correct context */old_fs = get_fs();set_fs(KERNEL_DS);

if (mod->module_init)flush_icache_range((unsigned long)mod->module_init,

(unsigned long)mod->module_init + mod->init_size);

flush_icache_range((unsigned long)mod->module_core, (unsigned long)mod->module_core + mod-

>core_size);

set_fs(old_fs);}

complete_formation• Check if the exported symbols are already exported

by another module (verify_export_symbols)• Add section information of symbols for BUG report

(module_bug_finalize)• Set NX and RO for core and init area.• Set the module state to MODULE_STATE_COMING

84

mod->state = MODULE_STATE_COMING;

load_module function (2) [Re]• Resolve the symbols (simplify_symbols)• Fix up the addresses in the module (apply_relocations)• Extable and per-cpu initialization (post_relocation)• Flush I-cache for the module area (flush_module_icache)• Copy the module parameters to mod->args.• Check duplication of symbols, and setup NX attributes.

(complete_formation)• Parse the module parameters (parse_args)• sysfs setup (mod_sysfs_setup)• Free the copy in the load_info structure (free_copy)• Call the init function of the module (do_init_module)

85

do_init_module (1)• Make a structure for call_rcu to free init area

• And call the init function in the module

• Set the module state to MODULE_STATE_LIVE

86

struct mod_initfree *freeinit;

freeinit = kmalloc(sizeof(*freeinit), GFP_KERNEL);...

freeinit->module_init = mod->module_init;

do_mod_ctors(mod);/* Start the module */if (mod->init != NULL)

ret = do_one_initcall(mod->init);

mod->state = MODULE_STATE_LIVE;

do_init_module (2)• To avoid deadlock, perform synchronize

• Drop the initial reference

• And clears the init-related stuffs!

87

if (current->flags & PF_USED_ASYNC)async_synchronize_full();

mutex_lock(&module_mutex);/* Drop initial reference. */module_put(mod);

trim_init_extable(mod);#ifdef CONFIG_KALLSYMS

mod->num_symtab = mod->core_num_syms;mod->symtab = mod->core_symtab;mod->strtab = mod->core_strtab;

#endifunset_module_init_ro_nx(mod);module_arch_freeing_init(mod);

do_init_module (3)• Finally, frees the init stuffs

• Wakes up if someone is waiting for the completion of the initialization.

88

call_rcu(&freeinit->rcu, do_free_init);mutex_unlock(&module_mutex);

wake_up_all(&module_wq);

Details (2)Unloading

89

sys_delete_module• Check capability and module blocking parameter• Find the specified module by name• If the module has the init function AND does not

have the exit function and it is not forceful unload, it fails with –EBUSY• Try to stop the module (try_stop_module)• Call the exit function• Frees the module

90

Now (3.19) [RE]• Reference count is now atomic_t (was per-cpu int

before) and checked without stop_machine• (thanks to a mysterious guy)

91

static int try_stop_module(struct module *mod, int flags, int *forced){

/* If it's not unused, quit unless we're forcing. */if (try_release_module_ref(mod) != 0) {

*forced = try_force_unload(flags);if (!(*forced))

return -EWOULDBLOCK;}

/* Mark it as dying. */mod->state = MODULE_STATE_GOING;

return 0;}

try_release_module_ref• Decrement the reference counter and checks if it

reaches is zero (= can be unloaded).

92

static int try_release_module_ref(struct module *mod){

int ret;

/* Try to decrement refcnt which we set at loading */ret = atomic_sub_return(MODULE_REF_BASE, &mod->refcnt);BUG_ON(ret < 0);if (ret)

/* Someone can put this right now, recover with checking */

ret = atomic_add_unless(&mod->refcnt, MODULE_REF_BASE, 0);

return ret;}

Details (3)Building a out-of-tree kernel module

93

Build steps (1) : .c -> .o• make .tmp_versions, create .tmp_versions/<module>.mod

• The file contains the names of the final .ko file and source .o files

• Compile .tmp_[name].o from [name].c• Calculate the CRCs (version) for the exported symbols

• Find a __ksymtab section in .tmp_[name].o• objdump –h (obj) | grep –q __ksymtab

• Calculate CRC for exported symbols in the source file by genksyms (Output is LD Script format)

• Compile the CRC values into the object file.

94

cmd_modversions = \if $(OBJDUMP) -h $(@D)/.tmp_$(@F) | grep -q __ksymtab; then \

$(call cmd_gensymtypes,$(KBUILD_SYMTYPES),$(@:.o=.symtypes))\

> $(@D)/.tmp_$(@F:.o=.ver); \\

$(LD) $(LDFLAGS) -r -o $@ $(@D)/.tmp_$(@F) \

-T $(@D)/.tmp_$(@F:.o=.ver); \rm -f $(@D)/.tmp_$(@F) $(@D)/.tmp_$(@F:.o=.ver); \

else \mv -f $(@D)/.tmp_$(@F) $@;

\fi;

_crc_say_hello = 0xb37b83db ;

Exported Symbols• Each exported symbol has a struct in __ksymtab* section.

95

#define __EXPORT_SYMBOL(sym, sec) \extern typeof(sym) sym; \__CRC_SYMBOL(sym, sec) \static const char __kstrtab_##sym[] \__attribute__((section("__ksymtab_strings"), aligned(1))) \= VMLINUX_SYMBOL_STR(sym); \extern const struct kernel_symbol __ksymtab_##sym; \__visible const struct kernel_symbol __ksymtab_##sym \__used \__attribute__((section("___ksymtab" sec "+" #sym), unused)) \= { (unsigned long)&sym, __kstrtab_##sym }

#define EXPORT_SYMBOL(sym) \__EXPORT_SYMBOL(sym, "")

#define EXPORT_SYMBOL_GPL(sym) \__EXPORT_SYMBOL(sym, "_gpl")

#define EXPORT_SYMBOL_GPL_FUTURE(sym) \__EXPORT_SYMBOL(sym, "_gpl_future")

(include/linux/export.h)

CRC sections• Declare CRC symbols in CRC sections with the weak

attribute.

96

#ifndef __GENKSYMS__#ifdef CONFIG_MODVERSIONS/* Mark the CRC weak since genksyms apparently decides not to * generate a checksums for some symbols */#define __CRC_SYMBOL(sym, sec) \

extern __visible void *__crc_##sym __attribute__((weak));\static const unsigned long __kcrctab_##sym \__used \__attribute__((section("___kcrctab" sec "+" #sym), unused)) \= (unsigned long) &__crc_##sym;

#else#define __CRC_SYMBOL(sym, sec)#endif

(include/linux/export.h)

Build Steps (2) : .c -> .o• Create __mcount_loc list (if –pg is enabled)• The list of pointers where “mcount” is called

• Fix up the dep file• Link into a single object file (<module>.o) if the

module is composed of multiple object files

97

Build Steps (3) – Stage 2• Create <module>.mod.c and <module>.symvers by modpost

command

• Compile the <module>.mod.c• Link the <module>.mod.o and <module>.o into a module

<module>.ko

98

modpost = scripts/mod/modpost \ $(if $(CONFIG_MODVERSIONS),-m) \ $(if $(CONFIG_MODULE_SRCVERSION_ALL),-a,) \ $(if $(KBUILD_EXTMOD),-i,-o) $(kernelsymfile) \ $(if $(KBUILD_EXTMOD),-I $(modulesymfile)) \ $(if $(KBUILD_EXTRA_SYMBOLS), $(patsubst %, -e %,$(KBUILD_EXTRA_SYMBOLS))) \ $(if $(KBUILD_EXTMOD),-o $(modulesymfile)) \ $(if $(CONFIG_DEBUG_SECTION_MISMATCH),,-S) \ $(if $(KBUILD_EXTMOD)$(KBUILD_MODPOST_WARN),-w)

MODPOST_OPT=$(subst -i,-n,$(filter -i,$(MAKEFLAGS)))

# We can go over command line length here, so be careful.quiet_cmd_modpost = MODPOST $(words $(filter-out vmlinux FORCE, $^)) modules cmd_modpost = $(MODLISTCMD) | sed 's/\.ko$$/.o/' | $(modpost) $(MODPOST_OPT) -s -T -

modpost (1)• Collects module information, symbol information

and versions from kernel symbols, object files, and generate module source file and symvers file. • Arguments

• Options

99

Option Description

-m CONFIG_MODVERSIONS (Symbol version)

-a CONFIG_MODULE_SRCVERSION_ALL (“srcversion” in modinfo)MD4 for the source files that made the module

-I (symvers file) Input symbol versions (kernel symbols)

-e (symvers file) Input extra symbol versions

-o (symvers file) Output symbol versions (for exported symbols of the module)

-T (files) Source (object) file list

$ modpost [Options...] [(Module object files...)]

modpost (2)• Generate the source file

100

for (mod = modules; mod; mod = mod->next) {char fname[PATH_MAX];

...buf.pos = 0;

add_header(&buf, mod);add_intree_flag(&buf, !external_module);add_staging_flag(&buf, mod->name);err |= add_versions(&buf, mod);add_depends(&buf, mod, modules);add_moddevtable(&buf, mod);add_srcversion(&buf, mod);

sprintf(fname, "%s.mod.c", mod->name);write_if_changed(&buf, fname);

} (scripts/mod/modpost.c)

modpost (3)• Dump the symbol versions

101

static void write_dump(const char *fname){

struct buffer buf = { };struct symbol *symbol;int n;

for (n = 0; n < SYMBOL_HASH_SIZE ; n++) {symbol = symbolhash[n];while (symbol) {

if (dump_sym(symbol))buf_printf(&buf, "0x%08x\t%s\t%s\t%s\

n",symbol->crc, symbol->name,symbol->module->name,export_str(symbol->export));

symbol = symbol->next;}

}write_if_changed(&buf, fname);

}(scripts/mod/modpost.c)

0xb37b83db say_hello /home/shimos/test_module/hello EXPORT_SYMBOL

Generated <module>.mod.c (1)• Example

102

#include <linux/module.h>#include <linux/vermagic.h>#include <linux/compiler.h>

MODULE_INFO(vermagic, VERMAGIC_STRING);

__visible struct module __this_module__attribute__((section(".gnu.linkonce.this_module"))) = { .name = KBUILD_MODNAME, .init = init_module,#ifdef CONFIG_MODULE_UNLOAD .exit = cleanup_module,#endif .arch = MODULE_ARCH_INIT,};

static const struct modversion_info ____versions[]__used__attribute__((section("__versions"))) = { { 0x9412fa01, __VMLINUX_SYMBOL_STR(module_layout) }, { 0x27e1a049, __VMLINUX_SYMBOL_STR(printk) }, { 0xbdfb6dbb, __VMLINUX_SYMBOL_STR(__fentry__) },}; ...

Additional modinfo is included

Base of struct module

Symbols and (expected) versions which this module depends on.

Generated <module>.mod.c (2)• Example

103

static const char __module_depends[]__used__attribute__((section(".modinfo"))) ="depends=";

MODULE_INFO(srcversion, "8D5BACDC1EA9421ABFF79DD")

Modinfo about dependency(but the kernel does not use this)

Modinfo “srcversion”

modinfo• The modinfo string is created by macros, and

concatenated by collecting the string into a single section

104

#define __MODULE_INFO(tag, name, info) \static const char __UNIQUE_ID(name)[] \ __used __attribute__((section(".modinfo"), unused, aligned(1))) \ = __stringify(tag) "=" info

(include/linux/moduleparam.h)#define MODULE_INFO(tag, info) __MODULE_INFO(tag, tag, info)...#define MODULE_LICENSE(_license) MODULE_INFO(license, _license)...#define MODULE_AUTHOR(_author) MODULE_INFO(author, _author)...

(include/linux/module.h)

UNIQUE_ID105

#define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__)

(include/linux/compiler-gcc4.h)

Recommended