The Hello World Kernel Module
Cave programmers carving Hello World on cave computers
As a brief introduction to this (very long) essay. What lies below are my notes while completing the 1st in a set of 20 tasks from The Eudyptula Challenge. Each task, emailed one at a time, starting with building a "hello world" kernel module (this essay) and progressed in difficulty until we ultimately submit patches into the main tree of the Linux kernel. The ultimate goal of The Eudyptula Challenge is to get new developers comfortable with the somewhat unique world of kernel development by separating the "on-boarding" process into focused manageable tasks.
Sadly, The Eudyptula Challenge is no longer accepting new applicants. However, if you wish to work on the tasks yourself, I've published the 20 tasks I've managed to find along with the code I used to "complete" them in a git repo here.
Task No.1
Write a Linux kernel module, and stand-alone Makefile, that when loaded prints to the kernel debug log level, "Hello World!" Be sure to make the module be able to be unloaded as well.
The Makefile should build the kernel module against the source for the currently running kernel, or, use an environment variable to specify what kernel tree to build it against.
Please show proof of this module being built, and running, in your kernel. What this proof is is up to you, I'm sure you can come up with something. Also be sure to send the kernel module you wrote, along with the Makefile you created to build the module.
—Little Penguin
What Is A Module
A kernel module is piece of code designed to be loaded and unloaded on demand by our kernels. For example, the device drivers for your keyboard or a network card are a type of module. By separating the kernel into individual software components, we can keep the overall size of the kernel small, letting Linux fit into the smallest of embedded systems. Some kernel modules, like the one we'll be building, can even be installed without the need to recompile and reboot our kernel, making upgrades easy, and saving us a lot of time.
If you have access to a Linux machine, you can find the modules that
are currently loaded into the kernel by using the lsmod
command, which gets its information from /proc/modules
.
Chiseling A Cave Module
Every kernel module must have at least two functions, one that will be
called when we install the module and another function to remove it
from the kernel. Back in the pre v2.3 era (early 2000s) this
could only be done with a "start" function, called
init_module()
and an "end" function, called
cleanup_module()
. There are more modern (and preferred)
methods available to us today, however some developers still use
these, so it's a great starting point.
#include <linux/kernel.h> /* for KERN_DEBUG */
#include <linux/module.h> /* for all kernel modules */
int init_module(void)
{
printk(KERN_DEBUG "Hello World.\n");
return 0; /* init_module loaded successfully */
}
void cleanup_module(void)
{
printk(KERN_DEBUG "oh, the rest is silence.\n");
}
Typically init_module()
is used to register handlers or alter
some other part of the kernel for a device or something. The
cleanup_module()
will then undo those changes, allowing the
module to be removed safely from the kernel. Both of these functions
(as of version 5.7) can be found on line 75, as well as everything
else we need, in linux/module.h
of the source code.
printk() != printf()
To print Hello World
on "the kernel debug log level", we'll
need to use another, very old, function called printk()
. Unlike
the printf()
commonly used in userspace applications,
printk()
is not designed to communicate to the user (or say
hello to worlds). It's a logging mechanism used to give warnings and
to log messages. This is why each printk()
statement also
comes with a priority. There are currently 8 defined priorities we can
use ranging from KERN_DEBUG
to KERN_EMERG
. You can see
them all, and their definitions, currently (version 5.7) in
linux/kern_levels.h
in the source code.
Pay attention to the single argument passed to printk()
. Looking
into the source code
shows that printk(const char *ftm, ...)
accepts only one
string, with space to pass extra arguments to format the string if
needed, for example, our "Hello World" statement from above, which
doesn't need formatting and therefore passes no extra arguments:
printk(KERN_DEBUG "Hello World.\n");
The KERN_DEBUG
macro will expand to "\001" "7"
,
turning our statement into:
printk("\001" "7" "Hello World.\n");
Our C lexer will then combine the adjacent string literals to produce our formatted string for the kernel to log:
printk("\0017Hello World.\n");
Even though printk()
is falling out of style with modern Linux
maintainers, as we will see in later sections, there
is a lot more to read about how to work with printk()
and
format specifiers in the kernel in the documentation here
if you're into that kind of stuff.
Making A Kernel Module
Much like how kernel modules are a little different than userspace application modules, the Makefiles that compile the kernel are also a bit different than Makefiles in userspace.
Originally, as the Linux code-base grew, so did its Makefiles. As they continued to grow in complexity, they eventually became a burden to maintain. Fortunately a solution, called the "kbuild system", was created and accepted into the kernel to help organize and simplify the kernel's building process. If you are interested, there is an entire section about the kbuild system in the documentation.
Kbuild Makefile
Just like Makefiles in userspace, we can start a Kbuild Makefile by
creating a new file called …wait for it… Makefile
in the
same folder as our hello-world.c
module we made in the
sections above.
$ ls -l
total 8
-rw-rw-r-- 1 me us 903 Jul 5 00:00 hello-world.c
-rw-rw-r-- 1 me us 167 Jul 5 00:00 Makefile
We can alternatively use the name Kbuild
(not preferred) to
indicate to other developers that the Makefile is intended to run
using the kbuild system. However, while the Kbuild
name is not
preferred, interestingly, if both Makefile
and Kbuild
files exist in the same directory the Kbuild
file will be
used. (source)
Goal Definitions
The "heart" of the kbuild system uses lines called "goal definitions" to define all the various target files, special compilation options, and any sub-directories to enter. When we compile the kernel (with its thousands of Makefiles) the goal definitions are collected and used to build all the various, documentation files, modules, and other files we need for our particular kernel.
The simplest Kbuild Makefile we can write for our module contains a single line:
obj-m += hello-world.o
obj-m
tells kbuild that our hello-world.o
object file
is a loadable kernel module (LKM)
that can be loaded and unloaded at any time without needing to reboot
the kernel. This line will also tell the kbuild system to look for
files in our directory named hello-world.c
or
hello-world.S
to compile into the hello-world.o
object
file, before building the kernel object file hello-world.ko
we'll use to load into our kernel.
Convenience Targets
For the pure convenience of it, we can add extra phony targets
to our Kbuild Makefile to easily compile our module for the kernel
currently running on our computer, simplifying the task of compiling
our module down to just typing make
into our terminals:
all:
${MAKE} -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules
And make clean
to clean up everything afterwards:
clean:
${MAKE} -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean
Both of these phony targets use the -C
option to move out of
our current directory and into our kernel's source directory. There
make
can find and use the top most kbuild Makefile, which
takes the M
option to locate the folder we are current
working in, and build the files defined using the obj-m
goal
definition we setup above.
Installing A Kernel Module
Just as with our userspace applications, kernel modules need to be
compiled. Using the kbuild system, along with our convenience targets
above, we can compile our kernel module by
issuing the make
command, and if all goes well, you should see
an output similar to this:
$ make
make -C /lib/modules/4.15.0-108-generic/build M=/home/me/src/eudyptula ...
/tasks/01 modules
make[1]: Entering directory '/usr/src/linux-headers-4.15.0-108-generic'
CC [M] /home/me/src/eudyptula/tasks/01/hello-world.o
Building modules, stage 2.
MODPOST 1 modules
WARNING: modpost: missing MODULE_LICENSE() in /home/me/src/eudyptula ...
see include/linux/module.h for more information
CC /home/me/src/eudyptula/tasks/01/hello-world.mod.o
LD [M] /home/me/src/eudyptula/tasks/01/hello-world.ko
make[1]: Leaving directory '/usr/src/linux-headers-4.15.0-108-generic'
The .ko
extension was introduced around kernel version 2.6 to
help differentiate between userspace object files and kernel object
files, which contain a .modinfo
section to hold extra metadata
information about the module. We can use the modinfo
command
to see and interpret the contents of the section:
$ modinfo hello-world.ko
filename: /home/me/src/eudyptula/tasks/01/hello-world.ko
srcversion: 18005133D4ECFCDD12928D8
depends:
retpoline: Y
name: hello_world
vermagic: 4.15.0-108-generic SMP mod_unload
Installing the Module
With our hello-world.c
module freshly compiled, we can insert
it into our kernel using the insmod
command as root
or
another user with sudo
privileges:
$ sudo insmod hello-world.ko
Congratulations!, you have created your first kernel module! A
quick inspection of the kernel's diagnostic messages, using
dmesg
, should show our Hello World.
message:
$ dmesg | tail -1
[241745.247591] Hello World.
Removing the Module
After the well deserved pat-on-the-back and when you are ready to
continue, we can uninstall our module with the rmmod
command
as root
or someone with sudo
privileges:
$ sudo rmmod hello_world
The only indication we've uninstalled our module will be in
dmesg
from our printk()
statement in the
cleanup_module()
function.
$ dmesg | tail -1
[241751.401232] oh, the rest is silence.
Kernel Taint
There are plenty of ways we can taint our kernel. Don't worry too much about this though, most of the time it is completely fine to run a tainted kernel. When something happens that could be important to an investigation later on, a kernel will mark itself as "tainted". Usually the event that caused the kernel to become tainted is the problem being investigated.
We can find our kernel's tainted state by reading our
/proc/sys/kernel/tainted
file. Every way we can taint our
kernels is assigned one bit in a bit-field, meaning any value other
than 0
indicates our kernel is tainted. To decode the
bit-field values, we can use the tools/debugging/kernel-chktaint
script found in the source code,
to decode its meaning.
$ tools/debugging/kernel-chktaint
Kernel is "tainted" for the following reasons:
* proprietary module was loaded (#0)
* kernel issued warning (#9)
* externally-built ('out-of-tree') module was loaded (#12)
* unsigned module was loaded (#13)
For a more detailed explanation of the various taint flags see
Documentation/admin-guide/tainted-kernels.rst in the the Linux kernel sources
or https://kernel.org/doc/html/latest/admin-guide/tainted-kernels.html
Raw taint value as int/string: 12801/'P W OE '
Licensing & Documentation
One of the ways we can taint our kernels is by loading proprietary
modules or modules that use licenses not compatible with the General
Public License (GPL) (bit 0
in the tainting list). Modules
that don't use the MODULE_LICENSE()
macro will also be
considered proprietary and taint our kernel, if loaded (this is why we
saw the warning above).
There are many documentation macros, defined in
linux/module.h
, some of the basics I added are:
MODULE_LICENSE("MIT");
MODULE_AUTHOR("Bryan Brattlof <email@example.com>");
MODULE_DESCRIPTION("A Hello World Driver");
MODULE_SUPPORTED_DEVICE("testdevice");
Once we add our module's license, author and other information to the
end of our hello-world.c
module, when we compile our module
again using make
, the WARNING
should be gone:
$ make
make -C /lib/modules/4.15.0-108-generic/build M=/home/me/src/eudyptula ...
make[1]: Entering directory '/usr/src/linux-headers-4.15.0-108-generic'
CC [M] /home/me/src/eudyptula/tasks/01/hello-world.o
Building modules, stage 2.
MODPOST 1 modules
CC /home/me/src/eudyptula/tasks/01/hello-world.mod.o
LD [M] /home/me/src/eudyptula/tasks/01/hello-world.ko
make[1]: Leaving directory '/usr/src/linux-headers-4.15.0-108-generic'
There where many reasons to add this system to the kernel. For example, it gives developers a way to easily find who maintains a module, describe what the module does, and what license the code is protected with. It also provides an easy method to inform users when they are using non open source software.
Updating the Module
Everything in my notes, to this point, was needed to complete the 1st task assigned to us by the Little Penguin. However, just like with every software project, the Linux kernel is constantly adding new features and adopting new coding styles, ensuring that my notes will become obsolete as soon as I've writing them.
With that said, the sections below, while not technically needed to
complete the task, are my notes on the macros and functions I saw in
the drivers
directory of the Linux source code that I found
particularly interesting. These functions are mostly stylistic changes
or they introduce functionality that improves efficiency and
modularity of the Linux kernel in some way.
module_init() & module_exit()
Introduced in version 2.4 of the kernel, and defined in
linux/init.h
of the source code,
we can now rename our "start" and "end" functions to whatever we
wish. In this example, I've chosen to rename the "start" function to
hello_world_init()
-int init_module(void)
+static int __init hello_world_init(void)
{
printk(KERN_DEBUG "Hello World.\n");
return 0; /* init_module loaded successfully */
}
And renamed the "exit" function hello_world_exit()
-void cleanup_module(void)
+static void __exit hello_world_exit(void)
{
printk(KERN_DEBUG "oh, the rest is silence.\n");
}
The kernel will then use the module_init()
macro to find the
function to execute when the module is installed and
module_exit()
to find the function to cleanup before being
removed.
module_init(hello_world_init);
module_exit(hello_world_exit);
To avoid compiling issues, both the module_init()
and
module_exit()
macros must be defined below our newly named
"start" and "end" functions.
__init & __exit
I also introduced two macros to our "start" and "end" functions above called __init
and
__exit
. These macros, defined in linux/init.h
of the
source code, help reduce memory
used by the kernel depending on how the module is installed.
For built-in modules, where our module cannot be removed from the
kernel without recompiling and restarting, the __init
keyword
will tell our C lexer to place our module's "start" function into a
special section inside the compiled kernel. After the module is loaded
and our "start" function has finished, the kernel will never have to
run the code again until reboot. So this special section can be freed,
saving memory.
The same is true for the __exit
macro. For built-in modules,
the module cannot be removed from the kernel without recompiling and
restarting. So the kernel will never need to run our module's "exit"
function to safely remove it from the kernel. This means our C lexer
can safely omit our "exit" function from the compiled kernel.
pr_debug()
In the beginning there was printk()
, and the kernel's
diagnostic messages structure was formless. The lack of any format for
printk()
messages is one of a number of reasons why developers
are replacing printk()
statements with their newer
equivalents. Depending on what section of the kernel we are in, there
are newer functions that have some benefits for us.
For example, the pr_debug()
function, which has the benefit of
being less syntactically verbose than printk(KERN_DEBUG ...)
also allows us to take advantage of the dynamic debugging interface, which gives developers a
uniform control interface for debugging kernel messages while avoiding
cluttering the kernel.
static int __init hello_world_init(void)
{
- printk(KERN_DEBUG "Hello World.\n");
+ pr_debug("Hello World.\n");
return 0; /* means init_module loaded successfully */
}
static void __exit hello_world_exit(void)
{
- printk(KERN_DEBUG "oh, the rest is silence.\n");
+ pr_debug("oh, the rest is silence.\n");
}
Wrapping Up
If you made it here, all I can say is you are a very brave person, and I'm glad my notes were able to help you in some way. If you see any issues or have a question, please feel free to contact me, or better yet subscribe to the kernel newbies mailing list.
For the next challenge, we will be building the Linux Kernel from scratch, as well as installing and booting from it. If you want to work on this challenge before you read my notes (recommended), I've published a copy of the challenges in a git repo here.
Next: My notes on How to build the Linux Kernel from scratch.