2018-12-17 15:24:39

by Thorsten Leemhuis

[permalink] [raw]
Subject: [PATCH 0/1] RFC: Revamp admin-guide/tainted-kernels.rst to make it more comprehensible

Hi! Find my first contribution to the kernel documentation in the reply to this
mail. Hopefully a lot more will follow. This patch got triggered while working
on an update for reporting-bugs.rst, as improving the documentation around
reporting bugs and regressions was one of the main things that a lot of people
wanted to see while regression tracking was discussed in the kernel and
maintainer summit 2017 in Prague.

I'm used to writing, but not in English. Hope the result is not too bad. Needs
someone to do a spell check, too, as I'm bad at finding spelling or grammatical
errors in general; it's even worse when I try to proofread my own texts. :-/

Sorry for using the simple table format for the table. I only noticed the list
table format is preferred after creating the table. Shall I convert it for the
next submission? Sounds like a downside to me, as for a table this small the
simple table format seems way easier to parse when reading the plain text file.

Any feedback much appreciated.

Ciao, Thorsten

Thorsten Leemhuis (1):
docs: Revamp tainted-kernels.rst to make it more comprehensible

Documentation/admin-guide/tainted-kernels.rst | 105 ++++++++++++++++--
1 file changed, 96 insertions(+), 9 deletions(-)

--
2.18.1



2018-12-17 15:22:25

by Thorsten Leemhuis

[permalink] [raw]
Subject: [PATCH 1/1] docs: Revamp tainted-kernels.rst to make it more comprehensible

Add a section about /proc/sys/kernel/tainted and a command that decodes it to
Documentation/admin-guide/tainted-kernels.rst. While at it introduce a table
that shows the various bits as well as the letters used in oops and panic
messages. Make the document more user focused and easier to understand, too.

Backstory: While working updating reporting-bugs.rst I noticed there is no easy
comprehensible document showing how to check if or why the running kernel might
be tainted. That's why I wrote a section with a small python command to decodes
/proc/sys/kernel/tainted. I suspect there is a more elegant and shorter command
to archive the same, which still works on common machines out of the box;
please let me know if you know such a command.

While putting that section in place I ended up writing an easier understandable
intro and a hopefully better explanation for the tainted flags in bugs, oops or
panics messages. Only thing missing then was a table that quickly describes the
various bits and the taint flags before going into more detail, so I added that
as well.

Signed-off-by: Thorsten Leemhuis <[email protected]>
---
Documentation/admin-guide/tainted-kernels.rst | 105 ++++++++++++++++--
1 file changed, 96 insertions(+), 9 deletions(-)

diff --git a/Documentation/admin-guide/tainted-kernels.rst b/Documentation/admin-guide/tainted-kernels.rst
index 28a869c509a0..aabd307a178a 100644
--- a/Documentation/admin-guide/tainted-kernels.rst
+++ b/Documentation/admin-guide/tainted-kernels.rst
@@ -1,10 +1,102 @@
Tainted kernels
---------------

-Some oops reports contain the string **'Tainted: '** after the program
-counter. This indicates that the kernel has been tainted by some
-mechanism. The string is followed by a series of position-sensitive
-characters, each representing a particular tainted value.
+The kernel will mark itself as 'tainted' when something occurs that
+might be relevant later when investigating problems. Don't worry
+yourself too much about this, most of the time it's not a problem to run
+a tainted kernel; the information is mainly of interest once someone
+wants to investigate some problem, as its real cause might be the event
+that got the kernel tainted. That's why the kernel will remain tainted
+even after you undo what caused the taint (i.e. unload a proprietary
+kernel module), to indicate the kernel remains not trustworthy. That's
+also why the kernel will print the tainted state when it noticed
+ainternal problem (a 'kernel bug'), a recoverable error ('kernel oops')
+or a nonrecoverable error ('kernel panic') and writes debug information
+about this to the logs ``dmesg`` outputs. It's also possible to check
+the tainted state at runtime through a file in ``/proc/``.
+
+
+Tainted flag in bugs, oops or panics messages
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+You find the tainted state near the top after the list of loaded
+modules. The state is part of the line that begins with mentioning CPU
+('CPU:'), Process ID ('PID:'), and a shorted name of the executed
+command ('Comm:') that triggered the event. When followed by **'Not
+tainted: '** the kernel was not tainted at the time of the event; if it
+was, then it will print **'Tainted: '** and characters either letters or
+blanks. The meaning of those characters is explained in below table. The
+output for example might state '``Tainted: P WO``' when the kernel got
+tainted earlier because a proprietary Module (``P``) was loaded, a
+warning occurred (``W``), and an externally-built module was loaded
+(``O``). To decode other letters use below table.
+
+
+Decoding tainted state at runtime
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+At runtime, you can query the tainted state by reading
+``/proc/sys/kernel/tainted``. If that returns ``0``, the kernel is not
+tainted; any other number indicates the reasons why it is. You might
+find that number in below table if there was only one reason that got
+the kernel tainted. If there were multiple reasons you need to decode
+the number, as it is a bitfield, where each bit indicates the absence or
+presence of a particular type of taint. You can use the following python
+command to decode::
+
+ $ python3 -c 'from pprint import pprint; from itertools import zip_longest; pprint(list(zip_longest(range(1,17), reversed(bin(int(open("/proc/sys/kernel/tainted").read()))[2:]),fillvalue="0")))'
+ [(1, '1'),
+ (2, '0'),
+ (3, '0'),
+ (4, '0'),
+ (5, '0'),
+ (6, '0'),
+ (7, '0'),
+ (8, '0'),
+ (9, '0'),
+ (10, '1'),
+ (11, '0'),
+ (12, '0'),
+ (13, '1'),
+ (14, '0'),
+ (15, '0'),
+ (16, '0')]
+
+In this case ``/proc/sys/kernel/tainted`` contained ``4609``, as the
+kernel got tainted because a proprietary Module (Bit 1) got loaded, a
+warning occurred (Bit 10), and an externally-built module got loaded
+(Bit 13). To decode other bits use below table.
+
+
+Table for decoding tainted state
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+=== === ====== ========================================================
+Bit Log Int Reason that got the kernel tainted
+=== === ====== ========================================================
+ 1) G/P 0 proprietary module got loaded
+ 2) _/F 2 module was force loaded
+ 3) _/S 4 SMP kernel oops on a officially SMP incapable processor
+ 4) _/R 8 module was force unloaded
+ 5) _/M 16 processor reported a Machine Check Exception (MCE)
+ 6) _/B 32 bad page referenced or some unexpected page flags
+ 7) _/U 64 taint requested by userspace application
+ 8) _/D 128 kernel died recently, i.e. there was an OOPS or BUG
+ 9) _/A 256 ACPI table overridden by user
+10) _/W 512 kernel issued warning
+11) _/C 1024 staging driver got loaded
+12) _/I 2048 workaround for bug in platform firmware in use
+13) _/O 4096 externally-built ("out-of-tree") module got loaded
+14) _/E 8192 unsigned module was loaded
+15) _/L 16384 soft lockup occurred
+16) _/K 32768 Kernel live patched
+=== === ====== ========================================================
+
+Note: To make reading easier ``_`` is representing a blank in this
+table.
+
+More detailed explanation for tainting
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

1) ``G`` if all modules loaded have a GPL or compatible license, ``P`` if
any proprietary module has been loaded. Modules without a
@@ -52,8 +144,3 @@ characters, each representing a particular tainted value.

16) ``K`` if the kernel has been live patched.

-The primary reason for the **'Tainted: '** string is to tell kernel
-debuggers if this is a clean kernel or if anything unusual has
-occurred. Tainting is permanent: even if an offending module is
-unloaded, the tainted value remains to indicate that the kernel is not
-trustworthy.
--
2.18.1


2018-12-17 20:05:49

by Jonathan Corbet

[permalink] [raw]
Subject: Re: [PATCH 0/1] RFC: Revamp admin-guide/tainted-kernels.rst to make it more comprehensible

On Mon, 17 Dec 2018 16:20:42 +0100
Thorsten Leemhuis <[email protected]> wrote:

> Hi! Find my first contribution to the kernel documentation in the reply to this
> mail. Hopefully a lot more will follow.

Hopefully! Looking forward to it.

> Sorry for using the simple table format for the table. I only noticed the
> list table format is preferred after creating the table. Shall I convert
> it for the next submission? Sounds like a downside to me, as for a table
> this small the simple table format seems way easier to parse when reading
> the plain text file.

The thing that matters is readability in the plain-text format. Your
table here is fine, no reason to redo it.

With regard to the patch itself:

> diff --git a/Documentation/admin-guide/tainted-kernels.rst b/Documentation/admin-guide/tainted-kernels.rst
> index 28a869c509a0..aabd307a178a 100644
> --- a/Documentation/admin-guide/tainted-kernels.rst
> +++ b/Documentation/admin-guide/tainted-kernels.rst
> @@ -1,10 +1,102 @@
> Tainted kernels
> ---------------
>
> -Some oops reports contain the string **'Tainted: '** after the program
> -counter. This indicates that the kernel has been tainted by some
> -mechanism. The string is followed by a series of position-sensitive
> -characters, each representing a particular tainted value.
> +The kernel will mark itself as 'tainted' when something occurs that
> +might be relevant later when investigating problems. Don't worry
> +yourself too much about this, most of the time it's not a problem to run

s/yourself//

> +a tainted kernel; the information is mainly of interest once someone
> +wants to investigate some problem, as its real cause might be the event
> +that got the kernel tainted.

While this is true, an oops with a taint flag will often be ignored by
developers. It's worth saying that, if at all possible, a problem needs
to be reproduced on an untainted kernel.

> That's why the kernel will remain tainted
> +even after you undo what caused the taint (i.e. unload a proprietary
> +kernel module), to indicate the kernel remains not trustworthy. That's
> +also why the kernel will print the tainted state when it noticed
> +ainternal problem (a 'kernel bug'), a recoverable error ('kernel oops')
> +or a nonrecoverable error ('kernel panic') and writes debug information
> +about this to the logs ``dmesg`` outputs. It's also possible to check
> +the tainted state at runtime through a file in ``/proc/``.
> +
> +
> +Tainted flag in bugs, oops or panics messages
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +You find the tainted state near the top after the list of loaded
> +modules. The state is part of the line that begins with mentioning CPU
> +('CPU:'), Process ID ('PID:'), and a shorted name of the executed
> +command ('Comm:') that triggered the event.

This seems like a good place for an example.

> When followed by **'Not
> +tainted: '** the kernel was not tainted at the time of the event; if it
> +was, then it will print **'Tainted: '** and characters either letters or
> +blanks. The meaning of those characters is explained in below table. The
> +output for example might state '``Tainted: P WO``' when the kernel got
> +tainted earlier because a proprietary Module (``P``) was loaded, a
> +warning occurred (``W``), and an externally-built module was loaded
> +(``O``). To decode other letters use below table.
> +
> +
> +Decoding tainted state at runtime
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +At runtime, you can query the tainted state by reading
> +``/proc/sys/kernel/tainted``. If that returns ``0``, the kernel is not
> +tainted; any other number indicates the reasons why it is. You might
> +find that number in below table if there was only one reason that got
> +the kernel tainted. If there were multiple reasons you need to decode
> +the number, as it is a bitfield, where each bit indicates the absence or
> +presence of a particular type of taint. You can use the following python
> +command to decode::

Here's an idea if you feel like improving this: rather than putting an
inscrutable program inline, add a taint_status script to scripts/ that
prints out the status in fully human-readable form, with the explanation
for every set bit.

> +
> + $ python3 -c 'from pprint import pprint; from itertools import zip_longest; pprint(list(zip_longest(range(1,17), reversed(bin(int(open("/proc/sys/kernel/tainted").read()))[2:]),fillvalue="0")))'
> + [(1, '1'),
> + (2, '0'),
> + (3, '0'),
> + (4, '0'),
> + (5, '0'),
> + (6, '0'),
> + (7, '0'),
> + (8, '0'),
> + (9, '0'),
> + (10, '1'),
> + (11, '0'),
> + (12, '0'),
> + (13, '1'),
> + (14, '0'),
> + (15, '0'),
> + (16, '0')]
> +
> +In this case ``/proc/sys/kernel/tainted`` contained ``4609``, as the
> +kernel got tainted because a proprietary Module (Bit 1) got loaded, a
> +warning occurred (Bit 10), and an externally-built module got loaded
> +(Bit 13). To decode other bits use below table.
> +
> +
> +Table for decoding tainted state
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

As noted before, this table is entirely readable and need not be messed
with.

> +=== === ====== ========================================================
> +Bit Log Int Reason that got the kernel tainted
> +=== === ====== ========================================================
> + 1) G/P 0 proprietary module got loaded

I'd s/got/was/ throughout. Also, this is the kernel, we start counting at
zero! :)

> + 2) _/F 2 module was force loaded
> + 3) _/S 4 SMP kernel oops on a officially SMP incapable processor
> + 4) _/R 8 module was force unloaded
> + 5) _/M 16 processor reported a Machine Check Exception (MCE)
> + 6) _/B 32 bad page referenced or some unexpected page flags
> + 7) _/U 64 taint requested by userspace application
> + 8) _/D 128 kernel died recently, i.e. there was an OOPS or BUG
> + 9) _/A 256 ACPI table overridden by user
> +10) _/W 512 kernel issued warning
> +11) _/C 1024 staging driver got loaded
> +12) _/I 2048 workaround for bug in platform firmware in use
> +13) _/O 4096 externally-built ("out-of-tree") module got loaded
> +14) _/E 8192 unsigned module was loaded
> +15) _/L 16384 soft lockup occurred
> +16) _/K 32768 Kernel live patched

A look at kernel.h shows two more flags. TAINT_AUX doesn't seem to be
used, but TAINT_RANDSTRUCT is.

> +=== === ====== ========================================================
> +
> +Note: To make reading easier ``_`` is representing a blank in this
> +table.
> +
> +More detailed explanation for tainting
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> 1) ``G`` if all modules loaded have a GPL or compatible license, ``P`` if
> any proprietary module has been loaded. Modules without a
> @@ -52,8 +144,3 @@ characters, each representing a particular tainted value.
>
> 16) ``K`` if the kernel has been live patched.
>
> -The primary reason for the **'Tainted: '** string is to tell kernel
> -debuggers if this is a clean kernel or if anything unusual has
> -occurred. Tainting is permanent: even if an offending module is
> -unloaded, the tainted value remains to indicate that the kernel is not
> -trustworthy.
> --
> 2.18.1

Thanks,

jon

2018-12-17 21:25:35

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH 0/1] RFC: Revamp admin-guide/tainted-kernels.rst to make it more comprehensible

On 12/17/18 10:24 AM, Jonathan Corbet wrote:
> Here's an idea if you feel like improving this: rather than putting an
> inscrutable program inline, add a taint_status script to scripts/ that
> prints out the status in fully human-readable form, with the explanation
> for every set bit.


And some people prefer not adding tools that use python, perl, etc.

E.g., I use this shell script (named 'chktaint', which could probably
be done better):

(see attachment)

--
~Randy


Attachments:
chktaint (2.30 kB)

2018-12-17 21:26:23

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH 1/1] docs: Revamp tainted-kernels.rst to make it more comprehensible

On 12/17/18 7:20 AM, Thorsten Leemhuis wrote:
>
> Signed-off-by: Thorsten Leemhuis <[email protected]>
> ---
> Documentation/admin-guide/tainted-kernels.rst | 105 ++++++++++++++++--
> 1 file changed, 96 insertions(+), 9 deletions(-)
>
> diff --git a/Documentation/admin-guide/tainted-kernels.rst b/Documentation/admin-guide/tainted-kernels.rst
> index 28a869c509a0..aabd307a178a 100644
> --- a/Documentation/admin-guide/tainted-kernels.rst
> +++ b/Documentation/admin-guide/tainted-kernels.rst
> @@ -1,10 +1,102 @@
> Tainted kernels
> ---------------
>
> -Some oops reports contain the string **'Tainted: '** after the program
> -counter. This indicates that the kernel has been tainted by some
> -mechanism. The string is followed by a series of position-sensitive
> -characters, each representing a particular tainted value.
> +The kernel will mark itself as 'tainted' when something occurs that
> +might be relevant later when investigating problems. Don't worry
> +yourself too much about this, most of the time it's not a problem to run
> +a tainted kernel; the information is mainly of interest once someone
> +wants to investigate some problem, as its real cause might be the event
> +that got the kernel tainted. That's why the kernel will remain tainted
> +even after you undo what caused the taint (i.e. unload a proprietary
> +kernel module), to indicate the kernel remains not trustworthy. That's
> +also why the kernel will print the tainted state when it noticed

notices

> +ainternal problem (a 'kernel bug'), a recoverable error ('kernel oops')

an internal

> +or a nonrecoverable error ('kernel panic') and writes debug information
> +about this to the logs ``dmesg`` outputs. It's also possible to check
> +the tainted state at runtime through a file in ``/proc/``.
> +
> +
> +Tainted flag in bugs, oops or panics messages
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +You find the tainted state near the top after the list of loaded
> +modules. The state is part of the line that begins with mentioning CPU
> +('CPU:'), Process ID ('PID:'), and a shorted name of the executed

shortened

> +command ('Comm:') that triggered the event. When followed by **'Not
> +tainted: '** the kernel was not tainted at the time of the event; if it
> +was, then it will print **'Tainted: '** and characters either letters or
> +blanks. The meaning of those characters is explained in below table. The

in the table below. The

> +output for example might state '``Tainted: P WO``' when the kernel got
> +tainted earlier because a proprietary Module (``P``) was loaded, a
> +warning occurred (``W``), and an externally-built module was loaded
> +(``O``). To decode other letters use below table.

use the table below.

> +
> +
> +Decoding tainted state at runtime
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +At runtime, you can query the tainted state by reading
> +``/proc/sys/kernel/tainted``. If that returns ``0``, the kernel is not
> +tainted; any other number indicates the reasons why it is. You might
> +find that number in below table if there was only one reason that got

in the table below for the

> +the kernel tainted. If there were multiple reasons you need to decode

kernel to be tainted. reasons,

> +the number, as it is a bitfield, where each bit indicates the absence or
> +presence of a particular type of taint. You can use the following python
> +command to decode::
> +
> + $ python3 -c 'from pprint import pprint; from itertools import zip_longest; pprint(list(zip_longest(range(1,17), reversed(bin(int(open("/proc/sys/kernel/tainted").read()))[2:]),fillvalue="0")))'
> + [(1, '1'),
> + (2, '0'),
> + (3, '0'),
> + (4, '0'),
> + (5, '0'),
> + (6, '0'),
> + (7, '0'),
> + (8, '0'),
> + (9, '0'),
> + (10, '1'),
> + (11, '0'),
> + (12, '0'),
> + (13, '1'),
> + (14, '0'),
> + (15, '0'),
> + (16, '0')]
> +
> +In this case ``/proc/sys/kernel/tainted`` contained ``4609``, as the
> +kernel got tainted because a proprietary Module (Bit 1) got loaded, a
> +warning occurred (Bit 10), and an externally-built module got loaded
> +(Bit 13). To decode other bits use below table.

use the table below.

> +
> +
> +Table for decoding tainted state
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +=== === ====== ========================================================
> +Bit Log Int Reason that got the kernel tainted
> +=== === ====== ========================================================
> + 1) G/P 0 proprietary module got loaded
> + 2) _/F 2 module was force loaded
> + 3) _/S 4 SMP kernel oops on a officially SMP incapable processor
> + 4) _/R 8 module was force unloaded
> + 5) _/M 16 processor reported a Machine Check Exception (MCE)
> + 6) _/B 32 bad page referenced or some unexpected page flags
> + 7) _/U 64 taint requested by userspace application
> + 8) _/D 128 kernel died recently, i.e. there was an OOPS or BUG
> + 9) _/A 256 ACPI table overridden by user
> +10) _/W 512 kernel issued warning
> +11) _/C 1024 staging driver got loaded
> +12) _/I 2048 workaround for bug in platform firmware in use
> +13) _/O 4096 externally-built ("out-of-tree") module got loaded
> +14) _/E 8192 unsigned module was loaded
> +15) _/L 16384 soft lockup occurred
> +16) _/K 32768 Kernel live patched
> +=== === ====== ========================================================
> +
> +Note: To make reading easier ``_`` is representing a blank in this
> +table.
> +
> +More detailed explanation for tainting
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> 1) ``G`` if all modules loaded have a GPL or compatible license, ``P`` if
> any proprietary module has been loaded. Modules without a
> @@ -52,8 +144,3 @@ characters, each representing a particular tainted value.
>
> 16) ``K`` if the kernel has been live patched.
>
> -The primary reason for the **'Tainted: '** string is to tell kernel
> -debuggers if this is a clean kernel or if anything unusual has
> -occurred. Tainting is permanent: even if an offending module is
> -unloaded, the tainted value remains to indicate that the kernel is not
> -trustworthy.
>

thanks for the update.

--
~Randy

2018-12-20 16:38:08

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: [PATCH 0/1] RFC: Revamp admin-guide/tainted-kernels.rst to make it more comprehensible

Hi! Am 17.12.18 um 22:06 schrieb Randy Dunlap:
> On 12/17/18 10:24 AM, Jonathan Corbet wrote:
>> Here's an idea if you feel like improving this: rather than putting an
>> inscrutable program inline, add a taint_status script to scripts/ that
>> prints out the status in fully human-readable form, with the explanation
>> for every set bit.
> And some people prefer not adding tools that use python, perl, etc.

Yeah, I know :-/ On twitter @apexo (thx!) suggested these two:

dc -e"[00000000000000]n2o$(cat /proc/sys/kernel/tainted)p"|fold
-w1|tac|nl| grep -m 18 '.'

(echo -n 000000000000000;(echo obase=2;cat
/proc/sys/kernel/tainted)|bc)|fold -w1|tac|nl| grep -m 18 '.'

But it needs bc, which often is not installed by default :-/ Any as you
mentioned already: using Perl (

perl -e 'printf("%016b\n",<STDIN>)' < /proc/sys/kernel/tainted |fold
-w1|tac|nl

) also has it downsides. Having something that works in plain bash/sh
would be great...

Nevertheless: I'm still inclined to put a one liner decode command into
tainted-kernels.rst so people can decode the file easily even if they do
not have attached script at hand.

> E.g., I use this shell script (named 'chktaint', which could probably
> be done better):

Many thx. Find a slightly improved version attached that directly prints
the reason. I assume that's more like what Jonathan had in mind. The
script now is also capable of decoding a value retrieved from
/proc/sys/kernel/tainted on another system.

Randy, do you spot any problems or bashisms in the code? BTW, can I have
your "Signed-off-by" for the submission?

While at it: Jonathan, you mentioned putting the script in scripts/, but
according to the Makefile in that directory it is "for various helper
programs used throughout the kernel for the build process". That's one
reason why it feels wrong to put it there. Another one: that script
targets users and thus we should try to make sure they can access it
easily. That's why I'm currently inclined to put it in tools/ somewhere.
But I'm still unsure where. tools/scripts/ is used for something else
already, so maybe tools/helper-scripts/ or something? Putting it there
and installing it by default when building tools/ afaics increases the
chances a lot that distros will actually ship it in their packages that
contain tools from that directory.

Ciao, Thorsten


Attachments:
kernel-taintstatus (3.27 kB)

2018-12-20 16:38:48

by Jonathan Corbet

[permalink] [raw]
Subject: Re: [PATCH 0/1] RFC: Revamp admin-guide/tainted-kernels.rst to make it more comprehensible

On Thu, 20 Dec 2018 16:23:38 +0100
Thorsten Leemhuis <[email protected]> wrote:

> While at it: Jonathan, you mentioned putting the script in scripts/, but
> according to the Makefile in that directory it is "for various helper
> programs used throughout the kernel for the build process". That's one
> reason why it feels wrong to put it there. Another one: that script
> targets users and thus we should try to make sure they can access it
> easily. That's why I'm currently inclined to put it in tools/ somewhere.

Yeah, tools/ is a better place. Maybe a tools/debugging directory or some
such?

jon

2018-12-20 17:02:55

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH 0/1] RFC: Revamp admin-guide/tainted-kernels.rst to make it more comprehensible

On 12/20/18 7:28 AM, Jonathan Corbet wrote:
> On Thu, 20 Dec 2018 16:23:38 +0100
> Thorsten Leemhuis <[email protected]> wrote:
>
>> While at it: Jonathan, you mentioned putting the script in scripts/, but
>> according to the Makefile in that directory it is "for various helper
>> programs used throughout the kernel for the build process". That's one
>> reason why it feels wrong to put it there. Another one: that script
>> targets users and thus we should try to make sure they can access it
>> easily. That's why I'm currently inclined to put it in tools/ somewhere.
>
> Yeah, tools/ is a better place. Maybe a tools/debugging directory or some
> such?

chktaint is similar (IMO) to scripts/decodecode though.

@Thorsten:
Signed-off-by: Randy Dunlap <[email protected]>


thanks,
--
~Randy

2018-12-20 17:06:50

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH 0/1] RFC: Revamp admin-guide/tainted-kernels.rst to make it more comprehensible

On 12/20/18 7:23 AM, Thorsten Leemhuis wrote:
> Hi! Am 17.12.18 um 22:06 schrieb Randy Dunlap:
>> On 12/17/18 10:24 AM, Jonathan Corbet wrote:
>>> Here's an idea if you feel like improving this: rather than putting an
>>> inscrutable program inline, add a taint_status script to scripts/ that
>>> prints out the status in fully human-readable form, with the explanation
>>> for every set bit.
>> And some people prefer not adding tools that use python, perl, etc.
>
> Yeah, I know :-/ On twitter @apexo (thx!) suggested these two:
>
> dc -e"[00000000000000]n2o$(cat /proc/sys/kernel/tainted)p"|fold
> -w1|tac|nl| grep -m 18 '.'
>
> (echo -n 000000000000000;(echo obase=2;cat
> /proc/sys/kernel/tainted)|bc)|fold -w1|tac|nl| grep -m 18 '.'
>
> But it needs bc, which often is not installed by default :-/ Any as you
> mentioned already: using Perl (
>
> perl -e 'printf("%016b\n",<STDIN>)' < /proc/sys/kernel/tainted |fold
> -w1|tac|nl
>
> ) also has it downsides. Having something that works in plain bash/sh
> would be great...
>
> Nevertheless: I'm still inclined to put a one liner decode command into
> tainted-kernels.rst so people can decode the file easily even if they do
> not have attached script at hand.
>
>> E.g., I use this shell script (named 'chktaint', which could probably
>> be done better):
>
> Many thx. Find a slightly improved version attached that directly prints
> the reason. I assume that's more like what Jonathan had in mind. The
> script now is also capable of decoding a value retrieved from
> /proc/sys/kernel/tainted on another system.

Thorsten:

- drop the trailing spaces on multiple lines
- s/follwing/following/


thanks.
--
~Randy

2018-12-21 03:20:55

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: [PATCH 0/1] RFC: Revamp admin-guide/tainted-kernels.rst to make it more comprehensible

Am 20.12.18 um 17:38 schrieb Randy Dunlap:
> On 12/20/18 7:28 AM, Jonathan Corbet wrote:
>> On Thu, 20 Dec 2018 16:23:38 +0100
>> Thorsten Leemhuis <[email protected]> wrote:
>>
>>> While at it: Jonathan, you mentioned putting the script in scripts/, but
>>> according to the Makefile in that directory it is "for various helper
>>> programs used throughout the kernel for the build process". That's one
>>> reason why it feels wrong to put it there. Another one: that script
>>> targets users and thus we should try to make sure they can access it
>>> easily. That's why I'm currently inclined to put it in tools/ somewhere.
>> Yeah, tools/ is a better place. Maybe a tools/debugging directory or some
>> such?
> chktaint

BTW, I renamed it to kernel-taintstatus, sounded more appropriate to me.
Does anyone mind?

> is similar (IMO) to scripts/decodecode though.

Hmmm. Maybe it would be better to move this to tools/? Will take a quick
look, guess sooner or later by current endeavours will lead me to the
documentation that refers to this script anyway.

> @Thorsten:
> Signed-off-by: Randy Dunlap <[email protected]>

Thx. And thx for the feedback in the other reply.

BTW, for those following this thread and my earlier quest for a simple
cmd to decode /proc/sys/kernel/tainted: looks like @apexo on twitter
(thx again!) found a trick to do what I want which should work on most
systems out-of-the-box:

$ for i in $(seq 18); do echo $i $(($(cat
/proc/sys/kernel/tainted)>>($i-1)&1));done

Ciao, Thorsten

2018-12-21 10:19:20

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH 0/1] RFC: Revamp admin-guide/tainted-kernels.rst to make it more comprehensible

On 12/20/18 10:21 AM, Thorsten Leemhuis wrote:
> Am 20.12.18 um 17:38 schrieb Randy Dunlap:
>> On 12/20/18 7:28 AM, Jonathan Corbet wrote:
>>> On Thu, 20 Dec 2018 16:23:38 +0100
>>> Thorsten Leemhuis <[email protected]> wrote:
>>>
>>>> While at it: Jonathan, you mentioned putting the script in scripts/, but
>>>> according to the Makefile in that directory it is "for various helper
>>>> programs used throughout the kernel for the build process". That's one
>>>> reason why it feels wrong to put it there. Another one: that script
>>>> targets users and thus we should try to make sure they can access it
>>>> easily. That's why I'm currently inclined to put it in tools/ somewhere.
>>> Yeah, tools/ is a better place. Maybe a tools/debugging directory or some
>>> such?
>> chktaint
>
> BTW, I renamed it to kernel-taintstatus, sounded more appropriate to me.
> Does anyone mind?

Not terribly, although that seems too long to me. ;)
maybe 'taintstatus'?

>> is similar (IMO) to scripts/decodecode though.
>
> Hmmm. Maybe it would be better to move this to tools/? Will take a quick
> look, guess sooner or later by current endeavours will lead me to the
> documentation that refers to this script anyway.
>
>> @Thorsten:
>> Signed-off-by: Randy Dunlap <[email protected]>
>
> Thx. And thx for the feedback in the other reply.
>
> BTW, for those following this thread and my earlier quest for a simple
> cmd to decode /proc/sys/kernel/tainted: looks like @apexo on twitter
> (thx again!) found a trick to do what I want which should work on most
> systems out-of-the-box:
>
> $ for i in $(seq 18); do echo $i $(($(cat
> /proc/sys/kernel/tainted)>>($i-1)&1));done

I think Jon mentioned this: The output should begin with bit #0,
not bit #1, so it should show bits 0 - 17 (or whatever the max is),
not 1 - 18.


--
~Randy

2018-12-21 16:41:41

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: [PATCH 0/1] RFC: Revamp admin-guide/tainted-kernels.rst to make it more comprehensible

Am 20.12.18 um 21:10 schrieb Randy Dunlap:
> On 12/20/18 10:21 AM, Thorsten Leemhuis wrote:
>> Am 20.12.18 um 17:38 schrieb Randy Dunlap:
>>> On 12/20/18 7:28 AM, Jonathan Corbet wrote:
>>>> On Thu, 20 Dec 2018 16:23:38 +0100
>>>> Thorsten Leemhuis <[email protected]> wrote:
>>>>> While at it: Jonathan, you mentioned putting the script in scripts/, but
>>>>> according to the Makefile in that directory it is "for various helper
>>>>> programs used throughout the kernel for the build process". That's one
>>>>> reason why it feels wrong to put it there. Another one: that script
>>>>> targets users and thus we should try to make sure they can access it
>>>>> easily. That's why I'm currently inclined to put it in tools/ somewhere.
>>>> Yeah, tools/ is a better place. Maybe a tools/debugging directory or some
>>>> such?
>>> chktaint
>> BTW, I renamed it to kernel-taintstatus, sounded more appropriate to me.
>> Does anyone mind?
> Not terribly, although that seems too long to me. ;)
> maybe 'taintstatus'?

I settled to "kernel-chktaint" for now. I'm not attached to the name, but IMHO
making it obvious what this tool checks is worth the "kernel-" prefix.

>> BTW, for those following this thread and my earlier quest for a simple
>> cmd to decode /proc/sys/kernel/tainted: looks like @apexo on twitter
>> (thx again!) found a trick to do what I want which should work on most
>> systems out-of-the-box:
>> $ for i in $(seq 18); do echo $i $(($(cat
>> /proc/sys/kernel/tainted)>>($i-1)&1));done
> I think Jon mentioned this: The output should begin with bit #0,
> not bit #1, so it should show bits 0 - 17 (or whatever the max is),
> not 1 - 18.

No worries, replying to that is nearly next on my todo list.

BTW & FYI, find below the patch I have prepared now.

Ciao, Thorsten

commit 2aa04b7a65a5ecceac27a0d9c0d64a4b04ae943a
Author: Thorsten Leemhuis <[email protected]>
Date: Fri Dec 21 12:24:19 2018 +0100

tools: create tools/debugging/ and add a script decoding /proc/sys/kernel/tainted

Add a script to the tools/ directory that shows if or why the running kernel was
tainted. The script was mostly written by Randy Dunlap (thx!), who published it
while discussing changes that try to make admin-guide/tainted-kernels.rst more
comprehensible (https://lore.kernel.org/lkml/[email protected]/);
I enhanced the script a bit and created this patch.

As the script targets users I did not want to add it to scripts/, as according
to its Makefile "contains sources for various helper programs used throughout
the kernel for the build process". The directory tools/scripts/ also did not
look like a good fit, as the stuff that's there already is used for other
purposes. That's why I created a new directory for tools like this; maybe we
should move scripts/decodecode there as well, but that's something for another
day.

Signed-off-by: Randy Dunlap <[email protected]>
Signed-off-by: Thorsten Leemhuis <[email protected]>

diff --git a/tools/Makefile b/tools/Makefile
index abb358a70ad0..c0d1e59f5abb 100644
--- a/tools/Makefile
+++ b/tools/Makefile
@@ -12,6 +12,7 @@ help:
@echo ' acpi - ACPI tools'
@echo ' cgroup - cgroup tools'
@echo ' cpupower - a tool for all things x86 CPU power'
+ @echo ' debugging - tools for debugging'
@echo ' firewire - the userspace part of nosy, an IEEE-1394 traffic sniffer'
@echo ' freefall - laptop accelerometer program for disk protection'
@echo ' gpio - GPIO tools'
@@ -60,7 +61,7 @@ acpi: FORCE
cpupower: FORCE
$(call descend,power/$@)

-cgroup firewire hv guest spi usb virtio vm bpf iio gpio objtool leds wmi pci: FORCE
+cgroup firewire hv guest spi usb virtio vm bpf iio gpio objtool leds wmi pci debugging: FORCE
$(call descend,$@)

liblockdep: FORCE
@@ -95,7 +96,8 @@ kvm_stat: FORCE
all: acpi cgroup cpupower gpio hv firewire liblockdep \
perf selftests spi turbostat usb \
virtio vm bpf x86_energy_perf_policy \
- tmon freefall iio objtool kvm_stat wmi pci
+ tmon freefall iio objtool kvm_stat wmi \
+ pci debugging

acpi_install:
$(call descend,power/$(@:_install=),install)
@@ -103,7 +105,7 @@ acpi_install:
cpupower_install:
$(call descend,power/$(@:_install=),install)

-cgroup_install firewire_install gpio_install hv_install iio_install perf_install spi_install usb_install virtio_install vm_install bpf_install objtool_install wmi_install pci_install:
+cgroup_install firewire_install gpio_install hv_install iio_install perf_install spi_install usb_install virtio_install vm_install bpf_install objtool_install wmi_install pci_install debugging_install:
$(call descend,$(@:_install=),install)

liblockdep_install:
@@ -129,7 +131,7 @@ install: acpi_install cgroup_install cpupower_install gpio_install \
perf_install selftests_install turbostat_install usb_install \
virtio_install vm_install bpf_install x86_energy_perf_policy_install \
tmon_install freefall_install objtool_install kvm_stat_install \
- wmi_install pci_install
+ wmi_install pci_install debugging_install

acpi_clean:
$(call descend,power/acpi,clean)
@@ -137,7 +139,7 @@ acpi_clean:
cpupower_clean:
$(call descend,power/cpupower,clean)

-cgroup_clean hv_clean firewire_clean spi_clean usb_clean virtio_clean vm_clean wmi_clean bpf_clean iio_clean gpio_clean objtool_clean leds_clean pci_clean:
+cgroup_clean hv_clean firewire_clean spi_clean usb_clean virtio_clean vm_clean wmi_clean bpf_clean iio_clean gpio_clean objtool_clean leds_clean pci_clean debugging_clean:
$(call descend,$(@:_clean=),clean)

liblockdep_clean:
@@ -175,6 +177,6 @@ clean: acpi_clean cgroup_clean cpupower_clean hv_clean firewire_clean \
perf_clean selftests_clean turbostat_clean spi_clean usb_clean virtio_clean \
vm_clean bpf_clean iio_clean x86_energy_perf_policy_clean tmon_clean \
freefall_clean build_clean libbpf_clean libsubcmd_clean liblockdep_clean \
- gpio_clean objtool_clean leds_clean wmi_clean pci_clean
+ gpio_clean objtool_clean leds_clean wmi_clean pci_clean debugging_clean

.PHONY: FORCE
diff --git a/tools/debugging/Makefile b/tools/debugging/Makefile
new file mode 100644
index 000000000000..e2b7c1a6fb8f
--- /dev/null
+++ b/tools/debugging/Makefile
@@ -0,0 +1,16 @@
+# SPDX-License-Identifier: GPL-2.0
+# Makefile for debugging tools
+
+PREFIX ?= /usr
+BINDIR ?= bin
+INSTALL ?= install
+
+TARGET = kernel-chktaint
+
+all: $(TARGET)
+
+clean:
+
+install: kernel-chktaint
+ $(INSTALL) -D -m 755 $(TARGET) $(DESTDIR)$(PREFIX)/$(BINDIR)/$(TARGET)
+
diff --git a/tools/debugging/kernel-chktaint b/tools/debugging/kernel-chktaint
new file mode 100644
index 000000000000..98861858b192
--- /dev/null
+++ b/tools/debugging/kernel-chktaint
@@ -0,0 +1,199 @@
+#! /bin/sh
+# SPDX-License-Identifier: GPL-2.0
+#
+# Randy Dunlap <[email protected]>, 2018
+# Thorsten Leemhuis <[email protected]>, 2018
+
+usage()
+{
+ cat <<EOF
+usage: ${0##*/}
+ ${0##*/} <int>
+
+Call without parameters to decode /proc/sys/kernel/tainted.
+
+Call with a positive integer as parameter to decode a value you
+retrieved from /proc/sys/kernel/tainted on another system.
+
+EOF
+}
+
+if [ "$1"x != "x" ]; then
+ if [ "$1"x == "--helpx" ] || [ "$1"x == "-hx" ] ; then
+ usage
+ exit 1
+ elif [ $1 -ge 0 ] 2>/dev/null ; then
+ taint=$1
+ else
+ echo "Error: Parameter '$1' not a positive interger. Aborting." >&2
+ exit 1
+ fi
+else
+ TAINTFILE="/proc/sys/kernel/tainted"
+ if [ ! -r $TAINTFILE ]; then
+ echo "No file: $TAINTFILE"
+ exit
+ fi
+
+ taint=`cat $TAINTFILE`
+fi
+
+if [ $taint -eq 0 ]; then
+ echo "Kernel not Tainted"
+ exit
+else
+ echo "Kernel is Tainted for following reasons:"
+fi
+
+T=$taint
+out=
+
+addout() {
+ out=$out$1
+}
+
+if [ `expr $T % 2` -eq 0 ]; then
+ addout "G"
+else
+ addout "P"
+ echo " * Proprietary module was loaded."
+fi
+
+T=`expr $T / 2`
+if [ `expr $T % 2` -eq 0 ]; then
+ addout " "
+else
+ addout "F"
+ echo " * Module was force loaded."
+fi
+
+T=`expr $T / 2`
+if [ `expr $T % 2` -eq 0 ]; then
+ addout " "
+else
+ addout "S"
+ echo " * SMP kernel oops on an officially SMP incapable processor."
+fi
+
+T=`expr $T / 2`
+if [ `expr $T % 2` -eq 0 ]; then
+ addout " "
+else
+ addout "R"
+ echo " * Module was force unloaded."
+fi
+
+T=`expr $T / 2`
+if [ `expr $T % 2` -eq 0 ]; then
+ addout " "
+else
+ addout "M"
+ echo " * Processor reported a Machine Check Exception (MCE)."
+fi
+
+T=`expr $T / 2`
+if [ `expr $T % 2` -eq 0 ]; then
+ addout " "
+else
+ addout "B"
+ echo " * Bad page referenced or some unexpected page flags."
+fi
+
+T=`expr $T / 2`
+if [ `expr $T % 2` -eq 0 ]; then
+ addout " "
+else
+ addout "U"
+ echo " * Taint requested by userspace application."
+fi
+
+T=`expr $T / 2`
+if [ `expr $T % 2` -eq 0 ]; then
+ addout " "
+else
+ addout "D"
+ echo " * Kernel died recently, i.e. there was an OOPS or BUG"
+fi
+
+T=`expr $T / 2`
+if [ `expr $T % 2` -eq 0 ]; then
+ addout " "
+else
+ addout "A"
+ echo " * ACPI table overridden by user."
+fi
+
+T=`expr $T / 2`
+if [ `expr $T % 2` -eq 0 ]; then
+ addout " "
+else
+ addout "W"
+ echo " * Kernel issued warning."
+fi
+
+T=`expr $T / 2`
+if [ `expr $T % 2` -eq 0 ]; then
+ addout " "
+else
+ addout "C"
+ echo " * Staging driver was loaded."
+fi
+
+T=`expr $T / 2`
+if [ `expr $T % 2` -eq 0 ]; then
+ addout " "
+else
+ addout "I"
+ echo " * Workaround for bug in platform firmware applied."
+fi
+
+T=`expr $T / 2`
+if [ `expr $T % 2` -eq 0 ]; then
+ addout " "
+else
+ addout "O"
+ echo " * Externally-built ('out-of-tree') module was loaded"
+fi
+
+T=`expr $T / 2`
+if [ `expr $T % 2` -eq 0 ]; then
+ addout " "
+else
+ addout "E"
+ echo " * Unsigned module was loaded."
+fi
+
+T=`expr $T / 2`
+if [ `expr $T % 2` -eq 0 ]; then
+ addout " "
+else
+ addout "L"
+ echo " * Soft lockup occurred."
+fi
+
+T=`expr $T / 2`
+if [ `expr $T % 2` -eq 0 ]; then
+ addout " "
+else
+ addout "K"
+ echo " * Kernel live patched."
+fi
+
+T=`expr $T / 2`
+if [ `expr $T % 2` -eq 0 ]; then
+ addout " "
+else
+ addout "X"
+ echo " * Auxiliary taint, defined for and used by distros."
+
+fi
+T=`expr $T / 2`
+if [ `expr $T % 2` -eq 0 ]; then
+ addout " "
+else
+ addout "T"
+ echo " * Kernel was built with the struct randomization plugin."
+fi
+
+echo "Raw taint value as int/string: $taint/'$out'"
+#EOF#

2018-12-21 23:44:02

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: [PATCH 0/1] RFC: Revamp admin-guide/tainted-kernels.rst to make it more comprehensible

Hi! Am 17.12.18 um 19:24 schrieb Jonathan Corbet:
> On Mon, 17 Dec 2018 16:20:42 +0100
> Thorsten Leemhuis <[email protected]> wrote:
>
>> +might be relevant later when investigating problems. Don't worry
>> +yourself too much about this, most of the time it's not a problem to run
> s/yourself//

Thx for this and other suggestions or fixes, consider them implemented when
not mentioned in this mail. Find the current state of the text at the end of
this mail for reference.

> [...]
>> +At runtime, you can query the tainted state by reading
>> +``/proc/sys/kernel/tainted``. If that returns ``0``, the kernel is not
>> +tainted; any other number indicates the reasons why it is. You might
>> +find that number in below table if there was only one reason that got
>> +the kernel tainted. If there were multiple reasons you need to decode
>> +the number, as it is a bitfield, where each bit indicates the absence or
>> +presence of a particular type of taint. You can use the following python
>> +command to decode::
> Here's an idea if you feel like improving this: rather than putting an
> inscrutable program inline, add a taint_status script to scripts/ that
> prints out the status in fully human-readable form, with the explanation
> for every set bit.

I posted the script earlier today and noticed now that it prints only
the fully human-readable form, not if a bit it set or unset. Would you
prefer if it did that as well?

>> +=== === ====== ========================================================
>> +Bit Log Int Reason that got the kernel tainted
>> +=== === ====== ========================================================
>> + 1) G/P 0 proprietary module got loaded
> I'd s/got/was/ throughout. Also, this is the kernel, we start counting at
> zero! :)

Hehe, yeah :-D At first I actually started at zero, but that looked
odd as the old explanations (those already in the file) start to could at one.
Having a off-by-one within one document is just confusing, that's why I
decided against starting at zero here.

Another reason that came to my mind when reading your comment: Yes, this
is the kernel, but the document should be easy to understand even for
inexperienced users (e.g. people that know how to open and use command
line tools, but never learned programming). That's why I leaning towards
starting with one everywhere. But yes, that can be confusing, that's
why I added a note, albeit I'm not really happy with it yet:

"""
Note: This document is aimed at users and thus starts to count at one here and
in other places. Use ``seq 0 17`` instead to start counting at zero, as it's
normal for developers.
"""

See below for full context. Anyway: I can change the text to start at zero if
you prefer it.

Ciao, Thorsten

---

Tainted kernels
---------------

The kernel will mark itself as 'tainted' when something occurs that might be
relevant later when investigating problems. Don't worry too much about this,
most of the time it's not a problem to run a tainted kernel; the information is
mainly of interest once someone wants to investigate some problem, as its real
cause might be the event that got the kernel tainted. That's why bug reports
from tainted kernels will often be ignored by developers, hence try to reproduce
problems with an untainted kernel.

Note the kernel will remain tainted even after you undo what caused the taint
(i.e. unload a proprietary kernel module), to indicate the kernel remains not
trustworthy. That's also why the kernel will print the tainted state when it
notices an internal problem (a 'kernel bug'), a recoverable error
('kernel oops') or a non-recoverable error ('kernel panic') and writes debug
information about this to the logs ``dmesg`` outputs. It's also possible to
check the tainted state at runtime through a file in ``/proc/``.


Tainted flag in bugs, oops or panics messages
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

You find the tainted state near the top in a line starting with 'CPU:'; if or
why the kernel is shown after the Process ID ('PID:') and a shortened name of
the command ('Comm:') that triggered the event:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
Oops: 0002 [#1] SMP PTI
CPU: 0 PID: 4424 Comm: insmod Tainted: P W O 4.20.0-0.rc6.fc30 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:my_oops_init+0x13/0x1000 [kpanic]
[...]

You'll find a **'Not tainted: '** there if the kernel was not tainted at the
time of the event; if it was, then it will print **'Tainted: '** and characters
either letters or blanks. The meaning of those characters is explained in the
table below. In above example it's '``Tainted: P W O ``' as as the
kernel got tainted earlier because a proprietary Module (``P``) was loaded, a
warning occurred (``W``), and an externally-built module was loaded (``O``).
To decode other letters use the table below.


Decoding tainted state at runtime
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

At runtime, you can query the tainted state by reading
``cat /proc/sys/kernel/tainted``. If that returns ``0``, the kernel is not
tainted; any other number indicates the reasons why it is. The easiest way to
decode that number is the script ``tools/debugging/kernel-chktaint``, which your
distribution might ship as part of a package called ``linux-tools`` or
``kernel-tools``; if it doesn't you can download the script from
`git.kernel.org <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/plain/tools/debugging/kernel-chktaint>`_.
and execute it with ``sh kernel-chktaint``

If you do not want to run that script you can try to decode the number yourself.
That's easy if there was only one reason that got your kernel tainted, as in
this case you can find the number with the table below. If there were multiple
reasons you need to decode the number, as it is a bitfield, where each bit
indicates the absence or presence of a particular type of taint. It's best to
leave that to the aforementioned script, but if you need something quick you can
use this shell command to check which bits are set:

$ for i in $(seq 18); do echo $i $(($(cat /proc/sys/kernel/tainted)>>($i-1)&1));done

Note: This document is aimed at users and thus starts to count at one here and
in other places. Use ``seq 0 17`` instead to start counting at zero, as it's
normal for developers.

Table for decoding tainted state
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

==== === ====== ========================================================
Pos. Log Number Reason that got the kernel tainted
==== === ====== ========================================================
1) G/P 0 proprietary module was loaded
2) _/F 2 module was force loaded
3) _/S 4 SMP kernel oops on an officially SMP incapable processor
4) _/R 8 module was force unloaded
5) _/M 16 processor reported a Machine Check Exception (MCE)
6) _/B 32 bad page referenced or some unexpected page flags
7) _/U 64 taint requested by userspace application
8) _/D 128 kernel died recently, i.e. there was an OOPS or BUG
9) _/A 256 ACPI table overridden by user
10) _/W 512 kernel issued warning
11) _/C 1024 staging driver was loaded
12) _/I 2048 workaround for bug in platform firmware applied
13) _/O 4096 externally-built ("out-of-tree") module was loaded
14) _/E 8192 unsigned module was loaded
15) _/L 16384 soft lockup occurred
16) _/K 32768 Kernel live patched
17) _/K 65536 Auxiliary taint, defined for and used by distros
18) _/K 131072 Kernel was built with the struct randomization plugin
==== === ====== ========================================================

Note: To make reading easier ``_`` is representing a blank in this
table.

More detailed explanation for tainting
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

1) ``G`` if all modules loaded have a GPL or compatible license, ``P`` if
any proprietary module has been loaded. Modules without a
MODULE_LICENSE or with a MODULE_LICENSE that is not recognised by
insmod as GPL compatible are assumed to be proprietary.

2) ``F`` if any module was force loaded by ``insmod -f``, ``' '`` if all
modules were loaded normally.

3) ``S`` if the oops occurred on an SMP kernel running on hardware that
hasn't been certified as safe to run multiprocessor.
Currently this occurs only on various Athlons that are not
SMP capable.

4) ``R`` if a module was force unloaded by ``rmmod -f``, ``' '`` if all
modules were unloaded normally.

5) ``M`` if any processor has reported a Machine Check Exception,
``' '`` if no Machine Check Exceptions have occurred.

6) ``B`` if a page-release function has found a bad page reference or
some unexpected page flags.

7) ``U`` if a user or user application specifically requested that the
Tainted flag be set, ``' '`` otherwise.

8) ``D`` if the kernel has died recently, i.e. there was an OOPS or BUG.

9) ``A`` if the ACPI table has been overridden.

10) ``W`` if a warning has previously been issued by the kernel.
(Though some warnings may set more specific taint flags.)

11) ``C`` if a staging driver has been loaded.

12) ``I`` if the kernel is working around a severe bug in the platform
firmware (BIOS or similar).

13) ``O`` if an externally-built ("out-of-tree") module has been loaded.

14) ``E`` if an unsigned module has been loaded in a kernel supporting
module signature.

15) ``L`` if a soft lockup has previously occurred on the system.

16) ``K`` if the kernel has been live patched.

17) ``X`` Auxiliary taint, defined for and used by Linux distributors.

18) ``T`` Kernel was build with randstruct plugin, which can intentionally
produce extremely unusual kernel structure layouts (even performance
pathological ones), which is important to know when debugging. Set at
build time.

2019-01-03 13:56:47

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: [PATCH 0/1] RFC: Revamp admin-guide/tainted-kernels.rst to make it more comprehensible

Hi Jonathan! If you have a minute could you provide feedback on below
mail? I sent it right before Christmas to get it of my todo list, but
due to the timing it afaics fell through the cracks a bit, as I had
feared already (no worries). Ciao, Thorsten

Am 21.12.18 um 16:26 schrieb Thorsten Leemhuis:
> Hi! Am 17.12.18 um 19:24 schrieb Jonathan Corbet:
>> On Mon, 17 Dec 2018 16:20:42 +0100
>> Thorsten Leemhuis <[email protected]> wrote:
>>
>>> +might be relevant later when investigating problems. Don't worry
>>> +yourself too much about this, most of the time it's not a problem to run
>> s/yourself//
>
> Thx for this and other suggestions or fixes, consider them implemented when
> not mentioned in this mail. Find the current state of the text at the end of
> this mail for reference.
>
>> [...]
>>> +At runtime, you can query the tainted state by reading
>>> +``/proc/sys/kernel/tainted``. If that returns ``0``, the kernel is not
>>> +tainted; any other number indicates the reasons why it is. You might
>>> +find that number in below table if there was only one reason that got
>>> +the kernel tainted. If there were multiple reasons you need to decode
>>> +the number, as it is a bitfield, where each bit indicates the absence or
>>> +presence of a particular type of taint. You can use the following python
>>> +command to decode::
>> Here's an idea if you feel like improving this: rather than putting an
>> inscrutable program inline, add a taint_status script to scripts/ that
>> prints out the status in fully human-readable form, with the explanation
>> for every set bit.
>
> I posted the script earlier today and noticed now that it prints only
> the fully human-readable form, not if a bit it set or unset. Would you
> prefer if it did that as well?
>
>>> +=== === ====== ========================================================
>>> +Bit Log Int Reason that got the kernel tainted
>>> +=== === ====== ========================================================
>>> + 1) G/P 0 proprietary module got loaded
>> I'd s/got/was/ throughout. Also, this is the kernel, we start counting at
>> zero! :)
>
> Hehe, yeah :-D At first I actually started at zero, but that looked
> odd as the old explanations (those already in the file) start to could at one.
> Having a off-by-one within one document is just confusing, that's why I
> decided against starting at zero here.
>
> Another reason that came to my mind when reading your comment: Yes, this
> is the kernel, but the document should be easy to understand even for
> inexperienced users (e.g. people that know how to open and use command
> line tools, but never learned programming). That's why I leaning towards
> starting with one everywhere. But yes, that can be confusing, that's
> why I added a note, albeit I'm not really happy with it yet:
>
> """
> Note: This document is aimed at users and thus starts to count at one here and
> in other places. Use ``seq 0 17`` instead to start counting at zero, as it's
> normal for developers.
> """
>
> See below for full context. Anyway: I can change the text to start at zero if
> you prefer it.
>
> Ciao, Thorsten
>
> ---
>
> Tainted kernels
> ---------------
>
> The kernel will mark itself as 'tainted' when something occurs that might be
> relevant later when investigating problems. Don't worry too much about this,
> most of the time it's not a problem to run a tainted kernel; the information is
> mainly of interest once someone wants to investigate some problem, as its real
> cause might be the event that got the kernel tainted. That's why bug reports
> from tainted kernels will often be ignored by developers, hence try to reproduce
> problems with an untainted kernel.
>
> Note the kernel will remain tainted even after you undo what caused the taint
> (i.e. unload a proprietary kernel module), to indicate the kernel remains not
> trustworthy. That's also why the kernel will print the tainted state when it
> notices an internal problem (a 'kernel bug'), a recoverable error
> ('kernel oops') or a non-recoverable error ('kernel panic') and writes debug
> information about this to the logs ``dmesg`` outputs. It's also possible to
> check the tainted state at runtime through a file in ``/proc/``.
>
>
> Tainted flag in bugs, oops or panics messages
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> You find the tainted state near the top in a line starting with 'CPU:'; if or
> why the kernel is shown after the Process ID ('PID:') and a shortened name of
> the command ('Comm:') that triggered the event:
>
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
> Oops: 0002 [#1] SMP PTI
> CPU: 0 PID: 4424 Comm: insmod Tainted: P W O 4.20.0-0.rc6.fc30 #1
> Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
> RIP: 0010:my_oops_init+0x13/0x1000 [kpanic]
> [...]
>
> You'll find a **'Not tainted: '** there if the kernel was not tainted at the
> time of the event; if it was, then it will print **'Tainted: '** and characters
> either letters or blanks. The meaning of those characters is explained in the
> table below. In above example it's '``Tainted: P W O ``' as as the
> kernel got tainted earlier because a proprietary Module (``P``) was loaded, a
> warning occurred (``W``), and an externally-built module was loaded (``O``).
> To decode other letters use the table below.
>
>
> Decoding tainted state at runtime
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> At runtime, you can query the tainted state by reading
> ``cat /proc/sys/kernel/tainted``. If that returns ``0``, the kernel is not
> tainted; any other number indicates the reasons why it is. The easiest way to
> decode that number is the script ``tools/debugging/kernel-chktaint``, which your
> distribution might ship as part of a package called ``linux-tools`` or
> ``kernel-tools``; if it doesn't you can download the script from
> `git.kernel.org <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/plain/tools/debugging/kernel-chktaint>`_.
> and execute it with ``sh kernel-chktaint``
>
> If you do not want to run that script you can try to decode the number yourself.
> That's easy if there was only one reason that got your kernel tainted, as in
> this case you can find the number with the table below. If there were multiple
> reasons you need to decode the number, as it is a bitfield, where each bit
> indicates the absence or presence of a particular type of taint. It's best to
> leave that to the aforementioned script, but if you need something quick you can
> use this shell command to check which bits are set:
>
> $ for i in $(seq 18); do echo $i $(($(cat /proc/sys/kernel/tainted)>>($i-1)&1));done
>
> Note: This document is aimed at users and thus starts to count at one here and
> in other places. Use ``seq 0 17`` instead to start counting at zero, as it's
> normal for developers.
>
> Table for decoding tainted state
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> ==== === ====== ========================================================
> Pos. Log Number Reason that got the kernel tainted
> ==== === ====== ========================================================
> 1) G/P 0 proprietary module was loaded
> 2) _/F 2 module was force loaded
> 3) _/S 4 SMP kernel oops on an officially SMP incapable processor
> 4) _/R 8 module was force unloaded
> 5) _/M 16 processor reported a Machine Check Exception (MCE)
> 6) _/B 32 bad page referenced or some unexpected page flags
> 7) _/U 64 taint requested by userspace application
> 8) _/D 128 kernel died recently, i.e. there was an OOPS or BUG
> 9) _/A 256 ACPI table overridden by user
> 10) _/W 512 kernel issued warning
> 11) _/C 1024 staging driver was loaded
> 12) _/I 2048 workaround for bug in platform firmware applied
> 13) _/O 4096 externally-built ("out-of-tree") module was loaded
> 14) _/E 8192 unsigned module was loaded
> 15) _/L 16384 soft lockup occurred
> 16) _/K 32768 Kernel live patched
> 17) _/K 65536 Auxiliary taint, defined for and used by distros
> 18) _/K 131072 Kernel was built with the struct randomization plugin
> ==== === ====== ========================================================
>
> Note: To make reading easier ``_`` is representing a blank in this
> table.
>
> More detailed explanation for tainting
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> 1) ``G`` if all modules loaded have a GPL or compatible license, ``P`` if
> any proprietary module has been loaded. Modules without a
> MODULE_LICENSE or with a MODULE_LICENSE that is not recognised by
> insmod as GPL compatible are assumed to be proprietary.
>
> 2) ``F`` if any module was force loaded by ``insmod -f``, ``' '`` if all
> modules were loaded normally.
>
> 3) ``S`` if the oops occurred on an SMP kernel running on hardware that
> hasn't been certified as safe to run multiprocessor.
> Currently this occurs only on various Athlons that are not
> SMP capable.
>
> 4) ``R`` if a module was force unloaded by ``rmmod -f``, ``' '`` if all
> modules were unloaded normally.
>
> 5) ``M`` if any processor has reported a Machine Check Exception,
> ``' '`` if no Machine Check Exceptions have occurred.
>
> 6) ``B`` if a page-release function has found a bad page reference or
> some unexpected page flags.
>
> 7) ``U`` if a user or user application specifically requested that the
> Tainted flag be set, ``' '`` otherwise.
>
> 8) ``D`` if the kernel has died recently, i.e. there was an OOPS or BUG.
>
> 9) ``A`` if the ACPI table has been overridden.
>
> 10) ``W`` if a warning has previously been issued by the kernel.
> (Though some warnings may set more specific taint flags.)
>
> 11) ``C`` if a staging driver has been loaded.
>
> 12) ``I`` if the kernel is working around a severe bug in the platform
> firmware (BIOS or similar).
>
> 13) ``O`` if an externally-built ("out-of-tree") module has been loaded.
>
> 14) ``E`` if an unsigned module has been loaded in a kernel supporting
> module signature.
>
> 15) ``L`` if a soft lockup has previously occurred on the system.
>
> 16) ``K`` if the kernel has been live patched.
>
> 17) ``X`` Auxiliary taint, defined for and used by Linux distributors.
>
> 18) ``T`` Kernel was build with randstruct plugin, which can intentionally
> produce extremely unusual kernel structure layouts (even performance
> pathological ones), which is important to know when debugging. Set at
> build time.
>

2019-01-03 23:44:43

by Jonathan Corbet

[permalink] [raw]
Subject: Re: [PATCH 0/1] RFC: Revamp admin-guide/tainted-kernels.rst to make it more comprehensible

Sorry for the delay in responding to this ... $EXCUSES ...

On Fri, 21 Dec 2018 16:26:31 +0100
Thorsten Leemhuis <[email protected]> wrote:

> > Here's an idea if you feel like improving this: rather than putting an
> > inscrutable program inline, add a taint_status script to scripts/ that
> > prints out the status in fully human-readable form, with the explanation
> > for every set bit.
>
> I posted the script earlier today and noticed now that it prints only
> the fully human-readable form, not if a bit it set or unset. Would you
> prefer if it did that as well?

Not sure I have an opinion; perhaps if it can be done in a readable way
putting more information is better than less.

> >> +=== === ====== ========================================================
> >> +Bit Log Int Reason that got the kernel tainted
> >> +=== === ====== ========================================================
> >> + 1) G/P 0 proprietary module got loaded
> > I'd s/got/was/ throughout. Also, this is the kernel, we start counting at
> > zero! :)
>
> Hehe, yeah :-D At first I actually started at zero, but that looked
> odd as the old explanations (those already in the file) start to could at one.
> Having a off-by-one within one document is just confusing, that's why I
> decided against starting at zero here.
>
> Another reason that came to my mind when reading your comment: Yes, this
> is the kernel, but the document should be easy to understand even for
> inexperienced users (e.g. people that know how to open and use command
> line tools, but never learned programming). That's why I leaning towards
> starting with one everywhere. But yes, that can be confusing, that's
> why I added a note, albeit I'm not really happy with it yet:
>
> """
> Note: This document is aimed at users and thus starts to count at one here and
> in other places. Use ``seq 0 17`` instead to start counting at zero, as it's
> normal for developers.
> """
>
> See below for full context. Anyway: I can change the text to start at zero if
> you prefer it.

This is a kernel document in the end, so I do really think that we should
be consistent with kernel conventions.

> Tainted kernels
> ---------------
>
> The kernel will mark itself as 'tainted' when something occurs that might be
> relevant later when investigating problems. Don't worry too much about this,
> most of the time it's not a problem to run a tainted kernel; the information is
> mainly of interest once someone wants to investigate some problem, as its real
> cause might be the event that got the kernel tainted. That's why bug reports
> from tainted kernels will often be ignored by developers, hence try to reproduce
> problems with an untainted kernel.
>
> Note the kernel will remain tainted even after you undo what caused the taint
> (i.e. unload a proprietary kernel module), to indicate the kernel remains not
> trustworthy. That's also why the kernel will print the tainted state when it
> notices an internal problem (a 'kernel bug'), a recoverable error
> ('kernel oops') or a non-recoverable error ('kernel panic') and writes debug
> information about this to the logs ``dmesg`` outputs. It's also possible to
> check the tainted state at runtime through a file in ``/proc/``.
>
>
> Tainted flag in bugs, oops or panics messages
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> You find the tainted state near the top in a line starting with 'CPU:'; if or
> why the kernel is shown after the Process ID ('PID:') and a shortened name of
> the command ('Comm:') that triggered the event:
>
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
> Oops: 0002 [#1] SMP PTI
> CPU: 0 PID: 4424 Comm: insmod Tainted: P W O 4.20.0-0.rc6.fc30 #1
> Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
> RIP: 0010:my_oops_init+0x13/0x1000 [kpanic]
> [...]
>
> You'll find a **'Not tainted: '** there if the kernel was not tainted at the
> time of the event; if it was, then it will print **'Tainted: '** and characters
> either letters or blanks. The meaning of those characters is explained in the
> table below. In above example it's '``Tainted: P W O ``' as as the

A seriously minor nit, but I would format this as:

In the above example it's::

Tainted: P W O

as the kernel got tainted...

That will keep the text from being broken in unfortunate places in the
formatted docs.

(One "as" is also sufficient :)

> kernel got tainted earlier because a proprietary Module (``P``) was loaded, a
> warning occurred (``W``), and an externally-built module was loaded (``O``).
> To decode other letters use the table below.
>
>
> Decoding tainted state at runtime
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> At runtime, you can query the tainted state by reading
> ``cat /proc/sys/kernel/tainted``. If that returns ``0``, the kernel is not
> tainted; any other number indicates the reasons why it is. The easiest way to
> decode that number is the script ``tools/debugging/kernel-chktaint``, which your
> distribution might ship as part of a package called ``linux-tools`` or
> ``kernel-tools``; if it doesn't you can download the script from
> `git.kernel.org <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/plain/tools/debugging/kernel-chktaint>`_.
> and execute it with ``sh kernel-chktaint``
>
> If you do not want to run that script you can try to decode the number yourself.
> That's easy if there was only one reason that got your kernel tainted, as in
> this case you can find the number with the table below. If there were multiple
> reasons you need to decode the number, as it is a bitfield, where each bit
> indicates the absence or presence of a particular type of taint. It's best to
> leave that to the aforementioned script, but if you need something quick you can
> use this shell command to check which bits are set:
>
> $ for i in $(seq 18); do echo $i $(($(cat /proc/sys/kernel/tainted)>>($i-1)&1));done
>
> Note: This document is aimed at users and thus starts to count at one here and
> in other places. Use ``seq 0 17`` instead to start counting at zero, as it's
> normal for developers.

Again, just zero-base it and keep things simple and consistent.

> Table for decoding tainted state
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> ==== === ====== ========================================================
> Pos. Log Number Reason that got the kernel tainted
> ==== === ====== ========================================================
> 1) G/P 0 proprietary module was loaded
> 2) _/F 2 module was force loaded
> 3) _/S 4 SMP kernel oops on an officially SMP incapable processor
> 4) _/R 8 module was force unloaded
> 5) _/M 16 processor reported a Machine Check Exception (MCE)
> 6) _/B 32 bad page referenced or some unexpected page flags
> 7) _/U 64 taint requested by userspace application
> 8) _/D 128 kernel died recently, i.e. there was an OOPS or BUG
> 9) _/A 256 ACPI table overridden by user
> 10) _/W 512 kernel issued warning
> 11) _/C 1024 staging driver was loaded
> 12) _/I 2048 workaround for bug in platform firmware applied
> 13) _/O 4096 externally-built ("out-of-tree") module was loaded
> 14) _/E 8192 unsigned module was loaded
> 15) _/L 16384 soft lockup occurred
> 16) _/K 32768 Kernel live patched
> 17) _/K 65536 Auxiliary taint, defined for and used by distros
> 18) _/K 131072 Kernel was built with the struct randomization plugin
> ==== === ====== ========================================================
>
> Note: To make reading easier ``_`` is representing a blank in this
> table.
>
> More detailed explanation for tainting
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> 1) ``G`` if all modules loaded have a GPL or compatible license, ``P`` if
> any proprietary module has been loaded. Modules without a
> MODULE_LICENSE or with a MODULE_LICENSE that is not recognised by
> insmod as GPL compatible are assumed to be proprietary.
>
> 2) ``F`` if any module was force loaded by ``insmod -f``, ``' '`` if all
> modules were loaded normally.
>
> 3) ``S`` if the oops occurred on an SMP kernel running on hardware that
> hasn't been certified as safe to run multiprocessor.
> Currently this occurs only on various Athlons that are not
> SMP capable.

I wonder if any such hardware has ever run anything remotely resembling a
current kernel. In any case, a quick grep suggests that this taint can be
set in a number of other places as well.

> 4) ``R`` if a module was force unloaded by ``rmmod -f``, ``' '`` if all
> modules were unloaded normally.
>
> 5) ``M`` if any processor has reported a Machine Check Exception,
> ``' '`` if no Machine Check Exceptions have occurred.
>
> 6) ``B`` if a page-release function has found a bad page reference or
> some unexpected page flags.

I'd be tempted to add something like: "This taint indicates a hardware
problem or a kernel bug; there should be other information in the log
indicating why this bit was set."

> 7) ``U`` if a user or user application specifically requested that the
> Tainted flag be set, ``' '`` otherwise.
>
> 8) ``D`` if the kernel has died recently, i.e. there was an OOPS or BUG.
>
> 9) ``A`` if the ACPI table has been overridden.
>
> 10) ``W`` if a warning has previously been issued by the kernel.
> (Though some warnings may set more specific taint flags.)
>
> 11) ``C`` if a staging driver has been loaded.

There's a couple of other situations where this one is set as well; not
sure if it's worth the trouble to try to describe them.

> 12) ``I`` if the kernel is working around a severe bug in the platform
> firmware (BIOS or similar).
>
> 13) ``O`` if an externally-built ("out-of-tree") module has been loaded.
>
> 14) ``E`` if an unsigned module has been loaded in a kernel supporting
> module signature.
>
> 15) ``L`` if a soft lockup has previously occurred on the system.
>
> 16) ``K`` if the kernel has been live patched.
>
> 17) ``X`` Auxiliary taint, defined for and used by Linux distributors.

Do we know anything about whether anybody uses this?

> 18) ``T`` Kernel was build with randstruct plugin, which can intentionally
> produce extremely unusual kernel structure layouts (even performance
> pathological ones), which is important to know when debugging. Set at
> build time.

with *the* randstruct plugin

Overall, just nits except for the start-with-zero thing.

Thanks,

jon

2019-01-07 19:03:07

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: [PATCH 0/1] RFC: Revamp admin-guide/tainted-kernels.rst to make it more comprehensible

Am 03.01.19 um 19:12 schrieb Jonathan Corbet:
> On Fri, 21 Dec 2018 16:26:31 +0100
> Thorsten Leemhuis <[email protected]> wrote:
>>> Here's an idea if you feel like improving this: rather than putting an
>>> inscrutable program inline, add a taint_status script to scripts/ that
>>> prints out the status in fully human-readable form, with the explanation
>>> for every set bit.
>> I posted the script earlier today and noticed now that it prints only
>> the fully human-readable form, not if a bit it set or unset. Would you
>> prefer if it did that as well?
> Not sure I have an opinion; perhaps if it can be done in a readable way
> putting more information is better than less.

I think I found a way, the script output looks like this now:

Kernel is Tainted for following reasons:
* Proprietary module was loaded (#0)
* Kernel issued warning (#9)
* Externally-built ('out-of-tree') module was loaded (#12)
For a more detailed explanation of the various taint flags see
Documentation/admin-guide/tainted-kernels.rst in the the Linux kernel sources
or https://kernel.org/doc/html/latest/admin-guide/tainted-kernels.html
Raw taint value as int/string: 4609/'P W O

>>>> +=== === ====== ========================================================
>>>> +Bit Log Int Reason that got the kernel tainted
>>>> +=== === ====== ========================================================
>>>> + 1) G/P 0 proprietary module got loaded
>>> I'd s/got/was/ throughout. Also, this is the kernel, we start counting at
>>> zero! :)
>> Hehe, yeah :-D At first I actually started at zero, but that looked
>> odd as the old explanations (those already in the file) start to could at one.
>> Having a off-by-one within one document is just confusing, that's why I
>> decided against starting at zero here.
>> Another reason that came to my mind when reading your comment: Yes, this
>> is the kernel, but the document should be easy to understand even for
>> inexperienced users (e.g. people that know how to open and use command
>> line tools, but never learned programming). That's why I leaning towards
>> starting with one everywhere. But yes, that can be confusing, that's
>> why I added a note, albeit I'm not really happy with it yet:
>> """
>> Note: This document is aimed at users and thus starts to count at one here and
>> in other places. Use ``seq 0 17`` instead to start counting at zero, as it's
>> normal for developers.
>> """
>> See below for full context. Anyway: I can change the text to start at zero if
>> you prefer it.
> This is a kernel document in the end, so I do really think that we should
> be consistent with kernel conventions.

Okay. I still don't like it, but well, maybe your are right. And in the
end we can change it easily later if we want to.

> [...]
>> 3) ``S`` if the oops occurred on an SMP kernel running on hardware that
>> hasn't been certified as safe to run multiprocessor.
>> Currently this occurs only on various Athlons that are not
>> SMP capable.
> I wonder if any such hardware has ever run anything remotely resembling a
> current kernel. In any case, a quick grep suggests that this taint can be
> set in a number of other places as well.

I looked into this and...

> [...]
>> 11) ``C`` if a staging driver has been loaded.
> There's a couple of other situations where this one is set as well; not
> sure if it's worth the trouble to try to describe them.

...this, but decided that takes things too far for now. Thus I'll leave
those as they are for now, but will take a closer look and start a discussion
dedicated to this with the relevant parties that use those flags.

>> 17) ``X`` Auxiliary taint, defined for and used by Linux distributors.
> Do we know anything about whether anybody uses this?

Seems SUSE does: https://www.suse.com/de-de/support/kb/doc/?id=3582750
Or at least did in the not to distant past, which I'd say is good enough
for now.

> [...]
> Overall, just nits except for the start-with-zero thing.

All the other nits stripped from the reply fixed, will sent out and
update patch series tomorrow.

Side note FYI: While at it I decided to update the tainted section in
Documentation/sysctl/kernel.txt and reuse the short description
used it the table of the revamped tainted-kernels.rst, which results
in the patch at the end (sigh, this patch slowly gets too big):

Ciao, Thorsten

diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index 1b8775298cf7..8e1c21e1fdf6 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -93,7 +93,7 @@ show up in /proc/sys/kernel:
- stop-a [ SPARC only ]
- sysrq ==> Documentation/admin-guide/sysrq.rst
- sysctl_writes_strict
-- tainted
+- tainted ==> Documentation/admin-guide/tainted-kernels.rst
- threads-max
- unknown_nmi_panic
- watchdog
@@ -1005,36 +1005,31 @@ compilation sees a 1% slowdown, other systems and workloads may vary.

==============================================================

-tainted:
+tainted

Non-zero if the kernel has been tainted. Numeric values, which can be
ORed together. The letters are seen in "Tainted" line of Oops reports.

- 1 (P): A module with a non-GPL license has been loaded, this
- includes modules with no license.
- Set by modutils >= 2.4.9 and module-init-tools.
- 2 (F): A module was force loaded by insmod -f.
- Set by modutils >= 2.4.9 and module-init-tools.
- 4 (S): Unsafe SMP processors: SMP with CPUs not designed for SMP.
- 8 (R): A module was forcibly unloaded from the system by rmmod -f.
- 16 (M): A hardware machine check error occurred on the system.
- 32 (B): A bad page was discovered on the system.
- 64 (U): The user has asked that the system be marked "tainted". This
- could be because they are running software that directly modifies
- the hardware, or for other reasons.
- 128 (D): The system has died.
- 256 (A): The ACPI DSDT has been overridden with one supplied by the user
- instead of using the one provided by the hardware.
- 512 (W): A kernel warning has occurred.
- 1024 (C): A module from drivers/staging was loaded.
- 2048 (I): The system is working around a severe firmware bug.
- 4096 (O): An out-of-tree module has been loaded.
- 8192 (E): An unsigned module has been loaded in a kernel supporting module
- signature.
- 16384 (L): A soft lockup has previously occurred on the system.
- 32768 (K): The kernel has been live patched.
- 65536 (X): Auxiliary taint, defined and used by for distros.
-131072 (T): The kernel was built with the struct randomization plugin.
+ 1 (P): proprietary module was loaded
+ 2 (F): module was force loaded
+ 4 (S): SMP kernel oops on an officially SMP incapable processor
+ 8 (R): module was force unloaded
+ 16 (M): processor reported a Machine Check Exception (MCE)
+ 32 (B): bad page referenced or some unexpected page flags
+ 64 (U): taint requested by userspace application
+ 128 (D): kernel died recently, i.e. there was an OOPS or BUG
+ 256 (A): an ACPI table was overridden by user
+ 512 (W): kernel issued warning
+ 1024 (C): staging driver was loaded
+ 2048 (I): workaround for bug in platform firmware applied
+ 4096 (O): externally-built ("out-of-tree") module was loaded
+ 8192 (E): unsigned module was loaded
+ 16384 (L): soft lockup occurred
+ 32768 (K): kernel has been live patched
+ 65536 (X): Auxiliary taint, defined and used by for distros
+131072 (T): The kernel was built with the struct randomization plugin
+
+See Documentation/admin-guide/tainted-kernels.rst for more information.

==============================================================