LinuxLists.cc - [PATCH] twist: allow converting pr_devel()/pr_debug() into printk(KERN

[permalink] [raw]

Subject: Re: [PATCH] twist: allow converting pr_devel()/pr_debug() into printk(KERN_DEBUG)

On 2020/05/25 4:18, Ondrej Mosnacek wrote:
> I'm also not sure if this is really worth it... It would help localize
> the bug in this specific case, but there is nothing systematic about
> it. Are there that many debug print statements that dereference
> pointers that are later passed to functions, but not dereferenced
> otherwise? Maybe yes, but it seems to be quite an optimistic
> assumption... I don't consider it such a big problem that a bug in
> function X only manifests itself deeper in the callchain. There will
> always be such bugs, no matter how many moles you whack.

There are about 1400 pr_debug() callers. About 1000 pr_debug() callers seem
to pass plain '%p' (which is now likely useless for debugging purpose due to
default ptr_to_id() conversion inside pointer()), and about 400 pr_debug()
callers seem to pass '%p[a-zA-Z]' (which does some kind of dereference inside
pointer()). Thus, we might find some bugs by evaluating '%p[a-zA-Z]'.

On Sun, May 24, 2020 at 7:38 PM Joe Perches <[email protected]> wrote:
> While I think this is rather unnecessary,
> what about dev_dbg/netdev_dbg/netif_dbg et al ?

Maybe a good idea, for there are about 24000 *dev_dbg() callers, and
479 callers pass '%p[a-zA-Z]'. But we can defer to another patch, in
case this patch finds crashes before fuzz testing process starts.

2020-05-25 06:14:20

by Joe Perches

[permalink] [raw]

Subject: Re: [PATCH] twist: allow converting pr_devel()/pr_debug() into printk(KERN_DEBUG)

On Mon, 2020-05-25 at 14:03 +0900, Tetsuo Handa wrote:
> On 2020/05/25 4:18, Ondrej Mosnacek wrote:
> > I'm also not sure if this is really worth it... It would help localize
> > the bug in this specific case, but there is nothing systematic about
> > it. Are there that many debug print statements that dereference
> > pointers that are later passed to functions, but not dereferenced
> > otherwise? Maybe yes, but it seems to be quite an optimistic
> > assumption... I don't consider it such a big problem that a bug in
> > function X only manifests itself deeper in the callchain. There will
> > always be such bugs, no matter how many moles you whack.
>
> There are about 1400 pr_debug() callers. About 1000 pr_debug() callers seem
> to pass plain '%p' (which is now likely useless for debugging purpose due to
> default ptr_to_id() conversion inside pointer()), and about 400 pr_debug()
> callers seem to pass '%p[a-zA-Z]' (which does some kind of dereference inside
> pointer()). Thus, we might find some bugs by evaluating '%p[a-zA-Z]'.
>
>
>
> On Sun, May 24, 2020 at 7:38 PM Joe Perches <[email protected]> wrote:
> > While I think this is rather unnecessary,
> > what about dev_dbg/netdev_dbg/netif_dbg et al ?
>
> Maybe a good idea, for there are about 24000 *dev_dbg() callers, and
> 479 callers pass '%p[a-zA-Z]'. But we can defer to another patch, in
> case this patch finds crashes before fuzz testing process starts.

There are a bunch more than that.
Some use other macros, some are functions.

$ grep-2.5.4 --include=*.[ch] -n -rP '\w+_dbg\s*\((?:[^,"]+,){0,3}\s*"[^"]+%p\w+\b[^"]*"' * | \
perl -e 'local $/; while (<>) { s/\n\s+/ /g; print; }' | \
grep -o -P '\w+_dbg' | \
sort | uniq -c | sort -rn
415 dev_dbg
116 netdev_dbg
100 batadv_dbg
80 ath10k_dbg
53 mwifiex_dbg
49 ath11k_dbg
29 brcmf_dbg
28 ath_dbg
26 ht_dbg
20 ath6kl_dbg
17 wcn36xx_dbg
15 netif_dbg
15 cifs_dbg
14 tdls_dbg
13 ibss_dbg
11 mpl_dbg
10 memblock_dbg
10 bt_dev_dbg
9 ps_dbg
8 wiphy_dbg
8 mps_dbg
8 mlme_dbg
8 mhwmp_dbg
8 ipoib_dbg
7 sta_dbg
7 slave_dbg
7 pci_dbg
7 ibdev_dbg
6 mpath_dbg
6 en_dbg
6 drm_dbg
5 usnic_dbg
5 mlx5_core_dbg
4 vin_dbg
4 msync_dbg
3 rsi_dbg
3 cal_dbg
2 v4l2_dbg
2 siw_dbg
2 sdata_dbg
2 ocb_dbg
2 musb_dbg
2 hw_dbg
2 eeh_edev_dbg
2 cifs_server_dbg
2 at76_dbg
1 rt2x00_eeprom_dbg
1 pnp_dbg
1 mthca_dbg
1 mlx5_ib_dbg
1 mlx4_dbg
1 isp_dbg
1 gfs2_print_dbg
1 erofs_dbg
1 dynamic_drbd_dbg
1 ctx_dbg
1 cs89_dbg

2020-05-25 08:45:58

[permalink] [raw]

Subject: Re: [PATCH] twist: allow converting pr_devel()/pr_debug() into printk(KERN_DEBUG)

On Sun 2020-05-24 23:50:34, Tetsuo Handa wrote:
> syzbot found a NULL pointer dereference bug inside mptcp_recvmsg() due to
> ssock == NULL, but this bug manifested inside selinux_socket_recvmsg()
> because pr_debug() was no-op [1].
>
> pr_debug("fallback-read subflow=%p",
> mptcp_subflow_ctx(ssock->sk));
> copied = sock_recvmsg(ssock, msg, flags);

The NULL pointer deference was found even without this patch.
This patch would just cause that it will manifest itself on another
place. What is the benefit, please?

> Since console loglevel used by syzkaller will not print KERN_DEBUG
> messages to consoles, always evaluating pr_devel()/pr_debug() messages
> will not cause too much console output. Thus, let's allow fuzzers to
> always evaluate pr_devel()/pr_debug() messages.

I see few drawbacks with this patch:

1. It will cause adding much more messages into the logbuffer even
though they are not flushed to the console. It might cause that
more important messages will get overridden before they reach
console. They might also make hard to read the full log.

2. Crash inside printk() causes recursive messages. They are currently
printed into the printk_safe() buffers and there is a bigger risk
that they will not reach the console.

3. pr_debug() messages are not printed by default. It is possible that
nobody used them for ages. You might get many errors in less
maintained code instead in the really used one. I mean that you
will get more noise with less gain.

Have you tested this patch by the syzcaller with many runs, please?
Did it helped to actually discover more bugs?
Did it really made things easier?

I am not able to judge usefulness without more data. My intuition
tells me that we should keep the number of syzcaller-related twists
as small as possible. Otherwise, syscaller will diverge more and
more from reality.

Best Regards,
Petr

2020-05-25 09:14:17

by Sergey Senozhatsky

[permalink] [raw]

Subject: Re: [PATCH] twist: allow converting pr_devel()/pr_debug() into printk(KERN_DEBUG)

On (20/05/25 10:42), Petr Mladek wrote:
> On Sun 2020-05-24 23:50:34, Tetsuo Handa wrote:
> > syzbot found a NULL pointer dereference bug inside mptcp_recvmsg() due to
> > ssock == NULL, but this bug manifested inside selinux_socket_recvmsg()
> > because pr_debug() was no-op [1].
> >
> > pr_debug("fallback-read subflow=%p",
> > mptcp_subflow_ctx(ssock->sk));
> > copied = sock_recvmsg(ssock, msg, flags);
>
> The NULL pointer deference was found even without this patch.
> This patch would just cause that it will manifest itself on another
> place. What is the benefit, please?

Right, I don't get this patch. A NULL-deref is still a NULL pointer deref.
pr_debug() will fault reading one byte from the address and print something
like "fallback-read subflow=(efault)" to printk-safe buffer, but then
sock_recvmsg() is still going to do its thing.

-ss

2020-05-25 10:46:59

[permalink] [raw]

Subject: Re: [PATCH] twist: allow converting pr_devel()/pr_debug() into printk(KERN_DEBUG)

On 2020/05/25 17:42, Petr Mladek wrote:
> I see few drawbacks with this patch:
>
> 1. It will cause adding much more messages into the logbuffer even
> though they are not flushed to the console. It might cause that
> more important messages will get overridden before they reach
> console. They might also make hard to read the full log.

Since the user of this twist option will select console loglevel in a way
KERN_DEBUG messages are not printed to consoles, KERN_DEBUG messages will
be immediately processed (and space for future messages will be reclaimed).
Therefore, I don't think that more important messages will get overridden.

This twist option might increase possibility of mixing KERN_DEBUG messages
and non-KERN_DEBUG messages due to KERN_CONT case.

But if these concerns turn out to be a real problem, we can redirect
pr_devel()/pr_debug() to simple snprintf() which evaluates arguments
but discards the result without storing into the logbuffer.

>
> 2. Crash inside printk() causes recursive messages. They are currently
> printed into the printk_safe() buffers and there is a bigger risk
> that they will not reach the console.

Currently "static char textbuf[LOG_LINE_MAX];" is "static" because it is used
under logbuf_lock. If we remove "static", we can use "char textbuf[LOG_LINE_MAX];"
without logbuf_lock. Then, we can bring potentially dangerous-and-slow vscnprintf()
in vprintk_store() to earlier stage (and vprintk_store() will need to do simple
copy) so that oops in printk() will happen before entering printk-safe context.
I think that this change follows a direction which lockless logbuf will want.

>
> 3. pr_debug() messages are not printed by default. It is possible that
> nobody used them for ages. You might get many errors in less
> maintained code instead in the really used one. I mean that you
> will get more noise with less gain.

Given that potentially dangerous-and-slow vscnprintf() is done outside of
printk-safe context, we can get more test coverage without difficult things.

>
>
> Have you tested this patch by the syzcaller with many runs, please?
> Did it helped to actually discover more bugs?
> Did it really made things easier?

syzbot can't test with custom patches. The only way to test this patch is
to send to e.g. linux-next.git which syzbot is testing.

>
> I am not able to judge usefulness without more data. My intuition
> tells me that we should keep the number of syzcaller-related twists
> as small as possible. Otherwise, syscaller will diverge more and
> more from reality.

The twist options are not specific to syzkaller. Anyone can selectively
enable the twist options.

On 2020/05/25 18:11, Sergey Senozhatsky wrote:
> On (20/05/25 10:42), Petr Mladek wrote:
>> On Sun 2020-05-24 23:50:34, Tetsuo Handa wrote:
>>> syzbot found a NULL pointer dereference bug inside mptcp_recvmsg() due to
>>> ssock == NULL, but this bug manifested inside selinux_socket_recvmsg()
>>> because pr_debug() was no-op [1].
>>>
>>> pr_debug("fallback-read subflow=%p",
>>> mptcp_subflow_ctx(ssock->sk));
>>> copied = sock_recvmsg(ssock, msg, flags);
>>
>> The NULL pointer deference was found even without this patch.
>> This patch would just cause that it will manifest itself on another
>> place. What is the benefit, please?

It would help localizing the bug in this specific case.

It's not only about %p, even %d can crash kernel or leak sensitive
info (if it happens after-free/out-of-bounds/uninit). Overall it
increases code coverage and allows to catch more bugs earlier.

>
> Right, I don't get this patch. A NULL-deref is still a NULL pointer deref.
> pr_debug() will fault reading one byte from the address and print something
> like "fallback-read subflow=(efault)" to printk-safe buffer, but then
> sock_recvmsg() is still going to do its thing.

Since this NULL pointer dereference already happens before calling pr_debug(),
we won't store "fallback-read subflow=(efault)" to printk-safe buffer.

Just evaluating pr_devel()/pr_debug() arguments would help finding some bugs.
Again, we can change this twist option to redirect pr_devel()/pr_debug() to
simple snprintf() which evaluates arguments but discards the result without
storing into the logbuffer.

2020-05-25 16:41:33

by Dmitry Vyukov

[permalink] [raw]

Subject: Re: [PATCH] twist: allow converting pr_devel()/pr_debug() into printk(KERN_DEBUG)

On Mon, May 25, 2020 at 8:07 AM Joe Perches <[email protected]> wrote:
>
> On Mon, 2020-05-25 at 14:03 +0900, Tetsuo Handa wrote:
> > On 2020/05/25 4:18, Ondrej Mosnacek wrote:
> > > I'm also not sure if this is really worth it... It would help localize
> > > the bug in this specific case, but there is nothing systematic about
> > > it. Are there that many debug print statements that dereference
> > > pointers that are later passed to functions, but not dereferenced
> > > otherwise? Maybe yes, but it seems to be quite an optimistic
> > > assumption... I don't consider it such a big problem that a bug in
> > > function X only manifests itself deeper in the callchain. There will
> > > always be such bugs, no matter how many moles you whack.
> >
> > There are about 1400 pr_debug() callers. About 1000 pr_debug() callers seem
> > to pass plain '%p' (which is now likely useless for debugging purpose due to
> > default ptr_to_id() conversion inside pointer()), and about 400 pr_debug()
> > callers seem to pass '%p[a-zA-Z]' (which does some kind of dereference inside
> > pointer()). Thus, we might find some bugs by evaluating '%p[a-zA-Z]'.
> >
> >
> >
> > On Sun, May 24, 2020 at 7:38 PM Joe Perches <[email protected]> wrote:
> > > While I think this is rather unnecessary,
> > > what about dev_dbg/netdev_dbg/netif_dbg et al ?
> >
> > Maybe a good idea, for there are about 24000 *dev_dbg() callers, and
> > 479 callers pass '%p[a-zA-Z]'. But we can defer to another patch, in
> > case this patch finds crashes before fuzz testing process starts.
>
> There are a bunch more than that.
> Some use other macros, some are functions.

I think this is a good idea overall and I don't mind enabling it on syzbot.
It's not only about %p, even %d can crash kernel or leak sensitive
info (if it happens after-free/out-of-bounds/uninit). Overall it
increases code coverage and allows to catch more bugs earlier. That
was the reason for enabling dynamic debug, but I wasn't aware that
debug level is not included.

2020-05-27 11:20:34

[permalink] [raw]

Subject: Re: [PATCH] twist: allow converting pr_devel()/pr_debug() into printk(KERN_DEBUG)

On Mon 2020-05-25 19:43:04, Tetsuo Handa wrote:
> On 2020/05/25 17:42, Petr Mladek wrote:
> > I see few drawbacks with this patch:
> >
> > 1. It will cause adding much more messages into the logbuffer even
> > though they are not flushed to the console. It might cause that
> > more important messages will get overridden before they reach
> > console. They might also make hard to read the full log.
>
> Since the user of this twist option will select console loglevel in a way
> KERN_DEBUG messages are not printed to consoles, KERN_DEBUG messages will
> be immediately processed (and space for future messages will be reclaimed).
> Therefore, I don't think that more important messages will get overridden.

This is not fully true. More important messages will still be printed
to the console. The debug messages will not be skipped before the
older messages are proceed.

I mean that many debug messages might cause losing more important ones
before the old important messages are proceed.

> This twist option might increase possibility of mixing KERN_DEBUG messages
> and non-KERN_DEBUG messages due to KERN_CONT case.
>
> But if these concerns turn out to be a real problem, we can redirect
> pr_devel()/pr_debug() to simple snprintf() which evaluates arguments
> but discards the result without storing into the logbuffer.
>
> >
> > 2. Crash inside printk() causes recursive messages. They are currently
> > printed into the printk_safe() buffers and there is a bigger risk
> > that they will not reach the console.
>
> Currently "static char textbuf[LOG_LINE_MAX];" is "static" because it is used
> under logbuf_lock. If we remove "static", we can use "char textbuf[LOG_LINE_MAX];"
> without logbuf_lock. Then, we can bring potentially dangerous-and-slow vscnprintf()
> in vprintk_store() to earlier stage (and vprintk_store() will need to do simple
> copy) so that oops in printk() will happen before entering printk-safe context.
> I think that this change follows a direction which lockless logbuf will want.

No, LOG_LINE_MAX is too big to be allocated on stack.

Well, it would be possible to call vsprintf() with NULL buffer. It is
normally used to calculate the length of the message before it is
printed. But it also does all the accesses without printing anything.

> > Have you tested this patch by the syzcaller with many runs, please?
> > Did it helped to actually discover more bugs?
> > Did it really made things easier?
>
> syzbot can't test with custom patches. The only way to test this patch is
> to send to e.g. linux-next.git which syzbot is testing.

OK, we could try this via some test branch that will go into
linux-next but it would not be scheduled for the next merge window.

For the testing, this patch might be good enough.

For eventual upstreaming, I would prefer to handle this in
lib/dynamic_debug.c by enabling all entries by default. This
would solve all DYNAMIC_DEBUG_BRANCH() users at one place.

Anyway, I would like to see a proof that it really helped to find
some bugs an easier way before upstreaming.

Best Regards,
Petr

2020-05-27 15:32:42

by kernel test robot

[permalink] [raw]

Subject: Re: [PATCH] twist: allow converting pr_devel()/pr_debug() into printk(KERN_DEBUG)

Hi Tetsuo,

I love your patch! Perhaps something to improve:

[auto build test WARNING on next-20200519]
[cannot apply to linus/master linux/master pmladek/for-next v5.7-rc6 v5.7-rc5 v5.7-rc4 v5.7-rc7]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url: https://github.com/0day-ci/linux/commits/Tetsuo-Handa/twist-allow-converting-pr_devel-pr_debug-into-printk-KERN_DEBUG/20200524-225318
base: fb57b1fabcb28f358901b2df90abd2b48abc1ca8
config: powerpc-allmodconfig (attached as .config)
compiler: powerpc64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot <[email protected]>

All warnings (new ones prefixed by >>, old ones prefixed by <<):

In file included from include/linux/device.h:15,
from include/linux/acpi.h:15,
from include/linux/i2c.h:13,
from drivers/iio/adc/ina2xx-adc.c:24:
drivers/iio/adc/ina2xx-adc.c: In function 'ina2xx_buffer_enable':
include/linux/dev_printk.h:115:2: error: implicit declaration of function 'dynamic_dev_dbg' [-Werror=implicit-function-declaration]
115 | dynamic_dev_dbg(dev, dev_fmt(fmt), ##__VA_ARGS__)
| ^~~~~~~~~~~~~~~
>> drivers/iio/adc/ina2xx-adc.c:834:2: note: in expansion of macro 'dev_dbg'
834 | dev_dbg(&indio_dev->dev, "Enabling buffer w/ scan_mask %02x, freq = %d, avg =%un",
| ^~~~~~~
cc1: some warnings being treated as errors

vim +/dev_dbg +834 drivers/iio/adc/ina2xx-adc.c

c43a102e67db99c Marc Titinger 2015-12-07 827
c43a102e67db99c Marc Titinger 2015-12-07 828 static int ina2xx_buffer_enable(struct iio_dev *indio_dev)
c43a102e67db99c Marc Titinger 2015-12-07 829 {
c43a102e67db99c Marc Titinger 2015-12-07 830 struct ina2xx_chip_info *chip = iio_priv(indio_dev);
c43a102e67db99c Marc Titinger 2015-12-07 831 unsigned int sampling_us = SAMPLING_PERIOD(chip);
7d6cd21d82bacab Akinobu Mita 2018-06-25 832 struct task_struct *task;
c43a102e67db99c Marc Titinger 2015-12-07 833
1961bce76452938 Andrew F. Davis 2016-02-24 @834 dev_dbg(&indio_dev->dev, "Enabling buffer w/ scan_mask %02x, freq = %d, avg =%u\n",
c43a102e67db99c Marc Titinger 2015-12-07 835 (unsigned int)(*indio_dev->active_scan_mask),
c43a102e67db99c Marc Titinger 2015-12-07 836 1000000 / sampling_us, chip->avg);
c43a102e67db99c Marc Titinger 2015-12-07 837
1961bce76452938 Andrew F. Davis 2016-02-24 838 dev_dbg(&indio_dev->dev, "Expected work period: %u us\n", sampling_us);
1961bce76452938 Andrew F. Davis 2016-02-24 839 dev_dbg(&indio_dev->dev, "Async readout mode: %d\n",
1961bce76452938 Andrew F. Davis 2016-02-24 840 chip->allow_async_readout);
c43a102e67db99c Marc Titinger 2015-12-07 841
7d6cd21d82bacab Akinobu Mita 2018-06-25 842 task = kthread_create(ina2xx_capture_thread, (void *)indio_dev,
46294cd948d530d Marc Titinger 2015-12-11 843 "%s:%d-%uus", indio_dev->name, indio_dev->id,
46294cd948d530d Marc Titinger 2015-12-11 844 sampling_us);
7d6cd21d82bacab Akinobu Mita 2018-06-25 845 if (IS_ERR(task))
7d6cd21d82bacab Akinobu Mita 2018-06-25 846 return PTR_ERR(task);
7d6cd21d82bacab Akinobu Mita 2018-06-25 847
7d6cd21d82bacab Akinobu Mita 2018-06-25 848 get_task_struct(task);
7d6cd21d82bacab Akinobu Mita 2018-06-25 849 wake_up_process(task);
7d6cd21d82bacab Akinobu Mita 2018-06-25 850 chip->task = task;
c43a102e67db99c Marc Titinger 2015-12-07 851
7d6cd21d82bacab Akinobu Mita 2018-06-25 852 return 0;
c43a102e67db99c Marc Titinger 2015-12-07 853 }
c43a102e67db99c Marc Titinger 2015-12-07 854

:::::: The code at line 834 was first introduced by commit
:::::: 1961bce76452938eed8f797b7d96ab5f3c611525 iio: ina2xx: Remove trace_printk debug statments

:::::: TO: Andrew F. Davis <[email protected]>
:::::: CC: Jonathan Cameron <[email protected]>

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]

Attachments:

(No filename) (4.41 kB)
.config.gz (67.18 kB)
Download all attachments

2020-05-27 15:34:42

[permalink] [raw]

Subject: Re: [PATCH] twist: allow converting pr_devel()/pr_debug() into printk(KERN_DEBUG)

On 2020/05/27 17:37, Petr Mladek wrote:
> On Mon 2020-05-25 19:43:04, Tetsuo Handa wrote:
>> On 2020/05/25 17:42, Petr Mladek wrote:
>>> I see few drawbacks with this patch:
>>>
>>> 1. It will cause adding much more messages into the logbuffer even
>>> though they are not flushed to the console. It might cause that
>>> more important messages will get overridden before they reach
>>> console. They might also make hard to read the full log.
>>
>> Since the user of this twist option will select console loglevel in a way
>> KERN_DEBUG messages are not printed to consoles, KERN_DEBUG messages will
>> be immediately processed (and space for future messages will be reclaimed).
>> Therefore, I don't think that more important messages will get overridden.
>
> This is not fully true. More important messages will still be printed
> to the console. The debug messages will not be skipped before the
> older messages are proceed.
>
> I mean that many debug messages might cause losing more important ones
> before the old important messages are proceed.

Then, this reasoning will be also applicable to

[PATCH] printk: Add loglevel for "do not print to consoles".

in a sense that "don't try to quickly queue a lot of messages" rule. This concern
cannot be solved even if asynchronous printk() and per console loglevel are
supported someday, and oom_dump_tasks() is not allowed to count on these for
solving the stall problem caused by reporting all OOM victim candidates at once.

>
>
>> This twist option might increase possibility of mixing KERN_DEBUG messages
>> and non-KERN_DEBUG messages due to KERN_CONT case.
>>
>> But if these concerns turn out to be a real problem, we can redirect
>> pr_devel()/pr_debug() to simple snprintf() which evaluates arguments
>> but discards the result without storing into the logbuffer.
>>
>>>
>>> 2. Crash inside printk() causes recursive messages. They are currently
>>> printed into the printk_safe() buffers and there is a bigger risk
>>> that they will not reach the console.
>>
>> Currently "static char textbuf[LOG_LINE_MAX];" is "static" because it is used
>> under logbuf_lock. If we remove "static", we can use "char textbuf[LOG_LINE_MAX];"
>> without logbuf_lock. Then, we can bring potentially dangerous-and-slow vscnprintf()
>> in vprintk_store() to earlier stage (and vprintk_store() will need to do simple
>> copy) so that oops in printk() will happen before entering printk-safe context.
>> I think that this change follows a direction which lockless logbuf will want.
>
> No, LOG_LINE_MAX is too big to be allocated on stack.

We could assign per task_struct buffers and per CPU interrupt context buffers
(like we discussed about how to handle KERN_CONT problem). But managing these
off stack buffers is out of scope for this patch.

>
> Well, it would be possible to call vsprintf() with NULL buffer. It is
> normally used to calculate the length of the message before it is
> printed. But it also does all the accesses without printing anything.

OK. I think that redirecting pr_debug() to vsnprintf(NULL, 0) will be
better than modifying dynamic_debug path, for

>
>
>>> Have you tested this patch by the syzcaller with many runs, please?
>>> Did it helped to actually discover more bugs?
>>> Did it really made things easier?
>>
>> syzbot can't test with custom patches. The only way to test this patch is
>> to send to e.g. linux-next.git which syzbot is testing.
>
> OK, we could try this via some test branch that will go into
> linux-next but it would not be scheduled for the next merge window.
>
> For the testing, this patch might be good enough.
>
> For eventual upstreaming, I would prefer to handle this in
> lib/dynamic_debug.c by enabling all entries by default. This
> would solve all DYNAMIC_DEBUG_BRANCH() users at one place.

since "enabling all entries by default" will redirect pr_debug() calls to
printk(KERN_DEBUG), the "don't try to quickly queue too much messages" above
remains (i.e. it is essentially same with what this patch is doing).

>
> Anyway, I would like to see a proof that it really helped to find
> some bugs an easier way before upstreaming.
>
> Best Regards,
> Petr
>

2020-05-27 15:52:14

by kernel test robot

[permalink] [raw]

Subject: Re: [PATCH] twist: allow converting pr_devel()/pr_debug() into printk(KERN_DEBUG)

Hi Tetsuo,

I love your patch! Perhaps something to improve:

[auto build test WARNING on next-20200519]
[cannot apply to linus/master linux/master pmladek/for-next v5.7-rc6 v5.7-rc5 v5.7-rc4 v5.7-rc7]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url: https://github.com/0day-ci/linux/commits/Tetsuo-Handa/twist-allow-converting-pr_devel-pr_debug-into-printk-KERN_DEBUG/20200524-225318
base: fb57b1fabcb28f358901b2df90abd2b48abc1ca8
config: riscv-allyesconfig (attached as .config)
compiler: riscv64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=riscv

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot <[email protected]>

All warnings (new ones prefixed by >>, old ones prefixed by <<):

In file included from include/linux/device.h:15,
from include/linux/dmaengine.h:8,
from drivers/i2c/busses/i2c-tegra.c:12:
drivers/i2c/busses/i2c-tegra.c: In function 'tegra_i2c_dma_submit':
include/linux/dev_printk.h:115:2: error: implicit declaration of function 'dynamic_dev_dbg' [-Werror=implicit-function-declaration]
115 | dynamic_dev_dbg(dev, dev_fmt(fmt), ##__VA_ARGS__)
| ^~~~~~~~~~~~~~~
>> drivers/i2c/busses/i2c-tegra.c:377:2: note: in expansion of macro 'dev_dbg'
377 | dev_dbg(i2c_dev->dev, "starting DMA for length: %zun", len);
| ^~~~~~~
cc1: some warnings being treated as errors

vim +/dev_dbg +377 drivers/i2c/busses/i2c-tegra.c

86c92b9965ff175 Sowjanya Komatineni 2019-02-12 370
86c92b9965ff175 Sowjanya Komatineni 2019-02-12 371 static int tegra_i2c_dma_submit(struct tegra_i2c_dev *i2c_dev, size_t len)
86c92b9965ff175 Sowjanya Komatineni 2019-02-12 372 {
86c92b9965ff175 Sowjanya Komatineni 2019-02-12 373 struct dma_async_tx_descriptor *dma_desc;
86c92b9965ff175 Sowjanya Komatineni 2019-02-12 374 enum dma_transfer_direction dir;
86c92b9965ff175 Sowjanya Komatineni 2019-02-12 375 struct dma_chan *chan;
86c92b9965ff175 Sowjanya Komatineni 2019-02-12 376
86c92b9965ff175 Sowjanya Komatineni 2019-02-12 @377 dev_dbg(i2c_dev->dev, "starting DMA for length: %zu\n", len);
86c92b9965ff175 Sowjanya Komatineni 2019-02-12 378 reinit_completion(&i2c_dev->dma_complete);
86c92b9965ff175 Sowjanya Komatineni 2019-02-12 379 dir = i2c_dev->msg_read ? DMA_DEV_TO_MEM : DMA_MEM_TO_DEV;
86c92b9965ff175 Sowjanya Komatineni 2019-02-12 380 chan = i2c_dev->msg_read ? i2c_dev->rx_dma_chan : i2c_dev->tx_dma_chan;
86c92b9965ff175 Sowjanya Komatineni 2019-02-12 381 dma_desc = dmaengine_prep_slave_single(chan, i2c_dev->dma_phys,
86c92b9965ff175 Sowjanya Komatineni 2019-02-12 382 len, dir, DMA_PREP_INTERRUPT |
86c92b9965ff175 Sowjanya Komatineni 2019-02-12 383 DMA_CTRL_ACK);
86c92b9965ff175 Sowjanya Komatineni 2019-02-12 384 if (!dma_desc) {
86c92b9965ff175 Sowjanya Komatineni 2019-02-12 385 dev_err(i2c_dev->dev, "failed to get DMA descriptor\n");
86c92b9965ff175 Sowjanya Komatineni 2019-02-12 386 return -EINVAL;
86c92b9965ff175 Sowjanya Komatineni 2019-02-12 387 }
86c92b9965ff175 Sowjanya Komatineni 2019-02-12 388
86c92b9965ff175 Sowjanya Komatineni 2019-02-12 389 dma_desc->callback = tegra_i2c_dma_complete;
86c92b9965ff175 Sowjanya Komatineni 2019-02-12 390 dma_desc->callback_param = i2c_dev;
86c92b9965ff175 Sowjanya Komatineni 2019-02-12 391 dmaengine_submit(dma_desc);
86c92b9965ff175 Sowjanya Komatineni 2019-02-12 392 dma_async_issue_pending(chan);
86c92b9965ff175 Sowjanya Komatineni 2019-02-12 393 return 0;
86c92b9965ff175 Sowjanya Komatineni 2019-02-12 394 }
86c92b9965ff175 Sowjanya Komatineni 2019-02-12 395

:::::: The code at line 377 was first introduced by commit
:::::: 86c92b9965ff1758952cd0d6c5f19eeeef291eea i2c: tegra: Add DMA support

:::::: TO: Sowjanya Komatineni <[email protected]>
:::::: CC: Wolfram Sang <[email protected]>

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]

Attachments:

(No filename) (4.37 kB)
.config.gz (62.78 kB)
Download all attachments

2020-05-27 17:55:38

[permalink] [raw]

Subject: Re: [PATCH] twist: allow converting pr_devel()/pr_debug() into printk(KERN_DEBUG)

From 3406a1853564618f167a5d0a815f174d2fb295f5 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <[email protected]>
Date: Wed, 27 May 2020 22:34:50 +0900
Subject: [PATCH] dev_printk: Explicitly include dynamic_debug.h

While printk.h included dynamic_debug.h before pr_debug(), dev_printk.h
did not include it before dynamic_dev_dbg(). Include it in dev_printk.h
in case future patch touches CONFIG_DYNAMIC_DEBUG case in printk.h .

Reported-by: kbuild test robot <[email protected]>
Signed-off-by: Tetsuo Handa <[email protected]>
---
include/linux/dev_printk.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/include/linux/dev_printk.h b/include/linux/dev_printk.h
index 5aad06b4ca7b..69f0b3708eaf 100644
--- a/include/linux/dev_printk.h
+++ b/include/linux/dev_printk.h
@@ -110,6 +110,7 @@ void _dev_info(const struct device *dev, const char *fmt, ...)
_dev_info(dev, dev_fmt(fmt), ##__VA_ARGS__)

#if defined(CONFIG_DYNAMIC_DEBUG)
+#include <linux/dynamic_debug.h>
#define dev_dbg(dev, fmt, ...) \
dynamic_dev_dbg(dev, dev_fmt(fmt), ##__VA_ARGS__)
#elif defined(DEBUG)
--
2.18.2

2020-05-27 19:03:09

[permalink] [raw]

Subject: Re: [PATCH] twist: allow converting pr_devel()/pr_debug() into printk(KERN_DEBUG)

On Wed 2020-05-27 19:13:38, Tetsuo Handa wrote:
> On 2020/05/27 17:37, Petr Mladek wrote:
> > On Mon 2020-05-25 19:43:04, Tetsuo Handa wrote:
> >> On 2020/05/25 17:42, Petr Mladek wrote:
> >>> I see few drawbacks with this patch:
> >>>
> >>> 1. It will cause adding much more messages into the logbuffer even
> >>> though they are not flushed to the console. It might cause that
> >>> more important messages will get overridden before they reach
> >>> console. They might also make hard to read the full log.
> >>
> >> Since the user of this twist option will select console loglevel in a way
> >> KERN_DEBUG messages are not printed to consoles, KERN_DEBUG messages will
> >> be immediately processed (and space for future messages will be reclaimed).
> >> Therefore, I don't think that more important messages will get overridden.
> >
> > This is not fully true. More important messages will still be printed
> > to the console. The debug messages will not be skipped before the
> > older messages are proceed.
> >
> > I mean that many debug messages might cause losing more important ones
> > before the old important messages are proceed.
>
> Then, this reasoning will be also applicable to
>
> [PATCH] printk: Add loglevel for "do not print to consoles".
>
> in a sense that "don't try to quickly queue a lot of messages" rule. This concern
> cannot be solved even if asynchronous printk() and per console loglevel are
> supported someday, and oom_dump_tasks() is not allowed to count on these for
> solving the stall problem caused by reporting all OOM victim candidates at once.

Good point. Even the "do_not_print_to_consoles" flag might cause
loosing other important messages on consoles. It is because all
consoles are handled in a single loop.

The asynchronous consoles would use one kthread per console. It would
allow to see all messages at least by fast consoles and userspace.

The OOM problem will also be solvable with a bigger log buffer.
It would all to keep the entire dump until it can be seen on
consoles or stored by userspace. The asynchronous consoles will
prevent blocking OOM killer which is the primary problem.

> >> This twist option might increase possibility of mixing KERN_DEBUG messages
> >> and non-KERN_DEBUG messages due to KERN_CONT case.
> >>
> >> But if these concerns turn out to be a real problem, we can redirect
> >> pr_devel()/pr_debug() to simple snprintf() which evaluates arguments
> >> but discards the result without storing into the logbuffer.
> >>
> >>>
> >>> 2. Crash inside printk() causes recursive messages. They are currently
> >>> printed into the printk_safe() buffers and there is a bigger risk
> >>> that they will not reach the console.
> >>
> >> Currently "static char textbuf[LOG_LINE_MAX];" is "static" because it is used
> >> under logbuf_lock. If we remove "static", we can use "char textbuf[LOG_LINE_MAX];"
> >> without logbuf_lock. Then, we can bring potentially dangerous-and-slow vscnprintf()
> >> in vprintk_store() to earlier stage (and vprintk_store() will need to do simple
> >> copy) so that oops in printk() will happen before entering printk-safe context.
> >> I think that this change follows a direction which lockless logbuf will want.
> >
> > No, LOG_LINE_MAX is too big to be allocated on stack.
>
> We could assign per task_struct buffers and per CPU interrupt context buffers
> (like we discussed about how to handle KERN_CONT problem). But managing these
> off stack buffers is out of scope for this patch.

This is much more complicated than the vsprintf(NULL) approach.

> > Well, it would be possible to call vsprintf() with NULL buffer. It is
> > normally used to calculate the length of the message before it is
> > printed. But it also does all the accesses without printing anything.
>
> OK. I think that redirecting pr_debug() to vsnprintf(NULL, 0) will be
> better than modifying dynamic_debug path, for

It might get more complicated if you would actually want to see
pr_debug() messages for a selected module in the fuzzer.

> >>> Have you tested this patch by the syzcaller with many runs, please?
> >>> Did it helped to actually discover more bugs?
> >>> Did it really made things easier?
> >>
> >> syzbot can't test with custom patches. The only way to test this patch is
> >> to send to e.g. linux-next.git which syzbot is testing.
> >
> > OK, we could try this via some test branch that will go into
> > linux-next but it would not be scheduled for the next merge window.
> >
> > For the testing, this patch might be good enough.
> >
> > For eventual upstreaming, I would prefer to handle this in
> > lib/dynamic_debug.c by enabling all entries by default. This
> > would solve all DYNAMIC_DEBUG_BRANCH() users at one place.
>
> since "enabling all entries by default" will redirect pr_debug() calls to
> printk(KERN_DEBUG), the "don't try to quickly queue too much messages" above
> remains (i.e. it is essentially same with what this patch is doing).

vsprintf(NULL, ) can be called for pr_debug() messages in
vprintk_store(). It will be again only a single place for
all printk() wrappers.

Best Regards,
Petr

2020-05-27 23:39:08

[permalink] [raw]

Subject: Re: [PATCH] twist: allow converting pr_devel()/pr_debug() into printk(KERN_DEBUG)

On 2020/05/28 0:55, Petr Mladek wrote:
>>> Well, it would be possible to call vsprintf() with NULL buffer. It is
>>> normally used to calculate the length of the message before it is
>>> printed. But it also does all the accesses without printing anything.
>>
>> OK. I think that redirecting pr_debug() to vsnprintf(NULL, 0) will be
>> better than modifying dynamic_debug path, for
>
> It might get more complicated if you would actually want to see
> pr_debug() messages for a selected module in the fuzzer.

I don't expect that automated testing can afford selectively enabling
DYNAMIC_DEBUG_BRANCH(id) conditions. But we could evaluate all arguments
by calling snprintf(NULL, 0) if the condition to call printk() is false.

> vsprintf(NULL, ) can be called for pr_debug() messages in
> vprintk_store(). It will be again only a single place for
> all printk() wrappers.

I couldn't catch what you mean. The problem of pr_debug() is that
vprintk_store() might not be called because of

#define no_printk(fmt, ...) \
({ \
if (0) \
printk(fmt, ##__VA_ARGS__); \
0; \
})

#define pr_debug(fmt, ...) \
no_printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__)

or

#define __dynamic_func_call(id, fmt, func, ...) do { \
DEFINE_DYNAMIC_DEBUG_METADATA(id, fmt); \
if (DYNAMIC_DEBUG_BRANCH(id)) \
func(&id, ##__VA_ARGS__); \
} while (0)

#define _dynamic_func_call(fmt, func, ...) \
__dynamic_func_call(__UNIQUE_ID(ddebug), fmt, func, ##__VA_ARGS__)

#define dynamic_pr_debug(fmt, ...) \
_dynamic_func_call(fmt, __dynamic_pr_debug, \
pr_fmt(fmt), ##__VA_ARGS__)

#define pr_debug(fmt, ...) \
dynamic_pr_debug(fmt, ##__VA_ARGS__)

. Maybe we can append

else if (IS_BUILTIN(CONFIG_TWIST_ALWAYS_EVALUATE_DEBUG_PRINTK)) \
snprintf(NULL, 0, fmt, ##__VA_ARGS__); \

to no_printk() and __dynamic_func_call().

2020-05-28 06:59:01

[permalink] [raw]

Subject: [PATCH v2] twist: allow converting pr_devel()/pr_debug() into snprintf()

syzbot found a NULL pointer dereference bug inside mptcp_recvmsg() due to
ssock == NULL, but this bug manifested inside selinux_socket_recvmsg()
because pr_debug() was no-op [1].

pr_debug("fallback-read subflow=%p",
mptcp_subflow_ctx(ssock->sk));
copied = sock_recvmsg(ssock, msg, flags);

Thus, let's allow fuzzers to always evaluate pr_devel()/pr_debug()
messages, by redirecting no-op pr_devel()/pr_debug() calls to snprintf().

[1] https://syzkaller.appspot.com/bug?id=12be9aa373be9d8727cdd172f190de39528a413a

Signed-off-by: Tetsuo Handa <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Sergey Senozhatsky <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ondrej Mosnacek <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
---
include/linux/dev_printk.h | 16 ++++++++++++++++
include/linux/dynamic_debug.h | 14 ++++++++++++--
include/linux/printk.h | 10 ++++++++++
lib/Kconfig.twist | 12 ++++++++++++
4 files changed, 50 insertions(+), 2 deletions(-)

diff --git a/include/linux/dev_printk.h b/include/linux/dev_printk.h
index 3028b644b4fb..ed5d5bb3b5b6 100644
--- a/include/linux/dev_printk.h
+++ b/include/linux/dev_printk.h
@@ -121,6 +121,8 @@ void _dev_info(const struct device *dev, const char *fmt, ...)
({ \
if (0) \
dev_printk(KERN_DEBUG, dev, dev_fmt(fmt), ##__VA_ARGS__); \
+ else if (IS_BUILTIN(CONFIG_TWIST_ALWAYS_EVALUATE_PRINTK_ARGUMENTS)) \
+ snprintf(NULL, 0, dev_fmt(fmt), ##__VA_ARGS__); \
})
#endif

@@ -133,12 +135,16 @@ do { \
__print_once = true; \
dev_level(dev, fmt, ##__VA_ARGS__); \
} \
+ else if (IS_BUILTIN(CONFIG_TWIST_ALWAYS_EVALUATE_PRINTK_ARGUMENTS)) \
+ snprintf(NULL, 0, fmt, ##__VA_ARGS__); \
} while (0)
#else
#define dev_level_once(dev_level, dev, fmt, ...) \
do { \
if (0) \
dev_level(dev, fmt, ##__VA_ARGS__); \
+ else if (IS_BUILTIN(CONFIG_TWIST_ALWAYS_EVALUATE_PRINTK_ARGUMENTS)) \
+ snprintf(NULL, 0, fmt, ##__VA_ARGS__); \
} while (0)
#endif

@@ -166,6 +172,8 @@ do { \
DEFAULT_RATELIMIT_BURST); \
if (__ratelimit(&_rs)) \
dev_level(dev, fmt, ##__VA_ARGS__); \
+ else if (IS_BUILTIN(CONFIG_TWIST_ALWAYS_EVALUATE_PRINTK_ARGUMENTS)) \
+ snprintf(NULL, 0, fmt, ##__VA_ARGS__); \
} while (0)

#define dev_emerg_ratelimited(dev, fmt, ...) \
@@ -195,6 +203,8 @@ do { \
__ratelimit(&_rs)) \
__dynamic_dev_dbg(&descriptor, dev, dev_fmt(fmt), \
##__VA_ARGS__); \
+ else if (IS_BUILTIN(CONFIG_TWIST_ALWAYS_EVALUATE_PRINTK_ARGUMENTS)) \
+ snprintf(NULL, 0, dev_fmt(fmt), ##__VA_ARGS__); \
} while (0)
#elif defined(DEBUG)
#define dev_dbg_ratelimited(dev, fmt, ...) \
@@ -204,12 +214,16 @@ do { \
DEFAULT_RATELIMIT_BURST); \
if (__ratelimit(&_rs)) \
dev_printk(KERN_DEBUG, dev, dev_fmt(fmt), ##__VA_ARGS__); \
+ else if (IS_BUILTIN(CONFIG_TWIST_ALWAYS_EVALUATE_PRINTK_ARGUMENTS)) \
+ snprintf(NULL, 0, dev_fmt(fmt), ##__VA_ARGS__); \
} while (0)
#else
#define dev_dbg_ratelimited(dev, fmt, ...) \
do { \
if (0) \
dev_printk(KERN_DEBUG, dev, dev_fmt(fmt), ##__VA_ARGS__); \
+ else if (IS_BUILTIN(CONFIG_TWIST_ALWAYS_EVALUATE_PRINTK_ARGUMENTS)) \
+ snprintf(NULL, 0, dev_fmt(fmt), ##__VA_ARGS__); \
} while (0)
#endif

@@ -220,6 +234,8 @@ do { \
({ \
if (0) \
dev_printk(KERN_DEBUG, dev, dev_fmt(fmt), ##__VA_ARGS__); \
+ else if (IS_BUILTIN(CONFIG_TWIST_ALWAYS_EVALUATE_PRINTK_ARGUMENTS)) \
+ snprintf(NULL, 0, dev_fmt(fmt), ##__VA_ARGS__); \
})
#endif

diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h
index abcd5fde30eb..ede42ebb38e5 100644
--- a/include/linux/dynamic_debug.h
+++ b/include/linux/dynamic_debug.h
@@ -201,9 +201,19 @@ static inline int ddebug_dyndbg_module_param_cb(char *param, char *val,
}

#define dynamic_pr_debug(fmt, ...) \
- do { if (0) printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__); } while (0)
+ do { \
+ if (0) \
+ printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__); \
+ else if (IS_BUILTIN(CONFIG_TWIST_ALWAYS_EVALUATE_PRINTK_ARGUMENTS)) \
+ snprintf(NULL, 0, pr_fmt(fmt), ##__VA_ARGS__); \
+ } while (0)
#define dynamic_dev_dbg(dev, fmt, ...) \
- do { if (0) dev_printk(KERN_DEBUG, dev, fmt, ##__VA_ARGS__); } while (0)
+ do { \
+ if (0) \
+ dev_printk(KERN_DEBUG, dev, fmt, ##__VA_ARGS__); \
+ else if (IS_BUILTIN(CONFIG_TWIST_ALWAYS_EVALUATE_PRINTK_ARGUMENTS)) \
+ snprintf(NULL, 0, fmt, ##__VA_ARGS__); \
+ } while (0)
#define dynamic_hex_dump(prefix_str, prefix_type, rowsize, \
groupsize, buf, len, ascii) \
do { if (0) \
diff --git a/include/linux/printk.h b/include/linux/printk.h
index fc8f03c54543..517c7cdec265 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -135,6 +135,8 @@ struct va_format {
({ \
if (0) \
printk(fmt, ##__VA_ARGS__); \
+ else if (IS_BUILTIN(CONFIG_TWIST_ALWAYS_EVALUATE_PRINTK_ARGUMENTS)) \
+ snprintf(NULL, 0, fmt, ##__VA_ARGS__); \
0; \
})

@@ -439,6 +441,8 @@ extern int kptr_restrict;
__print_once = true; \
printk(fmt, ##__VA_ARGS__); \
} \
+ else if (IS_BUILTIN(CONFIG_TWIST_ALWAYS_EVALUATE_PRINTK_ARGUMENTS)) \
+ snprintf(NULL, 0, fmt, ##__VA_ARGS__); \
unlikely(__ret_print_once); \
})
#define printk_deferred_once(fmt, ...) \
@@ -450,6 +454,8 @@ extern int kptr_restrict;
__print_once = true; \
printk_deferred(fmt, ##__VA_ARGS__); \
} \
+ else if (IS_BUILTIN(CONFIG_TWIST_ALWAYS_EVALUATE_PRINTK_ARGUMENTS)) \
+ snprintf(NULL, 0, fmt, ##__VA_ARGS__); \
unlikely(__ret_print_once); \
})
#else
@@ -505,6 +511,8 @@ extern int kptr_restrict;
\
if (__ratelimit(&_rs)) \
printk(fmt, ##__VA_ARGS__); \
+ else if (IS_BUILTIN(CONFIG_TWIST_ALWAYS_EVALUATE_PRINTK_ARGUMENTS)) \
+ snprintf(NULL, 0, fmt, ##__VA_ARGS__); \
})
#else
#define printk_ratelimited(fmt, ...) \
@@ -548,6 +556,8 @@ do { \
if (DYNAMIC_DEBUG_BRANCH(descriptor) && \
__ratelimit(&_rs)) \
__dynamic_pr_debug(&descriptor, pr_fmt(fmt), ##__VA_ARGS__); \
+ else if (IS_BUILTIN(CONFIG_TWIST_ALWAYS_EVALUATE_PRINTK_ARGUMENTS)) \
+ snprintf(NULL, 0, pr_fmt(fmt), ##__VA_ARGS__); \
} while (0)
#elif defined(DEBUG)
#define pr_debug_ratelimited(fmt, ...) \
diff --git a/lib/Kconfig.twist b/lib/Kconfig.twist
index f20e0d003581..a45a631819c3 100644
--- a/lib/Kconfig.twist
+++ b/lib/Kconfig.twist
@@ -12,10 +12,22 @@ if TWIST_KERNEL_BEHAVIOR

config TWIST_FOR_SYZKALLER_TESTING
bool "Select all twist options suitable for syzkaller testing"
+ select TWIST_ALWAYS_EVALUATE_PRINTK_ARGUMENTS
select TWIST_DISABLE_KBD_K_SPEC_HANDLER
help
Say N unless you are building kernels for syzkaller's testing.

+config TWIST_ALWAYS_EVALUATE_PRINTK_ARGUMENTS
+ bool "Always evaluate printk() arguments"
+ help
+ Currently, only format string of printk() arguments is checked
+ by compiler if pr_devel()/pr_debug() are disabled. Therefore,
+ fuzz testing cannot catch runtime bugs (e.g. NULL pointer
+ dereference, use-after-free/out-of-bounds/uninitialized read)
+ in disabled printk() calls. This option redirects disabled
+ printk(...) to snprintf(NULL, 0, ...) in order to evaluate
+ arguments without printing.
+
config TWIST_DISABLE_KBD_K_SPEC_HANDLER
bool "Disable k_spec() function in drivers/tty/vt/keyboard.c"
help
--
2.18.2

2020-05-28 11:02:41

[permalink] [raw]

Subject: Re: [PATCH] twist: allow converting pr_devel()/pr_debug() into printk(KERN_DEBUG)

On Thu 2020-05-28 08:33:37, Tetsuo Handa wrote:
> On 2020/05/28 0:55, Petr Mladek wrote:
> >>> Well, it would be possible to call vsprintf() with NULL buffer. It is
> >>> normally used to calculate the length of the message before it is
> >>> printed. But it also does all the accesses without printing anything.
> >>
> >> OK. I think that redirecting pr_debug() to vsnprintf(NULL, 0) will be
> >> better than modifying dynamic_debug path, for
> >
> > It might get more complicated if you would actually want to see
> > pr_debug() messages for a selected module in the fuzzer.
>
> I don't expect that automated testing can afford selectively enabling
> DYNAMIC_DEBUG_BRANCH(id) conditions. But we could evaluate all arguments
> by calling snprintf(NULL, 0) if the condition to call printk() is false.
>
> > vsprintf(NULL, ) can be called for pr_debug() messages in
> > vprintk_store(). It will be again only a single place for
> > all printk() wrappers.
>
> I couldn't catch what you mean. The problem of pr_debug() is that
> vprintk_store() might not be called because of
>
> #define no_printk(fmt, ...) \
> ({ \
> if (0) \
> printk(fmt, ##__VA_ARGS__); \
> 0; \
> })
>
> #define pr_debug(fmt, ...) \
> no_printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__)
>
> or
>
> #define __dynamic_func_call(id, fmt, func, ...) do { \
> DEFINE_DYNAMIC_DEBUG_METADATA(id, fmt); \
> if (DYNAMIC_DEBUG_BRANCH(id)) \
> func(&id, ##__VA_ARGS__); \
> } while (0)
>
> #define _dynamic_func_call(fmt, func, ...) \
> __dynamic_func_call(__UNIQUE_ID(ddebug), fmt, func, ##__VA_ARGS__)
>
> #define dynamic_pr_debug(fmt, ...) \
> _dynamic_func_call(fmt, __dynamic_pr_debug, \
> pr_fmt(fmt), ##__VA_ARGS__)
>
> #define pr_debug(fmt, ...) \
> dynamic_pr_debug(fmt, ##__VA_ARGS__)

That is exactly the problem. Your current patch [1] adds checks
for the CONFIG_TWIST into 15 different locations.

This is perfectly fine for testing in linux-next whether this twist
is worth the effort. But I do not like this as a long term solution.

If the testing shows that it was really helpful and you would want
to get this into Linus' tree. Then I would like to do the twist at different
level:

1. Add twist into ddebug_add_module() and enable all newly added
entries by default. For example, by calling
ddebug_exec_query("*:+p", const char *modname) or what is the syntax.

This will cause that any pr_devel() variant will always get called.

2. Add twist into vprintk_store(). In the current, implementation
it would do:

#if TWIST
return text_len;
#endif

return log_output(facility, level, lflags,
dict, dictlen, text, text_len);

Something similar would need to be done also in printk_safe().
Hot you could ignore this because it would be used only in
very few scenarios.

In the lock_less variant, we would need to format the message
into small buffer on stack to detect the log level from the first
few bytes.

The approach will cause that pr_devel() message will never get really
printed when this TWIST is enabled. But you mention that automatic
testing would not do so anyway.

[1] https://lore.kernel.org/r/[email protected]

Best Regards,
Petr

2020-05-28 11:10:59

[permalink] [raw]

Subject: Re: [PATCH v2] twist: allow converting pr_devel()/pr_debug() into snprintf()

On Thu 2020-05-28 15:56:03, Tetsuo Handa wrote:
> syzbot found a NULL pointer dereference bug inside mptcp_recvmsg() due to
> ssock == NULL, but this bug manifested inside selinux_socket_recvmsg()
> because pr_debug() was no-op [1].
>
> pr_debug("fallback-read subflow=%p",
> mptcp_subflow_ctx(ssock->sk));
> copied = sock_recvmsg(ssock, msg, flags);
>
> Thus, let's allow fuzzers to always evaluate pr_devel()/pr_debug()
> messages, by redirecting no-op pr_devel()/pr_debug() calls to snprintf().
>
> [1] https://syzkaller.appspot.com/bug?id=12be9aa373be9d8727cdd172f190de39528a413a
>
> Signed-off-by: Tetsuo Handa <[email protected]>
> Cc: Petr Mladek <[email protected]>
> Cc: Sergey Senozhatsky <[email protected]>
> Cc: Steven Rostedt <[email protected]>
> Cc: Ondrej Mosnacek <[email protected]>
> Cc: Dmitry Vyukov <[email protected]>
> ---
> include/linux/dev_printk.h | 16 ++++++++++++++++
> include/linux/dynamic_debug.h | 14 ++++++++++++--
> include/linux/printk.h | 10 ++++++++++
> lib/Kconfig.twist | 12 ++++++++++++
> 4 files changed, 50 insertions(+), 2 deletions(-)

I am fine with pushing this into linux-next for testing purposes.
But I am against pushing this to Linus' tree in this form.

Now, it requires lib/Kconfig.twist that is added by a patch in
Andrew's tree. One approach is to push this into linux-next
via Andrew's -mm tree.

Another possibility would be to remove lib/Kconfig.twist
changes from this patch and replace
CONFIG_TWIST_ALWAYS_EVALUATE_PRINTK_ARGUMENTS with
CONFIG_TWIST_FOR_SYZKALLER_TESTING.
Then I could push it into linux-next via printk/linux.git tree.

Best Regards,
Petr

2020-05-28 11:36:20

[permalink] [raw]

Subject: Re: [PATCH] twist: allow converting pr_devel()/pr_debug() into printk(KERN_DEBUG)

On 2020/05/28 19:59, Petr Mladek wrote:
> 2. Add twist into vprintk_store(). In the current, implementation
> it would do:
>
> #if TWIST
> return text_len;
> #endif
>
> return log_output(facility, level, lflags,
> dict, dictlen, text, text_len);

This part could be possible. But

> 1. Add twist into ddebug_add_module() and enable all newly added
> entries by default. For example, by calling
> ddebug_exec_query("*:+p", const char *modname) or what is the syntax.
>
> This will cause that any pr_devel() variant will always get called.

how to handle

>> #define no_printk(fmt, ...) \
>> ({ \
>> if (0) \
>> printk(fmt, ##__VA_ARGS__); \
>> 0; \
>> })

part used by e.g. pr_devel() ? Since this macro is not using dynamic debug
interface, vprintk_store() will not be called from the beginning. Are you
suggesting that we should convert no_printk() to use dynamic debug interface ?

I don't know whether enabling only in linux-next makes sense. Since not all tests
are equally done on each git tree, available only in linux-next will not be able
to cover all callers. Just using CONFIG_TWIST_ALWAYS_EVALUATE_PRINTK_ARGUMENTS=y
and CONFIG_DYNAMIC_DEBUG=n is the simplest.

2020-05-28 12:17:20

[permalink] [raw]

Subject: Re: [PATCH] twist: allow converting pr_devel()/pr_debug() into printk(KERN_DEBUG)

On Thu 2020-05-28 20:33:10, Tetsuo Handa wrote:
> On 2020/05/28 19:59, Petr Mladek wrote:
> > 2. Add twist into vprintk_store(). In the current, implementation
> > it would do:
> >
> > #if TWIST
> > return text_len;
> > #endif
> >
> > return log_output(facility, level, lflags,
> > dict, dictlen, text, text_len);
>
> This part could be possible. But
>
> > 1. Add twist into ddebug_add_module() and enable all newly added
> > entries by default. For example, by calling
> > ddebug_exec_query("*:+p", const char *modname) or what is the syntax.
> >
> > This will cause that any pr_devel() variant will always get called.
>
> how to handle
>
> >> #define no_printk(fmt, ...) \
> >> ({ \
> >> if (0) \
> >> printk(fmt, ##__VA_ARGS__); \
> >> 0; \
> >> })
>
> part used by e.g. pr_devel() ? Since this macro is not using dynamic debug
> interface, vprintk_store() will not be called from the beginning. Are you
> suggesting that we should convert no_printk() to use dynamic debug interface ?

OK, this is one more path that would need special handling. Two paths
are much better than 15.

> I don't know whether enabling only in linux-next makes sense. Since not all tests
> are equally done on each git tree, available only in linux-next will not be able
> to cover all callers. Just using CONFIG_TWIST_ALWAYS_EVALUATE_PRINTK_ARGUMENTS=y
> and CONFIG_DYNAMIC_DEBUG=n is the simplest.

I hope that tests done on linux-next would be enough to trigger some
bugs. If you do not see problems in linux-next then this twist
probably is not worth the effort and code complications.

Best Regards,
Petr

2020-05-28 14:18:52

[permalink] [raw]

Subject: Re: [PATCH] twist: allow converting pr_devel()/pr_debug() into printk(KERN_DEBUG)

On 2020/05/28 21:14, Petr Mladek wrote:
>> how to handle
>>
>>>> #define no_printk(fmt, ...) \
>>>> ({ \
>>>> if (0) \
>>>> printk(fmt, ##__VA_ARGS__); \
>>>> 0; \
>>>> })
>>
>> part used by e.g. pr_devel() ? Since this macro is not using dynamic debug
>> interface, vprintk_store() will not be called from the beginning. Are you
>> suggesting that we should convert no_printk() to use dynamic debug interface ?
>
> OK, this is one more path that would need special handling. Two paths
> are much better than 15.

OK. That can avoid needlessly increasing dynamic debug locations.
But I believe that your suggestion is much worse than 15. ;-(

Let's go back to "Add twist into vprintk_store().". The condition is not as simple as

#if TWIST
return text_len;
#endif

because we need to check whether the caller wants to print this message or not.
Since we need to print all messages that the caller asked to print, the condition
needs to be

#if TWIST
if (!this_message_should_be_stored_into_logbuf(arg))
return text_len;
#endif

and where does the "arg" come? It is not as simple as loglevel. Like you said

It might get more complicated if you would actually want to see
pr_debug() messages for a selected module in the fuzzer.

, we want to conditionally store KERN_DEBUG messages into logbuf.

We sometimes don't want to store into logbuf due to ratelimit, due to
dynamic debug. But we DO want to evaluate arguments passed to printk().

Oh, if we try to add twist into vprintk_store(), we will have to modify
all printk() callers in order to pass "arg" only for telling whether we
want to store that messages into logbuf. What a nightmare!

2020-05-28 15:21:22

[permalink] [raw]

Subject: Re: [PATCH v2] twist: allow converting pr_devel()/pr_debug() into snprintf()

On 2020/05/28 20:06, Petr Mladek wrote:
> Now, it requires lib/Kconfig.twist that is added by a patch in
> Andrew's tree. One approach is to push this into linux-next
> via Andrew's -mm tree.
>
> Another possibility would be to remove lib/Kconfig.twist
> changes from this patch and replace
> CONFIG_TWIST_ALWAYS_EVALUATE_PRINTK_ARGUMENTS with
> CONFIG_TWIST_FOR_SYZKALLER_TESTING.
> Then I could push it into linux-next via printk/linux.git tree.

CONFIG_TWIST_FOR_SYZKALLER_TESTING is meant for linux-next only.
But CONFIG_TWIST_KERNEL_BEHAVIOR is meant for Linus's tree.

That is, lib/Kconfig.twist will be there in printk/linux.git tree
after 5.8-rc1. But maybe twist related patches should be gathered
into one tree for easier management.

2020-05-28 17:10:53

by Joe Perches

[permalink] [raw]

Subject: Re: [PATCH] twist: allow converting pr_devel()/pr_debug() into printk(KERN_DEBUG)

On Thu, 2020-05-28 at 14:14 +0200, Petr Mladek wrote:
> On Thu 2020-05-28 20:33:10, Tetsuo Handa wrote:
> > On 2020/05/28 19:59, Petr Mladek wrote:
> > > 2. Add twist into vprintk_store(). In the current, implementation
> > > it would do:
> > >
> > > #if TWIST
> > > return text_len;
> > > #endif
> > >
> > > return log_output(facility, level, lflags,
> > > dict, dictlen, text, text_len);
> >
> > This part could be possible. But
> >
> > > 1. Add twist into ddebug_add_module() and enable all newly added
> > > entries by default. For example, by calling
> > > ddebug_exec_query("*:+p", const char *modname) or what is the syntax.
> > >
> > > This will cause that any pr_devel() variant will always get called.
> >
> > how to handle
> >
> > > > #define no_printk(fmt, ...) \
> > > > ({ \
> > > > if (0) \
> > > > printk(fmt, ##__VA_ARGS__); \
> > > > 0; \
> > > > })
> >
> > part used by e.g. pr_devel() ? Since this macro is not using dynamic debug
> > interface, vprintk_store() will not be called from the beginning. Are you
> > suggesting that we should convert no_printk() to use dynamic debug interface ?
>
> OK, this is one more path that would need special handling. Two paths
> are much better than 15.

A few more:

$ grep-2.5.4 --include=*.[ch] -rP '\if\s*$\s*0\s*$\s*\{?\s*\\?\s*no_printk' *
drivers/platform/x86/thinkpad_acpi.c: do { if (0) no_printk(format, ##arg); } while (0)
fs/ntfs/debug.h: if (0) \
no_printk(fmt, ##__VA_ARGS__); \
include/linux/net.h: if (0) \
no_printk(KERN_DEBUG pr_fmt(fmt), ##__VA_ARGS__); \
kernel/cred.c: if (0) \
no_printk("[%-5.5s%5u] " FMT "\n", \

2020-05-28 19:15:22

by Andrew Morton

[permalink] [raw]

Subject: Re: [PATCH v2] twist: allow converting pr_devel()/pr_debug() into snprintf()

On Fri, 29 May 2020 00:16:22 +0900 Tetsuo Handa <[email protected]> wrote:

> On 2020/05/28 20:06, Petr Mladek wrote:
> > Now, it requires lib/Kconfig.twist that is added by a patch in
> > Andrew's tree. One approach is to push this into linux-next
> > via Andrew's -mm tree.
> >
> > Another possibility would be to remove lib/Kconfig.twist
> > changes from this patch and replace
> > CONFIG_TWIST_ALWAYS_EVALUATE_PRINTK_ARGUMENTS with
> > CONFIG_TWIST_FOR_SYZKALLER_TESTING.
> > Then I could push it into linux-next via printk/linux.git tree.
>
> CONFIG_TWIST_FOR_SYZKALLER_TESTING is meant for linux-next only.
> But CONFIG_TWIST_KERNEL_BEHAVIOR is meant for Linus's tree.
>
> That is, lib/Kconfig.twist will be there in printk/linux.git tree
> after 5.8-rc1. But maybe twist related patches should be gathered
> into one tree for easier management.

Yup. I'll drop the twist patches from -mm.

2020-05-28 19:53:06

[permalink] [raw]

Subject: Re: [PATCH v2] twist: allow converting pr_devel()/pr_debug() into snprintf()

On Thu, May 28, 2020 at 8:17 AM Tetsuo Handa
<[email protected]> wrote:
>
> CONFIG_TWIST_FOR_SYZKALLER_TESTING is meant for linux-next only.
> But CONFIG_TWIST_KERNEL_BEHAVIOR is meant for Linus's tree.

I really absolutely still detest this all. I don't see the point. The
naming is completely random (both "twist" and then options like
"TWIST_FOR_SYZKALLER_TESTING" that have no conceptual meaning.

I still don't understand why this small set of random options couldn't
just be kernel options that get set on the command line, and that have
independent and sane and explainable behavior? Why this odd mentality
of "syzkaller is special"?

I've complained about this whole thing before. I'm getting really fed
up with this whole concept of "magic crazy config options".

The kernel configuration phase is just about the _worst_ part of the
kernel, we shouldn't make these pointless things make it even worse.

Linus

2020-05-28 20:04:30

[permalink] [raw]

Subject: Re: [PATCH v2] twist: allow converting pr_devel()/pr_debug() into snprintf()

On Thu, May 28, 2020 at 12:50 PM Linus Torvalds
<[email protected]> wrote:
>
> I still don't understand why this small set of random options couldn't
> just be kernel options that get set on the command line, and that have
> independent and sane and explainable behavior? Why this odd mentality
> of "syzkaller is special"?

And just to clarify: the kernel option wouldn't be "syzcaller_twist"
or something insane like that.

It would be something like "kbd-disable-hotkeys" or whatever: actual
real text that says what it does.

Linus

2020-05-29 00:12:22

[permalink] [raw]

Subject: Re: [PATCH v2] twist: allow converting pr_devel()/pr_debug() into snprintf()

On 2020/05/29 4:50, Linus Torvalds wrote:
> On Thu, May 28, 2020 at 8:17 AM Tetsuo Handa
> <[email protected]> wrote:
>>
>> CONFIG_TWIST_FOR_SYZKALLER_TESTING is meant for linux-next only.
>> But CONFIG_TWIST_KERNEL_BEHAVIOR is meant for Linus's tree.
>
> I really absolutely still detest this all. I don't see the point. The
> naming is completely random (both "twist" and then options like
> "TWIST_FOR_SYZKALLER_TESTING" that have no conceptual meaning.

Oh, I made copy&paste error. I wanted to say

CONFIG_TWIST_KERNEL_BEHAVIOR and CONFIG_TWIST_FOR_SYZKALLER_TESTING are
meant for Linus's tree.

CONFIG_DEBUG_AID_FOR_SYZBOT is meant for linux-next only, and will be
removed after CONFIG_TWIST_FOR_SYZKALLER_TESTING went to Linus's tree.

If you don't like the name "CONFIG_TWIST_FOR_SYZKALLER_TESTING", I'm happy
to rename it.

>
> I still don't understand why this small set of random options couldn't
> just be kernel options that get set on the command line, and that have
> independent and sane and explainable behavior?

You mean "export these behavior as kernel command line options"? That will
involve run-time costs (while build-time branching based on #ifdef can
completely eliminate run-time costs). Also, as number of options which
can be controlled at boot-time grows, the kernel command line will become
too long to specify all of these behavior. Also, making these options
controllable at boot-time involves making these options as user-visible ABI
(which is bad, for the twists which we want might change in the future).

> Why this odd mentality of "syzkaller is special"?

Why do you think "syzkaller is special" ? There is no syzkaller-specific
choice. CONFIG_TWIST_FOR_SYZKALLER_TESTING is intended for eliminating the
need of managing CONFIG_TWIST_* options on each kernel tree/commit. When
a different fuzzer (or some kernel testing project) appears, they can define
their own CONFIG_TWIST_FOR_$projectname_TESTING as well.

>
> I've complained about this whole thing before. I'm getting really fed
> up with this whole concept of "magic crazy config options".
>
> The kernel configuration phase is just about the _worst_ part of the
> kernel, we shouldn't make these pointless things make it even worse.

Current kernel is not well segmented enough to allow switching based on
per process flags. We can't distinguish whether some kernel message was
caused by a process with such flags.

All we could afford is to switch based on kernel boot command line. But
that will entail a lot of code/data (and runtime-cost) which is not used
if the administrator does not turn on the switches.

After all, switching at the kernel configuration phase is the most simple
approach.

On 2020/05/29 5:01, Linus Torvalds wrote:
> On Thu, May 28, 2020 at 12:50 PM Linus Torvalds
> <[email protected]> wrote:
>>
>> I still don't understand why this small set of random options couldn't
>> just be kernel options that get set on the command line, and that have
>> independent and sane and explainable behavior? Why this odd mentality
>> of "syzkaller is special"?
>
> And just to clarify: the kernel option wouldn't be "syzcaller_twist"
> or something insane like that.
>
> It would be something like "kbd-disable-hotkeys" or whatever: actual
> real text that says what it does.

If you don't like the name "CONFIG_TWIST_FOR_SYZKALLER_TESTING", please
suggest an example name you would accept. But if you don't like switching
based on the kernel configuration options, I can't find a better solution.

2020-05-29 00:33:18

[permalink] [raw]

Subject: Re: [PATCH v2] twist: allow converting pr_devel()/pr_debug() into snprintf()

On Thu, May 28, 2020 at 5:08 PM Tetsuo Handa
<[email protected]> wrote:
>
> You mean "export these behavior as kernel command line options"? That will
> involve run-time costs (while build-time branching based on #ifdef can
> completely eliminate run-time costs).

Are _any_ of these things meaningful?

> Also, as number of options which
> can be controlled at boot-time grows, the kernel command line will become
> too long to specify all of these behavior.

So? We have explicitly added boot-config files for exactly this,
because people who do boot-time tracing wanted this.

> Why do you think "syzkaller is special" ? There is no syzkaller-specific
> choice.

ALL of these are designed to be totally about syzkaller. Nobody else
has ever asked for the TWIST options. And if they have, they'd still
make more sense as generic real actual options than as some odd
"twist" thing.

> Current kernel is not well segmented enough to allow switching based on
> per process flags. We can't distinguish whether some kernel message was
> caused by a process with such flags.

Who said anything at all about per process?

> All we could afford is to switch based on kernel boot command line. But
> that will entail a lot of code/data (and runtime-cost) which is not used
> if the administrator does not turn on the switches.

Absolutely nobody cares.

In fact, I'd prefer it just so that the options would be individually
explained, and not hidden away in some odd kernel config file, and
would be visible and force people to have sane nam,es.

> After all, switching at the kernel configuration phase is the most simple
> approach.

No it isn't. It's the UGLIEST possible approach, forcing a nasty
horrible config process to be even worse, forcing a compile-time
decision for something that isn't at all obvious should be
compile-time, and forcing crazy ugly config option names because they
are all just odd.

I have complained about this for MONTHS. You never never actually
explained why you want these badly named config options.

If it's something _so_ core that it affects performance, then no,
syzkaller shouldn't be messing with it in the first place, because
then you'd be testing something that is entirely irrelevant to anybody
else.

And if it's about things like "disable sysrq-k", and it might be
useful to somebody else than syzkaller, then it would be *much* better
off as a boot option so that you don't have to recompile the kernel to
pick it.

Some things might even be worth having a runtime option for.

But this "put random options, give them random names, and force them
all as compile-time options in a nasty kernel config process" just
smells.

And if they are _so_ specialized that only syzkaller could possibly
care, I still maintain that they shouldn't go upstream at all.

Linus

2020-05-29 02:07:05

by Sergey Senozhatsky

[permalink] [raw]

Subject: Re: [PATCH] twist: allow converting pr_devel()/pr_debug() into printk(KERN_DEBUG)

On (20/05/25 19:43), Tetsuo Handa wrote:
> >> On Sun 2020-05-24 23:50:34, Tetsuo Handa wrote:
> >>> syzbot found a NULL pointer dereference bug inside mptcp_recvmsg() due to
> >>> ssock == NULL, but this bug manifested inside selinux_socket_recvmsg()
> >>> because pr_debug() was no-op [1].
> >>>
> >>> pr_debug("fallback-read subflow=%p",
> >>> mptcp_subflow_ctx(ssock->sk));
> >>> copied = sock_recvmsg(ssock, msg, flags);
> >>
> >> The NULL pointer deference was found even without this patch.
> >> This patch would just cause that it will manifest itself on another
> >> place. What is the benefit, please?
>
> It would help localizing the bug in this specific case.
>
> It's not only about %p, even %d can crash kernel or leak sensitive
> info (if it happens after-free/out-of-bounds/uninit). Overall it
> increases code coverage and allows to catch more bugs earlier.

I don't know. Relying on random pr_debug()-s that can be added or
removed any time. oops backtrace should help with that. You are
not going to add pr_debug() all over the kernel, are you?

-ss

2020-05-29 02:15:51

[permalink] [raw]

Subject: Re: [PATCH v2] twist: allow converting pr_devel()/pr_debug() into snprintf()

On 2020/05/29 9:28, Linus Torvalds wrote:
>> Current kernel is not well segmented enough to allow switching based on
>> per process flags. We can't distinguish whether some kernel message was
>> caused by a process with such flags.
>
> Who said anything at all about per process?
>

You said

Some kind of "not even root" flag, which might be per-process and not
possible to clear once set (so that your _normal_ system binaries
could still do the root-only stuff, but then you could start a fuzzing
process with that flag set, knowing that the fuzzing process - and
it's children - are not able to do things).

Honestly, in a perfect world, it has nothing at all to do with
fuzzing, and people could even have some local rules like "anybody who
came in through ssh from the network will also have this flag set".

at https://lkml.kernel.org/r/CAHk-=wgbMi2+VBN0SCEw9GeoiWgui034AOBwbt_dW9tdCa3Nig@mail.gmail.com
and I said

I don't think per-process flags will work. Not every action is taken inside
process context who issued a syscall. Some action might be taken in interrupt
context or in kthread context. Since we have no means to propagate per-process
flags, we will need to give up more than we have to give up.

at https://lkml.kernel.org/r/[email protected] .

>> Why do you think "syzkaller is special" ? There is no syzkaller-specific
>> choice.
>
> ALL of these are designed to be totally about syzkaller. Nobody else
> has ever asked for the TWIST options. And if they have, they'd still
> make more sense as generic real actual options than as some odd
> "twist" thing.

Because syzkaller is the first user who needs the TWIST options. syzkaller is
creative enough to trigger unexpected syscalls with unexpected arguments.
Some of unexpected arguments can be filtered out by sanitization steps, but not all
arguments can be filtered out. Some examples which syzkaller expects the TWIST options
to filter out are described at https://github.com/google/syzkaller/issues/1622 .

Who can implement a fuzzer which never triggers unexpected syscalls with unexpected
arguments? If somebody implemented one, that fuzzer will not be testing as deep as
syzkaller does. It is natural that nobody else has ever asked for the TWIST options.

> And if it's about things like "disable sysrq-k", and it might be
> useful to somebody else than syzkaller, then it would be *much* better
> off as a boot option so that you don't have to recompile the kernel to
> pick it.

Recompiling the kernel to pick something is not a difficult thing. Avoiding bugs
caused by allowing runtime enable/disable is far more difficult thing, for fuzz
testing might unexpectedly overwrite flag variables due to unexpected arguments
or memory corruption.

>> After all, switching at the kernel configuration phase is the most simple
>> approach.
>
> No it isn't. It's the UGLIEST possible approach, forcing a nasty
> horrible config process to be even worse, forcing a compile-time
> decision for something that isn't at all obvious should be
> compile-time, and forcing crazy ugly config option names because they
> are all just odd.

If the code is fixed at build-time and the data does not contain flag variables,
it improves the reliability/robustness of fuzz testing.

Given that one of TWIST switches allows administrator to disable reboot() syscall,
who would use that switch? Only kernel fuzzing projects would use that switch. How
allowing administrators to control such switch at kernel command line can be useful?

> Some things might even be worth having a runtime option for.

If somebody starts asking for boot time switching of the TWIST options, we can consider
it then. Until then, I don't think supporting boot time switching is worth the effort.

> But this "put random options, give them random names, and force them
> all as compile-time options in a nasty kernel config process" just
> smells.

Switching at build-time has a reason for switching at build-time; it is to
improve the reliability/robustness of fuzz testing.

> And if they are _so_ specialized that only syzkaller could possibly
> care, I still maintain that they shouldn't go upstream at all.

Without the support from the kernel side, we will fail to find deep kernel bugs
because we will fail upon unexpected syscalls with unexpected arguments.

The TWIST options are not only for syzkaller. Any fuzzers which try to find deep
kernel bugs will need the TWIST options.

You are free to add runtime switches for things like "disable sysrq-k".
Please don't consider the TWIST options as "for userspace's convenience".
The TWIST options are intended for "improving the quality of kernel testing".

2020-05-29 02:28:16

[permalink] [raw]

Subject: Re: [PATCH v2] twist: allow converting pr_devel()/pr_debug() into snprintf()

On Thu, May 28, 2020 at 7:14 PM Tetsuo Handa
<[email protected]> wrote:
>
> You said
>
> Some kind of "not even root" flag, which might be per-process and not
> possible to clear once set (so that your _normal_ system binaries
> could still do the root-only stuff, but then you could start a fuzzing
> process with that flag set, knowing that the fuzzing process - and
> it's children - are not able to do things).

Yes, that was in a long ago discussion.

And I still think it's actually possibly the right idea for several of
these flags.

Some flags do end up having to be practically system-wide, because
they end up being used in contexts other than the test environment (ie
anything that ends up doing workqueues or networking or VM or whatever
- it's a "global context").

At the same time, some of the flags might well be "in the test
environment" if they act on behavior that is all synchronous. Then
other testers can use them on production machines for testing, even
when the kernel might be used for other things at the same time.

But the point is, NONE OF THIS IS CONFIG OPTIONS.

And absolutely none of this should be named "twist". If you can't make
a config have a sane name that talks about what it does, then it
shouldn't exist at all.

An option like "disable-sysrq-k" is a fine option. It might make
sense. In fact, it would fit well with boot options that we already
have, like "sysrq_always_enabled".

An option called "twist-disablke-sysrq-k" is just odd and nonsensical.

And making that a build-time option is the worst of all worlds.

Linus

2020-05-29 04:50:06

[permalink] [raw]

Subject: Re: [PATCH v2] twist: allow converting pr_devel()/pr_debug() into snprintf()

On 2020/05/29 11:24, Linus Torvalds wrote:
> Some flags do end up having to be practically system-wide, because
> they end up being used in contexts other than the test environment (ie
> anything that ends up doing workqueues or networking or VM or whatever
> - it's a "global context").

Right. And since fuzz testing can't determine whether it will involve
"global context", the options for fuzz testing (currently referenced as
"twist") can't be "per process". It might be possible to make the twist
options "per boot", but we are conflicting on whether making the twist
options "per boot" worth the effort.

> At the same time, some of the flags might well be "in the test
> environment" if they act on behavior that is all synchronous. Then
> other testers can use them on production machines for testing, even
> when the kernel might be used for other things at the same time.

I don't think that people use their machines for testing and non-testing
at the same time. When testing kernels, people enable a lot of debugging
kernel config options (e.g. KASAN, lockdep) because finding bugs has
higher priority than faster kernels. Some of debugging functionality
could be enabled/disabled using kernel command line, but it is hardly
possible to convert such debugging functionality into "per process".
It is much safer to run a qemu-kvm instance on their machines if they
want to test kernels without affecting the rest of their machines.

> And absolutely none of this should be named "twist". If you can't make
> a config have a sane name that talks about what it does, then it
> shouldn't exist at all.

CONFIG_TWIST_* options are analogy of CONFIG_DEBUG_* options.
The TWIST part is a prefix for grouping. If you don't like the grouping,
it is just a matter of renaming CONFIG_TWIST_* options.

> And making that a build-time option is the worst of all worlds.

Can we avoid discussing "the config option naming" (which is build-time
things) and "the kernel command line option" (which is boot-time things)
at the same time? What I am saying is

THE TWIST OPTIONS ARE INTENDED FOR "IMPROVING THE QUALITY OF KERNEL TESTING"

which will in turn help improving the quality of kernel itself, and your
insistence on making them boot-time things (which might increase flexibility
for users) can decrease the reliability/robustness of testing.

2020-05-29 05:10:02

[permalink] [raw]

Subject: Re: [PATCH] twist: allow converting pr_devel()/pr_debug() into printk(KERN_DEBUG)

On 2020/05/29 11:04, Sergey Senozhatsky wrote:
> You are not going to add pr_debug() all over the kernel, are you?

Of course, I'm not planning to add pr_debug() all over the kernel.

When I need to print something to consoles, I will use one-time patch
(like https://syzkaller.appspot.com/text?tag=Patch&x=135f254a100000 ) for
adding custom printk() and turning on some switches for making existing
printk() calls work.

2020-05-29 08:20:11

[permalink] [raw]

Subject: Re: [PATCH v2] twist: allow converting pr_devel()/pr_debug() into snprintf()

On Thu 2020-05-28 12:50:35, Linus Torvalds wrote:
> On Thu, May 28, 2020 at 8:17 AM Tetsuo Handa
> <[email protected]> wrote:
> >
> > CONFIG_TWIST_FOR_SYZKALLER_TESTING is meant for linux-next only.
> > But CONFIG_TWIST_KERNEL_BEHAVIOR is meant for Linus's tree.
>
> I really absolutely still detest this all. I don't see the point. The
> naming is completely random (both "twist" and then options like
> "TWIST_FOR_SYZKALLER_TESTING" that have no conceptual meaning.
>
> I still don't understand why this small set of random options couldn't
> just be kernel options that get set on the command line, and that have
> independent and sane and explainable behavior? Why this odd mentality
> of "syzkaller is special"?

I am afraid that many of them could not be normal options. They change or
break some behavior that is necessary by seriously used system.

> I've complained about this whole thing before. I'm getting really fed
> up with this whole concept of "magic crazy config options".

Just to make my role clear in this saga.

I am focused on the change of pr_debug() behavior. I do _not_ believe
that it is worth it. But I wanted to give fuzzer guys a chance to get
some data.

This is why I offered to push hacky patch into linux-next via printk
tree to get fuzzers fed. Such a patch would change the behavior only
for the fuzzer (with the crazy config enabled) and it would be there
only for a limited time.

I personally do _not_ have a good feeling about having such hacks in
upstream kernel. But I do not feel in position to decide about it.
I wanted to solve this question later if there would have been
anything to upstream.

I am _not_ going to push any twists, in the current form,
upstream via printk tree.

Best Regards,
Petr

2020-05-29 13:29:48

[permalink] [raw]

Subject: Re: [PATCH v2] twist: allow converting pr_devel()/pr_debug() into snprintf()

Hello, Dmitry.

Linus is asking me to avoid build-time switching based on kernel config options,
and is suggesting me to use boot-time switching based on boot-config file feature
(which is available since 5.6). I have several concerns about use of boot-config file
feature in syzkaller.

(1) To use boot-config file, syzkaller will need to add "bootconfig" option
to the kernel command line. This will be doable by patching
https://github.com/google/syzkaller/tree/master/dashboard/config/ *.cmdline
files.

(2) The boot-config file is embedded into initramfs file. Since syzkaller builds
kernels with almost-allyesconfig, booting syzkaller kernels do not require
initramfs for loading kernel modules needed for mounting the root partition.
In fact, according to "unexpected kernel reboot" report which contains boot messages,
I can't find "Unpacking initramfs..." message. It seems that syzkaller kernels do not
use initramfs file.

Is it possible for syzkaller to enforce performing steps for creating an initramfs,
embedding the boot-config file
( https://www.kernel.org/doc/html/latest/admin-guide/bootconfig.html#boot-kernel-with-a-boot-config),
and loading that initramfs whenever booting the syzkaller kernels?
By the way, I do worry that people forget to perform these steps when they do
their tests without asking syzbot...

(3) Since syzkaller keeps track of "kernel tree", "commit of the kernel tree", and
"commit of the syzkaller tree" in order to guarantee reproducibility, it would be
possible to identify the "your-config" file used for tools/bootconfig/bootconfig
command. But since "#syz test" command currently accepts only "kernel tree" and
"commit of the kernel tree" arguments, we might fail to use intended "your-config"
file when doing reproducibility test. Can syzbot solve this concern?

(4) Of course, "your-config" file would not change so frequently, but "#syz test" command
relies on external file in "syzkaller tree" makes it impossible to try different
configuration when I have to ask syzbot to test. (Since I don't have hardware which
syzbot is reporting problems, I have to ask syzbot when I can't reproduce the problem
in my environment.)

https://syzkaller.appspot.com/text?tag=Patch&x=135f254a100000 is an example of
need to enforce CONFIG_DEBUG_INFO_BTF=n in order to workaround build failure during
"#syz test" command. If we bring "which twist options should be enabled" to an external
boot-config file, I can't ask syzbot to try different twist options (except directly
patching the kernel source which handles "which twist options should be enabled").
Can syzbot solve this concern?

(5) Anything else?

2020-06-03 11:06:07

[permalink] [raw]

Subject: twist: allow disabling reboot request

On 2020/05/29 22:26, Tetsuo Handa wrote:
> By the way, I do worry that people forget to perform these steps when they do
> their tests without asking syzbot...

Here is a draft of boot-time switching. Since kconfig can handle string variable up to
2048 characters, we could hold the content of the "your-config" file inside .config file
in order to avoid relying on external file in "syzkaller tree". But since only one kconfig
option is used, basically the way to temporarily include/exclude specific options (under
automated testing by syzbot) seems to remain directly patching apply_twist_flags(), for
https://github.com/google/syzkaller/blob/master/dashboard/config/util.sh will automatically
overwrite CONFIG_DEFAULT_TWIST_FLAGS settings. If each twist flag were using independent
kconfig option, the way to temporarily include/exclude specific options will become directly
patching Kconfig file.

drivers/tty/vt/keyboard.c | 2 ++
include/linux/kernel.h | 8 ++++++++
init/main.c | 30 ++++++++++++++++++++++++++++++
kernel/reboot.c | 36 ++++++++++++++++++++++++++++++++++++
lib/Kconfig.debug | 5 +++++
5 files changed, 81 insertions(+)

diff --git a/drivers/tty/vt/keyboard.c b/drivers/tty/vt/keyboard.c
index 568b2171f335..ae0b7cd69249 100644
--- a/drivers/tty/vt/keyboard.c
+++ b/drivers/tty/vt/keyboard.c
@@ -637,6 +637,8 @@ static void k_spec(struct vc_data *vc, unsigned char value, char up_flag)
kbd->kbdmode == VC_OFF) &&
value != KVAL(K_SAK))
return; /* SAK is allowed even in raw mode */
+ if (twist_flags.disable_kbd_k_spec_handler)
+ return;
fn_handler[value](vc);
}

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 82d91547d122..78fdbb4f17b1 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -1038,4 +1038,12 @@ static inline void ftrace_dump(enum ftrace_dump_mode oops_dump_mode) { }
/* OTHER_WRITABLE? Generally considered a bad idea. */ \
BUILD_BUG_ON_ZERO((perms) & 2) + \
(perms))
+
+/* Flags for twisting kernel behavior. */
+struct twist_flags {
+ bool disable_kbd_k_spec_handler;
+ bool disable_reboot_request;
+};
+extern struct twist_flags twist_flags;
+
#endif
diff --git a/init/main.c b/init/main.c
index 0ead83e86b5a..15eecd253b61 100644
--- a/init/main.c
+++ b/init/main.c
@@ -1531,3 +1531,33 @@ static noinline void __init kernel_init_freeable(void)

integrity_load_keys();
}
+
+/* Flags for twisting kernel behavior. */
+struct twist_flags twist_flags __ro_after_init;
+EXPORT_SYMBOL(twist_flags);
+static __initdata char default_twist_flags[] __initdata = CONFIG_DEFAULT_TWIST_FLAGS;
+static __initdata char *chosen_twist_flags = default_twist_flags;
+
+static int __init overwrite_twist_flags(char *str)
+{
+ chosen_twist_flags = str;
+ return 1;
+}
+__setup("twist_flags=", overwrite_twist_flags);
+
+static int __init apply_twist_flags(void)
+{
+ char *flags = chosen_twist_flags;
+ char *name;
+
+ while ((name = strsep(&flags, ",")) != NULL) {
+ if (!strcmp(name, "kbd-disable-hotkeys"))
+ twist_flags.disable_kbd_k_spec_handler = true;
+ else if (!strcmp(name, "disable-reboot-request"))
+ twist_flags.disable_reboot_request = true;
+ else
+ printk(KERN_INFO "Ignoring unknown twist option '%s'.\n", name);
+ }
+ return 0;
+}
+late_initcall(apply_twist_flags);
diff --git a/kernel/reboot.c b/kernel/reboot.c
index 491f1347bf43..63cec97a9e59 100644
--- a/kernel/reboot.c
+++ b/kernel/reboot.c
@@ -63,6 +63,8 @@ EXPORT_SYMBOL_GPL(pm_power_off_prepare);
*/
void emergency_restart(void)
{
+ if (twist_flags.disable_reboot_request)
+ panic("reboot request is disabled");
kmsg_dump(KMSG_DUMP_EMERG);
machine_emergency_restart();
}
@@ -243,6 +245,8 @@ void migrate_to_reboot_cpu(void)
*/
void kernel_restart(char *cmd)
{
+ if (twist_flags.disable_reboot_request)
+ panic("reboot request is disabled");
kernel_restart_prepare(cmd);
migrate_to_reboot_cpu();
syscore_shutdown();
@@ -270,6 +274,8 @@ static void kernel_shutdown_prepare(enum system_states state)
*/
void kernel_halt(void)
{
+ if (twist_flags.disable_reboot_request)
+ panic("reboot request is disabled");
kernel_shutdown_prepare(SYSTEM_HALT);
migrate_to_reboot_cpu();
syscore_shutdown();
@@ -286,6 +292,8 @@ EXPORT_SYMBOL_GPL(kernel_halt);
*/
void kernel_power_off(void)
{
+ if (twist_flags.disable_reboot_request)
+ panic("reboot request is disabled");
kernel_shutdown_prepare(SYSTEM_POWER_OFF);
if (pm_power_off_prepare)
pm_power_off_prepare();
@@ -344,6 +352,10 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, unsigned int, cmd,
mutex_lock(&system_transition_mutex);
switch (cmd) {
case LINUX_REBOOT_CMD_RESTART:
+ if (twist_flags.disable_reboot_request) {
+ ret = -EPERM;
+ break;
+ }
kernel_restart(NULL);
break;

@@ -356,11 +368,19 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, unsigned int, cmd,
break;

case LINUX_REBOOT_CMD_HALT:
+ if (twist_flags.disable_reboot_request) {
+ ret = -EPERM;
+ break;
+ }
kernel_halt();
do_exit(0);
panic("cannot halt");

case LINUX_REBOOT_CMD_POWER_OFF:
+ if (twist_flags.disable_reboot_request) {
+ ret = -EPERM;
+ break;
+ }
kernel_power_off();
do_exit(0);
break;
@@ -373,17 +393,29 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, unsigned int, cmd,
}
buffer[sizeof(buffer) - 1] = '\0';

+ if (twist_flags.disable_reboot_request) {
+ ret = -EPERM;
+ break;
+ }
kernel_restart(buffer);
break;

#ifdef CONFIG_KEXEC_CORE
case LINUX_REBOOT_CMD_KEXEC:
+ if (twist_flags.disable_reboot_request) {
+ ret = -EPERM;
+ break;
+ }
ret = kernel_kexec();
break;
#endif

#ifdef CONFIG_HIBERNATION
case LINUX_REBOOT_CMD_SW_SUSPEND:
+ if (twist_flags.disable_reboot_request) {
+ ret = -EPERM;
+ break;
+ }
ret = hibernate();
break;
#endif
@@ -493,6 +525,8 @@ static DECLARE_WORK(poweroff_work, poweroff_work_func);
*/
void orderly_poweroff(bool force)
{
+ if (twist_flags.disable_reboot_request)
+ panic("reboot request is disabled");
if (force) /* do not override the pending "true" */
poweroff_force = true;
schedule_work(&poweroff_work);
@@ -514,6 +548,8 @@ static DECLARE_WORK(reboot_work, reboot_work_func);
*/
void orderly_reboot(void)
{
+ if (twist_flags.disable_reboot_request)
+ panic("reboot request is disabled");
schedule_work(&reboot_work);
}
EXPORT_SYMBOL_GPL(orderly_reboot);
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 498d344ea53a..41cfabc74ad7 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -2338,4 +2338,9 @@ config HYPERV_TESTING

endmenu # "Kernel Testing and Coverage"

+menuconfig DEFAULT_TWIST_FLAGS
+ string "Default twist options (DANGEROUS)"
+ help
+ Don't specify anything unless you know what you are doing.
+
endmenu # Kernel hacking

2020-06-03 12:46:25

[permalink] [raw]

Subject: Re: twist: allow disabling reboot request

On Wed 2020-06-03 20:03:28, Tetsuo Handa wrote:
> On 2020/05/29 22:26, Tetsuo Handa wrote:
> > By the way, I do worry that people forget to perform these steps when they do
> > their tests without asking syzbot...
>
> Here is a draft of boot-time switching. Since kconfig can handle string variable up to
> 2048 characters, we could hold the content of the "your-config" file inside .config file
> in order to avoid relying on external file in "syzkaller tree". But since only one kconfig
> option is used, basically the way to temporarily include/exclude specific options (under
> automated testing by syzbot) seems to remain directly patching apply_twist_flags(), for
> https://github.com/google/syzkaller/blob/master/dashboard/config/util.sh will automatically
> overwrite CONFIG_DEFAULT_TWIST_FLAGS settings. If each twist flag were using independent
> kconfig option, the way to temporarily include/exclude specific options will become directly
> patching Kconfig file.
>
> diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> index 82d91547d122..78fdbb4f17b1 100644
> --- a/include/linux/kernel.h
> +++ b/include/linux/kernel.h
> @@ -1038,4 +1038,12 @@ static inline void ftrace_dump(enum ftrace_dump_mode oops_dump_mode) { }
> /* OTHER_WRITABLE? Generally considered a bad idea. */ \
> BUILD_BUG_ON_ZERO((perms) & 2) + \
> (perms))
> +
> +/* Flags for twisting kernel behavior. */
> +struct twist_flags {
> + bool disable_kbd_k_spec_handler;
> + bool disable_reboot_request;
> +};
> +extern struct twist_flags twist_flags;

Why all these options have to be in a single structure?

> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 498d344ea53a..41cfabc74ad7 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -2338,4 +2338,9 @@ config HYPERV_TESTING
>
> endmenu # "Kernel Testing and Coverage"
>
> +menuconfig DEFAULT_TWIST_FLAGS
> + string "Default twist options (DANGEROUS)"
> + help
> + Don't specify anything unless you know what you are doing.
> +
> endmenu # Kernel hacking

Why such a crazy build configure option?

I think that the only way to get this upstream is to:

+ Add separate boot options that might theoretically be used also
by other people.

+ Use boot parameters and not build configuration.

+ Avoid the meaningless word "twist" !!!

Best Regards,
Petr

2020-06-03 13:37:27

[permalink] [raw]

Subject: Re: twist: allow disabling reboot request

On 2020/06/03 21:44, Petr Mladek wrote:
> On Wed 2020-06-03 20:03:28, Tetsuo Handa wrote:
>> On 2020/05/29 22:26, Tetsuo Handa wrote:
>>> By the way, I do worry that people forget to perform these steps when they do
>>> their tests without asking syzbot...
>>
>> Here is a draft of boot-time switching. Since kconfig can handle string variable up to
>> 2048 characters, we could hold the content of the "your-config" file inside .config file
>> in order to avoid relying on external file in "syzkaller tree". But since only one kconfig
>> option is used, basically the way to temporarily include/exclude specific options (under
>> automated testing by syzbot) seems to remain directly patching apply_twist_flags(), for
>> https://github.com/google/syzkaller/blob/master/dashboard/config/util.sh will automatically
>> overwrite CONFIG_DEFAULT_TWIST_FLAGS settings. If each twist flag were using independent
>> kconfig option, the way to temporarily include/exclude specific options will become directly
>> patching Kconfig file.
>>
>> diff --git a/include/linux/kernel.h b/include/linux/kernel.h
>> index 82d91547d122..78fdbb4f17b1 100644
>> --- a/include/linux/kernel.h
>> +++ b/include/linux/kernel.h
>> @@ -1038,4 +1038,12 @@ static inline void ftrace_dump(enum ftrace_dump_mode oops_dump_mode) { }
>> /* OTHER_WRITABLE? Generally considered a bad idea. */ \
>> BUILD_BUG_ON_ZERO((perms) & 2) + \
>> (perms))
>> +
>> +/* Flags for twisting kernel behavior. */
>> +struct twist_flags {
>> + bool disable_kbd_k_spec_handler;
>> + bool disable_reboot_request;
>> +};
>> +extern struct twist_flags twist_flags;
>
>
> Why all these options have to be in a single structure?

There will be many options (maybe some dozens).
Do we really want to expose so many options individually?

(If these options were build-time configuration, we won't need
this structure at all.)

>
>
>> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
>> index 498d344ea53a..41cfabc74ad7 100644
>> --- a/lib/Kconfig.debug
>> +++ b/lib/Kconfig.debug
>> @@ -2338,4 +2338,9 @@ config HYPERV_TESTING
>>
>> endmenu # "Kernel Testing and Coverage"
>>
>> +menuconfig DEFAULT_TWIST_FLAGS
>> + string "Default twist options (DANGEROUS)"
>> + help
>> + Don't specify anything unless you know what you are doing.
>> +
>> endmenu # Kernel hacking
>
> Why such a crazy build configure option?
>
> I think that the only way to get this upstream is to:
>
> + Add separate boot options that might theoretically be used also
> by other people.

Like you said

I am afraid that many of them could not be normal options. They change or
break some behavior that is necessary by seriously used system.

these options are meant for helping fuzzers to find bugs while protecting
the kernel from legitimate-but-stupid requests from fuzzers. Other people
can include projects other than syzbot, but basically only useful for
debugging projects. (And making these options boot-time configuration
increases garbage/overhead for non-debugging usage.)

>
> + Use boot parameters and not build configuration.

That sounds like a very tight restriction for syzbot. Relying on external
files breaks reproducibility; people can fail to specify intended options.
Saving intended options into the .config file is the most robust/reliable
approach.

>
> + Avoid the meaningless word "twist" !!!

Then, what do you suggest? I think that we need some keyword for grouping.
https://lkml.kernel.org/r/[email protected]

>
>
> Best Regards,
> Petr
>

2020-06-04 10:26:26

[permalink] [raw]

Subject: Re: twist: allow disabling reboot request

On Wed 2020-06-03 22:35:02, Tetsuo Handa wrote:
> On 2020/06/03 21:44, Petr Mladek wrote:
> > On Wed 2020-06-03 20:03:28, Tetsuo Handa wrote:
> >> On 2020/05/29 22:26, Tetsuo Handa wrote:
> >>> By the way, I do worry that people forget to perform these steps when they do
> >>> their tests without asking syzbot...
> >>
> >> Here is a draft of boot-time switching. Since kconfig can handle string variable up to
> >> 2048 characters, we could hold the content of the "your-config" file inside .config file
> >> in order to avoid relying on external file in "syzkaller tree". But since only one kconfig
> >> option is used, basically the way to temporarily include/exclude specific options (under
> >> automated testing by syzbot) seems to remain directly patching apply_twist_flags(), for
> >> https://github.com/google/syzkaller/blob/master/dashboard/config/util.sh will automatically
> >> overwrite CONFIG_DEFAULT_TWIST_FLAGS settings. If each twist flag were using independent
> >> kconfig option, the way to temporarily include/exclude specific options will become directly
> >> patching Kconfig file.
> >>
> >> diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> >> index 82d91547d122..78fdbb4f17b1 100644
> >> --- a/include/linux/kernel.h
> >> +++ b/include/linux/kernel.h
> >> @@ -1038,4 +1038,12 @@ static inline void ftrace_dump(enum ftrace_dump_mode oops_dump_mode) { }
> >> /* OTHER_WRITABLE? Generally considered a bad idea. */ \
> >> BUILD_BUG_ON_ZERO((perms) & 2) + \
> >> (perms))
> >> +
> >> +/* Flags for twisting kernel behavior. */
> >> +struct twist_flags {
> >> + bool disable_kbd_k_spec_handler;
> >> + bool disable_reboot_request;
> >> +};
> >> +extern struct twist_flags twist_flags;
> >
> >
> > Why all these options have to be in a single structure?
>
> There will be many options (maybe some dozens).
> Do we really want to expose so many options individually?
>
> (If these options were build-time configuration, we won't need
> this structure at all.)
>
> >
> >
> >> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> >> index 498d344ea53a..41cfabc74ad7 100644
> >> --- a/lib/Kconfig.debug
> >> +++ b/lib/Kconfig.debug
> >> @@ -2338,4 +2338,9 @@ config HYPERV_TESTING
> >>
> >> endmenu # "Kernel Testing and Coverage"
> >>
> >> +menuconfig DEFAULT_TWIST_FLAGS
> >> + string "Default twist options (DANGEROUS)"
> >> + help
> >> + Don't specify anything unless you know what you are doing.
> >> +
> >> endmenu # Kernel hacking
> >
> > Why such a crazy build configure option?
> >
> > I think that the only way to get this upstream is to:
> >
> > + Add separate boot options that might theoretically be used also
> > by other people.
>
> Like you said
>
> I am afraid that many of them could not be normal options. They change or
> break some behavior that is necessary by seriously used system.
>
> these options are meant for helping fuzzers to find bugs while protecting
> the kernel from legitimate-but-stupid requests from fuzzers. Other people
> can include projects other than syzbot, but basically only useful for
> debugging projects. (And making these options boot-time configuration
> increases garbage/overhead for non-debugging usage.)

There were suggestions that some switches might be useful in general.
You should start with them.

Anyway, it still sounds to me like another lockdown mode. I think that
the relation with lockdown has already been discussed. But I do not
remember if it was refused.

> > + Use boot parameters and not build configuration.
>
> That sounds like a very tight restriction for syzbot. Relying on external
> files breaks reproducibility; people can fail to specify intended options.
> Saving intended options into the .config file is the most robust/reliable
> approach.

IMHO, it is not strict restriction. The motivation behind the boot
options is:

+ Create widely useful behavior switches when possible instead of
crazy hacks all over the kernel code.

+ The changes will modify the existing behavior. Build
configuration works only when the kernel will be used
for well defined purpose.

Boot option is better for (distribution) kernels that are used by
many different users. The person that builds the kernel does not
know what behavior would different users prefer.

You might argue that syzbot is a well defined user. But this goes
back to the first motivation to created general purpose options
when possible.

> > + Avoid the meaningless word "twist" !!!
>
> Then, what do you suggest? I think that we need some keyword for grouping.
> https://lkml.kernel.org/r/[email protected]

Please, start will single option and you will see how they are
acceptable for the affected subsystem. You could always group them
later.

Anyway, the word "twist" is meaning less. It inspires people to create
hacks that have undefined effects.

IMHO, lockdown would be better but it is already used. Anyway, I
suggest start with independent options for the begining.

Best Regards,
Petr

PS: This is most likely my last reply in this thread. I feel that I am
just repeating all the already mentioned ideas just by other words.

2020-06-08 07:52:26

by Dmitry Vyukov

[permalink] [raw]

Subject: Re: [PATCH v2] twist: allow converting pr_devel()/pr_debug() into snprintf()

On Fri, May 29, 2020 at 3:27 PM Tetsuo Handa
<[email protected]> wrote:
>
> Hello, Dmitry.
>
> Linus is asking me to avoid build-time switching based on kernel config options,
> and is suggesting me to use boot-time switching based on boot-config file feature
> (which is available since 5.6). I have several concerns about use of boot-config file
> feature in syzkaller.
>
> (1) To use boot-config file, syzkaller will need to add "bootconfig" option
> to the kernel command line. This will be doable by patching
> https://github.com/google/syzkaller/tree/master/dashboard/config/ *.cmdline
> files.

Hello Tetsuo,

Yes, command line arguments are easily changeable. Please send pull
requests to syzkaller, if you want to change something.

> (2) The boot-config file is embedded into initramfs file. Since syzkaller builds
> kernels with almost-allyesconfig, booting syzkaller kernels do not require
> initramfs for loading kernel modules needed for mounting the root partition.
> In fact, according to "unexpected kernel reboot" report which contains boot messages,
> I can't find "Unpacking initramfs..." message. It seems that syzkaller kernels do not
> use initramfs file.
>
> Is it possible for syzkaller to enforce performing steps for creating an initramfs,
> embedding the boot-config file
> ( https://www.kernel.org/doc/html/latest/admin-guide/bootconfig.html#boot-kernel-with-a-boot-config),
> and loading that initramfs whenever booting the syzkaller kernels?
> By the way, I do worry that people forget to perform these steps when they do
> their tests without asking syzbot...

I think we have some confusion between syzkaller and syzbot here.
syzkaller itself does not enforce/require any kernel configuration,
hardware nor use or not use of initramfs. In fact, qemu VM type
supports initramfs today. Or syzkaller can work with bare machines
where all setup is up to the user.
syzbot is just one deployment of syzkaller with a particular
configuration/hardware.

If this feature is useful for any linux kernel fuzzing, then we need
to have a plan for all users and setups.

And, yes, an additional context is kernel developers reproducing bugs.
Not all of them may be happy about any additional steps, nor will
follow them.

Answering your question, syzkaller can do some sanity checking of the
provided machine/kernel and reject working with it. If you tell me
what exactly needs to be checked, I can think where this code should
go.
However, again, I am not sure if one is using, say, Android phones and
they don't envision use of initramfs, then what?

For syzbot, the build happens here:
https://github.com/google/syzkaller/blob/7751efd04aebb07bc82b5c0e8eeaca07be1ae112/pkg/build/linux.go#L72
I don't know if initramfs is supported with GCE machines and what
exactly is required.

> (3) Since syzkaller keeps track of "kernel tree", "commit of the kernel tree", and
> "commit of the syzkaller tree" in order to guarantee reproducibility, it would be
> possible to identify the "your-config" file used for tools/bootconfig/bootconfig
> command. But since "#syz test" command currently accepts only "kernel tree" and
> "commit of the kernel tree" arguments, we might fail to use intended "your-config"
> file when doing reproducibility test. Can syzbot solve this concern?

Most likely it's possible.

> (4) Of course, "your-config" file would not change so frequently, but "#syz test" command
> relies on external file in "syzkaller tree" makes it impossible to try different
> configuration when I have to ask syzbot to test. (Since I don't have hardware which
> syzbot is reporting problems, I have to ask syzbot when I can't reproduce the problem
> in my environment.)
>
> https://syzkaller.appspot.com/text?tag=Patch&x=135f254a100000 is an example of
> need to enforce CONFIG_DEBUG_INFO_BTF=n in order to workaround build failure during
> "#syz test" command. If we bring "which twist options should be enabled" to an external
> boot-config file, I can't ask syzbot to try different twist options (except directly
> patching the kernel source which handles "which twist options should be enabled").
> Can syzbot solve this concern?

The CONFIG_DEBUG_INFO_BTF relates to build config. This can't be
solved during boot, right? So what is the relation?

> (5) Anything else?

Reading:
https://www.kernel.org/doc/html/latest/admin-guide/bootconfig.html#boot-kernel-with-a-boot-config
It seems that boot config is just a more complex way to provide
command line arguments. syzbot already supports command line
arguments, and it looks much simpler and no additional work required.
Why do we want to use boot config?

Next quarter we will be additionally busy with interns, so I can't
promise any time availability for syzbot improvements. But pull
requests are welcome.

2020-06-08 10:32:51