2023-01-05 06:03:43

by Yujie Liu

[permalink] [raw]
Subject: [niks:has_ioport_v3] [tty] aa0652d7f1: BUG:kernel_NULL_pointer_dereference,address

Greeting,

FYI, we noticed BUG:kernel_NULL_pointer_dereference,address due to commit (built with clang-14):

commit: aa0652d7f1b311e55232a8153522fdaaba0f197a ("tty: serial: handle HAS_IOPORT dependencies")
https://git.kernel.org/cgit/linux/kernel/git/niks/linux.git has_ioport_v3

in testcase: boot

on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 4G

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


[ 2.166733][ T0] calling univ8250_console_init+0x0/0x30 @ 0
[ 2.167555][ T0] BUG: kernel NULL pointer dereference, address: 00000000
[ 2.168429][ T0] #PF: supervisor read access in kernel mode
[ 2.169188][ T0] #PF: error_code(0x0000) - not-present page
[ 2.169909][ T0] *pde = 00000000
[ 2.170361][ T0] Oops: 0000 [#1]
[ 2.170799][ T0] CPU: 0 PID: 0 Comm: swapper Not tainted 6.2.0-rc2-00034-gaa0652d7f1b3 #1 06b22a6623100bbcf5eb4bbd76b0938141d59f7b
[ 2.172282][ T0] EIP: 0x0
[ 2.172633][ T0] Code: Unable to access opcode bytes at 0xffffffd6.
[ 2.173392][ T0] EAX: c3b6e098 EBX: 00000000 ECX: 00000000 EDX: 00000001
[ 2.174255][ T0] ESI: c3b6e098 EDI: 00000000 EBP: c2c3de7c ESP: c2c3de50
[ 2.175138][ T0] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 EFLAGS: 00210046
[ 2.176087][ T0] CR0: 80050033 CR2: ffffffd6 CR3: 0341f000 CR4: 00040690
[ 2.176967][ T0] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 2.177850][ T0] DR6: fffe0ff0 DR7: 00000400
[ 2.178422][ T0] Call Trace:
[ 2.178805][ T0] serial8250_do_set_termios+0x311/0x540
[ 2.179477][ T0] serial8250_set_termios+0x17/0x20
[ 2.180120][ T0] ? serial8250_shutdown+0x20/0x20
[ 2.180729][ T0] uart_set_options+0x153/0x180
[ 2.181310][ T0] serial8250_console_setup+0x128/0x160
[ 2.181985][ T0] univ8250_console_setup+0x52/0x70
[ 2.182625][ T0] ? univ8250_console_exit+0x40/0x40
[ 2.183253][ T0] try_enable_preferred_console+0xab/0xe0
[ 2.183931][ T0] register_console+0xe2/0x470
[ 2.184501][ T0] ? serial8250_isa_init_ports+0xa1/0x1a0
[ 2.185199][ T0] ? earlycon_print_info+0x90/0x90
[ 2.185836][ T0] univ8250_console_init+0x22/0x30
[ 2.186462][ T0] console_init+0xb7/0x1b0
[ 2.187002][ T0] start_kernel+0x1f6/0x380
[ 2.187562][ T0] i386_start_kernel+0x218/0x220
[ 2.188155][ T0] startup_32_smp+0x151/0x160
[ 2.188714][ T0] Modules linked in:
[ 2.189185][ T0] CR2: 0000000000000000
[ 2.189667][ T0] ---[ end trace 0000000000000000 ]---
[ 2.190352][ T0] EIP: 0x0
[ 2.190737][ T0] Code: Unable to access opcode bytes at 0xffffffd6.
[ 2.191577][ T0] EAX: c3b6e098 EBX: 00000000 ECX: 00000000 EDX: 00000001
[ 2.192461][ T0] ESI: c3b6e098 EDI: 00000000 EBP: c2c3de7c ESP: c2c3de50
[ 2.193341][ T0] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 EFLAGS: 00210046
[ 2.194285][ T0] CR0: 80050033 CR2: ffffffd6 CR3: 0341f000 CR4: 00040690
[ 2.195170][ T0] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 2.196100][ T0] DR6: fffe0ff0 DR7: 00000400
[ 2.196652][ T0] Kernel panic - not syncing: Fatal exception


If you fix the issue, kindly add following tag
| Reported-by: kernel test robot <[email protected]>
| Link: https://lore.kernel.org/oe-lkp/[email protected]


To reproduce:

# build kernel
cd linux
cp config-6.2.0-rc2-00034-gaa0652d7f1b3 .config
make HOSTCC=clang-14 CC=clang-14 ARCH=i386 olddefconfig prepare modules_prepare bzImage modules
make HOSTCC=clang-14 CC=clang-14 ARCH=i386 INSTALL_MOD_PATH=<mod-install-dir> modules_install
cd <mod-install-dir>
find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz


git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email

# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.


--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests


Attachments:
(No filename) (4.14 kB)
config-6.2.0-rc2-00034-gaa0652d7f1b3 (167.70 kB)
job-script (4.72 kB)
dmesg.xz (4.22 kB)
Download all attachments

2023-01-05 08:39:39

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [niks:has_ioport_v3] [tty] aa0652d7f1: BUG:kernel_NULL_pointer_dereference,address

On Thu, Jan 5, 2023, at 06:54, kernel test robot wrote:
> Greeting,
>
> FYI, we noticed BUG:kernel_NULL_pointer_dereference,address due to
> commit (built with clang-14):
>
> commit: aa0652d7f1b311e55232a8153522fdaaba0f197a ("tty: serial: handle
> HAS_IOPORT dependencies")
> https://git.kernel.org/cgit/linux/kernel/git/niks/linux.git
> has_ioport_v3
>
> in testcase: boot
>
> on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 4G
>
> caused below changes (please refer to attached dmesg/kmsg for entire
> log/backtrace):
>
>
> [ 2.166733][ T0] calling univ8250_console_init+0x0/0x30 @ 0
> [ 2.167555][ T0] BUG: kernel NULL pointer dereference, address:
> 00000000

I think it's this bit:

@@ -508,12 +523,13 @@ static void set_io_from_upio(struct uart_port *p)
up->dl_read = au_serial_dl_read;
up->dl_write = au_serial_dl_write;
break;
-#endif
-
+#ifdef CONFIG_HAS_IOPORT
default:
p->serial_in = io_serial_in;
p->serial_out = io_serial_out;
break;
+#endif
+#endif
}
/* Remember loaded iotype */
up->cur_iotype = p->iotype;


which puts the 'default' case inside of '#ifdef
CONFIG_SERIAL_8250_RT288X'. x86 does not use the
RT288x variant but relies on the default, so any
call to io_serial_{in,out} will cause a NULL
pointer dereference.

Arnd

2023-01-09 09:55:52

by Niklas Schnelle

[permalink] [raw]
Subject: Re: [niks:has_ioport_v3] [tty] aa0652d7f1: BUG:kernel_NULL_pointer_dereference,address

On Thu, 2023-01-05 at 09:03 +0100, Arnd Bergmann wrote:
> On Thu, Jan 5, 2023, at 06:54, kernel test robot wrote:
> > Greeting,
> >
> > FYI, we noticed BUG:kernel_NULL_pointer_dereference,address due to
> > commit (built with clang-14):
> >
> > commit: aa0652d7f1b311e55232a8153522fdaaba0f197a ("tty: serial: handle
> > HAS_IOPORT dependencies")
> > https://git.kernel.org/cgit/linux/kernel/git/niks/linux.git
> > has_ioport_v3
> >
> > in testcase: boot
> >
> > on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 4G
> >
> > caused below changes (please refer to attached dmesg/kmsg for entire
> > log/backtrace):
> >
> >
> > [ 2.166733][ T0] calling univ8250_console_init+0x0/0x30 @ 0
> > [ 2.167555][ T0] BUG: kernel NULL pointer dereference, address:
> > 00000000
>
> I think it's this bit:
>
> @@ -508,12 +523,13 @@ static void set_io_from_upio(struct uart_port *p)
> up->dl_read = au_serial_dl_read;
> up->dl_write = au_serial_dl_write;
> break;
> -#endif
> -
> +#ifdef CONFIG_HAS_IOPORT
> default:
> p->serial_in = io_serial_in;
> p->serial_out = io_serial_out;
> break;
> +#endif
> +#endif
> }
> /* Remember loaded iotype */
> up->cur_iotype = p->iotype;
>
>
> which puts the 'default' case inside of '#ifdef
> CONFIG_SERIAL_8250_RT288X'. x86 does not use the
> RT288x variant but relies on the default, so any
> call to io_serial_{in,out} will cause a NULL
> pointer dereference.
>
> Arnd

Thanks for looking into it Arnd, your reasoning makes sense to me, I'll
try to look into this and test this case before I sent out the v3 of
the HAS_IOPORT patches. I still also have a few nitpicks from Bjorn,
mostly about the commit messages, to work in. Hope to do that soon but
got a few things going on at the moment.

2023-03-08 11:26:05

by Niklas Schnelle

[permalink] [raw]
Subject: Re: [niks:has_ioport_v3] [tty] aa0652d7f1: BUG:kernel_NULL_pointer_dereference,address

On Thu, 2023-01-05 at 09:03 +0100, Arnd Bergmann wrote:
> On Thu, Jan 5, 2023, at 06:54, kernel test robot wrote:
> > Greeting,
> >
> > FYI, we noticed BUG:kernel_NULL_pointer_dereference,address due to
> > commit (built with clang-14):
> >
> > commit: aa0652d7f1b311e55232a8153522fdaaba0f197a ("tty: serial: handle
> > HAS_IOPORT dependencies")
> > https://git.kernel.org/cgit/linux/kernel/git/niks/linux.git
> > has_ioport_v3
> >
> > in testcase: boot
> >
> > on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 4G
> >
> > caused below changes (please refer to attached dmesg/kmsg for entire
> > log/backtrace):
> >
> >
> > [ 2.166733][ T0] calling univ8250_console_init+0x0/0x30 @ 0
> > [ 2.167555][ T0] BUG: kernel NULL pointer dereference, address:
> > 00000000
>
> I think it's this bit:
>
> @@ -508,12 +523,13 @@ static void set_io_from_upio(struct uart_port *p)
> up->dl_read = au_serial_dl_read;
> up->dl_write = au_serial_dl_write;
> break;
> -#endif
> -
> +#ifdef CONFIG_HAS_IOPORT
> default:
> p->serial_in = io_serial_in;
> p->serial_out = io_serial_out;
> break;
> +#endif
> +#endif
> }
> /* Remember loaded iotype */
> up->cur_iotype = p->iotype;
>
>
> which puts the 'default' case inside of '#ifdef
> CONFIG_SERIAL_8250_RT288X'. x86 does not use the
> RT288x variant but relies on the default, so any
> call to io_serial_{in,out} will cause a NULL
> pointer dereference.
>
> Arnd

Yes that makes sense, it's clearly not correct to put the default case
inside CONFIG_SERIAL_8250_RT288X. What do you think about going with
something like:

@@ -519,9 +534,14 @@ static void set_io_from_upio(struct uart_port *p)
#endif

default:
+#ifdef CONFIG_HAS_IOPORT
p->serial_in = io_serial_in;
p->serial_out = io_serial_out;
break;
+#else
+ WARN(1, "Unsupported UART type \"io\"\n");
+ return;
+#endif
}

I've pushed a version with the above change rebased on v6.3-rc1 to my
git.kernel.org repository and will do some more testing before I can
hopefully send this out for review and make some progress on this.
Meanwhile the original problem is now the only thing preventing clean
Werror builds on clang for s390 as far as I understand.

Thanks,
Niklas

2023-03-08 12:23:45

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [niks:has_ioport_v3] [tty] aa0652d7f1: BUG:kernel_NULL_pointer_dereference,address

On Wed, Mar 8, 2023, at 12:24, Niklas Schnelle wrote:
> On Thu, 2023-01-05 at 09:03 +0100, Arnd Bergmann wrote:
>
> Yes that makes sense, it's clearly not correct to put the default case
> inside CONFIG_SERIAL_8250_RT288X. What do you think about going with
> something like:
>
> @@ -519,9 +534,14 @@ static void set_io_from_upio(struct uart_port *p)
> #endif
>
> default:
> +#ifdef CONFIG_HAS_IOPORT
> p->serial_in = io_serial_in;
> p->serial_out = io_serial_out;
> break;
> +#else
> + WARN(1, "Unsupported UART type \"io\"\n");
> + return;
> +#endif
> }

I think we have to ensure that ->serial_in() always points
to some function that doesn't immediately panic, though that
could be an empty dummy like

default:
p->serial_in = IS_ENABLED(CONFIG_HAS_IOPORT) ?
io_serial_in : no_serial_in;
p->serial_out = IS_ENABLED(CONFIG_HAS_IOPORT) ?
io_serial_out : no_serial_out;

Ideally we'd make mem_serial_in() the default function
and only use io_serial_in() when UPIO_PORT is selected,
but that still causes a NULL pointer dereference when
a platform initializes a 8250 like

static struct plat_serial8250_port serial_platform_data[] = {
{
.iobase = 0x3f8, /* NULL pointer */
.irq = IRQ_ISA_UART,
.uartclk = 1843200,
/* default .iotype = UPIO_PORT, */
},

so I think an empty function plus a warning is best here.

> I've pushed a version with the above change rebased on v6.3-rc1 to my
> git.kernel.org repository and will do some more testing before I can
> hopefully send this out for review and make some progress on this.
> Meanwhile the original problem is now the only thing preventing clean
> Werror builds on clang for s390 as far as I understand.

Thanks a lot! Let's hope we can manage to get this merged at last.

Arnd

2023-03-08 14:07:30

by Niklas Schnelle

[permalink] [raw]
Subject: Re: [niks:has_ioport_v3] [tty] aa0652d7f1: BUG:kernel_NULL_pointer_dereference,address

On Wed, 2023-03-08 at 13:21 +0100, Arnd Bergmann wrote:
> On Wed, Mar 8, 2023, at 12:24, Niklas Schnelle wrote:
> > On Thu, 2023-01-05 at 09:03 +0100, Arnd Bergmann wrote:
> >
> > Yes that makes sense, it's clearly not correct to put the default case
> > inside CONFIG_SERIAL_8250_RT288X. What do you think about going with
> > something like:
> >
> > @@ -519,9 +534,14 @@ static void set_io_from_upio(struct uart_port *p)
> > #endif
> >
> > default:
> > +#ifdef CONFIG_HAS_IOPORT
> > p->serial_in = io_serial_in;
> > p->serial_out = io_serial_out;
> > break;
> > +#else
> > + WARN(1, "Unsupported UART type \"io\"\n");
> > + return;
> > +#endif
> > }
>
> I think we have to ensure that ->serial_in() always points
> to some function that doesn't immediately panic, though that
> could be an empty dummy like
>
> default:
> p->serial_in = IS_ENABLED(CONFIG_HAS_IOPORT) ?
> io_serial_in : no_serial_in;
> p->serial_out = IS_ENABLED(CONFIG_HAS_IOPORT) ?
> io_serial_out : no_serial_out;

Sadly the IS_ENABLED() plus ternary still gives me an undeclared
identifier error for io_serial_in(). So I think we need the more ugly
#ifdef. With that I hope it would then not crash even if one might be
left without any console at all.

>
> Ideally we'd make mem_serial_in() the default function
> and only use io_serial_in() when UPIO_PORT is selected,
> but that still causes a NULL pointer dereference when
> a platform initializes a 8250 like
>
> static struct plat_serial8250_port serial_platform_data[] = {
> {
> .iobase = 0x3f8, /* NULL pointer */
> .irq = IRQ_ISA_UART,
> .uartclk = 1843200,
> /* default .iotype = UPIO_PORT, */
> },
>
> so I think an empty function plus a warning is best here.

So in the above case .iotype is implicitly set to 0 which is UPIO_PORT
so I think one could argue it is selected, no? Not sure how picking
UPIO_MEM as default would look like then. One thing we could do though
is make the switch/case more regular like so:

...
#ifdef CONFIG_HAS_IOPORT
case UPIO_PORT:
p->serial_in = io_serial_in;
p->serial_out = io_serial_out;
break;
#endif

default:
WARN(1, "Unsupported UART type %x\n", p->iotype);
p->serial_in = no_serial_in;
p->serial_out = no_serial_out;
}
...

That way we would have to always define no_serial_in() /
no_serial_out() but would also gracefully handle when p->iotype is set
to some other value and it looks relatively clean.