2017-12-14 00:54:23

by Laura Abbott

[permalink] [raw]
Subject: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

Hi,

Fedora got a bug report https://bugzilla.redhat.com/show_bug.cgi?id=1520982
of a boot failure/bug on Linus' master (full bootlog at the bugzilla)

WARNING: CPU: 3 PID: 3486 at block/genhd.c:680 device_add_disk+0x3d9/0x460
Modules linked in: intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp qcaux snd_usb_audio snd_usbmidi_lib coretemp floppy(+) snd_rawmidi snd_seq_device cdc_acm kvm_intel kvm irqbypass iTCO_wdt iTCO_vendor_support mei_wdt intel_wmi_thunderbolt intel_cstate intel_uncore intel_rapl_perf dcdbas snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic dell_smm_hwmon i2c_i801 snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep lpc_ich mei_me mei wmi shpchp target_core_mod snd_pcm_oss snd_mixer_oss binfmt_misc dm_crypt raid1 radeon i2c_algo_bit drm_kms_helper crct10dif_pclmul crc32_pclmul crc32c_intel ttm ghash_clmulni_intel drm e1000e ptp pps_core snd_pcm snd_timer snd soundcore analog gameport joydev
CPU: 3 PID: 3486 Comm: mdadm Not tainted 4.15.0-0.rc2.git0.1.fc28.x86_64 #1
Hardware name: Dell Inc. Precision Tower 5810/0WR1RF, BIOS A07 04/14/2015
task: 00000000e8461579 task.stack: 00000000bfe85ee4
RIP: 0010:device_add_disk+0x3d9/0x460
RSP: 0018:ffffb42783b37b30 EFLAGS: 00010282
RAX: 00000000fffffff4 RBX: ffff952df829b000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 000000000001f040 RDI: 00000000000001ff
RBP: ffff952df829b070 R08: ffff952df6bb2d60 R09: 00000001820001ff
R10: 0000000000000001 R11: 0000000000001401 R12: 0000000000000000
R13: ffff952df829b00c R14: 0000000000000009 R15: ffff952df829b000
FS: 00007fd492882740(0000) GS:ffff952e1fd80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fd4921a95b0 CR3: 0000000837ecf001 CR4: 00000000001606e0
Call Trace:
? pm_runtime_init+0xa0/0xc0
md_alloc+0x1a8/0x360
md_probe+0x15/0x20
kobj_lookup+0x100/0x150
? md_alloc+0x360/0x360
get_gendisk+0x29/0x110
blkdev_get+0x61/0x2f0
? bd_acquire+0xb0/0xb0
? bd_acquire+0xb0/0xb0
do_dentry_open+0x1b1/0x2d0
? security_inode_permission+0x3c/0x50
path_openat+0x602/0x14e0
do_filp_open+0x9b/0x110
? __check_object_size+0xaf/0x1b0
? do_sys_open+0x1bd/0x250
do_sys_open+0x1bd/0x250
do_syscall_64+0x61/0x170
entry_SYSCALL64_slow_path+0x25/0x25
RIP: 0033:0x7fd492234a5e
RSP: 002b:00007fff5d59e9f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
RAX: ffffffffffffffda RBX: 0000000000004082 RCX: 00007fd492234a5e
RDX: 0000000000004082 RSI: 00007fff5d59ea80 RDI: 00000000ffffff9c
RBP: 00007fff5d59ea80 R08: 00007fff5d59ea80 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009
R13: 000000000000007c R14: 00007fff5d59eae0 R15: 00007fff5d59eb68
Code: 48 83 c6 10 e8 19 08 f0 ff 85 c0 0f 84 d6 fd ff ff 0f ff e9 cf fd ff ff 80 a3 bc 00 00 00 ef e9 c3 fd ff ff 0f ff e9 d8 fd ff ff <0f> ff e9 ba fe ff ff 31 f6 48 89 df e8 36 ec ff ff 48 85 c0 48
---[ end trace 9590c1ef4c38eb03 ]---
BUG: unable to handle kernel NULL pointer dereference at 0000000054605537
IP: sysfs_do_create_link_sd.isra.2+0x2f/0xb0
PGD 0 P4D 0
Oops: 0000 [#1] SMP
Modules linked in: intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp qcaux snd_usb_audio snd_usbmidi_lib coretemp floppy(+) snd_rawmidi snd_seq_device cdc_acm kvm_intel kvm irqbypass iTCO_wdt iTCO_vendor_support mei_wdt intel_wmi_thunderbolt intel_cstate intel_uncore intel_rapl_perf dcdbas snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic dell_smm_hwmon i2c_i801 snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep lpc_ich mei_me mei wmi shpchp target_core_mod snd_pcm_oss snd_mixer_oss binfmt_misc dm_crypt raid1 radeon i2c_algo_bit drm_kms_helper crct10dif_pclmul crc32_pclmul crc32c_intel ttm ghash_clmulni_intel drm e1000e ptp pps_core snd_pcm snd_timer snd soundcore analog gameport joydev
CPU: 3 PID: 3486 Comm: mdadm Tainted: G W 4.15.0-0.rc2.git0.1.fc28.x86_64 #1
Hardware name: Dell Inc. Precision Tower 5810/0WR1RF, BIOS A07 04/14/2015
task: 00000000e8461579 task.stack: 00000000bfe85ee4
RIP: 0010:sysfs_do_create_link_sd.isra.2+0x2f/0xb0
RSP: 0018:ffffb42783b37b00 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000040 RCX: 0000000000000001
RDX: 0000000000000001 RSI: 0000000000000040 RDI: ffffffffbb613b0c
RBP: ffffffffbaca3577 R08: 0000000800000000 R09: 00000008ffffffff
R10: fffff9efe0e8ca00 R11: fffff9efe0d77001 R12: 0000000000000001
R13: ffff952df6f45110 R14: 0000000000000009 R15: ffff952df829b000
FS: 00007fd492882740(0000) GS:ffff952e1fd80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000040 CR3: 0000000837ecf001 CR4: 00000000001606e0
Call Trace:
device_add_disk+0x3b7/0x460
md_alloc+0x1a8/0x360
md_probe+0x15/0x20
kobj_lookup+0x100/0x150
? md_alloc+0x360/0x360
get_gendisk+0x29/0x110
blkdev_get+0x61/0x2f0
? bd_acquire+0xb0/0xb0
? bd_acquire+0xb0/0xb0
do_dentry_open+0x1b1/0x2d0
? security_inode_permission+0x3c/0x50
path_openat+0x602/0x14e0
do_filp_open+0x9b/0x110
? __check_object_size+0xaf/0x1b0
? do_sys_open+0x1bd/0x250
do_sys_open+0x1bd/0x250
do_syscall_64+0x61/0x170
entry_SYSCALL64_slow_path+0x25/0x25
RIP: 0033:0x7fd492234a5e
RSP: 002b:00007fff5d59e9f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
RAX: ffffffffffffffda RBX: 0000000000004082 RCX: 00007fd492234a5e
RDX: 0000000000004082 RSI: 00007fff5d59ea80 RDI: 00000000ffffff9c
RBP: 00007fff5d59ea80 R08: 00007fff5d59ea80 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009
R13: 000000000000007c R14: 00007fff5d59eae0 R15: 00007fff5d59eb68
Code: 00 48 85 d2 41 56 41 55 41 54 55 53 74 76 48 85 ff 74 71 48 89 f3 49 89 fd 48 c7 c7 0c 3b 61 bb 41 89 cc 48 89 d5 e8 31 88 58 00 <48> 8b 1b 48 85 db 74 3c 48 89 df e8 71 c7 ff ff c6 05 36 bc 30
RIP: sysfs_do_create_link_sd.isra.2+0x2f/0xb0 RSP: ffffb42783b37b00
CR2: 0000000000000040
Dec 05 08:24:29 cerberus.csd.uwm.edu kernel: ---[ end trace 9590c1ef4c38eb04 ]---

The reporter did a bisect and found the bad commit to be

commit a0747a859ef6d3cc5b6cd50eb694499b78dd0025
Author: weiping zhang <[email protected]>
Date: Fri Nov 17 23:06:04 2017 +0800

bdi: add error handle for bdi_debug_register

In order to make error handle more cleaner we call bdi_debug_register
before set state to WB_registered, that we can avoid call bdi_unregister
in release_bdi().

Reviewed-by: Jan Kara <[email protected]>
Signed-off-by: weiping zhang <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

Any ideas?

Thanks,
Laura


2017-12-14 08:34:37

by Bruno Wolff III

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

On Wed, Dec 13, 2017 at 16:54:17 -0800,
Laura Abbott <[email protected]> wrote:
>Hi,
>
>Fedora got a bug report https://bugzilla.redhat.com/show_bug.cgi?id=1520982
>of a boot failure/bug on Linus' master (full bootlog at the bugzilla)

I'm available for testing. The problem happens on my x86_64 Dell Workstation,
but not an old i386 server or an x86_64 mac hardware based laptop.

2017-12-14 10:24:51

by Weiping Zhang

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

On Thu, Dec 14, 2017 at 02:24:52AM -0600, Bruno Wolff III wrote:
> On Wed, Dec 13, 2017 at 16:54:17 -0800,
> Laura Abbott <[email protected]> wrote:
> >Hi,
> >
> >Fedora got a bug report https://bugzilla.redhat.com/show_bug.cgi?id=1520982
> >of a boot failure/bug on Linus' master (full bootlog at the bugzilla)
>
> I'm available for testing. The problem happens on my x86_64 Dell
> Workstation, but not an old i386 server or an x86_64 mac hardware
> based laptop.

Hi,

It seems something wrong with bdi debugfs register, could you help
test the forllowing debug patch, I add some debug log, no function
change, thanks.


>From d2728c07589e8b83115a51e0c629451bff7308db Mon Sep 17 00:00:00 2001
From: weiping zhang <[email protected]>
Date: Thu, 14 Dec 2017 17:56:22 +0800
Subject: [PATCH] bdi debugfs

Signed-off-by: weiping zhang <[email protected]>
---
mm/backing-dev.c | 28 +++++++++++++++++++++++++---
1 file changed, 25 insertions(+), 3 deletions(-)

diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index 84b2dc7..fbbb9a6 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -39,6 +39,10 @@ static struct dentry *bdi_debug_root;
static void bdi_debug_init(void)
{
bdi_debug_root = debugfs_create_dir("bdi", NULL);
+ if (!bdi_debug_root)
+ pr_err("DEBUG:bdi_debug_root fail\n");
+ else
+ pr_err("DEBUG:bdi_debug_root success\n");
}

static int bdi_debug_stats_show(struct seq_file *m, void *v)
@@ -115,18 +119,29 @@ static const struct file_operations bdi_debug_stats_fops = {

static int bdi_debug_register(struct backing_dev_info *bdi, const char *name)
{
- if (!bdi_debug_root)
+ if (!bdi_debug_root) {
+ pr_err("DEBUG:dev:%s, bdi_debug_root fail\n", name);
return -ENOMEM;
+ } else {
+ pr_err("DEBUG:dev:%s, bdi_debug_root success\n", name);
+ }

bdi->debug_dir = debugfs_create_dir(name, bdi_debug_root);
- if (!bdi->debug_dir)
+ if (!bdi->debug_dir) {
+ pr_err("DEBUG:dev:%s, debug_dir fail\n", name);
return -ENOMEM;
+ } else {
+ pr_err("DEBUG:dev:%s, debug_dir success\n", name);
+ }

bdi->debug_stats = debugfs_create_file("stats", 0444, bdi->debug_dir,
bdi, &bdi_debug_stats_fops);
if (!bdi->debug_stats) {
debugfs_remove(bdi->debug_dir);
+ pr_err("DEBUG:dev:%s, debug_stats fail\n", name);
return -ENOMEM;
+ } else {
+ pr_err("DEBUG:dev:%s, debug_stats success\n", name);
}

return 0;
@@ -879,13 +894,20 @@ int bdi_register_va(struct backing_dev_info *bdi, const char *fmt, va_list args)
return 0;

dev = device_create_vargs(bdi_class, NULL, MKDEV(0, 0), bdi, fmt, args);
- if (IS_ERR(dev))
+ if (IS_ERR(dev)) {
+ pr_err("DEBUG: bdi device_create_vargs fail\n");
return PTR_ERR(dev);
+ }
+ pr_err("DEBUG: bdi(0x%p) device_create_vargs sucess\n", bdi);

if (bdi_debug_register(bdi, dev_name(dev))) {
+ pr_err("DEBUG: dev:%s, bdi(0x%p) bdi_debug_register fail\n",
+ dev_name(dev), bdi);
device_destroy(bdi_class, dev->devt);
return -ENOMEM;
}
+ pr_err("DEBUG: dev:%s, bdi(0x%p) bdi_debug_register success\n",
+ dev_name(dev), bdi);
cgwb_bdi_register(bdi);
bdi->dev = dev;

--
2.9.4

2017-12-14 10:32:31

by Bruno Wolff III

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

On Thu, Dec 14, 2017 at 18:09:27 +0800,
weiping zhang <[email protected]> wrote:
>On Thu, Dec 14, 2017 at 02:24:52AM -0600, Bruno Wolff III wrote:
>> On Wed, Dec 13, 2017 at 16:54:17 -0800,
>> Laura Abbott <[email protected]> wrote:
>> >Hi,
>> >
>> >Fedora got a bug report https://bugzilla.redhat.com/show_bug.cgi?id=1520982
>> >of a boot failure/bug on Linus' master (full bootlog at the bugzilla)
>>
>> I'm available for testing. The problem happens on my x86_64 Dell
>> Workstation, but not an old i386 server or an x86_64 mac hardware
>> based laptop.
>
>Hi,
>
>It seems something wrong with bdi debugfs register, could you help
>test the forllowing debug patch, I add some debug log, no function
>change, thanks.

I'll test it this morning. I'll probably have results in about 7 hrs from now.

2017-12-14 15:44:49

by Bruno Wolff III

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

On Thu, Dec 14, 2017 at 18:09:27 +0800,
weiping zhang <[email protected]> wrote:
>
>It seems something wrong with bdi debugfs register, could you help
>test the forllowing debug patch, I add some debug log, no function
>change, thanks.

I applied your patch to d39a01eff9af1045f6e30ff9db40310517c4b45f and there
were some new debug messages in the dmesg output. Hopefully this helps. I
also added the patch and output to the Fedora bug for people following there.


Attachments:
(No filename) (479.00 B)
boot1.log (96.20 kB)
Download all attachments

2017-12-15 01:22:26

by Weiping Zhang

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

2017-12-14 23:41 GMT+08:00 Bruno Wolff III <[email protected]>:
> On Thu, Dec 14, 2017 at 18:09:27 +0800,
> weiping zhang <[email protected]> wrote:
>>
>>
>> It seems something wrong with bdi debugfs register, could you help
>> test the forllowing debug patch, I add some debug log, no function
>> change, thanks.
>
>
> I applied your patch to d39a01eff9af1045f6e30ff9db40310517c4b45f and there
> were some new debug messages in the dmesg output. Hopefully this helps. I
> also added the patch and output to the Fedora bug for people following
> there.

Hi Bruno,

Thanks your testing, but I cann't find WARN_ON in device_add_disk from
this boot1.log, could you help reproduce that issue? And does this issue can be
triggered at every bootup ?

--
Thanks
weiping

2017-12-15 01:47:26

by Bruno Wolff III

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

On Fri, Dec 15, 2017 at 09:22:21 +0800,
weiping zhang <[email protected]> wrote:
>
>Thanks your testing, but I cann't find WARN_ON in device_add_disk from
>this boot1.log, could you help reproduce that issue? And does this issue can be
>triggered at every bootup ?

I don't know what you need for the first question. When I am physically at
the machine I can do test reboots. If you have something specific you want
me to try I should be able to.

Every time I boot with the problem commit, the boot never completes. However
it does seem to get pretty far. I get multiple register dumps every time.
After a while (a few minutes) I reboot to a wrking kernel.

The output I included is from: journalctl -k -b -1
If you think it would be better to see more than dmesg output let me know.

2017-12-15 02:04:36

by Weiping Zhang

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

2017-12-15 9:44 GMT+08:00 Bruno Wolff III <[email protected]>:
> On Fri, Dec 15, 2017 at 09:22:21 +0800,
> weiping zhang <[email protected]> wrote:
>>
>>
>> Thanks your testing, but I cann't find WARN_ON in device_add_disk from
>> this boot1.log, could you help reproduce that issue? And does this issue
>> can be
>> triggered at every bootup ?
>
>
> I don't know what you need for the first question. When I am physically at
> the machine I can do test reboots. If you have something specific you want
> me to try I should be able to.
>
> Every time I boot with the problem commit, the boot never completes. However
> it does seem to get pretty far. I get multiple register dumps every time.
> After a while (a few minutes) I reboot to a wrking kernel.
>
> The output I included is from: journalctl -k -b -1
> If you think it would be better to see more than dmesg output let me know.
I just want to know WARN_ON WHAT in device_add_disk,
if bdi_register_owner return error code, it may fail at any step of following:

bdi_debug_root is NULL
bdi->debug_dir is NULL
bdi->debug_stats is NULL

so I want see the WARN_ON as you paste before, also my DEBUG log will help
to find which step fail.

2017-12-15 02:14:22

by Bruno Wolff III

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

On Fri, Dec 15, 2017 at 10:04:32 +0800,
weiping zhang <[email protected]> wrote:
>
>so I want see the WARN_ON as you paste before, also my DEBUG log will help
>to find which step fail.

The previous time also journalctl for output, but maybe I used slightly
different options. I'll look and see if it is in the journal for the last
bad boot. I can do that from home.

2017-12-15 11:14:04

by Bruno Wolff III

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

On Fri, Dec 15, 2017 at 10:04:32 +0800,
weiping zhang <[email protected]> wrote:
>I just want to know WARN_ON WHAT in device_add_disk,
>if bdi_register_owner return error code, it may fail at any step of following:

Was that output in the original boot log? I didn't see anything there
that had the string WARN_ON. The first log was from a Fedora kernel. The
second from a kernel I built. I used a Fedora config though. The config
was probably from one of their nodebug kernels, I could build another
one using a config from a debug kernel. Would that likely provide what
you are looking for?

2017-12-15 14:02:24

by Weiping Zhang

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

2017-12-15 19:10 GMT+08:00 Bruno Wolff III <[email protected]>:
> On Fri, Dec 15, 2017 at 10:04:32 +0800,
> weiping zhang <[email protected]> wrote:
>>
>> I just want to know WARN_ON WHAT in device_add_disk,
>> if bdi_register_owner return error code, it may fail at any step of
>> following:
>
>
> Was that output in the original boot log? I didn't see anything there that
> had the string WARN_ON. The first log was from a Fedora kernel. The second
Sorry to let you confuse, WARN_ON means we catch log as following:
WARNING: CPU: 3 PID: 3486 at block/genhd.c:680 device_add_disk+0x3d9/0x460

> from a kernel I built. I used a Fedora config though. The config was
> probably from one of their nodebug kernels, I could build another one using
> a config from a debug kernel. Would that likely provide what you are looking
> for?

Yes, please help reproduce this issue include my debug patch. Reproduce means
we can see WARN_ON in device_add_disk caused by failure of bdi_register_owner.

2017-12-15 16:34:10

by Bruno Wolff III

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

On Fri, Dec 15, 2017 at 22:02:20 +0800,
weiping zhang <[email protected]> wrote:
>
>Yes, please help reproduce this issue include my debug patch. Reproduce means
>we can see WARN_ON in device_add_disk caused by failure of bdi_register_owner.

I'm not sure why yet, but I'm only getting the warning message you want
with Fedora kernels, not the ones I am building (with or without your test
patch). I'll attach a debug config file if you want to look there. But in
theory that should be essentially what Fedora is using for theirs. They
probably have some out of tree patches they are applying, but I wouldn't
expect those to make a difference here. I think they now have a tree
somewhere that I can try to build from that has their patches applied
to the upstream kernel and if I can find it I will try building it just
to test this out.

I only have about 6 hours of physical access to the machine exhibiting
the problem, and after that I won't be able to do test boots until Monday.

2017-12-15 17:19:03

by Laura Abbott

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

On 12/15/2017 08:30 AM, Bruno Wolff III wrote:
> On Fri, Dec 15, 2017 at 22:02:20 +0800,
>  weiping zhang <[email protected]> wrote:
>>
>> Yes, please help reproduce this issue include my debug patch. Reproduce means
>> we can see WARN_ON in device_add_disk caused by failure of bdi_register_owner.
>
> I'm not sure why yet, but I'm only getting the warning message you want with Fedora kernels, not the ones I am building (with or without your test patch). I'll attach a debug config file if you want to look there. But in theory that should be essentially what Fedora is using for theirs. They probably have some out of tree patches they are applying, but I wouldn't expect those to make a difference here. I think they now have a tree somewhere that I can try to build from that has their patches applied to the upstream kernel and if I can find it I will try building it just to test this out.
>
> I only have about 6 hours of physical access to the machine exhibiting the problem, and after that I won't be able to do test boots until Monday.


You can see the trees Fedora produces at https://git.kernel.org/pub/scm/linux/kernel/git/jwboyer/fedora.git
which includes the configs (you want to look at the ones withtout - debug)

2017-12-15 17:43:40

by Bruno Wolff III

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

On Fri, Dec 15, 2017 at 09:18:56 -0800,
Laura Abbott <[email protected]> wrote:
>
>You can see the trees Fedora produces at https://git.kernel.org/pub/scm/linux/kernel/git/jwboyer/fedora.git
>which includes the configs (you want to look at the ones withtout - debug)

Thanks. I found it a little while ago and am already doing a test build
without weiping's test patch to see if that kernel provides what he(?)
needs. Doing a rebuild with the test patch will go pretty quickly. So
if I get the message with device_add_disk from these kernels, I should
be able to get the information this afternoon. If there is some other
reason I don't get that when I do the builds, I'm probably not going to be
able to figure it out and get a build done before I leave. I don't live
close enough to the office that I'm going to want to drive in just to
be able to do a reboot test. (And my hardware at home does exhibit the
problem.)

If you have some other idea about why I might not be seeing the
device_add_disk message, I'd be interested in hearing it.

2017-12-15 19:54:38

by Bruno Wolff III

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

On Fri, Dec 15, 2017 at 22:02:20 +0800,
weiping zhang <[email protected]> wrote:
>Sorry to let you confuse, WARN_ON means we catch log as following:
>WARNING: CPU: 3 PID: 3486 at block/genhd.c:680 device_add_disk+0x3d9/0x460

I do not get this warning for any of the kernels I build, whether from
Linus' tree or Josh Boyer's Fedora tree. It shows up when I test kernels built
by Fedora, but those don't have your debug patch.

I do not know what is different. Do you have any ideas? Most likely I won't
be able to test any more kernels until Monday (unless I can use most of my
most recent build over again very soon).

2017-12-16 16:35:58

by Bruno Wolff III

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

On Fri, Dec 15, 2017 at 13:51:22 -0600,
Bruno Wolff III <[email protected]> wrote:
>
>I do not know what is different. Do you have any ideas? Most likely I
>won't be able to test any more kernels until Monday (unless I can use
>most of my most recent build over again very soon).

The .config looks like it should be OK. I'll test setting loglevel on
boot in case the default is different than what the config file says.
I can't do that until Monday morning.

I think it is more likely the the WARN_ON macro code isn't being
compiled in for some reason. I haven't confirmed that, nor have I found
anything that would leave that code out when I do a make, but include
it during Fedora builds.

2017-12-17 13:43:54

by Weiping Zhang

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

2017-12-17 0:32 GMT+08:00 Bruno Wolff III <[email protected]>:
> On Fri, Dec 15, 2017 at 13:51:22 -0600,
> Bruno Wolff III <[email protected]> wrote:
>>
>>
>> I do not know what is different. Do you have any ideas? Most likely I
>> won't be able to test any more kernels until Monday (unless I can use most
>> of my most recent build over again very soon).
>
>
> The .config looks like it should be OK. I'll test setting loglevel on boot
> in case the default is different than what the config file says. I can't do
> that until Monday morning.
>
> I think it is more likely the the WARN_ON macro code isn't being compiled in
> for some reason. I haven't confirmed that, nor have I found anything that
> would leave that code out when I do a make, but include it during Fedora
> builds.
Hi, thanks for testing, I think you first reproduce this issue(got WARNING
at device_add_disk) by your own build, then add my debug patch.

2017-12-17 15:58:23

by Bruno Wolff III

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

On Sun, Dec 17, 2017 at 21:43:50 +0800,
weiping zhang <[email protected]> wrote:
>Hi, thanks for testing, I think you first reproduce this issue(got WARNING
>at device_add_disk) by your own build, then add my debug patch.

No, the first log (that Laura copied) was from the Fedora bug and it was
from a Fedora kernel before I started doing the bisect. That warnings are
missing from all of my builds, not just the ones from the patches.

I didn't spot anything obvious going through the kernel spec file when
I looked, but I could easily have missed something.

Hopefully it is just a boot default that is different and that I can
override it with a kernel boot parameter.

2017-12-18 21:57:38

by Bruno Wolff III

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

On Sun, Dec 17, 2017 at 21:43:50 +0800,
weiping zhang <[email protected]> wrote:
>Hi, thanks for testing, I think you first reproduce this issue(got WARNING
>at device_add_disk) by your own build, then add my debug patch.

I'm going to try testing warnings with a kernel I've built, to try to
determine if warnings are working at all for the ones I'm building. However
it might be that the WARN_ONs are not being reached for the kernels I've
built. If that turns out to be the case, I may not be able to get you both
the output from the WARN_ONs and the output from your debugging patch at
the same time.
My next kernel build isn't going to finish in time to test today.

2017-12-19 16:28:20

by Bruno Wolff III

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

On Sun, Dec 17, 2017 at 21:43:50 +0800,
weiping zhang <[email protected]> wrote:
>Hi, thanks for testing, I think you first reproduce this issue(got WARNING
>at device_add_disk) by your own build, then add my debug patch.

The problem is still in rc4. Reverting the commit still fixes the problem.
I tested that warning level messages should appear using lkdtm. While
there could be something weird relating to the WARN_ON macro, more likely
there is something different about the boots with the kernels I build
(the exact way initramfs is built is probably different) and probably
that (WARN_ON) code is not getting executed.

2017-12-19 18:24:56

by Shaohua Li

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

On Tue, Dec 19, 2017 at 10:17:43AM -0600, Bruno Wolff III wrote:
> On Sun, Dec 17, 2017 at 21:43:50 +0800,
> weiping zhang <[email protected]> wrote:
> > Hi, thanks for testing, I think you first reproduce this issue(got WARNING
> > at device_add_disk) by your own build, then add my debug patch.
>
> The problem is still in rc4. Reverting the commit still fixes the problem. I
> tested that warning level messages should appear using lkdtm. While there
> could be something weird relating to the WARN_ON macro, more likely there is
> something different about the boots with the kernels I build (the exact way
> initramfs is built is probably different) and probably that (WARN_ON) code
> is not getting executed.

Not sure if this is MD related, but could you please check if this debug patch
changes anything?

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 4e4dee0..c365179 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -518,7 +518,6 @@ static void mddev_put(struct mddev *mddev)
mddev->ctime == 0 && !mddev->hold_active) {
/* Array is not configured at all, and not held active,
* so destroy it */
- list_del_init(&mddev->all_mddevs);
bs = mddev->bio_set;
sync_bs = mddev->sync_set;
mddev->bio_set = NULL;
@@ -5210,6 +5209,10 @@ static void md_free(struct kobject *ko)
}
percpu_ref_exit(&mddev->writes_pending);

+ spin_lock(&all_mddevs_lock);
+ list_del_init(&mddev->all_mddevs);
+ spin_unlock(&all_mddevs_lock);
+
kfree(mddev);
}


2017-12-19 19:52:38

by Bruno Wolff III

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

On Tue, Dec 19, 2017 at 10:24:52 -0800,
Shaohua Li <[email protected]> wrote:
>
>Not sure if this is MD related, but could you please check if this debug patch
>changes anything?

I'm doing a build now. I do use md to mirror disk partitions between two disks. I do that on another machine that doesn't exhibit the problem, but it is
i686, not x86_64.

2017-12-20 00:24:34

by Bruno Wolff III

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

On Tue, Dec 19, 2017 at 10:24:52 -0800,
Shaohua Li <[email protected]> wrote:
>
>Not sure if this is MD related, but could you please check if this debug patch
>changes anything?

The system still had cpu hangs. I've attached dmesg output saved by systemd
and retrieved after booting with a pre-rc2 kernel.


Attachments:
(No filename) (308.00 B)
boot11.log (145.54 kB)
Download all attachments

2017-12-21 13:03:25

by Bruno Wolff III

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

After today, I won't have physical access to the problem machine until
January 2nd. So if you guys have any testing suggestions I need them soon
if they are to get done before my vacation.
I do plan to try booting to level 1 to see if I can get a login prompt
that might facilitate testing. The lockups do happen fairly late in the
boot process. I never get to X, but maybe it will get far enough for
a console login.

2017-12-21 14:01:37

by Weiping Zhang

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

2017-12-21 21:00 GMT+08:00 Bruno Wolff III <[email protected]>:
> After today, I won't have physical access to the problem machine until
> January 2nd. So if you guys have any testing suggestions I need them soon if
> they are to get done before my vacation.
> I do plan to try booting to level 1 to see if I can get a login prompt that
> might facilitate testing. The lockups do happen fairly late in the boot
> process. I never get to X, but maybe it will get far enough for a console
> login.
>
Hi,
how do you do bisect ?build all kernel commit one by one ?
as you did before:
https://bugzilla.redhat.com/show_bug.cgi?id=1520982

what kernel source code do you use that occur warning at device_add_disk?
from fedora or any official release ? if so ,could you provide web link?

if you use same kernel source code and same .config, why your own build
Cann't trigger that warning ?

2017-12-21 15:21:05

by Bruno Wolff III

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

On Thu, Dec 21, 2017 at 22:01:33 +0800,
weiping zhang <[email protected]> wrote:
>Hi,
>how do you do bisect ?build all kernel commit one by one ?
>as you did before:
>https://bugzilla.redhat.com/show_bug.cgi?id=1520982

I just did the one bisect using Linus' tree. After each build, I would do
a test boot and see if the boot was normal or if I got errors and an
eventual hang before boot.

Since then I have used git revert to revert just the problem commit from
later kernels (such as v4.15-rc4) and when I do the system boots normally.
And when I don't do the revert or just use stock Fedora kernels the problem
occurs every time.

I also did a couple of tests with Josh Boyer's Fedora kernel tree that
has Fedora patches on top of the development kernel.

>what kernel source code do you use that occur warning at device_add_disk?
>from fedora or any official release ? if so ,could you provide web link?

That was from an offical Fedora kernel. I believe I got it from the
nodebug repo, but that kernel should be the same as the one that was
normally used for rawhide. It is at
https://koji.fedoraproject.org/koji/buildinfo?buildID=1007500
but I don't know how much longer the binaries will stay available in koji.

>if you use same kernel source code and same .config, why your own build
>Cann't trigger that warning ?

I don't know. The install script may build the initramfs differently. As
far as I can tell, if the WARN_ON was triggered, I should have gotten
output.

2017-12-21 15:31:44

by Weiping Zhang

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

2017-12-21 23:18 GMT+08:00 Bruno Wolff III <[email protected]>:
> On Thu, Dec 21, 2017 at 22:01:33 +0800,
> weiping zhang <[email protected]> wrote:
>>
>> Hi,
>> how do you do bisect ?build all kernel commit one by one ?
>> as you did before:
>> https://bugzilla.redhat.com/show_bug.cgi?id=1520982
>
>
> I just did the one bisect using Linus' tree. After each build, I would do a
> test boot and see if the boot was normal or if I got errors and an eventual
> hang before boot.
>
> Since then I have used git revert to revert just the problem commit from
> later kernels (such as v4.15-rc4) and when I do the system boots normally.
> And when I don't do the revert or just use stock Fedora kernels the problem
> occurs every time.
does every time boot fail can trigger WANRING in device_add_disk ?

2017-12-21 15:38:54

by Bruno Wolff III

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

On Thu, Dec 21, 2017 at 23:31:40 +0800,
weiping zhang <[email protected]> wrote:
>does every time boot fail can trigger WANRING in device_add_disk ?

Not that I see. But the message could scroll off the screen. The boot gets
far enough that systemd copies over dmesg output to permanent storage that
I can see on my next successful boot. That's where I looked for the warning
output you want. I never saw it for any kernels I compiled myself. Only
when I test kernels built by Fedora do I see it.

I just tried booting to single user and the boot still hangs.

When I build the kernels, the compiler options are probably a bit different
than when Fedora does. That might affect what happens during boot.

2017-12-21 15:48:24

by Weiping Zhang

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

2017-12-21 23:36 GMT+08:00 Bruno Wolff III <[email protected]>:
> On Thu, Dec 21, 2017 at 23:31:40 +0800,
> weiping zhang <[email protected]> wrote:
>>
>> does every time boot fail can trigger WANRING in device_add_disk ?
>
>
> Not that I see. But the message could scroll off the screen. The boot gets
> far enough that systemd copies over dmesg output to permanent storage that I
> can see on my next successful boot. That's where I looked for the warning
> output you want. I never saw it for any kernels I compiled myself. Only when
> I test kernels built by Fedora do I see it.
see it every boot ?

> I just tried booting to single user and the boot still hangs.
>
> When I build the kernels, the compiler options are probably a bit different
> than when Fedora does. That might affect what happens during boot.

2017-12-21 16:44:48

by Bruno Wolff III

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

On Thu, Dec 21, 2017 at 23:48:19 +0800,
weiping zhang <[email protected]> wrote:
>> output you want. I never saw it for any kernels I compiled myself. Only when
>> I test kernels built by Fedora do I see it.
>see it every boot ?

I don't look every boot. The warning gets scrolled of the screen. Once I see
the CPU hang warnings I know the boot is failing. I don't always look
at journalctl later to see what's there.

2017-12-21 17:02:19

by Jens Axboe

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

On 12/21/17 9:42 AM, Bruno Wolff III wrote:
> On Thu, Dec 21, 2017 at 23:48:19 +0800,
> weiping zhang <[email protected]> wrote:
>>> output you want. I never saw it for any kernels I compiled myself. Only when
>>> I test kernels built by Fedora do I see it.
>> see it every boot ?
>
> I don't look every boot. The warning gets scrolled of the screen. Once I see
> the CPU hang warnings I know the boot is failing. I don't always look
> at journalctl later to see what's there.

I'm going to revert a0747a859ef6 for now, since we're now 8 days into this
and no progress has been made on fixing it.

--
Jens Axboe

2017-12-21 17:46:07

by Weiping Zhang

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

2017-12-22 1:02 GMT+08:00 Jens Axboe <[email protected]>:
> On 12/21/17 9:42 AM, Bruno Wolff III wrote:
>> On Thu, Dec 21, 2017 at 23:48:19 +0800,
>> weiping zhang <[email protected]> wrote:
>>>> output you want. I never saw it for any kernels I compiled myself. Only when
>>>> I test kernels built by Fedora do I see it.
>>> see it every boot ?
>>
>> I don't look every boot. The warning gets scrolled of the screen. Once I see
>> the CPU hang warnings I know the boot is failing. I don't always look
>> at journalctl later to see what's there.
>
> I'm going to revert a0747a859ef6 for now, since we're now 8 days into this
> and no progress has been made on fixing it.
>
OK, you can revert it first.
it seems MD produce a duplicated major:minor pair, which lead to create
debugfs dir failed, but it's under debugging...
>

2017-12-21 18:18:15

by Bruno Wolff III

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

On Thu, Dec 21, 2017 at 10:02:15 -0700,
Jens Axboe <[email protected]> wrote:
>On 12/21/17 9:42 AM, Bruno Wolff III wrote:
>> On Thu, Dec 21, 2017 at 23:48:19 +0800,
>> weiping zhang <[email protected]> wrote:
>>>> output you want. I never saw it for any kernels I compiled myself. Only when
>>>> I test kernels built by Fedora do I see it.
>>> see it every boot ?
>>
>> I don't look every boot. The warning gets scrolled of the screen. Once I see
>> the CPU hang warnings I know the boot is failing. I don't always look
>> at journalctl later to see what's there.
>
>I'm going to revert a0747a859ef6 for now, since we're now 8 days into this
>and no progress has been made on fixing it.

One important thing I have just found is that it looks like the problem
only happens when booting in enforcing mode. If I boot in permissive
mode it does not happen. My home machines are currently set to boot in
permissive mode and I'll test this evening to see if I can reproduce the
problem if I change them to enforcing mode. If so I'll be able to do lots
of testing during my vacation.

2017-12-21 23:18:33

by Bruno Wolff III

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

On Thu, Dec 21, 2017 at 12:15:31 -0600,
Bruno Wolff III <[email protected]> wrote:
>
>One important thing I have just found is that it looks like the
>problem only happens when booting in enforcing mode. If I boot in
>permissive mode it does not happen. My home machines are currently set
>to boot in permissive mode and I'll test this evening to see if I can
>reproduce the problem if I change them to enforcing mode. If so I'll
>be able to do lots of testing during my vacation.

Enforcing mode alone isn't enough as I tested that one one machine at
home and it didn't trigger the problem. I'll try another machine late
tonight.

2017-12-22 00:29:36

by James Bottomley

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

On Thu, 2017-12-21 at 10:02 -0700, Jens Axboe wrote:
> On 12/21/17 9:42 AM, Bruno Wolff III wrote:
> >
> > On Thu, Dec 21, 2017 at 23:48:19 +0800,
> >   weiping zhang <[email protected]> wrote:
> > >
> > > >
> > > > output you want. I never saw it for any kernels I compiled
> > > > myself. Only when I test kernels built by Fedora do I see it.
> > > > see it every boot ?
> >
> > I don't look every boot. The warning gets scrolled of the screen.
> > Once I see the CPU hang warnings I know the boot is failing. I
> > don't always look at journalctl later to see what's there.
>
> I'm going to revert a0747a859ef6 for now, since we're now 8 days into
> this and no progress has been made on fixing it.

I think this is correct.  If you build the kernel with
CONFIG_DEBUG_FS=N, you're definitely going to get the same hang
(because the debugfs_ functions fail with -ENODEV and the bdi will
never get registered).  This alone leads me to suspect the commit is
bogus because it's a randconfig/test accident waiting to happen.

We should still root cause the debugfs failure in this case, but I
really think debugfs files should be treated as optional, so a failure
in setting them up should translate to some sort of warning not a
failure to set up the bdi.

James

2017-12-22 05:02:20

by Bruno Wolff III

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

On Thu, Dec 21, 2017 at 17:16:03 -0600,
Bruno Wolff III <[email protected]> wrote:
>
>Enforcing mode alone isn't enough as I tested that one one machine at
>home and it didn't trigger the problem. I'll try another machine late
>tonight.

I got the problem to occur on my i686 machine when booting in enforcing
mode. This machine uses raid 1 vua mdraid which may or may not be a
factor in this problem. The boot log has a trace at the end and might be
helpful, so I'm attaching it here.


Attachments:
(No filename) (490.00 B)
boot-i686.log (79.44 kB)
Download all attachments

2017-12-22 13:20:16

by Weiping Zhang

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

2017-12-22 12:53 GMT+08:00 Bruno Wolff III <[email protected]>:
> On Thu, Dec 21, 2017 at 17:16:03 -0600,
> Bruno Wolff III <[email protected]> wrote:
>>
>>
>> Enforcing mode alone isn't enough as I tested that one one machine at home
>> and it didn't trigger the problem. I'll try another machine late tonight.
>
>
> I got the problem to occur on my i686 machine when booting in enforcing
> mode. This machine uses raid 1 vua mdraid which may or may not be a factor
> in this problem. The boot log has a trace at the end and might be helpful,
> so I'm attaching it here.
Hi Bruno,
I can reproduce this issue in my QEMU test VM easily, just add an soft
RAID1, always trigger
that warning, I'll debug it later.

2017-12-22 14:06:52

by Bruno Wolff III

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

On Fri, Dec 22, 2017 at 21:20:10 +0800,
weiping zhang <[email protected]> wrote:
>2017-12-22 12:53 GMT+08:00 Bruno Wolff III <[email protected]>:
>> On Thu, Dec 21, 2017 at 17:16:03 -0600,
>> Bruno Wolff III <[email protected]> wrote:
>>>
>>>
>>> Enforcing mode alone isn't enough as I tested that one one machine at home
>>> and it didn't trigger the problem. I'll try another machine late tonight.
>>
>>
>> I got the problem to occur on my i686 machine when booting in enforcing
>> mode. This machine uses raid 1 vua mdraid which may or may not be a factor
>> in this problem. The boot log has a trace at the end and might be helpful,
>> so I'm attaching it here.
>Hi Bruno,
>I can reproduce this issue in my QEMU test VM easily, just add an soft
>RAID1, always trigger
>that warning, I'll debug it later.

Great. When you have a fix, I can test it.

2017-12-29 16:36:33

by Weiping Zhang

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

On Fri, Dec 22, 2017 at 08:04:23AM -0600, Bruno Wolff III wrote:
> On Fri, Dec 22, 2017 at 21:20:10 +0800,
> weiping zhang <[email protected]> wrote:
> >2017-12-22 12:53 GMT+08:00 Bruno Wolff III <[email protected]>:
> >>On Thu, Dec 21, 2017 at 17:16:03 -0600,
> >> Bruno Wolff III <[email protected]> wrote:
> >>>
> >>>
> >>>Enforcing mode alone isn't enough as I tested that one one machine at home
> >>>and it didn't trigger the problem. I'll try another machine late tonight.
> >>
> >>
> >>I got the problem to occur on my i686 machine when booting in enforcing
> >>mode. This machine uses raid 1 vua mdraid which may or may not be a factor
> >>in this problem. The boot log has a trace at the end and might be helpful,
> >>so I'm attaching it here.
> >Hi Bruno,
> >I can reproduce this issue in my QEMU test VM easily, just add an soft
> >RAID1, always trigger
> >that warning, I'll debug it later.
>
> Great. When you have a fix, I can test it.
This issue can trigger easily in Centos7.3, if meet two factors:
1. SELINUX in enforceing mode
2. mdadm try to create new gendisk.

if disable SELINUX or let it in permissive mode, issue disappear.
As Jens has revert that commit, it seems boot normally, actually
this is no diretor created under /sys/kernel/debug/bdi/, though
has no effect on disk workflow.

As james said before, "debugfs files should be treated as optional",
so kernel give warning here is enough.

So there are 2 ways to fix this issue:
1. Add proper SELINUX policy allow mdadm create dir at debugfs
2. mdadm don't create gendisk directly, first mdadm trigger a kwork and
wait it done, let kwork create gendisk.
A possible change for MD like following:

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 4e4dee0..86ead5a 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -90,6 +90,7 @@
EXPORT_SYMBOL(md_cluster_mod);

static DECLARE_WAIT_QUEUE_HEAD(resync_wait);
+static struct workqueue_struct *md_probe_wq;
static struct workqueue_struct *md_wq;
static struct workqueue_struct *md_misc_wq;

@@ -5367,10 +5368,27 @@ static int md_alloc(dev_t dev, char *name)
return error;
}

+static void md_probe_work_fn(struct work_struct *ws)
+{
+ struct md_probe_work *mpw = container_of(ws, struct md_probe_work,
+ work);
+ md_alloc(mpw->dev, NULL);
+ mpw->done = 1;
+ wake_up(&mpw->wait);
+}
+
static struct kobject *md_probe(dev_t dev, int *part, void *data)
{
- if (create_on_open)
- md_alloc(dev, NULL);
+ struct md_probe_work mpw;
+
+ if (create_on_open) {
+ init_waitqueue_head(&mpw.wait);
+ mpw.dev = dev;
+ mpw.done = 0;
+ INIT_WORK(&mpw.work, md_probe_work_fn);
+ queue_work(md_probe_wq, &mpw.work);
+ wait_event(mpw.wait, mpw.done);
+ }
return NULL;
}

@@ -9023,9 +9041,13 @@ static int __init md_init(void)
{
int ret = -ENOMEM;

+ md_probe_wq = alloc_workqueue("md_probe", 0, 0);
+ if (!md_probe_wq)
+ goto err_wq;
+
md_wq = alloc_workqueue("md", WQ_MEM_RECLAIM, 0);
if (!md_wq)
- goto err_wq;
+ goto err_probe_wq;

md_misc_wq = alloc_workqueue("md_misc", 0, 0);
if (!md_misc_wq)
@@ -9055,6 +9077,8 @@ static int __init md_init(void)
destroy_workqueue(md_misc_wq);
err_misc_wq:
destroy_workqueue(md_wq);
+err_probe_wq:
+ destroy_workqueue(md_probe_wq);
err_wq:
return ret;
}
@@ -9311,6 +9335,7 @@ static __exit void md_exit(void)
}
destroy_workqueue(md_misc_wq);
destroy_workqueue(md_wq);
+ destroy_workqueue(md_probe_wq);
}

subsys_initcall(md_init);
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 7d6bcf0..3953896 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -487,6 +487,13 @@ enum recovery_flags {
MD_RECOVERY_ERROR, /* sync-action interrupted because io-error */
};

+struct md_probe_work {
+ struct work_struct work;
+ wait_queue_head_t wait;
+ dev_t dev;
+ int done;
+};
+
static inline int __must_check mddev_lock(struct mddev *mddev)
{
return mutex_lock_interruptible(&mddev->reconfig_mutex);

2017-12-29 16:46:09

by Weiping Zhang

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

On Fri, Dec 22, 2017 at 08:04:23AM -0600, Bruno Wolff III wrote:
> On Fri, Dec 22, 2017 at 21:20:10 +0800,
> weiping zhang <[email protected]> wrote:
> >2017-12-22 12:53 GMT+08:00 Bruno Wolff III <[email protected]>:
> >>On Thu, Dec 21, 2017 at 17:16:03 -0600,
> >> Bruno Wolff III <[email protected]> wrote:
> >>>
> >>>
> >>>Enforcing mode alone isn't enough as I tested that one one machine at home
> >>>and it didn't trigger the problem. I'll try another machine late tonight.
> >>
> >>
> >>I got the problem to occur on my i686 machine when booting in enforcing
> >>mode. This machine uses raid 1 vua mdraid which may or may not be a factor
> >>in this problem. The boot log has a trace at the end and might be helpful,
> >>so I'm attaching it here.
> >Hi Bruno,
> >I can reproduce this issue in my QEMU test VM easily, just add an soft
> >RAID1, always trigger
> >that warning, I'll debug it later.
>
> Great. When you have a fix, I can test it.
This issue can trigger easily in Centos7.3 + kernel-4.15-rc3, if meet two factors:
1. SELINUX in enforceing mode
2. mdadm try to create new gendisk.

if disable SELINUX or let it in permissive mode, issue disappear.
As Jens has revert that commit, it seems boot normally, actually
there is no diretory created under /sys/kernel/debug/bdi/, though
has no effect on disk workflow.

As James said before, "debugfs files should be treated as optional",
so kernel give warning here is enough.

So, we may solve this issue in two ways:
1. Add proper SELINUX policy that give permission to mdadm for debugfs.
2. Split mdadm into 2 part, Firstly, user proccess mdadm trigger a kwork,
secondly kwork will create gendisk)and mdadm wait it done, Like
following:

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 4e4dee0..86ead5a 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -90,6 +90,7 @@
EXPORT_SYMBOL(md_cluster_mod);

static DECLARE_WAIT_QUEUE_HEAD(resync_wait);
+static struct workqueue_struct *md_probe_wq;
static struct workqueue_struct *md_wq;
static struct workqueue_struct *md_misc_wq;

@@ -5367,10 +5368,27 @@ static int md_alloc(dev_t dev, char *name)
return error;
}

+static void md_probe_work_fn(struct work_struct *ws)
+{
+ struct md_probe_work *mpw = container_of(ws, struct md_probe_work,
+ work);
+ md_alloc(mpw->dev, NULL);
+ mpw->done = 1;
+ wake_up(&mpw->wait);
+}
+
static struct kobject *md_probe(dev_t dev, int *part, void *data)
{
- if (create_on_open)
- md_alloc(dev, NULL);
+ struct md_probe_work mpw;
+
+ if (create_on_open) {
+ init_waitqueue_head(&mpw.wait);
+ mpw.dev = dev;
+ mpw.done = 0;
+ INIT_WORK(&mpw.work, md_probe_work_fn);
+ queue_work(md_probe_wq, &mpw.work);
+ wait_event(mpw.wait, mpw.done);
+ }
return NULL;
}

@@ -9023,9 +9041,13 @@ static int __init md_init(void)
{
int ret = -ENOMEM;

+ md_probe_wq = alloc_workqueue("md_probe", 0, 0);
+ if (!md_probe_wq)
+ goto err_wq;
+
md_wq = alloc_workqueue("md", WQ_MEM_RECLAIM, 0);
if (!md_wq)
- goto err_wq;
+ goto err_probe_wq;

md_misc_wq = alloc_workqueue("md_misc", 0, 0);
if (!md_misc_wq)
@@ -9055,6 +9077,8 @@ static int __init md_init(void)
destroy_workqueue(md_misc_wq);
err_misc_wq:
destroy_workqueue(md_wq);
+err_probe_wq:
+ destroy_workqueue(md_probe_wq);
err_wq:
return ret;
}
@@ -9311,6 +9335,7 @@ static __exit void md_exit(void)
}
destroy_workqueue(md_misc_wq);
destroy_workqueue(md_wq);
+ destroy_workqueue(md_probe_wq);
}

subsys_initcall(md_init);
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 7d6bcf0..3953896 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -487,6 +487,13 @@ enum recovery_flags {
MD_RECOVERY_ERROR, /* sync-action interrupted because io-error */
};

+struct md_probe_work {
+ struct work_struct work;
+ wait_queue_head_t wait;
+ dev_t dev;
+ int done;
+};
+
static inline int __must_check mddev_lock(struct mddev *mddev)
{
return mutex_lock_interruptible(&mddev->reconfig_mutex);

2017-12-29 18:48:37

by Bruno Wolff III

[permalink] [raw]
Subject: Re: Regression with a0747a859ef6 ("bdi: add error handle for bdi_debug_register")

On Sat, Dec 30, 2017 at 00:30:32 +0800,
weiping zhang <[email protected]> wrote:
>1. Add proper SELINUX policy that give permission to mdadm for debugfs.
>2. Split mdadm into 2 part, Firstly, user proccess mdadm trigger a kwork,
>secondly kwork will create gendisk)and mdadm wait it done, Like
>following:
>
>diff --git a/drivers/md/md.c b/drivers/md/md.c

Is that patch ready to be tested?

Fedora hasn't built an rc5 kernel yet, probably because a lot of people
are off work this week. So I haven't done that test yet.