2021-09-01 11:25:38

by Abdul Haleem

[permalink] [raw]
Subject: [next-20210827][ppc][multipathd] INFO: task hung in dm_table_add_target

Greeting's

multiple task hung while adding the vfc disk back to the multipath on my
powerpc box running linux-next kernel

Test:
$ multipathd -k"remove path sde"
$ multipathd -k"add path sde"

After above command multipathd task hung with continuous call traces on
console, it requires reboot to stop call traces

systemd-udevd[180359]: Process '/sbin/mdadm -I /dev/dm-8' failed with
exit code 1.
multipathd[180272]: mpathf: load table [0 62914560 multipath 1
queue_if_no_path 1 alua 2 1 round-robin 0 2 1 8:64 1 65:0 1 round-robin
0 2 1 8:144 1 65:80 1]
systemd[1]: Started Device-Mapper Multipath Device Controller.
multipathd[180272]: mpatha: sdl - tur checker reports path is up
multipathd[180272]: 8:176: reinstated
multipathd[180272]: mpatha: remaining active paths: 2
multipathd[180272]: sde: remove path (operator)
multipathd[180272]: mpathf: load table [0 62914560 multipath 1
queue_if_no_path 1 alua 2 1 round-robin 0 1 1 65:0 1 round-robin 0 2 1
8:144 1 65:80 1]
multipathd[180272]: sde [8:64]: path removed from map mpathf
multipathd[180272]: sdq: remove path (operator)
multipathd[180272]: mpathf: load table [0 62914560 multipath 1
queue_if_no_path 1 alua 1 1 round-robin 0 2 1 8:144 1 65:80 1]
multipathd[180272]: sdq [65:0]: path removed from map mpathf
multipathd[180272]: sdj: remove path (operator)
multipathd[180272]: mpathf: load table [0 62914560 multipath 1
queue_if_no_path 1 alua 1 1 round-robin 0 1 1 65:80 1]
multipathd[180272]: sdj [8:144]: path removed from map mpathf
multipathd[180272]: sdv: remove path (operator)
multipathd[180272]: mpathf: map in use
multipathd[180272]: mpathf: can't flush
multipathd[180272]: mpathf: load table [0 62914560 multipath 1
queue_if_no_path 0 0 0]
multipathd[180272]: sdv [65:80]: path removed from map mpathf
systemd[1]: Starting system activity accounting tool...
systemd[1]: sysstat-collect.service: Succeeded.
systemd[1]: Started system activity accounting tool.
systemd-udevd[1156]: seq 5678 '/devices/virtual/block/dm-10' is taking a
long time
systemd-udevd[1156]: seq 5678 '/devices/virtual/block/dm-10' killed
multipathd[180272]: sde: add path (operator)
systemd[1]: Stopping Device-Mapper Multipath Device Controller...
systemd[1]: multipathd.service: State 'stop-sigterm' timed out. Killing.
systemd[1]: multipathd.service: Killing process 180272 (multipathd) with
signal SIGKILL.
systemd[1]: multipathd.service: Processes still around after SIGKILL.
Ignoring.
dbus-daemon[1920]: [system] Activating via systemd: service
name='net.reactivated.Fprint' unit='fprintd.service' requested by
':1.255' (uid=0 pid=95173 comm="/bin/login -p --      ")
systemd[1]: Starting Fingerprint Authentication Daemon...
dbus-daemon[1920]: [system] Successfully activated service
'net.reactivated.Fprint'
systemd[1]: Started Fingerprint Authentication Daemon.
systemd-logind[2032]: New session 6 of user root.
systemd[1]: Started Session 6 of user root.
kernel: INFO: task multipathd:180274 blocked for more than 122 seconds.
kernel:      Not tainted 5.14.0-rc7-next-20210827-autotest #1
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
kernel: task:multipathd      state:D stack:    0 pid:180274 ppid:     1
flags:0x00040082
kernel: Call Trace:
kernel: [c00000002aaa72c0] [c00000002aaa73a0] 0xc00000002aaa73a0
(unreliable)
kernel: [c00000002aaa74b0] [c00000000001e638] __switch_to+0x278/0x490
kernel: [c00000002aaa7510] [c000000000c89cfc] __schedule+0x31c/0xa10
kernel: [c00000002aaa75d0] [c000000000c8a468] schedule+0x78/0x130
kernel: [c00000002aaa7600] [c000000000c8aa08]
schedule_preempt_disabled+0x18/0x30
kernel: [c00000002aaa7620] [c000000000c8cf0c]
__mutex_lock.isra.11+0x36c/0x700
kernel: [c00000002aaa76b0] [c00000000067efe4] bd_link_disk_holder+0x34/0x270
kernel: [c00000002aaa7700] [c0080000003e3b78]
dm_get_table_device+0x1e0/0x2c0 [dm_mod]
kernel: [c00000002aaa77a0] [c0080000003e7e48] dm_get_device+0x130/0x2f0
[dm_mod]
kernel: [c00000002aaa7850] [c008000000745234] multipath_ctr+0x9bc/0xff0
[dm_multipath]
kernel: [c00000002aaa79d0] [c0080000003e883c]
dm_table_add_target+0x1a4/0x420 [dm_mod]
kernel: [c00000002aaa7a90] [c0080000003ee874] table_load+0x15c/0x4a0
[dm_mod]
kernel: [c00000002aaa7b40] [c0080000003f1454] ctl_ioctl+0x27c/0x770 [dm_mod]
kernel: [c00000002aaa7d40] [c0080000003f1960] dm_ctl_ioctl+0x18/0x30
[dm_mod]
kernel: [c00000002aaa7d60] [c000000000481198] sys_ioctl+0xf8/0x150
kernel: [c00000002aaa7db0] [c00000000002ff28]
system_call_exception+0x158/0x2c0
kernel: [c00000002aaa7e10] [c00000000000c764] system_call_common+0xf4/0x258
kernel: --- interrupt: c00 at 0x7fffa06f4480
kernel: NIP:  00007fffa06f4480 LR: 00007fffa0ad6714 CTR: 0000000000000000
kernel: REGS: c00000002aaa7e80 TRAP: 0c00   Not tainted
(5.14.0-rc7-next-20210827-autotest)
kernel: MSR:  800000000000d033 <SF,EE,PR,ME,IR,DR,RI,LE>  CR: 24042204 
XER: 00000000
kernel: IRQMASK: 0 #012GPR00: 0000000000000036 00007fff9fd4c3a0
00007fffa07e7100 0000000000000005 #012GPR04: 00000000c138fd09
00007fff9806a0c0 00007fffa0ad9f18 00007fff9fd4a298 #012GPR08:
0000000000000005 0000000000000000 0000000000000000 0000000000000000
#012GPR12: 0000000000000000 00007fff9fd56300 00007fff9806a0c0
00007fffa0ad9c80 #012GPR16: 00007fffa0ad9c80 00007fffa0ad9c80
00007fffa0b13670 0000000000000000 #012GPR20: 00007fffa0ae3260
00007fffa0b12040 00007fff9806a0f0 00007fff9800eca0 #012GPR24:
00007fffa0ad9c80 00007fffa0ad9c80 00007fffa0ad9c80 0000000000000000
#012GPR28: 00007fffa0ad9c80 00007fffa0ad9c80 0000000000000000
00007fffa0ad9c80
kernel: NIP [00007fffa06f4480] 0x7fffa06f4480
kernel: LR [00007fffa0ad6714] 0x7fffa0ad6714
kernel: --- interrupt: c00
kernel: INFO: task systemd-udevd:180404 blocked for more than 122 seconds.
kernel:      Not tainted 5.14.0-rc7-next-20210827-autotest #1
kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
kernel: task:systemd-udevd   state:D stack:    0 pid:180404 ppid:  1156
flags:0x00042482
kernel: Call Trace:

Before test
------------
$multipath -ll
mpathf (360050768108001b3a8000000000000e8) dm-10 IBM,2145
size=30G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
|-+- policy='round-robin 0' prio=50 status=active
| |- 1:0:0:4 sde 8:64  active ready running
| `- 2:0:0:4 sdq 65:0  active ready running
`-+- policy='round-robin 0' prio=10 status=enabled
  |- 1:0:1:4 sdj 8:144 active ready running
  `- 2:0:1:4 sdv 65:80 active ready running
$

After test fail
------------
$ multipath -ll
mpathf (360050768108001b3a8000000000000e8) dm-10 ##,##
size=30G features='1 queue_if_no_path' hwhandler='0' wp=rw
$

--
Regard's

Abdul Haleem
IBM Linux Technology Center


2021-09-01 18:57:27

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [next-20210827][ppc][multipathd] INFO: task hung in dm_table_add_target

On Wed, Sep 01, 2021 at 04:47:26PM +0530, Abdul Haleem wrote:
> Greeting's
>
> multiple task hung while adding the vfc disk back to the multipath on my
> powerpc box running linux-next kernel

Can you retry to reproduce this with lockdep enabled to see if there
is anything interesting holding this lock?

2021-09-28 10:28:01

by Abdul Haleem

[permalink] [raw]
Subject: Re: [next-20210827][ppc][multipathd] INFO: task hung in dm_table_add_target


On 9/1/21 7:06 PM, Christoph Hellwig wrote:
> On Wed, Sep 01, 2021 at 04:47:26PM +0530, Abdul Haleem wrote:
>> Greeting's
>>
>> multiple task hung while adding the vfc disk back to the multipath on my
>> powerpc box running linux-next kernel
> Can you retry to reproduce this with lockdep enabled to see if there
> is anything interesting holding this lock?

LOCKDEP was earlier enabled by default

# cat .config | grep LOCKDEP
CONFIG_LOCKDEP_SUPPORT=y

BTW, Recreated again on 5.15.0-rc2 mainline kernel and attaching the logs

--
Regard's

Abdul Haleem
IBM Linux Technology Center


Attachments:
hungtasklogs.txt (25.48 kB)

2021-09-29 06:57:31

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [next-20210827][ppc][multipathd] INFO: task hung in dm_table_add_target

On Tue, Sep 28, 2021 at 03:53:47PM +0530, Abdul Haleem wrote:
>
> On 9/1/21 7:06 PM, Christoph Hellwig wrote:
>> On Wed, Sep 01, 2021 at 04:47:26PM +0530, Abdul Haleem wrote:
>>> Greeting's
>>>
>>> multiple task hung while adding the vfc disk back to the multipath on my
>>> powerpc box running linux-next kernel
>> Can you retry to reproduce this with lockdep enabled to see if there
>> is anything interesting holding this lock?
>
> LOCKDEP was earlier enabled by default
>
> # cat .config | grep LOCKDEP
> CONFIG_LOCKDEP_SUPPORT=y
>
> BTW, Recreated again on 5.15.0-rc2 mainline kernel and attaching the logs

It seems the reinstate is blocking on the close which is blocking on
flushing dirty data. In other words it looks like the link blocking
looks like the symptom and not the cause.