2013-09-09 06:19:21

by Joe Jin

[permalink] [raw]
Subject: [PATCH] [scsi] enclosure: remove all possible sysfs entries before add device

When do disk pull/insert test we encountered below:

WARNING: at fs/sysfs/dir.c:455 sysfs_add_one+0xbc/0xe0()
Hardware name: SUN FIRE X4370 M2 SERVER
sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:03.0/0000:0d:00.0/host6/port-6:1/expander-6:1/port-6:1:14/end_device-6:1:14/target6:0:27/6:0:27:0/enclosure_device:HDD10'
Modules linked in: oracleacfs(P)(U) oracleadvm(P)(U) oracleoks(P)(U) mptctl mptbase autofs4 hidp bluetooth rfkill lockd sunrpc bonding be2iscsi iscsi_boot_sysfs ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp libiscsi scsi_transport_iscsi dm_round_robin dm_multipath video sbs sbshc acpi_pad acpi_memhotplug acpi_ipmi parport_pc lp parport ipmi_si ipmi_devintf ipmi_msghandler sg ses enclosure ixgbe e1000e hwmon igb snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore snd_page_alloc iTCO_wdt pcspkr i2c_i801 ioatdma ghes iTCO_vendor_support hed dca i2c_core i7core_edac edac_core dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_mod usb_storage shpchp mpt2sas scsi_transport_sas raid_class ahci libahci sd_mod crc_t10dif raid1 ext3 jbd mbcache
Pid: 23302, comm: kworker/u:2 Tainted: P 2.6.39-400.124.1.el5uek #1
Call Trace:
[<ffffffff811daf8c>] ? sysfs_add_one+0xbc/0xe0
[<ffffffff8106f030>] warn_slowpath_common+0x90/0xc0
[<ffffffff8106f15e>] warn_slowpath_fmt+0x6e/0x70
[<ffffffff81258bd4>] ? strlcat+0x54/0x70
[<ffffffff811daf8c>] sysfs_add_one+0xbc/0xe0
[<ffffffff811dbec8>] sysfs_do_create_link+0x148/0x1d0
[<ffffffff811dbf83>] sysfs_create_link+0x13/0x20
[<ffffffffa00de307>] enclosure_add_links+0xe7/0x110 [enclosure]
[<ffffffff8125325d>] ? kobject_release+0xd/0x10
[<ffffffff812549e7>] ? kref_put+0x37/0x70
[<ffffffffa00de3c3>] enclosure_add_device+0x93/0xa0 [enclosure]
[<ffffffffa00c8666>] ses_enclosure_find_by_addr+0x76/0xc0 [ses]
[<ffffffffa00c85f0>] ? ses_get_fault+0x40/0x40 [ses]
[<ffffffffa00de433>] enclosure_for_each_device+0x63/0x90 [enclosure]
[<ffffffffa00c8a8a>] ses_match_to_enclosure+0x11a/0x1d0 [ses]
[<ffffffffa00c8e08>] ses_intf_add+0x2c8/0x5c0 [ses]
[<ffffffff8125327a>] ? kobject_get+0x1a/0x30
[<ffffffff814e8b56>] ? add_tail+0x36/0x50
[<ffffffff81345ae4>] device_add+0x2d4/0x380
[<ffffffff8136b096>] scsi_sysfs_add_sdev+0xe6/0x2a0
[<ffffffff813682cc>] scsi_add_lun+0x41c/0x560
[<ffffffff81368a80>] scsi_probe_and_add_lun+0x1e0/0x3e0
[<ffffffff81041009>] ? default_spin_lock_flags+0x9/0x10
[<ffffffff813696e7>] __scsi_scan_target+0xe7/0x120
[<ffffffff81369b8d>] scsi_scan_target+0xcd/0xf0
[<ffffffffa003faab>] sas_rphy_add+0x11b/0x170 [scsi_transport_sas]
[<ffffffffa009a74f>] mpt2sas_transport_port_add+0x2cf/0x430 [mpt2sas]
[<ffffffffa008d437>] _scsih_sas_device_add+0x87/0x110 [mpt2sas]
[<ffffffffa0094eb8>] _scsih_add_device+0x248/0x340 [mpt2sas]
[<ffffffffa0098cb1>] ? mpt2sas_transport_update_links+0xf1/0x190 [mpt2sas]
[<ffffffffa00977b6>] _scsih_sas_topology_change_event+0x3c6/0x490 [mpt2sas]
[<ffffffff81080698>] ? add_timer+0x18/0x20
[<ffffffff8108a405>] ? queue_delayed_work_on+0xc5/0x170
[<ffffffffa0097a85>] _mpt2sas_fw_work+0x205/0x240 [mpt2sas]
[<ffffffffa0097ad9>] _firmware_event_work_delayed+0x19/0x20 [mpt2sas]
[<ffffffff8108c0d9>] process_one_work+0xf9/0x370
[<ffffffffa0097ac0>] ? _mpt2sas_fw_work+0x240/0x240 [mpt2sas]
[<ffffffff8108ca1a>] worker_thread+0xca/0x240
[<ffffffff8108c950>] ? manage_workers+0x90/0x90
[<ffffffff81090ff7>] kthread+0x97/0xa0
[<ffffffff8150fdc4>] kernel_thread_helper+0x4/0x10
[<ffffffff81090f60>] ? kthread_bind+0x80/0x80
[<ffffffff8150fdc0>] ? gs_change+0x13/0x13
---[ end trace 89a1351702ab360f ]---

During our test, multipath used, each LUN has 2 paths. when adding second
path enclousure did not check if will adding device's symlink existed or no.

Cc: James Bottomley <[email protected]>
Signed-off-by: Joe Jin <[email protected]>
---
drivers/misc/enclosure.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/drivers/misc/enclosure.c b/drivers/misc/enclosure.c
index 0e8df41..efc0e86 100644
--- a/drivers/misc/enclosure.c
+++ b/drivers/misc/enclosure.c
@@ -325,6 +325,13 @@ int enclosure_add_device(struct enclosure_device *edev, int component,
if (cdev->dev)
enclosure_remove_links(cdev);

+ if (dev) {
+ char name[ENCLOSURE_NAME_SIZE];
+
+ enclosure_link_name(cdev, name);
+ sysfs_remove_link(&dev->kobj, name);
+ }
+
put_device(cdev->dev);
cdev->dev = get_device(dev);
return enclosure_add_links(cdev);
--
1.8.3.1


2013-09-09 13:41:34

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] [scsi] enclosure: remove all possible sysfs entries before add device

> Modules linked in: oracleacfs(P)(U) oracleadvm(P)(U) oracleoks(P)(U)

Please reproduce without this weird crap loaded.

2013-09-09 14:50:10

by Brian King

[permalink] [raw]
Subject: Re: [PATCH] [scsi] enclosure: remove all possible sysfs entries before add device

On 09/09/2013 01:19 AM, Joe Jin wrote:
> When do disk pull/insert test we encountered below:
>
> WARNING: at fs/sysfs/dir.c:455 sysfs_add_one+0xbc/0xe0()
> Hardware name: SUN FIRE X4370 M2 SERVER
> sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:03.0/0000:0d:00.0/host6/port-6:1/expander-6:1/port-6:1:14/end_device-6:1:14/target6:0:27/6:0:27:0/enclosure_device:HDD10'
> Modules linked in: oracleacfs(P)(U) oracleadvm(P)(U) oracleoks(P)(U) mptctl mptbase autofs4 hidp bluetooth rfkill lockd sunrpc bonding be2iscsi iscsi_boot_sysfs ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp libiscsi scsi_transport_iscsi dm_round_robin dm_multipath video sbs sbshc acpi_pad acpi_memhotplug acpi_ipmi parport_pc lp parport ipmi_si ipmi_devintf ipmi_msghandler sg ses enclosure ixgbe e1000e hwmon igb snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore snd_page_alloc iTCO_wdt pcspkr i2c_i801 ioatdma ghes iTCO_vendor_support hed dca i2c_core i7core_edac edac_core dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_mod usb_storage shpchp mpt2sas scsi_transport_sas raid_class ahci libahci sd_mod crc_t10dif raid1 ext3 jbd mbcache
> Pid: 23302, comm: kworker/u:2 Tainted: P 2.6.39-400.124.1.el5uek #1
> Call Trace:
> [<ffffffff811daf8c>] ? sysfs_add_one+0xbc/0xe0
> [<ffffffff8106f030>] warn_slowpath_common+0x90/0xc0
> [<ffffffff8106f15e>] warn_slowpath_fmt+0x6e/0x70
> [<ffffffff81258bd4>] ? strlcat+0x54/0x70
> [<ffffffff811daf8c>] sysfs_add_one+0xbc/0xe0
> [<ffffffff811dbec8>] sysfs_do_create_link+0x148/0x1d0
> [<ffffffff811dbf83>] sysfs_create_link+0x13/0x20
> [<ffffffffa00de307>] enclosure_add_links+0xe7/0x110 [enclosure]
> [<ffffffff8125325d>] ? kobject_release+0xd/0x10
> [<ffffffff812549e7>] ? kref_put+0x37/0x70
> [<ffffffffa00de3c3>] enclosure_add_device+0x93/0xa0 [enclosure]
> [<ffffffffa00c8666>] ses_enclosure_find_by_addr+0x76/0xc0 [ses]
> [<ffffffffa00c85f0>] ? ses_get_fault+0x40/0x40 [ses]
> [<ffffffffa00de433>] enclosure_for_each_device+0x63/0x90 [enclosure]
> [<ffffffffa00c8a8a>] ses_match_to_enclosure+0x11a/0x1d0 [ses]
> [<ffffffffa00c8e08>] ses_intf_add+0x2c8/0x5c0 [ses]
> [<ffffffff8125327a>] ? kobject_get+0x1a/0x30
> [<ffffffff814e8b56>] ? add_tail+0x36/0x50
> [<ffffffff81345ae4>] device_add+0x2d4/0x380
> [<ffffffff8136b096>] scsi_sysfs_add_sdev+0xe6/0x2a0
> [<ffffffff813682cc>] scsi_add_lun+0x41c/0x560
> [<ffffffff81368a80>] scsi_probe_and_add_lun+0x1e0/0x3e0
> [<ffffffff81041009>] ? default_spin_lock_flags+0x9/0x10
> [<ffffffff813696e7>] __scsi_scan_target+0xe7/0x120
> [<ffffffff81369b8d>] scsi_scan_target+0xcd/0xf0
> [<ffffffffa003faab>] sas_rphy_add+0x11b/0x170 [scsi_transport_sas]
> [<ffffffffa009a74f>] mpt2sas_transport_port_add+0x2cf/0x430 [mpt2sas]
> [<ffffffffa008d437>] _scsih_sas_device_add+0x87/0x110 [mpt2sas]
> [<ffffffffa0094eb8>] _scsih_add_device+0x248/0x340 [mpt2sas]
> [<ffffffffa0098cb1>] ? mpt2sas_transport_update_links+0xf1/0x190 [mpt2sas]
> [<ffffffffa00977b6>] _scsih_sas_topology_change_event+0x3c6/0x490 [mpt2sas]
> [<ffffffff81080698>] ? add_timer+0x18/0x20
> [<ffffffff8108a405>] ? queue_delayed_work_on+0xc5/0x170
> [<ffffffffa0097a85>] _mpt2sas_fw_work+0x205/0x240 [mpt2sas]
> [<ffffffffa0097ad9>] _firmware_event_work_delayed+0x19/0x20 [mpt2sas]
> [<ffffffff8108c0d9>] process_one_work+0xf9/0x370
> [<ffffffffa0097ac0>] ? _mpt2sas_fw_work+0x240/0x240 [mpt2sas]
> [<ffffffff8108ca1a>] worker_thread+0xca/0x240
> [<ffffffff8108c950>] ? manage_workers+0x90/0x90
> [<ffffffff81090ff7>] kthread+0x97/0xa0
> [<ffffffff8150fdc4>] kernel_thread_helper+0x4/0x10
> [<ffffffff81090f60>] ? kthread_bind+0x80/0x80
> [<ffffffff8150fdc0>] ? gs_change+0x13/0x13
> ---[ end trace 89a1351702ab360f ]---
>
> During our test, multipath used, each LUN has 2 paths. when adding second
> path enclousure did not check if will adding device's symlink existed or no.
>
> Cc: James Bottomley <[email protected]>
> Signed-off-by: Joe Jin <[email protected]>
> ---
> drivers/misc/enclosure.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/drivers/misc/enclosure.c b/drivers/misc/enclosure.c
> index 0e8df41..efc0e86 100644
> --- a/drivers/misc/enclosure.c
> +++ b/drivers/misc/enclosure.c
> @@ -325,6 +325,13 @@ int enclosure_add_device(struct enclosure_device *edev, int component,
> if (cdev->dev)
> enclosure_remove_links(cdev);
>
> + if (dev) {
> + char name[ENCLOSURE_NAME_SIZE];
> +
> + enclosure_link_name(cdev, name);
> + sysfs_remove_link(&dev->kobj, name);
> + }
> +
> put_device(cdev->dev);
> cdev->dev = get_device(dev);
> return enclosure_add_links(cdev);
>

We've been looking at a similar issue with SAS multipath using the ipr driver.

Wendy, does this patch fix the issue you are seeing as well or is this a different
issue?

Thanks,

Brian

--
Brian King
Power Linux I/O
IBM Linux Technology Center

2013-09-10 01:51:17

by Joe Jin

[permalink] [raw]
Subject: Re: [PATCH] [scsi] enclosure: remove all possible sysfs entries before add device

On 09/09/13 21:41, Christoph Hellwig wrote:
>> Modules linked in: oracleacfs(P)(U) oracleadvm(P)(U) oracleoks(P)(U)
>
> Please reproduce without this weird crap loaded.
>
These modules is filesystem and will not impact enclosure.

Thanks,
Joe



2013-09-10 12:46:18

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH] [scsi] enclosure: remove all possible sysfs entries before add device

On Mon, 2013-09-09 at 14:19 +0800, Joe Jin wrote:
> When do disk pull/insert test we encountered below:
>
> WARNING: at fs/sysfs/dir.c:455 sysfs_add_one+0xbc/0xe0()
> Hardware name: SUN FIRE X4370 M2 SERVER
> sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:03.0/0000:0d:00.0/host6/port-6:1/expander-6:1/port-6:1:14/end_device-6:1:14/target6:0:27/6:0:27:0/enclosure_device:HDD10'
> Modules linked in: oracleacfs(P)(U) oracleadvm(P)(U) oracleoks(P)(U) mptctl mptbase autofs4 hidp bluetooth rfkill lockd sunrpc bonding be2iscsi iscsi_boot_sysfs ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp libiscsi scsi_transport_iscsi dm_round_robin dm_multipath video sbs sbshc acpi_pad acpi_memhotplug acpi_ipmi parport_pc lp parport ipmi_si ipmi_devintf ipmi_msghandler sg ses enclosure ixgbe e1000e hwmon igb snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore snd_page_alloc iTCO_wdt pcspkr i2c_i801 ioatdma ghes iTCO_vendor_support hed dca i2c_core i7core_edac edac_core dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_mod usb_storage shpchp mpt2sas scsi_transport_sas raid_class ahci libahci sd_mod crc_t10dif raid1 ext3 jbd mbcache
> Pid: 23302, comm: kworker/u:2 Tainted: P 2.6.39-400.124.1.el5uek #1
> Call Trace:
> [<ffffffff811daf8c>] ? sysfs_add_one+0xbc/0xe0
> [<ffffffff8106f030>] warn_slowpath_common+0x90/0xc0
> [<ffffffff8106f15e>] warn_slowpath_fmt+0x6e/0x70
> [<ffffffff81258bd4>] ? strlcat+0x54/0x70
> [<ffffffff811daf8c>] sysfs_add_one+0xbc/0xe0
> [<ffffffff811dbec8>] sysfs_do_create_link+0x148/0x1d0
> [<ffffffff811dbf83>] sysfs_create_link+0x13/0x20
> [<ffffffffa00de307>] enclosure_add_links+0xe7/0x110 [enclosure]
> [<ffffffff8125325d>] ? kobject_release+0xd/0x10
> [<ffffffff812549e7>] ? kref_put+0x37/0x70
> [<ffffffffa00de3c3>] enclosure_add_device+0x93/0xa0 [enclosure]
> [<ffffffffa00c8666>] ses_enclosure_find_by_addr+0x76/0xc0 [ses]
> [<ffffffffa00c85f0>] ? ses_get_fault+0x40/0x40 [ses]
> [<ffffffffa00de433>] enclosure_for_each_device+0x63/0x90 [enclosure]
> [<ffffffffa00c8a8a>] ses_match_to_enclosure+0x11a/0x1d0 [ses]
> [<ffffffffa00c8e08>] ses_intf_add+0x2c8/0x5c0 [ses]
> [<ffffffff8125327a>] ? kobject_get+0x1a/0x30
> [<ffffffff814e8b56>] ? add_tail+0x36/0x50
> [<ffffffff81345ae4>] device_add+0x2d4/0x380
> [<ffffffff8136b096>] scsi_sysfs_add_sdev+0xe6/0x2a0
> [<ffffffff813682cc>] scsi_add_lun+0x41c/0x560
> [<ffffffff81368a80>] scsi_probe_and_add_lun+0x1e0/0x3e0
> [<ffffffff81041009>] ? default_spin_lock_flags+0x9/0x10
> [<ffffffff813696e7>] __scsi_scan_target+0xe7/0x120
> [<ffffffff81369b8d>] scsi_scan_target+0xcd/0xf0
> [<ffffffffa003faab>] sas_rphy_add+0x11b/0x170 [scsi_transport_sas]
> [<ffffffffa009a74f>] mpt2sas_transport_port_add+0x2cf/0x430 [mpt2sas]
> [<ffffffffa008d437>] _scsih_sas_device_add+0x87/0x110 [mpt2sas]
> [<ffffffffa0094eb8>] _scsih_add_device+0x248/0x340 [mpt2sas]
> [<ffffffffa0098cb1>] ? mpt2sas_transport_update_links+0xf1/0x190 [mpt2sas]
> [<ffffffffa00977b6>] _scsih_sas_topology_change_event+0x3c6/0x490 [mpt2sas]
> [<ffffffff81080698>] ? add_timer+0x18/0x20
> [<ffffffff8108a405>] ? queue_delayed_work_on+0xc5/0x170
> [<ffffffffa0097a85>] _mpt2sas_fw_work+0x205/0x240 [mpt2sas]
> [<ffffffffa0097ad9>] _firmware_event_work_delayed+0x19/0x20 [mpt2sas]
> [<ffffffff8108c0d9>] process_one_work+0xf9/0x370
> [<ffffffffa0097ac0>] ? _mpt2sas_fw_work+0x240/0x240 [mpt2sas]
> [<ffffffff8108ca1a>] worker_thread+0xca/0x240
> [<ffffffff8108c950>] ? manage_workers+0x90/0x90
> [<ffffffff81090ff7>] kthread+0x97/0xa0
> [<ffffffff8150fdc4>] kernel_thread_helper+0x4/0x10
> [<ffffffff81090f60>] ? kthread_bind+0x80/0x80
> [<ffffffff8150fdc0>] ? gs_change+0x13/0x13
> ---[ end trace 89a1351702ab360f ]---
>
> During our test, multipath used, each LUN has 2 paths. when adding second
> path enclousure did not check if will adding device's symlink existed or no.

The description doesn't look helpful. The problem, presumably in a
remove/re-add test that the add event gets processed before the remove
event, which is why the link is still there?

> Cc: James Bottomley <[email protected]>
> Signed-off-by: Joe Jin <[email protected]>
> ---
> drivers/misc/enclosure.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/drivers/misc/enclosure.c b/drivers/misc/enclosure.c
> index 0e8df41..efc0e86 100644
> --- a/drivers/misc/enclosure.c
> +++ b/drivers/misc/enclosure.c
> @@ -325,6 +325,13 @@ int enclosure_add_device(struct enclosure_device *edev, int component,
> if (cdev->dev)
> enclosure_remove_links(cdev);
>
> + if (dev) {

This test is pointless. Adding a NULL device is illegal.

> + char name[ENCLOSURE_NAME_SIZE];
> +
> + enclosure_link_name(cdev, name);
> + sysfs_remove_link(&dev->kobj, name);

If we're really going to force eject the device, then this should be
enclosure_remove_device(edev, dev);

How do you prevent the case for remove re-add in the same slot? Surely
in that case, with your code, the link will get removed again when the
remove gets processed, so the slot will then look empty (even though
it's not).

James

2013-09-10 13:39:38

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH] [scsi] enclosure: remove all possible sysfs entries before add device

On Tue, 2013-09-10 at 09:50 +0800, Joe Jin wrote:
> On 09/09/13 21:41, Christoph Hellwig wrote:
> >> Modules linked in: oracleacfs(P)(U) oracleadvm(P)(U) oracleoks(P)(U)
> >
> > Please reproduce without this weird crap loaded.
> >
> These modules is filesystem and will not impact enclosure.

That's not really something you can guarantee, is it? Experience has
shown us that binary modules tend to do silly things which can
compromise the integrity of the whole system, so it's not unreasonable
to ask for a reproduction without them, particularly as most kernel
debugging time is volunteer best effort.

In this case the problem looks to be udev and timing related, so having
it reproduced without these modules eliminates a real source of
uncertainty.

James

2013-09-11 08:26:35

by Joe Jin

[permalink] [raw]
Subject: Re: [PATCH] [scsi] enclosure: remove all possible sysfs entries before add device

On 09/10/13 20:46, James Bottomley wrote:
>> > During our test, multipath used, each LUN has 2 paths. when adding second
>> > path enclousure did not check if will adding device's symlink existed or no.
> The description doesn't look helpful. The problem, presumably in a
> remove/re-add test that the add event gets processed before the remove
> event, which is why the link is still there?

Attach my debug info to here:

sd 7:0:27:0: [ses_intf_add]:cdev:ffff8817e81fcba0,intf:ffffffffa00c9400,sdev:ffff8817e81fc800
sd 7:0:27:0: [ses_intf_add] call ses_match_to_enclosure(edev=ffff8817e812c000,sdev=ffff8817e81fc800), cdev=ffff8817e81fcba0
sd 7:0:27:0: *** inq[6]: 48
sd 7:0:27:0: [sdq] 1172123568 512-byte logical blocks: (600 GB/558 GiB)
sd 7:0:27:0: [sdq] Write Protect is off
ADD: [enclosure_add_links]: kobj: <ffff8817e812cce8> target: <ffff8817e81fc948>, device
ADD: [enclosure_add_links]: kobj: <ffff8817e81fc948> target: <ffff8817e812cce8>, name: enclosure_device:HDD10
[ses_enclosure_find_by_addr] call enclosure_add_device(edev=ffff8817e812c000,i=4,efd->dev=ffff8817e81fc938),cdev=ffff8817e812ccd0
sd 7:0:27:0: [ses_intf_add] call ses_match_to_enclosure(edev=ffff8817ebd18000,sdev=ffff8817e81fc800), cdev=ffff8817e81fcba0
sd 7:0:27:0: *** inq[6]: 48
sd 7:0:27:0: [sdq] Write cache: disabled, read cache: enabled, supports DPO and FUA
[ses_enclosure_find_by_addr] call enclosure_add_device(edev=ffff8817e812c000,i=4,efd->dev=ffff8817e81fc938),cdev=ffff8817e812ccd0
sd 7:0:27:0: Attached scsi generic sg17 type 0
sdq: sdq1 sdq2
scsi 6:0:27:0: SSP: handle(0x001c), sas_addr(0x5000c500006bd15e), phy(2), device_name(0x5000c500006bd15e)
scsi 6:0:27:0: SSP: enclosure_logical_id(0x5080020000a3a510), slot(10)
scsi 6:0:27:0: serial_number(000934E00P0S 3SL00P0S)
scsi 6:0:27:0: qdepth(254), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1)
sd 6:0:27:0: [ses_intf_add]:cdev:ffff8817e8304ba0,intf:ffffffffa00c9400,sdev:ffff8817e8304800
sd 6:0:27:0: [ses_intf_add] call ses_match_to_enclosure(edev=ffff8817e0c5c000,sdev=ffff8817e8304800), cdev=ffff8817e8304ba0
sd 6:0:27:0: *** inq[6]: 48
sd 6:0:27:0: [sdac] 1172123568 512-byte logical blocks: (600 GB/558 GiB)
sd 7:0:27:0: [sdq] Attached SCSI disk
RM: [enclosure_remove_links]: kobj: <ffff8817e81fc948> name: [enclosure_device:HDD10]
RM: [enclosure_remove_links]: kobj: <ffff8817e812cce8> device
sd 6:0:27:0: [sdac] Write Protect is off
ADD: [enclosure_add_links]: kobj: <ffff8817e812cce8> target: <ffff8817e8304948>, device
ADD: [enclosure_add_links]: kobj: <ffff8817e8304948> target: <ffff8817e812cce8>, name: enclosure_device:HDD10
[ses_enclosure_find_by_addr] call enclosure_add_device(edev=ffff8817e812c000,i=4,efd->dev=ffff8817e8304938),cdev=ffff8817e812ccd0
sd 6:0:27:0: [ses_intf_add] call ses_match_to_enclosure(edev=ffff8817e4094000,sdev=ffff8817e8304800), cdev=ffff8817e8304ba0
sd 6:0:27:0: *** inq[6]: 48
RM: [enclosure_remove_links]: kobj: <ffff8817e9a80948> name: [enclosure_device:HDD10]
RM: [enclosure_remove_links]: kobj: <ffff8817e4094ce8> device
ADD: [enclosure_add_links]: kobj: <ffff8817e4094ce8> target: <ffff8817e8304948>, device
ADD: [enclosure_add_links]: kobj: <ffff8817e8304948> target: <ffff8817e4094ce8>, name: enclosure_device:HDD10
------------[ cut here ]------------
WARNING: at fs/sysfs/dir.c:455 sysfs_add_one+0xbc/0xe0()
Hardware name: SUN FIRE X4370 M2 SERVER
sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:03.0/0000:0d:00.0/host6/port-6:1/expander-6:1/port-6:1:14/end_device-6:1:14/target6:0:27/6:0:27:0/enclosure_device:HDD10'
Modules linked in: oracleacfs(P)(U) oracleadvm(P)(U) oracleoks(P)(U) mptctl mptbase autofs4 hidp bluetooth rfkill lockd sunrpc bonding be2iscsi iscsi_boot_sysfs ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp libiscsi scsi_transport_iscsi dm_round_robin dm_multipath video sbs sbshc acpi_pad acpi_memhotplug acpi_ipmi parport_pc lp parport ipmi_si ipmi_devintf ipmi_msghandler sg ses enclosure ixgbe e1000e hwmon igb snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore snd_page_alloc iTCO_wdt pcspkr i2c_i801 ioatdma ghes iTCO_vendor_support hed dca i2c_core i7core_edac edac_core dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_mod usb_storage shpchp mpt2sas scsi_transport_sas raid_class ahci libahci sd_mod crc_t10dif raid1 ext3 jbd mbcache
Pid: 23302, comm: kworker/u:2 Tainted: P 2.6.39-400.124.1.el5uek.bug17342873V2 #1
Call Trace:
[<ffffffff811daf8c>] ? sysfs_add_one+0xbc/0xe0
[<ffffffff8106f030>] warn_slowpath_common+0x90/0xc0
[<ffffffff8106f15e>] warn_slowpath_fmt+0x6e/0x70
[<ffffffff81258bd4>] ? strlcat+0x54/0x70
[<ffffffff811daf8c>] sysfs_add_one+0xbc/0xe0
[<ffffffff811dbec8>] sysfs_do_create_link+0x148/0x1d0
[<ffffffff811dbf83>] sysfs_create_link+0x13/0x20
[<ffffffffa00de307>] enclosure_add_links+0xe7/0x110 [enclosure]
[<ffffffff8125325d>] ? kobject_release+0xd/0x10
[<ffffffff812549e7>] ? kref_put+0x37/0x70
[<ffffffffa00de3c3>] enclosure_add_device+0x93/0xa0 [enclosure]
[<ffffffffa00c8666>] ses_enclosure_find_by_addr+0x76/0xc0 [ses]
[<ffffffffa00c85f0>] ? ses_get_fault+0x40/0x40 [ses]
[<ffffffffa00de433>] enclosure_for_each_device+0x63/0x90 [enclosure]
[<ffffffffa00c8a8a>] ses_match_to_enclosure+0x11a/0x1d0 [ses]
[<ffffffffa00c8e08>] ses_intf_add+0x2c8/0x5c0 [ses]
[<ffffffff8125327a>] ? kobject_get+0x1a/0x30
[<ffffffff814e8b56>] ? add_tail+0x36/0x50
[<ffffffff81345ae4>] device_add+0x2d4/0x380
[<ffffffff8136b096>] scsi_sysfs_add_sdev+0xe6/0x2a0
[<ffffffff813682cc>] scsi_add_lun+0x41c/0x560
[<ffffffff81368a80>] scsi_probe_and_add_lun+0x1e0/0x3e0
[<ffffffff81041009>] ? default_spin_lock_flags+0x9/0x10
[<ffffffff813696e7>] __scsi_scan_target+0xe7/0x120
[<ffffffff81369b8d>] scsi_scan_target+0xcd/0xf0
[<ffffffffa003faab>] sas_rphy_add+0x11b/0x170 [scsi_transport_sas]
[<ffffffffa009a74f>] mpt2sas_transport_port_add+0x2cf/0x430 [mpt2sas]
[<ffffffffa008d437>] _scsih_sas_device_add+0x87/0x110 [mpt2sas]
[<ffffffffa0094eb8>] _scsih_add_device+0x248/0x340 [mpt2sas]
[<ffffffffa0098cb1>] ? mpt2sas_transport_update_links+0xf1/0x190 [mpt2sas]
[<ffffffffa00977b6>] _scsih_sas_topology_change_event+0x3c6/0x490 [mpt2sas]
[<ffffffff81080698>] ? add_timer+0x18/0x20
[<ffffffff8108a405>] ? queue_delayed_work_on+0xc5/0x170
[<ffffffffa0097a85>] _mpt2sas_fw_work+0x205/0x240 [mpt2sas]
[<ffffffffa0097ad9>] _firmware_event_work_delayed+0x19/0x20 [mpt2sas]
[<ffffffff8108c0d9>] process_one_work+0xf9/0x370
[<ffffffffa0097ad9>] _firmware_event_work_delayed+0x19/0x20 [mpt2sas]
[<ffffffff8108c0d9>] process_one_work+0xf9/0x370
[<ffffffffa0097ac0>] ? _mpt2sas_fw_work+0x240/0x240 [mpt2sas]
[<ffffffff8108ca1a>] worker_thread+0xca/0x240
[<ffffffff8108c950>] ? manage_workers+0x90/0x90
[<ffffffff81090ff7>] kthread+0x97/0xa0
[<ffffffff8150fdc4>] kernel_thread_helper+0x4/0x10
[<ffffffff81090f60>] ? kthread_bind+0x80/0x80
[<ffffffff8150fdc0>] ? gs_change+0x13/0x13
---[ end trace 89a1351702ab360f ]---
[ses_enclosure_find_by_addr] call enclosure_add_device(edev=ffff8817e4094000,i=4,efd->dev=ffff8817e8304938),cdev=ffff8817e4094cd0

Per above message you can see the last tried for enclosure_device:HDD10,
the index of component is not same then conflicted.

BTW, 6:0:27:0 and 7:0:27:0 are same disk.

>
>> > Cc: James Bottomley <[email protected]>
>> > Signed-off-by: Joe Jin <[email protected]>
>> > ---
>> > drivers/misc/enclosure.c | 7 +++++++
>> > 1 file changed, 7 insertions(+)
>> >
>> > diff --git a/drivers/misc/enclosure.c b/drivers/misc/enclosure.c
>> > index 0e8df41..efc0e86 100644
>> > --- a/drivers/misc/enclosure.c
>> > +++ b/drivers/misc/enclosure.c
>> > @@ -325,6 +325,13 @@ int enclosure_add_device(struct enclosure_device *edev, int component,
>> > if (cdev->dev)
>> > enclosure_remove_links(cdev);
>> >
>> > + if (dev) {
> This test is pointless. Adding a NULL device is illegal.

Yes this is right.

Thanks,
Joe


>
>> > + char name[ENCLOSURE_NAME_SIZE];
>> > +
>> > + enclosure_link_name(cdev, name);
>> > + sysfs_remove_link(&dev->kobj, name);
> If we're really going to force eject the device, then this should be
> enclosure_remove_device(edev, dev);
>
> How do you prevent the case for remove re-add in the same slot? Surely
> in that case, with your code, the link will get removed again when the
> remove gets processed, so the slot will then look empty (even though
> it's not).