2015-04-10 07:16:26

by Calvin Owens

[permalink] [raw]
Subject: [PATCH] mpt2sas: mpt3sas: Fix memory corruption during initialization

While _scsih_probe_sas() is initializing devices, the hardware can trigger a
MPI2_EVENT_SAS_TOPO_RC_TARG_NOT_RESPONDING interrupt.

The handler for TARG_NOT_RESPONDING calls _scsih_device_remove_by_handle(),
which deletes the device in question from either ioc->sas_device_list or
ioc->sas_device_init_list. Since _scsih_probe_sas() uses no exclusion when
iterating over ioc->sas_device_init_list, this results in a use-after-free
in _scsih_probe_sas(), and also corrupts the list:

mpt2sas1: removing handle(0x0020), sas_addr(0x5f80f418573360e0)
mpt2sas1: log_info(0x31111000): originator(PL), code(0x11), sub_code(0x1000)
------------[ cut here ]------------
WARNING: at lib/list_debug.c:56 __list_del_entry+0xc3/0xd0()
list_del corruption, ffff88240012fa00->prev is LIST_POISON2 (dead000000200200)
<snip>
Workqueue: events work_for_cpu_fn
ffffffff810c4f17 ffff881214825b38 0000000000000009 ffff881214825ae8
ffffffff8169b61e ffff881214825b28 ffffffff8104a990 0000000000000002
ffff88240012f900 ffff88240012fa00 ffff881217595af8 0000000000000282
Call Trace:
[<ffffffff810c4f17>] ? print_modules+0xd7/0x120
[<ffffffff8169b61e>] dump_stack+0x19/0x1b
[<ffffffff8104a990>] warn_slowpath_common+0x70/0xa0
[<ffffffff8104aa76>] warn_slowpath_fmt+0x46/0x50
[<ffffffff816a2234>] ? _raw_spin_lock_irqsave+0x84/0xa0
[<ffffffffa0010e8e>] ? _scsih_probe_sas+0x8e/0x110 [mpt2sas]
[<ffffffff8132a5a3>] __list_del_entry+0xc3/0xd0
[<ffffffffa0010e99>] _scsih_probe_sas+0x99/0x110 [mpt2sas]
[<ffffffffa0011d5f>] _scsih_scan_finished+0x19f/0x2c0 [mpt2sas]
[<ffffffff81429d67>] do_scsi_scan_host+0x77/0xa0
[<ffffffff81429f20>] scsi_scan_host+0x190/0x1c0
[<ffffffffa0011402>] _scsih_probe+0x452/0x640 [mpt2sas]
[<ffffffff813444eb>] local_pci_probe+0x4b/0x80
[<ffffffff8106b848>] work_for_cpu_fn+0x18/0x30
[<ffffffff81070012>] process_one_work+0x212/0x6e0
[<ffffffff8106ffa6>] ? process_one_work+0x1a6/0x6e0
[<ffffffff8108ed1f>] ? local_clock+0x4f/0x60
[<ffffffff8107050c>] process_scheduled_works+0x2c/0x40
[<ffffffff81070ae2>] worker_thread+0x262/0x370
[<ffffffff81070880>] ? rescuer_thread+0x360/0x360
[<ffffffff81078f3b>] kthread+0xdb/0xe0
[<ffffffff810b5e8d>] ? trace_hardirqs_on+0xd/0x10
[<ffffffff81078e60>] ? kthread_create_on_node+0x140/0x140
[<ffffffff816ac01c>] ret_from_fork+0x7c/0xb0
[<ffffffff81078e60>] ? kthread_create_on_node+0x140/0x140
---[ end trace 41352a0bd2d0d61b ]---

This either results in an immediate panic, or corrupts random memory and
causes nasty problems later in the box's uptime.

This patch splices the discovered devices out of the global list while
holding the lock, since _scsih_probe_sas() always removes them from that
global list anyway (either deleting them if initialization fails, or
moving them onto ioc->sas_device_list if it succeeds). The interrupt that
caused this bug will no longer cause the device to be removed during
initialization, since it won't exist on the global lists, but
_scsih_probe_sas() will remove it anyway when it fails to initialize.

Cc: [email protected]
Signed-off-by: Calvin Owens <[email protected]>
---
drivers/scsi/mpt2sas/mpt2sas_scsih.c | 14 +++++++++++---
drivers/scsi/mpt3sas/mpt3sas_scsih.c | 14 +++++++++++---
2 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index 3f26147..4d603cb 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -7977,11 +7977,19 @@ _scsih_probe_sas(struct MPT2SAS_ADAPTER *ioc)
{
struct _sas_device *sas_device, *next;
unsigned long flags;
+ LIST_HEAD(head);

- /* SAS Device List */
- list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
- list) {
+ /*
+ * Yank the entries out of the global list before attempting to iterate
+ * over them, since interrupts can delete sas_device entries out of the
+ * global list while we iterate.
+ */
+ spin_lock_irqsave(&ioc->sas_device_lock, flags);
+ list_splice_init(&ioc->sas_device_init_list, &head);
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

+ /* SAS Device List */
+ list_for_each_entry_safe(sas_device, next, &head, list) {
if (ioc->hide_drives)
continue;

diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 5a97e32..1a6a6a3 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -7609,11 +7609,19 @@ _scsih_probe_sas(struct MPT3SAS_ADAPTER *ioc)
{
struct _sas_device *sas_device, *next;
unsigned long flags;
+ LIST_HEAD(head);

- /* SAS Device List */
- list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
- list) {
+ /*
+ * Yank the entries out of the global list before attempting to iterate
+ * over them, since interrupts can delete sas_device entries out of the
+ * global list while we iterate.
+ */
+ spin_lock_irqsave(&ioc->sas_device_lock, flags);
+ list_splice_init(&ioc->sas_device_init_list, &head);
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

+ /* SAS Device List */
+ list_for_each_entry_safe(sas_device, next, &head, list) {
if (!mpt3sas_transport_port_add(ioc, sas_device->handle,
sas_device->sas_address_parent)) {
list_del(&sas_device->list);
--
1.8.1


2015-04-10 14:32:16

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH] mpt2sas: mpt3sas: Fix memory corruption during initialization

On Fri, 2015-04-10 at 00:14 -0700, Calvin Owens wrote:
> While _scsih_probe_sas() is initializing devices, the hardware can trigger a
> MPI2_EVENT_SAS_TOPO_RC_TARG_NOT_RESPONDING interrupt.
>
> The handler for TARG_NOT_RESPONDING calls _scsih_device_remove_by_handle(),
> which deletes the device in question from either ioc->sas_device_list or
> ioc->sas_device_init_list. Since _scsih_probe_sas() uses no exclusion when
> iterating over ioc->sas_device_init_list, this results in a use-after-free
> in _scsih_probe_sas(), and also corrupts the list:
>
> mpt2sas1: removing handle(0x0020), sas_addr(0x5f80f418573360e0)
> mpt2sas1: log_info(0x31111000): originator(PL), code(0x11), sub_code(0x1000)
> ------------[ cut here ]------------
> WARNING: at lib/list_debug.c:56 __list_del_entry+0xc3/0xd0()
> list_del corruption, ffff88240012fa00->prev is LIST_POISON2 (dead000000200200)
> <snip>
> Workqueue: events work_for_cpu_fn
> ffffffff810c4f17 ffff881214825b38 0000000000000009 ffff881214825ae8
> ffffffff8169b61e ffff881214825b28 ffffffff8104a990 0000000000000002
> ffff88240012f900 ffff88240012fa00 ffff881217595af8 0000000000000282
> Call Trace:
> [<ffffffff810c4f17>] ? print_modules+0xd7/0x120
> [<ffffffff8169b61e>] dump_stack+0x19/0x1b
> [<ffffffff8104a990>] warn_slowpath_common+0x70/0xa0
> [<ffffffff8104aa76>] warn_slowpath_fmt+0x46/0x50
> [<ffffffff816a2234>] ? _raw_spin_lock_irqsave+0x84/0xa0
> [<ffffffffa0010e8e>] ? _scsih_probe_sas+0x8e/0x110 [mpt2sas]
> [<ffffffff8132a5a3>] __list_del_entry+0xc3/0xd0
> [<ffffffffa0010e99>] _scsih_probe_sas+0x99/0x110 [mpt2sas]
> [<ffffffffa0011d5f>] _scsih_scan_finished+0x19f/0x2c0 [mpt2sas]
> [<ffffffff81429d67>] do_scsi_scan_host+0x77/0xa0
> [<ffffffff81429f20>] scsi_scan_host+0x190/0x1c0
> [<ffffffffa0011402>] _scsih_probe+0x452/0x640 [mpt2sas]
> [<ffffffff813444eb>] local_pci_probe+0x4b/0x80
> [<ffffffff8106b848>] work_for_cpu_fn+0x18/0x30
> [<ffffffff81070012>] process_one_work+0x212/0x6e0
> [<ffffffff8106ffa6>] ? process_one_work+0x1a6/0x6e0
> [<ffffffff8108ed1f>] ? local_clock+0x4f/0x60
> [<ffffffff8107050c>] process_scheduled_works+0x2c/0x40
> [<ffffffff81070ae2>] worker_thread+0x262/0x370
> [<ffffffff81070880>] ? rescuer_thread+0x360/0x360
> [<ffffffff81078f3b>] kthread+0xdb/0xe0
> [<ffffffff810b5e8d>] ? trace_hardirqs_on+0xd/0x10
> [<ffffffff81078e60>] ? kthread_create_on_node+0x140/0x140
> [<ffffffff816ac01c>] ret_from_fork+0x7c/0xb0
> [<ffffffff81078e60>] ? kthread_create_on_node+0x140/0x140
> ---[ end trace 41352a0bd2d0d61b ]---
>
> This either results in an immediate panic, or corrupts random memory and
> causes nasty problems later in the box's uptime.
>
> This patch splices the discovered devices out of the global list while
> holding the lock, since _scsih_probe_sas() always removes them from that
> global list anyway (either deleting them if initialization fails, or
> moving them onto ioc->sas_device_list if it succeeds). The interrupt that
> caused this bug will no longer cause the device to be removed during
> initialization, since it won't exist on the global lists, but
> _scsih_probe_sas() will remove it anyway when it fails to initialize.

Hopefully the avago team will curate this, but just in case they don't,
the correct list to make sure it gets the attention of storage people
should be

[email protected]

James

????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2015-04-10 16:43:58

by Sathya Prakash

[permalink] [raw]
Subject: RE: [PATCH] mpt2sas: mpt3sas: Fix memory corruption during initialization

James & Calvin,
Noted this, we will review and ACK/revert back with further questions in
next couple of weeks.

Thanks
Sathya

-----Original Message-----
From: James Bottomley [mailto:[email protected]]
Sent: Friday, April 10, 2015 9:31 AM
To: [email protected]
Cc: [email protected]; [email protected];
[email protected]; [email protected];
[email protected]; [email protected];
[email protected]; [email protected]
Subject: Re: [PATCH] mpt2sas: mpt3sas: Fix memory corruption during
initialization

On Fri, 2015-04-10 at 00:14 -0700, Calvin Owens wrote:
> While _scsih_probe_sas() is initializing devices, the hardware can
> trigger a MPI2_EVENT_SAS_TOPO_RC_TARG_NOT_RESPONDING interrupt.
>
> The handler for TARG_NOT_RESPONDING calls
> _scsih_device_remove_by_handle(), which deletes the device in question
> from either ioc->sas_device_list or
> ioc->sas_device_init_list. Since _scsih_probe_sas() uses no exclusion
> ioc->when
> iterating over ioc->sas_device_init_list, this results in a
> use-after-free in _scsih_probe_sas(), and also corrupts the list:
>
> mpt2sas1: removing handle(0x0020), sas_addr(0x5f80f418573360e0)
> mpt2sas1: log_info(0x31111000): originator(PL), code(0x11),
> sub_code(0x1000)
> ------------[ cut here ]------------
> WARNING: at lib/list_debug.c:56 __list_del_entry+0xc3/0xd0()
> list_del corruption, ffff88240012fa00->prev is LIST_POISON2
> (dead000000200200)
> <snip>
> Workqueue: events work_for_cpu_fn
> ffffffff810c4f17 ffff881214825b38 0000000000000009 ffff881214825ae8
> ffffffff8169b61e ffff881214825b28 ffffffff8104a990 0000000000000002
> ffff88240012f900 ffff88240012fa00 ffff881217595af8 0000000000000282
> Call Trace:
> [<ffffffff810c4f17>] ? print_modules+0xd7/0x120
> [<ffffffff8169b61e>] dump_stack+0x19/0x1b
> [<ffffffff8104a990>] warn_slowpath_common+0x70/0xa0
> [<ffffffff8104aa76>] warn_slowpath_fmt+0x46/0x50
> [<ffffffff816a2234>] ? _raw_spin_lock_irqsave+0x84/0xa0
> [<ffffffffa0010e8e>] ? _scsih_probe_sas+0x8e/0x110 [mpt2sas]
> [<ffffffff8132a5a3>] __list_del_entry+0xc3/0xd0
> [<ffffffffa0010e99>] _scsih_probe_sas+0x99/0x110 [mpt2sas]
> [<ffffffffa0011d5f>] _scsih_scan_finished+0x19f/0x2c0 [mpt2sas]
> [<ffffffff81429d67>] do_scsi_scan_host+0x77/0xa0
> [<ffffffff81429f20>] scsi_scan_host+0x190/0x1c0
> [<ffffffffa0011402>] _scsih_probe+0x452/0x640 [mpt2sas]
> [<ffffffff813444eb>] local_pci_probe+0x4b/0x80
> [<ffffffff8106b848>] work_for_cpu_fn+0x18/0x30
> [<ffffffff81070012>] process_one_work+0x212/0x6e0
> [<ffffffff8106ffa6>] ? process_one_work+0x1a6/0x6e0
> [<ffffffff8108ed1f>] ? local_clock+0x4f/0x60
> [<ffffffff8107050c>] process_scheduled_works+0x2c/0x40
> [<ffffffff81070ae2>] worker_thread+0x262/0x370
> [<ffffffff81070880>] ? rescuer_thread+0x360/0x360
> [<ffffffff81078f3b>] kthread+0xdb/0xe0
> [<ffffffff810b5e8d>] ? trace_hardirqs_on+0xd/0x10
> [<ffffffff81078e60>] ? kthread_create_on_node+0x140/0x140
> [<ffffffff816ac01c>] ret_from_fork+0x7c/0xb0
> [<ffffffff81078e60>] ? kthread_create_on_node+0x140/0x140
> ---[ end trace 41352a0bd2d0d61b ]---
>
> This either results in an immediate panic, or corrupts random memory
> and causes nasty problems later in the box's uptime.
>
> This patch splices the discovered devices out of the global list while
> holding the lock, since _scsih_probe_sas() always removes them from
> that global list anyway (either deleting them if initialization fails,
> or moving them onto ioc->sas_device_list if it succeeds). The
> interrupt that caused this bug will no longer cause the device to be
> removed during initialization, since it won't exist on the global
> lists, but
> _scsih_probe_sas() will remove it anyway when it fails to initialize.

Hopefully the avago team will curate this, but just in case they don't, the
correct list to make sure it gets the attention of storage people should be

[email protected]

James

2015-05-04 15:05:53

by Sreekanth Reddy

[permalink] [raw]
Subject: Re: [PATCH] mpt2sas: mpt3sas: Fix memory corruption during initialization

I have applied this patch on the latest upstream mpt3sas driver, then I have compiled and loaded the driver.
In the driver logs I didn't see any attached drives are added to the OS, 'fdisk -l' command also doesn't list
the drives which are actually attached to the HBA.

When I debug this issue then I see that in '_scsih_target_alloc'
driver is searching for sas_device from the lists 'sas_device_init_list' & 'sas_device_list'
based on the device sas address using the function mpt3sas_scsih_sas_device_find_by_sas_address(),
since this device is not in the 'sas_device_init_list' (as it is moved it to head list) driver exit
from this function without updating the required device addition information.

To solve the original problem (i.e memory corruption), here I have attached the patch,
in this patch I have added one atomic flag is_on_sas_device_init_list in _sas_device_structure
and I followed below algorithm.

1. when ever a device is added to sas_device_init_list then driver will set this atomic flag of this device to one.

2. And during the addition of this device to SCSI mid layer,
if the device is successfully added to the OS then driver will move this device list in to sas_device_list list from sas_device_init_list list and at this time driver will reset this flag to zero.
if device is failed to register with SCSI mid layer then also driver will reset this flag to zero in function _scsih_sas_device_remove and will remove the device entry from sas_device_init_list and will free the device structure.

3. Now when a device is removed then driver will receive target not responding event and in the function _scsih_device_remove_by_handle,
a. driver will check whether addition of discovered devices to SML process is currently running or not,
i. if addition (or registration) of discovered devices to SML process is running then driver will check whether device is in sas_device_init_list or not (by reading the atomic flag)?.
if it is in a sas_device_init_list then driver will ignore this device removal event (since device registration with SML will fail and it is removed in function _scsih_sas_device_remove as mentioned in step 2).
ii. if the device is not in a sas_device_init_list or addition (or registration) of discovered devices to SML process is already completed then device structure is removed from this function and this device entry is removed from sas_device_list.

4. if the device removal event is received after device structure is freed due to failure of device registration with SML them in the function _scsih_device_remove_by_handle driver won't find this device in the sas_device_list or in a sas_device_init_list and so driver will ignore this device removal event.

Signed-off-by: Sreekanth Reddy <[email protected]>
---
drivers/scsi/mpt2sas/mpt2sas_base.h | 2 ++
drivers/scsi/mpt2sas/mpt2sas_scsih.c | 45 +++++++++++++++++++++++++++---------
drivers/scsi/mpt3sas/mpt3sas_base.h | 2 ++
drivers/scsi/mpt3sas/mpt3sas_scsih.c | 43 ++++++++++++++++++++++++++--------
4 files changed, 71 insertions(+), 21 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
index caff8d1..1aa10d2 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.h
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
@@ -376,6 +376,7 @@ struct _sas_device {
u8 phy;
u8 responding;
u8 pfa_led_on;
+ atomic_t is_on_sas_device_init_list;
};

/**
@@ -833,6 +834,7 @@ struct MPT2SAS_ADAPTER {
u8 broadcast_aen_busy;
u16 broadcast_aen_pending;
u8 shost_recovery;
+ u8 discovered_device_addition_on;

struct mutex reset_in_progress_mutex;
spinlock_t ioc_reset_in_progress_lock;
diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index 3f26147..2a61286 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -590,13 +590,20 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
struct _sas_device *sas_device)
{
unsigned long flags;
+ struct _sas_device *same_sas_device;

if (!sas_device)
return;

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- list_del(&sas_device->list);
- kfree(sas_device);
+ same_sas_device = _scsih_sas_device_find_by_handle(ioc,
+ sas_device->handle);
+ if (same_sas_device) {
+ list_del(&same_sas_device->list);
+ if (atomic_read(&sas_device->is_on_sas_device_init_list))
+ atomic_set(&sas_device->is_on_sas_device_init_list, 0);
+ kfree(same_sas_device);
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}

@@ -658,6 +664,7 @@ _scsih_sas_device_init_add(struct MPT2SAS_ADAPTER *ioc,
"(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
sas_device->handle, (unsigned long long)sas_device->sas_address));

+ atomic_set(&sas_device->is_on_sas_device_init_list, 1);
spin_lock_irqsave(&ioc->sas_device_lock, flags);
list_add_tail(&sas_device->list, &ioc->sas_device_init_list);
_scsih_determine_boot_device(ioc, sas_device, 0);
@@ -5364,8 +5371,14 @@ _scsih_device_remove_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)

spin_lock_irqsave(&ioc->sas_device_lock, flags);
sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
- if (sas_device)
- list_del(&sas_device->list);
+ if (sas_device) {
+ if (ioc->discovered_device_addition_on &&
+ atomic_read(&sas_device->is_on_sas_device_init_list)) {
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+ return;
+ } else
+ list_del(&sas_device->list);
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
if (sas_device)
_scsih_remove_device(ioc, sas_device);
@@ -5391,8 +5404,14 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
spin_lock_irqsave(&ioc->sas_device_lock, flags);
sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
sas_address);
- if (sas_device)
- list_del(&sas_device->list);
+ if (sas_device) {
+ if (ioc->discovered_device_addition_on &&
+ atomic_read(&sas_device->is_on_sas_device_init_list)) {
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+ return;
+ } else
+ list_del(&sas_device->list);
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
if (sas_device)
_scsih_remove_device(ioc, sas_device);
@@ -7978,32 +7997,36 @@ _scsih_probe_sas(struct MPT2SAS_ADAPTER *ioc)
struct _sas_device *sas_device, *next;
unsigned long flags;

+ ioc->discovered_device_addition_on = 1;
/* SAS Device List */
list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
list) {

if (ioc->hide_drives)
continue;
-
+
if (!mpt2sas_transport_port_add(ioc, sas_device->handle,
sas_device->sas_address_parent)) {
- list_del(&sas_device->list);
- kfree(sas_device);
+ mpt2sas_transport_port_remove(ioc,
+ sas_device->sas_address,
+ sas_device->sas_address_parent);
+ _scsih_sas_device_remove(ioc, sas_device);
continue;
} else if (!sas_device->starget) {
if (!ioc->is_driver_loading) {
mpt2sas_transport_port_remove(ioc,
sas_device->sas_address,
sas_device->sas_address_parent);
- list_del(&sas_device->list);
- kfree(sas_device);
+ _scsih_sas_device_remove(ioc, sas_device);
continue;
}
}
spin_lock_irqsave(&ioc->sas_device_lock, flags);
list_move_tail(&sas_device->list, &ioc->sas_device_list);
+ atomic_dec(&sas_device->is_on_sas_device_init_list);
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}
+ ioc->discovered_device_addition_on = 0;
}

/**
diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h
index afa8816..6188490 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.h
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
@@ -315,6 +315,7 @@ struct _sas_device {
u8 responding;
u8 fast_path;
u8 pfa_led_on;
+ atomic_t is_on_sas_device_init_list;
};

/**
@@ -766,6 +767,7 @@ struct MPT3SAS_ADAPTER {
u8 broadcast_aen_busy;
u16 broadcast_aen_pending;
u8 shost_recovery;
+ u8 discovered_device_addition_on;

struct mutex reset_in_progress_mutex;
spinlock_t ioc_reset_in_progress_lock;
diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 5a97e32..53cc9ea 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -582,13 +582,20 @@ _scsih_sas_device_remove(struct MPT3SAS_ADAPTER *ioc,
struct _sas_device *sas_device)
{
unsigned long flags;
+ struct _sas_device *same_sas_device;

if (!sas_device)
return;

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- list_del(&sas_device->list);
- kfree(sas_device);
+ same_sas_device = _scsih_sas_device_find_by_handle(ioc,
+ sas_device->handle);
+ if (same_sas_device) {
+ list_del(&same_sas_device->list);
+ if (atomic_read(&sas_device->is_on_sas_device_init_list))
+ atomic_set(&sas_device->is_on_sas_device_init_list, 0);
+ kfree(same_sas_device);
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}

@@ -610,8 +616,14 @@ _scsih_device_remove_by_handle(struct MPT3SAS_ADAPTER *ioc, u16 handle)

spin_lock_irqsave(&ioc->sas_device_lock, flags);
sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
- if (sas_device)
- list_del(&sas_device->list);
+ if (sas_device) {
+ if (ioc->discovered_device_addition_on &&
+ atomic_read(&sas_device->is_on_sas_device_init_list)) {
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+ return;
+ } else
+ list_del(&sas_device->list);
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
if (sas_device)
_scsih_remove_device(ioc, sas_device);
@@ -637,8 +649,14 @@ mpt3sas_device_remove_by_sas_address(struct MPT3SAS_ADAPTER *ioc,
spin_lock_irqsave(&ioc->sas_device_lock, flags);
sas_device = mpt3sas_scsih_sas_device_find_by_sas_address(ioc,
sas_address);
- if (sas_device)
- list_del(&sas_device->list);
+ if (sas_device) {
+ if (ioc->discovered_device_addition_on &&
+ atomic_read(&sas_device->is_on_sas_device_init_list)) {
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+ return;
+ } else
+ list_del(&sas_device->list);
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
if (sas_device)
_scsih_remove_device(ioc, sas_device);
@@ -663,6 +681,7 @@ _scsih_sas_device_add(struct MPT3SAS_ADAPTER *ioc,
ioc->name, __func__, sas_device->handle,
(unsigned long long)sas_device->sas_address));

+ atomic_set(&sas_device->is_on_sas_device_init_list, 1);
spin_lock_irqsave(&ioc->sas_device_lock, flags);
list_add_tail(&sas_device->list, &ioc->sas_device_list);
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
@@ -7610,14 +7629,17 @@ _scsih_probe_sas(struct MPT3SAS_ADAPTER *ioc)
struct _sas_device *sas_device, *next;
unsigned long flags;

+ ioc->discovered_device_addition_on = 1;
/* SAS Device List */
list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
list) {

if (!mpt3sas_transport_port_add(ioc, sas_device->handle,
sas_device->sas_address_parent)) {
- list_del(&sas_device->list);
- kfree(sas_device);
+ mpt3sas_transport_port_remove(ioc,
+ sas_device->sas_address,
+ sas_device->sas_address_parent);
+ _scsih_sas_device_remove(ioc, sas_device);
continue;
} else if (!sas_device->starget) {
/*
@@ -7630,16 +7652,17 @@ _scsih_probe_sas(struct MPT3SAS_ADAPTER *ioc)
mpt3sas_transport_port_remove(ioc,
sas_device->sas_address,
sas_device->sas_address_parent);
- list_del(&sas_device->list);
- kfree(sas_device);
+ _scsih_sas_device_remove(ioc, sas_device);
continue;
}
}

spin_lock_irqsave(&ioc->sas_device_lock, flags);
list_move_tail(&sas_device->list, &ioc->sas_device_list);
+ atomic_dec(&sas_device->is_on_sas_device_init_list);
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}
+ ioc->discovered_device_addition_on = 0;
}

/**
--
2.0.2

2015-05-05 16:10:41

by Tomas Henzl

[permalink] [raw]
Subject: Re: [PATCH] mpt2sas: mpt3sas: Fix memory corruption during initialization

On 05/04/2015 05:05 PM, Sreekanth Reddy wrote:
> I have applied this patch on the latest upstream mpt3sas driver, then I have compiled and loaded the driver.
> In the driver logs I didn't see any attached drives are added to the OS, 'fdisk -l' command also doesn't list
> the drives which are actually attached to the HBA.
>
> When I debug this issue then I see that in '_scsih_target_alloc'
> driver is searching for sas_device from the lists 'sas_device_init_list' & 'sas_device_list'
> based on the device sas address using the function mpt3sas_scsih_sas_device_find_by_sas_address(),
> since this device is not in the 'sas_device_init_list' (as it is moved it to head list) driver exit
> from this function without updating the required device addition information.
>
> To solve the original problem (i.e memory corruption), here I have attached the patch,
> in this patch I have added one atomic flag is_on_sas_device_init_list in _sas_device_structure
> and I followed below algorithm.
>
> 1. when ever a device is added to sas_device_init_list then driver will set this atomic flag of this device to one.
>
> 2. And during the addition of this device to SCSI mid layer,
> if the device is successfully added to the OS then driver will move this device list in to sas_device_list list from sas_device_init_list list and at this time driver will reset this flag to zero.
> if device is failed to register with SCSI mid layer then also driver will reset this flag to zero in function _scsih_sas_device_remove and will remove the device entry from sas_device_init_list and will free the device structure.
>
> 3. Now when a device is removed then driver will receive target not responding event and in the function _scsih_device_remove_by_handle,
> a. driver will check whether addition of discovered devices to SML process is currently running or not,
> i. if addition (or registration) of discovered devices to SML process is running then driver will check whether device is in sas_device_init_list or not (by reading the atomic flag)?.
> if it is in a sas_device_init_list then driver will ignore this device removal event (since device registration with SML will fail and it is removed in function _scsih_sas_device_remove as mentioned in step 2).
> ii. if the device is not in a sas_device_init_list or addition (or registration) of discovered devices to SML process is already completed then device structure is removed from this function and this device entry is removed from sas_device_list.
>
> 4. if the device removal event is received after device structure is freed due to failure of device registration with SML them in the function _scsih_device_remove_by_handle driver won't find this device in the sas_device_list or in a sas_device_init_list and so driver will ignore this device removal event.
>
> Signed-off-by: Sreekanth Reddy <[email protected]>
> ---
> drivers/scsi/mpt2sas/mpt2sas_base.h | 2 ++
> drivers/scsi/mpt2sas/mpt2sas_scsih.c | 45 +++++++++++++++++++++++++++---------
> drivers/scsi/mpt3sas/mpt3sas_base.h | 2 ++
> drivers/scsi/mpt3sas/mpt3sas_scsih.c | 43 ++++++++++++++++++++++++++--------
> 4 files changed, 71 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
> index caff8d1..1aa10d2 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_base.h
> +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
> @@ -376,6 +376,7 @@ struct _sas_device {
> u8 phy;
> u8 responding;
> u8 pfa_led_on;
> + atomic_t is_on_sas_device_init_list;

Hi Sreekanth,
when is_on_sas_device_init_list is used it's protected
by ioc->sas_device_lock - why do you need a atomic_t ?
There is one exception, but easily fixable.

> };
>
> /**
> @@ -833,6 +834,7 @@ struct MPT2SAS_ADAPTER {
> u8 broadcast_aen_busy;
> u16 broadcast_aen_pending;
> u8 shost_recovery;
> + u8 discovered_device_addition_on;
>
> struct mutex reset_in_progress_mutex;
> spinlock_t ioc_reset_in_progress_lock;
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> index 3f26147..2a61286 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> +++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> @@ -590,13 +590,20 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
> struct _sas_device *sas_device)
> {
> unsigned long flags;
> + struct _sas_device *same_sas_device;
>
> if (!sas_device)
> return;
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - list_del(&sas_device->list);
> - kfree(sas_device);
> + same_sas_device = _scsih_sas_device_find_by_handle(ioc,
> + sas_device->handle);

Is it possible that when same_sas_device is not null, that the
value is not the same as for the sas_device ?

> + if (same_sas_device) {
> + list_del(&same_sas_device->list);
> + if (atomic_read(&sas_device->is_on_sas_device_init_list))

Seems easier to just set the variable without a test.

> + atomic_set(&sas_device->is_on_sas_device_init_list, 0);
> + kfree(same_sas_device);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> }
>
> @@ -658,6 +664,7 @@ _scsih_sas_device_init_add(struct MPT2SAS_ADAPTER *ioc,
> "(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
> sas_device->handle, (unsigned long long)sas_device->sas_address));
>
> + atomic_set(&sas_device->is_on_sas_device_init_list, 1);
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> list_add_tail(&sas_device->list, &ioc->sas_device_init_list);
> _scsih_determine_boot_device(ioc, sas_device, 0);
> @@ -5364,8 +5371,14 @@ _scsih_device_remove_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> - if (sas_device)
> - list_del(&sas_device->list);
> + if (sas_device) {
> + if (ioc->discovered_device_addition_on &&
> + atomic_read(&sas_device->is_on_sas_device_init_list)) {
> + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> + return;
> + } else
> + list_del(&sas_device->list);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> if (sas_device)
> _scsih_remove_device(ioc, sas_device);
> @@ -5391,8 +5404,14 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> sas_address);
> - if (sas_device)
> - list_del(&sas_device->list);
> + if (sas_device) {
> + if (ioc->discovered_device_addition_on &&
> + atomic_read(&sas_device->is_on_sas_device_init_list)) {
> + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> + return;
> + } else
> + list_del(&sas_device->list);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> if (sas_device)
> _scsih_remove_device(ioc, sas_device);
> @@ -7978,32 +7997,36 @@ _scsih_probe_sas(struct MPT2SAS_ADAPTER *ioc)
> struct _sas_device *sas_device, *next;
> unsigned long flags;
>
> + ioc->discovered_device_addition_on = 1;
> /* SAS Device List */
> list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
> list) {
>
> if (ioc->hide_drives)
> continue;
> -
> +
> if (!mpt2sas_transport_port_add(ioc, sas_device->handle,
> sas_device->sas_address_parent)) {
> - list_del(&sas_device->list);
> - kfree(sas_device);
> + mpt2sas_transport_port_remove(ioc,
> + sas_device->sas_address,
> + sas_device->sas_address_parent);
> + _scsih_sas_device_remove(ioc, sas_device);
> continue;
> } else if (!sas_device->starget) {
> if (!ioc->is_driver_loading) {
> mpt2sas_transport_port_remove(ioc,
> sas_device->sas_address,
> sas_device->sas_address_parent);
> - list_del(&sas_device->list);
> - kfree(sas_device);
> + _scsih_sas_device_remove(ioc, sas_device);
> continue;
> }
> }
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> list_move_tail(&sas_device->list, &ioc->sas_device_list);
> + atomic_dec(&sas_device->is_on_sas_device_init_list);

Why not 'atomic_set(&sas_device->is_on_sas_device_init_list, 0);' ?
There is no place where you set the value of is_on_sas_device_init_list
higher than '1'.

Cheers,
Tomas

> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> }
> + ioc->discovered_device_addition_on = 0;
> }
>
> /**
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h
> index afa8816..6188490 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_base.h
> +++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
> @@ -315,6 +315,7 @@ struct _sas_device {
> u8 responding;
> u8 fast_path;
> u8 pfa_led_on;
> + atomic_t is_on_sas_device_init_list;
> };
>
> /**
> @@ -766,6 +767,7 @@ struct MPT3SAS_ADAPTER {
> u8 broadcast_aen_busy;
> u16 broadcast_aen_pending;
> u8 shost_recovery;
> + u8 discovered_device_addition_on;
>
> struct mutex reset_in_progress_mutex;
> spinlock_t ioc_reset_in_progress_lock;
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> index 5a97e32..53cc9ea 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> @@ -582,13 +582,20 @@ _scsih_sas_device_remove(struct MPT3SAS_ADAPTER *ioc,
> struct _sas_device *sas_device)
> {
> unsigned long flags;
> + struct _sas_device *same_sas_device;
>
> if (!sas_device)
> return;
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - list_del(&sas_device->list);
> - kfree(sas_device);
> + same_sas_device = _scsih_sas_device_find_by_handle(ioc,
> + sas_device->handle);
> + if (same_sas_device) {
> + list_del(&same_sas_device->list);
> + if (atomic_read(&sas_device->is_on_sas_device_init_list))
> + atomic_set(&sas_device->is_on_sas_device_init_list, 0);
> + kfree(same_sas_device);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> }
>
> @@ -610,8 +616,14 @@ _scsih_device_remove_by_handle(struct MPT3SAS_ADAPTER *ioc, u16 handle)
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> - if (sas_device)
> - list_del(&sas_device->list);
> + if (sas_device) {
> + if (ioc->discovered_device_addition_on &&
> + atomic_read(&sas_device->is_on_sas_device_init_list)) {
> + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> + return;
> + } else
> + list_del(&sas_device->list);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> if (sas_device)
> _scsih_remove_device(ioc, sas_device);
> @@ -637,8 +649,14 @@ mpt3sas_device_remove_by_sas_address(struct MPT3SAS_ADAPTER *ioc,
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> sas_device = mpt3sas_scsih_sas_device_find_by_sas_address(ioc,
> sas_address);
> - if (sas_device)
> - list_del(&sas_device->list);
> + if (sas_device) {
> + if (ioc->discovered_device_addition_on &&
> + atomic_read(&sas_device->is_on_sas_device_init_list)) {
> + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> + return;
> + } else
> + list_del(&sas_device->list);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> if (sas_device)
> _scsih_remove_device(ioc, sas_device);
> @@ -663,6 +681,7 @@ _scsih_sas_device_add(struct MPT3SAS_ADAPTER *ioc,
> ioc->name, __func__, sas_device->handle,
> (unsigned long long)sas_device->sas_address));
>
> + atomic_set(&sas_device->is_on_sas_device_init_list, 1);
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> list_add_tail(&sas_device->list, &ioc->sas_device_list);
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> @@ -7610,14 +7629,17 @@ _scsih_probe_sas(struct MPT3SAS_ADAPTER *ioc)
> struct _sas_device *sas_device, *next;
> unsigned long flags;
>
> + ioc->discovered_device_addition_on = 1;
> /* SAS Device List */
> list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
> list) {
>
> if (!mpt3sas_transport_port_add(ioc, sas_device->handle,
> sas_device->sas_address_parent)) {
> - list_del(&sas_device->list);
> - kfree(sas_device);
> + mpt3sas_transport_port_remove(ioc,
> + sas_device->sas_address,
> + sas_device->sas_address_parent);
> + _scsih_sas_device_remove(ioc, sas_device);
> continue;
> } else if (!sas_device->starget) {
> /*
> @@ -7630,16 +7652,17 @@ _scsih_probe_sas(struct MPT3SAS_ADAPTER *ioc)
> mpt3sas_transport_port_remove(ioc,
> sas_device->sas_address,
> sas_device->sas_address_parent);
> - list_del(&sas_device->list);
> - kfree(sas_device);
> + _scsih_sas_device_remove(ioc, sas_device);
> continue;
> }
> }
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> list_move_tail(&sas_device->list, &ioc->sas_device_list);
> + atomic_dec(&sas_device->is_on_sas_device_init_list);
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> }
> + ioc->discovered_device_addition_on = 0;
> }
>
> /**

2015-05-06 18:49:46

by Calvin Owens

[permalink] [raw]
Subject: Re: [PATCH] mpt2sas: mpt3sas: Fix memory corruption during initialization

On Monday 05/04 at 20:35 +0530, Sreekanth Reddy wrote:
> I have applied this patch on the latest upstream mpt3sas driver, then
> I have compiled and loaded the driver. In the driver logs I didn't
> see any attached drives are added to the OS, 'fdisk -l' command also
> doesn't list the drives which are actually attached to the HBA.
>
> When I debug this issue then I see that in '_scsih_target_alloc'
> driver is searching for sas_device from the lists
> 'sas_device_init_list' & 'sas_device_list' based on the device sas
> address using the function
> mpt3sas_scsih_sas_device_find_by_sas_address(), since this device is
> not in the 'sas_device_init_list' (as it is moved it to head list)
> driver exit from this function without updating the required device
> addition information.

Yes, I misunderstood that the initialization depended on the devices
still being on the init_list.

What's interesting about this is that when I tested it, it still worked.
I think the MPT2SAS_PORT_ENABLE_COMPLETE fw_event might zero
ioc->start_scan and allow scsih_scan_finished() to start probing devices
before all the devices are actually on the init_list. It seems to be
very repeatable per-machine whether or not it works.

But in any case, my patch was wrong.

> To solve the original problem (i.e memory corruption), here I have
> attached the patch, in this patch I have added one atomic flag
> is_on_sas_device_init_list in _sas_device_structure and I followed
> below algorithm.

The problem is that this only solves a single case. There isn't anything
to enforce that this or a similar chain of events can't happen elsewhere
in the code.

I think the best general solution would be to add a refcount to these
objects. They sit on a list that can be concurrently accessed from
multiple threads, so I think a refcount is the best way to ensure that
objects aren't freed out from under other users.

I'm working on a patchset that does this. I'm starting by adding a
refcount to the sas_device object only, and refactoring the code in
mpt2sas_scsih.c to use it. I should be able to send up a first version
of that pretty soon to get some feedback.

Thanks,
Calvin


> 1. when ever a device is added to sas_device_init_list then driver
> will set this atomic flag of this device to one.
>
> 2. And during the addition of this device to SCSI mid layer, if the
> device is successfully added to the OS then driver will move this
> device list in to sas_device_list list from sas_device_init_list list
> and at this time driver will reset this flag to zero. if device is
> failed to register with SCSI mid layer then also driver will reset
> this flag to zero in function _scsih_sas_device_remove and will remove
> the device entry from sas_device_init_list and will free the device
> structure.
>
> 3. Now when a device is removed then driver will receive target not responding event and in the function _scsih_device_remove_by_handle,
> a. driver will check whether addition of discovered devices to SML process is currently running or not,
> i. if addition (or registration) of discovered devices to SML process is running then driver will check whether device is in sas_device_init_list or not (by reading the atomic flag)?.
> if it is in a sas_device_init_list then driver will ignore this device removal event (since device registration with SML will fail and it is removed in function _scsih_sas_device_remove as mentioned in step 2).
> ii. if the device is not in a sas_device_init_list or addition (or registration) of discovered devices to SML process is already completed then device structure is removed from this function and this device entry is removed from sas_device_list.
>
> 4. if the device removal event is received after device structure is freed due to failure of device registration with SML them in the function _scsih_device_remove_by_handle driver won't find this device in the sas_device_list or in a sas_device_init_list and so driver will ignore this device removal event.
>
> Signed-off-by: Sreekanth Reddy <[email protected]>
> ---
> drivers/scsi/mpt2sas/mpt2sas_base.h | 2 ++
> drivers/scsi/mpt2sas/mpt2sas_scsih.c | 45 +++++++++++++++++++++++++++---------
> drivers/scsi/mpt3sas/mpt3sas_base.h | 2 ++
> drivers/scsi/mpt3sas/mpt3sas_scsih.c | 43 ++++++++++++++++++++++++++--------
> 4 files changed, 71 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
> index caff8d1..1aa10d2 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_base.h
> +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
> @@ -376,6 +376,7 @@ struct _sas_device {
> u8 phy;
> u8 responding;
> u8 pfa_led_on;
> + atomic_t is_on_sas_device_init_list;
> };
>
> /**
> @@ -833,6 +834,7 @@ struct MPT2SAS_ADAPTER {
> u8 broadcast_aen_busy;
> u16 broadcast_aen_pending;
> u8 shost_recovery;
> + u8 discovered_device_addition_on;
>
> struct mutex reset_in_progress_mutex;
> spinlock_t ioc_reset_in_progress_lock;
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> index 3f26147..2a61286 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> +++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> @@ -590,13 +590,20 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
> struct _sas_device *sas_device)
> {
> unsigned long flags;
> + struct _sas_device *same_sas_device;
>
> if (!sas_device)
> return;
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - list_del(&sas_device->list);
> - kfree(sas_device);
> + same_sas_device = _scsih_sas_device_find_by_handle(ioc,
> + sas_device->handle);
> + if (same_sas_device) {
> + list_del(&same_sas_device->list);
> + if (atomic_read(&sas_device->is_on_sas_device_init_list))
> + atomic_set(&sas_device->is_on_sas_device_init_list, 0);
> + kfree(same_sas_device);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> }
>
> @@ -658,6 +664,7 @@ _scsih_sas_device_init_add(struct MPT2SAS_ADAPTER *ioc,
> "(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
> sas_device->handle, (unsigned long long)sas_device->sas_address));
>
> + atomic_set(&sas_device->is_on_sas_device_init_list, 1);
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> list_add_tail(&sas_device->list, &ioc->sas_device_init_list);
> _scsih_determine_boot_device(ioc, sas_device, 0);
> @@ -5364,8 +5371,14 @@ _scsih_device_remove_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> - if (sas_device)
> - list_del(&sas_device->list);
> + if (sas_device) {
> + if (ioc->discovered_device_addition_on &&
> + atomic_read(&sas_device->is_on_sas_device_init_list)) {
> + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> + return;
> + } else
> + list_del(&sas_device->list);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> if (sas_device)
> _scsih_remove_device(ioc, sas_device);
> @@ -5391,8 +5404,14 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> sas_address);
> - if (sas_device)
> - list_del(&sas_device->list);
> + if (sas_device) {
> + if (ioc->discovered_device_addition_on &&
> + atomic_read(&sas_device->is_on_sas_device_init_list)) {
> + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> + return;
> + } else
> + list_del(&sas_device->list);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> if (sas_device)
> _scsih_remove_device(ioc, sas_device);
> @@ -7978,32 +7997,36 @@ _scsih_probe_sas(struct MPT2SAS_ADAPTER *ioc)
> struct _sas_device *sas_device, *next;
> unsigned long flags;
>
> + ioc->discovered_device_addition_on = 1;
> /* SAS Device List */
> list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
> list) {
>
> if (ioc->hide_drives)
> continue;
> -
> +
> if (!mpt2sas_transport_port_add(ioc, sas_device->handle,
> sas_device->sas_address_parent)) {
> - list_del(&sas_device->list);
> - kfree(sas_device);
> + mpt2sas_transport_port_remove(ioc,
> + sas_device->sas_address,
> + sas_device->sas_address_parent);
> + _scsih_sas_device_remove(ioc, sas_device);
> continue;
> } else if (!sas_device->starget) {
> if (!ioc->is_driver_loading) {
> mpt2sas_transport_port_remove(ioc,
> sas_device->sas_address,
> sas_device->sas_address_parent);
> - list_del(&sas_device->list);
> - kfree(sas_device);
> + _scsih_sas_device_remove(ioc, sas_device);
> continue;
> }
> }
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> list_move_tail(&sas_device->list, &ioc->sas_device_list);
> + atomic_dec(&sas_device->is_on_sas_device_init_list);
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> }
> + ioc->discovered_device_addition_on = 0;
> }
>
> /**
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h
> index afa8816..6188490 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_base.h
> +++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
> @@ -315,6 +315,7 @@ struct _sas_device {
> u8 responding;
> u8 fast_path;
> u8 pfa_led_on;
> + atomic_t is_on_sas_device_init_list;
> };
>
> /**
> @@ -766,6 +767,7 @@ struct MPT3SAS_ADAPTER {
> u8 broadcast_aen_busy;
> u16 broadcast_aen_pending;
> u8 shost_recovery;
> + u8 discovered_device_addition_on;
>
> struct mutex reset_in_progress_mutex;
> spinlock_t ioc_reset_in_progress_lock;
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> index 5a97e32..53cc9ea 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> @@ -582,13 +582,20 @@ _scsih_sas_device_remove(struct MPT3SAS_ADAPTER *ioc,
> struct _sas_device *sas_device)
> {
> unsigned long flags;
> + struct _sas_device *same_sas_device;
>
> if (!sas_device)
> return;
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - list_del(&sas_device->list);
> - kfree(sas_device);
> + same_sas_device = _scsih_sas_device_find_by_handle(ioc,
> + sas_device->handle);
> + if (same_sas_device) {
> + list_del(&same_sas_device->list);
> + if (atomic_read(&sas_device->is_on_sas_device_init_list))
> + atomic_set(&sas_device->is_on_sas_device_init_list, 0);
> + kfree(same_sas_device);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> }
>
> @@ -610,8 +616,14 @@ _scsih_device_remove_by_handle(struct MPT3SAS_ADAPTER *ioc, u16 handle)
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> - if (sas_device)
> - list_del(&sas_device->list);
> + if (sas_device) {
> + if (ioc->discovered_device_addition_on &&
> + atomic_read(&sas_device->is_on_sas_device_init_list)) {
> + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> + return;
> + } else
> + list_del(&sas_device->list);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> if (sas_device)
> _scsih_remove_device(ioc, sas_device);
> @@ -637,8 +649,14 @@ mpt3sas_device_remove_by_sas_address(struct MPT3SAS_ADAPTER *ioc,
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> sas_device = mpt3sas_scsih_sas_device_find_by_sas_address(ioc,
> sas_address);
> - if (sas_device)
> - list_del(&sas_device->list);
> + if (sas_device) {
> + if (ioc->discovered_device_addition_on &&
> + atomic_read(&sas_device->is_on_sas_device_init_list)) {
> + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> + return;
> + } else
> + list_del(&sas_device->list);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> if (sas_device)
> _scsih_remove_device(ioc, sas_device);
> @@ -663,6 +681,7 @@ _scsih_sas_device_add(struct MPT3SAS_ADAPTER *ioc,
> ioc->name, __func__, sas_device->handle,
> (unsigned long long)sas_device->sas_address));
>
> + atomic_set(&sas_device->is_on_sas_device_init_list, 1);
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> list_add_tail(&sas_device->list, &ioc->sas_device_list);
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> @@ -7610,14 +7629,17 @@ _scsih_probe_sas(struct MPT3SAS_ADAPTER *ioc)
> struct _sas_device *sas_device, *next;
> unsigned long flags;
>
> + ioc->discovered_device_addition_on = 1;
> /* SAS Device List */
> list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
> list) {
>
> if (!mpt3sas_transport_port_add(ioc, sas_device->handle,
> sas_device->sas_address_parent)) {
> - list_del(&sas_device->list);
> - kfree(sas_device);
> + mpt3sas_transport_port_remove(ioc,
> + sas_device->sas_address,
> + sas_device->sas_address_parent);
> + _scsih_sas_device_remove(ioc, sas_device);
> continue;
> } else if (!sas_device->starget) {
> /*
> @@ -7630,16 +7652,17 @@ _scsih_probe_sas(struct MPT3SAS_ADAPTER *ioc)
> mpt3sas_transport_port_remove(ioc,
> sas_device->sas_address,
> sas_device->sas_address_parent);
> - list_del(&sas_device->list);
> - kfree(sas_device);
> + _scsih_sas_device_remove(ioc, sas_device);
> continue;
> }
> }
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> list_move_tail(&sas_device->list, &ioc->sas_device_list);
> + atomic_dec(&sas_device->is_on_sas_device_init_list);
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> }
> + ioc->discovered_device_addition_on = 0;
> }
>
> /**
> --
> 2.0.2
>

2015-05-12 09:39:05

by Sreekanth Reddy

[permalink] [raw]
Subject: Re: [PATCH] mpt2sas: mpt3sas: Fix memory corruption during initialization

HI Tomas & Calvin,

Thanks for reviewing this patch.

There is some problem with this patch, In this patch as the driver is
ignoring the device removal event (when ioc's
discovered_device_addition_on flag and device's
is_on_sas_device_init_list is one) so driver not freeing the
sas_device structure from the sas_device_init_list list.

Due to this when ever same device is hot plugged then the device
addition to the SML is not happing.

I have one more patch to fix the original issue and I will post it today.

Regards,
Sreekanth

On Tue, May 5, 2015 at 9:05 PM, Tomas Henzl <[email protected]> wrote:
>
> On 05/04/2015 05:05 PM, Sreekanth Reddy wrote:
> > I have applied this patch on the latest upstream mpt3sas driver, then I have compiled and loaded the driver.
> > In the driver logs I didn't see any attached drives are added to the OS, 'fdisk -l' command also doesn't list
> > the drives which are actually attached to the HBA.
> >
> > When I debug this issue then I see that in '_scsih_target_alloc'
> > driver is searching for sas_device from the lists 'sas_device_init_list' & 'sas_device_list'
> > based on the device sas address using the function mpt3sas_scsih_sas_device_find_by_sas_address(),
> > since this device is not in the 'sas_device_init_list' (as it is moved it to head list) driver exit
> > from this function without updating the required device addition information.
> >
> > To solve the original problem (i.e memory corruption), here I have attached the patch,
> > in this patch I have added one atomic flag is_on_sas_device_init_list in _sas_device_structure
> > and I followed below algorithm.
> >
> > 1. when ever a device is added to sas_device_init_list then driver will set this atomic flag of this device to one.
> >
> > 2. And during the addition of this device to SCSI mid layer,
> > if the device is successfully added to the OS then driver will move this device list in to sas_device_list list from sas_device_init_list list and at this time driver will reset this flag to zero.
> > if device is failed to register with SCSI mid layer then also driver will reset this flag to zero in function _scsih_sas_device_remove and will remove the device entry from sas_device_init_list and will free the device structure.
> >
> > 3. Now when a device is removed then driver will receive target not responding event and in the function _scsih_device_remove_by_handle,
> > a. driver will check whether addition of discovered devices to SML process is currently running or not,
> > i. if addition (or registration) of discovered devices to SML process is running then driver will check whether device is in sas_device_init_list or not (by reading the atomic flag)?.
> > if it is in a sas_device_init_list then driver will ignore this device removal event (since device registration with SML will fail and it is removed in function _scsih_sas_device_remove as mentioned in step 2).
> > ii. if the device is not in a sas_device_init_list or addition (or registration) of discovered devices to SML process is already completed then device structure is removed from this function and this device entry is removed from sas_device_list.
> >
> > 4. if the device removal event is received after device structure is freed due to failure of device registration with SML them in the function _scsih_device_remove_by_handle driver won't find this device in the sas_device_list or in a sas_device_init_list and so driver will ignore this device removal event.
> >
> > Signed-off-by: Sreekanth Reddy <[email protected]>
> > ---
> > drivers/scsi/mpt2sas/mpt2sas_base.h | 2 ++
> > drivers/scsi/mpt2sas/mpt2sas_scsih.c | 45 +++++++++++++++++++++++++++---------
> > drivers/scsi/mpt3sas/mpt3sas_base.h | 2 ++
> > drivers/scsi/mpt3sas/mpt3sas_scsih.c | 43 ++++++++++++++++++++++++++--------
> > 4 files changed, 71 insertions(+), 21 deletions(-)
> >
> > diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
> > index caff8d1..1aa10d2 100644
> > --- a/drivers/scsi/mpt2sas/mpt2sas_base.h
> > +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
> > @@ -376,6 +376,7 @@ struct _sas_device {
> > u8 phy;
> > u8 responding;
> > u8 pfa_led_on;
> > + atomic_t is_on_sas_device_init_list;
>
> Hi Sreekanth,
> when is_on_sas_device_init_list is used it's protected
> by ioc->sas_device_lock - why do you need a atomic_t ?
> There is one exception, but easily fixable.
>
> > };
> >
> > /**
> > @@ -833,6 +834,7 @@ struct MPT2SAS_ADAPTER {
> > u8 broadcast_aen_busy;
> > u16 broadcast_aen_pending;
> > u8 shost_recovery;
> > + u8 discovered_device_addition_on;
> >
> > struct mutex reset_in_progress_mutex;
> > spinlock_t ioc_reset_in_progress_lock;
> > diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> > index 3f26147..2a61286 100644
> > --- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> > +++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> > @@ -590,13 +590,20 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
> > struct _sas_device *sas_device)
> > {
> > unsigned long flags;
> > + struct _sas_device *same_sas_device;
> >
> > if (!sas_device)
> > return;
> >
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - list_del(&sas_device->list);
> > - kfree(sas_device);
> > + same_sas_device = _scsih_sas_device_find_by_handle(ioc,
> > + sas_device->handle);
>
> Is it possible that when same_sas_device is not null, that the
> value is not the same as for the sas_device ?
>
> > + if (same_sas_device) {
> > + list_del(&same_sas_device->list);
> > + if (atomic_read(&sas_device->is_on_sas_device_init_list))
>
> Seems easier to just set the variable without a test.
>
> > + atomic_set(&sas_device->is_on_sas_device_init_list, 0);
> > + kfree(same_sas_device);
> > + }
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > }
> >
> > @@ -658,6 +664,7 @@ _scsih_sas_device_init_add(struct MPT2SAS_ADAPTER *ioc,
> > "(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
> > sas_device->handle, (unsigned long long)sas_device->sas_address));
> >
> > + atomic_set(&sas_device->is_on_sas_device_init_list, 1);
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > list_add_tail(&sas_device->list, &ioc->sas_device_init_list);
> > _scsih_determine_boot_device(ioc, sas_device, 0);
> > @@ -5364,8 +5371,14 @@ _scsih_device_remove_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > - if (sas_device)
> > - list_del(&sas_device->list);
> > + if (sas_device) {
> > + if (ioc->discovered_device_addition_on &&
> > + atomic_read(&sas_device->is_on_sas_device_init_list)) {
> > + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > + return;
> > + } else
> > + list_del(&sas_device->list);
> > + }
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > if (sas_device)
> > _scsih_remove_device(ioc, sas_device);
> > @@ -5391,8 +5404,14 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > sas_address);
> > - if (sas_device)
> > - list_del(&sas_device->list);
> > + if (sas_device) {
> > + if (ioc->discovered_device_addition_on &&
> > + atomic_read(&sas_device->is_on_sas_device_init_list)) {
> > + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > + return;
> > + } else
> > + list_del(&sas_device->list);
> > + }
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > if (sas_device)
> > _scsih_remove_device(ioc, sas_device);
> > @@ -7978,32 +7997,36 @@ _scsih_probe_sas(struct MPT2SAS_ADAPTER *ioc)
> > struct _sas_device *sas_device, *next;
> > unsigned long flags;
> >
> > + ioc->discovered_device_addition_on = 1;
> > /* SAS Device List */
> > list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
> > list) {
> >
> > if (ioc->hide_drives)
> > continue;
> > -
> > +
> > if (!mpt2sas_transport_port_add(ioc, sas_device->handle,
> > sas_device->sas_address_parent)) {
> > - list_del(&sas_device->list);
> > - kfree(sas_device);
> > + mpt2sas_transport_port_remove(ioc,
> > + sas_device->sas_address,
> > + sas_device->sas_address_parent);
> > + _scsih_sas_device_remove(ioc, sas_device);
> > continue;
> > } else if (!sas_device->starget) {
> > if (!ioc->is_driver_loading) {
> > mpt2sas_transport_port_remove(ioc,
> > sas_device->sas_address,
> > sas_device->sas_address_parent);
> > - list_del(&sas_device->list);
> > - kfree(sas_device);
> > + _scsih_sas_device_remove(ioc, sas_device);
> > continue;
> > }
> > }
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > list_move_tail(&sas_device->list, &ioc->sas_device_list);
> > + atomic_dec(&sas_device->is_on_sas_device_init_list);
>
> Why not 'atomic_set(&sas_device->is_on_sas_device_init_list, 0);' ?
> There is no place where you set the value of is_on_sas_device_init_list
> higher than '1'.
>
> Cheers,
> Tomas
>
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > }
> > + ioc->discovered_device_addition_on = 0;
> > }
> >
> > /**
> > diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h
> > index afa8816..6188490 100644
> > --- a/drivers/scsi/mpt3sas/mpt3sas_base.h
> > +++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
> > @@ -315,6 +315,7 @@ struct _sas_device {
> > u8 responding;
> > u8 fast_path;
> > u8 pfa_led_on;
> > + atomic_t is_on_sas_device_init_list;
> > };
> >
> > /**
> > @@ -766,6 +767,7 @@ struct MPT3SAS_ADAPTER {
> > u8 broadcast_aen_busy;
> > u16 broadcast_aen_pending;
> > u8 shost_recovery;
> > + u8 discovered_device_addition_on;
> >
> > struct mutex reset_in_progress_mutex;
> > spinlock_t ioc_reset_in_progress_lock;
> > diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> > index 5a97e32..53cc9ea 100644
> > --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> > +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
> > @@ -582,13 +582,20 @@ _scsih_sas_device_remove(struct MPT3SAS_ADAPTER *ioc,
> > struct _sas_device *sas_device)
> > {
> > unsigned long flags;
> > + struct _sas_device *same_sas_device;
> >
> > if (!sas_device)
> > return;
> >
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - list_del(&sas_device->list);
> > - kfree(sas_device);
> > + same_sas_device = _scsih_sas_device_find_by_handle(ioc,
> > + sas_device->handle);
> > + if (same_sas_device) {
> > + list_del(&same_sas_device->list);
> > + if (atomic_read(&sas_device->is_on_sas_device_init_list))
> > + atomic_set(&sas_device->is_on_sas_device_init_list, 0);
> > + kfree(same_sas_device);
> > + }
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > }
> >
> > @@ -610,8 +616,14 @@ _scsih_device_remove_by_handle(struct MPT3SAS_ADAPTER *ioc, u16 handle)
> >
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > - if (sas_device)
> > - list_del(&sas_device->list);
> > + if (sas_device) {
> > + if (ioc->discovered_device_addition_on &&
> > + atomic_read(&sas_device->is_on_sas_device_init_list)) {
> > + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > + return;
> > + } else
> > + list_del(&sas_device->list);
> > + }
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > if (sas_device)
> > _scsih_remove_device(ioc, sas_device);
> > @@ -637,8 +649,14 @@ mpt3sas_device_remove_by_sas_address(struct MPT3SAS_ADAPTER *ioc,
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > sas_device = mpt3sas_scsih_sas_device_find_by_sas_address(ioc,
> > sas_address);
> > - if (sas_device)
> > - list_del(&sas_device->list);
> > + if (sas_device) {
> > + if (ioc->discovered_device_addition_on &&
> > + atomic_read(&sas_device->is_on_sas_device_init_list)) {
> > + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > + return;
> > + } else
> > + list_del(&sas_device->list);
> > + }
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > if (sas_device)
> > _scsih_remove_device(ioc, sas_device);
> > @@ -663,6 +681,7 @@ _scsih_sas_device_add(struct MPT3SAS_ADAPTER *ioc,
> > ioc->name, __func__, sas_device->handle,
> > (unsigned long long)sas_device->sas_address));
> >
> > + atomic_set(&sas_device->is_on_sas_device_init_list, 1);
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > list_add_tail(&sas_device->list, &ioc->sas_device_list);
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > @@ -7610,14 +7629,17 @@ _scsih_probe_sas(struct MPT3SAS_ADAPTER *ioc)
> > struct _sas_device *sas_device, *next;
> > unsigned long flags;
> >
> > + ioc->discovered_device_addition_on = 1;
> > /* SAS Device List */
> > list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
> > list) {
> >
> > if (!mpt3sas_transport_port_add(ioc, sas_device->handle,
> > sas_device->sas_address_parent)) {
> > - list_del(&sas_device->list);
> > - kfree(sas_device);
> > + mpt3sas_transport_port_remove(ioc,
> > + sas_device->sas_address,
> > + sas_device->sas_address_parent);
> > + _scsih_sas_device_remove(ioc, sas_device);
> > continue;
> > } else if (!sas_device->starget) {
> > /*
> > @@ -7630,16 +7652,17 @@ _scsih_probe_sas(struct MPT3SAS_ADAPTER *ioc)
> > mpt3sas_transport_port_remove(ioc,
> > sas_device->sas_address,
> > sas_device->sas_address_parent);
> > - list_del(&sas_device->list);
> > - kfree(sas_device);
> > + _scsih_sas_device_remove(ioc, sas_device);
> > continue;
> > }
> > }
> >
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > list_move_tail(&sas_device->list, &ioc->sas_device_list);
> > + atomic_dec(&sas_device->is_on_sas_device_init_list);
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > }
> > + ioc->discovered_device_addition_on = 0;
> > }
> >
> > /**
>

2015-05-15 03:42:45

by Calvin Owens

[permalink] [raw]
Subject: [PATCH 0/6] Fixes for memory corruption in mpt2sas

Hello all,

This patchset attempts to address problems we've been having with
panics due to memory corruption from the mpt2sas driver.

I will provide a similar set of fixes for mpt3sas, since we see
similar issues there as well. "Porting" this to mpt3sas will be
trivial since the part of the driver I'm touching is nearly identical
between the two, so I thought it would be simpler to review a patch
against mpt2sas alone at first.

I've tested this for a few days on a big storage box that seemed to be
very susceptible to the panics, and so far it seems to have eliminated
them.

Thanks,
Calvin


Total diffstat:

drivers/scsi/mpt2sas/mpt2sas_base.h | 20 +-
drivers/scsi/mpt2sas/mpt2sas_scsih.c | 482 +++++++++++++++++++++----------
drivers/scsi/mpt2sas/mpt2sas_transport.c | 12 +-
3 files changed, 359 insertions(+), 155 deletions(-)

Patches:

* [PATCH 1/6] Add refcount to sas_device struct
* [PATCH 2/6] Refactor code to use new sas_device refcount
* [PATCH 3/6] Fix unsafe sas_device_list usage
* [PATCH 4/6] Add refcount to fw_event_work struct
* [PATCH 5/6] Refactor code to use new fw_event refcount
* [PATCH 6/6] Fix unsafe fw_event_list usage

2015-05-15 03:42:49

by Calvin Owens

[permalink] [raw]
Subject: [PATCH 1/6] Add refcount to sas_device struct

These objects can be referenced concurrently throughout the driver, we
need a way to make sure threads can't delete them out from under each
other.

Signed-off-by: Calvin Owens <[email protected]>
---
drivers/scsi/mpt2sas/mpt2sas_base.h | 16 ++++++++++++++++
1 file changed, 16 insertions(+)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
index caff8d1..2e7dc33 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.h
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
@@ -376,8 +376,24 @@ struct _sas_device {
u8 phy;
u8 responding;
u8 pfa_led_on;
+ struct kref refcount;
};

+static inline void sas_device_get(struct _sas_device *s)
+{
+ kref_get(&s->refcount);
+}
+
+static inline void sas_device_free(struct kref *r)
+{
+ kfree(container_of(r, struct _sas_device, refcount));
+}
+
+static inline void sas_device_put(struct _sas_device *s)
+{
+ kref_put(&s->refcount, sas_device_free);
+}
+
/**
* struct _raid_device - raid volume link list
* @list: sas device list
--
1.8.1

2015-05-15 03:43:08

by Calvin Owens

[permalink] [raw]
Subject: [PATCH 2/6] Refactor code to use new sas_device refcount

This patch refactors the code in the driver to use the new reference
count on the sas_device struct.

Signed-off-by: Calvin Owens <[email protected]>
---
drivers/scsi/mpt2sas/mpt2sas_base.h | 4 +-
drivers/scsi/mpt2sas/mpt2sas_scsih.c | 329 ++++++++++++++++++++-----------
drivers/scsi/mpt2sas/mpt2sas_transport.c | 12 +-
3 files changed, 220 insertions(+), 125 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
index 2e7dc33..dac0e8a 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.h
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
@@ -1111,7 +1111,9 @@ struct _sas_node *mpt2sas_scsih_expander_find_by_handle(struct MPT2SAS_ADAPTER *
u16 handle);
struct _sas_node *mpt2sas_scsih_expander_find_by_sas_address(struct MPT2SAS_ADAPTER
*ioc, u64 sas_address);
-struct _sas_device *mpt2sas_scsih_sas_device_find_by_sas_address(
+struct _sas_device *mpt2sas_scsih_sas_device_get_by_sas_address(
+ struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
+struct _sas_device *mpt2sas_scsih_sas_device_get_by_sas_address_nolock(
struct MPT2SAS_ADAPTER *ioc, u64 sas_address);

void mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc);
diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index 3f26147..ad6ceb7e 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -526,8 +526,31 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
}
}

+struct _sas_device *
+mpt2sas_scsih_sas_device_get_by_sas_address_nolock(struct MPT2SAS_ADAPTER *ioc,
+ u64 sas_address)
+{
+ struct _sas_device *sas_device;
+
+ BUG_ON(!spin_is_locked(&ioc->sas_device_lock));
+
+ list_for_each_entry(sas_device, &ioc->sas_device_list, list)
+ if (sas_device->sas_address == sas_address)
+ goto found_device;
+
+ list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
+ if (sas_device->sas_address == sas_address)
+ goto found_device;
+
+ return NULL;
+
+found_device:
+ sas_device_get(sas_device);
+ return sas_device;
+}
+
/**
- * mpt2sas_scsih_sas_device_find_by_sas_address - sas device search
+ * mpt2sas_scsih_sas_device_get_by_sas_address - sas device search
* @ioc: per adapter object
* @sas_address: sas address
* Context: Calling function should acquire ioc->sas_device_lock
@@ -536,24 +559,44 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
* object.
*/
struct _sas_device *
-mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
+mpt2sas_scsih_sas_device_get_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
u64 sas_address)
{
struct _sas_device *sas_device;
+ unsigned long flags;
+
+ spin_lock_irqsave(&ioc->sas_device_lock, flags);
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
+ sas_address);
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+ return sas_device;
+}
+
+static struct _sas_device *
+_scsih_sas_device_get_by_handle_nolock(struct MPT2SAS_ADAPTER *ioc, u16 handle)
+{
+ struct _sas_device *sas_device;
+
+ BUG_ON(!spin_is_locked(&ioc->sas_device_lock));

list_for_each_entry(sas_device, &ioc->sas_device_list, list)
- if (sas_device->sas_address == sas_address)
- return sas_device;
+ if (sas_device->handle == handle)
+ goto found_device;

list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
- if (sas_device->sas_address == sas_address)
- return sas_device;
+ if (sas_device->handle == handle)
+ goto found_device;

return NULL;
+
+found_device:
+ sas_device_get(sas_device);
+ return sas_device;
}

/**
- * _scsih_sas_device_find_by_handle - sas device search
+ * _scsih_sas_device_get_by_handle - sas device search
* @ioc: per adapter object
* @handle: sas device handle (assigned by firmware)
* Context: Calling function should acquire ioc->sas_device_lock
@@ -562,19 +605,16 @@ mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
* object.
*/
static struct _sas_device *
-_scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
+_scsih_sas_device_get_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
{
struct _sas_device *sas_device;
+ unsigned long flags;

- list_for_each_entry(sas_device, &ioc->sas_device_list, list)
- if (sas_device->handle == handle)
- return sas_device;
-
- list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
- if (sas_device->handle == handle)
- return sas_device;
+ spin_lock_irqsave(&ioc->sas_device_lock, flags);
+ sas_device = _scsih_sas_device_get_by_handle_nolock(ioc, handle);
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

- return NULL;
+ return sas_device;
}

/**
@@ -583,7 +623,7 @@ _scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
* @sas_device: the sas_device object
* Context: This function will acquire ioc->sas_device_lock.
*
- * Removing object and freeing associated memory from the ioc->sas_device_list.
+ * If sas_device is on the list, remove it and decrement its reference count.
*/
static void
_scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
@@ -594,9 +634,15 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
if (!sas_device)
return;

+ /*
+ * The lock serializes access to the list, but we still need to verify
+ * that nobody removed the entry while we were waiting on the lock.
+ */
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- list_del(&sas_device->list);
- kfree(sas_device);
+ if (!list_empty(&sas_device->list)) {
+ list_del_init(&sas_device->list);
+ sas_device_put(sas_device);
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}

@@ -620,6 +666,7 @@ _scsih_sas_device_add(struct MPT2SAS_ADAPTER *ioc,
sas_device->handle, (unsigned long long)sas_device->sas_address));

spin_lock_irqsave(&ioc->sas_device_lock, flags);
+ sas_device_get(sas_device);
list_add_tail(&sas_device->list, &ioc->sas_device_list);
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

@@ -659,6 +706,7 @@ _scsih_sas_device_init_add(struct MPT2SAS_ADAPTER *ioc,
sas_device->handle, (unsigned long long)sas_device->sas_address));

spin_lock_irqsave(&ioc->sas_device_lock, flags);
+ sas_device_get(sas_device);
list_add_tail(&sas_device->list, &ioc->sas_device_init_list);
_scsih_determine_boot_device(ioc, sas_device, 0);
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
@@ -1208,12 +1256,15 @@ _scsih_change_queue_depth(struct scsi_device *sdev, int qdepth)
goto not_sata;
if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))
goto not_sata;
+
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
sas_device_priv_data->sas_target->sas_address);
- if (sas_device && sas_device->device_info &
- MPI2_SAS_DEVICE_INFO_SATA_DEVICE)
+ if (sas_device && sas_device->device_info
+ & MPI2_SAS_DEVICE_INFO_SATA_DEVICE) {
max_depth = MPT2SAS_SATA_QUEUE_DEPTH;
+ sas_device_put(sas_device);
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

not_sata:
@@ -1271,7 +1322,7 @@ _scsih_target_alloc(struct scsi_target *starget)
/* sas/sata devices */
spin_lock_irqsave(&ioc->sas_device_lock, flags);
rphy = dev_to_rphy(starget->dev.parent);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
rphy->identify.sas_address);

if (sas_device) {
@@ -1283,6 +1334,8 @@ _scsih_target_alloc(struct scsi_target *starget)
if (test_bit(sas_device->handle, ioc->pd_handles))
sas_target_priv_data->flags |=
MPT_TARGET_FLAGS_RAID_COMPONENT;
+
+ sas_device_put(sas_device);
}
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

@@ -1324,13 +1377,15 @@ _scsih_target_destroy(struct scsi_target *starget)

spin_lock_irqsave(&ioc->sas_device_lock, flags);
rphy = dev_to_rphy(starget->dev.parent);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
rphy->identify.sas_address);
if (sas_device && (sas_device->starget == starget) &&
(sas_device->id == starget->id) &&
(sas_device->channel == starget->channel))
sas_device->starget = NULL;

+ if (sas_device)
+ sas_device_put(sas_device);
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

out:
@@ -1386,7 +1441,7 @@ _scsih_slave_alloc(struct scsi_device *sdev)

if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
sas_target_priv_data->sas_address);
if (sas_device && (sas_device->starget == NULL)) {
sdev_printk(KERN_INFO, sdev,
@@ -1394,6 +1449,10 @@ _scsih_slave_alloc(struct scsi_device *sdev)
__func__, __LINE__);
sas_device->starget = starget;
}
+
+ if (sas_device)
+ sas_device_put(sas_device);
+
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}

@@ -1428,10 +1487,13 @@ _scsih_slave_destroy(struct scsi_device *sdev)

if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
sas_target_priv_data->sas_address);
if (sas_device && !sas_target_priv_data->num_luns)
sas_device->starget = NULL;
+
+ if (sas_device)
+ sas_device_put(sas_device);
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}

@@ -2078,7 +2140,7 @@ _scsih_slave_configure(struct scsi_device *sdev)
}

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
sas_device_priv_data->sas_target->sas_address);
if (!sas_device) {
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
@@ -2116,13 +2178,14 @@ _scsih_slave_configure(struct scsi_device *sdev)
if (!ssp_target)
_scsih_display_sata_capabilities(ioc, handle, sdev);

-
_scsih_change_queue_depth(sdev, qdepth);

if (ssp_target) {
sas_read_port_mode_page(sdev);
_scsih_enable_tlr(ioc, sdev);
}
+
+ sas_device_put(sas_device);
return 0;
}

@@ -2509,7 +2572,7 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
device_str, (unsigned long long)priv_target->sas_address);
} else {
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
priv_target->sas_address);
if (sas_device) {
if (priv_target->flags &
@@ -2529,6 +2592,8 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
"enclosure_logical_id(0x%016llx), slot(%d)\n",
(unsigned long long)sas_device->enclosure_logical_id,
sas_device->slot);
+
+ sas_device_put(sas_device);
}
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}
@@ -2604,8 +2669,7 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
{
struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
struct MPT2SAS_DEVICE *sas_device_priv_data;
- struct _sas_device *sas_device;
- unsigned long flags;
+ struct _sas_device *sas_device = NULL;
u16 handle;
int r;

@@ -2629,12 +2693,10 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
handle = 0;
if (sas_device_priv_data->sas_target->flags &
MPT_TARGET_FLAGS_RAID_COMPONENT) {
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc,
+ sas_device = _scsih_sas_device_get_by_handle(ioc,
sas_device_priv_data->sas_target->handle);
if (sas_device)
handle = sas_device->volume_handle;
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
} else
handle = sas_device_priv_data->sas_target->handle;

@@ -2651,6 +2713,10 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
out:
sdev_printk(KERN_INFO, scmd->device, "device reset: %s scmd(%p)\n",
((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
+
+ if (sas_device)
+ sas_device_put(sas_device);
+
return r;
}

@@ -2665,8 +2731,7 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
{
struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
struct MPT2SAS_DEVICE *sas_device_priv_data;
- struct _sas_device *sas_device;
- unsigned long flags;
+ struct _sas_device *sas_device = NULL;
u16 handle;
int r;
struct scsi_target *starget = scmd->device->sdev_target;
@@ -2689,12 +2754,10 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
handle = 0;
if (sas_device_priv_data->sas_target->flags &
MPT_TARGET_FLAGS_RAID_COMPONENT) {
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc,
+ sas_device = _scsih_sas_device_get_by_handle(ioc,
sas_device_priv_data->sas_target->handle);
if (sas_device)
handle = sas_device->volume_handle;
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
} else
handle = sas_device_priv_data->sas_target->handle;

@@ -2711,6 +2774,10 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
out:
starget_printk(KERN_INFO, starget, "target reset: %s scmd(%p)\n",
((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
+
+ if (sas_device)
+ sas_device_put(sas_device);
+
return r;
}

@@ -3002,15 +3069,15 @@ _scsih_block_io_to_children_attached_to_ex(struct MPT2SAS_ADAPTER *ioc,

list_for_each_entry(mpt2sas_port,
&sas_expander->sas_port_list, port_list) {
- if (mpt2sas_port->remote_identify.device_type ==
- SAS_END_DEVICE) {
+ if (mpt2sas_port->remote_identify.device_type == SAS_END_DEVICE) {
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device =
- mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
- mpt2sas_port->remote_identify.sas_address);
- if (sas_device)
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
+ mpt2sas_port->remote_identify.sas_address);
+ if (sas_device) {
set_bit(sas_device->handle,
- ioc->blocking_handles);
+ ioc->blocking_handles);
+ sas_device_put(sas_device);
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}
}
@@ -3080,7 +3147,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
{
Mpi2SCSITaskManagementRequest_t *mpi_request;
u16 smid;
- struct _sas_device *sas_device;
+ struct _sas_device *sas_device = NULL;
struct MPT2SAS_TARGET *sas_target_priv_data = NULL;
u64 sas_address = 0;
unsigned long flags;
@@ -3110,7 +3177,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
return;

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+ sas_device = _scsih_sas_device_get_by_handle_nolock(ioc, handle);
if (sas_device && sas_device->starget &&
sas_device->starget->hostdata) {
sas_target_priv_data = sas_device->starget->hostdata;
@@ -3131,14 +3198,14 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
if (!smid) {
delayed_tr = kzalloc(sizeof(*delayed_tr), GFP_ATOMIC);
if (!delayed_tr)
- return;
+ goto out;
INIT_LIST_HEAD(&delayed_tr->list);
delayed_tr->handle = handle;
list_add_tail(&delayed_tr->list, &ioc->delayed_tr_list);
dewtprintk(ioc, printk(MPT2SAS_INFO_FMT
"DELAYED:tr:handle(0x%04x), (open)\n",
ioc->name, handle));
- return;
+ goto out;
}

dewtprintk(ioc, printk(MPT2SAS_INFO_FMT "tr_send:handle(0x%04x), "
@@ -3150,6 +3217,9 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
mpi_request->DevHandle = cpu_to_le16(handle);
mpi_request->TaskType = MPI2_SCSITASKMGMT_TASKTYPE_TARGET_RESET;
mpt2sas_base_put_smid_hi_priority(ioc, smid);
+out:
+ if (sas_device)
+ sas_device_put(sas_device);
}


@@ -4068,7 +4138,6 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
char *desc_scsi_state = ioc->tmp_string;
u32 log_info = le32_to_cpu(mpi_reply->IOCLogInfo);
struct _sas_device *sas_device = NULL;
- unsigned long flags;
struct scsi_target *starget = scmd->device->sdev_target;
struct MPT2SAS_TARGET *priv_target = starget->hostdata;
char *device_str = NULL;
@@ -4200,8 +4269,7 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
printk(MPT2SAS_WARN_FMT "\t%s wwid(0x%016llx)\n", ioc->name,
device_str, (unsigned long long)priv_target->sas_address);
} else {
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address(ioc,
priv_target->sas_address);
if (sas_device) {
printk(MPT2SAS_WARN_FMT "\tsas_address(0x%016llx), "
@@ -4211,8 +4279,9 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
"\tenclosure_logical_id(0x%016llx), slot(%d)\n",
ioc->name, sas_device->enclosure_logical_id,
sas_device->slot);
+
+ sas_device_put(sas_device);
}
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}

printk(MPT2SAS_WARN_FMT "\thandle(0x%04x), ioc_status(%s)(0x%04x), "
@@ -4259,7 +4328,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
Mpi2SepRequest_t mpi_request;
struct _sas_device *sas_device;

- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+ sas_device = _scsih_sas_device_get_by_handle(ioc, handle);
if (!sas_device)
return;

@@ -4274,7 +4343,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
&mpi_request)) != 0) {
printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n", ioc->name,
__FILE__, __LINE__, __func__);
- return;
+ goto out;
}
sas_device->pfa_led_on = 1;

@@ -4284,8 +4353,10 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
"enclosure_processor: ioc_status (0x%04x), loginfo(0x%08x)\n",
ioc->name, le16_to_cpu(mpi_reply.IOCStatus),
le32_to_cpu(mpi_reply.IOCLogInfo)));
- return;
+ goto out;
}
+out:
+ sas_device_put(sas_device);
}

/**
@@ -4370,19 +4441,17 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)

/* only handle non-raid devices */
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+ sas_device = _scsih_sas_device_get_by_handle_nolock(ioc, handle);
if (!sas_device) {
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
+ goto out_unlock;
}
starget = sas_device->starget;
sas_target_priv_data = starget->hostdata;

if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_RAID_COMPONENT) ||
- ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))) {
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
- }
+ ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)))
+ goto out_unlock;
+
starget_printk(KERN_WARNING, starget, "predicted fault\n");
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

@@ -4396,7 +4465,7 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
if (!event_reply) {
printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n",
ioc->name, __FILE__, __LINE__, __func__);
- return;
+ goto out;
}

event_reply->Function = MPI2_FUNCTION_EVENT_NOTIFICATION;
@@ -4413,6 +4482,14 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
event_data->SASAddress = cpu_to_le64(sas_target_priv_data->sas_address);
mpt2sas_ctl_add_to_event_log(ioc, event_reply);
kfree(event_reply);
+out:
+ if (sas_device)
+ sas_device_put(sas_device);
+ return;
+
+out_unlock:
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+ goto out;
}

/**
@@ -5148,14 +5225,13 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)

spin_lock_irqsave(&ioc->sas_device_lock, flags);
sas_address = le64_to_cpu(sas_device_pg0.SASAddress);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
sas_address);

if (!sas_device) {
printk(MPT2SAS_ERR_FMT "device is not present "
"handle(0x%04x), no sas_device!!!\n", ioc->name, handle);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
+ goto out_unlock;
}

if (unlikely(sas_device->handle != handle)) {
@@ -5172,19 +5248,22 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)
MPI2_SAS_DEVICE0_FLAGS_DEVICE_PRESENT)) {
printk(MPT2SAS_ERR_FMT "device is not present "
"handle(0x%04x), flags!!!\n", ioc->name, handle);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
+ goto out_unlock;
}

/* check if there were any issues with discovery */
if (_scsih_check_access_status(ioc, sas_address, handle,
- sas_device_pg0.AccessStatus)) {
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
- }
+ sas_device_pg0.AccessStatus))
+ goto out_unlock;
+
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
_scsih_ublock_io_device(ioc, sas_address);
+ return;

+out_unlock:
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+ if (sas_device)
+ sas_device_put(sas_device);
}

/**
@@ -5208,7 +5287,6 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
u32 ioc_status;
__le64 sas_address;
u32 device_info;
- unsigned long flags;

if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
@@ -5250,14 +5328,13 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
return -1;
}

-
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address(ioc,
sas_address);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

- if (sas_device)
+ if (sas_device) {
+ sas_device_put(sas_device);
return 0;
+ }

sas_device = kzalloc(sizeof(struct _sas_device),
GFP_KERNEL);
@@ -5267,6 +5344,7 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
return -1;
}

+ kref_init(&sas_device->refcount);
sas_device->handle = handle;
if (_scsih_get_sas_address(ioc, le16_to_cpu
(sas_device_pg0.ParentDevHandle),
@@ -5344,7 +5422,6 @@ _scsih_remove_device(struct MPT2SAS_ADAPTER *ioc,
"handle(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
sas_device->handle, (unsigned long long)
sas_device->sas_address));
- kfree(sas_device);
}
/**
* _scsih_device_remove_by_handle - removing device object by handle
@@ -5363,12 +5440,17 @@ _scsih_device_remove_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
return;

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
- if (sas_device)
- list_del(&sas_device->list);
+ sas_device = _scsih_sas_device_get_by_handle_nolock(ioc, handle);
+ if (sas_device) {
+ list_del_init(&sas_device->list);
+ sas_device_put(sas_device);
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- if (sas_device)
+
+ if (sas_device) {
_scsih_remove_device(ioc, sas_device);
+ sas_device_put(sas_device);
+ }
}

/**
@@ -5389,13 +5471,17 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
return;

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
- sas_address);
- if (sas_device)
- list_del(&sas_device->list);
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc, sas_address);
+ if (sas_device) {
+ list_del_init(&sas_device->list);
+ sas_device_put(sas_device);
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- if (sas_device)
+
+ if (sas_device) {
_scsih_remove_device(ioc, sas_device);
+ sas_device_put(sas_device);
+ }
}
#ifdef CONFIG_SCSI_MPT2SAS_LOGGING
/**
@@ -5716,26 +5802,28 @@ _scsih_sas_device_status_change_event(struct MPT2SAS_ADAPTER *ioc,

spin_lock_irqsave(&ioc->sas_device_lock, flags);
sas_address = le64_to_cpu(event_data->SASAddress);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
sas_address);

- if (!sas_device || !sas_device->starget) {
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
- }
+ if (!sas_device || !sas_device->starget)
+ goto out;

target_priv_data = sas_device->starget->hostdata;
- if (!target_priv_data) {
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
- }
+ if (!target_priv_data)
+ goto out;

if (event_data->ReasonCode ==
MPI2_EVENT_SAS_DEV_STAT_RC_INTERNAL_DEVICE_RESET)
target_priv_data->tm_busy = 1;
else
target_priv_data->tm_busy = 0;
+
+out:
+ if (sas_device)
+ sas_device_put(sas_device);
+
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
}

#ifdef CONFIG_SCSI_MPT2SAS_LOGGING
@@ -6123,7 +6211,7 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
u16 handle = le16_to_cpu(element->PhysDiskDevHandle);

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+ sas_device = _scsih_sas_device_get_by_handle_nolock(ioc, handle);
if (sas_device) {
sas_device->volume_handle = 0;
sas_device->volume_wwid = 0;
@@ -6142,6 +6230,8 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
/* exposing raid component */
if (starget)
starget_for_each_device(starget, NULL, _scsih_reprobe_lun);
+
+ sas_device_put(sas_device);
}

/**
@@ -6170,7 +6260,7 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
&volume_wwid);

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+ sas_device = _scsih_sas_device_get_by_handle_nolock(ioc, handle);
if (sas_device) {
set_bit(handle, ioc->pd_handles);
if (sas_device->starget && sas_device->starget->hostdata) {
@@ -6189,6 +6279,8 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
/* hiding raid component */
if (starget)
starget_for_each_device(starget, (void *)1, _scsih_reprobe_lun);
+
+ sas_device_put(sas_device);
}

/**
@@ -6221,7 +6313,6 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,
Mpi2EventIrConfigElement_t *element)
{
struct _sas_device *sas_device;
- unsigned long flags;
u16 handle = le16_to_cpu(element->PhysDiskDevHandle);
Mpi2ConfigReply_t mpi_reply;
Mpi2SasDevicePage0_t sas_device_pg0;
@@ -6231,11 +6322,11 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,

set_bit(handle, ioc->pd_handles);

- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- if (sas_device)
+ sas_device = _scsih_sas_device_get_by_handle(ioc, handle);
+ if (sas_device) {
+ sas_device_put(sas_device);
return;
+ }

if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
@@ -6509,7 +6600,6 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
u16 handle, parent_handle;
u32 state;
struct _sas_device *sas_device;
- unsigned long flags;
Mpi2ConfigReply_t mpi_reply;
Mpi2SasDevicePage0_t sas_device_pg0;
u32 ioc_status;
@@ -6542,12 +6632,11 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
if (!ioc->is_warpdrive)
set_bit(handle, ioc->pd_handles);

- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-
- if (sas_device)
+ sas_device = _scsih_sas_device_get_by_handle(ioc, handle);
+ if (sas_device) {
+ sas_device_put(sas_device);
return;
+ }

if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
&sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
@@ -7179,11 +7268,11 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
}
phys_disk_num = pd_pg0.PhysDiskNum;
handle = le16_to_cpu(pd_pg0.DevHandle);
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- if (sas_device)
+ sas_device = _scsih_sas_device_get_by_handle(ioc, handle);
+ if (sas_device) {
+ sas_device_put(sas_device);
continue;
+ }
if (mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
&sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
handle) != 0)
@@ -7302,12 +7391,12 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
if (!(_scsih_is_end_device(
le32_to_cpu(sas_device_pg0.DeviceInfo))))
continue;
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address(ioc,
le64_to_cpu(sas_device_pg0.SASAddress));
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- if (sas_device)
+ if (sas_device) {
+ sas_device_put(sas_device);
continue;
+ }
parent_handle = le16_to_cpu(sas_device_pg0.ParentDevHandle);
if (!_scsih_get_sas_address(ioc, parent_handle, &sas_address)) {
printk(MPT2SAS_INFO_FMT "\tBEFORE adding end device: "
diff --git a/drivers/scsi/mpt2sas/mpt2sas_transport.c b/drivers/scsi/mpt2sas/mpt2sas_transport.c
index ff2500a..ebfc827 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_transport.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_transport.c
@@ -1323,15 +1323,17 @@ _transport_get_enclosure_identifier(struct sas_rphy *rphy, u64 *identifier)
int rc;

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
rphy->identify.sas_address);
if (sas_device) {
*identifier = sas_device->enclosure_logical_id;
rc = 0;
+ sas_device_put(sas_device);
} else {
*identifier = 0;
rc = -ENXIO;
}
+
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
return rc;
}
@@ -1351,12 +1353,14 @@ _transport_get_bay_identifier(struct sas_rphy *rphy)
int rc;

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
rphy->identify.sas_address);
- if (sas_device)
+ if (sas_device) {
rc = sas_device->slot;
- else
+ sas_device_put(sas_device);
+ } else {
rc = -ENXIO;
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
return rc;
}
--
1.8.1

2015-05-15 03:42:53

by Calvin Owens

[permalink] [raw]
Subject: [PATCH 3/6] Fix unsafe sas_device_list usage

We cannot iterate over the list without holding a lock for the entire
duration, or we risk corrupting random memory if items are added or
deleted as we iterate.

This refactors code such that it always holds the lock when iterating
on or accessing the sas_device_list.

Signed-off-by: Calvin Owens <[email protected]>
---
drivers/scsi/mpt2sas/mpt2sas_scsih.c | 83 +++++++++++++++++++++++++++---------
1 file changed, 62 insertions(+), 21 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index ad6ceb7e..9645055 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -7104,6 +7104,7 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
struct _raid_device *raid_device, *raid_device_next;
struct list_head tmp_list;
unsigned long flags;
+ LIST_HEAD(head);

printk(MPT2SAS_INFO_FMT "removing unresponding devices: start\n",
ioc->name);
@@ -7111,14 +7112,29 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
/* removing unresponding end devices */
printk(MPT2SAS_INFO_FMT "removing unresponding devices: end-devices\n",
ioc->name);
+
+ /*
+ * Iterate, pulling off devices marked as non-responding. We become the
+ * owner for the reference the list had on any object we prune.
+ */
+ spin_lock_irqsave(&ioc->sas_device_lock, flags);
list_for_each_entry_safe(sas_device, sas_device_next,
- &ioc->sas_device_list, list) {
+ &ioc->sas_device_list, list) {
if (!sas_device->responding)
- mpt2sas_device_remove_by_sas_address(ioc,
- sas_device->sas_address);
+ list_move_tail(&sas_device->list, &head);
else
sas_device->responding = 0;
}
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+ /*
+ * Now, uninitialize and remove the unresponding devices we pruned.
+ */
+ list_for_each_entry_safe(sas_device, sas_device_next, &head, list) {
+ _scsih_remove_device(ioc, sas_device);
+ list_del_init(&sas_device->list);
+ sas_device_put(sas_device);
+ }

/* removing unresponding volumes */
if (ioc->ir_firmware) {
@@ -8055,6 +8071,37 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
}
}

+static struct _sas_device *dequeue_next_sas_device(struct MPT2SAS_ADAPTER *ioc)
+{
+ struct _sas_device *sas_device = NULL;
+ unsigned long flags;
+
+ spin_lock_irqsave(&ioc->sas_device_lock, flags);
+ if (!list_empty(&ioc->sas_device_init_list)) {
+ sas_device = list_first_entry(&ioc->sas_device_init_list,
+ struct _sas_device, list);
+ list_del_init(&sas_device->list);
+ }
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+ /*
+ * If an item was dequeued, the caller now owns the reference that was
+ * previously owned by the list
+ */
+ return sas_device;
+}
+
+static void sas_device_make_active(struct MPT2SAS_ADAPTER *ioc,
+ struct _sas_device *sas_device)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&ioc->sas_device_lock, flags);
+ sas_device_get(sas_device);
+ list_add_tail(&sas_device->list, &ioc->sas_device_list);
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+}
+
/**
* _scsih_probe_sas - reporting sas devices to sas transport
* @ioc: per adapter object
@@ -8064,34 +8111,28 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
static void
_scsih_probe_sas(struct MPT2SAS_ADAPTER *ioc)
{
- struct _sas_device *sas_device, *next;
- unsigned long flags;
-
- /* SAS Device List */
- list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
- list) {
+ struct _sas_device *sas_device;

- if (ioc->hide_drives)
- continue;
+ if (ioc->hide_drives)
+ return;

+ while ((sas_device = dequeue_next_sas_device(ioc))) {
if (!mpt2sas_transport_port_add(ioc, sas_device->handle,
- sas_device->sas_address_parent)) {
- list_del(&sas_device->list);
- kfree(sas_device);
+ sas_device->sas_address_parent)) {
+ sas_device_put(sas_device);
continue;
} else if (!sas_device->starget) {
if (!ioc->is_driver_loading) {
mpt2sas_transport_port_remove(ioc,
- sas_device->sas_address,
- sas_device->sas_address_parent);
- list_del(&sas_device->list);
- kfree(sas_device);
+ sas_device->sas_address,
+ sas_device->sas_address_parent);
+ sas_device_put(sas_device);
continue;
}
}
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- list_move_tail(&sas_device->list, &ioc->sas_device_list);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+ sas_device_make_active(ioc, sas_device);
+ sas_device_put(sas_device);
}
}

--
1.8.1

2015-05-15 03:43:03

by Calvin Owens

[permalink] [raw]
Subject: [PATCH 4/6] Add refcount to fw_event_work struct

The fw_event_work struct is concurrently referenced at shutdown, so
add a refcount to protect it.

Signed-off-by: Calvin Owens <[email protected]>
---
drivers/scsi/mpt2sas/mpt2sas_scsih.c | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index 9645055..611b34d 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -176,9 +176,37 @@ struct fw_event_work {
u8 VP_ID;
u8 ignore;
u16 event;
+ struct kref refcount;
char event_data[0] __aligned(4);
};

+static void fw_event_work_free(struct kref *r)
+{
+ kfree(container_of(r, struct fw_event_work, refcount));
+}
+
+static void fw_event_work_get(struct fw_event_work *fw_work)
+{
+ kref_get(&fw_work->refcount);
+}
+
+static void fw_event_work_put(struct fw_event_work *fw_work)
+{
+ kref_put(&fw_work->refcount, fw_event_work_free);
+}
+
+static struct fw_event_work *alloc_fw_event_work(int len)
+{
+ struct fw_event_work *fw_event;
+
+ fw_event = kzalloc(sizeof(*fw_event) + len, GFP_ATOMIC);
+ if (!fw_event)
+ return NULL;
+
+ kref_init(&fw_event->refcount);
+ return fw_event;
+}
+
/* raid transport support */
static struct raid_template *mpt2sas_raid_template;

--
1.8.1

2015-05-15 03:43:00

by Calvin Owens

[permalink] [raw]
Subject: [PATCH 5/6] Refactor code to use new fw_event refcount

This refactors the fw_event code to use the new refcount.

Signed-off-by: Calvin Owens <[email protected]>
---
drivers/scsi/mpt2sas/mpt2sas_scsih.c | 20 +++++++++++++-------
1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index 611b34d..8d8c814 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -2863,6 +2863,7 @@ _scsih_fw_event_add(struct MPT2SAS_ADAPTER *ioc, struct fw_event_work *fw_event)
return;

spin_lock_irqsave(&ioc->fw_event_lock, flags);
+ fw_event_work_get(fw_event);
list_add_tail(&fw_event->list, &ioc->fw_event_list);
INIT_DELAYED_WORK(&fw_event->delayed_work, _firmware_event_work);
queue_delayed_work(ioc->firmware_event_thread,
@@ -2887,12 +2888,13 @@ _scsih_fw_event_free(struct MPT2SAS_ADAPTER *ioc, struct fw_event_work
unsigned long flags;

spin_lock_irqsave(&ioc->fw_event_lock, flags);
- list_del(&fw_event->list);
- kfree(fw_event);
+ if (!list_empty(&fw_event->list))
+ list_del_init(&fw_event->list);
+
+ fw_event_work_put(fw_event);
spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
}

-
/**
* _scsih_error_recovery_delete_devices - remove devices not responding
* @ioc: per adapter object
@@ -2907,13 +2909,14 @@ _scsih_error_recovery_delete_devices(struct MPT2SAS_ADAPTER *ioc)
if (ioc->is_driver_loading)
return;

- fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+ fw_event = alloc_fw_event_work(0);
if (!fw_event)
return;

fw_event->event = MPT2SAS_REMOVE_UNRESPONDING_DEVICES;
fw_event->ioc = ioc;
_scsih_fw_event_add(ioc, fw_event);
+ fw_event_work_put(fw_event);
}

/**
@@ -2927,12 +2930,13 @@ mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc)
{
struct fw_event_work *fw_event;

- fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+ fw_event = alloc_fw_event_work(0);
if (!fw_event)
return;
fw_event->event = MPT2SAS_PORT_ENABLE_COMPLETE;
fw_event->ioc = ioc;
_scsih_fw_event_add(ioc, fw_event);
+ fw_event_work_put(fw_event);
}

/**
@@ -4439,13 +4443,14 @@ _scsih_send_event_to_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
{
struct fw_event_work *fw_event;

- fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+ fw_event = alloc_fw_event_work(0);
if (!fw_event)
return;
fw_event->event = MPT2SAS_TURN_ON_PFA_LED;
fw_event->device_handle = handle;
fw_event->ioc = ioc;
_scsih_fw_event_add(ioc, fw_event);
+ fw_event_work_put(fw_event);
}

/**
@@ -7740,7 +7745,7 @@ mpt2sas_scsih_event_callback(struct MPT2SAS_ADAPTER *ioc, u8 msix_index,
}

sz = le16_to_cpu(mpi_reply->EventDataLength) * 4;
- fw_event = kzalloc(sizeof(*fw_event) + sz, GFP_ATOMIC);
+ fw_event = alloc_fw_event_work(sz);
if (!fw_event) {
printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n",
ioc->name, __FILE__, __LINE__, __func__);
@@ -7753,6 +7758,7 @@ mpt2sas_scsih_event_callback(struct MPT2SAS_ADAPTER *ioc, u8 msix_index,
fw_event->VP_ID = mpi_reply->VP_ID;
fw_event->event = event;
_scsih_fw_event_add(ioc, fw_event);
+ fw_event_work_put(fw_event);
return;
}

--
1.8.1

2015-05-15 03:44:11

by Calvin Owens

[permalink] [raw]
Subject: [PATCH 6/6] Fix unsafe fw_event_list usage

Since the fw_event deletes itself from the list, cleanup_queue() can
walk onto garbage pointers or walk off into freed memory.

This refactors the code in _scsih_fw_event_cleanup_queue() to not
iterate over the fw_event_list without a lock.

Signed-off-by: Calvin Owens <[email protected]>
---
drivers/scsi/mpt2sas/mpt2sas_scsih.c | 22 ++++++++++++++++++++--
1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index 8d8c814..f504e28 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -2939,6 +2939,23 @@ mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc)
fw_event_work_put(fw_event);
}

+static struct fw_event_work *dequeue_next_fw_event(struct MPT2SAS_ADAPTER *ioc)
+{
+ unsigned long flags;
+ struct fw_event_work *fw_event = NULL;
+
+ spin_lock_irqsave(&ioc->fw_event_lock, flags);
+ if (!list_empty(&ioc->fw_event_list)) {
+ fw_event = list_first_entry(&ioc->fw_event_list,
+ struct fw_event_work, list);
+ list_del_init(&fw_event->list);
+ fw_event_work_get(fw_event);
+ }
+ spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
+
+ return fw_event;
+}
+
/**
* _scsih_fw_event_cleanup_queue - cleanup event queue
* @ioc: per adapter object
@@ -2951,17 +2968,18 @@ mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc)
static void
_scsih_fw_event_cleanup_queue(struct MPT2SAS_ADAPTER *ioc)
{
- struct fw_event_work *fw_event, *next;
+ struct fw_event_work *fw_event;

if (list_empty(&ioc->fw_event_list) ||
!ioc->firmware_event_thread || in_interrupt())
return;

- list_for_each_entry_safe(fw_event, next, &ioc->fw_event_list, list) {
+ while ((fw_event = dequeue_next_fw_event(ioc))) {
if (cancel_delayed_work_sync(&fw_event->delayed_work)) {
_scsih_fw_event_free(ioc, fw_event);
continue;
}
+ fw_event_work_put(fw_event);
}
}

--
1.8.1

2015-06-09 03:51:26

by Calvin Owens

[permalink] [raw]
Subject: [RESEND][PATCH 0/6] Fixes for memory corruption in mpt2sas

Hello all,

This patchset attempts to address problems we've been having with
panics due to memory corruption from the mpt2sas driver.

I will provide a similar set of fixes for mpt3sas, since we see
similar issues there as well. "Porting" this to mpt3sas will be
trivial since the part of the driver I'm touching is nearly identical
between the two, so I thought it would be simpler to review a patch
against mpt2sas alone at first.

I've tested this on a handful of large storage boxes over the past few
weeks, so far it seems to have completely eliminated the memory
corruption panics.

Thanks,
Calvin


Total diffstat:

drivers/scsi/mpt2sas/mpt2sas_base.h | 20 +-
drivers/scsi/mpt2sas/mpt2sas_scsih.c | 482 +++++++++++++++++++++----------
drivers/scsi/mpt2sas/mpt2sas_transport.c | 12 +-
3 files changed, 359 insertions(+), 155 deletions(-)

Patches:

* [PATCH 1/6] Add refcount to sas_device struct
* [PATCH 2/6] Refactor code to use new sas_device refcount
* [PATCH 3/6] Fix unsafe sas_device_list usage
* [PATCH 4/6] Add refcount to fw_event_work struct
* [PATCH 5/6] Refactor code to use new fw_event refcount
* [PATCH 6/6] Fix unsafe fw_event_list usage

2015-06-09 03:51:51

by Calvin Owens

[permalink] [raw]
Subject: [PATCH 1/6] Add refcount to sas_device struct

These objects can be referenced concurrently throughout the driver, we
need a way to make sure threads can't delete them out from under each
other.

Signed-off-by: Calvin Owens <[email protected]>
---
drivers/scsi/mpt2sas/mpt2sas_base.h | 16 ++++++++++++++++
1 file changed, 16 insertions(+)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
index caff8d1..2e7dc33 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.h
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
@@ -376,8 +376,24 @@ struct _sas_device {
u8 phy;
u8 responding;
u8 pfa_led_on;
+ struct kref refcount;
};

+static inline void sas_device_get(struct _sas_device *s)
+{
+ kref_get(&s->refcount);
+}
+
+static inline void sas_device_free(struct kref *r)
+{
+ kfree(container_of(r, struct _sas_device, refcount));
+}
+
+static inline void sas_device_put(struct _sas_device *s)
+{
+ kref_put(&s->refcount, sas_device_free);
+}
+
/**
* struct _raid_device - raid volume link list
* @list: sas device list
--
1.8.1

2015-06-09 03:52:46

by Calvin Owens

[permalink] [raw]
Subject: [PATCH 2/6] Refactor code to use new sas_device refcount

This patch refactors the code in the driver to use the new reference
count on the sas_device struct.

Signed-off-by: Calvin Owens <[email protected]>
---
drivers/scsi/mpt2sas/mpt2sas_base.h | 4 +-
drivers/scsi/mpt2sas/mpt2sas_scsih.c | 329 ++++++++++++++++++++-----------
drivers/scsi/mpt2sas/mpt2sas_transport.c | 12 +-
3 files changed, 220 insertions(+), 125 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
index 2e7dc33..dac0e8a 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.h
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
@@ -1111,7 +1111,9 @@ struct _sas_node *mpt2sas_scsih_expander_find_by_handle(struct MPT2SAS_ADAPTER *
u16 handle);
struct _sas_node *mpt2sas_scsih_expander_find_by_sas_address(struct MPT2SAS_ADAPTER
*ioc, u64 sas_address);
-struct _sas_device *mpt2sas_scsih_sas_device_find_by_sas_address(
+struct _sas_device *mpt2sas_scsih_sas_device_get_by_sas_address(
+ struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
+struct _sas_device *mpt2sas_scsih_sas_device_get_by_sas_address_nolock(
struct MPT2SAS_ADAPTER *ioc, u64 sas_address);

void mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc);
diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index 3f26147..ad6ceb7e 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -526,8 +526,31 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
}
}

+struct _sas_device *
+mpt2sas_scsih_sas_device_get_by_sas_address_nolock(struct MPT2SAS_ADAPTER *ioc,
+ u64 sas_address)
+{
+ struct _sas_device *sas_device;
+
+ BUG_ON(!spin_is_locked(&ioc->sas_device_lock));
+
+ list_for_each_entry(sas_device, &ioc->sas_device_list, list)
+ if (sas_device->sas_address == sas_address)
+ goto found_device;
+
+ list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
+ if (sas_device->sas_address == sas_address)
+ goto found_device;
+
+ return NULL;
+
+found_device:
+ sas_device_get(sas_device);
+ return sas_device;
+}
+
/**
- * mpt2sas_scsih_sas_device_find_by_sas_address - sas device search
+ * mpt2sas_scsih_sas_device_get_by_sas_address - sas device search
* @ioc: per adapter object
* @sas_address: sas address
* Context: Calling function should acquire ioc->sas_device_lock
@@ -536,24 +559,44 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
* object.
*/
struct _sas_device *
-mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
+mpt2sas_scsih_sas_device_get_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
u64 sas_address)
{
struct _sas_device *sas_device;
+ unsigned long flags;
+
+ spin_lock_irqsave(&ioc->sas_device_lock, flags);
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
+ sas_address);
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+ return sas_device;
+}
+
+static struct _sas_device *
+_scsih_sas_device_get_by_handle_nolock(struct MPT2SAS_ADAPTER *ioc, u16 handle)
+{
+ struct _sas_device *sas_device;
+
+ BUG_ON(!spin_is_locked(&ioc->sas_device_lock));

list_for_each_entry(sas_device, &ioc->sas_device_list, list)
- if (sas_device->sas_address == sas_address)
- return sas_device;
+ if (sas_device->handle == handle)
+ goto found_device;

list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
- if (sas_device->sas_address == sas_address)
- return sas_device;
+ if (sas_device->handle == handle)
+ goto found_device;

return NULL;
+
+found_device:
+ sas_device_get(sas_device);
+ return sas_device;
}

/**
- * _scsih_sas_device_find_by_handle - sas device search
+ * _scsih_sas_device_get_by_handle - sas device search
* @ioc: per adapter object
* @handle: sas device handle (assigned by firmware)
* Context: Calling function should acquire ioc->sas_device_lock
@@ -562,19 +605,16 @@ mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
* object.
*/
static struct _sas_device *
-_scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
+_scsih_sas_device_get_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
{
struct _sas_device *sas_device;
+ unsigned long flags;

- list_for_each_entry(sas_device, &ioc->sas_device_list, list)
- if (sas_device->handle == handle)
- return sas_device;
-
- list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
- if (sas_device->handle == handle)
- return sas_device;
+ spin_lock_irqsave(&ioc->sas_device_lock, flags);
+ sas_device = _scsih_sas_device_get_by_handle_nolock(ioc, handle);
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

- return NULL;
+ return sas_device;
}

/**
@@ -583,7 +623,7 @@ _scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
* @sas_device: the sas_device object
* Context: This function will acquire ioc->sas_device_lock.
*
- * Removing object and freeing associated memory from the ioc->sas_device_list.
+ * If sas_device is on the list, remove it and decrement its reference count.
*/
static void
_scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
@@ -594,9 +634,15 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
if (!sas_device)
return;

+ /*
+ * The lock serializes access to the list, but we still need to verify
+ * that nobody removed the entry while we were waiting on the lock.
+ */
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- list_del(&sas_device->list);
- kfree(sas_device);
+ if (!list_empty(&sas_device->list)) {
+ list_del_init(&sas_device->list);
+ sas_device_put(sas_device);
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}

@@ -620,6 +666,7 @@ _scsih_sas_device_add(struct MPT2SAS_ADAPTER *ioc,
sas_device->handle, (unsigned long long)sas_device->sas_address));

spin_lock_irqsave(&ioc->sas_device_lock, flags);
+ sas_device_get(sas_device);
list_add_tail(&sas_device->list, &ioc->sas_device_list);
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

@@ -659,6 +706,7 @@ _scsih_sas_device_init_add(struct MPT2SAS_ADAPTER *ioc,
sas_device->handle, (unsigned long long)sas_device->sas_address));

spin_lock_irqsave(&ioc->sas_device_lock, flags);
+ sas_device_get(sas_device);
list_add_tail(&sas_device->list, &ioc->sas_device_init_list);
_scsih_determine_boot_device(ioc, sas_device, 0);
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
@@ -1208,12 +1256,15 @@ _scsih_change_queue_depth(struct scsi_device *sdev, int qdepth)
goto not_sata;
if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))
goto not_sata;
+
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
sas_device_priv_data->sas_target->sas_address);
- if (sas_device && sas_device->device_info &
- MPI2_SAS_DEVICE_INFO_SATA_DEVICE)
+ if (sas_device && sas_device->device_info
+ & MPI2_SAS_DEVICE_INFO_SATA_DEVICE) {
max_depth = MPT2SAS_SATA_QUEUE_DEPTH;
+ sas_device_put(sas_device);
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

not_sata:
@@ -1271,7 +1322,7 @@ _scsih_target_alloc(struct scsi_target *starget)
/* sas/sata devices */
spin_lock_irqsave(&ioc->sas_device_lock, flags);
rphy = dev_to_rphy(starget->dev.parent);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
rphy->identify.sas_address);

if (sas_device) {
@@ -1283,6 +1334,8 @@ _scsih_target_alloc(struct scsi_target *starget)
if (test_bit(sas_device->handle, ioc->pd_handles))
sas_target_priv_data->flags |=
MPT_TARGET_FLAGS_RAID_COMPONENT;
+
+ sas_device_put(sas_device);
}
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

@@ -1324,13 +1377,15 @@ _scsih_target_destroy(struct scsi_target *starget)

spin_lock_irqsave(&ioc->sas_device_lock, flags);
rphy = dev_to_rphy(starget->dev.parent);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
rphy->identify.sas_address);
if (sas_device && (sas_device->starget == starget) &&
(sas_device->id == starget->id) &&
(sas_device->channel == starget->channel))
sas_device->starget = NULL;

+ if (sas_device)
+ sas_device_put(sas_device);
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

out:
@@ -1386,7 +1441,7 @@ _scsih_slave_alloc(struct scsi_device *sdev)

if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
sas_target_priv_data->sas_address);
if (sas_device && (sas_device->starget == NULL)) {
sdev_printk(KERN_INFO, sdev,
@@ -1394,6 +1449,10 @@ _scsih_slave_alloc(struct scsi_device *sdev)
__func__, __LINE__);
sas_device->starget = starget;
}
+
+ if (sas_device)
+ sas_device_put(sas_device);
+
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}

@@ -1428,10 +1487,13 @@ _scsih_slave_destroy(struct scsi_device *sdev)

if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
sas_target_priv_data->sas_address);
if (sas_device && !sas_target_priv_data->num_luns)
sas_device->starget = NULL;
+
+ if (sas_device)
+ sas_device_put(sas_device);
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}

@@ -2078,7 +2140,7 @@ _scsih_slave_configure(struct scsi_device *sdev)
}

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
sas_device_priv_data->sas_target->sas_address);
if (!sas_device) {
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
@@ -2116,13 +2178,14 @@ _scsih_slave_configure(struct scsi_device *sdev)
if (!ssp_target)
_scsih_display_sata_capabilities(ioc, handle, sdev);

-
_scsih_change_queue_depth(sdev, qdepth);

if (ssp_target) {
sas_read_port_mode_page(sdev);
_scsih_enable_tlr(ioc, sdev);
}
+
+ sas_device_put(sas_device);
return 0;
}

@@ -2509,7 +2572,7 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
device_str, (unsigned long long)priv_target->sas_address);
} else {
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
priv_target->sas_address);
if (sas_device) {
if (priv_target->flags &
@@ -2529,6 +2592,8 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
"enclosure_logical_id(0x%016llx), slot(%d)\n",
(unsigned long long)sas_device->enclosure_logical_id,
sas_device->slot);
+
+ sas_device_put(sas_device);
}
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}
@@ -2604,8 +2669,7 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
{
struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
struct MPT2SAS_DEVICE *sas_device_priv_data;
- struct _sas_device *sas_device;
- unsigned long flags;
+ struct _sas_device *sas_device = NULL;
u16 handle;
int r;

@@ -2629,12 +2693,10 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
handle = 0;
if (sas_device_priv_data->sas_target->flags &
MPT_TARGET_FLAGS_RAID_COMPONENT) {
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc,
+ sas_device = _scsih_sas_device_get_by_handle(ioc,
sas_device_priv_data->sas_target->handle);
if (sas_device)
handle = sas_device->volume_handle;
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
} else
handle = sas_device_priv_data->sas_target->handle;

@@ -2651,6 +2713,10 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
out:
sdev_printk(KERN_INFO, scmd->device, "device reset: %s scmd(%p)\n",
((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
+
+ if (sas_device)
+ sas_device_put(sas_device);
+
return r;
}

@@ -2665,8 +2731,7 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
{
struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
struct MPT2SAS_DEVICE *sas_device_priv_data;
- struct _sas_device *sas_device;
- unsigned long flags;
+ struct _sas_device *sas_device = NULL;
u16 handle;
int r;
struct scsi_target *starget = scmd->device->sdev_target;
@@ -2689,12 +2754,10 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
handle = 0;
if (sas_device_priv_data->sas_target->flags &
MPT_TARGET_FLAGS_RAID_COMPONENT) {
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc,
+ sas_device = _scsih_sas_device_get_by_handle(ioc,
sas_device_priv_data->sas_target->handle);
if (sas_device)
handle = sas_device->volume_handle;
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
} else
handle = sas_device_priv_data->sas_target->handle;

@@ -2711,6 +2774,10 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
out:
starget_printk(KERN_INFO, starget, "target reset: %s scmd(%p)\n",
((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
+
+ if (sas_device)
+ sas_device_put(sas_device);
+
return r;
}

@@ -3002,15 +3069,15 @@ _scsih_block_io_to_children_attached_to_ex(struct MPT2SAS_ADAPTER *ioc,

list_for_each_entry(mpt2sas_port,
&sas_expander->sas_port_list, port_list) {
- if (mpt2sas_port->remote_identify.device_type ==
- SAS_END_DEVICE) {
+ if (mpt2sas_port->remote_identify.device_type == SAS_END_DEVICE) {
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device =
- mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
- mpt2sas_port->remote_identify.sas_address);
- if (sas_device)
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
+ mpt2sas_port->remote_identify.sas_address);
+ if (sas_device) {
set_bit(sas_device->handle,
- ioc->blocking_handles);
+ ioc->blocking_handles);
+ sas_device_put(sas_device);
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}
}
@@ -3080,7 +3147,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
{
Mpi2SCSITaskManagementRequest_t *mpi_request;
u16 smid;
- struct _sas_device *sas_device;
+ struct _sas_device *sas_device = NULL;
struct MPT2SAS_TARGET *sas_target_priv_data = NULL;
u64 sas_address = 0;
unsigned long flags;
@@ -3110,7 +3177,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
return;

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+ sas_device = _scsih_sas_device_get_by_handle_nolock(ioc, handle);
if (sas_device && sas_device->starget &&
sas_device->starget->hostdata) {
sas_target_priv_data = sas_device->starget->hostdata;
@@ -3131,14 +3198,14 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
if (!smid) {
delayed_tr = kzalloc(sizeof(*delayed_tr), GFP_ATOMIC);
if (!delayed_tr)
- return;
+ goto out;
INIT_LIST_HEAD(&delayed_tr->list);
delayed_tr->handle = handle;
list_add_tail(&delayed_tr->list, &ioc->delayed_tr_list);
dewtprintk(ioc, printk(MPT2SAS_INFO_FMT
"DELAYED:tr:handle(0x%04x), (open)\n",
ioc->name, handle));
- return;
+ goto out;
}

dewtprintk(ioc, printk(MPT2SAS_INFO_FMT "tr_send:handle(0x%04x), "
@@ -3150,6 +3217,9 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
mpi_request->DevHandle = cpu_to_le16(handle);
mpi_request->TaskType = MPI2_SCSITASKMGMT_TASKTYPE_TARGET_RESET;
mpt2sas_base_put_smid_hi_priority(ioc, smid);
+out:
+ if (sas_device)
+ sas_device_put(sas_device);
}


@@ -4068,7 +4138,6 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
char *desc_scsi_state = ioc->tmp_string;
u32 log_info = le32_to_cpu(mpi_reply->IOCLogInfo);
struct _sas_device *sas_device = NULL;
- unsigned long flags;
struct scsi_target *starget = scmd->device->sdev_target;
struct MPT2SAS_TARGET *priv_target = starget->hostdata;
char *device_str = NULL;
@@ -4200,8 +4269,7 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
printk(MPT2SAS_WARN_FMT "\t%s wwid(0x%016llx)\n", ioc->name,
device_str, (unsigned long long)priv_target->sas_address);
} else {
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address(ioc,
priv_target->sas_address);
if (sas_device) {
printk(MPT2SAS_WARN_FMT "\tsas_address(0x%016llx), "
@@ -4211,8 +4279,9 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
"\tenclosure_logical_id(0x%016llx), slot(%d)\n",
ioc->name, sas_device->enclosure_logical_id,
sas_device->slot);
+
+ sas_device_put(sas_device);
}
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}

printk(MPT2SAS_WARN_FMT "\thandle(0x%04x), ioc_status(%s)(0x%04x), "
@@ -4259,7 +4328,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
Mpi2SepRequest_t mpi_request;
struct _sas_device *sas_device;

- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+ sas_device = _scsih_sas_device_get_by_handle(ioc, handle);
if (!sas_device)
return;

@@ -4274,7 +4343,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
&mpi_request)) != 0) {
printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n", ioc->name,
__FILE__, __LINE__, __func__);
- return;
+ goto out;
}
sas_device->pfa_led_on = 1;

@@ -4284,8 +4353,10 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
"enclosure_processor: ioc_status (0x%04x), loginfo(0x%08x)\n",
ioc->name, le16_to_cpu(mpi_reply.IOCStatus),
le32_to_cpu(mpi_reply.IOCLogInfo)));
- return;
+ goto out;
}
+out:
+ sas_device_put(sas_device);
}

/**
@@ -4370,19 +4441,17 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)

/* only handle non-raid devices */
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+ sas_device = _scsih_sas_device_get_by_handle_nolock(ioc, handle);
if (!sas_device) {
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
+ goto out_unlock;
}
starget = sas_device->starget;
sas_target_priv_data = starget->hostdata;

if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_RAID_COMPONENT) ||
- ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))) {
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
- }
+ ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)))
+ goto out_unlock;
+
starget_printk(KERN_WARNING, starget, "predicted fault\n");
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

@@ -4396,7 +4465,7 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
if (!event_reply) {
printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n",
ioc->name, __FILE__, __LINE__, __func__);
- return;
+ goto out;
}

event_reply->Function = MPI2_FUNCTION_EVENT_NOTIFICATION;
@@ -4413,6 +4482,14 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
event_data->SASAddress = cpu_to_le64(sas_target_priv_data->sas_address);
mpt2sas_ctl_add_to_event_log(ioc, event_reply);
kfree(event_reply);
+out:
+ if (sas_device)
+ sas_device_put(sas_device);
+ return;
+
+out_unlock:
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+ goto out;
}

/**
@@ -5148,14 +5225,13 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)

spin_lock_irqsave(&ioc->sas_device_lock, flags);
sas_address = le64_to_cpu(sas_device_pg0.SASAddress);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
sas_address);

if (!sas_device) {
printk(MPT2SAS_ERR_FMT "device is not present "
"handle(0x%04x), no sas_device!!!\n", ioc->name, handle);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
+ goto out_unlock;
}

if (unlikely(sas_device->handle != handle)) {
@@ -5172,19 +5248,22 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)
MPI2_SAS_DEVICE0_FLAGS_DEVICE_PRESENT)) {
printk(MPT2SAS_ERR_FMT "device is not present "
"handle(0x%04x), flags!!!\n", ioc->name, handle);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
+ goto out_unlock;
}

/* check if there were any issues with discovery */
if (_scsih_check_access_status(ioc, sas_address, handle,
- sas_device_pg0.AccessStatus)) {
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
- }
+ sas_device_pg0.AccessStatus))
+ goto out_unlock;
+
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
_scsih_ublock_io_device(ioc, sas_address);
+ return;

+out_unlock:
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+ if (sas_device)
+ sas_device_put(sas_device);
}

/**
@@ -5208,7 +5287,6 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
u32 ioc_status;
__le64 sas_address;
u32 device_info;
- unsigned long flags;

if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
@@ -5250,14 +5328,13 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
return -1;
}

-
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address(ioc,
sas_address);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

- if (sas_device)
+ if (sas_device) {
+ sas_device_put(sas_device);
return 0;
+ }

sas_device = kzalloc(sizeof(struct _sas_device),
GFP_KERNEL);
@@ -5267,6 +5344,7 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
return -1;
}

+ kref_init(&sas_device->refcount);
sas_device->handle = handle;
if (_scsih_get_sas_address(ioc, le16_to_cpu
(sas_device_pg0.ParentDevHandle),
@@ -5344,7 +5422,6 @@ _scsih_remove_device(struct MPT2SAS_ADAPTER *ioc,
"handle(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
sas_device->handle, (unsigned long long)
sas_device->sas_address));
- kfree(sas_device);
}
/**
* _scsih_device_remove_by_handle - removing device object by handle
@@ -5363,12 +5440,17 @@ _scsih_device_remove_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
return;

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
- if (sas_device)
- list_del(&sas_device->list);
+ sas_device = _scsih_sas_device_get_by_handle_nolock(ioc, handle);
+ if (sas_device) {
+ list_del_init(&sas_device->list);
+ sas_device_put(sas_device);
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- if (sas_device)
+
+ if (sas_device) {
_scsih_remove_device(ioc, sas_device);
+ sas_device_put(sas_device);
+ }
}

/**
@@ -5389,13 +5471,17 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
return;

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
- sas_address);
- if (sas_device)
- list_del(&sas_device->list);
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc, sas_address);
+ if (sas_device) {
+ list_del_init(&sas_device->list);
+ sas_device_put(sas_device);
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- if (sas_device)
+
+ if (sas_device) {
_scsih_remove_device(ioc, sas_device);
+ sas_device_put(sas_device);
+ }
}
#ifdef CONFIG_SCSI_MPT2SAS_LOGGING
/**
@@ -5716,26 +5802,28 @@ _scsih_sas_device_status_change_event(struct MPT2SAS_ADAPTER *ioc,

spin_lock_irqsave(&ioc->sas_device_lock, flags);
sas_address = le64_to_cpu(event_data->SASAddress);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
sas_address);

- if (!sas_device || !sas_device->starget) {
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
- }
+ if (!sas_device || !sas_device->starget)
+ goto out;

target_priv_data = sas_device->starget->hostdata;
- if (!target_priv_data) {
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
- }
+ if (!target_priv_data)
+ goto out;

if (event_data->ReasonCode ==
MPI2_EVENT_SAS_DEV_STAT_RC_INTERNAL_DEVICE_RESET)
target_priv_data->tm_busy = 1;
else
target_priv_data->tm_busy = 0;
+
+out:
+ if (sas_device)
+ sas_device_put(sas_device);
+
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
}

#ifdef CONFIG_SCSI_MPT2SAS_LOGGING
@@ -6123,7 +6211,7 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
u16 handle = le16_to_cpu(element->PhysDiskDevHandle);

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+ sas_device = _scsih_sas_device_get_by_handle_nolock(ioc, handle);
if (sas_device) {
sas_device->volume_handle = 0;
sas_device->volume_wwid = 0;
@@ -6142,6 +6230,8 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
/* exposing raid component */
if (starget)
starget_for_each_device(starget, NULL, _scsih_reprobe_lun);
+
+ sas_device_put(sas_device);
}

/**
@@ -6170,7 +6260,7 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
&volume_wwid);

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+ sas_device = _scsih_sas_device_get_by_handle_nolock(ioc, handle);
if (sas_device) {
set_bit(handle, ioc->pd_handles);
if (sas_device->starget && sas_device->starget->hostdata) {
@@ -6189,6 +6279,8 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
/* hiding raid component */
if (starget)
starget_for_each_device(starget, (void *)1, _scsih_reprobe_lun);
+
+ sas_device_put(sas_device);
}

/**
@@ -6221,7 +6313,6 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,
Mpi2EventIrConfigElement_t *element)
{
struct _sas_device *sas_device;
- unsigned long flags;
u16 handle = le16_to_cpu(element->PhysDiskDevHandle);
Mpi2ConfigReply_t mpi_reply;
Mpi2SasDevicePage0_t sas_device_pg0;
@@ -6231,11 +6322,11 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,

set_bit(handle, ioc->pd_handles);

- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- if (sas_device)
+ sas_device = _scsih_sas_device_get_by_handle(ioc, handle);
+ if (sas_device) {
+ sas_device_put(sas_device);
return;
+ }

if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
@@ -6509,7 +6600,6 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
u16 handle, parent_handle;
u32 state;
struct _sas_device *sas_device;
- unsigned long flags;
Mpi2ConfigReply_t mpi_reply;
Mpi2SasDevicePage0_t sas_device_pg0;
u32 ioc_status;
@@ -6542,12 +6632,11 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
if (!ioc->is_warpdrive)
set_bit(handle, ioc->pd_handles);

- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-
- if (sas_device)
+ sas_device = _scsih_sas_device_get_by_handle(ioc, handle);
+ if (sas_device) {
+ sas_device_put(sas_device);
return;
+ }

if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
&sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
@@ -7179,11 +7268,11 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
}
phys_disk_num = pd_pg0.PhysDiskNum;
handle = le16_to_cpu(pd_pg0.DevHandle);
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- if (sas_device)
+ sas_device = _scsih_sas_device_get_by_handle(ioc, handle);
+ if (sas_device) {
+ sas_device_put(sas_device);
continue;
+ }
if (mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
&sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
handle) != 0)
@@ -7302,12 +7391,12 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
if (!(_scsih_is_end_device(
le32_to_cpu(sas_device_pg0.DeviceInfo))))
continue;
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address(ioc,
le64_to_cpu(sas_device_pg0.SASAddress));
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- if (sas_device)
+ if (sas_device) {
+ sas_device_put(sas_device);
continue;
+ }
parent_handle = le16_to_cpu(sas_device_pg0.ParentDevHandle);
if (!_scsih_get_sas_address(ioc, parent_handle, &sas_address)) {
printk(MPT2SAS_INFO_FMT "\tBEFORE adding end device: "
diff --git a/drivers/scsi/mpt2sas/mpt2sas_transport.c b/drivers/scsi/mpt2sas/mpt2sas_transport.c
index ff2500a..ebfc827 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_transport.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_transport.c
@@ -1323,15 +1323,17 @@ _transport_get_enclosure_identifier(struct sas_rphy *rphy, u64 *identifier)
int rc;

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
rphy->identify.sas_address);
if (sas_device) {
*identifier = sas_device->enclosure_logical_id;
rc = 0;
+ sas_device_put(sas_device);
} else {
*identifier = 0;
rc = -ENXIO;
}
+
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
return rc;
}
@@ -1351,12 +1353,14 @@ _transport_get_bay_identifier(struct sas_rphy *rphy)
int rc;

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
rphy->identify.sas_address);
- if (sas_device)
+ if (sas_device) {
rc = sas_device->slot;
- else
+ sas_device_put(sas_device);
+ } else {
rc = -ENXIO;
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
return rc;
}
--
1.8.1

2015-06-09 03:52:22

by Calvin Owens

[permalink] [raw]
Subject: [PATCH 3/6] Fix unsafe sas_device_list usage

We cannot iterate over the list without holding a lock for the entire
duration, or we risk corrupting random memory if items are added or
deleted as we iterate.

This refactors code such that it always holds the lock when iterating
on or accessing the sas_device_list.

Signed-off-by: Calvin Owens <[email protected]>
---
drivers/scsi/mpt2sas/mpt2sas_scsih.c | 83 +++++++++++++++++++++++++++---------
1 file changed, 62 insertions(+), 21 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index ad6ceb7e..9645055 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -7104,6 +7104,7 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
struct _raid_device *raid_device, *raid_device_next;
struct list_head tmp_list;
unsigned long flags;
+ LIST_HEAD(head);

printk(MPT2SAS_INFO_FMT "removing unresponding devices: start\n",
ioc->name);
@@ -7111,14 +7112,29 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
/* removing unresponding end devices */
printk(MPT2SAS_INFO_FMT "removing unresponding devices: end-devices\n",
ioc->name);
+
+ /*
+ * Iterate, pulling off devices marked as non-responding. We become the
+ * owner for the reference the list had on any object we prune.
+ */
+ spin_lock_irqsave(&ioc->sas_device_lock, flags);
list_for_each_entry_safe(sas_device, sas_device_next,
- &ioc->sas_device_list, list) {
+ &ioc->sas_device_list, list) {
if (!sas_device->responding)
- mpt2sas_device_remove_by_sas_address(ioc,
- sas_device->sas_address);
+ list_move_tail(&sas_device->list, &head);
else
sas_device->responding = 0;
}
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+ /*
+ * Now, uninitialize and remove the unresponding devices we pruned.
+ */
+ list_for_each_entry_safe(sas_device, sas_device_next, &head, list) {
+ _scsih_remove_device(ioc, sas_device);
+ list_del_init(&sas_device->list);
+ sas_device_put(sas_device);
+ }

/* removing unresponding volumes */
if (ioc->ir_firmware) {
@@ -8055,6 +8071,37 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
}
}

+static struct _sas_device *dequeue_next_sas_device(struct MPT2SAS_ADAPTER *ioc)
+{
+ struct _sas_device *sas_device = NULL;
+ unsigned long flags;
+
+ spin_lock_irqsave(&ioc->sas_device_lock, flags);
+ if (!list_empty(&ioc->sas_device_init_list)) {
+ sas_device = list_first_entry(&ioc->sas_device_init_list,
+ struct _sas_device, list);
+ list_del_init(&sas_device->list);
+ }
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+ /*
+ * If an item was dequeued, the caller now owns the reference that was
+ * previously owned by the list
+ */
+ return sas_device;
+}
+
+static void sas_device_make_active(struct MPT2SAS_ADAPTER *ioc,
+ struct _sas_device *sas_device)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&ioc->sas_device_lock, flags);
+ sas_device_get(sas_device);
+ list_add_tail(&sas_device->list, &ioc->sas_device_list);
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+}
+
/**
* _scsih_probe_sas - reporting sas devices to sas transport
* @ioc: per adapter object
@@ -8064,34 +8111,28 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
static void
_scsih_probe_sas(struct MPT2SAS_ADAPTER *ioc)
{
- struct _sas_device *sas_device, *next;
- unsigned long flags;
-
- /* SAS Device List */
- list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
- list) {
+ struct _sas_device *sas_device;

- if (ioc->hide_drives)
- continue;
+ if (ioc->hide_drives)
+ return;

+ while ((sas_device = dequeue_next_sas_device(ioc))) {
if (!mpt2sas_transport_port_add(ioc, sas_device->handle,
- sas_device->sas_address_parent)) {
- list_del(&sas_device->list);
- kfree(sas_device);
+ sas_device->sas_address_parent)) {
+ sas_device_put(sas_device);
continue;
} else if (!sas_device->starget) {
if (!ioc->is_driver_loading) {
mpt2sas_transport_port_remove(ioc,
- sas_device->sas_address,
- sas_device->sas_address_parent);
- list_del(&sas_device->list);
- kfree(sas_device);
+ sas_device->sas_address,
+ sas_device->sas_address_parent);
+ sas_device_put(sas_device);
continue;
}
}
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- list_move_tail(&sas_device->list, &ioc->sas_device_list);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+ sas_device_make_active(ioc, sas_device);
+ sas_device_put(sas_device);
}
}

--
1.8.1

2015-06-09 03:52:30

by Calvin Owens

[permalink] [raw]
Subject: [PATCH 4/6] Add refcount to fw_event_work struct

The fw_event_work struct is concurrently referenced at shutdown, so
add a refcount to protect it.

Signed-off-by: Calvin Owens <[email protected]>
---
drivers/scsi/mpt2sas/mpt2sas_scsih.c | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index 9645055..611b34d 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -176,9 +176,37 @@ struct fw_event_work {
u8 VP_ID;
u8 ignore;
u16 event;
+ struct kref refcount;
char event_data[0] __aligned(4);
};

+static void fw_event_work_free(struct kref *r)
+{
+ kfree(container_of(r, struct fw_event_work, refcount));
+}
+
+static void fw_event_work_get(struct fw_event_work *fw_work)
+{
+ kref_get(&fw_work->refcount);
+}
+
+static void fw_event_work_put(struct fw_event_work *fw_work)
+{
+ kref_put(&fw_work->refcount, fw_event_work_free);
+}
+
+static struct fw_event_work *alloc_fw_event_work(int len)
+{
+ struct fw_event_work *fw_event;
+
+ fw_event = kzalloc(sizeof(*fw_event) + len, GFP_ATOMIC);
+ if (!fw_event)
+ return NULL;
+
+ kref_init(&fw_event->refcount);
+ return fw_event;
+}
+
/* raid transport support */
static struct raid_template *mpt2sas_raid_template;

--
1.8.1

2015-06-09 03:52:38

by Calvin Owens

[permalink] [raw]
Subject: [PATCH 5/6] Refactor code to use new fw_event refcount

This refactors the fw_event code to use the new refcount.

Signed-off-by: Calvin Owens <[email protected]>
---
drivers/scsi/mpt2sas/mpt2sas_scsih.c | 20 +++++++++++++-------
1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index 611b34d..8d8c814 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -2863,6 +2863,7 @@ _scsih_fw_event_add(struct MPT2SAS_ADAPTER *ioc, struct fw_event_work *fw_event)
return;

spin_lock_irqsave(&ioc->fw_event_lock, flags);
+ fw_event_work_get(fw_event);
list_add_tail(&fw_event->list, &ioc->fw_event_list);
INIT_DELAYED_WORK(&fw_event->delayed_work, _firmware_event_work);
queue_delayed_work(ioc->firmware_event_thread,
@@ -2887,12 +2888,13 @@ _scsih_fw_event_free(struct MPT2SAS_ADAPTER *ioc, struct fw_event_work
unsigned long flags;

spin_lock_irqsave(&ioc->fw_event_lock, flags);
- list_del(&fw_event->list);
- kfree(fw_event);
+ if (!list_empty(&fw_event->list))
+ list_del_init(&fw_event->list);
+
+ fw_event_work_put(fw_event);
spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
}

-
/**
* _scsih_error_recovery_delete_devices - remove devices not responding
* @ioc: per adapter object
@@ -2907,13 +2909,14 @@ _scsih_error_recovery_delete_devices(struct MPT2SAS_ADAPTER *ioc)
if (ioc->is_driver_loading)
return;

- fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+ fw_event = alloc_fw_event_work(0);
if (!fw_event)
return;

fw_event->event = MPT2SAS_REMOVE_UNRESPONDING_DEVICES;
fw_event->ioc = ioc;
_scsih_fw_event_add(ioc, fw_event);
+ fw_event_work_put(fw_event);
}

/**
@@ -2927,12 +2930,13 @@ mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc)
{
struct fw_event_work *fw_event;

- fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+ fw_event = alloc_fw_event_work(0);
if (!fw_event)
return;
fw_event->event = MPT2SAS_PORT_ENABLE_COMPLETE;
fw_event->ioc = ioc;
_scsih_fw_event_add(ioc, fw_event);
+ fw_event_work_put(fw_event);
}

/**
@@ -4439,13 +4443,14 @@ _scsih_send_event_to_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
{
struct fw_event_work *fw_event;

- fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+ fw_event = alloc_fw_event_work(0);
if (!fw_event)
return;
fw_event->event = MPT2SAS_TURN_ON_PFA_LED;
fw_event->device_handle = handle;
fw_event->ioc = ioc;
_scsih_fw_event_add(ioc, fw_event);
+ fw_event_work_put(fw_event);
}

/**
@@ -7740,7 +7745,7 @@ mpt2sas_scsih_event_callback(struct MPT2SAS_ADAPTER *ioc, u8 msix_index,
}

sz = le16_to_cpu(mpi_reply->EventDataLength) * 4;
- fw_event = kzalloc(sizeof(*fw_event) + sz, GFP_ATOMIC);
+ fw_event = alloc_fw_event_work(sz);
if (!fw_event) {
printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n",
ioc->name, __FILE__, __LINE__, __func__);
@@ -7753,6 +7758,7 @@ mpt2sas_scsih_event_callback(struct MPT2SAS_ADAPTER *ioc, u8 msix_index,
fw_event->VP_ID = mpi_reply->VP_ID;
fw_event->event = event;
_scsih_fw_event_add(ioc, fw_event);
+ fw_event_work_put(fw_event);
return;
}

--
1.8.1

2015-06-09 03:52:06

by Calvin Owens

[permalink] [raw]
Subject: [PATCH 6/6] Fix unsafe fw_event_list usage

Since the fw_event deletes itself from the list, cleanup_queue() can
walk onto garbage pointers or walk off into freed memory.

This refactors the code in _scsih_fw_event_cleanup_queue() to not
iterate over the fw_event_list without a lock.

Signed-off-by: Calvin Owens <[email protected]>
---
drivers/scsi/mpt2sas/mpt2sas_scsih.c | 22 ++++++++++++++++++++--
1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index 8d8c814..f504e28 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -2939,6 +2939,23 @@ mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc)
fw_event_work_put(fw_event);
}

+static struct fw_event_work *dequeue_next_fw_event(struct MPT2SAS_ADAPTER *ioc)
+{
+ unsigned long flags;
+ struct fw_event_work *fw_event = NULL;
+
+ spin_lock_irqsave(&ioc->fw_event_lock, flags);
+ if (!list_empty(&ioc->fw_event_list)) {
+ fw_event = list_first_entry(&ioc->fw_event_list,
+ struct fw_event_work, list);
+ list_del_init(&fw_event->list);
+ fw_event_work_get(fw_event);
+ }
+ spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
+
+ return fw_event;
+}
+
/**
* _scsih_fw_event_cleanup_queue - cleanup event queue
* @ioc: per adapter object
@@ -2951,17 +2968,18 @@ mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc)
static void
_scsih_fw_event_cleanup_queue(struct MPT2SAS_ADAPTER *ioc)
{
- struct fw_event_work *fw_event, *next;
+ struct fw_event_work *fw_event;

if (list_empty(&ioc->fw_event_list) ||
!ioc->firmware_event_thread || in_interrupt())
return;

- list_for_each_entry_safe(fw_event, next, &ioc->fw_event_list, list) {
+ while ((fw_event = dequeue_next_fw_event(ioc))) {
if (cancel_delayed_work_sync(&fw_event->delayed_work)) {
_scsih_fw_event_free(ioc, fw_event);
continue;
}
+ fw_event_work_put(fw_event);
}
}

--
1.8.1

2015-07-02 19:23:04

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH 0/6] Fixes for memory corruption in mpt2sas

On 05/14/2015 09:41 PM, Calvin Owens wrote:
> Hello all,
>
> This patchset attempts to address problems we've been having with
> panics due to memory corruption from the mpt2sas driver.
>
> I will provide a similar set of fixes for mpt3sas, since we see
> similar issues there as well. "Porting" this to mpt3sas will be
> trivial since the part of the driver I'm touching is nearly identical
> between the two, so I thought it would be simpler to review a patch
> against mpt2sas alone at first.
>
> I've tested this for a few days on a big storage box that seemed to be
> very susceptible to the panics, and so far it seems to have eliminated
> them.

Guys, can someone outside of FB please review this? We're hitting random
memory corruptions without these fixes.

--
Jens Axboe

2015-07-02 20:15:20

by Bart Van Assche

[permalink] [raw]
Subject: Re: [RESEND][PATCH 0/6] Fixes for memory corruption in mpt2sas

On 06/08/2015 08:50 PM, Calvin Owens wrote:
> This patchset attempts to address problems we've been having with
> panics due to memory corruption from the mpt2sas driver.
>
> I will provide a similar set of fixes for mpt3sas, since we see
> similar issues there as well. "Porting" this to mpt3sas will be
> trivial since the part of the driver I'm touching is nearly identical
> between the two, so I thought it would be simpler to review a patch
> against mpt2sas alone at first.
>
> I've tested this on a handful of large storage boxes over the past few
> weeks, so far it seems to have completely eliminated the memory
> corruption panics.

If you have to repost this series please convert
BUG_ON(!spin_is_locked(&ioc->sas_device_lock)); into
lockdep_is_held(...). Otherwise, for the whole series:

Reviewed-by: Bart Van Assche <[email protected]>

2015-07-03 15:24:10

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/6] Add refcount to sas_device struct

On Mon, Jun 08, 2015 at 08:50:51PM -0700, Calvin Owens wrote:
> These objects can be referenced concurrently throughout the driver, we
> need a way to make sure threads can't delete them out from under each
> other.
>
> Signed-off-by: Calvin Owens <[email protected]>

Thsi doesn't make sense without users of the refcount, and should be
squashed into the patch actually using the refcounting.

2015-07-03 15:38:12

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 2/6] Refactor code to use new sas_device refcount

>
> +struct _sas_device *
> +mpt2sas_scsih_sas_device_get_by_sas_address_nolock(struct MPT2SAS_ADAPTER *ioc,
> + u64 sas_address)

Any chance to use a shorter name for this function? E.g.
__mpt2sas_get_sdev_by_addr ?

> +{
> + struct _sas_device *sas_device;
> +
> + BUG_ON(!spin_is_locked(&ioc->sas_device_lock));

This will blow on UP builds. Please use assert_spin_locked or
lockdep_assert_held instead. And don't ask me which of the two,
that's a mystery I don't understand myself either.

> struct _sas_device *
> -mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> +mpt2sas_scsih_sas_device_get_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> u64 sas_address)
> {

> +static struct _sas_device *
> +_scsih_sas_device_get_by_handle_nolock(struct MPT2SAS_ADAPTER *ioc, u16 handle)

> static struct _sas_device *
> -_scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> +_scsih_sas_device_get_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)

Same comments about the function names as above.

> + struct _sas_device *sas_device;
> +
> + BUG_ON(!spin_is_locked(&ioc->sas_device_lock));

Same comment about the right assert helpers as above.

> @@ -594,9 +634,15 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
> if (!sas_device)
> return;
>
> + /*
> + * The lock serializes access to the list, but we still need to verify
> + * that nobody removed the entry while we were waiting on the lock.
> + */
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - list_del(&sas_device->list);
> - kfree(sas_device);
> + if (!list_empty(&sas_device->list)) {
> + list_del_init(&sas_device->list);
> + sas_device_put(sas_device);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

This looks odd to me. Normally you'd have the lock from the list
iteration that finds the device. From looking at the code it seems
like this only called from probe failure paths, though. It seems like
for this case the device simplify shouldn't be added until the probe
succeeds and this function should go away?

> @@ -1208,12 +1256,15 @@ _scsih_change_queue_depth(struct scsi_device *sdev, int qdepth)
> goto not_sata;
> if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))
> goto not_sata;
> +
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> + sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
> sas_device_priv_data->sas_target->sas_address);
> - if (sas_device && sas_device->device_info &
> - MPI2_SAS_DEVICE_INFO_SATA_DEVICE)
> + if (sas_device && sas_device->device_info
> + & MPI2_SAS_DEVICE_INFO_SATA_DEVICE) {
> max_depth = MPT2SAS_SATA_QUEUE_DEPTH;
> + sas_device_put(sas_device);
> + }

Please store a pointer to the sas_device in struct scsi_target ->hostdata
in _scsih_target_alloc and avoid the need for this and other runtime
lookups where we have a scsi_device or scsi_target structure available.

> @@ -1324,13 +1377,15 @@ _scsih_target_destroy(struct scsi_target *starget)
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> rphy = dev_to_rphy(starget->dev.parent);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> + sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
> rphy->identify.sas_address);
> if (sas_device && (sas_device->starget == starget) &&
> (sas_device->id == starget->id) &&
> (sas_device->channel == starget->channel))
> sas_device->starget = NULL;
>
> + if (sas_device)
> + sas_device_put(sas_device);
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

.. like this one.

> out:
> @@ -1386,7 +1441,7 @@ _scsih_slave_alloc(struct scsi_device *sdev)
>
> if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> + sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
> sas_target_priv_data->sas_address);
> if (sas_device && (sas_device->starget == NULL)) {
> sdev_printk(KERN_INFO, sdev,

.. or this one ..

> @@ -1428,10 +1487,13 @@ _scsih_slave_destroy(struct scsi_device *sdev)
>
> if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> + sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
> sas_target_priv_data->sas_address);
> if (sas_device && !sas_target_priv_data->num_luns)
> sas_device->starget = NULL;
> +
> + if (sas_device)
> + sas_device_put(sas_device);
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

.. and this, and many more.

2015-07-03 15:39:18

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 4/6] Add refcount to fw_event_work struct

On Mon, Jun 08, 2015 at 08:50:54PM -0700, Calvin Owens wrote:
> The fw_event_work struct is concurrently referenced at shutdown, so
> add a refcount to protect it.

Same comment here - a refcount that isn't used isn't useful, please fold
into the next patch.

2015-07-03 16:00:53

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 5/6] Refactor code to use new fw_event refcount

On Mon, Jun 08, 2015 at 08:50:55PM -0700, Calvin Owens wrote:
> This refactors the fw_event code to use the new refcount.

I spent some time looking over this code because it's so convoluted.
In general I think code should either embeed one work_struct (and it
really doesn't seem to need a delayed work here!) or if needed a list
and not both like this one. But it's probably too much work to sort
all this out, so let's go with your version.

2015-07-03 16:02:59

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 6/6] Fix unsafe fw_event_list usage

On Mon, Jun 08, 2015 at 08:50:56PM -0700, Calvin Owens wrote:
> Since the fw_event deletes itself from the list, cleanup_queue() can
> walk onto garbage pointers or walk off into freed memory.
>
> This refactors the code in _scsih_fw_event_cleanup_queue() to not
> iterate over the fw_event_list without a lock.

I think this really should be folded into the previous one, with the
fixes in this one the other refcounting change don't make a whole lot
sense.

> +static struct fw_event_work *dequeue_next_fw_event(struct MPT2SAS_ADAPTER *ioc)
> +{
> + unsigned long flags;
> + struct fw_event_work *fw_event = NULL;
> +
> + spin_lock_irqsave(&ioc->fw_event_lock, flags);
> + if (!list_empty(&ioc->fw_event_list)) {
> + fw_event = list_first_entry(&ioc->fw_event_list,
> + struct fw_event_work, list);
> + list_del_init(&fw_event->list);
> + fw_event_work_get(fw_event);
> + }
> + spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
> +
> + return fw_event;

Shouldn't we have a reference for each item on the list that gets
transfer to whomever removes it from the list?

Additionally _firmware_event_work should call dequeue_next_fw_event
first in the function so that item is off the list before we process
it, and can then just drop the reference once it's done.

2015-07-03 16:03:48

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 3/6] Fix unsafe sas_device_list usage

On Mon, Jun 08, 2015 at 08:50:53PM -0700, Calvin Owens wrote:
> We cannot iterate over the list without holding a lock for the entire
> duration, or we risk corrupting random memory if items are added or
> deleted as we iterate.
>
> This refactors code such that it always holds the lock when iterating
> on or accessing the sas_device_list.

This looks sensible but should probably be folded into the previous
patch.

2015-07-12 04:14:32

by Calvin Owens

[permalink] [raw]
Subject: Re: [PATCH 5/6] Refactor code to use new fw_event refcount

Thanks for this, I'm sending a v2 shortly.

On Friday 07/03 at 09:00 -0700, Christoph Hellwig wrote:
> On Mon, Jun 08, 2015 at 08:50:55PM -0700, Calvin Owens wrote:
> > This refactors the fw_event code to use the new refcount.
>
> I spent some time looking over this code because it's so convoluted.
> In general I think code should either embeed one work_struct (and it
> really doesn't seem to need a delayed work here!) or if needed a list
> and not both like this one. But it's probably too much work to sort
> all this out, so let's go with your version.

Yeah, I tried to get rid of fw_event_list altogether, since I think what
cleanup_queue() does could be simplified to calling flush_workqueue().

The problem is _scsih_check_topo_delete_events(), which looks at the
list and sometimes marks fw_events as "ignored" so they aren't executed.

2015-07-12 04:16:03

by Calvin Owens

[permalink] [raw]
Subject: Re: [PATCH 2/6] Refactor code to use new sas_device refcount

On Friday 07/03 at 08:38 -0700, Christoph Hellwig wrote:
> >
> > +struct _sas_device *
> > +mpt2sas_scsih_sas_device_get_by_sas_address_nolock(struct MPT2SAS_ADAPTER *ioc,
> > + u64 sas_address)
>
> Any chance to use a shorter name for this function? E.g.
> __mpt2sas_get_sdev_by_addr ?

Will do.

> > +{
> > + struct _sas_device *sas_device;
> > +
> > + BUG_ON(!spin_is_locked(&ioc->sas_device_lock));
>
> This will blow on UP builds. Please use assert_spin_locked or
> lockdep_assert_held instead. And don't ask me which of the two,
> that's a mystery I don't understand myself either.

Will do.

> > struct _sas_device *
> > -mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> > +mpt2sas_scsih_sas_device_get_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> > u64 sas_address)
> > {
>
> > +static struct _sas_device *
> > +_scsih_sas_device_get_by_handle_nolock(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>
> > static struct _sas_device *
> > -_scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > +_scsih_sas_device_get_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>
> Same comments about the function names as above.
>
> > + struct _sas_device *sas_device;
> > +
> > + BUG_ON(!spin_is_locked(&ioc->sas_device_lock));
>
> Same comment about the right assert helpers as above.
>
> > @@ -594,9 +634,15 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
> > if (!sas_device)
> > return;
> >
> > + /*
> > + * The lock serializes access to the list, but we still need to verify
> > + * that nobody removed the entry while we were waiting on the lock.
> > + */
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - list_del(&sas_device->list);
> > - kfree(sas_device);
> > + if (!list_empty(&sas_device->list)) {
> > + list_del_init(&sas_device->list);
> > + sas_device_put(sas_device);
> > + }
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
> This looks odd to me. Normally you'd have the lock from the list
> iteration that finds the device. From looking at the code it seems
> like this only called from probe failure paths, though. It seems like
> for this case the device simplify shouldn't be added until the probe
> succeeds and this function should go away?

There's a horrible maze of dependencies on things being on the lists
while being added that make this impossible: I spent some time trying
to get this to work, but I always end up with no drives. :(

(The path through _scsih_probe_sas() seems not to care)

I was hopeful your suggestion below about putting the sas_device
pointer in ->hostdata would eliminate the need for all the find_by_X()
lookups, but some won't go away.

> > @@ -1208,12 +1256,15 @@ _scsih_change_queue_depth(struct scsi_device *sdev, int qdepth)
> > goto not_sata;
> > if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))
> > goto not_sata;
> > +
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > + sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
> > sas_device_priv_data->sas_target->sas_address);
> > - if (sas_device && sas_device->device_info &
> > - MPI2_SAS_DEVICE_INFO_SATA_DEVICE)
> > + if (sas_device && sas_device->device_info
> > + & MPI2_SAS_DEVICE_INFO_SATA_DEVICE) {
> > max_depth = MPT2SAS_SATA_QUEUE_DEPTH;
> > + sas_device_put(sas_device);
> > + }
>
> Please store a pointer to the sas_device in struct scsi_target ->hostdata
> in _scsih_target_alloc and avoid the need for this and other runtime
> lookups where we have a scsi_device or scsi_target structure available.

Will do.

> > @@ -1324,13 +1377,15 @@ _scsih_target_destroy(struct scsi_target *starget)
> >
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > rphy = dev_to_rphy(starget->dev.parent);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > + sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
> > rphy->identify.sas_address);
> > if (sas_device && (sas_device->starget == starget) &&
> > (sas_device->id == starget->id) &&
> > (sas_device->channel == starget->channel))
> > sas_device->starget = NULL;
> >
> > + if (sas_device)
> > + sas_device_put(sas_device);
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
> .. like this one.
>
> > out:
> > @@ -1386,7 +1441,7 @@ _scsih_slave_alloc(struct scsi_device *sdev)
> >
> > if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > + sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
> > sas_target_priv_data->sas_address);
> > if (sas_device && (sas_device->starget == NULL)) {
> > sdev_printk(KERN_INFO, sdev,
>
> .. or this one ..
>
> > @@ -1428,10 +1487,13 @@ _scsih_slave_destroy(struct scsi_device *sdev)
> >
> > if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > + sas_device = mpt2sas_scsih_sas_device_get_by_sas_address_nolock(ioc,
> > sas_target_priv_data->sas_address);
> > if (sas_device && !sas_target_priv_data->num_luns)
> > sas_device->starget = NULL;
> > +
> > + if (sas_device)
> > + sas_device_put(sas_device);
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
> .. and this, and many more.
>

2015-07-12 04:20:41

by Calvin Owens

[permalink] [raw]
Subject: Re: [PATCH 6/6] Fix unsafe fw_event_list usage

On Friday 07/03 at 09:02 -0700, Christoph Hellwig wrote:
> On Mon, Jun 08, 2015 at 08:50:56PM -0700, Calvin Owens wrote:
> > Since the fw_event deletes itself from the list, cleanup_queue() can
> > walk onto garbage pointers or walk off into freed memory.
> >
> > This refactors the code in _scsih_fw_event_cleanup_queue() to not
> > iterate over the fw_event_list without a lock.
>
> I think this really should be folded into the previous one, with the
> fixes in this one the other refcounting change don't make a whole lot
> sense.
>
> > +static struct fw_event_work *dequeue_next_fw_event(struct MPT2SAS_ADAPTER *ioc)
> > +{
> > + unsigned long flags;
> > + struct fw_event_work *fw_event = NULL;
> > +
> > + spin_lock_irqsave(&ioc->fw_event_lock, flags);
> > + if (!list_empty(&ioc->fw_event_list)) {
> > + fw_event = list_first_entry(&ioc->fw_event_list,
> > + struct fw_event_work, list);
> > + list_del_init(&fw_event->list);
> > + fw_event_work_get(fw_event);
> > + }
> > + spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
> > +
> > + return fw_event;
>
> Shouldn't we have a reference for each item on the list that gets
> transfer to whomever removes it from the list?

Yes, this was a bit weird the way I did it. I redid this in v2, hopefully
it's clearer.

> Additionally _firmware_event_work should call dequeue_next_fw_event
> first in the function so that item is off the list before we process
> it, and can then just drop the reference once it's done.

That works: cleanup_queue() won't wait on some already-running events, but
destroy_workqueue() drains the wq, so we won't run ahead and free things
from under the fw_event when unwinding.

2015-07-12 04:25:46

by Calvin Owens

[permalink] [raw]
Subject: [PATCH 0/2 v2] Fixes for memory corruption in mpt2sas

Hello all,

This patchset attempts to address problems we've been having with
panics due to memory corruption from the mpt2sas driver.

Thanks,
Calvin

Patches in this series:
[PATCH 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage
[PATCH 2/2] mpt2sas: Refcount fw_events and fix unsafe list usage

Changes since v1:
* Squished patches 1-3 and 4-6 into two patches
* s/BUG_ON(!spin_is_locked/assert_spin_locked/g
* Use more succinct fuction names
* Store a pointer to the sas_device object in ->hostdata to eliminate
the need for several lookups on the lists.
* Remove the fw_event from fw_event_list at the start of
_firmware_event_work()
* Explicitly separate fw_event_list removal from fw_event freeing

Total diffstat:

drivers/scsi/mpt2sas/mpt2sas_base.h | 22 +-
drivers/scsi/mpt2sas/mpt2sas_scsih.c | 535 +++++++++++++++++++++----------
drivers/scsi/mpt2sas/mpt2sas_transport.c | 12 +-
3 files changed, 396 insertions(+), 173 deletions(-)

Diff showing changes v1 => v2:
http://jcalvinowens.github.io/stuff/mpt2sas-patchset-v1v2.patch

2015-07-12 04:26:16

by Calvin Owens

[permalink] [raw]
Subject: [PATCH 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage

These objects can be referenced concurrently throughout the driver, we
need a way to make sure threads can't delete them out from under each
other. This patch adds the refcount, and refactors the code to use it.

Additionally, we cannot iterate over the sas_device_list without
holding the lock, or we risk corrupting random memory if items are
added or deleted as we iterate. This patch refactors _scsih_probe_sas()
to use the sas_device_list in a safe way.

Cc: Christoph Hellwig <[email protected]>
Cc: Bart Van Assche <[email protected]>
Signed-off-by: Calvin Owens <[email protected]>
---
drivers/scsi/mpt2sas/mpt2sas_base.h | 22 +-
drivers/scsi/mpt2sas/mpt2sas_scsih.c | 434 ++++++++++++++++++++-----------
drivers/scsi/mpt2sas/mpt2sas_transport.c | 12 +-
3 files changed, 315 insertions(+), 153 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
index caff8d1..78f41ac 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.h
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
@@ -238,6 +238,7 @@
* @flags: MPT_TARGET_FLAGS_XXX flags
* @deleted: target flaged for deletion
* @tm_busy: target is busy with TM request.
+ * @sdev: The sas_device associated with this target
*/
struct MPT2SAS_TARGET {
struct scsi_target *starget;
@@ -248,6 +249,7 @@ struct MPT2SAS_TARGET {
u32 flags;
u8 deleted;
u8 tm_busy;
+ struct _sas_device *sdev;
};


@@ -376,8 +378,24 @@ struct _sas_device {
u8 phy;
u8 responding;
u8 pfa_led_on;
+ struct kref refcount;
};

+static inline void sas_device_get(struct _sas_device *s)
+{
+ kref_get(&s->refcount);
+}
+
+static inline void sas_device_free(struct kref *r)
+{
+ kfree(container_of(r, struct _sas_device, refcount));
+}
+
+static inline void sas_device_put(struct _sas_device *s)
+{
+ kref_put(&s->refcount, sas_device_free);
+}
+
/**
* struct _raid_device - raid volume link list
* @list: sas device list
@@ -1095,7 +1113,9 @@ struct _sas_node *mpt2sas_scsih_expander_find_by_handle(struct MPT2SAS_ADAPTER *
u16 handle);
struct _sas_node *mpt2sas_scsih_expander_find_by_sas_address(struct MPT2SAS_ADAPTER
*ioc, u64 sas_address);
-struct _sas_device *mpt2sas_scsih_sas_device_find_by_sas_address(
+struct _sas_device *mpt2sas_get_sdev_by_addr(
+ struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
+struct _sas_device *__mpt2sas_get_sdev_by_addr(
struct MPT2SAS_ADAPTER *ioc, u64 sas_address);

void mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc);
diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index 3f26147..fad80ce 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -526,8 +526,43 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
}
}

+struct _sas_device *
+__mpt2sas_get_sdev_from_target(struct MPT2SAS_TARGET *tgt_priv)
+{
+ struct _sas_device *ret;
+
+ ret = tgt_priv->sdev;
+ if (ret)
+ sas_device_get(ret);
+
+ return ret;
+}
+
+struct _sas_device *
+__mpt2sas_get_sdev_by_addr(struct MPT2SAS_ADAPTER *ioc,
+ u64 sas_address)
+{
+ struct _sas_device *sas_device;
+
+ assert_spin_locked(&ioc->sas_device_lock);
+
+ list_for_each_entry(sas_device, &ioc->sas_device_list, list)
+ if (sas_device->sas_address == sas_address)
+ goto found_device;
+
+ list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
+ if (sas_device->sas_address == sas_address)
+ goto found_device;
+
+ return NULL;
+
+found_device:
+ sas_device_get(sas_device);
+ return sas_device;
+}
+
/**
- * mpt2sas_scsih_sas_device_find_by_sas_address - sas device search
+ * mpt2sas_get_sdev_by_addr - sas device search
* @ioc: per adapter object
* @sas_address: sas address
* Context: Calling function should acquire ioc->sas_device_lock
@@ -536,24 +571,44 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
* object.
*/
struct _sas_device *
-mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
+mpt2sas_get_sdev_by_addr(struct MPT2SAS_ADAPTER *ioc,
u64 sas_address)
{
struct _sas_device *sas_device;
+ unsigned long flags;
+
+ spin_lock_irqsave(&ioc->sas_device_lock, flags);
+ sas_device = __mpt2sas_get_sdev_by_addr(ioc,
+ sas_address);
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+ return sas_device;
+}
+
+static struct _sas_device *
+__mpt2sas_get_sdev_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
+{
+ struct _sas_device *sas_device;
+
+ assert_spin_locked(&ioc->sas_device_lock);

list_for_each_entry(sas_device, &ioc->sas_device_list, list)
- if (sas_device->sas_address == sas_address)
- return sas_device;
+ if (sas_device->handle == handle)
+ goto found_device;

list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
- if (sas_device->sas_address == sas_address)
- return sas_device;
+ if (sas_device->handle == handle)
+ goto found_device;

return NULL;
+
+found_device:
+ sas_device_get(sas_device);
+ return sas_device;
}

/**
- * _scsih_sas_device_find_by_handle - sas device search
+ * mpt2sas_get_sdev_by_handle - sas device search
* @ioc: per adapter object
* @handle: sas device handle (assigned by firmware)
* Context: Calling function should acquire ioc->sas_device_lock
@@ -562,19 +617,16 @@ mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
* object.
*/
static struct _sas_device *
-_scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
+mpt2sas_get_sdev_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
{
struct _sas_device *sas_device;
+ unsigned long flags;

- list_for_each_entry(sas_device, &ioc->sas_device_list, list)
- if (sas_device->handle == handle)
- return sas_device;
-
- list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
- if (sas_device->handle == handle)
- return sas_device;
+ spin_lock_irqsave(&ioc->sas_device_lock, flags);
+ sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

- return NULL;
+ return sas_device;
}

/**
@@ -583,7 +635,7 @@ _scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
* @sas_device: the sas_device object
* Context: This function will acquire ioc->sas_device_lock.
*
- * Removing object and freeing associated memory from the ioc->sas_device_list.
+ * If sas_device is on the list, remove it and decrement its reference count.
*/
static void
_scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
@@ -594,9 +646,15 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
if (!sas_device)
return;

+ /*
+ * The lock serializes access to the list, but we still need to verify
+ * that nobody removed the entry while we were waiting on the lock.
+ */
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- list_del(&sas_device->list);
- kfree(sas_device);
+ if (!list_empty(&sas_device->list)) {
+ list_del_init(&sas_device->list);
+ sas_device_put(sas_device);
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}

@@ -620,6 +678,7 @@ _scsih_sas_device_add(struct MPT2SAS_ADAPTER *ioc,
sas_device->handle, (unsigned long long)sas_device->sas_address));

spin_lock_irqsave(&ioc->sas_device_lock, flags);
+ sas_device_get(sas_device);
list_add_tail(&sas_device->list, &ioc->sas_device_list);
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

@@ -659,6 +718,7 @@ _scsih_sas_device_init_add(struct MPT2SAS_ADAPTER *ioc,
sas_device->handle, (unsigned long long)sas_device->sas_address));

spin_lock_irqsave(&ioc->sas_device_lock, flags);
+ sas_device_get(sas_device);
list_add_tail(&sas_device->list, &ioc->sas_device_init_list);
_scsih_determine_boot_device(ioc, sas_device, 0);
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
@@ -1208,12 +1268,14 @@ _scsih_change_queue_depth(struct scsi_device *sdev, int qdepth)
goto not_sata;
if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))
goto not_sata;
+
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
- sas_device_priv_data->sas_target->sas_address);
- if (sas_device && sas_device->device_info &
- MPI2_SAS_DEVICE_INFO_SATA_DEVICE)
+ sas_device = __mpt2sas_get_sdev_from_target(sas_target_priv_data);
+ if (sas_device && sas_device->device_info
+ & MPI2_SAS_DEVICE_INFO_SATA_DEVICE) {
max_depth = MPT2SAS_SATA_QUEUE_DEPTH;
+ sas_device_put(sas_device);
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

not_sata:
@@ -1271,18 +1333,21 @@ _scsih_target_alloc(struct scsi_target *starget)
/* sas/sata devices */
spin_lock_irqsave(&ioc->sas_device_lock, flags);
rphy = dev_to_rphy(starget->dev.parent);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = __mpt2sas_get_sdev_by_addr(ioc,
rphy->identify.sas_address);

if (sas_device) {
sas_target_priv_data->handle = sas_device->handle;
sas_target_priv_data->sas_address = sas_device->sas_address;
+ sas_target_priv_data->sdev = sas_device;
sas_device->starget = starget;
sas_device->id = starget->id;
sas_device->channel = starget->channel;
if (test_bit(sas_device->handle, ioc->pd_handles))
sas_target_priv_data->flags |=
MPT_TARGET_FLAGS_RAID_COMPONENT;
+
+ sas_device_put(sas_device);
}
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

@@ -1324,13 +1389,14 @@ _scsih_target_destroy(struct scsi_target *starget)

spin_lock_irqsave(&ioc->sas_device_lock, flags);
rphy = dev_to_rphy(starget->dev.parent);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
- rphy->identify.sas_address);
+ sas_device = __mpt2sas_get_sdev_from_target(sas_target_priv_data);
if (sas_device && (sas_device->starget == starget) &&
(sas_device->id == starget->id) &&
(sas_device->channel == starget->channel))
sas_device->starget = NULL;

+ if (sas_device)
+ sas_device_put(sas_device);
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

out:
@@ -1386,7 +1452,7 @@ _scsih_slave_alloc(struct scsi_device *sdev)

if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = __mpt2sas_get_sdev_by_addr(ioc,
sas_target_priv_data->sas_address);
if (sas_device && (sas_device->starget == NULL)) {
sdev_printk(KERN_INFO, sdev,
@@ -1394,6 +1460,10 @@ _scsih_slave_alloc(struct scsi_device *sdev)
__func__, __LINE__);
sas_device->starget = starget;
}
+
+ if (sas_device)
+ sas_device_put(sas_device);
+
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}

@@ -1428,10 +1498,12 @@ _scsih_slave_destroy(struct scsi_device *sdev)

if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
- sas_target_priv_data->sas_address);
+ sas_device = __mpt2sas_get_sdev_from_target(sas_target_priv_data);
if (sas_device && !sas_target_priv_data->num_luns)
sas_device->starget = NULL;
+
+ if (sas_device)
+ sas_device_put(sas_device);
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}

@@ -2078,7 +2150,7 @@ _scsih_slave_configure(struct scsi_device *sdev)
}

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = __mpt2sas_get_sdev_by_addr(ioc,
sas_device_priv_data->sas_target->sas_address);
if (!sas_device) {
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
@@ -2116,13 +2188,14 @@ _scsih_slave_configure(struct scsi_device *sdev)
if (!ssp_target)
_scsih_display_sata_capabilities(ioc, handle, sdev);

-
_scsih_change_queue_depth(sdev, qdepth);

if (ssp_target) {
sas_read_port_mode_page(sdev);
_scsih_enable_tlr(ioc, sdev);
}
+
+ sas_device_put(sas_device);
return 0;
}

@@ -2509,8 +2582,7 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
device_str, (unsigned long long)priv_target->sas_address);
} else {
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
- priv_target->sas_address);
+ sas_device = __mpt2sas_get_sdev_from_target(priv_target);
if (sas_device) {
if (priv_target->flags &
MPT_TARGET_FLAGS_RAID_COMPONENT) {
@@ -2529,6 +2601,8 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
"enclosure_logical_id(0x%016llx), slot(%d)\n",
(unsigned long long)sas_device->enclosure_logical_id,
sas_device->slot);
+
+ sas_device_put(sas_device);
}
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}
@@ -2604,12 +2678,12 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
{
struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
struct MPT2SAS_DEVICE *sas_device_priv_data;
- struct _sas_device *sas_device;
- unsigned long flags;
+ struct _sas_device *sas_device = NULL;
u16 handle;
int r;

struct scsi_target *starget = scmd->device->sdev_target;
+ struct MPT2SAS_TARGET *target_priv_data = starget->hostdata;

starget_printk(KERN_INFO, starget, "attempting device reset! "
"scmd(%p)\n", scmd);
@@ -2629,12 +2703,9 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
handle = 0;
if (sas_device_priv_data->sas_target->flags &
MPT_TARGET_FLAGS_RAID_COMPONENT) {
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc,
- sas_device_priv_data->sas_target->handle);
+ sas_device = __mpt2sas_get_sdev_from_target(target_priv_data);
if (sas_device)
handle = sas_device->volume_handle;
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
} else
handle = sas_device_priv_data->sas_target->handle;

@@ -2651,6 +2722,10 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
out:
sdev_printk(KERN_INFO, scmd->device, "device reset: %s scmd(%p)\n",
((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
+
+ if (sas_device)
+ sas_device_put(sas_device);
+
return r;
}

@@ -2665,11 +2740,11 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
{
struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
struct MPT2SAS_DEVICE *sas_device_priv_data;
- struct _sas_device *sas_device;
- unsigned long flags;
+ struct _sas_device *sas_device = NULL;
u16 handle;
int r;
struct scsi_target *starget = scmd->device->sdev_target;
+ struct MPT2SAS_TARGET *target_priv_data = starget->hostdata;

starget_printk(KERN_INFO, starget, "attempting target reset! "
"scmd(%p)\n", scmd);
@@ -2689,12 +2764,9 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
handle = 0;
if (sas_device_priv_data->sas_target->flags &
MPT_TARGET_FLAGS_RAID_COMPONENT) {
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc,
- sas_device_priv_data->sas_target->handle);
+ sas_device = __mpt2sas_get_sdev_from_target(target_priv_data);
if (sas_device)
handle = sas_device->volume_handle;
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
} else
handle = sas_device_priv_data->sas_target->handle;

@@ -2711,6 +2783,10 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
out:
starget_printk(KERN_INFO, starget, "target reset: %s scmd(%p)\n",
((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
+
+ if (sas_device)
+ sas_device_put(sas_device);
+
return r;
}

@@ -3002,15 +3078,15 @@ _scsih_block_io_to_children_attached_to_ex(struct MPT2SAS_ADAPTER *ioc,

list_for_each_entry(mpt2sas_port,
&sas_expander->sas_port_list, port_list) {
- if (mpt2sas_port->remote_identify.device_type ==
- SAS_END_DEVICE) {
+ if (mpt2sas_port->remote_identify.device_type == SAS_END_DEVICE) {
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device =
- mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
- mpt2sas_port->remote_identify.sas_address);
- if (sas_device)
+ sas_device = __mpt2sas_get_sdev_by_addr(ioc,
+ mpt2sas_port->remote_identify.sas_address);
+ if (sas_device) {
set_bit(sas_device->handle,
- ioc->blocking_handles);
+ ioc->blocking_handles);
+ sas_device_put(sas_device);
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}
}
@@ -3080,7 +3156,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
{
Mpi2SCSITaskManagementRequest_t *mpi_request;
u16 smid;
- struct _sas_device *sas_device;
+ struct _sas_device *sas_device = NULL;
struct MPT2SAS_TARGET *sas_target_priv_data = NULL;
u64 sas_address = 0;
unsigned long flags;
@@ -3110,7 +3186,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
return;

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+ sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
if (sas_device && sas_device->starget &&
sas_device->starget->hostdata) {
sas_target_priv_data = sas_device->starget->hostdata;
@@ -3131,14 +3207,14 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
if (!smid) {
delayed_tr = kzalloc(sizeof(*delayed_tr), GFP_ATOMIC);
if (!delayed_tr)
- return;
+ goto out;
INIT_LIST_HEAD(&delayed_tr->list);
delayed_tr->handle = handle;
list_add_tail(&delayed_tr->list, &ioc->delayed_tr_list);
dewtprintk(ioc, printk(MPT2SAS_INFO_FMT
"DELAYED:tr:handle(0x%04x), (open)\n",
ioc->name, handle));
- return;
+ goto out;
}

dewtprintk(ioc, printk(MPT2SAS_INFO_FMT "tr_send:handle(0x%04x), "
@@ -3150,6 +3226,9 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
mpi_request->DevHandle = cpu_to_le16(handle);
mpi_request->TaskType = MPI2_SCSITASKMGMT_TASKTYPE_TARGET_RESET;
mpt2sas_base_put_smid_hi_priority(ioc, smid);
+out:
+ if (sas_device)
+ sas_device_put(sas_device);
}


@@ -4068,7 +4147,6 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
char *desc_scsi_state = ioc->tmp_string;
u32 log_info = le32_to_cpu(mpi_reply->IOCLogInfo);
struct _sas_device *sas_device = NULL;
- unsigned long flags;
struct scsi_target *starget = scmd->device->sdev_target;
struct MPT2SAS_TARGET *priv_target = starget->hostdata;
char *device_str = NULL;
@@ -4200,9 +4278,7 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
printk(MPT2SAS_WARN_FMT "\t%s wwid(0x%016llx)\n", ioc->name,
device_str, (unsigned long long)priv_target->sas_address);
} else {
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
- priv_target->sas_address);
+ sas_device = __mpt2sas_get_sdev_from_target(priv_target);
if (sas_device) {
printk(MPT2SAS_WARN_FMT "\tsas_address(0x%016llx), "
"phy(%d)\n", ioc->name, sas_device->sas_address,
@@ -4211,8 +4287,9 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
"\tenclosure_logical_id(0x%016llx), slot(%d)\n",
ioc->name, sas_device->enclosure_logical_id,
sas_device->slot);
+
+ sas_device_put(sas_device);
}
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}

printk(MPT2SAS_WARN_FMT "\thandle(0x%04x), ioc_status(%s)(0x%04x), "
@@ -4259,7 +4336,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
Mpi2SepRequest_t mpi_request;
struct _sas_device *sas_device;

- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+ sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
if (!sas_device)
return;

@@ -4274,7 +4351,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
&mpi_request)) != 0) {
printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n", ioc->name,
__FILE__, __LINE__, __func__);
- return;
+ goto out;
}
sas_device->pfa_led_on = 1;

@@ -4284,8 +4361,10 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
"enclosure_processor: ioc_status (0x%04x), loginfo(0x%08x)\n",
ioc->name, le16_to_cpu(mpi_reply.IOCStatus),
le32_to_cpu(mpi_reply.IOCLogInfo)));
- return;
+ goto out;
}
+out:
+ sas_device_put(sas_device);
}

/**
@@ -4370,19 +4449,17 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)

/* only handle non-raid devices */
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+ sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
if (!sas_device) {
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
+ goto out_unlock;
}
starget = sas_device->starget;
sas_target_priv_data = starget->hostdata;

if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_RAID_COMPONENT) ||
- ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))) {
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
- }
+ ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)))
+ goto out_unlock;
+
starget_printk(KERN_WARNING, starget, "predicted fault\n");
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

@@ -4396,7 +4473,7 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
if (!event_reply) {
printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n",
ioc->name, __FILE__, __LINE__, __func__);
- return;
+ goto out;
}

event_reply->Function = MPI2_FUNCTION_EVENT_NOTIFICATION;
@@ -4413,6 +4490,14 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
event_data->SASAddress = cpu_to_le64(sas_target_priv_data->sas_address);
mpt2sas_ctl_add_to_event_log(ioc, event_reply);
kfree(event_reply);
+out:
+ if (sas_device)
+ sas_device_put(sas_device);
+ return;
+
+out_unlock:
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+ goto out;
}

/**
@@ -5148,14 +5233,13 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)

spin_lock_irqsave(&ioc->sas_device_lock, flags);
sas_address = le64_to_cpu(sas_device_pg0.SASAddress);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = __mpt2sas_get_sdev_by_addr(ioc,
sas_address);

if (!sas_device) {
printk(MPT2SAS_ERR_FMT "device is not present "
"handle(0x%04x), no sas_device!!!\n", ioc->name, handle);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
+ goto out_unlock;
}

if (unlikely(sas_device->handle != handle)) {
@@ -5172,19 +5256,22 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)
MPI2_SAS_DEVICE0_FLAGS_DEVICE_PRESENT)) {
printk(MPT2SAS_ERR_FMT "device is not present "
"handle(0x%04x), flags!!!\n", ioc->name, handle);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
+ goto out_unlock;
}

/* check if there were any issues with discovery */
if (_scsih_check_access_status(ioc, sas_address, handle,
- sas_device_pg0.AccessStatus)) {
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
- }
+ sas_device_pg0.AccessStatus))
+ goto out_unlock;
+
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
_scsih_ublock_io_device(ioc, sas_address);
+ return;

+out_unlock:
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+ if (sas_device)
+ sas_device_put(sas_device);
}

/**
@@ -5208,7 +5295,6 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
u32 ioc_status;
__le64 sas_address;
u32 device_info;
- unsigned long flags;

if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
@@ -5250,14 +5336,13 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
return -1;
}

-
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_get_sdev_by_addr(ioc,
sas_address);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

- if (sas_device)
+ if (sas_device) {
+ sas_device_put(sas_device);
return 0;
+ }

sas_device = kzalloc(sizeof(struct _sas_device),
GFP_KERNEL);
@@ -5267,6 +5352,7 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
return -1;
}

+ kref_init(&sas_device->refcount);
sas_device->handle = handle;
if (_scsih_get_sas_address(ioc, le16_to_cpu
(sas_device_pg0.ParentDevHandle),
@@ -5344,7 +5430,6 @@ _scsih_remove_device(struct MPT2SAS_ADAPTER *ioc,
"handle(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
sas_device->handle, (unsigned long long)
sas_device->sas_address));
- kfree(sas_device);
}
/**
* _scsih_device_remove_by_handle - removing device object by handle
@@ -5363,12 +5448,17 @@ _scsih_device_remove_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
return;

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
- if (sas_device)
- list_del(&sas_device->list);
+ sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
+ if (sas_device) {
+ list_del_init(&sas_device->list);
+ sas_device_put(sas_device);
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- if (sas_device)
+
+ if (sas_device) {
_scsih_remove_device(ioc, sas_device);
+ sas_device_put(sas_device);
+ }
}

/**
@@ -5389,13 +5479,17 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
return;

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
- sas_address);
- if (sas_device)
- list_del(&sas_device->list);
+ sas_device = __mpt2sas_get_sdev_by_addr(ioc, sas_address);
+ if (sas_device) {
+ list_del_init(&sas_device->list);
+ sas_device_put(sas_device);
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- if (sas_device)
+
+ if (sas_device) {
_scsih_remove_device(ioc, sas_device);
+ sas_device_put(sas_device);
+ }
}
#ifdef CONFIG_SCSI_MPT2SAS_LOGGING
/**
@@ -5716,26 +5810,28 @@ _scsih_sas_device_status_change_event(struct MPT2SAS_ADAPTER *ioc,

spin_lock_irqsave(&ioc->sas_device_lock, flags);
sas_address = le64_to_cpu(event_data->SASAddress);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = __mpt2sas_get_sdev_by_addr(ioc,
sas_address);

- if (!sas_device || !sas_device->starget) {
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
- }
+ if (!sas_device || !sas_device->starget)
+ goto out;

target_priv_data = sas_device->starget->hostdata;
- if (!target_priv_data) {
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
- }
+ if (!target_priv_data)
+ goto out;

if (event_data->ReasonCode ==
MPI2_EVENT_SAS_DEV_STAT_RC_INTERNAL_DEVICE_RESET)
target_priv_data->tm_busy = 1;
else
target_priv_data->tm_busy = 0;
+
+out:
+ if (sas_device)
+ sas_device_put(sas_device);
+
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
}

#ifdef CONFIG_SCSI_MPT2SAS_LOGGING
@@ -6123,7 +6219,7 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
u16 handle = le16_to_cpu(element->PhysDiskDevHandle);

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+ sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
if (sas_device) {
sas_device->volume_handle = 0;
sas_device->volume_wwid = 0;
@@ -6142,6 +6238,8 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
/* exposing raid component */
if (starget)
starget_for_each_device(starget, NULL, _scsih_reprobe_lun);
+
+ sas_device_put(sas_device);
}

/**
@@ -6170,7 +6268,7 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
&volume_wwid);

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+ sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
if (sas_device) {
set_bit(handle, ioc->pd_handles);
if (sas_device->starget && sas_device->starget->hostdata) {
@@ -6189,6 +6287,8 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
/* hiding raid component */
if (starget)
starget_for_each_device(starget, (void *)1, _scsih_reprobe_lun);
+
+ sas_device_put(sas_device);
}

/**
@@ -6221,7 +6321,6 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,
Mpi2EventIrConfigElement_t *element)
{
struct _sas_device *sas_device;
- unsigned long flags;
u16 handle = le16_to_cpu(element->PhysDiskDevHandle);
Mpi2ConfigReply_t mpi_reply;
Mpi2SasDevicePage0_t sas_device_pg0;
@@ -6231,11 +6330,11 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,

set_bit(handle, ioc->pd_handles);

- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- if (sas_device)
+ sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
+ if (sas_device) {
+ sas_device_put(sas_device);
return;
+ }

if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
@@ -6509,7 +6608,6 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
u16 handle, parent_handle;
u32 state;
struct _sas_device *sas_device;
- unsigned long flags;
Mpi2ConfigReply_t mpi_reply;
Mpi2SasDevicePage0_t sas_device_pg0;
u32 ioc_status;
@@ -6542,12 +6640,11 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
if (!ioc->is_warpdrive)
set_bit(handle, ioc->pd_handles);

- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-
- if (sas_device)
+ sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
+ if (sas_device) {
+ sas_device_put(sas_device);
return;
+ }

if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
&sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
@@ -7015,6 +7112,7 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
struct _raid_device *raid_device, *raid_device_next;
struct list_head tmp_list;
unsigned long flags;
+ LIST_HEAD(head);

printk(MPT2SAS_INFO_FMT "removing unresponding devices: start\n",
ioc->name);
@@ -7022,14 +7120,29 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
/* removing unresponding end devices */
printk(MPT2SAS_INFO_FMT "removing unresponding devices: end-devices\n",
ioc->name);
+
+ /*
+ * Iterate, pulling off devices marked as non-responding. We become the
+ * owner for the reference the list had on any object we prune.
+ */
+ spin_lock_irqsave(&ioc->sas_device_lock, flags);
list_for_each_entry_safe(sas_device, sas_device_next,
- &ioc->sas_device_list, list) {
+ &ioc->sas_device_list, list) {
if (!sas_device->responding)
- mpt2sas_device_remove_by_sas_address(ioc,
- sas_device->sas_address);
+ list_move_tail(&sas_device->list, &head);
else
sas_device->responding = 0;
}
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+ /*
+ * Now, uninitialize and remove the unresponding devices we pruned.
+ */
+ list_for_each_entry_safe(sas_device, sas_device_next, &head, list) {
+ _scsih_remove_device(ioc, sas_device);
+ list_del_init(&sas_device->list);
+ sas_device_put(sas_device);
+ }

/* removing unresponding volumes */
if (ioc->ir_firmware) {
@@ -7179,11 +7292,11 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
}
phys_disk_num = pd_pg0.PhysDiskNum;
handle = le16_to_cpu(pd_pg0.DevHandle);
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- if (sas_device)
+ sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
+ if (sas_device) {
+ sas_device_put(sas_device);
continue;
+ }
if (mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
&sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
handle) != 0)
@@ -7302,12 +7415,12 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
if (!(_scsih_is_end_device(
le32_to_cpu(sas_device_pg0.DeviceInfo))))
continue;
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_get_sdev_by_addr(ioc,
le64_to_cpu(sas_device_pg0.SASAddress));
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- if (sas_device)
+ if (sas_device) {
+ sas_device_put(sas_device);
continue;
+ }
parent_handle = le16_to_cpu(sas_device_pg0.ParentDevHandle);
if (!_scsih_get_sas_address(ioc, parent_handle, &sas_address)) {
printk(MPT2SAS_INFO_FMT "\tBEFORE adding end device: "
@@ -7966,6 +8079,37 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
}
}

+static struct _sas_device *dequeue_next_sas_device(struct MPT2SAS_ADAPTER *ioc)
+{
+ struct _sas_device *sas_device = NULL;
+ unsigned long flags;
+
+ spin_lock_irqsave(&ioc->sas_device_lock, flags);
+ if (!list_empty(&ioc->sas_device_init_list)) {
+ sas_device = list_first_entry(&ioc->sas_device_init_list,
+ struct _sas_device, list);
+ list_del_init(&sas_device->list);
+ }
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+ /*
+ * If an item was dequeued, the caller now owns the reference that was
+ * previously owned by the list
+ */
+ return sas_device;
+}
+
+static void sas_device_make_active(struct MPT2SAS_ADAPTER *ioc,
+ struct _sas_device *sas_device)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&ioc->sas_device_lock, flags);
+ sas_device_get(sas_device);
+ list_add_tail(&sas_device->list, &ioc->sas_device_list);
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+}
+
/**
* _scsih_probe_sas - reporting sas devices to sas transport
* @ioc: per adapter object
@@ -7975,34 +8119,28 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
static void
_scsih_probe_sas(struct MPT2SAS_ADAPTER *ioc)
{
- struct _sas_device *sas_device, *next;
- unsigned long flags;
-
- /* SAS Device List */
- list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
- list) {
+ struct _sas_device *sas_device;

- if (ioc->hide_drives)
- continue;
+ if (ioc->hide_drives)
+ return;

+ while ((sas_device = dequeue_next_sas_device(ioc))) {
if (!mpt2sas_transport_port_add(ioc, sas_device->handle,
- sas_device->sas_address_parent)) {
- list_del(&sas_device->list);
- kfree(sas_device);
+ sas_device->sas_address_parent)) {
+ sas_device_put(sas_device);
continue;
} else if (!sas_device->starget) {
if (!ioc->is_driver_loading) {
mpt2sas_transport_port_remove(ioc,
- sas_device->sas_address,
- sas_device->sas_address_parent);
- list_del(&sas_device->list);
- kfree(sas_device);
+ sas_device->sas_address,
+ sas_device->sas_address_parent);
+ sas_device_put(sas_device);
continue;
}
}
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- list_move_tail(&sas_device->list, &ioc->sas_device_list);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+ sas_device_make_active(ioc, sas_device);
+ sas_device_put(sas_device);
}
}

diff --git a/drivers/scsi/mpt2sas/mpt2sas_transport.c b/drivers/scsi/mpt2sas/mpt2sas_transport.c
index ff2500a..af86800 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_transport.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_transport.c
@@ -1323,15 +1323,17 @@ _transport_get_enclosure_identifier(struct sas_rphy *rphy, u64 *identifier)
int rc;

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = __mpt2sas_get_sdev_by_addr(ioc,
rphy->identify.sas_address);
if (sas_device) {
*identifier = sas_device->enclosure_logical_id;
rc = 0;
+ sas_device_put(sas_device);
} else {
*identifier = 0;
rc = -ENXIO;
}
+
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
return rc;
}
@@ -1351,12 +1353,14 @@ _transport_get_bay_identifier(struct sas_rphy *rphy)
int rc;

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = __mpt2sas_get_sdev_by_addr(ioc,
rphy->identify.sas_address);
- if (sas_device)
+ if (sas_device) {
rc = sas_device->slot;
- else
+ sas_device_put(sas_device);
+ } else {
rc = -ENXIO;
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
return rc;
}
--
1.8.1

2015-07-12 04:26:09

by Calvin Owens

[permalink] [raw]
Subject: [PATCH 2/2] mpt2sas: Refcount fw_events and fix unsafe list usage

The fw_event_work struct is concurrently referenced at shutdown, so
add a refcount to protect it, and refactor the code to use it.

Additionally, refactor _scsih_fw_event_cleanup_queue() such that it
no longer iterates over the list without holding the lock, since
_firmware_event_work() concurrently deletes items from the list.

Cc: Christoph Hellwig <[email protected]>
Cc: Bart Van Assche <[email protected]>
Signed-off-by: Calvin Owens <[email protected]>
---
drivers/scsi/mpt2sas/mpt2sas_scsih.c | 101 ++++++++++++++++++++++++++++-------
1 file changed, 81 insertions(+), 20 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index fad80ce..8b267af 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -176,9 +176,37 @@ struct fw_event_work {
u8 VP_ID;
u8 ignore;
u16 event;
+ struct kref refcount;
char event_data[0] __aligned(4);
};

+static void fw_event_work_free(struct kref *r)
+{
+ kfree(container_of(r, struct fw_event_work, refcount));
+}
+
+static void fw_event_work_get(struct fw_event_work *fw_work)
+{
+ kref_get(&fw_work->refcount);
+}
+
+static void fw_event_work_put(struct fw_event_work *fw_work)
+{
+ kref_put(&fw_work->refcount, fw_event_work_free);
+}
+
+static struct fw_event_work *alloc_fw_event_work(int len)
+{
+ struct fw_event_work *fw_event;
+
+ fw_event = kzalloc(sizeof(*fw_event) + len, GFP_ATOMIC);
+ if (!fw_event)
+ return NULL;
+
+ kref_init(&fw_event->refcount);
+ return fw_event;
+}
+
/* raid transport support */
static struct raid_template *mpt2sas_raid_template;

@@ -2844,36 +2872,39 @@ _scsih_fw_event_add(struct MPT2SAS_ADAPTER *ioc, struct fw_event_work *fw_event)
return;

spin_lock_irqsave(&ioc->fw_event_lock, flags);
+ fw_event_work_get(fw_event);
list_add_tail(&fw_event->list, &ioc->fw_event_list);
INIT_DELAYED_WORK(&fw_event->delayed_work, _firmware_event_work);
+ fw_event_work_get(fw_event);
queue_delayed_work(ioc->firmware_event_thread,
&fw_event->delayed_work, 0);
spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
}

/**
- * _scsih_fw_event_free - delete fw_event
+ * _scsih_fw_event_del_from_list - delete fw_event from the list
* @ioc: per adapter object
* @fw_event: object describing the event
* Context: This function will acquire ioc->fw_event_lock.
*
- * This removes firmware event object from link list, frees associated memory.
+ * If the fw_event is on the fw_event_list, remove it and do a put.
*
* Return nothing.
*/
static void
-_scsih_fw_event_free(struct MPT2SAS_ADAPTER *ioc, struct fw_event_work
+_scsih_fw_event_del_from_list(struct MPT2SAS_ADAPTER *ioc, struct fw_event_work
*fw_event)
{
unsigned long flags;

spin_lock_irqsave(&ioc->fw_event_lock, flags);
- list_del(&fw_event->list);
- kfree(fw_event);
+ if (!list_empty(&fw_event->list)) {
+ list_del_init(&fw_event->list);
+ fw_event_work_put(fw_event);
+ }
spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
}

-
/**
* _scsih_error_recovery_delete_devices - remove devices not responding
* @ioc: per adapter object
@@ -2888,13 +2919,14 @@ _scsih_error_recovery_delete_devices(struct MPT2SAS_ADAPTER *ioc)
if (ioc->is_driver_loading)
return;

- fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+ fw_event = alloc_fw_event_work(0);
if (!fw_event)
return;

fw_event->event = MPT2SAS_REMOVE_UNRESPONDING_DEVICES;
fw_event->ioc = ioc;
_scsih_fw_event_add(ioc, fw_event);
+ fw_event_work_put(fw_event);
}

/**
@@ -2908,12 +2940,29 @@ mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc)
{
struct fw_event_work *fw_event;

- fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+ fw_event = alloc_fw_event_work(0);
if (!fw_event)
return;
fw_event->event = MPT2SAS_PORT_ENABLE_COMPLETE;
fw_event->ioc = ioc;
_scsih_fw_event_add(ioc, fw_event);
+ fw_event_work_put(fw_event);
+}
+
+static struct fw_event_work *dequeue_next_fw_event(struct MPT2SAS_ADAPTER *ioc)
+{
+ unsigned long flags;
+ struct fw_event_work *fw_event = NULL;
+
+ spin_lock_irqsave(&ioc->fw_event_lock, flags);
+ if (!list_empty(&ioc->fw_event_list)) {
+ fw_event = list_first_entry(&ioc->fw_event_list,
+ struct fw_event_work, list);
+ list_del_init(&fw_event->list);
+ }
+ spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
+
+ return fw_event;
}

/**
@@ -2928,17 +2977,25 @@ mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc)
static void
_scsih_fw_event_cleanup_queue(struct MPT2SAS_ADAPTER *ioc)
{
- struct fw_event_work *fw_event, *next;
+ struct fw_event_work *fw_event;

if (list_empty(&ioc->fw_event_list) ||
!ioc->firmware_event_thread || in_interrupt())
return;

- list_for_each_entry_safe(fw_event, next, &ioc->fw_event_list, list) {
- if (cancel_delayed_work_sync(&fw_event->delayed_work)) {
- _scsih_fw_event_free(ioc, fw_event);
- continue;
- }
+ while ((fw_event = dequeue_next_fw_event(ioc))) {
+ /*
+ * Wait on the fw_event to complete. If this returns 1, then
+ * the event was never executed, and we need a put for the
+ * reference the delayed_work had on the fw_event.
+ *
+ * If it did execute, we wait for it to finish, and the put will
+ * happen from _firmware_event_work()
+ */
+ if (cancel_delayed_work_sync(&fw_event->delayed_work))
+ fw_event_work_put(fw_event);
+
+ fw_event_work_put(fw_event);
}
}

@@ -4419,13 +4476,14 @@ _scsih_send_event_to_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
{
struct fw_event_work *fw_event;

- fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+ fw_event = alloc_fw_event_work(0);
if (!fw_event)
return;
fw_event->event = MPT2SAS_TURN_ON_PFA_LED;
fw_event->device_handle = handle;
fw_event->ioc = ioc;
_scsih_fw_event_add(ioc, fw_event);
+ fw_event_work_put(fw_event);
}

/**
@@ -7523,10 +7581,11 @@ _firmware_event_work(struct work_struct *work)
struct fw_event_work, delayed_work.work);
struct MPT2SAS_ADAPTER *ioc = fw_event->ioc;

+ _scsih_fw_event_del_from_list(ioc, fw_event);
+
/* the queue is being flushed so ignore this event */
- if (ioc->remove_host ||
- ioc->pci_error_recovery) {
- _scsih_fw_event_free(ioc, fw_event);
+ if (ioc->remove_host || ioc->pci_error_recovery) {
+ fw_event_work_put(fw_event);
return;
}

@@ -7582,7 +7641,8 @@ _firmware_event_work(struct work_struct *work)
_scsih_sas_ir_operation_status_event(ioc, fw_event);
break;
}
- _scsih_fw_event_free(ioc, fw_event);
+
+ fw_event_work_put(fw_event);
}

/**
@@ -7720,7 +7780,7 @@ mpt2sas_scsih_event_callback(struct MPT2SAS_ADAPTER *ioc, u8 msix_index,
}

sz = le16_to_cpu(mpi_reply->EventDataLength) * 4;
- fw_event = kzalloc(sizeof(*fw_event) + sz, GFP_ATOMIC);
+ fw_event = alloc_fw_event_work(sz);
if (!fw_event) {
printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n",
ioc->name, __FILE__, __LINE__, __func__);
@@ -7733,6 +7793,7 @@ mpt2sas_scsih_event_callback(struct MPT2SAS_ADAPTER *ioc, u8 msix_index,
fw_event->VP_ID = mpi_reply->VP_ID;
fw_event->event = event;
_scsih_fw_event_add(ioc, fw_event);
+ fw_event_work_put(fw_event);
return;
}

--
1.8.1

2015-07-13 06:52:11

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage

On Sat, Jul 11, 2015 at 09:24:55PM -0700, Calvin Owens wrote:
> These objects can be referenced concurrently throughout the driver, we
> need a way to make sure threads can't delete them out from under each
> other. This patch adds the refcount, and refactors the code to use it.
>
> Additionally, we cannot iterate over the sas_device_list without
> holding the lock, or we risk corrupting random memory if items are
> added or deleted as we iterate. This patch refactors _scsih_probe_sas()
> to use the sas_device_list in a safe way.
>
> Cc: Christoph Hellwig <[email protected]>
> Cc: Bart Van Assche <[email protected]>
> Signed-off-by: Calvin Owens <[email protected]>
> ---
> drivers/scsi/mpt2sas/mpt2sas_base.h | 22 +-
> drivers/scsi/mpt2sas/mpt2sas_scsih.c | 434 ++++++++++++++++++++-----------
> drivers/scsi/mpt2sas/mpt2sas_transport.c | 12 +-
> 3 files changed, 315 insertions(+), 153 deletions(-)
>
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
> index caff8d1..78f41ac 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_base.h
> +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
> @@ -238,6 +238,7 @@
> * @flags: MPT_TARGET_FLAGS_XXX flags
> * @deleted: target flaged for deletion
> * @tm_busy: target is busy with TM request.
> + * @sdev: The sas_device associated with this target
> */
> struct MPT2SAS_TARGET {
> struct scsi_target *starget;
> @@ -248,6 +249,7 @@ struct MPT2SAS_TARGET {
> u32 flags;
> u8 deleted;
> u8 tm_busy;
> + struct _sas_device *sdev;
> };
>
>
> @@ -376,8 +378,24 @@ struct _sas_device {
> u8 phy;
> u8 responding;
> u8 pfa_led_on;
> + struct kref refcount;
> };
>
> +static inline void sas_device_get(struct _sas_device *s)
> +{
> + kref_get(&s->refcount);
> +}
> +
> +static inline void sas_device_free(struct kref *r)
> +{
> + kfree(container_of(r, struct _sas_device, refcount));
> +}
> +
> +static inline void sas_device_put(struct _sas_device *s)
> +{
> + kref_put(&s->refcount, sas_device_free);
> +}
> +
> /**
> * struct _raid_device - raid volume link list
> * @list: sas device list
> @@ -1095,7 +1113,9 @@ struct _sas_node *mpt2sas_scsih_expander_find_by_handle(struct MPT2SAS_ADAPTER *
> u16 handle);
> struct _sas_node *mpt2sas_scsih_expander_find_by_sas_address(struct MPT2SAS_ADAPTER
> *ioc, u64 sas_address);
> -struct _sas_device *mpt2sas_scsih_sas_device_find_by_sas_address(
> +struct _sas_device *mpt2sas_get_sdev_by_addr(
> + struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
> +struct _sas_device *__mpt2sas_get_sdev_by_addr(
> struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
>
> void mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc);
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> index 3f26147..fad80ce 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> +++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> @@ -526,8 +526,43 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
> }
> }
>
> +struct _sas_device *
> +__mpt2sas_get_sdev_from_target(struct MPT2SAS_TARGET *tgt_priv)
> +{
> + struct _sas_device *ret;
> +

Does this need a:

assert_spin_locked(&ioc->sas_device_lock);

?

Otherwise this looks sensible to me.

2015-07-13 15:05:18

by Joe Lawrence

[permalink] [raw]
Subject: Re: [PATCH 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage

On 07/12/2015 12:24 AM, Calvin Owens wrote:
> These objects can be referenced concurrently throughout the driver, we
> need a way to make sure threads can't delete them out from under each
> other. This patch adds the refcount, and refactors the code to use it.
>
> Additionally, we cannot iterate over the sas_device_list without
> holding the lock, or we risk corrupting random memory if items are
> added or deleted as we iterate. This patch refactors _scsih_probe_sas()
> to use the sas_device_list in a safe way.
>
> Cc: Christoph Hellwig <[email protected]>
> Cc: Bart Van Assche <[email protected]>
> Signed-off-by: Calvin Owens <[email protected]>
> ---
> drivers/scsi/mpt2sas/mpt2sas_base.h | 22 +-
> drivers/scsi/mpt2sas/mpt2sas_scsih.c | 434 ++++++++++++++++++++-----------
> drivers/scsi/mpt2sas/mpt2sas_transport.c | 12 +-
> 3 files changed, 315 insertions(+), 153 deletions(-)

[ ... snip ... ]

> @@ -2078,7 +2150,7 @@ _scsih_slave_configure(struct scsi_device *sdev)
> }
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> sas_device_priv_data->sas_target->sas_address);
> if (!sas_device) {
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> @@ -2116,13 +2188,14 @@ _scsih_slave_configure(struct scsi_device *sdev)
> if (!ssp_target)
> _scsih_display_sata_capabilities(ioc, handle, sdev);
>
> -
> _scsih_change_queue_depth(sdev, qdepth);
>
> if (ssp_target) {
> sas_read_port_mode_page(sdev);
> _scsih_enable_tlr(ioc, sdev);
> }
> +
> + sas_device_put(sas_device);
> return 0;
> }

Hi Calvin,

Any reason why this sas_device_put is placed outside the sas_device
lock? Most other instances in this patch were called just before unlocking.

BTW I attempted testing, but needed to port to mpt3 and ended up with a
driver that didn't boot :( Hopefully I can retry later this week, or
find an older mpt2 box lying around.

-- Joe

2015-07-16 14:57:46

by Sreekanth Reddy

[permalink] [raw]
Subject: Re: [PATCH 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage

On Sun, Jul 12, 2015 at 9:54 AM, Calvin Owens <[email protected]> wrote:
> These objects can be referenced concurrently throughout the driver, we
> need a way to make sure threads can't delete them out from under each
> other. This patch adds the refcount, and refactors the code to use it.
>
> Additionally, we cannot iterate over the sas_device_list without
> holding the lock, or we risk corrupting random memory if items are
> added or deleted as we iterate. This patch refactors _scsih_probe_sas()
> to use the sas_device_list in a safe way.
>
> Cc: Christoph Hellwig <[email protected]>
> Cc: Bart Van Assche <[email protected]>
> Signed-off-by: Calvin Owens <[email protected]>
> ---
> drivers/scsi/mpt2sas/mpt2sas_base.h | 22 +-
> drivers/scsi/mpt2sas/mpt2sas_scsih.c | 434 ++++++++++++++++++++-----------
> drivers/scsi/mpt2sas/mpt2sas_transport.c | 12 +-
> 3 files changed, 315 insertions(+), 153 deletions(-)
>
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
> index caff8d1..78f41ac 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_base.h
> +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
> @@ -238,6 +238,7 @@
> * @flags: MPT_TARGET_FLAGS_XXX flags
> * @deleted: target flaged for deletion
> * @tm_busy: target is busy with TM request.
> + * @sdev: The sas_device associated with this target
> */
> struct MPT2SAS_TARGET {
> struct scsi_target *starget;
> @@ -248,6 +249,7 @@ struct MPT2SAS_TARGET {
> u32 flags;
> u8 deleted;
> u8 tm_busy;
> + struct _sas_device *sdev;
> };
>
>
> @@ -376,8 +378,24 @@ struct _sas_device {
> u8 phy;
> u8 responding;
> u8 pfa_led_on;
> + struct kref refcount;
> };
>
> +static inline void sas_device_get(struct _sas_device *s)
> +{
> + kref_get(&s->refcount);
> +}
> +
> +static inline void sas_device_free(struct kref *r)
> +{
> + kfree(container_of(r, struct _sas_device, refcount));
> +}
> +
> +static inline void sas_device_put(struct _sas_device *s)
> +{
> + kref_put(&s->refcount, sas_device_free);
> +}
> +
> /**
> * struct _raid_device - raid volume link list
> * @list: sas device list
> @@ -1095,7 +1113,9 @@ struct _sas_node *mpt2sas_scsih_expander_find_by_handle(struct MPT2SAS_ADAPTER *
> u16 handle);
> struct _sas_node *mpt2sas_scsih_expander_find_by_sas_address(struct MPT2SAS_ADAPTER
> *ioc, u64 sas_address);
> -struct _sas_device *mpt2sas_scsih_sas_device_find_by_sas_address(
> +struct _sas_device *mpt2sas_get_sdev_by_addr(
> + struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
> +struct _sas_device *__mpt2sas_get_sdev_by_addr(
> struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
>
> void mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc);
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> index 3f26147..fad80ce 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> +++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> @@ -526,8 +526,43 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
> }
> }
>
> +struct _sas_device *
> +__mpt2sas_get_sdev_from_target(struct MPT2SAS_TARGET *tgt_priv)
> +{
> + struct _sas_device *ret;
> +
> + ret = tgt_priv->sdev;
> + if (ret)
> + sas_device_get(ret);
> +
> + return ret;
> +}
> +
> +struct _sas_device *
> +__mpt2sas_get_sdev_by_addr(struct MPT2SAS_ADAPTER *ioc,
> + u64 sas_address)
> +{
> + struct _sas_device *sas_device;
> +
> + assert_spin_locked(&ioc->sas_device_lock);
> +
> + list_for_each_entry(sas_device, &ioc->sas_device_list, list)
> + if (sas_device->sas_address == sas_address)
> + goto found_device;
> +
> + list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
> + if (sas_device->sas_address == sas_address)
> + goto found_device;
> +
> + return NULL;
> +
> +found_device:
> + sas_device_get(sas_device);
> + return sas_device;
> +}
> +
> /**
> - * mpt2sas_scsih_sas_device_find_by_sas_address - sas device search
> + * mpt2sas_get_sdev_by_addr - sas device search
> * @ioc: per adapter object
> * @sas_address: sas address
> * Context: Calling function should acquire ioc->sas_device_lock
> @@ -536,24 +571,44 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
> * object.
> */
> struct _sas_device *
> -mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> +mpt2sas_get_sdev_by_addr(struct MPT2SAS_ADAPTER *ioc,
> u64 sas_address)
> {
> struct _sas_device *sas_device;
> + unsigned long flags;
> +
> + spin_lock_irqsave(&ioc->sas_device_lock, flags);
> + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> + sas_address);
> + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +
> + return sas_device;
> +}
> +
> +static struct _sas_device *
> +__mpt2sas_get_sdev_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> +{
> + struct _sas_device *sas_device;
> +
> + assert_spin_locked(&ioc->sas_device_lock);
>
> list_for_each_entry(sas_device, &ioc->sas_device_list, list)
> - if (sas_device->sas_address == sas_address)
> - return sas_device;
> + if (sas_device->handle == handle)
> + goto found_device;
>
> list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
> - if (sas_device->sas_address == sas_address)
> - return sas_device;
> + if (sas_device->handle == handle)
> + goto found_device;
>
> return NULL;
> +
> +found_device:
> + sas_device_get(sas_device);
> + return sas_device;
> }
>
> /**
> - * _scsih_sas_device_find_by_handle - sas device search
> + * mpt2sas_get_sdev_by_handle - sas device search
> * @ioc: per adapter object
> * @handle: sas device handle (assigned by firmware)
> * Context: Calling function should acquire ioc->sas_device_lock
> @@ -562,19 +617,16 @@ mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> * object.
> */
> static struct _sas_device *
> -_scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> +mpt2sas_get_sdev_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> {
> struct _sas_device *sas_device;
> + unsigned long flags;
>
> - list_for_each_entry(sas_device, &ioc->sas_device_list, list)
> - if (sas_device->handle == handle)
> - return sas_device;
> -
> - list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
> - if (sas_device->handle == handle)
> - return sas_device;
> + spin_lock_irqsave(&ioc->sas_device_lock, flags);
> + sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
> - return NULL;
> + return sas_device;
> }
>
> /**
> @@ -583,7 +635,7 @@ _scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> * @sas_device: the sas_device object
> * Context: This function will acquire ioc->sas_device_lock.
> *
> - * Removing object and freeing associated memory from the ioc->sas_device_list.
> + * If sas_device is on the list, remove it and decrement its reference count.
> */
> static void
> _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
> @@ -594,9 +646,15 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
> if (!sas_device)
> return;
>
> + /*
> + * The lock serializes access to the list, but we still need to verify
> + * that nobody removed the entry while we were waiting on the lock.
> + */
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - list_del(&sas_device->list);
> - kfree(sas_device);
> + if (!list_empty(&sas_device->list)) {
> + list_del_init(&sas_device->list);
> + sas_device_put(sas_device);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> }
>
> @@ -620,6 +678,7 @@ _scsih_sas_device_add(struct MPT2SAS_ADAPTER *ioc,
> sas_device->handle, (unsigned long long)sas_device->sas_address));
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> + sas_device_get(sas_device);
> list_add_tail(&sas_device->list, &ioc->sas_device_list);
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
> @@ -659,6 +718,7 @@ _scsih_sas_device_init_add(struct MPT2SAS_ADAPTER *ioc,
> sas_device->handle, (unsigned long long)sas_device->sas_address));
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> + sas_device_get(sas_device);
> list_add_tail(&sas_device->list, &ioc->sas_device_init_list);
> _scsih_determine_boot_device(ioc, sas_device, 0);
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> @@ -1208,12 +1268,14 @@ _scsih_change_queue_depth(struct scsi_device *sdev, int qdepth)
> goto not_sata;
> if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))
> goto not_sata;
> +
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> - sas_device_priv_data->sas_target->sas_address);
> - if (sas_device && sas_device->device_info &
> - MPI2_SAS_DEVICE_INFO_SATA_DEVICE)
> + sas_device = __mpt2sas_get_sdev_from_target(sas_target_priv_data);
> + if (sas_device && sas_device->device_info
> + & MPI2_SAS_DEVICE_INFO_SATA_DEVICE) {
> max_depth = MPT2SAS_SATA_QUEUE_DEPTH;
> + sas_device_put(sas_device);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
> not_sata:
> @@ -1271,18 +1333,21 @@ _scsih_target_alloc(struct scsi_target *starget)
> /* sas/sata devices */
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> rphy = dev_to_rphy(starget->dev.parent);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> rphy->identify.sas_address);
>
> if (sas_device) {
> sas_target_priv_data->handle = sas_device->handle;
> sas_target_priv_data->sas_address = sas_device->sas_address;
> + sas_target_priv_data->sdev = sas_device;
> sas_device->starget = starget;
> sas_device->id = starget->id;
> sas_device->channel = starget->channel;
> if (test_bit(sas_device->handle, ioc->pd_handles))
> sas_target_priv_data->flags |=
> MPT_TARGET_FLAGS_RAID_COMPONENT;
> +
> + sas_device_put(sas_device);
> }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
> @@ -1324,13 +1389,14 @@ _scsih_target_destroy(struct scsi_target *starget)
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> rphy = dev_to_rphy(starget->dev.parent);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> - rphy->identify.sas_address);
> + sas_device = __mpt2sas_get_sdev_from_target(sas_target_priv_data);
> if (sas_device && (sas_device->starget == starget) &&
> (sas_device->id == starget->id) &&
> (sas_device->channel == starget->channel))
> sas_device->starget = NULL;
>
> + if (sas_device)
> + sas_device_put(sas_device);
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
> out:
> @@ -1386,7 +1452,7 @@ _scsih_slave_alloc(struct scsi_device *sdev)
>
> if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> sas_target_priv_data->sas_address);
> if (sas_device && (sas_device->starget == NULL)) {
> sdev_printk(KERN_INFO, sdev,
> @@ -1394,6 +1460,10 @@ _scsih_slave_alloc(struct scsi_device *sdev)
> __func__, __LINE__);
> sas_device->starget = starget;
> }
> +
> + if (sas_device)
> + sas_device_put(sas_device);
> +
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> }
>
> @@ -1428,10 +1498,12 @@ _scsih_slave_destroy(struct scsi_device *sdev)
>
> if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> - sas_target_priv_data->sas_address);
> + sas_device = __mpt2sas_get_sdev_from_target(sas_target_priv_data);
> if (sas_device && !sas_target_priv_data->num_luns)
> sas_device->starget = NULL;
> +
> + if (sas_device)
> + sas_device_put(sas_device);
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> }
>
> @@ -2078,7 +2150,7 @@ _scsih_slave_configure(struct scsi_device *sdev)
> }
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> sas_device_priv_data->sas_target->sas_address);
> if (!sas_device) {
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> @@ -2116,13 +2188,14 @@ _scsih_slave_configure(struct scsi_device *sdev)
> if (!ssp_target)
> _scsih_display_sata_capabilities(ioc, handle, sdev);
>
> -
> _scsih_change_queue_depth(sdev, qdepth);
>
> if (ssp_target) {
> sas_read_port_mode_page(sdev);
> _scsih_enable_tlr(ioc, sdev);
> }
> +
> + sas_device_put(sas_device);
> return 0;
> }
>
> @@ -2509,8 +2582,7 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
> device_str, (unsigned long long)priv_target->sas_address);
> } else {
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> - priv_target->sas_address);
> + sas_device = __mpt2sas_get_sdev_from_target(priv_target);
> if (sas_device) {
> if (priv_target->flags &
> MPT_TARGET_FLAGS_RAID_COMPONENT) {
> @@ -2529,6 +2601,8 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
> "enclosure_logical_id(0x%016llx), slot(%d)\n",
> (unsigned long long)sas_device->enclosure_logical_id,
> sas_device->slot);
> +
> + sas_device_put(sas_device);
> }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> }
> @@ -2604,12 +2678,12 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
> {
> struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
> struct MPT2SAS_DEVICE *sas_device_priv_data;
> - struct _sas_device *sas_device;
> - unsigned long flags;
> + struct _sas_device *sas_device = NULL;
> u16 handle;
> int r;
>
> struct scsi_target *starget = scmd->device->sdev_target;
> + struct MPT2SAS_TARGET *target_priv_data = starget->hostdata;
>
> starget_printk(KERN_INFO, starget, "attempting device reset! "
> "scmd(%p)\n", scmd);
> @@ -2629,12 +2703,9 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
> handle = 0;
> if (sas_device_priv_data->sas_target->flags &
> MPT_TARGET_FLAGS_RAID_COMPONENT) {
> - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = _scsih_sas_device_find_by_handle(ioc,
> - sas_device_priv_data->sas_target->handle);
> + sas_device = __mpt2sas_get_sdev_from_target(target_priv_data);
> if (sas_device)
> handle = sas_device->volume_handle;
> - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> } else
> handle = sas_device_priv_data->sas_target->handle;
>
> @@ -2651,6 +2722,10 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
> out:
> sdev_printk(KERN_INFO, scmd->device, "device reset: %s scmd(%p)\n",
> ((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
> +
> + if (sas_device)
> + sas_device_put(sas_device);
> +
> return r;
> }
>
> @@ -2665,11 +2740,11 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
> {
> struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
> struct MPT2SAS_DEVICE *sas_device_priv_data;
> - struct _sas_device *sas_device;
> - unsigned long flags;
> + struct _sas_device *sas_device = NULL;
> u16 handle;
> int r;
> struct scsi_target *starget = scmd->device->sdev_target;
> + struct MPT2SAS_TARGET *target_priv_data = starget->hostdata;
>
> starget_printk(KERN_INFO, starget, "attempting target reset! "
> "scmd(%p)\n", scmd);
> @@ -2689,12 +2764,9 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
> handle = 0;
> if (sas_device_priv_data->sas_target->flags &
> MPT_TARGET_FLAGS_RAID_COMPONENT) {
> - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = _scsih_sas_device_find_by_handle(ioc,
> - sas_device_priv_data->sas_target->handle);
> + sas_device = __mpt2sas_get_sdev_from_target(target_priv_data);
> if (sas_device)
> handle = sas_device->volume_handle;
> - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> } else
> handle = sas_device_priv_data->sas_target->handle;
>
> @@ -2711,6 +2783,10 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
> out:
> starget_printk(KERN_INFO, starget, "target reset: %s scmd(%p)\n",
> ((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
> +
> + if (sas_device)
> + sas_device_put(sas_device);
> +
> return r;
> }
>
> @@ -3002,15 +3078,15 @@ _scsih_block_io_to_children_attached_to_ex(struct MPT2SAS_ADAPTER *ioc,
>
> list_for_each_entry(mpt2sas_port,
> &sas_expander->sas_port_list, port_list) {
> - if (mpt2sas_port->remote_identify.device_type ==
> - SAS_END_DEVICE) {
> + if (mpt2sas_port->remote_identify.device_type == SAS_END_DEVICE) {
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device =
> - mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> - mpt2sas_port->remote_identify.sas_address);
> - if (sas_device)
> + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> + mpt2sas_port->remote_identify.sas_address);
> + if (sas_device) {
> set_bit(sas_device->handle,
> - ioc->blocking_handles);
> + ioc->blocking_handles);
> + sas_device_put(sas_device);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> }
> }
> @@ -3080,7 +3156,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> {
> Mpi2SCSITaskManagementRequest_t *mpi_request;
> u16 smid;
> - struct _sas_device *sas_device;
> + struct _sas_device *sas_device = NULL;
> struct MPT2SAS_TARGET *sas_target_priv_data = NULL;
> u64 sas_address = 0;
> unsigned long flags;
> @@ -3110,7 +3186,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> return;
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> + sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> if (sas_device && sas_device->starget &&
> sas_device->starget->hostdata) {
> sas_target_priv_data = sas_device->starget->hostdata;
> @@ -3131,14 +3207,14 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> if (!smid) {
> delayed_tr = kzalloc(sizeof(*delayed_tr), GFP_ATOMIC);
> if (!delayed_tr)
> - return;
> + goto out;
> INIT_LIST_HEAD(&delayed_tr->list);
> delayed_tr->handle = handle;
> list_add_tail(&delayed_tr->list, &ioc->delayed_tr_list);
> dewtprintk(ioc, printk(MPT2SAS_INFO_FMT
> "DELAYED:tr:handle(0x%04x), (open)\n",
> ioc->name, handle));
> - return;
> + goto out;
> }
>
> dewtprintk(ioc, printk(MPT2SAS_INFO_FMT "tr_send:handle(0x%04x), "
> @@ -3150,6 +3226,9 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> mpi_request->DevHandle = cpu_to_le16(handle);
> mpi_request->TaskType = MPI2_SCSITASKMGMT_TASKTYPE_TARGET_RESET;
> mpt2sas_base_put_smid_hi_priority(ioc, smid);
> +out:
> + if (sas_device)
> + sas_device_put(sas_device);
> }
>
>
> @@ -4068,7 +4147,6 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
> char *desc_scsi_state = ioc->tmp_string;
> u32 log_info = le32_to_cpu(mpi_reply->IOCLogInfo);
> struct _sas_device *sas_device = NULL;
> - unsigned long flags;
> struct scsi_target *starget = scmd->device->sdev_target;
> struct MPT2SAS_TARGET *priv_target = starget->hostdata;
> char *device_str = NULL;
> @@ -4200,9 +4278,7 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
> printk(MPT2SAS_WARN_FMT "\t%s wwid(0x%016llx)\n", ioc->name,
> device_str, (unsigned long long)priv_target->sas_address);
> } else {
> - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> - priv_target->sas_address);
> + sas_device = __mpt2sas_get_sdev_from_target(priv_target);
> if (sas_device) {
> printk(MPT2SAS_WARN_FMT "\tsas_address(0x%016llx), "
> "phy(%d)\n", ioc->name, sas_device->sas_address,
> @@ -4211,8 +4287,9 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
> "\tenclosure_logical_id(0x%016llx), slot(%d)\n",
> ioc->name, sas_device->enclosure_logical_id,
> sas_device->slot);
> +
> + sas_device_put(sas_device);
> }
> - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> }
>
> printk(MPT2SAS_WARN_FMT "\thandle(0x%04x), ioc_status(%s)(0x%04x), "
> @@ -4259,7 +4336,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> Mpi2SepRequest_t mpi_request;
> struct _sas_device *sas_device;
>
> - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> + sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
> if (!sas_device)
> return;
>
> @@ -4274,7 +4351,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> &mpi_request)) != 0) {
> printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n", ioc->name,
> __FILE__, __LINE__, __func__);
> - return;
> + goto out;
> }
> sas_device->pfa_led_on = 1;
>
> @@ -4284,8 +4361,10 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> "enclosure_processor: ioc_status (0x%04x), loginfo(0x%08x)\n",
> ioc->name, le16_to_cpu(mpi_reply.IOCStatus),
> le32_to_cpu(mpi_reply.IOCLogInfo)));
> - return;
> + goto out;
> }
> +out:
> + sas_device_put(sas_device);
> }
>
> /**
> @@ -4370,19 +4449,17 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>
> /* only handle non-raid devices */
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> + sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> if (!sas_device) {
> - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> - return;
> + goto out_unlock;
> }
> starget = sas_device->starget;
> sas_target_priv_data = starget->hostdata;
>
> if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_RAID_COMPONENT) ||
> - ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))) {
> - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> - return;
> - }
> + ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)))
> + goto out_unlock;
> +
> starget_printk(KERN_WARNING, starget, "predicted fault\n");
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
> @@ -4396,7 +4473,7 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> if (!event_reply) {
> printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n",
> ioc->name, __FILE__, __LINE__, __func__);
> - return;
> + goto out;
> }
>
> event_reply->Function = MPI2_FUNCTION_EVENT_NOTIFICATION;
> @@ -4413,6 +4490,14 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> event_data->SASAddress = cpu_to_le64(sas_target_priv_data->sas_address);
> mpt2sas_ctl_add_to_event_log(ioc, event_reply);
> kfree(event_reply);
> +out:
> + if (sas_device)
> + sas_device_put(sas_device);
> + return;
> +
> +out_unlock:
> + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> + goto out;
> }
>
> /**
> @@ -5148,14 +5233,13 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> sas_address = le64_to_cpu(sas_device_pg0.SASAddress);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> sas_address);
>
> if (!sas_device) {
> printk(MPT2SAS_ERR_FMT "device is not present "
> "handle(0x%04x), no sas_device!!!\n", ioc->name, handle);
> - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> - return;
> + goto out_unlock;
> }
>
> if (unlikely(sas_device->handle != handle)) {
> @@ -5172,19 +5256,22 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> MPI2_SAS_DEVICE0_FLAGS_DEVICE_PRESENT)) {
> printk(MPT2SAS_ERR_FMT "device is not present "
> "handle(0x%04x), flags!!!\n", ioc->name, handle);
> - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> - return;
> + goto out_unlock;
> }
>
> /* check if there were any issues with discovery */
> if (_scsih_check_access_status(ioc, sas_address, handle,
> - sas_device_pg0.AccessStatus)) {
> - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> - return;
> - }
> + sas_device_pg0.AccessStatus))
> + goto out_unlock;
> +
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> _scsih_ublock_io_device(ioc, sas_address);
> + return;
>
> +out_unlock:
> + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> + if (sas_device)
> + sas_device_put(sas_device);
> }
>
> /**
> @@ -5208,7 +5295,6 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
> u32 ioc_status;
> __le64 sas_address;
> u32 device_info;
> - unsigned long flags;
>
> if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
> MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
> @@ -5250,14 +5336,13 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
> return -1;
> }
>
> -
> - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> + sas_device = mpt2sas_get_sdev_by_addr(ioc,
> sas_address);
> - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
> - if (sas_device)
> + if (sas_device) {
> + sas_device_put(sas_device);
> return 0;
> + }
>
> sas_device = kzalloc(sizeof(struct _sas_device),
> GFP_KERNEL);
> @@ -5267,6 +5352,7 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
> return -1;
> }
>
> + kref_init(&sas_device->refcount);
> sas_device->handle = handle;
> if (_scsih_get_sas_address(ioc, le16_to_cpu
> (sas_device_pg0.ParentDevHandle),
> @@ -5344,7 +5430,6 @@ _scsih_remove_device(struct MPT2SAS_ADAPTER *ioc,
> "handle(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
> sas_device->handle, (unsigned long long)
> sas_device->sas_address));
> - kfree(sas_device);
> }
> /**
> * _scsih_device_remove_by_handle - removing device object by handle
> @@ -5363,12 +5448,17 @@ _scsih_device_remove_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> return;
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> - if (sas_device)
> - list_del(&sas_device->list);
> + sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> + if (sas_device) {
> + list_del_init(&sas_device->list);
> + sas_device_put(sas_device);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> - if (sas_device)
> +
> + if (sas_device) {
> _scsih_remove_device(ioc, sas_device);
> + sas_device_put(sas_device);
> + }
> }
>
> /**
> @@ -5389,13 +5479,17 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> return;
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> - sas_address);
> - if (sas_device)
> - list_del(&sas_device->list);
> + sas_device = __mpt2sas_get_sdev_by_addr(ioc, sas_address);
> + if (sas_device) {
> + list_del_init(&sas_device->list);
> + sas_device_put(sas_device);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> - if (sas_device)
> +
> + if (sas_device) {
> _scsih_remove_device(ioc, sas_device);
> + sas_device_put(sas_device);
> + }
> }
> #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
> /**
> @@ -5716,26 +5810,28 @@ _scsih_sas_device_status_change_event(struct MPT2SAS_ADAPTER *ioc,
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> sas_address = le64_to_cpu(event_data->SASAddress);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> sas_address);
>
> - if (!sas_device || !sas_device->starget) {
> - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> - return;
> - }
> + if (!sas_device || !sas_device->starget)
> + goto out;
>
> target_priv_data = sas_device->starget->hostdata;
> - if (!target_priv_data) {
> - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> - return;
> - }
> + if (!target_priv_data)
> + goto out;
>
> if (event_data->ReasonCode ==
> MPI2_EVENT_SAS_DEV_STAT_RC_INTERNAL_DEVICE_RESET)
> target_priv_data->tm_busy = 1;
> else
> target_priv_data->tm_busy = 0;
> +
> +out:
> + if (sas_device)
> + sas_device_put(sas_device);
> +
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +
> }
>
> #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
> @@ -6123,7 +6219,7 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
> u16 handle = le16_to_cpu(element->PhysDiskDevHandle);
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> + sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> if (sas_device) {
> sas_device->volume_handle = 0;
> sas_device->volume_wwid = 0;
> @@ -6142,6 +6238,8 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
> /* exposing raid component */
> if (starget)
> starget_for_each_device(starget, NULL, _scsih_reprobe_lun);
> +
> + sas_device_put(sas_device);
> }
>
> /**
> @@ -6170,7 +6268,7 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
> &volume_wwid);
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> + sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> if (sas_device) {
> set_bit(handle, ioc->pd_handles);
> if (sas_device->starget && sas_device->starget->hostdata) {
> @@ -6189,6 +6287,8 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
> /* hiding raid component */
> if (starget)
> starget_for_each_device(starget, (void *)1, _scsih_reprobe_lun);
> +
> + sas_device_put(sas_device);
> }
>
> /**
> @@ -6221,7 +6321,6 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,
> Mpi2EventIrConfigElement_t *element)
> {
> struct _sas_device *sas_device;
> - unsigned long flags;
> u16 handle = le16_to_cpu(element->PhysDiskDevHandle);
> Mpi2ConfigReply_t mpi_reply;
> Mpi2SasDevicePage0_t sas_device_pg0;
> @@ -6231,11 +6330,11 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,
>
> set_bit(handle, ioc->pd_handles);
>
> - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> - if (sas_device)
> + sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
> + if (sas_device) {
> + sas_device_put(sas_device);
> return;
> + }
>
> if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
> MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
> @@ -6509,7 +6608,6 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
> u16 handle, parent_handle;
> u32 state;
> struct _sas_device *sas_device;
> - unsigned long flags;
> Mpi2ConfigReply_t mpi_reply;
> Mpi2SasDevicePage0_t sas_device_pg0;
> u32 ioc_status;
> @@ -6542,12 +6640,11 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
> if (!ioc->is_warpdrive)
> set_bit(handle, ioc->pd_handles);
>
> - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -
> - if (sas_device)
> + sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
> + if (sas_device) {
> + sas_device_put(sas_device);
> return;
> + }
>
> if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
> &sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
> @@ -7015,6 +7112,7 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
> struct _raid_device *raid_device, *raid_device_next;
> struct list_head tmp_list;
> unsigned long flags;
> + LIST_HEAD(head);
>
> printk(MPT2SAS_INFO_FMT "removing unresponding devices: start\n",
> ioc->name);
> @@ -7022,14 +7120,29 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
> /* removing unresponding end devices */
> printk(MPT2SAS_INFO_FMT "removing unresponding devices: end-devices\n",
> ioc->name);
> +
> + /*
> + * Iterate, pulling off devices marked as non-responding. We become the
> + * owner for the reference the list had on any object we prune.
> + */
> + spin_lock_irqsave(&ioc->sas_device_lock, flags);
> list_for_each_entry_safe(sas_device, sas_device_next,
> - &ioc->sas_device_list, list) {
> + &ioc->sas_device_list, list) {
> if (!sas_device->responding)
> - mpt2sas_device_remove_by_sas_address(ioc,
> - sas_device->sas_address);
> + list_move_tail(&sas_device->list, &head);
> else
> sas_device->responding = 0;
> }
> + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +
> + /*
> + * Now, uninitialize and remove the unresponding devices we pruned.
> + */
> + list_for_each_entry_safe(sas_device, sas_device_next, &head, list) {
> + _scsih_remove_device(ioc, sas_device);
> + list_del_init(&sas_device->list);
> + sas_device_put(sas_device);
> + }
>
> /* removing unresponding volumes */
> if (ioc->ir_firmware) {
> @@ -7179,11 +7292,11 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
> }
> phys_disk_num = pd_pg0.PhysDiskNum;
> handle = le16_to_cpu(pd_pg0.DevHandle);
> - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> - if (sas_device)
> + sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
> + if (sas_device) {
> + sas_device_put(sas_device);
> continue;
> + }
> if (mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
> &sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
> handle) != 0)
> @@ -7302,12 +7415,12 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
> if (!(_scsih_is_end_device(
> le32_to_cpu(sas_device_pg0.DeviceInfo))))
> continue;
> - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> + sas_device = mpt2sas_get_sdev_by_addr(ioc,
> le64_to_cpu(sas_device_pg0.SASAddress));
> - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> - if (sas_device)
> + if (sas_device) {
> + sas_device_put(sas_device);
> continue;
> + }
> parent_handle = le16_to_cpu(sas_device_pg0.ParentDevHandle);
> if (!_scsih_get_sas_address(ioc, parent_handle, &sas_address)) {
> printk(MPT2SAS_INFO_FMT "\tBEFORE adding end device: "
> @@ -7966,6 +8079,37 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
> }
> }
>
> +static struct _sas_device *dequeue_next_sas_device(struct MPT2SAS_ADAPTER *ioc)
> +{
> + struct _sas_device *sas_device = NULL;
> + unsigned long flags;
> +
> + spin_lock_irqsave(&ioc->sas_device_lock, flags);
> + if (!list_empty(&ioc->sas_device_init_list)) {
> + sas_device = list_first_entry(&ioc->sas_device_init_list,
> + struct _sas_device, list);
> + list_del_init(&sas_device->list);
> + }
> + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +
> + /*
> + * If an item was dequeued, the caller now owns the reference that was
> + * previously owned by the list
> + */
> + return sas_device;
> +}
> +
> +static void sas_device_make_active(struct MPT2SAS_ADAPTER *ioc,
> + struct _sas_device *sas_device)
> +{
> + unsigned long flags;
> +
> + spin_lock_irqsave(&ioc->sas_device_lock, flags);
> + sas_device_get(sas_device);
> + list_add_tail(&sas_device->list, &ioc->sas_device_list);
> + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +}
> +
> /**
> * _scsih_probe_sas - reporting sas devices to sas transport
> * @ioc: per adapter object
> @@ -7975,34 +8119,28 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
> static void
> _scsih_probe_sas(struct MPT2SAS_ADAPTER *ioc)
> {
> - struct _sas_device *sas_device, *next;
> - unsigned long flags;
> -
> - /* SAS Device List */
> - list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
> - list) {
> + struct _sas_device *sas_device;
>
> - if (ioc->hide_drives)
> - continue;
> + if (ioc->hide_drives)
> + return;
>
> + while ((sas_device = dequeue_next_sas_device(ioc))) {

I see some issue here. Here sas_device is removed from the
sas_device_init_list and adding this device to the STL by calling sas_rphy_add,
which in turn invokes the driver's target_alloc, slave_alloc &
slave_configure callback
routines and in these routines we are checking whether this device is
present in the
sas_device_init_list or not, as this device is not in this list, so
this device won't be added.


Regards,
Sreekanth
> if (!mpt2sas_transport_port_add(ioc, sas_device->handle,
> - sas_device->sas_address_parent)) {
> - list_del(&sas_device->list);
> - kfree(sas_device);
> + sas_device->sas_address_parent)) {
> + sas_device_put(sas_device);
> continue;
> } else if (!sas_device->starget) {
> if (!ioc->is_driver_loading) {
> mpt2sas_transport_port_remove(ioc,
> - sas_device->sas_address,
> - sas_device->sas_address_parent);
> - list_del(&sas_device->list);
> - kfree(sas_device);
> + sas_device->sas_address,
> + sas_device->sas_address_parent);
> + sas_device_put(sas_device);
> continue;
> }
> }
> - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - list_move_tail(&sas_device->list, &ioc->sas_device_list);
> - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +
> + sas_device_make_active(ioc, sas_device);
> + sas_device_put(sas_device);
> }
> }
>
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_transport.c b/drivers/scsi/mpt2sas/mpt2sas_transport.c
> index ff2500a..af86800 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_transport.c
> +++ b/drivers/scsi/mpt2sas/mpt2sas_transport.c
> @@ -1323,15 +1323,17 @@ _transport_get_enclosure_identifier(struct sas_rphy *rphy, u64 *identifier)
> int rc;
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> rphy->identify.sas_address);
> if (sas_device) {
> *identifier = sas_device->enclosure_logical_id;
> rc = 0;
> + sas_device_put(sas_device);
> } else {
> *identifier = 0;
> rc = -ENXIO;
> }
> +
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> return rc;
> }
> @@ -1351,12 +1353,14 @@ _transport_get_bay_identifier(struct sas_rphy *rphy)
> int rc;
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> rphy->identify.sas_address);
> - if (sas_device)
> + if (sas_device) {
> rc = sas_device->slot;
> - else
> + sas_device_put(sas_device);
> + } else {
> rc = -ENXIO;
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> return rc;
> }
> --
> 1.8.1
>



--

Regards,
Sreekanth

2015-07-21 07:04:21

by Calvin Owens

[permalink] [raw]
Subject: Re: [PATCH 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage

On Thursday 07/16 at 20:27 +0530, Sreekanth Reddy wrote:
> On Sun, Jul 12, 2015 at 9:54 AM, Calvin Owens <[email protected]> wrote:
> > These objects can be referenced concurrently throughout the driver, we
> > need a way to make sure threads can't delete them out from under each
> > other. This patch adds the refcount, and refactors the code to use it.
> >
> > Additionally, we cannot iterate over the sas_device_list without
> > holding the lock, or we risk corrupting random memory if items are
> > added or deleted as we iterate. This patch refactors _scsih_probe_sas()
> > to use the sas_device_list in a safe way.
> >
> > Cc: Christoph Hellwig <[email protected]>
> > Cc: Bart Van Assche <[email protected]>
> > Signed-off-by: Calvin Owens <[email protected]>
> > ---
> > drivers/scsi/mpt2sas/mpt2sas_base.h | 22 +-
> > drivers/scsi/mpt2sas/mpt2sas_scsih.c | 434 ++++++++++++++++++++-----------
> > drivers/scsi/mpt2sas/mpt2sas_transport.c | 12 +-
> > 3 files changed, 315 insertions(+), 153 deletions(-)
> >
> > diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
> > index caff8d1..78f41ac 100644
> > --- a/drivers/scsi/mpt2sas/mpt2sas_base.h
> > +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
> > @@ -238,6 +238,7 @@
> > * @flags: MPT_TARGET_FLAGS_XXX flags
> > * @deleted: target flaged for deletion
> > * @tm_busy: target is busy with TM request.
> > + * @sdev: The sas_device associated with this target
> > */
> > struct MPT2SAS_TARGET {
> > struct scsi_target *starget;
> > @@ -248,6 +249,7 @@ struct MPT2SAS_TARGET {
> > u32 flags;
> > u8 deleted;
> > u8 tm_busy;
> > + struct _sas_device *sdev;
> > };
> >
> >
> > @@ -376,8 +378,24 @@ struct _sas_device {
> > u8 phy;
> > u8 responding;
> > u8 pfa_led_on;
> > + struct kref refcount;
> > };
> >
> > +static inline void sas_device_get(struct _sas_device *s)
> > +{
> > + kref_get(&s->refcount);
> > +}
> > +
> > +static inline void sas_device_free(struct kref *r)
> > +{
> > + kfree(container_of(r, struct _sas_device, refcount));
> > +}
> > +
> > +static inline void sas_device_put(struct _sas_device *s)
> > +{
> > + kref_put(&s->refcount, sas_device_free);
> > +}
> > +
> > /**
> > * struct _raid_device - raid volume link list
> > * @list: sas device list
> > @@ -1095,7 +1113,9 @@ struct _sas_node *mpt2sas_scsih_expander_find_by_handle(struct MPT2SAS_ADAPTER *
> > u16 handle);
> > struct _sas_node *mpt2sas_scsih_expander_find_by_sas_address(struct MPT2SAS_ADAPTER
> > *ioc, u64 sas_address);
> > -struct _sas_device *mpt2sas_scsih_sas_device_find_by_sas_address(
> > +struct _sas_device *mpt2sas_get_sdev_by_addr(
> > + struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
> > +struct _sas_device *__mpt2sas_get_sdev_by_addr(
> > struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
> >
> > void mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc);
> > diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> > index 3f26147..fad80ce 100644
> > --- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> > +++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> > @@ -526,8 +526,43 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
> > }
> > }
> >
> > +struct _sas_device *
> > +__mpt2sas_get_sdev_from_target(struct MPT2SAS_TARGET *tgt_priv)
> > +{
> > + struct _sas_device *ret;
> > +
> > + ret = tgt_priv->sdev;
> > + if (ret)
> > + sas_device_get(ret);
> > +
> > + return ret;
> > +}
> > +
> > +struct _sas_device *
> > +__mpt2sas_get_sdev_by_addr(struct MPT2SAS_ADAPTER *ioc,
> > + u64 sas_address)
> > +{
> > + struct _sas_device *sas_device;
> > +
> > + assert_spin_locked(&ioc->sas_device_lock);
> > +
> > + list_for_each_entry(sas_device, &ioc->sas_device_list, list)
> > + if (sas_device->sas_address == sas_address)
> > + goto found_device;
> > +
> > + list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
> > + if (sas_device->sas_address == sas_address)
> > + goto found_device;
> > +
> > + return NULL;
> > +
> > +found_device:
> > + sas_device_get(sas_device);
> > + return sas_device;
> > +}
> > +
> > /**
> > - * mpt2sas_scsih_sas_device_find_by_sas_address - sas device search
> > + * mpt2sas_get_sdev_by_addr - sas device search
> > * @ioc: per adapter object
> > * @sas_address: sas address
> > * Context: Calling function should acquire ioc->sas_device_lock
> > @@ -536,24 +571,44 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
> > * object.
> > */
> > struct _sas_device *
> > -mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> > +mpt2sas_get_sdev_by_addr(struct MPT2SAS_ADAPTER *ioc,
> > u64 sas_address)
> > {
> > struct _sas_device *sas_device;
> > + unsigned long flags;
> > +
> > + spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> > + sas_address);
> > + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +
> > + return sas_device;
> > +}
> > +
> > +static struct _sas_device *
> > +__mpt2sas_get_sdev_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > +{
> > + struct _sas_device *sas_device;
> > +
> > + assert_spin_locked(&ioc->sas_device_lock);
> >
> > list_for_each_entry(sas_device, &ioc->sas_device_list, list)
> > - if (sas_device->sas_address == sas_address)
> > - return sas_device;
> > + if (sas_device->handle == handle)
> > + goto found_device;
> >
> > list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
> > - if (sas_device->sas_address == sas_address)
> > - return sas_device;
> > + if (sas_device->handle == handle)
> > + goto found_device;
> >
> > return NULL;
> > +
> > +found_device:
> > + sas_device_get(sas_device);
> > + return sas_device;
> > }
> >
> > /**
> > - * _scsih_sas_device_find_by_handle - sas device search
> > + * mpt2sas_get_sdev_by_handle - sas device search
> > * @ioc: per adapter object
> > * @handle: sas device handle (assigned by firmware)
> > * Context: Calling function should acquire ioc->sas_device_lock
> > @@ -562,19 +617,16 @@ mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> > * object.
> > */
> > static struct _sas_device *
> > -_scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > +mpt2sas_get_sdev_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > {
> > struct _sas_device *sas_device;
> > + unsigned long flags;
> >
> > - list_for_each_entry(sas_device, &ioc->sas_device_list, list)
> > - if (sas_device->handle == handle)
> > - return sas_device;
> > -
> > - list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
> > - if (sas_device->handle == handle)
> > - return sas_device;
> > + spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > + sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> > + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >
> > - return NULL;
> > + return sas_device;
> > }
> >
> > /**
> > @@ -583,7 +635,7 @@ _scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > * @sas_device: the sas_device object
> > * Context: This function will acquire ioc->sas_device_lock.
> > *
> > - * Removing object and freeing associated memory from the ioc->sas_device_list.
> > + * If sas_device is on the list, remove it and decrement its reference count.
> > */
> > static void
> > _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
> > @@ -594,9 +646,15 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
> > if (!sas_device)
> > return;
> >
> > + /*
> > + * The lock serializes access to the list, but we still need to verify
> > + * that nobody removed the entry while we were waiting on the lock.
> > + */
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - list_del(&sas_device->list);
> > - kfree(sas_device);
> > + if (!list_empty(&sas_device->list)) {
> > + list_del_init(&sas_device->list);
> > + sas_device_put(sas_device);
> > + }
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > }
> >
> > @@ -620,6 +678,7 @@ _scsih_sas_device_add(struct MPT2SAS_ADAPTER *ioc,
> > sas_device->handle, (unsigned long long)sas_device->sas_address));
> >
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > + sas_device_get(sas_device);
> > list_add_tail(&sas_device->list, &ioc->sas_device_list);
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >
> > @@ -659,6 +718,7 @@ _scsih_sas_device_init_add(struct MPT2SAS_ADAPTER *ioc,
> > sas_device->handle, (unsigned long long)sas_device->sas_address));
> >
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > + sas_device_get(sas_device);
> > list_add_tail(&sas_device->list, &ioc->sas_device_init_list);
> > _scsih_determine_boot_device(ioc, sas_device, 0);
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > @@ -1208,12 +1268,14 @@ _scsih_change_queue_depth(struct scsi_device *sdev, int qdepth)
> > goto not_sata;
> > if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))
> > goto not_sata;
> > +
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > - sas_device_priv_data->sas_target->sas_address);
> > - if (sas_device && sas_device->device_info &
> > - MPI2_SAS_DEVICE_INFO_SATA_DEVICE)
> > + sas_device = __mpt2sas_get_sdev_from_target(sas_target_priv_data);
> > + if (sas_device && sas_device->device_info
> > + & MPI2_SAS_DEVICE_INFO_SATA_DEVICE) {
> > max_depth = MPT2SAS_SATA_QUEUE_DEPTH;
> > + sas_device_put(sas_device);
> > + }
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >
> > not_sata:
> > @@ -1271,18 +1333,21 @@ _scsih_target_alloc(struct scsi_target *starget)
> > /* sas/sata devices */
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > rphy = dev_to_rphy(starget->dev.parent);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> > rphy->identify.sas_address);
> >
> > if (sas_device) {
> > sas_target_priv_data->handle = sas_device->handle;
> > sas_target_priv_data->sas_address = sas_device->sas_address;
> > + sas_target_priv_data->sdev = sas_device;
> > sas_device->starget = starget;
> > sas_device->id = starget->id;
> > sas_device->channel = starget->channel;
> > if (test_bit(sas_device->handle, ioc->pd_handles))
> > sas_target_priv_data->flags |=
> > MPT_TARGET_FLAGS_RAID_COMPONENT;
> > +
> > + sas_device_put(sas_device);
> > }
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >
> > @@ -1324,13 +1389,14 @@ _scsih_target_destroy(struct scsi_target *starget)
> >
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > rphy = dev_to_rphy(starget->dev.parent);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > - rphy->identify.sas_address);
> > + sas_device = __mpt2sas_get_sdev_from_target(sas_target_priv_data);
> > if (sas_device && (sas_device->starget == starget) &&
> > (sas_device->id == starget->id) &&
> > (sas_device->channel == starget->channel))
> > sas_device->starget = NULL;
> >
> > + if (sas_device)
> > + sas_device_put(sas_device);
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >
> > out:
> > @@ -1386,7 +1452,7 @@ _scsih_slave_alloc(struct scsi_device *sdev)
> >
> > if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> > sas_target_priv_data->sas_address);
> > if (sas_device && (sas_device->starget == NULL)) {
> > sdev_printk(KERN_INFO, sdev,
> > @@ -1394,6 +1460,10 @@ _scsih_slave_alloc(struct scsi_device *sdev)
> > __func__, __LINE__);
> > sas_device->starget = starget;
> > }
> > +
> > + if (sas_device)
> > + sas_device_put(sas_device);
> > +
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > }
> >
> > @@ -1428,10 +1498,12 @@ _scsih_slave_destroy(struct scsi_device *sdev)
> >
> > if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > - sas_target_priv_data->sas_address);
> > + sas_device = __mpt2sas_get_sdev_from_target(sas_target_priv_data);
> > if (sas_device && !sas_target_priv_data->num_luns)
> > sas_device->starget = NULL;
> > +
> > + if (sas_device)
> > + sas_device_put(sas_device);
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > }
> >
> > @@ -2078,7 +2150,7 @@ _scsih_slave_configure(struct scsi_device *sdev)
> > }
> >
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> > sas_device_priv_data->sas_target->sas_address);
> > if (!sas_device) {
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > @@ -2116,13 +2188,14 @@ _scsih_slave_configure(struct scsi_device *sdev)
> > if (!ssp_target)
> > _scsih_display_sata_capabilities(ioc, handle, sdev);
> >
> > -
> > _scsih_change_queue_depth(sdev, qdepth);
> >
> > if (ssp_target) {
> > sas_read_port_mode_page(sdev);
> > _scsih_enable_tlr(ioc, sdev);
> > }
> > +
> > + sas_device_put(sas_device);
> > return 0;
> > }
> >
> > @@ -2509,8 +2582,7 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
> > device_str, (unsigned long long)priv_target->sas_address);
> > } else {
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > - priv_target->sas_address);
> > + sas_device = __mpt2sas_get_sdev_from_target(priv_target);
> > if (sas_device) {
> > if (priv_target->flags &
> > MPT_TARGET_FLAGS_RAID_COMPONENT) {
> > @@ -2529,6 +2601,8 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
> > "enclosure_logical_id(0x%016llx), slot(%d)\n",
> > (unsigned long long)sas_device->enclosure_logical_id,
> > sas_device->slot);
> > +
> > + sas_device_put(sas_device);
> > }
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > }
> > @@ -2604,12 +2678,12 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
> > {
> > struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
> > struct MPT2SAS_DEVICE *sas_device_priv_data;
> > - struct _sas_device *sas_device;
> > - unsigned long flags;
> > + struct _sas_device *sas_device = NULL;
> > u16 handle;
> > int r;
> >
> > struct scsi_target *starget = scmd->device->sdev_target;
> > + struct MPT2SAS_TARGET *target_priv_data = starget->hostdata;
> >
> > starget_printk(KERN_INFO, starget, "attempting device reset! "
> > "scmd(%p)\n", scmd);
> > @@ -2629,12 +2703,9 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
> > handle = 0;
> > if (sas_device_priv_data->sas_target->flags &
> > MPT_TARGET_FLAGS_RAID_COMPONENT) {
> > - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = _scsih_sas_device_find_by_handle(ioc,
> > - sas_device_priv_data->sas_target->handle);
> > + sas_device = __mpt2sas_get_sdev_from_target(target_priv_data);
> > if (sas_device)
> > handle = sas_device->volume_handle;
> > - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > } else
> > handle = sas_device_priv_data->sas_target->handle;
> >
> > @@ -2651,6 +2722,10 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
> > out:
> > sdev_printk(KERN_INFO, scmd->device, "device reset: %s scmd(%p)\n",
> > ((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
> > +
> > + if (sas_device)
> > + sas_device_put(sas_device);
> > +
> > return r;
> > }
> >
> > @@ -2665,11 +2740,11 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
> > {
> > struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
> > struct MPT2SAS_DEVICE *sas_device_priv_data;
> > - struct _sas_device *sas_device;
> > - unsigned long flags;
> > + struct _sas_device *sas_device = NULL;
> > u16 handle;
> > int r;
> > struct scsi_target *starget = scmd->device->sdev_target;
> > + struct MPT2SAS_TARGET *target_priv_data = starget->hostdata;
> >
> > starget_printk(KERN_INFO, starget, "attempting target reset! "
> > "scmd(%p)\n", scmd);
> > @@ -2689,12 +2764,9 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
> > handle = 0;
> > if (sas_device_priv_data->sas_target->flags &
> > MPT_TARGET_FLAGS_RAID_COMPONENT) {
> > - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = _scsih_sas_device_find_by_handle(ioc,
> > - sas_device_priv_data->sas_target->handle);
> > + sas_device = __mpt2sas_get_sdev_from_target(target_priv_data);
> > if (sas_device)
> > handle = sas_device->volume_handle;
> > - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > } else
> > handle = sas_device_priv_data->sas_target->handle;
> >
> > @@ -2711,6 +2783,10 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
> > out:
> > starget_printk(KERN_INFO, starget, "target reset: %s scmd(%p)\n",
> > ((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
> > +
> > + if (sas_device)
> > + sas_device_put(sas_device);
> > +
> > return r;
> > }
> >
> > @@ -3002,15 +3078,15 @@ _scsih_block_io_to_children_attached_to_ex(struct MPT2SAS_ADAPTER *ioc,
> >
> > list_for_each_entry(mpt2sas_port,
> > &sas_expander->sas_port_list, port_list) {
> > - if (mpt2sas_port->remote_identify.device_type ==
> > - SAS_END_DEVICE) {
> > + if (mpt2sas_port->remote_identify.device_type == SAS_END_DEVICE) {
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device =
> > - mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > - mpt2sas_port->remote_identify.sas_address);
> > - if (sas_device)
> > + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> > + mpt2sas_port->remote_identify.sas_address);
> > + if (sas_device) {
> > set_bit(sas_device->handle,
> > - ioc->blocking_handles);
> > + ioc->blocking_handles);
> > + sas_device_put(sas_device);
> > + }
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > }
> > }
> > @@ -3080,7 +3156,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > {
> > Mpi2SCSITaskManagementRequest_t *mpi_request;
> > u16 smid;
> > - struct _sas_device *sas_device;
> > + struct _sas_device *sas_device = NULL;
> > struct MPT2SAS_TARGET *sas_target_priv_data = NULL;
> > u64 sas_address = 0;
> > unsigned long flags;
> > @@ -3110,7 +3186,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > return;
> >
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > + sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> > if (sas_device && sas_device->starget &&
> > sas_device->starget->hostdata) {
> > sas_target_priv_data = sas_device->starget->hostdata;
> > @@ -3131,14 +3207,14 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > if (!smid) {
> > delayed_tr = kzalloc(sizeof(*delayed_tr), GFP_ATOMIC);
> > if (!delayed_tr)
> > - return;
> > + goto out;
> > INIT_LIST_HEAD(&delayed_tr->list);
> > delayed_tr->handle = handle;
> > list_add_tail(&delayed_tr->list, &ioc->delayed_tr_list);
> > dewtprintk(ioc, printk(MPT2SAS_INFO_FMT
> > "DELAYED:tr:handle(0x%04x), (open)\n",
> > ioc->name, handle));
> > - return;
> > + goto out;
> > }
> >
> > dewtprintk(ioc, printk(MPT2SAS_INFO_FMT "tr_send:handle(0x%04x), "
> > @@ -3150,6 +3226,9 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > mpi_request->DevHandle = cpu_to_le16(handle);
> > mpi_request->TaskType = MPI2_SCSITASKMGMT_TASKTYPE_TARGET_RESET;
> > mpt2sas_base_put_smid_hi_priority(ioc, smid);
> > +out:
> > + if (sas_device)
> > + sas_device_put(sas_device);
> > }
> >
> >
> > @@ -4068,7 +4147,6 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
> > char *desc_scsi_state = ioc->tmp_string;
> > u32 log_info = le32_to_cpu(mpi_reply->IOCLogInfo);
> > struct _sas_device *sas_device = NULL;
> > - unsigned long flags;
> > struct scsi_target *starget = scmd->device->sdev_target;
> > struct MPT2SAS_TARGET *priv_target = starget->hostdata;
> > char *device_str = NULL;
> > @@ -4200,9 +4278,7 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
> > printk(MPT2SAS_WARN_FMT "\t%s wwid(0x%016llx)\n", ioc->name,
> > device_str, (unsigned long long)priv_target->sas_address);
> > } else {
> > - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > - priv_target->sas_address);
> > + sas_device = __mpt2sas_get_sdev_from_target(priv_target);
> > if (sas_device) {
> > printk(MPT2SAS_WARN_FMT "\tsas_address(0x%016llx), "
> > "phy(%d)\n", ioc->name, sas_device->sas_address,
> > @@ -4211,8 +4287,9 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
> > "\tenclosure_logical_id(0x%016llx), slot(%d)\n",
> > ioc->name, sas_device->enclosure_logical_id,
> > sas_device->slot);
> > +
> > + sas_device_put(sas_device);
> > }
> > - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > }
> >
> > printk(MPT2SAS_WARN_FMT "\thandle(0x%04x), ioc_status(%s)(0x%04x), "
> > @@ -4259,7 +4336,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > Mpi2SepRequest_t mpi_request;
> > struct _sas_device *sas_device;
> >
> > - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > + sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
> > if (!sas_device)
> > return;
> >
> > @@ -4274,7 +4351,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > &mpi_request)) != 0) {
> > printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n", ioc->name,
> > __FILE__, __LINE__, __func__);
> > - return;
> > + goto out;
> > }
> > sas_device->pfa_led_on = 1;
> >
> > @@ -4284,8 +4361,10 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > "enclosure_processor: ioc_status (0x%04x), loginfo(0x%08x)\n",
> > ioc->name, le16_to_cpu(mpi_reply.IOCStatus),
> > le32_to_cpu(mpi_reply.IOCLogInfo)));
> > - return;
> > + goto out;
> > }
> > +out:
> > + sas_device_put(sas_device);
> > }
> >
> > /**
> > @@ -4370,19 +4449,17 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >
> > /* only handle non-raid devices */
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > + sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> > if (!sas_device) {
> > - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > - return;
> > + goto out_unlock;
> > }
> > starget = sas_device->starget;
> > sas_target_priv_data = starget->hostdata;
> >
> > if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_RAID_COMPONENT) ||
> > - ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))) {
> > - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > - return;
> > - }
> > + ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)))
> > + goto out_unlock;
> > +
> > starget_printk(KERN_WARNING, starget, "predicted fault\n");
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >
> > @@ -4396,7 +4473,7 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > if (!event_reply) {
> > printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n",
> > ioc->name, __FILE__, __LINE__, __func__);
> > - return;
> > + goto out;
> > }
> >
> > event_reply->Function = MPI2_FUNCTION_EVENT_NOTIFICATION;
> > @@ -4413,6 +4490,14 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > event_data->SASAddress = cpu_to_le64(sas_target_priv_data->sas_address);
> > mpt2sas_ctl_add_to_event_log(ioc, event_reply);
> > kfree(event_reply);
> > +out:
> > + if (sas_device)
> > + sas_device_put(sas_device);
> > + return;
> > +
> > +out_unlock:
> > + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > + goto out;
> > }
> >
> > /**
> > @@ -5148,14 +5233,13 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > sas_address = le64_to_cpu(sas_device_pg0.SASAddress);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> > sas_address);
> >
> > if (!sas_device) {
> > printk(MPT2SAS_ERR_FMT "device is not present "
> > "handle(0x%04x), no sas_device!!!\n", ioc->name, handle);
> > - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > - return;
> > + goto out_unlock;
> > }
> >
> > if (unlikely(sas_device->handle != handle)) {
> > @@ -5172,19 +5256,22 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > MPI2_SAS_DEVICE0_FLAGS_DEVICE_PRESENT)) {
> > printk(MPT2SAS_ERR_FMT "device is not present "
> > "handle(0x%04x), flags!!!\n", ioc->name, handle);
> > - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > - return;
> > + goto out_unlock;
> > }
> >
> > /* check if there were any issues with discovery */
> > if (_scsih_check_access_status(ioc, sas_address, handle,
> > - sas_device_pg0.AccessStatus)) {
> > - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > - return;
> > - }
> > + sas_device_pg0.AccessStatus))
> > + goto out_unlock;
> > +
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > _scsih_ublock_io_device(ioc, sas_address);
> > + return;
> >
> > +out_unlock:
> > + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > + if (sas_device)
> > + sas_device_put(sas_device);
> > }
> >
> > /**
> > @@ -5208,7 +5295,6 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
> > u32 ioc_status;
> > __le64 sas_address;
> > u32 device_info;
> > - unsigned long flags;
> >
> > if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
> > MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
> > @@ -5250,14 +5336,13 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
> > return -1;
> > }
> >
> > -
> > - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > + sas_device = mpt2sas_get_sdev_by_addr(ioc,
> > sas_address);
> > - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >
> > - if (sas_device)
> > + if (sas_device) {
> > + sas_device_put(sas_device);
> > return 0;
> > + }
> >
> > sas_device = kzalloc(sizeof(struct _sas_device),
> > GFP_KERNEL);
> > @@ -5267,6 +5352,7 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
> > return -1;
> > }
> >
> > + kref_init(&sas_device->refcount);
> > sas_device->handle = handle;
> > if (_scsih_get_sas_address(ioc, le16_to_cpu
> > (sas_device_pg0.ParentDevHandle),
> > @@ -5344,7 +5430,6 @@ _scsih_remove_device(struct MPT2SAS_ADAPTER *ioc,
> > "handle(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
> > sas_device->handle, (unsigned long long)
> > sas_device->sas_address));
> > - kfree(sas_device);
> > }
> > /**
> > * _scsih_device_remove_by_handle - removing device object by handle
> > @@ -5363,12 +5448,17 @@ _scsih_device_remove_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > return;
> >
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > - if (sas_device)
> > - list_del(&sas_device->list);
> > + sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> > + if (sas_device) {
> > + list_del_init(&sas_device->list);
> > + sas_device_put(sas_device);
> > + }
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > - if (sas_device)
> > +
> > + if (sas_device) {
> > _scsih_remove_device(ioc, sas_device);
> > + sas_device_put(sas_device);
> > + }
> > }
> >
> > /**
> > @@ -5389,13 +5479,17 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> > return;
> >
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > - sas_address);
> > - if (sas_device)
> > - list_del(&sas_device->list);
> > + sas_device = __mpt2sas_get_sdev_by_addr(ioc, sas_address);
> > + if (sas_device) {
> > + list_del_init(&sas_device->list);
> > + sas_device_put(sas_device);
> > + }
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > - if (sas_device)
> > +
> > + if (sas_device) {
> > _scsih_remove_device(ioc, sas_device);
> > + sas_device_put(sas_device);
> > + }
> > }
> > #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
> > /**
> > @@ -5716,26 +5810,28 @@ _scsih_sas_device_status_change_event(struct MPT2SAS_ADAPTER *ioc,
> >
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > sas_address = le64_to_cpu(event_data->SASAddress);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> > sas_address);
> >
> > - if (!sas_device || !sas_device->starget) {
> > - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > - return;
> > - }
> > + if (!sas_device || !sas_device->starget)
> > + goto out;
> >
> > target_priv_data = sas_device->starget->hostdata;
> > - if (!target_priv_data) {
> > - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > - return;
> > - }
> > + if (!target_priv_data)
> > + goto out;
> >
> > if (event_data->ReasonCode ==
> > MPI2_EVENT_SAS_DEV_STAT_RC_INTERNAL_DEVICE_RESET)
> > target_priv_data->tm_busy = 1;
> > else
> > target_priv_data->tm_busy = 0;
> > +
> > +out:
> > + if (sas_device)
> > + sas_device_put(sas_device);
> > +
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +
> > }
> >
> > #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
> > @@ -6123,7 +6219,7 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
> > u16 handle = le16_to_cpu(element->PhysDiskDevHandle);
> >
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > + sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> > if (sas_device) {
> > sas_device->volume_handle = 0;
> > sas_device->volume_wwid = 0;
> > @@ -6142,6 +6238,8 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
> > /* exposing raid component */
> > if (starget)
> > starget_for_each_device(starget, NULL, _scsih_reprobe_lun);
> > +
> > + sas_device_put(sas_device);
> > }
> >
> > /**
> > @@ -6170,7 +6268,7 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
> > &volume_wwid);
> >
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > + sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> > if (sas_device) {
> > set_bit(handle, ioc->pd_handles);
> > if (sas_device->starget && sas_device->starget->hostdata) {
> > @@ -6189,6 +6287,8 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
> > /* hiding raid component */
> > if (starget)
> > starget_for_each_device(starget, (void *)1, _scsih_reprobe_lun);
> > +
> > + sas_device_put(sas_device);
> > }
> >
> > /**
> > @@ -6221,7 +6321,6 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,
> > Mpi2EventIrConfigElement_t *element)
> > {
> > struct _sas_device *sas_device;
> > - unsigned long flags;
> > u16 handle = le16_to_cpu(element->PhysDiskDevHandle);
> > Mpi2ConfigReply_t mpi_reply;
> > Mpi2SasDevicePage0_t sas_device_pg0;
> > @@ -6231,11 +6330,11 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,
> >
> > set_bit(handle, ioc->pd_handles);
> >
> > - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > - if (sas_device)
> > + sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
> > + if (sas_device) {
> > + sas_device_put(sas_device);
> > return;
> > + }
> >
> > if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
> > MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
> > @@ -6509,7 +6608,6 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
> > u16 handle, parent_handle;
> > u32 state;
> > struct _sas_device *sas_device;
> > - unsigned long flags;
> > Mpi2ConfigReply_t mpi_reply;
> > Mpi2SasDevicePage0_t sas_device_pg0;
> > u32 ioc_status;
> > @@ -6542,12 +6640,11 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
> > if (!ioc->is_warpdrive)
> > set_bit(handle, ioc->pd_handles);
> >
> > - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > -
> > - if (sas_device)
> > + sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
> > + if (sas_device) {
> > + sas_device_put(sas_device);
> > return;
> > + }
> >
> > if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
> > &sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
> > @@ -7015,6 +7112,7 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
> > struct _raid_device *raid_device, *raid_device_next;
> > struct list_head tmp_list;
> > unsigned long flags;
> > + LIST_HEAD(head);
> >
> > printk(MPT2SAS_INFO_FMT "removing unresponding devices: start\n",
> > ioc->name);
> > @@ -7022,14 +7120,29 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
> > /* removing unresponding end devices */
> > printk(MPT2SAS_INFO_FMT "removing unresponding devices: end-devices\n",
> > ioc->name);
> > +
> > + /*
> > + * Iterate, pulling off devices marked as non-responding. We become the
> > + * owner for the reference the list had on any object we prune.
> > + */
> > + spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > list_for_each_entry_safe(sas_device, sas_device_next,
> > - &ioc->sas_device_list, list) {
> > + &ioc->sas_device_list, list) {
> > if (!sas_device->responding)
> > - mpt2sas_device_remove_by_sas_address(ioc,
> > - sas_device->sas_address);
> > + list_move_tail(&sas_device->list, &head);
> > else
> > sas_device->responding = 0;
> > }
> > + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +
> > + /*
> > + * Now, uninitialize and remove the unresponding devices we pruned.
> > + */
> > + list_for_each_entry_safe(sas_device, sas_device_next, &head, list) {
> > + _scsih_remove_device(ioc, sas_device);
> > + list_del_init(&sas_device->list);
> > + sas_device_put(sas_device);
> > + }
> >
> > /* removing unresponding volumes */
> > if (ioc->ir_firmware) {
> > @@ -7179,11 +7292,11 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
> > }
> > phys_disk_num = pd_pg0.PhysDiskNum;
> > handle = le16_to_cpu(pd_pg0.DevHandle);
> > - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > - if (sas_device)
> > + sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
> > + if (sas_device) {
> > + sas_device_put(sas_device);
> > continue;
> > + }
> > if (mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
> > &sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
> > handle) != 0)
> > @@ -7302,12 +7415,12 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
> > if (!(_scsih_is_end_device(
> > le32_to_cpu(sas_device_pg0.DeviceInfo))))
> > continue;
> > - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > + sas_device = mpt2sas_get_sdev_by_addr(ioc,
> > le64_to_cpu(sas_device_pg0.SASAddress));
> > - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > - if (sas_device)
> > + if (sas_device) {
> > + sas_device_put(sas_device);
> > continue;
> > + }
> > parent_handle = le16_to_cpu(sas_device_pg0.ParentDevHandle);
> > if (!_scsih_get_sas_address(ioc, parent_handle, &sas_address)) {
> > printk(MPT2SAS_INFO_FMT "\tBEFORE adding end device: "
> > @@ -7966,6 +8079,37 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
> > }
> > }
> >
> > +static struct _sas_device *dequeue_next_sas_device(struct MPT2SAS_ADAPTER *ioc)
> > +{
> > + struct _sas_device *sas_device = NULL;
> > + unsigned long flags;
> > +
> > + spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > + if (!list_empty(&ioc->sas_device_init_list)) {
> > + sas_device = list_first_entry(&ioc->sas_device_init_list,
> > + struct _sas_device, list);
> > + list_del_init(&sas_device->list);
> > + }
> > + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +
> > + /*
> > + * If an item was dequeued, the caller now owns the reference that was
> > + * previously owned by the list
> > + */
> > + return sas_device;
> > +}
> > +
> > +static void sas_device_make_active(struct MPT2SAS_ADAPTER *ioc,
> > + struct _sas_device *sas_device)
> > +{
> > + unsigned long flags;
> > +
> > + spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > + sas_device_get(sas_device);
> > + list_add_tail(&sas_device->list, &ioc->sas_device_list);
> > + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +}
> > +
> > /**
> > * _scsih_probe_sas - reporting sas devices to sas transport
> > * @ioc: per adapter object
> > @@ -7975,34 +8119,28 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
> > static void
> > _scsih_probe_sas(struct MPT2SAS_ADAPTER *ioc)
> > {
> > - struct _sas_device *sas_device, *next;
> > - unsigned long flags;
> > -
> > - /* SAS Device List */
> > - list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
> > - list) {
> > + struct _sas_device *sas_device;
> >
> > - if (ioc->hide_drives)
> > - continue;
> > + if (ioc->hide_drives)
> > + return;
> >
> > + while ((sas_device = dequeue_next_sas_device(ioc))) {
>
> I see some issue here. Here sas_device is removed from the
> sas_device_init_list and adding this device to the STL by calling
> sas_rphy_add, which in turn invokes the driver's target_alloc,
> slave_alloc & slave_configure callback routines and in these routines
> we are checking whether this device is present in the
> sas_device_init_list or not, as this device is not in this list, so
> this device won't be added.

Thanks for looking at this.

I think I can eliminate this problem without too much churn: Since we
hold the reference, it should be fine to leave the sas_device on the
list, as long as we're careful about the state of its list_head after
reacquiring the sas_device_lock (which we can't hold here because
mpt2sas_transport_port_add() calls things that sleep).

I'll send a v3 that does this in the next day or two.

(Lest it appear that I'm being incredibly sloppy about testing here, the
devices I'm using to test these patches don't seem to encounter this
problem at all. I can provide more detail about the hardware I'm using
if that's interesting.)

Thanks very much,
Calvin

> Regards,
> Sreekanth
> > if (!mpt2sas_transport_port_add(ioc, sas_device->handle,
> > - sas_device->sas_address_parent)) {
> > - list_del(&sas_device->list);
> > - kfree(sas_device);
> > + sas_device->sas_address_parent)) {
> > + sas_device_put(sas_device);
> > continue;
> > } else if (!sas_device->starget) {
> > if (!ioc->is_driver_loading) {
> > mpt2sas_transport_port_remove(ioc,
> > - sas_device->sas_address,
> > - sas_device->sas_address_parent);
> > - list_del(&sas_device->list);
> > - kfree(sas_device);
> > + sas_device->sas_address,
> > + sas_device->sas_address_parent);
> > + sas_device_put(sas_device);
> > continue;
> > }
> > }
> > - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - list_move_tail(&sas_device->list, &ioc->sas_device_list);
> > - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +
> > + sas_device_make_active(ioc, sas_device);
> > + sas_device_put(sas_device);
> > }
> > }
> >
> > diff --git a/drivers/scsi/mpt2sas/mpt2sas_transport.c b/drivers/scsi/mpt2sas/mpt2sas_transport.c
> > index ff2500a..af86800 100644
> > --- a/drivers/scsi/mpt2sas/mpt2sas_transport.c
> > +++ b/drivers/scsi/mpt2sas/mpt2sas_transport.c
> > @@ -1323,15 +1323,17 @@ _transport_get_enclosure_identifier(struct sas_rphy *rphy, u64 *identifier)
> > int rc;
> >
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> > rphy->identify.sas_address);
> > if (sas_device) {
> > *identifier = sas_device->enclosure_logical_id;
> > rc = 0;
> > + sas_device_put(sas_device);
> > } else {
> > *identifier = 0;
> > rc = -ENXIO;
> > }
> > +
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > return rc;
> > }
> > @@ -1351,12 +1353,14 @@ _transport_get_bay_identifier(struct sas_rphy *rphy)
> > int rc;
> >
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> > rphy->identify.sas_address);
> > - if (sas_device)
> > + if (sas_device) {
> > rc = sas_device->slot;
> > - else
> > + sas_device_put(sas_device);
> > + } else {
> > rc = -ENXIO;
> > + }
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > return rc;
> > }
> > --
> > 1.8.1
> >
>
>
>
> --
>
> Regards,
> Sreekanth

2015-07-21 07:05:11

by Calvin Owens

[permalink] [raw]
Subject: Re: [PATCH 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage

On Monday 07/13 at 11:05 -0400, Joe Lawrence wrote:
> On 07/12/2015 12:24 AM, Calvin Owens wrote:
> > These objects can be referenced concurrently throughout the driver, we
> > need a way to make sure threads can't delete them out from under each
> > other. This patch adds the refcount, and refactors the code to use it.
> >
> > Additionally, we cannot iterate over the sas_device_list without
> > holding the lock, or we risk corrupting random memory if items are
> > added or deleted as we iterate. This patch refactors _scsih_probe_sas()
> > to use the sas_device_list in a safe way.
> >
> > Cc: Christoph Hellwig <[email protected]>
> > Cc: Bart Van Assche <[email protected]>
> > Signed-off-by: Calvin Owens <[email protected]>
> > ---
> > drivers/scsi/mpt2sas/mpt2sas_base.h | 22 +-
> > drivers/scsi/mpt2sas/mpt2sas_scsih.c | 434 ++++++++++++++++++++-----------
> > drivers/scsi/mpt2sas/mpt2sas_transport.c | 12 +-
> > 3 files changed, 315 insertions(+), 153 deletions(-)
>
> [ ... snip ... ]
>
> > @@ -2078,7 +2150,7 @@ _scsih_slave_configure(struct scsi_device *sdev)
> > }
> >
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> > sas_device_priv_data->sas_target->sas_address);
> > if (!sas_device) {
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > @@ -2116,13 +2188,14 @@ _scsih_slave_configure(struct scsi_device *sdev)
> > if (!ssp_target)
> > _scsih_display_sata_capabilities(ioc, handle, sdev);
> >
> > -
> > _scsih_change_queue_depth(sdev, qdepth);
> >
> > if (ssp_target) {
> > sas_read_port_mode_page(sdev);
> > _scsih_enable_tlr(ioc, sdev);
> > }
> > +
> > + sas_device_put(sas_device);
> > return 0;
> > }
>
> Hi Calvin,
>
> Any reason why this sas_device_put is placed outside the sas_device
> lock? Most other instances in this patch were called just before unlocking.

Thanks for looking at this.

I guess I thought that something below where we drop the sas_device_lock
referenced it, but it looks like nothing does. I'll move it up in v3.

I don't think it's strictly necessary that the put() happen under the
lock: the only way this could be the final put() is if both ->hostdata
and the sas_device_list had dropped their references, and in that case
it would be impossible to have a concurrent get(), since those are the
only two ways to lookup/get a sas_device. But absent any reason not to,
let's make it more consistent.

I'm really glad you pointed this out, because I realized I flubbed this
in _scsih_target_alloc() and forgot to eliminate the sas_device_put()
from before the ->hostdata lookup was added. I'll fix this in v3.

> BTW I attempted testing, but needed to port to mpt3 and ended up with a
> driver that didn't boot :( Hopefully I can retry later this week, or
> find an older mpt2 box lying around.

More testing would be fantastic if that's possible :)

Thanks very much,
Calvin

> -- Joe

2015-07-21 07:06:52

by Calvin Owens

[permalink] [raw]
Subject: Re: [PATCH 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage

On Sunday 07/12 at 23:52 -0700, Christoph Hellwig wrote:
> On Sat, Jul 11, 2015 at 09:24:55PM -0700, Calvin Owens wrote:
> > These objects can be referenced concurrently throughout the driver, we
> > need a way to make sure threads can't delete them out from under each
> > other. This patch adds the refcount, and refactors the code to use it.
> >
> > Additionally, we cannot iterate over the sas_device_list without
> > holding the lock, or we risk corrupting random memory if items are
> > added or deleted as we iterate. This patch refactors _scsih_probe_sas()
> > to use the sas_device_list in a safe way.
> >
> > Cc: Christoph Hellwig <[email protected]>
> > Cc: Bart Van Assche <[email protected]>
> > Signed-off-by: Calvin Owens <[email protected]>
> > ---
> > drivers/scsi/mpt2sas/mpt2sas_base.h | 22 +-
> > drivers/scsi/mpt2sas/mpt2sas_scsih.c | 434 ++++++++++++++++++++-----------
> > drivers/scsi/mpt2sas/mpt2sas_transport.c | 12 +-
> > 3 files changed, 315 insertions(+), 153 deletions(-)
> >
> > diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
> > index caff8d1..78f41ac 100644
> > --- a/drivers/scsi/mpt2sas/mpt2sas_base.h
> > +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
> > @@ -238,6 +238,7 @@
> > * @flags: MPT_TARGET_FLAGS_XXX flags
> > * @deleted: target flaged for deletion
> > * @tm_busy: target is busy with TM request.
> > + * @sdev: The sas_device associated with this target
> > */
> > struct MPT2SAS_TARGET {
> > struct scsi_target *starget;
> > @@ -248,6 +249,7 @@ struct MPT2SAS_TARGET {
> > u32 flags;
> > u8 deleted;
> > u8 tm_busy;
> > + struct _sas_device *sdev;
> > };
> >
> >
> > @@ -376,8 +378,24 @@ struct _sas_device {
> > u8 phy;
> > u8 responding;
> > u8 pfa_led_on;
> > + struct kref refcount;
> > };
> >
> > +static inline void sas_device_get(struct _sas_device *s)
> > +{
> > + kref_get(&s->refcount);
> > +}
> > +
> > +static inline void sas_device_free(struct kref *r)
> > +{
> > + kfree(container_of(r, struct _sas_device, refcount));
> > +}
> > +
> > +static inline void sas_device_put(struct _sas_device *s)
> > +{
> > + kref_put(&s->refcount, sas_device_free);
> > +}
> > +
> > /**
> > * struct _raid_device - raid volume link list
> > * @list: sas device list
> > @@ -1095,7 +1113,9 @@ struct _sas_node *mpt2sas_scsih_expander_find_by_handle(struct MPT2SAS_ADAPTER *
> > u16 handle);
> > struct _sas_node *mpt2sas_scsih_expander_find_by_sas_address(struct MPT2SAS_ADAPTER
> > *ioc, u64 sas_address);
> > -struct _sas_device *mpt2sas_scsih_sas_device_find_by_sas_address(
> > +struct _sas_device *mpt2sas_get_sdev_by_addr(
> > + struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
> > +struct _sas_device *__mpt2sas_get_sdev_by_addr(
> > struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
> >
> > void mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc);
> > diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> > index 3f26147..fad80ce 100644
> > --- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> > +++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> > @@ -526,8 +526,43 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
> > }
> > }
> >
> > +struct _sas_device *
> > +__mpt2sas_get_sdev_from_target(struct MPT2SAS_TARGET *tgt_priv)
> > +{
> > + struct _sas_device *ret;
> > +
>
> Does this need a:
>
> assert_spin_locked(&ioc->sas_device_lock);
>
> ?

Yeah: I'll add that.

Thanks very much,
Calvin

> Otherwise this looks sensible to me.

2015-08-01 05:04:50

by Calvin Owens

[permalink] [raw]
Subject: [PATCH v3 0/2] Fixes for memory corruption in mpt2sas

Hello all,

This patchset attempts to address problems we've been having with
panics due to memory corruption from the mpt2sas driver.

Changes are noted in the individual patches, I realized putting them in the
cover was probably a bit confusing.

Thanks,
Calvin


Patches in this series:
[PATCH v3 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list
[PATCH v3 2/2] mpt2sas: Refcount fw_events and fix unsafe list usage

Total diffstat:
drivers/scsi/mpt2sas/mpt2sas_base.h | 22 +-
drivers/scsi/mpt2sas/mpt2sas_scsih.c | 579 ++++++++++++++++++++++---------
drivers/scsi/mpt2sas/mpt2sas_transport.c | 12 +-
3 files changed, 439 insertions(+), 174 deletions(-)

Diff showing changes v2 => v3:
http://jcalvinowens.github.io/stuff/mpt2sas-patchset-v2v3.patch

Diff showing changes v1 => v2:
http://jcalvinowens.github.io/stuff/mpt2sas-patchset-v1v2.patch

2015-08-01 05:05:30

by Calvin Owens

[permalink] [raw]
Subject: [PATCH v3 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage

These objects can be referenced concurrently throughout the driver, we
need a way to make sure threads can't delete them out from under each
other. This patch adds the refcount, and refactors the code to use it.

Additionally, we cannot iterate over the sas_device_list without
holding the lock, or we risk corrupting random memory if items are
added or deleted as we iterate. This patch refactors _scsih_probe_sas()
to use the sas_device_list in a safe way.

Cc: Christoph Hellwig <[email protected]>
Cc: Bart Van Assche <[email protected]>
Cc: Joe Lawrence <[email protected]>
Signed-off-by: Calvin Owens <[email protected]>
---
Changes in v3:
* Drop the sas_device_lock while enabling devices, and leave the
sas_device object on the list, since it may need to be looked up
there while it is being enabled.
* Drop put() in _scsih_add_device(), because the ->hostdata now keeps a
reference (this was an oversight in v2).
* Be consistent about calling sas_device_put() while holding the
sas_device_lock where feasible.
* Take and assert_spin_locked() on the sas_device_lock from the newly
added __get_sdev_from_target(), add wrapper similar to other lookups
for callers which do not explicitly take the lock.

Changes in v2:
* Squished patches 1-3 into this one
* s/BUG_ON(!spin_is_locked/assert_spin_locked/g
* Store a pointer to the sas_device object in ->hostdata, to eliminate
the need for several lookups on the lists.

drivers/scsi/mpt2sas/mpt2sas_base.h | 22 +-
drivers/scsi/mpt2sas/mpt2sas_scsih.c | 467 +++++++++++++++++++++----------
drivers/scsi/mpt2sas/mpt2sas_transport.c | 12 +-
3 files changed, 348 insertions(+), 153 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
index caff8d1..78f41ac 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.h
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
@@ -238,6 +238,7 @@
* @flags: MPT_TARGET_FLAGS_XXX flags
* @deleted: target flaged for deletion
* @tm_busy: target is busy with TM request.
+ * @sdev: The sas_device associated with this target
*/
struct MPT2SAS_TARGET {
struct scsi_target *starget;
@@ -248,6 +249,7 @@ struct MPT2SAS_TARGET {
u32 flags;
u8 deleted;
u8 tm_busy;
+ struct _sas_device *sdev;
};


@@ -376,8 +378,24 @@ struct _sas_device {
u8 phy;
u8 responding;
u8 pfa_led_on;
+ struct kref refcount;
};

+static inline void sas_device_get(struct _sas_device *s)
+{
+ kref_get(&s->refcount);
+}
+
+static inline void sas_device_free(struct kref *r)
+{
+ kfree(container_of(r, struct _sas_device, refcount));
+}
+
+static inline void sas_device_put(struct _sas_device *s)
+{
+ kref_put(&s->refcount, sas_device_free);
+}
+
/**
* struct _raid_device - raid volume link list
* @list: sas device list
@@ -1095,7 +1113,9 @@ struct _sas_node *mpt2sas_scsih_expander_find_by_handle(struct MPT2SAS_ADAPTER *
u16 handle);
struct _sas_node *mpt2sas_scsih_expander_find_by_sas_address(struct MPT2SAS_ADAPTER
*ioc, u64 sas_address);
-struct _sas_device *mpt2sas_scsih_sas_device_find_by_sas_address(
+struct _sas_device *mpt2sas_get_sdev_by_addr(
+ struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
+struct _sas_device *__mpt2sas_get_sdev_by_addr(
struct MPT2SAS_ADAPTER *ioc, u64 sas_address);

void mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc);
diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index 3f26147..a2af9a5 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -526,8 +526,61 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
}
}

+static struct _sas_device *
+__mpt2sas_get_sdev_from_target(struct MPT2SAS_ADAPTER *ioc,
+ struct MPT2SAS_TARGET *tgt_priv)
+{
+ struct _sas_device *ret;
+
+ assert_spin_locked(&ioc->sas_device_lock);
+
+ ret = tgt_priv->sdev;
+ if (ret)
+ sas_device_get(ret);
+
+ return ret;
+}
+
+static struct _sas_device *
+mpt2sas_get_sdev_from_target(struct MPT2SAS_ADAPTER *ioc,
+ struct MPT2SAS_TARGET *tgt_priv)
+{
+ struct _sas_device *ret;
+ unsigned long flags;
+
+ spin_lock_irqsave(&ioc->sas_device_lock, flags);
+ ret = __mpt2sas_get_sdev_from_target(ioc, tgt_priv);
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+ return ret;
+}
+
+
+struct _sas_device *
+__mpt2sas_get_sdev_by_addr(struct MPT2SAS_ADAPTER *ioc,
+ u64 sas_address)
+{
+ struct _sas_device *sas_device;
+
+ assert_spin_locked(&ioc->sas_device_lock);
+
+ list_for_each_entry(sas_device, &ioc->sas_device_list, list)
+ if (sas_device->sas_address == sas_address)
+ goto found_device;
+
+ list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
+ if (sas_device->sas_address == sas_address)
+ goto found_device;
+
+ return NULL;
+
+found_device:
+ sas_device_get(sas_device);
+ return sas_device;
+}
+
/**
- * mpt2sas_scsih_sas_device_find_by_sas_address - sas device search
+ * mpt2sas_get_sdev_by_addr - sas device search
* @ioc: per adapter object
* @sas_address: sas address
* Context: Calling function should acquire ioc->sas_device_lock
@@ -536,24 +589,44 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
* object.
*/
struct _sas_device *
-mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
+mpt2sas_get_sdev_by_addr(struct MPT2SAS_ADAPTER *ioc,
u64 sas_address)
{
struct _sas_device *sas_device;
+ unsigned long flags;
+
+ spin_lock_irqsave(&ioc->sas_device_lock, flags);
+ sas_device = __mpt2sas_get_sdev_by_addr(ioc,
+ sas_address);
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+ return sas_device;
+}
+
+static struct _sas_device *
+__mpt2sas_get_sdev_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
+{
+ struct _sas_device *sas_device;
+
+ assert_spin_locked(&ioc->sas_device_lock);

list_for_each_entry(sas_device, &ioc->sas_device_list, list)
- if (sas_device->sas_address == sas_address)
- return sas_device;
+ if (sas_device->handle == handle)
+ goto found_device;

list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
- if (sas_device->sas_address == sas_address)
- return sas_device;
+ if (sas_device->handle == handle)
+ goto found_device;

return NULL;
+
+found_device:
+ sas_device_get(sas_device);
+ return sas_device;
}

/**
- * _scsih_sas_device_find_by_handle - sas device search
+ * mpt2sas_get_sdev_by_handle - sas device search
* @ioc: per adapter object
* @handle: sas device handle (assigned by firmware)
* Context: Calling function should acquire ioc->sas_device_lock
@@ -562,19 +635,16 @@ mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
* object.
*/
static struct _sas_device *
-_scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
+mpt2sas_get_sdev_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
{
struct _sas_device *sas_device;
+ unsigned long flags;

- list_for_each_entry(sas_device, &ioc->sas_device_list, list)
- if (sas_device->handle == handle)
- return sas_device;
-
- list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
- if (sas_device->handle == handle)
- return sas_device;
+ spin_lock_irqsave(&ioc->sas_device_lock, flags);
+ sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

- return NULL;
+ return sas_device;
}

/**
@@ -583,7 +653,7 @@ _scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
* @sas_device: the sas_device object
* Context: This function will acquire ioc->sas_device_lock.
*
- * Removing object and freeing associated memory from the ioc->sas_device_list.
+ * If sas_device is on the list, remove it and decrement its reference count.
*/
static void
_scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
@@ -594,9 +664,15 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
if (!sas_device)
return;

+ /*
+ * The lock serializes access to the list, but we still need to verify
+ * that nobody removed the entry while we were waiting on the lock.
+ */
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- list_del(&sas_device->list);
- kfree(sas_device);
+ if (!list_empty(&sas_device->list)) {
+ list_del_init(&sas_device->list);
+ sas_device_put(sas_device);
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}

@@ -620,6 +696,7 @@ _scsih_sas_device_add(struct MPT2SAS_ADAPTER *ioc,
sas_device->handle, (unsigned long long)sas_device->sas_address));

spin_lock_irqsave(&ioc->sas_device_lock, flags);
+ sas_device_get(sas_device);
list_add_tail(&sas_device->list, &ioc->sas_device_list);
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

@@ -659,6 +736,7 @@ _scsih_sas_device_init_add(struct MPT2SAS_ADAPTER *ioc,
sas_device->handle, (unsigned long long)sas_device->sas_address));

spin_lock_irqsave(&ioc->sas_device_lock, flags);
+ sas_device_get(sas_device);
list_add_tail(&sas_device->list, &ioc->sas_device_init_list);
_scsih_determine_boot_device(ioc, sas_device, 0);
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
@@ -1208,12 +1286,14 @@ _scsih_change_queue_depth(struct scsi_device *sdev, int qdepth)
goto not_sata;
if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))
goto not_sata;
+
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
- sas_device_priv_data->sas_target->sas_address);
- if (sas_device && sas_device->device_info &
- MPI2_SAS_DEVICE_INFO_SATA_DEVICE)
+ sas_device = __mpt2sas_get_sdev_from_target(ioc, sas_target_priv_data);
+ if (sas_device && sas_device->device_info
+ & MPI2_SAS_DEVICE_INFO_SATA_DEVICE) {
max_depth = MPT2SAS_SATA_QUEUE_DEPTH;
+ sas_device_put(sas_device);
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

not_sata:
@@ -1271,18 +1351,20 @@ _scsih_target_alloc(struct scsi_target *starget)
/* sas/sata devices */
spin_lock_irqsave(&ioc->sas_device_lock, flags);
rphy = dev_to_rphy(starget->dev.parent);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = __mpt2sas_get_sdev_by_addr(ioc,
rphy->identify.sas_address);

if (sas_device) {
sas_target_priv_data->handle = sas_device->handle;
sas_target_priv_data->sas_address = sas_device->sas_address;
+ sas_target_priv_data->sdev = sas_device;
sas_device->starget = starget;
sas_device->id = starget->id;
sas_device->channel = starget->channel;
if (test_bit(sas_device->handle, ioc->pd_handles))
sas_target_priv_data->flags |=
MPT_TARGET_FLAGS_RAID_COMPONENT;
+
}
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

@@ -1324,13 +1406,14 @@ _scsih_target_destroy(struct scsi_target *starget)

spin_lock_irqsave(&ioc->sas_device_lock, flags);
rphy = dev_to_rphy(starget->dev.parent);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
- rphy->identify.sas_address);
+ sas_device = __mpt2sas_get_sdev_from_target(ioc, sas_target_priv_data);
if (sas_device && (sas_device->starget == starget) &&
(sas_device->id == starget->id) &&
(sas_device->channel == starget->channel))
sas_device->starget = NULL;

+ if (sas_device)
+ sas_device_put(sas_device);
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

out:
@@ -1386,7 +1469,7 @@ _scsih_slave_alloc(struct scsi_device *sdev)

if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = __mpt2sas_get_sdev_by_addr(ioc,
sas_target_priv_data->sas_address);
if (sas_device && (sas_device->starget == NULL)) {
sdev_printk(KERN_INFO, sdev,
@@ -1394,6 +1477,10 @@ _scsih_slave_alloc(struct scsi_device *sdev)
__func__, __LINE__);
sas_device->starget = starget;
}
+
+ if (sas_device)
+ sas_device_put(sas_device);
+
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}

@@ -1428,10 +1515,13 @@ _scsih_slave_destroy(struct scsi_device *sdev)

if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
- sas_target_priv_data->sas_address);
+ sas_device = __mpt2sas_get_sdev_from_target(ioc,
+ sas_target_priv_data);
if (sas_device && !sas_target_priv_data->num_luns)
sas_device->starget = NULL;
+
+ if (sas_device)
+ sas_device_put(sas_device);
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}

@@ -2078,7 +2168,7 @@ _scsih_slave_configure(struct scsi_device *sdev)
}

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = __mpt2sas_get_sdev_by_addr(ioc,
sas_device_priv_data->sas_target->sas_address);
if (!sas_device) {
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
@@ -2112,17 +2202,18 @@ _scsih_slave_configure(struct scsi_device *sdev)
(unsigned long long) sas_device->enclosure_logical_id,
sas_device->slot);

+ sas_device_put(sas_device);
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
if (!ssp_target)
_scsih_display_sata_capabilities(ioc, handle, sdev);

-
_scsih_change_queue_depth(sdev, qdepth);

if (ssp_target) {
sas_read_port_mode_page(sdev);
_scsih_enable_tlr(ioc, sdev);
}
+
return 0;
}

@@ -2509,8 +2600,7 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
device_str, (unsigned long long)priv_target->sas_address);
} else {
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
- priv_target->sas_address);
+ sas_device = __mpt2sas_get_sdev_from_target(ioc, priv_target);
if (sas_device) {
if (priv_target->flags &
MPT_TARGET_FLAGS_RAID_COMPONENT) {
@@ -2529,6 +2619,8 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
"enclosure_logical_id(0x%016llx), slot(%d)\n",
(unsigned long long)sas_device->enclosure_logical_id,
sas_device->slot);
+
+ sas_device_put(sas_device);
}
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}
@@ -2604,12 +2696,12 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
{
struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
struct MPT2SAS_DEVICE *sas_device_priv_data;
- struct _sas_device *sas_device;
- unsigned long flags;
+ struct _sas_device *sas_device = NULL;
u16 handle;
int r;

struct scsi_target *starget = scmd->device->sdev_target;
+ struct MPT2SAS_TARGET *target_priv_data = starget->hostdata;

starget_printk(KERN_INFO, starget, "attempting device reset! "
"scmd(%p)\n", scmd);
@@ -2629,12 +2721,10 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
handle = 0;
if (sas_device_priv_data->sas_target->flags &
MPT_TARGET_FLAGS_RAID_COMPONENT) {
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc,
- sas_device_priv_data->sas_target->handle);
+ sas_device = mpt2sas_get_sdev_from_target(ioc,
+ target_priv_data);
if (sas_device)
handle = sas_device->volume_handle;
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
} else
handle = sas_device_priv_data->sas_target->handle;

@@ -2651,6 +2741,10 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
out:
sdev_printk(KERN_INFO, scmd->device, "device reset: %s scmd(%p)\n",
((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
+
+ if (sas_device)
+ sas_device_put(sas_device);
+
return r;
}

@@ -2665,11 +2759,11 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
{
struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
struct MPT2SAS_DEVICE *sas_device_priv_data;
- struct _sas_device *sas_device;
- unsigned long flags;
+ struct _sas_device *sas_device = NULL;
u16 handle;
int r;
struct scsi_target *starget = scmd->device->sdev_target;
+ struct MPT2SAS_TARGET *target_priv_data = starget->hostdata;

starget_printk(KERN_INFO, starget, "attempting target reset! "
"scmd(%p)\n", scmd);
@@ -2689,12 +2783,10 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
handle = 0;
if (sas_device_priv_data->sas_target->flags &
MPT_TARGET_FLAGS_RAID_COMPONENT) {
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc,
- sas_device_priv_data->sas_target->handle);
+ sas_device = mpt2sas_get_sdev_from_target(ioc,
+ target_priv_data);
if (sas_device)
handle = sas_device->volume_handle;
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
} else
handle = sas_device_priv_data->sas_target->handle;

@@ -2711,6 +2803,10 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
out:
starget_printk(KERN_INFO, starget, "target reset: %s scmd(%p)\n",
((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
+
+ if (sas_device)
+ sas_device_put(sas_device);
+
return r;
}

@@ -3002,15 +3098,15 @@ _scsih_block_io_to_children_attached_to_ex(struct MPT2SAS_ADAPTER *ioc,

list_for_each_entry(mpt2sas_port,
&sas_expander->sas_port_list, port_list) {
- if (mpt2sas_port->remote_identify.device_type ==
- SAS_END_DEVICE) {
+ if (mpt2sas_port->remote_identify.device_type == SAS_END_DEVICE) {
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device =
- mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
- mpt2sas_port->remote_identify.sas_address);
- if (sas_device)
+ sas_device = __mpt2sas_get_sdev_by_addr(ioc,
+ mpt2sas_port->remote_identify.sas_address);
+ if (sas_device) {
set_bit(sas_device->handle,
- ioc->blocking_handles);
+ ioc->blocking_handles);
+ sas_device_put(sas_device);
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}
}
@@ -3080,7 +3176,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
{
Mpi2SCSITaskManagementRequest_t *mpi_request;
u16 smid;
- struct _sas_device *sas_device;
+ struct _sas_device *sas_device = NULL;
struct MPT2SAS_TARGET *sas_target_priv_data = NULL;
u64 sas_address = 0;
unsigned long flags;
@@ -3110,7 +3206,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
return;

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+ sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
if (sas_device && sas_device->starget &&
sas_device->starget->hostdata) {
sas_target_priv_data = sas_device->starget->hostdata;
@@ -3131,14 +3227,14 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
if (!smid) {
delayed_tr = kzalloc(sizeof(*delayed_tr), GFP_ATOMIC);
if (!delayed_tr)
- return;
+ goto out;
INIT_LIST_HEAD(&delayed_tr->list);
delayed_tr->handle = handle;
list_add_tail(&delayed_tr->list, &ioc->delayed_tr_list);
dewtprintk(ioc, printk(MPT2SAS_INFO_FMT
"DELAYED:tr:handle(0x%04x), (open)\n",
ioc->name, handle));
- return;
+ goto out;
}

dewtprintk(ioc, printk(MPT2SAS_INFO_FMT "tr_send:handle(0x%04x), "
@@ -3150,6 +3246,9 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
mpi_request->DevHandle = cpu_to_le16(handle);
mpi_request->TaskType = MPI2_SCSITASKMGMT_TASKTYPE_TARGET_RESET;
mpt2sas_base_put_smid_hi_priority(ioc, smid);
+out:
+ if (sas_device)
+ sas_device_put(sas_device);
}


@@ -4068,7 +4167,6 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
char *desc_scsi_state = ioc->tmp_string;
u32 log_info = le32_to_cpu(mpi_reply->IOCLogInfo);
struct _sas_device *sas_device = NULL;
- unsigned long flags;
struct scsi_target *starget = scmd->device->sdev_target;
struct MPT2SAS_TARGET *priv_target = starget->hostdata;
char *device_str = NULL;
@@ -4200,9 +4298,7 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
printk(MPT2SAS_WARN_FMT "\t%s wwid(0x%016llx)\n", ioc->name,
device_str, (unsigned long long)priv_target->sas_address);
} else {
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
- priv_target->sas_address);
+ sas_device = mpt2sas_get_sdev_from_target(ioc, priv_target);
if (sas_device) {
printk(MPT2SAS_WARN_FMT "\tsas_address(0x%016llx), "
"phy(%d)\n", ioc->name, sas_device->sas_address,
@@ -4211,8 +4307,9 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
"\tenclosure_logical_id(0x%016llx), slot(%d)\n",
ioc->name, sas_device->enclosure_logical_id,
sas_device->slot);
+
+ sas_device_put(sas_device);
}
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}

printk(MPT2SAS_WARN_FMT "\thandle(0x%04x), ioc_status(%s)(0x%04x), "
@@ -4259,7 +4356,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
Mpi2SepRequest_t mpi_request;
struct _sas_device *sas_device;

- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+ sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
if (!sas_device)
return;

@@ -4274,7 +4371,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
&mpi_request)) != 0) {
printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n", ioc->name,
__FILE__, __LINE__, __func__);
- return;
+ goto out;
}
sas_device->pfa_led_on = 1;

@@ -4284,8 +4381,10 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
"enclosure_processor: ioc_status (0x%04x), loginfo(0x%08x)\n",
ioc->name, le16_to_cpu(mpi_reply.IOCStatus),
le32_to_cpu(mpi_reply.IOCLogInfo)));
- return;
+ goto out;
}
+out:
+ sas_device_put(sas_device);
}

/**
@@ -4370,19 +4469,17 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)

/* only handle non-raid devices */
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+ sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
if (!sas_device) {
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
+ goto out_unlock;
}
starget = sas_device->starget;
sas_target_priv_data = starget->hostdata;

if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_RAID_COMPONENT) ||
- ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))) {
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
- }
+ ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)))
+ goto out_unlock;
+
starget_printk(KERN_WARNING, starget, "predicted fault\n");
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

@@ -4396,7 +4493,7 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
if (!event_reply) {
printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n",
ioc->name, __FILE__, __LINE__, __func__);
- return;
+ goto out;
}

event_reply->Function = MPI2_FUNCTION_EVENT_NOTIFICATION;
@@ -4413,6 +4510,14 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
event_data->SASAddress = cpu_to_le64(sas_target_priv_data->sas_address);
mpt2sas_ctl_add_to_event_log(ioc, event_reply);
kfree(event_reply);
+out:
+ if (sas_device)
+ sas_device_put(sas_device);
+ return;
+
+out_unlock:
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+ goto out;
}

/**
@@ -5148,14 +5253,13 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)

spin_lock_irqsave(&ioc->sas_device_lock, flags);
sas_address = le64_to_cpu(sas_device_pg0.SASAddress);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = __mpt2sas_get_sdev_by_addr(ioc,
sas_address);

if (!sas_device) {
printk(MPT2SAS_ERR_FMT "device is not present "
"handle(0x%04x), no sas_device!!!\n", ioc->name, handle);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
+ goto out_unlock;
}

if (unlikely(sas_device->handle != handle)) {
@@ -5172,19 +5276,22 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)
MPI2_SAS_DEVICE0_FLAGS_DEVICE_PRESENT)) {
printk(MPT2SAS_ERR_FMT "device is not present "
"handle(0x%04x), flags!!!\n", ioc->name, handle);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
+ goto out_unlock;
}

/* check if there were any issues with discovery */
if (_scsih_check_access_status(ioc, sas_address, handle,
- sas_device_pg0.AccessStatus)) {
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
- }
+ sas_device_pg0.AccessStatus))
+ goto out_unlock;
+
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
_scsih_ublock_io_device(ioc, sas_address);
+ return;

+out_unlock:
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+ if (sas_device)
+ sas_device_put(sas_device);
}

/**
@@ -5208,7 +5315,6 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
u32 ioc_status;
__le64 sas_address;
u32 device_info;
- unsigned long flags;

if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
@@ -5250,14 +5356,13 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
return -1;
}

-
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_get_sdev_by_addr(ioc,
sas_address);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

- if (sas_device)
+ if (sas_device) {
+ sas_device_put(sas_device);
return 0;
+ }

sas_device = kzalloc(sizeof(struct _sas_device),
GFP_KERNEL);
@@ -5267,6 +5372,7 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
return -1;
}

+ kref_init(&sas_device->refcount);
sas_device->handle = handle;
if (_scsih_get_sas_address(ioc, le16_to_cpu
(sas_device_pg0.ParentDevHandle),
@@ -5344,7 +5450,6 @@ _scsih_remove_device(struct MPT2SAS_ADAPTER *ioc,
"handle(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
sas_device->handle, (unsigned long long)
sas_device->sas_address));
- kfree(sas_device);
}
/**
* _scsih_device_remove_by_handle - removing device object by handle
@@ -5363,12 +5468,17 @@ _scsih_device_remove_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
return;

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
- if (sas_device)
- list_del(&sas_device->list);
+ sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
+ if (sas_device) {
+ list_del_init(&sas_device->list);
+ sas_device_put(sas_device);
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- if (sas_device)
+
+ if (sas_device) {
_scsih_remove_device(ioc, sas_device);
+ sas_device_put(sas_device);
+ }
}

/**
@@ -5389,13 +5499,17 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
return;

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
- sas_address);
- if (sas_device)
- list_del(&sas_device->list);
+ sas_device = __mpt2sas_get_sdev_by_addr(ioc, sas_address);
+ if (sas_device) {
+ list_del_init(&sas_device->list);
+ sas_device_put(sas_device);
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- if (sas_device)
+
+ if (sas_device) {
_scsih_remove_device(ioc, sas_device);
+ sas_device_put(sas_device);
+ }
}
#ifdef CONFIG_SCSI_MPT2SAS_LOGGING
/**
@@ -5716,26 +5830,28 @@ _scsih_sas_device_status_change_event(struct MPT2SAS_ADAPTER *ioc,

spin_lock_irqsave(&ioc->sas_device_lock, flags);
sas_address = le64_to_cpu(event_data->SASAddress);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = __mpt2sas_get_sdev_by_addr(ioc,
sas_address);

- if (!sas_device || !sas_device->starget) {
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
- }
+ if (!sas_device || !sas_device->starget)
+ goto out;

target_priv_data = sas_device->starget->hostdata;
- if (!target_priv_data) {
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
- }
+ if (!target_priv_data)
+ goto out;

if (event_data->ReasonCode ==
MPI2_EVENT_SAS_DEV_STAT_RC_INTERNAL_DEVICE_RESET)
target_priv_data->tm_busy = 1;
else
target_priv_data->tm_busy = 0;
+
+out:
+ if (sas_device)
+ sas_device_put(sas_device);
+
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
}

#ifdef CONFIG_SCSI_MPT2SAS_LOGGING
@@ -6123,7 +6239,7 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
u16 handle = le16_to_cpu(element->PhysDiskDevHandle);

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+ sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
if (sas_device) {
sas_device->volume_handle = 0;
sas_device->volume_wwid = 0;
@@ -6142,6 +6258,8 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
/* exposing raid component */
if (starget)
starget_for_each_device(starget, NULL, _scsih_reprobe_lun);
+
+ sas_device_put(sas_device);
}

/**
@@ -6170,7 +6288,7 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
&volume_wwid);

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+ sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
if (sas_device) {
set_bit(handle, ioc->pd_handles);
if (sas_device->starget && sas_device->starget->hostdata) {
@@ -6189,6 +6307,8 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
/* hiding raid component */
if (starget)
starget_for_each_device(starget, (void *)1, _scsih_reprobe_lun);
+
+ sas_device_put(sas_device);
}

/**
@@ -6221,7 +6341,6 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,
Mpi2EventIrConfigElement_t *element)
{
struct _sas_device *sas_device;
- unsigned long flags;
u16 handle = le16_to_cpu(element->PhysDiskDevHandle);
Mpi2ConfigReply_t mpi_reply;
Mpi2SasDevicePage0_t sas_device_pg0;
@@ -6231,11 +6350,11 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,

set_bit(handle, ioc->pd_handles);

- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- if (sas_device)
+ sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
+ if (sas_device) {
+ sas_device_put(sas_device);
return;
+ }

if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
@@ -6509,7 +6628,6 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
u16 handle, parent_handle;
u32 state;
struct _sas_device *sas_device;
- unsigned long flags;
Mpi2ConfigReply_t mpi_reply;
Mpi2SasDevicePage0_t sas_device_pg0;
u32 ioc_status;
@@ -6542,12 +6660,11 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
if (!ioc->is_warpdrive)
set_bit(handle, ioc->pd_handles);

- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-
- if (sas_device)
+ sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
+ if (sas_device) {
+ sas_device_put(sas_device);
return;
+ }

if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
&sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
@@ -7015,6 +7132,7 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
struct _raid_device *raid_device, *raid_device_next;
struct list_head tmp_list;
unsigned long flags;
+ LIST_HEAD(head);

printk(MPT2SAS_INFO_FMT "removing unresponding devices: start\n",
ioc->name);
@@ -7022,14 +7140,29 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
/* removing unresponding end devices */
printk(MPT2SAS_INFO_FMT "removing unresponding devices: end-devices\n",
ioc->name);
+
+ /*
+ * Iterate, pulling off devices marked as non-responding. We become the
+ * owner for the reference the list had on any object we prune.
+ */
+ spin_lock_irqsave(&ioc->sas_device_lock, flags);
list_for_each_entry_safe(sas_device, sas_device_next,
- &ioc->sas_device_list, list) {
+ &ioc->sas_device_list, list) {
if (!sas_device->responding)
- mpt2sas_device_remove_by_sas_address(ioc,
- sas_device->sas_address);
+ list_move_tail(&sas_device->list, &head);
else
sas_device->responding = 0;
}
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+ /*
+ * Now, uninitialize and remove the unresponding devices we pruned.
+ */
+ list_for_each_entry_safe(sas_device, sas_device_next, &head, list) {
+ _scsih_remove_device(ioc, sas_device);
+ list_del_init(&sas_device->list);
+ sas_device_put(sas_device);
+ }

/* removing unresponding volumes */
if (ioc->ir_firmware) {
@@ -7179,11 +7312,11 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
}
phys_disk_num = pd_pg0.PhysDiskNum;
handle = le16_to_cpu(pd_pg0.DevHandle);
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- if (sas_device)
+ sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
+ if (sas_device) {
+ sas_device_put(sas_device);
continue;
+ }
if (mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
&sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
handle) != 0)
@@ -7302,12 +7435,12 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
if (!(_scsih_is_end_device(
le32_to_cpu(sas_device_pg0.DeviceInfo))))
continue;
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_get_sdev_by_addr(ioc,
le64_to_cpu(sas_device_pg0.SASAddress));
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- if (sas_device)
+ if (sas_device) {
+ sas_device_put(sas_device);
continue;
+ }
parent_handle = le16_to_cpu(sas_device_pg0.ParentDevHandle);
if (!_scsih_get_sas_address(ioc, parent_handle, &sas_address)) {
printk(MPT2SAS_INFO_FMT "\tBEFORE adding end device: "
@@ -7966,6 +8099,48 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
}
}

+static struct _sas_device *get_next_sas_device(struct MPT2SAS_ADAPTER *ioc)
+{
+ struct _sas_device *sas_device = NULL;
+ unsigned long flags;
+
+ spin_lock_irqsave(&ioc->sas_device_lock, flags);
+ if (!list_empty(&ioc->sas_device_init_list)) {
+ sas_device = list_first_entry(&ioc->sas_device_init_list,
+ struct _sas_device, list);
+ sas_device_get(sas_device);
+ }
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+ return sas_device;
+}
+
+static void sas_device_make_active(struct MPT2SAS_ADAPTER *ioc,
+ struct _sas_device *sas_device)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&ioc->sas_device_lock, flags);
+
+ /*
+ * Since we dropped the lock during the call to port_add(), we need to
+ * be careful here that somebody else didn't move or delete this item
+ * while we were busy with other things.
+ *
+ * If it was on the list, we need a put() for the reference the list
+ * had. Either way, we need a get() for the destination list.
+ */
+ if (!list_empty(&sas_device->list)) {
+ list_del_init(&sas_device->list);
+ sas_device_put(sas_device);
+ }
+
+ sas_device_get(sas_device);
+ list_add_tail(&sas_device->list, &ioc->sas_device_list);
+
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+}
+
/**
* _scsih_probe_sas - reporting sas devices to sas transport
* @ioc: per adapter object
@@ -7975,34 +8150,30 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
static void
_scsih_probe_sas(struct MPT2SAS_ADAPTER *ioc)
{
- struct _sas_device *sas_device, *next;
- unsigned long flags;
-
- /* SAS Device List */
- list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
- list) {
+ struct _sas_device *sas_device;

- if (ioc->hide_drives)
- continue;
+ if (ioc->hide_drives)
+ return;

+ while ((sas_device = get_next_sas_device(ioc))) {
if (!mpt2sas_transport_port_add(ioc, sas_device->handle,
- sas_device->sas_address_parent)) {
- list_del(&sas_device->list);
- kfree(sas_device);
+ sas_device->sas_address_parent)) {
+ _scsih_sas_device_remove(ioc, sas_device);
+ sas_device_put(sas_device);
continue;
} else if (!sas_device->starget) {
if (!ioc->is_driver_loading) {
mpt2sas_transport_port_remove(ioc,
- sas_device->sas_address,
- sas_device->sas_address_parent);
- list_del(&sas_device->list);
- kfree(sas_device);
+ sas_device->sas_address,
+ sas_device->sas_address_parent);
+ _scsih_sas_device_remove(ioc, sas_device);
+ sas_device_put(sas_device);
continue;
}
}
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- list_move_tail(&sas_device->list, &ioc->sas_device_list);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+ sas_device_make_active(ioc, sas_device);
+ sas_device_put(sas_device);
}
}

diff --git a/drivers/scsi/mpt2sas/mpt2sas_transport.c b/drivers/scsi/mpt2sas/mpt2sas_transport.c
index ff2500a..af86800 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_transport.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_transport.c
@@ -1323,15 +1323,17 @@ _transport_get_enclosure_identifier(struct sas_rphy *rphy, u64 *identifier)
int rc;

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = __mpt2sas_get_sdev_by_addr(ioc,
rphy->identify.sas_address);
if (sas_device) {
*identifier = sas_device->enclosure_logical_id;
rc = 0;
+ sas_device_put(sas_device);
} else {
*identifier = 0;
rc = -ENXIO;
}
+
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
return rc;
}
@@ -1351,12 +1353,14 @@ _transport_get_bay_identifier(struct sas_rphy *rphy)
int rc;

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = __mpt2sas_get_sdev_by_addr(ioc,
rphy->identify.sas_address);
- if (sas_device)
+ if (sas_device) {
rc = sas_device->slot;
- else
+ sas_device_put(sas_device);
+ } else {
rc = -ENXIO;
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
return rc;
}
--
1.8.5.6

2015-08-01 05:04:55

by Calvin Owens

[permalink] [raw]
Subject: [PATCH v3 2/2] mpt2sas: Refcount fw_events and fix unsafe list usage

The fw_event_work struct is concurrently referenced at shutdown, so
add a refcount to protect it, and refactor the code to use it.

Additionally, refactor _scsih_fw_event_cleanup_queue() such that it
no longer iterates over the list without holding the lock, since
_firmware_event_work() concurrently deletes items from the list.

Cc: Christoph Hellwig <[email protected]>
Signed-off-by: Calvin Owens <[email protected]>
---

Changes in v3:
* Add a break condition to the REMOVE_UNRESPONDING_DEVICES fw_event,
which can loop over a sleep forever (5m+ at least) at unloading. I
don't think anything prevented this before, but taking the fw_event
object off the list at the top of _firmware_event_work() seems to have
made it more likely to happen.

Changes in v2:
* Squished patches 4-6 into one patch
* Remove the fw_event from fw_event_list at the start of
_firmware_event_work()
* Explicitly seperate fw_event_list removal from fw_event freeing

drivers/scsi/mpt2sas/mpt2sas_scsih.c | 112 ++++++++++++++++++++++++++++-------
1 file changed, 91 insertions(+), 21 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index a2af9a5..cdc647d 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -176,9 +176,37 @@ struct fw_event_work {
u8 VP_ID;
u8 ignore;
u16 event;
+ struct kref refcount;
char event_data[0] __aligned(4);
};

+static void fw_event_work_free(struct kref *r)
+{
+ kfree(container_of(r, struct fw_event_work, refcount));
+}
+
+static void fw_event_work_get(struct fw_event_work *fw_work)
+{
+ kref_get(&fw_work->refcount);
+}
+
+static void fw_event_work_put(struct fw_event_work *fw_work)
+{
+ kref_put(&fw_work->refcount, fw_event_work_free);
+}
+
+static struct fw_event_work *alloc_fw_event_work(int len)
+{
+ struct fw_event_work *fw_event;
+
+ fw_event = kzalloc(sizeof(*fw_event) + len, GFP_ATOMIC);
+ if (!fw_event)
+ return NULL;
+
+ kref_init(&fw_event->refcount);
+ return fw_event;
+}
+
/* raid transport support */
static struct raid_template *mpt2sas_raid_template;

@@ -2864,36 +2892,39 @@ _scsih_fw_event_add(struct MPT2SAS_ADAPTER *ioc, struct fw_event_work *fw_event)
return;

spin_lock_irqsave(&ioc->fw_event_lock, flags);
+ fw_event_work_get(fw_event);
list_add_tail(&fw_event->list, &ioc->fw_event_list);
INIT_DELAYED_WORK(&fw_event->delayed_work, _firmware_event_work);
+ fw_event_work_get(fw_event);
queue_delayed_work(ioc->firmware_event_thread,
&fw_event->delayed_work, 0);
spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
}

/**
- * _scsih_fw_event_free - delete fw_event
+ * _scsih_fw_event_del_from_list - delete fw_event from the list
* @ioc: per adapter object
* @fw_event: object describing the event
* Context: This function will acquire ioc->fw_event_lock.
*
- * This removes firmware event object from link list, frees associated memory.
+ * If the fw_event is on the fw_event_list, remove it and do a put.
*
* Return nothing.
*/
static void
-_scsih_fw_event_free(struct MPT2SAS_ADAPTER *ioc, struct fw_event_work
+_scsih_fw_event_del_from_list(struct MPT2SAS_ADAPTER *ioc, struct fw_event_work
*fw_event)
{
unsigned long flags;

spin_lock_irqsave(&ioc->fw_event_lock, flags);
- list_del(&fw_event->list);
- kfree(fw_event);
+ if (!list_empty(&fw_event->list)) {
+ list_del_init(&fw_event->list);
+ fw_event_work_put(fw_event);
+ }
spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
}

-
/**
* _scsih_error_recovery_delete_devices - remove devices not responding
* @ioc: per adapter object
@@ -2908,13 +2939,14 @@ _scsih_error_recovery_delete_devices(struct MPT2SAS_ADAPTER *ioc)
if (ioc->is_driver_loading)
return;

- fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+ fw_event = alloc_fw_event_work(0);
if (!fw_event)
return;

fw_event->event = MPT2SAS_REMOVE_UNRESPONDING_DEVICES;
fw_event->ioc = ioc;
_scsih_fw_event_add(ioc, fw_event);
+ fw_event_work_put(fw_event);
}

/**
@@ -2928,12 +2960,29 @@ mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc)
{
struct fw_event_work *fw_event;

- fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+ fw_event = alloc_fw_event_work(0);
if (!fw_event)
return;
fw_event->event = MPT2SAS_PORT_ENABLE_COMPLETE;
fw_event->ioc = ioc;
_scsih_fw_event_add(ioc, fw_event);
+ fw_event_work_put(fw_event);
+}
+
+static struct fw_event_work *dequeue_next_fw_event(struct MPT2SAS_ADAPTER *ioc)
+{
+ unsigned long flags;
+ struct fw_event_work *fw_event = NULL;
+
+ spin_lock_irqsave(&ioc->fw_event_lock, flags);
+ if (!list_empty(&ioc->fw_event_list)) {
+ fw_event = list_first_entry(&ioc->fw_event_list,
+ struct fw_event_work, list);
+ list_del_init(&fw_event->list);
+ }
+ spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
+
+ return fw_event;
}

/**
@@ -2948,17 +2997,25 @@ mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc)
static void
_scsih_fw_event_cleanup_queue(struct MPT2SAS_ADAPTER *ioc)
{
- struct fw_event_work *fw_event, *next;
+ struct fw_event_work *fw_event;

if (list_empty(&ioc->fw_event_list) ||
!ioc->firmware_event_thread || in_interrupt())
return;

- list_for_each_entry_safe(fw_event, next, &ioc->fw_event_list, list) {
- if (cancel_delayed_work_sync(&fw_event->delayed_work)) {
- _scsih_fw_event_free(ioc, fw_event);
- continue;
- }
+ while ((fw_event = dequeue_next_fw_event(ioc))) {
+ /*
+ * Wait on the fw_event to complete. If this returns 1, then
+ * the event was never executed, and we need a put for the
+ * reference the delayed_work had on the fw_event.
+ *
+ * If it did execute, we wait for it to finish, and the put will
+ * happen from _firmware_event_work()
+ */
+ if (cancel_delayed_work_sync(&fw_event->delayed_work))
+ fw_event_work_put(fw_event);
+
+ fw_event_work_put(fw_event);
}
}

@@ -4439,13 +4496,14 @@ _scsih_send_event_to_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
{
struct fw_event_work *fw_event;

- fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+ fw_event = alloc_fw_event_work(0);
if (!fw_event)
return;
fw_event->event = MPT2SAS_TURN_ON_PFA_LED;
fw_event->device_handle = handle;
fw_event->ioc = ioc;
_scsih_fw_event_add(ioc, fw_event);
+ fw_event_work_put(fw_event);
}

/**
@@ -7543,17 +7601,27 @@ _firmware_event_work(struct work_struct *work)
struct fw_event_work, delayed_work.work);
struct MPT2SAS_ADAPTER *ioc = fw_event->ioc;

+ _scsih_fw_event_del_from_list(ioc, fw_event);
+
/* the queue is being flushed so ignore this event */
- if (ioc->remove_host ||
- ioc->pci_error_recovery) {
- _scsih_fw_event_free(ioc, fw_event);
+ if (ioc->remove_host || ioc->pci_error_recovery) {
+ fw_event_work_put(fw_event);
return;
}

switch (fw_event->event) {
case MPT2SAS_REMOVE_UNRESPONDING_DEVICES:
- while (scsi_host_in_recovery(ioc->shost) || ioc->shost_recovery)
+ while (scsi_host_in_recovery(ioc->shost) ||
+ ioc->shost_recovery) {
+ /*
+ * If we're unloading, bail. Otherwise, this can become
+ * an infinite loop.
+ */
+ if (ioc->remove_host)
+ goto out;
+
ssleep(1);
+ }
_scsih_remove_unresponding_sas_devices(ioc);
_scsih_scan_for_devices_after_reset(ioc);
break;
@@ -7602,7 +7670,8 @@ _firmware_event_work(struct work_struct *work)
_scsih_sas_ir_operation_status_event(ioc, fw_event);
break;
}
- _scsih_fw_event_free(ioc, fw_event);
+out:
+ fw_event_work_put(fw_event);
}

/**
@@ -7740,7 +7809,7 @@ mpt2sas_scsih_event_callback(struct MPT2SAS_ADAPTER *ioc, u8 msix_index,
}

sz = le16_to_cpu(mpi_reply->EventDataLength) * 4;
- fw_event = kzalloc(sizeof(*fw_event) + sz, GFP_ATOMIC);
+ fw_event = alloc_fw_event_work(sz);
if (!fw_event) {
printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n",
ioc->name, __FILE__, __LINE__, __func__);
@@ -7753,6 +7822,7 @@ mpt2sas_scsih_event_callback(struct MPT2SAS_ADAPTER *ioc, u8 msix_index,
fw_event->VP_ID = mpi_reply->VP_ID;
fw_event->event = event;
_scsih_fw_event_add(ioc, fw_event);
+ fw_event_work_put(fw_event);
return;
}

--
1.8.5.6

2015-08-10 13:15:51

by Sreekanth Reddy

[permalink] [raw]
Subject: Re: [PATCH v3 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage

On Sat, Aug 1, 2015 at 10:32 AM, Calvin Owens <[email protected]> wrote:
> These objects can be referenced concurrently throughout the driver, we
> need a way to make sure threads can't delete them out from under each
> other. This patch adds the refcount, and refactors the code to use it.
>
> Additionally, we cannot iterate over the sas_device_list without
> holding the lock, or we risk corrupting random memory if items are
> added or deleted as we iterate. This patch refactors _scsih_probe_sas()
> to use the sas_device_list in a safe way.
>
> Cc: Christoph Hellwig <[email protected]>
> Cc: Bart Van Assche <[email protected]>
> Cc: Joe Lawrence <[email protected]>
> Signed-off-by: Calvin Owens <[email protected]>
> ---
> Changes in v3:
> * Drop the sas_device_lock while enabling devices, and leave the
> sas_device object on the list, since it may need to be looked up
> there while it is being enabled.
> * Drop put() in _scsih_add_device(), because the ->hostdata now keeps a
> reference (this was an oversight in v2).
> * Be consistent about calling sas_device_put() while holding the
> sas_device_lock where feasible.
> * Take and assert_spin_locked() on the sas_device_lock from the newly
> added __get_sdev_from_target(), add wrapper similar to other lookups
> for callers which do not explicitly take the lock.
>
> Changes in v2:
> * Squished patches 1-3 into this one
> * s/BUG_ON(!spin_is_locked/assert_spin_locked/g
> * Store a pointer to the sas_device object in ->hostdata, to eliminate
> the need for several lookups on the lists.
>
> drivers/scsi/mpt2sas/mpt2sas_base.h | 22 +-
> drivers/scsi/mpt2sas/mpt2sas_scsih.c | 467 +++++++++++++++++++++----------
> drivers/scsi/mpt2sas/mpt2sas_transport.c | 12 +-
> 3 files changed, 348 insertions(+), 153 deletions(-)
>
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
> index caff8d1..78f41ac 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_base.h
> +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
> @@ -238,6 +238,7 @@
> * @flags: MPT_TARGET_FLAGS_XXX flags
> * @deleted: target flaged for deletion
> * @tm_busy: target is busy with TM request.
> + * @sdev: The sas_device associated with this target
> */
> struct MPT2SAS_TARGET {
> struct scsi_target *starget;
> @@ -248,6 +249,7 @@ struct MPT2SAS_TARGET {
> u32 flags;
> u8 deleted;
> u8 tm_busy;
> + struct _sas_device *sdev;
> };
>
>
> @@ -376,8 +378,24 @@ struct _sas_device {
> u8 phy;
> u8 responding;
> u8 pfa_led_on;
> + struct kref refcount;
> };
>
> +static inline void sas_device_get(struct _sas_device *s)
> +{
> + kref_get(&s->refcount);
> +}
> +
> +static inline void sas_device_free(struct kref *r)
> +{
> + kfree(container_of(r, struct _sas_device, refcount));
> +}
> +
> +static inline void sas_device_put(struct _sas_device *s)
> +{
> + kref_put(&s->refcount, sas_device_free);
> +}
> +
> /**
> * struct _raid_device - raid volume link list
> * @list: sas device list
> @@ -1095,7 +1113,9 @@ struct _sas_node *mpt2sas_scsih_expander_find_by_handle(struct MPT2SAS_ADAPTER *
> u16 handle);
> struct _sas_node *mpt2sas_scsih_expander_find_by_sas_address(struct MPT2SAS_ADAPTER
> *ioc, u64 sas_address);
> -struct _sas_device *mpt2sas_scsih_sas_device_find_by_sas_address(
> +struct _sas_device *mpt2sas_get_sdev_by_addr(
> + struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
> +struct _sas_device *__mpt2sas_get_sdev_by_addr(
> struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
>
> void mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc);
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> index 3f26147..a2af9a5 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> +++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> @@ -526,8 +526,61 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
> }
> }
>
> +static struct _sas_device *
> +__mpt2sas_get_sdev_from_target(struct MPT2SAS_ADAPTER *ioc,
> + struct MPT2SAS_TARGET *tgt_priv)
> +{
> + struct _sas_device *ret;
> +
> + assert_spin_locked(&ioc->sas_device_lock);
> +
> + ret = tgt_priv->sdev;
> + if (ret)
> + sas_device_get(ret);
> +
> + return ret;
> +}
> +
> +static struct _sas_device *
> +mpt2sas_get_sdev_from_target(struct MPT2SAS_ADAPTER *ioc,
> + struct MPT2SAS_TARGET *tgt_priv)
> +{
> + struct _sas_device *ret;
> + unsigned long flags;
> +
> + spin_lock_irqsave(&ioc->sas_device_lock, flags);
> + ret = __mpt2sas_get_sdev_from_target(ioc, tgt_priv);
> + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +
> + return ret;
> +}
> +
> +
> +struct _sas_device *
> +__mpt2sas_get_sdev_by_addr(struct MPT2SAS_ADAPTER *ioc,
> + u64 sas_address)
> +{
> + struct _sas_device *sas_device;
> +
> + assert_spin_locked(&ioc->sas_device_lock);
> +
> + list_for_each_entry(sas_device, &ioc->sas_device_list, list)
> + if (sas_device->sas_address == sas_address)
> + goto found_device;
> +
> + list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
> + if (sas_device->sas_address == sas_address)
> + goto found_device;
> +
> + return NULL;
> +
> +found_device:
> + sas_device_get(sas_device);
> + return sas_device;
> +}
> +
> /**
> - * mpt2sas_scsih_sas_device_find_by_sas_address - sas device search
> + * mpt2sas_get_sdev_by_addr - sas device search
> * @ioc: per adapter object
> * @sas_address: sas address
> * Context: Calling function should acquire ioc->sas_device_lock
> @@ -536,24 +589,44 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
> * object.
> */
> struct _sas_device *
> -mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> +mpt2sas_get_sdev_by_addr(struct MPT2SAS_ADAPTER *ioc,
> u64 sas_address)
> {
> struct _sas_device *sas_device;
> + unsigned long flags;
> +
> + spin_lock_irqsave(&ioc->sas_device_lock, flags);
> + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> + sas_address);
> + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +
> + return sas_device;
> +}
> +
> +static struct _sas_device *
> +__mpt2sas_get_sdev_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> +{
> + struct _sas_device *sas_device;
> +
> + assert_spin_locked(&ioc->sas_device_lock);
>
> list_for_each_entry(sas_device, &ioc->sas_device_list, list)
> - if (sas_device->sas_address == sas_address)
> - return sas_device;
> + if (sas_device->handle == handle)
> + goto found_device;
>
> list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
> - if (sas_device->sas_address == sas_address)
> - return sas_device;
> + if (sas_device->handle == handle)
> + goto found_device;
>
> return NULL;
> +
> +found_device:
> + sas_device_get(sas_device);
> + return sas_device;
> }
>
> /**
> - * _scsih_sas_device_find_by_handle - sas device search
> + * mpt2sas_get_sdev_by_handle - sas device search
> * @ioc: per adapter object
> * @handle: sas device handle (assigned by firmware)
> * Context: Calling function should acquire ioc->sas_device_lock
> @@ -562,19 +635,16 @@ mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> * object.
> */
> static struct _sas_device *
> -_scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> +mpt2sas_get_sdev_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> {
> struct _sas_device *sas_device;
> + unsigned long flags;
>
> - list_for_each_entry(sas_device, &ioc->sas_device_list, list)
> - if (sas_device->handle == handle)
> - return sas_device;
> -
> - list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
> - if (sas_device->handle == handle)
> - return sas_device;
> + spin_lock_irqsave(&ioc->sas_device_lock, flags);
> + sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
> - return NULL;
> + return sas_device;
> }
>
> /**
> @@ -583,7 +653,7 @@ _scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> * @sas_device: the sas_device object
> * Context: This function will acquire ioc->sas_device_lock.
> *
> - * Removing object and freeing associated memory from the ioc->sas_device_list.
> + * If sas_device is on the list, remove it and decrement its reference count.
> */
> static void
> _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
> @@ -594,9 +664,15 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
> if (!sas_device)
> return;
>
> + /*
> + * The lock serializes access to the list, but we still need to verify
> + * that nobody removed the entry while we were waiting on the lock.
> + */
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - list_del(&sas_device->list);
> - kfree(sas_device);
> + if (!list_empty(&sas_device->list)) {
> + list_del_init(&sas_device->list);
> + sas_device_put(sas_device);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> }
>
> @@ -620,6 +696,7 @@ _scsih_sas_device_add(struct MPT2SAS_ADAPTER *ioc,
> sas_device->handle, (unsigned long long)sas_device->sas_address));
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> + sas_device_get(sas_device);

[Sreekanth] I think here we are unnecessarily taking extra reference count,
already devices reference count is initialized to one in
_scsih_add_device() using kref_init() API.

> list_add_tail(&sas_device->list, &ioc->sas_device_list);
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
> @@ -659,6 +736,7 @@ _scsih_sas_device_init_add(struct MPT2SAS_ADAPTER *ioc,
> sas_device->handle, (unsigned long long)sas_device->sas_address));
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> + sas_device_get(sas_device);

[Sreekanth] same as above comment.

> list_add_tail(&sas_device->list, &ioc->sas_device_init_list);
> _scsih_determine_boot_device(ioc, sas_device, 0);
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> @@ -1208,12 +1286,14 @@ _scsih_change_queue_depth(struct scsi_device *sdev, int qdepth)
> goto not_sata;
> if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))
> goto not_sata;
> +
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> - sas_device_priv_data->sas_target->sas_address);
> - if (sas_device && sas_device->device_info &
> - MPI2_SAS_DEVICE_INFO_SATA_DEVICE)
> + sas_device = __mpt2sas_get_sdev_from_target(ioc, sas_target_priv_data);
> + if (sas_device && sas_device->device_info
> + & MPI2_SAS_DEVICE_INFO_SATA_DEVICE) {
> max_depth = MPT2SAS_SATA_QUEUE_DEPTH;
> + sas_device_put(sas_device);

[Sreekanth] Here it looks it is reducing the reference count only for
SATA drives,
what if device is of SAS device.

> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
> not_sata:
> @@ -1271,18 +1351,20 @@ _scsih_target_alloc(struct scsi_target *starget)
> /* sas/sata devices */
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> rphy = dev_to_rphy(starget->dev.parent);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> rphy->identify.sas_address);
>
> if (sas_device) {
> sas_target_priv_data->handle = sas_device->handle;
> sas_target_priv_data->sas_address = sas_device->sas_address;
> + sas_target_priv_data->sdev = sas_device;
> sas_device->starget = starget;
> sas_device->id = starget->id;
> sas_device->channel = starget->channel;
> if (test_bit(sas_device->handle, ioc->pd_handles))
> sas_target_priv_data->flags |=
> MPT_TARGET_FLAGS_RAID_COMPONENT;
> +

[Sreekanth] I think here, sas_device_put() call is missing.


> }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
> @@ -1324,13 +1406,14 @@ _scsih_target_destroy(struct scsi_target *starget)
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> rphy = dev_to_rphy(starget->dev.parent);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> - rphy->identify.sas_address);
> + sas_device = __mpt2sas_get_sdev_from_target(ioc, sas_target_priv_data);
> if (sas_device && (sas_device->starget == starget) &&
> (sas_device->id == starget->id) &&
> (sas_device->channel == starget->channel))
> sas_device->starget = NULL;
>
> + if (sas_device)
> + sas_device_put(sas_device);
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
> out:
> @@ -1386,7 +1469,7 @@ _scsih_slave_alloc(struct scsi_device *sdev)
>
> if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> sas_target_priv_data->sas_address);
> if (sas_device && (sas_device->starget == NULL)) {
> sdev_printk(KERN_INFO, sdev,
> @@ -1394,6 +1477,10 @@ _scsih_slave_alloc(struct scsi_device *sdev)
> __func__, __LINE__);
> sas_device->starget = starget;
> }
> +
> + if (sas_device)
> + sas_device_put(sas_device);
> +
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> }
>
> @@ -1428,10 +1515,13 @@ _scsih_slave_destroy(struct scsi_device *sdev)
>
> if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> - sas_target_priv_data->sas_address);
> + sas_device = __mpt2sas_get_sdev_from_target(ioc,
> + sas_target_priv_data);
> if (sas_device && !sas_target_priv_data->num_luns)
> sas_device->starget = NULL;
> +
> + if (sas_device)
> + sas_device_put(sas_device);
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> }
>
> @@ -2078,7 +2168,7 @@ _scsih_slave_configure(struct scsi_device *sdev)
> }
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> sas_device_priv_data->sas_target->sas_address);
> if (!sas_device) {
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> @@ -2112,17 +2202,18 @@ _scsih_slave_configure(struct scsi_device *sdev)
> (unsigned long long) sas_device->enclosure_logical_id,
> sas_device->slot);
>
> + sas_device_put(sas_device);
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> if (!ssp_target)
> _scsih_display_sata_capabilities(ioc, handle, sdev);
>
> -
> _scsih_change_queue_depth(sdev, qdepth);
>
> if (ssp_target) {
> sas_read_port_mode_page(sdev);
> _scsih_enable_tlr(ioc, sdev);
> }
> +
> return 0;
> }
>
> @@ -2509,8 +2600,7 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
> device_str, (unsigned long long)priv_target->sas_address);
> } else {
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> - priv_target->sas_address);
> + sas_device = __mpt2sas_get_sdev_from_target(ioc, priv_target);
> if (sas_device) {
> if (priv_target->flags &
> MPT_TARGET_FLAGS_RAID_COMPONENT) {
> @@ -2529,6 +2619,8 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
> "enclosure_logical_id(0x%016llx), slot(%d)\n",
> (unsigned long long)sas_device->enclosure_logical_id,
> sas_device->slot);
> +
> + sas_device_put(sas_device);
> }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> }
> @@ -2604,12 +2696,12 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
> {
> struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
> struct MPT2SAS_DEVICE *sas_device_priv_data;
> - struct _sas_device *sas_device;
> - unsigned long flags;
> + struct _sas_device *sas_device = NULL;
> u16 handle;
> int r;
>
> struct scsi_target *starget = scmd->device->sdev_target;
> + struct MPT2SAS_TARGET *target_priv_data = starget->hostdata;
>
> starget_printk(KERN_INFO, starget, "attempting device reset! "
> "scmd(%p)\n", scmd);
> @@ -2629,12 +2721,10 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
> handle = 0;
> if (sas_device_priv_data->sas_target->flags &
> MPT_TARGET_FLAGS_RAID_COMPONENT) {
> - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = _scsih_sas_device_find_by_handle(ioc,
> - sas_device_priv_data->sas_target->handle);
> + sas_device = mpt2sas_get_sdev_from_target(ioc,
> + target_priv_data);
> if (sas_device)
> handle = sas_device->volume_handle;
> - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> } else
> handle = sas_device_priv_data->sas_target->handle;
>
> @@ -2651,6 +2741,10 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
> out:
> sdev_printk(KERN_INFO, scmd->device, "device reset: %s scmd(%p)\n",
> ((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
> +
> + if (sas_device)
> + sas_device_put(sas_device);
> +
> return r;
> }
>
> @@ -2665,11 +2759,11 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
> {
> struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
> struct MPT2SAS_DEVICE *sas_device_priv_data;
> - struct _sas_device *sas_device;
> - unsigned long flags;
> + struct _sas_device *sas_device = NULL;
> u16 handle;
> int r;
> struct scsi_target *starget = scmd->device->sdev_target;
> + struct MPT2SAS_TARGET *target_priv_data = starget->hostdata;
>
> starget_printk(KERN_INFO, starget, "attempting target reset! "
> "scmd(%p)\n", scmd);
> @@ -2689,12 +2783,10 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
> handle = 0;
> if (sas_device_priv_data->sas_target->flags &
> MPT_TARGET_FLAGS_RAID_COMPONENT) {
> - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = _scsih_sas_device_find_by_handle(ioc,
> - sas_device_priv_data->sas_target->handle);
> + sas_device = mpt2sas_get_sdev_from_target(ioc,
> + target_priv_data);
> if (sas_device)
> handle = sas_device->volume_handle;
> - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> } else
> handle = sas_device_priv_data->sas_target->handle;
>
> @@ -2711,6 +2803,10 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
> out:
> starget_printk(KERN_INFO, starget, "target reset: %s scmd(%p)\n",
> ((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
> +
> + if (sas_device)
> + sas_device_put(sas_device);
> +
> return r;
> }
>
> @@ -3002,15 +3098,15 @@ _scsih_block_io_to_children_attached_to_ex(struct MPT2SAS_ADAPTER *ioc,
>
> list_for_each_entry(mpt2sas_port,
> &sas_expander->sas_port_list, port_list) {
> - if (mpt2sas_port->remote_identify.device_type ==
> - SAS_END_DEVICE) {
> + if (mpt2sas_port->remote_identify.device_type == SAS_END_DEVICE) {
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device =
> - mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> - mpt2sas_port->remote_identify.sas_address);
> - if (sas_device)
> + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> + mpt2sas_port->remote_identify.sas_address);
> + if (sas_device) {
> set_bit(sas_device->handle,
> - ioc->blocking_handles);
> + ioc->blocking_handles);
> + sas_device_put(sas_device);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> }
> }
> @@ -3080,7 +3176,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> {
> Mpi2SCSITaskManagementRequest_t *mpi_request;
> u16 smid;
> - struct _sas_device *sas_device;
> + struct _sas_device *sas_device = NULL;
> struct MPT2SAS_TARGET *sas_target_priv_data = NULL;
> u64 sas_address = 0;
> unsigned long flags;
> @@ -3110,7 +3206,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> return;
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> + sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> if (sas_device && sas_device->starget &&
> sas_device->starget->hostdata) {
> sas_target_priv_data = sas_device->starget->hostdata;
> @@ -3131,14 +3227,14 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> if (!smid) {
> delayed_tr = kzalloc(sizeof(*delayed_tr), GFP_ATOMIC);
> if (!delayed_tr)
> - return;
> + goto out;
> INIT_LIST_HEAD(&delayed_tr->list);
> delayed_tr->handle = handle;
> list_add_tail(&delayed_tr->list, &ioc->delayed_tr_list);
> dewtprintk(ioc, printk(MPT2SAS_INFO_FMT
> "DELAYED:tr:handle(0x%04x), (open)\n",
> ioc->name, handle));
> - return;
> + goto out;
> }
>
> dewtprintk(ioc, printk(MPT2SAS_INFO_FMT "tr_send:handle(0x%04x), "
> @@ -3150,6 +3246,9 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> mpi_request->DevHandle = cpu_to_le16(handle);
> mpi_request->TaskType = MPI2_SCSITASKMGMT_TASKTYPE_TARGET_RESET;
> mpt2sas_base_put_smid_hi_priority(ioc, smid);
> +out:
> + if (sas_device)
> + sas_device_put(sas_device);
> }
>
>
> @@ -4068,7 +4167,6 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
> char *desc_scsi_state = ioc->tmp_string;
> u32 log_info = le32_to_cpu(mpi_reply->IOCLogInfo);
> struct _sas_device *sas_device = NULL;
> - unsigned long flags;
> struct scsi_target *starget = scmd->device->sdev_target;
> struct MPT2SAS_TARGET *priv_target = starget->hostdata;
> char *device_str = NULL;
> @@ -4200,9 +4298,7 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
> printk(MPT2SAS_WARN_FMT "\t%s wwid(0x%016llx)\n", ioc->name,
> device_str, (unsigned long long)priv_target->sas_address);
> } else {
> - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> - priv_target->sas_address);
> + sas_device = mpt2sas_get_sdev_from_target(ioc, priv_target);
> if (sas_device) {
> printk(MPT2SAS_WARN_FMT "\tsas_address(0x%016llx), "
> "phy(%d)\n", ioc->name, sas_device->sas_address,
> @@ -4211,8 +4307,9 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
> "\tenclosure_logical_id(0x%016llx), slot(%d)\n",
> ioc->name, sas_device->enclosure_logical_id,
> sas_device->slot);
> +
> + sas_device_put(sas_device);
> }
> - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> }
>
> printk(MPT2SAS_WARN_FMT "\thandle(0x%04x), ioc_status(%s)(0x%04x), "
> @@ -4259,7 +4356,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> Mpi2SepRequest_t mpi_request;
> struct _sas_device *sas_device;
>
> - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> + sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
> if (!sas_device)
> return;
>
> @@ -4274,7 +4371,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> &mpi_request)) != 0) {
> printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n", ioc->name,
> __FILE__, __LINE__, __func__);
> - return;
> + goto out;
> }
> sas_device->pfa_led_on = 1;
>
> @@ -4284,8 +4381,10 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> "enclosure_processor: ioc_status (0x%04x), loginfo(0x%08x)\n",
> ioc->name, le16_to_cpu(mpi_reply.IOCStatus),
> le32_to_cpu(mpi_reply.IOCLogInfo)));
> - return;
> + goto out;
> }
> +out:
> + sas_device_put(sas_device);
> }
>
> /**
> @@ -4370,19 +4469,17 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>
> /* only handle non-raid devices */
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> + sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> if (!sas_device) {
> - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> - return;
> + goto out_unlock;
> }
> starget = sas_device->starget;
> sas_target_priv_data = starget->hostdata;
>
> if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_RAID_COMPONENT) ||
> - ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))) {
> - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> - return;
> - }
> + ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)))
> + goto out_unlock;
> +
> starget_printk(KERN_WARNING, starget, "predicted fault\n");
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
> @@ -4396,7 +4493,7 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> if (!event_reply) {
> printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n",
> ioc->name, __FILE__, __LINE__, __func__);
> - return;
> + goto out;
> }
>
> event_reply->Function = MPI2_FUNCTION_EVENT_NOTIFICATION;
> @@ -4413,6 +4510,14 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> event_data->SASAddress = cpu_to_le64(sas_target_priv_data->sas_address);
> mpt2sas_ctl_add_to_event_log(ioc, event_reply);
> kfree(event_reply);
> +out:
> + if (sas_device)
> + sas_device_put(sas_device);
> + return;
> +
> +out_unlock:
> + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> + goto out;
> }
>
> /**
> @@ -5148,14 +5253,13 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> sas_address = le64_to_cpu(sas_device_pg0.SASAddress);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> sas_address);
>
> if (!sas_device) {
> printk(MPT2SAS_ERR_FMT "device is not present "
> "handle(0x%04x), no sas_device!!!\n", ioc->name, handle);
> - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> - return;
> + goto out_unlock;
> }
>
> if (unlikely(sas_device->handle != handle)) {
> @@ -5172,19 +5276,22 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> MPI2_SAS_DEVICE0_FLAGS_DEVICE_PRESENT)) {
> printk(MPT2SAS_ERR_FMT "device is not present "
> "handle(0x%04x), flags!!!\n", ioc->name, handle);
> - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> - return;
> + goto out_unlock;
> }
>
> /* check if there were any issues with discovery */
> if (_scsih_check_access_status(ioc, sas_address, handle,
> - sas_device_pg0.AccessStatus)) {
> - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> - return;
> - }
> + sas_device_pg0.AccessStatus))
> + goto out_unlock;
> +
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> _scsih_ublock_io_device(ioc, sas_address);
> + return;

[Sreekanth] I think here driver exits from this function without
reducing the reference count.

>
> +out_unlock:
> + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> + if (sas_device)
> + sas_device_put(sas_device);
> }
>
> /**
> @@ -5208,7 +5315,6 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
> u32 ioc_status;
> __le64 sas_address;
> u32 device_info;
> - unsigned long flags;
>
> if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
> MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
> @@ -5250,14 +5356,13 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
> return -1;
> }
>
> -
> - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> + sas_device = mpt2sas_get_sdev_by_addr(ioc,
> sas_address);
> - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
>
> - if (sas_device)
> + if (sas_device) {
> + sas_device_put(sas_device);
> return 0;
> + }
>
> sas_device = kzalloc(sizeof(struct _sas_device),
> GFP_KERNEL);
> @@ -5267,6 +5372,7 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
> return -1;
> }
>
> + kref_init(&sas_device->refcount);
> sas_device->handle = handle;
> if (_scsih_get_sas_address(ioc, le16_to_cpu
> (sas_device_pg0.ParentDevHandle),
> @@ -5344,7 +5450,6 @@ _scsih_remove_device(struct MPT2SAS_ADAPTER *ioc,
> "handle(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
> sas_device->handle, (unsigned long long)
> sas_device->sas_address));
> - kfree(sas_device);
> }
> /**
> * _scsih_device_remove_by_handle - removing device object by handle
> @@ -5363,12 +5468,17 @@ _scsih_device_remove_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> return;
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> - if (sas_device)
> - list_del(&sas_device->list);
> + sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> + if (sas_device) {
> + list_del_init(&sas_device->list);
> + sas_device_put(sas_device);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> - if (sas_device)
> +
> + if (sas_device) {
> _scsih_remove_device(ioc, sas_device);
> + sas_device_put(sas_device);
> + }
> }
>
> /**
> @@ -5389,13 +5499,17 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> return;
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> - sas_address);
> - if (sas_device)
> - list_del(&sas_device->list);
> + sas_device = __mpt2sas_get_sdev_by_addr(ioc, sas_address);
> + if (sas_device) {
> + list_del_init(&sas_device->list);
> + sas_device_put(sas_device);
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> - if (sas_device)
> +
> + if (sas_device) {
> _scsih_remove_device(ioc, sas_device);
> + sas_device_put(sas_device);
> + }
> }
> #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
> /**
> @@ -5716,26 +5830,28 @@ _scsih_sas_device_status_change_event(struct MPT2SAS_ADAPTER *ioc,
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> sas_address = le64_to_cpu(event_data->SASAddress);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> sas_address);
>
> - if (!sas_device || !sas_device->starget) {
> - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> - return;
> - }
> + if (!sas_device || !sas_device->starget)
> + goto out;
>
> target_priv_data = sas_device->starget->hostdata;
> - if (!target_priv_data) {
> - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> - return;
> - }
> + if (!target_priv_data)
> + goto out;
>
> if (event_data->ReasonCode ==
> MPI2_EVENT_SAS_DEV_STAT_RC_INTERNAL_DEVICE_RESET)
> target_priv_data->tm_busy = 1;
> else
> target_priv_data->tm_busy = 0;
> +
> +out:
> + if (sas_device)
> + sas_device_put(sas_device);
> +
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +
> }
>
> #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
> @@ -6123,7 +6239,7 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
> u16 handle = le16_to_cpu(element->PhysDiskDevHandle);
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> + sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> if (sas_device) {
> sas_device->volume_handle = 0;
> sas_device->volume_wwid = 0;
> @@ -6142,6 +6258,8 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
> /* exposing raid component */
> if (starget)
> starget_for_each_device(starget, NULL, _scsih_reprobe_lun);
> +
> + sas_device_put(sas_device);
> }
>
> /**
> @@ -6170,7 +6288,7 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
> &volume_wwid);
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> + sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> if (sas_device) {
> set_bit(handle, ioc->pd_handles);
> if (sas_device->starget && sas_device->starget->hostdata) {
> @@ -6189,6 +6307,8 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
> /* hiding raid component */
> if (starget)
> starget_for_each_device(starget, (void *)1, _scsih_reprobe_lun);
> +
> + sas_device_put(sas_device);
> }
>
> /**
> @@ -6221,7 +6341,6 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,
> Mpi2EventIrConfigElement_t *element)
> {
> struct _sas_device *sas_device;
> - unsigned long flags;
> u16 handle = le16_to_cpu(element->PhysDiskDevHandle);
> Mpi2ConfigReply_t mpi_reply;
> Mpi2SasDevicePage0_t sas_device_pg0;
> @@ -6231,11 +6350,11 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,
>
> set_bit(handle, ioc->pd_handles);
>
> - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> - if (sas_device)
> + sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
> + if (sas_device) {
> + sas_device_put(sas_device);
> return;
> + }
>
> if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
> MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
> @@ -6509,7 +6628,6 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
> u16 handle, parent_handle;
> u32 state;
> struct _sas_device *sas_device;
> - unsigned long flags;
> Mpi2ConfigReply_t mpi_reply;
> Mpi2SasDevicePage0_t sas_device_pg0;
> u32 ioc_status;
> @@ -6542,12 +6660,11 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
> if (!ioc->is_warpdrive)
> set_bit(handle, ioc->pd_handles);
>
> - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> -
> - if (sas_device)
> + sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
> + if (sas_device) {
> + sas_device_put(sas_device);
> return;
> + }
>
> if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
> &sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
> @@ -7015,6 +7132,7 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
> struct _raid_device *raid_device, *raid_device_next;
> struct list_head tmp_list;
> unsigned long flags;
> + LIST_HEAD(head);
>
> printk(MPT2SAS_INFO_FMT "removing unresponding devices: start\n",
> ioc->name);
> @@ -7022,14 +7140,29 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
> /* removing unresponding end devices */
> printk(MPT2SAS_INFO_FMT "removing unresponding devices: end-devices\n",
> ioc->name);
> +
> + /*
> + * Iterate, pulling off devices marked as non-responding. We become the
> + * owner for the reference the list had on any object we prune.
> + */
> + spin_lock_irqsave(&ioc->sas_device_lock, flags);
> list_for_each_entry_safe(sas_device, sas_device_next,
> - &ioc->sas_device_list, list) {
> + &ioc->sas_device_list, list) {
> if (!sas_device->responding)
> - mpt2sas_device_remove_by_sas_address(ioc,
> - sas_device->sas_address);
> + list_move_tail(&sas_device->list, &head);
> else
> sas_device->responding = 0;
> }
> + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +
> + /*
> + * Now, uninitialize and remove the unresponding devices we pruned.
> + */
> + list_for_each_entry_safe(sas_device, sas_device_next, &head, list) {
> + _scsih_remove_device(ioc, sas_device);
> + list_del_init(&sas_device->list);
> + sas_device_put(sas_device);
> + }
>
> /* removing unresponding volumes */
> if (ioc->ir_firmware) {
> @@ -7179,11 +7312,11 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
> }
> phys_disk_num = pd_pg0.PhysDiskNum;
> handle = le16_to_cpu(pd_pg0.DevHandle);
> - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> - if (sas_device)
> + sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
> + if (sas_device) {
> + sas_device_put(sas_device);
> continue;
> + }
> if (mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
> &sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
> handle) != 0)
> @@ -7302,12 +7435,12 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
> if (!(_scsih_is_end_device(
> le32_to_cpu(sas_device_pg0.DeviceInfo))))
> continue;
> - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> + sas_device = mpt2sas_get_sdev_by_addr(ioc,
> le64_to_cpu(sas_device_pg0.SASAddress));
> - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> - if (sas_device)
> + if (sas_device) {
> + sas_device_put(sas_device);
> continue;
> + }
> parent_handle = le16_to_cpu(sas_device_pg0.ParentDevHandle);
> if (!_scsih_get_sas_address(ioc, parent_handle, &sas_address)) {
> printk(MPT2SAS_INFO_FMT "\tBEFORE adding end device: "
> @@ -7966,6 +8099,48 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
> }
> }
>
> +static struct _sas_device *get_next_sas_device(struct MPT2SAS_ADAPTER *ioc)
> +{
> + struct _sas_device *sas_device = NULL;
> + unsigned long flags;
> +
> + spin_lock_irqsave(&ioc->sas_device_lock, flags);
> + if (!list_empty(&ioc->sas_device_init_list)) {
> + sas_device = list_first_entry(&ioc->sas_device_init_list,
> + struct _sas_device, list);
> + sas_device_get(sas_device);
> + }
> + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +
> + return sas_device;
> +}
> +
> +static void sas_device_make_active(struct MPT2SAS_ADAPTER *ioc,
> + struct _sas_device *sas_device)
> +{
> + unsigned long flags;
> +
> + spin_lock_irqsave(&ioc->sas_device_lock, flags);
> +
> + /*
> + * Since we dropped the lock during the call to port_add(), we need to
> + * be careful here that somebody else didn't move or delete this item
> + * while we were busy with other things.
> + *
> + * If it was on the list, we need a put() for the reference the list
> + * had. Either way, we need a get() for the destination list.
> + */
> + if (!list_empty(&sas_device->list)) {
> + list_del_init(&sas_device->list);
> + sas_device_put(sas_device);
> + }
> +
> + sas_device_get(sas_device);
> + list_add_tail(&sas_device->list, &ioc->sas_device_list);
> +
> + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +}
> +
> /**
> * _scsih_probe_sas - reporting sas devices to sas transport
> * @ioc: per adapter object
> @@ -7975,34 +8150,30 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
> static void
> _scsih_probe_sas(struct MPT2SAS_ADAPTER *ioc)
> {
> - struct _sas_device *sas_device, *next;
> - unsigned long flags;
> -
> - /* SAS Device List */
> - list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
> - list) {
> + struct _sas_device *sas_device;
>
> - if (ioc->hide_drives)
> - continue;
> + if (ioc->hide_drives)
> + return;
>
> + while ((sas_device = get_next_sas_device(ioc))) {
> if (!mpt2sas_transport_port_add(ioc, sas_device->handle,
> - sas_device->sas_address_parent)) {
> - list_del(&sas_device->list);
> - kfree(sas_device);
> + sas_device->sas_address_parent)) {
> + _scsih_sas_device_remove(ioc, sas_device);
> + sas_device_put(sas_device);
> continue;
> } else if (!sas_device->starget) {
> if (!ioc->is_driver_loading) {
> mpt2sas_transport_port_remove(ioc,
> - sas_device->sas_address,
> - sas_device->sas_address_parent);
> - list_del(&sas_device->list);
> - kfree(sas_device);
> + sas_device->sas_address,
> + sas_device->sas_address_parent);
> + _scsih_sas_device_remove(ioc, sas_device);
> + sas_device_put(sas_device);
> continue;
> }
> }
> - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - list_move_tail(&sas_device->list, &ioc->sas_device_list);
> - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> +
> + sas_device_make_active(ioc, sas_device);
> + sas_device_put(sas_device);
> }
> }
>
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_transport.c b/drivers/scsi/mpt2sas/mpt2sas_transport.c
> index ff2500a..af86800 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_transport.c
> +++ b/drivers/scsi/mpt2sas/mpt2sas_transport.c
> @@ -1323,15 +1323,17 @@ _transport_get_enclosure_identifier(struct sas_rphy *rphy, u64 *identifier)
> int rc;
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> rphy->identify.sas_address);
> if (sas_device) {
> *identifier = sas_device->enclosure_logical_id;
> rc = 0;
> + sas_device_put(sas_device);
> } else {
> *identifier = 0;
> rc = -ENXIO;
> }
> +
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> return rc;
> }
> @@ -1351,12 +1353,14 @@ _transport_get_bay_identifier(struct sas_rphy *rphy)
> int rc;
>
> spin_lock_irqsave(&ioc->sas_device_lock, flags);
> - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> rphy->identify.sas_address);
> - if (sas_device)
> + if (sas_device) {
> rc = sas_device->slot;
> - else
> + sas_device_put(sas_device);
> + } else {
> rc = -ENXIO;
> + }
> spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> return rc;
> }
> --
> 1.8.5.6
>



--

Regards,
Sreekanth

2015-08-14 01:44:39

by Calvin Owens

[permalink] [raw]
Subject: Re: [PATCH v3 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage

On Monday 08/10 at 18:45 +0530, Sreekanth Reddy wrote:
> On Sat, Aug 1, 2015 at 10:32 AM, Calvin Owens <[email protected]> wrote:

Sreekanth,

Thanks for the review, responses below. I'll have a v4 out shortly.

Calvin

> > These objects can be referenced concurrently throughout the driver, we
> > need a way to make sure threads can't delete them out from under each
> > other. This patch adds the refcount, and refactors the code to use it.
> >
> > Additionally, we cannot iterate over the sas_device_list without
> > holding the lock, or we risk corrupting random memory if items are
> > added or deleted as we iterate. This patch refactors _scsih_probe_sas()
> > to use the sas_device_list in a safe way.
> >
> > Cc: Christoph Hellwig <[email protected]>
> > Cc: Bart Van Assche <[email protected]>
> > Cc: Joe Lawrence <[email protected]>
> > Signed-off-by: Calvin Owens <[email protected]>
> > ---
> > Changes in v3:
> > * Drop the sas_device_lock while enabling devices, and leave the
> > sas_device object on the list, since it may need to be looked up
> > there while it is being enabled.
> > * Drop put() in _scsih_add_device(), because the ->hostdata now keeps a
> > reference (this was an oversight in v2).
> > * Be consistent about calling sas_device_put() while holding the
> > sas_device_lock where feasible.
> > * Take and assert_spin_locked() on the sas_device_lock from the newly
> > added __get_sdev_from_target(), add wrapper similar to other lookups
> > for callers which do not explicitly take the lock.
> >
> > Changes in v2:
> > * Squished patches 1-3 into this one
> > * s/BUG_ON(!spin_is_locked/assert_spin_locked/g
> > * Store a pointer to the sas_device object in ->hostdata, to eliminate
> > the need for several lookups on the lists.
> >
> > drivers/scsi/mpt2sas/mpt2sas_base.h | 22 +-
> > drivers/scsi/mpt2sas/mpt2sas_scsih.c | 467 +++++++++++++++++++++----------
> > drivers/scsi/mpt2sas/mpt2sas_transport.c | 12 +-
> > 3 files changed, 348 insertions(+), 153 deletions(-)
> >
> > diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
> > index caff8d1..78f41ac 100644
> > --- a/drivers/scsi/mpt2sas/mpt2sas_base.h
> > +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
> > @@ -238,6 +238,7 @@
> > * @flags: MPT_TARGET_FLAGS_XXX flags
> > * @deleted: target flaged for deletion
> > * @tm_busy: target is busy with TM request.
> > + * @sdev: The sas_device associated with this target
> > */
> > struct MPT2SAS_TARGET {
> > struct scsi_target *starget;
> > @@ -248,6 +249,7 @@ struct MPT2SAS_TARGET {
> > u32 flags;
> > u8 deleted;
> > u8 tm_busy;
> > + struct _sas_device *sdev;
> > };
> >
> >
> > @@ -376,8 +378,24 @@ struct _sas_device {
> > u8 phy;
> > u8 responding;
> > u8 pfa_led_on;
> > + struct kref refcount;
> > };
> >
> > +static inline void sas_device_get(struct _sas_device *s)
> > +{
> > + kref_get(&s->refcount);
> > +}
> > +
> > +static inline void sas_device_free(struct kref *r)
> > +{
> > + kfree(container_of(r, struct _sas_device, refcount));
> > +}
> > +
> > +static inline void sas_device_put(struct _sas_device *s)
> > +{
> > + kref_put(&s->refcount, sas_device_free);
> > +}
> > +
> > /**
> > * struct _raid_device - raid volume link list
> > * @list: sas device list
> > @@ -1095,7 +1113,9 @@ struct _sas_node *mpt2sas_scsih_expander_find_by_handle(struct MPT2SAS_ADAPTER *
> > u16 handle);
> > struct _sas_node *mpt2sas_scsih_expander_find_by_sas_address(struct MPT2SAS_ADAPTER
> > *ioc, u64 sas_address);
> > -struct _sas_device *mpt2sas_scsih_sas_device_find_by_sas_address(
> > +struct _sas_device *mpt2sas_get_sdev_by_addr(
> > + struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
> > +struct _sas_device *__mpt2sas_get_sdev_by_addr(
> > struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
> >
> > void mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc);
> > diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> > index 3f26147..a2af9a5 100644
> > --- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> > +++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
> > @@ -526,8 +526,61 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
> > }
> > }
> >
> > +static struct _sas_device *
> > +__mpt2sas_get_sdev_from_target(struct MPT2SAS_ADAPTER *ioc,
> > + struct MPT2SAS_TARGET *tgt_priv)
> > +{
> > + struct _sas_device *ret;
> > +
> > + assert_spin_locked(&ioc->sas_device_lock);
> > +
> > + ret = tgt_priv->sdev;
> > + if (ret)
> > + sas_device_get(ret);
> > +
> > + return ret;
> > +}
> > +
> > +static struct _sas_device *
> > +mpt2sas_get_sdev_from_target(struct MPT2SAS_ADAPTER *ioc,
> > + struct MPT2SAS_TARGET *tgt_priv)
> > +{
> > + struct _sas_device *ret;
> > + unsigned long flags;
> > +
> > + spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > + ret = __mpt2sas_get_sdev_from_target(ioc, tgt_priv);
> > + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +
> > + return ret;
> > +}
> > +
> > +
> > +struct _sas_device *
> > +__mpt2sas_get_sdev_by_addr(struct MPT2SAS_ADAPTER *ioc,
> > + u64 sas_address)
> > +{
> > + struct _sas_device *sas_device;
> > +
> > + assert_spin_locked(&ioc->sas_device_lock);
> > +
> > + list_for_each_entry(sas_device, &ioc->sas_device_list, list)
> > + if (sas_device->sas_address == sas_address)
> > + goto found_device;
> > +
> > + list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
> > + if (sas_device->sas_address == sas_address)
> > + goto found_device;
> > +
> > + return NULL;
> > +
> > +found_device:
> > + sas_device_get(sas_device);
> > + return sas_device;
> > +}
> > +
> > /**
> > - * mpt2sas_scsih_sas_device_find_by_sas_address - sas device search
> > + * mpt2sas_get_sdev_by_addr - sas device search
> > * @ioc: per adapter object
> > * @sas_address: sas address
> > * Context: Calling function should acquire ioc->sas_device_lock
> > @@ -536,24 +589,44 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
> > * object.
> > */
> > struct _sas_device *
> > -mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> > +mpt2sas_get_sdev_by_addr(struct MPT2SAS_ADAPTER *ioc,
> > u64 sas_address)
> > {
> > struct _sas_device *sas_device;
> > + unsigned long flags;
> > +
> > + spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> > + sas_address);
> > + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +
> > + return sas_device;
> > +}
> > +
> > +static struct _sas_device *
> > +__mpt2sas_get_sdev_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > +{
> > + struct _sas_device *sas_device;
> > +
> > + assert_spin_locked(&ioc->sas_device_lock);
> >
> > list_for_each_entry(sas_device, &ioc->sas_device_list, list)
> > - if (sas_device->sas_address == sas_address)
> > - return sas_device;
> > + if (sas_device->handle == handle)
> > + goto found_device;
> >
> > list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
> > - if (sas_device->sas_address == sas_address)
> > - return sas_device;
> > + if (sas_device->handle == handle)
> > + goto found_device;
> >
> > return NULL;
> > +
> > +found_device:
> > + sas_device_get(sas_device);
> > + return sas_device;
> > }
> >
> > /**
> > - * _scsih_sas_device_find_by_handle - sas device search
> > + * mpt2sas_get_sdev_by_handle - sas device search
> > * @ioc: per adapter object
> > * @handle: sas device handle (assigned by firmware)
> > * Context: Calling function should acquire ioc->sas_device_lock
> > @@ -562,19 +635,16 @@ mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> > * object.
> > */
> > static struct _sas_device *
> > -_scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > +mpt2sas_get_sdev_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > {
> > struct _sas_device *sas_device;
> > + unsigned long flags;
> >
> > - list_for_each_entry(sas_device, &ioc->sas_device_list, list)
> > - if (sas_device->handle == handle)
> > - return sas_device;
> > -
> > - list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
> > - if (sas_device->handle == handle)
> > - return sas_device;
> > + spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > + sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> > + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >
> > - return NULL;
> > + return sas_device;
> > }
> >
> > /**
> > @@ -583,7 +653,7 @@ _scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > * @sas_device: the sas_device object
> > * Context: This function will acquire ioc->sas_device_lock.
> > *
> > - * Removing object and freeing associated memory from the ioc->sas_device_list.
> > + * If sas_device is on the list, remove it and decrement its reference count.
> > */
> > static void
> > _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
> > @@ -594,9 +664,15 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
> > if (!sas_device)
> > return;
> >
> > + /*
> > + * The lock serializes access to the list, but we still need to verify
> > + * that nobody removed the entry while we were waiting on the lock.
> > + */
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - list_del(&sas_device->list);
> > - kfree(sas_device);
> > + if (!list_empty(&sas_device->list)) {
> > + list_del_init(&sas_device->list);
> > + sas_device_put(sas_device);
> > + }
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > }
> >
> > @@ -620,6 +696,7 @@ _scsih_sas_device_add(struct MPT2SAS_ADAPTER *ioc,
> > sas_device->handle, (unsigned long long)sas_device->sas_address));
> >
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > + sas_device_get(sas_device);
>
> [Sreekanth] I think here we are unnecessarily taking extra reference count,
> already devices reference count is initialized to one in
> _scsih_add_device() using kref_init() API.

The reference here is for the list itself. The corresponding put() is in
_scsih_sas_device_remove().

> > list_add_tail(&sas_device->list, &ioc->sas_device_list);
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >
> > @@ -659,6 +736,7 @@ _scsih_sas_device_init_add(struct MPT2SAS_ADAPTER *ioc,
> > sas_device->handle, (unsigned long long)sas_device->sas_address));
> >
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > + sas_device_get(sas_device);
>
> [Sreekanth] same as above comment.

Again, this is a reference for the list. The corresponding put() happens
in sas_device_make_active(), or in _scsih_sas_device_remove() if
mpt2sas_transport_port_add() fails in _scsih_probe_sas().

> > list_add_tail(&sas_device->list, &ioc->sas_device_init_list);
> > _scsih_determine_boot_device(ioc, sas_device, 0);
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > @@ -1208,12 +1286,14 @@ _scsih_change_queue_depth(struct scsi_device *sdev, int qdepth)
> > goto not_sata;
> > if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))
> > goto not_sata;
> > +
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > - sas_device_priv_data->sas_target->sas_address);
> > - if (sas_device && sas_device->device_info &
> > - MPI2_SAS_DEVICE_INFO_SATA_DEVICE)
> > + sas_device = __mpt2sas_get_sdev_from_target(ioc, sas_target_priv_data);
> > + if (sas_device && sas_device->device_info
> > + & MPI2_SAS_DEVICE_INFO_SATA_DEVICE) {
> > max_depth = MPT2SAS_SATA_QUEUE_DEPTH;
> > + sas_device_put(sas_device);
>
> [Sreekanth] Here it looks it is reducing the reference count only for
> SATA drives,
> what if device is of SAS device.

Yeah, you're right. Will fix.

> > + }
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >
> > not_sata:
> > @@ -1271,18 +1351,20 @@ _scsih_target_alloc(struct scsi_target *starget)
> > /* sas/sata devices */
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > rphy = dev_to_rphy(starget->dev.parent);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> > rphy->identify.sas_address);
> >
> > if (sas_device) {
> > sas_target_priv_data->handle = sas_device->handle;
> > sas_target_priv_data->sas_address = sas_device->sas_address;
> > + sas_target_priv_data->sdev = sas_device;
> > sas_device->starget = starget;
> > sas_device->id = starget->id;
> > sas_device->channel = starget->channel;
> > if (test_bit(sas_device->handle, ioc->pd_handles))
> > sas_target_priv_data->flags |=
> > MPT_TARGET_FLAGS_RAID_COMPONENT;
> > +
>
> [Sreekanth] I think here, sas_device_put() call is missing.

The reference here is for the pointer to the sas_device in the
->hostdata.

However, the corresponding put() is missing in _scsih_target_destroy(),
so it's definitely confusing. I'll fix that.

> > }
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >
> > @@ -1324,13 +1406,14 @@ _scsih_target_destroy(struct scsi_target *starget)
> >
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > rphy = dev_to_rphy(starget->dev.parent);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > - rphy->identify.sas_address);
> > + sas_device = __mpt2sas_get_sdev_from_target(ioc, sas_target_priv_data);
> > if (sas_device && (sas_device->starget == starget) &&
> > (sas_device->id == starget->id) &&
> > (sas_device->channel == starget->channel))
> > sas_device->starget = NULL;
> >
> > + if (sas_device)
> > + sas_device_put(sas_device);
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >
> > out:
> > @@ -1386,7 +1469,7 @@ _scsih_slave_alloc(struct scsi_device *sdev)
> >
> > if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> > sas_target_priv_data->sas_address);
> > if (sas_device && (sas_device->starget == NULL)) {
> > sdev_printk(KERN_INFO, sdev,
> > @@ -1394,6 +1477,10 @@ _scsih_slave_alloc(struct scsi_device *sdev)
> > __func__, __LINE__);
> > sas_device->starget = starget;
> > }
> > +
> > + if (sas_device)
> > + sas_device_put(sas_device);
> > +
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > }
> >
> > @@ -1428,10 +1515,13 @@ _scsih_slave_destroy(struct scsi_device *sdev)
> >
> > if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > - sas_target_priv_data->sas_address);
> > + sas_device = __mpt2sas_get_sdev_from_target(ioc,
> > + sas_target_priv_data);
> > if (sas_device && !sas_target_priv_data->num_luns)
> > sas_device->starget = NULL;
> > +
> > + if (sas_device)
> > + sas_device_put(sas_device);
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > }
> >
> > @@ -2078,7 +2168,7 @@ _scsih_slave_configure(struct scsi_device *sdev)
> > }
> >
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> > sas_device_priv_data->sas_target->sas_address);
> > if (!sas_device) {
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > @@ -2112,17 +2202,18 @@ _scsih_slave_configure(struct scsi_device *sdev)
> > (unsigned long long) sas_device->enclosure_logical_id,
> > sas_device->slot);
> >
> > + sas_device_put(sas_device);
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > if (!ssp_target)
> > _scsih_display_sata_capabilities(ioc, handle, sdev);
> >
> > -
> > _scsih_change_queue_depth(sdev, qdepth);
> >
> > if (ssp_target) {
> > sas_read_port_mode_page(sdev);
> > _scsih_enable_tlr(ioc, sdev);
> > }
> > +
> > return 0;
> > }
> >
> > @@ -2509,8 +2600,7 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
> > device_str, (unsigned long long)priv_target->sas_address);
> > } else {
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > - priv_target->sas_address);
> > + sas_device = __mpt2sas_get_sdev_from_target(ioc, priv_target);
> > if (sas_device) {
> > if (priv_target->flags &
> > MPT_TARGET_FLAGS_RAID_COMPONENT) {
> > @@ -2529,6 +2619,8 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
> > "enclosure_logical_id(0x%016llx), slot(%d)\n",
> > (unsigned long long)sas_device->enclosure_logical_id,
> > sas_device->slot);
> > +
> > + sas_device_put(sas_device);
> > }
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > }
> > @@ -2604,12 +2696,12 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
> > {
> > struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
> > struct MPT2SAS_DEVICE *sas_device_priv_data;
> > - struct _sas_device *sas_device;
> > - unsigned long flags;
> > + struct _sas_device *sas_device = NULL;
> > u16 handle;
> > int r;
> >
> > struct scsi_target *starget = scmd->device->sdev_target;
> > + struct MPT2SAS_TARGET *target_priv_data = starget->hostdata;
> >
> > starget_printk(KERN_INFO, starget, "attempting device reset! "
> > "scmd(%p)\n", scmd);
> > @@ -2629,12 +2721,10 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
> > handle = 0;
> > if (sas_device_priv_data->sas_target->flags &
> > MPT_TARGET_FLAGS_RAID_COMPONENT) {
> > - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = _scsih_sas_device_find_by_handle(ioc,
> > - sas_device_priv_data->sas_target->handle);
> > + sas_device = mpt2sas_get_sdev_from_target(ioc,
> > + target_priv_data);
> > if (sas_device)
> > handle = sas_device->volume_handle;
> > - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > } else
> > handle = sas_device_priv_data->sas_target->handle;
> >
> > @@ -2651,6 +2741,10 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
> > out:
> > sdev_printk(KERN_INFO, scmd->device, "device reset: %s scmd(%p)\n",
> > ((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
> > +
> > + if (sas_device)
> > + sas_device_put(sas_device);
> > +
> > return r;
> > }
> >
> > @@ -2665,11 +2759,11 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
> > {
> > struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
> > struct MPT2SAS_DEVICE *sas_device_priv_data;
> > - struct _sas_device *sas_device;
> > - unsigned long flags;
> > + struct _sas_device *sas_device = NULL;
> > u16 handle;
> > int r;
> > struct scsi_target *starget = scmd->device->sdev_target;
> > + struct MPT2SAS_TARGET *target_priv_data = starget->hostdata;
> >
> > starget_printk(KERN_INFO, starget, "attempting target reset! "
> > "scmd(%p)\n", scmd);
> > @@ -2689,12 +2783,10 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
> > handle = 0;
> > if (sas_device_priv_data->sas_target->flags &
> > MPT_TARGET_FLAGS_RAID_COMPONENT) {
> > - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = _scsih_sas_device_find_by_handle(ioc,
> > - sas_device_priv_data->sas_target->handle);
> > + sas_device = mpt2sas_get_sdev_from_target(ioc,
> > + target_priv_data);
> > if (sas_device)
> > handle = sas_device->volume_handle;
> > - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > } else
> > handle = sas_device_priv_data->sas_target->handle;
> >
> > @@ -2711,6 +2803,10 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
> > out:
> > starget_printk(KERN_INFO, starget, "target reset: %s scmd(%p)\n",
> > ((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
> > +
> > + if (sas_device)
> > + sas_device_put(sas_device);
> > +
> > return r;
> > }
> >
> > @@ -3002,15 +3098,15 @@ _scsih_block_io_to_children_attached_to_ex(struct MPT2SAS_ADAPTER *ioc,
> >
> > list_for_each_entry(mpt2sas_port,
> > &sas_expander->sas_port_list, port_list) {
> > - if (mpt2sas_port->remote_identify.device_type ==
> > - SAS_END_DEVICE) {
> > + if (mpt2sas_port->remote_identify.device_type == SAS_END_DEVICE) {
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device =
> > - mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > - mpt2sas_port->remote_identify.sas_address);
> > - if (sas_device)
> > + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> > + mpt2sas_port->remote_identify.sas_address);
> > + if (sas_device) {
> > set_bit(sas_device->handle,
> > - ioc->blocking_handles);
> > + ioc->blocking_handles);
> > + sas_device_put(sas_device);
> > + }
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > }
> > }
> > @@ -3080,7 +3176,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > {
> > Mpi2SCSITaskManagementRequest_t *mpi_request;
> > u16 smid;
> > - struct _sas_device *sas_device;
> > + struct _sas_device *sas_device = NULL;
> > struct MPT2SAS_TARGET *sas_target_priv_data = NULL;
> > u64 sas_address = 0;
> > unsigned long flags;
> > @@ -3110,7 +3206,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > return;
> >
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > + sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> > if (sas_device && sas_device->starget &&
> > sas_device->starget->hostdata) {
> > sas_target_priv_data = sas_device->starget->hostdata;
> > @@ -3131,14 +3227,14 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > if (!smid) {
> > delayed_tr = kzalloc(sizeof(*delayed_tr), GFP_ATOMIC);
> > if (!delayed_tr)
> > - return;
> > + goto out;
> > INIT_LIST_HEAD(&delayed_tr->list);
> > delayed_tr->handle = handle;
> > list_add_tail(&delayed_tr->list, &ioc->delayed_tr_list);
> > dewtprintk(ioc, printk(MPT2SAS_INFO_FMT
> > "DELAYED:tr:handle(0x%04x), (open)\n",
> > ioc->name, handle));
> > - return;
> > + goto out;
> > }
> >
> > dewtprintk(ioc, printk(MPT2SAS_INFO_FMT "tr_send:handle(0x%04x), "
> > @@ -3150,6 +3246,9 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > mpi_request->DevHandle = cpu_to_le16(handle);
> > mpi_request->TaskType = MPI2_SCSITASKMGMT_TASKTYPE_TARGET_RESET;
> > mpt2sas_base_put_smid_hi_priority(ioc, smid);
> > +out:
> > + if (sas_device)
> > + sas_device_put(sas_device);
> > }
> >
> >
> > @@ -4068,7 +4167,6 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
> > char *desc_scsi_state = ioc->tmp_string;
> > u32 log_info = le32_to_cpu(mpi_reply->IOCLogInfo);
> > struct _sas_device *sas_device = NULL;
> > - unsigned long flags;
> > struct scsi_target *starget = scmd->device->sdev_target;
> > struct MPT2SAS_TARGET *priv_target = starget->hostdata;
> > char *device_str = NULL;
> > @@ -4200,9 +4298,7 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
> > printk(MPT2SAS_WARN_FMT "\t%s wwid(0x%016llx)\n", ioc->name,
> > device_str, (unsigned long long)priv_target->sas_address);
> > } else {
> > - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > - priv_target->sas_address);
> > + sas_device = mpt2sas_get_sdev_from_target(ioc, priv_target);
> > if (sas_device) {
> > printk(MPT2SAS_WARN_FMT "\tsas_address(0x%016llx), "
> > "phy(%d)\n", ioc->name, sas_device->sas_address,
> > @@ -4211,8 +4307,9 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
> > "\tenclosure_logical_id(0x%016llx), slot(%d)\n",
> > ioc->name, sas_device->enclosure_logical_id,
> > sas_device->slot);
> > +
> > + sas_device_put(sas_device);
> > }
> > - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > }
> >
> > printk(MPT2SAS_WARN_FMT "\thandle(0x%04x), ioc_status(%s)(0x%04x), "
> > @@ -4259,7 +4356,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > Mpi2SepRequest_t mpi_request;
> > struct _sas_device *sas_device;
> >
> > - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > + sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
> > if (!sas_device)
> > return;
> >
> > @@ -4274,7 +4371,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > &mpi_request)) != 0) {
> > printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n", ioc->name,
> > __FILE__, __LINE__, __func__);
> > - return;
> > + goto out;
> > }
> > sas_device->pfa_led_on = 1;
> >
> > @@ -4284,8 +4381,10 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > "enclosure_processor: ioc_status (0x%04x), loginfo(0x%08x)\n",
> > ioc->name, le16_to_cpu(mpi_reply.IOCStatus),
> > le32_to_cpu(mpi_reply.IOCLogInfo)));
> > - return;
> > + goto out;
> > }
> > +out:
> > + sas_device_put(sas_device);
> > }
> >
> > /**
> > @@ -4370,19 +4469,17 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >
> > /* only handle non-raid devices */
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > + sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> > if (!sas_device) {
> > - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > - return;
> > + goto out_unlock;
> > }
> > starget = sas_device->starget;
> > sas_target_priv_data = starget->hostdata;
> >
> > if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_RAID_COMPONENT) ||
> > - ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))) {
> > - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > - return;
> > - }
> > + ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)))
> > + goto out_unlock;
> > +
> > starget_printk(KERN_WARNING, starget, "predicted fault\n");
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >
> > @@ -4396,7 +4493,7 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > if (!event_reply) {
> > printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n",
> > ioc->name, __FILE__, __LINE__, __func__);
> > - return;
> > + goto out;
> > }
> >
> > event_reply->Function = MPI2_FUNCTION_EVENT_NOTIFICATION;
> > @@ -4413,6 +4510,14 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > event_data->SASAddress = cpu_to_le64(sas_target_priv_data->sas_address);
> > mpt2sas_ctl_add_to_event_log(ioc, event_reply);
> > kfree(event_reply);
> > +out:
> > + if (sas_device)
> > + sas_device_put(sas_device);
> > + return;
> > +
> > +out_unlock:
> > + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > + goto out;
> > }
> >
> > /**
> > @@ -5148,14 +5253,13 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> >
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > sas_address = le64_to_cpu(sas_device_pg0.SASAddress);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> > sas_address);
> >
> > if (!sas_device) {
> > printk(MPT2SAS_ERR_FMT "device is not present "
> > "handle(0x%04x), no sas_device!!!\n", ioc->name, handle);
> > - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > - return;
> > + goto out_unlock;
> > }
> >
> > if (unlikely(sas_device->handle != handle)) {
> > @@ -5172,19 +5276,22 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > MPI2_SAS_DEVICE0_FLAGS_DEVICE_PRESENT)) {
> > printk(MPT2SAS_ERR_FMT "device is not present "
> > "handle(0x%04x), flags!!!\n", ioc->name, handle);
> > - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > - return;
> > + goto out_unlock;
> > }
> >
> > /* check if there were any issues with discovery */
> > if (_scsih_check_access_status(ioc, sas_address, handle,
> > - sas_device_pg0.AccessStatus)) {
> > - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > - return;
> > - }
> > + sas_device_pg0.AccessStatus))
> > + goto out_unlock;
> > +
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > _scsih_ublock_io_device(ioc, sas_address);
> > + return;
>
> [Sreekanth] I think here driver exits from this function without
> reducing the reference count.

Yes, it does. Will fix.

> >
> > +out_unlock:
> > + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > + if (sas_device)
> > + sas_device_put(sas_device);
> > }
> >
> > /**
> > @@ -5208,7 +5315,6 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
> > u32 ioc_status;
> > __le64 sas_address;
> > u32 device_info;
> > - unsigned long flags;
> >
> > if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
> > MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
> > @@ -5250,14 +5356,13 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
> > return -1;
> > }
> >
> > -
> > - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > + sas_device = mpt2sas_get_sdev_by_addr(ioc,
> > sas_address);
> > - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> >
> > - if (sas_device)
> > + if (sas_device) {
> > + sas_device_put(sas_device);
> > return 0;
> > + }
> >
> > sas_device = kzalloc(sizeof(struct _sas_device),
> > GFP_KERNEL);
> > @@ -5267,6 +5372,7 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
> > return -1;
> > }
> >
> > + kref_init(&sas_device->refcount);
> > sas_device->handle = handle;
> > if (_scsih_get_sas_address(ioc, le16_to_cpu
> > (sas_device_pg0.ParentDevHandle),
> > @@ -5344,7 +5450,6 @@ _scsih_remove_device(struct MPT2SAS_ADAPTER *ioc,
> > "handle(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
> > sas_device->handle, (unsigned long long)
> > sas_device->sas_address));
> > - kfree(sas_device);
> > }
> > /**
> > * _scsih_device_remove_by_handle - removing device object by handle
> > @@ -5363,12 +5468,17 @@ _scsih_device_remove_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
> > return;
> >
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > - if (sas_device)
> > - list_del(&sas_device->list);
> > + sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> > + if (sas_device) {
> > + list_del_init(&sas_device->list);
> > + sas_device_put(sas_device);
> > + }
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > - if (sas_device)
> > +
> > + if (sas_device) {
> > _scsih_remove_device(ioc, sas_device);
> > + sas_device_put(sas_device);
> > + }
> > }
> >
> > /**
> > @@ -5389,13 +5499,17 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
> > return;
> >
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > - sas_address);
> > - if (sas_device)
> > - list_del(&sas_device->list);
> > + sas_device = __mpt2sas_get_sdev_by_addr(ioc, sas_address);
> > + if (sas_device) {
> > + list_del_init(&sas_device->list);
> > + sas_device_put(sas_device);
> > + }
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > - if (sas_device)
> > +
> > + if (sas_device) {
> > _scsih_remove_device(ioc, sas_device);
> > + sas_device_put(sas_device);
> > + }
> > }
> > #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
> > /**
> > @@ -5716,26 +5830,28 @@ _scsih_sas_device_status_change_event(struct MPT2SAS_ADAPTER *ioc,
> >
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > sas_address = le64_to_cpu(event_data->SASAddress);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> > sas_address);
> >
> > - if (!sas_device || !sas_device->starget) {
> > - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > - return;
> > - }
> > + if (!sas_device || !sas_device->starget)
> > + goto out;
> >
> > target_priv_data = sas_device->starget->hostdata;
> > - if (!target_priv_data) {
> > - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > - return;
> > - }
> > + if (!target_priv_data)
> > + goto out;
> >
> > if (event_data->ReasonCode ==
> > MPI2_EVENT_SAS_DEV_STAT_RC_INTERNAL_DEVICE_RESET)
> > target_priv_data->tm_busy = 1;
> > else
> > target_priv_data->tm_busy = 0;
> > +
> > +out:
> > + if (sas_device)
> > + sas_device_put(sas_device);
> > +
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +
> > }
> >
> > #ifdef CONFIG_SCSI_MPT2SAS_LOGGING
> > @@ -6123,7 +6239,7 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
> > u16 handle = le16_to_cpu(element->PhysDiskDevHandle);
> >
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > + sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> > if (sas_device) {
> > sas_device->volume_handle = 0;
> > sas_device->volume_wwid = 0;
> > @@ -6142,6 +6258,8 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
> > /* exposing raid component */
> > if (starget)
> > starget_for_each_device(starget, NULL, _scsih_reprobe_lun);
> > +
> > + sas_device_put(sas_device);
> > }
> >
> > /**
> > @@ -6170,7 +6288,7 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
> > &volume_wwid);
> >
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > + sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
> > if (sas_device) {
> > set_bit(handle, ioc->pd_handles);
> > if (sas_device->starget && sas_device->starget->hostdata) {
> > @@ -6189,6 +6307,8 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
> > /* hiding raid component */
> > if (starget)
> > starget_for_each_device(starget, (void *)1, _scsih_reprobe_lun);
> > +
> > + sas_device_put(sas_device);
> > }
> >
> > /**
> > @@ -6221,7 +6341,6 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,
> > Mpi2EventIrConfigElement_t *element)
> > {
> > struct _sas_device *sas_device;
> > - unsigned long flags;
> > u16 handle = le16_to_cpu(element->PhysDiskDevHandle);
> > Mpi2ConfigReply_t mpi_reply;
> > Mpi2SasDevicePage0_t sas_device_pg0;
> > @@ -6231,11 +6350,11 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,
> >
> > set_bit(handle, ioc->pd_handles);
> >
> > - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > - if (sas_device)
> > + sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
> > + if (sas_device) {
> > + sas_device_put(sas_device);
> > return;
> > + }
> >
> > if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
> > MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
> > @@ -6509,7 +6628,6 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
> > u16 handle, parent_handle;
> > u32 state;
> > struct _sas_device *sas_device;
> > - unsigned long flags;
> > Mpi2ConfigReply_t mpi_reply;
> > Mpi2SasDevicePage0_t sas_device_pg0;
> > u32 ioc_status;
> > @@ -6542,12 +6660,11 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
> > if (!ioc->is_warpdrive)
> > set_bit(handle, ioc->pd_handles);
> >
> > - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > -
> > - if (sas_device)
> > + sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
> > + if (sas_device) {
> > + sas_device_put(sas_device);
> > return;
> > + }
> >
> > if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
> > &sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
> > @@ -7015,6 +7132,7 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
> > struct _raid_device *raid_device, *raid_device_next;
> > struct list_head tmp_list;
> > unsigned long flags;
> > + LIST_HEAD(head);
> >
> > printk(MPT2SAS_INFO_FMT "removing unresponding devices: start\n",
> > ioc->name);
> > @@ -7022,14 +7140,29 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
> > /* removing unresponding end devices */
> > printk(MPT2SAS_INFO_FMT "removing unresponding devices: end-devices\n",
> > ioc->name);
> > +
> > + /*
> > + * Iterate, pulling off devices marked as non-responding. We become the
> > + * owner for the reference the list had on any object we prune.
> > + */
> > + spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > list_for_each_entry_safe(sas_device, sas_device_next,
> > - &ioc->sas_device_list, list) {
> > + &ioc->sas_device_list, list) {
> > if (!sas_device->responding)
> > - mpt2sas_device_remove_by_sas_address(ioc,
> > - sas_device->sas_address);
> > + list_move_tail(&sas_device->list, &head);
> > else
> > sas_device->responding = 0;
> > }
> > + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +
> > + /*
> > + * Now, uninitialize and remove the unresponding devices we pruned.
> > + */
> > + list_for_each_entry_safe(sas_device, sas_device_next, &head, list) {
> > + _scsih_remove_device(ioc, sas_device);
> > + list_del_init(&sas_device->list);
> > + sas_device_put(sas_device);
> > + }
> >
> > /* removing unresponding volumes */
> > if (ioc->ir_firmware) {
> > @@ -7179,11 +7312,11 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
> > }
> > phys_disk_num = pd_pg0.PhysDiskNum;
> > handle = le16_to_cpu(pd_pg0.DevHandle);
> > - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
> > - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > - if (sas_device)
> > + sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
> > + if (sas_device) {
> > + sas_device_put(sas_device);
> > continue;
> > + }
> > if (mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
> > &sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
> > handle) != 0)
> > @@ -7302,12 +7435,12 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
> > if (!(_scsih_is_end_device(
> > le32_to_cpu(sas_device_pg0.DeviceInfo))))
> > continue;
> > - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > + sas_device = mpt2sas_get_sdev_by_addr(ioc,
> > le64_to_cpu(sas_device_pg0.SASAddress));
> > - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > - if (sas_device)
> > + if (sas_device) {
> > + sas_device_put(sas_device);
> > continue;
> > + }
> > parent_handle = le16_to_cpu(sas_device_pg0.ParentDevHandle);
> > if (!_scsih_get_sas_address(ioc, parent_handle, &sas_address)) {
> > printk(MPT2SAS_INFO_FMT "\tBEFORE adding end device: "
> > @@ -7966,6 +8099,48 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
> > }
> > }
> >
> > +static struct _sas_device *get_next_sas_device(struct MPT2SAS_ADAPTER *ioc)
> > +{
> > + struct _sas_device *sas_device = NULL;
> > + unsigned long flags;
> > +
> > + spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > + if (!list_empty(&ioc->sas_device_init_list)) {
> > + sas_device = list_first_entry(&ioc->sas_device_init_list,
> > + struct _sas_device, list);
> > + sas_device_get(sas_device);
> > + }
> > + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +
> > + return sas_device;
> > +}
> > +
> > +static void sas_device_make_active(struct MPT2SAS_ADAPTER *ioc,
> > + struct _sas_device *sas_device)
> > +{
> > + unsigned long flags;
> > +
> > + spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > +
> > + /*
> > + * Since we dropped the lock during the call to port_add(), we need to
> > + * be careful here that somebody else didn't move or delete this item
> > + * while we were busy with other things.
> > + *
> > + * If it was on the list, we need a put() for the reference the list
> > + * had. Either way, we need a get() for the destination list.
> > + */
> > + if (!list_empty(&sas_device->list)) {
> > + list_del_init(&sas_device->list);
> > + sas_device_put(sas_device);
> > + }
> > +
> > + sas_device_get(sas_device);
> > + list_add_tail(&sas_device->list, &ioc->sas_device_list);
> > +
> > + spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +}
> > +
> > /**
> > * _scsih_probe_sas - reporting sas devices to sas transport
> > * @ioc: per adapter object
> > @@ -7975,34 +8150,30 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
> > static void
> > _scsih_probe_sas(struct MPT2SAS_ADAPTER *ioc)
> > {
> > - struct _sas_device *sas_device, *next;
> > - unsigned long flags;
> > -
> > - /* SAS Device List */
> > - list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
> > - list) {
> > + struct _sas_device *sas_device;
> >
> > - if (ioc->hide_drives)
> > - continue;
> > + if (ioc->hide_drives)
> > + return;
> >
> > + while ((sas_device = get_next_sas_device(ioc))) {
> > if (!mpt2sas_transport_port_add(ioc, sas_device->handle,
> > - sas_device->sas_address_parent)) {
> > - list_del(&sas_device->list);
> > - kfree(sas_device);
> > + sas_device->sas_address_parent)) {
> > + _scsih_sas_device_remove(ioc, sas_device);
> > + sas_device_put(sas_device);
> > continue;
> > } else if (!sas_device->starget) {
> > if (!ioc->is_driver_loading) {
> > mpt2sas_transport_port_remove(ioc,
> > - sas_device->sas_address,
> > - sas_device->sas_address_parent);
> > - list_del(&sas_device->list);
> > - kfree(sas_device);
> > + sas_device->sas_address,
> > + sas_device->sas_address_parent);
> > + _scsih_sas_device_remove(ioc, sas_device);
> > + sas_device_put(sas_device);
> > continue;
> > }
> > }
> > - spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - list_move_tail(&sas_device->list, &ioc->sas_device_list);
> > - spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > +
> > + sas_device_make_active(ioc, sas_device);
> > + sas_device_put(sas_device);
> > }
> > }
> >
> > diff --git a/drivers/scsi/mpt2sas/mpt2sas_transport.c b/drivers/scsi/mpt2sas/mpt2sas_transport.c
> > index ff2500a..af86800 100644
> > --- a/drivers/scsi/mpt2sas/mpt2sas_transport.c
> > +++ b/drivers/scsi/mpt2sas/mpt2sas_transport.c
> > @@ -1323,15 +1323,17 @@ _transport_get_enclosure_identifier(struct sas_rphy *rphy, u64 *identifier)
> > int rc;
> >
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> > rphy->identify.sas_address);
> > if (sas_device) {
> > *identifier = sas_device->enclosure_logical_id;
> > rc = 0;
> > + sas_device_put(sas_device);
> > } else {
> > *identifier = 0;
> > rc = -ENXIO;
> > }
> > +
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > return rc;
> > }
> > @@ -1351,12 +1353,14 @@ _transport_get_bay_identifier(struct sas_rphy *rphy)
> > int rc;
> >
> > spin_lock_irqsave(&ioc->sas_device_lock, flags);
> > - sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
> > + sas_device = __mpt2sas_get_sdev_by_addr(ioc,
> > rphy->identify.sas_address);
> > - if (sas_device)
> > + if (sas_device) {
> > rc = sas_device->slot;
> > - else
> > + sas_device_put(sas_device);
> > + } else {
> > rc = -ENXIO;
> > + }
> > spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
> > return rc;
> > }
> > --
> > 1.8.5.6
> >
>
>
>
> --
>
> Regards,
> Sreekanth

2015-08-14 01:48:29

by Calvin Owens

[permalink] [raw]
Subject: [PATCH v4 0/2] Fixes for memory corruption in mpt2sas

Hello all,

This patchset attempts to address problems we've been having with
panics due to memory corruption from the mpt2sas driver.

Thanks,
Calvin


[PATCH v4 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list
[PATCH v4 2/2] mpt2sas: Refcount fw_events and fix unsafe list usage

Total diffstat:
drivers/scsi/mpt2sas/mpt2sas_base.h | 22 +-
drivers/scsi/mpt2sas/mpt2sas_scsih.c | 592 ++++++++++++++++++++++---------
drivers/scsi/mpt2sas/mpt2sas_transport.c | 12 +-
3 files changed, 451 insertions(+), 175 deletions(-)

Diff showing changes v3 => v4:
http://jcalvinowens.github.io/stuff/mpt2sas-patchset-v3v4.patch

Diff showing changes v2 => v3:
http://jcalvinowens.github.io/stuff/mpt2sas-patchset-v2v3.patch

Diff showing changes v1 => v2:
http://jcalvinowens.github.io/stuff/mpt2sas-patchset-v1v2.patch

2015-08-14 01:48:55

by Calvin Owens

[permalink] [raw]
Subject: [PATCH v4 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage

These objects can be referenced concurrently throughout the driver, we
need a way to make sure threads can't delete them out from under each
other. This patch adds the refcount, and refactors the code to use it.

Additionally, we cannot iterate over the sas_device_list without
holding the lock, or we risk corrupting random memory if items are
added or deleted as we iterate. This patch refactors _scsih_probe_sas()
to use the sas_device_list in a safe way.

Cc: Christoph Hellwig <[email protected]>
Cc: Bart Van Assche <[email protected]>
Cc: Joe Lawrence <[email protected]>
Signed-off-by: Calvin Owens <[email protected]>
---
Changes in v4:
* Fix lack of put() in non-SATA case in _scsih_change_queue_depth()
* Fix lack of put() in the non-error case in _scsih_check_device()
* Add missing put() at bottom of _scsih_add_device()
* Add put for ->hostdata pointer in _scsih_target_destroy() for the
get() in _scsih_target_alloc()

Changes in v3:
* Drop the sas_device_lock while enabling devices, and leave the
sas_device object on the list, since it may need to be looked up there
while it is being enabled.
* Drop put() in _scsih_add_device(), because the ->hostdata now keeps a
reference (this was an oversight in v2).
* Be consistent about calling sas_device_put() while holding the
sas_device_lock where feasible.
* Take and assert_spin_locked() on the sas_device_lock from the newly
added __get_sdev_from_target(), add wrapper similar to other lookups
for callers which do not explicitly take the lock.

Changes in v2:
* Squished patches 1-3 into this one
* s/BUG_ON(!spin_is_locked/assert_spin_locked/g
* Store a pointer to the sas_device object in ->hostdata, to eliminate
the need for several lookups on the lists.

drivers/scsi/mpt2sas/mpt2sas_base.h | 22 +-
drivers/scsi/mpt2sas/mpt2sas_scsih.c | 480 +++++++++++++++++++++----------
drivers/scsi/mpt2sas/mpt2sas_transport.c | 12 +-
3 files changed, 360 insertions(+), 154 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h
index caff8d1..78f41ac 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.h
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.h
@@ -238,6 +238,7 @@
* @flags: MPT_TARGET_FLAGS_XXX flags
* @deleted: target flaged for deletion
* @tm_busy: target is busy with TM request.
+ * @sdev: The sas_device associated with this target
*/
struct MPT2SAS_TARGET {
struct scsi_target *starget;
@@ -248,6 +249,7 @@ struct MPT2SAS_TARGET {
u32 flags;
u8 deleted;
u8 tm_busy;
+ struct _sas_device *sdev;
};


@@ -376,8 +378,24 @@ struct _sas_device {
u8 phy;
u8 responding;
u8 pfa_led_on;
+ struct kref refcount;
};

+static inline void sas_device_get(struct _sas_device *s)
+{
+ kref_get(&s->refcount);
+}
+
+static inline void sas_device_free(struct kref *r)
+{
+ kfree(container_of(r, struct _sas_device, refcount));
+}
+
+static inline void sas_device_put(struct _sas_device *s)
+{
+ kref_put(&s->refcount, sas_device_free);
+}
+
/**
* struct _raid_device - raid volume link list
* @list: sas device list
@@ -1095,7 +1113,9 @@ struct _sas_node *mpt2sas_scsih_expander_find_by_handle(struct MPT2SAS_ADAPTER *
u16 handle);
struct _sas_node *mpt2sas_scsih_expander_find_by_sas_address(struct MPT2SAS_ADAPTER
*ioc, u64 sas_address);
-struct _sas_device *mpt2sas_scsih_sas_device_find_by_sas_address(
+struct _sas_device *mpt2sas_get_sdev_by_addr(
+ struct MPT2SAS_ADAPTER *ioc, u64 sas_address);
+struct _sas_device *__mpt2sas_get_sdev_by_addr(
struct MPT2SAS_ADAPTER *ioc, u64 sas_address);

void mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc);
diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index 3f26147..5eca3a4 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -526,8 +526,61 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
}
}

+static struct _sas_device *
+__mpt2sas_get_sdev_from_target(struct MPT2SAS_ADAPTER *ioc,
+ struct MPT2SAS_TARGET *tgt_priv)
+{
+ struct _sas_device *ret;
+
+ assert_spin_locked(&ioc->sas_device_lock);
+
+ ret = tgt_priv->sdev;
+ if (ret)
+ sas_device_get(ret);
+
+ return ret;
+}
+
+static struct _sas_device *
+mpt2sas_get_sdev_from_target(struct MPT2SAS_ADAPTER *ioc,
+ struct MPT2SAS_TARGET *tgt_priv)
+{
+ struct _sas_device *ret;
+ unsigned long flags;
+
+ spin_lock_irqsave(&ioc->sas_device_lock, flags);
+ ret = __mpt2sas_get_sdev_from_target(ioc, tgt_priv);
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+ return ret;
+}
+
+
+struct _sas_device *
+__mpt2sas_get_sdev_by_addr(struct MPT2SAS_ADAPTER *ioc,
+ u64 sas_address)
+{
+ struct _sas_device *sas_device;
+
+ assert_spin_locked(&ioc->sas_device_lock);
+
+ list_for_each_entry(sas_device, &ioc->sas_device_list, list)
+ if (sas_device->sas_address == sas_address)
+ goto found_device;
+
+ list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
+ if (sas_device->sas_address == sas_address)
+ goto found_device;
+
+ return NULL;
+
+found_device:
+ sas_device_get(sas_device);
+ return sas_device;
+}
+
/**
- * mpt2sas_scsih_sas_device_find_by_sas_address - sas device search
+ * mpt2sas_get_sdev_by_addr - sas device search
* @ioc: per adapter object
* @sas_address: sas address
* Context: Calling function should acquire ioc->sas_device_lock
@@ -536,24 +589,44 @@ _scsih_determine_boot_device(struct MPT2SAS_ADAPTER *ioc,
* object.
*/
struct _sas_device *
-mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
+mpt2sas_get_sdev_by_addr(struct MPT2SAS_ADAPTER *ioc,
u64 sas_address)
{
struct _sas_device *sas_device;
+ unsigned long flags;
+
+ spin_lock_irqsave(&ioc->sas_device_lock, flags);
+ sas_device = __mpt2sas_get_sdev_by_addr(ioc,
+ sas_address);
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+ return sas_device;
+}
+
+static struct _sas_device *
+__mpt2sas_get_sdev_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
+{
+ struct _sas_device *sas_device;
+
+ assert_spin_locked(&ioc->sas_device_lock);

list_for_each_entry(sas_device, &ioc->sas_device_list, list)
- if (sas_device->sas_address == sas_address)
- return sas_device;
+ if (sas_device->handle == handle)
+ goto found_device;

list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
- if (sas_device->sas_address == sas_address)
- return sas_device;
+ if (sas_device->handle == handle)
+ goto found_device;

return NULL;
+
+found_device:
+ sas_device_get(sas_device);
+ return sas_device;
}

/**
- * _scsih_sas_device_find_by_handle - sas device search
+ * mpt2sas_get_sdev_by_handle - sas device search
* @ioc: per adapter object
* @handle: sas device handle (assigned by firmware)
* Context: Calling function should acquire ioc->sas_device_lock
@@ -562,19 +635,16 @@ mpt2sas_scsih_sas_device_find_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
* object.
*/
static struct _sas_device *
-_scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
+mpt2sas_get_sdev_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
{
struct _sas_device *sas_device;
+ unsigned long flags;

- list_for_each_entry(sas_device, &ioc->sas_device_list, list)
- if (sas_device->handle == handle)
- return sas_device;
-
- list_for_each_entry(sas_device, &ioc->sas_device_init_list, list)
- if (sas_device->handle == handle)
- return sas_device;
+ spin_lock_irqsave(&ioc->sas_device_lock, flags);
+ sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

- return NULL;
+ return sas_device;
}

/**
@@ -583,7 +653,7 @@ _scsih_sas_device_find_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
* @sas_device: the sas_device object
* Context: This function will acquire ioc->sas_device_lock.
*
- * Removing object and freeing associated memory from the ioc->sas_device_list.
+ * If sas_device is on the list, remove it and decrement its reference count.
*/
static void
_scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
@@ -594,9 +664,15 @@ _scsih_sas_device_remove(struct MPT2SAS_ADAPTER *ioc,
if (!sas_device)
return;

+ /*
+ * The lock serializes access to the list, but we still need to verify
+ * that nobody removed the entry while we were waiting on the lock.
+ */
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- list_del(&sas_device->list);
- kfree(sas_device);
+ if (!list_empty(&sas_device->list)) {
+ list_del_init(&sas_device->list);
+ sas_device_put(sas_device);
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}

@@ -620,6 +696,7 @@ _scsih_sas_device_add(struct MPT2SAS_ADAPTER *ioc,
sas_device->handle, (unsigned long long)sas_device->sas_address));

spin_lock_irqsave(&ioc->sas_device_lock, flags);
+ sas_device_get(sas_device);
list_add_tail(&sas_device->list, &ioc->sas_device_list);
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

@@ -659,6 +736,7 @@ _scsih_sas_device_init_add(struct MPT2SAS_ADAPTER *ioc,
sas_device->handle, (unsigned long long)sas_device->sas_address));

spin_lock_irqsave(&ioc->sas_device_lock, flags);
+ sas_device_get(sas_device);
list_add_tail(&sas_device->list, &ioc->sas_device_init_list);
_scsih_determine_boot_device(ioc, sas_device, 0);
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
@@ -1208,12 +1286,15 @@ _scsih_change_queue_depth(struct scsi_device *sdev, int qdepth)
goto not_sata;
if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))
goto not_sata;
+
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
- sas_device_priv_data->sas_target->sas_address);
- if (sas_device && sas_device->device_info &
- MPI2_SAS_DEVICE_INFO_SATA_DEVICE)
- max_depth = MPT2SAS_SATA_QUEUE_DEPTH;
+ sas_device = __mpt2sas_get_sdev_from_target(ioc, sas_target_priv_data);
+ if (sas_device) {
+ if (sas_device->device_info & MPI2_SAS_DEVICE_INFO_SATA_DEVICE)
+ max_depth = MPT2SAS_SATA_QUEUE_DEPTH;
+
+ sas_device_put(sas_device);
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

not_sata:
@@ -1271,18 +1352,20 @@ _scsih_target_alloc(struct scsi_target *starget)
/* sas/sata devices */
spin_lock_irqsave(&ioc->sas_device_lock, flags);
rphy = dev_to_rphy(starget->dev.parent);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = __mpt2sas_get_sdev_by_addr(ioc,
rphy->identify.sas_address);

if (sas_device) {
sas_target_priv_data->handle = sas_device->handle;
sas_target_priv_data->sas_address = sas_device->sas_address;
+ sas_target_priv_data->sdev = sas_device;
sas_device->starget = starget;
sas_device->id = starget->id;
sas_device->channel = starget->channel;
if (test_bit(sas_device->handle, ioc->pd_handles))
sas_target_priv_data->flags |=
MPT_TARGET_FLAGS_RAID_COMPONENT;
+
}
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

@@ -1324,13 +1407,21 @@ _scsih_target_destroy(struct scsi_target *starget)

spin_lock_irqsave(&ioc->sas_device_lock, flags);
rphy = dev_to_rphy(starget->dev.parent);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
- rphy->identify.sas_address);
+ sas_device = __mpt2sas_get_sdev_from_target(ioc, sas_target_priv_data);
if (sas_device && (sas_device->starget == starget) &&
(sas_device->id == starget->id) &&
(sas_device->channel == starget->channel))
sas_device->starget = NULL;

+ if (sas_device) {
+ /*
+ * Corresponding get() is in _scsih_target_alloc()
+ */
+ sas_target_priv_data->sdev = NULL;
+ sas_device_put(sas_device);
+
+ sas_device_put(sas_device);
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

out:
@@ -1386,7 +1477,7 @@ _scsih_slave_alloc(struct scsi_device *sdev)

if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = __mpt2sas_get_sdev_by_addr(ioc,
sas_target_priv_data->sas_address);
if (sas_device && (sas_device->starget == NULL)) {
sdev_printk(KERN_INFO, sdev,
@@ -1394,6 +1485,10 @@ _scsih_slave_alloc(struct scsi_device *sdev)
__func__, __LINE__);
sas_device->starget = starget;
}
+
+ if (sas_device)
+ sas_device_put(sas_device);
+
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}

@@ -1428,10 +1523,13 @@ _scsih_slave_destroy(struct scsi_device *sdev)

if (!(sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)) {
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
- sas_target_priv_data->sas_address);
+ sas_device = __mpt2sas_get_sdev_from_target(ioc,
+ sas_target_priv_data);
if (sas_device && !sas_target_priv_data->num_luns)
sas_device->starget = NULL;
+
+ if (sas_device)
+ sas_device_put(sas_device);
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}

@@ -2078,7 +2176,7 @@ _scsih_slave_configure(struct scsi_device *sdev)
}

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = __mpt2sas_get_sdev_by_addr(ioc,
sas_device_priv_data->sas_target->sas_address);
if (!sas_device) {
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
@@ -2112,17 +2210,18 @@ _scsih_slave_configure(struct scsi_device *sdev)
(unsigned long long) sas_device->enclosure_logical_id,
sas_device->slot);

+ sas_device_put(sas_device);
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
if (!ssp_target)
_scsih_display_sata_capabilities(ioc, handle, sdev);

-
_scsih_change_queue_depth(sdev, qdepth);

if (ssp_target) {
sas_read_port_mode_page(sdev);
_scsih_enable_tlr(ioc, sdev);
}
+
return 0;
}

@@ -2509,8 +2608,7 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
device_str, (unsigned long long)priv_target->sas_address);
} else {
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
- priv_target->sas_address);
+ sas_device = __mpt2sas_get_sdev_from_target(ioc, priv_target);
if (sas_device) {
if (priv_target->flags &
MPT_TARGET_FLAGS_RAID_COMPONENT) {
@@ -2529,6 +2627,8 @@ _scsih_tm_display_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd)
"enclosure_logical_id(0x%016llx), slot(%d)\n",
(unsigned long long)sas_device->enclosure_logical_id,
sas_device->slot);
+
+ sas_device_put(sas_device);
}
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}
@@ -2604,12 +2704,12 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
{
struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
struct MPT2SAS_DEVICE *sas_device_priv_data;
- struct _sas_device *sas_device;
- unsigned long flags;
+ struct _sas_device *sas_device = NULL;
u16 handle;
int r;

struct scsi_target *starget = scmd->device->sdev_target;
+ struct MPT2SAS_TARGET *target_priv_data = starget->hostdata;

starget_printk(KERN_INFO, starget, "attempting device reset! "
"scmd(%p)\n", scmd);
@@ -2629,12 +2729,10 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
handle = 0;
if (sas_device_priv_data->sas_target->flags &
MPT_TARGET_FLAGS_RAID_COMPONENT) {
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc,
- sas_device_priv_data->sas_target->handle);
+ sas_device = mpt2sas_get_sdev_from_target(ioc,
+ target_priv_data);
if (sas_device)
handle = sas_device->volume_handle;
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
} else
handle = sas_device_priv_data->sas_target->handle;

@@ -2651,6 +2749,10 @@ _scsih_dev_reset(struct scsi_cmnd *scmd)
out:
sdev_printk(KERN_INFO, scmd->device, "device reset: %s scmd(%p)\n",
((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
+
+ if (sas_device)
+ sas_device_put(sas_device);
+
return r;
}

@@ -2665,11 +2767,11 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
{
struct MPT2SAS_ADAPTER *ioc = shost_priv(scmd->device->host);
struct MPT2SAS_DEVICE *sas_device_priv_data;
- struct _sas_device *sas_device;
- unsigned long flags;
+ struct _sas_device *sas_device = NULL;
u16 handle;
int r;
struct scsi_target *starget = scmd->device->sdev_target;
+ struct MPT2SAS_TARGET *target_priv_data = starget->hostdata;

starget_printk(KERN_INFO, starget, "attempting target reset! "
"scmd(%p)\n", scmd);
@@ -2689,12 +2791,10 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
handle = 0;
if (sas_device_priv_data->sas_target->flags &
MPT_TARGET_FLAGS_RAID_COMPONENT) {
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc,
- sas_device_priv_data->sas_target->handle);
+ sas_device = mpt2sas_get_sdev_from_target(ioc,
+ target_priv_data);
if (sas_device)
handle = sas_device->volume_handle;
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
} else
handle = sas_device_priv_data->sas_target->handle;

@@ -2711,6 +2811,10 @@ _scsih_target_reset(struct scsi_cmnd *scmd)
out:
starget_printk(KERN_INFO, starget, "target reset: %s scmd(%p)\n",
((r == SUCCESS) ? "SUCCESS" : "FAILED"), scmd);
+
+ if (sas_device)
+ sas_device_put(sas_device);
+
return r;
}

@@ -3002,15 +3106,15 @@ _scsih_block_io_to_children_attached_to_ex(struct MPT2SAS_ADAPTER *ioc,

list_for_each_entry(mpt2sas_port,
&sas_expander->sas_port_list, port_list) {
- if (mpt2sas_port->remote_identify.device_type ==
- SAS_END_DEVICE) {
+ if (mpt2sas_port->remote_identify.device_type == SAS_END_DEVICE) {
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device =
- mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
- mpt2sas_port->remote_identify.sas_address);
- if (sas_device)
+ sas_device = __mpt2sas_get_sdev_by_addr(ioc,
+ mpt2sas_port->remote_identify.sas_address);
+ if (sas_device) {
set_bit(sas_device->handle,
- ioc->blocking_handles);
+ ioc->blocking_handles);
+ sas_device_put(sas_device);
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}
}
@@ -3080,7 +3184,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
{
Mpi2SCSITaskManagementRequest_t *mpi_request;
u16 smid;
- struct _sas_device *sas_device;
+ struct _sas_device *sas_device = NULL;
struct MPT2SAS_TARGET *sas_target_priv_data = NULL;
u64 sas_address = 0;
unsigned long flags;
@@ -3110,7 +3214,7 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
return;

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+ sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
if (sas_device && sas_device->starget &&
sas_device->starget->hostdata) {
sas_target_priv_data = sas_device->starget->hostdata;
@@ -3131,14 +3235,14 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
if (!smid) {
delayed_tr = kzalloc(sizeof(*delayed_tr), GFP_ATOMIC);
if (!delayed_tr)
- return;
+ goto out;
INIT_LIST_HEAD(&delayed_tr->list);
delayed_tr->handle = handle;
list_add_tail(&delayed_tr->list, &ioc->delayed_tr_list);
dewtprintk(ioc, printk(MPT2SAS_INFO_FMT
"DELAYED:tr:handle(0x%04x), (open)\n",
ioc->name, handle));
- return;
+ goto out;
}

dewtprintk(ioc, printk(MPT2SAS_INFO_FMT "tr_send:handle(0x%04x), "
@@ -3150,6 +3254,9 @@ _scsih_tm_tr_send(struct MPT2SAS_ADAPTER *ioc, u16 handle)
mpi_request->DevHandle = cpu_to_le16(handle);
mpi_request->TaskType = MPI2_SCSITASKMGMT_TASKTYPE_TARGET_RESET;
mpt2sas_base_put_smid_hi_priority(ioc, smid);
+out:
+ if (sas_device)
+ sas_device_put(sas_device);
}


@@ -4068,7 +4175,6 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
char *desc_scsi_state = ioc->tmp_string;
u32 log_info = le32_to_cpu(mpi_reply->IOCLogInfo);
struct _sas_device *sas_device = NULL;
- unsigned long flags;
struct scsi_target *starget = scmd->device->sdev_target;
struct MPT2SAS_TARGET *priv_target = starget->hostdata;
char *device_str = NULL;
@@ -4200,9 +4306,7 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
printk(MPT2SAS_WARN_FMT "\t%s wwid(0x%016llx)\n", ioc->name,
device_str, (unsigned long long)priv_target->sas_address);
} else {
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
- priv_target->sas_address);
+ sas_device = mpt2sas_get_sdev_from_target(ioc, priv_target);
if (sas_device) {
printk(MPT2SAS_WARN_FMT "\tsas_address(0x%016llx), "
"phy(%d)\n", ioc->name, sas_device->sas_address,
@@ -4211,8 +4315,9 @@ _scsih_scsi_ioc_info(struct MPT2SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
"\tenclosure_logical_id(0x%016llx), slot(%d)\n",
ioc->name, sas_device->enclosure_logical_id,
sas_device->slot);
+
+ sas_device_put(sas_device);
}
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
}

printk(MPT2SAS_WARN_FMT "\thandle(0x%04x), ioc_status(%s)(0x%04x), "
@@ -4259,7 +4364,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
Mpi2SepRequest_t mpi_request;
struct _sas_device *sas_device;

- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+ sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
if (!sas_device)
return;

@@ -4274,7 +4379,7 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
&mpi_request)) != 0) {
printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n", ioc->name,
__FILE__, __LINE__, __func__);
- return;
+ goto out;
}
sas_device->pfa_led_on = 1;

@@ -4284,8 +4389,10 @@ _scsih_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
"enclosure_processor: ioc_status (0x%04x), loginfo(0x%08x)\n",
ioc->name, le16_to_cpu(mpi_reply.IOCStatus),
le32_to_cpu(mpi_reply.IOCLogInfo)));
- return;
+ goto out;
}
+out:
+ sas_device_put(sas_device);
}

/**
@@ -4370,19 +4477,17 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)

/* only handle non-raid devices */
spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+ sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
if (!sas_device) {
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
+ goto out_unlock;
}
starget = sas_device->starget;
sas_target_priv_data = starget->hostdata;

if ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_RAID_COMPONENT) ||
- ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME))) {
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
- }
+ ((sas_target_priv_data->flags & MPT_TARGET_FLAGS_VOLUME)))
+ goto out_unlock;
+
starget_printk(KERN_WARNING, starget, "predicted fault\n");
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

@@ -4396,7 +4501,7 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
if (!event_reply) {
printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n",
ioc->name, __FILE__, __LINE__, __func__);
- return;
+ goto out;
}

event_reply->Function = MPI2_FUNCTION_EVENT_NOTIFICATION;
@@ -4413,6 +4518,14 @@ _scsih_smart_predicted_fault(struct MPT2SAS_ADAPTER *ioc, u16 handle)
event_data->SASAddress = cpu_to_le64(sas_target_priv_data->sas_address);
mpt2sas_ctl_add_to_event_log(ioc, event_reply);
kfree(event_reply);
+out:
+ if (sas_device)
+ sas_device_put(sas_device);
+ return;
+
+out_unlock:
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+ goto out;
}

/**
@@ -5148,14 +5261,13 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)

spin_lock_irqsave(&ioc->sas_device_lock, flags);
sas_address = le64_to_cpu(sas_device_pg0.SASAddress);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = __mpt2sas_get_sdev_by_addr(ioc,
sas_address);

if (!sas_device) {
printk(MPT2SAS_ERR_FMT "device is not present "
"handle(0x%04x), no sas_device!!!\n", ioc->name, handle);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
+ goto out_unlock;
}

if (unlikely(sas_device->handle != handle)) {
@@ -5172,19 +5284,24 @@ _scsih_check_device(struct MPT2SAS_ADAPTER *ioc, u16 handle)
MPI2_SAS_DEVICE0_FLAGS_DEVICE_PRESENT)) {
printk(MPT2SAS_ERR_FMT "device is not present "
"handle(0x%04x), flags!!!\n", ioc->name, handle);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
+ goto out_unlock;
}

/* check if there were any issues with discovery */
if (_scsih_check_access_status(ioc, sas_address, handle,
- sas_device_pg0.AccessStatus)) {
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
- }
+ sas_device_pg0.AccessStatus))
+ goto out_unlock;
+
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
_scsih_ublock_io_device(ioc, sas_address);
+ if (sas_device)
+ sas_device_put(sas_device);
+ return;

+out_unlock:
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+ if (sas_device)
+ sas_device_put(sas_device);
}

/**
@@ -5208,7 +5325,6 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
u32 ioc_status;
__le64 sas_address;
u32 device_info;
- unsigned long flags;

if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
@@ -5250,14 +5366,13 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
return -1;
}

-
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_get_sdev_by_addr(ioc,
sas_address);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);

- if (sas_device)
+ if (sas_device) {
+ sas_device_put(sas_device);
return 0;
+ }

sas_device = kzalloc(sizeof(struct _sas_device),
GFP_KERNEL);
@@ -5267,6 +5382,7 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
return -1;
}

+ kref_init(&sas_device->refcount);
sas_device->handle = handle;
if (_scsih_get_sas_address(ioc, le16_to_cpu
(sas_device_pg0.ParentDevHandle),
@@ -5296,6 +5412,7 @@ _scsih_add_device(struct MPT2SAS_ADAPTER *ioc, u16 handle, u8 phy_num, u8 is_pd)
else
_scsih_sas_device_add(ioc, sas_device);

+ sas_device_put(sas_device);
return 0;
}

@@ -5344,7 +5461,6 @@ _scsih_remove_device(struct MPT2SAS_ADAPTER *ioc,
"handle(0x%04x), sas_addr(0x%016llx)\n", ioc->name, __func__,
sas_device->handle, (unsigned long long)
sas_device->sas_address));
- kfree(sas_device);
}
/**
* _scsih_device_remove_by_handle - removing device object by handle
@@ -5363,12 +5479,17 @@ _scsih_device_remove_by_handle(struct MPT2SAS_ADAPTER *ioc, u16 handle)
return;

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
- if (sas_device)
- list_del(&sas_device->list);
+ sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
+ if (sas_device) {
+ list_del_init(&sas_device->list);
+ sas_device_put(sas_device);
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- if (sas_device)
+
+ if (sas_device) {
_scsih_remove_device(ioc, sas_device);
+ sas_device_put(sas_device);
+ }
}

/**
@@ -5389,13 +5510,17 @@ mpt2sas_device_remove_by_sas_address(struct MPT2SAS_ADAPTER *ioc,
return;

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
- sas_address);
- if (sas_device)
- list_del(&sas_device->list);
+ sas_device = __mpt2sas_get_sdev_by_addr(ioc, sas_address);
+ if (sas_device) {
+ list_del_init(&sas_device->list);
+ sas_device_put(sas_device);
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- if (sas_device)
+
+ if (sas_device) {
_scsih_remove_device(ioc, sas_device);
+ sas_device_put(sas_device);
+ }
}
#ifdef CONFIG_SCSI_MPT2SAS_LOGGING
/**
@@ -5716,26 +5841,28 @@ _scsih_sas_device_status_change_event(struct MPT2SAS_ADAPTER *ioc,

spin_lock_irqsave(&ioc->sas_device_lock, flags);
sas_address = le64_to_cpu(event_data->SASAddress);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = __mpt2sas_get_sdev_by_addr(ioc,
sas_address);

- if (!sas_device || !sas_device->starget) {
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
- }
+ if (!sas_device || !sas_device->starget)
+ goto out;

target_priv_data = sas_device->starget->hostdata;
- if (!target_priv_data) {
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- return;
- }
+ if (!target_priv_data)
+ goto out;

if (event_data->ReasonCode ==
MPI2_EVENT_SAS_DEV_STAT_RC_INTERNAL_DEVICE_RESET)
target_priv_data->tm_busy = 1;
else
target_priv_data->tm_busy = 0;
+
+out:
+ if (sas_device)
+ sas_device_put(sas_device);
+
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
}

#ifdef CONFIG_SCSI_MPT2SAS_LOGGING
@@ -6123,7 +6250,7 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
u16 handle = le16_to_cpu(element->PhysDiskDevHandle);

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+ sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
if (sas_device) {
sas_device->volume_handle = 0;
sas_device->volume_wwid = 0;
@@ -6142,6 +6269,8 @@ _scsih_sas_pd_expose(struct MPT2SAS_ADAPTER *ioc,
/* exposing raid component */
if (starget)
starget_for_each_device(starget, NULL, _scsih_reprobe_lun);
+
+ sas_device_put(sas_device);
}

/**
@@ -6170,7 +6299,7 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
&volume_wwid);

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
+ sas_device = __mpt2sas_get_sdev_by_handle(ioc, handle);
if (sas_device) {
set_bit(handle, ioc->pd_handles);
if (sas_device->starget && sas_device->starget->hostdata) {
@@ -6189,6 +6318,8 @@ _scsih_sas_pd_hide(struct MPT2SAS_ADAPTER *ioc,
/* hiding raid component */
if (starget)
starget_for_each_device(starget, (void *)1, _scsih_reprobe_lun);
+
+ sas_device_put(sas_device);
}

/**
@@ -6221,7 +6352,6 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,
Mpi2EventIrConfigElement_t *element)
{
struct _sas_device *sas_device;
- unsigned long flags;
u16 handle = le16_to_cpu(element->PhysDiskDevHandle);
Mpi2ConfigReply_t mpi_reply;
Mpi2SasDevicePage0_t sas_device_pg0;
@@ -6231,11 +6361,11 @@ _scsih_sas_pd_add(struct MPT2SAS_ADAPTER *ioc,

set_bit(handle, ioc->pd_handles);

- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- if (sas_device)
+ sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
+ if (sas_device) {
+ sas_device_put(sas_device);
return;
+ }

if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply, &sas_device_pg0,
MPI2_SAS_DEVICE_PGAD_FORM_HANDLE, handle))) {
@@ -6509,7 +6639,6 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
u16 handle, parent_handle;
u32 state;
struct _sas_device *sas_device;
- unsigned long flags;
Mpi2ConfigReply_t mpi_reply;
Mpi2SasDevicePage0_t sas_device_pg0;
u32 ioc_status;
@@ -6542,12 +6671,11 @@ _scsih_sas_ir_physical_disk_event(struct MPT2SAS_ADAPTER *ioc,
if (!ioc->is_warpdrive)
set_bit(handle, ioc->pd_handles);

- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
-
- if (sas_device)
+ sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
+ if (sas_device) {
+ sas_device_put(sas_device);
return;
+ }

if ((mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
&sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
@@ -7015,6 +7143,7 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
struct _raid_device *raid_device, *raid_device_next;
struct list_head tmp_list;
unsigned long flags;
+ LIST_HEAD(head);

printk(MPT2SAS_INFO_FMT "removing unresponding devices: start\n",
ioc->name);
@@ -7022,14 +7151,29 @@ _scsih_remove_unresponding_sas_devices(struct MPT2SAS_ADAPTER *ioc)
/* removing unresponding end devices */
printk(MPT2SAS_INFO_FMT "removing unresponding devices: end-devices\n",
ioc->name);
+
+ /*
+ * Iterate, pulling off devices marked as non-responding. We become the
+ * owner for the reference the list had on any object we prune.
+ */
+ spin_lock_irqsave(&ioc->sas_device_lock, flags);
list_for_each_entry_safe(sas_device, sas_device_next,
- &ioc->sas_device_list, list) {
+ &ioc->sas_device_list, list) {
if (!sas_device->responding)
- mpt2sas_device_remove_by_sas_address(ioc,
- sas_device->sas_address);
+ list_move_tail(&sas_device->list, &head);
else
sas_device->responding = 0;
}
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+ /*
+ * Now, uninitialize and remove the unresponding devices we pruned.
+ */
+ list_for_each_entry_safe(sas_device, sas_device_next, &head, list) {
+ _scsih_remove_device(ioc, sas_device);
+ list_del_init(&sas_device->list);
+ sas_device_put(sas_device);
+ }

/* removing unresponding volumes */
if (ioc->ir_firmware) {
@@ -7179,11 +7323,11 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
}
phys_disk_num = pd_pg0.PhysDiskNum;
handle = le16_to_cpu(pd_pg0.DevHandle);
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = _scsih_sas_device_find_by_handle(ioc, handle);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- if (sas_device)
+ sas_device = mpt2sas_get_sdev_by_handle(ioc, handle);
+ if (sas_device) {
+ sas_device_put(sas_device);
continue;
+ }
if (mpt2sas_config_get_sas_device_pg0(ioc, &mpi_reply,
&sas_device_pg0, MPI2_SAS_DEVICE_PGAD_FORM_HANDLE,
handle) != 0)
@@ -7302,12 +7446,12 @@ _scsih_scan_for_devices_after_reset(struct MPT2SAS_ADAPTER *ioc)
if (!(_scsih_is_end_device(
le32_to_cpu(sas_device_pg0.DeviceInfo))))
continue;
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = mpt2sas_get_sdev_by_addr(ioc,
le64_to_cpu(sas_device_pg0.SASAddress));
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
- if (sas_device)
+ if (sas_device) {
+ sas_device_put(sas_device);
continue;
+ }
parent_handle = le16_to_cpu(sas_device_pg0.ParentDevHandle);
if (!_scsih_get_sas_address(ioc, parent_handle, &sas_address)) {
printk(MPT2SAS_INFO_FMT "\tBEFORE adding end device: "
@@ -7966,6 +8110,48 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
}
}

+static struct _sas_device *get_next_sas_device(struct MPT2SAS_ADAPTER *ioc)
+{
+ struct _sas_device *sas_device = NULL;
+ unsigned long flags;
+
+ spin_lock_irqsave(&ioc->sas_device_lock, flags);
+ if (!list_empty(&ioc->sas_device_init_list)) {
+ sas_device = list_first_entry(&ioc->sas_device_init_list,
+ struct _sas_device, list);
+ sas_device_get(sas_device);
+ }
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+ return sas_device;
+}
+
+static void sas_device_make_active(struct MPT2SAS_ADAPTER *ioc,
+ struct _sas_device *sas_device)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&ioc->sas_device_lock, flags);
+
+ /*
+ * Since we dropped the lock during the call to port_add(), we need to
+ * be careful here that somebody else didn't move or delete this item
+ * while we were busy with other things.
+ *
+ * If it was on the list, we need a put() for the reference the list
+ * had. Either way, we need a get() for the destination list.
+ */
+ if (!list_empty(&sas_device->list)) {
+ list_del_init(&sas_device->list);
+ sas_device_put(sas_device);
+ }
+
+ sas_device_get(sas_device);
+ list_add_tail(&sas_device->list, &ioc->sas_device_list);
+
+ spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+}
+
/**
* _scsih_probe_sas - reporting sas devices to sas transport
* @ioc: per adapter object
@@ -7975,34 +8161,30 @@ _scsih_probe_raid(struct MPT2SAS_ADAPTER *ioc)
static void
_scsih_probe_sas(struct MPT2SAS_ADAPTER *ioc)
{
- struct _sas_device *sas_device, *next;
- unsigned long flags;
-
- /* SAS Device List */
- list_for_each_entry_safe(sas_device, next, &ioc->sas_device_init_list,
- list) {
+ struct _sas_device *sas_device;

- if (ioc->hide_drives)
- continue;
+ if (ioc->hide_drives)
+ return;

+ while ((sas_device = get_next_sas_device(ioc))) {
if (!mpt2sas_transport_port_add(ioc, sas_device->handle,
- sas_device->sas_address_parent)) {
- list_del(&sas_device->list);
- kfree(sas_device);
+ sas_device->sas_address_parent)) {
+ _scsih_sas_device_remove(ioc, sas_device);
+ sas_device_put(sas_device);
continue;
} else if (!sas_device->starget) {
if (!ioc->is_driver_loading) {
mpt2sas_transport_port_remove(ioc,
- sas_device->sas_address,
- sas_device->sas_address_parent);
- list_del(&sas_device->list);
- kfree(sas_device);
+ sas_device->sas_address,
+ sas_device->sas_address_parent);
+ _scsih_sas_device_remove(ioc, sas_device);
+ sas_device_put(sas_device);
continue;
}
}
- spin_lock_irqsave(&ioc->sas_device_lock, flags);
- list_move_tail(&sas_device->list, &ioc->sas_device_list);
- spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
+
+ sas_device_make_active(ioc, sas_device);
+ sas_device_put(sas_device);
}
}

diff --git a/drivers/scsi/mpt2sas/mpt2sas_transport.c b/drivers/scsi/mpt2sas/mpt2sas_transport.c
index ff2500a..af86800 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_transport.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_transport.c
@@ -1323,15 +1323,17 @@ _transport_get_enclosure_identifier(struct sas_rphy *rphy, u64 *identifier)
int rc;

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = __mpt2sas_get_sdev_by_addr(ioc,
rphy->identify.sas_address);
if (sas_device) {
*identifier = sas_device->enclosure_logical_id;
rc = 0;
+ sas_device_put(sas_device);
} else {
*identifier = 0;
rc = -ENXIO;
}
+
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
return rc;
}
@@ -1351,12 +1353,14 @@ _transport_get_bay_identifier(struct sas_rphy *rphy)
int rc;

spin_lock_irqsave(&ioc->sas_device_lock, flags);
- sas_device = mpt2sas_scsih_sas_device_find_by_sas_address(ioc,
+ sas_device = __mpt2sas_get_sdev_by_addr(ioc,
rphy->identify.sas_address);
- if (sas_device)
+ if (sas_device) {
rc = sas_device->slot;
- else
+ sas_device_put(sas_device);
+ } else {
rc = -ENXIO;
+ }
spin_unlock_irqrestore(&ioc->sas_device_lock, flags);
return rc;
}
--
2.5.0

2015-08-14 01:48:36

by Calvin Owens

[permalink] [raw]
Subject: [PATCH v4 2/2] mpt2sas: Refcount fw_events and fix unsafe list usage

The fw_event_work struct is concurrently referenced at shutdown, so
add a refcount to protect it, and refactor the code to use it.

Additionally, refactor _scsih_fw_event_cleanup_queue() such that it
no longer iterates over the list without holding the lock, since
_firmware_event_work() concurrently deletes items from the list.

Cc: Christoph Hellwig <[email protected]>
Signed-off-by: Calvin Owens <[email protected]>
---
Changes in v4: None

Changes in v3:
* Add a break condition to the REMOVE_UNRESPONDING_DEVICES fw_event,
which can loop over a sleep forever (5m+ at least) at unloading. I
don't think anything prevented this before, but taking the fw_event
object off the list at the top of _firmware_event_work() seems to have
made it more likely to happen.

Changes in v2:
* Squished patches 4-6 into one patch
* Remove the fw_event from fw_event_list at the start of
_firmware_event_work()
* Explicitly seperate fw_event_list removal from fw_event freeing

drivers/scsi/mpt2sas/mpt2sas_scsih.c | 112 ++++++++++++++++++++++++++++-------
1 file changed, 91 insertions(+), 21 deletions(-)

diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
index 5eca3a4..c0ff55b 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c
@@ -176,9 +176,37 @@ struct fw_event_work {
u8 VP_ID;
u8 ignore;
u16 event;
+ struct kref refcount;
char event_data[0] __aligned(4);
};

+static void fw_event_work_free(struct kref *r)
+{
+ kfree(container_of(r, struct fw_event_work, refcount));
+}
+
+static void fw_event_work_get(struct fw_event_work *fw_work)
+{
+ kref_get(&fw_work->refcount);
+}
+
+static void fw_event_work_put(struct fw_event_work *fw_work)
+{
+ kref_put(&fw_work->refcount, fw_event_work_free);
+}
+
+static struct fw_event_work *alloc_fw_event_work(int len)
+{
+ struct fw_event_work *fw_event;
+
+ fw_event = kzalloc(sizeof(*fw_event) + len, GFP_ATOMIC);
+ if (!fw_event)
+ return NULL;
+
+ kref_init(&fw_event->refcount);
+ return fw_event;
+}
+
/* raid transport support */
static struct raid_template *mpt2sas_raid_template;

@@ -2872,36 +2900,39 @@ _scsih_fw_event_add(struct MPT2SAS_ADAPTER *ioc, struct fw_event_work *fw_event)
return;

spin_lock_irqsave(&ioc->fw_event_lock, flags);
+ fw_event_work_get(fw_event);
list_add_tail(&fw_event->list, &ioc->fw_event_list);
INIT_DELAYED_WORK(&fw_event->delayed_work, _firmware_event_work);
+ fw_event_work_get(fw_event);
queue_delayed_work(ioc->firmware_event_thread,
&fw_event->delayed_work, 0);
spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
}

/**
- * _scsih_fw_event_free - delete fw_event
+ * _scsih_fw_event_del_from_list - delete fw_event from the list
* @ioc: per adapter object
* @fw_event: object describing the event
* Context: This function will acquire ioc->fw_event_lock.
*
- * This removes firmware event object from link list, frees associated memory.
+ * If the fw_event is on the fw_event_list, remove it and do a put.
*
* Return nothing.
*/
static void
-_scsih_fw_event_free(struct MPT2SAS_ADAPTER *ioc, struct fw_event_work
+_scsih_fw_event_del_from_list(struct MPT2SAS_ADAPTER *ioc, struct fw_event_work
*fw_event)
{
unsigned long flags;

spin_lock_irqsave(&ioc->fw_event_lock, flags);
- list_del(&fw_event->list);
- kfree(fw_event);
+ if (!list_empty(&fw_event->list)) {
+ list_del_init(&fw_event->list);
+ fw_event_work_put(fw_event);
+ }
spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
}

-
/**
* _scsih_error_recovery_delete_devices - remove devices not responding
* @ioc: per adapter object
@@ -2916,13 +2947,14 @@ _scsih_error_recovery_delete_devices(struct MPT2SAS_ADAPTER *ioc)
if (ioc->is_driver_loading)
return;

- fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+ fw_event = alloc_fw_event_work(0);
if (!fw_event)
return;

fw_event->event = MPT2SAS_REMOVE_UNRESPONDING_DEVICES;
fw_event->ioc = ioc;
_scsih_fw_event_add(ioc, fw_event);
+ fw_event_work_put(fw_event);
}

/**
@@ -2936,12 +2968,29 @@ mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc)
{
struct fw_event_work *fw_event;

- fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+ fw_event = alloc_fw_event_work(0);
if (!fw_event)
return;
fw_event->event = MPT2SAS_PORT_ENABLE_COMPLETE;
fw_event->ioc = ioc;
_scsih_fw_event_add(ioc, fw_event);
+ fw_event_work_put(fw_event);
+}
+
+static struct fw_event_work *dequeue_next_fw_event(struct MPT2SAS_ADAPTER *ioc)
+{
+ unsigned long flags;
+ struct fw_event_work *fw_event = NULL;
+
+ spin_lock_irqsave(&ioc->fw_event_lock, flags);
+ if (!list_empty(&ioc->fw_event_list)) {
+ fw_event = list_first_entry(&ioc->fw_event_list,
+ struct fw_event_work, list);
+ list_del_init(&fw_event->list);
+ }
+ spin_unlock_irqrestore(&ioc->fw_event_lock, flags);
+
+ return fw_event;
}

/**
@@ -2956,17 +3005,25 @@ mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc)
static void
_scsih_fw_event_cleanup_queue(struct MPT2SAS_ADAPTER *ioc)
{
- struct fw_event_work *fw_event, *next;
+ struct fw_event_work *fw_event;

if (list_empty(&ioc->fw_event_list) ||
!ioc->firmware_event_thread || in_interrupt())
return;

- list_for_each_entry_safe(fw_event, next, &ioc->fw_event_list, list) {
- if (cancel_delayed_work_sync(&fw_event->delayed_work)) {
- _scsih_fw_event_free(ioc, fw_event);
- continue;
- }
+ while ((fw_event = dequeue_next_fw_event(ioc))) {
+ /*
+ * Wait on the fw_event to complete. If this returns 1, then
+ * the event was never executed, and we need a put for the
+ * reference the delayed_work had on the fw_event.
+ *
+ * If it did execute, we wait for it to finish, and the put will
+ * happen from _firmware_event_work()
+ */
+ if (cancel_delayed_work_sync(&fw_event->delayed_work))
+ fw_event_work_put(fw_event);
+
+ fw_event_work_put(fw_event);
}
}

@@ -4447,13 +4504,14 @@ _scsih_send_event_to_turn_on_pfa_led(struct MPT2SAS_ADAPTER *ioc, u16 handle)
{
struct fw_event_work *fw_event;

- fw_event = kzalloc(sizeof(struct fw_event_work), GFP_ATOMIC);
+ fw_event = alloc_fw_event_work(0);
if (!fw_event)
return;
fw_event->event = MPT2SAS_TURN_ON_PFA_LED;
fw_event->device_handle = handle;
fw_event->ioc = ioc;
_scsih_fw_event_add(ioc, fw_event);
+ fw_event_work_put(fw_event);
}

/**
@@ -7554,17 +7612,27 @@ _firmware_event_work(struct work_struct *work)
struct fw_event_work, delayed_work.work);
struct MPT2SAS_ADAPTER *ioc = fw_event->ioc;

+ _scsih_fw_event_del_from_list(ioc, fw_event);
+
/* the queue is being flushed so ignore this event */
- if (ioc->remove_host ||
- ioc->pci_error_recovery) {
- _scsih_fw_event_free(ioc, fw_event);
+ if (ioc->remove_host || ioc->pci_error_recovery) {
+ fw_event_work_put(fw_event);
return;
}

switch (fw_event->event) {
case MPT2SAS_REMOVE_UNRESPONDING_DEVICES:
- while (scsi_host_in_recovery(ioc->shost) || ioc->shost_recovery)
+ while (scsi_host_in_recovery(ioc->shost) ||
+ ioc->shost_recovery) {
+ /*
+ * If we're unloading, bail. Otherwise, this can become
+ * an infinite loop.
+ */
+ if (ioc->remove_host)
+ goto out;
+
ssleep(1);
+ }
_scsih_remove_unresponding_sas_devices(ioc);
_scsih_scan_for_devices_after_reset(ioc);
break;
@@ -7613,7 +7681,8 @@ _firmware_event_work(struct work_struct *work)
_scsih_sas_ir_operation_status_event(ioc, fw_event);
break;
}
- _scsih_fw_event_free(ioc, fw_event);
+out:
+ fw_event_work_put(fw_event);
}

/**
@@ -7751,7 +7820,7 @@ mpt2sas_scsih_event_callback(struct MPT2SAS_ADAPTER *ioc, u8 msix_index,
}

sz = le16_to_cpu(mpi_reply->EventDataLength) * 4;
- fw_event = kzalloc(sizeof(*fw_event) + sz, GFP_ATOMIC);
+ fw_event = alloc_fw_event_work(sz);
if (!fw_event) {
printk(MPT2SAS_ERR_FMT "failure at %s:%d/%s()!\n",
ioc->name, __FILE__, __LINE__, __func__);
@@ -7764,6 +7833,7 @@ mpt2sas_scsih_event_callback(struct MPT2SAS_ADAPTER *ioc, u8 msix_index,
fw_event->VP_ID = mpi_reply->VP_ID;
fw_event->event = event;
_scsih_fw_event_add(ioc, fw_event);
+ fw_event_work_put(fw_event);
return;
}

--
2.5.0

2015-08-25 21:03:40

by Nicholas A. Bellinger

[permalink] [raw]
Subject: Re: [PATCH v4 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list usage

Hi Calvin,

On Thu, 2015-08-13 at 18:48 -0700, Calvin Owens wrote:
> These objects can be referenced concurrently throughout the driver, we
> need a way to make sure threads can't delete them out from under each
> other. This patch adds the refcount, and refactors the code to use it.
>
> Additionally, we cannot iterate over the sas_device_list without
> holding the lock, or we risk corrupting random memory if items are
> added or deleted as we iterate. This patch refactors _scsih_probe_sas()
> to use the sas_device_list in a safe way.
>
> Cc: Christoph Hellwig <[email protected]>
> Cc: Bart Van Assche <[email protected]>
> Cc: Joe Lawrence <[email protected]>
> Signed-off-by: Calvin Owens <[email protected]>
> ---
> Changes in v4:
> * Fix lack of put() in non-SATA case in _scsih_change_queue_depth()
> * Fix lack of put() in the non-error case in _scsih_check_device()
> * Add missing put() at bottom of _scsih_add_device()
> * Add put for ->hostdata pointer in _scsih_target_destroy() for the
> get() in _scsih_target_alloc()
>
> Changes in v3:
> * Drop the sas_device_lock while enabling devices, and leave the
> sas_device object on the list, since it may need to be looked up there
> while it is being enabled.
> * Drop put() in _scsih_add_device(), because the ->hostdata now keeps a
> reference (this was an oversight in v2).
> * Be consistent about calling sas_device_put() while holding the
> sas_device_lock where feasible.
> * Take and assert_spin_locked() on the sas_device_lock from the newly
> added __get_sdev_from_target(), add wrapper similar to other lookups
> for callers which do not explicitly take the lock.
>
> Changes in v2:
> * Squished patches 1-3 into this one
> * s/BUG_ON(!spin_is_locked/assert_spin_locked/g
> * Store a pointer to the sas_device object in ->hostdata, to eliminate
> the need for several lookups on the lists.
>
> drivers/scsi/mpt2sas/mpt2sas_base.h | 22 +-
> drivers/scsi/mpt2sas/mpt2sas_scsih.c | 480 +++++++++++++++++++++----------
> drivers/scsi/mpt2sas/mpt2sas_transport.c | 12 +-
> 3 files changed, 360 insertions(+), 154 deletions(-)
>

Looks good.

Reviewed-by: Nicholas Bellinger <[email protected]>

2015-08-25 21:06:18

by Nicholas A. Bellinger

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] mpt2sas: Refcount fw_events and fix unsafe list usage

On Thu, 2015-08-13 at 18:48 -0700, Calvin Owens wrote:
> The fw_event_work struct is concurrently referenced at shutdown, so
> add a refcount to protect it, and refactor the code to use it.
>
> Additionally, refactor _scsih_fw_event_cleanup_queue() such that it
> no longer iterates over the list without holding the lock, since
> _firmware_event_work() concurrently deletes items from the list.
>
> Cc: Christoph Hellwig <[email protected]>
> Signed-off-by: Calvin Owens <[email protected]>
> ---
> Changes in v4: None
>
> Changes in v3:
> * Add a break condition to the REMOVE_UNRESPONDING_DEVICES fw_event,
> which can loop over a sleep forever (5m+ at least) at unloading. I
> don't think anything prevented this before, but taking the fw_event
> object off the list at the top of _firmware_event_work() seems to have
> made it more likely to happen.
>
> Changes in v2:
> * Squished patches 4-6 into one patch
> * Remove the fw_event from fw_event_list at the start of
> _firmware_event_work()
> * Explicitly seperate fw_event_list removal from fw_event freeing
>
> drivers/scsi/mpt2sas/mpt2sas_scsih.c | 112 ++++++++++++++++++++++++++++-------
> 1 file changed, 91 insertions(+), 21 deletions(-)
>

Looks good.

Reviewed-by: Nicholas Bellinger <[email protected]>

2015-08-25 21:21:29

by Nicholas A. Bellinger

[permalink] [raw]
Subject: Re: [PATCH v4 0/2] Fixes for memory corruption in mpt2sas

On Thu, 2015-08-13 at 18:48 -0700, Calvin Owens wrote:
> Hello all,
>
> This patchset attempts to address problems we've been having with
> panics due to memory corruption from the mpt2sas driver.
>
> Thanks,
> Calvin
>
>
> [PATCH v4 1/2] mpt2sas: Refcount sas_device objects and fix unsafe list
> [PATCH v4 2/2] mpt2sas: Refcount fw_events and fix unsafe list usage
>
> Total diffstat:
> drivers/scsi/mpt2sas/mpt2sas_base.h | 22 +-
> drivers/scsi/mpt2sas/mpt2sas_scsih.c | 592 ++++++++++++++++++++++---------
> drivers/scsi/mpt2sas/mpt2sas_transport.c | 12 +-
> 3 files changed, 451 insertions(+), 175 deletions(-)
>
> Diff showing changes v3 => v4:
> http://jcalvinowens.github.io/stuff/mpt2sas-patchset-v3v4.patch
>
> Diff showing changes v2 => v3:
> http://jcalvinowens.github.io/stuff/mpt2sas-patchset-v2v3.patch
>
> Diff showing changes v1 => v2:
> http://jcalvinowens.github.io/stuff/mpt2sas-patchset-v1v2.patch
> --

(Adding JEJB CC')

James, please considering pick this up for v4.3-rc1.

Btw, I'm seeing the same type of issues on mpt3sas, and unless someone
at Avago is already working on a similar patch series, I'll end up
forward porting these to mpt3sas code.

Thank you,

--nab