2012-08-31 20:01:06

by Dongsu Park

[permalink] [raw]
Subject: [PATCH 0/5] Fix bugs in ib_srp patches for H.A. purposes

From: Dongsu Park <[email protected]>

Hi Bart,

This patchset aims at fixing bugs that have been discovered in our own
SRP test environment so far. These patches are based on your patchset v4,
"Make ib_srp better suited for H.A. purposes",(09 Aug 2012).

The 5th patch, "fix an error accessing invalid memory in
rport_dev_loss_timedout" is including your suggestion (30 Aug 2012).

You can also pull the following git repo to get the patches.

git://github.com/advance38/linux.git srp-ha

Our test setup consists of two systems.
Kernel 3.2.15 with SCST v4193 on the target,
and Kernel 3.2.8 with ib_srp-ha on the initiator.
Although I rebased the patches again onto 3.6-rc3,
I suppose there will be no significant differences.

All of the known critical issues seem to have been resolved
according to our internal tests in the last weeks.

Thanks,

Dongsu Park (3):
ib_srp: free memory correctly in srp_free_iu()
ib_srp: hold a mutex when adding a new target port
ib_srp: check if rport->lld_data is NULL before removing rport

Sebastian Riemer (1):
ib_srp: removed superfluous warning in send timeout case

Bart Van Assche (1):
ib_srp: fix an error accessing invalid memory in
rport_dev_loss_timedout

drivers/infiniband/ulp/srp/ib_srp.c | 23 +++++++++++++++++++----
drivers/scsi/scsi_transport_srp.c | 11 ++++++++++-
2 files changed, 29 insertions(+), 5 deletions(-)

--
1.7.11.1


2012-08-31 20:01:15

by Dongsu Park

[permalink] [raw]
Subject: [PATCH 4/5] ib_srp: check if rport->lld_data is NULL before removing rport

From: Dongsu Park <[email protected]>

After removing rport_delete(), rport->lld_data has to be set to NULL.
In addition to that, both srp_rport_delete() and
rport_dev_loss_timedout() must check if rport->lld_data is NULL,
before accessing to rport->lld_data or any rport's target area.

Without this patch, the initiator's kernel could crash with the
following call trace, especially deleting remote ports as well as
IB link down cases.

How to reproduce:
1. Configure 500+ vdisks on target, and get initiator connected.
2. Exchange data intensively, which works well.
3. (On initiator) delete SRP remote port occasionally, e.g.
# echo "1" > /sys/class/srp_remote_ports/port-6\:1/delete
And configure again the SRP target.
4. (On target) disable Infiniband interface, and enable it again.
5. Repeat 3 and 4.

Then the initiator's kernel suddenly crashes. (but not always)

Kernel Call Trace:

BUG: unable to handle kernel paging request at 0000000000010001
IP: [<ffffffff8139ec55>] strnlen+0x5/0x40 PGD 212fea067 PUD 2162f8067 PMD 0
Oops: 0000 [#1] SMP CPU 0
Pid: 2311, comm: kworker/0:2 Not tainted 3.2.8 #1 Supermicro H8DGU/H8DGU
RIP: 0010:[<ffffffff8139ec55>] [<ffffffff8139ec55>] strnlen+0x5/0x40
Process kworker/0:2 (pid: 2311, threadinfo ffff880215fe2000, task
ffff88020f2ce540)
Call Trace:
[<ffffffff813a023c>] ? string+0x4c/0xe0
[<ffffffff813a142d>] ? vsnprintf+0x1ed/0x5b0
[<ffffffffa0131900>] ? do_srp_rport_del+0x30/0x30 [scsi_transport_srp]
[<ffffffff813a18a9>] ? vscnprintf+0x9/0x20
[<ffffffff81049b7f>] ? vprintk+0xaf/0x440
[<ffffffff810f3cc0>] ? next_online_pgdat+0x20/0x50
[<ffffffff810f3d20>] ? next_zone+0x30/0x40
[<ffffffff810f4c60>] ? refresh_cpu_vm_stats+0xf0/0x160
[<ffffffffa0131900>] ? do_srp_rport_del+0x30/0x30 [scsi_transport_srp]
[<ffffffff816533b6>] ? printk+0x40/0x4a
[<ffffffffa013192d>] ? rport_dev_loss_timedout+0x2d/0xa0 [scsi_transport_srp]
[<ffffffff81063383>] ? process_one_work+0x113/0x470
[<ffffffff81065c73>] ? worker_thread+0x163/0x3e0
[<ffffffff81065b10>] ? manage_workers+0x200/0x200
[<ffffffff81065b10>] ? manage_workers+0x200/0x200
[<ffffffff8106a126>] ? kthread+0x96/0xa0
[<ffffffff8165f674>] ? kernel_thread_helper+0x4/0x10
[<ffffffff8106a090>] ? kthread_worker_fn+0x180/0x180
[<ffffffff8165f670>] ? gs_change+0x13/0x13
RIP [<ffffffff8139ec55>] strnlen+0x5/0x40
RSP <ffff880215fe3c28>
CR2: 0000000000010001
---[ end trace d55b61cd78c54a0a ]---
IP: [<ffffffff81069cb7>] kthread_data+0x7/0x10
Oops: 0000 [#2] SMP
CPU 3
Pid: 16745, comm: kworker/3:4 Tainted: G D O 3.2.8-pserver+
#51 System manufacturer System Product Name/M4A89GTD-PRO
RIP: 0010:[<ffffffff81069cb7>] [<ffffffff81069cb7>] kthread_data+0x7/0x10
Process kworker/3:4 (pid: 16745, threadinfo ffff8801f8162000, task
ffff88020ff91440)
Call Trace:
[<ffffffff81062fc8>] ? wq_worker_sleeping+0x8/0x90
[<ffffffff81653ca2>] ? __schedule+0x432/0x7e0
[<ffffffff8104ce54>] ? do_exit+0x5d4/0x8a0
[<ffffffff816533b6>] ? printk+0x40/0x4a
[<ffffffff81656d13>] ? oops_end+0xa3/0xf0
[<ffffffff8102b33d>] ? no_context+0xfd/0x270
[<ffffffff8103dd55>] ? check_preempt_wakeup+0x155/0x1d0
[<ffffffff8165951a>] ? do_page_fault+0x31a/0x440
[<ffffffff810406d2>] ? select_task_rq_fair+0x432/0x9d0
[<ffffffff81396b32>] ? cpumask_next_and+0x22/0x40
[<ffffffff810391f3>] ? find_busiest_group+0x1f3/0xb30
[<ffffffff81656285>] ? page_fault+0x25/0x30
[<ffffffff8139ec55>] ? strnlen+0x5/0x40
[<ffffffff813a023c>] ? string+0x4c/0xe0
[<ffffffff813a142d>] ? vsnprintf+0x1ed/0x5b0
[<ffffffffa0121900>] ? do_srp_rport_del+0x30/0x30 [scsi_transport_srp]
[<ffffffff813a18a9>] ? vscnprintf+0x9/0x20
[<ffffffff81049b7f>] ? vprintk+0xaf/0x440
[<ffffffff8104e429>] ? ns_to_timeval+0x9/0x40
[<ffffffff81062f87>] ? queue_delayed_work_on+0x157/0x170
[<ffffffffa0121900>] ? do_srp_rport_del+0x30/0x30 [scsi_transport_srp]
[<ffffffff816533b6>] ? printk+0x40/0x4a
[<ffffffffa012192d>] ? rport_dev_loss_timedout+0x2d/0xa0 [scsi_transport_srp]
[<ffffffff814f43f0>] ? cpufreq_governor_dbs+0x4b0/0x4b0
[<ffffffff81063383>] ? process_one_work+0x113/0x470
[<ffffffff81065c73>] ? worker_thread+0x163/0x3e0
[<ffffffff81065b10>] ? manage_workers+0x200/0x200
[<ffffffff81065b10>] ? manage_workers+0x200/0x200
[<ffffffff8106a126>] ? kthread+0x96/0xa0
[<ffffffff8165f674>] ? kernel_thread_helper+0x4/0x10
[<ffffffff8106a090>] ? kthread_worker_fn+0x180/0x180
[<ffffffff8165f670>] ? gs_change+0x13/0x13
RIP [<ffffffff81069cb7>] kthread_data+0x7/0x10
RSP <ffff8801f81638d0>
CR2: fffffffffffffff8
---[ end trace cab7f2c38a7f7ba9 ]---

Signed-off-by: Dongsu Park <[email protected]>
---
drivers/infiniband/ulp/srp/ib_srp.c | 12 +++++++++++-
drivers/scsi/scsi_transport_srp.c | 6 ++++++
2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 1b274484..ba7bbfd 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -647,9 +647,19 @@ static void srp_remove_work(struct work_struct *work)

static void srp_rport_delete(struct srp_rport *rport)
{
- struct srp_target_port *target = rport->lld_data;
+ struct srp_target_port *target;
+
+ if (!rport->lld_data) {
+ pr_warn("skipping srp_rport_delete. rport->lld_data=%p\n",
+ rport->lld_data);
+ return;
+ }
+
+ target = rport->lld_data;

srp_queue_remove_work(target);
+
+ rport->lld_data = NULL;
}

/**
diff --git a/drivers/scsi/scsi_transport_srp.c b/drivers/scsi/scsi_transport_srp.c
index af3cb56..915b355 100644
--- a/drivers/scsi/scsi_transport_srp.c
+++ b/drivers/scsi/scsi_transport_srp.c
@@ -272,6 +272,12 @@ static void rport_dev_loss_timedout(struct work_struct *work)
struct Scsi_Host *shost;
struct srp_internal *i;

+ if (!rport->lld_data) {
+ pr_warn("skipping rport_delete, rport->lld_data=%p\n",
+ rport->lld_data);
+ return;
+ }
+
pr_err("SRP transport: dev_loss_tmo (%ds) expired - removing %s.\n",
rport->dev_loss_tmo, dev_name(&rport->dev));

--
1.7.11.1

2012-08-31 20:01:13

by Dongsu Park

[permalink] [raw]
Subject: [PATCH 3/5] ib_srp: hold a mutex when adding a new target port

From: Dongsu Park <[email protected]>

Unter circumstances, srp_rport_add() can make conflicts with
srp_rport_delete(), dumping the call trace written below.
That does not always occur. But its possible reason is adding
sysfs entries for the SRP target too fast, even before the
deletion hasn't finished yet.

The possible solution is therefore holding a scan_mutex when
calling device_add().

Example call trace:

------------[ cut here ]------------
WARNING: at block/genhd.c:1466 __disk_unblock_events+0x10f/0x120()
Pid: 17238, comm: scsi_id Not tainted 3.2.8-pserver #1
Call Trace:
[<ffffffff81048dbb>] ? warn_slowpath_common+0x7b/0xc0
[<ffffffff813879bf>] ? __disk_unblock_events+0x10f/0x120
[<ffffffff81162b30>] ? __blkdev_get+0x190/0x410
[<ffffffff811630c0>] ? blkdev_get+0x310/0x310
[<ffffffff81162dfb>] ? blkdev_get+0x4b/0x310
[<ffffffff811630c0>] ? blkdev_get+0x310/0x310
[<ffffffff8112d513>] ? __dentry_open+0x263/0x370
[<ffffffff8113a0fe>] ? path_get+0x1e/0x30
[<ffffffff8113b4a0>] ? do_last+0x3e0/0x800
[<ffffffff8113c21b>] ? path_openat+0xdb/0x400
[<ffffffff8113c66d>] ? do_filp_open+0x4d/0xc0
[<ffffffff81148c13>] ? alloc_fd+0x43/0x130
[<ffffffff8112d915>] ? do_sys_open+0x105/0x1e0
[<ffffffff8165d512>] ? system_call_fastpath+0x16/0x1b
---[ end trace 4edc2747f936431c ]---
------------[ cut here ]------------
WARNING: at fs/sysfs/inode.c:323 sysfs_hash_and_remove+0xa4/0xb0()
Hardware name: H8DGU
sysfs: can not remove 'bsg', no directory
Pid: 15816, comm: kworker/4:8 Tainted: G W 3.2.8 #1
Call Trace:
[<ffffffff81048dbb>] ? warn_slowpath_common+0x7b/0xc0
[<ffffffff81048eb5>] ? warn_slowpath_fmt+0x45/0x50
[<ffffffff8119a854>] ? sysfs_hash_and_remove+0xa4/0xb0
[<ffffffff8138aaaf>] ? bsg_unregister_queue+0x3f/0x80
[<ffffffffa000eda9>] ? __scsi_remove_device+0x99/0xc0 [scsi_mod]
[<ffffffffa000b3b4>] ? scsi_forget_host+0x64/0x70 [scsi_mod]
[<ffffffffa00035b1>] ? scsi_remove_host+0x61/0x100 [scsi_mod]
[<ffffffffa0643097>] ? srp_remove_work+0x137/0x1c0 [ib_srp]
[<ffffffffa0642f60>] ? srp_free_req_data+0xd0/0xd0 [ib_srp]
[<ffffffff81063383>] ? process_one_work+0x113/0x470
[<ffffffff81065a90>] ? manage_workers+0x180/0x200
[<ffffffff81065c73>] ? worker_thread+0x163/0x3e0
[<ffffffff81065b10>] ? manage_workers+0x200/0x200
[<ffffffff81065b10>] ? manage_workers+0x200/0x200
[<ffffffff8106a126>] ? kthread+0x96/0xa0
[<ffffffff8165f674>] ? kernel_thread_helper+0x4/0x10
[<ffffffff8106a090>] ? kthread_worker_fn+0x180/0x180
[<ffffffff8165f670>] ? gs_change+0x13/0x13
---[ end trace 4edc2747f936431d ]---
------------[ cut here ]------------

Signed-off-by: Dongsu Park <[email protected]>
---
drivers/scsi/scsi_transport_srp.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/scsi/scsi_transport_srp.c b/drivers/scsi/scsi_transport_srp.c
index 7f17686..af3cb56 100644
--- a/drivers/scsi/scsi_transport_srp.c
+++ b/drivers/scsi/scsi_transport_srp.c
@@ -407,12 +407,15 @@ struct srp_rport *srp_rport_add(struct Scsi_Host *shost,

transport_setup_device(&rport->dev);

+ mutex_lock(&shost->scan_mutex);
ret = device_add(&rport->dev);
if (ret) {
+ mutex_unlock(&shost->scan_mutex);
transport_destroy_device(&rport->dev);
put_device(&rport->dev);
return ERR_PTR(ret);
}
+ mutex_unlock(&shost->scan_mutex);

if (shost->active_mode & MODE_TARGET &&
ids->roles == SRP_RPORT_ROLE_INITIATOR) {
--
1.7.11.1

2012-08-31 20:01:46

by Dongsu Park

[permalink] [raw]
Subject: [PATCH 5/5] ib_srp: fix an error accessing invalid memory in rport_dev_loss_timedout

From: Bart Van Assche <[email protected]>

In rport_dev_loss_timedout(), rport must be obtained by accessing
the member entry dev_loss_work, not fast_io_fail_work.

Signed-off-By: Bart Van Assche <[email protected]>
---
drivers/scsi/scsi_transport_srp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_transport_srp.c b/drivers/scsi/scsi_transport_srp.c
index 915b355..d796413 100644
--- a/drivers/scsi/scsi_transport_srp.c
+++ b/drivers/scsi/scsi_transport_srp.c
@@ -242,7 +242,7 @@ static void rport_fast_io_fail_timedout(struct work_struct *work)
{
struct srp_rport *rport =
container_of(to_delayed_work(work), struct srp_rport,
- fast_io_fail_work);
+ dev_loss_work);
struct Scsi_Host *shost;
struct srp_internal *i;

--
1.7.11.1

2012-08-31 20:02:10

by Dongsu Park

[permalink] [raw]
Subject: [PATCH 2/5] ib_srp: removed superfluous warning in send timeout case

From: Dongsu Park <[email protected]>

Signed-off-By: Sebastian Riemer <[email protected]>
---
drivers/infiniband/ulp/srp/ib_srp.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index a0d0ca2..1b274484 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -534,7 +534,6 @@ static void srp_wait_last_send_wqe(struct srp_target_port *target)
msleep(20);
}

- WARN_ON(!target->last_send_wqe);
}

static void srp_disconnect_target(struct srp_target_port *target)
--
1.7.11.1

2012-08-31 20:02:36

by Dongsu Park

[permalink] [raw]
Subject: [PATCH 1/5] ib_srp: free memory correctly in srp_free_iu()

From: Dongsu Park <[email protected]>

As a potential fix for a race condition in srp_free_iu(),
hold a mutex in srp_free_target_ib() before calling srp_free_iu().

In addition, also clear rx/tx ring after freeing memory.
Both rx_ring[] and tx_ring[] should be reinitialized to NULL,
to prevent other tasks from accessing the freed memory.

Signed-off-by: Dongsu Park <[email protected]>
---
drivers/infiniband/ulp/srp/ib_srp.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 7ae5a00..a0d0ca2 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -291,10 +291,16 @@ static void srp_free_target_ib(struct srp_target_port *target)
ib_destroy_cq(target->send_cq);
ib_destroy_cq(target->recv_cq);

- for (i = 0; i < SRP_RQ_SIZE; ++i)
+ mutex_lock(&target->mutex);
+ for (i = 0; i < SRP_RQ_SIZE; ++i) {
srp_free_iu(target->srp_host, target->rx_ring[i]);
- for (i = 0; i < SRP_SQ_SIZE; ++i)
+ target->rx_ring[i] = NULL;
+ }
+ for (i = 0; i < SRP_SQ_SIZE; ++i) {
srp_free_iu(target->srp_host, target->tx_ring[i]);
+ target->tx_ring[i] = NULL;
+ }
+ mutex_unlock(&target->mutex);
}

static void srp_path_rec_completion(int status,
--
1.7.11.1