2013-10-23 09:09:21

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 0/5] Some DRBD fixes for the 3.13 merge window

Hi Jens,

Please merge these patches to you for-3.13/drivers branch.
They contain of fixes in random places over the DRBD code

Best,
Phil


Lars Ellenberg (3):
drbd: fix NULL pointer deref in module init error path
drbd: fix decoding of bitmap vli rle for device sizes > 64 TB
drbd: avoid to shrink max_bio_size due to peer re-configuration

Philipp Reisner (2):
drbd: Fix an connection drop issue after enabling allow-two-primaries
drbd: Fix adding of new minors with freshly created meta data

drivers/block/drbd/drbd_int.h | 3 ++-
drivers/block/drbd/drbd_main.c | 19 ++++++++-------
drivers/block/drbd/drbd_nl.c | 6 ++---
drivers/block/drbd/drbd_receiver.c | 45 +++++++++++++++++-------------------
drivers/block/drbd/drbd_req.c | 3 +++
5 files changed, 38 insertions(+), 38 deletions(-)

--
1.7.9.5


2013-10-23 09:09:23

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 5/5] drbd: avoid to shrink max_bio_size due to peer re-configuration

From: Lars Ellenberg <[email protected]>

For a long time, the receiving side has spread "too large" incoming
requests over multiple bios. No need to shrink our max_bio_size
(max_hw_sectors) if the peer is reconfigured to use a different storage.

The problem manifests itself if we are not the top of the device stack
(DRBD is used a LVM PV).

A hardware reconfiguration on the peer may cause the supported
max_bio_size to shrink, and the connection handshake would now
unnecessarily shrink the max_bio_size on the active node.

There is no way to notify upper layers that they have to "re-stack"
their limits. So they won't notice at all, and may keep submitting bios
that are suddenly considered "too large for device".

We already check for compatibility and ignore changes on the peer,
the code only was masked out unless we have a fully established connection.
We just need to allow it a bit earlier during the handshake.

Also consider max_hw_sectors in our merge bvec function, just in case.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_nl.c | 4 ++--
drivers/block/drbd/drbd_req.c | 3 +++
2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index 37dad18..c706d50 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -1132,9 +1132,9 @@ void drbd_reconsider_max_bio_size(struct drbd_conf *mdev)
/* We may ignore peer limits if the peer is modern enough.
Because new from 8.3.8 onwards the peer can use multiple
BIOs for a single peer_request */
- if (mdev->state.conn >= C_CONNECTED) {
+ if (mdev->state.conn >= C_WF_REPORT_PARAMS) {
if (mdev->tconn->agreed_pro_version < 94)
- peer = min( mdev->peer_max_bio_size, DRBD_MAX_SIZE_H80_PACKET);
+ peer = min(mdev->peer_max_bio_size, DRBD_MAX_SIZE_H80_PACKET);
/* Correct old drbd (up to 8.3.7) if it believes it can do more than 32KiB */
else if (mdev->tconn->agreed_pro_version == 94)
peer = DRBD_MAX_SIZE_H80_PACKET;
diff --git a/drivers/block/drbd/drbd_req.c b/drivers/block/drbd/drbd_req.c
index c24379f..fec7bef 100644
--- a/drivers/block/drbd/drbd_req.c
+++ b/drivers/block/drbd/drbd_req.c
@@ -1306,6 +1306,7 @@ int drbd_merge_bvec(struct request_queue *q, struct bvec_merge_data *bvm, struct
int backing_limit;

if (bio_size && get_ldev(mdev)) {
+ unsigned int max_hw_sectors = queue_max_hw_sectors(q);
struct request_queue * const b =
mdev->ldev->backing_bdev->bd_disk->queue;
if (b->merge_bvec_fn) {
@@ -1313,6 +1314,8 @@ int drbd_merge_bvec(struct request_queue *q, struct bvec_merge_data *bvm, struct
limit = min(limit, backing_limit);
}
put_ldev(mdev);
+ if ((limit >> 9) > max_hw_sectors)
+ limit = max_hw_sectors << 9;
}
return limit;
}
--
1.7.9.5

2013-10-23 09:09:22

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 1/5] drbd: fix NULL pointer deref in module init error path

From: Lars Ellenberg <[email protected]>

If we want to iterate over the (as of yet still empty) list in the
cleanup path, we need to initialize the list before the first goto fail.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_main.c | 19 +++++++++----------
1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index 55635ed..9e3818b 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -2750,13 +2750,6 @@ int __init drbd_init(void)
return err;
}

- err = drbd_genl_register();
- if (err) {
- printk(KERN_ERR "drbd: unable to register generic netlink family\n");
- goto fail;
- }
-
-
register_reboot_notifier(&drbd_notifier);

/*
@@ -2767,6 +2760,15 @@ int __init drbd_init(void)
drbd_proc = NULL; /* play safe for drbd_cleanup */
idr_init(&minors);

+ rwlock_init(&global_state_lock);
+ INIT_LIST_HEAD(&drbd_tconns);
+
+ err = drbd_genl_register();
+ if (err) {
+ printk(KERN_ERR "drbd: unable to register generic netlink family\n");
+ goto fail;
+ }
+
err = drbd_create_mempools();
if (err)
goto fail;
@@ -2778,9 +2780,6 @@ int __init drbd_init(void)
goto fail;
}

- rwlock_init(&global_state_lock);
- INIT_LIST_HEAD(&drbd_tconns);
-
retry.wq = create_singlethread_workqueue("drbd-reissue");
if (!retry.wq) {
printk(KERN_ERR "drbd: unable to create retry workqueue\n");
--
1.7.9.5

2013-10-23 09:10:01

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 3/5] drbd: Fix adding of new minors with freshly created meta data

Online adding of new minors with freshly created meta data
to an resource with an established connection failed, with a
wrong state transition on one side on one side of the new minor.

Freshly created meta-data has a la_size (last agreed size) of 0.
When we online add such devices, the code wrongly got into
the code path for resyncing new storage that was added while
the disk was detached.

Fixed that by making the GREW from ZERO a special case.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_int.h | 3 ++-
drivers/block/drbd/drbd_nl.c | 2 +-
2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index 2d7f608..0e06f0c 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -1474,7 +1474,8 @@ enum determine_dev_size {
DS_ERROR = -1,
DS_UNCHANGED = 0,
DS_SHRUNK = 1,
- DS_GREW = 2
+ DS_GREW = 2,
+ DS_GREW_FROM_ZERO = 3,
};
extern enum determine_dev_size
drbd_determine_dev_size(struct drbd_conf *, enum dds_flags, struct resize_parms *) __must_hold(local);
diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index 8cc1e64..37dad18 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -955,7 +955,7 @@ drbd_determine_dev_size(struct drbd_conf *mdev, enum dds_flags flags, struct res
}

if (size > la_size_sect)
- rv = DS_GREW;
+ rv = la_size_sect ? DS_GREW : DS_GREW_FROM_ZERO;
if (size < la_size_sect)
rv = DS_SHRUNK;

--
1.7.9.5

2013-10-23 09:10:00

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 4/5] drbd: fix decoding of bitmap vli rle for device sizes > 64 TB

From: Lars Ellenberg <[email protected]>

Symptoms: disconnect after bitmap exchange due to
bitmap overflow (e:49731075554) while decoding bm RLE packet

In the decoding step of the variable length integer run length encoding
there was potentially an uncatched bitshift by wordsize (variable >> 64).

The result of which is "undefined" :(
(only "sometimes" the result is the desired 0)

Fix: don't do any bit shift magic for shift == 64, just assign.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_receiver.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c
index 12c59eb..6fa6673 100644
--- a/drivers/block/drbd/drbd_receiver.c
+++ b/drivers/block/drbd/drbd_receiver.c
@@ -4125,7 +4125,11 @@ recv_bm_rle_bits(struct drbd_conf *mdev,
(unsigned int)bs.buf_len);
return -EIO;
}
- look_ahead >>= bits;
+ /* if we consumed all 64 bits, assign 0; >> 64 is "undefined"; */
+ if (likely(bits < 64))
+ look_ahead >>= bits;
+ else
+ look_ahead = 0;
have -= bits;

bits = bitstream_get_bits(&bs, &tmp, 64 - have);
--
1.7.9.5

2013-10-23 09:09:59

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 2/5] drbd: Fix an connection drop issue after enabling allow-two-primaries

Since drbd-8.4.0 it is possible to change the allow-two-primaries
network option while the connection is established.

The sequence code used to partially order packets from the
data socket with packets from the meta-data socket, still assued
that the allow-two-primaries option is constant while the
connection is established.

I.e.
On a node that has the RESOLVE_CONFLICTS bits set, after enabling
allow-two-primaries, when receiving the next data packet it timed out
while waiting for the necessary packets on the data socket to arrive
(wait_for_and_update_peer_seq() function).

Fixed that by always tracking the sequence number, but only waiting
for it if allow-two-primaries is set.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_receiver.c | 39 +++++++++++++++---------------------
1 file changed, 16 insertions(+), 23 deletions(-)

diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c
index cc29cd3..12c59eb 100644
--- a/drivers/block/drbd/drbd_receiver.c
+++ b/drivers/block/drbd/drbd_receiver.c
@@ -1890,29 +1890,11 @@ static u32 seq_max(u32 a, u32 b)
return seq_greater(a, b) ? a : b;
}

-static bool need_peer_seq(struct drbd_conf *mdev)
-{
- struct drbd_tconn *tconn = mdev->tconn;
- int tp;
-
- /*
- * We only need to keep track of the last packet_seq number of our peer
- * if we are in dual-primary mode and we have the resolve-conflicts flag set; see
- * handle_write_conflicts().
- */
-
- rcu_read_lock();
- tp = rcu_dereference(mdev->tconn->net_conf)->two_primaries;
- rcu_read_unlock();
-
- return tp && test_bit(RESOLVE_CONFLICTS, &tconn->flags);
-}
-
static void update_peer_seq(struct drbd_conf *mdev, unsigned int peer_seq)
{
unsigned int newest_peer_seq;

- if (need_peer_seq(mdev)) {
+ if (test_bit(RESOLVE_CONFLICTS, &mdev->tconn->flags)) {
spin_lock(&mdev->peer_seq_lock);
newest_peer_seq = seq_max(mdev->peer_seq, peer_seq);
mdev->peer_seq = newest_peer_seq;
@@ -1972,22 +1954,31 @@ static int wait_for_and_update_peer_seq(struct drbd_conf *mdev, const u32 peer_s
{
DEFINE_WAIT(wait);
long timeout;
- int ret;
+ int ret = 0, tp;

- if (!need_peer_seq(mdev))
+ if (!test_bit(RESOLVE_CONFLICTS, &mdev->tconn->flags))
return 0;

spin_lock(&mdev->peer_seq_lock);
for (;;) {
if (!seq_greater(peer_seq - 1, mdev->peer_seq)) {
mdev->peer_seq = seq_max(mdev->peer_seq, peer_seq);
- ret = 0;
break;
}
+
if (signal_pending(current)) {
ret = -ERESTARTSYS;
break;
}
+
+ rcu_read_lock();
+ tp = rcu_dereference(mdev->tconn->net_conf)->two_primaries;
+ rcu_read_unlock();
+
+ if (!tp)
+ break;
+
+ /* Only need to wait if two_primaries is enabled */
prepare_to_wait(&mdev->seq_wait, &wait, TASK_INTERRUPTIBLE);
spin_unlock(&mdev->peer_seq_lock);
rcu_read_lock();
@@ -2228,8 +2219,10 @@ static int receive_Data(struct drbd_tconn *tconn, struct packet_info *pi)
}
goto out_interrupted;
}
- } else
+ } else {
+ update_peer_seq(mdev, peer_seq);
spin_lock_irq(&mdev->tconn->req_lock);
+ }
list_add(&peer_req->w.list, &mdev->active_ee);
spin_unlock_irq(&mdev->tconn->req_lock);

--
1.7.9.5

2013-10-23 09:33:00

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH 0/5] Some DRBD fixes for the 3.13 merge window

On Wed, Oct 23 2013, Philipp Reisner wrote:
> Hi Jens,
>
> Please merge these patches to you for-3.13/drivers branch.
> They contain of fixes in random places over the DRBD code

Thanks Philipp, all applied for 3.13.

--
Jens Axboe