2014-10-29 16:38:08

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 0/7] RFC DRBD updates for 3.19

Hi,

these are the proposed DRBD changes for 3.19. All minor stuff
and fixes.

I will repost it in one week with the appropriate CC: stable
header set, and eventual feedback integrated.

Thanks!


Andreas Gruenbacher (2):
drbd: Minor cleanups
drbd: Only use drbd_msg_put_info() in drbd_nl.c

Lars Ellenberg (2):
drbd: fix resync throttling initialization
drbd: merge_bvec_fn: properly remap bvm->bi_bdev

Philipp Reisner (3):
drbd: fix race between role change and handshake
drbd: Fix state change in case of connection timeout
drbd: Remove an useless copy of kernel_setsockopt()

drivers/block/drbd/drbd_actlog.c | 3 +-
drivers/block/drbd/drbd_int.h | 39 +++++------------------
drivers/block/drbd/drbd_main.c | 23 ++++----------
drivers/block/drbd/drbd_nl.c | 64 +++++++++++++++++---------------------
drivers/block/drbd/drbd_receiver.c | 2 +-
drivers/block/drbd/drbd_req.c | 3 +-
drivers/block/drbd/drbd_state.c | 42 ++++++++++++++++++++-----
drivers/block/drbd/drbd_state.h | 5 +++
drivers/block/drbd/drbd_worker.c | 5 ++-
include/linux/drbd.h | 2 +-
10 files changed, 90 insertions(+), 98 deletions(-)

--
1.9.1


2014-10-29 16:38:06

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 1/7] drbd: Minor cleanups

From: Andreas Gruenbacher <[email protected]>

. Update comments
. drbd_set_{in,out_of}_sync(): Remove unused parameters
. Move common code into adm_del_resource()
. Redefine ERR_MINOR_EXISTS -> ERR_MINOR_OR_VOLUME_EXISTS

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_actlog.c | 3 +-
drivers/block/drbd/drbd_int.h | 9 +++---
drivers/block/drbd/drbd_main.c | 6 ++--
drivers/block/drbd/drbd_nl.c | 60 +++++++++++++++++-----------------------
include/linux/drbd.h | 2 +-
5 files changed, 35 insertions(+), 45 deletions(-)

diff --git a/drivers/block/drbd/drbd_actlog.c b/drivers/block/drbd/drbd_actlog.c
index a2dfa16..1318e32 100644
--- a/drivers/block/drbd/drbd_actlog.c
+++ b/drivers/block/drbd/drbd_actlog.c
@@ -827,8 +827,7 @@ static int update_sync_bits(struct drbd_device *device,
*
*/
int __drbd_change_sync(struct drbd_device *device, sector_t sector, int size,
- enum update_sync_bits_mode mode,
- const char *file, const unsigned int line)
+ enum update_sync_bits_mode mode)
{
/* Is called from worker and receiver context _only_ */
unsigned long sbnr, ebnr, lbnr;
diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index 9b22f8f..c14b718 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -1662,14 +1662,13 @@ extern void drbd_advance_rs_marks(struct drbd_device *device, unsigned long stil

enum update_sync_bits_mode { RECORD_RS_FAILED, SET_OUT_OF_SYNC, SET_IN_SYNC };
extern int __drbd_change_sync(struct drbd_device *device, sector_t sector, int size,
- enum update_sync_bits_mode mode,
- const char *file, const unsigned int line);
+ enum update_sync_bits_mode mode);
#define drbd_set_in_sync(device, sector, size) \
- __drbd_change_sync(device, sector, size, SET_IN_SYNC, __FILE__, __LINE__)
+ __drbd_change_sync(device, sector, size, SET_IN_SYNC)
#define drbd_set_out_of_sync(device, sector, size) \
- __drbd_change_sync(device, sector, size, SET_OUT_OF_SYNC, __FILE__, __LINE__)
+ __drbd_change_sync(device, sector, size, SET_OUT_OF_SYNC)
#define drbd_rs_failed_io(device, sector, size) \
- __drbd_change_sync(device, sector, size, RECORD_RS_FAILED, __FILE__, __LINE__)
+ __drbd_change_sync(device, sector, size, RECORD_RS_FAILED)
extern void drbd_al_shrink(struct drbd_device *device);
extern int drbd_initialize_al(struct drbd_device *, void *);

diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index 973c185..cce25a5 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -2731,7 +2731,7 @@ enum drbd_ret_code drbd_create_device(struct drbd_config_context *adm_ctx, unsig

device = minor_to_device(minor);
if (device)
- return ERR_MINOR_EXISTS;
+ return ERR_MINOR_OR_VOLUME_EXISTS;

/* GFP_KERNEL, we are outside of all write-out paths */
device = kzalloc(sizeof(struct drbd_device), GFP_KERNEL);
@@ -2794,7 +2794,7 @@ enum drbd_ret_code drbd_create_device(struct drbd_config_context *adm_ctx, unsig
id = idr_alloc(&drbd_devices, device, minor, minor + 1, GFP_KERNEL);
if (id < 0) {
if (id == -ENOSPC) {
- err = ERR_MINOR_EXISTS;
+ err = ERR_MINOR_OR_VOLUME_EXISTS;
drbd_msg_put_info(adm_ctx->reply_skb, "requested minor exists already");
}
goto out_no_minor_idr;
@@ -2804,7 +2804,7 @@ enum drbd_ret_code drbd_create_device(struct drbd_config_context *adm_ctx, unsig
id = idr_alloc(&resource->devices, device, vnr, vnr + 1, GFP_KERNEL);
if (id < 0) {
if (id == -ENOSPC) {
- err = ERR_MINOR_EXISTS;
+ err = ERR_MINOR_OR_VOLUME_EXISTS;
drbd_msg_put_info(adm_ctx->reply_skb, "requested minor exists already");
}
goto out_idr_remove_minor;
diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index 1cd47df..c145619 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -2052,7 +2052,7 @@ check_net_options(struct drbd_connection *connection, struct net_conf *new_net_c
rv = _check_net_options(connection, rcu_dereference(connection->net_conf), new_net_conf);
rcu_read_unlock();

- /* connection->volumes protected by genl_lock() here */
+ /* connection->peer_devices protected by genl_lock() here */
idr_for_each_entry(&connection->peer_devices, peer_device, i) {
struct drbd_device *device = peer_device->device;
if (!device->bitmap) {
@@ -3483,7 +3483,7 @@ int drbd_adm_new_minor(struct sk_buff *skb, struct genl_info *info)
* that first_peer_device(device)->connection and device->vnr match the request. */
if (adm_ctx.device) {
if (info->nlhdr->nlmsg_flags & NLM_F_EXCL)
- retcode = ERR_MINOR_EXISTS;
+ retcode = ERR_MINOR_OR_VOLUME_EXISTS;
/* else: still NO_ERROR */
goto out;
}
@@ -3530,6 +3530,27 @@ out:
return 0;
}

+static int adm_del_resource(struct drbd_resource *resource)
+{
+ struct drbd_connection *connection;
+
+ for_each_connection(connection, resource) {
+ if (connection->cstate > C_STANDALONE)
+ return ERR_NET_CONFIGURED;
+ }
+ if (!idr_is_empty(&resource->devices))
+ return ERR_RES_IN_USE;
+
+ list_del_rcu(&resource->resources);
+ /* Make sure all threads have actually stopped: state handling only
+ * does drbd_thread_stop_nowait(). */
+ list_for_each_entry(connection, &resource->connections, connections)
+ drbd_thread_stop(&connection->worker);
+ synchronize_rcu();
+ drbd_free_resource(resource);
+ return NO_ERROR;
+}
+
int drbd_adm_down(struct sk_buff *skb, struct genl_info *info)
{
struct drbd_config_context adm_ctx;
@@ -3575,14 +3596,6 @@ int drbd_adm_down(struct sk_buff *skb, struct genl_info *info)
}
}

- /* If we reach this, all volumes (of this connection) are Secondary,
- * Disconnected, Diskless, aka Unconfigured. Make sure all threads have
- * actually stopped, state handling only does drbd_thread_stop_nowait(). */
- for_each_connection(connection, resource)
- drbd_thread_stop(&connection->worker);
-
- /* Now, nothing can fail anymore */
-
/* delete volumes */
idr_for_each_entry(&resource->devices, device, i) {
retcode = adm_del_minor(device);
@@ -3593,10 +3606,7 @@ int drbd_adm_down(struct sk_buff *skb, struct genl_info *info)
}
}

- list_del_rcu(&resource->resources);
- synchronize_rcu();
- drbd_free_resource(resource);
- retcode = NO_ERROR;
+ retcode = adm_del_resource(resource);
out:
mutex_unlock(&resource->adm_mutex);
finish:
@@ -3608,7 +3618,6 @@ int drbd_adm_del_resource(struct sk_buff *skb, struct genl_info *info)
{
struct drbd_config_context adm_ctx;
struct drbd_resource *resource;
- struct drbd_connection *connection;
enum drbd_ret_code retcode;

retcode = drbd_adm_prepare(&adm_ctx, skb, info, DRBD_ADM_NEED_RESOURCE);
@@ -3616,27 +3625,10 @@ int drbd_adm_del_resource(struct sk_buff *skb, struct genl_info *info)
return retcode;
if (retcode != NO_ERROR)
goto finish;
-
resource = adm_ctx.resource;
- mutex_lock(&resource->adm_mutex);
- for_each_connection(connection, resource) {
- if (connection->cstate > C_STANDALONE) {
- retcode = ERR_NET_CONFIGURED;
- goto out;
- }
- }
- if (!idr_is_empty(&resource->devices)) {
- retcode = ERR_RES_IN_USE;
- goto out;
- }

- list_del_rcu(&resource->resources);
- for_each_connection(connection, resource)
- drbd_thread_stop(&connection->worker);
- synchronize_rcu();
- drbd_free_resource(resource);
- retcode = NO_ERROR;
-out:
+ mutex_lock(&resource->adm_mutex);
+ retcode = adm_del_resource(resource);
mutex_unlock(&resource->adm_mutex);
finish:
drbd_adm_finish(&adm_ctx, info, retcode);
diff --git a/include/linux/drbd.h b/include/linux/drbd.h
index debb70d..8723f2a 100644
--- a/include/linux/drbd.h
+++ b/include/linux/drbd.h
@@ -172,7 +172,7 @@ enum drbd_ret_code {
ERR_RES_NOT_KNOWN = 158,
ERR_RES_IN_USE = 159,
ERR_MINOR_CONFIGURED = 160,
- ERR_MINOR_EXISTS = 161,
+ ERR_MINOR_OR_VOLUME_EXISTS = 161,
ERR_INVALID_REQUEST = 162,
ERR_NEED_APV_100 = 163,
ERR_NEED_ALLOW_TWO_PRI = 164,
--
1.9.1

2014-10-29 16:38:04

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 4/7] drbd: fix resync throttling initialization

From: Lars Ellenberg <[email protected]>

If for some reason DRBD resync was the only activity on a backend
device, drbd_rs_c_min_rate_throttle() would mistakenly decide that it is
still initialization time, and keep throttling the resync.

This patch explicitly initializes ->rs_last_events to the current
backend event counters, and drops the rs_last_events == 0 from the
throttle condition.

Reported-by: Mikhail Sugakov <[email protected]>

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_receiver.c | 2 +-
drivers/block/drbd/drbd_state.c | 1 -
drivers/block/drbd/drbd_worker.c | 5 ++++-
3 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c
index 6960fb0..d169b4a 100644
--- a/drivers/block/drbd/drbd_receiver.c
+++ b/drivers/block/drbd/drbd_receiver.c
@@ -2482,7 +2482,7 @@ bool drbd_rs_c_min_rate_throttle(struct drbd_device *device)
atomic_read(&device->rs_sect_ev);

if (atomic_read(&device->ap_actlog_cnt)
- || !device->rs_last_events || curr_events - device->rs_last_events > 64) {
+ || curr_events - device->rs_last_events > 64) {
unsigned long rs_left;
int i;

diff --git a/drivers/block/drbd/drbd_state.c b/drivers/block/drbd/drbd_state.c
index 4529d92..2d7dd26 100644
--- a/drivers/block/drbd/drbd_state.c
+++ b/drivers/block/drbd/drbd_state.c
@@ -1099,7 +1099,6 @@ __drbd_set_state(struct drbd_device *device, union drbd_state ns,

set_ov_position(device, ns.conn);
device->rs_start = now;
- device->rs_last_events = 0;
device->rs_last_sect_ev = 0;
device->ov_last_oos_size = 0;
device->ov_last_oos_start = 0;
diff --git a/drivers/block/drbd/drbd_worker.c b/drivers/block/drbd/drbd_worker.c
index d2d1f97..d0fae55 100644
--- a/drivers/block/drbd/drbd_worker.c
+++ b/drivers/block/drbd/drbd_worker.c
@@ -1592,11 +1592,15 @@ void drbd_resync_after_changed(struct drbd_device *device)

void drbd_rs_controller_reset(struct drbd_device *device)
{
+ struct gendisk *disk = device->ldev->backing_bdev->bd_contains->bd_disk;
struct fifo_buffer *plan;

atomic_set(&device->rs_sect_in, 0);
atomic_set(&device->rs_sect_ev, 0);
device->rs_in_flight = 0;
+ device->rs_last_events =
+ (int)part_stat_read(&disk->part0, sectors[0]) +
+ (int)part_stat_read(&disk->part0, sectors[1]);

/* Updating the RCU protected object in place is necessary since
this function gets called from atomic context.
@@ -1743,7 +1747,6 @@ void drbd_start_resync(struct drbd_device *device, enum drbd_conns side)
device->rs_failed = 0;
device->rs_paused = 0;
device->rs_same_csum = 0;
- device->rs_last_events = 0;
device->rs_last_sect_ev = 0;
device->rs_total = tw;
device->rs_start = now;
--
1.9.1

2014-10-29 16:38:47

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 7/7] drbd: Remove an useless copy of kernel_setsockopt()

Old backward-compat cruft

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_int.h | 29 ++++-------------------------
1 file changed, 4 insertions(+), 25 deletions(-)

diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index d2a7589..b905e98 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -1557,52 +1557,31 @@ extern void drbd_set_recv_tcq(struct drbd_device *device, int tcq_enabled);
extern void _drbd_clear_done_ee(struct drbd_device *device, struct list_head *to_be_freed);
extern int drbd_connected(struct drbd_peer_device *);

-/* Yes, there is kernel_setsockopt, but only since 2.6.18.
- * So we have our own copy of it here. */
-static inline int drbd_setsockopt(struct socket *sock, int level, int optname,
- char *optval, int optlen)
-{
- mm_segment_t oldfs = get_fs();
- char __user *uoptval;
- int err;
-
- uoptval = (char __user __force *)optval;
-
- set_fs(KERNEL_DS);
- if (level == SOL_SOCKET)
- err = sock_setsockopt(sock, level, optname, uoptval, optlen);
- else
- err = sock->ops->setsockopt(sock, level, optname, uoptval,
- optlen);
- set_fs(oldfs);
- return err;
-}
-
static inline void drbd_tcp_cork(struct socket *sock)
{
int val = 1;
- (void) drbd_setsockopt(sock, SOL_TCP, TCP_CORK,
+ (void) kernel_setsockopt(sock, SOL_TCP, TCP_CORK,
(char*)&val, sizeof(val));
}

static inline void drbd_tcp_uncork(struct socket *sock)
{
int val = 0;
- (void) drbd_setsockopt(sock, SOL_TCP, TCP_CORK,
+ (void) kernel_setsockopt(sock, SOL_TCP, TCP_CORK,
(char*)&val, sizeof(val));
}

static inline void drbd_tcp_nodelay(struct socket *sock)
{
int val = 1;
- (void) drbd_setsockopt(sock, SOL_TCP, TCP_NODELAY,
+ (void) kernel_setsockopt(sock, SOL_TCP, TCP_NODELAY,
(char*)&val, sizeof(val));
}

static inline void drbd_tcp_quickack(struct socket *sock)
{
int val = 2;
- (void) drbd_setsockopt(sock, SOL_TCP, TCP_QUICKACK,
+ (void) kernel_setsockopt(sock, SOL_TCP, TCP_QUICKACK,
(char*)&val, sizeof(val));
}

--
1.9.1

2014-10-29 16:39:26

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 5/7] drbd: merge_bvec_fn: properly remap bvm->bi_bdev

From: Lars Ellenberg <[email protected]>

This was not noticed for many years. Affects operation if
md raid is used a backing device for DRBD.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_req.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/block/drbd/drbd_req.c b/drivers/block/drbd/drbd_req.c
index 5a01c53..90319b1 100644
--- a/drivers/block/drbd/drbd_req.c
+++ b/drivers/block/drbd/drbd_req.c
@@ -1545,6 +1545,7 @@ int drbd_merge_bvec(struct request_queue *q, struct bvec_merge_data *bvm, struct
struct request_queue * const b =
device->ldev->backing_bdev->bd_disk->queue;
if (b->merge_bvec_fn) {
+ bvm->bi_bdev = device->ldev->backing_bdev;
backing_limit = b->merge_bvec_fn(b, bvm, bvec);
limit = min(limit, backing_limit);
}
--
1.9.1

2014-10-29 16:39:29

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 3/7] drbd: fix race between role change and handshake

Symptoms:
If DRBD was "cleanly shut down" (all in sync, both Secondary before
disconnect, identical data generation uuids), and then one side was
promoted *during* the next connection handshake, the role change
could confuse the handshake.

The Primary would get stuck in WFBitmapS, the Secondary would log
unexpected cstate (Connected) in receive_bitmap
and get stuck in WFBitmapT.

Fix:
The test in is_valid_soft_transition wrong. It works because
the not allowed actions (promote/attach) do not touch the
cstate. The previous condition failed to demand a cstate change
in one clause.

In order to avoid deadlocks give up the state_mutex while waiting
for the transient state to go away.

Conflicts:
drbd/drbd_state.c
drbd/drbd_state.h
drbd/drbd_wrappers.h

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_nl.c | 2 +-
drivers/block/drbd/drbd_state.c | 41 ++++++++++++++++++++++++++++++++++-------
drivers/block/drbd/drbd_state.h | 5 +++++
3 files changed, 40 insertions(+), 8 deletions(-)

diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index 4782d07..74df8cf 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -588,7 +588,7 @@ drbd_set_role(struct drbd_device *const device, enum drbd_role new_role, int for
val.i = 0; val.role = new_role;

while (try++ < max_tries) {
- rv = _drbd_request_state(device, mask, val, CS_WAIT_COMPLETE);
+ rv = _drbd_request_state_holding_state_mutex(device, mask, val, CS_WAIT_COMPLETE);

/* in case we first succeeded to outdate,
* but now suddenly could establish a connection */
diff --git a/drivers/block/drbd/drbd_state.c b/drivers/block/drbd/drbd_state.c
index 84b11f8..4529d92 100644
--- a/drivers/block/drbd/drbd_state.c
+++ b/drivers/block/drbd/drbd_state.c
@@ -215,6 +215,18 @@ static bool no_peer_wf_report_params(struct drbd_connection *connection)
return rv;
}

+static void wake_up_all_devices(struct drbd_connection *connection)
+{
+ struct drbd_peer_device *peer_device;
+ int vnr;
+
+ rcu_read_lock();
+ idr_for_each_entry(&connection->peer_devices, peer_device, vnr)
+ wake_up(&peer_device->device->state_wait);
+ rcu_read_unlock();
+
+}
+

/**
* cl_wide_st_chg() - true if the state change is a cluster wide one
@@ -410,6 +422,22 @@ _drbd_request_state(struct drbd_device *device, union drbd_state mask,
return rv;
}

+enum drbd_state_rv
+_drbd_request_state_holding_state_mutex(struct drbd_device *device, union drbd_state mask,
+ union drbd_state val, enum chg_state_flags f)
+{
+ enum drbd_state_rv rv;
+
+ BUG_ON(f & CS_SERIALIZE);
+
+ wait_event_cmd(device->state_wait,
+ (rv = drbd_req_state(device, mask, val, f)) != SS_IN_TRANSIENT_STATE,
+ mutex_unlock(device->state_mutex),
+ mutex_lock(device->state_mutex));
+
+ return rv;
+}
+
static void print_st(struct drbd_device *device, const char *name, union drbd_state ns)
{
drbd_err(device, " %s = { cs:%s ro:%s/%s ds:%s/%s %c%c%c%c%c%c }\n",
@@ -629,14 +657,11 @@ is_valid_soft_transition(union drbd_state os, union drbd_state ns, struct drbd_c
if (ns.conn == C_DISCONNECTING && os.conn == C_UNCONNECTED)
rv = SS_IN_TRANSIENT_STATE;

- /* if (ns.conn == os.conn && ns.conn == C_WF_REPORT_PARAMS)
- rv = SS_IN_TRANSIENT_STATE; */
-
/* While establishing a connection only allow cstate to change.
- Delay/refuse role changes, detach attach etc... */
+ Delay/refuse role changes, detach attach etc... (they do not touch cstate) */
if (test_bit(STATE_SENT, &connection->flags) &&
- !(os.conn == C_WF_REPORT_PARAMS ||
- (ns.conn == C_WF_REPORT_PARAMS && os.conn == C_WF_CONNECTION)))
+ !((ns.conn == C_WF_REPORT_PARAMS && os.conn == C_WF_CONNECTION) ||
+ (ns.conn >= C_CONNECTED && os.conn == C_WF_REPORT_PARAMS)))
rv = SS_IN_TRANSIENT_STATE;

if ((ns.conn == C_VERIFY_S || ns.conn == C_VERIFY_T) && os.conn < C_CONNECTED)
@@ -1032,8 +1057,10 @@ __drbd_set_state(struct drbd_device *device, union drbd_state ns,

/* Wake up role changes, that were delayed because of connection establishing */
if (os.conn == C_WF_REPORT_PARAMS && ns.conn != C_WF_REPORT_PARAMS &&
- no_peer_wf_report_params(connection))
+ no_peer_wf_report_params(connection)) {
clear_bit(STATE_SENT, &connection->flags);
+ wake_up_all_devices(connection);
+ }

wake_up(&device->misc_wait);
wake_up(&device->state_wait);
diff --git a/drivers/block/drbd/drbd_state.h b/drivers/block/drbd/drbd_state.h
index cc41605..7f53c40 100644
--- a/drivers/block/drbd/drbd_state.h
+++ b/drivers/block/drbd/drbd_state.h
@@ -117,6 +117,11 @@ extern enum drbd_state_rv _drbd_request_state(struct drbd_device *,
union drbd_state,
union drbd_state,
enum chg_state_flags);
+
+extern enum drbd_state_rv
+_drbd_request_state_holding_state_mutex(struct drbd_device *, union drbd_state,
+ union drbd_state, enum chg_state_flags);
+
extern enum drbd_state_rv __drbd_set_state(struct drbd_device *, union drbd_state,
enum chg_state_flags,
struct completion *done);
--
1.9.1

2014-10-29 16:39:28

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 2/7] drbd: Only use drbd_msg_put_info() in drbd_nl.c

From: Andreas Gruenbacher <[email protected]>

Avoid generic netlink calls in other parts of the code base.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_int.h | 1 -
drivers/block/drbd/drbd_main.c | 17 +++--------------
drivers/block/drbd/drbd_nl.c | 2 +-
3 files changed, 4 insertions(+), 16 deletions(-)

diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index c14b718..d2a7589 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -1454,7 +1454,6 @@ extern int is_valid_ar_handle(struct drbd_request *, sector_t);


/* drbd_nl.c */
-extern int drbd_msg_put_info(struct sk_buff *skb, const char *info);
extern void drbd_suspend_io(struct drbd_device *device);
extern void drbd_resume_io(struct drbd_device *device);
extern char *ppsize(char *buf, unsigned long long size);
diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index cce25a5..1fc8342 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -2532,10 +2532,6 @@ int set_resource_options(struct drbd_resource *resource, struct res_opts *res_op

if (!zalloc_cpumask_var(&new_cpu_mask, GFP_KERNEL))
return -ENOMEM;
- /*
- retcode = ERR_NOMEM;
- drbd_msg_put_info("unable to allocate cpumask");
- */

/* silently ignore cpu mask on UP kernel */
if (nr_cpu_ids > 1 && res_opts->cpu_mask[0] != 0) {
@@ -2793,20 +2789,16 @@ enum drbd_ret_code drbd_create_device(struct drbd_config_context *adm_ctx, unsig

id = idr_alloc(&drbd_devices, device, minor, minor + 1, GFP_KERNEL);
if (id < 0) {
- if (id == -ENOSPC) {
+ if (id == -ENOSPC)
err = ERR_MINOR_OR_VOLUME_EXISTS;
- drbd_msg_put_info(adm_ctx->reply_skb, "requested minor exists already");
- }
goto out_no_minor_idr;
}
kref_get(&device->kref);

id = idr_alloc(&resource->devices, device, vnr, vnr + 1, GFP_KERNEL);
if (id < 0) {
- if (id == -ENOSPC) {
+ if (id == -ENOSPC)
err = ERR_MINOR_OR_VOLUME_EXISTS;
- drbd_msg_put_info(adm_ctx->reply_skb, "requested minor exists already");
- }
goto out_idr_remove_minor;
}
kref_get(&device->kref);
@@ -2825,10 +2817,8 @@ enum drbd_ret_code drbd_create_device(struct drbd_config_context *adm_ctx, unsig

id = idr_alloc(&connection->peer_devices, peer_device, vnr, vnr + 1, GFP_KERNEL);
if (id < 0) {
- if (id == -ENOSPC) {
+ if (id == -ENOSPC)
err = ERR_INVALID_REQUEST;
- drbd_msg_put_info(adm_ctx->reply_skb, "requested volume exists already");
- }
goto out_idr_remove_from_resource;
}
kref_get(&connection->kref);
@@ -2836,7 +2826,6 @@ enum drbd_ret_code drbd_create_device(struct drbd_config_context *adm_ctx, unsig

if (init_submitter(device)) {
err = ERR_NOMEM;
- drbd_msg_put_info(adm_ctx->reply_skb, "unable to create submit workqueue");
goto out_idr_remove_vol;
}

diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index c145619..4782d07 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -92,7 +92,7 @@ static void drbd_adm_send_reply(struct sk_buff *skb, struct genl_info *info)

/* Used on a fresh "drbd_adm_prepare"d reply_skb, this cannot fail: The only
* reason it could fail was no space in skb, and there are 4k available. */
-int drbd_msg_put_info(struct sk_buff *skb, const char *info)
+static int drbd_msg_put_info(struct sk_buff *skb, const char *info)
{
struct nlattr *nla;
int err = -EMSGSIZE;
--
1.9.1

2014-10-29 16:39:25

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 6/7] drbd: Fix state change in case of connection timeout

A connection timeout affects all volumes of a resource!
Under the following conditions:

A resource with multiple volumes
AND
ko-count >=1
AND
a write request triggers the timeout (ko-count * timeout)

DRBD's internal state gets confused. That in turn may
lead to very miss leading follow up failures. E.g.
"BUG: scheduling while atomic"

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_req.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/block/drbd/drbd_req.c b/drivers/block/drbd/drbd_req.c
index 90319b1..3b797cd 100644
--- a/drivers/block/drbd/drbd_req.c
+++ b/drivers/block/drbd/drbd_req.c
@@ -1629,7 +1629,7 @@ void request_timer_fn(unsigned long data)
time_after(now, req_peer->pre_send_jif + ent) &&
!time_in_range(now, connection->last_reconnect_jif, connection->last_reconnect_jif + ent)) {
drbd_warn(device, "Remote failed to finish a request within ko-count * timeout\n");
- _drbd_set_state(_NS(device, conn, C_TIMEOUT), CS_VERBOSE | CS_HARD, NULL);
+ _conn_request_state(connection, NS(conn, C_TIMEOUT), CS_VERBOSE | CS_HARD);
}
if (dt && oldest_submit_jif != now &&
time_after(now, oldest_submit_jif + dt) &&
--
1.9.1