2014-07-01 16:16:54

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 00/20] RFC DRBD fixes

Hi,

this series has a number of fixes to DRBD in random places. In this
empties the queue before the debug-fs part.

These patches are targeted at the 3.17 merge window, after review.


Andreas Gruenbacher (1):
drbd: Remove unnecessary/unused code

Joe Perches (1):
block: Convert last uses of __FUNCTION__ to __func__

Lars Ellenberg (15):
drbd: drop wrong debugging aid
drbd: silence -Wmissing-prototypes warnings
drbd: fix bogus resync stats in /proc/drbd
drbd: don't implicitly resize Diskless node beyond end of device
drbd: implement csums-after-crash-only
drbd: application writes may set-in-sync in protocol != C
drbd: short-circuit in maybe_pull_ahead
drbd: improve resync request throttling due to sendbuf size
drbd: clear CRASHED_PRIMARY only after successful resync
drbd: cosmetic: change all printk(level, ...) to pr_<level>(...)
drbd: drbd_rs_number_requests: fix unit mismatch in comparison
drbd: add drbd_queue_work_if_unqueued helper
drbd: drop drbd_md_flush
drbd: consistently use list_add_tail for peer_request tracking
drbd: also keep track of trim -> zero-out fallback peer_requests

Monam Agarwal (1):
drivers/block: Use RCU_INIT_POINTER(x, NULL) in drbd/drbd_state.c

Philipp Reisner (2):
drbd: Limit the time we are waiting for the first packet on an
accepted socket
drbd: New net configuration option socket-check-timeout

drivers/block/drbd/drbd_bitmap.c | 13 ++--
drivers/block/drbd/drbd_int.h | 93 ++++++----------------------
drivers/block/drbd/drbd_interval.h | 4 +-
drivers/block/drbd/drbd_main.c | 53 ++++++----------
drivers/block/drbd/drbd_nl.c | 6 +-
drivers/block/drbd/drbd_proc.c | 122 +++++++++++++++++++++++--------------
drivers/block/drbd/drbd_receiver.c | 92 +++++++++++++++++++++-------
drivers/block/drbd/drbd_req.c | 71 ++++++++++++---------
drivers/block/drbd/drbd_req.h | 1 +
drivers/block/drbd/drbd_state.c | 11 ++--
drivers/block/drbd/drbd_worker.c | 70 ++++++++++++++-------
include/linux/drbd_genl.h | 4 ++
include/linux/drbd_limits.h | 6 ++
lib/lru_cache.c | 2 +-
14 files changed, 301 insertions(+), 247 deletions(-)

--
1.9.1


2014-07-01 16:17:04

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 07/20] drbd: Limit the time we are waiting for the first packet on an accepted socket

Before the patch
'drbd: Keep the listening socket open while trying to connect to the peer'

the newly created socket inherited the receive timeout from the listen
socket. The listen socket had a receive timeout of connect-intervall
+- 30% random jitter.

The real issue is that after the mentioned patch we had no timeout at all.
Now use 4 times the ping-timeout.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_receiver.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c
index d326af6..7da83f3 100644
--- a/drivers/block/drbd/drbd_receiver.c
+++ b/drivers/block/drbd/drbd_receiver.c
@@ -791,8 +791,18 @@ static int receive_first_packet(struct drbd_connection *connection, struct socke
{
unsigned int header_size = drbd_header_size(connection);
struct packet_info pi;
+ struct net_conf *nc;
int err;

+ rcu_read_lock();
+ nc = rcu_dereference(connection->net_conf);
+ if (!nc) {
+ rcu_read_unlock();
+ return -EIO;
+ }
+ sock->sk->sk_rcvtimeo = nc->ping_timeo * 4 * HZ / 10;
+ rcu_read_unlock();
+
err = drbd_recv_short(sock, connection->data.rbuf, header_size, 0);
if (err != header_size) {
if (err >= 0)
--
1.9.1

2014-07-01 16:17:01

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 05/20] drbd: don't implicitly resize Diskless node beyond end of device

From: Lars Ellenberg <[email protected]>

During handshake, we compare backend sizes, and user set limits,
and agree on what device size we are going to expose.

We remember that last-agreed-size in our meta data.

But if we come up diskless, we have to accept what the peer
presents us with. We used to accept the peers maximum potential
capacity (backend size), which is wrong, and could lead to IO errors
due to access beyond end of device.

Instead, we need to accept the peer's current size.
Unless that is communicated as 0, in which case we
accept the backend size, or the user set limit, if set.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_receiver.c | 19 ++++++++++++++++---
1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c
index 2fcc3af..5626c5b 100644
--- a/drivers/block/drbd/drbd_receiver.c
+++ b/drivers/block/drbd/drbd_receiver.c
@@ -3657,7 +3657,7 @@ static int receive_sizes(struct drbd_connection *connection, struct packet_info
struct drbd_device *device;
struct p_sizes *p = pi->data;
enum determine_dev_size dd = DS_UNCHANGED;
- sector_t p_size, p_usize, my_usize;
+ sector_t p_size, p_usize, p_csize, my_usize;
int ldsc = 0; /* local disk size changed */
enum dds_flags ddsf;

@@ -3668,6 +3668,7 @@ static int receive_sizes(struct drbd_connection *connection, struct packet_info

p_size = be64_to_cpu(p->d_size);
p_usize = be64_to_cpu(p->u_size);
+ p_csize = be64_to_cpu(p->c_size);

/* just store the peer's disk size for now.
* we still need to figure out whether we accept that. */
@@ -3742,9 +3743,21 @@ static int receive_sizes(struct drbd_connection *connection, struct packet_info
return -EIO;
drbd_md_sync(device);
} else {
- /* I am diskless, need to accept the peer's size. */
+ /*
+ * I am diskless, need to accept the peer's *current* size.
+ * I must NOT accept the peers backing disk size,
+ * it may have been larger than mine all along...
+ *
+ * At this point, the peer knows more about my disk, or at
+ * least about what we last agreed upon, than myself.
+ * So if his c_size is less than his d_size, the most likely
+ * reason is that *my* d_size was smaller last time we checked.
+ *
+ * However, if he sends a zero current size,
+ * take his (user-capped or) backing disk size anyways.
+ */
drbd_reconsider_max_bio_size(device, NULL);
- drbd_set_my_capacity(device, p_size);
+ drbd_set_my_capacity(device, p_csize ?: p_usize ?: p_size);
}

if (get_ldev(device)) {
--
1.9.1

2014-07-01 16:16:59

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 02/20] drbd: silence -Wmissing-prototypes warnings

From: Lars Ellenberg <[email protected]>

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_main.c | 2 +-
drivers/block/drbd/drbd_nl.c | 2 +-
drivers/block/drbd/drbd_receiver.c | 3 +--
drivers/block/drbd/drbd_req.h | 1 +
drivers/block/drbd/drbd_state.c | 6 +++---
lib/lru_cache.c | 2 +-
6 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index ed35d52..68ec774 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -2877,7 +2877,7 @@ void drbd_delete_device(struct drbd_device *device)
kref_sub(&device->kref, refs, drbd_destroy_device);
}

-int __init drbd_init(void)
+static int __init drbd_init(void)
{
int err;

diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index 628167d..81949d6 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -2913,7 +2913,7 @@ static struct drbd_connection *the_only_connection(struct drbd_resource *resourc
return list_first_entry(&resource->connections, struct drbd_connection, connections);
}

-int nla_put_status_info(struct sk_buff *skb, struct drbd_device *device,
+static int nla_put_status_info(struct sk_buff *skb, struct drbd_device *device,
const struct sib_info *sib)
{
struct drbd_resource *resource = device->resource;
diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c
index cfc3d43..2fcc3af 100644
--- a/drivers/block/drbd/drbd_receiver.c
+++ b/drivers/block/drbd/drbd_receiver.c
@@ -3371,8 +3371,7 @@ disconnect:
* return: NULL (alg name was "")
* ERR_PTR(error) if something goes wrong
* or the crypto hash ptr, if it worked out ok. */
-static
-struct crypto_hash *drbd_crypto_alloc_digest_safe(const struct drbd_device *device,
+static struct crypto_hash *drbd_crypto_alloc_digest_safe(const struct drbd_device *device,
const char *alg, const char *name)
{
struct crypto_hash *tfm;
diff --git a/drivers/block/drbd/drbd_req.h b/drivers/block/drbd/drbd_req.h
index 8566cd5..9f6a040 100644
--- a/drivers/block/drbd/drbd_req.h
+++ b/drivers/block/drbd/drbd_req.h
@@ -288,6 +288,7 @@ extern void complete_master_bio(struct drbd_device *device,
extern void request_timer_fn(unsigned long data);
extern void tl_restart(struct drbd_connection *connection, enum drbd_req_event what);
extern void _tl_restart(struct drbd_connection *connection, enum drbd_req_event what);
+extern void tl_abort_disk_io(struct drbd_device *device);

/* this is in drbd_main.c */
extern void drbd_restart_request(struct drbd_request *req);
diff --git a/drivers/block/drbd/drbd_state.c b/drivers/block/drbd/drbd_state.c
index 6629f46..a7772e2 100644
--- a/drivers/block/drbd/drbd_state.c
+++ b/drivers/block/drbd/drbd_state.c
@@ -410,7 +410,7 @@ _drbd_request_state(struct drbd_device *device, union drbd_state mask,
return rv;
}

-static void print_st(struct drbd_device *device, char *name, union drbd_state ns)
+static void print_st(struct drbd_device *device, const char *name, union drbd_state ns)
{
drbd_err(device, " %s = { cs:%s ro:%s/%s ds:%s/%s %c%c%c%c%c%c }\n",
name,
@@ -1606,7 +1606,7 @@ static int w_after_conn_state_ch(struct drbd_work *w, int unused)
return 0;
}

-void conn_old_common_state(struct drbd_connection *connection, union drbd_state *pcs, enum chg_state_flags *pf)
+static void conn_old_common_state(struct drbd_connection *connection, union drbd_state *pcs, enum chg_state_flags *pf)
{
enum chg_state_flags flags = ~0;
struct drbd_peer_device *peer_device;
@@ -1695,7 +1695,7 @@ conn_is_valid_transition(struct drbd_connection *connection, union drbd_state ma
return rv;
}

-void
+static void
conn_set_state(struct drbd_connection *connection, union drbd_state mask, union drbd_state val,
union drbd_state *pns_min, union drbd_state *pns_max, enum chg_state_flags flags)
{
diff --git a/lib/lru_cache.c b/lib/lru_cache.c
index 4a83ecd..6111cd1 100644
--- a/lib/lru_cache.c
+++ b/lib/lru_cache.c
@@ -169,7 +169,7 @@ out_fail:
return NULL;
}

-void lc_free_by_index(struct lru_cache *lc, unsigned i)
+static void lc_free_by_index(struct lru_cache *lc, unsigned i)
{
void *p = lc->lc_element[i];
WARN_ON(!p);
--
1.9.1

2014-07-01 16:16:57

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 01/20] drbd: drop wrong debugging aid

From: Lars Ellenberg <[email protected]>

The textual representation of resync extents in /proc/drbd presented
with proc_details >= 3 was wrong, it used bitnumbers as bitmasks.

It was not particularly useful either, and I doubt anyone has even tried
to look at it in the last few years. Drop it.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_proc.c | 17 -----------------
1 file changed, 17 deletions(-)

diff --git a/drivers/block/drbd/drbd_proc.c b/drivers/block/drbd/drbd_proc.c
index f11e573..46bb8dd 100644
--- a/drivers/block/drbd/drbd_proc.c
+++ b/drivers/block/drbd/drbd_proc.c
@@ -188,16 +188,6 @@ static void drbd_syncer_progress(struct drbd_device *device, struct seq_file *se
}
}

-static void resync_dump_detail(struct seq_file *seq, struct lc_element *e)
-{
- struct bm_extent *bme = lc_entry(e, struct bm_extent, lce);
-
- seq_printf(seq, "%5d %s %s\n", bme->rs_left,
- bme->flags & BME_NO_WRITES ? "NO_WRITES" : "---------",
- bme->flags & BME_LOCKED ? "LOCKED" : "------"
- );
-}
-
static int drbd_seq_show(struct seq_file *seq, void *v)
{
int i, prev_i = -1;
@@ -298,13 +288,6 @@ static int drbd_seq_show(struct seq_file *seq, void *v)
lc_seq_printf_stats(seq, device->act_log);
put_ldev(device);
}
-
- if (proc_details >= 2) {
- if (device->resync) {
- lc_seq_dump_details(seq, device->resync, "rs_left",
- resync_dump_detail);
- }
- }
}
rcu_read_unlock();

--
1.9.1

2014-07-01 16:18:09

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 18/20] drbd: drop drbd_md_flush

From: Lars Ellenberg <[email protected]>

The only user of drbd_md_flush was bm_rw(),
and it is always followed by either a drbd_md_sync(),
or an al_write_transaction(), which, if so configured,
both end up submiting a FLUSH|FUA request anyways.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_bitmap.c | 4 +---
drivers/block/drbd/drbd_int.h | 19 -------------------
2 files changed, 1 insertion(+), 22 deletions(-)

diff --git a/drivers/block/drbd/drbd_bitmap.c b/drivers/block/drbd/drbd_bitmap.c
index 7e6f924..7303433 100644
--- a/drivers/block/drbd/drbd_bitmap.c
+++ b/drivers/block/drbd/drbd_bitmap.c
@@ -1153,9 +1153,7 @@ static int bm_rw(struct drbd_device *device, int rw, unsigned flags, unsigned la
err = -EIO; /* Disk timeout/force-detach during IO... */

now = jiffies;
- if (rw == WRITE) {
- drbd_md_flush(device);
- } else /* rw == READ */ {
+ if (rw == READ) {
b->bm_set = bm_count_bits(b);
drbd_info(device, "recounting of set bits took additional %lu jiffies\n",
jiffies - now);
diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index a71f8bb..d85f43c 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -2182,25 +2182,6 @@ static inline int drbd_queue_order_type(struct drbd_device *device)
return QUEUE_ORDERED_NONE;
}

-static inline void drbd_md_flush(struct drbd_device *device)
-{
- int r;
-
- if (device->ldev == NULL) {
- drbd_warn(device, "device->ldev == NULL in drbd_md_flush\n");
- return;
- }
-
- if (test_bit(MD_NO_FUA, &device->flags))
- return;
-
- r = blkdev_issue_flush(device->ldev->md_bdev, GFP_NOIO, NULL);
- if (r) {
- set_bit(MD_NO_FUA, &device->flags);
- drbd_err(device, "meta data flush failed with status %d, disabling md-flushes\n", r);
- }
-}
-
static inline struct drbd_connection *first_connection(struct drbd_resource *resource)
{
return list_first_entry_or_null(&resource->connections,
--
1.9.1

2014-07-01 16:18:06

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 16/20] drbd: drbd_rs_number_requests: fix unit mismatch in comparison

From: Lars Ellenberg <[email protected]>

We try to limit the number of "in-flight" resync requests.
One condition for that is the amount of requested data should not exceed
half of what can be covered by our "max-buffers" setting.

However we compared number of 4k pages with number of in-flight 512 Byte
sectors, and this extra throttle triggered much earlier than intended.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_worker.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/block/drbd/drbd_worker.c b/drivers/block/drbd/drbd_worker.c
index 0b5e429..9f0acb0 100644
--- a/drivers/block/drbd/drbd_worker.c
+++ b/drivers/block/drbd/drbd_worker.c
@@ -504,9 +504,9 @@ struct fifo_buffer *fifo_alloc(int fifo_size)
static int drbd_rs_controller(struct drbd_device *device, unsigned int sect_in)
{
struct disk_conf *dc;
- unsigned int want; /* The number of sectors we want in the proxy */
+ unsigned int want; /* The number of sectors we want in-flight */
int req_sect; /* Number of sectors to request in this turn */
- int correction; /* Number of sectors more we need in the proxy*/
+ int correction; /* Number of sectors more we need in-flight */
int cps; /* correction per invocation of drbd_rs_controller() */
int steps; /* Number of time steps to plan ahead */
int curr_corr;
@@ -577,8 +577,13 @@ static int drbd_rs_number_requests(struct drbd_device *device)
* potentially causing a distributed deadlock on congestion during
* online-verify or (checksum-based) resync, if max-buffers,
* socket buffer sizes and resync rate settings are mis-configured. */
- if (mxb - device->rs_in_flight < number)
- number = mxb - device->rs_in_flight;
+
+ /* note that "number" is in units of "BM_BLOCK_SIZE" (which is 4k),
+ * mxb (as used here, and in drbd_alloc_pages on the peer) is
+ * "number of pages" (typically also 4k),
+ * but "rs_in_flight" is in "sectors" (512 Byte). */
+ if (mxb - device->rs_in_flight/8 < number)
+ number = mxb - device->rs_in_flight/8;

return number;
}
--
1.9.1

2014-07-01 16:18:04

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 19/20] drbd: consistently use list_add_tail for peer_request tracking

From: Lars Ellenberg <[email protected]>

Keep the epoch entry lists (active_ee, read_ee, sync_ee, ...)
consistently "oldest first". That way finding the oldest not yet
successfully processed request is simply list_first_entry_or_null.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_receiver.c | 4 ++--
drivers/block/drbd/drbd_worker.c | 2 +-
2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c
index 3a3c489..db5c580 100644
--- a/drivers/block/drbd/drbd_receiver.c
+++ b/drivers/block/drbd/drbd_receiver.c
@@ -1777,7 +1777,7 @@ static int recv_resync_read(struct drbd_peer_device *peer_device, sector_t secto
peer_req->w.cb = e_end_resync_block;

spin_lock_irq(&device->resource->req_lock);
- list_add(&peer_req->w.list, &device->sync_ee);
+ list_add_tail(&peer_req->w.list, &device->sync_ee);
spin_unlock_irq(&device->resource->req_lock);

atomic_add(pi->size >> 9, &device->rs_sect_ev);
@@ -2341,7 +2341,7 @@ static int receive_Data(struct drbd_connection *connection, struct packet_info *
* active_ee to become empty in drbd_submit_peer_request();
* better not add ourselves here. */
if ((peer_req->flags & EE_IS_TRIM_USE_ZEROOUT) == 0)
- list_add(&peer_req->w.list, &device->active_ee);
+ list_add_tail(&peer_req->w.list, &device->active_ee);
spin_unlock_irq(&device->resource->req_lock);

if (device->state.conn == C_SYNC_TARGET)
diff --git a/drivers/block/drbd/drbd_worker.c b/drivers/block/drbd/drbd_worker.c
index 49b8873..ad57129 100644
--- a/drivers/block/drbd/drbd_worker.c
+++ b/drivers/block/drbd/drbd_worker.c
@@ -410,7 +410,7 @@ static int read_for_csum(struct drbd_peer_device *peer_device, sector_t sector,

peer_req->w.cb = w_e_send_csum;
spin_lock_irq(&device->resource->req_lock);
- list_add(&peer_req->w.list, &device->read_ee);
+ list_add_tail(&peer_req->w.list, &device->read_ee);
spin_unlock_irq(&device->resource->req_lock);

atomic_add(size >> 9, &device->rs_sect_ev);
--
1.9.1

2014-07-01 16:18:02

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 17/20] drbd: add drbd_queue_work_if_unqueued helper

From: Lars Ellenberg <[email protected]>

We sometimes do
if (list_empty(&w.list))
drbd_queue_work(&q, &w.list);

Removal (list_del_init) may happen outside all locks, after all
pending work entries have been moved to an on-stack local work list.

For not dynamically allocated, but embeded, work structs,
we must avoid to re-add until it really was removed.

Move that list_empty check inside the spin_lock(&q->q_lock)
within the helper function, and change to list_empty_careful().

This may have been the reason for a list_add corruption
inside drbd_queue_work().

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_int.h | 11 +++++++++++
drivers/block/drbd/drbd_worker.c | 8 ++++----
2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index c88d6c6..a71f8bb 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -1778,6 +1778,17 @@ drbd_queue_work(struct drbd_work_queue *q, struct drbd_work *w)
}

static inline void
+drbd_queue_work_if_unqueued(struct drbd_work_queue *q, struct drbd_work *w)
+{
+ unsigned long flags;
+ spin_lock_irqsave(&q->q_lock, flags);
+ if (list_empty_careful(&w->list))
+ list_add_tail(&w->list, &q->q);
+ spin_unlock_irqrestore(&q->q_lock, flags);
+ wake_up(&q->q_wait);
+}
+
+static inline void
drbd_device_post_work(struct drbd_device *device, int work_bit)
{
if (!test_and_set_bit(work_bit, &device->flags)) {
diff --git a/drivers/block/drbd/drbd_worker.c b/drivers/block/drbd/drbd_worker.c
index 9f0acb0..49b8873 100644
--- a/drivers/block/drbd/drbd_worker.c
+++ b/drivers/block/drbd/drbd_worker.c
@@ -452,9 +452,9 @@ void resync_timer_fn(unsigned long data)
{
struct drbd_device *device = (struct drbd_device *) data;

- if (list_empty(&device->resync_work.list))
- drbd_queue_work(&first_peer_device(device)->connection->sender_work,
- &device->resync_work);
+ drbd_queue_work_if_unqueued(
+ &first_peer_device(device)->connection->sender_work,
+ &device->resync_work);
}

static void fifo_set(struct fifo_buffer *fb, int value)
@@ -1968,7 +1968,7 @@ static void do_unqueued_work(struct drbd_connection *connection)
static bool dequeue_work_batch(struct drbd_work_queue *queue, struct list_head *work_list)
{
spin_lock_irq(&queue->q_lock);
- list_splice_init(&queue->q, work_list);
+ list_splice_tail_init(&queue->q, work_list);
spin_unlock_irq(&queue->q_lock);
return !list_empty(work_list);
}
--
1.9.1

2014-07-01 16:18:00

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 20/20] drbd: also keep track of trim -> zero-out fallback peer_requests

From: Lars Ellenberg <[email protected]>

To be able to find and present such zero-out fallback peer_requests
in debugfs, we add those to "active_ee", once that list drained.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_receiver.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c
index db5c580..7a1078d 100644
--- a/drivers/block/drbd/drbd_receiver.c
+++ b/drivers/block/drbd/drbd_receiver.c
@@ -1370,6 +1370,11 @@ int drbd_submit_peer_request(struct drbd_device *device,
/* wait for all pending IO completions, before we start
* zeroing things out. */
conn_wait_active_ee_empty(first_peer_device(device)->connection);
+ /* add it to the active list now,
+ * so we can find it to present it in debugfs */
+ spin_lock_irq(&device->resource->req_lock);
+ list_add_tail(&peer_req->w.list, &device->active_ee);
+ spin_unlock_irq(&device->resource->req_lock);
if (blkdev_issue_zeroout(device->ldev->backing_bdev,
sector, ds >> 9, GFP_NOIO))
peer_req->flags |= EE_WAS_ERROR;
--
1.9.1

2014-07-01 16:19:30

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 14/20] drbd: clear CRASHED_PRIMARY only after successful resync

From: Lars Ellenberg <[email protected]>

If we lost a disk during the first resync after primary crash,
we could have prematurely cleared the CRASHED_PRIMARY flag.
Testing on C_CONNECTED is not what we meant there,
but testing for both peers to become D_UP_TO_DATE.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_state.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/block/drbd/drbd_state.c b/drivers/block/drbd/drbd_state.c
index 4a75b7a..c35c0f0 100644
--- a/drivers/block/drbd/drbd_state.c
+++ b/drivers/block/drbd/drbd_state.c
@@ -1238,7 +1238,8 @@ static void after_state_ch(struct drbd_device *device, union drbd_state os,
sib.os = os;
sib.ns = ns;

- if (os.conn != C_CONNECTED && ns.conn == C_CONNECTED) {
+ if ((os.disk != D_UP_TO_DATE || os.pdsk != D_UP_TO_DATE)
+ && (ns.disk == D_UP_TO_DATE && ns.pdsk == D_UP_TO_DATE)) {
clear_bit(CRASHED_PRIMARY, &device->flags);
if (device->p_uuid)
device->p_uuid[UI_FLAGS] &= ~((u64)2);
--
1.9.1

2014-07-01 16:19:33

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 12/20] block: Convert last uses of __FUNCTION__ to __func__

From: Joe Perches <[email protected]>

Just about all of these have been converted to __func__,
so convert the last uses.

Signed-off-by: Joe Perches <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_int.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index fe6595a..b0ab358 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -1867,7 +1867,7 @@ static inline void inc_ap_pending(struct drbd_device *device)
func, line, \
atomic_read(&device->which))

-#define dec_ap_pending(device) _dec_ap_pending(device, __FUNCTION__, __LINE__)
+#define dec_ap_pending(device) _dec_ap_pending(device, __func__, __LINE__)
static inline void _dec_ap_pending(struct drbd_device *device, const char *func, int line)
{
if (atomic_dec_and_test(&device->ap_pending_cnt))
@@ -1886,7 +1886,7 @@ static inline void inc_rs_pending(struct drbd_device *device)
atomic_inc(&device->rs_pending_cnt);
}

-#define dec_rs_pending(device) _dec_rs_pending(device, __FUNCTION__, __LINE__)
+#define dec_rs_pending(device) _dec_rs_pending(device, __func__, __LINE__)
static inline void _dec_rs_pending(struct drbd_device *device, const char *func, int line)
{
atomic_dec(&device->rs_pending_cnt);
@@ -1907,14 +1907,14 @@ static inline void inc_unacked(struct drbd_device *device)
atomic_inc(&device->unacked_cnt);
}

-#define dec_unacked(device) _dec_unacked(device, __FUNCTION__, __LINE__)
+#define dec_unacked(device) _dec_unacked(device, __func__, __LINE__)
static inline void _dec_unacked(struct drbd_device *device, const char *func, int line)
{
atomic_dec(&device->unacked_cnt);
ERR_IF_CNT_IS_NEGATIVE(unacked_cnt, func, line);
}

-#define sub_unacked(device, n) _sub_unacked(device, n, __FUNCTION__, __LINE__)
+#define sub_unacked(device, n) _sub_unacked(device, n, __func__, __LINE__)
static inline void _sub_unacked(struct drbd_device *device, int n, const char *func, int line)
{
atomic_sub(n, &device->unacked_cnt);
--
1.9.1

2014-07-01 16:19:31

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 13/20] drbd: improve resync request throttling due to sendbuf size

From: Lars Ellenberg <[email protected]>

If we throttle resync because the socket sendbuffer is filling up,
tell TCP about it, so it may expand the sendbuffer for us.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_worker.c | 23 ++++++++++++++---------
1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/drivers/block/drbd/drbd_worker.c b/drivers/block/drbd/drbd_worker.c
index 6532a69..0b5e429 100644
--- a/drivers/block/drbd/drbd_worker.c
+++ b/drivers/block/drbd/drbd_worker.c
@@ -592,7 +592,7 @@ static int make_resync_request(struct drbd_device *const device, int cancel)
const sector_t capacity = drbd_get_capacity(device->this_bdev);
int max_bio_size;
int number, rollback_i, size;
- int align, queued, sndbuf;
+ int align, requeue = 0;
int i = 0;

if (unlikely(cancel))
@@ -619,17 +619,22 @@ static int make_resync_request(struct drbd_device *const device, int cancel)
goto requeue;

for (i = 0; i < number; i++) {
- /* Stop generating RS requests, when half of the send buffer is filled */
+ /* Stop generating RS requests when half of the send buffer is filled,
+ * but notify TCP that we'd like to have more space. */
mutex_lock(&connection->data.mutex);
if (connection->data.socket) {
- queued = connection->data.socket->sk->sk_wmem_queued;
- sndbuf = connection->data.socket->sk->sk_sndbuf;
- } else {
- queued = 1;
- sndbuf = 0;
- }
+ struct sock *sk = connection->data.socket->sk;
+ int queued = sk->sk_wmem_queued;
+ int sndbuf = sk->sk_sndbuf;
+ if (queued > sndbuf / 2) {
+ requeue = 1;
+ if (sk->sk_socket)
+ set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
+ }
+ } else
+ requeue = 1;
mutex_unlock(&connection->data.mutex);
- if (queued > sndbuf / 2)
+ if (requeue)
goto requeue;

next_sector:
--
1.9.1

2014-07-01 16:19:29

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 15/20] drbd: cosmetic: change all printk(level, ...) to pr_<level>(...)

From: Lars Ellenberg <[email protected]>

Cosmetic change only.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_bitmap.c | 9 ++++-----
drivers/block/drbd/drbd_int.h | 4 +---
drivers/block/drbd/drbd_main.c | 28 +++++++++++++---------------
drivers/block/drbd/drbd_nl.c | 4 +++-
4 files changed, 21 insertions(+), 24 deletions(-)

diff --git a/drivers/block/drbd/drbd_bitmap.c b/drivers/block/drbd/drbd_bitmap.c
index 3da6730..7e6f924 100644
--- a/drivers/block/drbd/drbd_bitmap.c
+++ b/drivers/block/drbd/drbd_bitmap.c
@@ -353,9 +353,8 @@ static void bm_free_pages(struct page **pages, unsigned long number)

for (i = 0; i < number; i++) {
if (!pages[i]) {
- printk(KERN_ALERT "drbd: bm_free_pages tried to free "
- "a NULL pointer; i=%lu n=%lu\n",
- i, number);
+ pr_alert("bm_free_pages tried to free a NULL pointer; i=%lu n=%lu\n",
+ i, number);
continue;
}
__free_page(pages[i]);
@@ -592,7 +591,7 @@ static void bm_memset(struct drbd_bitmap *b, size_t offset, int c, size_t len)
end = offset + len;

if (end > b->bm_words) {
- printk(KERN_ALERT "drbd: bm_memset end > bm_words\n");
+ pr_alert("bm_memset end > bm_words\n");
return;
}

@@ -602,7 +601,7 @@ static void bm_memset(struct drbd_bitmap *b, size_t offset, int c, size_t len)
p_addr = bm_map_pidx(b, idx);
bm = p_addr + MLPP(offset);
if (bm+do_now > p_addr + LWPP) {
- printk(KERN_ALERT "drbd: BUG BUG BUG! p_addr:%p bm:%p do_now:%d\n",
+ pr_alert("BUG BUG BUG! p_addr:%p bm:%p do_now:%d\n",
p_addr, bm, (int)do_now);
} else
memset(bm, c, do_now * sizeof(long));
diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index b0ab358..c88d6c6 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -1467,9 +1467,7 @@ static inline void drbd_generic_make_request(struct drbd_device *device,
{
__release(local);
if (!bio->bi_bdev) {
- printk(KERN_ERR "drbd%d: drbd_generic_make_request: "
- "bio->bi_bdev == NULL\n",
- device_to_minor(device));
+ drbd_err(device, "drbd_generic_make_request: bio->bi_bdev == NULL\n");
bio_endio(bio, -ENODEV);
return;
}
diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index 7ada5d3..ce387bb 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -26,6 +26,8 @@

*/

+#define pr_fmt(fmt) "drbd: " fmt
+
#include <linux/module.h>
#include <linux/drbd.h>
#include <asm/uaccess.h>
@@ -2333,7 +2335,7 @@ static void drbd_cleanup(void)

idr_destroy(&drbd_devices);

- printk(KERN_INFO "drbd: module cleanup done.\n");
+ pr_info("module cleanup done.\n");
}

/**
@@ -2869,8 +2871,7 @@ static int __init drbd_init(void)
int err;

if (minor_count < DRBD_MINOR_COUNT_MIN || minor_count > DRBD_MINOR_COUNT_MAX) {
- printk(KERN_ERR
- "drbd: invalid minor_count (%d)\n", minor_count);
+ pr_err("invalid minor_count (%d)\n", minor_count);
#ifdef MODULE
return -EINVAL;
#else
@@ -2880,8 +2881,7 @@ static int __init drbd_init(void)

err = register_blkdev(DRBD_MAJOR, "drbd");
if (err) {
- printk(KERN_ERR
- "drbd: unable to register block device major %d\n",
+ pr_err("unable to register block device major %d\n",
DRBD_MAJOR);
return err;
}
@@ -2899,7 +2899,7 @@ static int __init drbd_init(void)

err = drbd_genl_register();
if (err) {
- printk(KERN_ERR "drbd: unable to register generic netlink family\n");
+ pr_err("unable to register generic netlink family\n");
goto fail;
}

@@ -2910,34 +2910,32 @@ static int __init drbd_init(void)
err = -ENOMEM;
drbd_proc = proc_create_data("drbd", S_IFREG | S_IRUGO , NULL, &drbd_proc_fops, NULL);
if (!drbd_proc) {
- printk(KERN_ERR "drbd: unable to register proc file\n");
+ pr_err("unable to register proc file\n");
goto fail;
}

retry.wq = create_singlethread_workqueue("drbd-reissue");
if (!retry.wq) {
- printk(KERN_ERR "drbd: unable to create retry workqueue\n");
+ pr_err("unable to create retry workqueue\n");
goto fail;
}
INIT_WORK(&retry.worker, do_retry);
spin_lock_init(&retry.lock);
INIT_LIST_HEAD(&retry.writes);

- printk(KERN_INFO "drbd: initialized. "
+ pr_info("initialized. "
"Version: " REL_VERSION " (api:%d/proto:%d-%d)\n",
API_VERSION, PRO_VERSION_MIN, PRO_VERSION_MAX);
- printk(KERN_INFO "drbd: %s\n", drbd_buildtag());
- printk(KERN_INFO "drbd: registered as block device major %d\n",
- DRBD_MAJOR);
-
+ pr_info("%s\n", drbd_buildtag());
+ pr_info("registered as block device major %d\n", DRBD_MAJOR);
return 0; /* Success! */

fail:
drbd_cleanup();
if (err == -ENOMEM)
- printk(KERN_ERR "drbd: ran out of memory\n");
+ pr_err("ran out of memory\n");
else
- printk(KERN_ERR "drbd: initialization failure\n");
+ pr_err("initialization failure\n");
return err;
}

diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index 81949d6..e2f535f 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -23,6 +23,8 @@

*/

+#define pr_fmt(fmt) "drbd: " fmt
+
#include <linux/module.h>
#include <linux/drbd.h>
#include <linux/in.h>
@@ -85,7 +87,7 @@ static void drbd_adm_send_reply(struct sk_buff *skb, struct genl_info *info)
{
genlmsg_end(skb, genlmsg_data(nlmsg_data(nlmsg_hdr(skb))));
if (genlmsg_reply(skb, info))
- printk(KERN_ERR "drbd: error sending genl reply\n");
+ pr_err("error sending genl reply\n");
}

/* Used on a fresh "drbd_adm_prepare"d reply_skb, this cannot fail: The only
--
1.9.1

2014-07-01 16:20:54

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 09/20] drbd: application writes may set-in-sync in protocol != C

From: Lars Ellenberg <[email protected]>

If "dirty" blocks are written to during resync,
that brings them in-sync.

By explicitly requesting write-acks during resync even in protocol != C,
we now can actually respect this.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_interval.h | 4 ++-
drivers/block/drbd/drbd_main.c | 5 ++-
drivers/block/drbd/drbd_receiver.c | 3 ++
drivers/block/drbd/drbd_req.c | 68 ++++++++++++++++++++++----------------
4 files changed, 49 insertions(+), 31 deletions(-)

diff --git a/drivers/block/drbd/drbd_interval.h b/drivers/block/drbd/drbd_interval.h
index f38fcb0..f210543 100644
--- a/drivers/block/drbd/drbd_interval.h
+++ b/drivers/block/drbd/drbd_interval.h
@@ -10,7 +10,9 @@ struct drbd_interval {
unsigned int size; /* size in bytes */
sector_t end; /* highest interval end in subtree */
int local:1 /* local or remote request? */;
- int waiting:1;
+ int waiting:1; /* someone is waiting for this to complete */
+ int completed:1; /* this has been completed already;
+ * ignore for conflict detection */
};

static inline void drbd_clear_interval(struct drbd_interval *i)
diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index 7c06024..7ada5d3 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -1639,7 +1639,10 @@ int drbd_send_dblock(struct drbd_peer_device *peer_device, struct drbd_request *
if (peer_device->connection->agreed_pro_version >= 100) {
if (req->rq_state & RQ_EXP_RECEIVE_ACK)
dp_flags |= DP_SEND_RECEIVE_ACK;
- if (req->rq_state & RQ_EXP_WRITE_ACK)
+ /* During resync, request an explicit write ack,
+ * even in protocol != C */
+ if (req->rq_state & RQ_EXP_WRITE_ACK
+ || (dp_flags & DP_MAY_SET_IN_SYNC))
dp_flags |= DP_SEND_WRITE_ACK;
}
p->dp_flags = cpu_to_be32(dp_flags);
diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c
index b89e6fb..3a3c489 100644
--- a/drivers/block/drbd/drbd_receiver.c
+++ b/drivers/block/drbd/drbd_receiver.c
@@ -1930,6 +1930,7 @@ static int e_end_block(struct drbd_work *w, int cancel)
}
dec_unacked(device);
}
+
/* we delete from the conflict detection hash _after_ we sent out the
* P_WRITE_ACK / P_NEG_ACK, to get the sequence number right. */
if (peer_req->flags & EE_IN_INTERVAL_TREE) {
@@ -2156,6 +2157,8 @@ static int handle_write_conflicts(struct drbd_device *device,
drbd_for_each_overlap(i, &device->write_requests, sector, size) {
if (i == &peer_req->i)
continue;
+ if (i->completed)
+ continue;

if (!i->local) {
/*
diff --git a/drivers/block/drbd/drbd_req.c b/drivers/block/drbd/drbd_req.c
index 1ee7355..f07a724 100644
--- a/drivers/block/drbd/drbd_req.c
+++ b/drivers/block/drbd/drbd_req.c
@@ -92,6 +92,19 @@ static struct drbd_request *drbd_req_new(struct drbd_device *device,
return req;
}

+static void drbd_remove_request_interval(struct rb_root *root,
+ struct drbd_request *req)
+{
+ struct drbd_device *device = req->device;
+ struct drbd_interval *i = &req->i;
+
+ drbd_remove_interval(root, i);
+
+ /* Wake up any processes waiting for this request to complete. */
+ if (i->waiting)
+ wake_up(&device->misc_wait);
+}
+
void drbd_req_destroy(struct kref *kref)
{
struct drbd_request *req = container_of(kref, struct drbd_request, kref);
@@ -115,6 +128,20 @@ void drbd_req_destroy(struct kref *kref)
* here unconditionally */
list_del_init(&req->tl_requests);

+ /* finally remove the request from the conflict detection
+ * respective block_id verification interval tree. */
+ if (!drbd_interval_empty(&req->i)) {
+ struct rb_root *root;
+
+ if (s & RQ_WRITE)
+ root = &device->write_requests;
+ else
+ root = &device->read_requests;
+ drbd_remove_request_interval(root, req);
+ } else if (s & (RQ_NET_MASK & ~RQ_NET_DONE) && req->i.size != 0)
+ drbd_err(device, "drbd_req_destroy: Logic BUG: interval empty, but: rq_state=0x%x, sect=%llu, size=%u\n",
+ s, (unsigned long long)req->i.sector, req->i.size);
+
/* if it was a write, we may have to set the corresponding
* bit(s) out-of-sync first. If it had a local part, we need to
* release the reference to the activity log. */
@@ -188,19 +215,6 @@ void complete_master_bio(struct drbd_device *device,
}


-static void drbd_remove_request_interval(struct rb_root *root,
- struct drbd_request *req)
-{
- struct drbd_device *device = req->device;
- struct drbd_interval *i = &req->i;
-
- drbd_remove_interval(root, i);
-
- /* Wake up any processes waiting for this request to complete. */
- if (i->waiting)
- wake_up(&device->misc_wait);
-}
-
/* Helper for __req_mod().
* Set m->bio to the master bio, if it is fit to be completed,
* or leave it alone (it is initialized to NULL in __req_mod),
@@ -254,18 +268,6 @@ void drbd_req_complete(struct drbd_request *req, struct bio_and_error *m)
ok = (s & RQ_LOCAL_OK) || (s & RQ_NET_OK);
error = PTR_ERR(req->private_bio);

- /* remove the request from the conflict detection
- * respective block_id verification hash */
- if (!drbd_interval_empty(&req->i)) {
- struct rb_root *root;
-
- if (rw == WRITE)
- root = &device->write_requests;
- else
- root = &device->read_requests;
- drbd_remove_request_interval(root, req);
- }
-
/* Before we can signal completion to the upper layers,
* we may need to close the current transfer log epoch.
* We are within the request lock, so we can simply compare
@@ -301,7 +303,15 @@ void drbd_req_complete(struct drbd_request *req, struct bio_and_error *m)
m->error = ok ? 0 : (error ?: -EIO);
m->bio = req->master_bio;
req->master_bio = NULL;
+ /* We leave it in the tree, to be able to verify later
+ * write-acks in protocol != C during resync.
+ * But we mark it as "complete", so it won't be counted as
+ * conflict in a multi-primary setup. */
+ req->i.completed = true;
}
+
+ if (req->i.waiting)
+ wake_up(&device->misc_wait);
}

static int drbd_req_put_completion_ref(struct drbd_request *req, struct bio_and_error *m, int put)
@@ -660,12 +670,13 @@ int __req_mod(struct drbd_request *req, enum drbd_req_event what,
case WRITE_ACKED_BY_PEER_AND_SIS:
req->rq_state |= RQ_NET_SIS;
case WRITE_ACKED_BY_PEER:
- D_ASSERT(device, req->rq_state & RQ_EXP_WRITE_ACK);
- /* protocol C; successfully written on peer.
+ /* Normal operation protocol C: successfully written on peer.
+ * During resync, even in protocol != C,
+ * we requested an explicit write ack anyways.
+ * Which means we cannot even assert anything here.
* Nothing more to do here.
* We want to keep the tl in place for all protocols, to cater
* for volatile write-back caches on lower level devices. */
-
goto ack_common;
case RECV_ACKED_BY_PEER:
D_ASSERT(device, req->rq_state & RQ_EXP_RECEIVE_ACK);
@@ -673,7 +684,6 @@ int __req_mod(struct drbd_request *req, enum drbd_req_event what,
* see also notes above in HANDED_OVER_TO_NETWORK about
* protocol != C */
ack_common:
- D_ASSERT(device, req->rq_state & RQ_NET_PENDING);
mod_rq_state(req, m, RQ_NET_PENDING, RQ_NET_OK);
break;

--
1.9.1

2014-07-01 16:20:53

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 11/20] drivers/block: Use RCU_INIT_POINTER(x, NULL) in drbd/drbd_state.c

From: Monam Agarwal <[email protected]>

This patch replaces rcu_assign_pointer(x, NULL) with RCU_INIT_POINTER(x, NULL)

The rcu_assign_pointer() ensures that the initialization of a structure
is carried out before storing a pointer to that structure.
And in the case of the NULL pointer, there is no structure to initialize.
So, rcu_assign_pointer(p, NULL) can be safely converted to RCU_INIT_POINTER(p, NULL)

Signed-off-by: Monam Agarwal <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_state.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/block/drbd/drbd_state.c b/drivers/block/drbd/drbd_state.c
index a7772e2..4a75b7a 100644
--- a/drivers/block/drbd/drbd_state.c
+++ b/drivers/block/drbd/drbd_state.c
@@ -1570,7 +1570,7 @@ static int w_after_conn_state_ch(struct drbd_work *w, int unused)
old_conf = connection->net_conf;
connection->my_addr_len = 0;
connection->peer_addr_len = 0;
- rcu_assign_pointer(connection->net_conf, NULL);
+ RCU_INIT_POINTER(connection->net_conf, NULL);
conn_free_crypto(connection);
mutex_unlock(&connection->resource->conf_update);

--
1.9.1

2014-07-01 16:21:33

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 06/20] drbd: implement csums-after-crash-only

From: Lars Ellenberg <[email protected]>

Checksum based resync trades CPU cycles for network bandwidth,
in situations where we expect much of the to-be-resynced blocks
to be actually identical on both sides already.

In a "network hickup" scenario, it won't help:
all to-be-resynced blocks will typically be different.

The use case is for the resync of *potentially* different blocks
after crash recovery -- the crash recovery had marked larger areas
(those covered by the activity log) as need-to-be-resynced,
just in case. Most of those blocks will be identical.

This option makes it possible to configure checksum based resync,
but only actually use it for the first resync after primary crash.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_int.h | 2 ++
drivers/block/drbd/drbd_receiver.c | 2 ++
drivers/block/drbd/drbd_worker.c | 24 ++++++++++++++++++++----
include/linux/drbd_genl.h | 3 +++
include/linux/drbd_limits.h | 1 +
5 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index abf5aef..fe6595a 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -738,6 +738,8 @@ struct drbd_device {
struct rb_root read_requests;
struct rb_root write_requests;

+ /* use checksums for *this* resync */
+ bool use_csums;
/* blocks to resync in this run [unit BM_BLOCK_SIZE] */
unsigned long rs_total;
/* number of resync blocks that failed in this run */
diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c
index 5626c5b..d326af6 100644
--- a/drivers/block/drbd/drbd_receiver.c
+++ b/drivers/block/drbd/drbd_receiver.c
@@ -2555,6 +2555,8 @@ static int receive_DataRequest(struct drbd_connection *connection, struct packet
peer_req->w.cb = w_e_end_csum_rs_req;
/* used in the sector offset progress display */
device->bm_resync_fo = BM_SECT_TO_BIT(sector);
+ /* remember to report stats in drbd_resync_finished */
+ device->use_csums = true;
} else if (pi->cmd == P_OV_REPLY) {
/* track progress, we may need to throttle */
atomic_add(size >> 9, &device->rs_sect_in);
diff --git a/drivers/block/drbd/drbd_worker.c b/drivers/block/drbd/drbd_worker.c
index 2ff5fd4..6532a69 100644
--- a/drivers/block/drbd/drbd_worker.c
+++ b/drivers/block/drbd/drbd_worker.c
@@ -698,8 +698,8 @@ next_sector:
/* adjust very last sectors, in case we are oddly sized */
if (sector + (size>>9) > capacity)
size = (capacity-sector)<<9;
- if (connection->agreed_pro_version >= 89 &&
- connection->csums_tfm) {
+
+ if (device->use_csums) {
switch (read_for_csum(peer_device, sector, size)) {
case -EIO: /* Disk failure */
put_ldev(device);
@@ -913,7 +913,7 @@ int drbd_resync_finished(struct drbd_device *device)
if (os.conn == C_SYNC_TARGET || os.conn == C_PAUSED_SYNC_T)
khelper_cmd = "after-resync-target";

- if (first_peer_device(device)->connection->csums_tfm && device->rs_total) {
+ if (device->use_csums && device->rs_total) {
const unsigned long s = device->rs_same_csum;
const unsigned long t = device->rs_total;
const int ratio =
@@ -1622,6 +1622,18 @@ static void do_start_resync(struct drbd_device *device)
clear_bit(AHEAD_TO_SYNC_SOURCE, &device->flags);
}

+static bool use_checksum_based_resync(struct drbd_connection *connection, struct drbd_device *device)
+{
+ bool csums_after_crash_only;
+ rcu_read_lock();
+ csums_after_crash_only = rcu_dereference(connection->net_conf)->csums_after_crash_only;
+ rcu_read_unlock();
+ return connection->agreed_pro_version >= 89 && /* supported? */
+ connection->csums_tfm && /* configured? */
+ (csums_after_crash_only == 0 /* use for each resync? */
+ || test_bit(CRASHED_PRIMARY, &device->flags)); /* or only after Primary crash? */
+}
+
/**
* drbd_start_resync() - Start the resync process
* @device: DRBD device.
@@ -1756,8 +1768,12 @@ void drbd_start_resync(struct drbd_device *device, enum drbd_conns side)
drbd_conn_str(ns.conn),
(unsigned long) device->rs_total << (BM_BLOCK_SHIFT-10),
(unsigned long) device->rs_total);
- if (side == C_SYNC_TARGET)
+ if (side == C_SYNC_TARGET) {
device->bm_resync_fo = 0;
+ device->use_csums = use_checksum_based_resync(connection, device);
+ } else {
+ device->use_csums = 0;
+ }

/* Since protocol 96, we must serialize drbd_gen_and_send_sync_uuid
* with w_send_oos, or the sync target will get confused as to
diff --git a/include/linux/drbd_genl.h b/include/linux/drbd_genl.h
index 4193f5f..71fc924 100644
--- a/include/linux/drbd_genl.h
+++ b/include/linux/drbd_genl.h
@@ -171,6 +171,9 @@ GENL_struct(DRBD_NLA_NET_CONF, 5, net_conf,
__flg_field(28, DRBD_GENLA_F_MANDATORY | DRBD_F_INVARIANT, tentative)
__flg_field_def(29, DRBD_GENLA_F_MANDATORY, use_rle, DRBD_USE_RLE_DEF)
/* 9: __u32_field_def(30, DRBD_GENLA_F_MANDATORY, fencing_policy, DRBD_FENCING_DEF) */
+ /* 9: __str_field_def(31, DRBD_GENLA_F_MANDATORY, name, SHARED_SECRET_MAX) */
+ /* 9: __u32_field(32, DRBD_F_REQUIRED | DRBD_F_INVARIANT, peer_node_id) */
+ __flg_field_def(33, 0 /* OPTIONAL */, csums_after_crash_only, DRBD_CSUMS_AFTER_CRASH_ONLY_DEF)
)

GENL_struct(DRBD_NLA_SET_ROLE_PARMS, 6, set_role_parms,
diff --git a/include/linux/drbd_limits.h b/include/linux/drbd_limits.h
index 17e50bb..9d2df1d 100644
--- a/include/linux/drbd_limits.h
+++ b/include/linux/drbd_limits.h
@@ -214,6 +214,7 @@
#define DRBD_ALLOW_TWO_PRIMARIES_DEF 0
#define DRBD_ALWAYS_ASBP_DEF 0
#define DRBD_USE_RLE_DEF 1
+#define DRBD_CSUMS_AFTER_CRASH_ONLY_DEF 0

#define DRBD_AL_STRIPES_MIN 1
#define DRBD_AL_STRIPES_MAX 1024
--
1.9.1

2014-07-01 16:21:32

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 08/20] drbd: New net configuration option socket-check-timeout

In setups involving a DRBD-proxy and connections that experience a lot of
buffer-bloat it might be necessary to set ping-timeout to an
unusual high value. By default DRBD uses the same value to wait if a newly
established TCP-connection is stable. Since the DRBD-proxy is usually located
in the same data center such a long wait time may hinder DRBD's connect process.

In such setups socket-check-timeout should be set to
at least to the round trip time between DRBD and DRBD-proxy. I.e. in most
cases to 1.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_receiver.c | 46 +++++++++++++++++++++++++-------------
include/linux/drbd_genl.h | 1 +
include/linux/drbd_limits.h | 5 +++++
3 files changed, 36 insertions(+), 16 deletions(-)

diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c
index 7da83f3..b89e6fb 100644
--- a/drivers/block/drbd/drbd_receiver.c
+++ b/drivers/block/drbd/drbd_receiver.c
@@ -819,7 +819,7 @@ static int receive_first_packet(struct drbd_connection *connection, struct socke
* drbd_socket_okay() - Free the socket if its connection is not okay
* @sock: pointer to the pointer to the socket.
*/
-static int drbd_socket_okay(struct socket **sock)
+static bool drbd_socket_okay(struct socket **sock)
{
int rr;
char tb[4];
@@ -837,6 +837,30 @@ static int drbd_socket_okay(struct socket **sock)
return false;
}
}
+
+static bool connection_established(struct drbd_connection *connection,
+ struct socket **sock1,
+ struct socket **sock2)
+{
+ struct net_conf *nc;
+ int timeout;
+ bool ok;
+
+ if (!*sock1 || !*sock2)
+ return false;
+
+ rcu_read_lock();
+ nc = rcu_dereference(connection->net_conf);
+ timeout = (nc->sock_check_timeo ?: nc->ping_timeo) * HZ / 10;
+ rcu_read_unlock();
+ schedule_timeout_interruptible(timeout);
+
+ ok = drbd_socket_okay(sock1);
+ ok = drbd_socket_okay(sock2) && ok;
+
+ return ok;
+}
+
/* Gets called if a connection is established, or if a new minor gets created
in a connection */
int drbd_connected(struct drbd_peer_device *peer_device)
@@ -878,8 +902,8 @@ static int conn_connect(struct drbd_connection *connection)
struct drbd_socket sock, msock;
struct drbd_peer_device *peer_device;
struct net_conf *nc;
- int vnr, timeout, h, ok;
- bool discard_my_data;
+ int vnr, timeout, h;
+ bool discard_my_data, ok;
enum drbd_state_rv rv;
struct accept_wait_data ad = {
.connection = connection,
@@ -923,17 +947,8 @@ static int conn_connect(struct drbd_connection *connection)
}
}

- if (sock.socket && msock.socket) {
- rcu_read_lock();
- nc = rcu_dereference(connection->net_conf);
- timeout = nc->ping_timeo * HZ / 10;
- rcu_read_unlock();
- schedule_timeout_interruptible(timeout);
- ok = drbd_socket_okay(&sock.socket);
- ok = drbd_socket_okay(&msock.socket) && ok;
- if (ok)
- break;
- }
+ if (connection_established(connection, &sock.socket, &msock.socket))
+ break;

retry:
s = drbd_wait_for_connect(connection, &ad);
@@ -979,8 +994,7 @@ randomize:
goto out_release_sockets;
}

- ok = drbd_socket_okay(&sock.socket);
- ok = drbd_socket_okay(&msock.socket) && ok;
+ ok = connection_established(connection, &sock.socket, &msock.socket);
} while (!ok);

if (ad.s_listen)
diff --git a/include/linux/drbd_genl.h b/include/linux/drbd_genl.h
index 71fc924..7b131ed 100644
--- a/include/linux/drbd_genl.h
+++ b/include/linux/drbd_genl.h
@@ -174,6 +174,7 @@ GENL_struct(DRBD_NLA_NET_CONF, 5, net_conf,
/* 9: __str_field_def(31, DRBD_GENLA_F_MANDATORY, name, SHARED_SECRET_MAX) */
/* 9: __u32_field(32, DRBD_F_REQUIRED | DRBD_F_INVARIANT, peer_node_id) */
__flg_field_def(33, 0 /* OPTIONAL */, csums_after_crash_only, DRBD_CSUMS_AFTER_CRASH_ONLY_DEF)
+ __u32_field_def(34, 0 /* OPTIONAL */, sock_check_timeo, DRBD_SOCKET_CHECK_TIMEO_DEF)
)

GENL_struct(DRBD_NLA_SET_ROLE_PARMS, 6, set_role_parms,
diff --git a/include/linux/drbd_limits.h b/include/linux/drbd_limits.h
index 9d2df1d..8ac8c5d 100644
--- a/include/linux/drbd_limits.h
+++ b/include/linux/drbd_limits.h
@@ -225,4 +225,9 @@
#define DRBD_AL_STRIPE_SIZE_MAX 16777216
#define DRBD_AL_STRIPE_SIZE_DEF 32
#define DRBD_AL_STRIPE_SIZE_SCALE 'k' /* kilobytes */
+
+#define DRBD_SOCKET_CHECK_TIMEO_MIN 0
+#define DRBD_SOCKET_CHECK_TIMEO_MAX DRBD_PING_TIMEO_MAX
+#define DRBD_SOCKET_CHECK_TIMEO_DEF 0
+#define DRBD_SOCKET_CHECK_TIMEO_SCALE '1'
#endif
--
1.9.1

2014-07-01 16:21:29

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 10/20] drbd: short-circuit in maybe_pull_ahead

From: Lars Ellenberg <[email protected]>

If we already "pulled ahead", we can short-circuit,
and avoid logging the same messages over and over again.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_req.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/block/drbd/drbd_req.c b/drivers/block/drbd/drbd_req.c
index f07a724..3824d5c 100644
--- a/drivers/block/drbd/drbd_req.c
+++ b/drivers/block/drbd/drbd_req.c
@@ -899,6 +899,9 @@ static void maybe_pull_ahead(struct drbd_device *device)
connection->agreed_pro_version < 96)
return;

+ if (on_congestion == OC_PULL_AHEAD && device->state.conn == C_AHEAD)
+ return; /* nothing to do ... */
+
/* If I don't even have good local storage, we can not reasonably try
* to pull ahead of the peer. We also need the local reference to make
* sure device->act_log is there.
--
1.9.1

2014-07-01 16:22:27

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 04/20] drbd: fix bogus resync stats in /proc/drbd

From: Lars Ellenberg <[email protected]>

We intentionally do not serialize /proc/drbd access with
internal state changes or statistic updates.

Because of that, cat /proc/drbd may race with resync just being
finished, still see the sync state, and find information about
number of blocks still to go, but then find the total number
of blocks within this resync has just been reset to 0
when accessing it.

This now produces bogus numbers in the resync speed estimates.

Fix by accessing all relevant data only once,
and fixing it up if "still to go" happens to be more than "total".

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_int.h | 48 -------------------
drivers/block/drbd/drbd_proc.c | 103 ++++++++++++++++++++++++++++++-----------
2 files changed, 75 insertions(+), 76 deletions(-)

diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index e306a22..abf5aef 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -1982,54 +1982,6 @@ static inline int _get_ldev_if_state(struct drbd_device *device, enum drbd_disk_
extern int _get_ldev_if_state(struct drbd_device *device, enum drbd_disk_state mins);
#endif

-/* you must have an "get_ldev" reference */
-static inline void drbd_get_syncer_progress(struct drbd_device *device,
- unsigned long *bits_left, unsigned int *per_mil_done)
-{
- /* this is to break it at compile time when we change that, in case we
- * want to support more than (1<<32) bits on a 32bit arch. */
- typecheck(unsigned long, device->rs_total);
-
- /* note: both rs_total and rs_left are in bits, i.e. in
- * units of BM_BLOCK_SIZE.
- * for the percentage, we don't care. */
-
- if (device->state.conn == C_VERIFY_S || device->state.conn == C_VERIFY_T)
- *bits_left = device->ov_left;
- else
- *bits_left = drbd_bm_total_weight(device) - device->rs_failed;
- /* >> 10 to prevent overflow,
- * +1 to prevent division by zero */
- if (*bits_left > device->rs_total) {
- /* doh. maybe a logic bug somewhere.
- * may also be just a race condition
- * between this and a disconnect during sync.
- * for now, just prevent in-kernel buffer overflow.
- */
- smp_rmb();
- drbd_warn(device, "cs:%s rs_left=%lu > rs_total=%lu (rs_failed %lu)\n",
- drbd_conn_str(device->state.conn),
- *bits_left, device->rs_total, device->rs_failed);
- *per_mil_done = 0;
- } else {
- /* Make sure the division happens in long context.
- * We allow up to one petabyte storage right now,
- * at a granularity of 4k per bit that is 2**38 bits.
- * After shift right and multiplication by 1000,
- * this should still fit easily into a 32bit long,
- * so we don't need a 64bit division on 32bit arch.
- * Note: currently we don't support such large bitmaps on 32bit
- * arch anyways, but no harm done to be prepared for it here.
- */
- unsigned int shift = device->rs_total > UINT_MAX ? 16 : 10;
- unsigned long left = *bits_left >> shift;
- unsigned long total = 1UL + (device->rs_total >> shift);
- unsigned long tmp = 1000UL - left * 1000UL/total;
- *per_mil_done = tmp;
- }
-}
-
-
/* this throttles on-the-fly application requests
* according to max_buffers settings;
* maybe re-implement using semaphores? */
diff --git a/drivers/block/drbd/drbd_proc.c b/drivers/block/drbd/drbd_proc.c
index 886f6be..9059d7b 100644
--- a/drivers/block/drbd/drbd_proc.c
+++ b/drivers/block/drbd/drbd_proc.c
@@ -60,20 +60,65 @@ static void seq_printf_with_thousands_grouping(struct seq_file *seq, long v)
seq_printf(seq, "%ld", v);
}

+static void drbd_get_syncer_progress(struct drbd_device *device,
+ union drbd_dev_state state, unsigned long *rs_total,
+ unsigned long *bits_left, unsigned int *per_mil_done)
+{
+ /* this is to break it at compile time when we change that, in case we
+ * want to support more than (1<<32) bits on a 32bit arch. */
+ typecheck(unsigned long, device->rs_total);
+ *rs_total = device->rs_total;
+
+ /* note: both rs_total and rs_left are in bits, i.e. in
+ * units of BM_BLOCK_SIZE.
+ * for the percentage, we don't care. */
+
+ if (state.conn == C_VERIFY_S || state.conn == C_VERIFY_T)
+ *bits_left = device->ov_left;
+ else
+ *bits_left = drbd_bm_total_weight(device) - device->rs_failed;
+ /* >> 10 to prevent overflow,
+ * +1 to prevent division by zero */
+ if (*bits_left > *rs_total) {
+ /* D'oh. Maybe a logic bug somewhere. More likely just a race
+ * between state change and reset of rs_total.
+ */
+ *bits_left = *rs_total;
+ *per_mil_done = *rs_total ? 0 : 1000;
+ } else {
+ /* Make sure the division happens in long context.
+ * We allow up to one petabyte storage right now,
+ * at a granularity of 4k per bit that is 2**38 bits.
+ * After shift right and multiplication by 1000,
+ * this should still fit easily into a 32bit long,
+ * so we don't need a 64bit division on 32bit arch.
+ * Note: currently we don't support such large bitmaps on 32bit
+ * arch anyways, but no harm done to be prepared for it here.
+ */
+ unsigned int shift = *rs_total > UINT_MAX ? 16 : 10;
+ unsigned long left = *bits_left >> shift;
+ unsigned long total = 1UL + (*rs_total >> shift);
+ unsigned long tmp = 1000UL - left * 1000UL/total;
+ *per_mil_done = tmp;
+ }
+}
+
+
/*lge
* progress bars shamelessly adapted from driver/md/md.c
* output looks like
* [=====>..............] 33.5% (23456/123456)
* finish: 2:20:20 speed: 6,345 (6,456) K/sec
*/
-static void drbd_syncer_progress(struct drbd_device *device, struct seq_file *seq)
+static void drbd_syncer_progress(struct drbd_device *device, struct seq_file *seq,
+ union drbd_dev_state state)
{
- unsigned long db, dt, dbdt, rt, rs_left;
+ unsigned long db, dt, dbdt, rt, rs_total, rs_left;
unsigned int res;
int i, x, y;
int stalled = 0;

- drbd_get_syncer_progress(device, &rs_left, &res);
+ drbd_get_syncer_progress(device, state, &rs_total, &rs_left, &res);

x = res/50;
y = 20-x;
@@ -85,21 +130,21 @@ static void drbd_syncer_progress(struct drbd_device *device, struct seq_file *se
seq_printf(seq, ".");
seq_printf(seq, "] ");

- if (device->state.conn == C_VERIFY_S || device->state.conn == C_VERIFY_T)
+ if (state.conn == C_VERIFY_S || state.conn == C_VERIFY_T)
seq_printf(seq, "verified:");
else
seq_printf(seq, "sync'ed:");
seq_printf(seq, "%3u.%u%% ", res / 10, res % 10);

/* if more than a few GB, display in MB */
- if (device->rs_total > (4UL << (30 - BM_BLOCK_SHIFT)))
+ if (rs_total > (4UL << (30 - BM_BLOCK_SHIFT)))
seq_printf(seq, "(%lu/%lu)M",
(unsigned long) Bit2KB(rs_left >> 10),
- (unsigned long) Bit2KB(device->rs_total >> 10));
+ (unsigned long) Bit2KB(rs_total >> 10));
else
seq_printf(seq, "(%lu/%lu)K\n\t",
(unsigned long) Bit2KB(rs_left),
- (unsigned long) Bit2KB(device->rs_total));
+ (unsigned long) Bit2KB(rs_total));

/* see drivers/md/md.c
* We do not want to overflow, so the order of operands and
@@ -150,13 +195,13 @@ static void drbd_syncer_progress(struct drbd_device *device, struct seq_file *se
dt = (jiffies - device->rs_start - device->rs_paused) / HZ;
if (dt == 0)
dt = 1;
- db = device->rs_total - rs_left;
+ db = rs_total - rs_left;
dbdt = Bit2KB(db/dt);
seq_printf_with_thousands_grouping(seq, dbdt);
seq_printf(seq, ")");

- if (device->state.conn == C_SYNC_TARGET ||
- device->state.conn == C_VERIFY_S) {
+ if (state.conn == C_SYNC_TARGET ||
+ state.conn == C_VERIFY_S) {
seq_printf(seq, " want: ");
seq_printf_with_thousands_grouping(seq, device->c_sync_rate);
}
@@ -168,8 +213,8 @@ static void drbd_syncer_progress(struct drbd_device *device, struct seq_file *se
unsigned long bm_bits = drbd_bm_bits(device);
unsigned long bit_pos;
unsigned long long stop_sector = 0;
- if (device->state.conn == C_VERIFY_S ||
- device->state.conn == C_VERIFY_T) {
+ if (state.conn == C_VERIFY_S ||
+ state.conn == C_VERIFY_T) {
bit_pos = bm_bits - device->ov_left;
if (verify_can_do_stop_sector(device))
stop_sector = device->ov_stop_sector;
@@ -194,6 +239,7 @@ static int drbd_seq_show(struct seq_file *seq, void *v)
const char *sn;
struct drbd_device *device;
struct net_conf *nc;
+ union drbd_dev_state state;
char wp;

static char write_ordering_chars[] = {
@@ -231,11 +277,12 @@ static int drbd_seq_show(struct seq_file *seq, void *v)
seq_printf(seq, "\n");
prev_i = i;

- sn = drbd_conn_str(device->state.conn);
+ state = device->state;
+ sn = drbd_conn_str(state.conn);

- if (device->state.conn == C_STANDALONE &&
- device->state.disk == D_DISKLESS &&
- device->state.role == R_SECONDARY) {
+ if (state.conn == C_STANDALONE &&
+ state.disk == D_DISKLESS &&
+ state.role == R_SECONDARY) {
seq_printf(seq, "%2d: cs:Unconfigured\n", i);
} else {
/* reset device->congestion_reason */
@@ -248,15 +295,15 @@ static int drbd_seq_show(struct seq_file *seq, void *v)
" ns:%u nr:%u dw:%u dr:%u al:%u bm:%u "
"lo:%d pe:%d ua:%d ap:%d ep:%d wo:%c",
i, sn,
- drbd_role_str(device->state.role),
- drbd_role_str(device->state.peer),
- drbd_disk_str(device->state.disk),
- drbd_disk_str(device->state.pdsk),
+ drbd_role_str(state.role),
+ drbd_role_str(state.peer),
+ drbd_disk_str(state.disk),
+ drbd_disk_str(state.pdsk),
wp,
drbd_suspended(device) ? 's' : 'r',
- device->state.aftr_isp ? 'a' : '-',
- device->state.peer_isp ? 'p' : '-',
- device->state.user_isp ? 'u' : '-',
+ state.aftr_isp ? 'a' : '-',
+ state.peer_isp ? 'p' : '-',
+ state.user_isp ? 'u' : '-',
device->congestion_reason ?: '-',
test_bit(AL_SUSPENDED, &device->flags) ? 's' : '-',
device->send_cnt/2,
@@ -277,11 +324,11 @@ static int drbd_seq_show(struct seq_file *seq, void *v)
Bit2KB((unsigned long long)
drbd_bm_total_weight(device)));
}
- if (device->state.conn == C_SYNC_SOURCE ||
- device->state.conn == C_SYNC_TARGET ||
- device->state.conn == C_VERIFY_S ||
- device->state.conn == C_VERIFY_T)
- drbd_syncer_progress(device, seq);
+ if (state.conn == C_SYNC_SOURCE ||
+ state.conn == C_SYNC_TARGET ||
+ state.conn == C_VERIFY_S ||
+ state.conn == C_VERIFY_T)
+ drbd_syncer_progress(device, seq, state);

if (proc_details >= 1 && get_ldev_if_state(device, D_FAILED)) {
lc_seq_printf_stats(seq, device->resync);
--
1.9.1

2014-07-01 16:22:42

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 03/20] drbd: Remove unnecessary/unused code

From: Andreas Gruenbacher <[email protected]>

Get rid of dump_stack() debug statements.

There is no point whatsoever in registering and unregistering a reboot
notifier that doesn't do anything.

The intention was to switch to an "emergency read-only" mode,
so we won't have to resync the full activity log just because
we had been Primary before the reboot.

Once we have that implemented, we may re-introduce the reboot notifier.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_int.h | 1 -
drivers/block/drbd/drbd_main.c | 18 ------------------
drivers/block/drbd/drbd_proc.c | 2 +-
3 files changed, 1 insertion(+), 20 deletions(-)

diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index 72b5001..e306a22 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -1468,7 +1468,6 @@ static inline void drbd_generic_make_request(struct drbd_device *device,
printk(KERN_ERR "drbd%d: drbd_generic_make_request: "
"bio->bi_bdev == NULL\n",
device_to_minor(device));
- dump_stack();
bio_endio(bio, -ENODEV);
return;
}
diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index 68ec774..7c06024 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -2125,20 +2125,6 @@ Enomem:
return -ENOMEM;
}

-static int drbd_notify_sys(struct notifier_block *this, unsigned long code,
- void *unused)
-{
- /* just so we have it. you never know what interesting things we
- * might want to do here some day...
- */
-
- return NOTIFY_DONE;
-}
-
-static struct notifier_block drbd_notifier = {
- .notifier_call = drbd_notify_sys,
-};
-
static void drbd_release_all_peer_reqs(struct drbd_device *device)
{
int rr;
@@ -2314,8 +2300,6 @@ static void drbd_cleanup(void)
struct drbd_device *device;
struct drbd_resource *resource, *tmp;

- unregister_reboot_notifier(&drbd_notifier);
-
/* first remove proc,
* drbdsetup uses it's presence to detect
* whether DRBD is loaded.
@@ -2899,8 +2883,6 @@ static int __init drbd_init(void)
return err;
}

- register_reboot_notifier(&drbd_notifier);
-
/*
* allocate all necessary structs
*/
diff --git a/drivers/block/drbd/drbd_proc.c b/drivers/block/drbd/drbd_proc.c
index 46bb8dd..886f6be 100644
--- a/drivers/block/drbd/drbd_proc.c
+++ b/drivers/block/drbd/drbd_proc.c
@@ -299,7 +299,7 @@ static int drbd_proc_open(struct inode *inode, struct file *file)
int err;

if (try_module_get(THIS_MODULE)) {
- err = single_open(file, drbd_seq_show, PDE_DATA(inode));
+ err = single_open(file, drbd_seq_show, NULL);
if (err)
module_put(THIS_MODULE);
return err;
--
1.9.1

2014-07-01 16:30:44

by Joe Perches

[permalink] [raw]
Subject: Re: [PATCH 15/20] drbd: cosmetic: change all printk(level, ...) to pr_<level>(...)

On Tue, 2014-07-01 at 18:16 +0200, Philipp Reisner wrote:
> From: Lars Ellenberg <[email protected]>
>
> Cosmetic change only.
[]
> diff --git a/drivers/block/drbd/drbd_bitmap.c b/drivers/block/drbd/drbd_bitmap.c
[]
> @@ -353,9 +353,8 @@ static void bm_free_pages(struct page **pages, unsigned long number)
>
> for (i = 0; i < number; i++) {
> if (!pages[i]) {
> - printk(KERN_ALERT "drbd: bm_free_pages tried to free "
> - "a NULL pointer; i=%lu n=%lu\n",
> - i, number);
> + pr_alert("bm_free_pages tried to free a NULL pointer; i=%lu n=%lu\n",
> + i, number);

Cosmetic yes, but these won't be prefixed with "drbd: "
unless there's a #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
somewhere above this and before any other #include.

> diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
[]
> @@ -26,6 +26,8 @@
>
> */
>
> +#define pr_fmt(fmt) "drbd: " fmt

like this one, though I think KBUILD_MODNAME would be better.


2014-07-02 09:59:03

by Philipp Reisner

[permalink] [raw]
Subject: Re: [Drbd-dev] [PATCH 15/20] drbd: cosmetic: change all printk(level, ...) to pr_<level>(...)

Am Dienstag, 1. Juli 2014, 09:30:37 schrieb Joe Perches:
> On Tue, 2014-07-01 at 18:16 +0200, Philipp Reisner wrote:
> > From: Lars Ellenberg <[email protected]>
> >
> > Cosmetic change only.
>
> []
>
> > diff --git a/drivers/block/drbd/drbd_bitmap.c
> > b/drivers/block/drbd/drbd_bitmap.c
> []
>
> > @@ -353,9 +353,8 @@ static void bm_free_pages(struct page **pages,
> > unsigned long number)>
> > for (i = 0; i < number; i++) {
> >
> > if (!pages[i]) {
> >
> > - printk(KERN_ALERT "drbd: bm_free_pages tried to free "
> > - "a NULL pointer; i=%lu n=%lu\n",
> > - i, number);
> > + pr_alert("bm_free_pages tried to free a NULL pointer; i=%lu n=%lu\n",
> > + i, number);
>
> Cosmetic yes, but these won't be prefixed with "drbd: "
> unless there's a #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> somewhere above this and before any other #include.
>

Right. Thanks for pointing that out!
I ammend it with this:

Subject: [PATCH] fixup! drbd: cosmetic: change all printk(level, ...) to
pr_<level>(...)

---
drivers/block/drbd/drbd_bitmap.c | 2 ++
drivers/block/drbd/drbd_main.c | 2 +-
drivers/block/drbd/drbd_nl.c | 2 +-
3 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/block/drbd/drbd_bitmap.c b/drivers/block/drbd/drbd_bitmap.c
index 72370e2..f435e7e 100644
--- a/drivers/block/drbd/drbd_bitmap.c
+++ b/drivers/block/drbd/drbd_bitmap.c
@@ -22,6 +22,8 @@
the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA.
*/

+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
#include <linux/bitops.h>
#include <linux/vmalloc.h>
#include <linux/string.h>
diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index 3788ac1..9b465bb 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -26,7 +26,7 @@

*/

-#define pr_fmt(fmt) "drbd: " fmt
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt

#include <linux/module.h>
#include <linux/jiffies.h>
diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index bb9b8e5..7fcdc54 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -23,7 +23,7 @@

*/

-#define pr_fmt(fmt) "drbd: " fmt
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt

#include <linux/module.h>
#include <linux/drbd.h>
--
1.9.1