2015-08-04 12:57:11

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 00/19] RFC DRBD updates for the 4.3 merge window (part II)

Hi,

these patches are intended for the 4.3 merge window.

The first two patches split on of kernel threads (drbd_asender)
into one that mainly receives packets on the control connection,
and a workqueue that only sends packets on the control connection.

All the other patches are self-contained fixes or minor
improvements described in the commit message.

Lars Ellenberg (13):
drbd: prevent NULL pointer deref when resuming diskless primary
drbd: debugfs: expose ed_data_gen_id
drbd: use resource name in workqueue
drbd: avoid redefinition of BITS_PER_PAGE
drbd: use bitmap_weight() helper, don't open code
drbd: fix spurious alert level printk
drbd: fix queue limit setup for discard
drbd: make drbd known to lsblk: use bd_link_disk_holder
drbd: don't block forever in disconnect during resync if
fencing=r-a-stonith
drbd: fix "endless" transfer log walk in protocol A
drbd: separate out __al_write_transaction helper function
drbd: avoid potential deadlock during handshake
drbd: fix error path during resize

Oleg Drokin (1):
drbd: fix memory leak in drbd_adm_resize

Philipp Reisner (3):
drbd: Rename asender to ack_receiver
drbd: Create a dedicated workqueue for sending acks on the control
connection
drbd: make suspend_io() / resume_io() must be thread and recursion
safe

Roland Kammerer (2):
lru_cache: Converted lc_seq_printf_status to return void
MAINTAINERS: Updated information for DRBD DRIVER

MAINTAINERS | 11 +-
drivers/block/drbd/drbd_actlog.c | 323 +++++++++++++++++++------------------
drivers/block/drbd/drbd_bitmap.c | 22 ++-
drivers/block/drbd/drbd_debugfs.c | 10 ++
drivers/block/drbd/drbd_int.h | 40 ++---
drivers/block/drbd/drbd_main.c | 48 ++----
drivers/block/drbd/drbd_nl.c | 285 +++++++++++++++++++++-----------
drivers/block/drbd/drbd_protocol.h | 2 +-
drivers/block/drbd/drbd_receiver.c | 212 +++++++++++++-----------
drivers/block/drbd/drbd_req.c | 4 +-
drivers/block/drbd/drbd_state.c | 2 +-
drivers/block/drbd/drbd_worker.c | 10 +-
include/linux/drbd.h | 2 +-
include/linux/lru_cache.h | 2 +-
lib/lru_cache.c | 4 +-
15 files changed, 560 insertions(+), 417 deletions(-)

--
1.9.1


2015-08-04 13:02:25

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 01/19] drbd: Rename asender to ack_receiver

This prepares the next patch where the sending on the meta (or
control) socket is moved to a dedicated workqueue.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_int.h | 6 +++---
drivers/block/drbd/drbd_main.c | 10 +++++-----
drivers/block/drbd/drbd_receiver.c | 6 +++---
3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index cb5fb68..3d5f3c0 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -754,7 +754,7 @@ struct drbd_connection {
unsigned long last_reconnect_jif;
struct drbd_thread receiver;
struct drbd_thread worker;
- struct drbd_thread asender;
+ struct drbd_thread ack_receiver;

/* cached pointers,
* so we can look up the oldest pending requests more quickly.
@@ -1558,7 +1558,7 @@ extern void drbd_endio_write_sec_final(struct drbd_peer_request *peer_req);

/* drbd_receiver.c */
extern int drbd_receiver(struct drbd_thread *thi);
-extern int drbd_asender(struct drbd_thread *thi);
+extern int drbd_ack_receiver(struct drbd_thread *thi);
extern bool drbd_rs_c_min_rate_throttle(struct drbd_device *device);
extern bool drbd_rs_should_slow_down(struct drbd_device *device, sector_t sector,
bool throttle_if_app_is_waiting);
@@ -1971,7 +1971,7 @@ extern void drbd_flush_workqueue(struct drbd_work_queue *work_queue);
static inline void wake_asender(struct drbd_connection *connection)
{
if (test_bit(SIGNAL_ASENDER, &connection->flags))
- force_sig(DRBD_SIG, connection->asender.task);
+ force_sig(DRBD_SIG, connection->ack_receiver.task);
}

static inline void request_ping(struct drbd_connection *connection)
diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index 81a8679..3f7371d 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -1436,8 +1436,8 @@ static int we_should_drop_the_connection(struct drbd_connection *connection, str
/* long elapsed = (long)(jiffies - device->last_received); */

drop_it = connection->meta.socket == sock
- || !connection->asender.task
- || get_t_state(&connection->asender) != RUNNING
+ || !connection->ack_receiver.task
+ || get_t_state(&connection->ack_receiver) != RUNNING
|| connection->cstate < C_WF_REPORT_PARAMS;

if (drop_it)
@@ -2564,7 +2564,7 @@ int set_resource_options(struct drbd_resource *resource, struct res_opts *res_op
cpumask_copy(resource->cpu_mask, new_cpu_mask);
for_each_connection_rcu(connection, resource) {
connection->receiver.reset_cpu_mask = 1;
- connection->asender.reset_cpu_mask = 1;
+ connection->ack_receiver.reset_cpu_mask = 1;
connection->worker.reset_cpu_mask = 1;
}
}
@@ -2653,8 +2653,8 @@ struct drbd_connection *conn_create(const char *name, struct res_opts *res_opts)
connection->receiver.connection = connection;
drbd_thread_init(resource, &connection->worker, drbd_worker, "worker");
connection->worker.connection = connection;
- drbd_thread_init(resource, &connection->asender, drbd_asender, "asender");
- connection->asender.connection = connection;
+ drbd_thread_init(resource, &connection->ack_receiver, drbd_ack_receiver, "ack_recv");
+ connection->ack_receiver.connection = connection;

kref_init(&connection->kref);

diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c
index d4c61af..5521465 100644
--- a/drivers/block/drbd/drbd_receiver.c
+++ b/drivers/block/drbd/drbd_receiver.c
@@ -1098,7 +1098,7 @@ randomize:
return 0;
}

- drbd_thread_start(&connection->asender);
+ drbd_thread_start(&connection->ack_receiver);

mutex_lock(&connection->resource->conf_update);
/* The discard_my_data flag is a single-shot modifier to the next
@@ -4655,7 +4655,7 @@ static void conn_disconnect(struct drbd_connection *connection)
conn_request_state(connection, NS(conn, C_NETWORK_FAILURE), CS_HARD);

/* asender does not clean up anything. it must not interfere, either */
- drbd_thread_stop(&connection->asender);
+ drbd_thread_stop(&connection->ack_receiver);
drbd_free_sock(connection);

rcu_read_lock();
@@ -5486,7 +5486,7 @@ static struct asender_cmd asender_tbl[] = {
[P_RETRY_WRITE] = { sizeof(struct p_block_ack), got_BlockAck },
};

-int drbd_asender(struct drbd_thread *thi)
+int drbd_ack_receiver(struct drbd_thread *thi)
{
struct drbd_connection *connection = thi->connection;
struct asender_cmd *cmd = NULL;
--
1.9.1

2015-08-04 13:02:21

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 02/19] drbd: Create a dedicated workqueue for sending acks on the control connection

The intention is to reduce CPU utilization. Recent measurements
unveiled that the current performance bottleneck is CPU utilization
on the receiving node. The asender thread became CPU limited.

One of the main points is to eliminate the idr_for_each_entry() loop
from the sending acks code path.

One exception in that is sending back ping_acks. These stay
in the ack-receiver thread. Otherwise the logic becomes too
complicated for no added value.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_int.h | 27 ++---
drivers/block/drbd/drbd_main.c | 10 +-
drivers/block/drbd/drbd_nl.c | 4 +-
drivers/block/drbd/drbd_protocol.h | 2 +-
drivers/block/drbd/drbd_receiver.c | 203 +++++++++++++++++++++----------------
drivers/block/drbd/drbd_req.c | 2 +-
drivers/block/drbd/drbd_worker.c | 8 +-
7 files changed, 141 insertions(+), 115 deletions(-)

diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index 3d5f3c0..9bf3178 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -77,13 +77,6 @@ extern int fault_devs;
extern char usermode_helper[];


-/* I don't remember why XCPU ...
- * This is used to wake the asender,
- * and to interrupt sending the sending task
- * on disconnect.
- */
-#define DRBD_SIG SIGXCPU
-
/* This is used to stop/restart our threads.
* Cannot use SIGTERM nor SIGKILL, since these
* are sent out by init on runlevel changes
@@ -647,8 +640,7 @@ extern struct fifo_buffer *fifo_alloc(int fifo_size);
enum {
NET_CONGESTED, /* The data socket is congested */
RESOLVE_CONFLICTS, /* Set on one node, cleared on the peer! */
- SEND_PING, /* whether asender should send a ping asap */
- SIGNAL_ASENDER, /* whether asender wants to be interrupted */
+ SEND_PING,
GOT_PING_ACK, /* set when we receive a ping_ack packet, ping_wait gets woken */
CONN_WD_ST_CHG_REQ, /* A cluster wide state change on the connection is active */
CONN_WD_ST_CHG_OKAY,
@@ -755,6 +747,7 @@ struct drbd_connection {
struct drbd_thread receiver;
struct drbd_thread worker;
struct drbd_thread ack_receiver;
+ struct workqueue_struct *ack_sender;

/* cached pointers,
* so we can look up the oldest pending requests more quickly.
@@ -823,6 +816,7 @@ struct drbd_peer_device {
struct list_head peer_devices;
struct drbd_device *device;
struct drbd_connection *connection;
+ struct work_struct send_acks_work;
#ifdef CONFIG_DEBUG_FS
struct dentry *debugfs_peer_dev;
#endif
@@ -1559,6 +1553,8 @@ extern void drbd_endio_write_sec_final(struct drbd_peer_request *peer_req);
/* drbd_receiver.c */
extern int drbd_receiver(struct drbd_thread *thi);
extern int drbd_ack_receiver(struct drbd_thread *thi);
+extern void drbd_send_ping_wf(struct work_struct *ws);
+extern void drbd_send_acks_wf(struct work_struct *ws);
extern bool drbd_rs_c_min_rate_throttle(struct drbd_device *device);
extern bool drbd_rs_should_slow_down(struct drbd_device *device, sector_t sector,
bool throttle_if_app_is_waiting);
@@ -1968,16 +1964,21 @@ drbd_device_post_work(struct drbd_device *device, int work_bit)

extern void drbd_flush_workqueue(struct drbd_work_queue *work_queue);

-static inline void wake_asender(struct drbd_connection *connection)
+/* To get the ack_receiver out of the blocking network stack,
+ * so it can change its sk_rcvtimeo from idle- to ping-timeout,
+ * and send a ping, we need to send a signal.
+ * Which signal we send is irrelevant. */
+static inline void wake_ack_receiver(struct drbd_connection *connection)
{
- if (test_bit(SIGNAL_ASENDER, &connection->flags))
- force_sig(DRBD_SIG, connection->ack_receiver.task);
+ struct task_struct *task = connection->ack_receiver.task;
+ if (task && get_t_state(&connection->ack_receiver) == RUNNING)
+ force_sig(SIGXCPU, task);
}

static inline void request_ping(struct drbd_connection *connection)
{
set_bit(SEND_PING, &connection->flags);
- wake_asender(connection);
+ wake_ack_receiver(connection);
}

extern void *conn_prepare_command(struct drbd_connection *, struct drbd_socket *);
diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index 3f7371d..99b457b 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -1794,15 +1794,6 @@ int drbd_send(struct drbd_connection *connection, struct socket *sock,
drbd_update_congested(connection);
}
do {
- /* STRANGE
- * tcp_sendmsg does _not_ use its size parameter at all ?
- *
- * -EAGAIN on timeout, -EINTR on signal.
- */
-/* THINK
- * do we need to block DRBD_SIG if sock == &meta.socket ??
- * otherwise wake_asender() might interrupt some send_*Ack !
- */
rv = kernel_sendmsg(sock, &msg, &iov, 1, size);
if (rv == -EAGAIN) {
if (we_should_drop_the_connection(connection, sock))
@@ -2822,6 +2813,7 @@ enum drbd_ret_code drbd_create_device(struct drbd_config_context *adm_ctx, unsig
goto out_idr_remove_from_resource;
}
kref_get(&connection->kref);
+ INIT_WORK(&peer_device->send_acks_work, drbd_send_acks_wf);
}

if (init_submitter(device)) {
diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index 4c8d8e7..0152f0f 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -1258,8 +1258,8 @@ static void conn_reconfig_done(struct drbd_connection *connection)
connection->cstate == C_STANDALONE;
spin_unlock_irq(&connection->resource->req_lock);
if (stop_threads) {
- /* asender is implicitly stopped by receiver
- * in conn_disconnect() */
+ /* ack_receiver thread and ack_sender workqueue are implicitly
+ * stopped by receiver in conn_disconnect() */
drbd_thread_stop(&connection->receiver);
drbd_thread_stop(&connection->worker);
}
diff --git a/drivers/block/drbd/drbd_protocol.h b/drivers/block/drbd/drbd_protocol.h
index 2da9104a..ef92453 100644
--- a/drivers/block/drbd/drbd_protocol.h
+++ b/drivers/block/drbd/drbd_protocol.h
@@ -23,7 +23,7 @@ enum drbd_packet {
P_AUTH_RESPONSE = 0x11,
P_STATE_CHG_REQ = 0x12,

- /* asender (meta socket */
+ /* (meta socket) */
P_PING = 0x13,
P_PING_ACK = 0x14,
P_RECV_ACK = 0x15, /* Used in protocol B */
diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c
index 5521465..9b70853 100644
--- a/drivers/block/drbd/drbd_receiver.c
+++ b/drivers/block/drbd/drbd_receiver.c
@@ -215,7 +215,7 @@ static void reclaim_finished_net_peer_reqs(struct drbd_device *device,
}
}

-static void drbd_kick_lo_and_reclaim_net(struct drbd_device *device)
+static void drbd_reclaim_net_peer_reqs(struct drbd_device *device)
{
LIST_HEAD(reclaimed);
struct drbd_peer_request *peer_req, *t;
@@ -223,11 +223,30 @@ static void drbd_kick_lo_and_reclaim_net(struct drbd_device *device)
spin_lock_irq(&device->resource->req_lock);
reclaim_finished_net_peer_reqs(device, &reclaimed);
spin_unlock_irq(&device->resource->req_lock);
-
list_for_each_entry_safe(peer_req, t, &reclaimed, w.list)
drbd_free_net_peer_req(device, peer_req);
}

+static void conn_reclaim_net_peer_reqs(struct drbd_connection *connection)
+{
+ struct drbd_peer_device *peer_device;
+ int vnr;
+
+ rcu_read_lock();
+ idr_for_each_entry(&connection->peer_devices, peer_device, vnr) {
+ struct drbd_device *device = peer_device->device;
+ if (!atomic_read(&device->pp_in_use_by_net))
+ continue;
+
+ kref_get(&device->kref);
+ rcu_read_unlock();
+ drbd_reclaim_net_peer_reqs(device);
+ kref_put(&device->kref, drbd_destroy_device);
+ rcu_read_lock();
+ }
+ rcu_read_unlock();
+}
+
/**
* drbd_alloc_pages() - Returns @number pages, retries forever (or until signalled)
* @device: DRBD device.
@@ -265,10 +284,15 @@ struct page *drbd_alloc_pages(struct drbd_peer_device *peer_device, unsigned int
if (atomic_read(&device->pp_in_use) < mxb)
page = __drbd_alloc_pages(device, number);

+ /* Try to keep the fast path fast, but occasionally we need
+ * to reclaim the pages we lended to the network stack. */
+ if (page && atomic_read(&device->pp_in_use_by_net) > 512)
+ drbd_reclaim_net_peer_reqs(device);
+
while (page == NULL) {
prepare_to_wait(&drbd_pp_wait, &wait, TASK_INTERRUPTIBLE);

- drbd_kick_lo_and_reclaim_net(device);
+ drbd_reclaim_net_peer_reqs(device);

if (atomic_read(&device->pp_in_use) < mxb) {
page = __drbd_alloc_pages(device, number);
@@ -1099,6 +1123,11 @@ randomize:
}

drbd_thread_start(&connection->ack_receiver);
+ connection->ack_sender = create_singlethread_workqueue("drbd_ack_sender");
+ if (!connection->ack_sender) {
+ drbd_err(connection, "Failed to create workqueue ack_sender\n");
+ return 0;
+ }

mutex_lock(&connection->resource->conf_update);
/* The discard_my_data flag is a single-shot modifier to the next
@@ -1745,7 +1774,7 @@ static int recv_dless_read(struct drbd_peer_device *peer_device, struct drbd_req
}

/*
- * e_end_resync_block() is called in asender context via
+ * e_end_resync_block() is called in ack_sender context via
* drbd_finish_peer_reqs().
*/
static int e_end_resync_block(struct drbd_work *w, int unused)
@@ -1919,7 +1948,7 @@ static void restart_conflicting_writes(struct drbd_device *device,
}

/*
- * e_end_block() is called in asender context via drbd_finish_peer_reqs().
+ * e_end_block() is called in ack_sender context via drbd_finish_peer_reqs().
*/
static int e_end_block(struct drbd_work *w, int cancel)
{
@@ -2210,7 +2239,7 @@ static int handle_write_conflicts(struct drbd_device *device,
peer_req->w.cb = superseded ? e_send_superseded :
e_send_retry_write;
list_add_tail(&peer_req->w.list, &device->done_ee);
- wake_asender(connection);
+ queue_work(connection->ack_sender, &peer_req->peer_device->send_acks_work);

err = -ENOENT;
goto out;
@@ -4049,7 +4078,7 @@ static int receive_state(struct drbd_connection *connection, struct packet_info
os = ns = drbd_read_state(device);
spin_unlock_irq(&device->resource->req_lock);

- /* If some other part of the code (asender thread, timeout)
+ /* If some other part of the code (ack_receiver thread, timeout)
* already decided to close the connection again,
* we must not "re-establish" it here. */
if (os.conn <= C_TEAR_DOWN)
@@ -4654,8 +4683,12 @@ static void conn_disconnect(struct drbd_connection *connection)
*/
conn_request_state(connection, NS(conn, C_NETWORK_FAILURE), CS_HARD);

- /* asender does not clean up anything. it must not interfere, either */
+ /* ack_receiver does not clean up anything. it must not interfere, either */
drbd_thread_stop(&connection->ack_receiver);
+ if (connection->ack_sender) {
+ destroy_workqueue(connection->ack_sender);
+ connection->ack_sender = NULL;
+ }
drbd_free_sock(connection);

rcu_read_lock();
@@ -5424,49 +5457,39 @@ static int got_skip(struct drbd_connection *connection, struct packet_info *pi)
return 0;
}

-static int connection_finish_peer_reqs(struct drbd_connection *connection)
+struct meta_sock_cmd {
+ size_t pkt_size;
+ int (*fn)(struct drbd_connection *connection, struct packet_info *);
+};
+
+static void set_rcvtimeo(struct drbd_connection *connection, bool ping_timeout)
{
- struct drbd_peer_device *peer_device;
- int vnr, not_empty = 0;
+ long t;
+ struct net_conf *nc;

- do {
- clear_bit(SIGNAL_ASENDER, &connection->flags);
- flush_signals(current);
+ rcu_read_lock();
+ nc = rcu_dereference(connection->net_conf);
+ t = ping_timeout ? nc->ping_timeo : nc->ping_int;
+ rcu_read_unlock();

- rcu_read_lock();
- idr_for_each_entry(&connection->peer_devices, peer_device, vnr) {
- struct drbd_device *device = peer_device->device;
- kref_get(&device->kref);
- rcu_read_unlock();
- if (drbd_finish_peer_reqs(device)) {
- kref_put(&device->kref, drbd_destroy_device);
- return 1;
- }
- kref_put(&device->kref, drbd_destroy_device);
- rcu_read_lock();
- }
- set_bit(SIGNAL_ASENDER, &connection->flags);
+ t *= HZ;
+ if (ping_timeout)
+ t /= 10;

- spin_lock_irq(&connection->resource->req_lock);
- idr_for_each_entry(&connection->peer_devices, peer_device, vnr) {
- struct drbd_device *device = peer_device->device;
- not_empty = !list_empty(&device->done_ee);
- if (not_empty)
- break;
- }
- spin_unlock_irq(&connection->resource->req_lock);
- rcu_read_unlock();
- } while (not_empty);
+ connection->meta.socket->sk->sk_rcvtimeo = t;
+}

- return 0;
+static void set_ping_timeout(struct drbd_connection *connection)
+{
+ set_rcvtimeo(connection, 1);
}

-struct asender_cmd {
- size_t pkt_size;
- int (*fn)(struct drbd_connection *connection, struct packet_info *);
-};
+static void set_idle_timeout(struct drbd_connection *connection)
+{
+ set_rcvtimeo(connection, 0);
+}

-static struct asender_cmd asender_tbl[] = {
+static struct meta_sock_cmd ack_receiver_tbl[] = {
[P_PING] = { 0, got_Ping },
[P_PING_ACK] = { 0, got_PingAck },
[P_RECV_ACK] = { sizeof(struct p_block_ack), got_BlockAck },
@@ -5489,61 +5512,37 @@ static struct asender_cmd asender_tbl[] = {
int drbd_ack_receiver(struct drbd_thread *thi)
{
struct drbd_connection *connection = thi->connection;
- struct asender_cmd *cmd = NULL;
+ struct meta_sock_cmd *cmd = NULL;
struct packet_info pi;
+ unsigned long pre_recv_jif;
int rv;
void *buf = connection->meta.rbuf;
int received = 0;
unsigned int header_size = drbd_header_size(connection);
int expect = header_size;
bool ping_timeout_active = false;
- struct net_conf *nc;
- int ping_timeo, tcp_cork, ping_int;
struct sched_param param = { .sched_priority = 2 };

rv = sched_setscheduler(current, SCHED_RR, &param);
if (rv < 0)
- drbd_err(connection, "drbd_asender: ERROR set priority, ret=%d\n", rv);
+ drbd_err(connection, "drbd_ack_receiver: ERROR set priority, ret=%d\n", rv);

while (get_t_state(thi) == RUNNING) {
drbd_thread_current_set_cpu(thi);

- rcu_read_lock();
- nc = rcu_dereference(connection->net_conf);
- ping_timeo = nc->ping_timeo;
- tcp_cork = nc->tcp_cork;
- ping_int = nc->ping_int;
- rcu_read_unlock();
+ conn_reclaim_net_peer_reqs(connection);

if (test_and_clear_bit(SEND_PING, &connection->flags)) {
if (drbd_send_ping(connection)) {
drbd_err(connection, "drbd_send_ping has failed\n");
goto reconnect;
}
- connection->meta.socket->sk->sk_rcvtimeo = ping_timeo * HZ / 10;
+ set_ping_timeout(connection);
ping_timeout_active = true;
}

- /* TODO: conditionally cork; it may hurt latency if we cork without
- much to send */
- if (tcp_cork)
- drbd_tcp_cork(connection->meta.socket);
- if (connection_finish_peer_reqs(connection)) {
- drbd_err(connection, "connection_finish_peer_reqs() failed\n");
- goto reconnect;
- }
- /* but unconditionally uncork unless disabled */
- if (tcp_cork)
- drbd_tcp_uncork(connection->meta.socket);
-
- /* short circuit, recv_msg would return EINTR anyways. */
- if (signal_pending(current))
- continue;
-
+ pre_recv_jif = jiffies;
rv = drbd_recv_short(connection->meta.socket, buf, expect-received, 0);
- clear_bit(SIGNAL_ASENDER, &connection->flags);
-
- flush_signals(current);

/* Note:
* -EINTR (on meta) we got a signal
@@ -5555,7 +5554,6 @@ int drbd_ack_receiver(struct drbd_thread *thi)
* rv < expected: "woken" by signal during receive
* rv == 0 : "connection shut down by peer"
*/
-received_more:
if (likely(rv > 0)) {
received += rv;
buf += rv;
@@ -5577,8 +5575,7 @@ received_more:
} else if (rv == -EAGAIN) {
/* If the data socket received something meanwhile,
* that is good enough: peer is still alive. */
- if (time_after(connection->last_received,
- jiffies - connection->meta.socket->sk->sk_rcvtimeo))
+ if (time_after(connection->last_received, pre_recv_jif))
continue;
if (ping_timeout_active) {
drbd_err(connection, "PingAck did not arrive in time.\n");
@@ -5587,6 +5584,10 @@ received_more:
set_bit(SEND_PING, &connection->flags);
continue;
} else if (rv == -EINTR) {
+ /* maybe drbd_thread_stop(): the while condition will notice.
+ * maybe woken for send_ping: we'll send a ping above,
+ * and change the rcvtimeo */
+ flush_signals(current);
continue;
} else {
drbd_err(connection, "sock_recvmsg returned %d\n", rv);
@@ -5596,8 +5597,8 @@ received_more:
if (received == expect && cmd == NULL) {
if (decode_header(connection, connection->meta.rbuf, &pi))
goto reconnect;
- cmd = &asender_tbl[pi.cmd];
- if (pi.cmd >= ARRAY_SIZE(asender_tbl) || !cmd->fn) {
+ cmd = &ack_receiver_tbl[pi.cmd];
+ if (pi.cmd >= ARRAY_SIZE(ack_receiver_tbl) || !cmd->fn) {
drbd_err(connection, "Unexpected meta packet %s (0x%04x)\n",
cmdname(pi.cmd), pi.cmd);
goto disconnect;
@@ -5620,9 +5621,8 @@ received_more:

connection->last_received = jiffies;

- if (cmd == &asender_tbl[P_PING_ACK]) {
- /* restore idle timeout */
- connection->meta.socket->sk->sk_rcvtimeo = ping_int * HZ;
+ if (cmd == &ack_receiver_tbl[P_PING_ACK]) {
+ set_idle_timeout(connection);
ping_timeout_active = false;
}

@@ -5631,11 +5631,6 @@ received_more:
expect = header_size;
cmd = NULL;
}
- if (test_bit(SEND_PING, &connection->flags))
- continue;
- rv = drbd_recv_short(connection->meta.socket, buf, expect-received, MSG_DONTWAIT);
- if (rv > 0)
- goto received_more;
}

if (0) {
@@ -5647,9 +5642,41 @@ reconnect:
disconnect:
conn_request_state(connection, NS(conn, C_DISCONNECTING), CS_HARD);
}
- clear_bit(SIGNAL_ASENDER, &connection->flags);

- drbd_info(connection, "asender terminated\n");
+ drbd_info(connection, "ack_receiver terminated\n");

return 0;
}
+
+void drbd_send_acks_wf(struct work_struct *ws)
+{
+ struct drbd_peer_device *peer_device =
+ container_of(ws, struct drbd_peer_device, send_acks_work);
+ struct drbd_connection *connection = peer_device->connection;
+ struct drbd_device *device = peer_device->device;
+ struct net_conf *nc;
+ int tcp_cork, err;
+
+ rcu_read_lock();
+ nc = rcu_dereference(connection->net_conf);
+ tcp_cork = nc->tcp_cork;
+ rcu_read_unlock();
+
+ if (tcp_cork)
+ drbd_tcp_cork(connection->meta.socket);
+
+ err = drbd_finish_peer_reqs(device);
+ kref_put(&device->kref, drbd_destroy_device);
+ /* get is in drbd_endio_write_sec_final(). That is necessary to keep the
+ struct work_struct send_acks_work alive, which is in the peer_device object */
+
+ if (err) {
+ conn_request_state(connection, NS(conn, C_NETWORK_FAILURE), CS_HARD);
+ return;
+ }
+
+ if (tcp_cork)
+ drbd_tcp_uncork(connection->meta.socket);
+
+ return;
+}
diff --git a/drivers/block/drbd/drbd_req.c b/drivers/block/drbd/drbd_req.c
index 3c8ade7..03d5b09 100644
--- a/drivers/block/drbd/drbd_req.c
+++ b/drivers/block/drbd/drbd_req.c
@@ -452,7 +452,7 @@ static void mod_rq_state(struct drbd_request *req, struct bio_and_error *m,
kref_get(&req->kref); /* wait for the DONE */

if (!(s & RQ_NET_SENT) && (set & RQ_NET_SENT)) {
- /* potentially already completed in the asender thread */
+ /* potentially already completed in the ack_receiver thread */
if (!(s & RQ_NET_DONE)) {
atomic_add(req->i.size >> 9, &device->ap_in_flight);
set_if_null_req_not_net_done(peer_device, req);
diff --git a/drivers/block/drbd/drbd_worker.c b/drivers/block/drbd/drbd_worker.c
index 02fad6c..fe18c66 100644
--- a/drivers/block/drbd/drbd_worker.c
+++ b/drivers/block/drbd/drbd_worker.c
@@ -113,6 +113,7 @@ void drbd_endio_write_sec_final(struct drbd_peer_request *peer_req) __releases(l
unsigned long flags = 0;
struct drbd_peer_device *peer_device = peer_req->peer_device;
struct drbd_device *device = peer_device->device;
+ struct drbd_connection *connection = peer_device->connection;
struct drbd_interval i;
int do_wake;
u64 block_id;
@@ -145,6 +146,12 @@ void drbd_endio_write_sec_final(struct drbd_peer_request *peer_req) __releases(l
* ((peer_req->flags & (EE_WAS_ERROR|EE_IS_TRIM)) == EE_WAS_ERROR) */
if (peer_req->flags & EE_WAS_ERROR)
__drbd_chk_io_error(device, DRBD_WRITE_ERROR);
+
+ if (connection->cstate >= C_WF_REPORT_PARAMS) {
+ kref_get(&device->kref); /* put is in drbd_send_acks_wf() */
+ if (!queue_work(connection->ack_sender, &peer_device->send_acks_work))
+ kref_put(&device->kref, drbd_destroy_device);
+ }
spin_unlock_irqrestore(&device->resource->req_lock, flags);

if (block_id == ID_SYNCER)
@@ -156,7 +163,6 @@ void drbd_endio_write_sec_final(struct drbd_peer_request *peer_req) __releases(l
if (do_al_complete_io)
drbd_al_complete_io(device, &i);

- wake_asender(peer_device->connection);
put_ldev(device);
}

--
1.9.1

2015-08-04 12:57:07

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 03/19] drbd: prevent NULL pointer deref when resuming diskless primary

From: Lars Ellenberg <[email protected]>

In a multiple error scenario, we may end up with a "frozen" Primary,
that has no access to any data (no local disk, no replication link).

If we then resume-io, we try to generate a new data generation id,
which will fail if there is no longer a local disk.

Double check for available local data,
which prevents the NULL pointer deref.

If we are diskless, turn the resume-io in this situation
into the first stage of a "force down", by bumping the "effective" data
gen id, which will prevent later attach or connect to the former data
set without first being demoted (deconfigured).

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_nl.c | 25 ++++++++++++++++++++++++-
1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index 0152f0f..66e8acd 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -2920,7 +2920,30 @@ int drbd_adm_resume_io(struct sk_buff *skb, struct genl_info *info)
mutex_lock(&adm_ctx.resource->adm_mutex);
device = adm_ctx.device;
if (test_bit(NEW_CUR_UUID, &device->flags)) {
- drbd_uuid_new_current(device);
+ if (get_ldev_if_state(device, D_ATTACHING)) {
+ drbd_uuid_new_current(device);
+ put_ldev(device);
+ } else {
+ /* This is effectively a multi-stage "forced down".
+ * The NEW_CUR_UUID bit is supposedly only set, if we
+ * lost the replication connection, and are configured
+ * to freeze IO and wait for some fence-peer handler.
+ * So we still don't have a replication connection.
+ * And now we don't have a local disk either. After
+ * resume, we will fail all pending and new IO, because
+ * we don't have any data anymore. Which means we will
+ * eventually be able to terminate all users of this
+ * device, and then take it down. By bumping the
+ * "effective" data uuid, we make sure that you really
+ * need to tear down before you reconfigure, we will
+ * the refuse to re-connect or re-attach (because no
+ * matching real data uuid exists).
+ */
+ u64 val;
+ get_random_bytes(&val, sizeof(u64));
+ drbd_set_ed_uuid(device, val);
+ drbd_warn(device, "Resumed without access to data; please tear down before attempting to re-configure.\n");
+ }
clear_bit(NEW_CUR_UUID, &device->flags);
}
drbd_suspend_io(device);
--
1.9.1

2015-08-04 13:02:23

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 04/19] drbd: debugfs: expose ed_data_gen_id

From: Lars Ellenberg <[email protected]>

The effective data generation ID may be interesting for debugging
purposes of scenarios involving diskless states.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_debugfs.c | 10 ++++++++++
drivers/block/drbd/drbd_int.h | 1 +
2 files changed, 11 insertions(+)

diff --git a/drivers/block/drbd/drbd_debugfs.c b/drivers/block/drbd/drbd_debugfs.c
index 6b88a35..96a0107 100644
--- a/drivers/block/drbd/drbd_debugfs.c
+++ b/drivers/block/drbd/drbd_debugfs.c
@@ -771,6 +771,13 @@ static int device_data_gen_id_show(struct seq_file *m, void *ignored)
return 0;
}

+static int device_ed_gen_id_show(struct seq_file *m, void *ignored)
+{
+ struct drbd_device *device = m->private;
+ seq_printf(m, "0x%016llX\n", (unsigned long long)device->ed_uuid);
+ return 0;
+}
+
#define drbd_debugfs_device_attr(name) \
static int device_ ## name ## _open(struct inode *inode, struct file *file) \
{ \
@@ -796,6 +803,7 @@ drbd_debugfs_device_attr(oldest_requests)
drbd_debugfs_device_attr(act_log_extents)
drbd_debugfs_device_attr(resync_extents)
drbd_debugfs_device_attr(data_gen_id)
+drbd_debugfs_device_attr(ed_gen_id)

void drbd_debugfs_device_add(struct drbd_device *device)
{
@@ -839,6 +847,7 @@ void drbd_debugfs_device_add(struct drbd_device *device)
DCF(act_log_extents);
DCF(resync_extents);
DCF(data_gen_id);
+ DCF(ed_gen_id);
#undef DCF
return;

@@ -854,6 +863,7 @@ void drbd_debugfs_device_cleanup(struct drbd_device *device)
drbd_debugfs_remove(&device->debugfs_vol_act_log_extents);
drbd_debugfs_remove(&device->debugfs_vol_resync_extents);
drbd_debugfs_remove(&device->debugfs_vol_data_gen_id);
+ drbd_debugfs_remove(&device->debugfs_vol_ed_gen_id);
drbd_debugfs_remove(&device->debugfs_vol);
}

diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index 9bf3178..249a909 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -835,6 +835,7 @@ struct drbd_device {
struct dentry *debugfs_vol_act_log_extents;
struct dentry *debugfs_vol_resync_extents;
struct dentry *debugfs_vol_data_gen_id;
+ struct dentry *debugfs_vol_ed_gen_id;
#endif

unsigned int vnr; /* volume number within the connection */
--
1.9.1

2015-08-04 13:02:01

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 05/19] drbd: use resource name in workqueue

From: Lars Ellenberg <[email protected]>

Since kernel 3.3, we can use snprintf-style arguments
to create a workqueue.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_main.c | 4 ++--
drivers/block/drbd/drbd_receiver.c | 5 ++++-
2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index 99b457b..30c1120 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -2694,8 +2694,8 @@ static int init_submitter(struct drbd_device *device)
{
/* opencoded create_singlethread_workqueue(),
* to be able to say "drbd%d", ..., minor */
- device->submit.wq = alloc_workqueue("drbd%u_submit",
- WQ_UNBOUND | WQ_MEM_RECLAIM, 1, device->minor);
+ device->submit.wq =
+ alloc_ordered_workqueue("drbd%u_submit", WQ_MEM_RECLAIM, device->minor);
if (!device->submit.wq)
return -ENOMEM;

diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c
index 9b70853..251c862 100644
--- a/drivers/block/drbd/drbd_receiver.c
+++ b/drivers/block/drbd/drbd_receiver.c
@@ -1123,7 +1123,10 @@ randomize:
}

drbd_thread_start(&connection->ack_receiver);
- connection->ack_sender = create_singlethread_workqueue("drbd_ack_sender");
+ /* opencoded create_singlethread_workqueue(),
+ * to be able to use format string arguments */
+ connection->ack_sender =
+ alloc_ordered_workqueue("drbd_as_%s", WQ_MEM_RECLAIM, connection->resource->name);
if (!connection->ack_sender) {
drbd_err(connection, "Failed to create workqueue ack_sender\n");
return 0;
--
1.9.1

2015-08-04 13:00:19

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 06/19] drbd: avoid redefinition of BITS_PER_PAGE

From: Lars Ellenberg <[email protected]>

Apparently we now implicitly get definitions for BITS_PER_PAGE and
BITS_PER_PAGE_MASK from the pid_namespace.h

Instead of renaming our defines, I chose to define only if not yet
defined, but to double check the value if already defined.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_bitmap.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/drivers/block/drbd/drbd_bitmap.c b/drivers/block/drbd/drbd_bitmap.c
index 434c77d..227deb0 100644
--- a/drivers/block/drbd/drbd_bitmap.c
+++ b/drivers/block/drbd/drbd_bitmap.c
@@ -479,8 +479,14 @@ void drbd_bm_cleanup(struct drbd_device *device)
* this masks out the remaining bits.
* Returns the number of bits cleared.
*/
+#ifndef BITS_PER_PAGE
#define BITS_PER_PAGE (1UL << (PAGE_SHIFT + 3))
#define BITS_PER_PAGE_MASK (BITS_PER_PAGE - 1)
+#else
+# if BITS_PER_PAGE != (1UL << (PAGE_SHIFT + 3))
+# error "ambiguous BITS_PER_PAGE"
+# endif
+#endif
#define BITS_PER_LONG_MASK (BITS_PER_LONG - 1)
static int bm_clear_surplus(struct drbd_bitmap *b)
{
--
1.9.1

2015-08-04 13:00:15

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 07/19] drbd: use bitmap_weight() helper, don't open code

From: Lars Ellenberg <[email protected]>

Suggested by Akinobu Mita <[email protected]>

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_bitmap.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/block/drbd/drbd_bitmap.c b/drivers/block/drbd/drbd_bitmap.c
index 227deb0..c09c571 100644
--- a/drivers/block/drbd/drbd_bitmap.c
+++ b/drivers/block/drbd/drbd_bitmap.c
@@ -24,7 +24,7 @@

#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt

-#include <linux/bitops.h>
+#include <linux/bitmap.h>
#include <linux/vmalloc.h>
#include <linux/string.h>
#include <linux/drbd.h>
@@ -565,21 +565,19 @@ static unsigned long bm_count_bits(struct drbd_bitmap *b)
unsigned long *p_addr;
unsigned long bits = 0;
unsigned long mask = (1UL << (b->bm_bits & BITS_PER_LONG_MASK)) -1;
- int idx, i, last_word;
+ int idx, last_word;

/* all but last page */
for (idx = 0; idx < b->bm_number_of_pages - 1; idx++) {
p_addr = __bm_map_pidx(b, idx);
- for (i = 0; i < LWPP; i++)
- bits += hweight_long(p_addr[i]);
+ bits += bitmap_weight(p_addr, BITS_PER_PAGE);
__bm_unmap(p_addr);
cond_resched();
}
/* last (or only) page */
last_word = ((b->bm_bits - 1) & BITS_PER_PAGE_MASK) >> LN2_BPL;
p_addr = __bm_map_pidx(b, idx);
- for (i = 0; i < last_word; i++)
- bits += hweight_long(p_addr[i]);
+ bits += bitmap_weight(p_addr, last_word * BITS_PER_LONG);
p_addr[last_word] &= cpu_to_lel(mask);
bits += hweight_long(p_addr[last_word]);
/* 32bit arch, may have an unused padding long */
@@ -1434,6 +1432,9 @@ static inline void bm_set_full_words_within_one_page(struct drbd_bitmap *b,
int bits;
int changed = 0;
unsigned long *paddr = kmap_atomic(b->bm_pages[page_nr]);
+
+ /* I think it is more cache line friendly to hweight_long then set to ~0UL,
+ * than to first bitmap_weight() all words, then bitmap_fill() all words */
for (i = first_word; i < last_word; i++) {
bits = hweight_long(paddr[i]);
paddr[i] = ~0UL;
@@ -1643,8 +1644,7 @@ int drbd_bm_e_weight(struct drbd_device *device, unsigned long enr)
int n = e-s;
p_addr = bm_map_pidx(b, bm_word_to_page_idx(b, s));
bm = p_addr + MLPP(s);
- while (n--)
- count += hweight_long(*bm++);
+ count += bitmap_weight(bm, n * BITS_PER_LONG);
bm_unmap(p_addr);
} else {
drbd_err(device, "start offset (%d) too large in drbd_bm_e_weight\n", s);
--
1.9.1

2015-08-04 13:00:17

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 08/19] drbd: fix spurious alert level printk

From: Lars Ellenberg <[email protected]>

When accessing out meta data area on disk, we double check the
plausibility of the requested sector offsets, and are very noisy about
it if they look suspicious.

During initial read of our "superblock", for "external" meta data,
this triggered because the range estimate returned by
drbd_md_last_sector() was still wrong.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_main.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index 30c1120..903d22e 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -3271,6 +3271,10 @@ int drbd_md_read(struct drbd_device *device, struct drbd_backing_dev *bdev)
* and read it. */
bdev->md.meta_dev_idx = bdev->disk_conf->meta_dev_idx;
bdev->md.md_offset = drbd_md_ss(bdev);
+ /* Even for (flexible or indexed) external meta data,
+ * initially restrict us to the 4k superblock for now.
+ * Affects the paranoia out-of-range access check in drbd_md_sync_page_io(). */
+ bdev->md.md_size_sect = 8;

if (drbd_md_sync_page_io(device, bdev, bdev->md.md_offset, READ)) {
/* NOTE: can't do normal error processing here as this is
--
1.9.1

2015-08-04 12:59:12

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 09/19] drbd: fix queue limit setup for discard

From: Lars Ellenberg <[email protected]>

We cannot possibly support SECDISCARD, even if all backend devices would
support it: if our peer is currently unreachable, some instance of the
data may obviously still be recoverable.

We did not set discard_granularity at all. We don't really care (yet),
we only pass them on, so for now, set our granularity to one sector.
blkdev_stack_limits() takes care of the rest.

If we decide we cannot support discards,
not only clear the (not user visible) QUEUE_FLAG_DISCARD,
but set both (user visible) discard_granularity and max_discard_sectors
to zero, to avoid confusion with e.g. lsblk -D.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_nl.c | 24 ++++++++++++++----------
1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index 66e8acd..9f60a68 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -1168,21 +1168,19 @@ static void drbd_setup_queue_param(struct drbd_device *device, struct drbd_backi
if (b) {
struct drbd_connection *connection = first_peer_device(device)->connection;

+ q->limits.max_discard_sectors = DRBD_MAX_DISCARD_SECTORS;
+
if (blk_queue_discard(b) &&
(connection->cstate < C_CONNECTED || connection->agreed_features & FF_TRIM)) {
- /* For now, don't allow more than one activity log extent worth of data
- * to be discarded in one go. We may need to rework drbd_al_begin_io()
- * to allow for even larger discard ranges */
- q->limits.max_discard_sectors = DRBD_MAX_DISCARD_SECTORS;
-
+ /* We don't care, stacking below should fix it for the local device.
+ * Whether or not it is a suitable granularity on the remote device
+ * is not our problem, really. If you care, you need to
+ * use devices with similar topology on all peers. */
+ q->limits.discard_granularity = 512;
queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, q);
- /* REALLY? Is stacking secdiscard "legal"? */
- if (blk_queue_secdiscard(b))
- queue_flag_set_unlocked(QUEUE_FLAG_SECDISCARD, q);
} else {
- q->limits.max_discard_sectors = 0;
queue_flag_clear_unlocked(QUEUE_FLAG_DISCARD, q);
- queue_flag_clear_unlocked(QUEUE_FLAG_SECDISCARD, q);
+ q->limits.discard_granularity = 0;
}

blk_queue_stack_limits(q, b);
@@ -1194,6 +1192,12 @@ static void drbd_setup_queue_param(struct drbd_device *device, struct drbd_backi
q->backing_dev_info.ra_pages = b->backing_dev_info.ra_pages;
}
}
+ /* To avoid confusion, if this queue does not support discard, clear
+ * max_discard_sectors, which is what lsblk -D reports to the user. */
+ if (!blk_queue_discard(q)) {
+ q->limits.max_discard_sectors = 0;
+ q->limits.discard_granularity = 0;
+ }
}

void drbd_reconsider_max_bio_size(struct drbd_device *device, struct drbd_backing_dev *bdev)
--
1.9.1

2015-08-04 12:59:14

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 10/19] drbd: make drbd known to lsblk: use bd_link_disk_holder

From: Lars Ellenberg <[email protected]>

lsblk should be able to pick up stacking device driver relations
involving DRBD conveniently.

Even though upstream kernel since 2011 says
"DON'T USE THIS UNLESS YOU'RE ALREADY USING IT."
a new user has been added since (bcache),
which sets the precedences for us to use it as well.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_int.h | 2 +-
drivers/block/drbd/drbd_main.c | 16 +-----
drivers/block/drbd/drbd_nl.c | 121 ++++++++++++++++++++++++++++-----------
drivers/block/drbd/drbd_worker.c | 2 +-
include/linux/drbd.h | 2 +-
5 files changed, 91 insertions(+), 52 deletions(-)

diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index 249a909..e15776e 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -1126,7 +1126,7 @@ extern int drbd_send_ov_request(struct drbd_peer_device *, sector_t sector, int
extern int drbd_send_bitmap(struct drbd_device *device);
extern void drbd_send_sr_reply(struct drbd_peer_device *, enum drbd_state_rv retcode);
extern void conn_send_sr_reply(struct drbd_connection *connection, enum drbd_state_rv retcode);
-extern void drbd_free_ldev(struct drbd_backing_dev *ldev);
+extern void drbd_backing_dev_free(struct drbd_device *device, struct drbd_backing_dev *ldev);
extern void drbd_device_cleanup(struct drbd_device *device);
void drbd_print_uuids(struct drbd_device *device, const char *text);

diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index 903d22e..fea97fe 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -1992,7 +1992,7 @@ void drbd_device_cleanup(struct drbd_device *device)
drbd_bm_cleanup(device);
}

- drbd_free_ldev(device->ldev);
+ drbd_backing_dev_free(device, device->ldev);
device->ldev = NULL;

clear_bit(AL_SUSPENDED, &device->flags);
@@ -2171,7 +2171,7 @@ void drbd_destroy_device(struct kref *kref)
if (device->this_bdev)
bdput(device->this_bdev);

- drbd_free_ldev(device->ldev);
+ drbd_backing_dev_free(device, device->ldev);
device->ldev = NULL;

drbd_release_all_peer_reqs(device);
@@ -2965,18 +2965,6 @@ fail:
return err;
}

-void drbd_free_ldev(struct drbd_backing_dev *ldev)
-{
- if (ldev == NULL)
- return;
-
- blkdev_put(ldev->backing_bdev, FMODE_READ | FMODE_WRITE | FMODE_EXCL);
- blkdev_put(ldev->md_bdev, FMODE_READ | FMODE_WRITE | FMODE_EXCL);
-
- kfree(ldev->disk_conf);
- kfree(ldev);
-}
-
static void drbd_free_one_sock(struct drbd_socket *ds)
{
struct socket *s;
diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index 9f60a68..3925f8e 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -1470,6 +1470,88 @@ success:
return 0;
}

+static struct block_device *open_backing_dev(struct drbd_device *device,
+ const char *bdev_path, void *claim_ptr, bool do_bd_link)
+{
+ struct block_device *bdev;
+ int err = 0;
+
+ bdev = blkdev_get_by_path(bdev_path,
+ FMODE_READ | FMODE_WRITE | FMODE_EXCL, claim_ptr);
+ if (IS_ERR(bdev)) {
+ drbd_err(device, "open(\"%s\") failed with %ld\n",
+ bdev_path, PTR_ERR(bdev));
+ return bdev;
+ }
+
+ if (!do_bd_link)
+ return bdev;
+
+ err = bd_link_disk_holder(bdev, device->vdisk);
+ if (err) {
+ blkdev_put(bdev, FMODE_READ | FMODE_WRITE | FMODE_EXCL);
+ drbd_err(device, "bd_link_disk_holder(\"%s\", ...) failed with %d\n",
+ bdev_path, err);
+ bdev = ERR_PTR(err);
+ }
+ return bdev;
+}
+
+static int open_backing_devices(struct drbd_device *device,
+ struct disk_conf *new_disk_conf,
+ struct drbd_backing_dev *nbc)
+{
+ struct block_device *bdev;
+
+ bdev = open_backing_dev(device, new_disk_conf->backing_dev, device, true);
+ if (IS_ERR(bdev))
+ return ERR_OPEN_DISK;
+ nbc->backing_bdev = bdev;
+
+ /*
+ * meta_dev_idx >= 0: external fixed size, possibly multiple
+ * drbd sharing one meta device. TODO in that case, paranoia
+ * check that [md_bdev, meta_dev_idx] is not yet used by some
+ * other drbd minor! (if you use drbd.conf + drbdadm, that
+ * should check it for you already; but if you don't, or
+ * someone fooled it, we need to double check here)
+ */
+ bdev = open_backing_dev(device, new_disk_conf->meta_dev,
+ /* claim ptr: device, if claimed exclusively; shared drbd_m_holder,
+ * if potentially shared with other drbd minors */
+ (new_disk_conf->meta_dev_idx < 0) ? (void*)device : (void*)drbd_m_holder,
+ /* avoid double bd_claim_by_disk() for the same (source,target) tuple,
+ * as would happen with internal metadata. */
+ (new_disk_conf->meta_dev_idx != DRBD_MD_INDEX_FLEX_INT &&
+ new_disk_conf->meta_dev_idx != DRBD_MD_INDEX_INTERNAL));
+ if (IS_ERR(bdev))
+ return ERR_OPEN_MD_DISK;
+ nbc->md_bdev = bdev;
+ return NO_ERROR;
+}
+
+static void close_backing_dev(struct drbd_device *device, struct block_device *bdev,
+ bool do_bd_unlink)
+{
+ if (!bdev)
+ return;
+ if (do_bd_unlink)
+ bd_unlink_disk_holder(bdev, device->vdisk);
+ blkdev_put(bdev, FMODE_READ | FMODE_WRITE | FMODE_EXCL);
+}
+
+void drbd_backing_dev_free(struct drbd_device *device, struct drbd_backing_dev *ldev)
+{
+ if (ldev == NULL)
+ return;
+
+ close_backing_dev(device, ldev->md_bdev, ldev->md_bdev != ldev->backing_bdev);
+ close_backing_dev(device, ldev->backing_bdev, true);
+
+ kfree(ldev->disk_conf);
+ kfree(ldev);
+}
+
int drbd_adm_attach(struct sk_buff *skb, struct genl_info *info)
{
struct drbd_config_context adm_ctx;
@@ -1483,7 +1565,6 @@ int drbd_adm_attach(struct sk_buff *skb, struct genl_info *info)
sector_t min_md_device_sectors;
struct drbd_backing_dev *nbc = NULL; /* new_backing_conf */
struct disk_conf *new_disk_conf = NULL;
- struct block_device *bdev;
struct lru_cache *resync_lru = NULL;
struct fifo_buffer *new_plan = NULL;
union drbd_state ns, os;
@@ -1571,35 +1652,9 @@ int drbd_adm_attach(struct sk_buff *skb, struct genl_info *info)
}
rcu_read_unlock();

- bdev = blkdev_get_by_path(new_disk_conf->backing_dev,
- FMODE_READ | FMODE_WRITE | FMODE_EXCL, device);
- if (IS_ERR(bdev)) {
- drbd_err(device, "open(\"%s\") failed with %ld\n", new_disk_conf->backing_dev,
- PTR_ERR(bdev));
- retcode = ERR_OPEN_DISK;
- goto fail;
- }
- nbc->backing_bdev = bdev;
-
- /*
- * meta_dev_idx >= 0: external fixed size, possibly multiple
- * drbd sharing one meta device. TODO in that case, paranoia
- * check that [md_bdev, meta_dev_idx] is not yet used by some
- * other drbd minor! (if you use drbd.conf + drbdadm, that
- * should check it for you already; but if you don't, or
- * someone fooled it, we need to double check here)
- */
- bdev = blkdev_get_by_path(new_disk_conf->meta_dev,
- FMODE_READ | FMODE_WRITE | FMODE_EXCL,
- (new_disk_conf->meta_dev_idx < 0) ?
- (void *)device : (void *)drbd_m_holder);
- if (IS_ERR(bdev)) {
- drbd_err(device, "open(\"%s\") failed with %ld\n", new_disk_conf->meta_dev,
- PTR_ERR(bdev));
- retcode = ERR_OPEN_MD_DISK;
+ retcode = open_backing_devices(device, new_disk_conf, nbc);
+ if (retcode != NO_ERROR)
goto fail;
- }
- nbc->md_bdev = bdev;

if ((nbc->backing_bdev == nbc->md_bdev) !=
(new_disk_conf->meta_dev_idx == DRBD_MD_INDEX_INTERNAL ||
@@ -1899,12 +1954,8 @@ int drbd_adm_attach(struct sk_buff *skb, struct genl_info *info)
fail:
conn_reconfig_done(connection);
if (nbc) {
- if (nbc->backing_bdev)
- blkdev_put(nbc->backing_bdev,
- FMODE_READ | FMODE_WRITE | FMODE_EXCL);
- if (nbc->md_bdev)
- blkdev_put(nbc->md_bdev,
- FMODE_READ | FMODE_WRITE | FMODE_EXCL);
+ close_backing_dev(device, nbc->md_bdev, nbc->md_bdev != nbc->backing_bdev);
+ close_backing_dev(device, nbc->backing_bdev, true);
kfree(nbc);
}
kfree(new_disk_conf);
diff --git a/drivers/block/drbd/drbd_worker.c b/drivers/block/drbd/drbd_worker.c
index fe18c66..302dec0 100644
--- a/drivers/block/drbd/drbd_worker.c
+++ b/drivers/block/drbd/drbd_worker.c
@@ -1863,7 +1863,7 @@ static void drbd_ldev_destroy(struct drbd_device *device)
device->act_log = NULL;

__acquire(local);
- drbd_free_ldev(device->ldev);
+ drbd_backing_dev_free(device, device->ldev);
device->ldev = NULL;
__release(local);

diff --git a/include/linux/drbd.h b/include/linux/drbd.h
index 392fc0e..d6b3c99 100644
--- a/include/linux/drbd.h
+++ b/include/linux/drbd.h
@@ -51,7 +51,7 @@
#endif

extern const char *drbd_buildtag(void);
-#define REL_VERSION "8.4.5"
+#define REL_VERSION "8.4.6"
#define API_VERSION 1
#define PRO_VERSION_MIN 86
#define PRO_VERSION_MAX 101
--
1.9.1

2015-08-04 12:59:07

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 11/19] lru_cache: Converted lc_seq_printf_status to return void

From: Roland Kammerer <[email protected]>

Fix the semantic of lc_seq_printf. Currently, it always returns 0 and
the return value is unused, therefore, convert the return type to void.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
include/linux/lru_cache.h | 2 +-
lib/lru_cache.c | 4 +---
2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/include/linux/lru_cache.h b/include/linux/lru_cache.h
index 4626228..04fc6e6 100644
--- a/include/linux/lru_cache.h
+++ b/include/linux/lru_cache.h
@@ -264,7 +264,7 @@ extern unsigned int lc_put(struct lru_cache *lc, struct lc_element *e);
extern void lc_committed(struct lru_cache *lc);

struct seq_file;
-extern size_t lc_seq_printf_stats(struct seq_file *seq, struct lru_cache *lc);
+extern void lc_seq_printf_stats(struct seq_file *seq, struct lru_cache *lc);

extern void lc_seq_dump_details(struct seq_file *seq, struct lru_cache *lc, char *utext,
void (*detail) (struct seq_file *, struct lc_element *));
diff --git a/lib/lru_cache.c b/lib/lru_cache.c
index 028f5d9..28ba40b 100644
--- a/lib/lru_cache.c
+++ b/lib/lru_cache.c
@@ -238,7 +238,7 @@ void lc_reset(struct lru_cache *lc)
* @seq: the seq_file to print into
* @lc: the lru cache to print statistics of
*/
-size_t lc_seq_printf_stats(struct seq_file *seq, struct lru_cache *lc)
+void lc_seq_printf_stats(struct seq_file *seq, struct lru_cache *lc)
{
/* NOTE:
* total calls to lc_get are
@@ -250,8 +250,6 @@ size_t lc_seq_printf_stats(struct seq_file *seq, struct lru_cache *lc)
seq_printf(seq, "\t%s: used:%u/%u hits:%lu misses:%lu starving:%lu locked:%lu changed:%lu\n",
lc->name, lc->used, lc->nr_elements,
lc->hits, lc->misses, lc->starving, lc->locked, lc->changed);
-
- return 0;
}

static struct hlist_head *lc_hash_slot(struct lru_cache *lc, unsigned int enr)
--
1.9.1

2015-08-04 12:59:09

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 12/19] drbd: don't block forever in disconnect during resync if fencing=r-a-stonith

From: Lars Ellenberg <[email protected]>

Disconnect should wait for pending bitmap IO.
But if that bitmap IO is not happening, because it is waiting for
pending application IO, and there is no progress, because the fencing
policy suspended application IO because of the disconnect,
then we deadlock.

The bitmap writeout in this case does not care for concurrent
application IO, so there is no point waiting for it.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_main.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index fea97fe..d26ddb6 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -3564,7 +3564,9 @@ void drbd_queue_bitmap_io(struct drbd_device *device,

spin_lock_irq(&device->resource->req_lock);
set_bit(BITMAP_IO, &device->flags);
- if (atomic_read(&device->ap_bio_cnt) == 0) {
+ /* don't wait for pending application IO if the caller indicates that
+ * application IO does not conflict anyways. */
+ if (flags == BM_LOCKED_CHANGE_ALLOWED || atomic_read(&device->ap_bio_cnt) == 0) {
if (!test_and_set_bit(BITMAP_IO_QUEUED, &device->flags))
drbd_queue_work(&first_peer_device(device)->connection->sender_work,
&device->bm_io_work.w);
--
1.9.1

2015-08-04 12:58:07

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 13/19] drbd: fix memory leak in drbd_adm_resize

From: Oleg Drokin <[email protected]>

new_disk_conf could be leaked if the follow on checks fail,
so make sure to free it on error if it was not assigned yet.

Found with smatch.

Signed-off-by: Oleg Drokin <[email protected]>
Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_nl.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index 3925f8e..1d42c4f 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -2705,6 +2705,7 @@ int drbd_adm_resize(struct sk_buff *skb, struct genl_info *info)
mutex_unlock(&device->resource->conf_update);
synchronize_rcu();
kfree(old_disk_conf);
+ new_disk_conf = NULL;
}

ddsf = (rs.resize_force ? DDSF_FORCED : 0) | (rs.no_resync ? DDSF_NO_RESYNC : 0);
@@ -2738,6 +2739,7 @@ int drbd_adm_resize(struct sk_buff *skb, struct genl_info *info)

fail_ldev:
put_ldev(device);
+ kfree(new_disk_conf);
goto fail;
}

--
1.9.1

2015-08-04 12:58:05

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 14/19] drbd: fix "endless" transfer log walk in protocol A

From: Lars Ellenberg <[email protected]>

Don't remember a DRBD request as ack_pending, if it is not.

In protocol A, we usually clear RQ_NET_PENDING at the same time we set
RQ_NET_SENT, so when deciding to remember it as ack_pending,
mod_rq_state needs to look at the current request state,
not at the previous state before the current modification was applied.

This should prevent advance_conn_req_ack_pending() from walking the full
transfer log just to find NULL in protocol A, which would cause serious
performance degradation with many "in-flight" requests, e.g. when
working via DRBD-proxy, or with a huge bandwidth-delay product.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_req.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/block/drbd/drbd_req.c b/drivers/block/drbd/drbd_req.c
index 03d5b09..c0ec559 100644
--- a/drivers/block/drbd/drbd_req.c
+++ b/drivers/block/drbd/drbd_req.c
@@ -457,7 +457,7 @@ static void mod_rq_state(struct drbd_request *req, struct bio_and_error *m,
atomic_add(req->i.size >> 9, &device->ap_in_flight);
set_if_null_req_not_net_done(peer_device, req);
}
- if (s & RQ_NET_PENDING)
+ if (req->rq_state & RQ_NET_PENDING)
set_if_null_req_ack_pending(peer_device, req);
}

--
1.9.1

2015-08-04 12:58:09

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 15/19] drbd: make suspend_io() / resume_io() must be thread and recursion safe

Avoid to prematurely resume application IO: don't set/clear a single
bit, but inc/dec an atomic counter.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_int.h | 4 ++--
drivers/block/drbd/drbd_nl.c | 8 +++++---
drivers/block/drbd/drbd_state.c | 2 +-
3 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index e15776e..349b147 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -500,7 +500,6 @@ enum {

MD_NO_FUA, /* Users wants us to not use FUA/FLUSH on meta data dev */

- SUSPEND_IO, /* suspend application io */
BITMAP_IO, /* suspend application io;
once no more io in flight, start bitmap io */
BITMAP_IO_QUEUED, /* Started bitmap IO */
@@ -880,6 +879,7 @@ struct drbd_device {
atomic_t rs_pending_cnt; /* RS request/data packets on the wire */
atomic_t unacked_cnt; /* Need to send replies for */
atomic_t local_cnt; /* Waiting for local completion */
+ atomic_t suspend_cnt;

/* Interval tree of pending local requests */
struct rb_root read_requests;
@@ -2263,7 +2263,7 @@ static inline bool may_inc_ap_bio(struct drbd_device *device)

if (drbd_suspended(device))
return false;
- if (test_bit(SUSPEND_IO, &device->flags))
+ if (atomic_read(&device->suspend_cnt))
return false;

/* to avoid potential deadlock or bitmap corruption,
diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index 1d42c4f..06ef824 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -865,9 +865,11 @@ char *ppsize(char *buf, unsigned long long size)
* and can be long lived.
* This changes an device->flag, is triggered by drbd internals,
* and should be short-lived. */
+/* It needs to be a counter, since multiple threads might
+ independently suspend and resume IO. */
void drbd_suspend_io(struct drbd_device *device)
{
- set_bit(SUSPEND_IO, &device->flags);
+ atomic_inc(&device->suspend_cnt);
if (drbd_suspended(device))
return;
wait_event(device->misc_wait, !atomic_read(&device->ap_bio_cnt));
@@ -875,8 +877,8 @@ void drbd_suspend_io(struct drbd_device *device)

void drbd_resume_io(struct drbd_device *device)
{
- clear_bit(SUSPEND_IO, &device->flags);
- wake_up(&device->misc_wait);
+ if (atomic_dec_and_test(&device->suspend_cnt))
+ wake_up(&device->misc_wait);
}

/**
diff --git a/drivers/block/drbd/drbd_state.c b/drivers/block/drbd/drbd_state.c
index f022e99..5a7ef78 100644
--- a/drivers/block/drbd/drbd_state.c
+++ b/drivers/block/drbd/drbd_state.c
@@ -1484,7 +1484,7 @@ int drbd_bitmap_io_from_worker(struct drbd_device *device,
D_ASSERT(device, current == first_peer_device(device)->connection->worker.task);

/* open coded non-blocking drbd_suspend_io(device); */
- set_bit(SUSPEND_IO, &device->flags);
+ atomic_inc(&device->suspend_cnt);

drbd_bm_lock(device, why, flags);
rv = io_fn(device);
--
1.9.1

2015-08-04 12:57:17

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 16/19] drbd: separate out __al_write_transaction helper function

From: Lars Ellenberg <[email protected]>

To be able to "force out" an activity log transaction,
even if there are no pending updates.

This will be used to relocate the on-disk activity log,
if the on-disk offsets have to be changed,
without the need to empty the activity log first.

While at it, move the definition,
so we can drop the forward declaration of a static helper.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_actlog.c | 304 ++++++++++++++++++++-------------------
1 file changed, 156 insertions(+), 148 deletions(-)

diff --git a/drivers/block/drbd/drbd_actlog.c b/drivers/block/drbd/drbd_actlog.c
index 1318e32..f83dcca 100644
--- a/drivers/block/drbd/drbd_actlog.c
+++ b/drivers/block/drbd/drbd_actlog.c
@@ -288,7 +288,162 @@ bool drbd_al_begin_io_prepare(struct drbd_device *device, struct drbd_interval *
return need_transaction;
}

-static int al_write_transaction(struct drbd_device *device);
+#if (PAGE_SHIFT + 3) < (AL_EXTENT_SHIFT - BM_BLOCK_SHIFT)
+/* Currently BM_BLOCK_SHIFT, BM_EXT_SHIFT and AL_EXTENT_SHIFT
+ * are still coupled, or assume too much about their relation.
+ * Code below will not work if this is violated.
+ * Will be cleaned up with some followup patch.
+ */
+# error FIXME
+#endif
+
+static unsigned int al_extent_to_bm_page(unsigned int al_enr)
+{
+ return al_enr >>
+ /* bit to page */
+ ((PAGE_SHIFT + 3) -
+ /* al extent number to bit */
+ (AL_EXTENT_SHIFT - BM_BLOCK_SHIFT));
+}
+
+static sector_t al_tr_number_to_on_disk_sector(struct drbd_device *device)
+{
+ const unsigned int stripes = device->ldev->md.al_stripes;
+ const unsigned int stripe_size_4kB = device->ldev->md.al_stripe_size_4k;
+
+ /* transaction number, modulo on-disk ring buffer wrap around */
+ unsigned int t = device->al_tr_number % (device->ldev->md.al_size_4k);
+
+ /* ... to aligned 4k on disk block */
+ t = ((t % stripes) * stripe_size_4kB) + t/stripes;
+
+ /* ... to 512 byte sector in activity log */
+ t *= 8;
+
+ /* ... plus offset to the on disk position */
+ return device->ldev->md.md_offset + device->ldev->md.al_offset + t;
+}
+
+static int __al_write_transaction(struct drbd_device *device, struct al_transaction_on_disk *buffer)
+{
+ struct lc_element *e;
+ sector_t sector;
+ int i, mx;
+ unsigned extent_nr;
+ unsigned crc = 0;
+ int err = 0;
+
+ memset(buffer, 0, sizeof(*buffer));
+ buffer->magic = cpu_to_be32(DRBD_AL_MAGIC);
+ buffer->tr_number = cpu_to_be32(device->al_tr_number);
+
+ i = 0;
+
+ /* Even though no one can start to change this list
+ * once we set the LC_LOCKED -- from drbd_al_begin_io(),
+ * lc_try_lock_for_transaction() --, someone may still
+ * be in the process of changing it. */
+ spin_lock_irq(&device->al_lock);
+ list_for_each_entry(e, &device->act_log->to_be_changed, list) {
+ if (i == AL_UPDATES_PER_TRANSACTION) {
+ i++;
+ break;
+ }
+ buffer->update_slot_nr[i] = cpu_to_be16(e->lc_index);
+ buffer->update_extent_nr[i] = cpu_to_be32(e->lc_new_number);
+ if (e->lc_number != LC_FREE)
+ drbd_bm_mark_for_writeout(device,
+ al_extent_to_bm_page(e->lc_number));
+ i++;
+ }
+ spin_unlock_irq(&device->al_lock);
+ BUG_ON(i > AL_UPDATES_PER_TRANSACTION);
+
+ buffer->n_updates = cpu_to_be16(i);
+ for ( ; i < AL_UPDATES_PER_TRANSACTION; i++) {
+ buffer->update_slot_nr[i] = cpu_to_be16(-1);
+ buffer->update_extent_nr[i] = cpu_to_be32(LC_FREE);
+ }
+
+ buffer->context_size = cpu_to_be16(device->act_log->nr_elements);
+ buffer->context_start_slot_nr = cpu_to_be16(device->al_tr_cycle);
+
+ mx = min_t(int, AL_CONTEXT_PER_TRANSACTION,
+ device->act_log->nr_elements - device->al_tr_cycle);
+ for (i = 0; i < mx; i++) {
+ unsigned idx = device->al_tr_cycle + i;
+ extent_nr = lc_element_by_index(device->act_log, idx)->lc_number;
+ buffer->context[i] = cpu_to_be32(extent_nr);
+ }
+ for (; i < AL_CONTEXT_PER_TRANSACTION; i++)
+ buffer->context[i] = cpu_to_be32(LC_FREE);
+
+ device->al_tr_cycle += AL_CONTEXT_PER_TRANSACTION;
+ if (device->al_tr_cycle >= device->act_log->nr_elements)
+ device->al_tr_cycle = 0;
+
+ sector = al_tr_number_to_on_disk_sector(device);
+
+ crc = crc32c(0, buffer, 4096);
+ buffer->crc32c = cpu_to_be32(crc);
+
+ if (drbd_bm_write_hinted(device))
+ err = -EIO;
+ else {
+ bool write_al_updates;
+ rcu_read_lock();
+ write_al_updates = rcu_dereference(device->ldev->disk_conf)->al_updates;
+ rcu_read_unlock();
+ if (write_al_updates) {
+ if (drbd_md_sync_page_io(device, device->ldev, sector, WRITE)) {
+ err = -EIO;
+ drbd_chk_io_error(device, 1, DRBD_META_IO_ERROR);
+ } else {
+ device->al_tr_number++;
+ device->al_writ_cnt++;
+ }
+ }
+ }
+
+ return err;
+}
+
+static int al_write_transaction(struct drbd_device *device)
+{
+ struct al_transaction_on_disk *buffer;
+ int err;
+
+ if (!get_ldev(device)) {
+ drbd_err(device, "disk is %s, cannot start al transaction\n",
+ drbd_disk_str(device->state.disk));
+ return -EIO;
+ }
+
+ /* The bitmap write may have failed, causing a state change. */
+ if (device->state.disk < D_INCONSISTENT) {
+ drbd_err(device,
+ "disk is %s, cannot write al transaction\n",
+ drbd_disk_str(device->state.disk));
+ put_ldev(device);
+ return -EIO;
+ }
+
+ /* protects md_io_buffer, al_tr_cycle, ... */
+ buffer = drbd_md_get_buffer(device, __func__);
+ if (!buffer) {
+ drbd_err(device, "disk failed while waiting for md_io buffer\n");
+ put_ldev(device);
+ return -ENODEV;
+ }
+
+ err = __al_write_transaction(device, buffer);
+
+ drbd_md_put_buffer(device);
+ put_ldev(device);
+
+ return err;
+}
+

void drbd_al_begin_io_commit(struct drbd_device *device)
{
@@ -420,153 +575,6 @@ void drbd_al_complete_io(struct drbd_device *device, struct drbd_interval *i)
wake_up(&device->al_wait);
}

-#if (PAGE_SHIFT + 3) < (AL_EXTENT_SHIFT - BM_BLOCK_SHIFT)
-/* Currently BM_BLOCK_SHIFT, BM_EXT_SHIFT and AL_EXTENT_SHIFT
- * are still coupled, or assume too much about their relation.
- * Code below will not work if this is violated.
- * Will be cleaned up with some followup patch.
- */
-# error FIXME
-#endif
-
-static unsigned int al_extent_to_bm_page(unsigned int al_enr)
-{
- return al_enr >>
- /* bit to page */
- ((PAGE_SHIFT + 3) -
- /* al extent number to bit */
- (AL_EXTENT_SHIFT - BM_BLOCK_SHIFT));
-}
-
-static sector_t al_tr_number_to_on_disk_sector(struct drbd_device *device)
-{
- const unsigned int stripes = device->ldev->md.al_stripes;
- const unsigned int stripe_size_4kB = device->ldev->md.al_stripe_size_4k;
-
- /* transaction number, modulo on-disk ring buffer wrap around */
- unsigned int t = device->al_tr_number % (device->ldev->md.al_size_4k);
-
- /* ... to aligned 4k on disk block */
- t = ((t % stripes) * stripe_size_4kB) + t/stripes;
-
- /* ... to 512 byte sector in activity log */
- t *= 8;
-
- /* ... plus offset to the on disk position */
- return device->ldev->md.md_offset + device->ldev->md.al_offset + t;
-}
-
-int al_write_transaction(struct drbd_device *device)
-{
- struct al_transaction_on_disk *buffer;
- struct lc_element *e;
- sector_t sector;
- int i, mx;
- unsigned extent_nr;
- unsigned crc = 0;
- int err = 0;
-
- if (!get_ldev(device)) {
- drbd_err(device, "disk is %s, cannot start al transaction\n",
- drbd_disk_str(device->state.disk));
- return -EIO;
- }
-
- /* The bitmap write may have failed, causing a state change. */
- if (device->state.disk < D_INCONSISTENT) {
- drbd_err(device,
- "disk is %s, cannot write al transaction\n",
- drbd_disk_str(device->state.disk));
- put_ldev(device);
- return -EIO;
- }
-
- /* protects md_io_buffer, al_tr_cycle, ... */
- buffer = drbd_md_get_buffer(device, __func__);
- if (!buffer) {
- drbd_err(device, "disk failed while waiting for md_io buffer\n");
- put_ldev(device);
- return -ENODEV;
- }
-
- memset(buffer, 0, sizeof(*buffer));
- buffer->magic = cpu_to_be32(DRBD_AL_MAGIC);
- buffer->tr_number = cpu_to_be32(device->al_tr_number);
-
- i = 0;
-
- /* Even though no one can start to change this list
- * once we set the LC_LOCKED -- from drbd_al_begin_io(),
- * lc_try_lock_for_transaction() --, someone may still
- * be in the process of changing it. */
- spin_lock_irq(&device->al_lock);
- list_for_each_entry(e, &device->act_log->to_be_changed, list) {
- if (i == AL_UPDATES_PER_TRANSACTION) {
- i++;
- break;
- }
- buffer->update_slot_nr[i] = cpu_to_be16(e->lc_index);
- buffer->update_extent_nr[i] = cpu_to_be32(e->lc_new_number);
- if (e->lc_number != LC_FREE)
- drbd_bm_mark_for_writeout(device,
- al_extent_to_bm_page(e->lc_number));
- i++;
- }
- spin_unlock_irq(&device->al_lock);
- BUG_ON(i > AL_UPDATES_PER_TRANSACTION);
-
- buffer->n_updates = cpu_to_be16(i);
- for ( ; i < AL_UPDATES_PER_TRANSACTION; i++) {
- buffer->update_slot_nr[i] = cpu_to_be16(-1);
- buffer->update_extent_nr[i] = cpu_to_be32(LC_FREE);
- }
-
- buffer->context_size = cpu_to_be16(device->act_log->nr_elements);
- buffer->context_start_slot_nr = cpu_to_be16(device->al_tr_cycle);
-
- mx = min_t(int, AL_CONTEXT_PER_TRANSACTION,
- device->act_log->nr_elements - device->al_tr_cycle);
- for (i = 0; i < mx; i++) {
- unsigned idx = device->al_tr_cycle + i;
- extent_nr = lc_element_by_index(device->act_log, idx)->lc_number;
- buffer->context[i] = cpu_to_be32(extent_nr);
- }
- for (; i < AL_CONTEXT_PER_TRANSACTION; i++)
- buffer->context[i] = cpu_to_be32(LC_FREE);
-
- device->al_tr_cycle += AL_CONTEXT_PER_TRANSACTION;
- if (device->al_tr_cycle >= device->act_log->nr_elements)
- device->al_tr_cycle = 0;
-
- sector = al_tr_number_to_on_disk_sector(device);
-
- crc = crc32c(0, buffer, 4096);
- buffer->crc32c = cpu_to_be32(crc);
-
- if (drbd_bm_write_hinted(device))
- err = -EIO;
- else {
- bool write_al_updates;
- rcu_read_lock();
- write_al_updates = rcu_dereference(device->ldev->disk_conf)->al_updates;
- rcu_read_unlock();
- if (write_al_updates) {
- if (drbd_md_sync_page_io(device, device->ldev, sector, WRITE)) {
- err = -EIO;
- drbd_chk_io_error(device, 1, DRBD_META_IO_ERROR);
- } else {
- device->al_tr_number++;
- device->al_writ_cnt++;
- }
- }
- }
-
- drbd_md_put_buffer(device);
- put_ldev(device);
-
- return err;
-}
-
static int _try_lc_del(struct drbd_device *device, struct lc_element *al_ext)
{
int rv;
--
1.9.1

2015-08-04 12:57:14

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 17/19] drbd: avoid potential deadlock during handshake

From: Lars Ellenberg <[email protected]>

During handshake communication, we also reconsider our device size,
using drbd_determine_dev_size(). Just in case we need to change the
offsets or layout of our on-disk metadata, we lock out application
and other meta data IO, and wait for the activity log to be "idle"
(no more referenced extents).

If this handshake happens just after a connection loss, with a fencing
policy of "resource-and-stonith", we have frozen IO.

If, additionally, the activity log was "starving" (too many incoming
random writes at that point in time), it won't become idle, ever,
because of the frozen IO, and this would be a lockup of the receiver
thread, and consquentially of DRBD.

Previous logic (re-)initialized with a special "empty" transaction
block, which required the activity log to fully drain first.

Instead, write out some standard activity log transactions.
Using lc_try_lock_for_transaction() instead of lc_try_lock() does not
care about pending activity log references, avoiding the potential
deadlock.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_actlog.c | 19 +++++++++++--------
drivers/block/drbd/drbd_int.h | 2 +-
drivers/block/drbd/drbd_nl.c | 33 +++++++++++++++++++--------------
3 files changed, 31 insertions(+), 23 deletions(-)

diff --git a/drivers/block/drbd/drbd_actlog.c b/drivers/block/drbd/drbd_actlog.c
index f83dcca..5c3f35b 100644
--- a/drivers/block/drbd/drbd_actlog.c
+++ b/drivers/block/drbd/drbd_actlog.c
@@ -614,21 +614,24 @@ void drbd_al_shrink(struct drbd_device *device)
wake_up(&device->al_wait);
}

-int drbd_initialize_al(struct drbd_device *device, void *buffer)
+int drbd_al_initialize(struct drbd_device *device, void *buffer)
{
struct al_transaction_on_disk *al = buffer;
struct drbd_md *md = &device->ldev->md;
- sector_t al_base = md->md_offset + md->al_offset;
int al_size_4k = md->al_stripes * md->al_stripe_size_4k;
int i;

- memset(al, 0, 4096);
- al->magic = cpu_to_be32(DRBD_AL_MAGIC);
- al->transaction_type = cpu_to_be16(AL_TR_INITIALIZED);
- al->crc32c = cpu_to_be32(crc32c(0, al, 4096));
+ __al_write_transaction(device, al);
+ /* There may or may not have been a pending transaction. */
+ spin_lock_irq(&device->al_lock);
+ lc_committed(device->act_log);
+ spin_unlock_irq(&device->al_lock);

- for (i = 0; i < al_size_4k; i++) {
- int err = drbd_md_sync_page_io(device, device->ldev, al_base + i * 8, WRITE);
+ /* The rest of the transactions will have an empty "updates" list, and
+ * are written out only to provide the context, and to initialize the
+ * on-disk ring buffer. */
+ for (i = 1; i < al_size_4k; i++) {
+ int err = __al_write_transaction(device, al);
if (err)
return err;
}
diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index 349b147..9f1ff2b 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -1667,7 +1667,7 @@ extern int __drbd_change_sync(struct drbd_device *device, sector_t sector, int s
#define drbd_rs_failed_io(device, sector, size) \
__drbd_change_sync(device, sector, size, RECORD_RS_FAILED)
extern void drbd_al_shrink(struct drbd_device *device);
-extern int drbd_initialize_al(struct drbd_device *, void *);
+extern int drbd_al_initialize(struct drbd_device *, void *);

/* drbd_nl.c */
/* state info broadcast */
diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index 06ef824..8650906 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -903,15 +903,14 @@ drbd_determine_dev_size(struct drbd_device *device, enum dds_flags flags, struct
int md_moved, la_size_changed;
enum determine_dev_size rv = DS_UNCHANGED;

- /* race:
- * application request passes inc_ap_bio,
- * but then cannot get an AL-reference.
- * this function later may wait on ap_bio_cnt == 0. -> deadlock.
+ /* We may change the on-disk offsets of our meta data below. Lock out
+ * anything that may cause meta data IO, to avoid acting on incomplete
+ * layout changes or scribbling over meta data that is in the process
+ * of being moved.
*
- * to avoid that:
- * Suspend IO right here.
- * still lock the act_log to not trigger ASSERTs there.
- */
+ * Move is not exactly correct, btw, currently we have all our meta
+ * data in core memory, to "move" it we just write it all out, there
+ * are no reads. */
drbd_suspend_io(device);
buffer = drbd_md_get_buffer(device, __func__); /* Lock meta-data IO */
if (!buffer) {
@@ -919,9 +918,6 @@ drbd_determine_dev_size(struct drbd_device *device, enum dds_flags flags, struct
return DS_ERROR;
}

- /* no wait necessary anymore, actually we could assert that */
- wait_event(device->al_wait, lc_try_lock(device->act_log));
-
prev_first_sect = drbd_md_first_sector(device->ldev);
prev_size = device->ldev->md.md_size_sect;
la_size_sect = device->ldev->md.la_size_sect;
@@ -997,20 +993,29 @@ drbd_determine_dev_size(struct drbd_device *device, enum dds_flags flags, struct
* Clear the timer, to avoid scary "timer expired!" messages,
* "Superblock" is written out at least twice below, anyways. */
del_timer(&device->md_sync_timer);
- drbd_al_shrink(device); /* All extents inactive. */

+ /* We won't change the "al-extents" setting, we just may need
+ * to move the on-disk location of the activity log ringbuffer.
+ * Lock for transaction is good enough, it may well be "dirty"
+ * or even "starving". */
+ wait_event(device->al_wait, lc_try_lock_for_transaction(device->act_log));
+
+ /* mark current on-disk bitmap and activity log as unreliable */
prev_flags = md->flags;
- md->flags &= ~MDF_PRIMARY_IND;
+ md->flags |= MDF_FULL_SYNC | MDF_AL_DISABLED;
drbd_md_write(device, buffer);

+ drbd_al_initialize(device, buffer);
+
drbd_info(device, "Writing the whole bitmap, %s\n",
la_size_changed && md_moved ? "size changed and md moved" :
la_size_changed ? "size changed" : "md moved");
/* next line implicitly does drbd_suspend_io()+drbd_resume_io() */
drbd_bitmap_io(device, md_moved ? &drbd_bm_write_all : &drbd_bm_write,
"size changed", BM_LOCKED_MASK);
- drbd_initialize_al(device, buffer);

+ /* on-disk bitmap and activity log is authoritative again
+ * (unless there was an IO error meanwhile...) */
md->flags = prev_flags;
drbd_md_write(device, buffer);

--
1.9.1

2015-08-04 12:58:03

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 18/19] drbd: fix error path during resize

From: Lars Ellenberg <[email protected]>

In case the lower level device size changed, but some other internal
details of the resize did not work out, drbd_determine_dev_size() would
try to restore the previous settings, trusting
drbd_md_set_sector_offsets() to "do the right thing", but overlooked
that this internally may set the meta data base offset based on device size.

This could end up with incomplete on-disk meta data layout change, and
ultimately lead to data corruption (if the failure was not noticed or
ignored by the operator, and other things go wrong as well).

Just remember all meta data related offsets/sizes,
and on error restore them all.

Signed-off-by: Philipp Reisner <[email protected]>
Signed-off-by: Lars Ellenberg <[email protected]>
---
drivers/block/drbd/drbd_nl.c | 68 +++++++++++++++++++++++++-------------------
1 file changed, 38 insertions(+), 30 deletions(-)

diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index 8650906..a581fa1 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -891,12 +891,18 @@ void drbd_resume_io(struct drbd_device *device)
enum determine_dev_size
drbd_determine_dev_size(struct drbd_device *device, enum dds_flags flags, struct resize_parms *rs) __must_hold(local)
{
- sector_t prev_first_sect, prev_size; /* previous meta location */
- sector_t la_size_sect, u_size;
+ struct md_offsets_and_sizes {
+ u64 last_agreed_sect;
+ u64 md_offset;
+ s32 al_offset;
+ s32 bm_offset;
+ u32 md_size_sect;
+
+ u32 al_stripes;
+ u32 al_stripe_size_4k;
+ } prev;
+ sector_t u_size, size;
struct drbd_md *md = &device->ldev->md;
- u32 prev_al_stripe_size_4k;
- u32 prev_al_stripes;
- sector_t size;
char ppb[10];
void *buffer;

@@ -918,16 +924,17 @@ drbd_determine_dev_size(struct drbd_device *device, enum dds_flags flags, struct
return DS_ERROR;
}

- prev_first_sect = drbd_md_first_sector(device->ldev);
- prev_size = device->ldev->md.md_size_sect;
- la_size_sect = device->ldev->md.la_size_sect;
+ /* remember current offset and sizes */
+ prev.last_agreed_sect = md->la_size_sect;
+ prev.md_offset = md->md_offset;
+ prev.al_offset = md->al_offset;
+ prev.bm_offset = md->bm_offset;
+ prev.md_size_sect = md->md_size_sect;
+ prev.al_stripes = md->al_stripes;
+ prev.al_stripe_size_4k = md->al_stripe_size_4k;

if (rs) {
/* rs is non NULL if we should change the AL layout only */
-
- prev_al_stripes = md->al_stripes;
- prev_al_stripe_size_4k = md->al_stripe_size_4k;
-
md->al_stripes = rs->al_stripes;
md->al_stripe_size_4k = rs->al_stripe_size / 4;
md->al_size_4k = (u64)rs->al_stripes * rs->al_stripe_size / 4;
@@ -940,7 +947,7 @@ drbd_determine_dev_size(struct drbd_device *device, enum dds_flags flags, struct
rcu_read_unlock();
size = drbd_new_dev_size(device, device->ldev, u_size, flags & DDSF_FORCED);

- if (size < la_size_sect) {
+ if (size < prev.last_agreed_sect) {
if (rs && u_size == 0) {
/* Remove "rs &&" later. This check should always be active, but
right now the receiver expects the permissive behavior */
@@ -961,30 +968,29 @@ drbd_determine_dev_size(struct drbd_device *device, enum dds_flags flags, struct
err = drbd_bm_resize(device, size, !(flags & DDSF_NO_RESYNC));
if (unlikely(err)) {
/* currently there is only one error: ENOMEM! */
- size = drbd_bm_capacity(device)>>1;
+ size = drbd_bm_capacity(device);
if (size == 0) {
drbd_err(device, "OUT OF MEMORY! "
"Could not allocate bitmap!\n");
} else {
drbd_err(device, "BM resizing failed. "
- "Leaving size unchanged at size = %lu KB\n",
- (unsigned long)size);
+ "Leaving size unchanged\n");
}
rv = DS_ERROR;
}
/* racy, see comments above. */
drbd_set_my_capacity(device, size);
- device->ldev->md.la_size_sect = size;
+ md->la_size_sect = size;
drbd_info(device, "size = %s (%llu KB)\n", ppsize(ppb, size>>1),
(unsigned long long)size>>1);
}
if (rv <= DS_ERROR)
goto err_out;

- la_size_changed = (la_size_sect != device->ldev->md.la_size_sect);
+ la_size_changed = (prev.last_agreed_sect != md->la_size_sect);

- md_moved = prev_first_sect != drbd_md_first_sector(device->ldev)
- || prev_size != device->ldev->md.md_size_sect;
+ md_moved = prev.md_offset != md->md_offset
+ || prev.md_size_sect != md->md_size_sect;

if (la_size_changed || md_moved || rs) {
u32 prev_flags;
@@ -1024,20 +1030,22 @@ drbd_determine_dev_size(struct drbd_device *device, enum dds_flags flags, struct
md->al_stripes, md->al_stripe_size_4k * 4);
}

- if (size > la_size_sect)
- rv = la_size_sect ? DS_GREW : DS_GREW_FROM_ZERO;
- if (size < la_size_sect)
+ if (size > prev.last_agreed_sect)
+ rv = prev.last_agreed_sect ? DS_GREW : DS_GREW_FROM_ZERO;
+ if (size < prev.last_agreed_sect)
rv = DS_SHRUNK;

if (0) {
err_out:
- if (rs) {
- md->al_stripes = prev_al_stripes;
- md->al_stripe_size_4k = prev_al_stripe_size_4k;
- md->al_size_4k = (u64)prev_al_stripes * prev_al_stripe_size_4k;
-
- drbd_md_set_sector_offsets(device, device->ldev);
- }
+ /* restore previous offset and sizes */
+ md->la_size_sect = prev.last_agreed_sect;
+ md->md_offset = prev.md_offset;
+ md->al_offset = prev.al_offset;
+ md->bm_offset = prev.bm_offset;
+ md->md_size_sect = prev.md_size_sect;
+ md->al_stripes = prev.al_stripes;
+ md->al_stripe_size_4k = prev.al_stripe_size_4k;
+ md->al_size_4k = (u64)prev.al_stripes * prev.al_stripe_size_4k;
}
lc_unlock(device->act_log);
wake_up(&device->al_wait);
--
1.9.1

2015-08-04 12:57:19

by Philipp Reisner

[permalink] [raw]
Subject: [PATCH 19/19] MAINTAINERS: Updated information for DRBD DRIVER

From: Roland Kammerer <[email protected]>

- Changed obsoleted 'P' to 'M' entries.
- Removed the user related mailing list.
- Changed git repos to current versions

Signed-off-by: Roland Kammerer <[email protected]>
---
MAINTAINERS | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 9289ecb..d5ce1c5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3472,13 +3472,12 @@ F: drivers/scsi/dpt*
F: drivers/scsi/dpt/

DRBD DRIVER
-P: Philipp Reisner
-P: Lars Ellenberg
-M: [email protected]
-L: [email protected]
+M: Philipp Reisner <[email protected]>
+M: Lars Ellenberg <[email protected]>
+L: [email protected]
W: http://www.drbd.org
-T: git git://git.drbd.org/linux-2.6-drbd.git drbd
-T: git git://git.drbd.org/drbd-8.3.git
+T: git git://git.linbit.com/linux-drbd.git
+T: git git://git.linbit.com/drbd-8.4.git
S: Supported
F: drivers/block/drbd/
F: lib/lru_cache.c
--
1.9.1