Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932440AbcDYMUJ (ORCPT ); Mon, 25 Apr 2016 08:20:09 -0400 Received: from zimbra13.linbit.com ([212.69.166.240]:39615 "EHLO zimbra13.linbit.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932312AbcDYMQ2 (ORCPT ); Mon, 25 Apr 2016 08:16:28 -0400 From: Philipp Reisner To: Jens Axboe , linux-kernel@vger.kernel.org Cc: drbd-dev@lists.linbit.com, Lars Ellenberg , Philipp Reisner Subject: [PATCH 15/30] drbd: finish resync on sync source only by notification from sync target Date: Mon, 25 Apr 2016 14:07:42 +0200 Message-Id: <1461586077-11581-16-git-send-email-philipp.reisner@linbit.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1461586077-11581-1-git-send-email-philipp.reisner@linbit.com> References: <1461586077-11581-1-git-send-email-philipp.reisner@linbit.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2786 Lines: 77 From: Lars Ellenberg If the replication link breaks exactly during "resync finished" detection, finishing too early on the sync source could again lead to UUIDs rotated too fast, and potentially a spurious full resync on next handshake. Always wait for explicit resync finished state change notification from the sync target. Signed-off-by: Philipp Reisner Signed-off-by: Lars Ellenberg --- drivers/block/drbd/drbd_actlog.c | 16 ++++++++++++---- drivers/block/drbd/drbd_int.h | 19 ++++++++++++++----- 2 files changed, 26 insertions(+), 9 deletions(-) diff --git a/drivers/block/drbd/drbd_actlog.c b/drivers/block/drbd/drbd_actlog.c index 1664762..4e07cff 100644 --- a/drivers/block/drbd/drbd_actlog.c +++ b/drivers/block/drbd/drbd_actlog.c @@ -768,10 +768,18 @@ static bool lazy_bitmap_update_due(struct drbd_device *device) static void maybe_schedule_on_disk_bitmap_update(struct drbd_device *device, bool rs_done) { - if (rs_done) - set_bit(RS_DONE, &device->flags); - /* and also set RS_PROGRESS below */ - else if (!lazy_bitmap_update_due(device)) + if (rs_done) { + struct drbd_connection *connection = first_peer_device(device)->connection; + if (connection->agreed_pro_version <= 95 || + is_sync_target_state(device->state.conn)) + set_bit(RS_DONE, &device->flags); + /* and also set RS_PROGRESS below */ + + /* Else: rather wait for explicit notification via receive_state, + * to avoid uuids-rotated-too-fast causing full resync + * in next handshake, in case the replication link breaks + * at the most unfortunate time... */ + } else if (!lazy_bitmap_update_due(device)) return; drbd_device_post_work(device, RS_PROGRESS); diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h index d82e531..451a745 100644 --- a/drivers/block/drbd/drbd_int.h +++ b/drivers/block/drbd/drbd_int.h @@ -2102,13 +2102,22 @@ static inline void _sub_unacked(struct drbd_device *device, int n, const char *f ERR_IF_CNT_IS_NEGATIVE(unacked_cnt, func, line); } +static inline bool is_sync_target_state(enum drbd_conns connection_state) +{ + return connection_state == C_SYNC_TARGET || + connection_state == C_PAUSED_SYNC_T; +} + +static inline bool is_sync_source_state(enum drbd_conns connection_state) +{ + return connection_state == C_SYNC_SOURCE || + connection_state == C_PAUSED_SYNC_S; +} + static inline bool is_sync_state(enum drbd_conns connection_state) { - return - (connection_state == C_SYNC_SOURCE - || connection_state == C_SYNC_TARGET - || connection_state == C_PAUSED_SYNC_S - || connection_state == C_PAUSED_SYNC_T); + return is_sync_source_state(connection_state) || + is_sync_target_state(connection_state); } /** -- 1.9.1