Received: by 10.192.165.148 with SMTP id m20csp4139771imm; Mon, 30 Apr 2018 12:29:08 -0700 (PDT) X-Google-Smtp-Source: AB8JxZoxT/EJ+BkiESostVZ2nM1qnWH6QK4SvUesqjMCB1NKs3Ltob1GOYCSa1PXVicohM3TPoCU X-Received: by 2002:a17:902:8a:: with SMTP id a10-v6mr13405444pla.89.1525116548305; Mon, 30 Apr 2018 12:29:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525116548; cv=none; d=google.com; s=arc-20160816; b=xS464qF0QNJFE70gRW0mPc5UgMjCpTAszxjrZgPasy91YpPXL2KxoHdOrZ+Zg9B9JN 6OZN3T5hxMt5K+LCxHPlpEnDIArhsdvU7QnUXchpP8nvA9H1Z58Sf1eRbGPRSwjDtqrp FlSDBE4Kc4UOWOENQTHUB5PAalrZ6Z8ebV3OQ6Aa0/TQxzSYvOlYCq1ejChMf4Jf2Dim P9mChjufbWvqC/VxqKJg1K1wVo7XzodNjlb6PSkW65+QVQwb3IoLWdBqxsTvoBnNHo2C DDhmWBaCoOkaZH7kfByVuaN/TiLnxKivnG1Ibn04e6EsGAvsEmqCydFIibHL4rRuQQcM KR+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :in-reply-to:message-id:date:subject:cc:to:from:dmarc-filter :arc-authentication-results; bh=5OQUQKx5ZRMW15Uw1ZS9XZyhnXI2E4c3nla5ar/eLgI=; b=VmpITVnFOwDCUF5Cua+norTUawiXd2gA4fT/Qv0k8uLNeW220mm7bRXIciSOR5Pwkb kjPISJohRZDNLFhqH75XVEddV/FXpNOM2yaOF33Y8Yer8i6LEoXZyLpUIaitACBEMFAo Xsf3u/PnpOPkZooDk5IMhGxWil2RHggpXm3RZKvMlHvNcDaQLjPA+kQdfPXofIzMaYbQ o9H77+IgJLnLTCRAnBKpMc/Ef5tZStjxOsZvN8+j7tfSeaZXYe11jwcOnEWxrZHJken9 gxucrqwPciOSYL++yNaGDRWLM/MSG3q+W7waFN7QUd0eaBbaC/TY2fotmhGg2Azlg2DK 2bUA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e9-v6si7949784pli.576.2018.04.30.12.28.54; Mon, 30 Apr 2018 12:29:08 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932374AbeD3T2v (ORCPT + 99 others); Mon, 30 Apr 2018 15:28:51 -0400 Received: from mail.kernel.org ([198.145.29.99]:36686 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756381AbeD3T2l (ORCPT ); Mon, 30 Apr 2018 15:28:41 -0400 Received: from localhost (unknown [104.132.1.102]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 8135122DAC; Mon, 30 Apr 2018 19:28:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8135122DAC Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linuxfoundation.org Authentication-Results: mail.kernel.org; spf=fail smtp.mailfrom=gregkh@linuxfoundation.org From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Ilya Dryomov , Jason Dillaman Subject: [PATCH 4.16 087/113] libceph: reschedule a tick in finish_hunting() Date: Mon, 30 Apr 2018 12:24:58 -0700 Message-Id: <20180430184018.874669650@linuxfoundation.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180430184015.043892819@linuxfoundation.org> References: <20180430184015.043892819@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.16-stable review patch. If anyone has any objections, please let me know. ------------------ From: Ilya Dryomov commit 7b4c443d139f1d2b5570da475f7a9cbcef86740c upstream. If we go without an established session for a while, backoff delay will climb to 30 seconds. The keepalive timeout is also 30 seconds, so it's pretty easily hit after a prolonged hunting for a monitor: we don't get a chance to send out a keepalive in time, which means we never get back a keepalive ack in time, cutting an established session and attempting to connect to a different monitor every 30 seconds: [Sun Apr 1 23:37:05 2018] libceph: mon0 10.80.20.99:6789 session established [Sun Apr 1 23:37:36 2018] libceph: mon0 10.80.20.99:6789 session lost, hunting for new mon [Sun Apr 1 23:37:36 2018] libceph: mon2 10.80.20.103:6789 session established [Sun Apr 1 23:38:07 2018] libceph: mon2 10.80.20.103:6789 session lost, hunting for new mon [Sun Apr 1 23:38:07 2018] libceph: mon1 10.80.20.100:6789 session established [Sun Apr 1 23:38:37 2018] libceph: mon1 10.80.20.100:6789 session lost, hunting for new mon [Sun Apr 1 23:38:37 2018] libceph: mon2 10.80.20.103:6789 session established [Sun Apr 1 23:39:08 2018] libceph: mon2 10.80.20.103:6789 session lost, hunting for new mon The regular keepalive interval is 10 seconds. After ->hunting is cleared in finish_hunting(), call __schedule_delayed() to ensure we send out a keepalive after 10 seconds. Cc: stable@vger.kernel.org # 4.7+ Link: http://tracker.ceph.com/issues/23537 Signed-off-by: Ilya Dryomov Reviewed-by: Jason Dillaman Signed-off-by: Greg Kroah-Hartman --- net/ceph/mon_client.c | 1 + 1 file changed, 1 insertion(+) --- a/net/ceph/mon_client.c +++ b/net/ceph/mon_client.c @@ -1133,6 +1133,7 @@ static void finish_hunting(struct ceph_m monc->hunting = false; monc->had_a_connection = true; un_backoff(monc); + __schedule_delayed(monc); } }