Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp3100717imu; Wed, 7 Nov 2018 05:10:08 -0800 (PST) X-Google-Smtp-Source: AJdET5dPREhPE0G1Kt1qdZnRSusIr5Ir1b/NQxY6XqfmtscFCA/COGaHmBuC6vSWD1tprhB973xc X-Received: by 2002:a63:ae4d:: with SMTP id e13-v6mr109279pgp.315.1541596208025; Wed, 07 Nov 2018 05:10:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1541596207; cv=none; d=google.com; s=arc-20160816; b=O1rwDIgXGR7U/IWMPkroL1EsLwMujhIt4KDeZTevUnnJ2ponfBgfEUZeLXRd6WioWA Ub6eieHC6MSSkp5BzXrlx+SLKb7R/A2EjXD3VZHWEuPfwy0QR7L1dlPhsuHaUREQS+Dx 8PDuHCawivaf2aD5nkO8jkSD71E9xH59XlzX8IRnyGH0/vtF7Y6CHRo6uAFdWiqHhF3N gGju9eOjoSDiAlbjy1fckkw2BCyyoVVcjo2AT4vr4OFq/ng9GvYrNqMTwGqVDGyyBSl7 wYWC6SvoptSZ266MkRPG5h/fWCBPNyR+KtKgvL9C/zZ3Q0A2wdDUy7OvnHRcmZ3qNJwY FxLQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature; bh=LO8LVs4QnK5pY1XerlzRfE3ckOfZjyQ5f4s67TviexY=; b=UihhouoxS+UUJT4Atk7CQ4Ri3vhsp7HaDIKoDhRV+ZinY7NsEIvSBhvUpkJyyWwBYZ GdSFEaL9akJSzf9aLft4mhFNV3PwWzlla6HpIAPw5yDN8ztkVO6SJGThfrehnv/DBPbf axfdJ0Rx7SOItbsbRxRDMZtKcjpmhyaan4kZiw9pUj+2YfVnNuyWzgvUfE+w8RB3AVNj PPR+Jm10do/JfS5Qzpr8kwRnxLPunVbyMLQ02osmDj/jBmddPUwHmW8X23WRaj52aK3H D4Ghxup3J/qyxOXroDuvmN5xFYNNwoHaEJxW8ovV85RasfyGudn7XZzC5jotAUOmJFxV t4+w== ARC-Authentication-Results: i=1; mx.google.com; dkim=temperror (no key for signature) header.i=@szeredi.hu header.s=google header.b=NZfS58F3; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l62-v6si576237pfc.114.2018.11.07.05.09.52; Wed, 07 Nov 2018 05:10:07 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=temperror (no key for signature) header.i=@szeredi.hu header.s=google header.b=NZfS58F3; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726886AbeKGWjo (ORCPT + 99 others); Wed, 7 Nov 2018 17:39:44 -0500 Received: from mail-io1-f68.google.com ([209.85.166.68]:42994 "EHLO mail-io1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726635AbeKGWjn (ORCPT ); Wed, 7 Nov 2018 17:39:43 -0500 Received: by mail-io1-f68.google.com with SMTP id h19-v6so11814315iog.9 for ; Wed, 07 Nov 2018 05:09:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=szeredi.hu; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=LO8LVs4QnK5pY1XerlzRfE3ckOfZjyQ5f4s67TviexY=; b=NZfS58F3QTwytkXCvAhKkgYWvQNA0cK2SZVKZ05CmNyBOvxA0v6apYv1Q/5uuLQXSP rAbjS9ZQB5W6I9KNdIACUNW/AnMhxp8waIi0wuJkI6WpBiIMfOsOkJ3/MEt8+ljrXhkc nRDk1tDWrzsyU980kdAFBxxfZGcuo2DVym5hY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=LO8LVs4QnK5pY1XerlzRfE3ckOfZjyQ5f4s67TviexY=; b=DwLrvHRSVXASFimphKsXQhKm9ktjpVdLoXAcwcsSHqe3kfVC+pD77LkoamK1rTbJ4j zZ0sSXWVkY3/oB1iL86i1mpO1G8GX9MfL/ULYX6wJSXWzjuoAyuRRcgEAEFA/5zBIsdy jI/xWi1Q8dOKfMcKNM5iRNo3buxTrlJcEylL6+L2nK0hA1OKrQ8RMbiJNTakQfs9jlx0 lXu245HGpgeqvH5ee8uSS9lv1+NDP68mdLtWFpOzzI8WLcqjldEI4TTv5zqnYJEJjtg/ fm2C2GS+pwZEVBz6fRbH/yMVvwXDdBSzv2PuKEPz57zx4N39g/HcmJsaqzYPKgFSo9Pc QtLg== X-Gm-Message-State: AGRZ1gJ7EzuQ8OcpQGsUKueCtgrcLdlZX6IPvkI+3OQH09rS7V/QWgRi uvbJHThmQV6lqk+FaUSfFbsstmc898nLhNi5VfdXtVcb X-Received: by 2002:a6b:15c5:: with SMTP id 188-v6mr121993iov.246.1541596163708; Wed, 07 Nov 2018 05:09:23 -0800 (PST) MIME-Version: 1.0 Received: by 2002:a6b:ac42:0:0:0:0:0 with HTTP; Wed, 7 Nov 2018 05:09:23 -0800 (PST) X-Originating-IP: [212.96.48.140] In-Reply-To: <154149663862.17764.9649077325029198892.stgit@localhost.localdomain> References: <154149586524.17764.5252013294539109287.stgit@localhost.localdomain> <154149663862.17764.9649077325029198892.stgit@localhost.localdomain> From: Miklos Szeredi Date: Wed, 7 Nov 2018 14:09:23 +0100 Message-ID: Subject: Re: [PATCH 2/6] fuse: Optimize request_end() by not taking fiq->waitq.lock To: Kirill Tkhai Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Nov 6, 2018 at 10:30 AM, Kirill Tkhai wrote: > We take global fiq->waitq.lock every time, when we are > in this function, but interrupted requests are just small > subset of all requests. This patch optimizes request_end() > and makes it to take the lock when it's really needed. > > queue_interrupt() needs small change for that. After req > is linked to interrupt list, we do smp_mb() and check > for FR_FINISHED again. In case of FR_FINISHED bit has > appeared, we remove req and leave the function: > > CPU 0 CPU 1 > queue_interrupt() request_end() > > spin_lock(&fiq->waitq.lock) > > > list_add_tail(&req->intr_entry, &fiq->interrupts) test_and_set_bit(FR_FINISHED, &req->flags) > smp_mb() > if (test_bit(FR_FINISHED, &req->flags)) if (!list_empty(&req->intr_entry)) > > list_del_init(&req->intr_entry) spin_lock(&fiq->waitq.lock) > list_del_init(&req->intr_entry) > > Check the change is visible in perf report: > > 1)Firstly mount fusexmp_fh: > $fuse/example/.libs/fusexmp_fh mnt > > 2)Run test doing > futimes(fd, tv1); > futimes(fd, tv2); > in many threads endlessly. > > 3)perf record -g (all the system load) > > Without the patch in request_end() we spend 62.58% of do_write() time: > (= 12.58 / 20.10 * 100%) > > 55,22% entry_SYSCALL_64 > 20,10% do_writev > ... > 18,08% fuse_dev_do_write > 12,58% request_end > 10,08% __wake_up_common_lock > 1,97% queued_spin_lock_slowpath > 1,31% fuse_copy_args > 1,04% fuse_copy_one > 0,85% queued_spin_lock_slowpath > > With the patch, the perf report becomes better, and only 58.16% > of do_write() time we spend in request_end(): > > 54,15% entry_SYSCALL_64 > 18,24% do_writev > ... > 16,25% fuse_dev_do_write > 10,61% request_end > 10,25% __wake_up_common_lock > 1,34% fuse_copy_args > 1,06% fuse_copy_one > 0,88% queued_spin_lock_slowpath > > Signed-off-by: Kirill Tkhai > --- > fs/fuse/dev.c | 30 ++++++++++++++++++++++-------- > 1 file changed, 22 insertions(+), 8 deletions(-) > > diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c > index 7705f75c77a3..391498e680ec 100644 > --- a/fs/fuse/dev.c > +++ b/fs/fuse/dev.c > @@ -427,10 +427,16 @@ static void request_end(struct fuse_conn *fc, struct fuse_req *req) > > if (test_and_set_bit(FR_FINISHED, &req->flags)) > goto put_request; > - > - spin_lock(&fiq->waitq.lock); > - list_del_init(&req->intr_entry); > - spin_unlock(&fiq->waitq.lock); > + /* > + * test_and_set_bit() implies smp_mb() between bit > + * changing and below intr_entry check. Pairs with > + * smp_mb() from queue_interrupt(). > + */ > + if (!list_empty(&req->intr_entry)) { > + spin_lock(&fiq->waitq.lock); > + list_del_init(&req->intr_entry); > + spin_unlock(&fiq->waitq.lock); > + } > WARN_ON(test_bit(FR_PENDING, &req->flags)); > WARN_ON(test_bit(FR_SENT, &req->flags)); > if (test_bit(FR_BACKGROUND, &req->flags)) { > @@ -470,13 +476,21 @@ static void queue_interrupt(struct fuse_iqueue *fiq, struct fuse_req *req) > { > bool kill = false; > > - spin_lock(&fiq->waitq.lock); > - if (test_bit(FR_FINISHED, &req->flags)) { > - spin_unlock(&fiq->waitq.lock); > + if (test_bit(FR_FINISHED, &req->flags)) The only case this test would make sense if this was a performance sensitive path, which it's not. > return; > - } > + spin_lock(&fiq->waitq.lock); > if (list_empty(&req->intr_entry)) { > list_add_tail(&req->intr_entry, &fiq->interrupts); > + /* > + * Pairs with smp_mb() implied by test_and_set_bit() > + * from request_end(). > + */ > + smp_mb(); > + if (test_bit(FR_FINISHED, &req->flags)) { > + list_del_init(&req->intr_entry); > + spin_unlock(&fiq->waitq.lock); > + return; > + } > wake_up_locked(&fiq->waitq); > kill = true; > } >