Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp2233981imm; Thu, 7 Jun 2018 07:28:20 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKUr/fVWm0ZH81SBH0nlvPO0cmbSbcoZSfFThlcNCyhTHOohEaQq79qDewVUMlAt93a4afh X-Received: by 2002:a63:bf49:: with SMTP id i9-v6mr1815072pgo.342.1528381700566; Thu, 07 Jun 2018 07:28:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528381700; cv=none; d=google.com; s=arc-20160816; b=RY8qKwPaDYh53Gun90zRFIhIr1tagDWNLEeWrxZGYHA1/Cdzva07sBR38ODfXC479B kqDOjr5E3s4QaU+RBjXmj02rVetZA+nPrW/4pN2xSOvUuDFrku0eGHQkC9tiS8+ucs3u Bukc2f0pVkEyPYI0fBlnnOg6aapFSt43mB6PYl1OxulIICjbdy4zVh6O0Pm7JOR2uxnh BAQwfv3dBFrmuACn5AmnzQxHOttcaIi9GOh7HgmoLlfgk88rHvdurSu/yRIK2QyCZrjD yF33BppcbWTUGcIhVK2vOHR+KCyYZn5HDcZ2NVc+8gXQ59ooABaKslXwrHLt5Ngaz4HY pvuA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:subject:message-id:date:cc:to :from:mime-version:content-transfer-encoding:content-disposition :arc-authentication-results; bh=7PDTgqN+wnCor5zvzVCpg3KeUa55Odt2o3AVhEjVWUs=; b=X+3ELTpWo2/jxeTkYJ9mECWOp4bt5Z38a2FNWh9XDsq1VdCmrF40gDgRcncFD8fC0F tvplR7M+Zdemi88Oz5DGzZPl62NQswvVA+kOFpRCy6F9kJ3+acQklhnWIBA1N0+K6K4p z9rt4IEYNPwhPkAuAfXpg6sTAr8l3Eo2chXQqmac0ssXigRVy6lBPmH8fp9CseFSFasv mQKMwEaoQyeENUMpw3GNu6LsqwlQpUhhjyu3UIJv0oGHDGPD4tHYkxCIEK6W0YICo5v2 /aOJnwoeHNgvhqh8svoZ3IQtZvd6WriwrKBs0L6g0QVdND6aS/y4zZkCab3zImrgvvqF +TEw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i74-v6si8682803pgc.188.2018.06.07.07.28.05; Thu, 07 Jun 2018 07:28:20 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933580AbeFGO0F (ORCPT + 99 others); Thu, 7 Jun 2018 10:26:05 -0400 Received: from shadbolt.e.decadent.org.uk ([88.96.1.126]:40156 "EHLO shadbolt.e.decadent.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933420AbeFGO0C (ORCPT ); Thu, 7 Jun 2018 10:26:02 -0400 Received: from [148.252.241.226] (helo=deadeye) by shadbolt.decadent.org.uk with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1fQvbG-0005Zv-5x; Thu, 07 Jun 2018 15:09:14 +0100 Received: from ben by deadeye with local (Exim 4.91) (envelope-from ) id 1fQvbC-0003FL-Q1; Thu, 07 Jun 2018 15:09:10 +0100 Content-Type: text/plain; charset="UTF-8" Content-Disposition: inline Content-Transfer-Encoding: 8bit MIME-Version: 1.0 From: Ben Hutchings To: linux-kernel@vger.kernel.org, stable@vger.kernel.org CC: akpm@linux-foundation.org, "Jens Axboe" , "Jeff Moyer" Date: Thu, 07 Jun 2018 15:05:21 +0100 Message-ID: X-Mailer: LinuxStableQueue (scripts by bwh) Subject: [PATCH 3.16 356/410] aio: fix serial draining in exit_aio() In-Reply-To: X-SA-Exim-Connect-IP: 148.252.241.226 X-SA-Exim-Mail-From: ben@decadent.org.uk X-SA-Exim-Scanned: No (on shadbolt.decadent.org.uk); SAEximRunCond expanded to false Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 3.16.57-rc1 review patch. If anyone has any objections, please let me know. ------------------ From: Jens Axboe commit dc48e56d761610da4ea1088d1bea0a030b8e3e43 upstream. exit_aio() currently serializes killing io contexts. Each context killing ends up having to do percpu_ref_kill(), which in turns has to wait for an RCU grace period. This can take a long time, depending on the number of contexts. And there's no point in doing them serially, when we could be waiting for all of them in one fell swoop. This patches makes my fio thread offload test case exit 0.2s instead of almost 6s. Reviewed-by: Jeff Moyer Signed-off-by: Jens Axboe Signed-off-by: Ben Hutchings --- fs/aio.c | 45 ++++++++++++++++++++++++++++++--------------- 1 file changed, 30 insertions(+), 15 deletions(-) --- a/fs/aio.c +++ b/fs/aio.c @@ -77,6 +77,11 @@ struct kioctx_cpu { unsigned reqs_available; }; +struct ctx_rq_wait { + struct completion comp; + atomic_t count; +}; + struct kioctx { struct percpu_ref users; atomic_t dead; @@ -115,7 +120,7 @@ struct kioctx { /* * signals when all in-flight requests are done */ - struct completion *requests_done; + struct ctx_rq_wait *rq_wait; struct { /* @@ -523,8 +528,8 @@ static void free_ioctx_reqs(struct percp struct kioctx *ctx = container_of(ref, struct kioctx, reqs); /* At this point we know that there are no any in-flight requests */ - if (ctx->requests_done) - complete(ctx->requests_done); + if (ctx->rq_wait && atomic_dec_and_test(&ctx->rq_wait->count)) + complete(&ctx->rq_wait->comp); INIT_WORK(&ctx->free_work, free_ioctx); schedule_work(&ctx->free_work); @@ -735,7 +740,7 @@ err: * the rapid destruction of the kioctx. */ static int kill_ioctx(struct mm_struct *mm, struct kioctx *ctx, - struct completion *requests_done) + struct ctx_rq_wait *wait) { struct kioctx_table *table; @@ -764,7 +769,7 @@ static int kill_ioctx(struct mm_struct * if (ctx->mmap_size) vm_munmap(ctx->mmap_base, ctx->mmap_size); - ctx->requests_done = requests_done; + ctx->rq_wait = wait; percpu_ref_kill(&ctx->users); return 0; } @@ -796,18 +801,24 @@ EXPORT_SYMBOL(wait_on_sync_kiocb); void exit_aio(struct mm_struct *mm) { struct kioctx_table *table = rcu_dereference_raw(mm->ioctx_table); - int i; + struct ctx_rq_wait wait; + int i, skipped; if (!table) return; + atomic_set(&wait.count, table->nr); + init_completion(&wait.comp); + + skipped = 0; for (i = 0; i < table->nr; ++i) { struct kioctx *ctx = table->table[i]; - struct completion requests_done = - COMPLETION_INITIALIZER_ONSTACK(requests_done); - if (!ctx) + if (!ctx) { + skipped++; continue; + } + /* * We don't need to bother with munmap() here - exit_mmap(mm) * is coming and it'll unmap everything. And we simply can't, @@ -816,10 +827,12 @@ void exit_aio(struct mm_struct *mm) * that it needs to unmap the area, just set it to 0. */ ctx->mmap_size = 0; - kill_ioctx(mm, ctx, &requests_done); + kill_ioctx(mm, ctx, &wait); + } + if (!atomic_sub_and_test(skipped, &wait.count)) { /* Wait until all IO for the context are done. */ - wait_for_completion(&requests_done); + wait_for_completion(&wait.comp); } RCU_INIT_POINTER(mm->ioctx_table, NULL); @@ -1299,15 +1312,17 @@ SYSCALL_DEFINE1(io_destroy, aio_context_ { struct kioctx *ioctx = lookup_ioctx(ctx); if (likely(NULL != ioctx)) { - struct completion requests_done = - COMPLETION_INITIALIZER_ONSTACK(requests_done); + struct ctx_rq_wait wait; int ret; + init_completion(&wait.comp); + atomic_set(&wait.count, 1); + /* Pass requests_done to kill_ioctx() where it can be set * in a thread-safe way. If we try to set it here then we have * a race condition if two io_destroy() called simultaneously. */ - ret = kill_ioctx(current->mm, ioctx, &requests_done); + ret = kill_ioctx(current->mm, ioctx, &wait); percpu_ref_put(&ioctx->users); /* Wait until all IO for the context are done. Otherwise kernel @@ -1315,7 +1330,7 @@ SYSCALL_DEFINE1(io_destroy, aio_context_ * is destroyed. */ if (!ret) - wait_for_completion(&requests_done); + wait_for_completion(&wait.comp); return ret; }