Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp2222809imm; Thu, 7 Jun 2018 07:18:20 -0700 (PDT) X-Google-Smtp-Source: ADUXVKItVA0XBIWPpjel+Y13G9ak8gNbNz3UtV3FGEYOVZSuyYJtYazW/fZs4Plyd3SQmEWj4N4y X-Received: by 2002:a17:902:7888:: with SMTP id q8-v6mr2235326pll.79.1528381100627; Thu, 07 Jun 2018 07:18:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528381100; cv=none; d=google.com; s=arc-20160816; b=GflPml+Wd84bWiFKn5paaMTk3z1bdkYPYMmAI0m+mCnrt/Ow7oi3BJIavWAGqo4bgb cKTXH/P7r8B2R7E+NwWHGe+RRs+I8t26JOD3oWesVZVn9nKxNbxIEDG0kBfC5B2VKzoC xZOi9wnG1Dx3lPjn3T7bCenAYJ/0gOva6i5hmYAo2rY0d5k/vcZyifZNENh8uZU61WJ4 CsW/0hp84/MlJQ0vBwmr3lMjjXHT46drClL0mxoY2s/WA660oG3RYBgW7Qv3gQLyBpdg IH64B1XNaOVHr9SLrmOqdmKJw9JAFVlZo0p/Y082mPDL7GoibHD/aT8TMdTTzfkbYOd8 lH+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:subject:message-id:date:cc:to :from:mime-version:content-transfer-encoding:content-disposition :arc-authentication-results; bh=pYHJr+SaSbi5YwQ0lmsVzT5KmOqpNbw+p8+H+4jm0sY=; b=eQFmhboXuMjUfPUIBfp/moR1QV3pR3czLBMXxXpKQ4neM/vsBnUoJaWMmURf/7ZWVk E/dgGhWvTcJYPf/aZDEyjVTkZK5eSC82Lmt3oKH8VBAGkil8+1mjDxwG8kA9iwqI5Jc1 lOOuhOFMldWUZSL2OXS6DiiGVXvLZH12Gvg7odl6d8d6wYvTZmpbGI8w/6MFbhfs1cdV RCBJ24oGLt1epV47ouv1Bd87as1zYGtDlkKgL6XNLv7yZ51tfPIBcLHjXG/I90IjthiG 8pdd3t5ggs3qHqJ02cuOXR8JIYlPcgSvdmAY34f9lvb1CO3msH/pxZPEzx2iFrbIq7JF 568A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r67-v6si13706963pfr.134.2018.06.07.07.18.06; Thu, 07 Jun 2018 07:18:20 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933320AbeFGOQT (ORCPT + 99 others); Thu, 7 Jun 2018 10:16:19 -0400 Received: from shadbolt.e.decadent.org.uk ([88.96.1.126]:39592 "EHLO shadbolt.e.decadent.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933103AbeFGOJP (ORCPT ); Thu, 7 Jun 2018 10:09:15 -0400 Received: from [148.252.241.226] (helo=deadeye) by shadbolt.decadent.org.uk with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1fQvbF-0005hL-QP; Thu, 07 Jun 2018 15:09:13 +0100 Received: from ben by deadeye with local (Exim 4.91) (envelope-from ) id 1fQvbC-0003FC-OX; Thu, 07 Jun 2018 15:09:10 +0100 Content-Type: text/plain; charset="UTF-8" Content-Disposition: inline Content-Transfer-Encoding: 8bit MIME-Version: 1.0 From: Ben Hutchings To: linux-kernel@vger.kernel.org, stable@vger.kernel.org CC: akpm@linux-foundation.org, "Oleg Nesterov" , "Benjamin LaHaise" Date: Thu, 07 Jun 2018 15:05:21 +0100 Message-ID: X-Mailer: LinuxStableQueue (scripts by bwh) Subject: [PATCH 3.16 354/410] aio: change exit_aio() to load mm->ioctx_table once and avoid rcu_read_lock() In-Reply-To: X-SA-Exim-Connect-IP: 148.252.241.226 X-SA-Exim-Mail-From: ben@decadent.org.uk X-SA-Exim-Scanned: No (on shadbolt.decadent.org.uk); SAEximRunCond expanded to false Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 3.16.57-rc1 review patch. If anyone has any objections, please let me know. ------------------ From: Oleg Nesterov commit 4b70ac5fd9b58bfaa5f25b4ea48f528aefbf3308 upstream. On 04/30, Benjamin LaHaise wrote: > > > - ctx->mmap_size = 0; > > - > > - kill_ioctx(mm, ctx, NULL); > > + if (ctx) { > > + ctx->mmap_size = 0; > > + kill_ioctx(mm, ctx, NULL); > > + } > > Rather than indenting and moving the two lines changing mmap_size and the > kill_ioctx() call, why not just do "if (!ctx) ... continue;"? That reduces > the number of lines changed and avoid excessive indentation. OK. To me the code looks better/simpler with "if (ctx)", but this is subjective of course, I won't argue. The patch still removes the empty line between mmap_size = 0 and kill_ioctx(), we reset mmap_size only for kill_ioctx(). But feel free to remove this change. ------------------------------------------------------------------------------- Subject: [PATCH v3 1/2] aio: change exit_aio() to load mm->ioctx_table once and avoid rcu_read_lock() 1. We can read ->ioctx_table only once and we do not read rcu_read_lock() or even rcu_dereference(). This mm has no users, nobody else can play with ->ioctx_table. Otherwise the code is buggy anyway, if we need rcu_read_lock() in a loop because ->ioctx_table can be updated then kfree(table) is obviously wrong. 2. Update the comment. "exit_mmap(mm) is coming" is the good reason to avoid munmap(), but another reason is that we simply can't do vm_munmap() unless current->mm == mm and this is not true in general, the caller is mmput(). 3. We do not really need to nullify mm->ioctx_table before return, probably the current code does this to catch the potential problems. But in this case RCU_INIT_POINTER(NULL) looks better. Signed-off-by: Oleg Nesterov Signed-off-by: Benjamin LaHaise [bwh: Backported to 3.16: Adjust context to apply after backport of commit 6098b45b32e6 "aio: block exit_aio() until all context requests are completed"] Signed-off-by: Ben Hutchings --- --- a/fs/aio.c +++ b/fs/aio.c @@ -803,46 +803,35 @@ EXPORT_SYMBOL(wait_on_sync_kiocb); */ void exit_aio(struct mm_struct *mm) { - struct kioctx_table *table; - struct kioctx *ctx; - unsigned i = 0; + struct kioctx_table *table = rcu_dereference_raw(mm->ioctx_table); + int i; - while (1) { + if (!table) + return; + + for (i = 0; i < table->nr; ++i) { + struct kioctx *ctx = table->table[i]; struct completion requests_done = COMPLETION_INITIALIZER_ONSTACK(requests_done); - rcu_read_lock(); - table = rcu_dereference(mm->ioctx_table); - - do { - if (!table || i >= table->nr) { - rcu_read_unlock(); - rcu_assign_pointer(mm->ioctx_table, NULL); - if (table) - kfree(table); - return; - } - - ctx = table->table[i++]; - } while (!ctx); - - rcu_read_unlock(); - + if (!ctx) + continue; /* - * We don't need to bother with munmap() here - - * exit_mmap(mm) is coming and it'll unmap everything. - * Since aio_free_ring() uses non-zero ->mmap_size - * as indicator that it needs to unmap the area, - * just set it to 0; aio_free_ring() is the only - * place that uses ->mmap_size, so it's safe. + * We don't need to bother with munmap() here - exit_mmap(mm) + * is coming and it'll unmap everything. And we simply can't, + * this is not necessarily our ->mm. + * Since kill_ioctx() uses non-zero ->mmap_size as indicator + * that it needs to unmap the area, just set it to 0. */ ctx->mmap_size = 0; - kill_ioctx(mm, ctx, &requests_done); /* Wait until all IO for the context are done. */ wait_for_completion(&requests_done); } + + RCU_INIT_POINTER(mm->ioctx_table, NULL); + kfree(table); } static void put_reqs_available(struct kioctx *ctx, unsigned nr)