Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965170AbaD2UmS (ORCPT ); Tue, 29 Apr 2014 16:42:18 -0400 Received: from kanga.kvack.org ([205.233.56.17]:59677 "EHLO kanga.kvack.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933267AbaD2UmS (ORCPT ); Tue, 29 Apr 2014 16:42:18 -0400 Date: Tue, 29 Apr 2014 16:42:17 -0400 From: Benjamin LaHaise To: Oleg Nesterov Cc: Andrew Morton , Kent Overstreet , Al Viro , linux-aio@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 1/1] aio: change exit_aio() to load mm->ioctx_table once and avoid rcu_read_lock() Message-ID: <20140429204217.GO14608@kvack.org> References: <20140429183915.GA32513@redhat.com> <20140429184004.GB32521@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20140429184004.GB32521@redhat.com> User-Agent: Mutt/1.4.2.2i Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Apr 29, 2014 at 08:40:04PM +0200, Oleg Nesterov wrote: > 1. We can read ->ioctx_table only once and we do not read rcu_read_lock() > or even rcu_dereference(). > > This mm has no users, nobody else can play with ->ioctx_table. Otherwise > the code is buggy anyway, if we need rcu_read_lock() in a loop because > ->ioctx_table can be updated then kfree(table) is obviously wrong. > > 2. Update the comment. "exit_mmap(mm) is coming" is the good reason to avoid > munmap(), but another reason is that we simply can't do vm_munmap() unless > current->mm == mm and this is not true in general, the caller is mmput(). > > Signed-off-by: Oleg Nesterov Your patch does not apply because it is whitespace damaged. Please resend and verify that it applies with 'git am'. -ben > --- > fs/aio.c | 47 ++++++++++++++++++----------------------------- > 1 files changed, 18 insertions(+), 29 deletions(-) > > diff --git a/fs/aio.c b/fs/aio.c > index 12a3de0..5fd1fe7 100644 > --- a/fs/aio.c > +++ b/fs/aio.c > @@ -777,40 +777,29 @@ EXPORT_SYMBOL(wait_on_sync_kiocb); > */ > void exit_aio(struct mm_struct *mm) > { > - struct kioctx_table *table; > - struct kioctx *ctx; > - unsigned i = 0; > - > - while (1) { > - rcu_read_lock(); > - table = rcu_dereference(mm->ioctx_table); > - > - do { > - if (!table || i >= table->nr) { > - rcu_read_unlock(); > - rcu_assign_pointer(mm->ioctx_table, NULL); > - if (table) > - kfree(table); > - return; > - } > - > - ctx = table->table[i++]; > - } while (!ctx); > + struct kioctx_table *table = rcu_dereference_raw(mm->ioctx_table); > + int i; > > - rcu_read_unlock(); > + if (!table) > + return; > > + for (i = 0; i < table->nr; ++i) { > + struct kioctx *ctx = table->table[i]; > /* > - * We don't need to bother with munmap() here - > - * exit_mmap(mm) is coming and it'll unmap everything. > - * Since aio_free_ring() uses non-zero ->mmap_size > - * as indicator that it needs to unmap the area, > - * just set it to 0; aio_free_ring() is the only > - * place that uses ->mmap_size, so it's safe. > + * We don't need to bother with munmap() here - exit_mmap(mm) > + * is coming and it'll unmap everything. And we simply can't, > + * this is not necessarily our ->mm. > + * Since kill_ioctx() uses non-zero ->mmap_size as indicator > + * that it needs to unmap the area, just set it to 0. > */ > - ctx->mmap_size = 0; > - > - kill_ioctx(mm, ctx); > + if (ctx) { > + ctx->mmap_size = 0; > + kill_ioctx(mm, ctx); > + } > } > + > + rcu_assign_pointer(mm->ioctx_table, NULL); > + kfree(table); > } > > static void put_reqs_available(struct kioctx *ctx, unsigned nr) > -- > 1.5.5.1 > -- "Thought is the essence of where you are now." -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/