Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752412AbbFNUz0 (ORCPT ); Sun, 14 Jun 2015 16:55:26 -0400 Received: from mx1.redhat.com ([209.132.183.28]:49689 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751752AbbFNUzU (ORCPT ); Sun, 14 Jun 2015 16:55:20 -0400 Date: Sun, 14 Jun 2015 22:54:12 +0200 From: Oleg Nesterov To: Ingo Molnar Cc: linux-kernel@vger.kernel.org, linux-mml@vger.kernel.org, Andy Lutomirski , Andrew Morton , Denys Vlasenko , Brian Gerst , Peter Zijlstra , Borislav Petkov , "H. Peter Anvin" , Linus Torvalds , Thomas Gleixner , Waiman Long Subject: Re: [PATCH 06/12] x86/mm: Enable and use the arch_pgd_init_late() method Message-ID: <20150614205412.GC19582@redhat.com> References: <1434031637-9091-1-git-send-email-mingo@kernel.org> <1434031637-9091-7-git-send-email-mingo@kernel.org> <20150612225000.GA24699@redhat.com> <20150613064705.GA13835@gmail.com> <20150613065255.GA16018@gmail.com> <20150613174527.GA29379@redhat.com> <20150614081352.GA3446@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150614081352.GA3446@gmail.com> User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1641 Lines: 55 On 06/14, Ingo Molnar wrote: > > So since we have a spin_lock() there already, Yeeeees, I thought about task_lock() or pgd_lock too. > Also, since this is x86 specific code we could rely on the fact that > spinlock-acquire is a full memory barrier? we do not really need the full barrier if we rely on spinlock_t, we can rely on acquire+release semantics. Lets forget about exec_mmap(). If we add, say, // or unlock_wait() + barriers task_lock(current->group_leader); task_unlock(current->group_leader); at the start of arch_pgd_init_late() we will fix the problems with fork() even if pgd_none() below can leak into the critical section. We rely on the fact that find_lock_task_mm() does lock/unlock too and always starts with the group leader. If sync_global_pgds() takes this lock first, we must see the change in *PGD after task_unlock(). Actually right after task_lock(). Otherwise, sync_global_pgds() should see the result of list addition if it takes this (the same) ->group_leader->lock_alloc after us. But this is not nice, and exec_mmap() calls arch_pgd_init_late() under task_lock(). So, unless you are going to remove pgd_lock altogether perhaps we can rely on it the same way mb(); spin_unlock_wait(&pgd_lock); rmb(); Avoids the barriers (and comments) on another side, but I can't say I really like this... So I won't argue with 2 mb's on both sides. Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/