Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932461AbbFLIEl (ORCPT ); Fri, 12 Jun 2015 04:04:41 -0400 Received: from mail-wg0-f45.google.com ([74.125.82.45]:34380 "EHLO mail-wg0-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753162AbbFLIEb (ORCPT ); Fri, 12 Jun 2015 04:04:31 -0400 Date: Fri, 12 Jun 2015 10:04:25 +0200 From: Ingo Molnar To: Linus Torvalds Cc: Waiman Long , Thomas Gleixner , Denys Vlasenko , Borislav Petkov , Andrew Morton , Oleg Nesterov , Andy Lutomirski , linux-mml@vger.kernel.org, Linux Kernel Mailing List , Brian Gerst , "H. Peter Anvin" , Peter Zijlstra Subject: Re: [PATCH 07/12] x86/virt/guest/xen: Remove use of pgd_list from the Xen guest code Message-ID: <20150612080425.GC8759@gmail.com> References: <1434031637-9091-1-git-send-email-mingo@kernel.org> <1434031637-9091-8-git-send-email-mingo@kernel.org> <20150612072302.GA7509@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1868 Lines: 42 * Linus Torvalds wrote: > On Jun 12, 2015 00:23, "Ingo Molnar" wrote: > > > > We might make it so: but that would mean restricting certain clone_flags > > variants - not sure that's possible with our current ABI usage? > > We already do that. You can't share signal info unless you share the mm. And a > shared signal state is what defines a thread group. > > So I think the only issue is that ->mm can become NULL when the thread group > leader dies - a non-NULL mm should always be shared among all threads. Indeed, we do that in exit_mm(). So we could add tsk->mm_leader or so, which does not get cleared and which the scheduler does not look at, but I'm not sure it's entirely safe that way: we don't have a refcount, and when the last thread exits it becomes bogus for a small window until the zombie leader is unlinked from the task list. To close that race we'd have __mmdrop() or so clear out tsk->mm_leader - but the task doing the mmdrop() might be a lazy thread totally unrelated to the original thread group so we don't know which tsk->mm_leader to clear out. To solve that we'd have to track the leader owning an MM in mm_struct - which gets interesting for the exec() case where the thread group gets a new leader, so we'd have to re-link the mm's leader pointer there. So unless I missed some simpler solution there a good number of steps where this could go wrong, in small looking race windows - how about we just live with iterating through all tasks instead of just all processes, once per 512 GB of memory mapped? Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/