Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755050AbbFKUFZ (ORCPT ); Thu, 11 Jun 2015 16:05:25 -0400 Received: from aserp1040.oracle.com ([141.146.126.69]:25616 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753129AbbFKUFW (ORCPT ); Thu, 11 Jun 2015 16:05:22 -0400 Message-ID: <5579E9B4.7080601@oracle.com> Date: Thu, 11 Jun 2015 16:04:04 -0400 From: Boris Ostrovsky User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Ingo Molnar , linux-kernel@vger.kernel.org CC: linux-mml@vger.kernel.org, Andy Lutomirski , Andrew Morton , Denys Vlasenko , Brian Gerst , Peter Zijlstra , Borislav Petkov , "H. Peter Anvin" , Linus Torvalds , Oleg Nesterov , Thomas Gleixner , Waiman Long , Konrad Rzeszutek Wilk , David Vrabel Subject: Re: [PATCH 06/12] x86/mm: Enable and use the arch_pgd_init_late() method References: <1434031637-9091-1-git-send-email-mingo@kernel.org> <1434031637-9091-7-git-send-email-mingo@kernel.org> In-Reply-To: <1434031637-9091-7-git-send-email-mingo@kernel.org> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Source-IP: aserv0022.oracle.com [141.146.126.234] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2680 Lines: 79 On 06/11/2015 10:07 AM, Ingo Molnar wrote: > diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c > index fb0a9dd1d6e4..e0bf90470d70 100644 > --- a/arch/x86/mm/pgtable.c > +++ b/arch/x86/mm/pgtable.c > @@ -391,6 +391,63 @@ pgd_t *pgd_alloc(struct mm_struct *mm) > return NULL; > } > > +/* > + * Initialize the kernel portion of the PGD. > + * > + * This is done separately, because pgd_alloc() happens when > + * the task is not on the task list yet - and PGD updates > + * happen by walking the task list. > + * > + * No locking is needed here, as we just copy over the reference > + * PGD. The reference PGD (pgtable_init) is only ever expanded > + * at the highest, PGD level. Thus any other task extending it > + * will first update the reference PGD, then modify the task PGDs. > + */ > +void arch_pgd_init_late(struct mm_struct *mm, pgd_t *pgd) > +{ > + /* > + * This is called after a new MM has been made visible > + * in fork() or exec(). > + * > + * This barrier makes sure the MM is visible to new RCU > + * walkers before we initialize it, so that we don't miss > + * updates: > + */ > + smp_wmb(); > + > + /* > + * If the pgd points to a shared pagetable level (either the > + * ptes in non-PAE, or shared PMD in PAE), then just copy the > + * references from swapper_pg_dir: > + */ > + if (CONFIG_PGTABLE_LEVELS == 2 || > + (CONFIG_PGTABLE_LEVELS == 3 && SHARED_KERNEL_PMD) || > + CONFIG_PGTABLE_LEVELS == 4) { > + > + pgd_t *pgd_src = swapper_pg_dir + KERNEL_PGD_BOUNDARY; > + pgd_t *pgd_dst = pgd + KERNEL_PGD_BOUNDARY; > + int i; > + > + for (i = 0; i < KERNEL_PGD_PTRS; i++, pgd_src++, pgd_dst++) { > + /* > + * This is lock-less, so it can race with PGD updates > + * coming from vmalloc() or CPA methods, but it's safe, > + * because: > + * > + * 1) this PGD is not in use yet, we have still not > + * scheduled this task. > + * 2) we only ever extend PGD entries > + * > + * So if we observe a non-zero PGD entry we can copy it, > + * it won't change from under us. Parallel updates (new > + * allocations) will modify our (already visible) PGD: > + */ > + if (pgd_val(*pgd_src)) > + WRITE_ONCE(*pgd_dst, *pgd_src); This should be set_pgd(pgd_dst, *pgd_src) in order for it to work as a Xen PV guest. I don't know whether anything would need to be done wrt WRITE_ONCE. Perhaps put it into native_set_pgd()? -boris -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/