Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932216AbbFKOHh (ORCPT ); Thu, 11 Jun 2015 10:07:37 -0400 Received: from mail-wi0-f181.google.com ([209.85.212.181]:36793 "EHLO mail-wi0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753315AbbFKOHd (ORCPT ); Thu, 11 Jun 2015 10:07:33 -0400 From: Ingo Molnar To: linux-kernel@vger.kernel.org Cc: linux-mml@vger.kernel.org, Andy Lutomirski , Andrew Morton , Denys Vlasenko , Brian Gerst , Peter Zijlstra , Borislav Petkov , "H. Peter Anvin" , Linus Torvalds , Oleg Nesterov , Thomas Gleixner , Waiman Long Subject: [RFC PATCH 00/12] x86/mm: Implement lockless pgd_alloc()/pgd_free() Date: Thu, 11 Jun 2015 16:07:05 +0200 Message-Id: <1434031637-9091-1-git-send-email-mingo@kernel.org> X-Mailer: git-send-email 2.1.4 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2651 Lines: 67 Waiman Long reported 'pgd_lock' contention on high CPU count systems and proposed moving pgd_lock on a separate cacheline to eliminate false sharing and to reduce some of the lock bouncing overhead. I think we can do much better: this series eliminates the pgd_list and makes pgd_alloc()/pgd_free() lockless. Now the lockless initialization of the PGD has a few preconditions, which the initial part of the series implements: - no PGD clearing is allowed, only additions. This makes sense as a single PGD entry covers 512 GB of RAM so the 4K overhead per 0.5TB of RAM mapped is miniscule. The patches after that convert existing pgd_list users to walk the task list. PGD locking is kept intact: coherency guarantees between the CPA, vmalloc, hotplug, etc. code are unchanged. The final patches eliminate the pgd_list and thus make pgd_alloc()/pgd_free() lockless. The patches have been boot tested on 64-bit and 32-bit x86 systems. Architectures not making use of the new facility are unaffected. Thanks, Ingo ===== Ingo Molnar (12): x86/mm/pat: Don't free PGD entries on memory unmap x86/mm/hotplug: Remove pgd_list use from the memory hotplug code x86/mm/hotplug: Don't remove PGD entries in remove_pagetable() x86/mm/hotplug: Simplify sync_global_pgds() mm: Introduce arch_pgd_init_late() x86/mm: Enable and use the arch_pgd_init_late() method x86/virt/guest/xen: Remove use of pgd_list from the Xen guest code x86/mm: Remove pgd_list use from vmalloc_sync_all() x86/mm/pat/32: Remove pgd_list use from the PAT code x86/mm: Make pgd_alloc()/pgd_free() lockless x86/mm: Remove pgd_list leftovers x86/mm: Simplify pgd_alloc() arch/Kconfig | 9 ++++ arch/x86/Kconfig | 1 + arch/x86/include/asm/pgtable.h | 3 -- arch/x86/include/asm/pgtable_64.h | 3 +- arch/x86/mm/fault.c | 24 +++++++---- arch/x86/mm/init_64.c | 73 +++++++++++---------------------- arch/x86/mm/pageattr.c | 34 ++++++++-------- arch/x86/mm/pgtable.c | 129 +++++++++++++++++++++++++++++----------------------------- arch/x86/xen/mmu.c | 34 +++++++++++++--- fs/exec.c | 3 ++ include/linux/mm.h | 6 +++ kernel/fork.c | 16 ++++++++ 12 files changed, 183 insertions(+), 152 deletions(-) -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/