Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp281281imm; Wed, 12 Sep 2018 23:12:25 -0700 (PDT) X-Google-Smtp-Source: ANB0Vda0Iy8qB6I0Rulge8eKN2TJX9O9sGuNy9fPcVAv6GSE5RLMdXCK01Vp7V8Jt4gLB/OhJEUG X-Received: by 2002:a63:d309:: with SMTP id b9-v6mr5592171pgg.163.1536819145387; Wed, 12 Sep 2018 23:12:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536819145; cv=none; d=google.com; s=arc-20160816; b=fVM85EoyHZxH3yETOCFKMkCIr0IH5yaz93VQNf4DjtdAZ+33Wp01J8wiq3ZFOSyUiR P/rcuDjy9uu8rBNnagRJ8TWWYmGgtYQmHAu75N+vh/FSxy+WjrxlJksADJ57nHE6dtHv oVcwPTRxhqBoWmewr0IO5MXnuuDPmQRLtlo5+LBHYeyoqdxYhrNOC+Vf60uhi3vNcok/ L0vAVjvOFeIVfRNs95oKrgy2N+29l7oFMYEMkus0U8OTD+VjZ8SxMbmW3ESGvtSJhe20 RIkn5gdBqB0kDfMyAs4Qmw7VyyWw7+3hcjM2fentvgIGJG1uvNiyuNPnrI3e2EXL/ZKg WyBw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature; bh=H1JNbMfwbESozImo8DPIHl1nzsZDhdGy6DOG//hFizs=; b=oqhRYjBdfCz/Sjb8VZy0DdIurZinxA0HuzxLE11kGvNvBredcw917Xi+cSJJ9NqnkH Kb6uIXNRW4LBVQveSU4KwCHWdMV7awR/coqWjTDAmVWI3LYTOVy/uR8td5GKjx78dkRu g20Twx2Hfd+D9NtQ7w8CoScL2qIGwaYT4aC25jo97wBEZjRZ2Ef0tWEvYn0wXRdyq+99 /9ilmSe6CUWnzbxpO88H0A6kVQsz1wjRbcOC/+eWGYdKIjQzpJqZBh39yRW9cbBCb6DF 7q4JIX/A9rtrEnwgb/u4CgvN0yx+tO4w9uOwvKNB7mUWEXktHJoFGn8kpOMFeWWlyLBg qIkQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=WHX7Hu0n; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u6-v6si3292239pfu.143.2018.09.12.23.11.57; Wed, 12 Sep 2018 23:12:25 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=WHX7Hu0n; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726767AbeIMLTw (ORCPT + 99 others); Thu, 13 Sep 2018 07:19:52 -0400 Received: from mail-oi0-f66.google.com ([209.85.218.66]:35480 "EHLO mail-oi0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726570AbeIMLTw (ORCPT ); Thu, 13 Sep 2018 07:19:52 -0400 Received: by mail-oi0-f66.google.com with SMTP id m11-v6so8329827oic.2 for ; Wed, 12 Sep 2018 23:11:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=H1JNbMfwbESozImo8DPIHl1nzsZDhdGy6DOG//hFizs=; b=WHX7Hu0nHHEm2k+Embws5gMordF26EQqCUPy02x4lYbyPHlvfWFznfJQVOwJAqsum+ A/A58WEhooragib8oo8HJeY6FBBb6PtRVE86PCsdRikVO11NL9ZLfjyCj+7c2QWclqsa VCOBkmGoR4aR81x4ukbOTPi7GUxLDOGgeunbY3o0ZHcmPYX+vh6NcSm2au4Ij0jUUlu7 /TE7g63A/E1pexTHydGJ6XxPqR7nFogQpgEyLVYwIkPNq83mauTOwB29oExn2lFG4+sE iFR6Cv769v5HLo543pafjBm7jLL55Ny6QYBLSsZwObQpUih+ajVyGUqFmPLt/WkbRfns TZdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=H1JNbMfwbESozImo8DPIHl1nzsZDhdGy6DOG//hFizs=; b=uQASahvcbLzwQnfdJhmuou45VopL8sZZxsC0FsJZqaGpvqxBsxS9fk4KJwR/htHkvS iF+RpxmOE3sR521XPz7H168kmvJqA2PtKY3ODIpT0g6P/HQIxPMOsDTPT1+7VO12hzqe dZAVsWOEbliDuZte0PXBEez22G9CnpRBUoj/ClqPYU+HlXe0NfbAqn280jpL3JdF58Bo diEDq6EJ5K4qT/feoCmB8hqGwxEHCYM7q62aKTi+L5K0IUngQnoC2LlfrEGqXMls8q0O tlUiS26qW6cFRTidd6v4/oZWBTM/4rHfJ9STTbxjNJnRITAkjEr8sPwYa4jjDNUjFk6N Y8AQ== X-Gm-Message-State: APzg51DsAvGzuHASUNJrSAxA+NzAJxkJx5IuB4dCzZBbGO6/o28llLsa ouPFmGD45GlotDAG0NLI0edfy/26zMxeaTV+4Pg= X-Received: by 2002:aca:2cd3:: with SMTP id s202-v6mr5081352ois.253.1536819110506; Wed, 12 Sep 2018 23:11:50 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:ac9:5e12:0:0:0:0:0 with HTTP; Wed, 12 Sep 2018 23:11:49 -0700 (PDT) In-Reply-To: References: From: Juerg Haefliger Date: Thu, 13 Sep 2018 08:11:49 +0200 Message-ID: Subject: Re: Redoing eXclusive Page Frame Ownership (XPFO) with isolated CPUs in mind (for KVM to isolate its guests per CPU) To: Julian Stecklina Cc: Linus Torvalds , David Woodhouse , Konrad Rzeszutek Wilk , deepa.srinivasan@oracle.com, Jim Mattson , Andrew Cooper , Linux Kernel Mailing List , Boris Ostrovsky , linux-mm , Thomas Gleixner , joao.m.martins@oracle.com, pradeep.vincent@oracle.com, Andi Kleen , Khalid Aziz , kanth.ghatraju@oracle.com, Liran Alon , Kees Cook , Kernel Hardening , chris.hyser@oracle.com, Tyler Hicks , John Haxby , Jon Masters Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Sep 12, 2018 at 5:37 PM, Julian Stecklina wrote: > Julian Stecklina writes: > >> Linus Torvalds writes: >> >>> On Fri, Aug 31, 2018 at 12:45 AM Julian Stecklina wrote: >>>> >>>> I've been spending some cycles on the XPFO patch set this week. For the >>>> patch set as it was posted for v4.13, the performance overhead of >>>> compiling a Linux kernel is ~40% on x86_64[1]. The overhead comes almost >>>> completely from TLB flushing. If we can live with stale TLB entries >>>> allowing temporary access (which I think is reasonable), we can remove >>>> all TLB flushing (on x86). This reduces the overhead to 2-3% for >>>> kernel compile. >>> >>> I have to say, even 2-3% for a kernel compile sounds absolutely horrendous. >> >> Well, it's at least in a range where it doesn't look hopeless. >> >>> Kernel bullds are 90% user space at least for me, so a 2-3% slowdown >>> from a kernel is not some small unnoticeable thing. >> >> The overhead seems to come from the hooks that XPFO adds to >> alloc/free_pages. These hooks add a couple of atomic operations per >> allocated (4K) page for book keeping. Some of these atomic ops are only >> for debugging and could be removed. There is also some opportunity to >> streamline the per-page space overhead of XPFO. > > I've updated my XPFO branch[1] to make some of the debugging optional > and also integrated the XPFO bookkeeping with struct page, instead of > requiring CONFIG_PAGE_EXTENSION, which removes some checks in the hot > path. FWIW, that was my original design but there was some resistance to adding more to the page struct and page extension was suggested instead. > These changes push the overhead down to somewhere between 1.5 and > 2% for my quad core box in kernel compile. This is close to the > measurement noise, so I take suggestions for a better benchmark here. > > Of course, if you hit contention on the xpfo spinlock then performance > will suffer. I guess this is what happened on Khalid's large box. > > I'll try to remove the spinlocks and add fixup code to the pagefault > handler to see whether this improves the situation on large boxes. This > might turn out to be ugly, though. I'm wondering how much performance we're loosing by having to split hugepages. Any chance this can be quantified somehow? Maybe we can have a pool of some sorts reserved for userpages and group allocations so that we can track the XPFO state at the hugepage level instead of at the 4k level to prevent/reduce page splitting. Not sure if that causes issues or has any unwanted side effects though... ...Juerg > Julian > > [1] http://git.infradead.org/users/jsteckli/linux-xpfo.git/shortlog/refs/heads/xpfo-master > -- > Amazon Development Center Germany GmbH > Berlin - Dresden - Aachen > main office: Krausenstr. 38, 10117 Berlin > Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger > Ust-ID: DE289237879 > Eingetragen am Amtsgericht Charlottenburg HRB 149173 B >