Received: by 10.223.185.116 with SMTP id b49csp2397081wrg; Mon, 12 Feb 2018 08:55:37 -0800 (PST) X-Google-Smtp-Source: AH8x224DR2q+L/sVreibRSSa+JHIMUueC4peyBXIioy/kdXAdrOlZZ7purQRblp2sa1YRFTFsEbj X-Received: by 2002:a17:902:7844:: with SMTP id e4-v6mr10988518pln.83.1518454537178; Mon, 12 Feb 2018 08:55:37 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518454537; cv=none; d=google.com; s=arc-20160816; b=aklNecFOHr1YLreeLZxAPGPp/YL0j3roDrWR0KYZZkdZ85RWSLwOBbdcEqaFfS5PvJ GnR48SpNV/bcQzTTWXYYGvlzMeNyN096P98JGByuU7Ew76b8roZDjYVSKl2/Yb2+7EeW dsr+s3zm+XonSr5ZbpsAcY+HcKrCSJqTQguVodh4IQayq+A23+o5l8AEkKen7y3R//62 Q9TrQwLwJ4XeZnAq/a4LxEMlEHr/HzrOZlwONQLys9dtJxLEa9r4tFjMX2C7wTHxT6ac T2rUFdb1JDoD6vZn2/BIV8pcbgJD9Kf2gr06h1/HWRQ5j0bidC4OhCYpxiKImt5OKzjf Opmw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=RkA0cC2xwhDqGD2V24jJockc3BFx3kH29esjN8Pt+ew=; b=kcxMKG8xTJQBHknI0sb95UEktQ6mlPLGL9b1GRzUfxzrf6fd9xEElqNwP3rBSOaj6Y zOXNaIOxBIcUY2UsSGWcWTp9vOrgcoaQS+3IrhRCJWEKBpr6mE9boYNxs336mo9KXcfv q3LywHrTQL/98MdTLmaYFA3E3HQy26xF/8QO7iJDaf8bKxib85SWBpkiG0vNgnUYPSdR hf9kiOHT8fTSPaOdjIwqxvvsW7MyNtZO356cgwUd7spvuKHpZmwsFOD1Jy+CoVYSvkx2 qstt4fVH555vR2uNsI6L/Onr4F1G0GN0G3tG7vG3sUhaBdzUJbI8DH4wP4g8r+Gxc9/4 F6DA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail (test mode) header.i=@8bytes.org header.s=mail-1 header.b=GIlu4cMk; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=8bytes.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r81si150055pfi.239.2018.02.12.08.55.22; Mon, 12 Feb 2018 08:55:37 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail (test mode) header.i=@8bytes.org header.s=mail-1 header.b=GIlu4cMk; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=8bytes.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933437AbeBLOva (ORCPT + 99 others); Mon, 12 Feb 2018 09:51:30 -0500 Received: from 8bytes.org ([81.169.241.247]:45508 "EHLO theia.8bytes.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754503AbeBLOv1 (ORCPT ); Mon, 12 Feb 2018 09:51:27 -0500 Received: by theia.8bytes.org (Postfix, from userid 1000) id 014F222F; Mon, 12 Feb 2018 15:51:25 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=8bytes.org; s=mail-1; t=1518447086; bh=D5K7jDCi2spQFHfVECCS6vNcM4TJwPhWAJPu+XFcQTw=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=GIlu4cMk/iqzIWt6VDgLtSCZQTULDeWOrIVL+JDJo2Wan4WynXb/UO3JOxAL7r/CV l5wQEFodmCp0otWoU20vNf2R3U75U66SB8DmRXs+ZyYXh7jTXgeRm4UfBDBCY3BVQc v9+pCj/Ch3seOs3j08PXGyt+2ujwQafp24ZJh1wKs3fw4exsrUvH6OiT0ec2rsa2l2 B0HJ/RWzoL+/srGB0rUUjwL9NK9VmnrxwZ7aqR6dYdbPhWnitslyT09aIUQzuYa4DI OVPH7qzuOQUUXPLY+8V4PkNv+6JYZjX2WjQn9cpOtquja9NYDxQlTOS8uVISMoakkM hXrlWPZca6OrQ== Date: Mon, 12 Feb 2018 15:51:25 +0100 From: Joerg Roedel To: Ingo Molnar Cc: Joerg Roedel , Andy Lutomirski , Thomas Gleixner , "H . Peter Anvin" , X86 ML , LKML , Linux-MM , Linus Torvalds , Dave Hansen , Josh Poimboeuf , Juergen Gross , Peter Zijlstra , Borislav Petkov , Jiri Kosina , Boris Ostrovsky , Brian Gerst , David Laight , Denys Vlasenko , Eduardo Valentin , Greg KH , Will Deacon , "Liguori, Anthony" , Daniel Gruss , Hugh Dickins , Kees Cook , Andrea Arcangeli , Waiman Long , Pavel Machek Subject: Re: [PATCH 00/31 v2] PTI support for x86_32 Message-ID: <20180212145125.GE16484@8bytes.org> References: <1518168340-9392-1-git-send-email-joro@8bytes.org> <20180209191112.55zyjf4njum75brd@suse.de> <20180211191312.54apu5edk3olsfz3@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180211191312.54apu5edk3olsfz3@gmail.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Ingo, On Sun, Feb 11, 2018 at 08:13:12PM +0100, Ingo Molnar wrote: > Could you please measure the PTI kernel vs. vanilla kernel? Okay, did that, here is the data. The test machine is a Xeon E5-1620v2, which is Ivy Bridge based (no PCIE) and has 4C/8T. I ran the 2 tests you suggested: * Test-1: perf stat --null --sync --repeat 10 perf bench sched messaging -g 20 * Test-2: perf stat --null --sync --repeat 10 perf bench sched messaging -g 20 -t The tests ran on these kernels: * tip-32-pae: current top of tip/x86-tip-for-linus branch, compiled as a 32 bit kernel with PAE (commit b2ac58f90540e39324e7a29a7ad471407ae0bf48) * pti-32-pae: Same as above with my patches on-top, as on git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux.git pti-x32-v2 compiled as a 32 bit kernel with PAE (commit dbb0074f778b396a11e0c897fef9d0c4583e7ccb) * pti-off-64: current top of tip/x86-tip-for-linus branch, compiled as a 64 bit kernel, booted with pti=off (commit b2ac58f90540e39324e7a29a7ad471407ae0bf48) * pti-on-64: current top of tip/x86-tip-for-linus branch, compiled as a 64 bit kernel, booted with pti=on (commit b2ac58f90540e39324e7a29a7ad471407ae0bf48) Results are: | Test-1 | Test-2 ------------+--------------------+----------------- tip-32-pae | 0.28s (+-0.44%) | 0.27s (+-2.15%) ------------+--------------------+----------------- pti-32-pae | 0.44s (+-0.40%) | 0.42s (+-0.48%) ------------+--------------------+----------------- pti-off-64 | 0.24s (+-0.40%) | 0.25s (+-1.31%) ------------+--------------------+----------------- pti-on-64 | 0.30s (+-0.47%) | 0.31s (+-0.95%) On 32 bit with PTI enabled the test needs 157% (non-threaded) and 156% (threaded) of time compared to the non-PTI baseline. On 64 bit these numbers are 125% (non-threaded) and 124% (threaded). The pti-32-pae kernel still used 'rep movsb' in the entry code. I replaced that with 'rep movsl' and measured again, but overhead is still around 152%. I also measured cycles with 'perf record' to see where the additional time is spent. The report showed around 25% in entry_SYSENTER_32 for the pti-32-pae kernel. The same report on the tip-32-pae kernel shows around 2.5% for the same symbol. The entry_SYSENTER_32 path does no stack-copy on entry (it only push/pops 8 bytes for the cr3 switch), but one full pt_regs copy on exit. The exit-path was easy to optimize, I got it to the point where it only copied 8 bytes to the entry stack (flags and eax). This way I got the 'perf report' numbers for entry_SYSENTER_32 down to around 20%, but the overall numbers for Test-1 and Test-2 are still at around 150% of the baseline. So it seems that most of the additional time is actually spent switching the cr3s. Regards, Joerg