Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1955035AbdDYWDN (ORCPT ); Tue, 25 Apr 2017 18:03:13 -0400 Received: from mail-co1nam03on0071.outbound.protection.outlook.com ([104.47.40.71]:34880 "EHLO NAM03-CO1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1430741AbdDYWDI (ORCPT ); Tue, 25 Apr 2017 18:03:08 -0400 Authentication-Results: amd.com; dkim=none (message not signed) header.d=none;amd.com; dmarc=none action=none header.from=amd.com; Subject: Re: [PATCH] x86: kvm: Avoid guest page table walk when gpa_available is set To: =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= References: <1493049146-19261-1-git-send-email-brijesh.singh@amd.com> <20170424205236.GE5713@potion> <77f51978-5937-0c94-13b6-885345921b03@amd.com> <20170425140351.GF5713@potion> CC: , , , , , , , , , From: Brijesh Singh Message-ID: <6e453f35-cd26-1df8-5f8e-68fa09c6a1a3@amd.com> Date: Tue, 25 Apr 2017 17:02:58 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <20170425140351.GF5713@potion> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [165.204.77.1] X-ClientProxiedBy: MWHPR15CA0035.namprd15.prod.outlook.com (10.173.226.149) To BN6PR1201MB0129.namprd12.prod.outlook.com (10.174.114.142) X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: ff4c0290-6f59-4bd0-d09e-08d48c26d9bf X-MS-Office365-Filtering-HT: Tenant X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(22001)(48565401081)(201703131423075)(201703031133081);SRVR:BN6PR1201MB0129; X-Microsoft-Exchange-Diagnostics: 1;BN6PR1201MB0129;3:qQLEMs3p/Tr8EjsqvSQhtd4CU+nWfHHN6GYsiFMdU2YSOQErb6owqwUmtINeKpqGx7gTjb592Fusc55168NowaeKpR5y3nBOko2Vt++hfXlTfJ1g/K0OMz497wnDz5OWFmngVHXYa4rbNnhUG6TlRCLQcsxPb6jXiisRXg+KDj+1L2oDMFZocc4/G6dITo5h20Cid+w9iFjxkH5mV4+ftt+PKZw7gpr5iPAzSGkcjNczGFlOKhAAQ4sGTN68+8+JM6pdmOv7/ZePTftDSCPY/M5ZR8zpx0Qsee41vVsl3EosRbPTVLHgH51vncnmpdSIuxlrnVuoSlIesh9z3IBjImJlSEXAkvJHg2he3I3RUtU=;25:d4GQfkKBes5Xxe6EdcQkaH9t8DM6cDLhJ5t4xQVldxSaKfijoXGuxTFtCT10PrxNVIZqGIP33Cq0ATOSCGT2SgHfMMxhPpvNARV/YjOZYT68zRYszgZUAlVkRA0jVOzUr9bidpnt3V3ET0b6rKlkpnpsw+vay78ByXlr1bwiMXdRD5hNup20G1eYewTQghx7qIHeXOx1XrdV3FS9f59S/6/+bIGbVNOUiTjpCG557D9qTuMmkFtXHQDTlbvDaAlOsgSLXqDfwyBufCr01O61PPXJEnvjw28ZjUy4pMo1cSe16gtLm7DCwu7sC8BOyzpqFp/z6ixkXfgeGhDHapDjNbGCIGQlLgMou+atWSk8bITpCmgzk++2lK5sJXkWTXAPCIVcVTgK0r+apYhOgVoqbGSZBZah3yOIeUQsc03N2TcKJGeWTocfSKvZhBhOM+tsTpIYpAMlsOixfi8pOaU4Yw== X-Microsoft-Exchange-Diagnostics: 1;BN6PR1201MB0129;31:rYIac6pOHwpwmuLnHorelc8Un/+RtZq7SwkjrGJYgRXknjpYdrjhfzlw31y2Rb4awTZEG15Jg4G9II6beNDMpFMcNipn3fbs/jEGsXv0p0Mq+bAxUYwSV9pIR32qLyoWFMH9lWJAAqJ6t6IWBvXSe0LIDVkw5lRthXex9vcWbQ+Fv7+gIQihrZuLEDSaVIrOig4RMbmZUbF6HacN/D63Cg5WRu7UzWik2ip4bid2ZWaChSrJRqvMnHzFE4TWKGf4;20:Gaz8cCyXhvDEeFJTZiBJ6J9djW3jIRss0yMoRUGKeHVKSELwTdgFKY65mhPkS6YOVkAZs+7J9x7riJlIwLUwyiGNXyMlUNYODjFQHLSWXRsaMwg2ZfiMLqhYn9fuZnCjJKA5mN80Kc393SqKxqCHqi4nb4j76gANpHle6EZ2407YndjfNQXFJ8iK+RjIQCJmFGXPj+nJaixjlCe3Ug59RFml8re8ooIYZdeLhxXiywlIL67DTIIzhpHvgjkCOGZN5CBjMLgR0vSltkYYrqInuxTmTEp7rspsrPhxpGd3AoC2S9YcGQY8t0YUCQw+uWLQwK1wSR4p5sp0zkoB2ZAl+qvfAOHdgRNZZ1nFLhVYgnWynd3optOTSq/J8AEhfay4QbhagUZcKCmFs3PnIrtr2MUBdE7h0DaWzafEL/JQpceH5vdo65iqyhUq/CdZ4PeKQymKhLqwBe9ekLRx6vfNqvIoX2wJJfEnoyU5XbNpVGwfmPvIFnYsx832rmPg4bcy X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040450)(601004)(2401047)(5005006)(8121501046)(10201501046)(3002001)(93006095)(93001095)(6055026)(6041248)(201703131423075)(201702281528075)(201703061421075)(20161123562025)(20161123560025)(20161123564025)(20161123555025)(6072148);SRVR:BN6PR1201MB0129;BCL:0;PCL:0;RULEID:;SRVR:BN6PR1201MB0129; X-Microsoft-Exchange-Diagnostics: 1;BN6PR1201MB0129;4:uUMev6HWbrEiCmtXDeHPmB7WI5TqTcUHlo520IA/c8euFxSYbFVr89EZbGpRSO5f9aIcoLHZMPxrAeWLcxiSdHq8rFM8uHb4aLdZ6KqR1RcwJQDsNSKHu8mwcHzmPpaqFea4cHshmIB0mXqDGSvzkUsv1RHGrB7q0OXA6g+dA4Jv55vf5ZQ0gTs5dHylPH4kytx21R+3OQl4JAoxTNEQaolggOMMQRu9kMCwoEj2AzLsap5mUNlYeu537bpkNbP8T75RPs1foLXQe8dL8WP5y2hxnnPOeCZt/wyirAqeXhLEFcmVzPwQPxR7pffsEYkQBtoIoXS1n+7QXAcHFG9a61vfKpxBitbfhEVwD9mVZSWWrG5FptlpUw93rQDPShX06gcJFRZdMFQobrnf1tPkdec+nynZiv+Odn67QAr2gyO38eTWvn53m+nSMpzziphJKXIlIUyHQTBMQls2W25BgFAWRhEHhGfBQ1CER/k9/+CFfToj1DDru55SJuapc3u5p1EPMTOgyaFFf5fwYmzmQ9vOgmbXg9P8V9sIYXICsZbuD1OoWKugJR3s06a3nMvvtPL3OE+QM442n2Ux9xGrxYDlRCG+q6jD7m1LPjVqFlbZCRtHc6tCdoJ+kkfyoSyZ/SD4x5V+GUob2Ez+vqAKTqhx7B44Lsjklhh5yfkv4aTMO3GEi2WS7KnY3CDvNps2JRQYttj2/IFjd7rUhITg90uhIH1/jammA3q3qaB0Nto= X-Forefront-PRVS: 0288CD37D9 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10009020)(4630300001)(6049001)(6009001)(39860400002)(39840400002)(39450400003)(39850400002)(39400400002)(39410400002)(42186005)(50986999)(38730400002)(50466002)(54906002)(305945005)(64126003)(4326008)(33646002)(6116002)(31686004)(23746002)(76176999)(3846002)(110136004)(2906002)(6486002)(54356999)(77096006)(90366009)(7736002)(81166006)(2950100002)(230700001)(5660300001)(65806001)(4001350100001)(31696002)(6916009)(229853002)(66066001)(25786009)(36756003)(47776003)(86362001)(53936002)(83506001)(189998001)(65826007)(6666003)(6246003)(8676002)(93886004);DIR:OUT;SFP:1101;SCL:1;SRVR:BN6PR1201MB0129;H:[10.236.136.62];FPR:;SPF:None;MLV:sfv;LANG:en; X-Microsoft-Exchange-Diagnostics: =?Windows-1252?Q?1;BN6PR1201MB0129;23:NpyTyiD4mFtfNvyVHoB9fWW3bWenFL2QmHI?= =?Windows-1252?Q?i3fOGC3Has2XqbR5/bClnQFBoHQkI0EHbF1az50wFPUY4gYvbmdPojFW?= =?Windows-1252?Q?N3f3+EGq2V0QnLnf86ZBMv+NwmXA6BKmyCdFuIxZSC6WGsthwcUz4m73?= =?Windows-1252?Q?xt+BZArt0NB5qSezYBdHfE+2oX7qj+v/rMbhGpy0IWWso7yJTNjxl9/v?= =?Windows-1252?Q?/ZkGqS0d1Q/DWVGyuVXgz85dGwicuIA3zkQDFJPgi5m6nSL8YbXNZ/I5?= =?Windows-1252?Q?RHcWaDh6kAevloQGnVOGbmk3RY+W9XPDTcNN8vi9HDA7LEuWiIjMmaC9?= =?Windows-1252?Q?zGL1OzlBYRoorrrYvFz//MZn7d5zO5tfBcFLMlnBTysnaYxkrMSWZEip?= =?Windows-1252?Q?KEl86PXtrsj0hLy55xrWoUutkUs4zAmQcC5cPsU00dTRD/Ukr/Q2QiDu?= =?Windows-1252?Q?SnVLi4xFUdKLzadIw7/KdI/uEi7j5r8XkhQ9EPmaxVdTrFMxPGPhc9G4?= =?Windows-1252?Q?r0Dr3oBKTj63OEswMklCRPvwzma/QG+BqDe34jVFzJdeqwb296LasLbe?= =?Windows-1252?Q?ecs2Im+3sQDHHRZcxaW7jgJcxnbYHlTVUosDWbOotaNTUBLP3n+ykgq1?= =?Windows-1252?Q?F8sDCLo6kMTVHmeZGMMmKsXKDdo+UpjuAD2GZtbbQCnSdPSmMFZpUR7N?= =?Windows-1252?Q?wgBm3+YsGe4HfSoIqzuN4wQ37q/2/vRSi2JQ8z0lOunRsICqhhqShFTn?= =?Windows-1252?Q?U8rarXrvFpm6al+5im9c1SDoiJpnCMSRF6X+UUc6/U5aUbJEWFmb5RJa?= =?Windows-1252?Q?Z1cnGZyMrXZJXFYVpLfGDM34qb00kkMr/EnZGli7JlQ12stPoRo2PuPt?= =?Windows-1252?Q?nCtfq1MW6I6rMx4DrHncmberHgurIQImSd3LhGBEgNQzV9Sle/9RZouj?= =?Windows-1252?Q?n8+fQknctm4EwCOwqe0Wv0nCG1q6LQFq/rl/TOOYjNR4RoJVrqWexw11?= =?Windows-1252?Q?eKCzesvSU4bWW1FaPiKNkKkRdchaxwe6y2KjypxEc5sZi12KL6XC7HAL?= =?Windows-1252?Q?6xSwH50hUyHIlQkBqDMhdfMEjAvv2sxnjZDzD3xoxaF9DBEp23WyL1kn?= =?Windows-1252?Q?TOB6Bvx1HSKaXXrjm8UaVnsXtwKwYW71C4b7SwjiPZBVZg6VRCVlAqRV?= =?Windows-1252?Q?C3Ep9G502sQzV32sTRy7Ng0+hVeWbW+WYU66chzs99AmiQenphxIZPHJ?= =?Windows-1252?Q?4gY6x6h+LbPj6B64R7zWIMRB35lwq/gtN6gBFneXnx3xSwI3p519XT8s?= =?Windows-1252?Q?ZlM4X6gPTuTy3cImFKd6WiIgGCpnY595zGSqzSFNCkW8vhRmnhTO5rxX?= =?Windows-1252?Q?ZfiJ7SLaQpsEpfTSsVnoP9DbiczPTbguy2Q=3D=3D?= X-Microsoft-Exchange-Diagnostics: 1;BN6PR1201MB0129;6:mVkWnmW5EnKEPKh59F5bZEUtA8IUAoFJUZRTnw5MdZThnz0KhCGRdQT6SiQGvx2T9O3khqJm0PkNzEhzVj7nWl3g8iSjAYOFcDj/kpK+1kgQ8VPBOTB3xMp6qvntO4cFmQdIizJUZTP2N4zS5BUHvH6QO924oRf/1627trD5lybZt7swsRlFu+pTtoUJhCe/OZ5RPlDW9dsZu2Py+qySWDOy8bZJ9okYG4TIOYCa5fOSsbpMnr9fp+soImxG8Kp5Bq/KZrr4UNEQb2yN2MbF4LIqx/UW+BFZv3D/WHZZUIwGkGqc6eB3qXldfV6TfanG0wnbP/qgRBTLKkbe5hFs7CEqfqZCeRNYUrC7xdl3NVkxryXjuZQB2OZ5GpSRcNk5N28o0F9ZKLRw+3M8k3XH9GbAc24A/JDBcv7UJhBt2Ge5ehNLC2JgJpO+zsExGk0T1uJMZR6p4TK8THP9skflLlF1XQ32zaWa7Oo2Oc9VHnGgiYhvLUeWS+hOCvfHwSdrAa9neYsGHk8ryBkcEosMYjA4M4PYZdGlOmWF+/ab/HI=;5:IasanB6/vSAtyANOnRfBm4XHNb0KzW4SFKwqgu8g4zl1n7dpeJ3aYOHt1m8dGOxXBcdiLIWJQ0T3dP8BmWamHu58R+26W6RAHV6KZ+Snb66pKoxWQHbFyhO0SWCklIRZkXSKB/oa1JuI7yELYzOnsw==;24:/DCDODHxEYh0QT4e5vYaPHFlmYhWtB80vWHwCzSRQwaN/kgkpauGkN+QsCCF+5Wb5al7kJ6ODOngrux7iXmZjbrmQwKe+/H9LPZ5i/Gq7/4= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;BN6PR1201MB0129;7:u1tyD2FLyu6L3zHCw8iBUBEApDKn/3v6uCTcC39H23J0mKjWEN3bUyBTigVCnMyT/JumXUCrZz0cPrwpgUGRGP29pfVeqfaUxJAGvQS+L6wkl9VHFXIfrMmvaZ5pbe0FV/l6Mxb+hjGDzmjKzoQ6YOg9LQP2vRNvg9iIodytZhlWiUPKvAbt14m9rXU5nEI8mEakqDlgO5b0o58hZMxT842jBCzthjs3jgiaLS2MMGQklumrqHGlvujkjTOlzp64KcELTM0RaHWUwUNWEtCdneJ1ymc3fmAeDQGZK41HBSZivFGWwlpAC1Qj6P66KDLlfjU27Am70vdYZbhYn88rnw==;20:nv0/V9d0xMiZe2V7WZXdXfgqIJyHa4W36gGm0FYqvJkdKsKrLoWAS3kY55iongi4QHOLDkkby8vefQ0IGiojeDs/rHQyjwh53A6En+sVvC9kFIkq/ZfROfigBkg0Xxcf4RTNQzSovRzifnNFj/nr1Hdo6r+Gz820Lpn4qgQFyKu3QDbYOHHJ4BegO6ocHTOChi7L7F/YG9nSxvqfvgDwQg1ybGLiEFpRTL0IirP6GwkPgyOBlupk2wNjCKOjhUbz X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Apr 2017 22:03:03.8620 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN6PR1201MB0129 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4996 Lines: 141 >> >> I also wanted to avoid adding yet another variable but we can't depend on >> cr2 parameters passed into x86_emulate_instruction(). >> >> The x86_emulate_instruction() function is called from two places: >> >> 1) handling the page-fault. >> pf_interception [svm.c] >> kvm_mmu_page_fault [mmu.c] >> x86_emulate_instruction [x86.c] >> >> 2) completing the IO/MMIO's from previous instruction decode >> kvm_arch_vcpu_ioctl_run >> complete_emulated_io >> emulate_instruction >> x86_emulate_instruction(vcpu, 0, emulation_type, NULL, 0) >> >> In #1, we are guaranteed that cr2 variable will contain a valid GPA but >> in #2, CR2 is set to zero. > > We are setting up the completion in #1 x86_emulate_instruction(), where > the gpa (cr2) is available, so we could store the value while arming > vcpu->arch.complete_userspace_io. > > emulator_read_write_onepage() already saves gpa in frag->gpa, which is > then passed into complete_emulated_mmio -- isn't that mechanism > sufficient? > I see that complete_emulated_mmio() saves the frag>gpa into run->mmio.phys_addr, so based on the exit_reason we should be able to get the saved gpa. In my debug patch below, I tried doing something similar to verify that frag->gpa contains the valid CR2 value but I saw a bunch of mismatch. So it seems like we may not able to use frag->gpa mechanism. Additionally we also need to handle the PIO cases (e.g what if we are called from complete_emulated_pio), which also takes similar code path complete_emulated_pio completed_emulated_io emulate_instruction x86_emulate_instruction(vcpu, 0, emulation_type, NULL, 0) @@ -5682,13 +5686,20 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, restart: /* - * Save the faulting GPA (cr2) in the address field - * NOTE: If gpa_available is set then gpa_val will contain a valid GPA + * if previous exit was due to userspace mmio completion then actual + * cr2 is stored in mmio.phys_addr. */ - if (vcpu->arch.gpa_available) - ctxt->exception.address = vcpu->arch.gpa_val; - else - ctxt->exception.address = cr2; + if (vcpu->run->exit_reason == KVM_EXIT_MMIO) { + cr2 = vcpu->run->mmio.phys_addr; + if (cr2 != vcpu->arch.gpa_val) + pr_err("** mismatch %llx %llx\n", + vcpu->run->mmio.phys_addr, vcpu->arch.gpa_val); + } + + /* Save the faulting GPA (cr2) in the address field */ + ctxt->exception.address = cr2; >> >> handle_exit [svm.c] >> pf_interception [svm.c] >> /* it invokes the fault handler with CR2 = svm->vmcb->control.exit_info_2 */ >> kvm_mmu_page_fault [mmu.c] >> x86_emulate_instruction [x86.c] >> emulator_read_write_onepage [x86.c] >> /* >> *this is where we walk the guest page table to translate >> * a GVA to GPA. If gpa_available is set then we use the >> * gpa_val instead of walking the pgtable. >> */ > > pf_interception is the NPF exit handler -- please move the setting > there, at least. handle_exit() is a hot path that shouldn't contain > code that isn't applicable to all exits. > Sure, Will do. > Btw. we need some other guarantees to declare it as GPA (cr2 is GPA in > NPT exits, but might not be in other) ... isn't arch.mmu.direct_map a > condition we are interested in? > > The other code uses it to interpret cr2 directly as gpa, so we might be > able to avoid setting the arch.gpa_available in a hot path too. > Hmm looking at the call trace I am not sure how arch.mmu_direct_map will help but I will investigate a bit more. >> >> See my previous comment. In some cases CR2 may be set to zero >> (e.g when completing the instruction from previous io/mmio page-fault). >> >> If we are decide to add the gpa_val then we can remove above if >> statement from x86_emulate_instruction() and update emulator_read_write_onepage >> to use the vcpu->arch.gpa_val instead of exception->address. > > Yeah, that would be nicer than setting exception->address at a random > place. > > We could also pass the value as cr2 in all cases if it made something > better. > >> if (vcpu->arch.gpa_available && >> emulator_can_use_gpa(ctxt) && >> (addr & ~PAGE_MASK) == (exception->address & ~PAGE_MASK)) { >> gpa = vcpu=>arch.gpa_val; >> ... >> ... >> } > > If at all possible, I'd like to have the gpa passed with other relevant > data, instead of having it isolated like this ... and we can't manage > that, then at least good benchmark results to excuse the bad code. > I ran two tests to see how many times we walk guest page table. Test1: run kvm-unit-test Test2: launch Ubuntu 16.06 guest, run a stressapptest for 20 seconds, shutdown the VM Before patch * Test1: 10419 * Test2: 243365 After patch: * Test1: 1259 * Test2: 1221 Please let me know if you want me to run other other benchmark and capture the data. -Brijesh