Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751623AbdILUGp (ORCPT ); Tue, 12 Sep 2017 16:06:45 -0400 Received: from mail-pg0-f52.google.com ([74.125.83.52]:34185 "EHLO mail-pg0-f52.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751557AbdILUGl (ORCPT ); Tue, 12 Sep 2017 16:06:41 -0400 X-Google-Smtp-Source: ADKCNb7XRFIZfPgnC3BZeU42dr6JEoH70JiCbfTUQVwKBQp+asipxkdhsSPcdbBWuXbJLdGaM0NVUk09z+J+o3WzD/M= MIME-Version: 1.0 In-Reply-To: References: <1503590171-41030-1-git-send-email-pbonzini@redhat.com> From: Peter Feiner Date: Tue, 12 Sep 2017 13:06:39 -0700 Message-ID: Subject: Re: [PATCH] KVM: MMU: speedup update_permission_bitmask To: Paolo Bonzini Cc: Jim Mattson , LKML , kvm list , David Hildenbrand Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1283 Lines: 29 On Tue, Sep 12, 2017 at 12:55 PM, Paolo Bonzini wrote: > On 12/09/2017 18:48, Peter Feiner wrote: >>>> >>>> Because update_permission_bitmask is actually the top item in the profile >>>> for nested vmexits, this speeds up an L2->L1 vmexit by about ten thousand >>>> clock cycles, or up to 30%: >> >> This is a great improvement! Why not take it a step further and >> compute the whole table once at module init time and be done with it? >> There are only 5 extra input bits (nx, ept, smep, smap, wp), > > 4 actually, nx could be ignored (because unlike WP, the bit is reserved > when nx is disabled). It is only handled for clarity. > >> so the >> whole table would only take up (1 << 5) * 16 = 512 bytes. Moreover, if >> you had 32 VMs on the host, you'd actually save memory! > > Indeed; my thought was to write a script or something to generate the > tables at compile time, but doing it at module init time would be clever > and easier. > > That said, the generated code for the function, right now, is pretty > good. If it saved 1000 clock cycles per nested vmexit it would be very > convincing, but if it were 50 or even 100 a bit less so. ACK. I'm good with either approach :-) Please consider this one Reviewed-By: Peter Feiner