Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp608191imm; Wed, 18 Jul 2018 07:44:33 -0700 (PDT) X-Google-Smtp-Source: AAOMgpdenq9g/S0qWqS3M2ZsishndDG0mKJTU9Dfw5BiTk36/TRykLIxbgSjz2TQnaf7PD7lPJlw X-Received: by 2002:a17:902:650c:: with SMTP id b12-v6mr6204955plk.31.1531925073259; Wed, 18 Jul 2018 07:44:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1531925073; cv=none; d=google.com; s=arc-20160816; b=FjwRvzjbtEbhgg2CZ3hpql8DmhXD5K2fveAkAvSCUjiLPUAxdQWGMzXQqyJcjg0x74 HKHsX5Yu7q6mpteDL1rHUz4ODjT9tjTmLMl7z94SnPuptfff2waybIdP4rJArBjomPzj 2oNwJ5lQumf3t518Ln5ksUKdqirfngNWUeTR5PtlujABStdQMs1m2dAZv8Yyb5sZuhpn /HhXu10xpm3ycjFcVBEZ9u1N44Ytf7zab1W/Ge3QfgaPoEeBWpVEjUI6rL7oUWYn6WNL Slxfl5SkdUbFjl21R/WAeL1zFdFMahAk/ROq0fhD5Ysk/o2IR+kJnaL3kEI6SwQIYiSf ZU0A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=VcmRAGAm2SmTzT1gC/LU4pOa0oFrU8hwmfHQNYcDts4=; b=eMEu+t+gE4pbtJXq+zpX66KzLQFW7MzZ/ZbftMQiOsk3y+22h1Yh/FX0W3fjMMIpzA 6DofFxPLRr88t9rK/1V28TdDMNHam+DXmMEBJ8A0PE9OR3/fSMmJDX1izuuUS+AEPFck rkL5rzs61jVKpOJvxjpG+K6GzPSXjiGvPnItxrgDFSx8aO7L21OwhlqsiiMeXo9+PJXh BBK9YJePG1i6nTu7MdFE4h21GmjazA24fRMqwW68CvcOMlPzEVQQ7jZA5lt+8/3Xs7oj nJJFJOyMlIT5sDyoPXiA2gIpT7RRk031kU/54YJJbMLwzIiZ8zgngYudFR6hFnFWKVO/ Ck7g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j63-v6si3157762pgd.425.2018.07.18.07.44.17; Wed, 18 Jul 2018 07:44:33 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730987AbeGRPVb (ORCPT + 99 others); Wed, 18 Jul 2018 11:21:31 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:50754 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730627AbeGRPVb (ORCPT ); Wed, 18 Jul 2018 11:21:31 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 2CE29401EF07; Wed, 18 Jul 2018 14:43:16 +0000 (UTC) Received: from flask (unknown [10.43.2.80]) by smtp.corp.redhat.com (Postfix) with SMTP id 63A172026D69; Wed, 18 Jul 2018 14:43:12 +0000 (UTC) Received: by flask (sSMTP sendmail emulation); Wed, 18 Jul 2018 16:43:11 +0200 Date: Wed, 18 Jul 2018 16:43:11 +0200 From: Radim =?utf-8?B?S3LEjW3DocWZ?= To: Wanpeng Li Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Paolo Bonzini , Eduardo Habkost , Peng Hao Subject: Re: [PATCH v2] KVM: Add coalesced PIO support Message-ID: <20180718144311.GB6348@flask> References: <1531360773-32304-1-git-send-email-wanpengli@tencent.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1531360773-32304-1-git-send-email-wanpengli@tencent.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.5]); Wed, 18 Jul 2018 14:43:16 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.5]); Wed, 18 Jul 2018 14:43:16 +0000 (UTC) for IP:'10.11.54.4' DOMAIN:'int-mx04.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'rkrcmar@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 2018-07-12 09:59+0800, Wanpeng Li: > From: Peng Hao > > Windows I/O, such as the real-time clock. The address register (port > 0x70 in the RTC case) can use coalesced I/O, cutting the number of > userspace exits by half when reading or writing the RTC. > > Guest access rtc like this: write register index to 0x70, then write or > read data from 0x71. writing 0x70 port is just as index and do nothing > else. So we can use coalesced mmio to handle this scene to reduce VM-EXIT > time. > > In our environment, 12 windows guests running on a Skylake server: > > Before patch: > > IO Port Access Samples Samples% Time% Avg time > > 0x70:POUT 20675 46.04% 92.72% 67.15us ( +- 7.93% ) > > After patch: > > IO Port Access Samples Samples% Time% Avg time > > 0x70:POUT 17509 45.42% 42.08% 6.37us ( +- 20.37% ) > > Cc: Paolo Bonzini > Cc: Radim Krčmář > Cc: Eduardo Habkost > Cc: Peng Hao > Signed-off-by: Peng Hao > Signed-off-by: Wanpeng Li > --- > v1 -> v2: > * add the original author > > Documentation/virtual/kvm/00-INDEX | 2 ++ > Documentation/virtual/kvm/api.txt | 7 +++++++ > Documentation/virtual/kvm/coalesced-io.txt | 17 +++++++++++++++++ > include/uapi/linux/kvm.h | 5 +++-- > virt/kvm/coalesced_mmio.c | 16 +++++++++++++--- > virt/kvm/kvm_main.c | 2 ++ > 6 files changed, 44 insertions(+), 5 deletions(-) > create mode 100644 Documentation/virtual/kvm/coalesced-io.txt > > diff --git a/Documentation/virtual/kvm/00-INDEX b/Documentation/virtual/kvm/00-INDEX > index 3492458..4160620 100644 > --- a/Documentation/virtual/kvm/00-INDEX > +++ b/Documentation/virtual/kvm/00-INDEX > @@ -9,6 +9,8 @@ arm > - internal ABI between the kernel and HYP (for arm/arm64) > cpuid.txt > - KVM-specific cpuid leaves (x86). > +coalesced-io.txt > + - Coalesced MMIO and coalesced PIO. > devices/ > - KVM_CAP_DEVICE_CTRL userspace API. > halt-polling.txt > diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt > index d10944e..4190796 100644 > --- a/Documentation/virtual/kvm/api.txt > +++ b/Documentation/virtual/kvm/api.txt > @@ -4618,3 +4618,10 @@ This capability indicates that KVM supports paravirtualized Hyper-V TLB Flush > hypercalls: > HvFlushVirtualAddressSpace, HvFlushVirtualAddressSpaceEx, > HvFlushVirtualAddressList, HvFlushVirtualAddressListEx. > + > +8.19 KVM_CAP_COALESCED_PIO > + > +Architectures: x86, s390, ppc, arm64 > + > +This Capability indicates that kvm supports writing to a coalesced-pio region > +is not reported to userspace until the next non-coalesced pio is issued. > diff --git a/Documentation/virtual/kvm/coalesced-io.txt b/Documentation/virtual/kvm/coalesced-io.txt > new file mode 100644 > index 0000000..4a96eaf > --- /dev/null > +++ b/Documentation/virtual/kvm/coalesced-io.txt > @@ -0,0 +1,17 @@ > +---- > +Coalesced MMIO and coalesced PIO can be used to optimize writes to > +simple device registers. Writes to a coalesced-I/O region are not > +reported to userspace until the next non-coalesced I/O is issued, > +in a similar fashion to write combining hardware. In KVM, coalesced > +writes are handled in the kernel without exits to userspace, and > +are thus several times faster. > + > +Examples of devices that can benefit from coalesced I/O include: > + > +- devices whose memory is accessed with many consecutive writes, for > + example the EGA/VGA video RAM. > + > +- windows I/O, such as the real-time clock. The address register (port > + 0x70 in the RTC case) can use coalesced I/O, cutting the number of > + userspace exits by half when reading or writing the RTC. > +---- > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h > index b6270a3..9cc56d3 100644 > --- a/include/uapi/linux/kvm.h > +++ b/include/uapi/linux/kvm.h > @@ -420,13 +420,13 @@ struct kvm_run { > struct kvm_coalesced_mmio_zone { > __u64 addr; > __u32 size; > - __u32 pad; > + __u32 pio; Paolo, do you think we can rename the field without breaking userspace builds? > }; > > struct kvm_coalesced_mmio { > __u64 phys_addr; > __u32 len; > - __u32 pad; > + __u32 pio; > __u8 data[8]; > }; > > diff --git a/virt/kvm/coalesced_mmio.c b/virt/kvm/coalesced_mmio.c > @@ -149,8 +150,12 @@ int kvm_vm_ioctl_register_coalesced_mmio(struct kvm *kvm, > dev->zone = *zone; > > mutex_lock(&kvm->slots_lock); > - ret = kvm_io_bus_register_dev(kvm, KVM_MMIO_BUS, zone->addr, > - zone->size, &dev->dev); > + if (zone->pio) > + ret = kvm_io_bus_register_dev(kvm, KVM_PIO_BUS, zone->addr, > + zone->size, &dev->dev); > + else > + ret = kvm_io_bus_register_dev(kvm, KVM_MMIO_BUS, zone->addr, > + zone->size, &dev->dev); This would be better readable as ret = kvm_io_bus_register_dev(kvm, zone->pio ? KVM_PIO_BUS : KVM_MMIO_BUS, zone->addr, zone->size, &dev->dev); thanks.