Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp1208637imm; Wed, 11 Jul 2018 20:11:43 -0700 (PDT) X-Google-Smtp-Source: AAOMgpcr0SLjkm1SuCLGK1Q2xKxhzp3Xtzkr/Kax602TUMhYaiTjKV+mVf2bGZ87BjfcKQYfZr50 X-Received: by 2002:a17:902:8209:: with SMTP id x9-v6mr494459pln.150.1531365103325; Wed, 11 Jul 2018 20:11:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1531365103; cv=none; d=google.com; s=arc-20160816; b=ivqouAzhH9oJfyPuG7ijVimfKfaHX6ZJnVBU2wWPq8ETb1mPaBNJZ8ePiUTI6Y4rck sSuqP5oTHHejypBHkm8i+2vIH4hyn55vMm0dYSyxQoAMfjIr175Te4U1sCfeKJ6H3LNc 2tT1aM1aMi9GE+Ebj8dzcgmjbYZ+HrdgT//wDGxogv2Blh+nKM+juY5JPLngC+mDc7X5 Pa43rkwIRMSZrf2Vwh0n2TfC9pKgS1lEy75irujuZhdEWz9Pf24XIBAtjuS9EQwfMbIJ IKHS9nE+ZlXAjcwO9Pcw0PaGZjN64ReaJ/RSKgpJB8pbSh66AnnVMxV4Z5mEnsCqWpZx SswQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature :arc-authentication-results; bh=doGGPin0JH698kn8mn0iYePhcGKMw9ml9MsH9slowos=; b=iDlXj3LnpNGnN/vooQsbvvQz26pR/uK9MNE1XR9Jfewm0g/EQuEFU/GAdl1odALJOl K1NZQloRGkJRn2z1OHoflYbR31AouUXw0b1h90qraBwxBABpuL/BwziDP8ToICAIDDq6 jon8zxu/z5Yg0ScHok6jb06v0EMJFAgn3X9lgZ/Bkz/nL18WDbwPs1HJnhGQimlGcvp0 7RnLl8mBbGO1ItWA55Ylkv0hHhXCYopZMYcdCjxypW7FEJ/aoPzLgvcuD4kKRVlqIO8X oGFataQJdtsU1SCXhDaReE1/SlxBIa/0g7UJhMkvBxr+1GILSbX30DPIhJbwOXQhdwI5 ZqWw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=JxNmT2Ae; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o23-v6si19993121pgm.170.2018.07.11.20.11.28; Wed, 11 Jul 2018 20:11:43 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=JxNmT2Ae; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391149AbeGLCGw (ORCPT + 99 others); Wed, 11 Jul 2018 22:06:52 -0400 Received: from mail-pl0-f68.google.com ([209.85.160.68]:38139 "EHLO mail-pl0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1733093AbeGLCGw (ORCPT ); Wed, 11 Jul 2018 22:06:52 -0400 Received: by mail-pl0-f68.google.com with SMTP id b1-v6so9961580pls.5; Wed, 11 Jul 2018 18:59:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=doGGPin0JH698kn8mn0iYePhcGKMw9ml9MsH9slowos=; b=JxNmT2AesHxz+azHWi+iUKrm4bJtuIC8MNjtKYb/1CL/Lu+6j1Xq07mv7N+MjQHtyz cNMt+bDQqrIPpVa0L7S/F33jX7JP9Oy+HFgdj/1DXrTtcf4VwTNnOvTQKB4lzM5oXGjF 1PvczVNhEDNH/urgdCz3Pglq7zN3wOKdoZAqgjrO/JjGNATHqOUKN9UGvXvcKoIgPBzj QbzGrVsPx69tgHTCzjbSWkq8du9dPUtar404xECV/jswhjIEQcdeRWGINz+32f6qtoAG NCkaWDhk25KhM93Dsn8pt1yjuQ026dLh13DFyt4wcxZhpG8Cqo3obeftpcYTChRfM7wR nacQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=doGGPin0JH698kn8mn0iYePhcGKMw9ml9MsH9slowos=; b=SVZfwbLq2icRg2FJtuKnf7bmKwRzpRl1wZ+zJkmXRGUz69TKHl8c/5WAt8R/DTNJgA UXDCYwW2Eu2qWCwhELaqZ77X9hsqGr/cpJ1lUtvRS/KT2syB3OjqkWFmjNORTaPxnEVV F/rc8fqMZcClUi8HwJxq44EvSIpiuPnblyeNxVRsPIbX23cBFT6lnJcjCn/QaQyqMf92 esSZXZwpOTTx31Vr5qaotafGpekf2KQwMyr5bTu++t1NKOMW2q/2U+8CSkqn4pELOKD8 Ed5D0VEdhAFwjCsgI7ktZgzypNqg9YMKZSIrkhj45g6Z+miEGYb51jWGGLxO13C6XP/r fgow== X-Gm-Message-State: AOUpUlFv5pA4rC9oJwNnnWiz3sXHQHsUeBCs9dlN929veb5GSOPf9aWh S0tCLEe4F7fEWZQRmegK33p7dg== X-Received: by 2002:a17:902:8a87:: with SMTP id p7-v6mr274527plo.281.1531360783321; Wed, 11 Jul 2018 18:59:43 -0700 (PDT) Received: from localhost.localdomain ([203.205.141.123]) by smtp.googlemail.com with ESMTPSA id r1-v6sm64661139pfi.17.2018.07.11.18.59.41 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Wed, 11 Jul 2018 18:59:42 -0700 (PDT) From: Wanpeng Li X-Google-Original-From: Wanpeng Li To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , =?UTF-8?q?Radim=20Kr=C4=8Dm=C3=A1=C5=99?= , Eduardo Habkost , Peng Hao Subject: [PATCH v2] KVM: Add coalesced PIO support Date: Thu, 12 Jul 2018 09:59:33 +0800 Message-Id: <1531360773-32304-1-git-send-email-wanpengli@tencent.com> X-Mailer: git-send-email 2.7.4 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Peng Hao Windows I/O, such as the real-time clock. The address register (port 0x70 in the RTC case) can use coalesced I/O, cutting the number of userspace exits by half when reading or writing the RTC. Guest access rtc like this: write register index to 0x70, then write or read data from 0x71. writing 0x70 port is just as index and do nothing else. So we can use coalesced mmio to handle this scene to reduce VM-EXIT time. In our environment, 12 windows guests running on a Skylake server: Before patch: IO Port Access Samples Samples% Time% Avg time 0x70:POUT 20675 46.04% 92.72% 67.15us ( +- 7.93% ) After patch: IO Port Access Samples Samples% Time% Avg time 0x70:POUT 17509 45.42% 42.08% 6.37us ( +- 20.37% ) Cc: Paolo Bonzini Cc: Radim Krčmář Cc: Eduardo Habkost Cc: Peng Hao Signed-off-by: Peng Hao Signed-off-by: Wanpeng Li --- v1 -> v2: * add the original author Documentation/virtual/kvm/00-INDEX | 2 ++ Documentation/virtual/kvm/api.txt | 7 +++++++ Documentation/virtual/kvm/coalesced-io.txt | 17 +++++++++++++++++ include/uapi/linux/kvm.h | 5 +++-- virt/kvm/coalesced_mmio.c | 16 +++++++++++++--- virt/kvm/kvm_main.c | 2 ++ 6 files changed, 44 insertions(+), 5 deletions(-) create mode 100644 Documentation/virtual/kvm/coalesced-io.txt diff --git a/Documentation/virtual/kvm/00-INDEX b/Documentation/virtual/kvm/00-INDEX index 3492458..4160620 100644 --- a/Documentation/virtual/kvm/00-INDEX +++ b/Documentation/virtual/kvm/00-INDEX @@ -9,6 +9,8 @@ arm - internal ABI between the kernel and HYP (for arm/arm64) cpuid.txt - KVM-specific cpuid leaves (x86). +coalesced-io.txt + - Coalesced MMIO and coalesced PIO. devices/ - KVM_CAP_DEVICE_CTRL userspace API. halt-polling.txt diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index d10944e..4190796 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -4618,3 +4618,10 @@ This capability indicates that KVM supports paravirtualized Hyper-V TLB Flush hypercalls: HvFlushVirtualAddressSpace, HvFlushVirtualAddressSpaceEx, HvFlushVirtualAddressList, HvFlushVirtualAddressListEx. + +8.19 KVM_CAP_COALESCED_PIO + +Architectures: x86, s390, ppc, arm64 + +This Capability indicates that kvm supports writing to a coalesced-pio region +is not reported to userspace until the next non-coalesced pio is issued. diff --git a/Documentation/virtual/kvm/coalesced-io.txt b/Documentation/virtual/kvm/coalesced-io.txt new file mode 100644 index 0000000..4a96eaf --- /dev/null +++ b/Documentation/virtual/kvm/coalesced-io.txt @@ -0,0 +1,17 @@ +---- +Coalesced MMIO and coalesced PIO can be used to optimize writes to +simple device registers. Writes to a coalesced-I/O region are not +reported to userspace until the next non-coalesced I/O is issued, +in a similar fashion to write combining hardware. In KVM, coalesced +writes are handled in the kernel without exits to userspace, and +are thus several times faster. + +Examples of devices that can benefit from coalesced I/O include: + +- devices whose memory is accessed with many consecutive writes, for + example the EGA/VGA video RAM. + +- windows I/O, such as the real-time clock. The address register (port + 0x70 in the RTC case) can use coalesced I/O, cutting the number of + userspace exits by half when reading or writing the RTC. +---- diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index b6270a3..9cc56d3 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -420,13 +420,13 @@ struct kvm_run { struct kvm_coalesced_mmio_zone { __u64 addr; __u32 size; - __u32 pad; + __u32 pio; }; struct kvm_coalesced_mmio { __u64 phys_addr; __u32 len; - __u32 pad; + __u32 pio; __u8 data[8]; }; @@ -949,6 +949,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_GET_MSR_FEATURES 153 #define KVM_CAP_HYPERV_EVENTFD 154 #define KVM_CAP_HYPERV_TLBFLUSH 155 +#define KVM_CAP_COALESCED_PIO 156 #ifdef KVM_CAP_IRQ_ROUTING diff --git a/virt/kvm/coalesced_mmio.c b/virt/kvm/coalesced_mmio.c index 9e65feb..fc66a834 100644 --- a/virt/kvm/coalesced_mmio.c +++ b/virt/kvm/coalesced_mmio.c @@ -83,6 +83,7 @@ static int coalesced_mmio_write(struct kvm_vcpu *vcpu, ring->coalesced_mmio[ring->last].phys_addr = addr; ring->coalesced_mmio[ring->last].len = len; memcpy(ring->coalesced_mmio[ring->last].data, val, len); + ring->coalesced_mmio[ring->last].pio = dev->zone.pio; smp_wmb(); ring->last = (ring->last + 1) % KVM_COALESCED_MMIO_MAX; spin_unlock(&dev->kvm->ring_lock); @@ -149,8 +150,12 @@ int kvm_vm_ioctl_register_coalesced_mmio(struct kvm *kvm, dev->zone = *zone; mutex_lock(&kvm->slots_lock); - ret = kvm_io_bus_register_dev(kvm, KVM_MMIO_BUS, zone->addr, - zone->size, &dev->dev); + if (zone->pio) + ret = kvm_io_bus_register_dev(kvm, KVM_PIO_BUS, zone->addr, + zone->size, &dev->dev); + else + ret = kvm_io_bus_register_dev(kvm, KVM_MMIO_BUS, zone->addr, + zone->size, &dev->dev); if (ret < 0) goto out_free_dev; list_add_tail(&dev->list, &kvm->coalesced_zones); @@ -174,7 +179,12 @@ int kvm_vm_ioctl_unregister_coalesced_mmio(struct kvm *kvm, list_for_each_entry_safe(dev, tmp, &kvm->coalesced_zones, list) if (coalesced_mmio_in_range(dev, zone->addr, zone->size)) { - kvm_io_bus_unregister_dev(kvm, KVM_MMIO_BUS, &dev->dev); + if (zone->pio) + kvm_io_bus_unregister_dev(kvm, KVM_PIO_BUS, + &dev->dev); + else + kvm_io_bus_unregister_dev(kvm, KVM_MMIO_BUS, + &dev->dev); kvm_iodevice_destructor(&dev->dev); } diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 8b47507f..a587fb9 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2936,6 +2936,8 @@ static long kvm_vm_ioctl_check_extension_generic(struct kvm *kvm, long arg) #ifdef CONFIG_KVM_MMIO case KVM_CAP_COALESCED_MMIO: return KVM_COALESCED_MMIO_PAGE_OFFSET; + case KVM_CAP_COALESCED_PIO: + return 1; #endif #ifdef CONFIG_HAVE_KVM_IRQ_ROUTING case KVM_CAP_IRQ_ROUTING: -- 2.7.4