Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp2295422imu; Tue, 6 Nov 2018 12:09:29 -0800 (PST) X-Google-Smtp-Source: AJdET5emJehYjUqmDPwaHXBFepjrRPMy9A6clc0J4yQ9owf1do5R+Pro2uGfytxGdb1stn1tnM2p X-Received: by 2002:a17:902:2ac3:: with SMTP id j61-v6mr27554740plb.139.1541534969165; Tue, 06 Nov 2018 12:09:29 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1541534969; cv=none; d=google.com; s=arc-20160816; b=PUhgjVReUyl278415omLLOTQuwZ+jdagI+RVmCRRoCuOHL5rvyYofIeUEN+qyzZfn5 Glp25K9IwHeORm88TNy1WCRtgcMF16ogHqqAEzWmfco2HK2igga4Xe2JnmDHUTqi6WbA JTJoCvwXKM4NcMx/PlNmozcZcD82ujjbEblf0wBBEyFCcAqje+U27R03POrOwEk9Oppb dtiFRHlqXZGfDsQxwceyTymNH6kJRbLJu0T/eZb7f78J96MsDTG3mwdjBlSyrSsgP0vU ruC6kER9FYpzfnqPpDwzUSlgkBkoFL+SxnnLASXCcmCRkeEqw1c1rsMpnNv+MyVnowTo +Njw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=YbiDXKdV5AWrnVl4kLV8SNsPJBhEtXN0ieTivrA3r5Y=; b=LkS/n/l6C3XVOezqbnnFrNpjKTmW1a5LaMesDkkuaOBELUYCiw8HEuzk4I+1MeFzUE cJAy01eNTgQ4IMB57MSUF+1kQj043qZTUNGtEopJ0wMYTEowpzI+XzvthWkLkwTle26b FFdrfC5HqVA2buqEw6vYP9m5t5U38rk1I7rXEyued04pqUPHxVbdzYKTFBU/FsNruddw gU8TcmE7M8SLMEaAgLAkmV+07EDkorl2OL04qPaSOKj3R1mpdzVyEQxBXCfyZ6+V3MCk AEuCJiLYBsMy5BgIlyt4OJPwA8L7TTPQ1WNIbclbhb+tUsOvUF6pJg/D4zDg7mA8P8jZ nyQQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=MnrAvwq0; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t191si24819826pgd.579.2018.11.06.12.09.12; Tue, 06 Nov 2018 12:09:29 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=MnrAvwq0; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730725AbeKGFff (ORCPT + 99 others); Wed, 7 Nov 2018 00:35:35 -0500 Received: from mail-qk1-f193.google.com ([209.85.222.193]:39636 "EHLO mail-qk1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727268AbeKGFfe (ORCPT ); Wed, 7 Nov 2018 00:35:34 -0500 Received: by mail-qk1-f193.google.com with SMTP id e4so19257359qkh.6; Tue, 06 Nov 2018 12:08:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=YbiDXKdV5AWrnVl4kLV8SNsPJBhEtXN0ieTivrA3r5Y=; b=MnrAvwq0jvmnMwoDH7eQgRWvlSnjv13o8TECOn2arPyQ+Se88zkDuEUGKUPCGUkmsl F3YupPktRz93dq9DzUZoORMzgXNGmH5buhZzFvkLtsRipFEXgmvLrAhyqc5GmGdLExIU iQ7wyG4xcsPhnc12lM+tDvREV7bUxck+dKIgHsuDI/IqJyWKMBq6VnY6KYt5AwbaFZCv 33qmYVVQ7NeJhemRCGGeo55TQ1ysFvjsc0HIKBXgU/pAsXbRNmev8c5/r5DA97eT1SJ7 eIZ4S4yWX6xv4NB8Eouj6KdRTliNYBrsUK0LWrewvBYjbYvdUjh+cKslOQmt7TiM9wsH UmsQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=YbiDXKdV5AWrnVl4kLV8SNsPJBhEtXN0ieTivrA3r5Y=; b=cLWnPzJRztY9tdDpUFlnqTxQdKZlRhmXxa73E7RWG7iRti7Ftu8VF54nOQ4L/T6poR +CZKWhooVTbZrdQcPlcY6Su7HqrDvIt9diwgN/JsZZs9fGrswDU5BOr0olNqqssdutD5 u1Sz6b16MlH11yvqUD5w7CNOx/YF8d6PQ3Srw5KEuodJfZo4UKHHo8ZRhFmWvar5kbfg jap4mWEeNBkRpH2BdYz4JSz7bruf5Q8tsk+vKlptn4adrUQ2N2g/eA3zED7dw1AUfwOp Hsicx8uRbdpq54jLmtOOUez3f5VYwtK/+bNaHsY7R2IkBzBXexOAjUJrBKll5U3afsQi CphA== X-Gm-Message-State: AGRZ1gIuar5Z1sKIsURc+QBWC3uAt4h0TYFd6ORRDe6PLffZOS+1oa4V cIGV13rUIi/ENdC/5QCxsSi8kjmxW/j1nLON2Jncrr0eW3A= X-Received: by 2002:a37:bb86:: with SMTP id l128-v6mr27262306qkf.237.1541534919450; Tue, 06 Nov 2018 12:08:39 -0800 (PST) MIME-Version: 1.0 References: <20181102182123.29420-1-v.mayatskih@gmail.com> <20181102142446-mutt-send-email-mst@kernel.org> <20181106154048.GB31579@stefanha-x1.localdomain> In-Reply-To: <20181106154048.GB31579@stefanha-x1.localdomain> From: Vitaly Mayatskih Date: Tue, 6 Nov 2018 15:08:26 -0500 Message-ID: Subject: Re: [PATCH 0/1] vhost: add vhost_blk driver To: stefanha@gmail.com Cc: Jason Wang , Paolo Bonzini , kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org, Kevin Wolf , "Michael S . Tsirkin" , den@virtuozzo.com Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Nov 6, 2018 at 10:40 AM Stefan Hajnoczi wrote: > Previously vhost_blk.ko implementations were basically the same thing as > the QEMU x-data-plane=on (dedicated thread using Linux AIO), except they > were using a kernel thread and maybe submitted bios. > > The performance differences weren't convincing enough that it seemed > worthwhile maintaining another code path which loses live migration, I/O > throttling, image file formats, etc (all the things that QEMU's block > layer supports). > > Two changes since then: > > 1. x-data-plane=on has been replaced with a full trip down QEMU's block > layer (-object iothread,id=iothread0 -device > virtio-blk-pci,iothread=iothread0,...). It's slower and not truly > multiqueue (yet!). > > So from this perspective vhost_blk.ko might be more attractive again, at > least until further QEMU block layer work eliminates the multiqueue and > performance overheads. Yes, this work is a direct consequence of insufficient performance of virtio-blk's host side. I'm working on a storage driver, but there's no a good way to feed all these IOs into one disk of one VM. The nature of storage design dictates the need of very high IOPS seen by VM. This is only one tiny use case of course, but the vhost/QEMU change is small enough to share. > 2. SPDK has become available for users who want the best I/O performance > and are willing to sacrifice CPU cores for polling. > > If you want better performance and don't care about QEMU block layer > features, could you use SPDK? People who are the target market for > vhost_blk.ko would probably be willing to use SPDK and it already > exists... Yes. Though in my experience SPDK creates more problems most of times than it solves ;) What I find very compelling in using a plain Linux block device is that it is really fast these days (blk-mq) and the device mapper can be used for even greater flexibility. Device mapper is less than perfect performance-wise and at some point will need some work for sure, but still can push few million IOPS through. And it's all standard code with decades old user APIs. In fact, Linux kernel is so good now that our pure-software solution can push IO at rates up to the limits of fat hardware (x00 GbE, bunch of NVMes) without an apparent need for hardware acceleration. And, without hardware dependencies, it is much more flexible. Disk interface between the host and VM was the only major bottleneck. > From the QEMU userspace perspective, I think the best way to integrate > vhost_blk.ko is to transparently switch to it when possible. If the > user enables QEMU block layer features that are incompatible with > vhost_blk.ko, then it should fall back to the QEMU block layer > transparently. Sounds like an excellent idea! I'll do that. Most of vhost-blk support in QEMU is a boilerplate code anyways. > I'm not keen on yet another code path with it's own set of limitations > and having to educate users about how to make the choice. But if it can > be integrated transparently as an "accelerator", then it could be > valuable. Understood. Agree. Thanks! -- wbr, Vitaly