Received: by 2002:a05:6358:e9c4:b0:b2:91dc:71ab with SMTP id hc4csp3188282rwb; Sat, 6 Aug 2022 15:03:23 -0700 (PDT) X-Google-Smtp-Source: AA6agR5PcSKS2afYJgbcz7jk5+qy7jiFnmrEIrAoJPZ/YfoI0WK9l9RrwA7G9ImuG42DNcGmlzml X-Received: by 2002:a17:90a:94c2:b0:1f3:2590:8693 with SMTP id j2-20020a17090a94c200b001f325908693mr22560095pjw.153.1659823403374; Sat, 06 Aug 2022 15:03:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1659823403; cv=none; d=google.com; s=arc-20160816; b=JZQDpkenYziQlv8+ddF7Y+ErzmRtfmxplBxvOES/sJrupXaeb8JRxjp2QSSLb04OUZ w7jKTiB+cdNfpY/omzK+iVOhO6tGxpMBQqQK+IBYIzXSNi6j2xt5JMl+5zxfJau9MlTu rT9hiUjvYuJdUnLPtLcYQ4Uw98LINjr/UNBchKXBZEtTrvpdStFPAWfEnE0mD6KkMvc2 XhregqAOcXvvxHo2e/ewO/jZJkGl0wGAFaKHvYdSEEIixxlBDB3P3FiZDV8KIqMdbKLO 3cE3wjOtVlS+hex91NtgYT50QslyO7rhVpUPL0rfkeVchI29PT7PA763BjZKEtzA+ccw 9NQg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=G/Yc0SN4MCdHWQQL7Yeq1p7bShR2bu8cCKLfrIoQmUw=; b=GXPDFA3RSkLugVkQiSsJYXpzQTfq2jG86j4U2k/nqOl952LurMTvZESH/o9J/A9QEd NPFnoz0RfbWXMGMLnWEum5uUSG6Ve7qvJs/X4ANiUF83BDOGjhz6yLaWuxRtVJwh36qz xjU+OMpUN7oe9eIcH8hJON/wGdV718KX3v6IiBBbhOqqYTpXIINBewISzJ+jhBO+NAKI yq4J3O2RPq2/rf/nTRorfRgpYLS9R6j7TXVbAAVU7O4wp4O5SS0HjusBB9/CIAaTdbSt 4wA6H6j7nPeKSR+eNwJN5GdU3UW+kIlnOQP8dA302+SUu1nQYeYCtPC63ZEMVK7roRGu o0JA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=dFEvbBZ5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u68-20020a637947000000b003f5d8756675si7095541pgc.371.2022.08.06.15.03.03; Sat, 06 Aug 2022 15:03:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=dFEvbBZ5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233680AbiHFVsJ (ORCPT + 99 others); Sat, 6 Aug 2022 17:48:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35962 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229740AbiHFVsH (ORCPT ); Sat, 6 Aug 2022 17:48:07 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 1913EA1AA for ; Sat, 6 Aug 2022 14:48:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1659822484; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=G/Yc0SN4MCdHWQQL7Yeq1p7bShR2bu8cCKLfrIoQmUw=; b=dFEvbBZ5fEIbMZOpUcKkK54nD+fKyZxwc7kwUT7CdLUnUDInrnYrFLvCHnGisbtbpFnpr9 IjXps1SkM81VLUGmvRu0puuaz4XX9bSgB3YoK+QQEA6Ll7HfSJmuL1DnAeh2g0mjYJlYUG 5hfK0liq4x1q8+OWy4xYM+KAIYR228I= Received: from mail-lf1-f70.google.com (mail-lf1-f70.google.com [209.85.167.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-561-WnzqmN3GMfy0w6COqY0qMg-1; Sat, 06 Aug 2022 17:48:03 -0400 X-MC-Unique: WnzqmN3GMfy0w6COqY0qMg-1 Received: by mail-lf1-f70.google.com with SMTP id e10-20020a19674a000000b0047f8d95f43cso1164047lfj.0 for ; Sat, 06 Aug 2022 14:48:03 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc; bh=G/Yc0SN4MCdHWQQL7Yeq1p7bShR2bu8cCKLfrIoQmUw=; b=a1AOV9TVQhkHgfYcvFC8R6urY3kOczLU8bgZ0NjkQAgApdjvKqzFPFb49UnvJfzVap qf4R0EqHzGUmeQl+d/8u3WfMVkIo4Gpemq1alqHH2f3ypod1oAlrt1qYTsof3hO12Bla Rwr3xs0bBusUlxyo97DdV24dI1Vl0QMJL1kK3TLu06nBRTjpeFnfHCGshP4LUt0GkjXx i5wfqaqlYy+REQbm4LUeqbi+P1ZUkLRWtF32OFyQFHSDUEnAeXWkDbXhDhdabGgZAQDf dG9BUj8yg94guCopL/kvEEUDuWbv7l90IobWZ1Q0nbKBAWjBUAgjBner5WH5lVja0EAC NPLw== X-Gm-Message-State: ACgBeo0f6+YPGJdsLN1lji8acFVZLY1UoeP206Q6bErtkeyzbtVbDuOo 0MWCdS6btL8NDFQvkWWu2e33NYYpfs8ryTEKyrAojFXsO55pzgX2E9KL8KVu0vBmTaIMAcavtUo vyxlWMHTr4Rx3eUa1w0OU+DSoTF+uzruhgDt3LOxc X-Received: by 2002:a2e:96c1:0:b0:258:e8ec:3889 with SMTP id d1-20020a2e96c1000000b00258e8ec3889mr3830100ljj.6.1659822482061; Sat, 06 Aug 2022 14:48:02 -0700 (PDT) X-Received: by 2002:a2e:96c1:0:b0:258:e8ec:3889 with SMTP id d1-20020a2e96c1000000b00258e8ec3889mr3830084ljj.6.1659822481766; Sat, 06 Aug 2022 14:48:01 -0700 (PDT) MIME-Version: 1.0 References: <20220805181105.GA29848@willie-the-truck> <20220806074828.zwzgn5gj47gjx5og@sgarzare-redhat> <20220806094239.GA30268@willie-the-truck> <20220806143443.GA30658@willie-the-truck> In-Reply-To: <20220806143443.GA30658@willie-the-truck> From: Stefan Hajnoczi Date: Sat, 6 Aug 2022 17:47:50 -0400 Message-ID: Subject: Re: IOTLB support for vhost/vsock breaks crosvm on Android To: Will Deacon Cc: Stefano Garzarella , Michael Tsirkin , Stefan Hajnoczi , Jason Wang , torvalds@linux-foundation.org, ascull@google.com, maz@kernel.org, keirf@google.com, jiyong@google.com, kernel-team@android.com, linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, kvm@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-3.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Aug 6, 2022 at 10:35 AM Will Deacon wrote: > > On Sat, Aug 06, 2022 at 06:52:15AM -0400, Stefan Hajnoczi wrote: > > On Sat, Aug 6, 2022 at 5:50 AM Will Deacon wrote: > > > On Sat, Aug 06, 2022 at 09:48:28AM +0200, Stefano Garzarella wrote: > > > > On Fri, Aug 05, 2022 at 07:11:06PM +0100, Will Deacon wrote: > > > > If the VMM implements the translation feature, it is right in my opinion > > > > that it does not enable the feature for the vhost device. Otherwise, if it > > > > wants the vhost device to do the translation, enable the feature and send > > > > the IOTLB messages to set the translation. > > > > > > > > QEMU for example masks features when not required or supported. > > > > crosvm should negotiate only the features it supports. > > > > > > > > @Michael and @Jason can correct me, but if a vhost device negotiates > > > > VIRTIO_F_ACCESS_PLATFORM, then it expects the VMM to send IOTLB messages to > > > > set the translation. > > > > > > As above, the issue is that vhost now unconditionally advertises this in > > > VHOST_GET_FEATURES and so a VMM with no knowledge of IOTLB can end up > > > enabling it by accident. > > > > Unconditionally exposing all vhost feature bits to the guest is > > incorrect. The emulator must filter out only the feature bits that it > > supports. > > I've evidently done a bad job of explaining this, sorry. > > crosvm _does_ filter the feature bits which it passes to vhost. It takes the > feature set which it has negotiated with the guest and then takes the > intersection of this set with the set of features which vhost advertises. > The result is what is passed to VHOST_SET_FEATURES. I included the rust > for this in my initial mail, but in C it might look something like: > > u64 features = negotiate_features_with_guest(dev); > > ioctl(vhost_fd, VHOST_GET_FEATURES, &vhost_features); > vhost_features &= features; /* Mask out unsupported features */ > ioctl(vhost_fd, VHOST_SET_FEATURES, &vhost_features); This is unrelated to the current issue, but this code looks wrong. VHOST_GET_FEATURES must be called before negotiating with the guest. The device features must be restricted by vhost before advertising them to the guest. For example, if a new crosvm binary runs on an old kernel then feature bits crosvm negotiated with the guest may not be supported by the vhost kernel module and the device is broken. > The problem is that crosvm has negotiated VIRTIO_F_ACCESS_PLATFORM with > the guest so that restricted DMA is used for the virtio devices. With > e13a6915a03f, VIRTIO_F_ACCESS_PLATFORM is now advertised by > VHOST_GET_FEATURES and so IOTLB is enabled by the sequence above. > > > For example, see QEMU's vhost-net device's vhost feature bit allowlist: > > https://gitlab.com/qemu-project/qemu/-/blob/master/hw/net/vhost_net.c#L40 > > I agree that changing crosvm to use an allowlist would fix the problem, > I'm just questioning whether we should be changing userspace at all to > resolve this issue. > > > The reason why the emulator (crosvm/QEMU/etc) must filter out feature > > bits is that vhost devices are full VIRTIO devices. They are a subset > > of a VIRTIO device and the emulator is responsible for the rest of the > > device. Some features will require both vhost and emulator support. > > Therefore it is incorrect to expect the device to work correctly if > > the vhost feature bits are passed through to the guest. > > I think crosvm is trying to cater for this by masking out the features > it doesn't know about. Can you point to the guest driver code for restricted DMA? It's unclear to me what the guest drivers are doing and whether that is VIRTIO spec compliant. Is the driver compliant with VIRTIO 1.2 "6.1 Driver Requirements: Reserved Feature Bits": A driver SHOULD accept VIRTIO_F_ACCESS_PLATFORM if it is offered, and it MUST then either disable the IOMMU or configure the IOMMU to translate bus addresses passed to the device into physical addresses in memory. If VIRTIO_F_ACCESS_PLATFORM is not offered, then a driver MUST pass only physical addresses to the device. Stefan