Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp2025948imw; Tue, 5 Jul 2022 21:22:42 -0700 (PDT) X-Google-Smtp-Source: AGRyM1ucGr6LHnOvyqm/3lNq/l1pe919YpL/0h8IPLyS0Y69AgGp8UNeT+h+k5BEb5eN23209u7L X-Received: by 2002:a05:6402:b3b:b0:439:b0cc:d409 with SMTP id bo27-20020a0564020b3b00b00439b0ccd409mr29186238edb.325.1657081362569; Tue, 05 Jul 2022 21:22:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1657081362; cv=none; d=google.com; s=arc-20160816; b=ObUnFQA6U+/62jpm8q5J2p5/YM/PS0ScNjJbrrmCrTjjfx2ILUKlbtL2A68hV9OS++ p05K5n8z9pg9GZn0HqLWVLYMP6Bln2nbam5z63RaIJ8Nw1IlvU0thX+KWBnuDGY0yDaj hdS5yqUiQ33sb9tVCquCmmr0LcOo5jNPmKPydYAmWU51/NjE/cqr5V9l80/muBtgczu7 EnP6he1Cgnqdwfs2m4HXzeMGrPxb+ZDp+I/lSEf5feupqZYJPOV1t1OzniKssTk6mFjs bEcW3VQzLrFNpluVR2dCFQladNQJLaZnPrmmFFpkMcrQxd17jn4V9DUAY3iFB9uZWmVy f2dg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=bgioBYMSEhDMqtphKUl3XC15o9rR3khwbKqKnXOUfUc=; b=xDIHVk/mHM+MW6ESSUHE7ltrae0wwEkfkbpVk/tCERF6iJY7VB1X1F9IB6gSHgWyJ+ dz4gqjZyk0q7NQolXrBcWKaS8DyAMUKmgSlP74iVLzrtbrk3gSzdhsjqryFfVbgpSLTM y4os82Tv1mYa54HIX0rI8vXSPNH/UCrkwVeDcR5uNlqWskP81+fzc3GJoGpeDt2Jw4zW QMuzpZLBDeC/7eLUhXWcFQqJqAOsSqiE3j6NAQSHRwPrIyk5wRsPofqCLZALUNtQ6ZVx wtQaNnJ6mZIIfEEUrcjPVJYvq72Z99ksi4lK1UeB9dFbLySSgKHMdp8+dtcqjeUMS1tf UI1A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=L0FBpkND; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id hc40-20020a17090716a800b007235c2c8a7bsi51843201ejc.855.2022.07.05.21.22.17; Tue, 05 Jul 2022 21:22:42 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=L0FBpkND; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231731AbiGFEDx (ORCPT + 99 others); Wed, 6 Jul 2022 00:03:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35432 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231789AbiGFEDt (ORCPT ); Wed, 6 Jul 2022 00:03:49 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 36BAD1F2F5 for ; Tue, 5 Jul 2022 21:03:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1657080224; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=bgioBYMSEhDMqtphKUl3XC15o9rR3khwbKqKnXOUfUc=; b=L0FBpkND7LDgAh6Ggp0esymT+Twgod8k7DlfGzfZqBlxKyRMVnEzcgGrZwkBE4EzNNnTEc b7sufrE7q7VAsq80vLZhKKcmbspYRy0xEzt/OuqRc/RYGtKwOqoMjn9ZvHNfyVy+/TUBAo Bk7QCAUHTNxky7U9q1caWaf5G/2ePZY= Received: from mail-lf1-f71.google.com (mail-lf1-f71.google.com [209.85.167.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-563-vtF4fWhHP6-Jlwtpl0yvCw-1; Wed, 06 Jul 2022 00:03:42 -0400 X-MC-Unique: vtF4fWhHP6-Jlwtpl0yvCw-1 Received: by mail-lf1-f71.google.com with SMTP id f40-20020a0565123b2800b0048454c5aec2so1010763lfv.1 for ; Tue, 05 Jul 2022 21:03:42 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=bgioBYMSEhDMqtphKUl3XC15o9rR3khwbKqKnXOUfUc=; b=00V0s7ZdSanUdtjNJSn6HYpuzMk7F3Gw4vT5MI3g3/T4iOAPt0f4VHBGsys99lIFzR LZ7n/DLDYL0zt9tioa3bQnqRlUop6IoRq49l73g0s6hg9XKXB3u5NQNo02QunbMUjZlH kKaRQLU47e2rfhgqxwz9QSXENAH7UuYbHLvB9s1tmWeNc8lbVHl0jFzQNwmTQZSclw2F hBkkCXFWM0ephswOwKDnX2pmANDaZUBqmJrJj0z0TKNgFqxmKhJuzMYMNLuL9H58wbSS LsqjNC6MiMsIQjwN02wbR3zv2qC1p5E8Y3fgDsT9+PAVkwrCPtyHOuZUYVwdkfIUBEqS 0gWw== X-Gm-Message-State: AJIora9Te03kTo9eZOPj8ovNAG5CvMLW5p4VH8opFZ1I9VgGzGtjxLbr 89nodSaH5ksZPRaHUqdPIFPH6e3VyA2lf7bw70eOu7J8aIFnjZ4n/CwWfyWSgdluHvAdIcOiH5V kYLott+fp+AVmaD3/NBSzRb7h38krCT9SC/c1EMI3 X-Received: by 2002:a05:6512:158d:b0:47f:718c:28b5 with SMTP id bp13-20020a056512158d00b0047f718c28b5mr25875408lfb.397.1657080219362; Tue, 05 Jul 2022 21:03:39 -0700 (PDT) X-Received: by 2002:a05:6512:158d:b0:47f:718c:28b5 with SMTP id bp13-20020a056512158d00b0047f718c28b5mr25875395lfb.397.1657080219115; Tue, 05 Jul 2022 21:03:39 -0700 (PDT) MIME-Version: 1.0 References: <20220608171334.730739-1-apatel@ventanamicro.com> <20220629174318.GB2018382@p14s> <20220630152003-mutt-send-email-mst@kernel.org> <20220701021536-mutt-send-email-mst@kernel.org> <66323a79-48a7-853e-1c44-9e62fcc5b775@foss.st.com> In-Reply-To: <66323a79-48a7-853e-1c44-9e62fcc5b775@foss.st.com> From: Jason Wang Date: Wed, 6 Jul 2022 12:03:27 +0800 Message-ID: Subject: Re: [PATCH] rpmsg: virtio: Fix broken rpmsg_probe() To: Arnaud POULIQUEN Cc: "Michael S. Tsirkin" , Mathieu Poirier , Anup Patel , Bjorn Andersson , Atish Patra , Alistair Francis , Anup Patel , linux-remoteproc@vger.kernel.org, kvm-riscv@lists.infradead.org, linux-kernel , virtualization Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-3.5 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jul 4, 2022 at 5:45 PM Arnaud POULIQUEN wrote: > > Hello Jason, > > On 7/4/22 06:35, Jason Wang wrote: > > On Fri, Jul 1, 2022 at 2:16 PM Michael S. Tsirkin wrote: > >> > >> On Fri, Jul 01, 2022 at 09:22:15AM +0800, Jason Wang wrote: > >>> On Fri, Jul 1, 2022 at 3:20 AM Michael S. Tsirkin wrote: > >>>> > >>>> On Thu, Jun 30, 2022 at 11:51:30AM -0600, Mathieu Poirier wrote: > >>>>> + virtualization@lists.linux-foundation.org > >>>>> + jasowang@redhat.com > >>>>> + mst@redhat.com > >>>>> > >>>>> On Thu, 30 Jun 2022 at 10:20, Arnaud POULIQUEN > >>>>> wrote: > >>>>>> > >>>>>> Hi, > >>>>>> > >>>>>> On 6/29/22 19:43, Mathieu Poirier wrote: > >>>>>>> Hi Anup, > >>>>>>> > >>>>>>> On Wed, Jun 08, 2022 at 10:43:34PM +0530, Anup Patel wrote: > >>>>>>>> The rpmsg_probe() is broken at the moment because virtqueue_add_inbuf() > >>>>>>>> fails due to both virtqueues (Rx and Tx) marked as broken by the > >>>>>>>> __vring_new_virtqueue() function. To solve this, virtio_device_ready() > >>>>>>>> (which unbreaks queues) should be called before virtqueue_add_inbuf(). > >>>>>>>> > >>>>>>>> Fixes: 8b4ec69d7e09 ("virtio: harden vring IRQ") > >>>>>>>> Signed-off-by: Anup Patel > >>>>>>>> --- > >>>>>>>> drivers/rpmsg/virtio_rpmsg_bus.c | 6 +++--- > >>>>>>>> 1 file changed, 3 insertions(+), 3 deletions(-) > >>>>>>>> > >>>>>>>> diff --git a/drivers/rpmsg/virtio_rpmsg_bus.c b/drivers/rpmsg/virtio_rpmsg_bus.c > >>>>>>>> index 905ac7910c98..71a64d2c7644 100644 > >>>>>>>> --- a/drivers/rpmsg/virtio_rpmsg_bus.c > >>>>>>>> +++ b/drivers/rpmsg/virtio_rpmsg_bus.c > >>>>>>>> @@ -929,6 +929,9 @@ static int rpmsg_probe(struct virtio_device *vdev) > >>>>>>>> /* and half is dedicated for TX */ > >>>>>>>> vrp->sbufs = bufs_va + total_buf_space / 2; > >>>>>>>> > >>>>>>>> + /* From this point on, we can notify and get callbacks. */ > >>>>>>>> + virtio_device_ready(vdev); > >>>>>>>> + > >>>>>>> > >>>>>>> Calling virtio_device_ready() here means that virtqueue_get_buf_ctx_split() can > >>>>>>> potentially be called (by way of rpmsg_recv_done()), which will race with > >>>>>>> virtqueue_add_inbuf(). If buffers in the virtqueue aren't available then > >>>>>>> rpmsg_recv_done() will fail, potentially breaking remote processors' state > >>>>>>> machines that don't expect their initial name service to fail when the "device" > >>>>>>> has been marked as ready. > >>>>>>> > >>>>>>> What does make me curious though is that nobody on the remoteproc mailing list > >>>>>>> has complained about commit 8b4ec69d7e09 breaking their environment... By now, > >>>>>>> i.e rc4, that should have happened. Anyone from TI, ST and Xilinx care to test this on > >>>>>>> their rig? > >>>>>> > >>>>>> I tested on STm32mp1 board using tag v5.19-rc4(03c765b0e3b4) > >>>>>> I confirm the issue! > >>>>>> > >>>>>> Concerning the solution, I share Mathieu's concern. This could break legacy. > >>>>>> I made a short test and I would suggest to use __virtio_unbreak_device instead, tounbreak the virtqueues without changing the init sequence. > >>>>>> > >>>>>> I this case the patch would be: > >>>>>> > >>>>>> + /* > >>>>>> + * Unbreak the virtqueues to allow to add buffers before setting the vdev status > >>>>>> + * to ready > >>>>>> + */ > >>>>>> + __virtio_unbreak_device(vdev); > >>>>>> + > >>>>>> > >>>>>> /* set up the receive buffers */ > >>>>>> for (i = 0; i < vrp->num_bufs / 2; i++) { > >>>>>> struct scatterlist sg; > >>>>>> void *cpu_addr = vrp->rbufs + i * vrp->buf_size; > >>>>> > >>>>> This will indeed fix the problem. On the flip side the kernel > >>>>> documentation for __virtio_unbreak_device() puzzles me... > >>>>> It clearly states that it should be used for probing and restoring but > >>>>> _not_ directly by the driver. Function rpmsg_probe() is part of > >>>>> probing but also the entry point to a driver. > >>>>> > >>>>> Michael and virtualisation folks, is this the right way to move forward? > >>>> > >>>> I don't think it is, __virtio_unbreak_device is intended for core use. > >>> > >>> Can we fill the rx after virtio_device_ready() in this case? > >>> > >>> Btw, the driver set driver ok after registering, we probably get a svq > >>> kick before DRIVER_OK? > > By "registering" you mean calling rpmsg_virtio_add_ctrl_dev and > rpmsg_ns_register_device? Yes. > > The rpmsg_ns_register_device has to be called before. Because it has to be > probed to handle the first message coming from the remote side to create > associated rpmsg local device. I couldn't find the code to do this, maybe you can give me some hint on this. > It doesn't send message. I see the function register the device to the bus, I wonder if this means the device could be probed and used by the driver before virtio_device_ready(). > > The risk could be for the rpmsg_ctrl device. Registering it > after the virtio_device_ready(vdev) call could make sense... I see. > > >>> > >>> Thanks > >> > >> Is this an ack for the original patch? > > > > Nope, I meant, instead of moving virtio_device_ready() a little bit > > earlier, can we only move the rvq filling after virtio_device_ready(). > > > > Thanks > > Please find some concerns about this inversion here: > https://lore.kernel.org/lkml/20220701053813-mutt-send-email-mst@kernel.org/ > > Regarding __virtio_unbreak_device. The pending virtio_break_device is > used by some virtio driver. > Could we consider that it makes sense to also have a > virtio_unbreak_device interface? We don't want to allow the driver to unbreak a device since it's easier to have bugs. > > > I do not well understand the reason of the commit: > 8b4ec69d7e09 ("virtio: harden vring IRQ", 2022-05-27) It tries to forbid the virtqueue callbacks to be called before virtio_device_ready(). This helps to prevent the malicious device from attacking the driver. But unfortunately, it breaks several driver because: 1) some driver have races in probe/remove 2) it tries to reuse vq->broken which may break the driver that call virqueue_add() before virtio_device_ready() which is allowed by the spec There's a discussion to have a better behavior that doesn't break the existing drivers. And the IRQ hardening feature is marked as broken now, so rpmsg should be fine without any extra effort. > So following alternative is probably pretty naive: > Is the use of virtqueue_disable_cb could be an alternative to the > vq->broken usage allowing to register buffer while preventing virtqueue IRQ? Probably not, there's no guarantee that the device will not send notification after virqtueue_disable_cb(). Thanks > > Thanks, > Arnaud > > > > >> > >>>> > >>>>>> > >>>>>> Regards, > >>>>>> Arnaud > >>>>>> > >>>>>>> > >>>>>>> Thanks, > >>>>>>> Mathieu > >>>>>>> > >>>>>>>> /* set up the receive buffers */ > >>>>>>>> for (i = 0; i < vrp->num_bufs / 2; i++) { > >>>>>>>> struct scatterlist sg; > >>>>>>>> @@ -983,9 +986,6 @@ static int rpmsg_probe(struct virtio_device *vdev) > >>>>>>>> */ > >>>>>>>> notify = virtqueue_kick_prepare(vrp->rvq); > >>>>>>>> > >>>>>>>> - /* From this point on, we can notify and get callbacks. */ > >>>>>>>> - virtio_device_ready(vdev); > >>>>>>>> - > >>>>>>>> /* tell the remote processor it can start sending messages */ > >>>>>>>> /* > >>>>>>>> * this might be concurrent with callbacks, but we are only > >>>>>>>> -- > >>>>>>>> 2.34.1 > >>>>>>>> > >>>> > >> > > >