Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp367460rwd; Thu, 1 Jun 2023 01:04:43 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ56y57M53JdQMahKERg56JwxsSR7U9FNH4kg7pltyPeuZxGOeYiaFwcoGD+cgJNWw2rMKhb X-Received: by 2002:a17:902:ceca:b0:1b0:6c10:6836 with SMTP id d10-20020a170902ceca00b001b06c106836mr3933589plg.33.1685606682781; Thu, 01 Jun 2023 01:04:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685606682; cv=none; d=google.com; s=arc-20160816; b=iRIx7+Ugw/EKd6kb+iIeq4bSstG7yu3OuXa8tKq8ElSplnnZ0ZUKvqxoJheSUT+h96 NMA0vVCiHqSEFVnTsbc0Gn+kRtKgl8C9fi6WJ8cKrG1YGJEtGBuIb7nFC4fgJHvpG4Kj CjJURfc8wLdiImNwCQaz7HXhRX12gLwJBdV2IQZ/Y9FkGPokSr9Q+QlOCHPqnlaALNmM nWCUU4lRoKYDVK2wusqNhidKC5I3P25qjyH5c7IdipGbYU1WZYl2B4Ee+k8L7Q294NH6 JymSGoKxB8tmTZ6CU/WQ7IKs+ObHMroVGUeqJ2qb3Ux0+D+D4aJg7ywpHjKvyUEZuu9n HH/A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=LJY0ffzy9d7zjgW/HEvqIO2dS8yVBrmwr6J73PuSOuU=; b=a2nnafKh/kIIxcDgNcI8/d1c0dZCGD2PgcevLcAqtkUtzHSn0oD8jqZbDDDCHXVp9O w+cRara62GaJaivzL2GprDQ0D02J0fApU5EaGAkzxp7snqgVO+7iCIAAUwPV/fpgUzNq 19RmIWPOaaRed+mdDqyhC/jFAcGTt+7565iAdDgvMrCHGUxWKYhFvi5iIFFBJ9VBWCNE p+53ZbAIqCYlKw0EeXDziqZmeIzbSRAn18fzPRkTEpWnt41zi5XYhjOuh/e2Jo/NHCa0 pbjqek6YWjNaqBsEd4gX2MAtsCMebr3bj4x1lYjmt8rqktraAp1jAsecwXtG5anEgrig j68w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=CgGko6px; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id jb18-20020a170903259200b001b02ce8470dsi2212945plb.543.2023.06.01.01.04.28; Thu, 01 Jun 2023 01:04:42 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=CgGko6px; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229927AbjFAH5b (ORCPT + 99 others); Thu, 1 Jun 2023 03:57:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52668 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230046AbjFAHzZ (ORCPT ); Thu, 1 Jun 2023 03:55:25 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 76B85170D for ; Thu, 1 Jun 2023 00:47:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1685605672; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=LJY0ffzy9d7zjgW/HEvqIO2dS8yVBrmwr6J73PuSOuU=; b=CgGko6pxX5vQUtwvveXVGwMkmHEdL/1TxFwGQGOQTVaJM3XcLWQ3rr+Oa/5vLiRh9IvZTZ Av/mqrbLDZ5AQCXEKSgx12paf/pxL1kPDEiC3C+KwIfjJI2cpmSIQtPeQKuVErI4y/OT83 IRBcmFWtPJs90iQU7c1h1G/rs1Lsd0g= Received: from mail-lj1-f199.google.com (mail-lj1-f199.google.com [209.85.208.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-343-xiAgqL4hNf-L1kX4hbbJmA-1; Thu, 01 Jun 2023 03:47:51 -0400 X-MC-Unique: xiAgqL4hNf-L1kX4hbbJmA-1 Received: by mail-lj1-f199.google.com with SMTP id 38308e7fff4ca-2b057e763a3so3920581fa.3 for ; Thu, 01 Jun 2023 00:47:51 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685605670; x=1688197670; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=LJY0ffzy9d7zjgW/HEvqIO2dS8yVBrmwr6J73PuSOuU=; b=H5FsB80GdMoq/J9kjw/U4Iy0YWuUNTepMOArlHWsZtJO7Zz+stZVX/5bgbdZTdWpXg DpPUkvg3YySUqxh2ci/pb9ST6EHLQurb4nWk8mbMwKDZSL6Hyz1qAkK7+VlQvTZGOjqV KzrEgvdlqfx3mIJ97xQVjZ9e6ti8UW74hgc9RmYL+BiNAkk2EViIqF7weAmYyRHkuh0O WjCQTZXeEIb7fSX9vZpOmjdkE5i3mvAw0DnEnY2mDWd5C/ssDXetLyBV/JVPsMORCPIo 1ufwVLFQzeJ7Pe2EOXkCCk3j/K7NelRL27wmoXbRbLh+TgQHabuHed3ILIJhBhc0J90F KZvA== X-Gm-Message-State: AC+VfDwtei6zP4ffhpvCIJVGFWfP+5OUflafYLW3HmYlmLshjbCkXreX fCaPw2uH6Wf5BmcRjnW6ZewjcjSryY9wXccWRN2Hgjefnv/FKU0xxY/lR5uICZTGEFxHUUytjEd 8KZSBnKYOl3FOQeqE8QtGtsfV X-Received: by 2002:a2e:8782:0:b0:2ad:8c4a:ef7e with SMTP id n2-20020a2e8782000000b002ad8c4aef7emr4339346lji.43.1685605670000; Thu, 01 Jun 2023 00:47:50 -0700 (PDT) X-Received: by 2002:a2e:8782:0:b0:2ad:8c4a:ef7e with SMTP id n2-20020a2e8782000000b002ad8c4aef7emr4339335lji.43.1685605669610; Thu, 01 Jun 2023 00:47:49 -0700 (PDT) Received: from sgarzare-redhat ([134.0.3.103]) by smtp.gmail.com with ESMTPSA id k12-20020a056402048c00b00514c4350243sm1763867edv.56.2023.06.01.00.47.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Jun 2023 00:47:48 -0700 (PDT) Date: Thu, 1 Jun 2023 09:47:45 +0200 From: Stefano Garzarella To: Mike Christie Cc: "Michael S. Tsirkin" , syzbot , jasowang@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, syzkaller-bugs@googlegroups.com, virtualization@lists.linux-foundation.org, stefanha@redhat.com Subject: Re: [syzbot] [kvm?] [net?] [virt?] general protection fault in vhost_work_queue Message-ID: <7vk2uizpmf4fi54tmmopnbwwb7fs2xg6vae6ynrcvs26hjmshb@hpjzu4jfj35i> References: <0000000000001777f605fce42c5f@google.com> <20230530072310-mutt-send-email-mst@kernel.org> <6p7pi6mf3db3gp3xqarap4uzrgwlzqiz7wgg5kn2ep7hvrw5pg@wxowhbw4e7w7> <035e3423-c003-3de9-0805-2091b9efb45d@oracle.com> <43f67549-fe4d-e3ca-fbb0-33bea6e2b534@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-2.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, May 31, 2023 at 11:27:12AM -0500, Mike Christie wrote: >On 5/31/23 10:15 AM, Mike Christie wrote: >>>> rcu would work for your case and for what Jason had requested. >>> Yeah, so you already have some patches? >>> >>> Do you want to send it to solve this problem? >>> >> Yeah, I'll break them out and send them later today when I can retest >> rebased patches. >> > >Just one question. Do you core vhost developers consider RCU more complex >or switching to READ_ONCE/WRITE_ONCE? I am asking because for this immediate >regression fix we could just switch to the latter like below to just fix >the crash if we think that is more simple. > >I think RCU is just a little more complex/invasive because it will have the >extra synchronize_rcu calls. Yes, you may be right, in this case we should just need READ_ONCE/WRITE_ONCE if dev->worker is no longer a pointer. > > >diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c >index a92af08e7864..03fd47a22a73 100644 >--- a/drivers/vhost/vhost.c >+++ b/drivers/vhost/vhost.c >@@ -235,7 +235,7 @@ void vhost_dev_flush(struct vhost_dev *dev) > { > struct vhost_flush_struct flush; > >- if (dev->worker) { >+ if (READ_ONCE(dev->worker.vtsk)) { > init_completion(&flush.wait_event); > vhost_work_init(&flush.work, vhost_flush_work); > >@@ -247,7 +247,9 @@ EXPORT_SYMBOL_GPL(vhost_dev_flush); > > void vhost_work_queue(struct vhost_dev *dev, struct vhost_work *work) > { >- if (!dev->worker) >+ struct vhost_task *vtsk = READ_ONCE(dev->worker.vtsk); >+ >+ if (!vtsk) > return; > > if (!test_and_set_bit(VHOST_WORK_QUEUED, &work->flags)) { >@@ -255,8 +257,8 @@ void vhost_work_queue(struct vhost_dev *dev, struct vhost_work *work) > * sure it was not in the list. > * test_and_set_bit() implies a memory barrier. > */ >- llist_add(&work->node, &dev->worker->work_list); >- wake_up_process(dev->worker->vtsk->task); >+ llist_add(&work->node, &dev->worker.work_list); >+ wake_up_process(vtsk->task); > } > } > EXPORT_SYMBOL_GPL(vhost_work_queue); >@@ -264,7 +266,7 @@ EXPORT_SYMBOL_GPL(vhost_work_queue); > /* A lockless hint for busy polling code to exit the loop */ > bool vhost_has_work(struct vhost_dev *dev) > { >- return dev->worker && !llist_empty(&dev->worker->work_list); >+ return !llist_empty(&dev->worker.work_list); > } > EXPORT_SYMBOL_GPL(vhost_has_work); > >@@ -468,7 +470,7 @@ void vhost_dev_init(struct vhost_dev *dev, > dev->umem = NULL; > dev->iotlb = NULL; > dev->mm = NULL; >- dev->worker = NULL; >+ memset(&dev->worker, 0, sizeof(dev->worker)); > dev->iov_limit = iov_limit; > dev->weight = weight; > dev->byte_weight = byte_weight; >@@ -542,46 +544,38 @@ static void vhost_detach_mm(struct vhost_dev *dev) > > static void vhost_worker_free(struct vhost_dev *dev) > { >- struct vhost_worker *worker = dev->worker; >+ struct vhost_task *vtsk = READ_ONCE(dev->worker.vtsk); > >- if (!worker) >+ if (!vtsk) > return; > >- dev->worker = NULL; >- WARN_ON(!llist_empty(&worker->work_list)); >- vhost_task_stop(worker->vtsk); >- kfree(worker); >+ vhost_task_stop(vtsk); >+ WARN_ON(!llist_empty(&dev->worker.work_list)); >+ WRITE_ONCE(dev->worker.vtsk, NULL); The patch LGTM, I just wonder if we should set dev->worker to zero here, but maybe we don't need to. Thanks, Stefano > } > > static int vhost_worker_create(struct vhost_dev *dev) > { >- struct vhost_worker *worker; > struct vhost_task *vtsk; > char name[TASK_COMM_LEN]; > int ret; > >- worker = kzalloc(sizeof(*worker), GFP_KERNEL_ACCOUNT); >- if (!worker) >- return -ENOMEM; >- >- dev->worker = worker; >- worker->kcov_handle = kcov_common_handle(); >- init_llist_head(&worker->work_list); >+ dev->worker.kcov_handle = kcov_common_handle(); >+ init_llist_head(&dev->worker.work_list); > snprintf(name, sizeof(name), "vhost-%d", current->pid); > >- vtsk = vhost_task_create(vhost_worker, worker, name); >+ vtsk = vhost_task_create(vhost_worker, &dev->worker, name); > if (!vtsk) { > ret = -ENOMEM; > goto free_worker; > } > >- worker->vtsk = vtsk; >+ WRITE_ONCE(dev->worker.vtsk, vtsk); > vhost_task_start(vtsk); > return 0; > > free_worker: >- kfree(worker); >- dev->worker = NULL; >+ WRITE_ONCE(dev->worker.vtsk, NULL); > return ret; > } > >diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h >index 0308638cdeee..305ec8593d46 100644 >--- a/drivers/vhost/vhost.h >+++ b/drivers/vhost/vhost.h >@@ -154,7 +154,7 @@ struct vhost_dev { > struct vhost_virtqueue **vqs; > int nvqs; > struct eventfd_ctx *log_ctx; >- struct vhost_worker *worker; >+ struct vhost_worker worker; > struct vhost_iotlb *umem; > struct vhost_iotlb *iotlb; > spinlock_t iotlb_lock; >