Received: by 2002:a6b:fb09:0:0:0:0:0 with SMTP id h9csp1095570iog; Wed, 15 Jun 2022 21:13:44 -0700 (PDT) X-Google-Smtp-Source: AGRyM1tCCqk8X7DmQ82hk70BtMiOuEIVPkJuECFYOwGqvkYCUFcHf/bPXByiQ+TzV0Ddt411nMvT X-Received: by 2002:a17:902:b714:b0:168:f72d:b8d5 with SMTP id d20-20020a170902b71400b00168f72db8d5mr2826180pls.66.1655352824629; Wed, 15 Jun 2022 21:13:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1655352824; cv=none; d=google.com; s=arc-20160816; b=t9P2LYY8DqTTJWGaL4hnNoenW61bkFcgNM6i1n2Fj//UC5BowkEbP9vKH5oUdIFBR5 cqkLH/gQH3cU5m20uwxTQsYjUxU2JlV0jJ3B23nDZ8k5VIhyQe21CNNyOBRrfOyxAD4A SmZ4NpM/VVnM/UxxgepTWtvGbQLMhx/kTUKDxKmwSyJajVfaFOAtdcCiYHkLJLKBMjy7 pjdUO8WnejGoUqCp+SmSuH7B1jPXH20sWpU75MuDX0F+BhCMi5LpeA2zexYoPVVwK5am tWZN+mn+1/OzzLe8u2JIaaXuNSSJwd2fI1OpXVxN7a2HMkgWq3NWs3AcBqmVfEzCNwCQ YJsA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-language:content-transfer-encoding :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:dkim-signature; bh=WzJNadh1hSnmmL0dW8LI/MHOGJubcN11puGnqaQm174=; b=UnJRf0joGUpuY2FOchAq4/6Iy68cZgrirbDdiAVK1c/LVDy9zO1nUwR/q82s3ox2He rIWtRSbIbhDw2Wn05PsaR1yFnCjWW6LMgkHlPvfc/SHJsReHN0HHDGpWivsvGtDJeqHt DhJEgTgxffD3NKM6Ga0yNrY8H/TRSxW2hep3ryCtneSNQXklkGK4dOp7thSI+PugUcsp UTUtcwERb8fYqEuHE/SFzb9MN/dnOwOxeP73QVRkA+E5xlHqOSBq76iPS8bLWfhBScl7 JL7qKFkGKHi3CZOeh77ydXEUmt7sIOeQzkrWjbWGbYmmZ9NzPaZdzXlWgrYmwNnRhBaw UVJQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=YgP+guyi; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q6-20020a170902dac600b00161c545c5c3si1371400plx.601.2022.06.15.21.13.17; Wed, 15 Jun 2022 21:13:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=YgP+guyi; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1358607AbiFPEKd (ORCPT + 99 others); Thu, 16 Jun 2022 00:10:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54202 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1358620AbiFPEKZ (ORCPT ); Thu, 16 Jun 2022 00:10:25 -0400 Received: from mail-pf1-x42b.google.com (mail-pf1-x42b.google.com [IPv6:2607:f8b0:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 66B0757990 for ; Wed, 15 Jun 2022 21:10:24 -0700 (PDT) Received: by mail-pf1-x42b.google.com with SMTP id e11so414426pfj.5 for ; Wed, 15 Jun 2022 21:10:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding:content-language; bh=WzJNadh1hSnmmL0dW8LI/MHOGJubcN11puGnqaQm174=; b=YgP+guyirvc2wR0eokmvk7hxnf7zaDI1NQnUMfVym3nkeqVioUdT0MkHkHtCM5Nmmv K7rzxJjZNn5BcBT5Tw7y29hcawM66158z5lOYoXPgw0SD8qDEIHodoP6/0lgOwQ97/bk BjZbwFE29Ljfquda8NqLiQrKRG5h6Lxre5G6MEgEleq2gE7gojxULZ4vbbgTmcaFE1AK gHgskvBnpglx5aL22vxohlAsbkYwBJc70n3ch08ugpKpNO0ka/56L6fndfl9z//45aS8 NAdI9LQDmWXh61yKB4P3eCj4x0gqnIQ5JVG3gk+2zN2Vao457kr90g0Bp4C54dLTpTCG Etbg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=WzJNadh1hSnmmL0dW8LI/MHOGJubcN11puGnqaQm174=; b=3nNt+pRrI02a+RR+vcAEZtBz91FH9NZ1k63q0UWOObJfG9474Iyx3bLndOFDFgatv4 FmAtM4xQzbyOJ5CqlDw2T9PWwHmfVsvjdsR5zoczEfiFIo+OJPQOV9Pa6LDhfmhFJi7I NSNJ9K4pO0N2pCjdvdvLe+2EakZk+MWgmeA/6Zpe/O4mpsacCo9hQo/JCmaJk0O0WhHX reQsF7zkH26mGG7/HWyYTPgDeFlbDJqw9ppXuTv1uRtPuYFkGkTIawId8Z5ZdnUluJYe gxqJNinrsn+SGahleXlG/3FT9vUbga+zeskG96lDHIqUMwMksUz2iuS5zjMI5yf3/I5P MM7w== X-Gm-Message-State: AJIora+GMESwr1vTT0DZazHwetxFXEWKtKqeZfelVMTcQMTYlNy9O/hV gemNlspEhGP93jJxYnCFuPNe2w== X-Received: by 2002:aa7:9e9c:0:b0:51b:e1b8:271c with SMTP id p28-20020aa79e9c000000b0051be1b8271cmr2885255pfq.73.1655352623853; Wed, 15 Jun 2022 21:10:23 -0700 (PDT) Received: from [10.31.0.6] ([199.101.192.15]) by smtp.gmail.com with ESMTPSA id n2-20020a63b442000000b003c265b7d4f6sm449755pgu.44.2022.06.15.21.10.20 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 15 Jun 2022 21:10:23 -0700 (PDT) Subject: Re: [PATCH] uacce: fix concurrency of fops_open and uacce_remove To: Jean-Philippe Brucker Cc: Greg Kroah-Hartman , Arnd Bergmann , Herbert Xu , Wangzhou , Jonathan Cameron , linux-accelerators@lists.ozlabs.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org, iommu@lists.linux-foundation.org, Yang Shen References: <20220610123423.27496-1-zhangfei.gao@linaro.org> From: Zhangfei Gao Message-ID: Date: Thu, 16 Jun 2022 12:10:18 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_NONE, RCVD_IN_SBL_CSS,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Hi, Jean On 2022/6/15 下午11:16, Jean-Philippe Brucker wrote: > Hi, > > On Fri, Jun 10, 2022 at 08:34:23PM +0800, Zhangfei Gao wrote: >> The uacce parent's module can be removed when uacce is working, >> which may cause troubles. >> >> If rmmod/uacce_remove happens just after fops_open: bind_queue, >> the uacce_remove can not remove the bound queue since it is not >> added to the queue list yet, which blocks the uacce_disable_sva. >> >> Change queues_lock area to make sure the bound queue is added to >> the list thereby can be searched in uacce_remove. >> >> And uacce->parent->driver is checked immediately in case rmmod is >> just happening. >> >> Also the parent driver must always stop DMA before calling >> uacce_remove. >> >> Signed-off-by: Yang Shen >> Signed-off-by: Zhangfei Gao >> --- >> drivers/misc/uacce/uacce.c | 19 +++++++++++++------ >> 1 file changed, 13 insertions(+), 6 deletions(-) >> >> diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c >> index 281c54003edc..b6219c6bfb48 100644 >> --- a/drivers/misc/uacce/uacce.c >> +++ b/drivers/misc/uacce/uacce.c >> @@ -136,9 +136,16 @@ static int uacce_fops_open(struct inode *inode, struct file *filep) >> if (!q) >> return -ENOMEM; >> >> + mutex_lock(&uacce->queues_lock); >> + >> + if (!uacce->parent->driver) { > I don't think this is useful, because the core clears parent->driver after > having run uacce_remove(): > > rmmod hisi_zip open() > ... uacce_fops_open() > __device_release_driver() ... > pci_device_remove() > hisi_zip_remove() > hisi_qm_uninit() > uacce_remove() > ... ... > mutex_lock(uacce->queues_lock) > ... if (!uacce->parent->driver) > device_unbind_cleanup() /* driver still valid, proceed */ > dev->driver = NULL The check  if (!uacce->parent->driver) is required, otherwise NULL pointer may happen. iommu_sva_bind_device const struct iommu_ops *ops = dev_iommu_ops(dev);  -> dev->iommu->iommu_dev->ops rmmod has no issue, but remove parent pci device has the issue. Test: sleep in fops_open before mutex. estuary:/mnt$ ./work/a.out & //sleep in fops_open echo 1 > /sys/bus/pci/devices/0000:00:02.0/remove & estuary:/mnt$ [   22.594348] uacce_remove! [   22.594663] pci 0000:00:02.0: Removing from iommu group 2 [   22.595073] iommu_release_device dev->iommu=0 [   22.595076] CPU: 2 PID: 229 Comm: ash Not tainted 5.19.0-rc1-15071-gcbcf098c5257-dirty #633 [   22.595079] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 [   22.595080] Call trace: [   22.595080]  dump_backtrace+0xe4/0xf0 [   22.595085]  show_stack+0x20/0x70 [   22.595086]  dump_stack_lvl+0x8c/0xb8 [   22.595089]  dump_stack+0x18/0x34 [   22.595091]  iommu_release_device+0x90/0x98 [   22.595095]  iommu_bus_notifier+0x58/0x68 [   22.595097]  blocking_notifier_call_chain+0x74/0xa8 [   22.595100]  device_del+0x268/0x3b0 [   22.595102]  pci_remove_bus_device+0x84/0x110 [   22.595106]  pci_stop_and_remove_bus_device_locked+0x30/0x60 ... estuary:/mnt$ [   31.466360] uacce: sleep end! [   31.466362] uacce->parent->driver=0 [   31.466364] uacce->parent->iommu=0 [   31.466365] uacce_bind_queue! [   31.466366] uacce_bind_queue call iommu_sva_bind_device! [   31.466367] uacce->parent=d120d0 [   31.466371] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000038 [   31.472870] Mem abort info: [   31.473450]   ESR = 0x0000000096000004 [   31.474223]   EC = 0x25: DABT (current EL), IL = 32 bits [   31.475390]   SET = 0, FnV = 0 [   31.476031]   EA = 0, S1PTW = 0 [   31.476680]   FSC = 0x04: level 0 translation fault [   31.477687] Data abort info: [   31.478294]   ISV = 0, ISS = 0x00000004 [   31.479152]   CM = 0, WnR = 0 [   31.479785] user pgtable: 4k pages, 48-bit VAs, pgdp=00000000714d8000 [   31.481144] [0000000000000038] pgd=0000000000000000, p4d=0000000000000000 [   31.482622] Internal error: Oops: 96000004 [#1] PREEMPT SMP [   31.483784] Modules linked in: hisi_zip [   31.484590] CPU: 2 PID: 228 Comm: a.out Not tainted 5.19.0-rc1-15071-gcbcf098c5257-dirty #633 [   31.486374] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 [   31.487862] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [   31.489390] pc : iommu_sva_bind_device+0x44/0xf4 [   31.490404] lr : uacce_fops_open+0x128/0x234 > > Since uacce_remove() disabled SVA, the following uacce_bind_queue() will > fail anyway. However, if uacce->flags does not have UACCE_DEV_SVA set, > we'll proceed further and call uacce->ops->get_queue(), which does not > exist anymore since the parent module is gone. > > I think we need the global uacce_mutex to serialize uacce_remove() and > uacce_fops_open(). uacce_remove() would do everything, including > xa_erase(), while holding that mutex. And uacce_fops_open() would try to > obtain the uacce object from the xarray while holding the mutex, which > fails if the uacce object is being removed. Since fops_open get char device refcount, uacce_release will not happen until open returns. So either uacce = xa_load(&uacce_xa, iminor(inode)) is got, uacce_release release uacce after fops_release. Or uacce is not got and return -ENODEV. open:         uacce = xa_load(&uacce_xa, iminor(inode));         if (!uacce)                 return -ENODEV; uacce->dev.release = uacce_release; uacce_release: kfree(uacce); Thanks