Received: by 2002:a05:6358:45e:b0:b5:b6eb:e1f9 with SMTP id 30csp945059rwe; Wed, 24 Aug 2022 11:46:46 -0700 (PDT) X-Google-Smtp-Source: AA6agR46DyHxjWs+DLoVPISOuIOGto5rewGLHC26djXR3b5CS2nIXDlQQAMsPwxmJ0+waVl1vA+b X-Received: by 2002:a17:907:6e18:b0:73d:63d9:945f with SMTP id sd24-20020a1709076e1800b0073d63d9945fmr213832ejc.12.1661366806077; Wed, 24 Aug 2022 11:46:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1661366806; cv=none; d=google.com; s=arc-20160816; b=ywN2sAlmqe9dBD43fQSE51WmdSoWcUenfNnMrN7C0DfsooNSnPxiZm4kT2iJuRjWzS Lu8X6P61eZbi5939xdQZ3ZNItZZy6D006tofkl3ym+UnDKqq/UcC/lbHZWwlQPZPlZ24 I9m/UdDtEE6+zFj6ZYf2GiN0+eebk4ORgqV+g075mabkPwLVROtbCmUQy5XfaqS+Z6fj eeVsIwxMQ/8FmSBIs9TVi2dCy39O6LONuujeH53EuDXvJSo/JzUhQPhVz8iaVbnxoILm oiCW9E/gxx7MI2j2Rd2QRr1AaEzMIl/U6FhY4Mpp4P7gZWq3qasrJs+74v57fyBJ3uxi 8wGQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent :content-transfer-encoding:references:in-reply-to:date:cc:to:from :subject:message-id:dkim-signature; bh=34X7VFt8V3iE8XwzbGfHOBsqTvNr4hGuESvVJPfwn4E=; b=OS74djiZQAfKXc6Ujbt/e2WibIXdBtsEbyg9Fb9srrSNiUQVJ+0aT6LKUvlHZ4PxxF xx/vMWye3jq95FTUuAnziSVMNFtUSSweiJEKpt16iLhfrdZ8cEIo0UcJZ07gfbUF9opX xfdG9992Hsfyh8sGpHfWJfbDPwxg/LPB5OG/TgpeF3ACudWWqED0PZmONw/a7Ixh3Nw8 sz7YeXZXeDcoSWafBKSlXweN8BQxAIaqcbZp0iO3SueZH5A6w2K6f7owf4xRj4VFHm2U B2PmTLjOO3APz6k6FRieYqXzhROjafLHgoPlFO0wSO92Jr5sgbrQBEPWcRTbowOVGUfZ cY0Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=cOsCUrGL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b8-20020aa7c6c8000000b0043bd7459cf9si4218511eds.314.2022.08.24.11.46.19; Wed, 24 Aug 2022 11:46:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=cOsCUrGL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238540AbiHXSmU (ORCPT + 99 others); Wed, 24 Aug 2022 14:42:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51244 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232226AbiHXSmS (ORCPT ); Wed, 24 Aug 2022 14:42:18 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7B49C78588 for ; Wed, 24 Aug 2022 11:42:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661366535; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=34X7VFt8V3iE8XwzbGfHOBsqTvNr4hGuESvVJPfwn4E=; b=cOsCUrGLY+6BDCvLOkPFFtsFWQ7kkMG2YmwlXhNC2peMksTWPjK94r3yv6PyVGDQRt4uk5 HLlm6F6qnJ03X97OOQfOJeZlotxUmLFuXWiVhwBKjPDOLByGDk0sal5PWrUHvCmpqKMopj bR9eUdwZBTe02bCMD5vJUOFFb15pGo8= Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-125-fPBSSi3GNBiLF0zWZZiFRw-1; Wed, 24 Aug 2022 14:42:13 -0400 X-MC-Unique: fPBSSi3GNBiLF0zWZZiFRw-1 Received: by mail-qt1-f197.google.com with SMTP id k9-20020ac80749000000b0034302b53c6cso13654297qth.22 for ; Wed, 24 Aug 2022 11:42:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:x-gm-message-state :from:to:cc; bh=34X7VFt8V3iE8XwzbGfHOBsqTvNr4hGuESvVJPfwn4E=; b=INS/zzSIvGl4+HYYxklkmXSLP5KRxXlDYXc4XHSnpOFUg8yYfxjOOY0sW0WIEIqX31 s/KU6oWPrwyLOeOqmqNLJKghkNzn2lrw05PpHLX6sE7d8QpwY99A9HardSxlsds8vssn 8xTiOEQpc1bvgCjkfeo2GrylI+WOLcGsGWvQ+S3um4jYXaULMVPHRtAXB4DBfvtV6DpT dlNB/txmQ0BLnGZhGPmuEod+8oIYbHfKIu00RzP6tKlp+bCUN71GjAJaFRRLFsT4RGEq K8TclsjC3iwesjiV0FbMIStOVNAmHBbUPZOzZ4nzYrcAov9C5LRY64pIzMISq/1fk7g6 8urQ== X-Gm-Message-State: ACgBeo06QLpx11e6yCznMvcXuum1ELFyW3jCwmW/F6lOiCyN6ZchueUG 5KsfQcgT0xAl0vSBV9D3Q8vFXZSg2hCU91LPoLYveS4k779+F5ZTyx3njMd1TihtsL4xMeNmIRC mFJazuZhA00pno/rexI8Vp5KB X-Received: by 2002:a37:952:0:b0:6ba:37c6:12ec with SMTP id 79-20020a370952000000b006ba37c612ecmr472051qkj.331.1661366533530; Wed, 24 Aug 2022 11:42:13 -0700 (PDT) X-Received: by 2002:a37:952:0:b0:6ba:37c6:12ec with SMTP id 79-20020a370952000000b006ba37c612ecmr472042qkj.331.1661366533349; Wed, 24 Aug 2022 11:42:13 -0700 (PDT) Received: from [192.168.1.52] (ip98-179-76-75.ph.ph.cox.net. [98.179.76.75]) by smtp.gmail.com with ESMTPSA id r25-20020ae9d619000000b006bb2f555ba4sm15216878qkk.41.2022.08.24.11.42.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Aug 2022 11:42:12 -0700 (PDT) Message-ID: <38e416b47bb30fa161e52f24ecbcf95015480fed.camel@redhat.com> Subject: Re: [PATCH v2] dmaengine: idxd: avoid deadlock in process_misc_interrupts() From: Jerry Snitselaar To: Dave Jiang Cc: linux-kernel@vger.kernel.org, Fenghua Yu , Vinod Koul , dmaengine@vger.kernel.org Date: Wed, 24 Aug 2022 11:42:11 -0700 In-Reply-To: <223e5a43-95a5-da54-0ff7-c2e088a072e3@intel.com> References: <20220823162435.2099389-1-jsnitsel@redhat.com> <20220823163709.2102468-1-jsnitsel@redhat.com> <905d3feb-f75b-e91c-f3de-b69718aa5c69@intel.com> <20220824005435.jyexxvjxj3z7tc2f@cantor> <223e5a43-95a5-da54-0ff7-c2e088a072e3@intel.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.44.4 (3.44.4-1.fc36) MIME-Version: 1.0 X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2022-08-24 at 10:45 -0700, Dave Jiang wrote: >=20 > On 8/23/2022 5:54 PM, Jerry Snitselaar wrote: > > On Tue, Aug 23, 2022 at 09:46:19AM -0700, Dave Jiang wrote: > > > On 8/23/2022 9:37 AM, Jerry Snitselaar wrote: > > > > idxd_device_clear_state() now grabs the idxd->dev_lock > > > > itself, so don't grab the lock prior to calling it. > > > >=20 > > > > This was seen in testing after dmar fault occurred on system, > > > > resulting in lockup stack traces. > > > >=20 > > > > Cc: Fenghua Yu > > > > Cc: Dave Jiang > > > > Cc: Vinod Koul > > > > Cc: dmaengine@vger.kernel.org > > > > Fixes: cf4ac3fef338 ("dmaengine: idxd: fix lockdep warning on > > > > device driver removal") > > > > Signed-off-by: Jerry Snitselaar > > > Thanks Jerry! > > >=20 > > > Reviewed-by: Dave Jiang > > >=20 > > I noticed another problem while looking at this. When the device > > ends > > up in the halted state, and needs an flr or system reset, it calls > > idxd_wqs_unmap_portal(). Then if you do a modprobe -r idxd, you hit > > the WARN_ON in devm_iounmap(), because the remove code path calls > > idxd_wq_portal_unmap(), and wq->portal is null. I'm not sure if it > > just needs a simple sanity check in drv_disable_wq() to avoid the > > call > > in the case that it has already been unmapped, or if more cleanup > > needs to be done, and possibly a state to differentiate between > > halted + soft reset possible, versus halted + flr or system reset > > needed.=C2=A0 You get multiple "Device is HALTED" messages during the > > removal as well. >=20 > Thanks! >=20 > Fenghua, can you please take a look at this when you have a chance?=20 > Thank you! >=20 >=20 Fenghua, I see another potential issue. If a software reset is attempted idxd_device_reinit() will be called which walks the wqs, and if a wq has the state IDXD_WQ_ENABLED it calls idxd_wq_enable(), but the first thing idxd_wq_enable() does is see that the state is IDXD_WQ_ENABLED and returns 0. Without the wq enable command being sent, it will not be re-enabled, yes? Regards, Jerry > >=20 > > Regards, > > Jerry > >=20 >=20