Received: by 2002:a25:e74b:0:0:0:0:0 with SMTP id e72csp833301ybh; Mon, 13 Jul 2020 02:26:15 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx+Xduk6HeRSlvCjbLlM4ceJv5qMQcuqhd5GHUZtEYTZe9oR21OUatYUWUsXSo7jR6V3O29 X-Received: by 2002:aa7:d650:: with SMTP id v16mr90011498edr.361.1594632375157; Mon, 13 Jul 2020 02:26:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1594632375; cv=none; d=google.com; s=arc-20160816; b=0aBWNn0iInbCMr4ZN+ydYGP0JZNmyavKb1OmsWK0pCGxfJauD7BNB4KHZWhbl6nm+r RjLWMwgBf2+5lGIKUjV0mZQeWCiFcUnth/gXhUcm34Hy18tIWYDA3UgN8hGk6k3j6luL hYybSiL6h2HdvvtKzOA/r8Ib5iHfmP+B3mrYIh7WE5xJ1zvK4qimwIZRQQsXilHJ1Ubb QOSZ2mvLmcKQtxeUX3fU0GKT4JGEGDU7DoQRz0jU9kYpoQgjBCSdH1ZtTeSePVSH+Mz/ MNspCg2ILJWJvIup0ayIvuXUjxqt7nZE5T0OMy/1Jgh+8IUnh6V+5MHWBJzG5ttSxvi6 dmmg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=J3dzjEkpsPZKRk0cuzKhyX+FVpcBcp+ECm2hgm8oHtY=; b=XhZMmbmpUN5gwNQNxnNK8UJ3XK0ThvV8UPEOEQA/UyXFuRwQwUmQeQfE6oMOb8ahK+ M+9uQ5KLGs1K7ZiRDXyWKEXW5juuymahdHZfoO7u62OeG8EP8Ae5eBCAS+Ne8uiV1zHy 6hH9rzy6ppTlKHrpWDcCTPziuFLLwQC/Mw0nXGthdRLnfh2CnJS11ZF76lov/V8P+Qdl m3jNsC7FHs+HKFfRDf2IAsUMwCLm8TeqZo6aM+9WbFG6LOVbaylRdBz+XV+sbaVkYUNC qYkitCGg4QyC9h7SWvd5sAuTGCEm4orjHilcMwYZBd2KofnooRMmO/BzQqyaeN9Rlz67 FRTA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=SaaK99rF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id dk6si9054565ejb.129.2020.07.13.02.25.52; Mon, 13 Jul 2020 02:26:15 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=SaaK99rF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729308AbgGMJYv (ORCPT + 99 others); Mon, 13 Jul 2020 05:24:51 -0400 Received: from mail.kernel.org ([198.145.29.99]:56174 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727035AbgGMJYv (ORCPT ); Mon, 13 Jul 2020 05:24:51 -0400 Received: from localhost (unknown [122.182.251.219]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 377CC2076D; Mon, 13 Jul 2020 09:24:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1594632290; bh=EH5TrWdmq6I1jgYbH1HZ5kpFdNMmEWTWCUX1Y6ss3iA=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=SaaK99rF2HP7TKwycgbM63GKNl3Q3bd18O4sRxKaBGhqzmtguyP756uJYsXv7c3lt g6l9QbunOWXoxLu5ZAd8eWU74L36C/pXJikzoJRfP7CVMRt5V0+gWNHWOA+aLXEUIQ UUk/ZMIhSIyYdmd2XuMc7q+1K9VKM/4dybY1l//c= Date: Mon, 13 Jul 2020 14:54:45 +0530 From: Vinod Koul To: Dave Jiang Cc: Jiri Slaby , Swathi Kovvuri , peter.ujfalusi@ti.com, Linux kernel mailing list Subject: Re: [PATCH v2] dmaengine: check device and channel list for empty Message-ID: <20200713092445.GH34333@vkoul-mobl> References: <159319496403.69045.16298280729899651363.stgit@djiang5-desk3.ch.intel.com> <852318ec-9e18-3dee-a91d-1cf4dddb8906@kernel.org> <83932426-d52a-2e62-9d4b-5abb134a64df@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <83932426-d52a-2e62-9d4b-5abb134a64df@intel.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 09-07-20, 08:23, Dave Jiang wrote: > > > On 7/8/2020 10:35 PM, Jiri Slaby wrote: > > On 07. 07. 20, 17:42, Dave Jiang wrote: > > > On 7/6/2020 11:05 PM, Jiri Slaby wrote: > > > > On 26. 06. 20, 20:09, Dave Jiang wrote: > > > > > Check dma device list and channel list for empty before iterate as the > > > > > iteration function assume the list to be not empty. With devices and > > > > > channels now being hot pluggable this is a condition that needs to be > > > > > checked. Otherwise it can cause the iterator to spin forever. > > > > > > > > Could you be a little bit more specific how this can spin forever? I.e. > > > > can you attach a stacktrace of such a behaviour? > > > > > > I can't seem to find the original splat that lead me to the conclusion > > > of it's spinning forever. As I recall, the issue seems to produce > > > different splats and not always consistent in being reproduced. Here's a > > > partial splat that was tracked by the internal bug database. Since with > > > the dma device and channel list being are hot added and removed, the > > > device and channel lists can be empty. The list_entry() and friends > > > expect the list to not be empty (according to header comment), I added > > > the check to ensure that isn't the case before using them in dmaengine. > > > > Yes, the comment states that as it is true: you receive a > > wild/non-checkable pointer if you do list_entry on an empty list. BUT > > have you actually read what I wrote: > > > > > > As in the empty case, "&pos->member" is "head" (look into > > > > list_for_each_entry) and the for loop should loop exactly zero times. > > > > HERE ^^^^ > > > > > With the fix, we can no longer produce any of the splats. So maybe the > > > above was a bad description of the issue. > > > > No, not only the description, worse, the patch proper looks wrong. > > > > > [ 4216.048375]? ? dma_channel_rebalance+0x7b/0x250 > > > [ 4216.056360]? dma_async_device_register+0x349/0x3a0 > > > [ 4216.064604]? idxd_register_dma_device+0x90/0xc0 [idxd] > > > [ 4216.073175]? idxd_config_bus_probe.cold+0x7d/0x1fc [idxd] > > > > So, the good part in the patch is the fixed locking in > > dma_async_device_register. Otherwise it adds nonsense checks. So you > > fixed the issue only by a chance, by a side effect as Peter pointed out. > > Leaving aside that you broke dma_request_chan -- that could happen to > > anybody. > > > > Vinod, please drop/revert this patch. Then start over only with > > dma_async_device_register fixed locking. > > I'll start on the proper fix. Dropped -- ~Vinod