Received: by 2002:a05:6a10:f3d0:0:0:0:0 with SMTP id a16csp1069340pxv; Fri, 25 Jun 2021 04:55:31 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxjt/ZUPq87XEzntI7oMT7D4L78oo7GEwzCkfAcYUyh6ho7d8ISXRDvbFUIj5byiYd7HoQn X-Received: by 2002:a05:6402:b17:: with SMTP id bm23mr14186886edb.173.1624622130834; Fri, 25 Jun 2021 04:55:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1624622130; cv=none; d=google.com; s=arc-20160816; b=o1HGPLoXi9xq/tuW598wDH6SrDVejOPyC17zbnyW+YZ43fOBoCoXK8ZCiuEXeaiREM fHe/EqYlR615lWVhJUMaROcTHS7nw16riyKrVUYLhmZlPaH0ypCLIQ39TzRAMeuokdCu C5YvigSk4ZZpREvylGQlZwjjmuNUxdh8Q5lLzBbIDm1ECMyqXdreuaK50xlhJbpnrMQX Bs0VY0ONgIFxB6FDRZGyznRdqWH01kI5EGcbVJ4ZSHl2i9hUDUAk+FwdHPJsSRzO4aWL 3OrWAJGA87quucw544P3xGUzJ+wVHm0Vl/d/IeXQvEtydKiWWyjCCLCchCHIt04bdJ7/ 0nRA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=dp28K2daX0ajZR9CYlxx3oHiJ/jfL+vyCuHjXFGuA5E=; b=XTLUZxCZ/v/RhY4PnkQFpegv1bQPhzwyZwgigMo3W9p23Gs9KM4bG/FmCDl+HOgWWx TrqomW0lfB9OD7GCS/vZDjNCavJkYpIL72AD5D3TSiZurCaXJBux6VJCEh6Opqlw9x+r 5ZBqJr2Fm2ZOi/PRqMry/y0YGepjctxn4xSoUew5MusUV9Ag74bRZEmzBprCNgHajPug eC0m/IjpupA29iJFa+PnAbgST9PpCFlWTfHjXDM9Z56AtBhZlB8TbQH5+AcvVl64jYPh Tk6WZ+H3uR2hzKtGWxncSQTenJ3umbCKejqH9zUlAIvhn3ncvharPnhNjBHifXvelWLu BlzA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=QFSo0RtC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id df5si5771494edb.598.2021.06.25.04.55.08; Fri, 25 Jun 2021 04:55:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=QFSo0RtC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230114AbhFYL40 (ORCPT + 99 others); Fri, 25 Jun 2021 07:56:26 -0400 Received: from mail.kernel.org ([198.145.29.99]:49260 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229498AbhFYL4Z (ORCPT ); Fri, 25 Jun 2021 07:56:25 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id CAC7D61464; Fri, 25 Jun 2021 11:54:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1624622045; bh=eKwvSIEBtV9yZ29mPOO8grGRX6O2AyCWp0i7Tkuc/eM=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=QFSo0RtCEX5GX6dxNXbQPazYaVXnToZoTw5wAT9kzABmT8tlQbNsI1XLHqgD94n8i OVyd/zpV1wDiEjb7Jjf72vfpx/tR91Omxa/B+VTUci7uCDvoBLoSVJBGVtVAY8T8Pp XuOMj1cBGeeONi3D8PXntWB+tqzF2qXfZlO4Kr139M4Qy1rHg0LgR5KPQNBM+pA8Ye 85iPWTiuNn3Z6kAPZ0q8vwytmfGfK62g+9swaY+lqTUxR/n858VD7ckLYkLXAn7Fz7 OV/EBGLBwEMvY3YXZvFdRAIyYB6y5ALIJLwKYxRNSI4YCEgz4/jOuHJpOpyAqCJXA4 yrlT5YcDA1j1A== Received: by pali.im (Postfix) id 7061360E; Fri, 25 Jun 2021 13:54:02 +0200 (CEST) Date: Fri, 25 Jun 2021 13:54:02 +0200 From: Pali =?utf-8?B?Um9ow6Fy?= To: Koen Vandeputte Cc: Petr =?utf-8?Q?=C5=A0tetiar?= , Bjorn Helgaas , Bjorn Helgaas , linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, Lorenzo Pieralisi , Gregory Clement , Andrew Lunn , Krzysztof =?utf-8?Q?Wilczy=C5=84ski?= , Yinghai Lu Subject: Re: PCI: Race condition in pci_create_sysfs_dev_files Message-ID: <20210625115402.jwga35xmknmo4vdk@pali> References: <20200909112850.hbtgkvwqy2rlixst@pali> <20201006222222.GA3221382@bjorn-Precision-5520> <20201007081227.d6f6otfsnmlgvegi@pali> <20210407142538.GA12387@meh.true.cz> <20210407145147.bvtubdmz4k6nulf7@pali> <20210407153041.GA17046@meh.true.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: NeoMutt/20180716 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello! On Friday 25 June 2021 13:29:00 Koen Vandeputte wrote: > Hi all, > > > Adding some more info regarding this issue: > > I've cooked up this simple patch to get some more info: > > > Index: linux-5.10.44/drivers/pci/pci-sysfs.c > =================================================================== > --- linux-5.10.44.orig/drivers/pci/pci-sysfs.c > +++ linux-5.10.44/drivers/pci/pci-sysfs.c > @@ -1335,6 +1335,8 @@ int __must_check pci_create_sysfs_dev_fi >      int rom_size; >      struct bin_attribute *attr; > > +    dump_stack(); > + >      if (!sysfs_initialized) >          return -EACCES; > > > Which shows this: > > > [    1.847384] Key type .fscrypt registered > [    1.854288] pci 0000:01:00.0: PCI bridge to [bus 02-05] > [    1.859242] Key type fscrypt-provisioning registered > [    1.863252] pci 0000:01:00.0:   bridge window [mem 0x01100000-0x011fffff] > [    1.874096] Key type encrypted registered > [    1.879290] pci 0000:00:00.0: PCI bridge to [bus 01-ff] > [    1.879306] pci 0000:00:00.0:   bridge window [mem 0x01100000-0x012fffff] > > --> patch kicking in here showing first creation now > > [    1.879346] CPU: 1 PID: 7 Comm: kworker/u4:0 Not tainted 5.10.44 #0 > [    1.913354] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) > [    1.919908] Workqueue: events_unbound async_run_entry_fn > [    1.925255] [<8010d5e0>] (unwind_backtrace) from [<801099f0>] > (show_stack+0x10/0x14) > [    1.933008] [<801099f0>] (show_stack) from [<804ab92c>] > (dump_stack+0x94/0xa8) > [    1.940249] [<804ab92c>] (dump_stack) from [<804ea96c>] > (pci_create_sysfs_dev_files+0x10/0x28c) > [    1.948969] [<804ea96c>] (pci_create_sysfs_dev_files) from [<804dc648>] > (pci_bus_add_device+0x20/0x84) > [    1.958287] [<804dc648>] (pci_bus_add_device) from [<804dc6d8>] > (pci_bus_add_devices+0x2c/0x70) > [    1.966996] [<804dc6d8>] (pci_bus_add_devices) from [<804dffc8>] > (pci_host_probe+0x40/0x94) > [    1.975362] [<804dffc8>] (pci_host_probe) from [<804fd5f0>] > (dw_pcie_host_init+0x1c0/0x3fc) > [    1.983720] [<804fd5f0>] (dw_pcie_host_init) from [<804fdcb0>] > (imx6_pcie_probe+0x358/0x63c) > [    1.992179] [<804fdcb0>] (imx6_pcie_probe) from [<8054c79c>] > (platform_drv_probe+0x48/0x98) > [    2.000542] [<8054c79c>] (platform_drv_probe) from [<8054a9fc>] > (really_probe+0xfc/0x4dc) > [    2.008732] [<8054a9fc>] (really_probe) from [<8054b0bc>] > (driver_probe_device+0x5c/0xb4) > [    2.016916] [<8054b0bc>] (driver_probe_device) from [<8054b1bc>] > (__driver_attach_async_helper+0xa8/0xac) > [    2.026497] [<8054b1bc>] (__driver_attach_async_helper) from [<8014beec>] > (async_run_entry_fn+0x44/0x108) > [    2.036082] [<8014beec>] (async_run_entry_fn) from [<80141b64>] > (process_one_work+0x1d8/0x43c) > [    2.044704] [<80141b64>] (process_one_work) from [<80141e2c>] > (worker_thread+0x64/0x5b0) > [    2.052802] [<80141e2c>] (worker_thread) from [<80147698>] > (kthread+0x148/0x14c) > [    2.060207] [<80147698>] (kthread) from [<80100148>] > (ret_from_fork+0x14/0x2c) > [    2.067433] Exception stack(0x81075fb0 to 0x81075ff8) > [    2.072493] 5fa0:                                     00000000 00000000 > 00000000 00000000 > [    2.080678] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 > 00000000 00000000 > [    2.088864] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000 > > --> getting called again ..  different caller this time .. seems unimportant > ? > > [    2.095490] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.44 #0 > [    2.095924] pcieport 0000:00:00.0: PME: Signaling with IRQ 307 > [    2.101510] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) > [    2.113883] [<8010d5e0>] (unwind_backtrace) from [<801099f0>] > (show_stack+0x10/0x14) > [    2.121638] [<801099f0>] (show_stack) from [<804ab92c>] > (dump_stack+0x94/0xa8) > [    2.128875] [<804ab92c>] (dump_stack) from [<804ea96c>] > (pci_create_sysfs_dev_files+0x10/0x28c) > [    2.137588] [<804ea96c>] (pci_create_sysfs_dev_files) from [<80b1b920>] > (pci_sysfs_init+0x34/0x54) > [    2.146559] [<80b1b920>] (pci_sysfs_init) from [<80101820>] > (do_one_initcall+0x54/0x1e8) > [    2.154667] [<80101820>] (do_one_initcall) from [<80b01108>] > (kernel_init_freeable+0x23c/0x290) > [    2.163386] [<80b01108>] (kernel_init_freeable) from [<80837e3c>] > (kernel_init+0x8/0x118) > [    2.171578] [<80837e3c>] (kernel_init) from [<80100148>] > (ret_from_fork+0x14/0x2c) > [    2.179151] Exception stack(0x81053fb0 to 0x81053ff8) > [    2.184210] 3fa0:                                     00000000 00000000 > 00000000 00000000 > [    2.192393] 3fc0: 00000000 00000000 00000000 00000000 00000000 00000000 > 00000000 00000000 > [    2.200575] 3fe0: 00000000 00000000 00000000 00000000 00000013 00000000 > > --> And then finally, the error occurs as it's already been added before.  > same cpu and same PID trying to add the same stuff again to sysfs This just proves that you have hit this race condition. > [    2.207200] CPU: 1 PID: 7 Comm: kworker/u4:0 Not tainted 5.10.44 #0 > [    2.207263] sysfs: cannot create duplicate filename > '/devices/platform/soc/1ffc000.pcie/pci0000:00/0000:00:00.0/config' <------ > [    2.213478] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) > [    2.230798] Workqueue: events_unbound async_run_entry_fn > [    2.236129] [<8010d5e0>] (unwind_backtrace) from [<801099f0>] > (show_stack+0x10/0x14) > [    2.243882] [<801099f0>] (show_stack) from [<804ab92c>] > (dump_stack+0x94/0xa8) > [    2.251118] [<804ab92c>] (dump_stack) from [<804ea96c>] > (pci_create_sysfs_dev_files+0x10/0x28c) > [    2.259837] [<804ea96c>] (pci_create_sysfs_dev_files) from [<804dc648>] > (pci_bus_add_device+0x20/0x84) > [    2.269153] [<804dc648>] (pci_bus_add_device) from [<804dc6d8>] > (pci_bus_add_devices+0x2c/0x70) > [    2.277861] [<804dc6d8>] (pci_bus_add_devices) from [<804dc70c>] > (pci_bus_add_devices+0x60/0x70) > [    2.286656] [<804dc70c>] (pci_bus_add_devices) from [<804dffc8>] > (pci_host_probe+0x40/0x94) > [    2.295018] [<804dffc8>] (pci_host_probe) from [<804fd5f0>] > (dw_pcie_host_init+0x1c0/0x3fc) > [    2.303377] [<804fd5f0>] (dw_pcie_host_init) from [<804fdcb0>] > (imx6_pcie_probe+0x358/0x63c) > [    2.311832] [<804fdcb0>] (imx6_pcie_probe) from [<8054c79c>] > (platform_drv_probe+0x48/0x98) > [    2.320196] [<8054c79c>] (platform_drv_probe) from [<8054a9fc>] > (really_probe+0xfc/0x4dc) > [    2.328382] [<8054a9fc>] (really_probe) from [<8054b0bc>] > (driver_probe_device+0x5c/0xb4) > [    2.336567] [<8054b0bc>] (driver_probe_device) from [<8054b1bc>] > (__driver_attach_async_helper+0xa8/0xac) > [    2.346145] [<8054b1bc>] (__driver_attach_async_helper) from [<8014beec>] > (async_run_entry_fn+0x44/0x108) > [    2.355727] [<8014beec>] (async_run_entry_fn) from [<80141b64>] > (process_one_work+0x1d8/0x43c) > [    2.364350] [<80141b64>] (process_one_work) from [<80141e2c>] > (worker_thread+0x64/0x5b0) > [    2.372449] [<80141e2c>] (worker_thread) from [<80147698>] > (kthread+0x148/0x14c) > [    2.379853] [<80147698>] (kthread) from [<80100148>] > (ret_from_fork+0x14/0x2c) > [    2.387077] Exception stack(0x81075fb0 to 0x81075ff8) > [    2.392134] 5fa0:                                     00000000 00000000 > 00000000 00000000 > [    2.400317] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 > 00000000 00000000 > [    2.408498] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000 > > > > Any idea? > > Thanks, > > Koen > It is same race condition which I described in the first email of this email thread. I described exactly when it happens. I'm not able to trigger it with new kernels but as we know race conditions are hard to trigger... So result is that we know about it, just fix is not ready yet as it is not easy to fix it. Krzysztof has been working on fixing it, so maybe can provide an update...