Received: by 2002:a89:d84:0:b0:1fb:9c95:a417 with SMTP id eb4csp656371lqb; Tue, 4 Jun 2024 03:01:46 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCU/uMOwhGgOX/mDFGee7r2qPQNxVU9F3JoLJCVDScfC0oTkKI2gnpt8+KBYY5hdHkm6aZefrJ4C8CIa9AXRT/ZMpNkK9pB9LYX+CpVaRg== X-Google-Smtp-Source: AGHT+IEPJyDfGLd/CmxlrhDZAe8YAvH10gEWSxQV5RFsK3q9L54UHyH+tx3F9wf+MT7TXUjhEiPi X-Received: by 2002:a17:902:ceca:b0:1f3:453:2ca0 with SMTP id d9443c01a7336-1f637098486mr131654335ad.42.1717495306595; Tue, 04 Jun 2024 03:01:46 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1717495306; cv=pass; d=google.com; s=arc-20160816; b=aEiZhpX/S4ut4pRvB3sH/SwlBTskkK75m03Kjm4enoOciAlMJoM5brXPd/CztZRzPl zvWpiFJhcxH6puNOzjL2PQuXiCXjJ222zRrRMzSiUGsaEDayZt38NloFQp0WteYT2j8Y EH3aAX/WTQrrv/mMG4ohoGdRFj0pLGtOFgSWlcJEhCf+pDIBd5nCupWNxeVJsRXo17oV twyTM54eOOYKVeOjJKalikVTq5mgUuk7VoZ79txaGHXxpeisy7A0HJeo7dZTKQ6tNj2k 4K/5OX6/YSOXBzV0HC/KL/2dJmf4LnlNQsiBdzZCpw7bnjhl/t21oEqes/9t/iJHLGfN q6sA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:date:message-id:dkim-signature; bh=UZzpK7RAnvb5v4Rf56gbIEbrxmYlFHstUELrFhF6ayM=; fh=SJQJ0vW7Np4n1TQfGmyH7+snHuOwVOWcfsimWRAJ8Y4=; b=ozvg5o/0yscC5EDS+RxWUkre89JN1qqCjMfjmllZqOk64lIXvmR0KTdfgiUmNhMXrT pTHIuKBvCc1+Dt1GD5GX0aDP7PGqsEmyJxSe/GQw8GgqIKTtj/w9p5/ozNHsgW26ZKG2 O2lnBWqFg3BiPUG4mnoiBmKdLBOyhfBdVt1ye8gMA8GLHy22HbM/HByMF+rl4kp7jcq/ 3qPUn1qdwOSUyPdkVt7RaidK5iPXhrTpasJrvkHC3LPsTFCrmv7EsRj585q9jyz48+C1 COCVslu4pc9bL4OOUh/uE+cMrluvCWZeH1hdG16sTi3nukbIz7CtWPbLiHmDEQRxWwQj y2eQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@163.com header.s=s110527 header.b=qPiU3yZ8; arc=pass (i=1 spf=pass spfdomain=163.com dkim=pass dkdomain=163.com dmarc=pass fromdomain=163.com); spf=pass (google.com: domain of linux-kernel+bounces-200395-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-200395-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=163.com Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [147.75.48.161]) by mx.google.com with ESMTPS id d9443c01a7336-1f651f5567bsi57361195ad.83.2024.06.04.03.01.45 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Jun 2024 03:01:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-200395-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) client-ip=147.75.48.161; Authentication-Results: mx.google.com; dkim=pass header.i=@163.com header.s=s110527 header.b=qPiU3yZ8; arc=pass (i=1 spf=pass spfdomain=163.com dkim=pass dkdomain=163.com dmarc=pass fromdomain=163.com); spf=pass (google.com: domain of linux-kernel+bounces-200395-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-200395-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=163.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 5DD66B21DC7 for ; Tue, 4 Jun 2024 10:00:56 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 6021A1448F2; Tue, 4 Jun 2024 10:00:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=163.com header.i=@163.com header.b="qPiU3yZ8" Received: from m16.mail.163.com (m16.mail.163.com [117.135.210.4]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 2002613BC39; Tue, 4 Jun 2024 10:00:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=117.135.210.4 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717495245; cv=none; b=m/czUxiS4SF4VVxcJNVi96M3PYRroYPv6nKZBiKd5vBVgf8Gn1ZwGr536MkW7cjrxpOMujfr6HPwWN4hG99b+5pOUKx2ZR9ywdPvq/v0DFZjUCAgI/b8d1PffYbSR5Ez4vKq/3mS9POEKu1VS4fTTRDtDKDbJri7sdwy+viS4M0= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717495245; c=relaxed/simple; bh=a+/pXEb0n+Lxm+BrKmisw+Kn7zWpdQvG+7DWDNmT3Pw=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=Chx5Mq8MrQJ9EgJSjJyxWweab6w/naFJOPEJU9L1z496/zsK5ca7sppSpK/OJo1HtPHxkSQtm0xfG/Z7a70KLjgyitu/3/9niaW5AkTtbQK2Sc4y6LMntPHcWPJ/NCV0vmUkOU+PeoyjSJeWp8oFVrTEMDtCYp7IYFHk3D5WlOI= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=163.com; spf=pass smtp.mailfrom=163.com; dkim=pass (1024-bit key) header.d=163.com header.i=@163.com header.b=qPiU3yZ8; arc=none smtp.client-ip=117.135.210.4 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=163.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=163.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=163.com; s=s110527; h=Message-ID:Date:MIME-Version:Subject:From: Content-Type; bh=UZzpK7RAnvb5v4Rf56gbIEbrxmYlFHstUELrFhF6ayM=; b=qPiU3yZ83ANkNuG6yPgEXyIspmAFFuieu1Zq6mkUrdO4xB44xMh6AO6rHW4W6M 1FX0+E77tSySYdaDdjAGXkXjXS2HdKBPVAmgAcwukRdkUi//ajFVEZh2fuXR3oTY xPnzfZdX1gmONgcWyT43dalOp2+Z1XUKELF4iCSQtgAcI= Received: from [10.0.2.15] (unknown [111.205.43.230]) by gzga-smtp-mta-g3-2 (Coremail) with SMTP id _____wC3X0yk5V5mlhwCBw--.18854S2; Tue, 04 Jun 2024 18:00:05 +0800 (CST) Message-ID: <67c92052-24d0-4bce-858a-ebed0aab5738@163.com> Date: Tue, 4 Jun 2024 18:00:04 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] PCI: vmd: Create domain symlink before pci_bus_add_devices To: Paul M Stillwell Jr , nirmal.patel@linux.intel.com, jonathan.derrick@linux.dev Cc: lpieralisi@kernel.org, kw@linux.com, robh@kernel.org, bhelgaas@google.com, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, sunjw10@lenovo.com, ahuang12@lenovo.com References: <20240603140329.7222-1-sjiwei@163.com> Content-Language: en-US From: Jiwei Sun In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-CM-TRANSID:_____wC3X0yk5V5mlhwCBw--.18854S2 X-Coremail-Antispam: 1Uf129KBjvJXoWxCrWDuw43GFyrXFWUWw1kuFg_yoWrGw1kpF W5GayjyFsrKr47XayDA3y8Xa4Yva1vv3y5J3s8K347Zr9xAFyI9rW0gF45AFWqvF4q93W2 vwsrXF1a9rs0kaDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDUYxBIdaVFxhVjvjDU0xZFpf9x07UYD7-UUUUU= X-CM-SenderInfo: 5vml4vrl6rljoofrz/1tbiWxnzmWV4JQdC4wABs3 On 6/3/24 23:47, Paul M Stillwell Jr wrote: > On 6/3/2024 7:03 AM, Jiwei Sun wrote: >> From: Jiwei Sun >> >> During booting into the kernel, the following error message appears: >> >>    (udev-worker)[2149]: nvme1n1: '/sbin/mdadm -I /dev/nvme1n1'(err) 'mdadm: Unable to get real path for '/sys/bus/pci/drivers/vmd/0000:c7:00.5/domain/device'' >>    (udev-worker)[2149]: nvme1n1: '/sbin/mdadm -I /dev/nvme1n1'(err) 'mdadm: /dev/nvme1n1 is not attached to Intel(R) RAID controller.' >>    (udev-worker)[2149]: nvme1n1: '/sbin/mdadm -I /dev/nvme1n1'(err) 'mdadm: No OROM/EFI properties for /dev/nvme1n1' >>    (udev-worker)[2149]: nvme1n1: '/sbin/mdadm -I /dev/nvme1n1'(err) 'mdadm: no RAID superblock on /dev/nvme1n1.' >>    (udev-worker)[2149]: nvme1n1: Process '/sbin/mdadm -I /dev/nvme1n1' failed with exit code 1. >> >> This symptom prevents the OS from booting successfully. >> > > I'm just curious: has this been doing this forever or has this just started recently? Thanks for your reply. The issue was only reproduced in certain specific servers (VROC configuration with RAID1 in two NVMe drives, 7mm NVME 2-bay rear RAID enablement kits), and the VROC RAID1 disk was installed with SLES15.6 (kernel 6.4). According to our test, the issue has been easily reproduced on this configured server since kernel 6.2. And according to the journalctl log, we found that the systemd-udevd starts running earlier than NVMe device added, it exposes this timing issue. Thanks, Regards, Jiwei > > Paul > >> After a NVMe disk is probed/added by the nvme driver, the udevd executes >> some rule scripts by invoking mdadm command to detect if there is a >> mdraid associated with this NVMe disk. The mdadm determines if one >> NVMe devce is connected to a particular VMD domain by checking the >> domain symlink. Here is the root cause: >> >> Thread A                   Thread B             Thread mdadm >> vmd_enable_domain >>    pci_bus_add_devices >>      __driver_probe_device >>       ... >>       work_on_cpu >>         schedule_work_on >>         : wakeup Thread B >>                             nvme_probe >>                             : wakeup scan_work >>                               to scan nvme disk >>                               and add nvme disk >>                               then wakeup udevd >>                                                  : udevd executes >>                                                    mdadm command >>         flush_work                               main >>         : wait for nvme_probe done                ... >>      __driver_probe_device                        find_driver_devices >>      : probe next nvme device                     : 1) Detect the domain >>      ...                                            symlink; 2) Find the >>      ...                                            domain symlink from >>      ...                                            vmd sysfs; 3) The >>      ...                                            domain symlink is not >>      ...                                            created yet, failed >>    sysfs_create_link >>    : create domain symlink >> >> sysfs_create_link is invoked at the end of vmd_enable_domain. However, >> this implementation introduces a timing issue, where mdadm might fail >> to retrieve the vmd symlink path because the symlink has not been >> created yet. >> >> Fix the issue by creating VMD domain symlinks before invoking >> pci_bus_add_devices. >> >> Signed-off-by: Jiwei Sun >> Suggested-by: Adrian Huang >> --- >>   drivers/pci/controller/vmd.c | 6 +++--- >>   1 file changed, 3 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c >> index 87b7856f375a..3f208c5f9ec9 100644 >> --- a/drivers/pci/controller/vmd.c >> +++ b/drivers/pci/controller/vmd.c >> @@ -961,12 +961,12 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features) >>       list_for_each_entry(child, &vmd->bus->children, node) >>           pcie_bus_configure_settings(child); >>   +    WARN(sysfs_create_link(&vmd->dev->dev.kobj, &vmd->bus->dev.kobj, >> +                   "domain"), "Can't create symlink to domain\n"); >> + >>       pci_bus_add_devices(vmd->bus); >>         vmd_acpi_end(); >> - >> -    WARN(sysfs_create_link(&vmd->dev->dev.kobj, &vmd->bus->dev.kobj, >> -                   "domain"), "Can't create symlink to domain\n"); >>       return 0; >>   } >>