Received: by 2002:a05:6358:16cc:b0:ea:6187:17c9 with SMTP id r12csp11628241rwl; Tue, 3 Jan 2023 02:12:21 -0800 (PST) X-Google-Smtp-Source: AMrXdXtTJMeOPNsF9TeW8CyNWdmdeILvxKXpaV6ghPptBw2nWnV1iKvW2nt2EBvYn0VAZThgUUCH X-Received: by 2002:a17:90a:ba13:b0:219:d98d:19d with SMTP id s19-20020a17090aba1300b00219d98d019dmr47249300pjr.32.1672740741283; Tue, 03 Jan 2023 02:12:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1672740741; cv=none; d=google.com; s=arc-20160816; b=A2X9Xm/lU1Rbxieu7KYDsQFVnKhIEdT+7esII6vosrbe6ZKWw6PEKjSsl0w+KKW99D ngZw3Cf8aahsxf4lwVTAWQqORGYtQ2znXF3CLLALBeXzVM5GnrYYhUejKyoaSs+v2bc5 ke4rjabwwa+KXIb1lY4a8kP3bSgdhPzpiFr99FDtfpTj0kYtOygj0jvY37QUYHmvXgIB BTLCoYigZJjC501jex/umA5X4dXAU9sZ53MacbwBr+wwPTGGPZHVuuVkkPXTK7OZSHnz kg+42SGmiqcqSIKCeSUugfQXQWwhFNYFWt/LEXz4tm3DGU+re8xnvBVj+Z2FqCyeRJyd cG0g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=WzhoC50HCrEG2UyVZwTfr6HqW1UQfhZNixOfzNV0F5k=; b=z18/KBmf3XqJrA++VoX8Jk4Osz6R34JROd/OB2eo9ZER7qlZITLD85joMiJSeKSnlj 2p2j8F8dF3+MbRtIwxUJEm5whhcFO+9VKyEr1Muv6orNa4U3BPLjw2RApQbRHHExOLr/ z5lm47w1NxxpbIdwSNC5RoMuBqJn4/1RA0IvzkyM70BPb+4ZZgyq1beveHcJpfWIkYWu IUx7MpiH04V/naNNPt99cNoZ3WpQpxg4aBG1p2VZ/S+YeU/YhQI814xbWiNrZv7J0qnH TTnW8kkBq/pxxbpcYhDxoqSoGSASOJ1mJxg4lhUVuNCFlChpogzip6njul73crH0D4Tt ec8Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=iXpIJf3t; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p12-20020a17090a428c00b00219404caaeasi34255473pjg.164.2023.01.03.02.11.55; Tue, 03 Jan 2023 02:12:21 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=iXpIJf3t; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236991AbjACJfN (ORCPT + 61 others); Tue, 3 Jan 2023 04:35:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56834 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233155AbjACJfL (ORCPT ); Tue, 3 Jan 2023 04:35:11 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4D30FE094; Tue, 3 Jan 2023 01:35:10 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id DFE0661228; Tue, 3 Jan 2023 09:35:09 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C3A44C433D2; Tue, 3 Jan 2023 09:35:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672738509; bh=JUA2lT1qfrNXleMDtLF82i4F/TDM47rJDAehK4IRzWE=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=iXpIJf3tpbeiVTecgwdjxbmVlDfR8sXNFxkplivk1BDVRtIGCWc8cFgb1WqzCPrmh pSv97G0VpNR8j3hHwc/jaVkUc/Y5Ewv8yamA/s61oXZC2s3cV6OXAeMi52NIzQ/O9l fyfzwUhhmn6dF1vX5A6/KiiJC9Pf6QEno960d6JEOfCsG9YS7fleQpCMeAeeNG6XIV JXUXBALPMV4N3kkIqtEG7XlNQzPjWkj7D5Tc7l/h3eWH7JKBCI+Y6AOL9f6m9pR37s PJseOduPM1nwO7SPbb1SYAjTF8/MmXpjqBVNM3LbBO1e5bqBEWxbF3umaLN7Ogovv4 DzYePuKtc2bbw== Date: Tue, 3 Jan 2023 11:35:04 +0200 From: Leon Romanovsky To: Petr Pavlu Cc: tariqt@nvidia.com, yishaih@nvidia.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: Part of devices not initialized with mlx4 Message-ID: References: <0a361ac2-c6bd-2b18-4841-b1b991f0635e@suse.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jan 02, 2023 at 11:33:15AM +0100, Petr Pavlu wrote: > On 12/18/22 10:53, Leon Romanovsky wrote: > > On Thu, Dec 15, 2022 at 10:51:15AM +0100, Petr Pavlu wrote: > >> Hello, > >> > >> We have seen an issue when some of ConnectX-3 devices are not initialized > >> when mlx4 drivers are a part of initrd. > > > > <...> > > > >> * Systemd stops running services and then sends SIGTERM to "unmanaged" tasks > >> on the system to terminate them too. This includes the modprobe task. > >> * Initialization of mlx4_en is interrupted in the middle of its init function. > > > > And why do you think that this systemd behaviour is correct one? > > My view is that this is an issue between the kernel and initrd/systemd. > Switching the root is a delicate operation and both parts need to carefully > cooperate for it to work correctly. > > I think it is generally sensible that systemd tries to terminate any remaining > processes started from the initrd. They would have troubles when the root is > switched under their hands anyway, unless they are specifically prepared for > it. Systemd only skips terminating kthreads and allows to exclude root storage > daemons. A modprobe helper could be excluded from being terminated too but the > problem with the root switch remains. > > It looks to me that a good approach is to complete all running module loads > before switching the root and continue with any further loads after the > operation is done. Leaving module loads to udevd assures this, hence the idea > to use an auxiliary bus. I'm not sure about it. Everything above are user-space troubles which are invited once systemd does root switch. Anyway, if you want to do aux bus for mlx4, go for it. Feel free to send me patches off-list and I will add them to our regression, but be aware that you are stepping on landmine field here. Thanks