Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp25974090rwd; Mon, 3 Jul 2023 03:50:09 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5gw2lHVmY3NlNnQJ6VfwbrwmIRrNAUFsp+krPd/dLJDv4+aBoprb3SmDGDW6M7UZZImUHp X-Received: by 2002:a05:6808:b2a:b0:3a3:7505:700a with SMTP id t10-20020a0568080b2a00b003a37505700amr8634906oij.44.1688381409535; Mon, 03 Jul 2023 03:50:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688381409; cv=none; d=google.com; s=arc-20160816; b=bFmptOnEXorCRw5IAEtxivFJulyPpTVwUKUUZ1Vy4WI+cYDXq3521dCEuiDVH5QU9j tSxUVM5fBzWEYTAFWyijM+s9V35Eg16pmL16GM2UgKnvFMljM+zVWAFT+xnymb43OKf7 6l/mrsSlvVOQDuTKZNO3SrtDA4T9yevncX0prPCMQ9tD/KI6aJK+jBj9yYTwdpan0o2y DMuVfxwv2eHmyN9e/e9FG5vytaxDJnRgpoLzL0vyreu3mF+NtNd0oHacRgwAxua/6h9d 2oAzE4cxiHghMbdNYibFJOB4TFfi/QyXpkTD/+KHg+hAFhH6qaV3iM9b+/+R4YYMb2UW l3Vw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :organization:references:in-reply-to:message-id:subject:cc:to:from :date:dkim-signature:dkim-signature; bh=caF63ymnbWmBVrCY3MqQ90BV6mN7gkQAVN98pVotegQ=; fh=1jTypjpGUsKe4trrIP5JxHi8HYxRxKg5wQ8OVPRT3hg=; b=fgoX6WC/4lvp/JB1G++Sgwp8ZNSvFocjTxf6HUvlBCs/qttBNB8pQ4CWy48EjozP8u BTe5uISI/vmkoVPTY+1yxpLFjyv5oHdSKMzpTe7PBecXU76bxfxhJGxvZnF4pdnF6bmd qnBhxpYwefKzaRf+kp442sXMBDtYb1/5GHYSh0mIfE84GlSfznfMYMMVv9espM/tWxb6 yP2bdtqbS9qwEO6mQjVgmexScPROTHgxcl05ksCXUaP03Z4DUSXAGSIBqQjiEPDaC9FN VaDkoCuI8iWkgqBpRE2TjNYKT1tPErnX0Zvx4IIhhvC4d9AU8sTaf/aPo4PnvllmgRcR ndJw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b="uDAp/WRF"; dkim=neutral (no key) header.i=@suse.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 22-20020a17090a019600b0026101590c32si18076424pjc.2.2023.07.03.03.49.55; Mon, 03 Jul 2023 03:50:09 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b="uDAp/WRF"; dkim=neutral (no key) header.i=@suse.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231281AbjGCKpl (ORCPT + 99 others); Mon, 3 Jul 2023 06:45:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59876 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231168AbjGCKpi (ORCPT ); Mon, 3 Jul 2023 06:45:38 -0400 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 83893C1; Mon, 3 Jul 2023 03:45:36 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 2D96A1FD74; Mon, 3 Jul 2023 10:45:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1688381135; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=caF63ymnbWmBVrCY3MqQ90BV6mN7gkQAVN98pVotegQ=; b=uDAp/WRFWp1Id23i5jD0oFSMtqv/8ajkG5z5ytxiqke1V33N0kBA1HEtg5gXCHiGDOQm1C IRBOlvYazIHCDAy+5oKcXmu2pZSeqNLZ5AFuSivXAQiVGqO4GwyhIfvXELO5CpyNeGG1lO pxH8mJYQxwGipMI2oNTIxQPfUUTAWMg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1688381135; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=caF63ymnbWmBVrCY3MqQ90BV6mN7gkQAVN98pVotegQ=; b=0B6S5TivaSxyV+31peKJ5USlMrF1kX3/PNpYUF9qF79iG7tsgm9nNXWSGaPD1sAHW62lIM BsqDgmKgYkcix4Dg== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id F3AFF138FC; Mon, 3 Jul 2023 10:45:34 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id 07++Oc6momSHPQAAMHmgww (envelope-from ); Mon, 03 Jul 2023 10:45:34 +0000 Date: Mon, 3 Jul 2023 12:45:33 +0200 From: Jean Delvare To: Luis Chamberlain Cc: Michal Hocko , linux-modules@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] module: print module name on refcount error Message-ID: <20230703124533.5fc7396d@endymion.delvare> In-Reply-To: <20230701175727.292b84f6@endymion.delvare> References: <20230626123252.73dbc139@endymion.delvare> <20230701175727.292b84f6@endymion.delvare> Organization: SUSE Linux X-Mailer: Claws Mail 4.0.0 (GTK+ 3.24.34; x86_64-suse-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, 1 Jul 2023 17:57:27 +0200, Jean Delvare wrote: > On Fri, 30 Jun 2023 16:05:33 -0700, Luis Chamberlain wrote: > > On Mon, Jun 26, 2023 at 12:32:52PM +0200, Jean Delvare wrote: > > > If module_put() triggers a refcount error, include the culprit > > > module name in the warning message, to easy further investigation of > > > the issue. > > > > > > Signed-off-by: Jean Delvare > > > Suggested-by: Michal Hocko > > > Cc: Luis Chamberlain > > > --- > > > kernel/module/main.c | 4 +++- > > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > > > --- linux-6.3.orig/kernel/module/main.c > > > +++ linux-6.3/kernel/module/main.c > > > @@ -850,7 +850,9 @@ void module_put(struct module *module) > > > if (module) { > > > preempt_disable(); > > > ret = atomic_dec_if_positive(&module->refcnt); > > > - WARN_ON(ret < 0); /* Failed to put refcount */ > > > + WARN(ret < 0, > > > + KERN_WARNING "Failed to put refcount for module %s\n", > > > + module->name); > > > trace_module_put(module, _RET_IP_); > > > preempt_enable(); > > > } > > > > The mod struct ends up actually being allocated, we first read the ELF > > passed by userspace and we end up allocating space for struct module > > when reading the ELF section ".gnu.linkonce.this_module". We cache > > the ELF section index in info->index.mod, we finally copy the module > > into the allocated space with move_module(). > > > > In linux-next code this is much more clear now. > > > > What prevents us from racing to free the module and thus invalidating > > the name? > > > > For instance the system call to delete_module() could hammer and > > so have tons of threads racing try_stop_module(), eventually one of > > them could win and free_module() would kick in gear. > > > > What prevents code from racing the free with a random module_put() > > called by some other piece of code? > > > > I realize this may implicate even the existing code seems racy. > > You are the maintainer so I'll trust your expertise, but this is how I > understand it: if we hit this WARN, this means reference counting is > screwed. If this is an underflow, we still have a reference to the > module while refcnt is zero, meaning the module could be removed at any > time. This is inherent to the issue we are reporting, and not related > to the proposed change. The name is just one field of struct module, > refcnt is in the very same situation already. > > So the whole piece of code is best effort reporting and assumes (both > before and after my proposed change) that nobody attempted to unload > the module yet. I thought some more about it and one potential problem with my proposed change is if the module has indeed already been freed and the memory already reused for a different purpose. We are in trouble already (we just called atomic_dec_if_positive on a random memory location) but the WARN message could become very messy if the memory where module.name used to reside no longer contains any string terminator (binary zero). So we probably want to play it safe and add a length limitation when printing the module name. Something like: WARN(ret < 0, KERN_WARNING "Failed to put refcount for module %.*s\n", (int)MODULE_NAME_LEN, module->name); -- Jean Delvare SUSE L3 Support