Received: by 2002:a05:7412:5112:b0:fa:6e18:a558 with SMTP id fm18csp102322rdb; Mon, 22 Jan 2024 13:24:32 -0800 (PST) X-Google-Smtp-Source: AGHT+IFWmfdspjYZbvCfTR39Lrek9YhHUwnP/Wy/bC4ZzmpWVNoS3RPJ3/TFOxycF6CcmapcnAgh X-Received: by 2002:a17:90a:ba01:b0:28f:ef6c:7606 with SMTP id s1-20020a17090aba0100b0028fef6c7606mr1874493pjr.35.1705958671990; Mon, 22 Jan 2024 13:24:31 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1705958671; cv=pass; d=google.com; s=arc-20160816; b=e3vibGiqwykPdDuZtJERU1uSl53PgqElYS/fep8RSF5Hd7GAl8a8GcXyOatoNYxlud UjvgrJdC23yHwd13tLrSQF3TRk6IeYV0qepnx6xEoOsN8RL6S7O92Woe6is7mhACp4ut CG9gr5qdNYTU9ck9Zdp/+QfTMFRDbzI2d9BHQyNA6ecaDmPHHDCaytUJ6uMVhPasneEi 9YizjPn6StvCEJzIiDMUOjIagh+Qjb7Zvo0Juo6rSBXDCsVcIxmeT56myyQoED6FCYYM S2qAJHIPiTYYLfVOrJSj30EW518VsVWHTP0/A4XqfDqiFaMf+KlvtLsvCKl7kOAx9Sv+ 6zmg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:message-id:subject:cc :to:from:date; bh=B7jcVdYnXRGZGv2mmwcSPM6+NC93MOQlEHUSCqBp58A=; fh=++fHcO+WQGd+9paZXe6Ag3VBfSpvcqCoycM7jgAmOjU=; b=U10sQDrksWyHURulN2qydz0AVjDVgiV9vnxiifC0sl9PkjADTfVZQhkhbKGrsox85l kIB2mQpPMIAAM8hA6iAZ+eITiilBwXxyjM380ikeEdcbCS2eej4tcEgJdvLx8S0ijU5p w5sNu5mlSBXrqS8a7EOfPHQS8EnYPp0Pz+xjcNGLvtpuRvatG17UcJMvfd3//KSejJTe hDViR5aYGmOxAJXyRXEV5um2NX+/ljIsgNjovhNvEJgq+CmYTg1CokwJyyEYq7yfEoP4 hG8orgvMGfZbTBwo/AuLucfH9wD9mF4DCGDxzwWJVJ0XRTGfV4Ohq7u7WvOWWR/7EU3U sUwQ== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of linux-kernel+bounces-34177-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-34177-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id f5-20020a17090ab94500b00290260542c3si8129214pjw.184.2024.01.22.13.24.31 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 Jan 2024 13:24:31 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-34177-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of linux-kernel+bounces-34177-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-34177-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 9E99D2878FB for ; Mon, 22 Jan 2024 21:24:31 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 402EE47F73; Mon, 22 Jan 2024 21:24:25 +0000 (UTC) Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BE96BF9C4 for ; Mon, 22 Jan 2024 21:24:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705958664; cv=none; b=tiGt0qrb7X37D/u3avJZjhNckxjUbqMFy6zMF5XHuPaW6Kh8jAsFVTiDP0jjVBwDLutmESGMojqwEUwNnucwWvS+pOZi7VsiNuC/FLw380nSxY2Sft4lSYTxVwL8cAlJmdgTvaR5ttbBvESCUeepPmmZbzrPW8I8L8Z9mDPtmk4= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705958664; c=relaxed/simple; bh=6xC4jtIb98lvi19x9ooerCbLShD3mbJHIdT1NX33ozw=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=tcDI2EPsA3HsRF8h3oZVnFIZqp2Ps8cyJY+NiTDXEIyXk+mg3SLJ/z+f8UdMnICRp2SVC+rqC/VCWUpfqmgzAiZeu3OiGcUD8apkmoF4zg+boOIU3Z4l2sUUwIQTWzDurcGyMuMdnnKmdaRW4Y2gkICfNNfY7Byzmi14ZZEW7+c= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: from rdvivi-mobl4 (unknown [192.55.55.57]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp.kernel.org (Postfix) with ESMTPSA id 369ECC433C7; Mon, 22 Jan 2024 21:24:22 +0000 (UTC) Date: Mon, 22 Jan 2024 16:24:20 -0500 From: Rodrigo Vivi To: "Souza, Jose" , johannes@sipsolutions.net Cc: "Vivi, Rodrigo" , "linux-kernel@vger.kernel.org" , "maarten.lankhorst@linux.intel.com" , "johannes@sipsolutions.net" Subject: Re: [PATCH] devcoredump: Remove devcoredump device if failing device is gone Message-ID: References: <20240117195349.343083-1-rodrigo.vivi@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Fri, Jan 19, 2024 at 01:13:45PM -0500, Souza, Jose wrote: > On Wed, 2024-01-17 at 14:53 -0500, Rodrigo Vivi wrote: > > Make dev_coredumpm a real device managed helper, that not only > > frees the device after a scheduled delay (DEVCD_TIMEOUT), but > > also when the failing/crashed device is gone. > > > > The module remove for the drivers using devcoredump are currently > > broken if attempted between the crash and the DEVCD_TIMEOUT, since > > the symbolic sysfs link won't be deleted. > > > > On top of that, for PCI devices, the unbind of the device will > > call the pci .remove void function, that cannot fail. At that > > time, our device is pretty much gone, but the read and free > > functions are alive trough the devcoredump device and they > > can get some NULL dereferences or use after free. > > > > So, if the failing-device is gone, let's cancel the scheduled > > work and remove devcoredump-device immediately. > > > > Cc: Jose Souza > > Cc: Maarten Lankhorst > > Cc: Johannes Berg > > Signed-off-by: Rodrigo Vivi > > --- > > drivers/base/devcoredump.c | 29 ++++++++++++++++++++++------- > > 1 file changed, 22 insertions(+), 7 deletions(-) > > > > diff --git a/drivers/base/devcoredump.c b/drivers/base/devcoredump.c > > index 7e2d1f0d903a..6db7a2fd9a02 100644 > > --- a/drivers/base/devcoredump.c > > +++ b/drivers/base/devcoredump.c > > @@ -8,6 +8,7 @@ > > #include > > #include > > #include > > +#include > > #include > > #include > > #include > > @@ -118,19 +119,24 @@ static ssize_t devcd_data_read(struct file *filp, struct kobject *kobj, > > return devcd->read(buffer, offset, count, devcd->data, devcd->datalen); > > } > > > > -static ssize_t devcd_data_write(struct file *filp, struct kobject *kobj, > > - struct bin_attribute *bin_attr, > > - char *buffer, loff_t offset, size_t count) > > +static void devcd_remove_now(struct devcd_entry *devcd) > > this function can also be used by devcd_free(). well, indeed. And perhaps using the flush_delayed_work(&devcd->del_wk); instead of mod_delayed_work(system_wq, &devcd->del_wk, 0); and then I don't even need to switch from INIT_DELAYED_WORK(&devcd->del_wk, devcd_del); to devm_delayed_work_autocancel() since it will be flushed, so no need to autocancel it. Johannes, any hard preference or request from your side? Thanks, Rodrigo. > > Other than that LGTM. > > > { > > - struct device *dev = kobj_to_dev(kobj); > > - struct devcd_entry *devcd = dev_to_devcd(dev); > > - > > mutex_lock(&devcd->mutex); > > if (!devcd->delete_work) { > > devcd->delete_work = true; > > mod_delayed_work(system_wq, &devcd->del_wk, 0); > > } > > mutex_unlock(&devcd->mutex); > > +} > > + > > +static ssize_t devcd_data_write(struct file *filp, struct kobject *kobj, > > + struct bin_attribute *bin_attr, > > + char *buffer, loff_t offset, size_t count) > > +{ > > + struct device *dev = kobj_to_dev(kobj); > > + struct devcd_entry *devcd = dev_to_devcd(dev); > > + > > + devcd_remove_now(devcd); > > > > return count; > > } > > @@ -304,6 +310,12 @@ static ssize_t devcd_read_from_sgtable(char *buffer, loff_t offset, > > offset); > > } > > > > +static void devcd_remove(void *data) > > +{ > > + struct devcd_entry *devcd = data; > > + devcd_remove_now(devcd); > > +} > > + > > /** > > * dev_coredumpm - create device coredump with read/free methods > > * @dev: the struct device for the crashed device > > @@ -379,7 +391,10 @@ void dev_coredumpm(struct device *dev, struct module *owner, > > > > dev_set_uevent_suppress(&devcd->devcd_dev, false); > > kobject_uevent(&devcd->devcd_dev.kobj, KOBJ_ADD); > > - INIT_DELAYED_WORK(&devcd->del_wk, devcd_del); > > + if (devm_add_action(dev, devcd_remove, devcd)) > > + dev_warn(dev, "devcoredump managed auto-removal registration failed\n"); > > + if (devm_delayed_work_autocancel(dev, &devcd->del_wk, devcd_del)) > > + dev_warn(dev, "devcoredump managed autocancel work failed\n"); > > schedule_delayed_work(&devcd->del_wk, DEVCD_TIMEOUT); > > mutex_unlock(&devcd->mutex); > > return; >