Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754733AbcJEHtG convert rfc822-to-8bit (ORCPT ); Wed, 5 Oct 2016 03:49:06 -0400 Received: from mga07.intel.com ([134.134.136.100]:64858 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751393AbcJEHtE (ORCPT ); Wed, 5 Oct 2016 03:49:04 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.31,448,1473145200"; d="scan'208";a="1040536056" From: "Winkler, Tomas" To: Jason Gunthorpe CC: Jarkko Sakkinen , "tpmdd-devel@lists.sourceforge.net" , "linux-kernel@vger.kernel.org" Subject: RE: [PATCH] tpm: don't destroy chip device prematurely Thread-Topic: [PATCH] tpm: don't destroy chip device prematurely Thread-Index: AQHSHJY7jYiWgZcWb0y7/x76d69Sl6CUwviAgAC3bgCAANJWAIAAMLYAgAEU7gCAAMAwAIAAgWOg///ptoCAAKl48A== Date: Wed, 5 Oct 2016 07:48:59 +0000 Message-ID: <5B8DA87D05A7694D9FA63FD143655C1B542F5084@hasmsx108.ger.corp.intel.com> References: <1475393971-12715-1-git-send-email-tomas.winkler@intel.com> <20161002101755.GA25844@intel.com> <20161002102455.GA27464@intel.com> <20161002212126.GA25872@obsidianresearch.com> <5B8DA87D05A7694D9FA63FD143655C1B542F466B@hasmsx108.ger.corp.intel.com> <20161003124836.GE9990@intel.com> <20161004051946.GA10572@intel.com> <20161004164738.GA17149@obsidianresearch.com> <5B8DA87D05A7694D9FA63FD143655C1B542F4C92@hasmsx108.ger.corp.intel.com> <20161004231057.GA20062@obsidianresearch.com> In-Reply-To: <20161004231057.GA20062@obsidianresearch.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ctpclassification: CTP_IC x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiYTZiODJhMjItNTIxZS00YzIzLThmNDMtODQ5MjhhMzQ1M2UzIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX0lDIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE1LjkuNi42IiwiVHJ1c3RlZExhYmVsSGFzaCI6IndZWWtjNGwyXC9xZ3lYNmdMSWJVK043Njlzem1NRTRscHkwdEhHbGRBVGowPSJ9 x-originating-ip: [10.184.70.11] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4768 Lines: 123 > > > > > On Tue, Oct 04, 2016 at 08:19:46AM +0300, Jarkko Sakkinen wrote: > > > > > > > > Make the driver uncallable first. The worst race that can happen > > > > > is that open("/dev/tpm0", ...) returns -EPIPE. I do not consider > > > > > this fatal at all. > > > > > > > > No responses for this reasonable proposal so I'll show what I mean: > > > > > > How is this any better than what Thomas proposed? It seems much > > > worse to me since now we have even more stuff in the wrong order. > > > > > > There are three purposes to the ordering as it stands today > > > 1) To guarantee that tpm2_shutdown is the last command delivered to > > > the TPM. When it is issued all other ways to access the device > > > are hard fenced off. > > > > I'm not sure where are you taking this requirements from simple bit is > > just enough to make the HW inaccessible if the interface is designed > > right. > > I'm having a hard time understanding the english in your email. (Jarkko do you > know what Tomas is talking about??) Will try to do better. > > > The ordering can be resolved, like this > > > > down_write(&chip->ops_sem); > > if (chip->flags & TPM_CHIP_FLAG_TPM2) > > tpm2_shutdown(chip, TPM2_SU_CLEAR); > > up_write(&chip->ops_sem); > > > > device_del(&chip->dev); > > > > down_write(&chip->ops_sem); > > chip->ops = NULL; > > up_write(&chip->ops_sem); > > No, that is wrong as well, another thread can issue a TPM command between > the device_del and the ops = NULL. Presumably that will fail the same as > tpm2_shutdown does. > Right, but that's why we need the TPM_CHIP_FLAG_REGISTERED bit to stay. Second, tmp2_shutdown only assure that the tpm state is saved, we are taking too hardline here. If another command is issued, this is a problem of the upper layers and it has to be fixed in the upper layer. On the other hand it is much worse if tpm2_shutdown is not sent at all. > > > I still haven't heard an explanation why Thomas's other patches need > > > this, or why trying to change this ordering makes any sense at all > > > considering how the subsystem is constructed. > > > > I thought it's quite clear form the commit message, the device_del > > Not clear at all the commit message describes the 'solution' not the problem. > This doesn't help.. This is the problem statement form the commit message: ' But with the introduction of runtime power management it will result in shutting down the parent device while it still in use.' We are talking about https://lkml.org/lkml/2016/9/12/352 https://sourceforge.net/p/tpmdd/mailman/message/35395799/ But again, the real bug is in design, where a device is used after device_del() is called. > > > naturally toggles runtime_pm of the parent device, it tries to resume > > the parent device so it can perform denationalization and then suspend > > the parent device back which caused tpm2_shutdown to fail. > > What code actually fails? I don't see anything in the runtime pm patch that > relies on chip->dev at all. "chip-dev.parent'' dev_get_drvdata(&chip->dev); > > What code fails and why? device_del(dev) bus_remove_device(dev) device_release_driver(dev) __device_release_driver(dev) pm_runtime_reinit(dev) { if (dev->parent) pm_runtime_put(dev->parent); } > > I general we can not to implement power management via runtime_pm and > > resolve the issue within tpm_crb driver but it's not abouth tpm_crb. > > tpm2_shutdown is a tpm stack call it's not tpm_crb function, it uses > > tpm_transmit_cmd and friends it should have valid tpm_chip initialized > > and valid. I'm not sure what could be more clearer than that. > > I'll say it again, the tpm_transmit_cmd path must not require a registered chip- > >dev. I rephrase it again as well, this requirement is just abusing of the device interface. > device_del only unregisters the dev, it does not deinitialize it, nor does it free > any memory. Someone will do a legitimate fix in device_del and all you house will crash on you, the device should not be used after device_del is called. There is nothing in the interface that promises that nothing is destructed. I still don't understand how this has any impact on the pm stuff > when all the pm stuff is attached only to the pdev. There is device hierarchy which is important for power management, please read the code. > > I have to admit that I'm not sure what the vtpm does yet, but I have a > > feeling that a simple flag can fix this. > > What flag? Fix what? TPM_CHIP_FLAG_REGISTERED Tomas