Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753550AbcJDQrq (ORCPT ); Tue, 4 Oct 2016 12:47:46 -0400 Received: from quartz.orcorp.ca ([184.70.90.242]:47649 "EHLO quartz.orcorp.ca" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752617AbcJDQrp (ORCPT ); Tue, 4 Oct 2016 12:47:45 -0400 Date: Tue, 4 Oct 2016 10:47:38 -0600 From: Jason Gunthorpe To: Jarkko Sakkinen Cc: "Winkler, Tomas" , "tpmdd-devel@lists.sourceforge.net" , "linux-kernel@vger.kernel.org" Subject: Re: [PATCH] tpm: don't destroy chip device prematurely Message-ID: <20161004164738.GA17149@obsidianresearch.com> References: <1475393971-12715-1-git-send-email-tomas.winkler@intel.com> <20161002101755.GA25844@intel.com> <20161002102455.GA27464@intel.com> <20161002212126.GA25872@obsidianresearch.com> <5B8DA87D05A7694D9FA63FD143655C1B542F466B@hasmsx108.ger.corp.intel.com> <20161003124836.GE9990@intel.com> <20161004051946.GA10572@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161004051946.GA10572@intel.com> User-Agent: Mutt/1.5.23 (2014-03-12) X-Broken-Reverse-DNS: no host name found for IP address 10.0.0.151 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2163 Lines: 50 On Tue, Oct 04, 2016 at 08:19:46AM +0300, Jarkko Sakkinen wrote: > > Make the driver uncallable first. The worst race that can happen is that > > open("/dev/tpm0", ...) returns -EPIPE. I do not consider this fatal at > > all. > > No responses for this reasonable proposal so I'll show what I mean: How is this any better than what Thomas proposed? It seems much worse to me since now we have even more stuff in the wrong order. There are three purposes to the ordering as it stands today 1) To guarantee that tpm2_shutdown is the last command delivered to the TPM. When it is issued all other ways to access the device are hard fenced off. 2) To hard fence the tpm subsystem for the 'platform' driver. Once tpm_del_char_device completes no callback into the driver is possible *at all*. The driver can destroy everything (iounmap, dereg irq, etc) and the driver module can be unloaded. 3) To prevent oopsing with the sysfs code. Recall this comment /* The sysfs routines rely on an implicit tpm_try_get_ops, device_del * is called before ops is null'd and the sysfs core synchronizes this * removal so that no callbacks are running or can run again */ device_del is what eliminates the sysfs access path, so ordering device_del after ops = null is just unconditionally wrong. I still haven't heard an explanation why Thomas's other patches need this, or why trying to change this ordering makes any sense at all considering how the subsystem is constructed. Further, if tpm_crb now needs a registered device, how on earth do all the chip ops we call work *before* registration? Or is that another bug? Why can't tpm_crb return to the pre-registration operating state in the driver remove function before calling unregister? None of this makes any sense to me. This whole thing was very carefully constructed to work *correctly* during unregister. Many other subsystems have races and bugs during remove (eg see the securityfs discussion). TPM has a hard requirement to support safe unregister due to the vtpm stuff, so we don't get to screw it up just to support one driver. Jason