Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp6340317imm; Mon, 23 Jul 2018 16:20:08 -0700 (PDT) X-Google-Smtp-Source: AAOMgpcOpxjiFPO02IOx7qR8XieTddP97Pm9+ekjo+bgP4/iCJqAiicnWBUSwABQlw/kSrELUolC X-Received: by 2002:a63:2b15:: with SMTP id r21-v6mr14162134pgr.262.1532388008571; Mon, 23 Jul 2018 16:20:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1532388008; cv=none; d=google.com; s=arc-20160816; b=IL4S5hwonbuGcJjRu4/2dStO74R/i/cWZ+fkGr+W5DNqROUTvLHXJS9AoSJG/elP14 cfSAP8iiOa9CpzryaueadDvLhGgJO3Maa70l90CUCycB4Gwm9dVGWtcBIc/FoTRTaBIR 7oZ7pOauFLIN7s4UreFnyXqOSrvGhliJSJ/Biq5KA+pQDc/T8cGP0ECJSArRXgfHu+HT k2DdfRLC/QPs4U85t+2wMpaLENGpNJ7QsSFH5GjhWlr0zAbI2h1hHdQaQhL4sQ8olCVw 95+/SJLUG9M4K5GRjChUANKXyOicHzc+GiNEROKWRQuEdRbyNmHuzO2kOFjA6DjgmmJa vUfA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :arc-authentication-results; bh=rybsu0xBSfm6PMP15kJUMxIgnp42PjIRe3OveQ+vZAo=; b=zDkO6r5ojQ1MeTnqZBxFPnuSCT3Aw4o1V7jE17Y9dTIwHGJ8XBiTb+ZhphOQROvyhR nrzMbRugukad7eBPCcoU8OBWYdgD2d0aOqFEG+y1CJFfHRi4f7Vs6oK/568OPJqEp5cc DV+8KVExtahBjJFsnzQ085wiS1zoLBu4Wkr2hn5NQx4YYb8QUtLbFpVfNPxgN0QZqV+D f8UxKub3YVms3wZzWSaBMe8sTAshp3lBF8lyC7cZVDcG27plfYSs4Bda4az4b31V/6rn xKY4z9/pM6oD0D260+7uViX4Nk1eR6668ag7XaGYEdBq88TeAPZp0FOgg9waH1JX3Yr2 ioxA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y23-v6si9563240pgl.132.2018.07.23.16.19.53; Mon, 23 Jul 2018 16:20:08 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388107AbeGXAVq (ORCPT + 99 others); Mon, 23 Jul 2018 20:21:46 -0400 Received: from mail.bootlin.com ([62.4.15.54]:40390 "EHLO mail.bootlin.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728425AbeGXAVq (ORCPT ); Mon, 23 Jul 2018 20:21:46 -0400 Received: by mail.bootlin.com (Postfix, from userid 110) id EC5E820733; Tue, 24 Jul 2018 01:18:12 +0200 (CEST) X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on mail.bootlin.com X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,SHORTCIRCUIT, URIBL_BLOCKED shortcircuit=ham autolearn=disabled version=3.4.0 Received: from bbrezillon (91-160-177-164.subs.proxad.net [91.160.177.164]) by mail.bootlin.com (Postfix) with ESMTPSA id 88F242069C; Tue, 24 Jul 2018 01:18:12 +0200 (CEST) Date: Tue, 24 Jul 2018 01:18:11 +0200 From: Boris Brezillon To: Brian Norris Cc: Zhiqiang Hou , linux-mtd@lists.infradead.org, Linux Kernel , David Woodhouse , Boris BREZILLON , Marek Vasut , Richard Weinberger , Cyrille Pitchen , NeilBrown Subject: Re: [PATCHv3 2/2] mtd: m25p80: restore the status of SPI flash when exiting Message-ID: <20180724011811.40165a35@bbrezillon> In-Reply-To: References: <20171206025342.7266-1-Zhiqiang.Hou@nxp.com> <20171206025342.7266-3-Zhiqiang.Hou@nxp.com> <20180723181350.GA58212@ban.mtv.corp.google.com> <20180723221002.736e0830@bbrezillon> X-Mailer: Claws Mail 3.15.0-dirty (GTK+ 2.24.31; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org +Neil On Mon, 23 Jul 2018 15:06:43 -0700 Brian Norris wrote: > Hi Boris, > > On Mon, Jul 23, 2018 at 1:10 PM, Boris Brezillon > wrote: > > On Mon, 23 Jul 2018 11:13:50 -0700 > > Brian Norris wrote: > >> I noticed this got merged, but I wanted to put my 2 cents in here: > > > > I wish you had replied to this thread when it was posted (more than > > 6 months ago). Reverting the patch now implies making some people > > unhappy because they'll have to resort to their old out-of-tree > > hacks :-(. > > I'd say I'm sorry for not following things closely these days, but I'm > not really that sorry. There are plenty of other capable hands. And if > y'all shoot yourselves in the foot, so be it. This patch isn't going > to blow things up, but now that I did finally notice it (because it > happened to show up in a list of backports I was looking at), I > thought better late than never to remind you. > > For way of notification: Marek already noticed that we've started down > a slippery slope months ago: > > https://lkml.org/lkml/2018/4/8/141 > Re: [PATCH] mtd: spi-nor: clear Extended Address Reg on switch to > 3-byte addressing. > > I'm not quite sure why that wasn't taken to its logical conclusion -- > that the hack should be reverted. > > This problem has been noted many times already, and we've always > stayed on the side of *avoiding* this hack. A few references from a > search of my email: > > http://lists.infradead.org/pipermail/linux-mtd/2013-March/046343.html > [PATCH 1/3] mtd: m25p80: utilize dedicated 4-byte addressing commands > > http://lists.infradead.org/pipermail/barebox/2014-September/020682.html > [RFC] MTD m25p80 3-byte addressing and boot problem > > http://lists.infradead.org/pipermail/linux-mtd/2015-February/057683.html > [PATCH 2/2] m25p80: if supported put chip to deep power down if not used To my defense, I was not actively following SPI NOR topics at this time, but, in the light of those discussions, I still keep thinking we should try our best to improve support for broken HW. So, ideally, we should find a way to support getting back to 3-byte addressing mode when resetting, while generating enough noise to make people well aware that their board design is broken (I think the proposal of the new DT prop + a WARN_ON() is a good idea). > > >> On Wed, Dec 06, 2017 at 10:53:42AM +0800, Zhiqiang Hou wrote: > >> > From: Hou Zhiqiang > >> > > >> > Restore the status to be compatible with legacy devices. > >> > Take Freescale eSPI boot for example, it copies (in 3 Byte > >> > addressing mode) the RCW and bootloader images from SPI flash > >> > without firing a reset signal previously, so the reboot command > >> > will fail without reseting the addressing mode of SPI flash. > >> > This patch implement .shutdown function to restore the status > >> > in reboot process, and add the same operation to the .remove > >> > function. > >> > >> We have previously rejected this patch multiple times, because the above > >> comment demonstrates a broken product. > > > > If we were to only support working HW parts, I fear Linux would not > > support a lot of HW (that's even more true when it comes to flashes :P). > > You stopped allowing UBI to attach to MLC NAND recently, no? That > sounds like almost the same boat -- you've probably killed quite a few > shitty products, if they were to use mainline directly. Well, that's a bit different in that it's purely a SW issue, and we also try to encourage people to take over the work we started with Richard to fix that problem. But I get your point, we broke setups that were working in an unreliable way, pretty much what you're suggesting to do here. > > Anyway, that's derailing the issue. Supporting broken hardware isn't > something you try to do by applying the same hack to all systems. You > normally try to apply your hack as narrowly as possible. This, I agree with. > You seem to > imply that below. So maybe that's a solution to move forward with. But > I'd personally be just as happy to see the patch reverted. > > >> You cannot guarantee that all > >> reboots will invoke the .shutdown() method -- what about crashes? What > >> about watchdog resets? IIUC, those will hit the same broken behavior, > >> and have unexepcted behavior in your bootloader. > > > > Yes, there are corner cases that are not addressed with this approach, > > Is a system crash really a corner case? :D Well, nothing forces you to reset the platform using a HW reset when that happens :P. > > > but it still seems to improve things. Of course, that means the > > user should try to re-route all HW reset sources to SW ones (RESET input > > pin muxed to the GPIO controller, watchdog generating an interrupt > > instead of directly asserting the RESET output pin), which is not always > > possible, but even when it's not, isn't it better to have a setup that > > works fine 99% of the time instead of 50% of the time? > > Perhaps, but not at the expense of future development. And > realistically, no one is doing that if they have this hack. Most > people won't even know that this hack is protecting them at all (so > again, they won't try to mitigate the problem any further). Unless we add a huge backtrace at probe time which forces them to look closer at what they did wrong (like you seem to suggest below). > > >> I suppose one could argue for doing this in remove(), but AIUI you're > >> just papering over system bugs by introducing the shutdown() function > >> here. Thus, I'd prefer we drop the shutdown() method to avoid misleading > >> other users of this driver. > > > > I understand your point. But if the problem is about making sure people > > designing new boards get that right, why not complaining at probe time > > when things are wrong? > > > > I mean, spi_nor_restore() seems to only do something on very specific > > NORs (those on which a SW RESET does not resets the addressing > > mode). > > The point isn't that SW RESET doesn't reset the addressing mode -- it > does on any flash I've seen. The point is that most systems are built > around a stateless assumption in these flash. IIRC, there wasn't even > a SW RESET command at all until these "huge" flash came around and > stateful addressing modes came about. So boot ROMs and bootloaders > would have to be updated to start figuring out when/how to do this SW > RESET. And once two vendors start doing it differently (I'm not sure: > have they done this already? I think so) it's no longer something a > boot ROM will get right. > > The only way to get this stuff right is to have a hardware reset, or > else to avoid all of the stateful modes in software. Okay. > > > So, how about adding a flag that says "my board has the NOR HW > > RESET pin wired" (there would be a DT props to set that flag). Then you > > add a WARN_ON() when this flag is not set and a NOR chip impacted by > > this bug is detected. > > I'd kinda prefer the reverse. There really isn't a need to document > anything for a working system (software usually can't control this > RESET pin). The burden should be on the b0rked system to document > where it needs unsound hacks to survive. That's true. > > > This way you make sure people are informed that > > they're doing something wrong, and for those who can't change their HW > > (because it's already widely deployed), you have a fix that improve > > things. > > Or even better: put this hack behind a DT flag, so that one has to > admit that their board design is broken before it will even do > anything. Proposal: "linux,badly-designed-flash-reset". I think we can remove the "linux," prefix. If it's badly designed, it applies to all OSes, don't you think? Regards, Boris