Received: by 10.223.164.221 with SMTP id h29csp1198123wrb; Wed, 1 Nov 2017 12:06:11 -0700 (PDT) X-Google-Smtp-Source: ABhQp+TDdoutuWW0rKcOxUBPReKmMw858IW/w15WB0TSHvf1IqIoVFdmzZednT+ZHxT615l64Wt4 X-Received: by 10.101.80.136 with SMTP id r8mr872299pgp.198.1509563171363; Wed, 01 Nov 2017 12:06:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1509563171; cv=none; d=google.com; s=arc-20160816; b=ReQFJEV8pFduzsIEVQZW7C05+C8mRqiU3oMCfjl4dMaWtOe72VeaAeAb2dToX3Axp4 UlBLxEscGMLhlj10JSJaA67UmZjOoSD1McvHhcL9TefTb6f7WE1Eq09wjkXO+hgPKWKd cxpEkBni0ucmTAIVeb9T3oiCOpSOzShrpGNdL88dbZB/XDvZfm1UAaqWnidUOPWQU0Yz oWcs8CG02A+kP4ogqA5cSJqPgt8XPY0ORWZ7acX0z7EH7K/v5vT/QSgvjCaijdCO6KIt lvrMutXifAlov0o5YWigoStGRqRvlrytbnfwZqRrDsaTptdZYa1ctJ57GvJP+giG7CKo Zq2Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:references:cc:to:subject :from:arc-authentication-results; bh=XiYqQB1otEGik/Yov8Hi4k1BCcc4uqjZHZ1TStxOSYc=; b=dlxC46CRo8cGZacFsFD5VZ5KWqevWx9v80G+7R28AXF+/xnavJ321s65IrKDkvvZZQ hRbiw9UMu94mhWgseUzB7Wb4gucCGfM3GTutRQtcNoKZPWdKhhd8Yr0MN6YN6YKKvZ+Z iE3BN1q0bhBHXHL2O0pAzxFjwvlji0wAdhdziwykRILTeh35GlR6qbWICYp9XPhObMGx cKO8T8thxYNyviKuKzlltmDr5Q4t36XTrmaP1OtdK3QvOe8kxXVAuxblUMME4I+yYlcm pXbP2c4IcFpWNf6PjH6SrV2Zl2V+Zhf1964SMawrdn8KqMziylG2sPoae2/m05hanFxK T0RQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e23si269721plj.47.2017.11.01.12.05.57; Wed, 01 Nov 2017 12:06:11 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933274AbdKATDy (ORCPT + 99 others); Wed, 1 Nov 2017 15:03:54 -0400 Received: from smtp2-g21.free.fr ([212.27.42.2]:18576 "EHLO smtp2-g21.free.fr" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932209AbdKATDw (ORCPT ); Wed, 1 Nov 2017 15:03:52 -0400 Received: from [192.168.0.66] (unknown [88.191.210.51]) by smtp2-g21.free.fr (Postfix) with ESMTP id 248E22003DC; Wed, 1 Nov 2017 20:03:50 +0100 (CET) From: Marc Gonzalez Subject: Re: [RFC] Improving udelay/ndelay on platforms where that is possible To: Alan Cox Cc: Linus Torvalds , LKML , Linux ARM , Steven Rostedt , Ingo Molnar , Thomas Gleixner , Peter Zijlstra , John Stultz , Douglas Anderson , Nicolas Pitre , Mark Rutland , Will Deacon , Jonathan Austin , Arnd Bergmann , Kevin Hilman , Russell King , Michael Turquette , Stephen Boyd , Mason References: <20171101175325.2557ce85@alans-desktop> Message-ID: <4b707ce0-6067-ab36-e167-1acf348d26bf@free.fr> Date: Wed, 1 Nov 2017 20:03:20 +0100 User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:52.0) Gecko/20100101 Firefox/52.0 SeaMonkey/2.49.1 MIME-Version: 1.0 In-Reply-To: <20171101175325.2557ce85@alans-desktop> Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/11/2017 18:53, Alan Cox wrote: > On Tue, 31 Oct 2017 17:15:34 +0100 > >> Therefore, users are accustomed to having delays be longer (within a reasonable margin). >> However, very few users would expect delays to be *shorter* than requested. > > If your udelay can be under by 10% then just bump the number by 10%. Except it's not *quite* that simple. Error has both an absolute and a relative component. So the actual value matters, and it's not always a constant. For example: http://elixir.free-electrons.com/linux/latest/source/drivers/mtd/nand/nand_base.c#L814 > However at that level most hardware isn't that predictable anyway because > the fabric between the CPU core and the device isn't some clunky > serialized link. Writes get delayed, they can bunch together, busses do > posting and queueing. Are you talking about the actual delay operation, or the pokes around it? > Then there is virtualisation 8) > >> A typical driver writer has some HW spec in front of them, which e.g. states: >> >> * poke register A >> * wait 1 microsecond for the dust to settle >> * poke register B > > Rarely because of posting. It's usually > > write > while(read() != READY); > write > > and even when you've got a legacy device with timeouts its > > write > read > delay > write > > and for sub 1ms delays I suspect the read and bus latency actually add a > randomization sufficient that it's not much of an optimization to worry > about an accurate ndelay(). I don't think "accurate" is the proper term. Over-delays are fine, under-delays are problematic. >> This "off-by-one" error is systematic over the entire range of allowed >> delay_us input (1 to 2000), so it is easy to fix, by adding 1 to the result. > > And that + 1 might be worth adding but really there isn't a lot of > modern hardware that has a bus that behaves like software folks imagine > and everything has percentage errors factored into published numbers. I guess I'm a software folk, but the designer of the system bus sits across my desk, and we do talk often. >> 3) Why does all this even matter? >> >> At boot, the NAND framework scans the NAND chips for bad blocks; >> this operation generates approximately 10^5 calls to ndelay(100); >> which cause a 100 ms delay, because ndelay is implemented as a >> call to the nearest udelay (rounded up). > > So why aren't you doing that on both NANDs in parallel and asynchronous > to other parts of boot ? If you start scanning at early boot time do you > need the bad block list before mounting / - or are you stuck with a > single threaded CPU and PIO ? There might be some low(ish) hanging fruit to improve the performance of the NAND framework, such as multi-page reads/writes. But the NAND controller on my SoC muxes access to the two NAND chips, so no parallel access, and this requires PIO. > For that matter given the bad blocks don't randomly change why not cache > them ? That's a good question, I'll ask the NAND framework maintainer. Store them where, by the way? On the NAND chip itself? >> My current NAND chips are tiny (2 x 512 MB) but with larger chips, >> the number of calls to ndelay would climb to 10^6 and the delay >> increase to 1 second, with is starting to be a problem. >> >> One solution is to implement ndelay, but ndelay is more prone to >> under-delays, and thus a prerequisite is fixing under-delays. > > For ndelay you probably have to make it platform specific or just use > udelay if not. We do have a few cases we wanted 400ns delays in the PC > world (ATA) but not many. By default, ndelay is implemented in terms of udelay. Regards. From 1582887427966772996@xxx Wed Nov 01 17:58:05 +0000 2017 X-GM-THRID: 1582790467810046578 X-Gmail-Labels: Inbox,Category Forums