Received: by 2002:a25:b323:0:0:0:0:0 with SMTP id l35csp1791339ybj; Sun, 22 Sep 2019 12:02:01 -0700 (PDT) X-Google-Smtp-Source: APXvYqyHmlcev/opTGwMRlIfve+iy/07m4+6BvdLtfMaUKVWD1AWYUhzXvYLeXgZwvD8a3FM60t3 X-Received: by 2002:a17:907:20e4:: with SMTP id rh4mr27107201ejb.59.1569178921606; Sun, 22 Sep 2019 12:02:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1569178921; cv=none; d=google.com; s=arc-20160816; b=OpRbsFgKrJyJYwTdOz7bW4BAM/BAp9FQX+DG+RmFdfHbDHzpw/OjCdFNPRdwXUg/cg jladbumwxeRRgOyARqXBWNps/v1URl3VyD4kanQVJ9hAmfhi74GjhWAsUy9EL0B8YFwr iMbbewM5KXvhq0GeUUkn/N+TXOrRO1mc3pj3OCM8F3A05K2+60VAfYoJRLuUWuxQwM1Q c5PRTkJQGa6JSUkx7VD5+sw7jrxMkZxCwqSN5ornFiYaiDvO2qU3+dpwjko70Hf4p2v6 D+7YPA/zXnnL8oEBWdxhieuf0QaquCM0TIr8oViItdVkV7I5VgA+CGLHVLgk67S8Olxq Tpdw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=lKSZ1sqYRWvFRFPSlZeYFIDNS1ZGHAxPoYvurPbh1zo=; b=Pi8spovuAHrqkBCgH74Luy+xkiroQv2H20mvY61fcPrrU44L2s99f6fiRzJEbXgLsU 9c7C+Ba3Guy1q8clFPAgynxPOVdY7eb9O+QesbZiRIHWJnc/mOMRhzQJlGc2zzcjSM0r FYTU03REihaTeIrFpPo3J58c7lV3NRQnjx+0CWw9mEQedZ84nORuEJ6ePV6sILlOwWUA JX22Tx8W9hHFiVyEJM/ynydMWUoR3MGqq9kF5l6LyNES9bcqSRyqir08vfJLYT7ViYBv Nbcv64HFVHQ1NQeuWoo+ilSO4+yJFUayq89CBTcjB33ruZgFByXI10/lZ2aSEA4UD2dW 4JIg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n2si5314783edq.264.2019.09.22.12.01.36; Sun, 22 Sep 2019 12:02:01 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729598AbfITThx (ORCPT + 99 others); Fri, 20 Sep 2019 15:37:53 -0400 Received: from wtarreau.pck.nerim.net ([62.212.114.60]:49284 "EHLO 1wt.eu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727165AbfITThx (ORCPT ); Fri, 20 Sep 2019 15:37:53 -0400 Received: (from willy@localhost) by pcw.home.local (8.15.2/8.15.2/Submit) id x8KJbe98001976; Fri, 20 Sep 2019 21:37:40 +0200 Date: Fri, 20 Sep 2019 21:37:40 +0200 From: Willy Tarreau To: Andy Lutomirski Cc: Linus Torvalds , "Ahmed S. Darwish" , Lennart Poettering , "Theodore Y. Ts'o" , "Eric W. Biederman" , "Alexander E. Patrakov" , Michael Kerrisk , Matthew Garrett , lkml , Ext4 Developers List , Linux API , linux-man Subject: Re: [PATCH RFC v4 1/1] random: WARN on large getrandom() waits and introduce getrandom2() Message-ID: <20190920193740.GD1889@1wt.eu> References: <20190918211503.GA1808@darwi-home-pc> <20190918211713.GA2225@darwi-home-pc> <20190920134609.GA2113@pc> <20190920181216.GA1889@1wt.eu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.6.1 (2016-04-27) Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Fri, Sep 20, 2019 at 12:22:17PM -0700, Andy Lutomirski wrote: > Perhaps userland could register a helper that takes over and does > something better? If userland sees the failure it can do whatever the developer/distro packager thought suitable for the system facing this condition. > But I think the kernel really should do something > vaguely reasonable all by itself. Definitely, that's what Linus' proposal was doing. Sleeping for some time is what I call "vaguely reasonable". > If nothing else, we want the ext4 > patch that provoked this whole discussion to be applied, Oh absolutely! > which means > that we need to unbreak userspace somehow, and returning garbage it to > is not a good choice. It depends how it's used. I'd claim that we certainly use randoms for other things (such as ASLR/hashtables) *before* using them to generate long lived keys thus we can have a bit more time to get some more entropy before reaching the point of producing these keys. > Here are some possible approaches that come to mind: > > int count; > while (crng isn't inited) { > msleep(1); > } > > and modify add_timer_randomness() to at least credit a tiny bit to > crng_init_cnt. Without a timeout it's sure we'll still face some situations where it blocks forever, which is the current problem. > Or we do something like intentionally triggering readahead on some > offset on the root block device. You don't necessarily have such a device, especially when you're in an initramfs. It's precisely where userland can be smarter. When the caller is sfdisk for example, it does have more chances to try to perform I/O than when it's a tiny http server starting to present a configuration page. > We should definitely not trigger *blocking* IO. I think I agree. > Also, I wonder if the real problem preventing the RNG from staring up > is that the crng_init_cnt threshold is too high. We have a rather > baroque accounting system, and it seems like we can accumulate and > credit entropy for a very long time indeed without actually > considering ourselves done. I have no opinion on this, lacking the skills to evaluate the situation. What I can say for sure is that I've faced the non-booting issue quite a number of times on headless systems, and conversely in the 2.4 era, my front reverse-proxy by then had the same SSH key as 89 other machines on the net. So there's surely a sweet spot to find between those two extremes. I tend to think that waiting *a little bit* for the *first* random is acceptable, even 10-15s, by the time the user starts to think about pressing the reset button the system might finish to boot. Hashing some RAM locations and the RTC when present can also help a little bit. If at least my machine by then had combined the RTC's date and time with the hash, chances for a key collision would have gone down to one over many thousands. Willy