Received: by 2002:a25:824b:0:0:0:0:0 with SMTP id d11csp1666688ybn; Wed, 2 Oct 2019 21:11:50 -0700 (PDT) X-Google-Smtp-Source: APXvYqy9XSq/orKRbXIEhbLe3+hn9pQAU5+BQ5DTztRDh5DQ0FftQEjHmpq/RxR/I3AMSM+CKhYe X-Received: by 2002:a17:906:3495:: with SMTP id g21mr6167905ejb.100.1570075910243; Wed, 02 Oct 2019 21:11:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1570075910; cv=none; d=google.com; s=arc-20160816; b=HnKl132AwRJ4uAnTLxl176CgkthJG8cjMcCFwRd3DrW5lmfPR8olsftPxFGNStSxWk KmJguoaFRqC4wczDL5QHa4IASH9JWmrmgE3ze57POt1626MblGsnqb1/aqvk455BGr4G bemWB/ghv0lVfQGTQcR9XjCx04ybsrQ6BPcUelk9bqZHjMF63T4/7gVIac5iojjfBAkm NvhJxnRNxoDnHl+OGMK2NUMX0Nbpj3fUvOPzzUPkRaZfwpvDWBTMwoERO0zm5aq9Trem cIMHfVaYDDsklUHCEB/4RpbI7sntCTMJBa6/xN4J41nnUSGVIsxSPhwHxfE632bRN2pV g8AQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=sYtkrFdT/o0bN1f18JwCr4yI9zR29jQJXBHud8isiUc=; b=dOU6TbIAHhFV8EOIkXp1Pdq7tS8Vl9d2SS9RQz+7YRSHwCiSeI8CWhY6o1R/PeDD/X mPeg1Zmi2A+DYX+M+/HM3jJZGlCxwCccvlo46zNmyd+ngti7Cq7UmhaR7/UqbaQ+woB2 DZmcojhpUPfyGdYo1+TUMpK+tQre2zXFfC0IXkom/XSMYdAFFKS0JjZrVEYWln2rw6TE JQMFIDZGv9tXVvtWyP9q+4K0MD966/Gnt3NyeCsXrYTCgookH6WX/B2Vcr6Wl8iUY6Bm 1MFC+52wOA2xqtBr0cH8LVCdk91BmAB3i26LiiLDjRh3IN5dU82b499pb4TVoRAa95k7 5GfQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l45si770347edb.18.2019.10.02.21.11.25; Wed, 02 Oct 2019 21:11:50 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727207AbfJCDhL (ORCPT + 99 others); Wed, 2 Oct 2019 23:37:11 -0400 Received: from outgoing-auth-1.mit.edu ([18.9.28.11]:50401 "EHLO outgoing.mit.edu" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725860AbfJCDhL (ORCPT ); Wed, 2 Oct 2019 23:37:11 -0400 Received: from callcc.thunk.org (pool-72-93-95-157.bstnma.fios.verizon.net [72.93.95.157]) (authenticated bits=0) (User authenticated as tytso@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id x933atx5019088 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 2 Oct 2019 23:36:56 -0400 Received: by callcc.thunk.org (Postfix, from userid 15806) id 523C142088C; Wed, 2 Oct 2019 23:36:55 -0400 (EDT) Date: Wed, 2 Oct 2019 23:36:55 -0400 From: "Theodore Y. Ts'o" To: Kurt Roeckx Cc: linux-kernel@vger.kernel.org Subject: Re: Stop breaking the CSRNG Message-ID: <20191003033655.GA3226@mit.edu> References: <20191002165533.GA18282@roeckx.be> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20191002165533.GA18282@roeckx.be> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 02, 2019 at 06:55:33PM +0200, Kurt Roeckx wrote: > > But it seems people are now thinking about breaking getrandom() too, > to let it return data when it's not initialized by default. Please > don't. "It's complicated" The problem is that whether a CRNG can be considered secure is a property of the entire system, including the hardware, and given the large number of hardware configurations which the kernel and OpenSSL can be used, in practice, we can't assure that getrandom(2) is "secure" without making certain assumptions. For example, if we assume that the CPU is an x86 processor new enough to support RDRAND, and that RDRAND is competently implemented (e.g., it won't disappear after a suspend/resume) and doesn't have any backdoors implanted in it, then it's easy to say that getrandom() will always be secure. But if you assume that there is no hardware random number generator, and everything is driven from a single master oscillator, with no exernal input, and the CPU is utterly simple, with speculation or anything else that might be non-determinstic, AND if we assume that the idiots who make an IOT device use the same random seed across millions of devices all cloned off of the same master imagine, there is ***absoutely*** nothing the kernel can do to guarantee, with 100% certainty, that the CRNG will be initialzied. (This is especially true if the idiots who design the IOT device call OpenSSL to generate their long-term private key the moment the device is first plugged in, before any networking device is brought on-line.) The point with all of this is that both the kernel and OpenSSL, and whether or not they can be combined to create a secure overall solution is going to be dependent on the hardware choices, and choices of the distribution and the application programmers in terms of what other software components are used, and when and where those components try to request random numbers, especially super-early in the boot process. Historically, I've tried to work around this problem by being super paranoid about the choices of thresholds before declaring the CRNG to be initialized, while *also* making sure that at least on most common x86 systems, the CRNG could be considered initialized before the root file system was mounted read/write. But over time, assumptions of what is common hardware changes. SSD's replace HDD's; NAPI and other polling techniques are more common to reduce the number of interrupts; the use of a single master oscillator to drive the all of the various clocks on the system, etc. And software changes --- systemd running boot scripts in parallel means that boot times are reduced, which is good, but it also means the time to when the root is mounted read/write is much shortened. So in the absence of a hardware RNG, or a hardware random number generator which is considered trusted (i.e., should RDRAND beconsidered trusted?), there *will* be times when we will simply fail to be able to generate secure random numbers (at least by our hueristics, which can potentially be overly optimistic on some hardware platforms, and overly conservative on others). The question is then, what do we do? Do we hang the boot --- at which point users will complain to Linus? Or do we just hope that things are "good enough", and that even if the user has elected to say that they don't trust RDRAND, that we'll hope it's competently implement and not backdoored? Or do we assume that using a jitter entropy scheme is actually secure, as opposed to security through obscurity (and maybe is completely pointless on a simple and completely open architecture with no speculation such as RISC-V)? There really are no good choices here. The one thing which Linus has made very clear is that hanging at boot is Not Acceptable. Long term, the best we can do is to through the kitchen sink at the problem. So we should try to use UEFI's RNG if available; use the TPM's RNG if available; use RDRAND if available; try to use a seed file if available (and hope it's not cloned to be identical on a million IOT devices); and so on. Hopefully, they won't *all* incompetently implemented and/or implanted with a backdoor from the NSA or MSS or the KGB. The only words of hope that I can give you is that it's likely that there are so many zero day bugs in the kernel, in userspace applications, and crypto libraries (including maybe OpenSSL), that we don't have to make the CRNG impossible to attack in order to make a difference. We just have to make it harder than finding and exploiting zero day security bugs in *other* parts of the system. "When a mountain bear is chasing after you, you don’t have to outrun the bear. You only have to outrun the person running next to you." :-) Bottom line, we can do the best we can with each of our various components, but without control over the hardware that will be in use, or for OpenSSL, what applications are trying to call OpenSSL for, and when they might try to generate long-term public keys during the first boot, perfection is always going to be impossible to achieve. The only thing we can choose is how do we handle failure. And Linus has laid down the law that a performance improving commit should never cause boot-ups to hang due to the lack of randomness. Given that I can't control when some application might try to call OpenSSL to generate a long-term public key, and OpenSSL certainly can't control if it gets called during early boot, if getrandom(2) ever boots, we can't meet Linus's demand. And given that many users are just installing some kind of userspace jitter entropy to square this particular circle, even though I don't trust a jitter entropy scheme, even if it is insecure, we're also using RDRAND, and ultimately I'll trust RDRAND more than I trust a jitter entropy scheme. And that's where we are right now. Linus has introduced a simple in-kernel jitter entropy system so getrandom(2) will never boot. Is it secure? Who can say? I have my doubts on RISC-V, but I don't use a RISC-V, and hopefully this will be a spur to encourage all RISC-V implementations to include the cryptographic extensions which include a RDRAND-like hardware random number generator into ISA. And since all of *my* x86 systems have RDRAND, I'm at least personally comfortable enough with where we've landed. Your mileage may vary. Regards, - Ted