Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp2459440rwb; Sat, 19 Nov 2022 16:54:45 -0800 (PST) X-Google-Smtp-Source: AA0mqf4AicVjgSP/IbT6eMA1kHi3ZveVi5DJOKl/PXj2XOO6jWvZ8yiowzDZI53HdFLdItY+OYua X-Received: by 2002:a17:906:9d16:b0:7ae:c45b:98fb with SMTP id fn22-20020a1709069d1600b007aec45b98fbmr1233011ejc.478.1668905684833; Sat, 19 Nov 2022 16:54:44 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668905684; cv=none; d=google.com; s=arc-20160816; b=hOfxVj635ISCCq9dc8ySPpZvYXEQi5Oy+gxQKtjdNnSra4MJjS5I8sMpw0rBNxWrY2 Wbm2Tuof4eiMYRGXF2HwvTicyJNBnU6MfZFZNsSwm0AY8GCnpxOmaQOQMbtnmkOyWIUS FF5J9H+HajCwFjXBoRtcIdGYLeBolPZhzDTguf99U8maZIGU8CaCISNg5YiMgB/SceCN uoQDpcOXa5cFKTmSus9JyLWITL1hAjZ062S6o/X2yHzUfm+Uvh4Z9jRagHXBE1fNbQsg 1JtZ7NOGwz3gNpgAuay9wdC3ga/8KsGW0FHBPjUBm8DoDL9+fZww20Zu0pwbZXCUarbF sUfA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=FULu1Q/R0+VeycCL68dPfbDP/Ga3miiXDWri1BQ6mXA=; b=xPEdJrfNlmuhI9SKUfd/bXI7uaV+0zvUrHyknn690kTI8Jv4WKGu04DAnvJa+l2h5G aoKOfBXzgearVpXqB0vfDb8pCjrNlkudk111MA9/iCkjlqqBA6ifN4k5MS6Ent68tjJu r8IlPnfpamFhsjj3TmX7O9LEVO0k9Cnr8AM+gtp5XhuaBJIuMBL3Xd4gohQSIz+sS5WC E83WjK6TTu+EE+PLB+cFpUpI5+OJIyNp29F8lOqz/gPmA+Bcu5Aei1I9FJybid8hvbfg XrygCCmQYoEkmztrzDJsyiS8vu5bPLNS2ZKDESzhlDy1GYcKvyBMQPYEQxzq6l69pOKl X7UA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@zx2c4.com header.s=20210105 header.b=pvoHQHqV; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=zx2c4.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id sz15-20020a1709078b0f00b0077951929340si5662148ejc.271.2022.11.19.16.54.10; Sat, 19 Nov 2022 16:54:44 -0800 (PST) Received-SPF: pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@zx2c4.com header.s=20210105 header.b=pvoHQHqV; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=zx2c4.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229455AbiKTAyH (ORCPT + 99 others); Sat, 19 Nov 2022 19:54:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56604 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229437AbiKTAyG (ORCPT ); Sat, 19 Nov 2022 19:54:06 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 794D0B10ED; Sat, 19 Nov 2022 16:54:02 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 2572CB80A3A; Sun, 20 Nov 2022 00:54:01 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8CAD1C433D6; Sun, 20 Nov 2022 00:53:58 +0000 (UTC) Authentication-Results: smtp.kernel.org; dkim=pass (1024-bit key) header.d=zx2c4.com header.i=@zx2c4.com header.b="pvoHQHqV" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zx2c4.com; s=20210105; t=1668905636; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FULu1Q/R0+VeycCL68dPfbDP/Ga3miiXDWri1BQ6mXA=; b=pvoHQHqV36ye0LiiuIq8hjj8mpZn7fHMacpZSb+LT50mP6qaDpeVxij5gR2hs0Kh7RNFHJ WDc5GchA3f9donHrYRZLx6RkUOs+KIVLb9lv2g/pZIuRCu2RQ09Zn1EdZEcPglV8JbC0VT VL1C3Xpk868v/xDnvgAnebMb+pTMvuE= Received: by mail.zx2c4.com (ZX2C4 Mail Server) with ESMTPSA id ac7f67a7 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Sun, 20 Nov 2022 00:53:55 +0000 (UTC) Date: Sun, 20 Nov 2022 01:53:53 +0100 From: "Jason A. Donenfeld" To: Eric Biggers Cc: linux-kernel@vger.kernel.org, patches@lists.linux.dev, linux-crypto@vger.kernel.org, x86@kernel.org, Thomas Gleixner , Greg Kroah-Hartman , Adhemerval Zanella Netto , Carlos O'Donell Subject: Re: [PATCH v5 2/3] random: introduce generic vDSO getrandom() implementation Message-ID: References: <20221119120929.2963813-1-Jason@zx2c4.com> <20221119120929.2963813-3-Jason@zx2c4.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Spam-Status: No, score=-6.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_HI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Hi Eric, On Sat, Nov 19, 2022 at 03:10:12PM -0800, Eric Biggers wrote: > > + if (IS_ENABLED(CONFIG_HAVE_VDSO_GETRANDOM)) > > + smp_store_release(&_vdso_rng_data.generation, next_gen + 1); > > Is the purpose of the smp_store_release() here to order the writes of > base_crng.generation and _vdso_rng_data.generation? It could use a comment. > > > + if (IS_ENABLED(CONFIG_HAVE_VDSO_GETRANDOM)) > > + smp_store_release(&_vdso_rng_data.is_ready, true); > > Similarly, is the purpose of this smp_store_release() to order the writes to the > the generation counters and is_ready? It could use a comment. Yes, I guess so. Actually this comes from an unexplored IRC comment from Andy back in July: 2022-07-29 21:21:56 zx2c4: WRITE_ONCE(_vdso_rng_data.generation, next_gen + 1); 2022-07-29 21:22:23 For x86 it shouldn’t matter much. For portability, smp_store_release Though maybe that doesn't actually matter much? When the userspace CPU learns about a change to vdso_rng_data, it's only course of action is make a syscall to getrandom(), anyway, and those paths should be consistent with themselves, thanks to the same locking and synchronization there's always been there. So maybe I actually should move back to WRITE_ONCE() here? Hm? > > +static void memcpy_and_zero(void *dst, void *src, size_t len) > > +{ > > +#define CASCADE(type) \ > > + while (len >= sizeof(type)) { \ > > + *(type *)dst = *(type *)src; \ > > + *(type *)src = 0; \ > > + dst += sizeof(type); \ > > + src += sizeof(type); \ > > + len -= sizeof(type); \ > > + } > > +#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > > +#if BITS_PER_LONG == 64 > > + CASCADE(u64); > > +#endif > > + CASCADE(u32); > > + CASCADE(u16); > > +#endif > > + CASCADE(u8); > > +#undef CASCADE > > +} > > CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS doesn't mean that dereferencing > misaligned pointers is okay. You still need to use get_unaligned() and > put_unaligned(). Take a look at crypto_xor(), for example. Right, thanks. Will do. > There's a lot of subtle stuff going on here. Adding some more comments would be > helpful. Maybe bring in some of the explanation that's currently only in the > commit message. Good idea. > One question I have is about forking. So, when a thread calls fork(), in the > child the kernel automatically replaces all vgetrandom_state pages with zeroed > pages (due to MADV_WIPEONFORK). If the child calls __cvdso_getrandom_data() > afterwards, it sees the zeroed state. But that's indistinguishable from the > state at the very beginning, after sys_vgetrandom_alloc() was just called, > right? So as long as this code handles initializing the state at the beginning, > then I'd think it would naturally handle fork() as well. Right, for this simple fork() case, it works fine. There are other cases though that are trickier... > However, it seems you have something a bit more subtle in mind, where the thread > calls fork() *while* it's in the middle of __cvdso_getrandom_data(). I guess > you are thinking of the case where a signal is sent to the thread while it's > executing __cvdso_getrandom_data(), and then the signal handler calls fork()? > Note that it doesn't matter if a different thread in the *process* calls fork(). > > If it's possible for the thread to fork() (and hence for the vgetrandom_state to > be zeroed) at absolutely any time, it probably would be a good idea to mark that > whole struct as volatile. Actually, this isn't something that matters, I don't think. If state->key_batch is zeroed, the result will be wrong, but the function logic will be fine. If state->pos is zeroed, it'll write to the beginning of the batch, which might be wrong, but the function logic will still be fine. That is, in both of these cases, even if the calculation is wrong, there's no memory corruption or anything. So then, the remaining member is state->generation. If this is zeroed, then it's actually something we detect with that READ_ONCE()! And in this case, it's a sign that something is off -- we forked -- and so we should start over from the beginning. So I don't think there's a reason to mark the whole struct as volatile. The one we care about is state->generation, and for that we READ_ONCE() it at the place that matters. There's actually a different scenario, though, that I'm concerned about, and this is the case in which a multithreaded program forks in the middle of one of its threads running this. Indeed, only the calling thread will carry forward into the child process, but all the memory is still left around from any concurrent threads in the middle of vgetrandom(). And if they're in the middle of a vgetrandom() call, that means they haven't yet done erasure and cleaned up the stack to prevent their state from leaking, and so forward secrecy is potentially lost, since the child process now has some state from the parent. I'm not quite sure what the best approach here is. One idea would be to just note that libcs should wait until vgetrandom() has returned everywhere before forking, using its atfork functionality. Another approach would be to say that multithreaded programs using this shouldn't fork or something, but that seems disappointing. Or more state could be allocated in the zeroing region, to hold a chacha state, so another 64 bytes, which would be sort of unfortunate. Or something else? I'd be interested to hear your impression of this quandary. Jason