Received: by 2002:a25:c593:0:0:0:0:0 with SMTP id v141csp3408930ybe; Sun, 15 Sep 2019 14:59:10 -0700 (PDT) X-Google-Smtp-Source: APXvYqwsWG4mInJBx2suHid7whWcG3+VKXwdp8yfBQZmI9x1ndVSmdMs4WECJU7wqpOY8b3wNNBn X-Received: by 2002:a17:906:6d52:: with SMTP id a18mr26980218ejt.56.1568584750209; Sun, 15 Sep 2019 14:59:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1568584750; cv=none; d=google.com; s=arc-20160816; b=T5jEHqpXvfLjDUaUlXG0JpVnsgBiE51jkhdUWc1/ZsZYzK08fbCOPGq0juBt5t77Dz orWTKsRmj/QVmr2IuhsTp1ma8d5j8OO56ra28MLsf2CPJZ/PZqJoXZchg/qFVuZckx/u zbuxI3Ew2cjuiPeHUcWqDhO9riB2rMpPzm1hB/N1czBC68yHJ1YvtKFKRO73PR0Wip2B IoRuQKELpTjf6p9t+gao6s7h1u+BW0bIfjkgtDxh/u/+SIVt8qyZ7Vf7Ll8SE3+z6D/r yoVP00alGdPk+7KLhdRY02MK9rkVNCu6HoZg6vJPSE8MLzu+G5YAW25laRqEODBwcvqk /KjQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=rAPL332mrX9ZdZST8f21sXEdE4lXIUBF9BNV3WHdliU=; b=HQEKlXAAb5THwOxF23FQq5RvFbzog22d9o3FQ4ZEhCLq/WUP4VOlUEvTb4Dgv59Erg T4Y4sAcxbLUyqqve9CdT+El8gm54VmFLAHU9qZFKXr4wjJdQod51kEPeB9qbZAM+ouf9 M+UyL6l7COS/aOuR71rQtqCPZ+TQVnR5gEM7LoBMF0gLbGVTJbGJFdRfl8U+zNlIhS9G AdZPsYBwZ0trC2DkkcwVnP3g0VnIuOfnRTBJUb3WjvxvCsuoXftVkUUcKKVf2tlfrFPZ fORxUajJkzIePrtFkvvqOoGJEfioQYxvR5mATyakOLS1c8sDNdGlHQCJbuRYg15UANFG B3Gg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f21si22367014edb.379.2019.09.15.14.58.44; Sun, 15 Sep 2019 14:59:10 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725951AbfIOTNQ (ORCPT + 99 others); Sun, 15 Sep 2019 15:13:16 -0400 Received: from wtarreau.pck.nerim.net ([62.212.114.60]:45428 "EHLO 1wt.eu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725270AbfIOTNQ (ORCPT ); Sun, 15 Sep 2019 15:13:16 -0400 Received: (from willy@localhost) by pcw.home.local (8.15.2/8.15.2/Submit) id x8FJCwW5023224; Sun, 15 Sep 2019 21:12:58 +0200 Date: Sun, 15 Sep 2019 21:12:58 +0200 From: Willy Tarreau To: Linus Torvalds Cc: "Theodore Y. Ts'o" , "Alexander E. Patrakov" , "Ahmed S. Darwish" , Michael Kerrisk , Andreas Dilger , Jan Kara , Ray Strode , William Jon McCann , zhangjs , linux-ext4@vger.kernel.org, lkml , Lennart Poettering Subject: Re: [PATCH RFC v2] random: optionally block in getrandom(2) when the CRNG is uninitialized Message-ID: <20190915191258.GA23212@1wt.eu> References: <20190911173624.GI2740@mit.edu> <20190912034421.GA2085@darwi-home-pc> <20190912082530.GA27365@mit.edu> <20190914122500.GA1425@darwi-home-pc> <008f17bc-102b-e762-a17c-e2766d48f515@gmail.com> <20190915052242.GG19710@mit.edu> <20190915183240.GA23155@1wt.eu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.6.1 (2016-04-27) Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Sun, Sep 15, 2019 at 11:59:41AM -0700, Linus Torvalds wrote: > > In addition, since you're leaving the door open to bikeshed around > > the timeout valeue, I'd say that while 30s is usually not huge in a > > desktop system's life, it actually is a lot in network environments > > when it delays a switchover. > > Oh, absolutely. > > But in that situation you have a MIS person on call, and somebody who > can fix it. > > It's not like switchovers happen in a vacuum. What we should care > about is that updating a kernel _works_. No regressions. But if you > have some five-nines setup with switchover, you'd better have some > competent MIS people there too. You don't just switch kernels without > testing ;) I mean maybe I didn't use the right term, but typically in networked environments you'll have watchdogs on sensitive devices (e.g. the default gateways and load balancers), which will trigger an instant reboot of the system if something really bad happens. It can range from a dirty oops, FS remounted R/O, pure freeze, OOM, missing process, panic etc. And here the reset which used to take roughly 10s to get the whole services back up for operations suddenly takes 40s. My point is that I won't have issues explaining users that 10s or 13s is the same when they rely on five nices, but trying to argue that 40s is identical to 10s will be a hard position to stand by. And actually there are other dirty cases. Such systems often work in active-backup or active-active modes. One typical issue is that the primary system reboots, the second takes over within one second, and once the primary system is back *apparently* operating, some processes which appear to be present and which possibly have already bound their listening ports are waiting for 30s in getrandom() while the monitoring systems around see them as ready, thus the primary machine goes back to its role and cannot reliably run the service for the first 30 seconds, which roughly multiplies the downtime by 30. That's why I'd like to make it possible to lower it this value (either definitely or by cmdline, as I think it can be fine for all those who care about down time). Willy