Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id ; Sun, 18 Aug 2002 12:52:22 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id ; Sun, 18 Aug 2002 12:52:22 -0400 Received: from neon-gw-l3.transmeta.com ([63.209.4.196]:19 "EHLO neon-gw.transmeta.com") by vger.kernel.org with ESMTP id ; Sun, 18 Aug 2002 12:52:21 -0400 Date: Sun, 18 Aug 2002 09:59:41 -0700 (PDT) From: Linus Torvalds To: Oliver Xymoron cc: linux-kernel Subject: Re: [PATCH] (0/4) Entropy accounting fixes In-Reply-To: <20020818052417.GL21643@waste.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4974 Lines: 105 On Sun, 18 Aug 2002, Oliver Xymoron wrote: > > The key word is actually conservative, as in conservative estimate. > Conservative here means less than or equal to. Your argument is that even with a gigahz logic analyzer on the network line, you should certainly see randomness that is worth considering. I dare you to actually show perfect correlation from it: the interrupt may be synchronized to the PCI clock, but the code executed there-after certainly will not. And even if the machine is 100% idle, and the whole working set fits in the L1 cache, the DMA generated by the packet itself will result in cache invalidations. In other words, in order for you to actually be able to predict the TSC from the outside, you'd have to not just have the gigahz logic analyzer on the network line, you' dalso have to be able to correlate the ethernet heartbeat to the PCI clock (which you probably could do by looking at the timing of the reply packets from a ping flood, although it would be "interestng" to say the least and probably depends on how the network card generates the ethernet clock), _and_ you'd have to be able to do a cache eviction analysis (which in turn requires knowing the initial memory layout for the kernel data structures for networking). And your argument that there is zero randomness in the TSC _depends_ on your ability to perfectly estimate what the TSC is. If you cannot do it, there is obviously at least one bit of randomness there. So I don't think your "zero" is a good conservative estimate. At some point being conservative turns into being useless [ insert obligatory political joke here ]. [ Side note: the most common source of pseudo-random numbers is the old linear congruental generator, which really is a sampling of a "beat" between two frequencies that are supposed to be "close", but prime. That's a fairly simple and accepted pseudo-random generator _despite_ the fact that the two frequencies are totally known, and there is zero noise inserted. I'll bet you'll see a _very_ hard-to-predict stream from something like the PCI clock / CPU clock thing, with noise inserted thanks to things like cache misses and shared bus interactions. Never mind the _real_ noise of having a work-load. ] > No, it says /dev/random is primarily useful for generating large > (>>160 bit) keys. Which is exactly what something like sshd would want to use for generating keys for the machine, right? That is _the_ primary reason to use /dev/random. Yet apparently our /dev/random has been too conservative to be actually useful, because (as you point out somewhere else) even sshd uses /dev/urandom for the host key generation by default. That is really sad. That is the _one_ application that is common and that should really have a reason to maybe care about /dev/random vs urandom. And that application uses urandom. To me that says that /dev/random has turned out to be less than useful in real life. Is there anything that actually uses /dev/random at all (except for clueless programs that really don't need to)? Please realize that _this_ is my worry: making /dev/random so useless that any practical program has no choice but to look elsewhere. > Actually, half of the point here is in fact to make /dev/urandom safer > too, by allowing mixing of untrusted data that would otherwise > compromise /dev/random. Now this I absolutely agree with. The xor'ing of the buffer data is clearly a good idea. I agree 100% with this part. You'll see no arguments against this part at all. > 99.9% of users aren't using network sampling > currently, after these patches we can turn it on for everyone and > still sleep well at night. See? Oh, that's the _good_ part. Yes. The bad part is that I think our current /dev/random is close to useless already, and I'd like to reverse that trend. > That is an interesting point. A counterpoint is if we account so much as > 1 bit of entropy per network interrupt on a typical system, the system > will basically _always_ feel comfortable (see /proc/interrupts). It will > practically never block and thus it is again identical to /dev/urandom. But what's the problem with that? The "/dev/random may block" is not the intrisic value of /dev/random - if people want to wait they are much better off just using "sleep(1)" than trying to read from /dev/random. My argument is that on a typical system there really _is_ so much randomness that /dev/random is actually a useful thing. I think that you'd have to _work_ at finding a system where /dev/random should block under any normal use (where "normal use" also obviously means that only programs that really need it would use it, ie ssh-keygen etc). Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/