Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753359Ab3IWCnx (ORCPT ); Sun, 22 Sep 2013 22:43:53 -0400 Received: from imap.thunk.org ([74.207.234.97]:36205 "EHLO imap.thunk.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753064Ab3IWCnv (ORCPT ); Sun, 22 Sep 2013 22:43:51 -0400 Date: Sun, 22 Sep 2013 22:43:38 -0400 From: "Theodore Ts'o" To: =?iso-8859-1?Q?J=F6rn?= Engel Cc: =?iso-8859-1?Q?J=F6rg-Volker?= Peetz , John Stultz , Stephan Mueller , LKML , dave.taht@bufferbloat.net, Frederic Weisbecker , Thomas Gleixner Subject: Re: [PATCH,RFC] random: make fast_mix() honor its name Message-ID: <20130923024338.GF7321@thunk.org> Mail-Followup-To: Theodore Ts'o , =?iso-8859-1?Q?J=F6rn?= Engel , =?iso-8859-1?Q?J=F6rg-Volker?= Peetz , John Stultz , Stephan Mueller , LKML , dave.taht@bufferbloat.net, Frederic Weisbecker , Thomas Gleixner References: <20130912233155.GB5279@thunk.org> <20130916154026.GA23345@logfs.org> <20130921212510.GD8606@thunk.org> <20130921214118.GE8606@thunk.org> <20130922030553.GA21422@thunk.org> <523F5AB6.8070107@web.de> <20130922212752.GB7321@thunk.org> <20130922205334.GC4584@logfs.org> <20130922233641.GE7321@thunk.org> <20130923001623.GD4584@logfs.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20130923001623.GD4584@logfs.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: tytso@thunk.org X-SA-Exim-Scanned: No (on imap.thunk.org); SAEximRunCond expanded to false Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2361 Lines: 48 On Sun, Sep 22, 2013 at 08:16:23PM -0400, J?rn Engel wrote: > How about we switch between the two mixing functions depending on the > interrupt load? If this CPU has seen fewer than 1000 interrupts in > the last second, use the better one, otherwise us the cheaper one? I guess the question here is whether it's worth it. On a 2.8 GHz laptop Ivy Bridge chip the numbers are: Original fast_mix: 84 ns tytso's fast_mix: 14 ns joern's fast_mix: 8 ns In terms of absolute overhead if you are running at an insane 100k interrupts per second, it's still only 0.84%, 0.14%, and 0.08%, respectively. Granted, an embedded CPU will be (much) slower, but so will the overall overhead of the rest of the interrupt handling code path plus whatever the overhead of the device driver will be. The real bug is the 100k interrupts per second workload. How about this as a compromise? We can add an #ifdef in the random.c code which has the alternate fast_mix algorithm in the very rare case that some embedded software engineer under time-pressure and suffering under the need to use a pathetically broken hardware design, and who starts looking in the random.c code, will find the an alternate version. That way, we avoid the complexity of an dynamic switching system, plus the overhead of measuring the number of interrupts per second. I am very strongly of the opinion that the number of systems where you have an embedded system with that kind of inane interrupt rate is the 0.00000000001% case. So IMHO it's not even worth having a dynamic switching system, especially when it's only going to improve things slightly. - Ted P.S. The real reason for the original fast_mix() function is because it has a separate pool for each CPU, so there's no spinlock contention and no cache line bouncing. And that's why the fast_mix pool is so small --- so that the entire struct fast_pool can fit in a single CPU cache line. So on normal, common systems --- even mobile handsets have multiple cores these days --- fast_mix() *is* actually much faster than the standard entropy pool, even before any optimizations. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/