Received: by 2002:a25:6193:0:0:0:0:0 with SMTP id v141csp775850ybb; Sat, 28 Mar 2020 09:47:24 -0700 (PDT) X-Google-Smtp-Source: ADFU+vtg/79+vSzw1roM0MszFdbWFqjbiVJ26s8yhUGJcDI6Kn88aEieZ2ixnqack70lLHAa5mjX X-Received: by 2002:a05:6830:1e93:: with SMTP id n19mr3420881otr.153.1585414044046; Sat, 28 Mar 2020 09:47:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585414044; cv=none; d=google.com; s=arc-20160816; b=O521MZA2WjJIHSR4GmU37LPrUBRBO0T3TUiSJ6rqgCo52GpSZHLj7Kx40RdUP6Tjnx Ad32h6RS3iAJ68712zzZxMWWXpIFhlWd21mUZZZtLqjx4h6qjzKy0NBIAj387UnJWdcT CyUqIR6m9NiDgaEuOCulEVukl2Nshg9/4WuEXIbAYKjNQaTdYSBNuFMbfoY2o0t48w59 dX3vOeg+/7jGqV0FGV/j9ZoGPPJOJ1CTbaR52QtU2I5r7ZGqD/5Oi6eiyG+bV6i1Jell E/UwGLBMuu8PV+hYjs3V9zqPTpVG8Qg3v2t76fIipeoxeWlEUHjPaofTnxpIY1fnN8Gc m1uQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:date:from:message-id; bh=P7ccKBpRZc4pX0IoK3/kcZkGmdfbiiWIvh8umtZRdow=; b=Qxc22PHTvf/XMDopy7398JqxMfbd2Eyzve2qRwLYQ061WBj8K24LKFuvAlKTvMzqTU IZJxe/CISMZXdMkB64m0eZZwZ3MUzCTCOSIO8HT8Vlgmp/3Ht3wwsQQgFnlz2me810wI VohWs/oYrHuTEDmoLwKWwPkeJBSQvy0EJEkSi9YSkLQ5/+OZ0CwTtI1y3KfikvUPlnIw VnQeEB1XOb4G4Ufu9hDkG2LiDddRgOp+2SLM5z+zkFM/yFo1WImCwzJ5znBLJI+/F4dv OzpFKqPyhK8crB/2wBVj02tha74EXiWWFjsLbIbGDP9KAjo2El/zEpoFoaKUrALHzuTs GNiA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z3si4182464oth.163.2020.03.28.09.47.11; Sat, 28 Mar 2020 09:47:24 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727191AbgC1QnQ (ORCPT + 99 others); Sat, 28 Mar 2020 12:43:16 -0400 Received: from mx.sdf.org ([205.166.94.20]:50251 "EHLO mx.sdf.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726382AbgC1QnP (ORCPT ); Sat, 28 Mar 2020 12:43:15 -0400 Received: from sdf.org (IDENT:lkml@sdf.lonestar.org [205.166.94.16]) by mx.sdf.org (8.15.2/8.14.5) with ESMTPS id 02SGhAHP029768 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256 bits) verified NO); Sat, 28 Mar 2020 16:43:10 GMT Received: (from lkml@localhost) by sdf.org (8.15.2/8.12.8/Submit) id 02SGhAEl008453; Sat, 28 Mar 2020 16:43:10 GMT Message-Id: <202003281643.02SGhAEl008453@sdf.org> From: George Spelvin Date: Fri, 20 Dec 2019 22:30:32 -0500 Subject: [RFC PATCH v1 10/50] Make prandom_u32_max() exact for constant ranges To: linux-kernel@vger.kernel.org, lkml@sdf.org Cc: Stephen Hemminger , Hannes Frederic Sowa Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This is a cute hack I'm not sure should go into the kernel. Comments very much requested! Getting rid of the last tiny bit of non-uniformity is quite cheap *if* the range is known at compile time: a compare with an immediate and a (rarely taken) conditional branch to retry. So for hack value, implement this for compile-time constant ranges. For variable ranges, it involves a division, so just live with the slight non-uniformity. The style questions are: - Is a more accurate result *some* of the time actually worth anything? - Is the benefit enough to justify the ugly inconsistency? Signed-off-by: George Spelvin Cc: Stephen Hemminger Cc: Hannes Frederic Sowa --- include/linux/random.h | 63 ++++++++++++++++++++++++++++++------------ 1 file changed, 46 insertions(+), 17 deletions(-) diff --git a/include/linux/random.h b/include/linux/random.h index 109772175c833..82dd0d613e75c 100644 --- a/include/linux/random.h +++ b/include/linux/random.h @@ -9,6 +9,7 @@ #include #include +#include #include @@ -147,9 +148,11 @@ void prandom_seed_full_state(struct rnd_state __percpu *pcpu_state); * a property that is provided by prandom_u32().) * * It is possible to extend this to avoid all bias and return perfectly - * uniform pseudorandom numbers by discarding and regenerating sometimes, - * but if your needs are that stringent, you might want to use a stronger - * generator (like get_random_u32()). + * uniform pseudorandom numbers by discarding and regenerating sometimes. + * That's easy to do for constant ranges, so this code does it in that + * case, but settles for the approimation for variable ranges. If you + * need exact uniformity, you might also want to use a stronger generator + * (like get_random_u32()). * * Ref: * Fast Random Integer Generation in an Interval @@ -160,21 +163,47 @@ void prandom_seed_full_state(struct rnd_state __percpu *pcpu_state); */ static inline u32 prandom_u32_max(u32 range) { - /* - * If the range is a compile-time constant power of 2, then use - * a simple shift. This is mathematically equivalent to the - * multiplication, but GCC 8.3 doesn't optimize that perfectly. - * - * We could do an AND with a mask, but - * 1) The shift is the same speed on a decent CPU, - * 2) It's generally smaller code (smaller immediate), and - * 3) Many PRNGs have trouble with their low-order bits; - * using the msbits is generaly preferred. - */ - if (__builtin_constant_p(range) && (range & (range - 1)) == 0) - return prandom_u32() / (u32)(0x100000000 / range); - else + if (!__builtin_constant_p(range)) { + /* + * If range is a variable, prioritize speed over + * perfect uniformity. + */ return reciprocal_scale(prandom_u32(), range); + } else if (range & (range - 1)) { + /* + * If the range is a compile-time constant, then achieving + * perfect uniformity is one compare with immediate and + * one unlikely branch, so go ahead and do it. + * + * Some 32-bit processors require an additional + * instruction to get the low half of a 64-bit product. + * PowerPC and NIOS have separate "multiply high" and + * "multiply low" instructions. MIPS32 and ARC need to + * move the result from a special-purpose register to + * a GPR. + */ + u32 const lim = -range % range; + u64 prod; + + do + prod = mul_u32_u32(prandom_u32(), range); + while (unlikely((u32)prod < lim)); + return prod >> 32; + } else { + /* + * If the range is a compile-time constant power of 2, + * then use a simple shift. This is mathematically + * equivalent to the multiplication, but GCC 8.3 doesn't + * optimize that perfectly. + * + * We could do an AND with a mask, but + * 1) The shift is the same speed on a decent CPU, + * 2) It's generally smaller code (smaller immediate), and + * 3) Many PRNGs have trouble with their low-order bits; + * using the msbits is generaly preferred. + */ + return prandom_u32() / (u32)(0x100000000 / range); + } } /* -- 2.26.0