Received: by 2002:ac0:bc90:0:0:0:0:0 with SMTP id a16csp2731408img; Sun, 24 Mar 2019 17:29:07 -0700 (PDT) X-Google-Smtp-Source: APXvYqytUFmaajyY5/xxIIo8ZuRS7+lvbo2ikgo3TEayHkTKPusmilcBl00/1XsP4pqv52kr9+OK X-Received: by 2002:a63:c34a:: with SMTP id e10mr20529062pgd.194.1553473747768; Sun, 24 Mar 2019 17:29:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553473747; cv=none; d=google.com; s=arc-20160816; b=OrdoksYCATwCgLJCqAIqeaGaH7SwcMIY5zJJDXKCNxhbHzKybcY3Fh7yeitB2sOPpx nCJgqkONTWfdI6GTf/WCc6w/mgRIEHrDY59y/+bNdwgOUkxOERI8QF0UXqK/KXkTSqa0 IQrvOr7yH7v8i1lBTm4fZjY22DhWRMR+xUKphit3rrv2Jr4c0wyLk5q6y+Nh+schcmlN UFvP4Tzf4WZTnLwtnvLZZp1+PW9XQuTeSCbrHvSPAapwRrh50nDjKUrfLlAL5xkDDca9 +v0/q9ehzVd4qnTgb7NfEIjNx5P5e/kbucFVsgFMZRMJh0BivV+3hNhsrBXLhSMFc0qt E0Fg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:cc:subject:to :message-id:from:date; bh=fgZHDFSyHMKsQchmGKEf5JGFxw/TtocGmbingWACu9I=; b=oaDihTtZ8BJppjTpNUzKbTMmY0DihiAEUSPsnTwQ6CVeY9AsDHY2CZGe3WxAzDzSXc UcrtW+S2kDyo5wi/r1LelolOvI5KcqfvEVhRQy54k4OOE+F1Fa+spBdbnSrHEbMrNypK pr5+GnjNCIokTZ4mWiXHIPW4mljA9h4cZVHN+AYH+Cs5F1+PGvcy346iRGyb2Tjx+kgo 5hw9q2kSenHhqwYUfyvGZncmi2AXxrKkpGO3CSywH+0A8iEVjuFWXQzpe7Gjq80gTEhd qD2tl9iXKjXU7+HBUE3K+40XnmOmVtX5qTDMyAcaQ1tqqJpgANiWLTu85AxgkCw4RvkF WU8g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r5si12201470pga.441.2019.03.24.17.28.52; Sun, 24 Mar 2019 17:29:07 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729211AbfCYA2S (ORCPT + 99 others); Sun, 24 Mar 2019 20:28:18 -0400 Received: from ol.sdf.org ([205.166.94.20]:58438 "EHLO mx.sdf.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729149AbfCYA2S (ORCPT ); Sun, 24 Mar 2019 20:28:18 -0400 Received: from sdf.org (IDENT:lkml@sdf.lonestar.org [205.166.94.16]) by mx.sdf.org (8.15.2/8.14.5) with ESMTPS id x2P0SD4i014183 (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256 bits) verified NO); Mon, 25 Mar 2019 00:28:13 GMT Received: (from lkml@localhost) by sdf.org (8.15.2/8.12.8/Submit) id x2P0SDH3026861; Mon, 25 Mar 2019 00:28:13 GMT Date: Mon, 25 Mar 2019 00:28:13 GMT From: George Spelvin Message-Id: <201903250028.x2P0SDH3026861@sdf.org> To: Jason@zx2c4.com, lkml@sdf.org Subject: Re: [RFC PATCH] random: add get_random_max() function Cc: linux-kernel@vger.kernel.org, tytso@mit.edu In-Reply-To: References: <201903241244.x2OCiL8P011277@sdf.org>, Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, 24 Mar 2019 at 21:47:50 +0100, Jason A. Donenfeld wrote: > I generally use a slightly simpler algorithm in various different projects: > > //[0, bound) > static unsigned long random_bounded(unsigned long bound) > { > unsigned long ret; > const unsigned long max_mod_bound = (1 + ~bound) % bound; > > if (bound < 2) > return 0; > do > ret = random_integer(); > while (ret < max_mod_bound); > return ret % bound; > } > > Is the motivation behind using Lemire that you avoid the division (via > the modulo) in favor of a multiplication? Yes. If we define eps = max_mod_bound * ldexp(1.0, -BITS_PER_LONG) as the probability of one retry, and retries = eps / (1 - eps) as the expected number of retries, then both algorithms take 1+retries random_integer()s. The above agorithm takes 2 divisions, always. Divides are slow, and usually not pipelined, so two in short succession gets a latency penalty. Lemire's mutiplicative algorithm takes 1 multiplication on the fast path (probability 1 - 2*eps on average), 1 additional division on the slow path (probability 2*eps), and 1 multiplication per retry. In the common case when bound is much less than ULONG_MAX, eps is tiny and the fast path is taken almost all the time, and it's a huge win. Even in the absolute worst case of bound = ULONG_MAX/2 + 2 when eps ~ 0.5 (2 multiplies, 0.5 divide; there's no 2*eps penalty in this case), it's faster as long as 2 mutiplies cost less than 1.5 divides. I you want simpler code, we could omit the fast path and stil get a speedup. But a predictable branch for a divide seemed like a worthwhile trade. (FYI, this all came about as a side project of a kernel-janitor project to replace "prandom_u32() % range" by "prandom_u32() * range >> 32". I'm also annoyed that get_random_u32() and get_random_u64() have separate buffers, even if EFFICIENT_UNALIGNED_ACCESS, but that's a separate complaint.)