Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp2779495imw; Wed, 6 Jul 2022 11:36:02 -0700 (PDT) X-Google-Smtp-Source: AGRyM1ui2fNCpgnrunj1CSvlVqxf1p3K408TSQTSqmT8G85up/BPQVhQ1eLqLQL1/1r6LCOvKylE X-Received: by 2002:a17:907:9605:b0:6f5:c66:7c13 with SMTP id gb5-20020a170907960500b006f50c667c13mr41739998ejc.66.1657132562100; Wed, 06 Jul 2022 11:36:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1657132562; cv=none; d=google.com; s=arc-20160816; b=bo1HF+I0EH3fKW22yINWqbRT8aQ5kvFxmSriJkZT22aq2bbugfjSGOEJSwUMmRXIX9 HFcNCDxYVIHZPxLyFD6ZMQi7XjWwKd4ZLFclQVmZ74naD3Bs89/wJbMm1KJ+7utkK3W6 IS2GhRvSRsZAAem+EgQJYvpB3V+rMdg7ZJQDTsQeukMX07NQJx8PTpynnpWZt8DLK2YH RvBcPWgGn7xrJblHkmBLUaqzBCRGToge36YkPkSR83UCiXMPVG2KmcaO+BZNCXswOgCz XbEwGtrxAPxI4NHdG93rZpvPR6uFpsjwT11o9pSkasO+Bel6JhgEglq5ehBv7kKWPzWA 3gxw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=5q6jbx68SS0unOLPjlUpw6Z3TUhj8RD1hsIdZAHbL2k=; b=XOtWlMfqjB7zrkaRMIOd526WwckRnU+qLNCi3rZbU9DsjLtJyMSKCi7SUaT/Y3L0Jf YynJBzpsmStyy0ROHgcilFMlF0+EpCUDhKt93AT1FLSCOCXPiHgGTXOyUr5qcK744lJZ T9R6TjrFS3xVGma44scWZsDtCxs8H6iSbceInKVIhC1VCieoGwYwG5A5X/ww0RorPV0O RHSmIrS4o0Cvi4YcUh6Qu3j0DVn2U2WjIm66YE6iG5z2BETx2AnqPB95sqlP05WzSQvx vtlUozT9pAsKcAw+kTGuNIcLKIefvej52VFqWO/5VvebKdt9sVHXe02955gVPvc73kWT 22FQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=UZp3tHdP; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g8-20020a1709065d0800b006feef404001si20688289ejt.801.2022.07.06.11.35.19; Wed, 06 Jul 2022 11:36:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=UZp3tHdP; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234510AbiGFS36 (ORCPT + 99 others); Wed, 6 Jul 2022 14:29:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43764 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230443AbiGFS35 (ORCPT ); Wed, 6 Jul 2022 14:29:57 -0400 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 618CBE6; Wed, 6 Jul 2022 11:29:56 -0700 (PDT) Received: from pps.filterd (m0098417.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 266I2pNt026835; Wed, 6 Jul 2022 18:29:55 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=5q6jbx68SS0unOLPjlUpw6Z3TUhj8RD1hsIdZAHbL2k=; b=UZp3tHdPL2h0j1lqkou+6iNTQJVuWChyAI3hfcnelNF6GABBL0zZmVQBI22y0bK1ZVa3 jga0TN2U2KVYtH0FLbGlhgKBAdRfs0Uv1476wem0n1Vct3HTi8/7OgJXl2oSVvwxvmLT OJ44LxidBvfxFidVigekJ34VsleyiwuikmX/n8VyEmaQCjE/8wmbeeg6ESJFLiel4GmP cDhJ1Zb2QhcGYJOeY3de5qdFottXDtk1sr0ZDn24+w9IzkgJ4FSmHUuKC6OMVc8F8NP+ iOBHF95QvbpY2D+12A75Kq+ck6C/58dWnTEJGz6xuoA6ox/OlhAZkoANmUVFeSAqsot5 cw== Received: from ppma04fra.de.ibm.com (6a.4a.5195.ip4.static.sl-reverse.com [149.81.74.106]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3h5et81fn5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 06 Jul 2022 18:29:55 +0000 Received: from pps.filterd (ppma04fra.de.ibm.com [127.0.0.1]) by ppma04fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 266ILJkc015356; Wed, 6 Jul 2022 18:29:53 GMT Received: from b06cxnps3074.portsmouth.uk.ibm.com (d06relay09.portsmouth.uk.ibm.com [9.149.109.194]) by ppma04fra.de.ibm.com with ESMTP id 3h4v6592hb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 06 Jul 2022 18:29:53 +0000 Received: from d06av23.portsmouth.uk.ibm.com (d06av23.portsmouth.uk.ibm.com [9.149.105.59]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 266ITn2825035128 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 6 Jul 2022 18:29:49 GMT Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id CD598A404D; Wed, 6 Jul 2022 18:29:49 +0000 (GMT) Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 6D8C9A4053; Wed, 6 Jul 2022 18:29:49 +0000 (GMT) Received: from [9.171.61.129] (unknown [9.171.61.129]) by d06av23.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 6 Jul 2022 18:29:49 +0000 (GMT) Message-ID: <386ec16c-2561-2fcf-2eea-deaab45f349c@linux.ibm.com> Date: Wed, 6 Jul 2022 20:29:49 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 Subject: Re: [PATCH v1 1/1] s390/arch_random: Buffer true random data Content-Language: en-US To: "Jason A. Donenfeld" , Harald Freudenberger Cc: Holger Dengler , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Juergen Christ , linux-crypto@vger.kernel.org, linux-s390@vger.kernel.org References: <20220705112712.4433-1-dengler@linux.ibm.com> <20220705112712.4433-2-dengler@linux.ibm.com> <9a0561c0-68f7-b630-4440-3ca32bf28dc2@linux.ibm.com> <7e65130c6e66ce7a9f9eb469eb7e64e0@linux.ibm.com> From: Christian Borntraeger In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: gxHJEsEoH6RK9Iq8mDNXIpP5NADBNeKX X-Proofpoint-GUID: gxHJEsEoH6RK9Iq8mDNXIpP5NADBNeKX X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.883,Hydra:6.0.517,FMLib:17.11.122.1 definitions=2022-07-06_09,2022-06-28_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 mlxlogscore=951 clxscore=1011 impostorscore=0 bulkscore=0 suspectscore=0 priorityscore=1501 adultscore=0 phishscore=0 spamscore=0 lowpriorityscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2206140000 definitions=main-2207060071 X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org Am 06.07.22 um 18:26 schrieb Jason A. Donenfeld: > Hi Harald, > > On Wed, Jul 06, 2022 at 06:18:27PM +0200, Harald Freudenberger wrote: >> On 2022-07-05 18:27, Holger Dengler wrote: >>> Hi Jason, >>> >>> On 05/07/2022 17:11, Jason A. Donenfeld wrote: >>>> Hi Holger, >>>> >>>> On Tue, Jul 05, 2022 at 04:58:30PM +0200, Holger Dengler wrote: >>>>> It is true, that the performance of the instruction is not really >>>>> relevant, but only for calls outside of an interrupt context. I did >>>>> some ftrace logging for the s390_random_get_seed_long() calls, and - >>>>> as you said - there are a few calls per minute. But there was also >>>>> some repeating calls in interrupt context. On systems with a huge >>>>> interrupt load, this can cause severe performance impacts. I've no >>>> >>>> It'd be interesting to know more about this. The way you get >>>> arch_random_get_seed_long() from irq context is: >>>> >>>> get_random_{bytes,int,long,u32,u64}() >>>> crng_make_state() >>>> crng_reseed() <-- Rarely >>>> extract_entropy() >>>> arch_get_random_seed_long() >>>> >>>> So if an irq user of get_random_xx() is the unlucky one in the minute >>>> span who has to call crng_reseed() then, yea, that'll happen. But I >>>> wonder about this luck aspect. What scenarios are you seeing where >>>> this >>>> happens all the time? Which driver is using random bytes *so* commonly >>>> from irq context? Not that, per say, there's anything wrong with that, >>>> but it could be eyebrow raising, and might point to de facto solutions >>>> that mostly take care of this. >>> >>> I saw a few calls in interrupt context during my tracing, but I didn't >>> look to see which ones they were. Let me figure that out in the next >>> few days and provide more information on that. >>> >>>> One such direction might be making a driver that does such a thing do >>>> it >>>> a little bit less, somehow. Another direction would be preferring >>>> non-irqs to handle crng_reseed(), but not disallowing irqs entirely, >>>> with a patch something like the one below. Or maybe there are other >>>> ideas. >>> >>> Reduce the number of trng in interrupt context is a possibility, but - >>> in my opinion - only one single trng instruction call in interrupt >>> context in one too much. >>> >>> For the moment, I would propose to drop the buffering but also return >>> false, if arch_random_get_seed_long() is called in interrupt context. >>> >>> diff --git a/arch/s390/include/asm/archrandom.h >>> b/arch/s390/include/asm/archrandom.h >>> index 2c6e1c6ecbe7..711357bdc464 100644 >>> --- a/arch/s390/include/asm/archrandom.h >>> +++ b/arch/s390/include/asm/archrandom.h >>> @@ -32,7 +32,8 @@ static inline bool __must_check >>> arch_get_random_int(unsigned int *v) >>> >>> static inline bool __must_check arch_get_random_seed_long(unsigned >>> long *v) >>> { >>> - if (static_branch_likely(&s390_arch_random_available)) { >>> + if (static_branch_likely(&s390_arch_random_available) && >>> + !in_interrupt()) { >>> cpacf_trng(NULL, 0, (u8 *)v, sizeof(*v)); >>> atomic64_add(sizeof(*v), &s390_arch_random_counter); >>> return true; >>> >>> (on-top of your commit, without our buffering patch) >>> >>>> >>>> But all this is to say that having some more of the "mundane" details >>>> about this might actually help us. >>>> >>>> Jason >>>> >>>> diff --git a/drivers/char/random.c b/drivers/char/random.c >>>> index e3dd1dd3dd22..81df8cdf2a62 100644 >>>> --- a/drivers/char/random.c >>>> +++ b/drivers/char/random.c >>>> @@ -270,6 +270,9 @@ static bool crng_has_old_seed(void) >>>> static bool early_boot = true; >>>> unsigned long interval = CRNG_RESEED_INTERVAL; >>>> >>>> + if (in_hardirq()) >>>> + interval += HZ * 10; >>>> + >>>> if (unlikely(READ_ONCE(early_boot))) { >>>> time64_t uptime = ktime_get_seconds(); >>>> if (uptime >= CRNG_RESEED_INTERVAL / HZ * 2) >>>> >> >> Hi Holger and Jason >> I tried to find out what is the reason of the invocations in interrupt >> context. >> First I have to admit that there is in fact not much of >> arch_get_random_seed_long() >> invocation any more in the recent kernel (5.19-rc5). I see about 100 >> invocations >> within 10 minutes with an LPAR running some qperf and dd dumps on dasds >> test load. >> About half of these invocations is in interrupt context. I >> dump_stack()ed some of >> these and I always catch the function >> kfence_guarded_alloc() >> prandom_u32_max() >> prandom_u32() >> get_random_u32() >> _get_random_bytes() >> crng_make_state() >> crng_reseed() >> extract_entropy() >> arch_get_random_seed_long() >> >> However, with so few invocations it should not make any harm when there >> is a >> even very expensive trng() invocation in interrupt context. >> >> But I think we should check, if this is really something to backport to >> the older >> kernels where arch_get_random_seed_long() is called really frequency. > > I backported the current random.c design to old kernels, so the > situation there should be the same as in 5.19-rc5. > > So if you feel such rare usage is find even in_hardirq(), then I suppose > there's nothing more to do here? I think up to 190µs in interrupt can result in unwanted latencies. Yes it does not happen that often and it is smaller than most timeslices of hypervisors. So I would likely turn that question around what happens if we return false if in_hardirq is true? Any noticeable delays in code that uses random numbers?