Received: by 10.213.65.68 with SMTP id h4csp416407imn; Wed, 4 Apr 2018 00:18:48 -0700 (PDT) X-Google-Smtp-Source: AIpwx4+/BJfo9iW+89FZaCX6Jdl4oKls6kcfyv6c8dfH7dWBJXHQSKLUYpMiFMwsd/KCoyUgfRd/ X-Received: by 10.101.102.196 with SMTP id c4mr1170958pgw.146.1522826328914; Wed, 04 Apr 2018 00:18:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1522826328; cv=none; d=google.com; s=arc-20160816; b=CY3UFcqa2ozq70QKQ76wQfsQ1k7YEZXI5/gb4PR3yp98LbApaZl/75WORoC4EQi4os 54QFg+33zQbjQfNssLaMjScUdpWEbouI0lltQklzt0mCyPmsc+/vv0rQyguhgEQGJSy5 i0BKPlKDnAmlRLCC0yOE97B+PhRLmJApdGER9+Cm5t9+Nj4pf34i6e1YA+rKpcCcx9o9 G2Jr8onmB8O0eil2Q+egPZnIbBvH0SNxoKe8gAMLXNuADlQQ0oB0vrVgfctiWBP6os74 80PAtZ13hEWwXKlKG+VsZErcHQKGYvKmVO9HcA1nZ8vrrEC/6ZUhPN6FBXK6DQ57bbor vVEQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :arc-authentication-results; bh=uHwQB+YYVr16WVn8h+RBLXpvvTSoAkWIyVRoDp+W6fo=; b=Q2jI6uvm7ZC9dKy/lDexbgFhnQxLKA8WzBSrwu0LxuwTEM2Kgg8+keiEEPez1Vcbx5 P/eeVItJmLYqhT9TlLq0ACRIzji5UjAKCwLukTvOWBpHWQ0hVMcCQECRlokmI4hCU2rP ArZpJT6DsKrRXpVkBwW+TLZW/KpAPnPQbq0lx4sdzdqo6o1CUkD4dKpqaCLNNPpHkkxW YusJdl8jn9pEFn8SOUuWJpml7m5+3bS+sfmeVNcVdBSnyeQnOfH8TY9OIK8UrDTT9pIU HLjh3WAG6ivsoM+a7sdvvrebwBSRe87HDYBgpQEypskARcmy7Lk5MOLs5frgzAYRAuiE SKqw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=monom.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d21-v6si2360471pll.557.2018.04.04.00.18.34; Wed, 04 Apr 2018 00:18:48 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=monom.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751469AbeDDHRZ (ORCPT + 99 others); Wed, 4 Apr 2018 03:17:25 -0400 Received: from mail.monom.org ([188.138.9.77]:49530 "EHLO mail.monom.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751259AbeDDHRR (ORCPT ); Wed, 4 Apr 2018 03:17:17 -0400 Received: from mail.monom.org (localhost [127.0.0.1]) by filter.mynetwork.local (Postfix) with ESMTP id ED5B2500C47; Wed, 4 Apr 2018 09:17:14 +0200 (CEST) X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on mail.monom.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=ham autolearn_force=no version=3.4.1 Received: from localhost (ppp-93-104-185-33.dynamic.mnet-online.de [93.104.185.33]) by mail.monom.org (Postfix) with ESMTPSA id 8890E500C40; Wed, 4 Apr 2018 09:17:14 +0200 (CEST) From: Daniel Wagner To: linux-kernel@vger.kernel.org Cc: linux-rt-users , Steven Rostedt , Thomas Gleixner , Carsten Emde , John Kacur , Paul Gortmaker , Julia Cartwright , Daniel Wagner , tom.zanussi@linux.intel.com, Sebastian Andrzej Siewior , stable-rt@vger.kernel.org Subject: [PATCH RT 5/7] crypto: limit more FPU-enabled sections Date: Wed, 4 Apr 2018 09:16:50 +0200 Message-Id: <20180404071652.24196-6-wagi@monom.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180404071652.24196-1-wagi@monom.org> References: <20180404071652.24196-1-wagi@monom.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Sebastian Andrzej Siewior Those crypto drivers use SSE/AVX/… for their crypto work and in order to do so in kernel they need to enable the "FPU" in kernel mode which disables preemption. There are two problems with the way they are used: - the while loop which processes X bytes may create latency spikes and should be avoided or limited. - the cipher-walk-next part may allocate/free memory and may use kmap_atomic(). The whole kernel_fpu_begin()/end() processing isn't probably that cheap. It most likely makes sense to process as much of those as possible in one go. The new *_fpu_sched_rt() schedules only if a RT task is pending. Probably we should measure the performance those ciphers in pure SW mode and with this optimisations to see if it makes sense to keep them for RT. This kernel_fpu_resched() makes the code more preemptible which might hurt performance. Cc: stable-rt@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior --- arch/x86/crypto/camellia_aesni_avx2_glue.c | 20 ++++++++++++++++++++ arch/x86/crypto/camellia_aesni_avx_glue.c | 19 +++++++++++++++++++ arch/x86/crypto/cast6_avx_glue.c | 24 +++++++++++++++++++----- arch/x86/crypto/chacha20_glue.c | 8 +++++--- arch/x86/crypto/serpent_avx2_glue.c | 19 +++++++++++++++++++ arch/x86/crypto/serpent_avx_glue.c | 23 +++++++++++++++++++---- arch/x86/crypto/serpent_sse2_glue.c | 23 +++++++++++++++++++---- arch/x86/crypto/twofish_avx_glue.c | 27 +++++++++++++++++++++++++-- arch/x86/include/asm/fpu/api.h | 1 + arch/x86/kernel/fpu/core.c | 12 ++++++++++++ 10 files changed, 158 insertions(+), 18 deletions(-) diff --git a/arch/x86/crypto/camellia_aesni_avx2_glue.c b/arch/x86/crypto/camellia_aesni_avx2_glue.c index d84456924563..c54536d9932c 100644 --- a/arch/x86/crypto/camellia_aesni_avx2_glue.c +++ b/arch/x86/crypto/camellia_aesni_avx2_glue.c @@ -206,6 +206,20 @@ struct crypt_priv { bool fpu_enabled; }; +#ifdef CONFIG_PREEMPT_RT_FULL +static void camellia_fpu_end_rt(struct crypt_priv *ctx) +{ + bool fpu_enabled = ctx->fpu_enabled; + + if (!fpu_enabled) + return; + camellia_fpu_end(fpu_enabled); + ctx->fpu_enabled = false; +} +#else +static void camellia_fpu_end_rt(struct crypt_priv *ctx) { } +#endif + static void encrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes) { const unsigned int bsize = CAMELLIA_BLOCK_SIZE; @@ -221,16 +235,19 @@ static void encrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes) } if (nbytes >= CAMELLIA_AESNI_PARALLEL_BLOCKS * bsize) { + kernel_fpu_resched(); camellia_ecb_enc_16way(ctx->ctx, srcdst, srcdst); srcdst += bsize * CAMELLIA_AESNI_PARALLEL_BLOCKS; nbytes -= bsize * CAMELLIA_AESNI_PARALLEL_BLOCKS; } while (nbytes >= CAMELLIA_PARALLEL_BLOCKS * bsize) { + kernel_fpu_resched(); camellia_enc_blk_2way(ctx->ctx, srcdst, srcdst); srcdst += bsize * CAMELLIA_PARALLEL_BLOCKS; nbytes -= bsize * CAMELLIA_PARALLEL_BLOCKS; } + camellia_fpu_end_rt(ctx); for (i = 0; i < nbytes / bsize; i++, srcdst += bsize) camellia_enc_blk(ctx->ctx, srcdst, srcdst); @@ -251,16 +268,19 @@ static void decrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes) } if (nbytes >= CAMELLIA_AESNI_PARALLEL_BLOCKS * bsize) { + kernel_fpu_resched(); camellia_ecb_dec_16way(ctx->ctx, srcdst, srcdst); srcdst += bsize * CAMELLIA_AESNI_PARALLEL_BLOCKS; nbytes -= bsize * CAMELLIA_AESNI_PARALLEL_BLOCKS; } while (nbytes >= CAMELLIA_PARALLEL_BLOCKS * bsize) { + kernel_fpu_resched(); camellia_dec_blk_2way(ctx->ctx, srcdst, srcdst); srcdst += bsize * CAMELLIA_PARALLEL_BLOCKS; nbytes -= bsize * CAMELLIA_PARALLEL_BLOCKS; } + camellia_fpu_end_rt(ctx); for (i = 0; i < nbytes / bsize; i++, srcdst += bsize) camellia_dec_blk(ctx->ctx, srcdst, srcdst); diff --git a/arch/x86/crypto/camellia_aesni_avx_glue.c b/arch/x86/crypto/camellia_aesni_avx_glue.c index 93d8f295784e..a1666a58eee5 100644 --- a/arch/x86/crypto/camellia_aesni_avx_glue.c +++ b/arch/x86/crypto/camellia_aesni_avx_glue.c @@ -210,6 +210,21 @@ struct crypt_priv { bool fpu_enabled; }; +#ifdef CONFIG_PREEMPT_RT_FULL +static void camellia_fpu_end_rt(struct crypt_priv *ctx) +{ + bool fpu_enabled = ctx->fpu_enabled; + + if (!fpu_enabled) + return; + camellia_fpu_end(fpu_enabled); + ctx->fpu_enabled = false; +} + +#else +static void camellia_fpu_end_rt(struct crypt_priv *ctx) { } +#endif + static void encrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes) { const unsigned int bsize = CAMELLIA_BLOCK_SIZE; @@ -225,10 +240,12 @@ static void encrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes) } while (nbytes >= CAMELLIA_PARALLEL_BLOCKS * bsize) { + kernel_fpu_resched(); camellia_enc_blk_2way(ctx->ctx, srcdst, srcdst); srcdst += bsize * CAMELLIA_PARALLEL_BLOCKS; nbytes -= bsize * CAMELLIA_PARALLEL_BLOCKS; } + camellia_fpu_end_rt(ctx); for (i = 0; i < nbytes / bsize; i++, srcdst += bsize) camellia_enc_blk(ctx->ctx, srcdst, srcdst); @@ -249,10 +266,12 @@ static void decrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes) } while (nbytes >= CAMELLIA_PARALLEL_BLOCKS * bsize) { + kernel_fpu_resched(); camellia_dec_blk_2way(ctx->ctx, srcdst, srcdst); srcdst += bsize * CAMELLIA_PARALLEL_BLOCKS; nbytes -= bsize * CAMELLIA_PARALLEL_BLOCKS; } + camellia_fpu_end_rt(ctx); for (i = 0; i < nbytes / bsize; i++, srcdst += bsize) camellia_dec_blk(ctx->ctx, srcdst, srcdst); diff --git a/arch/x86/crypto/cast6_avx_glue.c b/arch/x86/crypto/cast6_avx_glue.c index fca459578c35..9eb469b9836a 100644 --- a/arch/x86/crypto/cast6_avx_glue.c +++ b/arch/x86/crypto/cast6_avx_glue.c @@ -205,19 +205,33 @@ struct crypt_priv { bool fpu_enabled; }; +#ifdef CONFIG_PREEMPT_RT_FULL +static void cast6_fpu_end_rt(struct crypt_priv *ctx) +{ + bool fpu_enabled = ctx->fpu_enabled; + + if (!fpu_enabled) + return; + cast6_fpu_end(fpu_enabled); + ctx->fpu_enabled = false; +} + +#else +static void cast6_fpu_end_rt(struct crypt_priv *ctx) { } +#endif + static void encrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes) { const unsigned int bsize = CAST6_BLOCK_SIZE; struct crypt_priv *ctx = priv; int i; - ctx->fpu_enabled = cast6_fpu_begin(ctx->fpu_enabled, nbytes); - if (nbytes == bsize * CAST6_PARALLEL_BLOCKS) { + ctx->fpu_enabled = cast6_fpu_begin(ctx->fpu_enabled, nbytes); cast6_ecb_enc_8way(ctx->ctx, srcdst, srcdst); + cast6_fpu_end_rt(ctx); return; } - for (i = 0; i < nbytes / bsize; i++, srcdst += bsize) __cast6_encrypt(ctx->ctx, srcdst, srcdst); } @@ -228,10 +242,10 @@ static void decrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes) struct crypt_priv *ctx = priv; int i; - ctx->fpu_enabled = cast6_fpu_begin(ctx->fpu_enabled, nbytes); - if (nbytes == bsize * CAST6_PARALLEL_BLOCKS) { + ctx->fpu_enabled = cast6_fpu_begin(ctx->fpu_enabled, nbytes); cast6_ecb_dec_8way(ctx->ctx, srcdst, srcdst); + cast6_fpu_end_rt(ctx); return; } diff --git a/arch/x86/crypto/chacha20_glue.c b/arch/x86/crypto/chacha20_glue.c index 722bacea040e..0e3eb53a87cd 100644 --- a/arch/x86/crypto/chacha20_glue.c +++ b/arch/x86/crypto/chacha20_glue.c @@ -80,22 +80,24 @@ static int chacha20_simd(struct blkcipher_desc *desc, struct scatterlist *dst, crypto_chacha20_init(state, crypto_blkcipher_ctx(desc->tfm), walk.iv); - kernel_fpu_begin(); - while (walk.nbytes >= CHACHA20_BLOCK_SIZE) { + kernel_fpu_begin(); + chacha20_dosimd(state, walk.dst.virt.addr, walk.src.virt.addr, rounddown(walk.nbytes, CHACHA20_BLOCK_SIZE)); + kernel_fpu_end(); err = blkcipher_walk_done(desc, &walk, walk.nbytes % CHACHA20_BLOCK_SIZE); } if (walk.nbytes) { + kernel_fpu_begin(); chacha20_dosimd(state, walk.dst.virt.addr, walk.src.virt.addr, walk.nbytes); + kernel_fpu_end(); err = blkcipher_walk_done(desc, &walk, 0); } - kernel_fpu_end(); return err; } diff --git a/arch/x86/crypto/serpent_avx2_glue.c b/arch/x86/crypto/serpent_avx2_glue.c index 6d198342e2de..0954bae9a995 100644 --- a/arch/x86/crypto/serpent_avx2_glue.c +++ b/arch/x86/crypto/serpent_avx2_glue.c @@ -184,6 +184,21 @@ struct crypt_priv { bool fpu_enabled; }; +#ifdef CONFIG_PREEMPT_RT_FULL +static void serpent_fpu_end_rt(struct crypt_priv *ctx) +{ + bool fpu_enabled = ctx->fpu_enabled; + + if (!fpu_enabled) + return; + serpent_fpu_end(fpu_enabled); + ctx->fpu_enabled = false; +} + +#else +static void serpent_fpu_end_rt(struct crypt_priv *ctx) { } +#endif + static void encrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes) { const unsigned int bsize = SERPENT_BLOCK_SIZE; @@ -199,10 +214,12 @@ static void encrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes) } while (nbytes >= SERPENT_PARALLEL_BLOCKS * bsize) { + kernel_fpu_resched(); serpent_ecb_enc_8way_avx(ctx->ctx, srcdst, srcdst); srcdst += bsize * SERPENT_PARALLEL_BLOCKS; nbytes -= bsize * SERPENT_PARALLEL_BLOCKS; } + serpent_fpu_end_rt(ctx); for (i = 0; i < nbytes / bsize; i++, srcdst += bsize) __serpent_encrypt(ctx->ctx, srcdst, srcdst); @@ -223,10 +240,12 @@ static void decrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes) } while (nbytes >= SERPENT_PARALLEL_BLOCKS * bsize) { + kernel_fpu_resched(); serpent_ecb_dec_8way_avx(ctx->ctx, srcdst, srcdst); srcdst += bsize * SERPENT_PARALLEL_BLOCKS; nbytes -= bsize * SERPENT_PARALLEL_BLOCKS; } + serpent_fpu_end_rt(ctx); for (i = 0; i < nbytes / bsize; i++, srcdst += bsize) __serpent_decrypt(ctx->ctx, srcdst, srcdst); diff --git a/arch/x86/crypto/serpent_avx_glue.c b/arch/x86/crypto/serpent_avx_glue.c index 5dc37026c7ce..8c812a05a084 100644 --- a/arch/x86/crypto/serpent_avx_glue.c +++ b/arch/x86/crypto/serpent_avx_glue.c @@ -218,16 +218,31 @@ struct crypt_priv { bool fpu_enabled; }; +#ifdef CONFIG_PREEMPT_RT_FULL +static void serpent_fpu_end_rt(struct crypt_priv *ctx) +{ + bool fpu_enabled = ctx->fpu_enabled; + + if (!fpu_enabled) + return; + serpent_fpu_end(fpu_enabled); + ctx->fpu_enabled = false; +} + +#else +static void serpent_fpu_end_rt(struct crypt_priv *ctx) { } +#endif + static void encrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes) { const unsigned int bsize = SERPENT_BLOCK_SIZE; struct crypt_priv *ctx = priv; int i; - ctx->fpu_enabled = serpent_fpu_begin(ctx->fpu_enabled, nbytes); - if (nbytes == bsize * SERPENT_PARALLEL_BLOCKS) { + ctx->fpu_enabled = serpent_fpu_begin(ctx->fpu_enabled, nbytes); serpent_ecb_enc_8way_avx(ctx->ctx, srcdst, srcdst); + serpent_fpu_end_rt(ctx); return; } @@ -241,10 +256,10 @@ static void decrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes) struct crypt_priv *ctx = priv; int i; - ctx->fpu_enabled = serpent_fpu_begin(ctx->fpu_enabled, nbytes); - if (nbytes == bsize * SERPENT_PARALLEL_BLOCKS) { + ctx->fpu_enabled = serpent_fpu_begin(ctx->fpu_enabled, nbytes); serpent_ecb_dec_8way_avx(ctx->ctx, srcdst, srcdst); + serpent_fpu_end_rt(ctx); return; } diff --git a/arch/x86/crypto/serpent_sse2_glue.c b/arch/x86/crypto/serpent_sse2_glue.c index 3643dd508f45..1ef09f8b5cc7 100644 --- a/arch/x86/crypto/serpent_sse2_glue.c +++ b/arch/x86/crypto/serpent_sse2_glue.c @@ -187,16 +187,31 @@ struct crypt_priv { bool fpu_enabled; }; +#ifdef CONFIG_PREEMPT_RT_FULL +static void serpent_fpu_end_rt(struct crypt_priv *ctx) +{ + bool fpu_enabled = ctx->fpu_enabled; + + if (!fpu_enabled) + return; + serpent_fpu_end(fpu_enabled); + ctx->fpu_enabled = false; +} + +#else +static void serpent_fpu_end_rt(struct crypt_priv *ctx) { } +#endif + static void encrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes) { const unsigned int bsize = SERPENT_BLOCK_SIZE; struct crypt_priv *ctx = priv; int i; - ctx->fpu_enabled = serpent_fpu_begin(ctx->fpu_enabled, nbytes); - if (nbytes == bsize * SERPENT_PARALLEL_BLOCKS) { + ctx->fpu_enabled = serpent_fpu_begin(ctx->fpu_enabled, nbytes); serpent_enc_blk_xway(ctx->ctx, srcdst, srcdst); + serpent_fpu_end_rt(ctx); return; } @@ -210,10 +225,10 @@ static void decrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes) struct crypt_priv *ctx = priv; int i; - ctx->fpu_enabled = serpent_fpu_begin(ctx->fpu_enabled, nbytes); - if (nbytes == bsize * SERPENT_PARALLEL_BLOCKS) { + ctx->fpu_enabled = serpent_fpu_begin(ctx->fpu_enabled, nbytes); serpent_dec_blk_xway(ctx->ctx, srcdst, srcdst); + serpent_fpu_end_rt(ctx); return; } diff --git a/arch/x86/crypto/twofish_avx_glue.c b/arch/x86/crypto/twofish_avx_glue.c index b7a3904b953c..de00fe24927e 100644 --- a/arch/x86/crypto/twofish_avx_glue.c +++ b/arch/x86/crypto/twofish_avx_glue.c @@ -218,6 +218,21 @@ struct crypt_priv { bool fpu_enabled; }; +#ifdef CONFIG_PREEMPT_RT_FULL +static void twofish_fpu_end_rt(struct crypt_priv *ctx) +{ + bool fpu_enabled = ctx->fpu_enabled; + + if (!fpu_enabled) + return; + twofish_fpu_end(fpu_enabled); + ctx->fpu_enabled = false; +} + +#else +static void twofish_fpu_end_rt(struct crypt_priv *ctx) { } +#endif + static void encrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes) { const unsigned int bsize = TF_BLOCK_SIZE; @@ -228,12 +243,16 @@ static void encrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes) if (nbytes == bsize * TWOFISH_PARALLEL_BLOCKS) { twofish_ecb_enc_8way(ctx->ctx, srcdst, srcdst); + twofish_fpu_end_rt(ctx); return; } - for (i = 0; i < nbytes / (bsize * 3); i++, srcdst += bsize * 3) + for (i = 0; i < nbytes / (bsize * 3); i++, srcdst += bsize * 3) { + kernel_fpu_resched(); twofish_enc_blk_3way(ctx->ctx, srcdst, srcdst); + } + twofish_fpu_end_rt(ctx); nbytes %= bsize * 3; for (i = 0; i < nbytes / bsize; i++, srcdst += bsize) @@ -250,11 +269,15 @@ static void decrypt_callback(void *priv, u8 *srcdst, unsigned int nbytes) if (nbytes == bsize * TWOFISH_PARALLEL_BLOCKS) { twofish_ecb_dec_8way(ctx->ctx, srcdst, srcdst); + twofish_fpu_end_rt(ctx); return; } - for (i = 0; i < nbytes / (bsize * 3); i++, srcdst += bsize * 3) + for (i = 0; i < nbytes / (bsize * 3); i++, srcdst += bsize * 3) { + kernel_fpu_resched(); twofish_dec_blk_3way(ctx->ctx, srcdst, srcdst); + } + twofish_fpu_end_rt(ctx); nbytes %= bsize * 3; diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h index 1429a7c736db..85428df40a22 100644 --- a/arch/x86/include/asm/fpu/api.h +++ b/arch/x86/include/asm/fpu/api.h @@ -24,6 +24,7 @@ extern void __kernel_fpu_begin(void); extern void __kernel_fpu_end(void); extern void kernel_fpu_begin(void); extern void kernel_fpu_end(void); +extern void kernel_fpu_resched(void); extern bool irq_fpu_usable(void); /* diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index d25097c3fc1d..67ae18aeaf8b 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -149,6 +149,18 @@ void kernel_fpu_end(void) } EXPORT_SYMBOL_GPL(kernel_fpu_end); +void kernel_fpu_resched(void) +{ + WARN_ON_FPU(!this_cpu_read(in_kernel_fpu)); + + if (should_resched(PREEMPT_OFFSET)) { + kernel_fpu_end(); + cond_resched(); + kernel_fpu_begin(); + } +} +EXPORT_SYMBOL_GPL(kernel_fpu_resched); + /* * CR0::TS save/restore functions: */ -- 2.14.3