Currently pcrypt_aead_init_tfm() will pick callback cpus (ctx->cb_cpu)
from any online cpus. Later padata_reorder() will queue_work_on() the
chosen cb_cpu.
This is undesired if the chosen cb_cpu is listed as isolated (i.e. using
isolcpus=... kernel parameter), since the work queued will interfere with
the workload on the isolated cpu.
Make sure isolated cpus are not used for pcrypt.
Signed-off-by: Leonardo Bras <[email protected]>
---
crypto/pcrypt.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/crypto/pcrypt.c b/crypto/pcrypt.c
index 9d10b846ccf73..9017d08c91a8d 100644
--- a/crypto/pcrypt.c
+++ b/crypto/pcrypt.c
@@ -16,6 +16,7 @@
#include <linux/kobject.h>
#include <linux/cpu.h>
#include <crypto/pcrypt.h>
+#include <linux/sched/isolation.h>
static struct padata_instance *pencrypt;
static struct padata_instance *pdecrypt;
@@ -175,13 +176,16 @@ static int pcrypt_aead_init_tfm(struct crypto_aead *tfm)
struct pcrypt_instance_ctx *ictx = aead_instance_ctx(inst);
struct pcrypt_aead_ctx *ctx = crypto_aead_ctx(tfm);
struct crypto_aead *cipher;
+ struct cpumask non_isolated;
+
+ cpumask_and(&non_isolated, cpu_online_mask, housekeeping_cpumask(HK_TYPE_DOMAIN));
cpu_index = (unsigned int)atomic_inc_return(&ictx->tfm_count) %
- cpumask_weight(cpu_online_mask);
+ cpumask_weight(&non_isolated);
- ctx->cb_cpu = cpumask_first(cpu_online_mask);
+ ctx->cb_cpu = cpumask_first(&non_isolated);
for (cpu = 0; cpu < cpu_index; cpu++)
- ctx->cb_cpu = cpumask_next(ctx->cb_cpu, cpu_online_mask);
+ ctx->cb_cpu = cpumask_next(ctx->cb_cpu, &non_isolated);
cipher = crypto_spawn_aead(&ictx->spawn);
--
2.37.3
On Tue, Oct 04, 2022 at 03:25:37AM -0300, Leonardo Bras wrote:
> Currently pcrypt_aead_init_tfm() will pick callback cpus (ctx->cb_cpu)
> from any online cpus. Later padata_reorder() will queue_work_on() the
> chosen cb_cpu.
>
> This is undesired if the chosen cb_cpu is listed as isolated (i.e. using
> isolcpus=... kernel parameter), since the work queued will interfere with
> the workload on the isolated cpu.
>
> Make sure isolated cpus are not used for pcrypt.
>
> Signed-off-by: Leonardo Bras <[email protected]>
> ---
> crypto/pcrypt.c | 10 +++++++---
> 1 file changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/crypto/pcrypt.c b/crypto/pcrypt.c
> index 9d10b846ccf73..9017d08c91a8d 100644
> --- a/crypto/pcrypt.c
> +++ b/crypto/pcrypt.c
> @@ -16,6 +16,7 @@
> #include <linux/kobject.h>
> #include <linux/cpu.h>
> #include <crypto/pcrypt.h>
> +#include <linux/sched/isolation.h>
>
> static struct padata_instance *pencrypt;
> static struct padata_instance *pdecrypt;
> @@ -175,13 +176,16 @@ static int pcrypt_aead_init_tfm(struct crypto_aead *tfm)
> struct pcrypt_instance_ctx *ictx = aead_instance_ctx(inst);
> struct pcrypt_aead_ctx *ctx = crypto_aead_ctx(tfm);
> struct crypto_aead *cipher;
> + struct cpumask non_isolated;
> +
> + cpumask_and(&non_isolated, cpu_online_mask, housekeeping_cpumask(HK_TYPE_DOMAIN));
Since certain systems do not use isolcpus=domain, so please use a flag
that is setup by nohz_full=, for example HK_FLAG_MISC:
static int __init housekeeping_nohz_full_setup(char *str)
{
unsigned long flags;
flags = HK_FLAG_TICK | HK_FLAG_WQ | HK_FLAG_TIMER | HK_FLAG_RCU |
HK_FLAG_MISC | HK_FLAG_KTHREAD;
return housekeeping_setup(str, flags);
}
__setup("nohz_full=", housekeeping_nohz_full_setup);
Also, shouldnt you use cpumask_t ?
Looks good otherwise.
Thanks!
>
> cpu_index = (unsigned int)atomic_inc_return(&ictx->tfm_count) %
> - cpumask_weight(cpu_online_mask);
> + cpumask_weight(&non_isolated);
>
> - ctx->cb_cpu = cpumask_first(cpu_online_mask);
> + ctx->cb_cpu = cpumask_first(&non_isolated);
> for (cpu = 0; cpu < cpu_index; cpu++)
> - ctx->cb_cpu = cpumask_next(ctx->cb_cpu, cpu_online_mask);
> + ctx->cb_cpu = cpumask_next(ctx->cb_cpu, &non_isolated);
>
> cipher = crypto_spawn_aead(&ictx->spawn);
>
> --
> 2.37.3
>
>
On Wed, 2022-10-05 at 09:57 -0300, Marcelo Tosatti wrote:
> On Tue, Oct 04, 2022 at 03:25:37AM -0300, Leonardo Bras wrote:
> > Currently pcrypt_aead_init_tfm() will pick callback cpus (ctx->cb_cpu)
> > from any online cpus. Later padata_reorder() will queue_work_on() the
> > chosen cb_cpu.
> >
> > This is undesired if the chosen cb_cpu is listed as isolated (i.e. using
> > isolcpus=... kernel parameter), since the work queued will interfere with
> > the workload on the isolated cpu.
> >
> > Make sure isolated cpus are not used for pcrypt.
> >
> > Signed-off-by: Leonardo Bras <[email protected]>
> > ---
> > crypto/pcrypt.c | 10 +++++++---
> > 1 file changed, 7 insertions(+), 3 deletions(-)
> >
> > diff --git a/crypto/pcrypt.c b/crypto/pcrypt.c
> > index 9d10b846ccf73..9017d08c91a8d 100644
> > --- a/crypto/pcrypt.c
> > +++ b/crypto/pcrypt.c
> > @@ -16,6 +16,7 @@
> > #include <linux/kobject.h>
> > #include <linux/cpu.h>
> > #include <crypto/pcrypt.h>
> > +#include <linux/sched/isolation.h>
> >
> > static struct padata_instance *pencrypt;
> > static struct padata_instance *pdecrypt;
> > @@ -175,13 +176,16 @@ static int pcrypt_aead_init_tfm(struct crypto_aead *tfm)
> > struct pcrypt_instance_ctx *ictx = aead_instance_ctx(inst);
> > struct pcrypt_aead_ctx *ctx = crypto_aead_ctx(tfm);
> > struct crypto_aead *cipher;
> > + struct cpumask non_isolated;
> > +
> > + cpumask_and(&non_isolated, cpu_online_mask, housekeeping_cpumask(HK_TYPE_DOMAIN));
>
> Since certain systems do not use isolcpus=domain, so please use a flag
> that is setup by nohz_full=, for example HK_FLAG_MISC:
>
> static int __init housekeeping_nohz_full_setup(char *str)
> {
> unsigned long flags;
>
> flags = HK_FLAG_TICK | HK_FLAG_WQ | HK_FLAG_TIMER | HK_FLAG_RCU |
> HK_FLAG_MISC | HK_FLAG_KTHREAD;
>
> return housekeeping_setup(str, flags);
> }
> __setup("nohz_full=", housekeeping_nohz_full_setup);
Oh, sure.
Since we are talking about WorkQueues, I think it makes sense to pick
HK_FLAG_WQ.
>
> Also, shouldnt you use cpumask_t ?/
Yeah, I think so.
I was quick to choose the 'struct cpumask' because all functions would operate
in this variable type, but yeah, I think it makes sense to have this variable
being opaque here.
>
> Looks good otherwise.
>
> Thanks!
Thank you for reviewing!
Leo
>
>
> >
> > cpu_index = (unsigned int)atomic_inc_return(&ictx->tfm_count) %
> > - cpumask_weight(cpu_online_mask);
> > + cpumask_weight(&non_isolated);
> >
> > - ctx->cb_cpu = cpumask_first(cpu_online_mask);
> > + ctx->cb_cpu = cpumask_first(&non_isolated);
> > for (cpu = 0; cpu < cpu_index; cpu++)
> > - ctx->cb_cpu = cpumask_next(ctx->cb_cpu, cpu_online_mask);
> > + ctx->cb_cpu = cpumask_next(ctx->cb_cpu, &non_isolated);
> >
> > cipher = crypto_spawn_aead(&ictx->spawn);
> >
> > --
> > 2.37.3
> >
> >
>
On Fri, 2022-10-07 at 18:42 -0300, Leonardo Brás wrote:
> On Wed, 2022-10-05 at 09:57 -0300, Marcelo Tosatti wrote:
> > On Tue, Oct 04, 2022 at 03:25:37AM -0300, Leonardo Bras wrote:
> > > Currently pcrypt_aead_init_tfm() will pick callback cpus (ctx->cb_cpu)
> > > from any online cpus. Later padata_reorder() will queue_work_on() the
> > > chosen cb_cpu.
> > >
> > > This is undesired if the chosen cb_cpu is listed as isolated (i.e. using
> > > isolcpus=... kernel parameter), since the work queued will interfere with
> > > the workload on the isolated cpu.
> > >
> > > Make sure isolated cpus are not used for pcrypt.
> > >
> > > Signed-off-by: Leonardo Bras <[email protected]>
> > > ---
> > > crypto/pcrypt.c | 10 +++++++---
> > > 1 file changed, 7 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/crypto/pcrypt.c b/crypto/pcrypt.c
> > > index 9d10b846ccf73..9017d08c91a8d 100644
> > > --- a/crypto/pcrypt.c
> > > +++ b/crypto/pcrypt.c
> > > @@ -16,6 +16,7 @@
> > > #include <linux/kobject.h>
> > > #include <linux/cpu.h>
> > > #include <crypto/pcrypt.h>
> > > +#include <linux/sched/isolation.h>
> > >
> > > static struct padata_instance *pencrypt;
> > > static struct padata_instance *pdecrypt;
> > > @@ -175,13 +176,16 @@ static int pcrypt_aead_init_tfm(struct crypto_aead *tfm)
> > > struct pcrypt_instance_ctx *ictx = aead_instance_ctx(inst);
> > > struct pcrypt_aead_ctx *ctx = crypto_aead_ctx(tfm);
> > > struct crypto_aead *cipher;
> > > + struct cpumask non_isolated;
> > > +
> > > + cpumask_and(&non_isolated, cpu_online_mask, housekeeping_cpumask(HK_TYPE_DOMAIN));
> >
> > Since certain systems do not use isolcpus=domain, so please use a flag
> > that is setup by nohz_full=, for example HK_FLAG_MISC:
> >
> > static int __init housekeeping_nohz_full_setup(char *str)
> > {
> > unsigned long flags;
> >
> > flags = HK_FLAG_TICK | HK_FLAG_WQ | HK_FLAG_TIMER | HK_FLAG_RCU |
> > HK_FLAG_MISC | HK_FLAG_KTHREAD;
> >
> > return housekeeping_setup(str, flags);
> > }
> > __setup("nohz_full=", housekeeping_nohz_full_setup);
>
> Oh, sure.
> Since we are talking about WorkQueues, I think it makes sense to pick
> HK_FLAG_WQ.
>
> >
> > Also, shouldnt you use cpumask_t ?/
>
> Yeah, I think so.
> I was quick to choose the 'struct cpumask' because all functions would operate
> in this variable type, but yeah, I think it makes sense to have this variable
> being opaque here.
In fact, it seems neither 'struct cpumask' nor 'cpumask_t' are recommended to be
used allocated in the stack, due to the large size it can get (up to 1kB).
At include/linux/cpumask.h we have:
'cpumask_var_t: struct cpumask for stack usage'
which should work better at least for init functions like this.
In other cases, I see 'static cpumask_t' being used to avoid the allocation
overhead, but it's probably due to the functions being called in very specific
scenarios. It could mean trouble if multiple cpus try to use it at once.
What do you recommend on it?
Best regards,
Leo
>
> >
> > Looks good otherwise.
> >
> > Thanks!
>
> Thank you for reviewing!
>
> Leo
>
> >
> >
> > >
> > > cpu_index = (unsigned int)atomic_inc_return(&ictx->tfm_count) %
> > > - cpumask_weight(cpu_online_mask);
> > > + cpumask_weight(&non_isolated);
> > >
> > > - ctx->cb_cpu = cpumask_first(cpu_online_mask);
> > > + ctx->cb_cpu = cpumask_first(&non_isolated);
> > > for (cpu = 0; cpu < cpu_index; cpu++)
> > > - ctx->cb_cpu = cpumask_next(ctx->cb_cpu, cpu_online_mask);
> > > + ctx->cb_cpu = cpumask_next(ctx->cb_cpu, &non_isolated);
> > >
> > > cipher = crypto_spawn_aead(&ictx->spawn);
> > >
> > > --
> > > 2.37.3
> > >
> > >
> >
>
On Tue, Oct 11, 2022 at 03:20:39PM -0300, Leonardo Br?s wrote:
> On Fri, 2022-10-07 at 18:42 -0300, Leonardo Br?s wrote:
> > On Wed, 2022-10-05 at 09:57 -0300, Marcelo Tosatti wrote:
> > > On Tue, Oct 04, 2022 at 03:25:37AM -0300, Leonardo Bras wrote:
> > > > Currently pcrypt_aead_init_tfm() will pick callback cpus (ctx->cb_cpu)
> > > > from any online cpus. Later padata_reorder() will queue_work_on() the
> > > > chosen cb_cpu.
> > > >
> > > > This is undesired if the chosen cb_cpu is listed as isolated (i.e. using
> > > > isolcpus=... kernel parameter), since the work queued will interfere with
> > > > the workload on the isolated cpu.
> > > >
> > > > Make sure isolated cpus are not used for pcrypt.
> > > >
> > > > Signed-off-by: Leonardo Bras <[email protected]>
> > > > ---
> > > > crypto/pcrypt.c | 10 +++++++---
> > > > 1 file changed, 7 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/crypto/pcrypt.c b/crypto/pcrypt.c
> > > > index 9d10b846ccf73..9017d08c91a8d 100644
> > > > --- a/crypto/pcrypt.c
> > > > +++ b/crypto/pcrypt.c
> > > > @@ -16,6 +16,7 @@
> > > > #include <linux/kobject.h>
> > > > #include <linux/cpu.h>
> > > > #include <crypto/pcrypt.h>
> > > > +#include <linux/sched/isolation.h>
> > > >
> > > > static struct padata_instance *pencrypt;
> > > > static struct padata_instance *pdecrypt;
> > > > @@ -175,13 +176,16 @@ static int pcrypt_aead_init_tfm(struct crypto_aead *tfm)
> > > > struct pcrypt_instance_ctx *ictx = aead_instance_ctx(inst);
> > > > struct pcrypt_aead_ctx *ctx = crypto_aead_ctx(tfm);
> > > > struct crypto_aead *cipher;
> > > > + struct cpumask non_isolated;
> > > > +
> > > > + cpumask_and(&non_isolated, cpu_online_mask, housekeeping_cpumask(HK_TYPE_DOMAIN));
> > >
> > > Since certain systems do not use isolcpus=domain, so please use a flag
> > > that is setup by nohz_full=, for example HK_FLAG_MISC:
> > >
> > > static int __init housekeeping_nohz_full_setup(char *str)
> > > {
> > > unsigned long flags;
> > >
> > > flags = HK_FLAG_TICK | HK_FLAG_WQ | HK_FLAG_TIMER | HK_FLAG_RCU |
> > > HK_FLAG_MISC | HK_FLAG_KTHREAD;
> > >
> > > return housekeeping_setup(str, flags);
> > > }
> > > __setup("nohz_full=", housekeeping_nohz_full_setup);
> >
> > Oh, sure.?
> > Since we are talking about WorkQueues, I think it makes sense to pick
> > HK_FLAG_WQ.
> >
> > >
> > > Also, shouldnt you use cpumask_t ?/
> >
> > Yeah, I think so.?
> > I was quick to choose the 'struct cpumask' because all functions would operate
> > in this variable type, but yeah, I think it makes sense to have this variable
> > being opaque here.
>
> In fact, it seems neither 'struct cpumask' nor 'cpumask_t' are recommended to be
> used allocated in the stack, due to the large size it can get (up to 1kB).?
>
> At include/linux/cpumask.h we have:
> 'cpumask_var_t: struct cpumask for stack usage'
> which should work better at least for init functions like this.
>
> In other cases, I see 'static cpumask_t' being used to avoid the allocation
> overhead, but it's probably due to the functions being called in very specific
> scenarios. It could mean trouble if multiple cpus try to use it at once.
>
> What do you recommend on it?
Sorry for the delay. I suppose allocating and freeing is OK in this context, since
its initialization time and not a hot path?
On Tue, Nov 1, 2022 at 2:42 PM Marcelo Tosatti <[email protected]> wrote:
>
> On Tue, Oct 11, 2022 at 03:20:39PM -0300, Leonardo Brás wrote:
> > On Fri, 2022-10-07 at 18:42 -0300, Leonardo Brás wrote:
> > > On Wed, 2022-10-05 at 09:57 -0300, Marcelo Tosatti wrote:
> > > > On Tue, Oct 04, 2022 at 03:25:37AM -0300, Leonardo Bras wrote:
> > > > > Currently pcrypt_aead_init_tfm() will pick callback cpus (ctx->cb_cpu)
> > > > > from any online cpus. Later padata_reorder() will queue_work_on() the
> > > > > chosen cb_cpu.
> > > > >
> > > > > This is undesired if the chosen cb_cpu is listed as isolated (i.e. using
> > > > > isolcpus=... kernel parameter), since the work queued will interfere with
> > > > > the workload on the isolated cpu.
> > > > >
> > > > > Make sure isolated cpus are not used for pcrypt.
> > > > >
> > > > > Signed-off-by: Leonardo Bras <[email protected]>
> > > > > ---
> > > > > crypto/pcrypt.c | 10 +++++++---
> > > > > 1 file changed, 7 insertions(+), 3 deletions(-)
> > > > >
> > > > > diff --git a/crypto/pcrypt.c b/crypto/pcrypt.c
> > > > > index 9d10b846ccf73..9017d08c91a8d 100644
> > > > > --- a/crypto/pcrypt.c
> > > > > +++ b/crypto/pcrypt.c
> > > > > @@ -16,6 +16,7 @@
> > > > > #include <linux/kobject.h>
> > > > > #include <linux/cpu.h>
> > > > > #include <crypto/pcrypt.h>
> > > > > +#include <linux/sched/isolation.h>
> > > > >
> > > > > static struct padata_instance *pencrypt;
> > > > > static struct padata_instance *pdecrypt;
> > > > > @@ -175,13 +176,16 @@ static int pcrypt_aead_init_tfm(struct crypto_aead *tfm)
> > > > > struct pcrypt_instance_ctx *ictx = aead_instance_ctx(inst);
> > > > > struct pcrypt_aead_ctx *ctx = crypto_aead_ctx(tfm);
> > > > > struct crypto_aead *cipher;
> > > > > + struct cpumask non_isolated;
> > > > > +
> > > > > + cpumask_and(&non_isolated, cpu_online_mask, housekeeping_cpumask(HK_TYPE_DOMAIN));
> > > >
> > > > Since certain systems do not use isolcpus=domain, so please use a flag
> > > > that is setup by nohz_full=, for example HK_FLAG_MISC:
> > > >
> > > > static int __init housekeeping_nohz_full_setup(char *str)
> > > > {
> > > > unsigned long flags;
> > > >
> > > > flags = HK_FLAG_TICK | HK_FLAG_WQ | HK_FLAG_TIMER | HK_FLAG_RCU |
> > > > HK_FLAG_MISC | HK_FLAG_KTHREAD;
> > > >
> > > > return housekeeping_setup(str, flags);
> > > > }
> > > > __setup("nohz_full=", housekeeping_nohz_full_setup);
> > >
> > > Oh, sure.
> > > Since we are talking about WorkQueues, I think it makes sense to pick
> > > HK_FLAG_WQ.
> > >
> > > >
> > > > Also, shouldnt you use cpumask_t ?/
> > >
> > > Yeah, I think so.
> > > I was quick to choose the 'struct cpumask' because all functions would operate
> > > in this variable type, but yeah, I think it makes sense to have this variable
> > > being opaque here.
> >
> > In fact, it seems neither 'struct cpumask' nor 'cpumask_t' are recommended to be
> > used allocated in the stack, due to the large size it can get (up to 1kB).
> >
> > At include/linux/cpumask.h we have:
> > 'cpumask_var_t: struct cpumask for stack usage'
> > which should work better at least for init functions like this.
> >
> > In other cases, I see 'static cpumask_t' being used to avoid the allocation
> > overhead, but it's probably due to the functions being called in very specific
> > scenarios. It could mean trouble if multiple cpus try to use it at once.
> >
> > What do you recommend on it?
>
> Sorry for the delay. I suppose allocating and freeing is OK in this context, since
> its initialization time and not a hot path?
Yeah, it makes sense this way. I will allocate as suggested!
(Unless my other change gets in, so this new variable and a lot of
overhead can be avoided.)
Thanks Marcelo!