Received: by 2002:a25:868d:0:0:0:0:0 with SMTP id z13csp1724066ybk; Thu, 21 May 2020 13:43:50 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyOMs3DSfu45n1/o27AnFnG7pflLtN8uEGCN56X91sx9ni3zWmG5FjJKhnwxBQFZezK+NU4 X-Received: by 2002:a17:906:c308:: with SMTP id s8mr5202316ejz.366.1590093830228; Thu, 21 May 2020 13:43:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1590093830; cv=none; d=google.com; s=arc-20160816; b=nd3Nh1STA5kooFK/mqcdA7Y2TotcR+vM1frJJvWHDkn6krMhQb14lo7rLvpPsC7K+g qr66mxCnwcxbXsWNhIkyymm9QflKbWwnz+0uu/HhCQt78r7NvM7CoMN26t6ze7tuv9Cw YwDWiuKryxjXYhbK5QLnPHof6zFiU23SJYuw/8Pvfhuif/OeiKxFpG9EhnJC2Xg/TMu0 I1T/D53ZhqJX5FpID7D575w1PjEr6T2i27wPr8DXn5lkmzVaGy7UtU5KJ9VYbRaZ+dXe fUilrgYbtu1EwPqvr2046d82vfoK2m+nZQMHJ4dJamtuguRAtH74+fah0Ce+OCXKuA+p 209A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=ltC/0+34XiBDhZ+hg/k0+0fG/bHjOpRze8zbah8tonU=; b=Gp/AnIJC3KIgNsUisdLD1q5nKjpyybXQuJ4CIH+0ktanzKevFE3vffN9MLSsngo8W9 VLt9A+n86aD+V7YhJXbgkyb/jis7bod8V6R93XYHCBmsg5EXQGqP9R9q4wc4Tg3d+PCg MaRuSL8PfpLZjp0VTGZnpWiSuE0/45wsADNot70RE4jfjyMy7k+GaryIYoUDOaU3OMwk 9KuJwpsp+6UeCeja7unQseHmaHT2RdDfz+T3Z/3Hcutt8GasTYkQqoZYlNPHuw3ekzuU DyoHT15ZFSfyckaLXjMJiz2RQE7UHWoc+WeQDYM7rL0nSsJeZ95bMkWALhVujZ6bL/9O /4hQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2020-01-29 header.b=ZPY+PABC; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id aq18si3887368ejc.123.2020.05.21.13.43.26; Thu, 21 May 2020 13:43:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2020-01-29 header.b=ZPY+PABC; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728943AbgEUUnP (ORCPT + 99 others); Thu, 21 May 2020 16:43:15 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:46014 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728547AbgEUUnP (ORCPT ); Thu, 21 May 2020 16:43:15 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 04LKgEEZ113461; Thu, 21 May 2020 20:43:01 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : mime-version : content-transfer-encoding; s=corp-2020-01-29; bh=ltC/0+34XiBDhZ+hg/k0+0fG/bHjOpRze8zbah8tonU=; b=ZPY+PABCo2++JzZ1b4e5l5QR4yPljIooOoaKGrPmx7VeXdAVS5GWAbMQat05O3TXjcC7 Z5r1J3g1A/T1S8q0w9mRpEkmVx1Cpc7gqoeypRrBuf5SdZpXNx6zXTu5OJUIb0mKDGmd 5Ww9Vnu7m17a0dDdLUPG6uPKdJfTzO6qTQkVAkzLutp+qulu6ctTMxj24WXDdJm0TFIL 5H2WERh2EEAtMMTK0HJdazBtmv18Zgy4rOPuFux/8ocraSLrjjf4u/I/xF+L+BAGlFsZ GJ4G2f/wfxN2as80WQO7rqufTMVl0+RCtymUNeRutlqsZSa5K0E8WVHxYOs+L7048TNx kA== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by aserp2120.oracle.com with ESMTP id 31284manhw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Thu, 21 May 2020 20:43:00 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 04LKYWgF010759; Thu, 21 May 2020 20:41:00 GMT Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by userp3030.oracle.com with ESMTP id 314gm9wjg0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 21 May 2020 20:40:59 +0000 Received: from abhmp0014.oracle.com (abhmp0014.oracle.com [141.146.116.20]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id 04LKeuDb029896; Thu, 21 May 2020 20:40:56 GMT Received: from localhost.localdomain (/98.229.125.203) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 21 May 2020 13:40:56 -0700 From: Daniel Jordan To: Greg Kroah-Hartman , Sasha Levin Cc: Ben Hutchings , Herbert Xu , Steffen Klassert , stable@vger.kernel.org, linux-crypto@vger.kernel.org, Daniel Jordan Subject: [stable-4.19 1/3] padata: Replace delayed timer with immediate workqueue in padata_reorder Date: Thu, 21 May 2020 16:40:49 -0400 Message-Id: <20200521204051.1952184-1-daniel.m.jordan@oracle.com> X-Mailer: git-send-email 2.26.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9628 signatures=668686 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 malwarescore=0 mlxlogscore=999 adultscore=0 phishscore=0 mlxscore=0 spamscore=0 suspectscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2004280000 definitions=main-2005210152 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9628 signatures=668686 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 mlxscore=0 cotscore=-2147483648 impostorscore=0 malwarescore=0 mlxlogscore=999 lowpriorityscore=0 phishscore=0 spamscore=0 bulkscore=0 adultscore=0 priorityscore=1501 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2004280000 definitions=main-2005210153 Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org From: Herbert Xu [ Upstream commit 6fc4dbcf0276279d488c5fbbfabe94734134f4fa ] The function padata_reorder will use a timer when it cannot progress while completed jobs are outstanding (pd->reorder_objects > 0). This is suboptimal as if we do end up using the timer then it would have introduced a gratuitous delay of one second. In fact we can easily distinguish between whether completed jobs are outstanding and whether we can make progress. All we have to do is look at the next pqueue list. This patch does that by replacing pd->processed with pd->cpu so that the next pqueue is more accessible. A work queue is used instead of the original try_again to avoid hogging the CPU. Note that we don't bother removing the work queue in padata_flush_queues because the whole premise is broken. You cannot flush async crypto requests so it makes no sense to even try. A subsequent patch will fix it by replacing it with a ref counting scheme. Signed-off-by: Herbert Xu [dj: - adjust context - corrected setup_timer -> timer_setup to delete hunk - skip padata_flush_queues() hunk, function already removed in 4.19] Signed-off-by: Daniel Jordan --- include/linux/padata.h | 13 ++---- kernel/padata.c | 95 ++++++++---------------------------------- 2 files changed, 22 insertions(+), 86 deletions(-) diff --git a/include/linux/padata.h b/include/linux/padata.h index 5d13d25da2c8..d803397a28f7 100644 --- a/include/linux/padata.h +++ b/include/linux/padata.h @@ -24,7 +24,6 @@ #include #include #include -#include #include #include @@ -85,18 +84,14 @@ struct padata_serial_queue { * @serial: List to wait for serialization after reordering. * @pwork: work struct for parallelization. * @swork: work struct for serialization. - * @pd: Backpointer to the internal control structure. * @work: work struct for parallelization. - * @reorder_work: work struct for reordering. * @num_obj: Number of objects that are processed by this cpu. * @cpu_index: Index of the cpu. */ struct padata_parallel_queue { struct padata_list parallel; struct padata_list reorder; - struct parallel_data *pd; struct work_struct work; - struct work_struct reorder_work; atomic_t num_obj; int cpu_index; }; @@ -122,10 +117,10 @@ struct padata_cpumask { * @reorder_objects: Number of objects waiting in the reorder queues. * @refcnt: Number of objects holding a reference on this parallel_data. * @max_seq_nr: Maximal used sequence number. + * @cpu: Next CPU to be processed. * @cpumask: The cpumasks in use for parallel and serial workers. + * @reorder_work: work struct for reordering. * @lock: Reorder lock. - * @processed: Number of already processed objects. - * @timer: Reorder timer. */ struct parallel_data { struct padata_instance *pinst; @@ -134,10 +129,10 @@ struct parallel_data { atomic_t reorder_objects; atomic_t refcnt; atomic_t seq_nr; + int cpu; struct padata_cpumask cpumask; + struct work_struct reorder_work; spinlock_t lock ____cacheline_aligned; - unsigned int processed; - struct timer_list timer; }; /** diff --git a/kernel/padata.c b/kernel/padata.c index c280cb153915..47dc31ce15ac 100644 --- a/kernel/padata.c +++ b/kernel/padata.c @@ -167,23 +167,12 @@ EXPORT_SYMBOL(padata_do_parallel); */ static struct padata_priv *padata_get_next(struct parallel_data *pd) { - int cpu, num_cpus; - unsigned int next_nr, next_index; struct padata_parallel_queue *next_queue; struct padata_priv *padata; struct padata_list *reorder; + int cpu = pd->cpu; - num_cpus = cpumask_weight(pd->cpumask.pcpu); - - /* - * Calculate the percpu reorder queue and the sequence - * number of the next object. - */ - next_nr = pd->processed; - next_index = next_nr % num_cpus; - cpu = padata_index_to_cpu(pd, next_index); next_queue = per_cpu_ptr(pd->pqueue, cpu); - reorder = &next_queue->reorder; spin_lock(&reorder->lock); @@ -194,7 +183,8 @@ static struct padata_priv *padata_get_next(struct parallel_data *pd) list_del_init(&padata->list); atomic_dec(&pd->reorder_objects); - pd->processed++; + pd->cpu = cpumask_next_wrap(cpu, pd->cpumask.pcpu, -1, + false); spin_unlock(&reorder->lock); goto out; @@ -217,6 +207,7 @@ static void padata_reorder(struct parallel_data *pd) struct padata_priv *padata; struct padata_serial_queue *squeue; struct padata_instance *pinst = pd->pinst; + struct padata_parallel_queue *next_queue; /* * We need to ensure that only one cpu can work on dequeueing of @@ -248,7 +239,6 @@ static void padata_reorder(struct parallel_data *pd) * so exit immediately. */ if (PTR_ERR(padata) == -ENODATA) { - del_timer(&pd->timer); spin_unlock_bh(&pd->lock); return; } @@ -267,70 +257,29 @@ static void padata_reorder(struct parallel_data *pd) /* * The next object that needs serialization might have arrived to - * the reorder queues in the meantime, we will be called again - * from the timer function if no one else cares for it. + * the reorder queues in the meantime. * - * Ensure reorder_objects is read after pd->lock is dropped so we see - * an increment from another task in padata_do_serial. Pairs with + * Ensure reorder queue is read after pd->lock is dropped so we see + * new objects from another task in padata_do_serial. Pairs with * smp_mb__after_atomic in padata_do_serial. */ smp_mb(); - if (atomic_read(&pd->reorder_objects) - && !(pinst->flags & PADATA_RESET)) - mod_timer(&pd->timer, jiffies + HZ); - else - del_timer(&pd->timer); - return; + next_queue = per_cpu_ptr(pd->pqueue, pd->cpu); + if (!list_empty(&next_queue->reorder.list)) + queue_work(pinst->wq, &pd->reorder_work); } static void invoke_padata_reorder(struct work_struct *work) { - struct padata_parallel_queue *pqueue; struct parallel_data *pd; local_bh_disable(); - pqueue = container_of(work, struct padata_parallel_queue, reorder_work); - pd = pqueue->pd; + pd = container_of(work, struct parallel_data, reorder_work); padata_reorder(pd); local_bh_enable(); } -static void padata_reorder_timer(struct timer_list *t) -{ - struct parallel_data *pd = from_timer(pd, t, timer); - unsigned int weight; - int target_cpu, cpu; - - cpu = get_cpu(); - - /* We don't lock pd here to not interfere with parallel processing - * padata_reorder() calls on other CPUs. We just need any CPU out of - * the cpumask.pcpu set. It would be nice if it's the right one but - * it doesn't matter if we're off to the next one by using an outdated - * pd->processed value. - */ - weight = cpumask_weight(pd->cpumask.pcpu); - target_cpu = padata_index_to_cpu(pd, pd->processed % weight); - - /* ensure to call the reorder callback on the correct CPU */ - if (cpu != target_cpu) { - struct padata_parallel_queue *pqueue; - struct padata_instance *pinst; - - /* The timer function is serialized wrt itself -- no locking - * needed. - */ - pinst = pd->pinst; - pqueue = per_cpu_ptr(pd->pqueue, target_cpu); - queue_work_on(target_cpu, pinst->wq, &pqueue->reorder_work); - } else { - padata_reorder(pd); - } - - put_cpu(); -} - static void padata_serial_worker(struct work_struct *serial_work) { struct padata_serial_queue *squeue; @@ -384,9 +333,8 @@ void padata_do_serial(struct padata_priv *padata) cpu = get_cpu(); - /* We need to run on the same CPU padata_do_parallel(.., padata, ..) - * was called on -- or, at least, enqueue the padata object into the - * correct per-cpu queue. + /* We need to enqueue the padata object into the correct + * per-cpu queue. */ if (cpu != padata->cpu) { reorder_via_wq = 1; @@ -396,12 +344,12 @@ void padata_do_serial(struct padata_priv *padata) pqueue = per_cpu_ptr(pd->pqueue, cpu); spin_lock(&pqueue->reorder.lock); - atomic_inc(&pd->reorder_objects); list_add_tail(&padata->list, &pqueue->reorder.list); + atomic_inc(&pd->reorder_objects); spin_unlock(&pqueue->reorder.lock); /* - * Ensure the atomic_inc of reorder_objects above is ordered correctly + * Ensure the addition to the reorder list is ordered correctly * with the trylock of pd->lock in padata_reorder. Pairs with smp_mb * in padata_reorder. */ @@ -409,13 +357,7 @@ void padata_do_serial(struct padata_priv *padata) put_cpu(); - /* If we're running on the wrong CPU, call padata_reorder() via a - * kernel worker. - */ - if (reorder_via_wq) - queue_work_on(cpu, pd->pinst->wq, &pqueue->reorder_work); - else - padata_reorder(pd); + padata_reorder(pd); } EXPORT_SYMBOL(padata_do_serial); @@ -471,14 +413,12 @@ static void padata_init_pqueues(struct parallel_data *pd) continue; } - pqueue->pd = pd; pqueue->cpu_index = cpu_index; cpu_index++; __padata_list_init(&pqueue->reorder); __padata_list_init(&pqueue->parallel); INIT_WORK(&pqueue->work, padata_parallel_worker); - INIT_WORK(&pqueue->reorder_work, invoke_padata_reorder); atomic_set(&pqueue->num_obj, 0); } } @@ -506,12 +446,13 @@ static struct parallel_data *padata_alloc_pd(struct padata_instance *pinst, padata_init_pqueues(pd); padata_init_squeues(pd); - timer_setup(&pd->timer, padata_reorder_timer, 0); atomic_set(&pd->seq_nr, -1); atomic_set(&pd->reorder_objects, 0); atomic_set(&pd->refcnt, 1); pd->pinst = pinst; spin_lock_init(&pd->lock); + pd->cpu = cpumask_first(pcpumask); + INIT_WORK(&pd->reorder_work, invoke_padata_reorder); return pd; -- 2.26.2