Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752653AbdDLRkM (ORCPT ); Wed, 12 Apr 2017 13:40:12 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:36380 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752095AbdDLRkK (ORCPT ); Wed, 12 Apr 2017 13:40:10 -0400 Date: Wed, 12 Apr 2017 10:40:03 -0700 From: "Paul E. McKenney" To: linux-kernel@vger.kernel.org Cc: mingo@kernel.org, jiangshanlai@gmail.com, dipankar@in.ibm.com, akpm@linux-foundation.org, mathieu.desnoyers@efficios.com, josh@joshtriplett.org, tglx@linutronix.de, peterz@infradead.org, rostedt@goodmis.org, dhowells@redhat.com, edumazet@google.com, fweisbec@gmail.com, oleg@redhat.com, bobby.prani@gmail.com Subject: [PATCH tip/core/rcu 0/40] SRCU callback parallelization for 4.12 Reply-To: paulmck@linux.vnet.ibm.com MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 17041217-0024-0000-0000-000002442EC4 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00006924; HX=3.00000240; KW=3.00000007; PH=3.00000004; SC=3.00000208; SDB=6.00846603; UDB=6.00417607; IPR=6.00625041; BA=6.00005286; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00015023; XFM=3.00000013; UTC=2017-04-12 17:40:08 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17041217-0025-0000-0000-00004350CFDE Message-Id: <20170412174003.GA23207@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-04-12_13:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=1 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1702020001 definitions=main-1704120145 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5257 Lines: 128 Hello! This series moves SRCU from its traditional single per-srcu_struct callback queue to per-srcu_struct/per-CPU callback queues. This involves abstracting functionality from Tree RCU, which results in a large conflict footprint, which in turn results in some otherwise unrelated patches coming along for the ride. 1. Maintain special bits at bottom of ->dynticks counter. This is for some upcoming MM work. My intent was to hold it until that work was ready, but merge conflicts dictated otherwise. If the MM work does not appear soonish, I will manually revert this patch. 2. Make arch select smp_mb__after_unlock_lock() strength, which gets rid of an arch-specific #ifdef. 3. Consolidate SRCU batch checking into rcu_all_batches_empty(). 4. Check for tardy grace-period activity in cleanup_srcu_struct(). 5-7. Semicolon inside RCU_TRACE() for various parts of RCU. 8-10. Make various parts of RCU do deferred NOCB wakeups in order to prevent callback blockages, and thus hangs. 11. Pull rcu_sched_qs_mask into rcu_dynticks structure in order to eliminate an isolated per-CPU variable. 12. Pull rcu_qs_ctr into rcu_dynticks structure. 13. Eliminate flavor scan in rcu_momentary_dyntick_idle() to reduce semi-common-case context-switch overhead. 14. Place guard on rcu_all_qs() and rcu_note_context_switch() actions to reduce common-case scheduler-fastpath overhead. 15. Default RCU_FANOUT_LEAF to 16 unless explicitly changed. 16. Abstract multi-tail callback list handling for SRCU. 17. Allow SRCU to access rcu_scheduler_active. 18. Allow early boot use of synchronize_srcu(), though not yet mid-boot use. 19. Add single-element dequeue functions to rcu_segcblist for debug use. 20. Move rcu_seq_start() and friends to rcu.h for SRCU's benefit. 21. Expedited wakeups need to be fully ordered. 22. Fix warning in rcu_seq_end(). 23. Push srcu_advance_batches() fastpath into common case as a step towards callback parallelization. 24. Move to state-based grace-period sequencing, also as a step towards callback parallelization. 25. Add grace-period sequence numbers to SRCU. 26. Use rcu_segcblist to track SRCU callbacks. 27. Move combining-tree definitions for SRCU's benefit. 28. Move rcu_init_levelspread() to rcu_tree_node.h for SRCU's benefit. 29. Remove redundant levelcnt[] array from rcu_init_one(). 30. Move rcu_node traversal macros to rcu.h for SRCU's benefit. 31. Make num_rcu_lvl[] array be external for SRCU's benefit. 32. Fix bogus try_check_zero() comment. 33. Improve rcu_seq grace-period-counter abstraction for SRCU's benefit. 34. Allow a second bit in rcu_seq for SRCU state. 35. Merge ->srcu_state into ->srcu_gp_seq to allow atomic updates. 36. Provide crude control of expedited SRCU grace periods. 37. Create a tiny SRCU for bloatwatch/tinification. 38. Print Tiny SRCU reader statistics in rcutorture. 39. Introduce CLASSIC_SRCU Kconfig option for those who do not wish to help debug Tree SRCU. 40. Parallelize SRCU callback handling. Thanx, Paul ------------------------------------------------------------------------ /kernel/rcu/rcu_segcblist.h | 671 ----- b/Documentation/RCU/Design/Data-Structures/Data-Structures.html | 36 b/arch/Kconfig | 3 b/arch/powerpc/Kconfig | 1 b/include/linux/rcu_node_tree.h | 105 b/include/linux/rcu_segcblist.h | 720 +++++ b/include/linux/rcupdate.h | 6 b/include/linux/rcutiny.h | 11 b/include/linux/srcu.h | 112 b/include/linux/srcuclassic.h | 101 b/include/linux/srcutiny.h | 81 b/include/linux/srcutree.h | 171 + b/init/Kconfig | 33 b/kernel/rcu/Makefile | 6 b/kernel/rcu/rcu.h | 165 + b/kernel/rcu/rcu_segcblist.h | 671 +++++ b/kernel/rcu/rcutorture.c | 39 b/kernel/rcu/srcu.c | 846 +++--- b/kernel/rcu/srcutiny.c | 215 + b/kernel/rcu/srcutree.c | 1252 ++++++++-- b/kernel/rcu/tiny.c | 20 b/kernel/rcu/tiny_plugin.h | 13 b/kernel/rcu/tree.c | 650 +---- b/kernel/rcu/tree.h | 174 - b/kernel/rcu/tree_exp.h | 25 b/kernel/rcu/tree_plugin.h | 70 b/kernel/rcu/tree_trace.c | 26 b/kernel/rcu/update.c | 52 28 files changed, 4261 insertions(+), 2014 deletions(-)