LinuxLists.cc - [PATCH v2 00/12] plat-eznps upstream cont. set 2

2017-06-13 14:04:45

Subject: [PATCH v2 00/12] plat-eznps upstream cont. set 2

From: Noam Camus <[email protected]>

Chanlog:
V1 -> V2
1) I added "Handle memory error as an exception" patch from previous set
It now turn do_memory_error() into weak sybol.
It is then overriden by NPS400 platform, to simply call die().
2) This set is now based on arc-next branch
Summary:
With this patch set I continue the effort of upstreaming the eznps platform for arch/arc.

It comprise of couple of patches from last set yet not accepted,
patches for HW erratas and some misc extensions such for HIGHMEM / NUMA.

This set got more generic ARC changes than previous set.
Additional ifdef seem like unavoidable, however it may seem Ugly.
Let's see if we need to do it more elegant.

Elad Kanfi (1):
ARC: [plat-eznps] avoid toggling of DPC register

Liav Rehana (2):
ARC: [plat-eznps] Update the init sequence of aux regs per cpu.
ARC: [plat-eznps] handle dedicated AUX registers

Noam Camus (9):
ARC: [plat-eznps] Handle memory error as an exception
ARC: set level of log per CPU during boot to be debug level
ARC: send ipi to all cpus sharing task mm in case of page fault
ARC: Allow irq threading
ARC: Add CPU topology
ARC: Support more than one PGDIR for KVADDR
ARC: [NUMA] added CONFIG_NUMA for plat-eznps
ARC: [plat-eznps] new command line argument for HW scheduler at MTM
ARC: [plat-eznps] Save/Restore extra auxiliary registers

Documentation/admin-guide/kernel-parameters.txt | 9 ++
arch/arc/Kconfig | 48 +++++++++
arch/arc/include/asm/Kbuild | 1 -
arch/arc/include/asm/arcregs.h | 7 ++
arch/arc/include/asm/cacheflush.h | 3 +-
arch/arc/include/asm/entry-compact.h | 24 +++++
arch/arc/include/asm/highmem.h | 8 +-
arch/arc/include/asm/pgtable.h | 9 ++
arch/arc/include/asm/processor.h | 8 +-
arch/arc/include/asm/ptrace.h | 5 +
arch/arc/include/asm/switch_to.h | 11 ++
arch/arc/include/asm/topology.h | 40 +++++++
arch/arc/kernel/Makefile | 1 +
arch/arc/kernel/process.c | 4 +
arch/arc/kernel/setup.c | 13 ++-
arch/arc/kernel/smp.c | 9 ++-
arch/arc/kernel/topology.c | 125 +++++++++++++++++++++++
arch/arc/kernel/traps.c | 2 +-
arch/arc/mm/cache.c | 14 ++-
arch/arc/mm/fault.c | 8 ++
arch/arc/mm/highmem.c | 16 ++-
arch/arc/mm/init.c | 6 +
arch/arc/mm/tlb.c | 4 +-
arch/arc/mm/tlbex.S | 31 ++++++
arch/arc/plat-eznps/Kconfig | 23 ++++
arch/arc/plat-eznps/Makefile | 2 +-
arch/arc/plat-eznps/ctop.c | 33 ++++++
arch/arc/plat-eznps/entry.S | 2 +-
arch/arc/plat-eznps/include/plat/ctop.h | 1 +
arch/arc/plat-eznps/mtm.c | 72 +++++++++++++-
30 files changed, 510 insertions(+), 29 deletions(-)
create mode 100644 arch/arc/include/asm/topology.h
create mode 100644 arch/arc/kernel/topology.c
create mode 100644 arch/arc/plat-eznps/ctop.c

2017-06-13 14:04:14

by Noam Camus

[permalink] [raw]

Subject: [PATCH v2 02/12] ARC: set level of log per CPU during boot to be debug level

From: Noam Camus <[email protected]>

The reasons are:
1) speeding up boot time, becomes critical for many CPUs machine,
e.g. NPS400 with 4K CPUs
2) shorten kernel log at boot time, again easy to scan for large
scale machines such NPS400

Signed-off-by: Noam Camus <[email protected]>
---
arch/arc/kernel/setup.c | 6 +++---
arch/arc/kernel/smp.c | 4 ++--
arch/arc/mm/cache.c | 2 +-
arch/arc/mm/tlb.c | 2 +-
4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/arc/kernel/setup.c b/arch/arc/kernel/setup.c
index fc8211f..8494b31 100644
--- a/arch/arc/kernel/setup.c
+++ b/arch/arc/kernel/setup.c
@@ -385,13 +385,13 @@ void setup_processor(void)
read_arc_build_cfg_regs();
arc_init_IRQ();

- printk(arc_cpu_mumbojumbo(cpu_id, str, sizeof(str)));
+ pr_debug("%s", arc_cpu_mumbojumbo(cpu_id, str, sizeof(str)));

arc_mmu_init();
arc_cache_init();

- printk(arc_extn_mumbojumbo(cpu_id, str, sizeof(str)));
- printk(arc_platform_smp_cpuinfo());
+ pr_debug("%s", arc_extn_mumbojumbo(cpu_id, str, sizeof(str)));
+ pr_debug("%s", arc_platform_smp_cpuinfo());

arc_chk_core_config();
}
diff --git a/arch/arc/kernel/smp.c b/arch/arc/kernel/smp.c
index f462671..d1aa917 100644
--- a/arch/arc/kernel/smp.c
+++ b/arch/arc/kernel/smp.c
@@ -177,8 +177,8 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)

secondary_idle_tsk = idle;

- pr_info("Idle Task [%d] %p", cpu, idle);
- pr_info("Trying to bring up CPU%u ...\n", cpu);
+ pr_debug("Idle Task [%d] %p", cpu, idle);
+ pr_debug("Trying to bring up CPU%u ...\n", cpu);

if (plat_smp_ops.cpu_kick)
plat_smp_ops.cpu_kick(cpu,
diff --git a/arch/arc/mm/cache.c b/arch/arc/mm/cache.c
index a867575..7d3e79b 100644
--- a/arch/arc/mm/cache.c
+++ b/arch/arc/mm/cache.c
@@ -1188,7 +1188,7 @@ void __ref arc_cache_init(void)
unsigned int __maybe_unused cpu = smp_processor_id();
char str[256];

- printk(arc_cache_mumbojumbo(0, str, sizeof(str)));
+ pr_debug("%s", arc_cache_mumbojumbo(0, str, sizeof(str)));

/*
* Only master CPU needs to execute rest of function:
diff --git a/arch/arc/mm/tlb.c b/arch/arc/mm/tlb.c
index d0126fd..c5e70d8 100644
--- a/arch/arc/mm/tlb.c
+++ b/arch/arc/mm/tlb.c
@@ -814,7 +814,7 @@ void arc_mmu_init(void)
char str[256];
struct cpuinfo_arc_mmu *mmu = &cpuinfo_arc700[smp_processor_id()].mmu;

- printk(arc_mmu_mumbojumbo(0, str, sizeof(str)));
+ pr_debug("%s", arc_mmu_mumbojumbo(0, str, sizeof(str)));

/*
* Can't be done in processor.h due to header include depenedencies
--
1.7.1

2017-06-13 14:04:22

by Noam Camus

[permalink] [raw]

Subject: [PATCH v2 05/12] ARC: Add CPU topology

From: Noam Camus <[email protected]>

Now it is used for NPS SoC for multi-core of 256 cores
and SMT of 16 HW threads per core.

This way with topology the scheduler is much efficient in
creating domains and later using them.

Signed-off-by: Noam Camus <[email protected]>
---
arch/arc/Kconfig | 27 ++++++++
arch/arc/include/asm/Kbuild | 1 -
arch/arc/include/asm/topology.h | 34 +++++++++++
arch/arc/kernel/Makefile | 1 +
arch/arc/kernel/setup.c | 4 +-
arch/arc/kernel/smp.c | 5 ++
arch/arc/kernel/topology.c | 125 +++++++++++++++++++++++++++++++++++++++
7 files changed, 194 insertions(+), 3 deletions(-)
create mode 100644 arch/arc/include/asm/topology.h
create mode 100644 arch/arc/kernel/topology.c

diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig
index f464f97..08a9003 100644
--- a/arch/arc/Kconfig
+++ b/arch/arc/Kconfig
@@ -202,6 +202,33 @@ config ARC_SMP_HALT_ON_RESET
at designated entry point. For other case, all jump to common
entry point and spin wait for Master's signal.

+config NPS_CPU_TOPOLOGY
+ bool "Support cpu topology definition"
+ depends on EZNPS_MTM_EXT
+ default y
+ help
+ Support NPS cpu topology definition.
+ NPS400 got 16 clusters of cores.
+ NPS400 cluster got 16 cores.
+ NPS core got 16 symetrical threads.
+ Totally there are such 4096 threads (NR_CPUS=4096)
+
+config SCHED_MC
+ bool "Multi-core scheduler support"
+ depends on NPS_CPU_TOPOLOGY
+ help
+ Multi-core scheduler support improves the CPU scheduler's decision
+ making when dealing with multi-core CPU chips at a cost of slightly
+ increased overhead in some places. If unsure say N here.
+
+config SCHED_SMT
+ bool "SMT scheduler support"
+ depends on NPS_CPU_TOPOLOGY
+ help
+ Improves the CPU scheduler's decision making when dealing with
+ MultiThreading at a cost of slightly increased overhead in some
+ places. If unsure say N here.
+
endif #SMP

config ARC_MCIP
diff --git a/arch/arc/include/asm/Kbuild b/arch/arc/include/asm/Kbuild
index 7bee4e4..d8cb607 100644
--- a/arch/arc/include/asm/Kbuild
+++ b/arch/arc/include/asm/Kbuild
@@ -43,7 +43,6 @@ generic-y += stat.h
generic-y += statfs.h
generic-y += termbits.h
generic-y += termios.h
-generic-y += topology.h
generic-y += trace_clock.h
generic-y += types.h
generic-y += ucontext.h
diff --git a/arch/arc/include/asm/topology.h b/arch/arc/include/asm/topology.h
new file mode 100644
index 0000000..a9be3f8
--- /dev/null
+++ b/arch/arc/include/asm/topology.h
@@ -0,0 +1,34 @@
+#ifndef _ASM_ARC_TOPOLOGY_H
+#define _ASM_ARC_TOPOLOGY_H
+
+#ifdef CONFIG_NPS_CPU_TOPOLOGY
+
+#include <linux/cpumask.h>
+
+struct cputopo_nps {
+ int thread_id;
+ int core_id;
+ cpumask_t thread_sibling;
+ cpumask_t core_sibling;
+};
+
+extern struct cputopo_nps cpu_topology[NR_CPUS];
+
+#define topology_core_id(cpu) (cpu_topology[cpu].core_id)
+#define topology_core_cpumask(cpu) (&cpu_topology[cpu].core_sibling)
+#define topology_sibling_cpumask(cpu) (&cpu_topology[cpu].thread_sibling)
+
+void init_cpu_topology(void);
+void store_cpu_topology(unsigned int cpuid);
+const struct cpumask *cpu_coregroup_mask(int cpu);
+
+#else
+
+static inline void init_cpu_topology(void) { }
+static inline void store_cpu_topology(unsigned int cpuid) { }
+
+#endif
+
+#include <asm-generic/topology.h>
+
+#endif /* _ASM_ARC_TOPOLOGY_H */
diff --git a/arch/arc/kernel/Makefile b/arch/arc/kernel/Makefile
index 8942c5c..46af80a 100644
--- a/arch/arc/kernel/Makefile
+++ b/arch/arc/kernel/Makefile
@@ -23,6 +23,7 @@ obj-$(CONFIG_ARC_EMUL_UNALIGNED) += unaligned.o
obj-$(CONFIG_KGDB) += kgdb.o
obj-$(CONFIG_ARC_METAWARE_HLINK) += arc_hostlink.o
obj-$(CONFIG_PERF_EVENTS) += perf_event.o
+obj-$(CONFIG_NPS_CPU_TOPOLOGY) += topology.o

obj-$(CONFIG_ARC_FPU_SAVE_RESTORE) += fpu.o
CFLAGS_fpu.o += -mdpfp
diff --git a/arch/arc/kernel/setup.c b/arch/arc/kernel/setup.c
index 8494b31..5256205 100644
--- a/arch/arc/kernel/setup.c
+++ b/arch/arc/kernel/setup.c
@@ -571,14 +571,14 @@ static void c_stop(struct seq_file *m, void *v)
.show = show_cpuinfo
};

-static DEFINE_PER_CPU(struct cpu, cpu_topology);
+static DEFINE_PER_CPU(struct cpu, cpu_topo_info);

static int __init topology_init(void)
{
int cpu;

for_each_present_cpu(cpu)
- register_cpu(&per_cpu(cpu_topology, cpu), cpu);
+ register_cpu(&per_cpu(cpu_topo_info, cpu), cpu);

return 0;
}
diff --git a/arch/arc/kernel/smp.c b/arch/arc/kernel/smp.c
index d1aa917..167a620 100644
--- a/arch/arc/kernel/smp.c
+++ b/arch/arc/kernel/smp.c
@@ -67,6 +67,9 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
{
int i;

+ init_cpu_topology();
+ store_cpu_topology(smp_processor_id());
+
/*
* if platform didn't set the present map already, do it now
* boot cpu is set to present already by init/main.c
@@ -151,6 +154,8 @@ void start_kernel_secondary(void)
if (machine_desc->init_per_cpu)
machine_desc->init_per_cpu(cpu);

+ store_cpu_topology(cpu);
+
notify_cpu_starting(cpu);
set_cpu_online(cpu, true);

diff --git a/arch/arc/kernel/topology.c b/arch/arc/kernel/topology.c
new file mode 100644
index 0000000..3feb7c9
--- /dev/null
+++ b/arch/arc/kernel/topology.c
@@ -0,0 +1,125 @@
+/*
+ * Copyright (C) 2015 Synopsys, Inc. (http://www.synopsys.com)
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/cpu.h>
+#include <linux/cpumask.h>
+#include <linux/export.h>
+#include <linux/init.h>
+#include <linux/percpu.h>
+#include <linux/node.h>
+#include <linux/nodemask.h>
+#include <linux/of.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/sched/topology.h>
+#include <plat/smp.h>
+
+/*
+ * cpu topology table
+ */
+struct cputopo_nps cpu_topology[NR_CPUS];
+EXPORT_SYMBOL_GPL(cpu_topology);
+
+const struct cpumask *cpu_coregroup_mask(int cpu)
+{
+ return &cpu_topology[cpu].core_sibling;
+}
+
+static void update_siblings_masks(unsigned int cpuid)
+{
+ struct cputopo_nps *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
+ int cpu;
+ struct global_id global_topo, global_id_topo;
+
+ global_id_topo.value = cpuid;
+
+ /* update core and thread sibling masks */
+ for_each_possible_cpu(cpu) {
+ cpu_topo = &cpu_topology[cpu];
+ global_topo.value = cpu;
+
+ if (global_id_topo.cluster != global_topo.cluster)
+ continue;
+
+ cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
+ if (cpu != cpuid)
+ cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
+
+ if (cpuid_topo->core_id != cpu_topo->core_id)
+ continue;
+
+ cpumask_set_cpu(cpuid, &cpu_topo->thread_sibling);
+ if (cpu != cpuid)
+ cpumask_set_cpu(cpu, &cpuid_topo->thread_sibling);
+ }
+
+ /* Do not proceed before masks are written */
+ smp_wmb();
+}
+
+/*
+ * store_cpu_topology is called at boot when only one cpu is running
+ * and with the mutex cpu_hotplug.lock locked, when several cpus have booted,
+ * which prevents simultaneous write access to cpu_topology array
+ */
+void store_cpu_topology(unsigned int cpuid)
+{
+ struct cputopo_nps *cpuid_topo = &cpu_topology[cpuid];
+ struct global_id gid;
+
+ /* If the cpu topology has been already set, just return */
+ if (cpuid_topo->core_id != -1)
+ return;
+
+ gid.value = cpuid;
+
+ cpuid_topo->thread_id = gid.thread;
+ cpuid_topo->core_id = ((gid.cluster << 4) | gid.core);
+
+ update_siblings_masks(cpuid);
+
+ pr_debug("CPU%u: thread %d, core %d\n",
+ cpuid, cpu_topology[cpuid].thread_id,
+ cpu_topology[cpuid].core_id);
+}
+
+static struct sched_domain_topology_level nps_topology[] = {
+#ifdef CONFIG_SCHED_SMT
+ { cpu_smt_mask, cpu_smt_flags, SD_INIT_NAME(SMT) },
+#endif
+#ifdef CONFIG_SCHED_MC
+ { cpu_coregroup_mask, cpu_core_flags, SD_INIT_NAME(MC) },
+#endif
+ { cpu_cpu_mask, SD_INIT_NAME(DIE) },
+ { NULL, },
+};
+
+/*
+ * init_cpu_topology is called at boot when only one cpu is running
+ * which prevent simultaneous write access to cpu_topology array
+ */
+void __init init_cpu_topology(void)
+{
+ unsigned int cpu;
+
+ /* init core mask */
+ for_each_possible_cpu(cpu) {
+ struct cputopo_nps *cpu_topo = &(cpu_topology[cpu]);
+
+ cpu_topo->thread_id = -1;
+ cpu_topo->core_id = -1;
+ cpumask_clear(&cpu_topo->core_sibling);
+ cpumask_clear(&cpu_topo->thread_sibling);
+ }
+
+ /* Do not proceed before masks are written */
+ smp_wmb();
+
+ /* Set scheduler topology descriptor */
+ set_sched_topology(nps_topology);
+}
--
1.7.1

2017-06-13 14:04:12

by Noam Camus

[permalink] [raw]

Subject: [PATCH v2 12/12] ARC: [plat-eznps] avoid toggling of DPC register

From: Elad Kanfi <[email protected]>

HW bug description: in case of HW thread context switch
the dpc configuration of the exiting thread is dragged
one cycle into the next thread.
In order to avoid the consequences of this bug, the DPC register
is set to an initial value, and not changed afterwards.

Signed-off-by: Elad Kanfi <[email protected]>
Signed-off-by: Noam Camus <[email protected]>
---
arch/arc/plat-eznps/include/plat/ctop.h | 1 +
arch/arc/plat-eznps/mtm.c | 12 ++++++++++++
2 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/arch/arc/plat-eznps/include/plat/ctop.h b/arch/arc/plat-eznps/include/plat/ctop.h
index 7729d3d..0c7d110 100644
--- a/arch/arc/plat-eznps/include/plat/ctop.h
+++ b/arch/arc/plat-eznps/include/plat/ctop.h
@@ -39,6 +39,7 @@
#define CTOP_AUX_LOGIC_CORE_ID (CTOP_AUX_BASE + 0x018)
#define CTOP_AUX_MT_CTRL (CTOP_AUX_BASE + 0x020)
#define CTOP_AUX_HW_COMPLY (CTOP_AUX_BASE + 0x024)
+#define CTOP_AUX_DPC (CTOP_AUX_BASE + 0x02C)
#define CTOP_AUX_LPC (CTOP_AUX_BASE + 0x030)
#define CTOP_AUX_EFLAGS (CTOP_AUX_BASE + 0x080)
#define CTOP_AUX_IACK (CTOP_AUX_BASE + 0x088)
diff --git a/arch/arc/plat-eznps/mtm.c b/arch/arc/plat-eznps/mtm.c
index dd1ea1f..777231d 100644
--- a/arch/arc/plat-eznps/mtm.c
+++ b/arch/arc/plat-eznps/mtm.c
@@ -112,6 +112,18 @@ void mtm_enable_core(unsigned int cpu)
int i;
struct nps_host_reg_aux_mt_ctrl mt_ctrl;
struct nps_host_reg_mtm_cfg mtm_cfg;
+ struct nps_host_reg_aux_dpc dpc;
+
+ /*
+ * Initializing dpc register in each CPU.
+ * Overwriting the init value of the DPC
+ * register so that CMEM and FMT virtual address
+ * spaces are accessible, and Data Plane HW
+ * facilities are enabled.
+ */
+ dpc.ien = 1;
+ dpc.men = 1;
+ write_aux_reg(CTOP_AUX_DPC, dpc.value);

if (NPS_CPU_TO_THREAD_NUM(cpu) != 0)
return;
--
1.7.1

2017-06-13 14:04:47

by Noam Camus

[permalink] [raw]

Subject: [PATCH v2 01/12] ARC: [plat-eznps] Handle memory error as an exception

From: Noam Camus <[email protected]>

On ARC700, user mode memory error is treated as L2 interrupt, but NPS
hardware treats it as Machine Check exception.

Address this by defining an NPS specific bus error handler.

Signed-off-by: Noam Camus <[email protected]>
Signed-off-by: Elad Kanfi <[email protected]>
---
arch/arc/kernel/traps.c | 2 +-
arch/arc/plat-eznps/Kconfig | 12 ++++++++++++
arch/arc/plat-eznps/mtm.c | 11 +++++++++++
3 files changed, 24 insertions(+), 1 deletions(-)

diff --git a/arch/arc/kernel/traps.c b/arch/arc/kernel/traps.c
index ff83e78..62675b9 100644
--- a/arch/arc/kernel/traps.c
+++ b/arch/arc/kernel/traps.c
@@ -80,7 +80,7 @@ int name(unsigned long address, struct pt_regs *regs) \
DO_ERROR_INFO(SIGILL, "Priv Op/Disabled Extn", do_privilege_fault, ILL_PRVOPC)
DO_ERROR_INFO(SIGILL, "Invalid Extn Insn", do_extension_fault, ILL_ILLOPC)
DO_ERROR_INFO(SIGILL, "Illegal Insn (or Seq)", insterror_is_error, ILL_ILLOPC)
-DO_ERROR_INFO(SIGBUS, "Invalid Mem Access", do_memory_error, BUS_ADRERR)
+DO_ERROR_INFO(SIGBUS, "Invalid Mem Access", __weak do_memory_error, BUS_ADRERR)
DO_ERROR_INFO(SIGTRAP, "Breakpoint Set", trap_is_brkpt, TRAP_BRKPT)
DO_ERROR_INFO(SIGBUS, "Misaligned Access", do_misaligned_error, BUS_ADRALN)

diff --git a/arch/arc/plat-eznps/Kconfig b/arch/arc/plat-eznps/Kconfig
index feaa471..fa25136 100644
--- a/arch/arc/plat-eznps/Kconfig
+++ b/arch/arc/plat-eznps/Kconfig
@@ -32,3 +32,15 @@ config EZNPS_MTM_EXT
any of them seem like CPU from Linux point of view.
All threads within same core share the execution unit of the
core and HW scheduler round robin between them.
+
+config EZNPS_MEM_ERROR
+ bool "ARC-EZchip Memory error as an exception"
+ depends on EZNPS_MTM_EXT
+ default n
+ help
+ On the real chip of the NPS, user memory errors are handled
+ as a machine check exception, whereas on simulator platform
+ for NPS, is handled as an interrupt level 2 (like legacy arc
+ real chip architecture).This configuration will cause the kernel
+ to handle memory error similar to a machine check exception.
+ It means NOT sending a SIGBUS, but panic the system.
diff --git a/arch/arc/plat-eznps/mtm.c b/arch/arc/plat-eznps/mtm.c
index e0cb36b..59a0162 100644
--- a/arch/arc/plat-eznps/mtm.c
+++ b/arch/arc/plat-eznps/mtm.c
@@ -25,6 +25,17 @@
#define MT_CTRL_ST_CNT 0xF
#define NPS_NUM_HW_THREADS 0x10

+#ifdef CONFIG_EZNPS_MEM_ERROR
+int do_memory_error(unsigned long address, struct pt_regs *regs)
+{
+ char *str = "Invalid Mem Access";
+
+ die(str, regs, address);
+
+ return 1;
+}
+#endif
+
static void mtm_init_nat(int cpu)
{
struct nps_host_reg_mtm_cfg mtm_cfg;
--
1.7.1

2017-06-13 14:04:51

by Noam Camus

[permalink] [raw]

Subject: [PATCH v2 07/12] ARC: [NUMA] added CONFIG_NUMA for plat-eznps

From: Noam Camus <[email protected]>

This is needed for NPS400 where high memory is assigned to node1
where the associated addresses are lower than node0.
This use case is not typical and just using discontigmem is not enough
since nodes assumed to have increasing address range.
i.e. address range of node0 assumed to be lower than node1.

Signed-off-by: Noam Camus <[email protected]>
---
arch/arc/Kconfig | 9 +++++++++
arch/arc/include/asm/topology.h | 6 ++++++
arch/arc/kernel/setup.c | 3 +++
arch/arc/mm/init.c | 6 ++++++
4 files changed, 24 insertions(+), 0 deletions(-)

diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig
index 982bd18..18c37de 100644
--- a/arch/arc/Kconfig
+++ b/arch/arc/Kconfig
@@ -378,6 +378,15 @@ config ARC_HUGEPAGE_16M

endchoice

+config NUMA
+ bool "NUMA Memory Allocation and Scheduler Support"
+ depends on SMP && DISCONTIGMEM
+ default y if ARC_PLAT_EZNPS
+ ---help---
+ NUMA memory allocation is required for NPS400 processors.
+ The reason is that node1 in NPS400 is assigned to lower
+ addresses than node0, which is not typical scenario.
+
config NODES_SHIFT
int "Maximum NUMA Nodes (as a power of 2)"
default "0" if !DISCONTIGMEM
diff --git a/arch/arc/include/asm/topology.h b/arch/arc/include/asm/topology.h
index a9be3f8..dfbc2ab 100644
--- a/arch/arc/include/asm/topology.h
+++ b/arch/arc/include/asm/topology.h
@@ -1,6 +1,12 @@
#ifndef _ASM_ARC_TOPOLOGY_H
#define _ASM_ARC_TOPOLOGY_H

+#ifdef CONFIG_NUMA
+#define cpu_to_node(cpu) ((void)(cpu), 0)
+#define parent_node(node) (node)
+#define cpumask_of_node(node) ((void)node, cpu_online_mask)
+#endif
+
#ifdef CONFIG_NPS_CPU_TOPOLOGY

#include <linux/cpumask.h>
diff --git a/arch/arc/kernel/setup.c b/arch/arc/kernel/setup.c
index 5256205..5f04635 100644
--- a/arch/arc/kernel/setup.c
+++ b/arch/arc/kernel/setup.c
@@ -577,6 +577,9 @@ static int __init topology_init(void)
{
int cpu;

+ for_each_online_node(cpu)
+ register_one_node(cpu);
+
for_each_present_cpu(cpu)
register_cpu(&per_cpu(cpu_topo_info, cpu), cpu);

diff --git a/arch/arc/mm/init.c b/arch/arc/mm/init.c
index 8c9415e..f9f80d9 100644
--- a/arch/arc/mm/init.c
+++ b/arch/arc/mm/init.c
@@ -113,6 +113,10 @@ void __init setup_arch_memory(void)
init_mm.end_data = (unsigned long)_edata;
init_mm.brk = (unsigned long)_end;

+ node_set_online(0);
+ node_set_state(0, N_MEMORY);
+ node_set_state(0, N_NORMAL_MEMORY);
+
/* first page of system - kernel .vector starts here */
min_low_pfn = ARCH_PFN_OFFSET;

@@ -182,6 +186,8 @@ void __init setup_arch_memory(void)
* populated with normal memory zone while node 1 only has highmem
*/
node_set_online(1);
+ node_set_state(1, N_MEMORY);
+ node_set_state(1, N_HIGH_MEMORY);

min_high_pfn = PFN_DOWN(high_mem_start);
max_high_pfn = PFN_DOWN(high_mem_start + high_mem_sz);
--
1.7.1

2017-06-13 14:04:53

by Noam Camus

[permalink] [raw]

Subject: [PATCH v2 04/12] ARC: Allow irq threading

From: Noam Camus <[email protected]>

Working with NPS400 we noticed that there is a possibility of L1
interrupt nesting that may run out kernel stack.
The scenario include serving invoke_softirqs() from irq_exit()
and once local_irq_enable() called can hit another one before we
managed to restore last one and pop some place from kernel stack.

Serving softirqs at dedicated kernel thread may mitigate this.
We see that many architectures, including x86, behave like this.

Note 1: All interrupts which must be non threaded
should be marked IRQF_NO_THREAD.
Note 2: using kernel param "threadirqs" is needed to actually
turn this on. This configuration is only a preperation.

Signed-off-by: Noam Camus <[email protected]>
---
arch/arc/Kconfig | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig
index a545969..f464f97 100644
--- a/arch/arc/Kconfig
+++ b/arch/arc/Kconfig
@@ -33,6 +33,7 @@ config ARC
select HAVE_OPROFILE
select HAVE_PERF_EVENTS
select HANDLE_DOMAIN_IRQ
+ select IRQ_FORCED_THREADING
select IRQ_DOMAIN
select MODULES_USE_ELF_RELA
select NO_BOOTMEM
--
1.7.1

2017-06-13 14:04:52

by Noam Camus

[permalink] [raw]

Subject: [PATCH v2 11/12] ARC: [plat-eznps] handle dedicated AUX registers

From: Liav Rehana <[email protected]>

Preserve eflags and gpa1 auxiliaries during exception
Registers used by compare exchange instructions.
GPA1 is used for compare value, and EFLAGS got bit reflects
atomic operation response.

EFLAGS is zeroed for each new user task so it won't get its
parent value.

Signed-off-by: Noam Camus <[email protected]>
---
arch/arc/include/asm/entry-compact.h | 24 ++++++++++++++++++++++++
arch/arc/include/asm/ptrace.h | 5 +++++
arch/arc/kernel/process.c | 4 ++++
3 files changed, 33 insertions(+), 0 deletions(-)

diff --git a/arch/arc/include/asm/entry-compact.h b/arch/arc/include/asm/entry-compact.h
index 14c310f..9e4458a 100644
--- a/arch/arc/include/asm/entry-compact.h
+++ b/arch/arc/include/asm/entry-compact.h
@@ -192,6 +192,12 @@
PUSHAX lp_start
PUSHAX erbta

+#ifdef CONFIG_ARC_PLAT_EZNPS
+ .word CTOP_INST_SCHD_RW
+ PUSHAX CTOP_AUX_GPA1
+ PUSHAX CTOP_AUX_EFLAGS
+#endif
+`
lr r9, [ecr]
st r9, [sp, PT_event] /* EV_Trap expects r9 to have ECR */
.endm
@@ -208,6 +214,12 @@
* by hardware and that is not good.
*-------------------------------------------------------------*/
.macro EXCEPTION_EPILOGUE
+#ifdef CONFIG_ARC_PLAT_EZNPS
+ .word CTOP_INST_SCHD_RW
+ POPAX CTOP_AUX_EFLAGS
+ POPAX CTOP_AUX_GPA1
+#endif
+
POPAX erbta
POPAX lp_start
POPAX lp_end
@@ -265,6 +277,12 @@
PUSHAX lp_end
PUSHAX lp_start
PUSHAX bta_l\LVL\()
+
+#ifdef CONFIG_ARC_PLAT_EZNPS
+ .word CTOP_INST_SCHD_RW
+ PUSHAX CTOP_AUX_GPA1
+ PUSHAX CTOP_AUX_EFLAGS
+#endif
.endm

/*--------------------------------------------------------------
@@ -277,6 +295,12 @@
* by hardware and that is not good.
*-------------------------------------------------------------*/
.macro INTERRUPT_EPILOGUE LVL
+#ifdef CONFIG_ARC_PLAT_EZNPS
+ .word CTOP_INST_SCHD_RW
+ POPAX CTOP_AUX_EFLAGS
+ POPAX CTOP_AUX_GPA1
+#endif
+
POPAX bta_l\LVL\()
POPAX lp_start
POPAX lp_end
diff --git a/arch/arc/include/asm/ptrace.h b/arch/arc/include/asm/ptrace.h
index 5297faa..5a8cb22 100644
--- a/arch/arc/include/asm/ptrace.h
+++ b/arch/arc/include/asm/ptrace.h
@@ -19,6 +19,11 @@
#ifdef CONFIG_ISA_ARCOMPACT
struct pt_regs {

+#ifdef CONFIG_ARC_PLAT_EZNPS
+ unsigned long eflags; /* Extended FLAGS */
+ unsigned long gpa1; /* General Purpose Aux */
+#endif
+
/* Real registers */
unsigned long bta; /* bta_l1, bta_l2, erbta */

diff --git a/arch/arc/kernel/process.c b/arch/arc/kernel/process.c
index 5c631a1..5ac3b54 100644
--- a/arch/arc/kernel/process.c
+++ b/arch/arc/kernel/process.c
@@ -234,6 +234,10 @@ void start_thread(struct pt_regs * regs, unsigned long pc, unsigned long usp)
*/
regs->status32 = STATUS_U_MASK | STATUS_L_MASK | ISA_INIT_STATUS_BITS;

+#ifdef CONFIG_EZNPS_MTM_EXT
+ regs->eflags = 0;
+#endif
+
/* bogus seed values for debugging */
regs->lp_start = 0x10;
regs->lp_end = 0x80;
--
1.7.1

2017-06-13 14:04:49

by Noam Camus

[permalink] [raw]

Subject: [PATCH v2 03/12] ARC: send ipi to all cpus sharing task mm in case of page fault

From: Noam Camus <[email protected]>

This patch is derived due to performance issue.
The use case is a page fault that resides on more than the local cpu.
Trying to broadcast all CPUs results on performance degradation.
So we try to avoid this by sending only to the relevant CPUs.

Signed-off-by: Noam Camus <[email protected]>
Reviewed-by: Alexey Brodkin <[email protected]>
---
arch/arc/include/asm/cacheflush.h | 3 ++-
arch/arc/mm/cache.c | 12 ++++++++++--
arch/arc/mm/tlb.c | 2 +-
3 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/arch/arc/include/asm/cacheflush.h b/arch/arc/include/asm/cacheflush.h
index fc662f4..716dba1 100644
--- a/arch/arc/include/asm/cacheflush.h
+++ b/arch/arc/include/asm/cacheflush.h
@@ -33,7 +33,8 @@

void flush_icache_range(unsigned long kstart, unsigned long kend);
void __sync_icache_dcache(phys_addr_t paddr, unsigned long vaddr, int len);
-void __inv_icache_page(phys_addr_t paddr, unsigned long vaddr);
+void __inv_icache_page(struct vm_area_struct *vma,
+ phys_addr_t paddr, unsigned long vaddr);
void __flush_dcache_page(phys_addr_t paddr, unsigned long vaddr);

#define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
diff --git a/arch/arc/mm/cache.c b/arch/arc/mm/cache.c
index 7d3e79b..e1ea57f 100644
--- a/arch/arc/mm/cache.c
+++ b/arch/arc/mm/cache.c
@@ -934,9 +934,17 @@ void __sync_icache_dcache(phys_addr_t paddr, unsigned long vaddr, int len)
}

/* wrapper to compile time eliminate alignment checks in flush loop */
-void __inv_icache_page(phys_addr_t paddr, unsigned long vaddr)
+void __inv_icache_page(struct vm_area_struct *vma,
+ phys_addr_t paddr, unsigned long vaddr)
{
- __ic_line_inv_vaddr(paddr, vaddr, PAGE_SIZE);
+ struct ic_inv_args ic_inv = {
+ .paddr = paddr,
+ .vaddr = vaddr,
+ .sz = PAGE_SIZE
+ };
+
+ on_each_cpu_mask(mm_cpumask(vma->vm_mm),
+ __ic_line_inv_vaddr_helper, &ic_inv, 1);
}

/*
diff --git a/arch/arc/mm/tlb.c b/arch/arc/mm/tlb.c
index c5e70d8..a095608 100644
--- a/arch/arc/mm/tlb.c
+++ b/arch/arc/mm/tlb.c
@@ -626,7 +626,7 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned long vaddr_unaligned,

/* invalidate any existing icache lines (U-mapping) */
if (vma->vm_flags & VM_EXEC)
- __inv_icache_page(paddr, vaddr);
+ __inv_icache_page(vma, paddr, vaddr);
}
}
}
--
1.7.1

2017-06-13 14:04:48

by Noam Camus

[permalink] [raw]

Subject: [PATCH v2 06/12] ARC: Support more than one PGDIR for KVADDR

From: Noam Camus <[email protected]>

This way FIXMAP can have 2 PTEs per CPU even for
NR_CPUS=4096

For the extreme case like in eznps platform We use
all gutter between kernel and user.

Signed-off-by: Noam Camus <[email protected]>
---
arch/arc/Kconfig | 11 +++++++++++
arch/arc/include/asm/highmem.h | 8 +++++---
arch/arc/include/asm/pgtable.h | 9 +++++++++
arch/arc/include/asm/processor.h | 5 +++--
arch/arc/mm/fault.c | 8 ++++++++
arch/arc/mm/highmem.c | 16 +++++++++++-----
arch/arc/mm/tlbex.S | 31 +++++++++++++++++++++++++++++++
7 files changed, 78 insertions(+), 10 deletions(-)

diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig
index 08a9003..982bd18 100644
--- a/arch/arc/Kconfig
+++ b/arch/arc/Kconfig
@@ -477,6 +477,17 @@ config ARC_HAS_PAE40
Enable access to physical memory beyond 4G, only supported on
ARC cores with 40 bit Physical Addressing support

+config HIGHMEM_PGDS_SHIFT
+ int "log num of PGDs for HIGHMEM"
+ range 0 5
+ default "0" if !ARC_PLAT_EZNPS || !HIGHMEM
+ default "5" if ARC_PLAT_EZNPS
+ help
+ This way we can map more pages for HIGHMEM.
+ Single PGD (2M) is supporting 256 PTEs (8K PAGE_SIZE)
+ For FIXMAP where at least 2 PTEs are needed per CPU
+ large NR_CPUS e.g. 4096 will consume 32 PGDs
+
config ARCH_PHYS_ADDR_T_64BIT
def_bool ARC_HAS_PAE40

diff --git a/arch/arc/include/asm/highmem.h b/arch/arc/include/asm/highmem.h
index b1585c9..c5cb473 100644
--- a/arch/arc/include/asm/highmem.h
+++ b/arch/arc/include/asm/highmem.h
@@ -17,13 +17,13 @@

/* start after vmalloc area */
#define FIXMAP_BASE (PAGE_OFFSET - FIXMAP_SIZE - PKMAP_SIZE)
-#define FIXMAP_SIZE PGDIR_SIZE /* only 1 PGD worth */
-#define KM_TYPE_NR ((FIXMAP_SIZE >> PAGE_SHIFT)/NR_CPUS)
+#define FIXMAP_SIZE (PGDIR_SIZE * _BITUL(CONFIG_HIGHMEM_PGDS_SHIFT))
+#define KM_TYPE_NR (((FIXMAP_SIZE >> PAGE_SHIFT)/NR_CPUS) > 2 ?: 2)
#define FIXMAP_ADDR(nr) (FIXMAP_BASE + ((nr) << PAGE_SHIFT))

/* start after fixmap area */
#define PKMAP_BASE (FIXMAP_BASE + FIXMAP_SIZE)
-#define PKMAP_SIZE PGDIR_SIZE
+#define PKMAP_SIZE (PGDIR_SIZE * _BITUL(CONFIG_HIGHMEM_PGDS_SHIFT))
#define LAST_PKMAP (PKMAP_SIZE >> PAGE_SHIFT)
#define LAST_PKMAP_MASK (LAST_PKMAP - 1)
#define PKMAP_ADDR(nr) (PKMAP_BASE + ((nr) << PAGE_SHIFT))
@@ -32,6 +32,7 @@
#define kmap_prot PAGE_KERNEL

+#ifndef __ASSEMBLY__
#include <asm/cacheflush.h>

extern void *kmap(struct page *page);
@@ -54,6 +55,7 @@ static inline void kunmap(struct page *page)
return;
kunmap_high(page);
}
+#endif /* __ASSEMBLY__ */

#endif
diff --git a/arch/arc/include/asm/pgtable.h b/arch/arc/include/asm/pgtable.h
index 08fe338..d08e207 100644
--- a/arch/arc/include/asm/pgtable.h
+++ b/arch/arc/include/asm/pgtable.h
@@ -224,6 +224,8 @@
#define PTRS_PER_PTE _BITUL(BITS_FOR_PTE)
#define PTRS_PER_PGD _BITUL(BITS_FOR_PGD)

+#define PTRS_HMEM_PTE _BITUL(BITS_FOR_PTE + CONFIG_HIGHMEM_PGDS_SHIFT)
+
/*
* Number of entries a user land program use.
* TASK_SIZE is the maximum vaddr that can be used by a userland program.
@@ -285,7 +287,14 @@ static inline void pmd_set(pmd_t *pmdp, pte_t *ptep)

/* Don't use virt_to_pfn for macros below: could cause truncations for PAE40*/
#define pte_pfn(pte) (pte_val(pte) >> PAGE_SHIFT)
+#if CONFIG_HIGHMEM_PGDS_SHIFT
+#define __pte_index(addr) (((addr) >= VMALLOC_END) ? \
+ (((addr) >> PAGE_SHIFT) & (PTRS_HMEM_PTE - 1)) \
+ : \
+ (((addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1)))
+#else
#define __pte_index(addr) (((addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1))
+#endif

/*
* pte_offset gets a @ptr to PMD entry (PGD in our 2-tier paging system)
diff --git a/arch/arc/include/asm/processor.h b/arch/arc/include/asm/processor.h
index 6e1242d..fd7bdfa 100644
--- a/arch/arc/include/asm/processor.h
+++ b/arch/arc/include/asm/processor.h
@@ -121,8 +121,9 @@ extern void start_thread(struct pt_regs * regs, unsigned long pc,

#define VMALLOC_START (PAGE_OFFSET - (CONFIG_ARC_KVADDR_SIZE << 20))

-/* 1 PGDIR_SIZE each for fixmap/pkmap, 2 PGDIR_SIZE gutter (see asm/highmem.h) */
-#define VMALLOC_SIZE ((CONFIG_ARC_KVADDR_SIZE << 20) - PGDIR_SIZE * 4)
+/* 1 << CONFIG_HIGHMEM_PGDS_SHIFT PGDIR_SIZE each for fixmap/pkmap */
+#define VMALLOC_SIZE ((CONFIG_ARC_KVADDR_SIZE << 20) - \
+ PGDIR_SIZE * _BITUL(CONFIG_HIGHMEM_PGDS_SHIFT) * 2)

#define VMALLOC_END (VMALLOC_START + VMALLOC_SIZE)

diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c
index a0b7bd6..fd89c9a 100644
--- a/arch/arc/mm/fault.c
+++ b/arch/arc/mm/fault.c
@@ -17,6 +17,7 @@
#include <linux/perf_event.h>
#include <asm/pgalloc.h>
#include <asm/mmu.h>
+#include <asm/highmem.h>

/*
* kernel virtual address is required to implement vmalloc/pkmap/fixmap
@@ -35,6 +36,13 @@ noinline static int handle_kernel_vaddr_fault(unsigned long address)
pud_t *pud, *pud_k;
pmd_t *pmd, *pmd_k;

+#if defined(CONFIG_HIGHMEM) && (CONFIG_HIGHMEM_PGDS_SHIFT)
+ if (address > FIXMAP_BASE && address < (FIXMAP_BASE + FIXMAP_SIZE))
+ address = FIXMAP_BASE;
+ else if (address > PKMAP_BASE && address < (PKMAP_BASE + PKMAP_SIZE))
+ address = PKMAP_BASE;
+#endif
+
pgd = pgd_offset_fast(current->active_mm, address);
pgd_k = pgd_offset_k(address);

diff --git a/arch/arc/mm/highmem.c b/arch/arc/mm/highmem.c
index 77ff64a..1d4804d 100644
--- a/arch/arc/mm/highmem.c
+++ b/arch/arc/mm/highmem.c
@@ -112,7 +112,8 @@ void __kunmap_atomic(void *kv)
}
EXPORT_SYMBOL(__kunmap_atomic);

-static noinline pte_t * __init alloc_kmap_pgtable(unsigned long kvaddr)
+static noinline pte_t * __init alloc_kmap_pgtable(unsigned long kvaddr,
+ unsigned long pgnum)
{
pgd_t *pgd_k;
pud_t *pud_k;
@@ -123,19 +124,24 @@ void __kunmap_atomic(void *kv)
pud_k = pud_offset(pgd_k, kvaddr);
pmd_k = pmd_offset(pud_k, kvaddr);

- pte_k = (pte_t *)alloc_bootmem_low_pages(PAGE_SIZE);
+ pte_k = (pte_t *)alloc_bootmem_low_pages(pgnum * PAGE_SIZE);
pmd_populate_kernel(&init_mm, pmd_k, pte_k);
return pte_k;
}

void __init kmap_init(void)
{
+ unsigned int pgnum;
+
/* Due to recursive include hell, we can't do this in processor.h */
BUILD_BUG_ON(PAGE_OFFSET < (VMALLOC_END + FIXMAP_SIZE + PKMAP_SIZE));

BUILD_BUG_ON(KM_TYPE_NR > PTRS_PER_PTE);
- pkmap_page_table = alloc_kmap_pgtable(PKMAP_BASE);
+ pgnum = DIV_ROUND_UP(PKMAP_SIZE, PAGE_SIZE * PTRS_PER_PTE);
+ pkmap_page_table = alloc_kmap_pgtable(PKMAP_BASE, pgnum);

- BUILD_BUG_ON(LAST_PKMAP > PTRS_PER_PTE);
- fixmap_page_table = alloc_kmap_pgtable(FIXMAP_BASE);
+ BUILD_BUG_ON(LAST_PKMAP > (PTRS_PER_PTE *
+ _BITUL(CONFIG_HIGHMEM_PGDS_SHIFT)));
+ pgnum = DIV_ROUND_UP(FIXMAP_SIZE, PAGE_SIZE * PTRS_PER_PTE);
+ fixmap_page_table = alloc_kmap_pgtable(FIXMAP_BASE, pgnum);
}
diff --git a/arch/arc/mm/tlbex.S b/arch/arc/mm/tlbex.S
index 0e1e47a..e21aecc 100644
--- a/arch/arc/mm/tlbex.S
+++ b/arch/arc/mm/tlbex.S
@@ -43,6 +43,7 @@
#include <asm/cache.h>
#include <asm/processor.h>
#include <asm/tlb-mmu1.h>
+#include <asm/highmem.h>

#ifdef CONFIG_ISA_ARCOMPACT
;-----------------------------------------------------------------
@@ -204,6 +205,12 @@ ex_saved_reg1:
ld r1, [r1, MM_PGD]
#endif

+#if defined(CONFIG_HIGHMEM) && defined(CONFIG_HIGHMEM_PGDS_SHIFT)
+ ; handle pkmap/fixmap with more then on pte table
+ cmp_s r2, VMALLOC_END
+ b.hs 4f
+#endif
+
lsr r0, r2, PGDIR_SHIFT ; Bits for indexing into PGD
ld.as r3, [r1, r0] ; PGD entry corresp to faulting addr
tst r3, r3
@@ -237,6 +244,30 @@ ex_saved_reg1:

2:

+#if defined(CONFIG_HIGHMEM) && defined(CONFIG_HIGHMEM_PGDS_SHIFT)
+ b 6f
+
+4:
+ lsr r0, r2, PGDIR_SHIFT ; Bits for indexing into KMAP_PGD
+ and r0, r0, ~(_BITUL(CONFIG_HIGHMEM_PGDS_SHIFT) - 1)
+ ld.as r1, [r1, r0] ; PGD entry corresp to faulting addr
+ tst r1, r1
+ bz do_slow_path_pf ; if no Page Table, do page fault
+ and r1, r1, PAGE_MASK
+
+ cmp_s r2, PKMAP_BASE
+ mov.hs r0, ( PKMAP_BASE >> PAGE_SHIFT )
+ b.hs 5f
+ mov r0, ( FIXMAP_BASE >> PAGE_SHIFT )
+
+5:
+ lsr r3, r2, PAGE_SHIFT
+ sub r0, r3, r0
+ asl r0, r0, PTE_SIZE_LOG
+ ld.aw r0, [r1, r0]
+6:
+#endif
+
.endm

;-----------------------------------------------------------------
--
1.7.1

2017-06-13 14:06:26

by Noam Camus

[permalink] [raw]

Subject: [PATCH v2 08/12] ARC: [plat-eznps] new command line argument for HW scheduler at MTM

From: Noam Camus <[email protected]>

We add ability for all cores at NPS SoC to control the number of cycles
HW thread can execute before it is replace with another eligible
HW thread within the same core. The replacement is done by the
HE scheduler.

Signed-off-by: Noam Camus <[email protected]>
---
Documentation/admin-guide/kernel-parameters.txt | 9 ++++
arch/arc/plat-eznps/mtm.c | 49 ++++++++++++++++++++++-
2 files changed, 56 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 15f79c2..5b551f7 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2693,6 +2693,15 @@
If the dependencies are under your control, you can
turn on cpu0_hotplug.

+ nps_mtm_hs_ctr= [KNL,ARC]
+ This parameter sets the maximum duration, in
+ cycles, each HW thread of the CTOP can run
+ without interruptions, before HW switches it.
+ The actual maximum duration is 16 times this
+ parameter's value.
+ Format: integer between 1 and 255
+ Default: 255
+
nptcg= [IA-64] Override max number of concurrent global TLB
purges which is reported from either PAL_VM_SUMMARY or
SAL PALO.
diff --git a/arch/arc/plat-eznps/mtm.c b/arch/arc/plat-eznps/mtm.c
index 59a0162..dd1ea1f 100644
--- a/arch/arc/plat-eznps/mtm.c
+++ b/arch/arc/plat-eznps/mtm.c
@@ -21,10 +21,13 @@
#include <plat/mtm.h>
#include <plat/smp.h>

-#define MT_CTRL_HS_CNT 0xFF
+#define MT_HS_CNT_MIN 0x01
+#define MT_HS_CNT_MAX 0xFF
#define MT_CTRL_ST_CNT 0xF
#define NPS_NUM_HW_THREADS 0x10

+static int mtm_hs_ctr = MT_HS_CNT_MAX;
+
#ifdef CONFIG_EZNPS_MEM_ERROR
int do_memory_error(unsigned long address, struct pt_regs *regs)
{
@@ -129,7 +132,7 @@ void mtm_enable_core(unsigned int cpu)
/* Enable HW schedule, stall counter, mtm */
mt_ctrl.value = 0;
mt_ctrl.hsen = 1;
- mt_ctrl.hs_cnt = MT_CTRL_HS_CNT;
+ mt_ctrl.hs_cnt = mtm_hs_ctr;
mt_ctrl.mten = 1;
write_aux_reg(CTOP_AUX_MT_CTRL, mt_ctrl.value);

@@ -140,3 +143,45 @@ void mtm_enable_core(unsigned int cpu)
*/
cpu_relax();
}
+
+/* Handle an out of bounds mtm hs counter value */
+static void __init handle_mtm_hs_ctr_out_of_bounds_error(uint8_t val)
+{
+ pr_err("** The value of mtm_hs_ctr is out of bounds!\n"
+ "** It must be in the range [%d,%d] (inclusive)\n"
+ "Setting mtm_hs_ctr to %d\n", MT_HS_CNT_MIN, MT_HS_CNT_MAX, val);
+
+ mtm_hs_ctr = val;
+}
+
+/* Verify and set the value of the mtm hs counter */
+static int __init set_mtm_hs_ctr(char *ctr_str)
+{
+ int ret;
+ long hs_ctr;
+
+ ret = kstrtol(ctr_str, 0, &hs_ctr);
+ if (ret) {
+ pr_err("** Error parsing the value of mtm_hs_ctr\n"
+ "** Make sure you entered a valid integer value\n"
+ "Setting mtm_hs_ctr to default value: %d\n",
+ MT_HS_CNT_MAX);
+ mtm_hs_ctr = MT_HS_CNT_MAX;
+ return -EINVAL;
+ }
+
+ if (hs_ctr > MT_HS_CNT_MAX) {
+ handle_mtm_hs_ctr_out_of_bounds_error(MT_HS_CNT_MAX);
+ return -EDOM;
+ }
+
+ if (hs_ctr < MT_HS_CNT_MIN) {
+ handle_mtm_hs_ctr_out_of_bounds_error(MT_HS_CNT_MIN);
+ return -EDOM;
+ }
+
+ mtm_hs_ctr = hs_ctr;
+
+ return 0;
+}
+early_param("nps_mtm_hs_ctr", set_mtm_hs_ctr);
--
1.7.1

2017-06-13 14:06:23

by Noam Camus

[permalink] [raw]

Subject: [PATCH v2 09/12] ARC: [plat-eznps] Update the init sequence of aux regs per cpu.

From: Liav Rehana <[email protected]>

The following commit adds a config that will enable us to distinguish
between building the kernel for platforms that have a different set
of auxiliary registers for each cpu and platforms that have a shared
set of auxiliary registers across every thread in each core.
On platforms that implement a different set of auxiliary registers
there is a need to initialize them on every cpu and not just the for the
first thread of the core.

Signed-off-by: Liav Rehana <[email protected]>
Signed-off-by: Noam Camus <[email protected]>
---
arch/arc/plat-eznps/Kconfig | 11 +++++++++++
arch/arc/plat-eznps/entry.S | 2 +-
2 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/arch/arc/plat-eznps/Kconfig b/arch/arc/plat-eznps/Kconfig
index fa25136..019de58 100644
--- a/arch/arc/plat-eznps/Kconfig
+++ b/arch/arc/plat-eznps/Kconfig
@@ -44,3 +44,14 @@ config EZNPS_MEM_ERROR
real chip architecture).This configuration will cause the kernel
to handle memory error similar to a machine check exception.
It means NOT sending a SIGBUS, but panic the system.
+
+config EZNPS_SHARED_AUX_REGS
+ bool "ARC-EZchip Shared Auxiliary Registers Per Core"
+ depends on ARC_PLAT_EZNPS
+ default y
+ help
+ On the real chip of the NPS, auxiliary registers are shared between
+ all the cpus of the core, whereas on simulator platform for NPS,
+ each cpu has a different set of auxiliary registers. Configuration
+ should be unset if auxiliary registers are not shared between the cpus
+ of the core, so there will be a need to initialize them per cpu.
diff --git a/arch/arc/plat-eznps/entry.S b/arch/arc/plat-eznps/entry.S
index 328261c..091c92c 100644
--- a/arch/arc/plat-eznps/entry.S
+++ b/arch/arc/plat-eznps/entry.S
@@ -27,7 +27,7 @@
.align 1024 ; HW requierment for restart first PC

ENTRY(res_service)
-#ifdef CONFIG_EZNPS_MTM_EXT
+#if defined(CONFIG_EZNPS_MTM_EXT) && defined(CONFIG_EZNPS_SHARED_AUX_REGS)
; There is no work for HW thread id != 0
lr r3, [CTOP_AUX_THREAD_ID]
cmp r3, 0
--
1.7.1

2017-06-13 14:06:25

by Noam Camus

[permalink] [raw]

Subject: [PATCH v2 10/12] ARC: [plat-eznps] Save/Restore extra auxiliary registers

From: Noam Camus <[email protected]>

thread_struct got new field for data plane of eznps platform.
This field got place for data plane auxiliary registers and for
any extra registers that might be changed in kernel code.

We save EFLAGS, and GPA1 auxiliary registers since they may be
changed by the new task while using atomic operations e.g. cmpxchg.

Signed-off-by: Noam Camus <[email protected]>
---
arch/arc/include/asm/arcregs.h | 7 +++++++
arch/arc/include/asm/processor.h | 3 +++
arch/arc/include/asm/switch_to.h | 11 +++++++++++
arch/arc/plat-eznps/Makefile | 2 +-
arch/arc/plat-eznps/ctop.c | 33 +++++++++++++++++++++++++++++++++
5 files changed, 55 insertions(+), 1 deletions(-)
create mode 100644 arch/arc/plat-eznps/ctop.c

diff --git a/arch/arc/include/asm/arcregs.h b/arch/arc/include/asm/arcregs.h
index ba8e802..9437d42 100644
--- a/arch/arc/include/asm/arcregs.h
+++ b/arch/arc/include/asm/arcregs.h
@@ -123,6 +123,13 @@
#define PAGES_TO_MB(n_pages) (PAGES_TO_KB(n_pages) >> 10)

+#ifdef CONFIG_ARC_PLAT_EZNPS
+struct eznps_dp {
+ unsigned int eflags;
+ unsigned int gpa1;
+};
+#endif
+
/*
***************************************************************
* Build Configuration Registers, with encoded hardware config
diff --git a/arch/arc/include/asm/processor.h b/arch/arc/include/asm/processor.h
index fd7bdfa..130bb55 100644
--- a/arch/arc/include/asm/processor.h
+++ b/arch/arc/include/asm/processor.h
@@ -38,6 +38,9 @@ struct thread_struct {
#ifdef CONFIG_ARC_FPU_SAVE_RESTORE
struct arc_fpu fpu;
#endif
+#ifdef CONFIG_ARC_PLAT_EZNPS
+ struct eznps_dp dp;
+#endif
};

#define INIT_THREAD { \
diff --git a/arch/arc/include/asm/switch_to.h b/arch/arc/include/asm/switch_to.h
index 1b171ab..4c53080 100644
--- a/arch/arc/include/asm/switch_to.h
+++ b/arch/arc/include/asm/switch_to.h
@@ -26,13 +26,24 @@

#endif /* !CONFIG_ARC_FPU_SAVE_RESTORE */

+#ifdef CONFIG_ARC_PLAT_EZNPS
+extern void dp_save_restore(struct task_struct *p, struct task_struct *n);
+#define ARC_DP_PREV(p, n) dp_save_restore(p, n)
+#define ARC_DP_NEXT(t)
+#else
+#define ARC_DP_PREV(p, n)
+#define ARC_DP_NEXT(n)
+#endif /* !CONFIG_ARC_PLAT_EZNPS */
+
struct task_struct *__switch_to(struct task_struct *p, struct task_struct *n);

#define switch_to(prev, next, last) \
do { \
+ ARC_DP_PREV(prev, next); \
ARC_FPU_PREV(prev, next); \
last = __switch_to(prev, next);\
ARC_FPU_NEXT(next); \
+ ARC_DP_NEXT(next); \
mb(); \
} while (0)

diff --git a/arch/arc/plat-eznps/Makefile b/arch/arc/plat-eznps/Makefile
index 21091b1..8d43717 100644
--- a/arch/arc/plat-eznps/Makefile
+++ b/arch/arc/plat-eznps/Makefile
@@ -2,6 +2,6 @@
# Makefile for the linux kernel.
#

-obj-y := entry.o platform.o
+obj-y := entry.o platform.o ctop.o
obj-$(CONFIG_SMP) += smp.o
obj-$(CONFIG_EZNPS_MTM_EXT) += mtm.o
diff --git a/arch/arc/plat-eznps/ctop.c b/arch/arc/plat-eznps/ctop.c
new file mode 100644
index 0000000..8b13a08
--- /dev/null
+++ b/arch/arc/plat-eznps/ctop.c
@@ -0,0 +1,33 @@
+/*
+ * Copyright(c) 2015 EZchip Technologies.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
+ * more details.
+ *
+ * The full GNU General Public License is included in this distribution in
+ * the file called "COPYING".
+ */
+
+#include <linux/sched.h>
+#include <asm/arcregs.h>
+#include <plat/ctop.h>
+
+void dp_save_restore(struct task_struct *prev, struct task_struct *next)
+{
+ struct eznps_dp *prev_task_dp = &prev->thread.dp;
+ struct eznps_dp *next_task_dp = &next->thread.dp;
+
+ /* Here we save all Data Plane related auxiliary registers */
+ prev_task_dp->eflags = read_aux_reg(CTOP_AUX_EFLAGS);
+ write_aux_reg(CTOP_AUX_EFLAGS, next_task_dp->eflags);
+
+ prev_task_dp->gpa1 = read_aux_reg(CTOP_AUX_GPA1);
+ write_aux_reg(CTOP_AUX_GPA1, next_task_dp->gpa1);
+}
+
--
1.7.1

2017-06-14 22:39:45

by Vineet Gupta

[permalink] [raw]

Subject: Re: [PATCH v2 01/12] ARC: [plat-eznps] Handle memory error as an exception

On 06/13/2017 07:03 AM, Noam Camus wrote:
> From: Noam Camus <[email protected]>
>
> On ARC700, user mode memory error is treated as L2 interrupt, but NPS
> hardware treats it as Machine Check exception.
>
> Address this by defining an NPS specific bus error handler.

This still leaves too much to dig thru. I've rewritten the changelog here and pushed !

-Vineet

2017-06-14 22:46:07

by Vineet Gupta

[permalink] [raw]

Subject: Re: [PATCH v2 02/12] ARC: set level of log per CPU during boot to be debug level

On 06/13/2017 07:03 AM, Noam Camus wrote:
> From: Noam Camus <[email protected]>
>
> The reasons are:
> 1) speeding up boot time, becomes critical for many CPUs machine,
> e.g. NPS400 with 4K CPUs
> 2) shorten kernel log at boot time, again easy to scan for large
> scale machines such NPS400
>
> Signed-off-by: Noam Camus <[email protected]>
> ---
> arch/arc/kernel/setup.c | 6 +++---
> arch/arc/kernel/smp.c | 4 ++--
> arch/arc/mm/cache.c | 2 +-
> arch/arc/mm/tlb.c | 2 +-
> 4 files changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/arch/arc/kernel/setup.c b/arch/arc/kernel/setup.c
> index fc8211f..8494b31 100644
> --- a/arch/arc/kernel/setup.c
> +++ b/arch/arc/kernel/setup.c
> @@ -385,13 +385,13 @@ void setup_processor(void)
> read_arc_build_cfg_regs();
> arc_init_IRQ();
>
> - printk(arc_cpu_mumbojumbo(cpu_id, str, sizeof(str)));
> + pr_debug("%s", arc_cpu_mumbojumbo(cpu_id, str, sizeof(str)));

I understand you issue, but as Alexey mentioned before we can't switch the normal
kernel boot log to debug only.

At best you can convert the current printk to pr_info and then set the log level
in your cmdline to something higher than info !

>
> arc_mmu_init();
> arc_cache_init();
>
> - printk(arc_extn_mumbojumbo(cpu_id, str, sizeof(str)));
> - printk(arc_platform_smp_cpuinfo());
> + pr_debug("%s", arc_extn_mumbojumbo(cpu_id, str, sizeof(str)));
> + pr_debug("%s", arc_platform_smp_cpuinfo());
>
> arc_chk_core_config();
> }
> diff --git a/arch/arc/kernel/smp.c b/arch/arc/kernel/smp.c
> index f462671..d1aa917 100644
> --- a/arch/arc/kernel/smp.c
> +++ b/arch/arc/kernel/smp.c
> @@ -177,8 +177,8 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
>
> secondary_idle_tsk = idle;
>
> - pr_info("Idle Task [%d] %p", cpu, idle);
> - pr_info("Trying to bring up CPU%u ...\n", cpu);
> + pr_debug("Idle Task [%d] %p", cpu, idle);
> + pr_debug("Trying to bring up CPU%u ...\n", cpu);
>
> if (plat_smp_ops.cpu_kick)
> plat_smp_ops.cpu_kick(cpu,
> diff --git a/arch/arc/mm/cache.c b/arch/arc/mm/cache.c
> index a867575..7d3e79b 100644
> --- a/arch/arc/mm/cache.c
> +++ b/arch/arc/mm/cache.c
> @@ -1188,7 +1188,7 @@ void __ref arc_cache_init(void)
> unsigned int __maybe_unused cpu = smp_processor_id();
> char str[256];
>
> - printk(arc_cache_mumbojumbo(0, str, sizeof(str)));
> + pr_debug("%s", arc_cache_mumbojumbo(0, str, sizeof(str)));
>
> /*
> * Only master CPU needs to execute rest of function:
> diff --git a/arch/arc/mm/tlb.c b/arch/arc/mm/tlb.c
> index d0126fd..c5e70d8 100644
> --- a/arch/arc/mm/tlb.c
> +++ b/arch/arc/mm/tlb.c
> @@ -814,7 +814,7 @@ void arc_mmu_init(void)
> char str[256];
> struct cpuinfo_arc_mmu *mmu = &cpuinfo_arc700[smp_processor_id()].mmu;
>
> - printk(arc_mmu_mumbojumbo(0, str, sizeof(str)));
> + pr_debug("%s", arc_mmu_mumbojumbo(0, str, sizeof(str)));
>
> /*
> * Can't be done in processor.h due to header include depenedencies
>

2017-06-14 22:50:49

by Vineet Gupta

[permalink] [raw]

Subject: Re: [PATCH v2 08/12] ARC: [plat-eznps] new command line argument for HW scheduler at MTM

On 06/13/2017 07:03 AM, Noam Camus wrote:
> From: Noam Camus <[email protected]>
>
> We add ability for all cores at NPS SoC to control the number of cycles
> HW thread can execute before it is replace with another eligible
> HW thread within the same core. The replacement is done by the
> HE scheduler.
>
> Signed-off-by: Noam Camus <[email protected]>
> ---
> Documentation/admin-guide/kernel-parameters.txt | 9 ++++
> arch/arc/plat-eznps/mtm.c | 49 ++++++++++++++++++++++-
> 2 files changed, 56 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 15f79c2..5b551f7 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -2693,6 +2693,15 @@
> If the dependencies are under your control, you can
> turn on cpu0_hotplug.
>
> + nps_mtm_hs_ctr= [KNL,ARC]
> + This parameter sets the maximum duration, in
> + cycles, each HW thread of the CTOP can run
> + without interruptions, before HW switches it.
> + The actual maximum duration is 16 times this
> + parameter's value.
> + Format: integer between 1 and 255
> + Default: 255
> +
> nptcg= [IA-64] Override max number of concurrent global TLB
> purges which is reported from either PAL_VM_SUMMARY or
> SAL PALO.
> diff --git a/arch/arc/plat-eznps/mtm.c b/arch/arc/plat-eznps/mtm.c
> index 59a0162..dd1ea1f 100644
> --- a/arch/arc/plat-eznps/mtm.c
> +++ b/arch/arc/plat-eznps/mtm.c
> @@ -21,10 +21,13 @@
> #include <plat/mtm.h>
> #include <plat/smp.h>
>
> -#define MT_CTRL_HS_CNT 0xFF
> +#define MT_HS_CNT_MIN 0x01
> +#define MT_HS_CNT_MAX 0xFF
> #define MT_CTRL_ST_CNT 0xF
> #define NPS_NUM_HW_THREADS 0x10
>
> +static int mtm_hs_ctr = MT_HS_CNT_MAX;
> +
> #ifdef CONFIG_EZNPS_MEM_ERROR
> int do_memory_error(unsigned long address, struct pt_regs *regs)
> {
> @@ -129,7 +132,7 @@ void mtm_enable_core(unsigned int cpu)
> /* Enable HW schedule, stall counter, mtm */
> mt_ctrl.value = 0;
> mt_ctrl.hsen = 1;
> - mt_ctrl.hs_cnt = MT_CTRL_HS_CNT;
> + mt_ctrl.hs_cnt = mtm_hs_ctr;
> mt_ctrl.mten = 1;
> write_aux_reg(CTOP_AUX_MT_CTRL, mt_ctrl.value);
>
> @@ -140,3 +143,45 @@ void mtm_enable_core(unsigned int cpu)
> */
> cpu_relax();
> }
> +
> +/* Handle an out of bounds mtm hs counter value */
> +static void __init handle_mtm_hs_ctr_out_of_bounds_error(uint8_t val)
> +{
> + pr_err("** The value of mtm_hs_ctr is out of bounds!\n"
> + "** It must be in the range [%d,%d] (inclusive)\n"
> + "Setting mtm_hs_ctr to %d\n", MT_HS_CNT_MIN, MT_HS_CNT_MAX, val);

Do you really need such elaborate / verbose strings ! Please trim them !
Try to fit it in one line - breaking strings into multiple lines hurts sooner or
later - when grep'ing etc

> +
> + mtm_hs_ctr = val;
> +}
> +
> +/* Verify and set the value of the mtm hs counter */
> +static int __init set_mtm_hs_ctr(char *ctr_str)
> +{
> + int ret;
> + long hs_ctr;
> +
> + ret = kstrtol(ctr_str, 0, &hs_ctr);
> + if (ret) {
> + pr_err("** Error parsing the value of mtm_hs_ctr\n"
> + "** Make sure you entered a valid integer value\n"
> + "Setting mtm_hs_ctr to default value: %d\n",
> + MT_HS_CNT_MAX);

Ditto !

> + mtm_hs_ctr = MT_HS_CNT_MAX;
> + return -EINVAL;
> + }
> +
> + if (hs_ctr > MT_HS_CNT_MAX) {
> + handle_mtm_hs_ctr_out_of_bounds_error(MT_HS_CNT_MAX);
> + return -EDOM;
> + }
> +
> + if (hs_ctr < MT_HS_CNT_MIN) {
> + handle_mtm_hs_ctr_out_of_bounds_error(MT_HS_CNT_MIN);
> + return -EDOM;
> + }
> +
> + mtm_hs_ctr = hs_ctr;
> +
> + return 0;
> +}
> +early_param("nps_mtm_hs_ctr", set_mtm_hs_ctr);
>