2015-06-10 15:34:32

by Borislav Petkov

[permalink] [raw]
Subject: [PATCH 1/3] EDAC, mce_amd_inj: Cleanup and simplify README

From: Borislav Petkov <[email protected]>

Save us an indentation level, widen to 80 cols, make the text more
succinct and slender. Use i as the bank variable, same as what the
documentation uses.

Signed-off-by: Borislav Petkov <[email protected]>
---
drivers/edac/mce_amd_inj.c | 79 ++++++++++++++++++++++++----------------------
1 file changed, 41 insertions(+), 38 deletions(-)

diff --git a/drivers/edac/mce_amd_inj.c b/drivers/edac/mce_amd_inj.c
index 012b61a33a60..30227f30e7c6 100644
--- a/drivers/edac/mce_amd_inj.c
+++ b/drivers/edac/mce_amd_inj.c
@@ -256,10 +256,10 @@ static void do_inject(void)
if (inj_type == DFR_INT_INJ) {
/*
* Ensure necessary status bits for deferred errors:
- * a. MCx_STATUS[Deferred] is set -
+ * a. MCi_STATUS[Deferred] is set -
* This is to ensure the error will be handled by the
* interrupt handler
- * b. unset MCx_STATUS[UC]
+ * b. unset MCi_STATUS[UC]
* As deferred errors are _not_ UC
*/
i_mce.status |= MCI_STATUS_DEFERRED;
@@ -346,42 +346,45 @@ MCE_INJECT_GET(bank);
DEFINE_SIMPLE_ATTRIBUTE(bank_fops, inj_bank_get, inj_bank_set, "%llu\n");

static const char readme_msg[] =
- "\nDescription of the files and their usages:\n\n"
- "status: Set a value to be programmed into MCx_STATUS(bank)\n"
- "\t The status bits provide insight into the type of\n"
- "\t error that caused the MCE.\n\n"
- "misc: Set value of MCx_MISC(bank)\n"
- "\t misc register provides auxiliary info about the error. This\n"
- "\t register is typically used for error thresholding purpose and\n"
- "\t validity of the register is indicated by MCx_STATUS[MiscV]\n\n"
- "addr: Error address value to be written to MCx_ADDR(bank)\n"
- "\t This register is used to log address information associated\n"
- "\t with the error.\n\n"
- "Note: See respective BKDGs for the exact bit definitions of the\n"
- "\t above registers as they mirror the MCi_[STATUS | MISC | ADDR]\n"
- "\t hardware registers.\n\n"
- "bank: Specify the bank you want to inject the error into.\n"
- "\t The number of banks in a processor varies and is family/model\n"
- "\t dependent. So, a sanity check performed while writing.\n"
- "\t Writing to this file will trigger a #MC or APIC interrupts or\n"
- "\t invoke the error decoder routines for AMD processors. The value\n"
- "\t in 'flags' file decides which of above actions is triggered.\n\n"
- "flags: Write to this file to speficy the error injection policy.\n"
- "\t Allowed values:\n"
- "\t\t\"sw\" - SW error injection, Only calls error decoder\n"
- "\t\t\troutines to print error info in human readable format\n"
- "\t\t\"hw\" - HW error injection, Forces a #MC,\n"
- "\t\t\tcauses exception handler to handle the error\n"
- "\t\t\tif UC or poll handler catches it if CE\n"
- "\t\t\tWarning: Might cause system panic if MCx_STATUS[PCC]\n"
- "\t\t\tis set. For debug purposes, consider setting\n"
- "\t\t\t/<debugfs_mountpoint>/mce/fake_panic\n"
- "\t\t\"dfr\" - Trigger APIC interrupt for Deferred error\n"
- "\t\t\tError is handled by deferred error apic handler if\n"
- "\t\t\tfeature is present in HW.\n"
- "\t\t\"thr\" - Trigger APIC interrupt for threshold error\n"
- "\t\t\tError is handled by threshold apic handler\n\n"
- "cpu: The cpu to inject the error on.\n\n";
+"Description of the files and their usages:\n"
+"\n"
+"Note1: i refers to the bank number below.\n"
+"Note2: See respective BKDGs for the exact bit definitions of the files below\n"
+"as they mirror the hardware registers.\n"
+"\n"
+"status:\t Set MCi_STATUS: the bits in that MSR control the error type and\n"
+"\t attributes of the error which caused the MCE.\n"
+"\n"
+"misc:\t Set MCi_MISC: provide auxiliary info about the error. It is mostly\n"
+"\t used for error thresholding purposes and its validity is indicated by\n"
+"\t MCi_STATUS[MiscV].\n"
+"\n"
+"addr:\t Error address value to be written to MCi_ADDR. Log address information\n"
+"\t associated with the error.\n"
+"\n"
+"cpu:\t The CPU to inject the error on.\n"
+"\n"
+"bank:\t Specify the bank you want to inject the error into: the number of\n"
+"\t banks in a processor varies and is family/model-specific, therefore, the\n"
+"\t supplied value is sanity-checked. Setting the bank value also triggers the\n"
+"\t injection.\n"
+"\n"
+"flags:\t Injection type to be performed. Writing to this file will trigger a\n"
+"\t real machine check, an APIC interrupt or invoke the error decoder routines\n"
+"\t for AMD processors.\n"
+"\n"
+"\t Allowed error injection types:\n"
+"\t - \"sw\": Software error injection. Decode error to a human-readable \n"
+"\t format only. Safe to use.\n"
+"\t - \"hw\": Hardware error injection. Causes the #MC exception handler to \n"
+"\t handle the error. Be warned: might cause system panic if MCi_STATUS[PCC] \n"
+"\t is set. Therefore, consider setting (debugfs_mountpoint)/mce/fake_panic \n"
+"\t before injecting.\n"
+"\t - \"dfr\": Trigger Deferred error handled by the deferred error APIC handler \n"
+"\t if the feature is present in HW.\n"
+"\t - \"thr\": Trigger Thresholding error handled by the thresholding error \n"
+"\t APIC handler.\n"
+"\n";

static ssize_t
inj_readme_read(struct file *filp, char __user *ubuf,
--
2.3.5


2015-06-10 15:34:08

by Borislav Petkov

[permalink] [raw]
Subject: [PATCH 2/3] EDAC, mce_amd_inj: Move bit preparations before the injection

From: Borislav Petkov <[email protected]>

We do get_online_cpus() and then start noodling with the bits. Do that
*before* we grab the hotplug lock. This accidentally fixes the other
issue where we check if a CPU is online but then go ahead and find out
the NBC core which might not be the same CPU as the CPU we want to
inject on.

Signed-off-by: Borislav Petkov <[email protected]>
---
drivers/edac/mce_amd_inj.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/edac/mce_amd_inj.c b/drivers/edac/mce_amd_inj.c
index 30227f30e7c6..5001d166f54f 100644
--- a/drivers/edac/mce_amd_inj.c
+++ b/drivers/edac/mce_amd_inj.c
@@ -266,10 +266,6 @@ static void do_inject(void)
i_mce.status &= ~MCI_STATUS_UC;
}

- get_online_cpus();
- if (!cpu_online(cpu))
- goto err;
-
/* prep MCE global settings for the injection */
mcg_status = MCG_STATUS_MCIP | MCG_STATUS_EIPV;

@@ -290,6 +286,10 @@ static void do_inject(void)
cpu = get_nbc_for_node(amd_get_nb_id(cpu));
}

+ get_online_cpus();
+ if (!cpu_online(cpu))
+ goto err;
+
toggle_hw_mce_inject(cpu, true);

wrmsr_on_cpu(cpu, MSR_IA32_MCG_STATUS,
--
2.3.5

2015-06-10 15:34:21

by Borislav Petkov

[permalink] [raw]
Subject: [PATCH 3/3] EDAC, mce_amd_inj: Set MISCV on injection

From: Borislav Petkov <[email protected]>

When during injection we populate MCi_MISC by writing into misc, we need
to set the MiscV bit in the corresponding MCi_STATUS register which
denotes that there's valid info in the MCi_MISC register.

Signed-off-by: Borislav Petkov <[email protected]>
---
drivers/edac/mce_amd_inj.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/edac/mce_amd_inj.c b/drivers/edac/mce_amd_inj.c
index 5001d166f54f..021f256b7f11 100644
--- a/drivers/edac/mce_amd_inj.c
+++ b/drivers/edac/mce_amd_inj.c
@@ -248,6 +248,9 @@ static void do_inject(void)
unsigned int cpu = i_mce.extcpu;
u8 b = i_mce.bank;

+ if (i_mce.misc)
+ i_mce.status |= MCI_STATUS_MISCV;
+
if (inj_type == SW_INJ) {
amd_decode_mce(NULL, 0, &i_mce);
return;
--
2.3.5

2015-06-10 16:31:00

by Aravind Gopalakrishnan

[permalink] [raw]
Subject: Re: [PATCH 3/3] EDAC, mce_amd_inj: Set MISCV on injection

On 6/10/2015 10:33 AM, Borislav Petkov wrote:
> From: Borislav Petkov <[email protected]>
>
> When during injection we populate MCi_MISC by writing into misc, we need
> to set the MiscV bit in the corresponding MCi_STATUS register which
> denotes that there's valid info in the MCi_MISC register.
>
> Signed-off-by: Borislav Petkov <[email protected]>
> ---
> drivers/edac/mce_amd_inj.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/edac/mce_amd_inj.c b/drivers/edac/mce_amd_inj.c
> index 5001d166f54f..021f256b7f11 100644
> --- a/drivers/edac/mce_amd_inj.c
> +++ b/drivers/edac/mce_amd_inj.c
> @@ -248,6 +248,9 @@ static void do_inject(void)
> unsigned int cpu = i_mce.extcpu;
> u8 b = i_mce.bank;
>
> + if (i_mce.misc)
> + i_mce.status |= MCI_STATUS_MISCV;
> +
> if (inj_type == SW_INJ) {
> amd_decode_mce(NULL, 0, &i_mce);
> return;


Thanks for the catch here and on Patch 2:)

Looks good to me. So you can slap a <Reviewed-by> on them if needed..

Thanks,
-Aravind.

2015-06-10 16:51:31

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 3/3] EDAC, mce_amd_inj: Set MISCV on injection

On Wed, Jun 10, 2015 at 11:30:42AM -0500, Aravind Gopalakrishnan wrote:
> Thanks for the catch here and on Patch 2:)
>
> Looks good to me. So you can slap a <Reviewed-by> on them if needed..

Sure, done. :)

Thanks.

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--

2015-06-13 14:11:23

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 3/3] EDAC, mce_amd_inj: Set MISCV on injection

On Wed, Jun 10, 2015 at 06:51:15PM +0200, Borislav Petkov wrote:
> On Wed, Jun 10, 2015 at 11:30:42AM -0500, Aravind Gopalakrishnan wrote:
> > Thanks for the catch here and on Patch 2:)
> >
> > Looks good to me. So you can slap a <Reviewed-by> on them if needed..
>
> Sure, done. :)

Whoops, one more:

---
From: Borislav Petkov <[email protected]>
Date: Sat, 13 Jun 2015 16:01:21 +0200
Subject: [PATCH] EDAC, mce_amd_inj: Make it depend on AMD_NB
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

... now that it uses its facilities.

It, too, takes care of randconfig splats like this one:

drivers/edac/mce_amd_inj.c: In function ‘toggle_nb_mca_mst_cpu’:
drivers/edac/mce_amd_inj.c:218:42: warning: dereferencing ‘void *’ pointer [enabled by default]
struct pci_dev *F3 = node_to_amd_nb(nid)->misc;
^
drivers/edac/mce_amd_inj.c:218:42: error: request for member ‘misc’ in something not a structure or union
make[2]: *** [drivers/edac/mce_amd_inj.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [drivers/edac] Error 2

Signed-off-by: Borislav Petkov <[email protected]>
Cc: Aravind Gopalakrishnan <[email protected]>
---
drivers/edac/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig
index 8677ead2a8e1..84604310c826 100644
--- a/drivers/edac/Kconfig
+++ b/drivers/edac/Kconfig
@@ -63,7 +63,7 @@ config EDAC_DECODE_MCE

config EDAC_MCE_INJ
tristate "Simple MCE injection interface"
- depends on EDAC_DECODE_MCE && DEBUG_FS
+ depends on EDAC_DECODE_MCE && DEBUG_FS && AMD_NB
default n
help
This is a simple debugfs interface to inject MCEs and test different
--
2.3.5

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--