2013-10-02 16:11:24

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH V5 0/3] perf/x86/amd: AMD Family 16h Data Breakpoint Extensions

From: Suravee Suthikulpanit <[email protected]>

Frederic, this is the rebase of the V4 patch onto the linux-3.12.0-rc3 (linux.git),
and retest.

The following patchset enables hardware breakpoint bp_len greater than
HW_BREAKPOINT_LEN_8 on AMD Family 16h and later.

$ perf stat -e mem:0x1000/16:w a.out
^^
bp_len

This will count writes to [0x1000 ~ 0x1010)

V5:
* Rebase onto 3.12.0-rc3.
* Modify the tools/perf/util/parse-events.y due to change in
parse_events_add_breakpoint().

V4:
Even more per Oleg's suggestion:
* Further simplify info->len and info->mask setting switch statement

V3:
More per Oleg's suggestion:
* Use already existing bp_len instead of changing userland API and
in kernel turn bp_len into proper AMD hardware breakpoint address
mask.

V2:
Per Oleg's suggestions:
* Moved testing of bp_addr_mask to validate_hw_breakpoint()
* Changed perf tool syntax to mem:<addr>[/addr_mask][:access]


Jacob Shin (3):
perf/x86/amd: AMD support for bp_len > HW_BREAKPOINT_LEN_8
perf tools: allow user to specify hardware breakpoint bp_len
perf tools: add hardware breakpoint bp_len test cases

arch/x86/include/asm/cpufeature.h | 2 ++
arch/x86/include/asm/debugreg.h | 5 +++
arch/x86/include/asm/hw_breakpoint.h | 1 +
arch/x86/include/uapi/asm/msr-index.h | 4 +++
arch/x86/kernel/cpu/amd.c | 19 +++++++++++
arch/x86/kernel/hw_breakpoint.c | 47 +++++++++++----------------
tools/perf/Documentation/perf-record.txt | 7 ++--
tools/perf/tests/parse-events.c | 55 ++++++++++++++++++++++++++++++++
tools/perf/util/parse-events.c | 17 ++++------
tools/perf/util/parse-events.h | 2 +-
tools/perf/util/parse-events.l | 1 +
tools/perf/util/parse-events.y | 26 +++++++++++++--
12 files changed, 142 insertions(+), 44 deletions(-)

--
1.8.1.2


2013-10-02 16:11:23

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH 3/3] perf tools: add hardware breakpoint bp_len test cases

From: Jacob Shin <[email protected]>

Signed-off-by: Jacob Shin <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
tools/perf/tests/parse-events.c | 55 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 55 insertions(+)

diff --git a/tools/perf/tests/parse-events.c b/tools/perf/tests/parse-events.c
index 48114d1..9b44722 100644
--- a/tools/perf/tests/parse-events.c
+++ b/tools/perf/tests/parse-events.c
@@ -1115,6 +1115,49 @@ static int test__pinned_group(struct perf_evlist *evlist)
return 0;
}

+static int test__checkevent_breakpoint_len(struct perf_evlist *evlist)
+{
+ struct perf_evsel *evsel = perf_evlist__first(evlist);
+
+ TEST_ASSERT_VAL("wrong number of entries", 1 == evlist->nr_entries);
+ TEST_ASSERT_VAL("wrong type", PERF_TYPE_BREAKPOINT == evsel->attr.type);
+ TEST_ASSERT_VAL("wrong config", 0 == evsel->attr.config);
+ TEST_ASSERT_VAL("wrong bp_type", (HW_BREAKPOINT_R | HW_BREAKPOINT_W) ==
+ evsel->attr.bp_type);
+ TEST_ASSERT_VAL("wrong bp_len", HW_BREAKPOINT_LEN_1 ==
+ evsel->attr.bp_len);
+
+ return 0;
+}
+
+static int test__checkevent_breakpoint_len_w(struct perf_evlist *evlist)
+{
+ struct perf_evsel *evsel = perf_evlist__first(evlist);
+
+ TEST_ASSERT_VAL("wrong number of entries", 1 == evlist->nr_entries);
+ TEST_ASSERT_VAL("wrong type", PERF_TYPE_BREAKPOINT == evsel->attr.type);
+ TEST_ASSERT_VAL("wrong config", 0 == evsel->attr.config);
+ TEST_ASSERT_VAL("wrong bp_type", HW_BREAKPOINT_W ==
+ evsel->attr.bp_type);
+ TEST_ASSERT_VAL("wrong bp_len", HW_BREAKPOINT_LEN_2 ==
+ evsel->attr.bp_len);
+
+ return 0;
+}
+
+static int
+test__checkevent_breakpoint_len_rw_modifier(struct perf_evlist *evlist)
+{
+ struct perf_evsel *evsel = perf_evlist__first(evlist);
+
+ TEST_ASSERT_VAL("wrong exclude_user", !evsel->attr.exclude_user);
+ TEST_ASSERT_VAL("wrong exclude_kernel", evsel->attr.exclude_kernel);
+ TEST_ASSERT_VAL("wrong exclude_hv", evsel->attr.exclude_hv);
+ TEST_ASSERT_VAL("wrong precise_ip", !evsel->attr.precise_ip);
+
+ return test__checkevent_breakpoint_rw(evlist);
+}
+
static int count_tracepoints(void)
{
char events_path[PATH_MAX];
@@ -1347,6 +1390,18 @@ static struct evlist_test test__events[] = {
.name = "{cycles,cache-misses,branch-misses}:D",
.check = test__pinned_group,
},
+ [42] = {
+ .name = "mem:0/1",
+ .check = test__checkevent_breakpoint_len,
+ },
+ [43] = {
+ .name = "mem:0/2:w",
+ .check = test__checkevent_breakpoint_len_w,
+ },
+ [44] = {
+ .name = "mem:0/4:rw:u",
+ .check = test__checkevent_breakpoint_len_rw_modifier,
+ },
};

static struct evlist_test test__events_pmu[] = {
--
1.8.1.2

2013-10-02 16:11:22

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH 1/3] perf/x86/amd: AMD support for bp_len > HW_BREAKPOINT_LEN_8

From: Jacob Shin <[email protected]>

Implement hardware breakpoint address mask for AMD Family 16h and
above processors. CPUID feature bit indicates hardware support for
DRn_ADDR_MASK MSRs. These masks further qualify DRn/DR7 hardware
breakpoint addresses to allow matching of larger addresses ranges.

Valuable advice and pseudo code from Oleg Nesterov <[email protected]>

Signed-off-by: Jacob Shin <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/include/asm/cpufeature.h | 2 ++
arch/x86/include/asm/debugreg.h | 5 ++++
arch/x86/include/asm/hw_breakpoint.h | 1 +
arch/x86/include/uapi/asm/msr-index.h | 4 +++
arch/x86/kernel/cpu/amd.c | 19 ++++++++++++++
arch/x86/kernel/hw_breakpoint.c | 47 ++++++++++++++---------------------
6 files changed, 49 insertions(+), 29 deletions(-)

diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index d3f5c63..26609bb 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -170,6 +170,7 @@
#define X86_FEATURE_TOPOEXT (6*32+22) /* topology extensions CPUID leafs */
#define X86_FEATURE_PERFCTR_CORE (6*32+23) /* core performance counter extensions */
#define X86_FEATURE_PERFCTR_NB (6*32+24) /* NB performance counter extensions */
+#define X86_FEATURE_BPEXT (6*32+26) /* data breakpoint extension */
#define X86_FEATURE_PERFCTR_L2 (6*32+28) /* L2 performance counter extensions */

/*
@@ -332,6 +333,7 @@ extern const char * const x86_power_flags[32];
#define cpu_has_cx16 boot_cpu_has(X86_FEATURE_CX16)
#define cpu_has_eager_fpu boot_cpu_has(X86_FEATURE_EAGER_FPU)
#define cpu_has_topoext boot_cpu_has(X86_FEATURE_TOPOEXT)
+#define cpu_has_bpext boot_cpu_has(X86_FEATURE_BPEXT)

#ifdef CONFIG_X86_64

diff --git a/arch/x86/include/asm/debugreg.h b/arch/x86/include/asm/debugreg.h
index 4b528a9..145b009 100644
--- a/arch/x86/include/asm/debugreg.h
+++ b/arch/x86/include/asm/debugreg.h
@@ -114,5 +114,10 @@ static inline void debug_stack_usage_inc(void) { }
static inline void debug_stack_usage_dec(void) { }
#endif /* X86_64 */

+#ifdef CONFIG_CPU_SUP_AMD
+extern void set_dr_addr_mask(unsigned long mask, int dr);
+#else
+static inline void set_dr_addr_mask(unsigned long mask, int dr) { }
+#endif

#endif /* _ASM_X86_DEBUGREG_H */
diff --git a/arch/x86/include/asm/hw_breakpoint.h b/arch/x86/include/asm/hw_breakpoint.h
index ef1c4d2..6c98be8 100644
--- a/arch/x86/include/asm/hw_breakpoint.h
+++ b/arch/x86/include/asm/hw_breakpoint.h
@@ -12,6 +12,7 @@
*/
struct arch_hw_breakpoint {
unsigned long address;
+ unsigned long mask;
u8 len;
u8 type;
};
diff --git a/arch/x86/include/uapi/asm/msr-index.h b/arch/x86/include/uapi/asm/msr-index.h
index bb04650..1f04f6c 100644
--- a/arch/x86/include/uapi/asm/msr-index.h
+++ b/arch/x86/include/uapi/asm/msr-index.h
@@ -205,6 +205,10 @@
/* Fam 16h MSRs */
#define MSR_F16H_L2I_PERF_CTL 0xc0010230
#define MSR_F16H_L2I_PERF_CTR 0xc0010231
+#define MSR_F16H_DR1_ADDR_MASK 0xc0011019
+#define MSR_F16H_DR2_ADDR_MASK 0xc001101a
+#define MSR_F16H_DR3_ADDR_MASK 0xc001101b
+#define MSR_F16H_DR0_ADDR_MASK 0xc0011027

/* Fam 15h MSRs */
#define MSR_F15H_PERF_CTL 0xc0010200
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 903a264..fffc9cb 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -909,3 +909,22 @@ static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum)

return false;
}
+
+void set_dr_addr_mask(unsigned long mask, int dr)
+{
+ if (!cpu_has_bpext)
+ return;
+
+ switch (dr) {
+ case 0:
+ wrmsr(MSR_F16H_DR0_ADDR_MASK, mask, 0);
+ break;
+ case 1:
+ case 2:
+ case 3:
+ wrmsr(MSR_F16H_DR1_ADDR_MASK - 1 + dr, mask, 0);
+ break;
+ default:
+ break;
+ }
+}
diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index f66ff16..3cb1951 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -128,6 +128,8 @@ int arch_install_hw_breakpoint(struct perf_event *bp)
*dr7 |= encode_dr7(i, info->len, info->type);

set_debugreg(*dr7, 7);
+ if (info->mask)
+ set_dr_addr_mask(info->mask, i);

return 0;
}
@@ -163,29 +165,8 @@ void arch_uninstall_hw_breakpoint(struct perf_event *bp)
*dr7 &= ~__encode_dr7(i, info->len, info->type);

set_debugreg(*dr7, 7);
-}
-
-static int get_hbp_len(u8 hbp_len)
-{
- unsigned int len_in_bytes = 0;
-
- switch (hbp_len) {
- case X86_BREAKPOINT_LEN_1:
- len_in_bytes = 1;
- break;
- case X86_BREAKPOINT_LEN_2:
- len_in_bytes = 2;
- break;
- case X86_BREAKPOINT_LEN_4:
- len_in_bytes = 4;
- break;
-#ifdef CONFIG_X86_64
- case X86_BREAKPOINT_LEN_8:
- len_in_bytes = 8;
- break;
-#endif
- }
- return len_in_bytes;
+ if (info->mask)
+ set_dr_addr_mask(0, i);
}

/*
@@ -198,7 +179,7 @@ int arch_check_bp_in_kernelspace(struct perf_event *bp)
struct arch_hw_breakpoint *info = counter_arch_bp(bp);

va = info->address;
- len = get_hbp_len(info->len);
+ len = bp->attr.bp_len;

return (va >= TASK_SIZE) && ((va + len - 1) >= TASK_SIZE);
}
@@ -279,7 +260,16 @@ static int arch_build_bp_info(struct perf_event *bp)
}

/* Len */
+ info->mask = 0;
+
switch (bp->attr.bp_len) {
+ default:
+ if (!is_power_of_2(bp->attr.bp_len))
+ return -EINVAL;
+ if (!cpu_has_bpext)
+ return -EOPNOTSUPP;
+ info->mask = bp->attr.bp_len - 1;
+ /* fall through */
case HW_BREAKPOINT_LEN_1:
info->len = X86_BREAKPOINT_LEN_1;
break;
@@ -294,12 +284,11 @@ static int arch_build_bp_info(struct perf_event *bp)
info->len = X86_BREAKPOINT_LEN_8;
break;
#endif
- default:
- return -EINVAL;
}

return 0;
}
+
/*
* Validate the arch-specific HW Breakpoint register settings
*/
@@ -314,11 +303,11 @@ int arch_validate_hwbkpt_settings(struct perf_event *bp)
if (ret)
return ret;

- ret = -EINVAL;
-
switch (info->len) {
case X86_BREAKPOINT_LEN_1:
align = 0;
+ if (info->mask)
+ align = info->mask;
break;
case X86_BREAKPOINT_LEN_2:
align = 1;
@@ -332,7 +321,7 @@ int arch_validate_hwbkpt_settings(struct perf_event *bp)
break;
#endif
default:
- return ret;
+ BUG();
}

/*
--
1.8.1.2

2013-10-02 16:12:43

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH 2/3] perf tools: allow user to specify hardware breakpoint bp_len

From: Jacob Shin <[email protected]>

Currently bp_len is given a default value of 4. Allow user to override it:

$ perf stat -e mem:0x1000/8
^
bp_len

If no value is given, it will default to 4 as it did before.

Signed-off-by: Jacob Shin <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
tools/perf/Documentation/perf-record.txt | 7 +++++--
tools/perf/util/parse-events.c | 17 +++++++----------
tools/perf/util/parse-events.h | 2 +-
tools/perf/util/parse-events.l | 1 +
tools/perf/util/parse-events.y | 26 ++++++++++++++++++++++++--
5 files changed, 38 insertions(+), 15 deletions(-)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index e297b74..e07491f 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -33,12 +33,15 @@ OPTIONS
- a raw PMU event (eventsel+umask) in the form of rNNN where NNN is a
hexadecimal event descriptor.

- - a hardware breakpoint event in the form of '\mem:addr[:access]'
+ - a hardware breakpoint event in the form of '\mem:addr[/len][:access]'
where addr is the address in memory you want to break in.
Access is the memory access type (read, write, execute) it can
- be passed as follows: '\mem:addr[:[r][w][x]]'.
+ be passed as follows: '\mem:addr[:[r][w][x]]'. len is the range,
+ number of bytes from specified addr, which the breakpoint will cover.
If you want to profile read-write accesses in 0x1000, just set
'mem:0x1000:rw'.
+ If you want to profile write accesses in [0x1000~1008), just set
+ 'mem:0x1000/8:w'.

--filter=<filter>::
Event filter.
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 9812531..65d81ae 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -515,7 +515,7 @@ do { \
}

int parse_events_add_breakpoint(struct list_head *list, int *idx,
- void *ptr, char *type)
+ void *ptr, char *type, u64 len)
{
struct perf_event_attr attr;

@@ -525,14 +525,11 @@ int parse_events_add_breakpoint(struct list_head *list, int *idx,
if (parse_breakpoint_type(type, &attr))
return -EINVAL;

- /*
- * We should find a nice way to override the access length
- * Provide some defaults for now
- */
- if (attr.bp_type == HW_BREAKPOINT_X)
- attr.bp_len = sizeof(long);
- else
- attr.bp_len = HW_BREAKPOINT_LEN_4;
+ /* Provide some defaults if len is not specified */
+ if (!len)
+ len = attr.bp_type == HW_BREAKPOINT_X ? sizeof(long) :
+ HW_BREAKPOINT_LEN_4;
+ attr.bp_len = len;

attr.type = PERF_TYPE_BREAKPOINT;
attr.sample_period = 1;
@@ -1238,7 +1235,7 @@ void print_events(const char *event_glob, bool name_only)
printf("\n");

printf(" %-50s [%s]\n",
- "mem:<addr>[:access]",
+ "mem:<addr>[/len][:access]",
event_type_descriptors[PERF_TYPE_BREAKPOINT]);
printf("\n");
}
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index f1cb4c4..54a20df 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -93,7 +93,7 @@ int parse_events_add_numeric(struct list_head *list, int *idx,
int parse_events_add_cache(struct list_head *list, int *idx,
char *type, char *op_result1, char *op_result2);
int parse_events_add_breakpoint(struct list_head *list, int *idx,
- void *ptr, char *type);
+ void *ptr, char *type, u64 len);
int parse_events_add_pmu(struct list_head *list, int *idx,
char *pmu , struct list_head *head_config);
void parse_events__set_leader(char *name, struct list_head *list);
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 91346b7..2fde75f 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -193,6 +193,7 @@ r{num_raw_hex} { return raw(yyscanner); }
<mem>{
{modifier_bp} { return str(yyscanner, PE_MODIFIER_BP); }
: { return ':'; }
+"/" { return '/'; }
{num_dec} { return value(yyscanner, 10); }
{num_hex} { return value(yyscanner, 16); }
/*
diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y
index 4eb67ec..c9adf4d 100644
--- a/tools/perf/util/parse-events.y
+++ b/tools/perf/util/parse-events.y
@@ -276,6 +276,28 @@ PE_NAME_CACHE_TYPE
}

event_legacy_mem:
+PE_PREFIX_MEM PE_VALUE '/' PE_VALUE ':' PE_MODIFIER_BP sep_dc
+{
+ struct parse_events_evlist *data = _data;
+ struct list_head *list;
+
+ ALLOC_LIST(list);
+ ABORT_ON(parse_events_add_breakpoint(list, &data->idx,
+ (void *) $2, $6, $4));
+ $$ = list;
+}
+|
+PE_PREFIX_MEM PE_VALUE '/' PE_VALUE sep_dc
+{
+ struct parse_events_evlist *data = _data;
+ struct list_head *list;
+
+ ALLOC_LIST(list);
+ ABORT_ON(parse_events_add_breakpoint(list, &data->idx,
+ (void *) $2, NULL, $4));
+ $$ = list;
+}
+|
PE_PREFIX_MEM PE_VALUE ':' PE_MODIFIER_BP sep_dc
{
struct parse_events_evlist *data = _data;
@@ -283,7 +305,7 @@ PE_PREFIX_MEM PE_VALUE ':' PE_MODIFIER_BP sep_dc

ALLOC_LIST(list);
ABORT_ON(parse_events_add_breakpoint(list, &data->idx,
- (void *) $2, $4));
+ (void *) $2, $4, 0));
$$ = list;
}
|
@@ -294,7 +316,7 @@ PE_PREFIX_MEM PE_VALUE sep_dc

ALLOC_LIST(list);
ABORT_ON(parse_events_add_breakpoint(list, &data->idx,
- (void *) $2, NULL));
+ (void *) $2, NULL, 0));
$$ = list;
}

--
1.8.1.2

2013-10-02 16:22:32

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH V5 0/3] perf/x86/amd: AMD Family 16h Data Breakpoint Extensions

On 10/02, [email protected] wrote:
>
> From: Suravee Suthikulpanit <[email protected]>
>
> Frederic, this is the rebase of the V4 patch onto the linux-3.12.0-rc3 (linux.git),
> and retest.

But the code is the same? If yes,

Reviewed-by: Oleg Nesterov <[email protected]>


To remind, a range on executable breakpoint is problematic, but I do
not blame this series. We need to fix the generic code and kill
HW_BREAKPOINT_X.

And, heh, it would be nice if the hardware could tell us the _exact_
address (from the range) which should be blamed.

Oleg.

2013-10-02 16:54:11

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: Re: [PATCH V5 0/3] perf/x86/amd: AMD Family 16h Data Breakpoint Extensions

On 10/2/2013 11:15 AM, Oleg Nesterov wrote:
> On 10/02,[email protected] wrote:
>> >
>> >From: Suravee Suthikulpanit<[email protected]>
>> >
>> >Frederic, this is the rebase of the V4 patch onto the linux-3.12.0-rc3 (linux.git),
>> >and retest.
> But the code is the same? If yes,
>
> Reviewed-by: Oleg Nesterov<[email protected]>

The only change I made was in the tools/perf/util/parse-events.y due to change in parse_events_add_breakpoint(), but should be the same logic.

Suravee.

2013-10-31 09:43:35

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH V5 0/3] perf/x86/amd: AMD Family 16h Data Breakpoint Extensions

On Wed, Oct 02, 2013 at 11:11:05AM -0500, [email protected] wrote:
> From: Suravee Suthikulpanit <[email protected]>
>
> Frederic, this is the rebase of the V4 patch onto the linux-3.12.0-rc3 (linux.git),
> and retest.
>
> The following patchset enables hardware breakpoint bp_len greater than
> HW_BREAKPOINT_LEN_8 on AMD Family 16h and later.
>
> $ perf stat -e mem:0x1000/16:w a.out
> ^^
> bp_len
>
> This will count writes to [0x1000 ~ 0x1010)

This interface looks good to me. I mean it seems flexible enough to provide
a range of address that fits anyway whenever the architecture supports breakpoint
ranges through either a length or a mask.

Thanks.

>
> V5:
> * Rebase onto 3.12.0-rc3.
> * Modify the tools/perf/util/parse-events.y due to change in
> parse_events_add_breakpoint().
>
> V4:
> Even more per Oleg's suggestion:
> * Further simplify info->len and info->mask setting switch statement
>
> V3:
> More per Oleg's suggestion:
> * Use already existing bp_len instead of changing userland API and
> in kernel turn bp_len into proper AMD hardware breakpoint address
> mask.
>
> V2:
> Per Oleg's suggestions:
> * Moved testing of bp_addr_mask to validate_hw_breakpoint()
> * Changed perf tool syntax to mem:<addr>[/addr_mask][:access]
>
>
> Jacob Shin (3):
> perf/x86/amd: AMD support for bp_len > HW_BREAKPOINT_LEN_8
> perf tools: allow user to specify hardware breakpoint bp_len
> perf tools: add hardware breakpoint bp_len test cases
>
> arch/x86/include/asm/cpufeature.h | 2 ++
> arch/x86/include/asm/debugreg.h | 5 +++
> arch/x86/include/asm/hw_breakpoint.h | 1 +
> arch/x86/include/uapi/asm/msr-index.h | 4 +++
> arch/x86/kernel/cpu/amd.c | 19 +++++++++++
> arch/x86/kernel/hw_breakpoint.c | 47 +++++++++++----------------
> tools/perf/Documentation/perf-record.txt | 7 ++--
> tools/perf/tests/parse-events.c | 55 ++++++++++++++++++++++++++++++++
> tools/perf/util/parse-events.c | 17 ++++------
> tools/perf/util/parse-events.h | 2 +-
> tools/perf/util/parse-events.l | 1 +
> tools/perf/util/parse-events.y | 26 +++++++++++++--
> 12 files changed, 142 insertions(+), 44 deletions(-)
>
> --
> 1.8.1.2
>
>

2013-10-31 09:58:36

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH 1/3] perf/x86/amd: AMD support for bp_len > HW_BREAKPOINT_LEN_8

On Wed, Oct 02, 2013 at 11:11:06AM -0500, [email protected] wrote:
> From: Jacob Shin <[email protected]>
>
> Implement hardware breakpoint address mask for AMD Family 16h and
> above processors. CPUID feature bit indicates hardware support for
> DRn_ADDR_MASK MSRs. These masks further qualify DRn/DR7 hardware
> breakpoint addresses to allow matching of larger addresses ranges.
>
> Valuable advice and pseudo code from Oleg Nesterov <[email protected]>
>
> Signed-off-by: Jacob Shin <[email protected]>
> Signed-off-by: Suravee Suthikulpanit <[email protected]>
> ---
> arch/x86/include/asm/cpufeature.h | 2 ++
> arch/x86/include/asm/debugreg.h | 5 ++++
> arch/x86/include/asm/hw_breakpoint.h | 1 +
> arch/x86/include/uapi/asm/msr-index.h | 4 +++
> arch/x86/kernel/cpu/amd.c | 19 ++++++++++++++
> arch/x86/kernel/hw_breakpoint.c | 47 ++++++++++++++---------------------
> 6 files changed, 49 insertions(+), 29 deletions(-)
>
> diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
> index d3f5c63..26609bb 100644
> --- a/arch/x86/include/asm/cpufeature.h
> +++ b/arch/x86/include/asm/cpufeature.h
> @@ -170,6 +170,7 @@
> #define X86_FEATURE_TOPOEXT (6*32+22) /* topology extensions CPUID leafs */
> #define X86_FEATURE_PERFCTR_CORE (6*32+23) /* core performance counter extensions */
> #define X86_FEATURE_PERFCTR_NB (6*32+24) /* NB performance counter extensions */
> +#define X86_FEATURE_BPEXT (6*32+26) /* data breakpoint extension */

Is this perhaps a bit too generic? Extension can mean about everything and who knows
what other feature Intel and Amd will add to breakpoints in the future.

How about X86_FEATURE_BP_RANGE?

> #define X86_FEATURE_PERFCTR_L2 (6*32+28) /* L2 performance counter extensions */
>
> /*
> @@ -332,6 +333,7 @@ extern const char * const x86_power_flags[32];
> #define cpu_has_cx16 boot_cpu_has(X86_FEATURE_CX16)
> #define cpu_has_eager_fpu boot_cpu_has(X86_FEATURE_EAGER_FPU)
> #define cpu_has_topoext boot_cpu_has(X86_FEATURE_TOPOEXT)
> +#define cpu_has_bpext boot_cpu_has(X86_FEATURE_BPEXT)
>
> #ifdef CONFIG_X86_64
>
> diff --git a/arch/x86/include/asm/debugreg.h b/arch/x86/include/asm/debugreg.h
> index 4b528a9..145b009 100644
> --- a/arch/x86/include/asm/debugreg.h
> +++ b/arch/x86/include/asm/debugreg.h
> @@ -114,5 +114,10 @@ static inline void debug_stack_usage_inc(void) { }
> static inline void debug_stack_usage_dec(void) { }
> #endif /* X86_64 */
>
> +#ifdef CONFIG_CPU_SUP_AMD
> +extern void set_dr_addr_mask(unsigned long mask, int dr);
> +#else
> +static inline void set_dr_addr_mask(unsigned long mask, int dr) { }
> +#endif
>
> #endif /* _ASM_X86_DEBUGREG_H */
> diff --git a/arch/x86/include/asm/hw_breakpoint.h b/arch/x86/include/asm/hw_breakpoint.h
> index ef1c4d2..6c98be8 100644
> --- a/arch/x86/include/asm/hw_breakpoint.h
> +++ b/arch/x86/include/asm/hw_breakpoint.h
> @@ -12,6 +12,7 @@
> */
> struct arch_hw_breakpoint {
> unsigned long address;
> + unsigned long mask;
> u8 len;

So it's a bit sad that we have both len and mask. OTOH it's my fault because we have that len
thing that is actually buggy for instruction breakpoints and needs to be sizeof(long) (who knows
what kind of fluorescent bier I drank before writing that).

But Oleg had a patch to fix that.

Oleg?

> u8 type;
> };
> diff --git a/arch/x86/include/uapi/asm/msr-index.h b/arch/x86/include/uapi/asm/msr-index.h
> index bb04650..1f04f6c 100644
> --- a/arch/x86/include/uapi/asm/msr-index.h
> +++ b/arch/x86/include/uapi/asm/msr-index.h
> @@ -205,6 +205,10 @@
> /* Fam 16h MSRs */
> #define MSR_F16H_L2I_PERF_CTL 0xc0010230
> #define MSR_F16H_L2I_PERF_CTR 0xc0010231
> +#define MSR_F16H_DR1_ADDR_MASK 0xc0011019
> +#define MSR_F16H_DR2_ADDR_MASK 0xc001101a
> +#define MSR_F16H_DR3_ADDR_MASK 0xc001101b
> +#define MSR_F16H_DR0_ADDR_MASK 0xc0011027
>
> /* Fam 15h MSRs */
> #define MSR_F15H_PERF_CTL 0xc0010200
> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
> index 903a264..fffc9cb 100644
> --- a/arch/x86/kernel/cpu/amd.c
> +++ b/arch/x86/kernel/cpu/amd.c
> @@ -909,3 +909,22 @@ static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum)
>
> return false;
> }
> +
> +void set_dr_addr_mask(unsigned long mask, int dr)
> +{
> + if (!cpu_has_bpext)
> + return;
> +
> + switch (dr) {
> + case 0:
> + wrmsr(MSR_F16H_DR0_ADDR_MASK, mask, 0);
> + break;
> + case 1:
> + case 2:
> + case 3:
> + wrmsr(MSR_F16H_DR1_ADDR_MASK - 1 + dr, mask, 0);
> + break;
> + default:
> + break;
> + }
> +}
> diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
> index f66ff16..3cb1951 100644
> --- a/arch/x86/kernel/hw_breakpoint.c
> +++ b/arch/x86/kernel/hw_breakpoint.c
> @@ -128,6 +128,8 @@ int arch_install_hw_breakpoint(struct perf_event *bp)
> *dr7 |= encode_dr7(i, info->len, info->type);
>
> set_debugreg(*dr7, 7);
> + if (info->mask)
> + set_dr_addr_mask(info->mask, i);
>
> return 0;
> }
> @@ -163,29 +165,8 @@ void arch_uninstall_hw_breakpoint(struct perf_event *bp)
> *dr7 &= ~__encode_dr7(i, info->len, info->type);
>
> set_debugreg(*dr7, 7);
> -}
> -
> -static int get_hbp_len(u8 hbp_len)
> -{
> - unsigned int len_in_bytes = 0;
> -
> - switch (hbp_len) {
> - case X86_BREAKPOINT_LEN_1:
> - len_in_bytes = 1;
> - break;
> - case X86_BREAKPOINT_LEN_2:
> - len_in_bytes = 2;
> - break;
> - case X86_BREAKPOINT_LEN_4:
> - len_in_bytes = 4;
> - break;
> -#ifdef CONFIG_X86_64
> - case X86_BREAKPOINT_LEN_8:
> - len_in_bytes = 8;
> - break;
> -#endif
> - }
> - return len_in_bytes;
> + if (info->mask)
> + set_dr_addr_mask(0, i);
> }
>
> /*
> @@ -198,7 +179,7 @@ int arch_check_bp_in_kernelspace(struct perf_event *bp)
> struct arch_hw_breakpoint *info = counter_arch_bp(bp);
>
> va = info->address;
> - len = get_hbp_len(info->len);
> + len = bp->attr.bp_len;
>
> return (va >= TASK_SIZE) && ((va + len - 1) >= TASK_SIZE);
> }
> @@ -279,7 +260,16 @@ static int arch_build_bp_info(struct perf_event *bp)
> }
>
> /* Len */
> + info->mask = 0;
> +
> switch (bp->attr.bp_len) {
> + default:
> + if (!is_power_of_2(bp->attr.bp_len))
> + return -EINVAL;
> + if (!cpu_has_bpext)
> + return -EOPNOTSUPP;
> + info->mask = bp->attr.bp_len - 1;
> + /* fall through */
> case HW_BREAKPOINT_LEN_1:
> info->len = X86_BREAKPOINT_LEN_1;
> break;
> @@ -294,12 +284,11 @@ static int arch_build_bp_info(struct perf_event *bp)
> info->len = X86_BREAKPOINT_LEN_8;
> break;
> #endif
> - default:
> - return -EINVAL;
> }
>
> return 0;
> }
> +
> /*
> * Validate the arch-specific HW Breakpoint register settings
> */
> @@ -314,11 +303,11 @@ int arch_validate_hwbkpt_settings(struct perf_event *bp)
> if (ret)
> return ret;
>
> - ret = -EINVAL;
> -
> switch (info->len) {
> case X86_BREAKPOINT_LEN_1:
> align = 0;
> + if (info->mask)
> + align = info->mask;
> break;
> case X86_BREAKPOINT_LEN_2:
> align = 1;
> @@ -332,7 +321,7 @@ int arch_validate_hwbkpt_settings(struct perf_event *bp)
> break;
> #endif
> default:
> - return ret;
> + BUG();

Please use WARN_ON_ONCE(1) instead. BUG() makes the machine unusable, which may make the
user unhappy and the warning hard to log. BUG() is good only when letting things run may
compromize user persistent storage data integrity or so, like a scary filesystem bug for example.


> }
>
> /*
> --
> 1.8.1.2
>
>

2013-10-31 10:48:50

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 1/3] perf/x86/amd: AMD support for bp_len > HW_BREAKPOINT_LEN_8

fix tglx's address.

On Thu, Oct 31, 2013 at 10:58:31AM +0100, Frederic Weisbecker wrote:
> Is this perhaps a bit too generic? Extension can mean about everything
> and who knows what other feature Intel and Amd will add to breakpoints
> in the future.

Yeah, but that's the name of the feature. When they come up with another
extension, they'll call it BP_EXT2, most probably...

:-)

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2013-10-31 11:23:36

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH 1/3] perf/x86/amd: AMD support for bp_len > HW_BREAKPOINT_LEN_8

On Thu, Oct 31, 2013 at 11:48:36AM +0100, Borislav Petkov wrote:
> fix tglx's address.
>
> On Thu, Oct 31, 2013 at 10:58:31AM +0100, Frederic Weisbecker wrote:
> > Is this perhaps a bit too generic? Extension can mean about everything
> > and who knows what other feature Intel and Amd will add to breakpoints
> > in the future.
>
> Yeah, but that's the name of the feature. When they come up with another
> extension, they'll call it BP_EXT2, most probably...
>
> :-)

:-)

Ok we can keep that naming then, at least on the feature symbol. But add a comment
on it.

2013-10-31 16:48:25

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 1/3] perf/x86/amd: AMD support for bp_len > HW_BREAKPOINT_LEN_8

On 10/31, Frederic Weisbecker wrote:
>
> On Wed, Oct 02, 2013 at 11:11:06AM -0500, [email protected] wrote:
> > --- a/arch/x86/include/asm/hw_breakpoint.h
> > +++ b/arch/x86/include/asm/hw_breakpoint.h
> > @@ -12,6 +12,7 @@
> > */
> > struct arch_hw_breakpoint {
> > unsigned long address;
> > + unsigned long mask;
> > u8 len;
>
> So it's a bit sad that we have both len and mask.

Yes, we can probably remove it later, in fact iirc it is not strictly
necessary right now. But this is minor, we can do this later and the
code looks simpler this way.

> thing that is actually buggy for instruction breakpoints and needs to
> be sizeof(long) (who knows
> what kind of fluorescent bier I drank before writing that).
>
> But Oleg had a patch to fix that.

Yes, we already discussed some draft patches. And one of the problems
was this series. I mean, the changes we discussed conflict with these
patches, I think we should fix this after this series is merged.

Oleg.