2010-08-26 12:02:20

by DDD

[permalink] [raw]
Subject: [PATCH] perf: fix possible divide-by-zero in perf_swevent_overflow()

The event->hw.last_period is possible to zero, thus it will
cause divide_by_zero later in perf_swevent_set_period().

This patch checks event->hw.last_period before invoke
perf_swevent_set_period() and replaces "event->hw" with "hwc".

Signed-off-by: Dongdong Deng <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Paul Mackerras <[email protected]>
---
kernel/perf_event.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 403d180..ccba741 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -4050,8 +4050,8 @@ static void perf_swevent_overflow(struct perf_event *event, u64 overflow,
struct hw_perf_event *hwc = &event->hw;
int throttle = 0;

- data->period = event->hw.last_period;
- if (!overflow)
+ data->period = hwc->last_period;
+ if (!overflow && hwc->last_period)
overflow = perf_swevent_set_period(event);

if (hwc->interrupts == MAX_INTERRUPTS)
--
1.6.0.4


2010-08-26 12:12:21

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] perf: fix possible divide-by-zero in perf_swevent_overflow()

On Thu, 2010-08-26 at 20:07 +0800, Dongdong Deng wrote:
> The event->hw.last_period is possible to zero, thus it will
> cause divide_by_zero later in perf_swevent_set_period().

How can it be zero?

> This patch checks event->hw.last_period before invoke
> perf_swevent_set_period() and replaces "event->hw" with "hwc".
>
> Signed-off-by: Dongdong Deng <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Frederic Weisbecker <[email protected]>
> Cc: Arnaldo Carvalho de Melo <[email protected]>
> Cc: Paul Mackerras <[email protected]>
> ---
> kernel/perf_event.c | 4 ++--
> 1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/perf_event.c b/kernel/perf_event.c
> index 403d180..ccba741 100644
> --- a/kernel/perf_event.c
> +++ b/kernel/perf_event.c
> @@ -4050,8 +4050,8 @@ static void perf_swevent_overflow(struct perf_event *event, u64 overflow,
> struct hw_perf_event *hwc = &event->hw;
> int throttle = 0;
>
> - data->period = event->hw.last_period;
> - if (!overflow)
> + data->period = hwc->last_period;
> + if (!overflow && hwc->last_period)
> overflow = perf_swevent_set_period(event);
>
> if (hwc->interrupts == MAX_INTERRUPTS)

2010-08-26 12:31:32

by DDD

[permalink] [raw]
Subject: Re: [PATCH] perf: fix possible divide-by-zero in perf_swevent_overflow()

Peter Zijlstra wrote:
> On Thu, 2010-08-26 at 20:07 +0800, Dongdong Deng wrote:
>> The event->hw.last_period is possible to zero, thus it will
>> cause divide_by_zero later in perf_swevent_set_period().
>
> How can it be zero?

When I am running the kgdbts to test the hw_breakpoint_layer with kgdb,
I get a call trace as following and this problem is hardly to reproduce.

Maybe the root cause was from kgdb/hw_breakpoint_layer,
but add a checking is good to us and harmless. :-)

Thanks,
Dongdong



------------[ cut here ]------------
WARNING: at /buildarea/ddeng/build/linux/drivers/misc/kgdbts.c:703
run_simple_test+0x1fc/0x2a0()
Hardware name: Moon Creek platform
Modules linked in:
Pid: 668, comm: sh Tainted: G W 2.6.36-rc1-00133-g47ade1d #43
Call Trace:
[<c103b1fd>] warn_slowpath_common+0x6d/0xa0
[<c124e15c>] ? run_simple_test+0x1fc/0x2a0
[<c124e15c>] ? run_simple_test+0x1fc/0x2a0
[<c103b245>] warn_slowpath_null+0x15/0x20
[<c124e15c>] run_simple_test+0x1fc/0x2a0
[<c124daf4>] kgdbts_put_char+0x14/0x20
[<c107bab8>] gdb_serial_stub+0x678/0xbd0
[<c101aced>] ? default_send_IPI_allbutself+0x7d/0x90
[<c107a487>] kgdb_cpu_enter+0x197/0x470
[<c107a8e7>] kgdb_handle_exception+0x47/0x140
[<c13be9f0>] ? _raw_read_unlock+0x10/0x30
[<c101eb48>] __kgdb_notify+0x38/0x170
[<c130f1e0>] ? sock_queue_rcv_skb+0xe0/0x130
[<c101ec8e>] kgdb_notify+0xe/0x20
[<c105b815>] notifier_call_chain+0x35/0x70
[<c1003910>] ? do_divide_error+0x0/0xa0
[<c105bce8>] __atomic_notifier_call_chain+0x28/0x50
[<c105bd2a>] atomic_notifier_call_chain+0x1a/0x20
[<c105bd7d>] notify_die+0x2d/0x30
[<c1003964>] do_divide_error+0x54/0xa0
[<c11cf3d5>] ? div64_u64+0x55/0x80
[<c1361c22>] ? udp_rcv+0x12/0x20
[<c133e6e7>] ? ip_local_deliver_finish+0x127/0x240
[<c1361c22>] ? udp_rcv+0x12/0x20
[<c133e887>] ? ip_local_deliver+0x87/0x90
[<c109bb8e>] ? perf_output_begin+0x1fe/0x230
[<c133e0c3>] ? ip_rcv_finish+0xc3/0x320
[<c13bf822>] error_code+0x66/0x6c
[<c1003910>] ? do_divide_error+0x0/0xa0
[<c11cf3d5>] ? div64_u64+0x55/0x80
[<c1097cac>] perf_swevent_set_period+0x7c/0x100
[<c109c797>] perf_swevent_overflow+0x87/0x90
[<c109c85f>] perf_swevent_add+0xbf/0xd0
[<c109c8ce>] perf_bp_event+0x5e/0x80
[<c1289be7>] ? e1000_clean+0xf7/0x260
[<c1008941>] hw_breakpoint_exceptions_notify+0x111/0x170
[<c105b815>] notifier_call_chain+0x35/0x70
[<c105bce8>] __atomic_notifier_call_chain+0x28/0x50
[<c105bd2a>] atomic_notifier_call_chain+0x1a/0x20
[<c105bd7d>] notify_die+0x2d/0x30
[<c1004088>] do_debug+0x68/0x270
[<c101a4d4>] ? smp_apic_timer_interrupt+0x74/0x140
[<c13bf871>] debug_stack_correct+0x29/0x30
[<c101007b>] ? hw_perf_event_init+0x1bb/0x820
[<c124de60>] ? kgdbts_break_test+0x0/0x30
[<c124e870>] ? configure_kgdbts+0x1d0/0x4c0
[<c10544f0>] ? param_attr_store+0x0/0x20
[<c124ebad>] param_set_kgdbts_var+0x4d/0xb0
[<c124eb60>] ? param_set_kgdbts_var+0x0/0xb0
[<c1054507>] param_attr_store+0x17/0x20
[<c105458c>] module_attr_store+0x2c/0x40
[<c111fe54>] sysfs_write_file+0x94/0xf0
[<c10d42c6>] vfs_write+0x96/0x130
[<c111fdc0>] ? sysfs_write_file+0x0/0xf0
[<c10d44a6>] sys_write+0x46/0xd0
[<c13bf209>] system_call_done+0x0/0x4
---[ end trace e0a820ecc2c65c24 ]---


>
>> This patch checks event->hw.last_period before invoke
>> perf_swevent_set_period() and replaces "event->hw" with "hwc".
>>
>> Signed-off-by: Dongdong Deng <[email protected]>
>> Cc: Peter Zijlstra <[email protected]>
>> Cc: Frederic Weisbecker <[email protected]>
>> Cc: Arnaldo Carvalho de Melo <[email protected]>
>> Cc: Paul Mackerras <[email protected]>
>> ---
>> kernel/perf_event.c | 4 ++--
>> 1 files changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/perf_event.c b/kernel/perf_event.c
>> index 403d180..ccba741 100644
>> --- a/kernel/perf_event.c
>> +++ b/kernel/perf_event.c
>> @@ -4050,8 +4050,8 @@ static void perf_swevent_overflow(struct perf_event *event, u64 overflow,
>> struct hw_perf_event *hwc = &event->hw;
>> int throttle = 0;
>>
>> - data->period = event->hw.last_period;
>> - if (!overflow)
>> + data->period = hwc->last_period;
>> + if (!overflow && hwc->last_period)
>> overflow = perf_swevent_set_period(event);
>>
>> if (hwc->interrupts == MAX_INTERRUPTS)
>
>
>

2010-08-26 12:58:21

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] perf: fix possible divide-by-zero in perf_swevent_overflow()

On Thu, 2010-08-26 at 20:36 +0800, DDD wrote:
> Peter Zijlstra wrote:
> > On Thu, 2010-08-26 at 20:07 +0800, Dongdong Deng wrote:
> >> The event->hw.last_period is possible to zero, thus it will
> >> cause divide_by_zero later in perf_swevent_set_period().
> >
> > How can it be zero?
>
> When I am running the kgdbts to test the hw_breakpoint_layer with kgdb,
> I get a call trace as following and this problem is hardly to reproduce.
>
> Maybe the root cause was from kgdb/hw_breakpoint_layer,

Yeah, I think there's a bug in the hw_breakpoint stuff, does something
like the below fix it?


> but add a checking is good to us and harmless. :-)

Except that code path is already too bloated and should be reduced not
added to and conditionals are expensive.


---
kernel/hw_breakpoint.c | 13 ++++++++++++-
1 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/kernel/hw_breakpoint.c b/kernel/hw_breakpoint.c
index d71a987..f57ebee 100644
--- a/kernel/hw_breakpoint.c
+++ b/kernel/hw_breakpoint.c
@@ -600,9 +600,20 @@ static int __init init_hw_breakpoint(void)
}
core_initcall(init_hw_breakpoint);

+static int hw_breakpoint_enable(struct perf_event *event)
+{
+ struct hw_perf_event *hwc = &event->hw;
+
+ if (hwc->sample_period) {
+ hwc->last_period = hwc->sample_period;
+ perf_swevent_set_period(event);
+ }
+
+ return arch_install_hw_breakpoint(event);
+}

struct pmu perf_ops_bp = {
- .enable = arch_install_hw_breakpoint,
+ .enable = hw_breakpoint_enable,
.disable = arch_uninstall_hw_breakpoint,
.read = hw_breakpoint_pmu_read,
};

2010-08-27 12:16:25

by DDD

[permalink] [raw]
Subject: Re: [PATCH] perf: fix possible divide-by-zero in perf_swevent_overflow()

Peter Zijlstra wrote:
> On Thu, 2010-08-26 at 20:36 +0800, DDD wrote:
>> Peter Zijlstra wrote:
>>> On Thu, 2010-08-26 at 20:07 +0800, Dongdong Deng wrote:
>>>> The event->hw.last_period is possible to zero, thus it will
>>>> cause divide_by_zero later in perf_swevent_set_period().
>>> How can it be zero?
>> When I am running the kgdbts to test the hw_breakpoint_layer with kgdb,
>> I get a call trace as following and this problem is hardly to reproduce.
>>
>> Maybe the root cause was from kgdb/hw_breakpoint_layer,
>
> Yeah, I think there's a bug in the hw_breakpoint stuff, does something
> like the below fix it?

hello Peter,

Thanks for your patch, but I still could reproduce the problem with your
patch.


Thanks,
Dongdong




>
>
>> but add a checking is good to us and harmless. :-)
>
> Except that code path is already too bloated and should be reduced not
> added to and conditionals are expensive.
>
>
> ---
> kernel/hw_breakpoint.c | 13 ++++++++++++-
> 1 files changed, 12 insertions(+), 1 deletions(-)
>
> diff --git a/kernel/hw_breakpoint.c b/kernel/hw_breakpoint.c
> index d71a987..f57ebee 100644
> --- a/kernel/hw_breakpoint.c
> +++ b/kernel/hw_breakpoint.c
> @@ -600,9 +600,20 @@ static int __init init_hw_breakpoint(void)
> }
> core_initcall(init_hw_breakpoint);
>
> +static int hw_breakpoint_enable(struct perf_event *event)
> +{
> + struct hw_perf_event *hwc = &event->hw;
> +
> + if (hwc->sample_period) {
> + hwc->last_period = hwc->sample_period;
> + perf_swevent_set_period(event);
> + }
> +
> + return arch_install_hw_breakpoint(event);
> +}
>
> struct pmu perf_ops_bp = {
> - .enable = arch_install_hw_breakpoint,
> + .enable = hw_breakpoint_enable,
> .disable = arch_uninstall_hw_breakpoint,
> .read = hw_breakpoint_pmu_read,
> };
>
>

2010-08-27 12:51:01

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] perf: fix possible divide-by-zero in perf_swevent_overflow()

On Fri, 2010-08-27 at 20:21 +0800, DDD wrote:

> >> Maybe the root cause was from kgdb/hw_breakpoint_layer,
> >
> > Yeah, I think there's a bug in the hw_breakpoint stuff, does something
> > like the below fix it?

> Thanks for your patch, but I still could reproduce the problem with your
> patch.

Frederic, any clue as to what makes hw breakpoints go funny and have
last_period == 0?

> > ---
> > kernel/hw_breakpoint.c | 13 ++++++++++++-
> > 1 files changed, 12 insertions(+), 1 deletions(-)
> >
> > diff --git a/kernel/hw_breakpoint.c b/kernel/hw_breakpoint.c
> > index d71a987..f57ebee 100644
> > --- a/kernel/hw_breakpoint.c
> > +++ b/kernel/hw_breakpoint.c
> > @@ -600,9 +600,20 @@ static int __init init_hw_breakpoint(void)
> > }
> > core_initcall(init_hw_breakpoint);
> >
> > +static int hw_breakpoint_enable(struct perf_event *event)
> > +{
> > + struct hw_perf_event *hwc = &event->hw;
> > +
> > + if (hwc->sample_period) {
> > + hwc->last_period = hwc->sample_period;
> > + perf_swevent_set_period(event);
> > + }
> > +
> > + return arch_install_hw_breakpoint(event);
> > +}
> >
> > struct pmu perf_ops_bp = {
> > - .enable = arch_install_hw_breakpoint,
> > + .enable = hw_breakpoint_enable,
> > .disable = arch_uninstall_hw_breakpoint,
> > .read = hw_breakpoint_pmu_read,
> > };
> >
> >

2010-08-27 13:14:26

by DDD

[permalink] [raw]
Subject: Re: [PATCH] perf: fix possible divide-by-zero in perf_swevent_overflow()

Peter Zijlstra wrote:
> On Fri, 2010-08-27 at 20:21 +0800, DDD wrote:
>
>>>> Maybe the root cause was from kgdb/hw_breakpoint_layer,
>>> Yeah, I think there's a bug in the hw_breakpoint stuff, does something
>>> like the below fix it?
>
>> Thanks for your patch, but I still could reproduce the problem with your
>> patch.
>
> Frederic, any clue as to what makes hw breakpoints go funny and have
> last_period == 0?

Hi Peter,

Thanks for you take care of it, I have got the root cause of it now.

It is the kgdb using hw_breakpoint_layer's API problem, and I have a RFC
patch for it.(maybe it is not correctly), I will contact with Jason to
fix this problem.

Thank you very much,
Dongdong

>
>>> ---
>>> kernel/hw_breakpoint.c | 13 ++++++++++++-
>>> 1 files changed, 12 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/kernel/hw_breakpoint.c b/kernel/hw_breakpoint.c
>>> index d71a987..f57ebee 100644
>>> --- a/kernel/hw_breakpoint.c
>>> +++ b/kernel/hw_breakpoint.c
>>> @@ -600,9 +600,20 @@ static int __init init_hw_breakpoint(void)
>>> }
>>> core_initcall(init_hw_breakpoint);
>>>
>>> +static int hw_breakpoint_enable(struct perf_event *event)
>>> +{
>>> + struct hw_perf_event *hwc = &event->hw;
>>> +
>>> + if (hwc->sample_period) {
>>> + hwc->last_period = hwc->sample_period;
>>> + perf_swevent_set_period(event);
>>> + }
>>> +
>>> + return arch_install_hw_breakpoint(event);
>>> +}
>>>
>>> struct pmu perf_ops_bp = {
>>> - .enable = arch_install_hw_breakpoint,
>>> + .enable = hw_breakpoint_enable,
>>> .disable = arch_uninstall_hw_breakpoint,
>>> .read = hw_breakpoint_pmu_read,
>>> };
>>>
>>>
>

2010-08-27 13:37:44

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH] perf: fix possible divide-by-zero in perf_swevent_overflow()

On Fri, Aug 27, 2010 at 09:19:39PM +0800, DDD wrote:
> Peter Zijlstra wrote:
>> On Fri, 2010-08-27 at 20:21 +0800, DDD wrote:
>>
>>>>> Maybe the root cause was from kgdb/hw_breakpoint_layer,
>>>> Yeah, I think there's a bug in the hw_breakpoint stuff, does something
>>>> like the below fix it?
>>
>>> Thanks for your patch, but I still could reproduce the problem with
>>> your patch.
>>
>> Frederic, any clue as to what makes hw breakpoints go funny and have
>> last_period == 0?
>
> Hi Peter,
>
> Thanks for you take care of it, I have got the root cause of it now.
>
> It is the kgdb using hw_breakpoint_layer's API problem, and I have a RFC
> patch for it.(maybe it is not correctly), I will contact with Jason to
> fix this problem.
>
> Thank you very much,
> Dongdong


Thanks!

We are waiting for your patch then.