2022-12-29 13:01:12

by Yang Jihong

[permalink] [raw]
Subject: [PATCH v2] perf record: Fix coredump with --overwrite and --max-size

When --overwrite and --max-size options of perf record are used together,
a segmentation fault occurs. The following is an example:

# perf record -e sched:sched* --overwrite --max-size 1M -a -- sleep 1
[ perf record: Woken up 1 times to write data ]
perf: Segmentation fault
Obtained 1 stack frames.
[0xc4c67f]
Segmentation fault (core dumped)

backtrace of the core file is as follows:

#0 0x0000000000417990 in process_locked_synthesized_event (tool=0x0, event=0x15, sample=0x1de0, machine=0xf8) at builtin-record.c:630
#1 0x000000000057ee53 in perf_event__synthesize_threads (nr_threads_synthesize=21, mmap_data=<optimized out>, needs_mmap=<optimized out>, machine=0x17ad9b0, process=<optimized out>, tool=0x0) at util/synthetic-events.c:1950
#2 __machine__synthesize_threads (nr_threads_synthesize=0, data_mmap=<optimized out>, needs_mmap=<optimized out>, process=<optimized out>, threads=0x8, target=0x8, tool=0x0, machine=0x17ad9b0) at util/synthetic-events.c:1936
#3 machine__synthesize_threads (machine=0x17ad9b0, target=0x8, threads=0x8, needs_mmap=<optimized out>, data_mmap=<optimized out>, nr_threads_synthesize=0) at util/synthetic-events.c:1947
#4 0x000000000040165d in record__synthesize (tail=<optimized out>, rec=0xbe2520 <record>) at builtin-record.c:2010
#5 0x0000000000403989 in __cmd_record (argc=<optimized out>, argv=<optimized out>, rec=0xbe2520 <record>) at builtin-record.c:2810
#6 0x00000000004196ba in record__init_thread_user_masks (rec=0xbe2520 <record>, cpus=0x17a65f0) at builtin-record.c:3837
#7 record__init_thread_masks (rec=0xbe2520 <record>) at builtin-record.c:3938
#8 cmd_record (argc=1, argv=0x7ffdd692dc60) at builtin-record.c:4241
#9 0x00000000004b701d in pager_command_config (var=0x0, value=0x15 <error: Cannot access memory at address 0x15>, data=0x1de0) at perf.c:117
#10 0x00000000004b732b in get_leaf_frame_caller_aarch64 (sample=0xfffffffb, thread=0x0, usr_idx=<optimized out>) at util/arm64-frame-pointer-unwind-support.c:56
#11 0x0000000000406331 in execv_dashed_external (argv=0x7ffdd692d9e8) at perf.c:410
#12 run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:431
#13 main (argc=<optimized out>, argv=0x7ffdd692d9e8) at perf.c:562

The reason is that record__bytes_written accesses the freed memory rec->thread_data,
The process is as follows:
__cmd_record
-> record__free_thread_data
-> zfree(&rec->thread_data) // free rec->thread_data
-> record__synthesize
-> perf_event__synthesize_id_index
-> process_synthesized_event
-> record__write
-> record__bytes_written // access rec->thread_data

we only need to check the value of done first.
Also add variable check in record__bytes_written for code hardening,
and save bytes_written separately to reduce one calculation.

Fixes: 6d57581659f7 ("perf record: Add support for limit perf output file size")
Signed-off-by: Yang Jihong <[email protected]>
---

Changes since v1:
- Add variable check in record__bytes_written for code hardening.
- Save bytes_written separately to reduce one calculation.
- Remove rec->opts.tail_synthesize check.

tools/perf/builtin-record.c | 26 +++++++++++++++++---------
1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 29dcd454b8e2..acba9e43e519 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -230,16 +230,29 @@ static u64 record__bytes_written(struct record *rec)
u64 bytes_written = rec->bytes_written;
struct record_thread *thread_data = rec->thread_data;

+ if (thread_data == NULL)
+ return bytes_written;
+
for (t = 0; t < rec->nr_threads; t++)
bytes_written += thread_data[t].bytes_written;

return bytes_written;
}

-static bool record__output_max_size_exceeded(struct record *rec)
+static void record__check_output_max_size_exceeded(struct record *rec)
{
- return rec->output_max_size &&
- (record__bytes_written(rec) >= rec->output_max_size);
+ u64 bytes_written;
+
+ if (rec->output_max_size == 0 || done)
+ return;
+
+ bytes_written = record__bytes_written(rec);
+ if (bytes_written >= rec->output_max_size) {
+ fprintf(stderr, "[ perf record: perf size limit reached (%" PRIu64 " KB),"
+ " stopping session ]\n", bytes_written >> 10);
+
+ done = 1;
+ }
}

static int record__write(struct record *rec, struct mmap *map __maybe_unused,
@@ -260,12 +273,7 @@ static int record__write(struct record *rec, struct mmap *map __maybe_unused,
else
rec->bytes_written += size;

- if (record__output_max_size_exceeded(rec) && !done) {
- fprintf(stderr, "[ perf record: perf size limit reached (%" PRIu64 " KB),"
- " stopping session ]\n",
- record__bytes_written(rec) >> 10);
- done = 1;
- }
+ record__check_output_max_size_exceeded(rec);

if (switch_output_size(rec))
trigger_hit(&switch_output_trigger);
--
2.30.GIT


2023-01-02 17:11:27

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v2] perf record: Fix coredump with --overwrite and --max-size

Em Thu, Dec 29, 2022 at 12:47:28PM +0000, Yang Jihong escreveu:
> When --overwrite and --max-size options of perf record are used together,
> a segmentation fault occurs. The following is an example:
>
> # perf record -e sched:sched* --overwrite --max-size 1M -a -- sleep 1
> [ perf record: Woken up 1 times to write data ]
> perf: Segmentation fault
> Obtained 1 stack frames.
> [0xc4c67f]
> Segmentation fault (core dumped)
>
> backtrace of the core file is as follows:
>
> #0 0x0000000000417990 in process_locked_synthesized_event (tool=0x0, event=0x15, sample=0x1de0, machine=0xf8) at builtin-record.c:630
> #1 0x000000000057ee53 in perf_event__synthesize_threads (nr_threads_synthesize=21, mmap_data=<optimized out>, needs_mmap=<optimized out>, machine=0x17ad9b0, process=<optimized out>, tool=0x0) at util/synthetic-events.c:1950
> #2 __machine__synthesize_threads (nr_threads_synthesize=0, data_mmap=<optimized out>, needs_mmap=<optimized out>, process=<optimized out>, threads=0x8, target=0x8, tool=0x0, machine=0x17ad9b0) at util/synthetic-events.c:1936
> #3 machine__synthesize_threads (machine=0x17ad9b0, target=0x8, threads=0x8, needs_mmap=<optimized out>, data_mmap=<optimized out>, nr_threads_synthesize=0) at util/synthetic-events.c:1947
> #4 0x000000000040165d in record__synthesize (tail=<optimized out>, rec=0xbe2520 <record>) at builtin-record.c:2010
> #5 0x0000000000403989 in __cmd_record (argc=<optimized out>, argv=<optimized out>, rec=0xbe2520 <record>) at builtin-record.c:2810
> #6 0x00000000004196ba in record__init_thread_user_masks (rec=0xbe2520 <record>, cpus=0x17a65f0) at builtin-record.c:3837
> #7 record__init_thread_masks (rec=0xbe2520 <record>) at builtin-record.c:3938
> #8 cmd_record (argc=1, argv=0x7ffdd692dc60) at builtin-record.c:4241
> #9 0x00000000004b701d in pager_command_config (var=0x0, value=0x15 <error: Cannot access memory at address 0x15>, data=0x1de0) at perf.c:117
> #10 0x00000000004b732b in get_leaf_frame_caller_aarch64 (sample=0xfffffffb, thread=0x0, usr_idx=<optimized out>) at util/arm64-frame-pointer-unwind-support.c:56
> #11 0x0000000000406331 in execv_dashed_external (argv=0x7ffdd692d9e8) at perf.c:410
> #12 run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:431
> #13 main (argc=<optimized out>, argv=0x7ffdd692d9e8) at perf.c:562
>
> The reason is that record__bytes_written accesses the freed memory rec->thread_data,
> The process is as follows:
> __cmd_record
> -> record__free_thread_data
> -> zfree(&rec->thread_data) // free rec->thread_data
> -> record__synthesize
> -> perf_event__synthesize_id_index
> -> process_synthesized_event
> -> record__write
> -> record__bytes_written // access rec->thread_data
>
> we only need to check the value of done first.
> Also add variable check in record__bytes_written for code hardening,
> and save bytes_written separately to reduce one calculation.
>
> Fixes: 6d57581659f7 ("perf record: Add support for limit perf output file size")
> Signed-off-by: Yang Jihong <[email protected]>
> ---
>
> Changes since v1:
> - Add variable check in record__bytes_written for code hardening.
> - Save bytes_written separately to reduce one calculation.
> - Remove rec->opts.tail_synthesize check.

Namhyung, are you ok with this now?

- Arnaldo

> tools/perf/builtin-record.c | 26 +++++++++++++++++---------
> 1 file changed, 17 insertions(+), 9 deletions(-)
>
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index 29dcd454b8e2..acba9e43e519 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -230,16 +230,29 @@ static u64 record__bytes_written(struct record *rec)
> u64 bytes_written = rec->bytes_written;
> struct record_thread *thread_data = rec->thread_data;
>
> + if (thread_data == NULL)
> + return bytes_written;
> +
> for (t = 0; t < rec->nr_threads; t++)
> bytes_written += thread_data[t].bytes_written;
>
> return bytes_written;
> }
>
> -static bool record__output_max_size_exceeded(struct record *rec)
> +static void record__check_output_max_size_exceeded(struct record *rec)
> {
> - return rec->output_max_size &&
> - (record__bytes_written(rec) >= rec->output_max_size);
> + u64 bytes_written;
> +
> + if (rec->output_max_size == 0 || done)
> + return;
> +
> + bytes_written = record__bytes_written(rec);
> + if (bytes_written >= rec->output_max_size) {
> + fprintf(stderr, "[ perf record: perf size limit reached (%" PRIu64 " KB),"
> + " stopping session ]\n", bytes_written >> 10);
> +
> + done = 1;
> + }
> }
>
> static int record__write(struct record *rec, struct mmap *map __maybe_unused,
> @@ -260,12 +273,7 @@ static int record__write(struct record *rec, struct mmap *map __maybe_unused,
> else
> rec->bytes_written += size;
>
> - if (record__output_max_size_exceeded(rec) && !done) {
> - fprintf(stderr, "[ perf record: perf size limit reached (%" PRIu64 " KB),"
> - " stopping session ]\n",
> - record__bytes_written(rec) >> 10);
> - done = 1;
> - }
> + record__check_output_max_size_exceeded(rec);
>
> if (switch_output_size(rec))
> trigger_hit(&switch_output_trigger);
> --
> 2.30.GIT

--

- Arnaldo

2023-01-03 17:52:12

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH v2] perf record: Fix coredump with --overwrite and --max-size

On Mon, Jan 2, 2023 at 8:20 AM Arnaldo Carvalho de Melo <[email protected]> wrote:
>
> Em Thu, Dec 29, 2022 at 12:47:28PM +0000, Yang Jihong escreveu:
> > When --overwrite and --max-size options of perf record are used together,
> > a segmentation fault occurs. The following is an example:
> >
> > # perf record -e sched:sched* --overwrite --max-size 1M -a -- sleep 1
> > [ perf record: Woken up 1 times to write data ]
> > perf: Segmentation fault
> > Obtained 1 stack frames.
> > [0xc4c67f]
> > Segmentation fault (core dumped)
> >
> > backtrace of the core file is as follows:
> >
> > #0 0x0000000000417990 in process_locked_synthesized_event (tool=0x0, event=0x15, sample=0x1de0, machine=0xf8) at builtin-record.c:630
> > #1 0x000000000057ee53 in perf_event__synthesize_threads (nr_threads_synthesize=21, mmap_data=<optimized out>, needs_mmap=<optimized out>, machine=0x17ad9b0, process=<optimized out>, tool=0x0) at util/synthetic-events.c:1950
> > #2 __machine__synthesize_threads (nr_threads_synthesize=0, data_mmap=<optimized out>, needs_mmap=<optimized out>, process=<optimized out>, threads=0x8, target=0x8, tool=0x0, machine=0x17ad9b0) at util/synthetic-events.c:1936
> > #3 machine__synthesize_threads (machine=0x17ad9b0, target=0x8, threads=0x8, needs_mmap=<optimized out>, data_mmap=<optimized out>, nr_threads_synthesize=0) at util/synthetic-events.c:1947
> > #4 0x000000000040165d in record__synthesize (tail=<optimized out>, rec=0xbe2520 <record>) at builtin-record.c:2010
> > #5 0x0000000000403989 in __cmd_record (argc=<optimized out>, argv=<optimized out>, rec=0xbe2520 <record>) at builtin-record.c:2810
> > #6 0x00000000004196ba in record__init_thread_user_masks (rec=0xbe2520 <record>, cpus=0x17a65f0) at builtin-record.c:3837
> > #7 record__init_thread_masks (rec=0xbe2520 <record>) at builtin-record.c:3938
> > #8 cmd_record (argc=1, argv=0x7ffdd692dc60) at builtin-record.c:4241
> > #9 0x00000000004b701d in pager_command_config (var=0x0, value=0x15 <error: Cannot access memory at address 0x15>, data=0x1de0) at perf.c:117
> > #10 0x00000000004b732b in get_leaf_frame_caller_aarch64 (sample=0xfffffffb, thread=0x0, usr_idx=<optimized out>) at util/arm64-frame-pointer-unwind-support.c:56
> > #11 0x0000000000406331 in execv_dashed_external (argv=0x7ffdd692d9e8) at perf.c:410
> > #12 run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:431
> > #13 main (argc=<optimized out>, argv=0x7ffdd692d9e8) at perf.c:562
> >
> > The reason is that record__bytes_written accesses the freed memory rec->thread_data,
> > The process is as follows:
> > __cmd_record
> > -> record__free_thread_data
> > -> zfree(&rec->thread_data) // free rec->thread_data
> > -> record__synthesize
> > -> perf_event__synthesize_id_index
> > -> process_synthesized_event
> > -> record__write
> > -> record__bytes_written // access rec->thread_data
> >
> > we only need to check the value of done first.
> > Also add variable check in record__bytes_written for code hardening,
> > and save bytes_written separately to reduce one calculation.
> >
> > Fixes: 6d57581659f7 ("perf record: Add support for limit perf output file size")
> > Signed-off-by: Yang Jihong <[email protected]>
> > ---
> >
> > Changes since v1:
> > - Add variable check in record__bytes_written for code hardening.
> > - Save bytes_written separately to reduce one calculation.
> > - Remove rec->opts.tail_synthesize check.
>
> Namhyung, are you ok with this now?
>
> - Arnaldo
>
> > tools/perf/builtin-record.c | 26 +++++++++++++++++---------
> > 1 file changed, 17 insertions(+), 9 deletions(-)
> >
> > diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> > index 29dcd454b8e2..acba9e43e519 100644
> > --- a/tools/perf/builtin-record.c
> > +++ b/tools/perf/builtin-record.c
> > @@ -230,16 +230,29 @@ static u64 record__bytes_written(struct record *rec)
> > u64 bytes_written = rec->bytes_written;
> > struct record_thread *thread_data = rec->thread_data;
> >
> > + if (thread_data == NULL)
> > + return bytes_written;
> > +

Then it won't count bytes written by threads, right?
I think it needs to be saved somewhere.

Thanks,
Namhyung


> > for (t = 0; t < rec->nr_threads; t++)
> > bytes_written += thread_data[t].bytes_written;
> >
> > return bytes_written;
> > }
> >
> > -static bool record__output_max_size_exceeded(struct record *rec)
> > +static void record__check_output_max_size_exceeded(struct record *rec)
> > {
> > - return rec->output_max_size &&
> > - (record__bytes_written(rec) >= rec->output_max_size);
> > + u64 bytes_written;
> > +
> > + if (rec->output_max_size == 0 || done)
> > + return;
> > +
> > + bytes_written = record__bytes_written(rec);
> > + if (bytes_written >= rec->output_max_size) {
> > + fprintf(stderr, "[ perf record: perf size limit reached (%" PRIu64 " KB),"
> > + " stopping session ]\n", bytes_written >> 10);
> > +
> > + done = 1;
> > + }
> > }
> >
> > static int record__write(struct record *rec, struct mmap *map __maybe_unused,
> > @@ -260,12 +273,7 @@ static int record__write(struct record *rec, struct mmap *map __maybe_unused,
> > else
> > rec->bytes_written += size;
> >
> > - if (record__output_max_size_exceeded(rec) && !done) {
> > - fprintf(stderr, "[ perf record: perf size limit reached (%" PRIu64 " KB),"
> > - " stopping session ]\n",
> > - record__bytes_written(rec) >> 10);
> > - done = 1;
> > - }
> > + record__check_output_max_size_exceeded(rec);
> >
> > if (switch_output_size(rec))
> > trigger_hit(&switch_output_trigger);
> > --
> > 2.30.GIT
>
> --
>
> - Arnaldo

2023-01-05 04:20:24

by Yang Jihong

[permalink] [raw]
Subject: Re: [PATCH v2] perf record: Fix coredump with --overwrite and --max-size

Hello,

On 2023/1/4 0:50, Namhyung Kim wrote:
> On Mon, Jan 2, 2023 at 8:20 AM Arnaldo Carvalho de Melo <[email protected]> wrote:
>>
>> Em Thu, Dec 29, 2022 at 12:47:28PM +0000, Yang Jihong escreveu:
>>> When --overwrite and --max-size options of perf record are used together,
>>> a segmentation fault occurs. The following is an example:
>>>
>>> # perf record -e sched:sched* --overwrite --max-size 1M -a -- sleep 1
>>> [ perf record: Woken up 1 times to write data ]
>>> perf: Segmentation fault
>>> Obtained 1 stack frames.
>>> [0xc4c67f]
>>> Segmentation fault (core dumped)
>>>
>>> backtrace of the core file is as follows:
>>>
>>> #0 0x0000000000417990 in process_locked_synthesized_event (tool=0x0, event=0x15, sample=0x1de0, machine=0xf8) at builtin-record.c:630
>>> #1 0x000000000057ee53 in perf_event__synthesize_threads (nr_threads_synthesize=21, mmap_data=<optimized out>, needs_mmap=<optimized out>, machine=0x17ad9b0, process=<optimized out>, tool=0x0) at util/synthetic-events.c:1950
>>> #2 __machine__synthesize_threads (nr_threads_synthesize=0, data_mmap=<optimized out>, needs_mmap=<optimized out>, process=<optimized out>, threads=0x8, target=0x8, tool=0x0, machine=0x17ad9b0) at util/synthetic-events.c:1936
>>> #3 machine__synthesize_threads (machine=0x17ad9b0, target=0x8, threads=0x8, needs_mmap=<optimized out>, data_mmap=<optimized out>, nr_threads_synthesize=0) at util/synthetic-events.c:1947
>>> #4 0x000000000040165d in record__synthesize (tail=<optimized out>, rec=0xbe2520 <record>) at builtin-record.c:2010
>>> #5 0x0000000000403989 in __cmd_record (argc=<optimized out>, argv=<optimized out>, rec=0xbe2520 <record>) at builtin-record.c:2810
>>> #6 0x00000000004196ba in record__init_thread_user_masks (rec=0xbe2520 <record>, cpus=0x17a65f0) at builtin-record.c:3837
>>> #7 record__init_thread_masks (rec=0xbe2520 <record>) at builtin-record.c:3938
>>> #8 cmd_record (argc=1, argv=0x7ffdd692dc60) at builtin-record.c:4241
>>> #9 0x00000000004b701d in pager_command_config (var=0x0, value=0x15 <error: Cannot access memory at address 0x15>, data=0x1de0) at perf.c:117
>>> #10 0x00000000004b732b in get_leaf_frame_caller_aarch64 (sample=0xfffffffb, thread=0x0, usr_idx=<optimized out>) at util/arm64-frame-pointer-unwind-support.c:56
>>> #11 0x0000000000406331 in execv_dashed_external (argv=0x7ffdd692d9e8) at perf.c:410
>>> #12 run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:431
>>> #13 main (argc=<optimized out>, argv=0x7ffdd692d9e8) at perf.c:562
>>>
>>> The reason is that record__bytes_written accesses the freed memory rec->thread_data,
>>> The process is as follows:
>>> __cmd_record
>>> -> record__free_thread_data
>>> -> zfree(&rec->thread_data) // free rec->thread_data
>>> -> record__synthesize
>>> -> perf_event__synthesize_id_index
>>> -> process_synthesized_event
>>> -> record__write
>>> -> record__bytes_written // access rec->thread_data
>>>
>>> we only need to check the value of done first.
>>> Also add variable check in record__bytes_written for code hardening,
>>> and save bytes_written separately to reduce one calculation.
>>>
>>> Fixes: 6d57581659f7 ("perf record: Add support for limit perf output file size")
>>> Signed-off-by: Yang Jihong <[email protected]>
>>> ---
>>>
>>> Changes since v1:
>>> - Add variable check in record__bytes_written for code hardening.
>>> - Save bytes_written separately to reduce one calculation.
>>> - Remove rec->opts.tail_synthesize check.
>>
>> Namhyung, are you ok with this now?
>>
>> - Arnaldo
>>
>>> tools/perf/builtin-record.c | 26 +++++++++++++++++---------
>>> 1 file changed, 17 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
>>> index 29dcd454b8e2..acba9e43e519 100644
>>> --- a/tools/perf/builtin-record.c
>>> +++ b/tools/perf/builtin-record.c
>>> @@ -230,16 +230,29 @@ static u64 record__bytes_written(struct record *rec)
>>> u64 bytes_written = rec->bytes_written;
>>> struct record_thread *thread_data = rec->thread_data;
>>>
>>> + if (thread_data == NULL)
>>> + return bytes_written;
>>> +
>
> Then it won't count bytes written by threads, right?
> I think it needs to be saved somewhere.
>
I'm not sure here. Can you explain it more clearly, thanks :)
I can modify it accordingly.

I think if thread_data == NULL, it is not thread data.
In this case, we just return rec->bytes_written.

Thanks,
Yang

2023-01-06 21:27:02

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH v2] perf record: Fix coredump with --overwrite and --max-size

Hello,

On Wed, Jan 4, 2023 at 8:09 PM Yang Jihong <[email protected]> wrote:
>
> Hello,
>
> On 2023/1/4 0:50, Namhyung Kim wrote:
> > On Mon, Jan 2, 2023 at 8:20 AM Arnaldo Carvalho de Melo <[email protected]> wrote:
> >>
> >> Em Thu, Dec 29, 2022 at 12:47:28PM +0000, Yang Jihong escreveu:
> >>> When --overwrite and --max-size options of perf record are used together,
> >>> a segmentation fault occurs. The following is an example:
> >>>
> >>> # perf record -e sched:sched* --overwrite --max-size 1M -a -- sleep 1
> >>> [ perf record: Woken up 1 times to write data ]
> >>> perf: Segmentation fault
> >>> Obtained 1 stack frames.
> >>> [0xc4c67f]
> >>> Segmentation fault (core dumped)
> >>>
> >>> backtrace of the core file is as follows:
> >>>
> >>> #0 0x0000000000417990 in process_locked_synthesized_event (tool=0x0, event=0x15, sample=0x1de0, machine=0xf8) at builtin-record.c:630
> >>> #1 0x000000000057ee53 in perf_event__synthesize_threads (nr_threads_synthesize=21, mmap_data=<optimized out>, needs_mmap=<optimized out>, machine=0x17ad9b0, process=<optimized out>, tool=0x0) at util/synthetic-events.c:1950
> >>> #2 __machine__synthesize_threads (nr_threads_synthesize=0, data_mmap=<optimized out>, needs_mmap=<optimized out>, process=<optimized out>, threads=0x8, target=0x8, tool=0x0, machine=0x17ad9b0) at util/synthetic-events.c:1936
> >>> #3 machine__synthesize_threads (machine=0x17ad9b0, target=0x8, threads=0x8, needs_mmap=<optimized out>, data_mmap=<optimized out>, nr_threads_synthesize=0) at util/synthetic-events.c:1947
> >>> #4 0x000000000040165d in record__synthesize (tail=<optimized out>, rec=0xbe2520 <record>) at builtin-record.c:2010
> >>> #5 0x0000000000403989 in __cmd_record (argc=<optimized out>, argv=<optimized out>, rec=0xbe2520 <record>) at builtin-record.c:2810
> >>> #6 0x00000000004196ba in record__init_thread_user_masks (rec=0xbe2520 <record>, cpus=0x17a65f0) at builtin-record.c:3837
> >>> #7 record__init_thread_masks (rec=0xbe2520 <record>) at builtin-record.c:3938
> >>> #8 cmd_record (argc=1, argv=0x7ffdd692dc60) at builtin-record.c:4241
> >>> #9 0x00000000004b701d in pager_command_config (var=0x0, value=0x15 <error: Cannot access memory at address 0x15>, data=0x1de0) at perf.c:117
> >>> #10 0x00000000004b732b in get_leaf_frame_caller_aarch64 (sample=0xfffffffb, thread=0x0, usr_idx=<optimized out>) at util/arm64-frame-pointer-unwind-support.c:56
> >>> #11 0x0000000000406331 in execv_dashed_external (argv=0x7ffdd692d9e8) at perf.c:410
> >>> #12 run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:431
> >>> #13 main (argc=<optimized out>, argv=0x7ffdd692d9e8) at perf.c:562
> >>>
> >>> The reason is that record__bytes_written accesses the freed memory rec->thread_data,
> >>> The process is as follows:
> >>> __cmd_record
> >>> -> record__free_thread_data
> >>> -> zfree(&rec->thread_data) // free rec->thread_data
> >>> -> record__synthesize
> >>> -> perf_event__synthesize_id_index
> >>> -> process_synthesized_event
> >>> -> record__write
> >>> -> record__bytes_written // access rec->thread_data
> >>>
> >>> we only need to check the value of done first.
> >>> Also add variable check in record__bytes_written for code hardening,
> >>> and save bytes_written separately to reduce one calculation.
> >>>
> >>> Fixes: 6d57581659f7 ("perf record: Add support for limit perf output file size")
> >>> Signed-off-by: Yang Jihong <[email protected]>
> >>> ---
> >>>
> >>> Changes since v1:
> >>> - Add variable check in record__bytes_written for code hardening.
> >>> - Save bytes_written separately to reduce one calculation.
> >>> - Remove rec->opts.tail_synthesize check.
> >>
> >> Namhyung, are you ok with this now?
> >>
> >> - Arnaldo
> >>
> >>> tools/perf/builtin-record.c | 26 +++++++++++++++++---------
> >>> 1 file changed, 17 insertions(+), 9 deletions(-)
> >>>
> >>> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> >>> index 29dcd454b8e2..acba9e43e519 100644
> >>> --- a/tools/perf/builtin-record.c
> >>> +++ b/tools/perf/builtin-record.c
> >>> @@ -230,16 +230,29 @@ static u64 record__bytes_written(struct record *rec)
> >>> u64 bytes_written = rec->bytes_written;
> >>> struct record_thread *thread_data = rec->thread_data;
> >>>
> >>> + if (thread_data == NULL)
> >>> + return bytes_written;
> >>> +
> >
> > Then it won't count bytes written by threads, right?
> > I think it needs to be saved somewhere.
> >
> I'm not sure here. Can you explain it more clearly, thanks :)
> I can modify it accordingly.
>
> I think if thread_data == NULL, it is not thread data.
> In this case, we just return rec->bytes_written.

It can be thread data but freed before tail synthesis, right?
In that case, I think it needs to add bytes_written by threads
to calculate the correct data size.

Thanks,
Namhyung

2023-01-09 03:28:42

by Yang Jihong

[permalink] [raw]
Subject: Re: [PATCH v2] perf record: Fix coredump with --overwrite and --max-size

Hello,

On 2023/1/7 5:12, Namhyung Kim wrote:
> Hello,
>
> On Wed, Jan 4, 2023 at 8:09 PM Yang Jihong <[email protected]> wrote:
>>
>> Hello,
>>
>> On 2023/1/4 0:50, Namhyung Kim wrote:
>>> On Mon, Jan 2, 2023 at 8:20 AM Arnaldo Carvalho de Melo <[email protected]> wrote:
>>>>
>>>> Em Thu, Dec 29, 2022 at 12:47:28PM +0000, Yang Jihong escreveu:
>>>>> When --overwrite and --max-size options of perf record are used together,
>>>>> a segmentation fault occurs. The following is an example:
>>>>>
>>>>> # perf record -e sched:sched* --overwrite --max-size 1M -a -- sleep 1
>>>>> [ perf record: Woken up 1 times to write data ]
>>>>> perf: Segmentation fault
>>>>> Obtained 1 stack frames.
>>>>> [0xc4c67f]
>>>>> Segmentation fault (core dumped)
>>>>>
>>>>> backtrace of the core file is as follows:
>>>>>
>>>>> #0 0x0000000000417990 in process_locked_synthesized_event (tool=0x0, event=0x15, sample=0x1de0, machine=0xf8) at builtin-record.c:630
>>>>> #1 0x000000000057ee53 in perf_event__synthesize_threads (nr_threads_synthesize=21, mmap_data=<optimized out>, needs_mmap=<optimized out>, machine=0x17ad9b0, process=<optimized out>, tool=0x0) at util/synthetic-events.c:1950
>>>>> #2 __machine__synthesize_threads (nr_threads_synthesize=0, data_mmap=<optimized out>, needs_mmap=<optimized out>, process=<optimized out>, threads=0x8, target=0x8, tool=0x0, machine=0x17ad9b0) at util/synthetic-events.c:1936
>>>>> #3 machine__synthesize_threads (machine=0x17ad9b0, target=0x8, threads=0x8, needs_mmap=<optimized out>, data_mmap=<optimized out>, nr_threads_synthesize=0) at util/synthetic-events.c:1947
>>>>> #4 0x000000000040165d in record__synthesize (tail=<optimized out>, rec=0xbe2520 <record>) at builtin-record.c:2010
>>>>> #5 0x0000000000403989 in __cmd_record (argc=<optimized out>, argv=<optimized out>, rec=0xbe2520 <record>) at builtin-record.c:2810
>>>>> #6 0x00000000004196ba in record__init_thread_user_masks (rec=0xbe2520 <record>, cpus=0x17a65f0) at builtin-record.c:3837
>>>>> #7 record__init_thread_masks (rec=0xbe2520 <record>) at builtin-record.c:3938
>>>>> #8 cmd_record (argc=1, argv=0x7ffdd692dc60) at builtin-record.c:4241
>>>>> #9 0x00000000004b701d in pager_command_config (var=0x0, value=0x15 <error: Cannot access memory at address 0x15>, data=0x1de0) at perf.c:117
>>>>> #10 0x00000000004b732b in get_leaf_frame_caller_aarch64 (sample=0xfffffffb, thread=0x0, usr_idx=<optimized out>) at util/arm64-frame-pointer-unwind-support.c:56
>>>>> #11 0x0000000000406331 in execv_dashed_external (argv=0x7ffdd692d9e8) at perf.c:410
>>>>> #12 run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:431
>>>>> #13 main (argc=<optimized out>, argv=0x7ffdd692d9e8) at perf.c:562
>>>>>
>>>>> The reason is that record__bytes_written accesses the freed memory rec->thread_data,
>>>>> The process is as follows:
>>>>> __cmd_record
>>>>> -> record__free_thread_data
>>>>> -> zfree(&rec->thread_data) // free rec->thread_data
>>>>> -> record__synthesize
>>>>> -> perf_event__synthesize_id_index
>>>>> -> process_synthesized_event
>>>>> -> record__write
>>>>> -> record__bytes_written // access rec->thread_data
>>>>>
>>>>> we only need to check the value of done first.
>>>>> Also add variable check in record__bytes_written for code hardening,
>>>>> and save bytes_written separately to reduce one calculation.
>>>>>
>>>>> Fixes: 6d57581659f7 ("perf record: Add support for limit perf output file size")
>>>>> Signed-off-by: Yang Jihong <[email protected]>
>>>>> ---
>>>>>
>>>>> Changes since v1:
>>>>> - Add variable check in record__bytes_written for code hardening.
>>>>> - Save bytes_written separately to reduce one calculation.
>>>>> - Remove rec->opts.tail_synthesize check.
>>>>
>>>> Namhyung, are you ok with this now?
>>>>
>>>> - Arnaldo
>>>>
>>>>> tools/perf/builtin-record.c | 26 +++++++++++++++++---------
>>>>> 1 file changed, 17 insertions(+), 9 deletions(-)
>>>>>
>>>>> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
>>>>> index 29dcd454b8e2..acba9e43e519 100644
>>>>> --- a/tools/perf/builtin-record.c
>>>>> +++ b/tools/perf/builtin-record.c
>>>>> @@ -230,16 +230,29 @@ static u64 record__bytes_written(struct record *rec)
>>>>> u64 bytes_written = rec->bytes_written;
>>>>> struct record_thread *thread_data = rec->thread_data;
>>>>>
>>>>> + if (thread_data == NULL)
>>>>> + return bytes_written;
>>>>> +
>>>
>>> Then it won't count bytes written by threads, right?
>>> I think it needs to be saved somewhere.
>>>
>> I'm not sure here. Can you explain it more clearly, thanks :)
>> I can modify it accordingly.
>>
>> I think if thread_data == NULL, it is not thread data.
>> In this case, we just return rec->bytes_written.
>
> It can be thread data but freed before tail synthesis, right?
> In that case, I think it needs to add bytes_written by threads
> to calculate the correct data size.
Em... In the __cmd_record function, record__stop_threads is called
before record__free_thread_data, so if the thread has been freed, there
will be no thread data.
I think it's okay to ignore the situation you mentioned above.

Thanks,
Yang

2023-01-10 19:57:56

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH v2] perf record: Fix coredump with --overwrite and --max-size

On Sun, Jan 8, 2023 at 6:47 PM Yang Jihong <[email protected]> wrote:
>
> Hello,
>
> On 2023/1/7 5:12, Namhyung Kim wrote:
> > Hello,
> >
> > On Wed, Jan 4, 2023 at 8:09 PM Yang Jihong <[email protected]> wrote:
> >>
> >> Hello,
> >>
> >> On 2023/1/4 0:50, Namhyung Kim wrote:
> >>> On Mon, Jan 2, 2023 at 8:20 AM Arnaldo Carvalho de Melo <[email protected]> wrote:
> >>>>
> >>>> Em Thu, Dec 29, 2022 at 12:47:28PM +0000, Yang Jihong escreveu:
> >>>>> When --overwrite and --max-size options of perf record are used together,
> >>>>> a segmentation fault occurs. The following is an example:
> >>>>>
> >>>>> # perf record -e sched:sched* --overwrite --max-size 1M -a -- sleep 1
> >>>>> [ perf record: Woken up 1 times to write data ]
> >>>>> perf: Segmentation fault
> >>>>> Obtained 1 stack frames.
> >>>>> [0xc4c67f]
> >>>>> Segmentation fault (core dumped)
> >>>>>
> >>>>> backtrace of the core file is as follows:
> >>>>>
> >>>>> #0 0x0000000000417990 in process_locked_synthesized_event (tool=0x0, event=0x15, sample=0x1de0, machine=0xf8) at builtin-record.c:630
> >>>>> #1 0x000000000057ee53 in perf_event__synthesize_threads (nr_threads_synthesize=21, mmap_data=<optimized out>, needs_mmap=<optimized out>, machine=0x17ad9b0, process=<optimized out>, tool=0x0) at util/synthetic-events.c:1950
> >>>>> #2 __machine__synthesize_threads (nr_threads_synthesize=0, data_mmap=<optimized out>, needs_mmap=<optimized out>, process=<optimized out>, threads=0x8, target=0x8, tool=0x0, machine=0x17ad9b0) at util/synthetic-events.c:1936
> >>>>> #3 machine__synthesize_threads (machine=0x17ad9b0, target=0x8, threads=0x8, needs_mmap=<optimized out>, data_mmap=<optimized out>, nr_threads_synthesize=0) at util/synthetic-events.c:1947
> >>>>> #4 0x000000000040165d in record__synthesize (tail=<optimized out>, rec=0xbe2520 <record>) at builtin-record.c:2010
> >>>>> #5 0x0000000000403989 in __cmd_record (argc=<optimized out>, argv=<optimized out>, rec=0xbe2520 <record>) at builtin-record.c:2810
> >>>>> #6 0x00000000004196ba in record__init_thread_user_masks (rec=0xbe2520 <record>, cpus=0x17a65f0) at builtin-record.c:3837
> >>>>> #7 record__init_thread_masks (rec=0xbe2520 <record>) at builtin-record.c:3938
> >>>>> #8 cmd_record (argc=1, argv=0x7ffdd692dc60) at builtin-record.c:4241
> >>>>> #9 0x00000000004b701d in pager_command_config (var=0x0, value=0x15 <error: Cannot access memory at address 0x15>, data=0x1de0) at perf.c:117
> >>>>> #10 0x00000000004b732b in get_leaf_frame_caller_aarch64 (sample=0xfffffffb, thread=0x0, usr_idx=<optimized out>) at util/arm64-frame-pointer-unwind-support.c:56
> >>>>> #11 0x0000000000406331 in execv_dashed_external (argv=0x7ffdd692d9e8) at perf.c:410
> >>>>> #12 run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:431
> >>>>> #13 main (argc=<optimized out>, argv=0x7ffdd692d9e8) at perf.c:562
> >>>>>
> >>>>> The reason is that record__bytes_written accesses the freed memory rec->thread_data,
> >>>>> The process is as follows:
> >>>>> __cmd_record
> >>>>> -> record__free_thread_data
> >>>>> -> zfree(&rec->thread_data) // free rec->thread_data
> >>>>> -> record__synthesize
> >>>>> -> perf_event__synthesize_id_index
> >>>>> -> process_synthesized_event
> >>>>> -> record__write
> >>>>> -> record__bytes_written // access rec->thread_data
> >>>>>
> >>>>> we only need to check the value of done first.
> >>>>> Also add variable check in record__bytes_written for code hardening,
> >>>>> and save bytes_written separately to reduce one calculation.
> >>>>>
> >>>>> Fixes: 6d57581659f7 ("perf record: Add support for limit perf output file size")
> >>>>> Signed-off-by: Yang Jihong <[email protected]>
> >>>>> ---
> >>>>>
> >>>>> Changes since v1:
> >>>>> - Add variable check in record__bytes_written for code hardening.
> >>>>> - Save bytes_written separately to reduce one calculation.
> >>>>> - Remove rec->opts.tail_synthesize check.
> >>>>
> >>>> Namhyung, are you ok with this now?
> >>>>
> >>>> - Arnaldo
> >>>>
> >>>>> tools/perf/builtin-record.c | 26 +++++++++++++++++---------
> >>>>> 1 file changed, 17 insertions(+), 9 deletions(-)
> >>>>>
> >>>>> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> >>>>> index 29dcd454b8e2..acba9e43e519 100644
> >>>>> --- a/tools/perf/builtin-record.c
> >>>>> +++ b/tools/perf/builtin-record.c
> >>>>> @@ -230,16 +230,29 @@ static u64 record__bytes_written(struct record *rec)
> >>>>> u64 bytes_written = rec->bytes_written;
> >>>>> struct record_thread *thread_data = rec->thread_data;
> >>>>>
> >>>>> + if (thread_data == NULL)
> >>>>> + return bytes_written;
> >>>>> +
> >>>
> >>> Then it won't count bytes written by threads, right?
> >>> I think it needs to be saved somewhere.
> >>>
> >> I'm not sure here. Can you explain it more clearly, thanks :)
> >> I can modify it accordingly.
> >>
> >> I think if thread_data == NULL, it is not thread data.
> >> In this case, we just return rec->bytes_written.
> >
> > It can be thread data but freed before tail synthesis, right?
> > In that case, I think it needs to add bytes_written by threads
> > to calculate the correct data size.
> Em... In the __cmd_record function, record__stop_threads is called
> before record__free_thread_data, so if the thread has been freed, there
> will be no thread data.
> I think it's okay to ignore the situation you mentioned above.

Right, the thread data is already freed, but we need the size.

I think it didn't (and won't) update to rec->bytes_written for the data
written by the threads (data.X file) because it's only for the main
'data' file. So record__bytes_written() will return a smaller number
after the threads are gone. But I think it should return the total
data size.

Thanks,
Namhyung

2023-01-13 07:25:15

by Yang Jihong

[permalink] [raw]
Subject: Re: [PATCH v2] perf record: Fix coredump with --overwrite and --max-size

Hello,

On 2023/1/11 3:21, Namhyung Kim wrote:
> On Sun, Jan 8, 2023 at 6:47 PM Yang Jihong <[email protected]> wrote:
>>
>> Hello,
>>
>> On 2023/1/7 5:12, Namhyung Kim wrote:
>>> Hello,
>>>
>>> On Wed, Jan 4, 2023 at 8:09 PM Yang Jihong <[email protected]> wrote:
>>>>
>>>> Hello,
>>>>
>>>> On 2023/1/4 0:50, Namhyung Kim wrote:
>>>>> On Mon, Jan 2, 2023 at 8:20 AM Arnaldo Carvalho de Melo <[email protected]> wrote:
>>>>>>
>>>>>> Em Thu, Dec 29, 2022 at 12:47:28PM +0000, Yang Jihong escreveu:
>>>>>>> When --overwrite and --max-size options of perf record are used together,
>>>>>>> a segmentation fault occurs. The following is an example:
>>>>>>>
>>>>>>> # perf record -e sched:sched* --overwrite --max-size 1M -a -- sleep 1
>>>>>>> [ perf record: Woken up 1 times to write data ]
>>>>>>> perf: Segmentation fault
>>>>>>> Obtained 1 stack frames.
>>>>>>> [0xc4c67f]
>>>>>>> Segmentation fault (core dumped)
>>>>>>>
>>>>>>> backtrace of the core file is as follows:
>>>>>>>
>>>>>>> #0 0x0000000000417990 in process_locked_synthesized_event (tool=0x0, event=0x15, sample=0x1de0, machine=0xf8) at builtin-record.c:630
>>>>>>> #1 0x000000000057ee53 in perf_event__synthesize_threads (nr_threads_synthesize=21, mmap_data=<optimized out>, needs_mmap=<optimized out>, machine=0x17ad9b0, process=<optimized out>, tool=0x0) at util/synthetic-events.c:1950
>>>>>>> #2 __machine__synthesize_threads (nr_threads_synthesize=0, data_mmap=<optimized out>, needs_mmap=<optimized out>, process=<optimized out>, threads=0x8, target=0x8, tool=0x0, machine=0x17ad9b0) at util/synthetic-events.c:1936
>>>>>>> #3 machine__synthesize_threads (machine=0x17ad9b0, target=0x8, threads=0x8, needs_mmap=<optimized out>, data_mmap=<optimized out>, nr_threads_synthesize=0) at util/synthetic-events.c:1947
>>>>>>> #4 0x000000000040165d in record__synthesize (tail=<optimized out>, rec=0xbe2520 <record>) at builtin-record.c:2010
>>>>>>> #5 0x0000000000403989 in __cmd_record (argc=<optimized out>, argv=<optimized out>, rec=0xbe2520 <record>) at builtin-record.c:2810
>>>>>>> #6 0x00000000004196ba in record__init_thread_user_masks (rec=0xbe2520 <record>, cpus=0x17a65f0) at builtin-record.c:3837
>>>>>>> #7 record__init_thread_masks (rec=0xbe2520 <record>) at builtin-record.c:3938
>>>>>>> #8 cmd_record (argc=1, argv=0x7ffdd692dc60) at builtin-record.c:4241
>>>>>>> #9 0x00000000004b701d in pager_command_config (var=0x0, value=0x15 <error: Cannot access memory at address 0x15>, data=0x1de0) at perf.c:117
>>>>>>> #10 0x00000000004b732b in get_leaf_frame_caller_aarch64 (sample=0xfffffffb, thread=0x0, usr_idx=<optimized out>) at util/arm64-frame-pointer-unwind-support.c:56
>>>>>>> #11 0x0000000000406331 in execv_dashed_external (argv=0x7ffdd692d9e8) at perf.c:410
>>>>>>> #12 run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:431
>>>>>>> #13 main (argc=<optimized out>, argv=0x7ffdd692d9e8) at perf.c:562
>>>>>>>
>>>>>>> The reason is that record__bytes_written accesses the freed memory rec->thread_data,
>>>>>>> The process is as follows:
>>>>>>> __cmd_record
>>>>>>> -> record__free_thread_data
>>>>>>> -> zfree(&rec->thread_data) // free rec->thread_data
>>>>>>> -> record__synthesize
>>>>>>> -> perf_event__synthesize_id_index
>>>>>>> -> process_synthesized_event
>>>>>>> -> record__write
>>>>>>> -> record__bytes_written // access rec->thread_data
>>>>>>>
>>>>>>> we only need to check the value of done first.
>>>>>>> Also add variable check in record__bytes_written for code hardening,
>>>>>>> and save bytes_written separately to reduce one calculation.
>>>>>>>
>>>>>>> Fixes: 6d57581659f7 ("perf record: Add support for limit perf output file size")
>>>>>>> Signed-off-by: Yang Jihong <[email protected]>
>>>>>>> ---
>>>>>>>
>>>>>>> Changes since v1:
>>>>>>> - Add variable check in record__bytes_written for code hardening.
>>>>>>> - Save bytes_written separately to reduce one calculation.
>>>>>>> - Remove rec->opts.tail_synthesize check.
>>>>>>
>>>>>> Namhyung, are you ok with this now?
>>>>>>
>>>>>> - Arnaldo
>>>>>>
>>>>>>> tools/perf/builtin-record.c | 26 +++++++++++++++++---------
>>>>>>> 1 file changed, 17 insertions(+), 9 deletions(-)
>>>>>>>
>>>>>>> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
>>>>>>> index 29dcd454b8e2..acba9e43e519 100644
>>>>>>> --- a/tools/perf/builtin-record.c
>>>>>>> +++ b/tools/perf/builtin-record.c
>>>>>>> @@ -230,16 +230,29 @@ static u64 record__bytes_written(struct record *rec)
>>>>>>> u64 bytes_written = rec->bytes_written;
>>>>>>> struct record_thread *thread_data = rec->thread_data;
>>>>>>>
>>>>>>> + if (thread_data == NULL)
>>>>>>> + return bytes_written;
>>>>>>> +
>>>>>
>>>>> Then it won't count bytes written by threads, right?
>>>>> I think it needs to be saved somewhere.
>>>>>
>>>> I'm not sure here. Can you explain it more clearly, thanks :)
>>>> I can modify it accordingly.
>>>>
>>>> I think if thread_data == NULL, it is not thread data.
>>>> In this case, we just return rec->bytes_written.
>>>
>>> It can be thread data but freed before tail synthesis, right?
>>> In that case, I think it needs to add bytes_written by threads
>>> to calculate the correct data size.
>> Em... In the __cmd_record function, record__stop_threads is called
>> before record__free_thread_data, so if the thread has been freed, there
>> will be no thread data.
>> I think it's okay to ignore the situation you mentioned above.
>
> Right, the thread data is already freed, but we need the size.
>
> I think it didn't (and won't) update to rec->bytes_written for the data
> written by the threads (data.X file) because it's only for the main
> 'data' file. So record__bytes_written() will return a smaller number
> after the threads are gone. But I think it should return the total
> data size.
>
Yes, the total data size including data.X file should be returned here
to fit the semantics, so there's a problem here, too. will fix in next
version.

Thanks,
Yang