2014-10-30 22:28:26

by Peter Zijlstra

[permalink] [raw]
Subject: [RFD] perf syscall error handling

Hi,

Earlier today I was reminded that perf syscall error handling sucks arse
-- albeit not in those words.

Now I know we've had this discussion before, but nothing really
happened. I think back then the suggestion was having the kernel write a
string back or somesuch.

The problem with a string is, its hard for machines to interpret, its
English, so near impossible for some humans too.

So would something simple, like an offset into the struct
perf_event_attr pointing at the current field we're trying to process
make sense? Maybe with negative offsets to indicate the syscall
arguments?

That would narrow down the 'WTF is wrong noaw' a lot I think. But then,
I've not actually done a lot of userspace the last few years, so maybe
I'm just dreaming things.

Anybody?


2014-10-31 01:15:54

by Vince Weaver

[permalink] [raw]
Subject: Re: [RFD] perf syscall error handling

On Thu, 30 Oct 2014, Peter Zijlstra wrote:

> So would something simple, like an offset into the struct
> perf_event_attr pointing at the current field we're trying to process
> make sense? Maybe with negative offsets to indicate the syscall
> arguments?
>
> That would narrow down the 'WTF is wrong noaw' a lot I think. But then,
> I've not actually done a lot of userspace the last few years, so maybe
> I'm just dreaming things.

well, as someone who spends a lot of time in userspace trying to help
people who report probems like 'perf_event_open() returns EINVAL, what's
wrong' I can say pretty much anything will be an improvement.

What would really help is if we could somehow return the
filename/line-number of whatever source code file that's setting errno.

Even if perf_event_open() told me that hey, we're getting EOPNOTSUPP due
to the precise_ip parameter (something that happened just yesterday) it's
still a lot of grepping and poking around source files to find out what's
going on. It would be much better if it just told me the issue was at
kernel/events/core.c line 995 or so, but I'm not sure how you could pass
that back to the user, and one could argue it wouldn't help much the
average user without a kernel tree lying around.

Vince

2014-10-31 07:21:21

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFD] perf syscall error handling

On Thu, Oct 30, 2014 at 09:16:36PM -0400, Vince Weaver wrote:
> On Thu, 30 Oct 2014, Peter Zijlstra wrote:
>
> > So would something simple, like an offset into the struct
> > perf_event_attr pointing at the current field we're trying to process
> > make sense? Maybe with negative offsets to indicate the syscall
> > arguments?
> >
> > That would narrow down the 'WTF is wrong noaw' a lot I think. But then,
> > I've not actually done a lot of userspace the last few years, so maybe
> > I'm just dreaming things.
>
> well, as someone who spends a lot of time in userspace trying to help
> people who report probems like 'perf_event_open() returns EINVAL, what's
> wrong' I can say pretty much anything will be an improvement.

Right, the situation is dire indeed :/

> What would really help is if we could somehow return the
> filename/line-number of whatever source code file that's setting errno.
>
> Even if perf_event_open() told me that hey, we're getting EOPNOTSUPP due
> to the precise_ip parameter (something that happened just yesterday) it's
> still a lot of grepping and poking around source files to find out what's
> going on. It would be much better if it just told me the issue was at
> kernel/events/core.c line 995 or so, but I'm not sure how you could pass
> that back to the user, and one could argue it wouldn't help much the
> average user without a kernel tree lying around.

Would an additional bit mask help? With that we'd be able to finger
the exact flag that causes pain.

2014-10-31 09:27:20

by Ingo Molnar

[permalink] [raw]
Subject: Re: [RFD] perf syscall error handling


* Peter Zijlstra <[email protected]> wrote:

> On Thu, Oct 30, 2014 at 09:16:36PM -0400, Vince Weaver wrote:
> > On Thu, 30 Oct 2014, Peter Zijlstra wrote:
> >
> > > So would something simple, like an offset into the struct
> > > perf_event_attr pointing at the current field we're trying
> > > to process make sense? Maybe with negative offsets to
> > > indicate the syscall arguments?
> > >
> > > That would narrow down the 'WTF is wrong noaw' a lot I
> > > think. But then, I've not actually done a lot of userspace
> > > the last few years, so maybe I'm just dreaming things.
> >
> > well, as someone who spends a lot of time in userspace trying
> > to help people who report probems like 'perf_event_open()
> > returns EINVAL, what's wrong' I can say pretty much anything
> > will be an improvement.
>
> Right, the situation is dire indeed :/
>
> > What would really help is if we could somehow return the
> > filename/line-number of whatever source code file that's
> > setting errno.
> >
> > Even if perf_event_open() told me that hey, we're getting
> > EOPNOTSUPP due to the precise_ip parameter (something that
> > happened just yesterday) it's still a lot of grepping and
> > poking around source files to find out what's going on. It
> > would be much better if it just told me the issue was at
> > kernel/events/core.c line 995 or so, but I'm not sure how you
> > could pass that back to the user, and one could argue it
> > wouldn't help much the average user without a kernel tree
> > lying around.
>
> Would an additional bit mask help? With that we'd be able to
> finger the exact flag that causes pain.

Well, I don't really like bitmasks nor __LINE__/__FILE__
obscurity, those are non-starters because they are user
unfriendly.

What would work best is something like:

- user-space could request 'extended error code' passing from
kernel to user-space via a (default off) feature bit in
perf_attr, plus a user-space provided pointer to a string
buffer, and a maximum length value.

- old user-space or user-space that does not want it would not
be unaffected by the new 'extended error code' facility

- user-space gets back the regular errno (-EOPNOTSUPP or -ENOSYS
or -EINVAL, etc.) and a string. Strings are really the most
helpful information, because tools can just print that. They
can also match on specific strings and programmatically react
to them if they want to: we can promise to not arbitrarily
change error strings once they are introduced. (but even if
they change, user-space can still print them out.)

- in the kernel, instead of doing:

return -EOPNOTSUPP;

we'd do something like:

return perf_errno(-EOPNOTSUPP, "BTS not supported by this CPU architecture");

here the kernel would in the regular case just ignore the
string, but if user-space requested the 'extended errno'
facility, it would copy the (zero-delimited) error string to
perf_attr->errno_str, taking errno_len into account.

If no extended string was written then user-space can detect
this through the string not having been written to. (it can
initialize it to a zero string.)

Note the various advantages of this approach:

- it's hard to get the facility wrong on the user-space side: in
the simplest usage user-space simply prints out the error,
which will be obvious to the user in most cases.

- it's hard to get it wrong on the kernel side: it's really
similar to what we do today, plus a descriptive error string.
Developers should take care to use descriptive and unique
messages (but even duplicate messages will help in informing
the user).

- it's infinitely extensible, does not involve magic numbers nor
ever changing __LINE__ numbers and obscure source code file
names.

- it's really _easy_ to add good error information on the kernel
side: just add a perf_err() string passing method instead of a
dumb return -EINVAL. Also, the information is in a single
place, exactly where the problem occurs, so it will be easily
maintainable going forward.

Thanks,

Ingo

2014-10-31 10:00:24

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [RFD] perf syscall error handling

Em Thu, Oct 30, 2014 at 09:16:36PM -0400, Vince Weaver escreveu:
> On Thu, 30 Oct 2014, Peter Zijlstra wrote:
>
> > So would something simple, like an offset into the struct
> > perf_event_attr pointing at the current field we're trying to process
> > make sense? Maybe with negative offsets to indicate the syscall
> > arguments?
> >
> > That would narrow down the 'WTF is wrong noaw' a lot I think. But then,
> > I've not actually done a lot of userspace the last few years, so maybe
> > I'm just dreaming things.
>
> well, as someone who spends a lot of time in userspace trying to help
> people who report probems like 'perf_event_open() returns EINVAL, what's
> wrong' I can say pretty much anything will be an improvement.
>
> What would really help is if we could somehow return the
> filename/line-number of whatever source code file that's setting errno.
>
> Even if perf_event_open() told me that hey, we're getting EOPNOTSUPP due
> to the precise_ip parameter (something that happened just yesterday) it's
> still a lot of grepping and poking around source files to find out what's
> going on. It would be much better if it just told me the issue was at
> kernel/events/core.c line 995 or so, but I'm not sure how you could pass
> that back to the user, and one could argue it wouldn't help much the
> average user without a kernel tree lying around.

But perhaps we can have a mode where we can say to perf to setup
function tracing in the sys_perf_event_open() for the perf tool pid,
that would output the lines leading to the return ERRNO.

Or use 'perf probe' like stuff to insert/enable some kretprobes, both
ideas need some prototyping and would require root privilege :-\

- Arnaldo

2014-10-31 21:22:23

by Stephane Eranian

[permalink] [raw]
Subject: Re: [RFD] perf syscall error handling

Hi,

On Fri, Oct 31, 2014 at 1:28 PM, Matt Fleming <[email protected]> wrote:
> On Fri, 31 Oct, at 10:27:13AM, Ingo Molnar wrote:
>>
>> - user-space gets back the regular errno (-EOPNOTSUPP or -ENOSYS
>> or -EINVAL, etc.) and a string. Strings are really the most
>> helpful information, because tools can just print that. They
>> can also match on specific strings and programmatically react
>> to them if they want to: we can promise to not arbitrarily
>> change error strings once they are introduced. (but even if
>> they change, user-space can still print them out.)
>
> I guess we'd run into a problem if userspace doesn't want to just print
> the kernel string but instead wants to parse it in some fashion.
>
> That may or may not be a problem in practice, Vince can probably comment
> on that. I'm just thinking along the lines of making the perf syscall
> interface as useful as possible for tools other than tools/perf.
>
Maybe I missed something in the earlier thread, but I am trying to understand
why perf_event_open() would need such extended error retrieval system when
no other syscall does.

In any case, I would go with Ingo's proposal.

2014-11-01 05:29:44

by Vince Weaver

[permalink] [raw]
Subject: Re: [RFD] perf syscall error handling

On Fri, 31 Oct 2014, Stephane Eranian wrote:

> On Fri, Oct 31, 2014 at 1:28 PM, Matt Fleming <[email protected]> wrote:
> >
> > I guess we'd run into a problem if userspace doesn't want to just print
> > the kernel string but instead wants to parse it in some fashion.

If the string interface went in it would be a help when debugging
perf_event_open().

Support would probably get added to PAPI, but PAPI has its own error
reporting issues and it's not always easy to pass a string the whole way
back to the user.

Also with PAPI many of the users reporting obscure perf_event_open()
problems are often running 2.6.32-vendor-patched kernels, so sadly it will
be years before any improved error handling filters down to many of the
users.

This proposal also doesn't help with other weird failure modes in the
interface, the most annoying of which is the watchdog stealing a counter
so event groups perf_event_open() and start just fine but fail at read
time.

> > That may or may not be a problem in practice, Vince can probably comment
> > on that. I'm just thinking along the lines of making the perf syscall
> > interface as useful as possible for tools other than tools/perf.
> >
> Maybe I missed something in the earlier thread, but I am trying to understand
> why perf_event_open() would need such extended error retrieval system when
> no other syscall does.

well perf_event_open() is so complex, with it's 40+ different parameters.
Having a wrong value in any one of those (or one of the combinations of
those) can trigger EINVAL and it's not clear where the issue is.
I think other system calls tend to have slighly more straightforward
interfaces.

Vince

2014-11-01 17:32:06

by Matt Fleming

[permalink] [raw]
Subject: Re: [RFD] perf syscall error handling

On Fri, 31 Oct, at 10:27:13AM, Ingo Molnar wrote:
>
> - user-space gets back the regular errno (-EOPNOTSUPP or -ENOSYS
> or -EINVAL, etc.) and a string. Strings are really the most
> helpful information, because tools can just print that. They
> can also match on specific strings and programmatically react
> to them if they want to: we can promise to not arbitrarily
> change error strings once they are introduced. (but even if
> they change, user-space can still print them out.)

I guess we'd run into a problem if userspace doesn't want to just print
the kernel string but instead wants to parse it in some fashion.

That may or may not be a problem in practice, Vince can probably comment
on that. I'm just thinking along the lines of making the perf syscall
interface as useful as possible for tools other than tools/perf.

--
Matt Fleming, Intel Open Source Technology Center

2014-11-03 16:25:57

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [RFD] perf syscall error handling

Em Sat, Nov 01, 2014 at 01:30:39AM -0400, Vince Weaver escreveu:
> On Fri, 31 Oct 2014, Stephane Eranian wrote:
>
> > On Fri, Oct 31, 2014 at 1:28 PM, Matt Fleming <[email protected]> wrote:
> > >
> > > I guess we'd run into a problem if userspace doesn't want to just print
> > > the kernel string but instead wants to parse it in some fashion.

> If the string interface went in it would be a help when debugging
> perf_event_open().

The way that peterz suggested, i.e. returning information about which
perf_event_attr and which of the parameters was invalid/had issues could
help with fallbacking/capability querying, i.e. tooling may want to use
some features if available automagically, fallbacking to something else
when that fails.

We already do that to some degree in various cases, but for some if the
only way that becomes available to disambiguate some EINVAL return is a
string, code will start having strcmps :-\

> Support would probably get added to PAPI, but PAPI has its own error
> reporting issues and it's not always easy to pass a string the whole way
> back to the user.

Having the last resort being an string instead of EINVAL is indeed much
better than what we have now.

> Also with PAPI many of the users reporting obscure perf_event_open()
> problems are often running 2.6.32-vendor-patched kernels, so sadly it will
> be years before any improved error handling filters down to many of the
> users.

> This proposal also doesn't help with other weird failure modes in the
> interface, the most annoying of which is the watchdog stealing a counter
> so event groups perf_event_open() and start just fine but fail at read
> time.

> > > That may or may not be a problem in practice, Vince can probably comment
> > > on that. I'm just thinking along the lines of making the perf syscall
> > > interface as useful as possible for tools other than tools/perf.

> > Maybe I missed something in the earlier thread, but I am trying to understand
> > why perf_event_open() would need such extended error retrieval system when
> > no other syscall does.

> well perf_event_open() is so complex, with it's 40+ different parameters.
> Having a wrong value in any one of those (or one of the combinations of
> those) can trigger EINVAL and it's not clear where the issue is.
> I think other system calls tend to have slighly more straightforward
> interfaces.

Yes, it is a PITA to figure out which codepath returned -EINVAL.

Its just a "No comprendo", we're left wondering what is it that is not
being understood or accepted...

- Arnaldo

2014-11-03 16:50:31

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFD] perf syscall error handling

On Mon, Nov 03, 2014 at 02:25:48PM -0200, Arnaldo Carvalho de Melo wrote:

> The way that peterz suggested, i.e. returning information about which
> perf_event_attr and which of the parameters was invalid/had issues could
> help with fallbacking/capability querying, i.e. tooling may want to use
> some features if available automagically, fallbacking to something else
> when that fails.
>
> We already do that to some degree in various cases, but for some if the
> only way that becomes available to disambiguate some EINVAL return is a
> string, code will start having strcmps :-\

OK, so how about we do both, the offset+mask for the tools and the
string for the humans?

2014-11-03 17:01:07

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [RFD] perf syscall error handling

Em Mon, Nov 03, 2014 at 05:50:19PM +0100, Peter Zijlstra escreveu:
> On Mon, Nov 03, 2014 at 02:25:48PM -0200, Arnaldo Carvalho de Melo wrote:

> > The way that peterz suggested, i.e. returning information about which
> > perf_event_attr and which of the parameters was invalid/had issues could
> > help with fallbacking/capability querying, i.e. tooling may want to use
> > some features if available automagically, fallbacking to something else
> > when that fails.

> > We already do that to some degree in various cases, but for some if the
> > only way that becomes available to disambiguate some EINVAL return is a
> > string, code will start having strcmps :-\

> OK, so how about we do both, the offset+mask for the tools and the
> string for the humans?

Yeah, tooling tries to provide the best it can with the offset+mask, and
if doesn't manage to do anything smart with it, just show the string and
hope that helps the user to figure out what is happening.

- Arnaldo

2014-11-03 17:11:34

by Vince Weaver

[permalink] [raw]
Subject: Re: [RFD] perf syscall error handling

On Mon, 3 Nov 2014, Arnaldo Carvalho de Melo wrote:

> Em Mon, Nov 03, 2014 at 05:50:19PM +0100, Peter Zijlstra escreveu:
> > On Mon, Nov 03, 2014 at 02:25:48PM -0200, Arnaldo Carvalho de Melo wrote:
>
> > > The way that peterz suggested, i.e. returning information about which
> > > perf_event_attr and which of the parameters was invalid/had issues could
> > > help with fallbacking/capability querying, i.e. tooling may want to use
> > > some features if available automagically, fallbacking to something else
> > > when that fails.
>
> > > We already do that to some degree in various cases, but for some if the
> > > only way that becomes available to disambiguate some EINVAL return is a
> > > string, code will start having strcmps :-\
>
> > OK, so how about we do both, the offset+mask for the tools and the
> > string for the humans?
>
> Yeah, tooling tries to provide the best it can with the offset+mask, and
> if doesn't manage to do anything smart with it, just show the string and
> hope that helps the user to figure out what is happening.

I don't know if having an offset/mask helps much. Knowing your EINVAL
comes from ->config is nice to know, but if there's 30 different ways
to get an EINVAL from an improper config then you still can waste a lot
of time narrowing things down.

The string solution might be nice, but it is going to take major changes
to the code and increase the size a bit. For example:

$ cat arch/x86/kernel/cpu/perf* kernel/events/* | grep EINVAL | wc -l
100

And some of the code is passing the return values back through various
long callchains (and overloaded pointers via casts) where it's not clear
how you could also pass a string value.

Vince

2014-11-03 17:39:19

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFD] perf syscall error handling

On Mon, Nov 03, 2014 at 12:12:18PM -0500, Vince Weaver wrote:
> I don't know if having an offset/mask helps much. Knowing your EINVAL
> comes from ->config is nice to know, but if there's 30 different ways
> to get an EINVAL from an improper config then you still can waste a lot
> of time narrowing things down.
>
> The string solution might be nice, but it is going to take major changes
> to the code and increase the size a bit. For example:
>
> $ cat arch/x86/kernel/cpu/perf* kernel/events/* | grep EINVAL | wc -l
> 100
>
> And some of the code is passing the return values back through various
> long callchains (and overloaded pointers via casts) where it's not clear
> how you could also pass a string value.

Yes, nobody said this would be a quick and easy exercise. But I figure
something needs to happen.

2014-11-10 10:27:32

by Ingo Molnar

[permalink] [raw]
Subject: Re: [RFD] perf syscall error handling


* Arnaldo Carvalho de Melo <[email protected]> wrote:

> Em Mon, Nov 03, 2014 at 05:50:19PM +0100, Peter Zijlstra escreveu:
> > On Mon, Nov 03, 2014 at 02:25:48PM -0200, Arnaldo Carvalho de Melo wrote:
>
> > > The way that peterz suggested, i.e. returning information about which
> > > perf_event_attr and which of the parameters was invalid/had issues could
> > > help with fallbacking/capability querying, i.e. tooling may want to use
> > > some features if available automagically, fallbacking to something else
> > > when that fails.
>
> > > We already do that to some degree in various cases, but for some if the
> > > only way that becomes available to disambiguate some EINVAL return is a
> > > string, code will start having strcmps :-\
>
> > OK, so how about we do both, the offset+mask for the tools
> > and the string for the humans?
>
> Yeah, tooling tries to provide the best it can with the
> offset+mask, and if doesn't manage to do anything smart with
> it, just show the string and hope that helps the user to figure
> out what is happening.

Almost: tooling should generally always consider the string as
well, for the (not so uncommon) case where there can be multiple
problems with the same field.

Really, I think the string will give the most bang for the buck,
because it's really simple and straightforward on the kernel side
(so that we have a good chance of achieving full coverage
relatively quickly), and later on we could still complicate it
all with offset+mask if there's really a need.

So lets start with an error string...

Thanks,

Ingo

2014-11-10 10:38:41

by Ingo Molnar

[permalink] [raw]
Subject: Re: [RFD] perf syscall error handling


* Stephane Eranian <[email protected]> wrote:

> Hi,
>
> On Fri, Oct 31, 2014 at 1:28 PM, Matt Fleming <[email protected]> wrote:
> > On Fri, 31 Oct, at 10:27:13AM, Ingo Molnar wrote:
> >>
> >> - user-space gets back the regular errno (-EOPNOTSUPP or -ENOSYS
> >> or -EINVAL, etc.) and a string. Strings are really the most
> >> helpful information, because tools can just print that. They
> >> can also match on specific strings and programmatically react
> >> to them if they want to: we can promise to not arbitrarily
> >> change error strings once they are introduced. (but even if
> >> they change, user-space can still print them out.)
> >
> > I guess we'd run into a problem if userspace doesn't want to just print
> > the kernel string but instead wants to parse it in some fashion.
> >
> > That may or may not be a problem in practice, Vince can probably comment
> > on that. I'm just thinking along the lines of making the perf syscall
> > interface as useful as possible for tools other than tools/perf.
>
> Maybe I missed something in the earlier thread, but I am trying
> to understand why perf_event_open() would need such extended
> error retrieval system when no other syscall does.

Frankly, Linux kind of sucks in the 'error codes and diagnostics'
area, as we used the old Unix method of returning a single small
integer and never got around further organizing errors properly,
for a couple of good (and a handful of bad) reasons.

The good reasons: not having finegrained error codes is just fine
if you organize your APIs and objects via other means: file
system structure, split-up system calls, separate fds for
separate objects, etc.

The perf ABI is complex mostly because it provides bleeding edge
interfaces to bleeding edge hardware, while trying to be
transparent to the probed processes: i.e. no 'everything is a
file' and 'just use poll() to pass events' API simplifications
are possible, mostly for performance reasons.

A comparable ABI, ptrace, is considered complex as well, while
perf is much faster, exposes much more hardware capabilities and
is more capable as well.

But even outside of perf, with 'other' system calls I have
experienced dozens of incidents where some app failed with a
-EINVAL in an ambiguous way and it would have been a blessing to
get more extended error description from the kernel. There's a
few meaningful error codes like -EIO or -ENOMEM, but there's tons
of overlapping use of -EINVAL.

The VFS and VM error codes are pretty much self explanatory (and
that's where we have most of Unix heritage), but for example the
networking code and various drivers and also perf suffers from
not giving finegrained enough error explanations.

> In any case, I would go with Ingo's proposal.

Ok, cool!

Thanks,

Ingo

2014-11-10 12:15:35

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [RFD] perf syscall error handling

Em Mon, Nov 10, 2014 at 11:27:25AM +0100, Ingo Molnar escreveu:
>
> * Arnaldo Carvalho de Melo <[email protected]> wrote:
>
> > Em Mon, Nov 03, 2014 at 05:50:19PM +0100, Peter Zijlstra escreveu:
> > > On Mon, Nov 03, 2014 at 02:25:48PM -0200, Arnaldo Carvalho de Melo wrote:
> >
> > > > The way that peterz suggested, i.e. returning information about which
> > > > perf_event_attr and which of the parameters was invalid/had issues could
> > > > help with fallbacking/capability querying, i.e. tooling may want to use
> > > > some features if available automagically, fallbacking to something else
> > > > when that fails.
> >
> > > > We already do that to some degree in various cases, but for some if the
> > > > only way that becomes available to disambiguate some EINVAL return is a
> > > > string, code will start having strcmps :-\
> >
> > > OK, so how about we do both, the offset+mask for the tools
> > > and the string for the humans?
> >
> > Yeah, tooling tries to provide the best it can with the
> > offset+mask, and if doesn't manage to do anything smart with
> > it, just show the string and hope that helps the user to figure
> > out what is happening.
>
> Almost: tooling should generally always consider the string as
> well, for the (not so uncommon) case where there can be multiple
> problems with the same field.
>
> Really, I think the string will give the most bang for the buck,
> because it's really simple and straightforward on the kernel side
> (so that we have a good chance of achieving full coverage
> relatively quickly), and later on we could still complicate it
> all with offset+mask if there's really a need.
>
> So lets start with an error string...

I don't have a problem with the order of introduction of new error
reporting mechanisms, or at least I can't think of one right now.

So if we introduce strings now then tools/perf/ will trow them to the
user when it still don't have fallbacks or any other UI indication of
such an error.

I wonder tho if we have any previous experience on some other project
(or even in the kernel?) and how userspace ended up using it, if just
presenting those strings to the user or if trying to parse it, etc,
anybody?

- Arnaldo

2014-11-10 12:24:53

by Ingo Molnar

[permalink] [raw]
Subject: Re: [RFD] perf syscall error handling


* Arnaldo Carvalho de Melo <[email protected]> wrote:

> Em Mon, Nov 10, 2014 at 11:27:25AM +0100, Ingo Molnar escreveu:
> >
> > * Arnaldo Carvalho de Melo <[email protected]> wrote:
> >
> > > Em Mon, Nov 03, 2014 at 05:50:19PM +0100, Peter Zijlstra escreveu:
> > > > On Mon, Nov 03, 2014 at 02:25:48PM -0200, Arnaldo Carvalho de Melo wrote:
> > >
> > > > > The way that peterz suggested, i.e. returning information about which
> > > > > perf_event_attr and which of the parameters was invalid/had issues could
> > > > > help with fallbacking/capability querying, i.e. tooling may want to use
> > > > > some features if available automagically, fallbacking to something else
> > > > > when that fails.
> > >
> > > > > We already do that to some degree in various cases, but for some if the
> > > > > only way that becomes available to disambiguate some EINVAL return is a
> > > > > string, code will start having strcmps :-\
> > >
> > > > OK, so how about we do both, the offset+mask for the tools
> > > > and the string for the humans?
> > >
> > > Yeah, tooling tries to provide the best it can with the
> > > offset+mask, and if doesn't manage to do anything smart with
> > > it, just show the string and hope that helps the user to figure
> > > out what is happening.
> >
> > Almost: tooling should generally always consider the string as
> > well, for the (not so uncommon) case where there can be multiple
> > problems with the same field.
> >
> > Really, I think the string will give the most bang for the buck,
> > because it's really simple and straightforward on the kernel side
> > (so that we have a good chance of achieving full coverage
> > relatively quickly), and later on we could still complicate it
> > all with offset+mask if there's really a need.
> >
> > So lets start with an error string...
>
> I don't have a problem with the order of introduction of new
> error reporting mechanisms, or at least I can't think of one
> right now.
>
> So if we introduce strings now then tools/perf/ will trow them
> to the user when it still don't have fallbacks or any other UI
> indication of such an error.
>
> I wonder tho if we have any previous experience on some other
> project (or even in the kernel?) and how userspace ended up
> using it, if just presenting those strings to the user or if
> trying to parse it, etc, anybody?

I'm not aware of any such efforts in the Linux space - subsystems
with administrative interfaces generally just tend to printk() a
reason - that's obviously suboptimal in several ways.

Programmatic use in user-spaec is very simple - go with my
initial example, tooling can either just display the error string
and bail out, or do:

if (unlikely(error)) {
if (!strcmp(attr->error_str, "x86/bts: BTS not supported by this CPU architecture")) {
fprintf(stderr, "x86/BTS: No hardware support falling back to branch sampling\n");
activate_x86_bts_fallback_code();
goto out;
}
if (!strcmp(attr->error_str, "x86/lbr: LBR not supported by this CPU architecture"))
goto out_err;
}

or it may do any number of other things, such as convert it to
its internal error code. Note that the error messages should have
some minimal structure (the 'x86/bts:' and 'x86/lbr' prefixes) to
organize things nicely and to make string clashes less likely.

as this is a slowpath the performance of strcmp() doesn't matter,
and in any case it's hardware accelerated or optimized well on
most platforms.

Thanks,

Ingo

2014-11-10 13:54:26

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [RFD] perf syscall error handling

Em Mon, Nov 10, 2014 at 01:24:47PM +0100, Ingo Molnar escreveu:
> * Arnaldo Carvalho de Melo <[email protected]> wrote:
> > Em Mon, Nov 10, 2014 at 11:27:25AM +0100, Ingo Molnar escreveu:
> > > * Arnaldo Carvalho de Melo <[email protected]> wrote:
> > > > Em Mon, Nov 03, 2014 at 05:50:19PM +0100, Peter Zijlstra escreveu:
> > > > > OK, so how about we do both, the offset+mask for the tools
> > > > > and the string for the humans?

It looks like machines don't have problems with strings 8-)

> > > > Yeah, tooling tries to provide the best it can with the
> > > > offset+mask, and if doesn't manage to do anything smart with
> > > > it, just show the string and hope that helps the user to figure
> > > > out what is happening.

> > > Almost: tooling should generally always consider the string as
> > > well, for the (not so uncommon) case where there can be multiple
> > > problems with the same field.

> > > Really, I think the string will give the most bang for the buck,
> > > because it's really simple and straightforward on the kernel side
> > > (so that we have a good chance of achieving full coverage
> > > relatively quickly), and later on we could still complicate it
> > > all with offset+mask if there's really a need.

> > > So lets start with an error string...

> > I don't have a problem with the order of introduction of new
> > error reporting mechanisms, or at least I can't think of one
> > right now.

> > So if we introduce strings now then tools/perf/ will trow them
> > to the user when it still don't have fallbacks or any other UI
> > indication of such an error.

> > I wonder tho if we have any previous experience on some other
> > project (or even in the kernel?) and how userspace ended up
> > using it, if just presenting those strings to the user or if
> > trying to parse it, etc, anybody?

> I'm not aware of any such efforts in the Linux space - subsystems
> with administrative interfaces generally just tend to printk() a
> reason - that's obviously suboptimal in several ways.

> Programmatic use in user-spaec is very simple - go with my
> initial example, tooling can either just display the error string
> and bail out, or do:

> if (unlikely(error)) {
> if (!strcmp(attr->error_str, "x86/bts: BTS not supported by this CPU architecture")) {
> fprintf(stderr, "x86/BTS: No hardware support falling back to branch sampling\n");
> activate_x86_bts_fallback_code();
> goto out;
> }
> if (!strcmp(attr->error_str, "x86/lbr: LBR not supported by this CPU architecture"))
> goto out_err;
> }

> or it may do any number of other things, such as convert it to
> its internal error code. Note that the error messages should have
> some minimal structure (the 'x86/bts:' and 'x86/lbr' prefixes) to
> organize things nicely and to make string clashes less likely.

Right, focus on the string format: Can we just have this two level
thing, first part separated by a slash, followed by colon, to identify
the origin of the message, and then a message, that can have further,
unspecified at this time, parser tokens as the need arises?

> as this is a slowpath the performance of strcmp() doesn't matter,
> and in any case it's hardware accelerated or optimized well on
> most platforms.

- Arnaldo

2014-11-10 14:14:15

by David Ahern

[permalink] [raw]
Subject: Re: [RFD] perf syscall error handling

On 11/10/14, 5:24 AM, Ingo Molnar wrote:
> Programmatic use in user-spaec is very simple - go with my
> initial example, tooling can either just display the error string
> and bail out, or do:
>
> if (unlikely(error)) {
> if (!strcmp(attr->error_str, "x86/bts: BTS not supported by this CPU architecture")) {
> fprintf(stderr, "x86/BTS: No hardware support falling back to branch sampling\n");
> activate_x86_bts_fallback_code();
> goto out;
> }

That makes the exact string content part of the ABI. As I recall ftrace
had a problem with format strings changing and tooling then limiting the
ability to change it.

David

2014-11-10 14:47:11

by Ingo Molnar

[permalink] [raw]
Subject: Re: [RFD] perf syscall error handling


* David Ahern <[email protected]> wrote:

> On 11/10/14, 5:24 AM, Ingo Molnar wrote:
> >Programmatic use in user-spaec is very simple - go with my
> >initial example, tooling can either just display the error string
> >and bail out, or do:
> >
> > if (unlikely(error)) {
> > if (!strcmp(attr->error_str, "x86/bts: BTS not supported by this CPU architecture")) {
> > fprintf(stderr, "x86/BTS: No hardware support falling back to branch sampling\n");
> > activate_x86_bts_fallback_code();
> > goto out;
> > }
>
> That makes the exact string content part of the ABI. [...]

That's ok: messages might still disappear, new messages might
still be introduced.

> [...] As I recall ftrace had a problem with format strings
> changing and tooling then limiting the ability to change it.

I think there's a real difference between extended strings
provided by an error/exception path (perf) and strings provided
by the primary ABI (ftrace).

In the first case tooling is still functional without extended
strings, in the second case ftrace tooling is useless if it
cannot interpret the ftrace string output.

So it's apples to oranges.

Thanks,

Ingo