2020-09-07 17:15:51

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v8 3/3] binder: add transaction latency tracer

On Mon, Sep 07, 2020 at 09:51:15PM +0800, Frankie Chang wrote:
> On Mon, 2020-09-07 at 14:25 +0200, Greg Kroah-Hartman wrote:
> > On Mon, Sep 07, 2020 at 08:00:55PM +0800, Frankie Chang wrote:
> > > From: "Frankie.Chang" <[email protected]>
> > >
> > > Record start/end timestamp for binder transaction.
> > > When transaction is completed or transaction is free,
> > > it would be checked if transaction latency over threshold
> > > (default 2 sec), if yes, printing related information for tracing.
> > >
> > > /* Implement details */
> > > - Add latency tracer module to monitor transaction
> > > by attaching to new tracepoints introduced
> > > when transactions are allocated and freed.
> > > The trace_binder_txn_latency_free would not be enabled
> > > by default. Monitoring which transaction is too slow to
> > > cause some of exceptions is important. So we hook the
> > > tracepoint to call the monitor function.
> > >
> > > - Since some of modules would trigger timeout NE
> > > if their binder transaction don't finish in time,
> > > such as audio timeout (5 sec), even BT command
> > > timeout (2 sec), etc.
> > > Therefore, setting the timeout threshold as default
> > > 2 seconds could be helpful to debug.
> > > But this timeout threshold is configurable, to let
> > > all users determine the more suitable threshold.
> > >
> > > - The reason why printing the related information to
> > > kernel information log but not trace buffer is that
> > > some abnormal transactions may be pending for a long
> > > time ago, they could not be recorded due to buffer
> > > limited.
> > >
> > > Signed-off-by: Frankie.Chang <[email protected]>
> > > Acked-by: Todd Kjos <[email protected]>
> > > ---
> > > drivers/android/Kconfig | 8 +++
> > > drivers/android/Makefile | 1 +
> > > drivers/android/binder.c | 6 ++
> > > drivers/android/binder_internal.h | 13 ++++
> > > drivers/android/binder_latency_tracer.c | 112 +++++++++++++++++++++++++++++++
> > > drivers/android/binder_trace.h | 26 ++++++-
> > > 6 files changed, 163 insertions(+), 3 deletions(-)
> > > create mode 100644 drivers/android/binder_latency_tracer.c
> > >
> > > diff --git a/drivers/android/Kconfig b/drivers/android/Kconfig
> > > index 53b22e2..8aadaf4 100644
> > > --- a/drivers/android/Kconfig
> > > +++ b/drivers/android/Kconfig
> > > @@ -54,6 +54,14 @@ config ANDROID_BINDER_IPC_SELFTEST
> > > exhaustively with combinations of various buffer sizes and
> > > alignments.
> > >
> > > +config BINDER_TRANSACTION_LATENCY_TRACKING
> > > + tristate "Android Binder transaction tracking"
> > > + help
> > > + Used for track abnormal binder transaction which is over threshold,
> > > + when the transaction is done or be free, this transaction would be
> > > + checked whether it executed overtime.
> > > + If yes, printing out the detailed info.
> > > +
> > > endif # if ANDROID
> > >
> > > endmenu
> > > diff --git a/drivers/android/Makefile b/drivers/android/Makefile
> > > index c9d3d0c9..c2ffdb6 100644
> > > --- a/drivers/android/Makefile
> > > +++ b/drivers/android/Makefile
> > > @@ -4,3 +4,4 @@ ccflags-y += -I$(src) # needed for trace events
> > > obj-$(CONFIG_ANDROID_BINDERFS) += binderfs.o
> > > obj-$(CONFIG_ANDROID_BINDER_IPC) += binder.o binder_alloc.o
> > > obj-$(CONFIG_ANDROID_BINDER_IPC_SELFTEST) += binder_alloc_selftest.o
> > > +obj-$(CONFIG_BINDER_TRANSACTION_LATENCY_TRACKING) += binder_latency_tracer.o
> > > diff --git a/drivers/android/binder.c b/drivers/android/binder.c
> > > index 0cab900..65ad38c 100644
> > > --- a/drivers/android/binder.c
> > > +++ b/drivers/android/binder.c
> > > @@ -2674,6 +2674,7 @@ static void binder_transaction(struct binder_proc *proc,
> > > return_error_line = __LINE__;
> > > goto err_alloc_t_failed;
> > > }
> > > + trace_binder_txn_latency_alloc(t);
> > > INIT_LIST_HEAD(&t->fd_fixups);
> > > binder_stats_created(BINDER_STAT_TRANSACTION);
> > > spin_lock_init(&t->lock);
> > > @@ -5177,6 +5178,7 @@ static void print_binder_transaction_ilocked(struct seq_file *m,
> > > to_proc ? to_proc->pid : 0,
> > > t->to_thread ? t->to_thread->pid : 0,
> > > t->code, t->flags, t->priority, t->need_reply);
> > > + trace_binder_txn_latency_info(m, t);
> > > spin_unlock(&t->lock);
> > >
> > > if (proc != to_proc) {
> > > @@ -5818,4 +5820,8 @@ static int __init binder_init(void)
> > > #define CREATE_TRACE_POINTS
> > > #include "binder_trace.h"
> > >
> > > +EXPORT_TRACEPOINT_SYMBOL(binder_txn_latency_alloc);
> > > +EXPORT_TRACEPOINT_SYMBOL(binder_txn_latency_info);
> > > +EXPORT_TRACEPOINT_SYMBOL(binder_txn_latency_free);
> > > +
> > > MODULE_LICENSE("GPL v2");
> > > diff --git a/drivers/android/binder_internal.h b/drivers/android/binder_internal.h
> > > index 872232f..309d1b10 100644
> > > --- a/drivers/android/binder_internal.h
> > > +++ b/drivers/android/binder_internal.h
> > > @@ -15,6 +15,11 @@
> > > #include <uapi/linux/android/binderfs.h>
> > > #include "binder_alloc.h"
> > >
> > > +#if IS_ENABLED(CONFIG_BINDER_TRANSACTION_LATENCY_TRACKING)
> > > +#include <uapi/linux/time_types.h>
> > > +#include <linux/rtc.h>
> > > +#endif
> > > +
> > > struct binder_context {
> > > struct binder_node *binder_context_mgr_node;
> > > struct mutex context_mgr_node_lock;
> > > @@ -524,6 +529,14 @@ struct binder_transaction {
> > > * during thread teardown
> > > */
> > > spinlock_t lock;
> > > + /**
> > > + * @timestamp and @tv are used to record the time
> > > + * that the binder transaction startup
> > > + */
> > > +#if IS_ENABLED(CONFIG_BINDER_TRANSACTION_LATENCY_TRACKING)
> > > + struct timespec64 timestamp;
> > > + struct __kernel_old_timeval tv;
> >
> > As I stated in the 0/3 email, please use the correct in-kernel structure
> > for this. That structure is _ONLY_ for old userspace api compatibility
> > issues. Which is not the case here at all.
> >
> > thanks,
> > greg k-h
>
> Since it's not proper to use timeval & __kernel_old_timeval.
> The remaining structure with tv_usec that I found is
> __kernel_timex_timeval, but I'm not sure if this is proper one or not.
> All I can see is in
> https://lwn.net/ml/linux-kernel/[email protected]/
>
> I'm afraid of using wrong structure again.
> Could you give me some advices about this?
>
> Or is there another suitable way to replace timeval

What exactly are you trying to measure such that you feel you need a
timeval structure?

thanks,

greg k-h


2020-09-08 05:39:59

by Frankie Chang

[permalink] [raw]
Subject: Re: [PATCH v8 3/3] binder: add transaction latency tracer

On Mon, 2020-09-07 at 16:09 +0200, Greg Kroah-Hartman wrote:
> On Mon, Sep 07, 2020 at 09:51:15PM +0800, Frankie Chang wrote:
> > On Mon, 2020-09-07 at 14:25 +0200, Greg Kroah-Hartman wrote:
> > > On Mon, Sep 07, 2020 at 08:00:55PM +0800, Frankie Chang wrote:
> > > > From: "Frankie.Chang" <[email protected]>
> > > >
> > > > Record start/end timestamp for binder transaction.
> > > > When transaction is completed or transaction is free,
> > > > it would be checked if transaction latency over threshold
> > > > (default 2 sec), if yes, printing related information for tracing.
> > > >
> > > > /* Implement details */
> > > > - Add latency tracer module to monitor transaction
> > > > by attaching to new tracepoints introduced
> > > > when transactions are allocated and freed.
> > > > The trace_binder_txn_latency_free would not be enabled
> > > > by default. Monitoring which transaction is too slow to
> > > > cause some of exceptions is important. So we hook the
> > > > tracepoint to call the monitor function.
> > > >
> > > > - Since some of modules would trigger timeout NE
> > > > if their binder transaction don't finish in time,
> > > > such as audio timeout (5 sec), even BT command
> > > > timeout (2 sec), etc.
> > > > Therefore, setting the timeout threshold as default
> > > > 2 seconds could be helpful to debug.
> > > > But this timeout threshold is configurable, to let
> > > > all users determine the more suitable threshold.
> > > >
> > > > - The reason why printing the related information to
> > > > kernel information log but not trace buffer is that
> > > > some abnormal transactions may be pending for a long
> > > > time ago, they could not be recorded due to buffer
> > > > limited.
> > > >
> > > > Signed-off-by: Frankie.Chang <[email protected]>
> > > > Acked-by: Todd Kjos <[email protected]>
> > > > ---
> > > > drivers/android/Kconfig | 8 +++
> > > > drivers/android/Makefile | 1 +
> > > > drivers/android/binder.c | 6 ++
> > > > drivers/android/binder_internal.h | 13 ++++
> > > > drivers/android/binder_latency_tracer.c | 112 +++++++++++++++++++++++++++++++
> > > > drivers/android/binder_trace.h | 26 ++++++-
> > > > 6 files changed, 163 insertions(+), 3 deletions(-)
> > > > create mode 100644 drivers/android/binder_latency_tracer.c
> > > >
> > > > diff --git a/drivers/android/Kconfig b/drivers/android/Kconfig
> > > > index 53b22e2..8aadaf4 100644
> > > > --- a/drivers/android/Kconfig
> > > > +++ b/drivers/android/Kconfig
> > > > @@ -54,6 +54,14 @@ config ANDROID_BINDER_IPC_SELFTEST
> > > > exhaustively with combinations of various buffer sizes and
> > > > alignments.
> > > >
> > > > +config BINDER_TRANSACTION_LATENCY_TRACKING
> > > > + tristate "Android Binder transaction tracking"
> > > > + help
> > > > + Used for track abnormal binder transaction which is over threshold,
> > > > + when the transaction is done or be free, this transaction would be
> > > > + checked whether it executed overtime.
> > > > + If yes, printing out the detailed info.
> > > > +
> > > > endif # if ANDROID
> > > >
> > > > endmenu
> > > > diff --git a/drivers/android/Makefile b/drivers/android/Makefile
> > > > index c9d3d0c9..c2ffdb6 100644
> > > > --- a/drivers/android/Makefile
> > > > +++ b/drivers/android/Makefile
> > > > @@ -4,3 +4,4 @@ ccflags-y += -I$(src) # needed for trace events
> > > > obj-$(CONFIG_ANDROID_BINDERFS) += binderfs.o
> > > > obj-$(CONFIG_ANDROID_BINDER_IPC) += binder.o binder_alloc.o
> > > > obj-$(CONFIG_ANDROID_BINDER_IPC_SELFTEST) += binder_alloc_selftest.o
> > > > +obj-$(CONFIG_BINDER_TRANSACTION_LATENCY_TRACKING) += binder_latency_tracer.o
> > > > diff --git a/drivers/android/binder.c b/drivers/android/binder.c
> > > > index 0cab900..65ad38c 100644
> > > > --- a/drivers/android/binder.c
> > > > +++ b/drivers/android/binder.c
> > > > @@ -2674,6 +2674,7 @@ static void binder_transaction(struct binder_proc *proc,
> > > > return_error_line = __LINE__;
> > > > goto err_alloc_t_failed;
> > > > }
> > > > + trace_binder_txn_latency_alloc(t);
> > > > INIT_LIST_HEAD(&t->fd_fixups);
> > > > binder_stats_created(BINDER_STAT_TRANSACTION);
> > > > spin_lock_init(&t->lock);
> > > > @@ -5177,6 +5178,7 @@ static void print_binder_transaction_ilocked(struct seq_file *m,
> > > > to_proc ? to_proc->pid : 0,
> > > > t->to_thread ? t->to_thread->pid : 0,
> > > > t->code, t->flags, t->priority, t->need_reply);
> > > > + trace_binder_txn_latency_info(m, t);
> > > > spin_unlock(&t->lock);
> > > >
> > > > if (proc != to_proc) {
> > > > @@ -5818,4 +5820,8 @@ static int __init binder_init(void)
> > > > #define CREATE_TRACE_POINTS
> > > > #include "binder_trace.h"
> > > >
> > > > +EXPORT_TRACEPOINT_SYMBOL(binder_txn_latency_alloc);
> > > > +EXPORT_TRACEPOINT_SYMBOL(binder_txn_latency_info);
> > > > +EXPORT_TRACEPOINT_SYMBOL(binder_txn_latency_free);
> > > > +
> > > > MODULE_LICENSE("GPL v2");
> > > > diff --git a/drivers/android/binder_internal.h b/drivers/android/binder_internal.h
> > > > index 872232f..309d1b10 100644
> > > > --- a/drivers/android/binder_internal.h
> > > > +++ b/drivers/android/binder_internal.h
> > > > @@ -15,6 +15,11 @@
> > > > #include <uapi/linux/android/binderfs.h>
> > > > #include "binder_alloc.h"
> > > >
> > > > +#if IS_ENABLED(CONFIG_BINDER_TRANSACTION_LATENCY_TRACKING)
> > > > +#include <uapi/linux/time_types.h>
> > > > +#include <linux/rtc.h>
> > > > +#endif
> > > > +
> > > > struct binder_context {
> > > > struct binder_node *binder_context_mgr_node;
> > > > struct mutex context_mgr_node_lock;
> > > > @@ -524,6 +529,14 @@ struct binder_transaction {
> > > > * during thread teardown
> > > > */
> > > > spinlock_t lock;
> > > > + /**
> > > > + * @timestamp and @tv are used to record the time
> > > > + * that the binder transaction startup
> > > > + */
> > > > +#if IS_ENABLED(CONFIG_BINDER_TRANSACTION_LATENCY_TRACKING)
> > > > + struct timespec64 timestamp;
> > > > + struct __kernel_old_timeval tv;
> > >
> > > As I stated in the 0/3 email, please use the correct in-kernel structure
> > > for this. That structure is _ONLY_ for old userspace api compatibility
> > > issues. Which is not the case here at all.
> > >
> > > thanks,
> > > greg k-h
> >
> > Since it's not proper to use timeval & __kernel_old_timeval.
> > The remaining structure with tv_usec that I found is
> > __kernel_timex_timeval, but I'm not sure if this is proper one or not.
> > All I can see is in
> > https://lwn.net/ml/linux-kernel/[email protected]/
> >
> > I'm afraid of using wrong structure again.
> > Could you give me some advices about this?
> >
> > Or is there another suitable way to replace timeval
>
> What exactly are you trying to measure such that you feel you need a
> timeval structure?
>
Actually I'd like to record timestamp with standard time structure, so I
used 'struct timeval' in the past patch. However it's deprecated, so I
am looking for the alternatives.

Additionally, I have already looked some patches that replacing timeval
with timespec64. I'll do the same way in next version patch.

Many thanks,
Frankie

2020-09-08 20:18:09

by Frankie Chang

[permalink] [raw]
Subject: [PATCH v9] binder: transaction latency tracking for user build

Change from v9:
- rename timestamp to ts in binder_internal.h for conciseness.
- change 'struct timeval' to 'struct timespec64' in binder_internal.h.

Change from v8:
- change rtc_time_to_tm to rtc_time64_to_tm.
- change timeval to __kernel_old_timeval due to
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c766d1472c70d25ad475cf56042af1652e792b23
- export tracepoint symbol for binder_txn_latency_* which binder_transaction_latency_tracer to be ko needed.

Change from v7:
- Use the passed-in values instead of accessing via t->from/to_proc/to_thread
for trace_binder_txn_latency_free, when trace_binder_txn_latency_free_enable() return true.
- make a helper function to do the above.

Change from v6:
- change CONFIG_BINDER_TRANSACTION_LATENCY_TRACKING type from bool to tristate
- add comments to @timestamp and @tv under struct binder_transaction
- make binder_txn_latency threshold configurable
- enhance lock protection

Change from v5:
- change config name to the proper one, CONFIG_BINDER_TRANSACTION_LATENCY_TRACKING.
- change tracepoint name to more descriptive one, trace_binder_txn_latency_(alloc|info|free)
- enhance some lock protection.

Change from v4:
- split up into patch series.

Change from v3:
- use tracepoints for binder_update_info and print_binder_transaction_ext,
instead of custom registration functions.

Change from v2:
- create transaction latency module to monitor slow transaction.

Change from v1:
- first patchset.


Frankie.Chang (3):
binder: move structs from core file to header file
binder: add trace at free transaction.
binder: add transaction latency tracer

drivers/android/Kconfig | 8 +
drivers/android/Makefile | 1 +
drivers/android/binder.c | 430 ++----------------------
drivers/android/binder_internal.h | 418 +++++++++++++++++++++++
drivers/android/binder_latency_tracer.c | 108 ++++++
drivers/android/binder_trace.h | 49 +++
6 files changed, 608 insertions(+), 406 deletions(-)
create mode 100644 drivers/android/binder_latency_tracer.c

2020-09-16 15:47:13

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v9] binder: transaction latency tracking for user build

On Tue, Sep 08, 2020 at 10:06:45PM +0800, Frankie Chang wrote:
> Change from v9:
> - rename timestamp to ts in binder_internal.h for conciseness.
> - change 'struct timeval' to 'struct timespec64' in binder_internal.h.

These just blew up in the 0-day testing, again, so I'm dropping them
from my tree.

Can you please fix them up and resend when you have them properly
building?

thanks,

greg k-h