Received: by 2002:a05:6a10:eb17:0:0:0:0 with SMTP id hx23csp1157063pxb; Thu, 9 Sep 2021 22:39:09 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyV/pfowrmNerC9c7bMKZl1zTj1PplHLcHzzLZRqameIwVKuZxFyp/8CMRmbnT0NS2n/E+3 X-Received: by 2002:a05:6602:584:: with SMTP id v4mr5769011iox.85.1631252348836; Thu, 09 Sep 2021 22:39:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1631252348; cv=none; d=google.com; s=arc-20160816; b=SYtGeMjbGCeC3kPlWZaDEjeygAccxF4Fzl+Dxo4Qz0WtvaolL59oiElWX+76ybkX1y n2nOTYz6FJPzlUIWhA4xe6UqCbXOaGTMKx4SlkxVLrDDZaz9xWD/LGglCauJbC/kMITT sHCrLsSENEM9A4yioJxFdtv0O26OL/wM4mduYWN9MV3BnmpodzWVYCl0rr8GPSk8MAro URKlD//Eu0qjiGYBpuZp44jsG03ORKqJ8qVvjyRU/LDjwe6735Iv66FT8uZetwQnWKzW QJYiDRPUPZ5tBZwwO3jadSEslomIxzk8z9WaiEQ5Wq1LG7NCQR2wGbVVsFpForgmQ4rC wfgA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=tXExFUv34Tdm4rZ6L7kjjjK/2E0WJiLKbvGC53cZojM=; b=ytzbyD6/rvHHgotKrUhrZyxwK5A3YPNXg/0mmJnRAX7auEvriGImpKssMBNyUdV8+U rHwNvvG27cvgEdfgfAHUD53QAWLRmUZlaoUSVny0e1kOhM9VolNkJl75gQ4yFBI/q2nK mGI0zyrmlc9+sT9MZTGtcDJ1i3h0dbYcsAHRidRVQ9dT9p2TXYD8n4KbnJepFb/QuXDk 3MUzMJxMiHnr4SQNwuQ+csZVcT0tM2lJaM2MjoYMcQDTtXMOdapaAHGZsKTpAfusgjWA Ip8vB0kpg50IaXnROz+IBvH8fucnvmVnAxK8pav+dIq/WAcjevS3mBvfzI1mCt2QNvLs +cRg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=0Yn70zDD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id g34si3605142jaa.104.2021.09.09.22.38.57; Thu, 09 Sep 2021 22:39:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=0Yn70zDD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230382AbhIJFj1 (ORCPT + 99 others); Fri, 10 Sep 2021 01:39:27 -0400 Received: from mail.kernel.org ([198.145.29.99]:59904 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230037AbhIJFj0 (ORCPT ); Fri, 10 Sep 2021 01:39:26 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 884B9610A4; Fri, 10 Sep 2021 05:38:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1631252296; bh=UkvzRsFG9WTg1yXv7VFyTbQl8WUU/1hcYazJMYgXs3Y=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=0Yn70zDDQH3hEZCqTWA55JqRaFgOh+xxJANfCfTkEttvL9q2YJm0Jly/xbAlNc+1a p6S+rn7ujgWEhtceYi390v5MwC7xhvbRyLyjoGyrnegkT2SQ1kkNg/ydAnl5oVhboh Tt6opXU792IwGtzlJ7ZbQ0pfEKNUgKK6i/AxDeB4= Date: Fri, 10 Sep 2021 07:37:53 +0200 From: Greg KH To: Li Li Cc: dualli@google.com, tkjos@google.com, christian@brauner.io, arve@android.com, devel@driverdev.osuosl.org, linux-kernel@vger.kernel.org, maco@google.com, hridya@google.com, surenb@google.com, joel@joelfernandes.org, kernel-team@android.com Subject: Re: [PATCH v2 1/1] binder: fix freeze race Message-ID: References: <20210910035316.2873554-1-dualli@chromium.org> <20210910035316.2873554-2-dualli@chromium.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210910035316.2873554-2-dualli@chromium.org> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Sep 09, 2021 at 08:53:16PM -0700, Li Li wrote: > From: Li Li > > Currently cgroup freezer is used to freeze the application threads, and > BINDER_FREEZE is used to freeze the corresponding binder interface. > There's already a mechanism in ioctl(BINDER_FREEZE) to wait for any > existing transactions to drain out before actually freezing the binder > interface. > > But freezing an app requires 2 steps, freezing the binder interface with > ioctl(BINDER_FREEZE) and then freezing the application main threads with > cgroupfs. This is not an atomic operation. The following race issue > might happen. > > 1) Binder interface is frozen by ioctl(BINDER_FREEZE); > 2) Main thread A initiates a new sync binder transaction to process B; > 3) Main thread A is frozen by "echo 1 > cgroup.freeze"; > 4) The response from process B reaches the frozen thread, which will > unexpectedly fail. > > This patch provides a mechanism to check if there's any new pending > transaction happening between ioctl(BINDER_FREEZE) and freezing the > main thread. If there's any, the main thread freezing operation can > be rolled back to finish the pending transaction. > > Furthermore, the response might reach the binder driver before the > rollback actually happens. That will still cause failed transaction. > > As the other process doesn't wait for another response of the response, > the response transaction failure can be fixed by treating the response > transaction like an oneway/async one, allowing it to reach the frozen > thread. And it will be consumed when the thread gets unfrozen later. > > Fixes: 432ff1e91694 ("binder: BINDER_FREEZE ioctl") > Signed-off-by: Li Li > --- > drivers/android/binder.c | 34 +++++++++++++++++++++++++---- > drivers/android/binder_internal.h | 2 ++ > include/uapi/linux/android/binder.h | 7 ++++++ > 3 files changed, 39 insertions(+), 4 deletions(-) > > diff --git a/drivers/android/binder.c b/drivers/android/binder.c > index d9030cb6b1e4..eaffdf5f692c 100644 > --- a/drivers/android/binder.c > +++ b/drivers/android/binder.c > @@ -3038,9 +3038,8 @@ static void binder_transaction(struct binder_proc *proc, > if (reply) { > binder_enqueue_thread_work(thread, tcomplete); > binder_inner_proc_lock(target_proc); > - if (target_thread->is_dead || target_proc->is_frozen) { > - return_error = target_thread->is_dead ? > - BR_DEAD_REPLY : BR_FROZEN_REPLY; > + if (target_thread->is_dead) { > + return_error = BR_DEAD_REPLY; > binder_inner_proc_unlock(target_proc); > goto err_dead_proc_or_thread; > } > @@ -4648,6 +4647,22 @@ static int binder_ioctl_get_node_debug_info(struct binder_proc *proc, > return 0; > } > > +static int binder_txns_pending_ilocked(struct binder_proc *proc) > +{ > + struct rb_node *n; > + struct binder_thread *thread; > + > + if (proc->outstanding_txns > 0) > + return 1; > + > + for (n = rb_first(&proc->threads); n; n = rb_next(n)) { > + thread = rb_entry(n, struct binder_thread, rb_node); > + if (thread->transaction_stack) > + return 1; > + } > + return 0; > +} > + > static int binder_ioctl_freeze(struct binder_freeze_info *info, > struct binder_proc *target_proc) > { > @@ -4682,6 +4697,14 @@ static int binder_ioctl_freeze(struct binder_freeze_info *info, > if (!ret && target_proc->outstanding_txns) > ret = -EAGAIN; > > + /* Also check pending transactions that wait for reply */ > + if (ret >= 0) { > + binder_inner_proc_lock(target_proc); > + if (binder_txns_pending_ilocked(target_proc)) > + ret = -EAGAIN; > + binder_inner_proc_unlock(target_proc); > + } > + > if (ret < 0) { > binder_inner_proc_lock(target_proc); > target_proc->is_frozen = false; > @@ -4696,6 +4719,7 @@ static int binder_ioctl_get_freezer_info( > { > struct binder_proc *target_proc; > bool found = false; > + int txns_pending; > > info->sync_recv = 0; > info->async_recv = 0; > @@ -4705,7 +4729,9 @@ static int binder_ioctl_get_freezer_info( > if (target_proc->pid == info->pid) { > found = true; > binder_inner_proc_lock(target_proc); > - info->sync_recv |= target_proc->sync_recv; > + txns_pending = binder_txns_pending_ilocked(target_proc); > + info->sync_recv |= target_proc->sync_recv | > + (txns_pending << 1); > info->async_recv |= target_proc->async_recv; > binder_inner_proc_unlock(target_proc); > } > diff --git a/drivers/android/binder_internal.h b/drivers/android/binder_internal.h > index 810c0b84d3f8..402c4d4362a8 100644 > --- a/drivers/android/binder_internal.h > +++ b/drivers/android/binder_internal.h > @@ -378,6 +378,8 @@ struct binder_ref { > * binder transactions > * (protected by @inner_lock) > * @sync_recv: process received sync transactions since last frozen > + * bit 0: received sync transaction after being frozen > + * bit 1: new pending sync transaction during freezing > * (protected by @inner_lock) > * @async_recv: process received async transactions since last frozen > * (protected by @inner_lock) > diff --git a/include/uapi/linux/android/binder.h b/include/uapi/linux/android/binder.h > index 20e435fe657a..3246f2c74696 100644 > --- a/include/uapi/linux/android/binder.h > +++ b/include/uapi/linux/android/binder.h > @@ -225,7 +225,14 @@ struct binder_freeze_info { > > struct binder_frozen_status_info { > __u32 pid; > + > + /* process received sync transactions since last frozen > + * bit 0: received sync transaction after being frozen > + * bit 1: new pending sync transaction during freezing > + */ > __u32 sync_recv; You just changed a user/kernel api here, did you just break existing userspace applications? If not, how did that not happen? thanks, greg k-h