Received: by 2002:a05:6a10:6d25:0:0:0:0 with SMTP id gq37csp2028691pxb; Mon, 13 Sep 2021 10:20:56 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxrgrL+VR/dYOEYr5rmrPEglIohNAhUuUynxiocaIKy4+gRiyiZGKk/i31dIJUmWvU9ai+I X-Received: by 2002:aa7:d7c8:: with SMTP id e8mr14172040eds.381.1631553656086; Mon, 13 Sep 2021 10:20:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1631553656; cv=none; d=google.com; s=arc-20160816; b=PxpVHWbz5UYSqMCB9/L/7kKP/jGPDbXBtF3zcK2elcP+LV+jBlmzcIO79RLqfPijip m0k6TJfpL0sU27sJWHq1qNlMcApNj5tD1Y/cnMM7gtd9CezLnx3fYt7qAfd8srQUYrAl MGcw4Jy7cm4apfp9OhznnaJ9YL87ZBmaNZjA1sQlESC+StQ0QVc6E9Xr4XpaFDs0aeFy lP7eFIObuD1zOBQIOeHtO1wC1Uu+RuA3wWDclktx5c+MuGshKHf6kBgdRtsOB+icX25l 3sRbHr4ayzagTq9/g9S/icNb08gKojZXAH8iuQrRt5aAPnhgHsXh8z53RM7xA6iLpzyj Ttxg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=E4YV9pWz0lzIhd+YfUEnVg61Zq9n7qhIbQA2SNDsEUM=; b=VbX0rZi5TJmY57lRESEjEaI3GL8DSHiSfwui36DjWwavea6J5sBmhzJNiPKXJAVk8q etB0M6C+F3WMTu2ICKHCozs827TtcEJ1RL1MGX1CjUd/iDah4HHoMDE0/z8MLd62SA/M 6JDCoFSIrIdDUPJzxp1KlosGu4LPMwa7U6Qw/dhQN+3cyVffygrhEb81tenDFgp1fVS/ D+KjQWSwUmDk8o40EZOYiW8HLy5KTRgHScX+6RgYFHLXXDhkRNUvRtfojgqjpdlZY86l NapmTbjZggMqxGy5MdlIGl20dCknK0z3Ho2R4TWMZ/XXke1G5v8+2nh2izBgSwYISoRf 3c6Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=aSiTbWmd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id x1si7360198ejd.605.2021.09.13.10.20.27; Mon, 13 Sep 2021 10:20:56 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=aSiTbWmd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240075AbhIMRUG (ORCPT + 99 others); Mon, 13 Sep 2021 13:20:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57124 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234182AbhIMRUF (ORCPT ); Mon, 13 Sep 2021 13:20:05 -0400 Received: from mail-ej1-x630.google.com (mail-ej1-x630.google.com [IPv6:2a00:1450:4864:20::630]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D989FC061574 for ; Mon, 13 Sep 2021 10:18:49 -0700 (PDT) Received: by mail-ej1-x630.google.com with SMTP id x11so22700182ejv.0 for ; Mon, 13 Sep 2021 10:18:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=E4YV9pWz0lzIhd+YfUEnVg61Zq9n7qhIbQA2SNDsEUM=; b=aSiTbWmdIw42qZxE1KiBIhTgkbMfBjVrSueGpx83IEEmRKPYwwdohg4G0zgNEjiaBf A3N97HrRIEH2iD6tSJfHS8j5OGEK+66a511MEjXsXtNWqdezha/rvXl5wamCr7dJwxQx d9CeFkTGUlgcU9SxmAR/UJ8k4pcs4AJ06INUU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=E4YV9pWz0lzIhd+YfUEnVg61Zq9n7qhIbQA2SNDsEUM=; b=HpLi9DlP5Tm1Q31ravwWsDV66ltvLMtWa/BvQKUokdZ6r7+lcg3yzP8b7/wd66HbEN 6tx0dtbRn+52UAW1WRoKAM8X7E/QmDNGUzzXGxXVQUKU4i0fQuxpDSTIU6GLMLpABO9A FmbG88ZdTn/kZicD/ORXjG3yE4wKC3obN2u4m1r3X2YwrpHS8APgbt30ZqEqLUBT28Lt 971WOHM3pShXntT2WkKABPzgDBcHL9POviote14veAFnI6jEkRIASCIBKGFUIHD9jbUO GwnOWA+noXqu9Z5Z7uxbWDXkbnkUtuRH5yayCfDVSczXC/VbpNJbKkJJLVEjTo2r1zXY 0hzg== X-Gm-Message-State: AOAM532vGQ8iNopEmlNFiqoQBfRQQ7UWMeclNDnbcfqK1T3qnPjuqc8Q ZluHwQ0Xh6Vtdk7DE32Ii2eMNLHw36ypxIBBqNdfUQ== X-Received: by 2002:a17:906:180a:: with SMTP id v10mr14105396eje.112.1631553528438; Mon, 13 Sep 2021 10:18:48 -0700 (PDT) MIME-Version: 1.0 References: <20210910035316.2873554-1-dualli@chromium.org> <20210910035316.2873554-2-dualli@chromium.org> In-Reply-To: From: Li Li Date: Mon, 13 Sep 2021 10:18:37 -0700 Message-ID: Subject: Re: [PATCH v2 1/1] binder: fix freeze race To: Greg KH Cc: Li Li , "open list:ANDROID DRIVERS" , Android Kernel Team , LKML , "Joel Fernandes (Google)" , =?UTF-8?B?QXJ2ZSBIasO4bm5ldsOlZw==?= , Martijn Coenen , Hridya Valsaraju , Suren Baghdasaryan , Christian Brauner , Todd Kjos Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Sep 10, 2021 at 12:15 AM Greg KH wrote: > > On Thu, Sep 09, 2021 at 11:17:42PM -0700, Li Li wrote: > > On Thu, Sep 9, 2021 at 10:38 PM Greg KH wrote: > > > > > > On Thu, Sep 09, 2021 at 08:53:16PM -0700, Li Li wrote: > > > > struct binder_frozen_status_info { > > > > __u32 pid; > > > > + > > > > + /* process received sync transactions since last frozen > > > > + * bit 0: received sync transaction after being frozen > > > > + * bit 1: new pending sync transaction during freezing > > > > + */ > > > > __u32 sync_recv; > > > > > > You just changed a user/kernel api here, did you just break existing > > > userspace applications? If not, how did that not happen? > > > > > > > That's a good question. This design does keep backward compatibility. > > > > The existing userspace applications call ioctl(BINDER_GET_FROZEN_INFO) > > to check if there's sync or async binder transactions sent to a frozen > > process. > > > > If the existing userspace app runs on a new kernel, a sync binder call > > still sets bit 1 of sync_recv (as it's a bool variable) so the ioctl > > will return the expected value (TRUE). The app just doesn't check bit > > 1 intentionally so it doesn't have the ability to tell if there's a > > race - this behavior is aligned with what happens on an old kernel as > > the old kernel doesn't have bit 1 set at all. > > > > The bit 1 of sync_recv enables new userspace app the ability to tell > > 1) there's a sync binder transaction happened when being frozen - same > > as before; and 2) if that sync binder transaction happens exactly when > > there's a race - a new information for rollback decision. > > Ah, can you add that to the changelog text to make it more obvious? > Sure, added that to V3, plus other minor improvements listed in the cover letter. Please let me know if there's anything else I should continue to improve. https://lore.kernel.org/lkml/20210910164210.2282716-1-dualli@chromium.org/ BTW, I had a stress test running, repeatedly freezing and unfreezing a couple apps every second, which at the same time initiates new binder transactions in a loop. The overnight stress test during the past weekend showed positive results. Without this kernel patch, the reply transaction will fail in tens of iterations. With this kernel patch and the corresponding user space fix (rescheduling the freeze op to next second in case race happens), the stress test runs for 24hrs without a single failure. Thanks, Li