Received: by 2002:a25:e74b:0:0:0:0:0 with SMTP id e72csp1225667ybh; Thu, 16 Jul 2020 06:42:12 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxofsMMerb3l2y4feNWAFtx302oVl/PC40ussu61dEQa+wGVPgbbpJ6EYpePpe+1VGwPqCH X-Received: by 2002:a17:906:6d56:: with SMTP id a22mr3980686ejt.440.1594906932483; Thu, 16 Jul 2020 06:42:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1594906932; cv=none; d=google.com; s=arc-20160816; b=aVs3MFDR1+hXairbNcUtpTVxII3GvTdUuXSZDLxUUcQIBhC8rGPegd+r394BKDraVZ SzhrEf0MOeYqp+gJTf3TZp7BG3qMvfR6itWMpAOwcrFImKnZsdkXAdjJY7fzdHcxByGv 0/wvqqD8+0wbmJxsHcSY8Ad81fnyfy4DIyN9VJjDwJJqG3rKHyYNvckUe7kGPQChHA6t M5txQATthFanSQDJKJlTZaLRXBHGzWMV/1lf5radAnLfUFQ+BNyQsnR+W5u1Kc4CkOiV DYnV3aCwbRGTXgrgRQcOJ6BOLwWv+eD1TJv6vcjvgG/AY75Q+Zr1SNhy47s15HkFAB94 zwfA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=MqZQi1Y5c7G6d49ZPqTltiEtpJ/41gtBaelPZBFz/yA=; b=RXI3iSiPfpUo13Xt2ymGLk+wyk84xsLcisfNF9ICsIQr2c0drFwzSAbrS9NcJkI/WX aEj+pbuAh0Xz4jirE2Isw6NJjiEuwfEKu1Xvm7NEiWkfqeJBZPIe2Z35USqIplu1RjBz 5ontguBJRqETq1w03RyVNQfnyPtFod7HERBAVJ5kL1uUChbi7Eiu/pcoYmsICF0hWHal lii1MIFHMZGMgDzwqaIg8ncUpDVryXkRLthiwEqk0y0S5qKn5aPrhO8E8SBKU3pSyv13 AhR1O6l8DmTEGQLzemRsvq8Jk7SecpZk3hDVhcjGk4KWvW8pTjiUbRdU04VoKXhKBzpt f60A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b5si3252819edy.563.2020.07.16.06.41.49; Thu, 16 Jul 2020 06:42:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728762AbgGPNlU (ORCPT + 99 others); Thu, 16 Jul 2020 09:41:20 -0400 Received: from www262.sakura.ne.jp ([202.181.97.72]:49244 "EHLO www262.sakura.ne.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725975AbgGPNlT (ORCPT ); Thu, 16 Jul 2020 09:41:19 -0400 Received: from fsav108.sakura.ne.jp (fsav108.sakura.ne.jp [27.133.134.235]) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTP id 06GDfHDn085146; Thu, 16 Jul 2020 22:41:17 +0900 (JST) (envelope-from penguin-kernel@i-love.sakura.ne.jp) Received: from www262.sakura.ne.jp (202.181.97.72) by fsav108.sakura.ne.jp (F-Secure/fsigk_smtp/550/fsav108.sakura.ne.jp); Thu, 16 Jul 2020 22:41:17 +0900 (JST) X-Virus-Status: clean(F-Secure/fsigk_smtp/550/fsav108.sakura.ne.jp) Received: from [192.168.1.9] (M106072142033.v4.enabler.ne.jp [106.72.142.33]) (authenticated bits=0) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTPSA id 06GDfGUv085143 (version=TLSv1.2 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 16 Jul 2020 22:41:17 +0900 (JST) (envelope-from penguin-kernel@i-love.sakura.ne.jp) Subject: Re: [PATCH] binder: Don't use mmput() from shrinker function. To: Michal Hocko Cc: Greg Kroah-Hartman , Arve Hjonnevag , Todd Kjos , Martijn Coenen , Joel Fernandes , Christian Brauner , syzbot , acme@kernel.org, alexander.shishkin@linux.intel.com, jolsa@redhat.com, linux-kernel@vger.kernel.org, mark.rutland@arm.com, mingo@redhat.com, namhyung@kernel.org, peterz@infradead.org, syzkaller-bugs@googlegroups.com, "open list:ANDROID DRIVERS" , linux-mm References: <0000000000001fbbb605aa805c9b@google.com> <5ce3ee90-333e-638d-ac8c-cd6d7ab7aa3b@I-love.SAKURA.ne.jp> <20200716083506.GA20915@dhcp22.suse.cz> From: Tetsuo Handa Message-ID: <36db7016-98d6-2c6b-110b-b2481fd480ac@i-love.sakura.ne.jp> Date: Thu, 16 Jul 2020 22:41:14 +0900 User-Agent: Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <20200716083506.GA20915@dhcp22.suse.cz> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2020/07/16 17:35, Michal Hocko wrote: > On Thu 16-07-20 08:36:52, Tetsuo Handa wrote: >> syzbot is reporting that mmput() from shrinker function has a risk of >> deadlock [1]. Don't start synchronous teardown of mm when called from >> shrinker function. > > Please add the actual lock dependency to the changelog. > > Anyway is this deadlock real? Mayve I have missed some details but the > call graph points to these two paths. > uprobe_mmap do_shrink_slab > uprobes_mmap_hash #lock > install_breakpoint binder_shrink_scan > set_swbp binder_alloc_free_page > uprobe_write_opcode __mmput > update_ref_ctr uprobe_clear_state > mutex_lock(&delayed_uprobe_lock) mutex_lock(&delayed_uprobe_lock); > allocation -> reclaim > static int update_ref_ctr(struct uprobe *uprobe, struct mm_struct *mm, short d) { mutex_lock(&delayed_uprobe_lock); ret = delayed_uprobe_add(uprobe, mm1) { du = kzalloc(sizeof(*du), GFP_KERNEL) { do_shrink_slab() { binder_shrink_scan() { binder_alloc_free_page() { mmget_not_zero(mm2); mmput(mm2) { __mmput(mm2) { uprobe_clear_state(mm2) { mutex_lock(&delayed_uprobe_lock); delayed_uprobe_remove(NULL, mm2); mutex_unlock(&delayed_uprobe_lock); } } } } } } } } mutex_unlock(&delayed_uprobe_lock); } > But in order for this to happen the shrinker would have to do the last > put on the mm. But mm cannot go away from under uprobe_mmap so those two > paths cannot race with each other. and mm1 != mm2 is possible, isn't it? > > Unless I am missing something this is a false positive. I do not mind > using mmput_async from the shrinker as a workaround but the changelog > should be explicit about the fact. > binder_alloc_free_page() is already using mmput_async() 14 lines later. It just took 18 months to hit this race for the third time, for it is quite difficult to let the owner of mm2 to call mmput(mm2) between binder_alloc_free_page() calls mmget_not_zero(mm2) and mmput(mm2). The reason I added you is to see whether we can do void mmput(struct mm_struct *mm) { might_sleep(); + /* Calling mmput() from shrinker context can deadlock. */ + WARN_ON(current->flags & PF_MEMALLOC); if (atomic_dec_and_test(&mm->mm_users)) __mmput(mm); } in order to catch this bug easier.