Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751430AbdLUI1Y (ORCPT ); Thu, 21 Dec 2017 03:27:24 -0500 Received: from out02.mta.xmission.com ([166.70.13.232]:51671 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750891AbdLUI1W (ORCPT ); Thu, 21 Dec 2017 03:27:22 -0500 From: ebiederm@xmission.com (Eric W. Biederman) To: Dave Jones Cc: Linus Torvalds , Al Viro , Linux Kernel , syzkaller-bugs@googlegroups.com, Gargi Sharma , Alexey Dobriyan , Oleg Nesterov , Rik van Riel , Andrew Morton References: <20171218221541.GP21978@ZenIV.linux.org.uk> <20171218231013.GA9481@codemonkey.org.uk> <20171219033926.GA26981@codemonkey.org.uk> <87lghy7eul.fsf@xmission.com> <20171219193020.GA9237@codemonkey.org.uk> <878tdy5r5t.fsf@xmission.com> <87mv2e17vz.fsf@xmission.com> <20171220052803.GA17079@codemonkey.org.uk> <871sjp1cjz.fsf@xmission.com> <20171221031606.GA4636@codemonkey.org.uk> Date: Thu, 21 Dec 2017 02:26:53 -0600 In-Reply-To: <20171221031606.GA4636@codemonkey.org.uk> (Dave Jones's message of "Wed, 20 Dec 2017 22:16:06 -0500") Message-ID: <87po78trjm.fsf@xmission.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1eRwCF-0003GR-Kl;;;mid=<87po78trjm.fsf@xmission.com>;;;hst=in01.mta.xmission.com;;;ip=67.3.133.177;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1/Yxsk/jvG7qUdKlvDnkFMJvEmpbWPDMbc= X-SA-Exim-Connect-IP: 67.3.133.177 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.0 TVD_RCVD_IP Message was received from an IP address * 0.0 T_TM2_M_HEADER_IN_MSG BODY: No description available. * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.4998] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa07 1397; Body=1 Fuz1=1 Fuz2=1] X-Spam-DCC: XMission; sa07 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: ;Dave Jones X-Spam-Relay-Country: X-Spam-Timing: total 1050 ms - load_scoreonly_sql: 0.04 (0.0%), signal_user_changed: 2.5 (0.2%), b_tie_ro: 1.77 (0.2%), parse: 1.10 (0.1%), extract_message_metadata: 12 (1.1%), get_uri_detail_list: 1.47 (0.1%), tests_pri_-1000: 4.7 (0.4%), tests_pri_-950: 1.18 (0.1%), tests_pri_-900: 0.96 (0.1%), tests_pri_-400: 21 (2.0%), check_bayes: 20 (1.9%), b_tokenize: 6 (0.6%), b_tok_get_all: 7 (0.7%), b_comp_prob: 2.4 (0.2%), b_tok_touch_all: 2.8 (0.3%), b_finish: 0.61 (0.1%), tests_pri_0: 463 (44.1%), check_dkim_signature: 0.48 (0.0%), check_dkim_adsp: 2.8 (0.3%), tests_pri_500: 541 (51.5%), poll_dns_idle: 537 (51.1%), rewrite_mail: 0.00 (0.0%) Subject: Re: proc_flush_task oops X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1801 Lines: 50 Dave Jones writes: > On Wed, Dec 20, 2017 at 12:25:52PM -0600, Eric W. Biederman wrote: >> > > > > > > If the warning triggers it means the bug is in alloc_pid and somehow > > > > something has gotten past the is_child_reaper check. > > > > > > You're onto something. > > > > > I am not seeing where things go wrong, but that puts the recent pid bitmap, bit > > hash to idr change in the suspect zone. > > > > Can you try reverting that change: > > > > e8cfbc245e24 ("pid: remove pidhash") > > 95846ecf9dac ("pid: replace pid bitmap implementation with IDR API") > > > > While keeping the warning in place so we can see if this fixes the > > allocation problem? > > So I can't trigger this any more with those reverted. I seem to hit a > bunch of other long-standing bugs first. I'll keep running it > overnight, but it looks like this is where the problem lies. I would really like to hear from the people who made this change if they are interested in tracking down this failure. It might be as simple as the locking changed enough that the locking instrumentation is now slowing things down, and opening up an old race. I have stared at this code, and written some test programs and I can't see what is going on. alloc_pid by design and in implementation (as far as I can see) is always single threaded when allocating the first pid in a pid namespace. idr_init always initialized idr_next to 0. So how we can get past: if (unlikely(is_child_reaper(pid))) { if (pid_ns_prepare_proc(ns)) { disable_pid_allocation(ns); goto out_free; } } with proc_mnt still set to NULL is a mystery to me. Is there any chance the idr code doesn't always return the lowest valid free number? So init gets assigned something other than 1? Eric