Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp1986648imm; Tue, 22 May 2018 12:41:24 -0700 (PDT) X-Google-Smtp-Source: AB8JxZr6XNG0pjtfX6sYrCvaNMhikJmaIkPU91VxnDcDgwd1Pf2AghTNz5BQ1CB3nqVxbA0YWFsK X-Received: by 2002:a62:18d6:: with SMTP id 205-v6mr25506391pfy.242.1527018084616; Tue, 22 May 2018 12:41:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527018084; cv=none; d=google.com; s=arc-20160816; b=Lpmf8CScnPGOtax3uzdoM4MUw0EKfU3IxyMLb/q+pBCNHxazBNu7aJKuklFCpBWWjl k3A0iDN+f3Skrmg62psxqcIoC0iaVxQWrm4z7R9AgC+EWGeSGFhCi17BNb+SN1bHyBcR aO9GzyVkhPYkZG3viFV+eGMpnyS3AvpyLrB0gOUWM2213YdA2bAqagj9nPk6PlFbpK4x Sl/oKKfSc1hqiXDkLd0SyqFWPCQcLEtqsx+wYkAKmIFfz4u0bxOtb4mPhTEL9Pbq4heA sIx3M5fG8mSFU1C+5x2lTI2tLmI6DtkdFKlHCiUqXMry35b1FP7XCe98NCIJbNPYPKJ6 jDOg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:message-id:subject:cc:to:from :date:content-transfer-encoding:mime-version:dkim-signature :dkim-signature:arc-authentication-results; bh=+7XDuh7AJHCZyir5dDICHHhrLgTW327zTROGlfEytD4=; b=gtgdEu00ij1ZDwgt8I+iwGbqRLARUOvXloh7wQ9QN4gBV/weJqsrfdPdShrsV8YFB+ BebohOEWkTKvWUZNZx0QpVA+C1/+YGz8qjJNiRddquVm6fDqeF2x+jKhxtBs62CKtdjn mKsgK8U8A3/qFJAOYT9b0+81Ibb0ISn2beevU6K9m5ElN5llgMYD5AhtNX4FDH4I2WE/ j8stO9w20nk/gRlFlxLXElWj/aqsJppV/XGoP7RBHj9dLsjj4QmyJrOhT2E5/q96/zKn 8vTfV26YtHPNuqb/IyiG3CmJ5Ugjy5uZFJEPk6vx2HfwLBTfkftd+gNonACnBXQNLRKU jRAg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@codeaurora.org header.s=default header.b=HqR/cr/c; dkim=pass header.i=@codeaurora.org header.s=default header.b=MCAuS0Yf; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h8-v6si17774390pls.502.2018.05.22.12.41.09; Tue, 22 May 2018 12:41:24 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@codeaurora.org header.s=default header.b=HqR/cr/c; dkim=pass header.i=@codeaurora.org header.s=default header.b=MCAuS0Yf; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753065AbeEVTkJ (ORCPT + 99 others); Tue, 22 May 2018 15:40:09 -0400 Received: from smtp.codeaurora.org ([198.145.29.96]:54488 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752724AbeEVTkG (ORCPT ); Tue, 22 May 2018 15:40:06 -0400 Received: by smtp.codeaurora.org (Postfix, from userid 1000) id 50E6D60264; Tue, 22 May 2018 19:40:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1527018006; bh=u7jtpDCO0/HRpiEqIUCtIpZdZD1uNBN2KXI2j4zW4Xs=; h=Date:From:To:Cc:Subject:From; b=HqR/cr/crw0Kjtmrbd7vGNbPaCBLCW7HZFHxSlSaDxkNNgpFVAijpwWXgSHAueTxy IZ6/lAqFhk+xFDR/yakcCqJEU8Ny30eP64uB1B20i84v3aIwlSDxtp2NxDy4DQEf4j GmEXnd9Z5z/iRP2HXLK+/7blr34guUqGu+5CoBWk= X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on pdx-caf-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.8 required=2.0 tests=ALL_TRUSTED,BAYES_00, DKIM_SIGNED,T_DKIM_INVALID autolearn=no autolearn_force=no version=3.4.0 Received: from mail.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.codeaurora.org (Postfix) with ESMTP id 5412E60264; Tue, 22 May 2018 19:40:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1527018005; bh=u7jtpDCO0/HRpiEqIUCtIpZdZD1uNBN2KXI2j4zW4Xs=; h=Date:From:To:Cc:Subject:From; b=MCAuS0YfBD4qgRFokMPkBrXQtYb9ND35nIehUlCwF7/xZh3aX1/dssXMjutk16XO6 GLHDoKDH/+0R/FxtD8hVJLdDcwwAuDhJRG9/jN3QBoiRkxUN2IXbFecpa/THiQUyz8 AjeRwG6QNIic11JF+zn+g6qCogqvcAWiUlGJAD3Q= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Date: Tue, 22 May 2018 12:40:05 -0700 From: Sodagudi Prasad To: keescook@chromium.org, luto@amacapital.net, wad@chromium.org, akpm@linux-foundation.org, riel@redhat.com, tglx@linutronix.de, mingo@kernel.org, peterz@infradead.org, ebiggers@google.com, fweisbec@gmail.com, sherryy@android.com, vegard.nossum@oracle.com, cl@linux.com, aarcange@redhat.com, alexander.levin@verizon.com, vegard.nossum@oracle.com, sherryy@android.com, fweisbec@gmail.com, ebiggers@google.com, peterz@infradead.org Cc: linux-kernel@vger.kernel.org, torvalds@linux-foundation.org Subject: write_lock_irq(&tasklist_lock) Message-ID: <0879f797135033e05e8e9166a3c85628@codeaurora.org> X-Sender: psodagud@codeaurora.org User-Agent: Roundcube Webmail/1.2.5 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi All, When following test is executed on 4.14.41 stable kernel, observed that one of the core is waiting for tasklist_lock for long time with IRQs disabled. ./stress-ng-64 --get 8 -t 3h --times --metrics-brief Every time when device is crashed, I observed that one the task stuck at fork system call and waiting for tasklist_lock as writer with irq disabled. https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/tree/kernel/fork.c?h=linux-4.14.y#n1843 Some other tasks are making getrlimit, prlimit system calls, so that these readers are continuously taking tasklist_list read lock. Writer has disabled local IRQs for long time and waiting to readers to finish but readers are keeping tasklist_lock busy for quite long time. I think, −−get N option creates N thread and they make following system calls. ======================================================================== start N workers that call system calls that fetch data from the kernel, currently these are: getpid, getppid, getcwd, getgid, getegid, getuid, getgroups, getpgrp, getpgid, getpriority, getresgid, getresuid, getrlimit, prlimit, getrusage, getsid, gettid, getcpu, gettimeofday, uname, adjtimex, sysfs. Some of these system calls are OS specific. ======================================================================== Have you observed this type of issues with tasklist_lock ? Do we need write_lock_irq(&tasklist_lock) in below portion of code ? Can I use write_unlock instead of write_lock_irq in portion of code? https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/tree/kernel/fork.c?h=linux-4.14.y#n1843 -Thanks, Prasad -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, Linux Foundation Collaborative Project