Received: by 2002:a05:6a10:6744:0:0:0:0 with SMTP id w4csp308779pxu; Fri, 23 Oct 2020 00:57:24 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyGMlifakBpgVVDoqbw0f4+PDs9GCygIo6LI0sZsduJN0ySNj+g2lushafwqPDRHWrpnpYe X-Received: by 2002:a17:906:1299:: with SMTP id k25mr795119ejb.201.1603439844105; Fri, 23 Oct 2020 00:57:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1603439844; cv=none; d=google.com; s=arc-20160816; b=gAfs9qSh+wm6sZ0atIPgARKaVrj16tJe9fKm6g2HD+Feo1oZBS3G9WVnt6a4e4ufAd 4XBLA+0CJf4MiMIZwAHxTzPZiMNNxXnadPTEYTVv2nge/NOmJ2ZJ/lsIFNPHT6J0y4mO cva/ir7OXsr/fmCOxqMt4+xP6joQv0l912Uvd0rM+VODBsOXJH9iGNiAkcEhuG1NwxGo AWx6AFD0X61xjepElamYh/rwcvtKWI4hcG3dm1gel/C9NSnsJ2QBRImnH8QGIVVsxMtO iRap4bMMTpL/XvDEiiKAyLWCJl28/6jMkm+mgRHqqdlBzvjhRwVJVpojQWOc/+VRdK7O hpww== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:ironport-sdr:ironport-sdr; bh=YsgGD5hfMHATAumVUNLI+ckTiTjKOa1edpH1TgOQ//E=; b=JU98cwkRToHS1uaN02jlJqG1onUA6lasUhI6bQrfQeh07B9y2xq80I7khAvvoNc89I e6krodhhld4CQjREONUmx9KCXtZmZI+FRM4W5uOdFHdSU1cjoGSht7UuZ1CWlyjaGWgc 1/bgm6ONp4kmvlXultYwm71ch6W4WlsxKgQdIXCE7FIP4lcS5ScfrTFrRbamsgM7oovI W8xB5M2BKvDOPbUtbYa2kFH5TNngG/OpmxRxnGO8/THUBETz6a+rOIGny+KPGipPUvGY GwSOdzasmlPmwwZQkdx+t8GQ+vAxX95wteov80p77QGhKih7rEPISUoOUZxF8aqZSa3Q TJZw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b29si301815edn.354.2020.10.23.00.57.01; Fri, 23 Oct 2020 00:57:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2896346AbgJWFCR (ORCPT + 99 others); Fri, 23 Oct 2020 01:02:17 -0400 Received: from mga06.intel.com ([134.134.136.31]:4872 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2505392AbgJWFCQ (ORCPT ); Fri, 23 Oct 2020 01:02:16 -0400 IronPort-SDR: Jf0YhBaQQqxLKqsa97Wa7X6tlTHlN3RWMjDF3mpkPSBFpSPaP46k8gSLia/dFZJD7b7Ri6mrrQ 2B+svKdop7eQ== X-IronPort-AV: E=McAfee;i="6000,8403,9782"; a="229263728" X-IronPort-AV: E=Sophos;i="5.77,404,1596524400"; d="scan'208";a="229263728" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Oct 2020 22:02:15 -0700 IronPort-SDR: 7Day5D5KPL2/noKOCrxgfsCyMurAxXkzfTj5J0GV0BS3bePPACh7pdn5rGhilSEUDVqYQNYw7j kzNa5DgChLHA== X-IronPort-AV: E=Sophos;i="5.77,404,1596524400"; d="scan'208";a="466940929" Received: from sjchrist-coffee.jf.intel.com (HELO linux.intel.com) ([10.54.74.160]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Oct 2020 22:02:15 -0700 Date: Thu, 22 Oct 2020 22:02:14 -0700 From: Sean Christopherson To: Linus Torvalds Cc: Daniel =?iso-8859-1?Q?D=EDaz?= , Naresh Kamboju , Stephen Rothwell , "Matthew Wilcox (Oracle)" , zenglg.jy@cn.fujitsu.com, "Peter Zijlstra (Intel)" , Viresh Kumar , X86 ML , open list , lkft-triage@lists.linaro.org, "Eric W. Biederman" , linux-mm , linux-m68k , Linux-Next Mailing List , Thomas Gleixner , kasan-dev , Dmitry Vyukov , Geert Uytterhoeven , Christian Brauner , Ingo Molnar , LTP List , Al Viro Subject: Re: [LTP] mmstress[1309]: segfault at 7f3d71a36ee8 ip 00007f3d77132bdf sp 00007f3d71a36ee8 error 4 in libc-2.27.so[7f3d77058000+1aa000] Message-ID: <20201023050214.GG23681@linux.intel.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Oct 22, 2020 at 08:05:05PM -0700, Linus Torvalds wrote: > On Thu, Oct 22, 2020 at 6:36 PM Daniel D?az wrote: > > > > The kernel Naresh originally referred to is here: > > https://builds.tuxbuild.com/SCI7Xyjb7V2NbfQ2lbKBZw/ > > Thanks. > > And when I started looking at it, I realized that my original idea > ("just look for __put_user_nocheck_X calls, there aren't so many of > those") was garbage, and that I was just being stupid. > > Yes, the commit that broke was about __put_user(), but in order to not > duplicate all the code, it re-used the regular put_user() > infrastructure, and so all the normal put_user() calls are potential > problem spots too if this is about the compiler interaction with KASAN > and the asm changes. > > So it's not just a couple of special cases to look at, it's all the > normal cases too. > > Ok, back to the drawing board, but I think reverting it is probably > the right thing to do if I can't think of something smart. > > That said, since you see this on x86-64, where the whole ugly trick with that > > register asm("%"_ASM_AX) > > is unnecessary (because the 8-byte case is still just a single > register, no %eax:%edx games needed), it would be interesting to hear > if the attached patch fixes it. That would confirm that the problem > really is due to some register allocation issue interaction (or, > alternatively, it would tell me that there's something else going on). I haven't reproduced the crash, but I did find a smoking gun that confirms the "register shenanigans are evil shenanigans" theory. I ran into a similar thing recently where a seemingly innocuous line of code after loading a value into a register variable wreaked havoc because it clobbered the input register. This put_user() in schedule_tail(): if (current->set_child_tid) put_user(task_pid_vnr(current), current->set_child_tid); generates the following assembly with KASAN out-of-line: 0xffffffff810dccc9 <+73>: xor %edx,%edx 0xffffffff810dcccb <+75>: xor %esi,%esi 0xffffffff810dcccd <+77>: mov %rbp,%rdi 0xffffffff810dccd0 <+80>: callq 0xffffffff810bf5e0 <__task_pid_nr_ns> 0xffffffff810dccd5 <+85>: mov %r12,%rdi 0xffffffff810dccd8 <+88>: callq 0xffffffff81388c60 <__asan_load8> 0xffffffff810dccdd <+93>: mov 0x590(%rbp),%rcx 0xffffffff810dcce4 <+100>: callq 0xffffffff817708a0 <__put_user_4> 0xffffffff810dcce9 <+105>: pop %rbx 0xffffffff810dccea <+106>: pop %rbp 0xffffffff810dcceb <+107>: pop %r12 __task_pid_nr_ns() returns the pid in %rax, which gets clobbered by __asan_load8()'s check on current for the current->set_child_tid dereference.