2022-08-01 15:23:21

by Helge Deller

[permalink] [raw]
Subject: [PATCH 2/3] lib/dump_stack: Add dump_stack_print_cmdline() and wire up in dump_stack_print_info()

Add the function dump_stack_print_cmdline() which can be used by arch
code to print the command line of the current processs. This function
is useful in arch code when dumping information for a faulting process.

Wire this function up in the dump_stack_print_info() function to include
the dumping of the command line for architectures which use
dump_stack_print_info().

As an example, with this patch a failing glibc testcase (which uses
ld.so.1 as starting program) up to now reported just "ld.so.1" failing:

do_page_fault() command='ld.so.1' type=15 address=0x565921d8 in libc.so[f7339000+1bb000]
trap #15: Data TLB miss fault, vm_start = 0x0001a000, vm_end = 0x0001b000

and now it reports in addition:

ld.so.1[1151] cmdline: /home/gnu/glibc/objdir/elf/ld.so.1 --library-path /home/gnu/glibc/objdir:/home/gnu/glibc/objdir/math:/home/gnu/
/home/gnu/glibc/objdir/malloc/tst-safe-linking-malloc-hugetlb1

Signed-off-by: Helge Deller <[email protected]>
---
include/linux/printk.h | 5 +++++
lib/dump_stack.c | 17 +++++++++++++++++
2 files changed, 22 insertions(+)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index cf7d666ab1f8..5290a32a197d 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -191,6 +191,7 @@ u32 log_buf_len_get(void);
void log_buf_vmcoreinfo_setup(void);
void __init setup_log_buf(int early);
__printf(1, 2) void dump_stack_set_arch_desc(const char *fmt, ...);
+void dump_stack_print_cmdline(const char *log_lvl);
void dump_stack_print_info(const char *log_lvl);
void show_regs_print_info(const char *log_lvl);
extern asmlinkage void dump_stack_lvl(const char *log_lvl) __cold;
@@ -262,6 +263,10 @@ static inline __printf(1, 2) void dump_stack_set_arch_desc(const char *fmt, ...)
{
}

+static inline void dump_stack_print_cmdline(const char *log_lvl)
+{
+}
+
static inline void dump_stack_print_info(const char *log_lvl)
{
}
diff --git a/lib/dump_stack.c b/lib/dump_stack.c
index 83471e81501a..c9b6a4c0c4e1 100644
--- a/lib/dump_stack.c
+++ b/lib/dump_stack.c
@@ -14,6 +14,7 @@
#include <linux/kexec.h>
#include <linux/utsname.h>
#include <linux/stop_machine.h>
+#include <linux/proc_fs.h>

static char dump_stack_arch_desc_str[128];

@@ -45,6 +46,20 @@ void __init dump_stack_set_arch_desc(const char *fmt, ...)
#define BUILD_ID_VAL ""
#endif

+/**
+ * dump_stack_print_cmdline - print the command line of current process
+ * @log_lvl: log level
+ */
+void dump_stack_print_cmdline(const char *log_lvl)
+{
+ char cmdline[256];
+
+ /* get command line */
+ get_task_cmdline_kernel(current, cmdline, sizeof(cmdline));
+ printk("%s%s[%d] cmdline: %s\n", log_lvl, current->comm,
+ current->pid, cmdline);
+}
+
/**
* dump_stack_print_info - print generic debug info for dump_stack()
* @log_lvl: log level
@@ -62,6 +77,8 @@ void dump_stack_print_info(const char *log_lvl)
(int)strcspn(init_utsname()->version, " "),
init_utsname()->version, BUILD_ID_VAL);

+ dump_stack_print_cmdline(log_lvl);
+
if (dump_stack_arch_desc_str[0] != '\0')
printk("%sHardware name: %s\n",
log_lvl, dump_stack_arch_desc_str);
--
2.37.1



2022-08-06 15:11:07

by kernel test robot

[permalink] [raw]
Subject: [lib/dump_stack] 9ac328fc55: BUG:sleeping_function_called_from_invalid_context_at_kernel/locking/rwsem.c



Greeting,

FYI, we noticed the following commit (built with gcc-11):

commit: 9ac328fc556b6cead5bcdcb82737ac2713821ccd ("[PATCH 2/3] lib/dump_stack: Add dump_stack_print_cmdline() and wire up in dump_stack_print_info()")
url: https://github.com/intel-lab-lkp/linux/commits/Helge-Deller/Dump-command-line-of-faulting-process-to-syslog/20220801-232209
base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git 8f1d56f64f8d6b80dea2d1978d10071132a695c5
patch link: https://lore.kernel.org/linux-arch/[email protected]

in testcase: xfstests
version: xfstests-x86_64-c1144bf-1_20220725
with following parameters:

disk: 4HDD
fs: xfs
test: xfs-group-34
ucode: 0x21

test-description: xfstests is a regression test suite for xfs and other files ystems.
test-url: git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git


on test machine: 4 threads 1 sockets Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz with 8G memory

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>


[ 54.342713][ T4368] BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:1511
[ 54.342716][ T4368] in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 4368, name: ls
[ 54.342719][ T4368] preempt_count: 0, expected: 0
[ 54.342722][ T4368] CPU: 0 PID: 4368 Comm: ls Not tainted 5.18.0-01262-g9ac328fc556b #1
[ 54.342733][ T4368] ls[4368] cmdline: ls /fs/scratch/test/DATA
[ 54.342737][ T4368] Hardware name: Hewlett-Packard p6-1451cx/2ADA, BIOS 8.15 02/05/2013
[ 54.342740][ T4368] Call Trace:
[ 54.342742][ T4368] <TASK>
[ 54.342745][ T4368] dump_stack_lvl (lib/dump_stack.c:124 (discriminator 1))
[ 54.342753][ T4368] __might_resched.cold (kernel/sched/core.c:9755)
[ 54.342759][ T4368] down_read_killable (kernel/locking/rwsem.c:1511)
[ 54.342862][ T58] XFS (sdb4): First 128 bytes of corrupted metadata buffer:
[ 54.343029][ T4368] ? _raw_read_unlock_bh (kernel/locking/spinlock.c:137)
[ 54.343177][ T58] 00000000: 31 32 33 0a 00 00 00 00 00 00 00 00 00 00 00 00 123.............
[ 54.343267][ T4368] ? down_read (kernel/locking/rwsem.c:1510)
[ 54.343412][ T58] 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
[ 54.343520][ T4368] __access_remote_vm (include/linux/mmap_lock.h:126 mm/memory.c:5184)
[ 54.343678][ T58] 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
[ 54.343730][ T4368] ? follow_phys (mm/memory.c:5179)
[ 54.343789][ T58] 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
[ 54.343874][ T4368] ? _raw_spin_lock (arch/x86/include/asm/atomic.h:202 include/linux/atomic/atomic-instrumented.h:543 include/asm-generic/qspinlock.h:82 include/linux/spinlock.h:185 include/linux/spinlock_api_smp.h:134 kernel/locking/spinlock.c:154)
[ 54.344167][ T58] 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
[ 54.344261][ T4368] ? _raw_write_lock_irq (kernel/locking/spinlock.c:153)
[ 54.344388][ T58] 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
[ 54.344485][ T4368] get_mm_cmdline (fs/proc/base.c:299)
[ 54.344638][ T58] 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
[ 54.344722][ T4368] ? _raw_spin_lock (arch/x86/include/asm/atomic.h:202 include/linux/atomic/atomic-instrumented.h:543 include/asm-generic/qspinlock.h:82 include/linux/spinlock.h:185 include/linux/spinlock_api_smp.h:134 kernel/locking/spinlock.c:154)
[ 54.344728][ T4368] ? timerslack_ns_write (fs/proc/base.c:255)
[ 54.344734][ T4368] ? console_trylock (kernel/printk/printk.c:2661)
[ 54.344891][ T58] 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
[ 54.344984][ T4368] ? llist_add_batch (lib/llist.c:33)
[ 54.346587][ T4368] get_task_cmdline_kernel (fs/proc/base.c:347 fs/proc/base.c:361)
[ 54.346692][ T4368] dump_stack_print_cmdline (lib/dump_stack.c:59)
[ 54.346793][ T4368] ? _atomic_dec_and_lock_irqsave (lib/dump_stack.c:54)
[ 54.346908][ T4368] dump_stack_print_info (lib/dump_stack.c:82 (discriminator 4))
[ 54.347005][ T4368] dump_stack_lvl (lib/dump_stack.c:105 (discriminator 1) lib/dump_stack.c:123 (discriminator 1))
[ 54.347093][ T4368] xfs_corruption_error (fs/xfs/xfs_error.c:340 fs/xfs/xfs_error.c:325 fs/xfs/xfs_error.c:355) xfs
[ 54.347342][ T4368] ? xfs_dir2_isblock (fs/xfs/libxfs/xfs_dir2.c:613) xfs
[ 54.347570][ T4368] ? __alloc_pages_slowpath+0x1540/0x1540
[ 54.347694][ T4368] xfs_dir2_isblock (fs/xfs/libxfs/xfs_dir2.c:616) xfs
[ 54.347911][ T4368] ? xfs_dir2_isblock (fs/xfs/libxfs/xfs_dir2.c:613) xfs
[ 54.348133][ T4368] ? xfs_dir2_grow_inode (fs/xfs/libxfs/xfs_dir2.c:606) xfs
[ 54.348363][ T4368] xfs_readdir (fs/xfs/xfs_dir2_readdir.c:536) xfs
[ 54.348576][ T4368] ? xfs_dir3_get_dtype (fs/xfs/xfs_dir2_readdir.c:512) xfs
[ 54.348808][ T4368] ? copy_page_range (mm/memory.c:4678)
[ 54.349227][ T4368] ? fsnotify_perm+0x13b/0x4c0
[ 54.349329][ T4368] iterate_dir (fs/readdir.c:65)
[ 54.349417][ T4368] __x64_sys_getdents64 (fs/readdir.c:370 fs/readdir.c:354 fs/readdir.c:354)
[ 54.349517][ T4368] ? filldir (fs/readdir.c:354)
[ 54.349597][ T4368] ? handle_mm_fault (mm/memory.c:4861)
[ 54.349690][ T4368] ? __ia32_sys_getdents (fs/readdir.c:312)
[ 54.349787][ T4368] ? do_user_addr_fault (arch/x86/mm/fault.c:1422)
[ 54.349887][ T4368] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
[ 54.349972][ T4368] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:115)
[ 54.350085][ T4368] RIP: 0033:0x7f3ce13439c7
[ 54.350170][ T4368] Code: 0f 1f 00 48 8b 47 20 c3 66 2e 0f 1f 84 00 00 00 00 00 90 48 81 fa ff ff ff 7f b8 ff ff ff 7f 48 0f 47 d0 b8 d9 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 99 74 0f 00 f7 d8 64 89 02 48
All code
========
0: 0f 1f 00 nopl (%rax)
3: 48 8b 47 20 mov 0x20(%rdi),%rax
7: c3 retq
8: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
f: 00 00 00
12: 90 nop
13: 48 81 fa ff ff ff 7f cmp $0x7fffffff,%rdx
1a: b8 ff ff ff 7f mov $0x7fffffff,%eax
1f: 48 0f 47 d0 cmova %rax,%rdx
23: b8 d9 00 00 00 mov $0xd9,%eax
28: 0f 05 syscall
2a:* 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax <-- trapping instruction
30: 77 01 ja 0x33
32: c3 retq
33: 48 8b 15 99 74 0f 00 mov 0xf7499(%rip),%rdx # 0xf74d3
3a: f7 d8 neg %eax
3c: 64 89 02 mov %eax,%fs:(%rdx)
3f: 48 rex.W

Code starting with the faulting instruction
===========================================
0: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
6: 77 01 ja 0x9
8: c3 retq
9: 48 8b 15 99 74 0f 00 mov 0xf7499(%rip),%rdx # 0xf74a9
10: f7 d8 neg %eax
12: 64 89 02 mov %eax,%fs:(%rdx)
15: 48 rex.W
[ 54.350503][ T4368] RSP: 002b:00007ffec02690d8 EFLAGS: 00000293 ORIG_RAX: 00000000000000d9
[ 54.350655][ T4368] RAX: ffffffffffffffda RBX: 0000560265b3c930 RCX: 00007f3ce13439c7
[ 54.350803][ T4368] RDX: 0000000000008000 RSI: 0000560265b3c960 RDI: 0000000000000004
[ 54.350947][ T4368] RBP: 0000560265b3c960 R08: 0000000000000030 R09: 00007f3ce143bbe0
[ 54.351086][ T4368] R10: fffffffffffff28c R11: 0000000000000293 R12: fffffffffffffe98
[ 54.351226][ T4368] R13: 0000560265b3c934 R14: 0000000000000000 R15: 0000560264bca888
[ 54.351366][ T4368] </TASK>
[ 54.351679][ T4368] ls[4368] cmdline: ls /fs/scratch/test/DATA
[ 54.351804][ T4368] Hardware name: Hewlett-Packard p6-1451cx/2ADA, BIOS 8.15 02/05/2013
[ 54.351964][ T4368] Call Trace:
[ 54.352041][ T4368] <TASK>
[ 54.352100][ T4368] dump_stack_lvl (lib/dump_stack.c:124 (discriminator 1))
[ 54.352196][ T4368] xfs_corruption_error (fs/xfs/xfs_error.c:340 fs/xfs/xfs_error.c:325 fs/xfs/xfs_error.c:355) xfs
[ 54.352456][ T4368] ? xfs_dir2_isblock (fs/xfs/libxfs/xfs_dir2.c:613) xfs
[ 54.352724][ T4368] ? __alloc_pages_slowpath+0x1540/0x1540
[ 54.352855][ T4368] xfs_dir2_isblock (fs/xfs/libxfs/xfs_dir2.c:616) xfs
[ 54.353083][ T4368] ? xfs_dir2_isblock (fs/xfs/libxfs/xfs_dir2.c:613) xfs
[ 54.353350][ T4368] ? xfs_dir2_grow_inode (fs/xfs/libxfs/xfs_dir2.c:606) xfs
[ 54.353577][ T4368] xfs_readdir (fs/xfs/xfs_dir2_readdir.c:536) xfs
[ 54.353837][ T4368] ? xfs_dir3_get_dtype (fs/xfs/xfs_dir2_readdir.c:512) xfs
[ 54.354069][ T4368] ? copy_page_range (mm/memory.c:4678)
[ 54.354164][ T4368] ? fsnotify_perm+0x13b/0x4c0
[ 54.354267][ T4368] iterate_dir (fs/readdir.c:65)
[ 54.354350][ T4368] __x64_sys_getdents64 (fs/readdir.c:370 fs/readdir.c:354 fs/readdir.c:354)
[ 54.354449][ T4368] ? filldir (fs/readdir.c:354)
[ 54.354530][ T4368] ? handle_mm_fault (mm/memory.c:4861)
[ 54.354621][ T4368] ? __ia32_sys_getdents (fs/readdir.c:312)
[ 54.354735][ T4368] ? do_user_addr_fault (arch/x86/mm/fault.c:1422)
[ 54.354832][ T4368] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
[ 54.354918][ T4368] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:115)
[ 54.355041][ T4368] RIP: 0033:0x7f3ce13439c7
[ 54.355126][ T4368] Code: 0f 1f 00 48 8b 47 20 c3 66 2e 0f 1f 84 00 00 00 00 00 90 48 81 fa ff ff ff 7f b8 ff ff ff 7f 48 0f 47 d0 b8 d9 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 99 74 0f 00 f7 d8 64 89 02 48
All code
========
0: 0f 1f 00 nopl (%rax)
3: 48 8b 47 20 mov 0x20(%rdi),%rax
7: c3 retq
8: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
f: 00 00 00
12: 90 nop
13: 48 81 fa ff ff ff 7f cmp $0x7fffffff,%rdx
1a: b8 ff ff ff 7f mov $0x7fffffff,%eax
1f: 48 0f 47 d0 cmova %rax,%rdx
23: b8 d9 00 00 00 mov $0xd9,%eax
28: 0f 05 syscall
2a:* 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax <-- trapping instruction
30: 77 01 ja 0x33
32: c3 retq
33: 48 8b 15 99 74 0f 00 mov 0xf7499(%rip),%rdx # 0xf74d3
3a: f7 d8 neg %eax
3c: 64 89 02 mov %eax,%fs:(%rdx)
3f: 48 rex.W

Code starting with the faulting instruction
===========================================
0: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
6: 77 01 ja 0x9
8: c3 retq
9: 48 8b 15 99 74 0f 00 mov 0xf7499(%rip),%rdx # 0xf74a9
10: f7 d8 neg %eax
12: 64 89 02 mov %eax,%fs:(%rdx)
15: 48 rex.W


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file

# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.



--
0-DAY CI Kernel Test Service
https://01.org/lkp



Attachments:
(No filename) (11.30 kB)
config-5.18.0-01262-g9ac328fc556b (169.08 kB)
job-script (5.98 kB)
dmesg.xz (13.78 kB)
xfstests (1.69 kB)
job.yaml (4.80 kB)
reproduce (929.00 B)
Download all attachments