2021-12-29 15:53:06

by Daniel Wagner

[permalink] [raw]
Subject: [PATCH] nvmet: add support reading with offset from ANA log

Add support to read with offsets from ANA log buffer.

The controller claims to support extended data for the Get Log Page
command (including extended Number of Dwords and Log Page Offset 2
fields):

lpa : 0x7
[2:2] : 0x1 Extended data for Get Log Page Supported
[1:1] : 0x1 Command Effects Log Page Supported
[0:0] : 0x1 SMART/Health Log Page per NS Supported

Signed-off-by: Daniel Wagner <[email protected]>
---
drivers/nvme/target/admin-cmd.c | 37 +++++++++++++++++++--------------
1 file changed, 21 insertions(+), 16 deletions(-)

diff --git a/drivers/nvme/target/admin-cmd.c b/drivers/nvme/target/admin-cmd.c
index 6fb24746de06..7c8806f477e2 100644
--- a/drivers/nvme/target/admin-cmd.c
+++ b/drivers/nvme/target/admin-cmd.c
@@ -263,35 +263,40 @@ static u32 nvmet_format_ana_group(struct nvmet_req *req, u32 grpid,
desc->nnsids = cpu_to_le32(count);
desc->chgcnt = cpu_to_le64(nvmet_ana_chgcnt);
desc->state = req->port->ana_state[grpid];
- memset(desc->rsvd17, 0, sizeof(desc->rsvd17));
return struct_size(desc, nsids, count);
}

static void nvmet_execute_get_log_page_ana(struct nvmet_req *req)
{
- struct nvme_ana_rsp_hdr hdr = { 0, };
+ struct nvme_ana_rsp_hdr *hdr;
struct nvme_ana_group_desc *desc;
- size_t offset = sizeof(struct nvme_ana_rsp_hdr); /* start beyond hdr */
+ u64 offset = nvmet_get_log_page_offset(req->cmd);
size_t len;
+ void *buffer;
u32 grpid;
u16 ngrps = 0;
u16 status;

+ if (offset & 0x3) {
+ req->error_loc =
+ offsetof(struct nvme_get_log_page_command, lpo);
+ status = NVME_SC_INVALID_FIELD | NVME_SC_DNR;
+ goto out;
+ }
+
status = NVME_SC_INTERNAL;
- desc = kmalloc(struct_size(desc, nsids, NVMET_MAX_NAMESPACES),
- GFP_KERNEL);
- if (!desc)
+ len = sizeof(*hdr) + struct_size(desc, nsids, NVMET_MAX_NAMESPACES);
+ buffer = kzalloc(len, GFP_KERNEL);
+ if (!buffer)
goto out;
+ hdr = buffer;
+ desc = buffer + sizeof(*hdr);

down_read(&nvmet_ana_sem);
for (grpid = 1; grpid <= NVMET_MAX_ANAGRPS; grpid++) {
if (!nvmet_ana_group_enabled[grpid])
continue;
- len = nvmet_format_ana_group(req, grpid, desc);
- status = nvmet_copy_to_sgl(req, offset, desc, len);
- if (status)
- break;
- offset += len;
+ nvmet_format_ana_group(req, grpid, desc);
ngrps++;
}
for ( ; grpid <= NVMET_MAX_ANAGRPS; grpid++) {
@@ -299,15 +304,15 @@ static void nvmet_execute_get_log_page_ana(struct nvmet_req *req)
ngrps++;
}

- hdr.chgcnt = cpu_to_le64(nvmet_ana_chgcnt);
- hdr.ngrps = cpu_to_le16(ngrps);
+ hdr->chgcnt = cpu_to_le64(nvmet_ana_chgcnt);
+ hdr->ngrps = cpu_to_le16(ngrps);
nvmet_clear_aen_bit(req, NVME_AEN_BIT_ANA_CHANGE);
up_read(&nvmet_ana_sem);

- kfree(desc);
+ status = nvmet_copy_to_sgl(req, 0, buffer + offset,
+ nvmet_get_log_page_len(req->cmd));

- /* copy the header last once we know the number of groups */
- status = nvmet_copy_to_sgl(req, 0, &hdr, sizeof(hdr));
+ kfree(buffer);
out:
nvmet_req_complete(req, status);
}
--
2.29.2



2022-01-05 14:40:03

by kernel test robot

[permalink] [raw]
Subject: [nvmet] d4f2899b84: BUG:KASAN:slab-out-of-bounds_in_sg_copy_buffer



Greeting,

FYI, we noticed the following commit (built with gcc-9):

commit: d4f2899b84cf5018484603ebc0e6edf6a1689b60 ("[PATCH] nvmet: add support reading with offset from ANA log")
url: https://github.com/0day-ci/linux/commits/Daniel-Wagner/nvmet-add-support-reading-with-offset-from-ANA-log/20211229-235321
base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git 136057256686de39cc3a07c2e39ef6bc43003ff6
patch link: https://lore.kernel.org/linux-nvme/[email protected]

in testcase: blktests
version: blktests-x86_64-f51ee53-1_20211226
with following parameters:

disk: 1SSD
test: nvme-group-02
ucode: 0x11



on test machine: 288 threads 2 sockets Intel(R) Xeon Phi(TM) CPU 7295 @ 1.50GHz with 80G memory

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>


[ 163.438288][ T2044] BUG: KASAN: slab-out-of-bounds in sg_copy_buffer (lib/scatterlist.c:975)
[ 163.458459][ T2044] Read of size 4096 at addr ffff88810bfa1000 by task kworker/170:2/2044
[ 163.479251][ T2044]
[ 163.492289][ T2044] CPU: 170 PID: 2044 Comm: kworker/170:2 Not tainted 5.16.0-rc2-00001-gd4f2899b84cf #1
[ 163.514356][ T2044] Hardware name: Intel Corp. GROVEPORT/GROVEPORT, BIOS GVPRCRB1.86B.0018.D06.1710190403 10/19/2017
[ 163.537278][ T2044] Workqueue: events nvme_loop_execute_work [nvme_loop]
[ 163.556545][ T2044] Call Trace:
[ 163.572253][ T2044] <TASK>
[ 163.587253][ T2044] dump_stack_lvl (lib/dump_stack.c:107)
[ 163.603290][ T2044] print_address_description+0x21/0x140
[ 163.622941][ T2044] ? sg_copy_buffer (lib/scatterlist.c:975)
[ 163.640252][ T2044] kasan_report.cold (mm/kasan/report.c:434 mm/kasan/report.c:450)
[ 163.657346][ T2044] ? sg_copy_buffer (lib/scatterlist.c:975)
[ 163.674324][ T2044] kasan_check_range (mm/kasan/generic.c:190)
[ 163.691361][ T2044] memcpy (mm/kasan/shadow.c:65)
[ 163.707892][ T2044] sg_copy_buffer (lib/scatterlist.c:975)
[ 163.724283][ T2044] ? sg_miter_next (lib/scatterlist.c:950)
[ 163.740510][ T2044] nvmet_copy_to_sgl (drivers/nvme/target/core.c:92) nvmet
[ 163.758527][ T2044] ? up_read (arch/x86/include/asm/atomic64_64.h:160 include/linux/atomic/atomic-long.h:71 include/linux/atomic/atomic-instrumented.h:1233 kernel/locking/rwsem.c:1306 kernel/locking/rwsem.c:1570)
[ 163.774551][ T2044] nvmet_execute_get_log_page (drivers/nvme/target/admin-cmd.c:315 drivers/nvme/target/admin-cmd.c:342 drivers/nvme/target/admin-cmd.c:320) nvmet
[ 163.793281][ T2044] ? nvmet_execute_identify_ctrl (drivers/nvme/target/admin-cmd.c:321) nvmet
[ 163.812398][ T2044] ? __schedule (kernel/sched/core.c:6132)
[ 163.829776][ T2044] process_one_work (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:212 include/trace/events/workqueue.h:108 kernel/workqueue.c:2303)
[ 163.848284][ T2044] worker_thread (include/linux/list.h:284 kernel/workqueue.c:2446)
[ 163.864441][ T2044] ? __kthread_parkme (arch/x86/include/asm/bitops.h:207 (discriminator 4) include/asm-generic/bitops/instrumented-non-atomic.h:135 (discriminator 4) kernel/kthread.c:249 (discriminator 4))
[ 163.881270][ T2044] ? schedule (arch/x86/include/asm/bitops.h:207 (discriminator 1) include/asm-generic/bitops/instrumented-non-atomic.h:135 (discriminator 1) include/linux/thread_info.h:118 (discriminator 1) include/linux/sched.h:2120 (discriminator 1) kernel/sched/core.c:6328 (discriminator 1))
[ 163.898263][ T2044] ? process_one_work (kernel/workqueue.c:2388)
[ 163.916262][ T2044] ? process_one_work (kernel/workqueue.c:2388)
[ 163.933339][ T2044] kthread (kernel/kthread.c:327)
[ 163.948352][ T2044] ? set_kthread_struct (kernel/kthread.c:272)
[ 163.966274][ T2044] ret_from_fork (arch/x86/entry/entry_64.S:301)
[ 163.983276][ T2044] </TASK>
[ 163.997384][ T2044]
[ 164.011328][ T2044] Allocated by task 2044:
[ 164.027406][ T2044] kasan_save_stack (mm/kasan/common.c:38)
[ 164.042515][ T2044] __kasan_kmalloc (mm/kasan/common.c:46 mm/kasan/common.c:434 mm/kasan/common.c:513 mm/kasan/common.c:522)
[ 164.058288][ T2044] nvmet_execute_get_log_page (include/linux/slab.h:590 include/linux/slab.h:724 drivers/nvme/target/admin-cmd.c:289 drivers/nvme/target/admin-cmd.c:342 drivers/nvme/target/admin-cmd.c:320) nvmet
[ 164.077273][ T2044] process_one_work (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:212 include/trace/events/workqueue.h:108 kernel/workqueue.c:2303)
[ 164.093373][ T2044] worker_thread (include/linux/list.h:284 kernel/workqueue.c:2446)
[ 164.107943][ T2044] kthread (kernel/kthread.c:327)
[ 164.123255][ T2044] ret_from_fork (arch/x86/entry/entry_64.S:301)
[ 164.138263][ T2044]
[ 164.151266][ T2044] The buggy address belongs to the object at ffff88810bfa0000
[ 164.151266][ T2044] which belongs to the cache kmalloc-8k of size 8192
[ 164.188242][ T2044] The buggy address is located 4096 bytes inside of
[ 164.188242][ T2044] 8192-byte region [ffff88810bfa0000, ffff88810bfa2000)
[ 164.223069][ T2044] The buggy address belongs to the page:
[ 164.239419][ T2044] page:00000000a016574e refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10bfa0
[ 164.261522][ T2044] head:00000000a016574e order:3 compound_mapcount:0 compound_pincount:0
[ 164.281451][ T2044] flags: 0x17ffffc0010200(slab|head|node=0|zone=2|lastcpupid=0x1fffff)
[ 164.300485][ T2044] raw: 0017ffffc0010200 0000000000000000 dead000000000122 ffff88810004d180
[ 164.320284][ T2044] raw: 0000000000000000 0000000080020002 00000001ffffffff 0000000000000000
[ 164.339289][ T2044] page dumped because: kasan: bad access detected
[ 164.356507][ T2044]
[ 164.367914][ T2044] Memory state around the buggy address:
[ 164.384344][ T2044] ffff88810bfa0f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 164.403300][ T2044] ffff88810bfa0f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 164.424275][ T2044] >ffff88810bfa1000: 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc
[ 164.443391][ T2044] ^
[ 164.458281][ T2044] ffff88810bfa1080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 164.477088][ T2044] ffff88810bfa1100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 164.496753][ T2044] ==================================================================
[ 164.516436][ T2044] Disabling lock debugging due to kernel taint
[ 164.535019][ T3687] nvme nvme2: creating 128 I/O queues.
[ 164.750268][ T3687] nvme nvme2: new ctrl: "blktests-subsystem-1"
[ 166.125046][ T3799] nvme nvme2: Removing ctrl: NQN "blktests-subsystem-1"
[ 167.610366][ T2735] RESULT_ROOT=/result/blktests/1SSD-nvme-group-02-ucode=0x11/lkp-knm01/debian-10.4-x86_64-20200603.cgz/x86_64-rhel-8.3-func/gcc-9/d4f2899b84cf5018484603ebc0e6edf6a1689b60/5
[ 167.610429][ T2735]
[ 168.183331][ T2735] job=/lkp/jobs/scheduled/lkp-knm01/blktests-1SSD-nvme-group-02-ucode=0x11-debian-10.4-x86_64-20200603.cgz-d4f2899b84cf5018484603ebc0e6edf6a1689b60-20220101-63377-meduul-2.yaml
[ 168.183394][ T2735]
[ 168.640158][ T3834] run blktests nvme/005 at 2022-01-01 08:29:07
[ 168.946383][ T3860] loop0: detected capacity change from 0 to 2097152
[ 169.043044][ T3864] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
[ 169.142269][ T1705] nvmet: creating nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:4fa5f0a8-d9a3-4154-971b-e4f2612165cc.
[ 169.178250][ T3867] nvme nvme2: creating 128 I/O queues.
[ 169.334821][ T3867] nvme nvme2: new ctrl: "blktests-subsystem-1"
[ 172.964977][ T1737] nvmet: creating nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:4fa5f0a8-d9a3-4154-971b-e4f2612165cc.
[ 173.000623][ T1505] nvme nvme2: creating 128 I/O queues.
[ 173.097557][ T3904] nvme nvme2: Removing ctrl: NQN "blktests-subsystem-1"
[ 173.632045][ T3677] block nvme2n1: no available path - failing I/O
[ 173.650413][ T3677] block nvme2n1: no available path - failing I/O
[ 173.670779][ T3677] Buffer I/O error on dev nvme2n1, logical block 262128, async page read
[ 180.758798][ T2735] result_service: raw_upload, RESULT_MNT: /internal-lkp-server/result, RESULT_ROOT: /internal-lkp-server/result/blktests/1SSD-nvme-group-02-ucode=0x11/lkp-knm01/debian-10.4-x86_64-20200603.cgz/x86_64-rhel-8.3-func/gcc-9/d4f2899b84cf5018484603ebc0e6edf6a1689b60/5, TMP_RESULT_ROOT: /tmp/lkp/result
[ 180.758860][ T2735]
[ 180.857259][ T2735] run-job /lkp/jobs/scheduled/lkp-knm01/blktests-1SSD-nvme-group-02-ucode=0x11-debian-10.4-x86_64-20200603.cgz-d4f2899b84cf5018484603ebc0e6edf6a1689b60-20220101-63377-meduul-2.yaml
[ 180.857322][ T2735]
[ 184.829351][ T2735] /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-8 http://internal-lkp-server:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/lkp-knm01/blktests-1SSD-nvme-group-02-ucode=0x11-debian-10.4-x86_64-20200603.cgz-d4f2899b84cf5018484603ebc0e6edf6a1689b60-20220101-63377-meduul-2.yaml&job_state=running -O /dev/null
[ 184.829417][ T2735]
[ 184.909503][ T2735] target ucode: 0x11
[ 184.909644][ T2735]
[ 184.954751][ T2735] current_version: 11, target_version: 11
[ 184.954818][ T2735]
[ 184.998255][ T2735] IPMI Device Information
[ 184.998330][ T2735]
[ 185.046732][ T2735] BMC ARP Control : ARP Responses Enabled, Gratuitous ARP Disabled
[ 185.046798][ T2735]
[ 185.100309][ T2735] 2022-01-01 08:29:01 sed "s:^:nvme/:" /lkp/benchmarks/blktests/tests/nvme-group-02
[ 185.100375][ T2735]
[ 185.154281][ T2735] 2022-01-01 08:29:01 ./check -o /mnt/nvme-group-02 nvme/004 nvme/005
[ 185.154348][ T2735]
[ 185.206741][ T2735] nvme/004 (test nvme and nvmet UUID NS descriptors)


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file

# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.



---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation

Thanks,
Oliver Sang


Attachments:
(No filename) (10.11 kB)
config-5.16.0-rc2-00001-gd4f2899b84cf (173.57 kB)
job-script (5.84 kB)
dmesg.xz (27.05 kB)
blktests (1.39 kB)
job.yaml (4.88 kB)
reproduce (110.00 B)
Download all attachments