2018-09-11 07:14:54

by Randy Dunlap

[permalink] [raw]
Subject: 4.19-rc[23] iwlwifi: BUG in swiotlb

Hi,

Any ideas?

This is on a common (older) Toshiba Portege laptop.



2018-09-10T18:47:54.532836-07:00 dragon kernel: [ 31.471708] ------------[ cut here ]------------
2018-09-10T18:47:54.532837-07:00 dragon kernel: [ 31.472371] kernel BUG at ../kernel/dma/swiotlb.c:521!
2018-09-10T18:47:54.532838-07:00 dragon kernel: [ 31.473057] invalid opcode: 0000 [#1] PREEMPT SMP PTI
2018-09-10T18:47:54.613627-07:00 dragon kernel: [ 31.473734] CPU: 2 PID: 893 Comm: NetworkManager Not tainted 4.19.0-rc3rdd #1
2018-09-10T18:47:54.613640-07:00 dragon kernel: [ 31.473735] Hardware name: TOSHIBA PORTEGE R835/Portable PC, BIOS Version 4.10 01/08/2013
2018-09-10T18:47:54.613641-07:00 dragon kernel: [ 31.473740] RIP: 0010:swiotlb_tbl_map_single+0x296/0x2c0
2018-09-10T18:47:54.613643-07:00 dragon kernel: [ 31.473743] Code: fe ff ff 83 7d a0 01 0f 87 e2 fe ff ff 48 8b 35 e0 e0 df 00 48 8b 55 d0 49 8d 3c 36 48 03 75 b0 e8 df 5f 65 00 e9 c5 fe ff ff <0f> 0b 48 8b 55 d0 48 8b 7d c8 48 c7 c6 c8 74 e4 ab e8 f4 37 40 00
2018-09-10T18:47:54.613645-07:00 dragon kernel: [ 31.473744] RSP: 0018:ffffb42480cab0f0 EFLAGS: 00010246
2018-09-10T18:47:54.613646-07:00 dragon kernel: [ 31.473747] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
2018-09-10T18:47:54.613647-07:00 dragon kernel: [ 31.473749] RDX: 0000000000000000 RSI: 00000000a65d0000 RDI: ffff97e38a5a6890
2018-09-10T18:47:54.613648-07:00 dragon kernel: [ 31.473750] RBP: ffffb42480cab150 R08: 0000000000000000 R09: 0000000000000000
2018-09-10T18:47:54.613649-07:00 dragon kernel: [ 31.473752] R10: 0000000000000002 R11: 0000000000000000 R12: ffff97e38a5a6890
2018-09-10T18:47:54.613650-07:00 dragon kernel: [ 31.473753] R13: 000000000014cba0 R14: 0000000000000001 R15: 0000000000200000
2018-09-10T18:47:54.613651-07:00 dragon kernel: [ 31.473756] FS: 00007f9eaafe2980(0000) GS:ffff97e38ae00000(0000) knlGS:0000000000000000
2018-09-10T18:47:54.613652-07:00 dragon kernel: [ 31.473759] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
2018-09-10T18:47:54.613653-07:00 dragon kernel: [ 31.488350] CR2: 00005648e6cb5138 CR3: 0000000136ec6005 CR4: 00000000000606e0
2018-09-10T18:47:54.613654-07:00 dragon kernel: [ 31.488353] Call Trace:
2018-09-10T18:47:54.613655-07:00 dragon kernel: [ 31.490325] swiotlb_alloc+0x88/0x170
2018-09-10T18:47:54.613656-07:00 dragon kernel: [ 31.490329] ? __kmalloc+0x1cc/0x200
2018-09-10T18:47:54.613657-07:00 dragon kernel: [ 31.491652] iwl_pcie_txq_alloc+0x1d4/0x3b0 [iwlwifi]
2018-09-10T18:47:54.613658-07:00 dragon kernel: [ 31.491656] ? __kmalloc+0x1ae/0x200
2018-09-10T18:47:54.613659-07:00 dragon kernel: [ 31.491663] iwl_pcie_tx_init+0x338/0x3a0 [iwlwifi]
2018-09-10T18:47:54.613660-07:00 dragon kernel: [ 31.491671] iwl_trans_pcie_start_fw+0x252/0x580 [iwlwifi]
2018-09-10T18:47:54.613660-07:00 dragon kernel: [ 31.491681] iwl_load_ucode_wait_alive+0xd6/0x1c0 [iwldvm]
2018-09-10T18:47:54.613661-07:00 dragon kernel: [ 31.494968] ? iwl_alloc_all+0x30/0x30 [iwldvm]
2018-09-10T18:47:54.613662-07:00 dragon kernel: [ 31.494975] iwl_run_init_ucode+0x85/0x120 [iwldvm]
2018-09-10T18:47:54.613663-07:00 dragon kernel: [ 31.496293] ? iwl_run_init_ucode+0x85/0x120 [iwldvm]
2018-09-10T18:47:54.613664-07:00 dragon kernel: [ 31.496298] ? iwl_send_calib_cfg+0xb0/0xb0 [iwldvm]
2018-09-10T18:47:54.613664-07:00 dragon kernel: [ 31.497620] iwlagn_mac_start+0x11e/0x220 [iwldvm]
2018-09-10T18:47:54.613665-07:00 dragon kernel: [ 31.497625] ? iwlagn_mac_start+0x11e/0x220 [iwldvm]
2018-09-10T18:47:54.613666-07:00 dragon kernel: [ 31.498957] drv_start+0x44/0x60 [mac80211]
2018-09-10T18:47:54.613667-07:00 dragon kernel: [ 31.498970] ieee80211_do_open+0x31b/0x850 [mac80211]
2018-09-10T18:47:54.613668-07:00 dragon kernel: [ 31.498974] ? mutex_unlock+0xd/0x10
2018-09-10T18:47:54.613668-07:00 dragon kernel: [ 31.500960] ieee80211_open+0x4d/0x50 [mac80211]
2018-09-10T18:47:54.613669-07:00 dragon kernel: [ 31.500964] __dev_open+0xb7/0x150
2018-09-10T18:47:54.613670-07:00 dragon kernel: [ 31.502280] __dev_change_flags+0x15b/0x1a0
2018-09-10T18:47:54.613670-07:00 dragon kernel: [ 31.502284] dev_change_flags+0x24/0x60
2018-09-10T18:47:54.613671-07:00 dragon kernel: [ 31.503602] do_setlink+0x30e/0xed0
2018-09-10T18:47:54.613672-07:00 dragon kernel: [ 31.503607] ? sched_clock+0x9/0x10
2018-09-10T18:47:54.613673-07:00 dragon kernel: [ 31.504923] ? sched_clock+0x9/0x10
2018-09-10T18:47:54.613674-07:00 dragon kernel: [ 31.504927] ? sched_clock_cpu+0x11/0xd0
2018-09-10T18:47:54.613674-07:00 dragon kernel: [ 31.506253] ? sched_clock+0x9/0x10
2018-09-10T18:47:54.613675-07:00 dragon kernel: [ 31.506913] ? nla_parse+0x35/0x110
2018-09-10T18:47:54.613676-07:00 dragon kernel: [ 31.507572] rtnl_newlink+0x51b/0x8d0
2018-09-10T18:47:54.613677-07:00 dragon kernel: [ 31.508231] ? unwind_get_return_address+0x1a/0x30
2018-09-10T18:47:54.613678-07:00 dragon kernel: [ 31.508891] ? sched_clock+0x9/0x10
2018-09-10T18:47:54.613678-07:00 dragon kernel: [ 31.509547] ? sched_clock+0x9/0x10
2018-09-10T18:47:54.613679-07:00 dragon kernel: [ 31.510203] ? sched_clock_cpu+0x11/0xd0
2018-09-10T18:47:54.613680-07:00 dragon kernel: [ 31.510861] ? __lock_acquire.isra.32+0x16e/0x870
2018-09-10T18:47:54.613680-07:00 dragon kernel: [ 31.511521] ? sched_clock+0x9/0x10
2018-09-10T18:47:54.613681-07:00 dragon kernel: [ 31.512179] ? sched_clock+0x9/0x10
2018-09-10T18:47:54.613682-07:00 dragon kernel: [ 31.512834] ? sched_clock+0x9/0x10
2018-09-10T18:47:54.613683-07:00 dragon kernel: [ 31.513491] ? sched_clock_cpu+0x11/0xd0
2018-09-10T18:47:54.613684-07:00 dragon kernel: [ 31.514149] ? sched_clock+0x9/0x10
2018-09-10T18:47:54.613684-07:00 dragon kernel: [ 31.514806] ? sched_clock+0x9/0x10
2018-09-10T18:47:54.613685-07:00 dragon kernel: [ 31.515463] ? sched_clock+0x9/0x10
2018-09-10T18:47:54.613686-07:00 dragon kernel: [ 31.516120] ? sched_clock_cpu+0x11/0xd0
2018-09-10T18:47:54.613687-07:00 dragon kernel: [ 31.516778] ? sched_clock+0x9/0x10
2018-09-10T18:47:54.613687-07:00 dragon kernel: [ 31.517434] ? sched_clock+0x9/0x10
2018-09-10T18:47:54.613688-07:00 dragon kernel: [ 31.518091] ? sched_clock_cpu+0x11/0xd0
2018-09-10T18:47:54.613689-07:00 dragon kernel: [ 31.518751] ? sched_clock+0x9/0x10
2018-09-10T18:47:54.613690-07:00 dragon kernel: [ 31.519408] ? sched_clock+0x9/0x10
2018-09-10T18:47:54.613690-07:00 dragon kernel: [ 31.520064] ? sched_clock+0x9/0x10
2018-09-10T18:47:54.613691-07:00 dragon kernel: [ 31.520721] ? sched_clock_cpu+0x11/0xd0
2018-09-10T18:47:54.613692-07:00 dragon kernel: [ 31.521390] rtnetlink_rcv_msg+0x170/0x3e0
2018-09-10T18:47:54.613692-07:00 dragon kernel: [ 31.522049] ? rtnl_dellink+0x2a0/0x2a0
2018-09-10T18:47:54.613693-07:00 dragon kernel: [ 31.522709] netlink_rcv_skb+0x4c/0x120
2018-09-10T18:47:54.613694-07:00 dragon kernel: [ 31.523367] rtnetlink_rcv+0x10/0x20
2018-09-10T18:47:54.613695-07:00 dragon kernel: [ 31.524023] netlink_unicast+0x169/0x1f0
2018-09-10T18:47:54.613695-07:00 dragon kernel: [ 31.524755] netlink_sendmsg+0x287/0x380
2018-09-10T18:47:54.613696-07:00 dragon kernel: [ 31.525415] ? netlink_unicast+0x1f0/0x1f0
2018-09-10T18:47:54.613697-07:00 dragon kernel: [ 31.526075] ___sys_sendmsg+0x29b/0x300
2018-09-10T18:47:54.613697-07:00 dragon kernel: [ 31.526732] ? sched_clock+0x9/0x10
2018-09-10T18:47:54.613698-07:00 dragon kernel: [ 31.527389] ? sched_clock_cpu+0x11/0xd0
2018-09-10T18:47:54.613699-07:00 dragon kernel: [ 31.528045] ? sched_clock+0x9/0x10
2018-09-10T18:47:54.613700-07:00 dragon kernel: [ 31.528701] ? sched_clock_cpu+0x11/0xd0
2018-09-10T18:47:54.613700-07:00 dragon kernel: [ 31.529360] ? sched_clock+0x9/0x10
2018-09-10T18:47:54.613701-07:00 dragon kernel: [ 31.530018] ? __fget+0xb7/0xf0
2018-09-10T18:47:54.613702-07:00 dragon kernel: [ 31.530675] __sys_sendmsg+0x4f/0x90
2018-09-10T18:47:54.613703-07:00 dragon kernel: [ 31.531331] ? __sys_sendmsg+0x4f/0x90
2018-09-10T18:47:54.613703-07:00 dragon kernel: [ 31.531990] __x64_sys_sendmsg+0x1a/0x20
2018-09-10T18:47:54.613704-07:00 dragon kernel: [ 31.532649] do_syscall_64+0x65/0x1a0
2018-09-10T18:47:54.613705-07:00 dragon kernel: [ 31.533308] entry_SYSCALL_64_after_hwframe+0x44/0xa9
2018-09-10T18:47:54.613706-07:00 dragon kernel: [ 31.533968] RIP: 0033:0x7f9ea87c9014
2018-09-10T18:47:54.613706-07:00 dragon kernel: [ 31.534626] Code: 89 f3 48 83 ec 10 48 89 7c 24 08 48 89 14 24 e8 42 eb ff ff 48 8b 14 24 41 89 c0 48 89 de 48 8b 7c 24 08 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 30 44 89 c7 48 89 04 24 e8 78 eb ff ff 48 8b
2018-09-10T18:47:54.613707-07:00 dragon kernel: [ 31.536605] RSP: 002b:00007ffcb0ca3130 EFLAGS: 00000293 ORIG_RAX: 000000000000002e
2018-09-10T18:47:54.613708-07:00 dragon kernel: [ 31.537916] RAX: ffffffffffffffda RBX: 00007ffcb0ca3180 RCX: 00007f9ea87c9014
2018-09-10T18:47:54.613709-07:00 dragon kernel: [ 31.539223] RDX: 0000000000000000 RSI: 00007ffcb0ca3180 RDI: 0000000000000007
2018-09-10T18:47:54.613710-07:00 dragon kernel: [ 31.540531] RBP: 00005648e6cb26f0 R08: 0000000000000000 R09: 00005648e6cb3120
2018-09-10T18:47:54.613710-07:00 dragon kernel: [ 31.541840] R10: fffffffffffffe88 R11: 0000000000000293 R12: 00005648e6beb410
2018-09-10T18:47:54.613711-07:00 dragon kernel: [ 31.543147] R13: 00007ffcb0ca3180 R14: 00007ffcb0ca3304 R15: 0000000000000000
2018-09-10T18:47:54.613712-07:00 dragon kernel: [ 31.544456] Modules linked in: snd_hda_codec_hdmi snd_hda_codec_realtek nls_iso8859_1 nls_cp437 snd_hda_codec_generic i915 btrfs uvcvideo xor zstd_compress videobuf2_vmalloc videobuf2_memops raid6_pq hid_generic usbmouse videobuf2_v4l2 videobuf2_common arc4 usbhid coretemp iwldvm videodev libcrc32c hwmon hid media intel_rapl zstd_decompress mac80211 xxhash x86_pkg_temp_thermal msr intel_powerclamp snd_hda_intel snd_hda_codec kvm_intel crct10dif_pclmul snd_hwdep crc32_pclmul snd_hda_core crc32c_intel iwlwifi ghash_clmulni_intel snd_pcm pcbc kvmgt vfio_mdev snd_timer mdev aesni_intel vfio_iommu_type1 iTCO_wdt aes_x86_64 crypto_simd gpio_ich cfg80211 snd vfio iTCO_vendor_support kvm rfkill soundcore cryptd sdhci_pci cqhci sdhci sr_mod glue_helper irqbypass evdev input_leds joydev mousedev intel_cstate
2018-09-10T18:47:54.613713-07:00 dragon kernel: [ 31.551785] thermal intel_uncore mei_me mmc_core pcc_cpufreq intel_rapl_perf lpc_ich mei ac video led_class cdrom e1000e serio_raw pcspkr toshiba_haps battery button sg scsi_dh_rdac scsi_dh_emc scsi_dh_alua autofs4
2018-09-10T18:47:54.613714-07:00 dragon kernel: [ 31.553809] ---[ end trace d3ae93ce8608d128 ]---
2018-09-10T18:47:54.614851-07:00 dragon kernel: [ 31.554512] RIP: 0010:swiotlb_tbl_map_single+0x296/0x2c0
2018-09-10T18:47:54.614856-07:00 dragon kernel: [ 31.554514] Code: fe ff ff 83 7d a0 01 0f 87 e2 fe ff ff 48 8b 35 e0 e0 df 00 48 8b 55 d0 49 8d 3c 36 48 03 75 b0 e8 df 5f 65 00 e9 c5 fe ff ff <0f> 0b 48 8b 55 d0 48 8b 7d c8 48 c7 c6 c8 74 e4 ab e8 f4 37 40 00
2018-09-10T18:47:54.614858-07:00 dragon kernel: [ 31.554516] RSP: 0018:ffffb42480cab0f0 EFLAGS: 00010246
2018-09-10T18:47:54.614859-07:00 dragon kernel: [ 31.554519] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
2018-09-10T18:47:54.614861-07:00 dragon kernel: [ 31.554520] RDX: 0000000000000000 RSI: 00000000a65d0000 RDI: ffff97e38a5a6890
2018-09-10T18:47:54.614862-07:00 dragon kernel: [ 31.554522] RBP: ffffb42480cab150 R08: 0000000000000000 R09: 0000000000000000
2018-09-10T18:47:54.614863-07:00 dragon kernel: [ 31.554523] R10: 0000000000000002 R11: 0000000000000000 R12: ffff97e38a5a6890
2018-09-10T18:47:54.614864-07:00 dragon kernel: [ 31.554525] R13: 000000000014cba0 R14: 0000000000000001 R15: 0000000000200000
2018-09-10T18:47:54.614864-07:00 dragon kernel: [ 31.554527] FS: 00007f9eaafe2980(0000) GS:ffff97e38ae00000(0000) knlGS:0000000000000000
2018-09-10T18:47:54.614865-07:00 dragon kernel: [ 31.554529] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
2018-09-10T18:47:54.614866-07:00 dragon kernel: [ 31.554530] CR2: 00005648e6cb5138 CR3: 0000000136ec6005 CR4: 00000000000606e0


thanks,
--
~Randy


2018-09-16 15:18:01

by Pavel Machek

[permalink] [raw]
Subject: Re: 4.19-rc[23] iwlwifi: BUG in swiotlb

Hi!

> > > IO_TLB_SHIFT is 11, so we get 2k alignment, so even the smallest size
> > > (32*64) should result in nslots being 1?
> > >
> > > In fact, unless the driver passed *ZERO* as the size, this should never
> > > happen (hence the BUG_ON), since ALIGN() would take care of rounding up
> > > any smaller allocation here.
> > >
> > > Presumably you can reproduce this pretty easily (and I don't know what
> > > specific model of NIC you have etc.), so perhaps you can do something
> > > like this?
> > >
> > > https://p.sipsolutions.net/aa0dccd7a60fe176.txt
> >
> > That results in: ... if I'm not mistaken. Tested on top of today's
> > mainline. (-rc3.95 :-)
>
> Hold on. I was confused by my build system. Let me retry.
>
> Are you sure you are not mistaking WARN and WARN_ON?

I changed WARNs to printks, and yes, we seem to be pushing 0s where we
should not.

Looks simple to me...
Pavel

[ 6.307381] device-mapper: ioctl: error adding target to table
[ 8.882203] e1000e: eth2 NIC Link is Up 100 Mbps Full Duplex, Flow Control: Rx/Tx
[ 8.882211] e1000e 0000:00:19.0 eth2: 10/100 speed: disabling TSO
[ 9.850102] random: crng init done
[ 9.850119] random: 7 urandom warning(s) missed due to ratelimiting
[ 34.443033] iwlwifi 0000:03:00.0: RF_KILL bit toggled to enable radio.
[ 34.443053] iwlwifi 0000:03:00.0: reporting RF_KILL (radio enabled)
[ 34.467728] iwlwifi 0000:03:00.0: Radio type=0x0-0x0-0x3
[ 34.468122] tfd_sz is 0 - tfh:0, slots:256, tfd_size:128, maxq:0
[ 34.468129] ------------[ cut here ]------------
[ 34.468132] kernel BUG at kernel/dma/swiotlb.c:521!
[ 34.468156] invalid opcode: 0000 [#1] SMP PTI
[ 34.468160] CPU: 0 PID: 3126 Comm: NetworkManager Not tainted 4.19.0-rc3 #8
[ 34.468162] Hardware name: LENOVO 42872WU/42872WU, BIOS 8DET74WW (1.44 ) 03/13/2018
[ 34.468170] RIP: 0010:swiotlb_tbl_map_single+0x17f/0x2c0
[ 34.468175] Code: 21 c6 49 89 f5 49 81 c5 ff 07 00 00 49 c1 ed 0b 48 83 f8 ff 0f 84 f2 fe ff ff 48 8d 90 00 08 00 00 48 c1 ea 0b e9 e2 fe ff ff <0f> 0b 42 8d 0c 3b 89 d8 39 cb 7d 12 48 63 d0 83 c0 01 39 c8 41 c7
[ 34.468179] RSP: 0000:ffffc90000ab3070 EFLAGS: 00010246
[ 34.468183] RAX: 00000000ffffffff RBX: 0000000000000000 RCX: 0000000000000000
[ 34.468188] RDX: 0000000000200000 RSI: 00000000d699f000 RDI: ffff8801970d10a8
[ 34.468190] RBP: ffffc90000ab30c8 R08: 0000000000000002 R09: 0000000000000000
[ 34.468192] R10: 0000000000000034 R11: 303a7178616d2000 R12: 0000000000000001
[ 34.468194] R13: 00000000001ad33e R14: 0000000000000000 R15: 0000000000000000
[ 34.468196] FS: 0000000000000000(0000) GS:ffff88019e200000(0063) knlGS:00000000f70617c0
[ 34.468199] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
[ 34.468201] CR2: 0000000008227c48 CR3: 0000000193a9e006 CR4: 00000000000606b0
[ 34.468203] Call Trace:
[ 34.468208] ? dma_direct_alloc+0x6f/0x140
[ 34.468212] swiotlb_alloc+0x88/0x170
[ 34.468216] iwl_pcie_txq_alloc+0x2aa/0x450
[ 34.468220] iwl_pcie_tx_init+0x325/0x390
[ 34.468223] iwl_trans_pcie_start_fw+0x267/0x590
[ 34.468228] iwl_load_ucode_wait_alive+0xde/0x1b0
[ 34.468231] ? iwl_init_notification_wait+0x78/0x90
[ 34.468235] ? iwl_alloc_all+0x30/0x30
[ 34.468239] iwl_run_init_ucode+0xa3/0x130
[ 34.468242] ? iwl_run_init_ucode+0xa3/0x130
[ 34.468246] ? iwl_alive_notify+0x1b0/0x1b0
[ 34.468251] ? mutex_unlock+0xd/0x10
[ 34.468254] iwlagn_mac_start+0x112/0x200
[ 34.468257] ? iwlagn_mac_start+0x112/0x200
[ 34.468262] drv_start+0x2e/0x50
[ 34.468267] ieee80211_do_open+0x356/0x920
[ 34.468270] ? mutex_unlock+0xd/0x10
[ 34.468274] ieee80211_open+0x4e/0x60
[ 34.468279] __dev_open+0xba/0x130
[ 34.468282] __dev_change_flags+0x19c/0x200
[ 34.468286] ? __switch_to_asm+0x34/0x70
[ 34.468289] ? __switch_to_asm+0x40/0x70
[ 34.468293] dev_change_flags+0x24/0x60
[ 34.468297] do_setlink+0x2f4/0xce0
[ 34.468301] ? _raw_spin_unlock_irq+0x22/0x30
[ 34.468304] ? finish_task_switch+0xa3/0x250
[ 34.468308] ? finish_task_switch+0x76/0x250
[ 34.468311] ? __schedule+0x36c/0x830
[ 34.468317] ? blk_flush_plug_list+0xdd/0x250
[ 34.468322] ? nla_parse+0x36/0x130
[ 34.468325] rtnl_newlink+0x483/0x770
[ 34.468330] ? update_group_capacity+0x27/0x2f0
[ 34.468333] ? find_busiest_group+0x141/0xad0
[ 34.468339] ? cpumask_next_and+0x1d/0x20
[ 34.468342] ? load_balance+0x204/0xb80
[ 34.468346] ? find_held_lock+0x39/0xb0
[ 34.468350] ? find_held_lock+0x39/0xb0
[ 34.468353] ? __lock_acquire.isra.25+0x39e/0xa50
[ 34.468358] rtnetlink_rcv_msg+0x316/0x3e0
[ 34.468362] ? rtnl_calcit.isra.40+0x140/0x140
[ 34.468366] netlink_rcv_skb+0xcd/0x100
[ 34.468369] rtnetlink_rcv+0x10/0x20
[ 34.468372] netlink_unicast+0x179/0x210
[ 34.468375] netlink_sendmsg+0x307/0x3a0
[ 34.468379] sock_sendmsg+0x18/0x30
[ 34.468382] ___sys_sendmsg+0x2a5/0x2c0
[ 34.468386] ? sock_def_readable+0xce/0xe0
[ 34.468392] ? unix_dgram_sendmsg+0x46b/0x6a0
[ 34.468396] ? find_held_lock+0x39/0xb0
[ 34.468401] ? __fget+0x8a/0xd0
[ 34.468405] ? __fget+0xa2/0xd0
[ 34.468408] __sys_sendmsg+0x63/0xa0
[ 34.468411] ? __sys_sendmsg+0x63/0xa0
[ 34.468415] __ia32_compat_sys_socketcall+0xde/0x220
[ 34.468418] ? __ia32_compat_sys_time+0x10/0x40
[ 34.468424] do_int80_syscall_32+0x50/0x100
[ 34.468428] entry_INT80_compat+0x7d/0x82
[ 34.468431] RIP: 0023:0xf7fb6c42
[ 34.468434] Code: 65 8b 15 04 00 00 00 8b 0e 8b 0c ca 83 f9 ff 75 0c 89 04 24 89 f0 e8 b3 fe ff ff eb 05 8b 46 04 01 c8 83 c4 14 5b 5e c3 cd 80 <c3> 8d b6 00 00 00 00 8d bc 27 00 00 00 00 8b 1c 24 c3 8d b6 00 00
[ 34.468436] RSP: 002b:00000000ff93a304 EFLAGS: 00200293 ORIG_RAX: 0000000000000066
[ 34.468440] RAX: ffffffffffffffda RBX: 0000000000000010 RCX: 00000000ff93a310
[ 34.468442] RDX: 00000000f7c27000 RSI: 0000000000000000 RDI: 00000000081ae170
[ 34.468444] RBP: 00000000081b8080 R08: 0000000000000000 R09: 0000000000000000
[ 34.468446] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 34.468448] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 34.468451] Modules linked in:
[ 34.468457] ---[ end trace 301c76c6cfaad410 ]---
[ 34.468462] RIP: 0010:swiotlb_tbl_map_single+0x17f/0x2c0
[ 34.468466] Code: 21 c6 49 89 f5 49 81 c5 ff 07 00 00 49 c1 ed 0b 48 83 f8 ff 0f 84 f2 fe ff ff 48 8d 90 00 08 00 00 48 c1 ea 0b e9 e2 fe ff ff <0f> 0b 42 8d 0c 3b 89 d8 39 cb 7d 12 48 63 d0 83 c0 01 39 c8 41 c7
[ 34.468469] RSP: 0000:ffffc90000ab3070 EFLAGS: 00010246
[ 34.468472] RAX: 00000000ffffffff RBX: 0000000000000000 RCX: 0000000000000000
[ 34.468474] RDX: 0000000000200000 RSI: 00000000d699f000 RDI: ffff8801970d10a8
[ 34.468476] RBP: ffffc90000ab30c8 R08: 0000000000000002 R09: 0000000000000000
[ 34.468478] R10: 0000000000000034 R11: 303a7178616d2000 R12: 0000000000000001
[ 34.468480] R13: 00000000001ad33e R14: 0000000000000000 R15: 0000000000000000
[ 34.468483] FS: 0000000000000000(0000) GS:ffff88019e200000(0063) knlGS:00000000f70617c0
[ 34.468486] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
[ 34.468488] CR2: 0000000008227c48 CR3: 0000000193a9e006 CR4: 00000000000606b0
[ 34.928276] usb 1-1.4: new full-speed USB device number 5 using ehci-pci
[ 35.043018] usb 1-1.4: New USB device found, idVendor=0a5c, idProduct=217f, bcdDevice= 7.48
[ 35.043032] usb 1-1.4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 35.043040] usb 1-1.4: Product: Broadcom Bluetooth Device
[ 35.043046] usb 1-1.4: Manufacturer: Broadcom Corp
[ 35.043052] usb 1-1.4: SerialNumber: 7CE9D3B855AA




--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


Attachments:
(No filename) (7.58 kB)
signature.asc (181.00 B)
Digital signature
Download all attachments

2018-09-16 15:04:04

by Pavel Machek

[permalink] [raw]
Subject: Re: 4.19-rc[23] iwlwifi: BUG in swiotlb

On Sun 2018-09-16 11:34:14, Pavel Machek wrote:
> Hi!
>
> > > Any ideas?
> >
> > Hmm. Is this new?
> >
> > > 2018-09-10T18:47:54.532837-07:00 dragon kernel: [ 31.472371] kernel BUG at ../kernel/dma/swiotlb.c:521!
> >
> > nslots = ALIGN(size, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT;
> > [...]
> > BUG_ON(!nslots)
> >
> > > 2018-09-10T18:47:54.613655-07:00 dragon kernel: [ 31.490325] swiotlb_alloc+0x88/0x170
> > > 2018-09-10T18:47:54.613656-07:00 dragon kernel: [ 31.490329] ? __kmalloc+0x1cc/0x200
> > > 2018-09-10T18:47:54.613657-07:00 dragon kernel: [ 31.491652] iwl_pcie_txq_alloc+0x1d4/0x3b0 [iwlwifi]
> >
> > There are two calls to dma_alloc_coherent() here, should those even hit
> > swiotlb? The sizes of those should be
> > * 256 x 128 (32k)
> > * 32 x 256 (8k) [TFH, unlikely to be the case here]
> > * 256 x 256 (65k) [TFH]
> > * 32 x 64 (2k)
> > * 256 x 64 (16k)
> >
> >
> > IO_TLB_SHIFT is 11, so we get 2k alignment, so even the smallest size
> > (32*64) should result in nslots being 1?
> >
> > In fact, unless the driver passed *ZERO* as the size, this should never
> > happen (hence the BUG_ON), since ALIGN() would take care of rounding up
> > any smaller allocation here.
> >
> > Presumably you can reproduce this pretty easily (and I don't know what
> > specific model of NIC you have etc.), so perhaps you can do something
> > like this?
> >
> > https://p.sipsolutions.net/aa0dccd7a60fe176.txt
>
> That results in: ... if I'm not mistaken. Tested on top of today's
> mainline. (-rc3.95 :-)

Hold on. I was confused by my build system. Let me retry.

Are you sure you are not mistaking WARN and WARN_ON?



--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


Attachments:
(No filename) (1.75 kB)
signature.asc (181.00 B)
Digital signature
Download all attachments

2018-09-16 14:56:35

by Pavel Machek

[permalink] [raw]
Subject: Re: 4.19-rc[23] iwlwifi: BUG in swiotlb

Hi!

> > Any ideas?
>
> Hmm. Is this new?
>
> > 2018-09-10T18:47:54.532837-07:00 dragon kernel: [ 31.472371] kernel BUG at ../kernel/dma/swiotlb.c:521!
>
> nslots = ALIGN(size, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT;
> [...]
> BUG_ON(!nslots)
>
> > 2018-09-10T18:47:54.613655-07:00 dragon kernel: [ 31.490325] swiotlb_alloc+0x88/0x170
> > 2018-09-10T18:47:54.613656-07:00 dragon kernel: [ 31.490329] ? __kmalloc+0x1cc/0x200
> > 2018-09-10T18:47:54.613657-07:00 dragon kernel: [ 31.491652] iwl_pcie_txq_alloc+0x1d4/0x3b0 [iwlwifi]
>
> There are two calls to dma_alloc_coherent() here, should those even hit
> swiotlb? The sizes of those should be
> * 256 x 128 (32k)
> * 32 x 256 (8k) [TFH, unlikely to be the case here]
> * 256 x 256 (65k) [TFH]
> * 32 x 64 (2k)
> * 256 x 64 (16k)
>
>
> IO_TLB_SHIFT is 11, so we get 2k alignment, so even the smallest size
> (32*64) should result in nslots being 1?
>
> In fact, unless the driver passed *ZERO* as the size, this should never
> happen (hence the BUG_ON), since ALIGN() would take care of rounding up
> any smaller allocation here.
>
> Presumably you can reproduce this pretty easily (and I don't know what
> specific model of NIC you have etc.), so perhaps you can do something
> like this?
>
> https://p.sipsolutions.net/aa0dccd7a60fe176.txt

That results in: ... if I'm not mistaken. Tested on top of today's
mainline. (-rc3.95 :-)

[ 9.318335] e1000e: eth2 NIC Link is Up 100 Mbps Full Duplex, Flow Control: Rx/Tx
[ 9.318342] e1000e 0000:00:19.0 eth2: 10/100 speed: disabling TSO
[ 10.078165] random: crng init done
[ 10.078170] random: 7 urandom warning(s) missed due to ratelimiting
[ 89.607425] iwlwifi 0000:03:00.0: RF_KILL bit toggled to enable radio.
[ 89.609870] iwlwifi 0000:03:00.0: reporting RF_KILL (radio enabled)
[ 89.634418] iwlwifi 0000:03:00.0: Radio type=0x0-0x0-0x3
[ 89.635668] ------------[ cut here ]------------
[ 89.636445] kernel BUG at kernel/dma/swiotlb.c:521!
[ 89.637220] invalid opcode: 0000 [#1] SMP PTI
[ 89.637937] CPU: 1 PID: 3126 Comm: NetworkManager Not tainted 4.19.0-rc3 #7
[ 89.638665] Hardware name: LENOVO 42872WU/42872WU, BIOS 8DET74WW (1.44 ) 03/13/2018
[ 89.639415] RIP: 0010:swiotlb_tbl_map_single+0x17f/0x2c0
[ 89.640147] Code: 21 c6 49 89 f5 49 81 c5 ff 07 00 00 49 c1 ed 0b 48 83 f8 ff 0f 84 f2 fe ff ff 48 8d 90 00 08 00 00 48 c1 ea 0b e9 e2 fe ff ff <0f> 0b 42 8d 0c 3b 89 d8 39 cb 7d 12 48 63 d0 83 c0 01 39 c8 41 c7
[ 89.641746] RSP: 0000:ffffc9000092f070 EFLAGS: 00010246
[ 89.642560] RAX: 00000000ffffffff RBX: 0000000000000000 RCX: 0000000000000000
[ 89.643399] RDX: 0000000000200000 RSI: 00000000d699f000 RDI: ffff8801970960a8
[ 89.644235] RBP: ffffc9000092f0c8 R08: 0000000000000002 R09: 0000000000000000
[ 89.645080] R10: 0000000000000034 R11: 0000000000000000 R12: 0000000000000001
[ 89.645917] R13: 00000000001ad33e R14: 0000000000000000 R15: 0000000000000000
[ 89.646750] FS: 0000000000000000(0000) GS:ffff88019e240000(0063) knlGS:00000000f70437c0
[ 89.647599] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
[ 89.648442] CR2: 0000000056767120 CR3: 00000001938fa003 CR4: 00000000000606a0
[ 89.649302] Call Trace:
[ 89.650150] ? dma_direct_alloc+0x6f/0x140
[ 89.651001] swiotlb_alloc+0x88/0x170
[ 89.651838] iwl_pcie_txq_alloc+0x205/0x420
[ 89.652669] ? iwl_pcie_tx_init+0x28d/0x390
[ 89.653502] iwl_pcie_tx_init+0x325/0x390
[ 89.654338] iwl_trans_pcie_start_fw+0x267/0x590
[ 89.655185] iwl_load_ucode_wait_alive+0xde/0x1b0
[ 89.656024] ? iwl_init_notification_wait+0x78/0x90
[ 89.656865] ? iwl_alloc_all+0x30/0x30
[ 89.657701] iwl_run_init_ucode+0xa3/0x130
[ 89.658528] ? iwl_run_init_ucode+0xa3/0x130
[ 89.659352] ? iwl_alive_notify+0x1b0/0x1b0
[ 89.660167] ? mutex_unlock+0xd/0x10
[ 89.660975] iwlagn_mac_start+0x112/0x200
[ 89.661785] ? iwlagn_mac_start+0x112/0x200
[ 89.662600] drv_start+0x2e/0x50
[ 89.663424] ieee80211_do_open+0x356/0x920
[ 89.664230] ? mutex_unlock+0xd/0x10
[ 89.665027] ieee80211_open+0x4e/0x60
[ 89.665809] __dev_open+0xba/0x130
[ 89.666572] __dev_change_flags+0x19c/0x200
[ 89.667330] ? __switch_to_asm+0x34/0x70
[ 89.668070] ? __switch_to_asm+0x40/0x70
[ 89.668800] dev_change_flags+0x24/0x60
[ 89.669518] do_setlink+0x2f4/0xce0
[ 89.670216] ? _raw_spin_unlock_irq+0x22/0x30
[ 89.670933] ? finish_task_switch+0xa3/0x250
[ 89.671631] ? finish_task_switch+0x76/0x250
[ 89.672322] ? __schedule+0x36c/0x830
[ 89.673006] ? blk_flush_plug_list+0xdd/0x250
[ 89.673694] ? nla_parse+0x36/0x130
[ 89.674374] rtnl_newlink+0x483/0x770
[ 89.675061] ? update_group_capacity+0x27/0x2f0
[ 89.675735] ? find_busiest_group+0x141/0xad0
[ 89.676398] ? find_held_lock+0x39/0xb0
[ 89.677044] ? load_balance+0x709/0xb80
[ 89.677647] ? find_held_lock+0x39/0xb0
[ 89.678200] ? cache_alloc_refill+0x4c1/0xc80
[ 89.678735] ? find_held_lock+0x39/0xb0
[ 89.679265] ? __lock_acquire.isra.25+0x39e/0xa50
[ 89.679786] rtnetlink_rcv_msg+0x316/0x3e0
[ 89.680290] ? rtnl_calcit.isra.40+0x140/0x140
[ 89.680792] netlink_rcv_skb+0xcd/0x100
[ 89.681291] rtnetlink_rcv+0x10/0x20
[ 89.681779] netlink_unicast+0x179/0x210
[ 89.682253] netlink_sendmsg+0x307/0x3a0
[ 89.682713] sock_sendmsg+0x18/0x30
[ 89.683168] ___sys_sendmsg+0x2a5/0x2c0
[ 89.683619] ? find_held_lock+0x39/0xb0
[ 89.684071] ? find_held_lock+0x39/0xb0
[ 89.684511] ? __fget+0x8a/0xd0
[ 89.684947] ? __fget+0xa2/0xd0
[ 89.685377] __sys_sendmsg+0x63/0xa0
[ 89.685804] ? __sys_sendmsg+0x63/0xa0
[ 89.686232] __ia32_compat_sys_socketcall+0xde/0x220
[ 89.686660] do_int80_syscall_32+0x50/0x100
[ 89.687099] entry_INT80_compat+0x7d/0x82
[ 89.687527] RIP: 0023:0xf7f98c42
[ 89.687950] Code: 65 8b 15 04 00 00 00 8b 0e 8b 0c ca 83 f9 ff 75 0c 89 04 24 89 f0 e8 b3 fe ff ff eb 05 8b 46 04 01 c8 83 c4 14 5b 5e c3 cd 80 <c3> 8d b6 00 00 00 00 8d bc 27 00 00 00 00 8b 1c 24 c3 8d b6 00 00
[ 89.688990] RSP: 002b:00000000ff933894 EFLAGS: 00200293 ORIG_RAX: 0000000000000066
[ 89.689535] RAX: ffffffffffffffda RBX: 0000000000000010 RCX: 00000000ff9338a0
[ 89.690093] RDX: 00000000f7c09000 RSI: 0000000000000000 RDI: 00000000081ae170
[ 89.690653] RBP: 0000000008248118 R08: 0000000000000000 R09: 0000000000000000
[ 89.691226] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 89.691792] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 89.692354] Modules linked in:
[ 89.692929] ---[ end trace 3906e4f171da79b4 ]---
[ 89.693651] RIP: 0010:swiotlb_tbl_map_single+0x17f/0x2c0
[ 89.693653] Code: 21 c6 49 89 f5 49 81 c5 ff 07 00 00 49 c1 ed 0b 48 83 f8 ff 0f 84 f2 fe ff ff 48 8d 90 00 08 00 00 48 c1 ea 0b e9 e2 fe ff ff <0f> 0b 42 8d 0c 3b 89 d8 39 cb 7d 12 48 63 d0 83 c0 01 39 c8 41 c7
[ 89.693654] RSP: 0000:ffffc9000092f070 EFLAGS: 00010246
[ 89.693656] RAX: 00000000ffffffff RBX: 0000000000000000 RCX: 0000000000000000
[ 89.693657] RDX: 0000000000200000 RSI: 00000000d699f000 RDI: ffff8801970960a8
[ 89.693666] RBP: ffffc9000092f0c8 R08: 0000000000000002 R09: 0000000000000000
[ 89.693667] R10: 0000000000000034 R11: 0000000000000000 R12: 0000000000000001
[ 89.693668] R13: 00000000001ad33e R14: 0000000000000000 R15: 0000000000000000
[ 89.693670] FS: 0000000000000000(0000) GS:ffff88019e240000(0063) knlGS:00000000f70437c0
[ 89.693671] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
[ 89.693672] CR2: 0000000056767120 CR3: 00000001938fa003 CR4: 00000000000606a0
[ 90.235267] usb 1-1.4: new full-speed USB device number 5 using ehci-pci
[ 90.349748] usb 1-1.4: New USB device found, idVendor=0a5c, idProduct=217f, bcdDevice= 7.48
[ 90.351888] usb 1-1.4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 90.353967] usb 1-1.4: Product: Broadcom Bluetooth Device
[ 90.356097] usb 1-1.4: Manufacturer: Broadcom Corp
[ 90.356794] usb 1-1.4: SerialNumber: 7CE9D3B855AA


--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


Attachments:
(No filename) (7.93 kB)
signature.asc (181.00 B)
Digital signature
Download all attachments

2018-09-16 15:22:01

by Pavel Machek

[permalink] [raw]
Subject: [PATCH] fix iwlwifi on old cards in v4.19 was Re: 4.19-rc[23] iwlwifi: BUG in swiotlb


.max_tfd_queue_size was ommited for old cards, leading to oops in
swiotlb.

Signed-off-by: Pavel Machek <[email protected]>

--- linux/drivers/net/wireless/intel/iwlwifi/cfg/1000.c 2018-09-05 13:12:40.453164067 +0200
+++ linux-64/drivers/net/wireless/intel/iwlwifi/cfg/1000.c 2018-09-16 11:54:04.010970756 +0200
@@ -51,6 +51,7 @@

static const struct iwl_base_params iwl1000_base_params = {
.num_of_queues = IWLAGN_NUM_QUEUES,
+ .max_tfd_queue_size = 256,
.eeprom_size = OTP_LOW_IMAGE_SIZE,
.pll_cfg = true,
.max_ll_items = OTP_MAX_LL_ITEMS_1000,


--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


Attachments:
(No filename) (691.00 B)
signature.asc (181.00 B)
Digital signature
Download all attachments

2018-09-16 16:28:56

by Pavel Machek

[permalink] [raw]
Subject: Re: [linuxwifi] [PATCH] fix iwlwifi on old cards in v4.19 was Re: 4.19-rc[23] iwlwifi: BUG in swiotlb

On Sun 2018-09-16 10:14:22, Grumbach, Emmanuel wrote:
> >
> > >
> > >
> > > .max_tfd_queue_size was ommited for old cards, leading to oops in
> > swiotlb.
> > >
> > > Signed-off-by: Pavel Machek <[email protected]>
> > >
> >
> > I picked it up in our tree with minor commit message fixes.
> > I also added the Fixes tag for stable.
>
> Ah, of course... not needed... Sorry...

That was quick, thanks!

Ouch, and I should mention... I'm not sure 256 is right value to
use. I just... guessed so based on the other files :-).

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


Attachments:
(No filename) (673.00 B)
signature.asc (181.00 B)
Digital signature
Download all attachments

2018-09-17 03:52:00

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH] fix iwlwifi on old cards in v4.19 was Re: 4.19-rc[23] iwlwifi: BUG in swiotlb

On 9/16/18 2:59 AM, Pavel Machek wrote:
>
> .max_tfd_queue_size was ommited for old cards, leading to oops in
> swiotlb.
>
> Signed-off-by: Pavel Machek <[email protected]>
>

Hi,
Thanks. Works for me.

Tested-by: Randy Dunlap <[email protected]>
Acked-by: Randy Dunlap <[email protected]>


PS: I started the b-word yesterday but my old laptop is slow...


> --- linux/drivers/net/wireless/intel/iwlwifi/cfg/1000.c 2018-09-05 13:12:40.453164067 +0200
> +++ linux-64/drivers/net/wireless/intel/iwlwifi/cfg/1000.c 2018-09-16 11:54:04.010970756 +0200
> @@ -51,6 +51,7 @@
>
> static const struct iwl_base_params iwl1000_base_params = {
> .num_of_queues = IWLAGN_NUM_QUEUES,
> + .max_tfd_queue_size = 256,
> .eeprom_size = OTP_LOW_IMAGE_SIZE,
> .pll_cfg = true,
> .max_ll_items = OTP_MAX_LL_ITEMS_1000,
>
>


--
~Randy

2018-09-16 15:37:15

by Grumbach, Emmanuel

[permalink] [raw]
Subject: RE: [linuxwifi] [PATCH] fix iwlwifi on old cards in v4.19 was Re: 4.19-rc[23] iwlwifi: BUG in swiotlb

>
> >
> >
> > .max_tfd_queue_size was ommited for old cards, leading to oops in
> swiotlb.
> >
> > Signed-off-by: Pavel Machek <[email protected]>
> >
>
> I picked it up in our tree with minor commit message fixes.
> I also added the Fixes tag for stable.
>

Ah, of course... not needed... Sorry...

2018-09-12 01:58:53

by Randy Dunlap

[permalink] [raw]
Subject: Re: 4.19-rc[23] iwlwifi: BUG in swiotlb

On 9/11/18 12:32 AM, Johannes Berg wrote:
> On Mon, 2018-09-10 at 19:17 -0700, Randy Dunlap wrote:
>> Hi,
>>
>> Any ideas?
>
> Hmm. Is this new?

I can't be sure. I've been having problems booting this laptop for a few
weeks now but haven't tracked it down yet.

>> 2018-09-10T18:47:54.532837-07:00 dragon kernel: [ 31.472371] kernel BUG at ../kernel/dma/swiotlb.c:521!
>
> nslots = ALIGN(size, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT;
> [...]
> BUG_ON(!nslots)
>
>> 2018-09-10T18:47:54.613655-07:00 dragon kernel: [ 31.490325] swiotlb_alloc+0x88/0x170
>> 2018-09-10T18:47:54.613656-07:00 dragon kernel: [ 31.490329] ? __kmalloc+0x1cc/0x200
>> 2018-09-10T18:47:54.613657-07:00 dragon kernel: [ 31.491652] iwl_pcie_txq_alloc+0x1d4/0x3b0 [iwlwifi]
>
> There are two calls to dma_alloc_coherent() here, should those even hit
> swiotlb? The sizes of those should be
> * 256 x 128 (32k)
> * 32 x 256 (8k) [TFH, unlikely to be the case here]
> * 256 x 256 (65k) [TFH]
> * 32 x 64 (2k)
> * 256 x 64 (16k)
>
>
> IO_TLB_SHIFT is 11, so we get 2k alignment, so even the smallest size
> (32*64) should result in nslots being 1?
>
> In fact, unless the driver passed *ZERO* as the size, this should never
> happen (hence the BUG_ON), since ALIGN() would take care of rounding up
> any smaller allocation here.
>
> Presumably you can reproduce this pretty easily (and I don't know what
> specific model of NIC you have etc.), so perhaps you can do something
> like this?

The wireless NIC is Condor Peak:

04:00.0 Network controller: Intel Corporation Centrino Wireless-N 1000 [Condor Peak]
Subsystem: Intel Corporation Centrino Wireless-N 1000 BGN
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 31
Region 0: Memory at c2600000 (64-bit, non-prefetchable) [size=8K]
Capabilities: [c8] Power Management version 3
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Address: 00000000fee0800c Data: 4162
Capabilities: [e0] Express (v1) Endpoint, MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 unlimited
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
MaxPayload 128 bytes, MaxReadReq 128 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <128ns, L1 <32us
ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp-
LnkCtl: ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
AERCap: First Error Pointer: 00, ECRCGenCap- ECRCGenEn- ECRCChkCap- ECRCChkEn-
MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
HeaderLog: 00000000 00000000 00000000 00000000
Capabilities: [140 v1] Device Serial Number 74-e5-0b-ff-ff-2d-dc-28
Kernel driver in use: iwlwifi
Kernel modules: iwlwifi



sigh. I can reproduce it without the patch:

> https://p.sipsolutions.net/aa0dccd7a60fe176.txt

but with that patch, it just hangs after about 25 seconds of booting
(slow hard drive, not SSD).

I'll try some other ways.

--
~Randy

2018-09-16 15:35:17

by Grumbach, Emmanuel

[permalink] [raw]
Subject: RE: [linuxwifi] [PATCH] fix iwlwifi on old cards in v4.19 was Re: 4.19-rc[23] iwlwifi: BUG in swiotlb

>
>
> .max_tfd_queue_size was ommited for old cards, leading to oops in swiotlb.
>
> Signed-off-by: Pavel Machek <[email protected]>
>

I picked it up in our tree with minor commit message fixes.
I also added the Fixes tag for stable.

Thanks!

2018-09-12 02:03:18

by Johannes Berg

[permalink] [raw]
Subject: Re: 4.19-rc[23] iwlwifi: BUG in swiotlb

On Tue, 2018-09-11 at 13:57 -0700, Randy Dunlap wrote:

> I can't be sure. I've been having problems booting this laptop for a few
> weeks now but haven't tracked it down yet.

Ok.

> The wireless NIC is Condor Peak:
>
> 04:00.0 Network controller: Intel Corporation Centrino Wireless-N 1000 [Condor Peak]

Wow, that's old. I should have one somewhere, but we haven't worked on
this NIC in many years. We've touched the driver, of course, but the
configuration for this wouldn't have changed.

> sigh. I can reproduce it without the patch:
>
> > https://p.sipsolutions.net/aa0dccd7a60fe176.txt
>
> but with that patch, it just hangs after about 25 seconds of booting
> (slow hard drive, not SSD).
>
> I'll try some other ways.

Hmm. That makes me think you have some corruption going on, rather than
something really being set to 0, because otherwise you should've seen
the warning at least?

johannes

2018-09-11 12:31:06

by Johannes Berg

[permalink] [raw]
Subject: Re: 4.19-rc[23] iwlwifi: BUG in swiotlb

On Mon, 2018-09-10 at 19:17 -0700, Randy Dunlap wrote:
> Hi,
>
> Any ideas?

Hmm. Is this new?

> 2018-09-10T18:47:54.532837-07:00 dragon kernel: [ 31.472371] kernel BUG at ../kernel/dma/swiotlb.c:521!

nslots = ALIGN(size, 1 << IO_TLB_SHIFT) >> IO_TLB_SHIFT;
[...]
BUG_ON(!nslots)

> 2018-09-10T18:47:54.613655-07:00 dragon kernel: [ 31.490325] swiotlb_alloc+0x88/0x170
> 2018-09-10T18:47:54.613656-07:00 dragon kernel: [ 31.490329] ? __kmalloc+0x1cc/0x200
> 2018-09-10T18:47:54.613657-07:00 dragon kernel: [ 31.491652] iwl_pcie_txq_alloc+0x1d4/0x3b0 [iwlwifi]

There are two calls to dma_alloc_coherent() here, should those even hit
swiotlb? The sizes of those should be
* 256 x 128 (32k)
* 32 x 256 (8k) [TFH, unlikely to be the case here]
* 256 x 256 (65k) [TFH]
* 32 x 64 (2k)
* 256 x 64 (16k)


IO_TLB_SHIFT is 11, so we get 2k alignment, so even the smallest size
(32*64) should result in nslots being 1?

In fact, unless the driver passed *ZERO* as the size, this should never
happen (hence the BUG_ON), since ALIGN() would take care of rounding up
any smaller allocation here.

Presumably you can reproduce this pretty easily (and I don't know what
specific model of NIC you have etc.), so perhaps you can do something
like this?

https://p.sipsolutions.net/aa0dccd7a60fe176.txt

johannes

2018-10-07 00:44:23

by Randy Dunlap

[permalink] [raw]
Subject: Re: [linuxwifi] [PATCH] fix iwlwifi on old cards in v4.19 was Re: 4.19-rc[23] iwlwifi: BUG in swiotlb

On 9/16/18 3:12 AM, Grumbach, Emmanuel wrote:
>>
>>
>> .max_tfd_queue_size was ommited for old cards, leading to oops in swiotlb.
>>
>> Signed-off-by: Pavel Machek <[email protected]>
>>
>
> I picked it up in our tree with minor commit message fixes.
> I also added the Fixes tag for stable.
>
> Thanks!

Hi,
Are we going to see this fix in 4.19? hopefully.

thanks,
--
~Randy

2018-10-07 00:57:18

by Randy Dunlap

[permalink] [raw]
Subject: Re: [linuxwifi] [PATCH] fix iwlwifi on old cards in v4.19 was Re: 4.19-rc[23] iwlwifi: BUG in swiotlb

On 10/6/18 5:44 PM, Randy Dunlap wrote:
> On 9/16/18 3:12 AM, Grumbach, Emmanuel wrote:
>>>
>>>
>>> .max_tfd_queue_size was ommited for old cards, leading to oops in swiotlb.
>>>
>>> Signed-off-by: Pavel Machek <[email protected]>
>>>
>>
>> I picked it up in our tree with minor commit message fixes.
>> I also added the Fixes tag for stable.
>>
>> Thanks!
>
> Hi,
> Are we going to see this fix in 4.19? hopefully.

Sorry, I see it now.

thanks,
--
~Randy