This is the start of the stable review cycle for the 3.17.4 release.
There are 141 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Fri Nov 21 20:51:28 UTC 2014.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.17.4-rc1.gz
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <[email protected]>
Linux 3.17.4-rc1
Emmanuel Grumbach <[email protected]>
iwlwifi: fix RFkill while calibrating
David Howells <[email protected]>
KEYS: Reinstate EPERM for a key type name beginning with a '.'
Stanislaw Gruszka <[email protected]>
asus-nb-wmi: Add wapf4 quirk for the X550VB
Daniel Borkmann <[email protected]>
net: sctp: fix skb_over_panic when receiving malformed ASCONF chunks
Daniel Borkmann <[email protected]>
net: sctp: fix panic on duplicate ASCONF chunks
Daniel Borkmann <[email protected]>
net: sctp: fix remote memory pressure from excessive queueing
Stephan Mueller <[email protected]>
quirk for Lenovo Yoga 3: no rfkill switch
Nadav Amit <[email protected]>
KVM: x86: Don't report guest userspace emulation error to userspace
David Rientjes <[email protected]>
mm, thp: fix collapsing of hugepages on madvise
Joe Perches <[email protected]>
checkpatch: remove unnecessary + after {8,8}
Michal Marek <[email protected]>
builddeb: put the dbg files into the correct directory
Pali Rohár <[email protected]>
dell-wmi: Fix access out of memory
Pranith Kumar <[email protected]>
rcu: Use rcu_gp_kthread_wake() to wake up grace period kthreads
Bob Peterson <[email protected]>
GFS2: Make rename not save dirent location
Pablo Neira <[email protected]>
netfilter: xt_bpf: add mising opaque struct sk_filter definition
Arturo Borrero <[email protected]>
netfilter: nft_compat: fix wrong target lookup in nft_target_select_ops()
Houcheng Lin <[email protected]>
netfilter: nf_log: release skbuff on nlmsg put failure
Florian Westphal <[email protected]>
netfilter: nfnetlink_log: fix maximum packet length logged to userspace
Florian Westphal <[email protected]>
netfilter: nf_log: account for size of NLMSG_DONE attribute
Sabrina Dubroca <[email protected]>
netfilter: nf_tables: check for NULL in nf_tables_newchain pcpu stats allocation
Dan Carpenter <[email protected]>
netfilter: ipset: off by one in ip_set_nfnl_get_byindex()
Andrey Vagin <[email protected]>
ipc: always handle a new value of auto_msgmni
Devesh Sharma <[email protected]>
IB/core: Clear AH attr variable to prevent garbage data
Geert Uytterhoeven <[email protected]>
pwm: Fix uninitialized warnings in pwm_get()
Bjorn Helgaas <[email protected]>
clocksource: Remove "weak" from clocksource_default_clock() declaration
Bjorn Helgaas <[email protected]>
kgdb: Remove "weak" from kgdb_arch_pc() declaration
Bjorn Helgaas <[email protected]>
vmcore: Remove "weak" from function declarations
Bjorn Helgaas <[email protected]>
memory-hotplug: Remove "weak" from memory_block_size_bytes() declaration
Florian Fainelli <[email protected]>
net: systemport: reset UniMAC coming out of a suspend cycle
Florian Fainelli <[email protected]>
net: systemport: enable RX interrupts after NAPI
Anish Bhatt <[email protected]>
cxgb4 : Handle dcb enable correctly
Dan Carpenter <[email protected]>
media: ttusb-dec: buffer overflow in ioctl
Trond Myklebust <[email protected]>
NFSv4.1: nfs41_clear_delegation_stateid shouldn't trust NFS_DELEGATED_STATE
Trond Myklebust <[email protected]>
NFSv4: Fix races between nfs_remove_bad_delegation() and delegation return
Jan Kara <[email protected]>
nfs: Fix use of uninitialized variable in nfs_getattr()
Trond Myklebust <[email protected]>
NFS: Don't try to reclaim delegation open state if recovery failed
Trond Myklebust <[email protected]>
NFSv4: Ensure that we call FREE_STATEID when NFSv4.x stateids are revoked
Trond Myklebust <[email protected]>
NFSv4: Ensure that we remove NFSv4.0 delegations when state has expired
NeilBrown <[email protected]>
md: Always set RECOVERY_NEEDED when clearing RECOVERY_FROZEN
Junjie Mao <[email protected]>
x86, kaslr: Prevent .bss from overlaping initrd
Borislav Petkov <[email protected]>
x86, microcode, AMD: Fix ucode patch stashing on 32-bit
Borislav Petkov <[email protected]>
x86, microcode: Fix accessing dis_ucode_ldr on 32-bit
Borislav Petkov <[email protected]>
x86, microcode, AMD: Fix early ucode loading on 32-bit
Krzysztof Kozlowski <[email protected]>
power: bq2415x_charger: Fix memory leak on DTS parsing error
Krzysztof Kozlowski <[email protected]>
power: bq2415x_charger: Properly handle ENODEV from power_supply_get_by_phandle
Krzysztof Kozlowski <[email protected]>
power: charger-manager: Fix accessing invalidated power supply after charger unbind
Krzysztof Kozlowski <[email protected]>
power: charger-manager: Fix accessing invalidated power supply after fuel gauge unbind
Jeff Layton <[email protected]>
sunrpc: fix sleeping under rcu_read_lock in gss_stringify_acceptor
Geert Uytterhoeven <[email protected]>
cpufreq: Avoid crash in resume on SMP without OPP
Pali Rohár <[email protected]>
Input: alps - ignore bad data on Dell Latitudes E6440 and E7440
Pali Rohár <[email protected]>
Input: alps - allow up to 2 invalid packets without resetting device
Pali Rohár <[email protected]>
Input: alps - ignore potential bare packets when device is out of sync
Takashi Iwai <[email protected]>
Input: synaptics - add min/max quirk for Lenovo T440s
Heinz Mauelshagen <[email protected]>
dm raid: ensure superblock's size matches device's logical block size
Joe Thornber <[email protected]>
dm btree: fix a recursion depth bug in btree walking code
Mikulas Patocka <[email protected]>
dm bufio: change __GFP_IO to __GFP_FS in shrinker callbacks
Jan Kara <[email protected]>
block: Fix computation of merged request priority
Helge Deller <[email protected]>
parisc: Use compat layer for msgctl, shmat, shmctl and semtimedop syscalls
Christoph Hellwig <[email protected]>
scsi: only re-lock door after EH on devices that were reset
William Cohen <[email protected]>
Correct the race condition in aarch64_insn_patch_text_sync()
Peng Tao <[email protected]>
nfs: fix pnfs direct write memory leak
Simon Horman <[email protected]>
ata: sata_rcar: Disable DIPM mode for r8a7790 ES1
Stefan Richter <[email protected]>
firewire: cdev: prevent kernel stack leaking into ioctl arguments
Mark Rutland <[email protected]>
arm64: efi: Fix stub cache maintenance
Kyle McMartin <[email protected]>
arm64: __clear_user: handle exceptions on strb
Joe Thornber <[email protected]>
dm thin: grab a virtual cell before looking up the mapping
Paul Mackerras <[email protected]>
Fix thinko in iov_iter_single_seg_count
Roger Quadros <[email protected]>
pinctrl: dra: dt-bindings: Fix output pull up/down
Andrew Lunn <[email protected]>
ARM: mvebu: armada xp: Generalize use of i2c quirk
Roger Quadros <[email protected]>
ARM: dts: am335x-evm: Fix 5th NAND partition's name
Will Deacon <[email protected]>
ARM: 8191/1: decompressor: ensure I-side picks up relocated code
Nathan Lynch <[email protected]>
ARM: 8198/1: make kuser helpers depend on MMU
Dave Airlie <[email protected]>
drm/radeon: add locking around atombios scratch space usage
Alex Deucher <[email protected]>
drm/radeon: add missing crtc unlock when setting up the MC
Alex Deucher <[email protected]>
drm/radeon: use gart for DMA IB tests
Alex Deucher <[email protected]>
drm/radeon: make sure mode init is complete in bandwidth_update
Jammy Zhou <[email protected]>
drm/radeon: set correct CE ram size for CIK
Jani Nikula <[email protected]>
drm/i915/dp: only use training pattern 3 on platforms that support it
Rodrigo Vivi <[email protected]>
drm/i915: Disable caches for Global GTT.
Jani Nikula <[email protected]>
drm/i915: safeguard against too high minimum brightness
Johannes Berg <[email protected]>
mac80211: fix use-after-free in defragmentation
Luciano Coelho <[email protected]>
mac80211: schedule the actual switch of the station before CSA count 0
Luciano Coelho <[email protected]>
mac80211: use secondary channel offset IE also beacons during CSA
Johannes Berg <[email protected]>
mac80211: properly flush delayed scan work on interface removal
Junjie Mao <[email protected]>
mac80211_hwsim: release driver when ieee80211_register_hw fails
Herbert Xu <[email protected]>
macvtap: Fix csum_start when VLAN tags are present
Ilya Dryomov <[email protected]>
libceph: do not crash on large auth tickets
Max Filippov <[email protected]>
xtensa: re-wire umount syscall to sys_oldumount
Takashi Iwai <[email protected]>
ALSA: usb-audio: Fix memory leak in FTU quirk
Takashi Iwai <[email protected]>
ALSA: hda - Add mute LED control for Lenovo Ideapad Z560
Tejun Heo <[email protected]>
ahci: disable MSI instead of NCQ on Samsung pci-e SSDs on macbooks
Antoine Tenart <[email protected]>
ahci: fix AHCI parameters not taken into account
James Ralston <[email protected]>
ahci: Add Device IDs for Intel Sunrise Point PCH
Daniel Thompson <[email protected]>
param: fix crash on bad kernel arguments
Rabin Vincent <[email protected]>
tracing: Do not busy wait in buffer splice
Miklos Szeredi <[email protected]>
audit: keep inode pinned
Richard Guy Briggs <[email protected]>
audit: AUDIT_FEATURE_CHANGE message format missing delimiting space
Richard Guy Briggs <[email protected]>
audit: correct AUDIT_GET_FEATURE return message type
Andy Lutomirski <[email protected]>
x86, x32, audit: Fix x32's AUDIT_ARCH wrt audit
Herbert Xu <[email protected]>
tun: Fix csum_start with VLAN acceleration
Nadav Amit <[email protected]>
KVM: x86: Fix uninitialized op->type for some immediate values
Tang Chen <[email protected]>
mem-hotplug: reset node present pages when hot-adding a new pgdat
Tang Chen <[email protected]>
mem-hotplug: reset node managed pages when hot-adding a new pgdat
Greg Kurz <[email protected]>
hwrng: pseries - port to new read API and fix stack corruption
Krzysztof Kozlowski <[email protected]>
mfd: max77693: Fix always masked MUIC interrupts
Krzysztof Kozlowski <[email protected]>
mfd: max77693: Use proper regmap for handling MUIC interrupts
Tony Lindgren <[email protected]>
mfd: twl4030-power: Fix poweroff with PM configuration enabled
Cristian Stoica <[email protected]>
crypto: caam - remove duplicated sg copy functions
Tadeusz Struk <[email protected]>
crypto: qat - Enforce valid numa configuration
Tadeusz Struk <[email protected]>
crypto: qat - Prevent dma mapping zero length assoc data
Cristian Stoica <[email protected]>
crypto: caam - fix missing dma unmap on error path
Joonsoo Kim <[email protected]>
mm/page_alloc: restrict max order of merging on isolated pageblock
Joonsoo Kim <[email protected]>
mm/page_alloc: move freepage counting logic to __free_one_page()
Joonsoo Kim <[email protected]>
mm/page_alloc: add freepage on isolate pageblock to correct buddy list
Joonsoo Kim <[email protected]>
mm/page_alloc: fix incorrect isolation behavior by rechecking migratetype
Weijie Yang <[email protected]>
zram: avoid kunmap_atomic() of a NULL pointer
Andreas Larsson <[email protected]>
sparc32: Implement xchg and atomic_xchg using ATOMIC_HASH locks
David S. Miller <[email protected]>
sparc64: Do irq_{enter,exit}() around generic_smp_call_function*().
David S. Miller <[email protected]>
sparc64: Fix crashes in schizo_pcierr_intr_other().
Dwight Engen <[email protected]>
sunvdc: don't call VD_OP_GET_VTOC
Dwight Engen <[email protected]>
vio: fix reuse of vio_dring slot
Dwight Engen <[email protected]>
sunvdc: limit each sg segment to a page
Allen Pais <[email protected]>
sunvdc: compute vdisk geometry from capacity
Allen Pais <[email protected]>
sunvdc: add cdrom and v1.1 protocol support
Enric Balletbo i Serra <[email protected]>
smsc911x: power-up phydev before doing a software reset.
Hiroaki SHIMODA <[email protected]>
netlink: Properly unbind in error conditions.
Richard Cochran <[email protected]>
net: ptp: fix time stamp matching logic for VLAN packets.
Eric Dumazet <[email protected]>
ipv6: fix IPV6_PKTINFO with v4 mapped
Daniel Borkmann <[email protected]>
net: sctp: fix memory leak in auth key management
Daniel Borkmann <[email protected]>
net: sctp: fix NULL pointer dereference in af->from_addr_param on malformed packet
Takashi Iwai <[email protected]>
net: ppp: Don't call bpf_prog_create() in ppp_lock
Marcelo Leitner <[email protected]>
vxlan: Do not reuse sockets for a different address family
Jesse Gross <[email protected]>
udptunnel: Add SKB_GSO_UDP_TUNNEL during gro_complete.
Karl Beldan <[email protected]>
net: mv643xx_eth: reclaim TX skbs only when released by the HW
Steffen Klassert <[email protected]>
gre6: Move the setting of dev->iflink into the ndo_init functions.
Steffen Klassert <[email protected]>
sit: Use ipip6_tunnel_init as the ndo_init function.
Steffen Klassert <[email protected]>
vti6: Use vti6_dev_init as the ndo_init function.
Steffen Klassert <[email protected]>
ip6_tunnel: Use ip6_tnl_dev_init as the ndo_init function.
Nikolay Aleksandrov <[email protected]>
inet: frags: remove the WARN_ON from inet_evict_bucket
Nikolay Aleksandrov <[email protected]>
inet: frags: fix a race between inet_evict_bucket and inet_frag_kill
Shuah Khan <[email protected]>
x86/build: Add arch/x86/purgatory/ make generated files to gitignore
-------------
Diffstat:
.../devicetree/bindings/ata/sata_rcar.txt | 3 +-
Makefile | 4 +-
arch/arm/boot/compressed/head.S | 20 ++-
arch/arm/boot/dts/am335x-evm.dts | 2 +-
arch/arm/mach-mvebu/board-v7.c | 2 +-
arch/arm/mm/Kconfig | 1 +
arch/arm64/kernel/efi-entry.S | 27 +++-
arch/arm64/kernel/insn.c | 5 +-
arch/arm64/lib/clear_user.S | 2 +-
arch/parisc/include/uapi/asm/shmbuf.h | 25 ++-
arch/parisc/kernel/syscall_table.S | 8 +-
arch/sparc/include/asm/atomic_32.h | 2 +-
arch/sparc/include/asm/cmpxchg_32.h | 12 +-
arch/sparc/include/asm/vio.h | 14 +-
arch/sparc/kernel/pci_schizo.c | 6 +-
arch/sparc/kernel/smp_64.c | 4 +
arch/sparc/lib/atomic32.c | 27 ++++
arch/x86/.gitignore | 2 +
arch/x86/boot/compressed/Makefile | 4 +-
arch/x86/boot/compressed/head_32.S | 5 +-
arch/x86/boot/compressed/head_64.S | 5 +-
arch/x86/boot/compressed/misc.c | 13 +-
arch/x86/boot/compressed/mkpiggy.c | 9 +-
arch/x86/kernel/cpu/microcode/amd_early.c | 33 ++--
arch/x86/kernel/cpu/microcode/core_early.c | 2 +-
arch/x86/kernel/ptrace.c | 11 +-
arch/x86/kvm/emulate.c | 8 +
arch/x86/kvm/x86.c | 2 +-
arch/x86/tools/calc_run_size.pl | 30 ++++
arch/xtensa/include/uapi/asm/unistd.h | 3 +-
block/ioprio.c | 14 +-
drivers/ata/ahci.c | 28 ++--
drivers/ata/sata_rcar.c | 10 ++
drivers/block/sunvdc.c | 176 +++++++++++++++------
drivers/block/zram/zram_drv.c | 3 +-
drivers/char/hw_random/pseries-rng.c | 11 +-
drivers/cpufreq/cpufreq.c | 3 +-
drivers/crypto/caam/caamhash.c | 22 ++-
drivers/crypto/caam/key_gen.c | 29 ++--
drivers/crypto/caam/sg_sw_sec4.h | 54 -------
drivers/crypto/qat/qat_common/adf_accel_devices.h | 3 +-
drivers/crypto/qat/qat_common/adf_transport.c | 12 +-
drivers/crypto/qat/qat_common/qat_algs.c | 7 +-
drivers/crypto/qat/qat_common/qat_crypto.c | 8 +-
drivers/crypto/qat/qat_dh895xcc/adf_admin.c | 2 +-
drivers/crypto/qat/qat_dh895xcc/adf_drv.c | 32 ++--
drivers/crypto/qat/qat_dh895xcc/adf_isr.c | 2 +-
drivers/firewire/core-cdev.c | 3 +-
drivers/gpu/drm/i915/i915_gem_gtt.c | 16 ++
drivers/gpu/drm/i915/intel_dp.c | 5 +-
drivers/gpu/drm/i915/intel_panel.c | 17 +-
drivers/gpu/drm/radeon/atom.c | 11 +-
drivers/gpu/drm/radeon/atom.h | 2 +
drivers/gpu/drm/radeon/atombios_dp.c | 4 +-
drivers/gpu/drm/radeon/atombios_i2c.c | 4 +-
drivers/gpu/drm/radeon/cik.c | 7 +-
drivers/gpu/drm/radeon/cik_sdma.c | 21 +--
drivers/gpu/drm/radeon/evergreen.c | 4 +
drivers/gpu/drm/radeon/r100.c | 3 +
drivers/gpu/drm/radeon/r600_dma.c | 20 +--
drivers/gpu/drm/radeon/radeon_device.c | 1 +
drivers/gpu/drm/radeon/rs600.c | 3 +
drivers/gpu/drm/radeon/rs690.c | 3 +
drivers/gpu/drm/radeon/rv515.c | 3 +
drivers/gpu/drm/radeon/si.c | 3 +
drivers/infiniband/core/uverbs_cmd.c | 2 +
drivers/input/mouse/alps.c | 28 +++-
drivers/input/mouse/synaptics.c | 5 +-
drivers/md/dm-bufio.c | 12 +-
drivers/md/dm-raid.c | 11 +-
drivers/md/dm-thin.c | 16 +-
drivers/md/md.c | 4 +
drivers/md/persistent-data/dm-btree-internal.h | 6 +
drivers/md/persistent-data/dm-btree-spine.c | 2 +-
drivers/md/persistent-data/dm-btree.c | 24 ++-
drivers/media/usb/ttusb-dec/ttusbdecfe.c | 3 +
drivers/mfd/max77693.c | 14 +-
drivers/mfd/twl4030-power.c | 52 ++++++
drivers/net/ethernet/broadcom/bcmsysport.c | 11 +-
drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c | 7 +-
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 6 +-
drivers/net/ethernet/marvell/mv643xx_eth.c | 18 ++-
drivers/net/ethernet/smsc/smsc911x.c | 46 ++++++
drivers/net/ethernet/sun/sunvnet.c | 4 +-
drivers/net/ethernet/ti/cpts.c | 2 +-
drivers/net/macvtap.c | 2 +
drivers/net/phy/dp83640.c | 4 +-
drivers/net/ppp/ppp_generic.c | 40 ++---
drivers/net/tun.c | 16 +-
drivers/net/vxlan.c | 31 ++--
drivers/net/wireless/iwlwifi/mvm/fw.c | 10 +-
drivers/net/wireless/iwlwifi/mvm/mac80211.c | 1 +
drivers/net/wireless/iwlwifi/mvm/mvm.h | 1 +
drivers/net/wireless/iwlwifi/mvm/ops.c | 11 +-
drivers/net/wireless/iwlwifi/pcie/trans.c | 4 +-
drivers/net/wireless/mac80211_hwsim.c | 4 +-
drivers/platform/x86/asus-nb-wmi.c | 9 ++
drivers/platform/x86/dell-wmi.c | 12 +-
drivers/platform/x86/ideapad-laptop.c | 7 +
drivers/power/bq2415x_charger.c | 23 ++-
drivers/power/charger-manager.c | 163 ++++++++++++-------
drivers/pwm/core.c | 29 ++--
drivers/scsi/scsi_error.c | 4 +-
fs/gfs2/dir.c | 9 +-
fs/gfs2/dir.h | 1 +
fs/gfs2/inode.c | 6 +-
fs/nfs/delegation.c | 25 ++-
fs/nfs/delegation.h | 1 +
fs/nfs/direct.c | 1 +
fs/nfs/filelayout/filelayout.c | 3 -
fs/nfs/inode.c | 2 +-
fs/nfs/nfs4proc.c | 76 +++++----
include/dt-bindings/pinctrl/dra.h | 4 +-
include/linux/bootmem.h | 1 +
include/linux/clocksource.h | 2 +-
include/linux/crash_dump.h | 15 +-
include/linux/kgdb.h | 2 +-
include/linux/khugepaged.h | 17 +-
include/linux/memory.h | 2 +-
include/linux/mfd/max77693-private.h | 7 +
include/linux/mmzone.h | 9 ++
include/linux/nfs_xdr.h | 11 ++
include/linux/page-isolation.h | 8 +
include/linux/power/charger-manager.h | 3 -
include/linux/ring_buffer.h | 2 +-
include/net/sctp/sctp.h | 5 +
include/net/sctp/sm.h | 6 +-
include/net/udp_tunnel.h | 9 ++
include/uapi/linux/netfilter/xt_bpf.h | 2 +
init/main.c | 2 +-
ipc/ipc_sysctl.c | 3 +-
kernel/audit.c | 4 +-
kernel/audit_tree.c | 1 +
kernel/rcu/tree.c | 4 +-
kernel/trace/ring_buffer.c | 81 ++++++----
kernel/trace/trace.c | 23 +--
mm/bootmem.c | 9 +-
mm/huge_memory.c | 11 +-
mm/internal.h | 25 +++
mm/iov_iter.c | 4 +-
mm/memory_hotplug.c | 26 +++
mm/mmap.c | 8 +-
mm/nobootmem.c | 8 +-
mm/page_alloc.c | 55 +++----
mm/page_isolation.c | 43 ++++-
net/ceph/crypto.c | 169 +++++++++++++++-----
net/ipv4/inet_fragment.c | 4 +-
net/ipv4/ip_sockglue.c | 2 +-
net/ipv6/ip6_gre.c | 5 +-
net/ipv6/ip6_tunnel.c | 10 +-
net/ipv6/ip6_vti.c | 11 +-
net/ipv6/sit.c | 15 +-
net/mac80211/ibss.c | 2 +-
net/mac80211/ieee80211_i.h | 3 +-
net/mac80211/iface.c | 7 +-
net/mac80211/mesh.c | 2 +-
net/mac80211/mlme.c | 5 +-
net/mac80211/rx.c | 14 +-
net/mac80211/spectmgmt.c | 18 +--
net/netfilter/ipset/ip_set_core.c | 2 +-
net/netfilter/nf_tables_api.c | 4 +-
net/netfilter/nfnetlink_log.c | 31 ++--
net/netfilter/nft_compat.c | 2 +-
net/netlink/af_netlink.c | 5 +-
net/sctp/associola.c | 2 +
net/sctp/auth.c | 2 -
net/sctp/inqueue.c | 33 +---
net/sctp/sm_make_chunk.c | 102 ++++++------
net/sctp/sm_statefuns.c | 21 +--
net/sunrpc/auth_gss/auth_gss.c | 35 +++-
scripts/checkpatch.pl | 2 +-
scripts/package/builddeb | 22 ++-
security/keys/keyctl.c | 2 +
sound/pci/hda/patch_conexant.c | 31 ++++
sound/usb/mixer_quirks.c | 6 +
175 files changed, 1711 insertions(+), 875 deletions(-)
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Enric Balletbo i Serra <[email protected]>
[ Upstream commit ccf899a27c08038db91765ff12bb0380dcd85887 ]
With commit be9dad1f9f26604fb ("net: phy: suspend phydev when going
to HALTED"), the PHY device will be put in a low-power mode using
BMCR_PDOWN if the the interface is set down. The smsc911x driver does
a software_reset opening the device driver (ndo_open). In such case,
the PHY must be powered-up before access to any register and before
calling the software_reset function. Otherwise, as the PHY is powered
down the software reset fails and the interface can not be enabled
again.
This patch fixes this scenario that is easy to reproduce setting down
the network interface and setting up again.
$ ifconfig eth0 down
$ ifconfig eth0 up
ifconfig: SIOCSIFFLAGS: Input/output error
Signed-off-by: Enric Balletbo i Serra <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/smsc/smsc911x.c | 46 +++++++++++++++++++++++++++++++++++
1 file changed, 46 insertions(+)
--- a/drivers/net/ethernet/smsc/smsc911x.c
+++ b/drivers/net/ethernet/smsc/smsc911x.c
@@ -1342,6 +1342,42 @@ static void smsc911x_rx_multicast_update
spin_unlock(&pdata->mac_lock);
}
+static int smsc911x_phy_general_power_up(struct smsc911x_data *pdata)
+{
+ int rc = 0;
+
+ if (!pdata->phy_dev)
+ return rc;
+
+ /* If the internal PHY is in General Power-Down mode, all, except the
+ * management interface, is powered-down and stays in that condition as
+ * long as Phy register bit 0.11 is HIGH.
+ *
+ * In that case, clear the bit 0.11, so the PHY powers up and we can
+ * access to the phy registers.
+ */
+ rc = phy_read(pdata->phy_dev, MII_BMCR);
+ if (rc < 0) {
+ SMSC_WARN(pdata, drv, "Failed reading PHY control reg");
+ return rc;
+ }
+
+ /* If the PHY general power-down bit is not set is not necessary to
+ * disable the general power down-mode.
+ */
+ if (rc & BMCR_PDOWN) {
+ rc = phy_write(pdata->phy_dev, MII_BMCR, rc & ~BMCR_PDOWN);
+ if (rc < 0) {
+ SMSC_WARN(pdata, drv, "Failed writing PHY control reg");
+ return rc;
+ }
+
+ usleep_range(1000, 1500);
+ }
+
+ return 0;
+}
+
static int smsc911x_phy_disable_energy_detect(struct smsc911x_data *pdata)
{
int rc = 0;
@@ -1415,6 +1451,16 @@ static int smsc911x_soft_reset(struct sm
int ret;
/*
+ * Make sure to power-up the PHY chip before doing a reset, otherwise
+ * the reset fails.
+ */
+ ret = smsc911x_phy_general_power_up(pdata);
+ if (ret) {
+ SMSC_WARN(pdata, drv, "Failed to power-up the PHY chip");
+ return ret;
+ }
+
+ /*
* LAN9210/LAN9211/LAN9220/LAN9221 chips have an internal PHY that
* are initialized in a Energy Detect Power-Down mode that prevents
* the MAC chip to be software reseted. So we have to wakeup the PHY
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johannes Berg <[email protected]>
commit b8fff407a180286aa683d543d878d98d9fc57b13 upstream.
Upon receiving the last fragment, all but the first fragment
are freed, but the multicast check for statistics at the end
of the function refers to the current skb (the last fragment)
causing a use-after-free bug.
Since multicast frames cannot be fragmented and we check for
this early in the function, just modify that check to also
do the accounting to fix the issue.
Reported-by: Yosef Khyal <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/mac80211/rx.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
--- a/net/mac80211/rx.c
+++ b/net/mac80211/rx.c
@@ -1667,11 +1667,14 @@ ieee80211_rx_h_defragment(struct ieee802
sc = le16_to_cpu(hdr->seq_ctrl);
frag = sc & IEEE80211_SCTL_FRAG;
- if (likely((!ieee80211_has_morefrags(fc) && frag == 0) ||
- is_multicast_ether_addr(hdr->addr1))) {
- /* not fragmented */
+ if (likely(!ieee80211_has_morefrags(fc) && frag == 0))
+ goto out;
+
+ if (is_multicast_ether_addr(hdr->addr1)) {
+ rx->local->dot11MulticastReceivedFrameCount++;
goto out;
}
+
I802_DEBUG_INC(rx->local->rx_handlers_fragments);
if (skb_linearize(rx->skb))
@@ -1764,10 +1767,7 @@ ieee80211_rx_h_defragment(struct ieee802
out:
if (rx->sta)
rx->sta->rx_packets++;
- if (is_multicast_ether_addr(hdr->addr1))
- rx->local->dot11MulticastReceivedFrameCount++;
- else
- ieee80211_led_rx(rx->local);
+ ieee80211_led_rx(rx->local);
return RX_CONTINUE;
}
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Luciano Coelho <[email protected]>
commit ff1e417c7c239b7abfe70aa90460a77eaafc7f83 upstream.
Due to the time it takes to process the beacon that started the CSA
process, we may be late for the switch if we try to reach exactly
beacon 0. To avoid that, use count - 1 when calculating the switch time.
Reported-by: Jouni Malinen <[email protected]>
Signed-off-by: Luciano Coelho <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/mac80211/mlme.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/net/mac80211/mlme.c
+++ b/net/mac80211/mlme.c
@@ -1154,7 +1154,8 @@ ieee80211_sta_process_chanswitch(struct
ieee80211_queue_work(&local->hw, &ifmgd->chswitch_work);
else
mod_timer(&ifmgd->chswitch_timer,
- TU_TO_EXP_TIME(csa_ie.count * cbss->beacon_interval));
+ TU_TO_EXP_TIME((csa_ie.count - 1) *
+ cbss->beacon_interval));
}
static u32 ieee80211_handle_pwr_constr(struct ieee80211_sub_if_data *sdata,
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bjorn Helgaas <[email protected]>
commit 5ab03ac5aaa1f032e071f1b3dc433b7839359c03 upstream.
For the following functions:
elfcorehdr_alloc()
elfcorehdr_free()
elfcorehdr_read()
elfcorehdr_read_notes()
remap_oldmem_pfn_range()
fs/proc/vmcore.c provides default definitions explicitly marked "weak".
arch/s390 provides its own definitions intended to override the default
ones, but the "weak" attribute on the declarations applied to the s390
definitions as well, so the linker chose one based on link order (see
10629d711ed7 ("PCI: Remove __weak annotation from pcibios_get_phb_of_node
decl")).
Remove the "weak" attribute from the declarations so we always prefer a
non-weak definition over the weak one, independent of link order.
Fixes: be8a8d069e50 ("vmcore: introduce ELF header in new memory feature")
Fixes: 9cb218131de1 ("vmcore: introduce remap_oldmem_pfn_range()")
Signed-off-by: Bjorn Helgaas <[email protected]>
Acked-by: Andrew Morton <[email protected]>
Acked-by: Vivek Goyal <[email protected]>
CC: Michael Holzheu <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/linux/crash_dump.h | 15 +++++++--------
1 file changed, 7 insertions(+), 8 deletions(-)
--- a/include/linux/crash_dump.h
+++ b/include/linux/crash_dump.h
@@ -14,14 +14,13 @@
extern unsigned long long elfcorehdr_addr;
extern unsigned long long elfcorehdr_size;
-extern int __weak elfcorehdr_alloc(unsigned long long *addr,
- unsigned long long *size);
-extern void __weak elfcorehdr_free(unsigned long long addr);
-extern ssize_t __weak elfcorehdr_read(char *buf, size_t count, u64 *ppos);
-extern ssize_t __weak elfcorehdr_read_notes(char *buf, size_t count, u64 *ppos);
-extern int __weak remap_oldmem_pfn_range(struct vm_area_struct *vma,
- unsigned long from, unsigned long pfn,
- unsigned long size, pgprot_t prot);
+extern int elfcorehdr_alloc(unsigned long long *addr, unsigned long long *size);
+extern void elfcorehdr_free(unsigned long long addr);
+extern ssize_t elfcorehdr_read(char *buf, size_t count, u64 *ppos);
+extern ssize_t elfcorehdr_read_notes(char *buf, size_t count, u64 *ppos);
+extern int remap_oldmem_pfn_range(struct vm_area_struct *vma,
+ unsigned long from, unsigned long pfn,
+ unsigned long size, pgprot_t prot);
extern ssize_t copy_oldmem_page(unsigned long, char *, size_t,
unsigned long, int);
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bjorn Helgaas <[email protected]>
commit 107bcc6d566cb40184068d888637f9aefe6252dd upstream.
kernel/debug/debug_core.c provides a default kgdb_arch_pc() definition
explicitly marked "weak". Several architectures provide their own
definitions intended to override the default, but the "weak" attribute on
the declaration applied to the arch definitions as well, so the linker
chose one based on link order (see 10629d711ed7 ("PCI: Remove __weak
annotation from pcibios_get_phb_of_node decl")).
Remove the "weak" attribute from the declaration so we always prefer a
non-weak definition over the weak one, independent of link order.
Fixes: 688b744d8bc8 ("kgdb: fix signedness mixmatches, add statics, add declaration to header")
Tested-by: Vineet Gupta <[email protected]> # for ARC build
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Harvey Harrison <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/linux/kgdb.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/include/linux/kgdb.h
+++ b/include/linux/kgdb.h
@@ -283,7 +283,7 @@ struct kgdb_io {
extern struct kgdb_arch arch_kgdb_ops;
-extern unsigned long __weak kgdb_arch_pc(int exception, struct pt_regs *regs);
+extern unsigned long kgdb_arch_pc(int exception, struct pt_regs *regs);
#ifdef CONFIG_SERIAL_KGDB_NMI
extern int kgdb_register_nmi_console(void);
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bjorn Helgaas <[email protected]>
commit 96a2adbc6f501996418da9f7afe39bf0e4d006a9 upstream.
kernel/time/jiffies.c provides a default clocksource_default_clock()
definition explicitly marked "weak". arch/s390 provides its own definition
intended to override the default, but the "weak" attribute on the
declaration applied to the s390 definition as well, so the linker chose one
based on link order (see 10629d711ed7 ("PCI: Remove __weak annotation from
pcibios_get_phb_of_node decl")).
Remove the "weak" attribute from the clocksource_default_clock()
declaration so we always prefer a non-weak definition over the weak one,
independent of link order.
Fixes: f1b82746c1e9 ("clocksource: Cleanup clocksource selection")
Signed-off-by: Bjorn Helgaas <[email protected]>
Acked-by: John Stultz <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
CC: Daniel Lezcano <[email protected]>
CC: Martin Schwidefsky <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/linux/clocksource.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -287,7 +287,7 @@ extern struct clocksource* clocksource_g
extern void clocksource_change_rating(struct clocksource *cs, int rating);
extern void clocksource_suspend(void);
extern void clocksource_resume(void);
-extern struct clocksource * __init __weak clocksource_default_clock(void);
+extern struct clocksource * __init clocksource_default_clock(void);
extern void clocksource_mark_unstable(struct clocksource *cs);
extern u64
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Takashi Iwai <[email protected]>
commit e4742b1e786ca386e88e6cfb2801e14e15e365cd upstream.
The new Lenovo T440s laptop has a different PnP ID "LEN0039", and it
needs the similar min/max quirk to make its clickpad working.
BugLink: https://bugzilla.opensuse.org/show_bug.cgi?id=903748
Reported-and-tested-by: Joschi Brauchle <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Dmitry Torokhov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/input/mouse/synaptics.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
--- a/drivers/input/mouse/synaptics.c
+++ b/drivers/input/mouse/synaptics.c
@@ -135,8 +135,8 @@ static const struct min_max_quirk min_ma
1232, 5710, 1156, 4696
},
{
- (const char * const []){"LEN0034", "LEN0036", "LEN2002",
- "LEN2004", NULL},
+ (const char * const []){"LEN0034", "LEN0036", "LEN0039",
+ "LEN2002", "LEN2004", NULL},
1024, 5112, 2024, 4832
},
{
@@ -163,6 +163,7 @@ static const char * const topbuttonpad_p
"LEN0036", /* T440 */
"LEN0037",
"LEN0038",
+ "LEN0039", /* T440s */
"LEN0041",
"LEN0042", /* Yoga */
"LEN0045",
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Luciano Coelho <[email protected]>
commit 84469a45a1bedec9918e94ab2f78c5dc0739e4a7 upstream.
If we are switching from an HT40+ to an HT40- channel (or vice-versa),
we need the secondary channel offset IE to specify what is the
post-CSA offset to be used. This applies both to beacons and to probe
responses.
In ieee80211_parse_ch_switch_ie() we were ignoring this IE from
beacons and using the *current* HT information IE instead. This was
causing us to use the same offset as before the switch.
Fix that by using the secondary channel offset IE also for beacons and
don't ever use the pre-switch offset. Additionally, remove the
"beacon" argument from ieee80211_parse_ch_switch_ie(), since it's not
needed anymore.
Reported-by: Jouni Malinen <[email protected]>
Signed-off-by: Luciano Coelho <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/mac80211/ibss.c | 2 +-
net/mac80211/ieee80211_i.h | 3 +--
net/mac80211/mesh.c | 2 +-
net/mac80211/mlme.c | 2 +-
net/mac80211/spectmgmt.c | 18 ++++++------------
5 files changed, 10 insertions(+), 17 deletions(-)
--- a/net/mac80211/ibss.c
+++ b/net/mac80211/ibss.c
@@ -804,7 +804,7 @@ ieee80211_ibss_process_chanswitch(struct
memset(¶ms, 0, sizeof(params));
memset(&csa_ie, 0, sizeof(csa_ie));
- err = ieee80211_parse_ch_switch_ie(sdata, elems, beacon,
+ err = ieee80211_parse_ch_switch_ie(sdata, elems,
ifibss->chandef.chan->band,
sta_flags, ifibss->bssid, &csa_ie);
/* can't switch to destination channel, fail */
--- a/net/mac80211/ieee80211_i.h
+++ b/net/mac80211/ieee80211_i.h
@@ -1639,7 +1639,6 @@ void ieee80211_process_measurement_req(s
* ieee80211_parse_ch_switch_ie - parses channel switch IEs
* @sdata: the sdata of the interface which has received the frame
* @elems: parsed 802.11 elements received with the frame
- * @beacon: indicates if the frame was a beacon or probe response
* @current_band: indicates the current band
* @sta_flags: contains information about own capabilities and restrictions
* to decide which channel switch announcements can be accepted. Only the
@@ -1653,7 +1652,7 @@ void ieee80211_process_measurement_req(s
* Return: 0 on success, <0 on error and >0 if there is nothing to parse.
*/
int ieee80211_parse_ch_switch_ie(struct ieee80211_sub_if_data *sdata,
- struct ieee802_11_elems *elems, bool beacon,
+ struct ieee802_11_elems *elems,
enum ieee80211_band current_band,
u32 sta_flags, u8 *bssid,
struct ieee80211_csa_ie *csa_ie);
--- a/net/mac80211/mesh.c
+++ b/net/mac80211/mesh.c
@@ -874,7 +874,7 @@ ieee80211_mesh_process_chnswitch(struct
memset(¶ms, 0, sizeof(params));
memset(&csa_ie, 0, sizeof(csa_ie));
- err = ieee80211_parse_ch_switch_ie(sdata, elems, beacon, band,
+ err = ieee80211_parse_ch_switch_ie(sdata, elems, band,
sta_flags, sdata->vif.addr,
&csa_ie);
if (err < 0)
--- a/net/mac80211/mlme.c
+++ b/net/mac80211/mlme.c
@@ -1058,7 +1058,7 @@ ieee80211_sta_process_chanswitch(struct
current_band = cbss->channel->band;
memset(&csa_ie, 0, sizeof(csa_ie));
- res = ieee80211_parse_ch_switch_ie(sdata, elems, beacon, current_band,
+ res = ieee80211_parse_ch_switch_ie(sdata, elems, current_band,
ifmgd->flags,
ifmgd->associated->bssid, &csa_ie);
if (res < 0)
--- a/net/mac80211/spectmgmt.c
+++ b/net/mac80211/spectmgmt.c
@@ -22,7 +22,7 @@
#include "wme.h"
int ieee80211_parse_ch_switch_ie(struct ieee80211_sub_if_data *sdata,
- struct ieee802_11_elems *elems, bool beacon,
+ struct ieee802_11_elems *elems,
enum ieee80211_band current_band,
u32 sta_flags, u8 *bssid,
struct ieee80211_csa_ie *csa_ie)
@@ -91,19 +91,13 @@ int ieee80211_parse_ch_switch_ie(struct
return -EINVAL;
}
- if (!beacon && sec_chan_offs) {
+ if (sec_chan_offs) {
secondary_channel_offset = sec_chan_offs->sec_chan_offs;
- } else if (beacon && ht_oper) {
- secondary_channel_offset =
- ht_oper->ht_param & IEEE80211_HT_PARAM_CHA_SEC_OFFSET;
} else if (!(sta_flags & IEEE80211_STA_DISABLE_HT)) {
- /* If it's not a beacon, HT is enabled and the IE not present,
- * it's 20 MHz, 802.11-2012 8.5.2.6:
- * This element [the Secondary Channel Offset Element] is
- * present when switching to a 40 MHz channel. It may be
- * present when switching to a 20 MHz channel (in which
- * case the secondary channel offset is set to SCN).
- */
+ /* If the secondary channel offset IE is not present,
+ * we can't know what's the post-CSA offset, so the
+ * best we can do is use 20MHz.
+ */
secondary_channel_offset = IEEE80211_HT_PARAM_CHA_SEC_NONE;
}
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Anish Bhatt <[email protected]>
commit 3bb062613b1ecbd0c388106f61344d699f7859ec upstream.
Disabling DCBx in firmware automatically enables DCBx for control via host
lldp agents. Wait for an explicit setstate call from an lldp agents to enable
DCBx instead.
Fixes: 76bcb31efc06 ("cxgb4 : Add DCBx support codebase and dcbnl_ops")
Signed-off-by: Anish Bhatt <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c | 7 ++++++-
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 6 +++++-
2 files changed, 11 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c
@@ -80,7 +80,6 @@ void cxgb4_dcb_state_fsm(struct net_devi
/* we're going to use Host DCB */
dcb->state = CXGB4_DCB_STATE_HOST;
dcb->supported = CXGB4_DCBX_HOST_SUPPORT;
- dcb->enabled = 1;
break;
}
@@ -349,6 +348,12 @@ static u8 cxgb4_setstate(struct net_devi
{
struct port_info *pi = netdev2pinfo(dev);
+ /* If DCBx is host-managed, dcb is enabled by outside lldp agents */
+ if (pi->dcb.state == CXGB4_DCB_STATE_HOST) {
+ pi->dcb.enabled = enabled;
+ return 0;
+ }
+
/* Firmware doesn't provide any mechanism to control the DCB state.
*/
if (enabled != (pi->dcb.state == CXGB4_DCB_STATE_FW_ALLSYNCED))
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -688,7 +688,11 @@ int cxgb4_dcb_enabled(const struct net_d
#ifdef CONFIG_CHELSIO_T4_DCB
struct port_info *pi = netdev_priv(dev);
- return pi->dcb.state == CXGB4_DCB_STATE_FW_ALLSYNCED;
+ if (!pi->dcb.enabled)
+ return 0;
+
+ return ((pi->dcb.state == CXGB4_DCB_STATE_FW_ALLSYNCED) ||
+ (pi->dcb.state == CXGB4_DCB_STATE_HOST));
#else
return 0;
#endif
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Fainelli <[email protected]>
commit 8edf0047f4b8e03d94ef88f5a7dec146cce03a06 upstream.
There is currently a small window during which the SYSTEMPORT adapter
enables its RX interrupts without having enabled its NAPI handler, which
can result in packets to be discarded during interface bringup.
A similar but more serious window exists in bcm_sysport_resume() during
which we can have the RDMA engine not fully prepared to receive packets
and yet having RX interrupts enabled.
Fix this my moving the RX interrupt enable down to
bcm_sysport_netif_start() after napi_enable() for the RX path is called,
which fixes both call sites: bcm_sysport_open() and
bcm_sysport_resume().
Fixes: b02e6d9ba7ad ("net: systemport: add bcm_sysport_netif_{enable,stop}")
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/broadcom/bcmsysport.c | 9 +++------
1 file changed, 3 insertions(+), 6 deletions(-)
--- a/drivers/net/ethernet/broadcom/bcmsysport.c
+++ b/drivers/net/ethernet/broadcom/bcmsysport.c
@@ -1384,6 +1384,9 @@ static void bcm_sysport_netif_start(stru
/* Enable NAPI */
napi_enable(&priv->napi);
+ /* Enable RX interrupt and TX ring full interrupt */
+ intrl2_0_mask_clear(priv, INTRL2_0_RDMA_MBDONE | INTRL2_0_TX_RING_FULL);
+
phy_start(priv->phydev);
/* Enable TX interrupts for the 32 TXQs */
@@ -1486,9 +1489,6 @@ static int bcm_sysport_open(struct net_d
if (ret)
goto out_free_rx_ring;
- /* Enable RX interrupt and TX ring full interrupt */
- intrl2_0_mask_clear(priv, INTRL2_0_RDMA_MBDONE | INTRL2_0_TX_RING_FULL);
-
/* Turn on TDMA */
ret = tdma_enable_set(priv, 1);
if (ret)
@@ -1872,9 +1872,6 @@ static int bcm_sysport_resume(struct dev
netif_device_attach(dev);
- /* Enable RX interrupt and TX ring full interrupt */
- intrl2_0_mask_clear(priv, INTRL2_0_RDMA_MBDONE | INTRL2_0_TX_RING_FULL);
-
/* RX pipe enable */
topctrl_writel(priv, 0, RX_FLUSH_CNTL);
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Fainelli <[email protected]>
commit 704d33e7006f20f9b4fa7d24a0f08c4b5919b131 upstream.
bcm_sysport_resume() was missing an UniMAC reset which can lead to
various receive FIFO corruptions coming out of a suspend cycle. If the
RX FIFO is stuck, it will deliver corrupted/duplicate packets towards
the host CPU interface.
This could be reproduced on crowded network and when Wake-on-LAN is
enabled for this particular interface because the switch still forwards
packets towards the host CPU interface (SYSTEMPORT), and we had to leave
the UniMAC RX enable bit on to allow matching MagicPackets.
Once we re-enter the resume function, there is a small window during
which the UniMAC receive is still enabled, and we start queueing
packets, but the RDMA and RBUF engines are not ready, which leads to
having packets stuck in the UniMAC RX FIFO, ultimately delivered towards
the host CPU as corrupted.
Fixes: 40755a0fce17 ("net: systemport: add suspend and resume support")
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/broadcom/bcmsysport.c | 2 ++
1 file changed, 2 insertions(+)
--- a/drivers/net/ethernet/broadcom/bcmsysport.c
+++ b/drivers/net/ethernet/broadcom/bcmsysport.c
@@ -1845,6 +1845,8 @@ static int bcm_sysport_resume(struct dev
if (!netif_running(dev))
return 0;
+ umac_reset(priv);
+
/* We may have been suspended and never received a WOL event that
* would turn off MPD detection, take care of that now
*/
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bjorn Helgaas <[email protected]>
commit e0a8400c6923a163265d52798cdd4c33f3f8ab5a upstream.
drivers/base/memory.c provides a default memory_block_size_bytes()
definition explicitly marked "weak". Several architectures provide their
own definitions intended to override the default, but the "weak" attribute
on the declaration applied to the arch definitions as well, so the linker
chose one based on link order (see 10629d711ed7 ("PCI: Remove __weak
annotation from pcibios_get_phb_of_node decl")).
Remove the "weak" attribute from the declaration so we always prefer a
non-weak definition over the weak one, independent of link order.
Fixes: 41f107266b19 ("drivers: base: Add prototype declaration to the header file")
Signed-off-by: Bjorn Helgaas <[email protected]>
Acked-by: Andrew Morton <[email protected]>
CC: Rashika Kheria <[email protected]>
CC: Nathan Fontenot <[email protected]>
CC: Anton Blanchard <[email protected]>
CC: Heiko Carstens <[email protected]>
CC: Yinghai Lu <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/linux/memory.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/include/linux/memory.h
+++ b/include/linux/memory.h
@@ -35,7 +35,7 @@ struct memory_block {
};
int arch_get_memory_phys_device(unsigned long start_pfn);
-unsigned long __weak memory_block_size_bytes(void);
+unsigned long memory_block_size_bytes(void);
/* These states are exposed to userspace as text strings in sysfs */
#define MEM_ONLINE (1<<0) /* exposed to userspace */
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilya Dryomov <[email protected]>
commit aaef31703a0cf6a733e651885bfb49edc3ac6774 upstream.
Large (greater than 32k, the value of PAGE_ALLOC_COSTLY_ORDER) auth
tickets will have their buffers vmalloc'ed, which leads to the
following crash in crypto:
[ 28.685082] BUG: unable to handle kernel paging request at ffffeb04000032c0
[ 28.686032] IP: [<ffffffff81392b42>] scatterwalk_pagedone+0x22/0x80
[ 28.686032] PGD 0
[ 28.688088] Oops: 0000 [#1] PREEMPT SMP
[ 28.688088] Modules linked in:
[ 28.688088] CPU: 0 PID: 878 Comm: kworker/0:2 Not tainted 3.17.0-vm+ #305
[ 28.688088] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[ 28.688088] Workqueue: ceph-msgr con_work
[ 28.688088] task: ffff88011a7f9030 ti: ffff8800d903c000 task.ti: ffff8800d903c000
[ 28.688088] RIP: 0010:[<ffffffff81392b42>] [<ffffffff81392b42>] scatterwalk_pagedone+0x22/0x80
[ 28.688088] RSP: 0018:ffff8800d903f688 EFLAGS: 00010286
[ 28.688088] RAX: ffffeb04000032c0 RBX: ffff8800d903f718 RCX: ffffeb04000032c0
[ 28.688088] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8800d903f750
[ 28.688088] RBP: ffff8800d903f688 R08: 00000000000007de R09: ffff8800d903f880
[ 28.688088] R10: 18df467c72d6257b R11: 0000000000000000 R12: 0000000000000010
[ 28.688088] R13: ffff8800d903f750 R14: ffff8800d903f8a0 R15: 0000000000000000
[ 28.688088] FS: 00007f50a41c7700(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000
[ 28.688088] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 28.688088] CR2: ffffeb04000032c0 CR3: 00000000da3f3000 CR4: 00000000000006b0
[ 28.688088] Stack:
[ 28.688088] ffff8800d903f698 ffffffff81392ca8 ffff8800d903f6e8 ffffffff81395d32
[ 28.688088] ffff8800dac96000 ffff880000000000 ffff8800d903f980 ffff880119b7e020
[ 28.688088] ffff880119b7e010 0000000000000000 0000000000000010 0000000000000010
[ 28.688088] Call Trace:
[ 28.688088] [<ffffffff81392ca8>] scatterwalk_done+0x38/0x40
[ 28.688088] [<ffffffff81392ca8>] scatterwalk_done+0x38/0x40
[ 28.688088] [<ffffffff81395d32>] blkcipher_walk_done+0x182/0x220
[ 28.688088] [<ffffffff813990bf>] crypto_cbc_encrypt+0x15f/0x180
[ 28.688088] [<ffffffff81399780>] ? crypto_aes_set_key+0x30/0x30
[ 28.688088] [<ffffffff8156c40c>] ceph_aes_encrypt2+0x29c/0x2e0
[ 28.688088] [<ffffffff8156d2a3>] ceph_encrypt2+0x93/0xb0
[ 28.688088] [<ffffffff8156d7da>] ceph_x_encrypt+0x4a/0x60
[ 28.688088] [<ffffffff8155b39d>] ? ceph_buffer_new+0x5d/0xf0
[ 28.688088] [<ffffffff8156e837>] ceph_x_build_authorizer.isra.6+0x297/0x360
[ 28.688088] [<ffffffff8112089b>] ? kmem_cache_alloc_trace+0x11b/0x1c0
[ 28.688088] [<ffffffff8156b496>] ? ceph_auth_create_authorizer+0x36/0x80
[ 28.688088] [<ffffffff8156ed83>] ceph_x_create_authorizer+0x63/0xd0
[ 28.688088] [<ffffffff8156b4b4>] ceph_auth_create_authorizer+0x54/0x80
[ 28.688088] [<ffffffff8155f7c0>] get_authorizer+0x80/0xd0
[ 28.688088] [<ffffffff81555a8b>] prepare_write_connect+0x18b/0x2b0
[ 28.688088] [<ffffffff81559289>] try_read+0x1e59/0x1f10
This is because we set up crypto scatterlists as if all buffers were
kmalloc'ed. Fix it.
Signed-off-by: Ilya Dryomov <[email protected]>
Reviewed-by: Sage Weil <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ceph/crypto.c | 169 ++++++++++++++++++++++++++++++++++++++++++------------
1 file changed, 132 insertions(+), 37 deletions(-)
--- a/net/ceph/crypto.c
+++ b/net/ceph/crypto.c
@@ -90,11 +90,82 @@ static struct crypto_blkcipher *ceph_cry
static const u8 *aes_iv = (u8 *)CEPH_AES_IV;
+/*
+ * Should be used for buffers allocated with ceph_kvmalloc().
+ * Currently these are encrypt out-buffer (ceph_buffer) and decrypt
+ * in-buffer (msg front).
+ *
+ * Dispose of @sgt with teardown_sgtable().
+ *
+ * @prealloc_sg is to avoid memory allocation inside sg_alloc_table()
+ * in cases where a single sg is sufficient. No attempt to reduce the
+ * number of sgs by squeezing physically contiguous pages together is
+ * made though, for simplicity.
+ */
+static int setup_sgtable(struct sg_table *sgt, struct scatterlist *prealloc_sg,
+ const void *buf, unsigned int buf_len)
+{
+ struct scatterlist *sg;
+ const bool is_vmalloc = is_vmalloc_addr(buf);
+ unsigned int off = offset_in_page(buf);
+ unsigned int chunk_cnt = 1;
+ unsigned int chunk_len = PAGE_ALIGN(off + buf_len);
+ int i;
+ int ret;
+
+ if (buf_len == 0) {
+ memset(sgt, 0, sizeof(*sgt));
+ return -EINVAL;
+ }
+
+ if (is_vmalloc) {
+ chunk_cnt = chunk_len >> PAGE_SHIFT;
+ chunk_len = PAGE_SIZE;
+ }
+
+ if (chunk_cnt > 1) {
+ ret = sg_alloc_table(sgt, chunk_cnt, GFP_NOFS);
+ if (ret)
+ return ret;
+ } else {
+ WARN_ON(chunk_cnt != 1);
+ sg_init_table(prealloc_sg, 1);
+ sgt->sgl = prealloc_sg;
+ sgt->nents = sgt->orig_nents = 1;
+ }
+
+ for_each_sg(sgt->sgl, sg, sgt->orig_nents, i) {
+ struct page *page;
+ unsigned int len = min(chunk_len - off, buf_len);
+
+ if (is_vmalloc)
+ page = vmalloc_to_page(buf);
+ else
+ page = virt_to_page(buf);
+
+ sg_set_page(sg, page, len, off);
+
+ off = 0;
+ buf += len;
+ buf_len -= len;
+ }
+ WARN_ON(buf_len != 0);
+
+ return 0;
+}
+
+static void teardown_sgtable(struct sg_table *sgt)
+{
+ if (sgt->orig_nents > 1)
+ sg_free_table(sgt);
+}
+
static int ceph_aes_encrypt(const void *key, int key_len,
void *dst, size_t *dst_len,
const void *src, size_t src_len)
{
- struct scatterlist sg_in[2], sg_out[1];
+ struct scatterlist sg_in[2], prealloc_sg;
+ struct sg_table sg_out;
struct crypto_blkcipher *tfm = ceph_crypto_alloc_cipher();
struct blkcipher_desc desc = { .tfm = tfm, .flags = 0 };
int ret;
@@ -110,16 +181,18 @@ static int ceph_aes_encrypt(const void *
*dst_len = src_len + zero_padding;
- crypto_blkcipher_setkey((void *)tfm, key, key_len);
sg_init_table(sg_in, 2);
sg_set_buf(&sg_in[0], src, src_len);
sg_set_buf(&sg_in[1], pad, zero_padding);
- sg_init_table(sg_out, 1);
- sg_set_buf(sg_out, dst, *dst_len);
+ ret = setup_sgtable(&sg_out, &prealloc_sg, dst, *dst_len);
+ if (ret)
+ goto out_tfm;
+
+ crypto_blkcipher_setkey((void *)tfm, key, key_len);
iv = crypto_blkcipher_crt(tfm)->iv;
ivsize = crypto_blkcipher_ivsize(tfm);
-
memcpy(iv, aes_iv, ivsize);
+
/*
print_hex_dump(KERN_ERR, "enc key: ", DUMP_PREFIX_NONE, 16, 1,
key, key_len, 1);
@@ -128,16 +201,22 @@ static int ceph_aes_encrypt(const void *
print_hex_dump(KERN_ERR, "enc pad: ", DUMP_PREFIX_NONE, 16, 1,
pad, zero_padding, 1);
*/
- ret = crypto_blkcipher_encrypt(&desc, sg_out, sg_in,
+ ret = crypto_blkcipher_encrypt(&desc, sg_out.sgl, sg_in,
src_len + zero_padding);
- crypto_free_blkcipher(tfm);
- if (ret < 0)
+ if (ret < 0) {
pr_err("ceph_aes_crypt failed %d\n", ret);
+ goto out_sg;
+ }
/*
print_hex_dump(KERN_ERR, "enc out: ", DUMP_PREFIX_NONE, 16, 1,
dst, *dst_len, 1);
*/
- return 0;
+
+out_sg:
+ teardown_sgtable(&sg_out);
+out_tfm:
+ crypto_free_blkcipher(tfm);
+ return ret;
}
static int ceph_aes_encrypt2(const void *key, int key_len, void *dst,
@@ -145,7 +224,8 @@ static int ceph_aes_encrypt2(const void
const void *src1, size_t src1_len,
const void *src2, size_t src2_len)
{
- struct scatterlist sg_in[3], sg_out[1];
+ struct scatterlist sg_in[3], prealloc_sg;
+ struct sg_table sg_out;
struct crypto_blkcipher *tfm = ceph_crypto_alloc_cipher();
struct blkcipher_desc desc = { .tfm = tfm, .flags = 0 };
int ret;
@@ -161,17 +241,19 @@ static int ceph_aes_encrypt2(const void
*dst_len = src1_len + src2_len + zero_padding;
- crypto_blkcipher_setkey((void *)tfm, key, key_len);
sg_init_table(sg_in, 3);
sg_set_buf(&sg_in[0], src1, src1_len);
sg_set_buf(&sg_in[1], src2, src2_len);
sg_set_buf(&sg_in[2], pad, zero_padding);
- sg_init_table(sg_out, 1);
- sg_set_buf(sg_out, dst, *dst_len);
+ ret = setup_sgtable(&sg_out, &prealloc_sg, dst, *dst_len);
+ if (ret)
+ goto out_tfm;
+
+ crypto_blkcipher_setkey((void *)tfm, key, key_len);
iv = crypto_blkcipher_crt(tfm)->iv;
ivsize = crypto_blkcipher_ivsize(tfm);
-
memcpy(iv, aes_iv, ivsize);
+
/*
print_hex_dump(KERN_ERR, "enc key: ", DUMP_PREFIX_NONE, 16, 1,
key, key_len, 1);
@@ -182,23 +264,30 @@ static int ceph_aes_encrypt2(const void
print_hex_dump(KERN_ERR, "enc pad: ", DUMP_PREFIX_NONE, 16, 1,
pad, zero_padding, 1);
*/
- ret = crypto_blkcipher_encrypt(&desc, sg_out, sg_in,
+ ret = crypto_blkcipher_encrypt(&desc, sg_out.sgl, sg_in,
src1_len + src2_len + zero_padding);
- crypto_free_blkcipher(tfm);
- if (ret < 0)
+ if (ret < 0) {
pr_err("ceph_aes_crypt2 failed %d\n", ret);
+ goto out_sg;
+ }
/*
print_hex_dump(KERN_ERR, "enc out: ", DUMP_PREFIX_NONE, 16, 1,
dst, *dst_len, 1);
*/
- return 0;
+
+out_sg:
+ teardown_sgtable(&sg_out);
+out_tfm:
+ crypto_free_blkcipher(tfm);
+ return ret;
}
static int ceph_aes_decrypt(const void *key, int key_len,
void *dst, size_t *dst_len,
const void *src, size_t src_len)
{
- struct scatterlist sg_in[1], sg_out[2];
+ struct sg_table sg_in;
+ struct scatterlist sg_out[2], prealloc_sg;
struct crypto_blkcipher *tfm = ceph_crypto_alloc_cipher();
struct blkcipher_desc desc = { .tfm = tfm };
char pad[16];
@@ -210,16 +299,16 @@ static int ceph_aes_decrypt(const void *
if (IS_ERR(tfm))
return PTR_ERR(tfm);
- crypto_blkcipher_setkey((void *)tfm, key, key_len);
- sg_init_table(sg_in, 1);
sg_init_table(sg_out, 2);
- sg_set_buf(sg_in, src, src_len);
sg_set_buf(&sg_out[0], dst, *dst_len);
sg_set_buf(&sg_out[1], pad, sizeof(pad));
+ ret = setup_sgtable(&sg_in, &prealloc_sg, src, src_len);
+ if (ret)
+ goto out_tfm;
+ crypto_blkcipher_setkey((void *)tfm, key, key_len);
iv = crypto_blkcipher_crt(tfm)->iv;
ivsize = crypto_blkcipher_ivsize(tfm);
-
memcpy(iv, aes_iv, ivsize);
/*
@@ -228,12 +317,10 @@ static int ceph_aes_decrypt(const void *
print_hex_dump(KERN_ERR, "dec in: ", DUMP_PREFIX_NONE, 16, 1,
src, src_len, 1);
*/
-
- ret = crypto_blkcipher_decrypt(&desc, sg_out, sg_in, src_len);
- crypto_free_blkcipher(tfm);
+ ret = crypto_blkcipher_decrypt(&desc, sg_out, sg_in.sgl, src_len);
if (ret < 0) {
pr_err("ceph_aes_decrypt failed %d\n", ret);
- return ret;
+ goto out_sg;
}
if (src_len <= *dst_len)
@@ -251,7 +338,12 @@ static int ceph_aes_decrypt(const void *
print_hex_dump(KERN_ERR, "dec out: ", DUMP_PREFIX_NONE, 16, 1,
dst, *dst_len, 1);
*/
- return 0;
+
+out_sg:
+ teardown_sgtable(&sg_in);
+out_tfm:
+ crypto_free_blkcipher(tfm);
+ return ret;
}
static int ceph_aes_decrypt2(const void *key, int key_len,
@@ -259,7 +351,8 @@ static int ceph_aes_decrypt2(const void
void *dst2, size_t *dst2_len,
const void *src, size_t src_len)
{
- struct scatterlist sg_in[1], sg_out[3];
+ struct sg_table sg_in;
+ struct scatterlist sg_out[3], prealloc_sg;
struct crypto_blkcipher *tfm = ceph_crypto_alloc_cipher();
struct blkcipher_desc desc = { .tfm = tfm };
char pad[16];
@@ -271,17 +364,17 @@ static int ceph_aes_decrypt2(const void
if (IS_ERR(tfm))
return PTR_ERR(tfm);
- sg_init_table(sg_in, 1);
- sg_set_buf(sg_in, src, src_len);
sg_init_table(sg_out, 3);
sg_set_buf(&sg_out[0], dst1, *dst1_len);
sg_set_buf(&sg_out[1], dst2, *dst2_len);
sg_set_buf(&sg_out[2], pad, sizeof(pad));
+ ret = setup_sgtable(&sg_in, &prealloc_sg, src, src_len);
+ if (ret)
+ goto out_tfm;
crypto_blkcipher_setkey((void *)tfm, key, key_len);
iv = crypto_blkcipher_crt(tfm)->iv;
ivsize = crypto_blkcipher_ivsize(tfm);
-
memcpy(iv, aes_iv, ivsize);
/*
@@ -290,12 +383,10 @@ static int ceph_aes_decrypt2(const void
print_hex_dump(KERN_ERR, "dec in: ", DUMP_PREFIX_NONE, 16, 1,
src, src_len, 1);
*/
-
- ret = crypto_blkcipher_decrypt(&desc, sg_out, sg_in, src_len);
- crypto_free_blkcipher(tfm);
+ ret = crypto_blkcipher_decrypt(&desc, sg_out, sg_in.sgl, src_len);
if (ret < 0) {
pr_err("ceph_aes_decrypt failed %d\n", ret);
- return ret;
+ goto out_sg;
}
if (src_len <= *dst1_len)
@@ -325,7 +416,11 @@ static int ceph_aes_decrypt2(const void
dst2, *dst2_len, 1);
*/
- return 0;
+out_sg:
+ teardown_sgtable(&sg_in);
+out_tfm:
+ crypto_free_blkcipher(tfm);
+ return ret;
}
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Trond Myklebust <[email protected]>
commit 869f9dfa4d6d57b79e0afc3af14772c2a023eeb1 upstream.
Any attempt to call nfs_remove_bad_delegation() while a delegation is being
returned is currently a no-op. This means that we can end up looping
forever in nfs_end_delegation_return() if something causes the delegation
to be revoked.
This patch adds a mechanism whereby the state recovery code can communicate
to the delegation return code that the delegation is no longer valid and
that it should not be used when reclaiming state.
It also changes the return value for nfs4_handle_delegation_recall_error()
to ensure that nfs_end_delegation_return() does not reattempt the lock
reclaim before state recovery is done.
http://lkml.kernel.org/r/CAN-5tyHwG=Cn2Q9KsHWadewjpTTy_K26ee+UnSvHvG4192p-Xw@mail.gmail.com
Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/nfs/delegation.c | 23 +++++++++++++++++++++--
fs/nfs/delegation.h | 1 +
fs/nfs/nfs4proc.c | 2 +-
3 files changed, 23 insertions(+), 3 deletions(-)
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -195,7 +195,11 @@ static int nfs_do_return_delegation(stru
{
int res = 0;
- res = nfs4_proc_delegreturn(inode, delegation->cred, &delegation->stateid, issync);
+ if (!test_bit(NFS_DELEGATION_REVOKED, &delegation->flags))
+ res = nfs4_proc_delegreturn(inode,
+ delegation->cred,
+ &delegation->stateid,
+ issync);
nfs_free_delegation(delegation);
return res;
}
@@ -382,11 +386,13 @@ static int nfs_end_delegation_return(str
{
struct nfs_client *clp = NFS_SERVER(inode)->nfs_client;
struct nfs_inode *nfsi = NFS_I(inode);
- int err;
+ int err = 0;
if (delegation == NULL)
return 0;
do {
+ if (test_bit(NFS_DELEGATION_REVOKED, &delegation->flags))
+ break;
err = nfs_delegation_claim_opens(inode, &delegation->stateid);
if (!issync || err != -EAGAIN)
break;
@@ -607,10 +613,23 @@ static void nfs_client_mark_return_unuse
rcu_read_unlock();
}
+static void nfs_revoke_delegation(struct inode *inode)
+{
+ struct nfs_delegation *delegation;
+ rcu_read_lock();
+ delegation = rcu_dereference(NFS_I(inode)->delegation);
+ if (delegation != NULL) {
+ set_bit(NFS_DELEGATION_REVOKED, &delegation->flags);
+ nfs_mark_return_delegation(NFS_SERVER(inode), delegation);
+ }
+ rcu_read_unlock();
+}
+
void nfs_remove_bad_delegation(struct inode *inode)
{
struct nfs_delegation *delegation;
+ nfs_revoke_delegation(inode);
delegation = nfs_inode_detach_delegation(inode);
if (delegation) {
nfs_inode_find_state_and_recover(inode, &delegation->stateid);
--- a/fs/nfs/delegation.h
+++ b/fs/nfs/delegation.h
@@ -31,6 +31,7 @@ enum {
NFS_DELEGATION_RETURN_IF_CLOSED,
NFS_DELEGATION_REFERENCED,
NFS_DELEGATION_RETURNING,
+ NFS_DELEGATION_REVOKED,
};
int nfs_inode_set_delegation(struct inode *inode, struct rpc_cred *cred, struct nfs_openres *res);
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -1642,7 +1642,7 @@ static int nfs4_handle_delegation_recall
nfs_inode_find_state_and_recover(state->inode,
stateid);
nfs4_schedule_stateid_recovery(server, state);
- return 0;
+ return -EAGAIN;
case -NFS4ERR_DELAY:
case -NFS4ERR_GRACE:
set_bit(NFS_DELEGATED_STATE, &state->flags);
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Trond Myklebust <[email protected]>
commit 0c116cadd94b16b30b1dd90d38b2784d9b39b01a upstream.
This patch removes the assumption made previously, that we only need to
check the delegation stateid when it matches the stateid on a cached
open.
If we believe that we hold a delegation for this file, then we must assume
that its stateid may have been revoked or expired too. If we don't test it
then our state recovery process may end up caching open/lock state in a
situation where it should not.
We therefore rename the function nfs41_clear_delegation_stateid as
nfs41_check_delegation_stateid, and change it to always run through the
delegation stateid test and recovery process as outlined in RFC5661.
http://lkml.kernel.org/r/CAN-5tyHwG=Cn2Q9KsHWadewjpTTy_K26ee+UnSvHvG4192p-Xw@mail.gmail.com
Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/nfs/nfs4proc.c | 42 +++++++++++++++++-------------------------
1 file changed, 17 insertions(+), 25 deletions(-)
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2120,45 +2120,37 @@ static int nfs40_open_expired(struct nfs
}
#if defined(CONFIG_NFS_V4_1)
-static void nfs41_clear_delegation_stateid(struct nfs4_state *state)
+static void nfs41_check_delegation_stateid(struct nfs4_state *state)
{
struct nfs_server *server = NFS_SERVER(state->inode);
- nfs4_stateid *stateid = &state->stateid;
+ nfs4_stateid stateid;
struct nfs_delegation *delegation;
- struct rpc_cred *cred = NULL;
- int status = -NFS4ERR_BAD_STATEID;
-
- /* If a state reset has been done, test_stateid is unneeded */
- if (test_bit(NFS_DELEGATED_STATE, &state->flags) == 0)
- return;
+ struct rpc_cred *cred;
+ int status;
/* Get the delegation credential for use by test/free_stateid */
rcu_read_lock();
delegation = rcu_dereference(NFS_I(state->inode)->delegation);
- if (delegation != NULL &&
- nfs4_stateid_match(&delegation->stateid, stateid)) {
- cred = get_rpccred(delegation->cred);
- rcu_read_unlock();
- status = nfs41_test_stateid(server, stateid, cred);
- trace_nfs4_test_delegation_stateid(state, NULL, status);
- } else
+ if (delegation == NULL) {
rcu_read_unlock();
+ return;
+ }
+
+ nfs4_stateid_copy(&stateid, &delegation->stateid);
+ cred = get_rpccred(delegation->cred);
+ rcu_read_unlock();
+ status = nfs41_test_stateid(server, &stateid, cred);
+ trace_nfs4_test_delegation_stateid(state, NULL, status);
if (status != NFS_OK) {
/* Free the stateid unless the server explicitly
* informs us the stateid is unrecognized. */
if (status != -NFS4ERR_BAD_STATEID)
- nfs41_free_stateid(server, stateid, cred);
- nfs_remove_bad_delegation(state->inode);
-
- write_seqlock(&state->seqlock);
- nfs4_stateid_copy(&state->stateid, &state->open_stateid);
- write_sequnlock(&state->seqlock);
- clear_bit(NFS_DELEGATED_STATE, &state->flags);
+ nfs41_free_stateid(server, &stateid, cred);
+ nfs_finish_clear_delegation_stateid(state);
}
- if (cred != NULL)
- put_rpccred(cred);
+ put_rpccred(cred);
}
/**
@@ -2202,7 +2194,7 @@ static int nfs41_open_expired(struct nfs
{
int status;
- nfs41_clear_delegation_stateid(state);
+ nfs41_check_delegation_stateid(state);
status = nfs41_check_open_stateid(state);
if (status != NFS_OK)
status = nfs4_open_expired(sp, state);
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter <[email protected]>
commit f2e323ec96077642d397bb1c355def536d489d16 upstream.
We need to add a limit check here so we don't overflow the buffer.
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/media/usb/ttusb-dec/ttusbdecfe.c | 3 +++
1 file changed, 3 insertions(+)
--- a/drivers/media/usb/ttusb-dec/ttusbdecfe.c
+++ b/drivers/media/usb/ttusb-dec/ttusbdecfe.c
@@ -156,6 +156,9 @@ static int ttusbdecfe_dvbs_diseqc_send_m
0x00, 0x00, 0x00, 0x00,
0x00, 0x00 };
+ if (cmd->msg_len > sizeof(b) - 4)
+ return -EINVAL;
+
memcpy(&b[4], cmd->msg, cmd->msg_len);
state->config->send_command(fe, 0x72,
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Trond Myklebust <[email protected]>
commit f8ebf7a8ca35dde321f0cd385fee6f1950609367 upstream.
If state recovery failed, then we should not attempt to reclaim delegated
state.
http://lkml.kernel.org/r/CAN-5tyHwG=Cn2Q9KsHWadewjpTTy_K26ee+UnSvHvG4192p-Xw@mail.gmail.com
Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/nfs/delegation.c | 2 ++
1 file changed, 2 insertions(+)
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -125,6 +125,8 @@ again:
continue;
if (!test_bit(NFS_DELEGATED_STATE, &state->flags))
continue;
+ if (!nfs4_valid_open_stateid(state))
+ continue;
if (!nfs4_stateid_match(&state->stateid, stateid))
continue;
get_nfs_open_context(ctx);
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara <[email protected]>
commit 16caf5b6101d03335b386e77e9e14136f989be87 upstream.
Variable 'err' needn't be initialized when nfs_getattr() uses it to
check whether it should call generic_fillattr() or not. That can result
in spurious error returns. Initialize 'err' properly.
Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/nfs/inode.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -624,7 +624,7 @@ int nfs_getattr(struct vfsmount *mnt, st
{
struct inode *inode = dentry->d_inode;
int need_atime = NFS_I(inode)->cache_validity & NFS_INO_INVALID_ATIME;
- int err;
+ int err = 0;
trace_nfs_getattr_enter(inode);
/* Flush out writes to the server in order to update c/mtime. */
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Trond Myklebust <[email protected]>
commit c606bb8857921d3ecf4d353942d6cc7e116cc75a upstream.
NFSv4.x (x>0) requires us to call TEST_STATEID+FREE_STATEID if a stateid is
revoked. We will currently fail to do this if the stateid is a delegation.
http://lkml.kernel.org/r/CAN-5tyHwG=Cn2Q9KsHWadewjpTTy_K26ee+UnSvHvG4192p-Xw@mail.gmail.com
Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/nfs/filelayout/filelayout.c | 3 ---
fs/nfs/nfs4proc.c | 8 --------
2 files changed, 11 deletions(-)
--- a/fs/nfs/filelayout/filelayout.c
+++ b/fs/nfs/filelayout/filelayout.c
@@ -145,9 +145,6 @@ static int filelayout_async_handle_error
case -NFS4ERR_DELEG_REVOKED:
case -NFS4ERR_ADMIN_REVOKED:
case -NFS4ERR_BAD_STATEID:
- if (state == NULL)
- break;
- nfs_remove_bad_delegation(state->inode);
case -NFS4ERR_OPENMODE:
if (state == NULL)
break;
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -360,11 +360,6 @@ static int nfs4_handle_exception(struct
case -NFS4ERR_DELEG_REVOKED:
case -NFS4ERR_ADMIN_REVOKED:
case -NFS4ERR_BAD_STATEID:
- if (inode != NULL && nfs4_have_delegation(inode, FMODE_READ)) {
- nfs_remove_bad_delegation(inode);
- exception->retry = 1;
- break;
- }
if (state == NULL)
break;
ret = nfs4_schedule_stateid_recovery(server, state);
@@ -4849,9 +4844,6 @@ nfs4_async_handle_error(struct rpc_task
case -NFS4ERR_DELEG_REVOKED:
case -NFS4ERR_ADMIN_REVOKED:
case -NFS4ERR_BAD_STATEID:
- if (state == NULL)
- break;
- nfs_remove_bad_delegation(state->inode);
case -NFS4ERR_OPENMODE:
if (state == NULL)
break;
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Trond Myklebust <[email protected]>
commit 4dfd4f7af0afd201706ad186352ca423b0f17d4b upstream.
NFSv4.0 does not have TEST_STATEID/FREE_STATEID functionality, so
unlike NFSv4.1, the recovery procedure when stateids have expired or
have been revoked requires us to just forget the delegation.
http://lkml.kernel.org/r/CAN-5tyHwG=Cn2Q9KsHWadewjpTTy_K26ee+UnSvHvG4192p-Xw@mail.gmail.com
Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/nfs/nfs4proc.c | 24 +++++++++++++++++++++++-
1 file changed, 23 insertions(+), 1 deletion(-)
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2102,6 +2102,28 @@ static int nfs4_open_expired(struct nfs4
return ret;
}
+static void nfs_finish_clear_delegation_stateid(struct nfs4_state *state)
+{
+ nfs_remove_bad_delegation(state->inode);
+ write_seqlock(&state->seqlock);
+ nfs4_stateid_copy(&state->stateid, &state->open_stateid);
+ write_sequnlock(&state->seqlock);
+ clear_bit(NFS_DELEGATED_STATE, &state->flags);
+}
+
+static void nfs40_clear_delegation_stateid(struct nfs4_state *state)
+{
+ if (rcu_access_pointer(NFS_I(state->inode)->delegation) != NULL)
+ nfs_finish_clear_delegation_stateid(state);
+}
+
+static int nfs40_open_expired(struct nfs4_state_owner *sp, struct nfs4_state *state)
+{
+ /* NFSv4.0 doesn't allow for delegation recovery on open expire */
+ nfs40_clear_delegation_stateid(state);
+ return nfs4_open_expired(sp, state);
+}
+
#if defined(CONFIG_NFS_V4_1)
static void nfs41_clear_delegation_stateid(struct nfs4_state *state)
{
@@ -8366,7 +8388,7 @@ static const struct nfs4_state_recovery_
static const struct nfs4_state_recovery_ops nfs40_nograce_recovery_ops = {
.owner_flag_bit = NFS_OWNER_RECLAIM_NOGRACE,
.state_flag_bit = NFS_STATE_RECLAIM_NOGRACE,
- .recover_open = nfs4_open_expired,
+ .recover_open = nfs40_open_expired,
.recover_lock = nfs4_lock_expired,
.establish_clid = nfs4_init_clientid,
};
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: NeilBrown <[email protected]>
commit 45eaf45dfa4850df16bc2e8e7903d89021137f40 upstream.
md_check_recovery will skip any recovery and also clear
MD_RECOVERY_NEEDED if MD_RECOVERY_FROZEN is set.
So when we clear _FROZEN, we must set _NEEDED and ensure that
md_check_recovery gets run.
Otherwise we could miss out on something that is needed.
In particular, this can make it impossible to remove a
failed device from an array is the 'recovery-needed' processing
didn't happen.
Suitable for stable kernels since 3.13.
Reported-and-tested-by: Joe Lawrence <[email protected]>
Fixes: 30b8feb730f9b9b3c5de02580897da03f59b6b16
Signed-off-by: NeilBrown <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/md/md.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -5313,6 +5313,7 @@ static int md_set_readonly(struct mddev
printk("md: %s still in use.\n",mdname(mddev));
if (did_freeze) {
clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
+ set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
md_wakeup_thread(mddev->thread);
}
err = -EBUSY;
@@ -5327,6 +5328,8 @@ static int md_set_readonly(struct mddev
mddev->ro = 1;
set_disk_ro(mddev->gendisk, 1);
clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
+ set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
+ md_wakeup_thread(mddev->thread);
sysfs_notify_dirent_safe(mddev->sysfs_state);
err = 0;
}
@@ -5370,6 +5373,7 @@ static int do_md_stop(struct mddev * mdd
mutex_unlock(&mddev->open_mutex);
if (did_freeze) {
clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
+ set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
md_wakeup_thread(mddev->thread);
}
return -EBUSY;
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Heinz Mauelshagen <[email protected]>
commit 40d43c4b4cac4c2647bf07110d7b07d35f399a84 upstream.
The dm-raid superblock (struct dm_raid_superblock) is padded to 512
bytes and that size is being used to read it in from the metadata
device into one preallocated page.
Reading or writing this on a 512-byte sector device works fine but on
a 4096-byte sector device this fails.
Set the dm-raid superblock's size to the logical block size of the
metadata device, because IO at that size is guaranteed too work. Also
add a size check to avoid silent partial metadata loss in case the
superblock should ever grow past the logical block size or PAGE_SIZE.
[includes pointer math fix from Dan Carpenter]
Reported-by: "Liuhua Wang" <[email protected]>
Signed-off-by: Heinz Mauelshagen <[email protected]>
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/md/dm-raid.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
--- a/drivers/md/dm-raid.c
+++ b/drivers/md/dm-raid.c
@@ -785,8 +785,7 @@ struct dm_raid_superblock {
__le32 layout;
__le32 stripe_sectors;
- __u8 pad[452]; /* Round struct to 512 bytes. */
- /* Always set to 0 when writing. */
+ /* Remainder of a logical block is zero-filled when writing (see super_sync()). */
} __packed;
static int read_disk_sb(struct md_rdev *rdev, int size)
@@ -823,7 +822,7 @@ static void super_sync(struct mddev *mdd
test_bit(Faulty, &(rs->dev[i].rdev.flags)))
failed_devices |= (1ULL << i);
- memset(sb, 0, sizeof(*sb));
+ memset(sb + 1, 0, rdev->sb_size - sizeof(*sb));
sb->magic = cpu_to_le32(DM_RAID_MAGIC);
sb->features = cpu_to_le32(0); /* No features yet */
@@ -858,7 +857,11 @@ static int super_load(struct md_rdev *rd
uint64_t events_sb, events_refsb;
rdev->sb_start = 0;
- rdev->sb_size = sizeof(*sb);
+ rdev->sb_size = bdev_logical_block_size(rdev->meta_bdev);
+ if (rdev->sb_size < sizeof(*sb) || rdev->sb_size > PAGE_SIZE) {
+ DMERR("superblock size of a logical block is no longer valid");
+ return -EINVAL;
+ }
ret = read_disk_sb(rdev, rdev->sb_size);
if (ret)
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johannes Berg <[email protected]>
commit 46238845bd609a5c0fbe076e1b82b4c5b33360b2 upstream.
When an interface is deleted, an ongoing hardware scan is canceled and
the driver must abort the scan, at the very least reporting completion
while the interface is removed.
However, if it scheduled the work that might only run after everything
is said and done, which leads to cfg80211 warning that the scan isn't
reported as finished yet; this is no fault of the driver, it already
did, but mac80211 hasn't processed it.
To fix this situation, flush the delayed work when the interface being
removed is the one that was executing the scan.
Reported-by: Sujith Manoharan <[email protected]>
Tested-by: Sujith Manoharan <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/mac80211/iface.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
--- a/net/mac80211/iface.c
+++ b/net/mac80211/iface.c
@@ -765,10 +765,12 @@ static void ieee80211_do_stop(struct iee
int i, flushed;
struct ps_data *ps;
struct cfg80211_chan_def chandef;
+ bool cancel_scan;
clear_bit(SDATA_STATE_RUNNING, &sdata->state);
- if (rcu_access_pointer(local->scan_sdata) == sdata)
+ cancel_scan = rcu_access_pointer(local->scan_sdata) == sdata;
+ if (cancel_scan)
ieee80211_scan_cancel(local);
/*
@@ -990,6 +992,9 @@ static void ieee80211_do_stop(struct iee
ieee80211_recalc_ps(local, -1);
+ if (cancel_scan)
+ flush_delayed_work(&local->scan_work);
+
if (local->open_count == 0) {
ieee80211_stop_device(local);
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Borislav Petkov <[email protected]>
commit c0a717f23dccdb6e3b03471bc846fdc636f2b353 upstream.
Save the patch while we're running on the BSP instead of later, before
the initrd has been jettisoned. More importantly, on 32-bit we need to
access the physical address instead of the virtual.
This way we actually do find it on the APs instead of having to go
through the initrd each time.
Tested-by: Richard Hendershot <[email protected]>
Fixes: 5335ba5cf475 ("x86, microcode, AMD: Fix early ucode loading")
Signed-off-by: Borislav Petkov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/kernel/cpu/microcode/amd_early.c | 24 ++++++++++++++----------
1 file changed, 14 insertions(+), 10 deletions(-)
--- a/arch/x86/kernel/cpu/microcode/amd_early.c
+++ b/arch/x86/kernel/cpu/microcode/amd_early.c
@@ -108,12 +108,13 @@ static size_t compute_container_size(u8
* load_microcode_amd() to save equivalent cpu table and microcode patches in
* kernel heap memory.
*/
-static void apply_ucode_in_initrd(void *ucode, size_t size)
+static void apply_ucode_in_initrd(void *ucode, size_t size, bool save_patch)
{
struct equiv_cpu_entry *eq;
size_t *cont_sz;
u32 *header;
u8 *data, **cont;
+ u8 (*patch)[PATCH_MAX_SIZE];
u16 eq_id = 0;
int offset, left;
u32 rev, eax, ebx, ecx, edx;
@@ -123,10 +124,12 @@ static void apply_ucode_in_initrd(void *
new_rev = (u32 *)__pa_nodebug(&ucode_new_rev);
cont_sz = (size_t *)__pa_nodebug(&container_size);
cont = (u8 **)__pa_nodebug(&container);
+ patch = (u8 (*)[PATCH_MAX_SIZE])__pa_nodebug(&amd_ucode_patch);
#else
new_rev = &ucode_new_rev;
cont_sz = &container_size;
cont = &container;
+ patch = &amd_ucode_patch;
#endif
data = ucode;
@@ -213,9 +216,9 @@ static void apply_ucode_in_initrd(void *
rev = mc->hdr.patch_id;
*new_rev = rev;
- /* save ucode patch */
- memcpy(amd_ucode_patch, mc,
- min_t(u32, header[1], PATCH_MAX_SIZE));
+ if (save_patch)
+ memcpy(patch, mc,
+ min_t(u32, header[1], PATCH_MAX_SIZE));
}
}
@@ -246,7 +249,7 @@ void __init load_ucode_amd_bsp(void)
*data = cp.data;
*size = cp.size;
- apply_ucode_in_initrd(cp.data, cp.size);
+ apply_ucode_in_initrd(cp.data, cp.size, true);
}
#ifdef CONFIG_X86_32
@@ -263,7 +266,7 @@ void load_ucode_amd_ap(void)
size_t *usize;
void **ucode;
- mc = (struct microcode_amd *)__pa(amd_ucode_patch);
+ mc = (struct microcode_amd *)__pa_nodebug(amd_ucode_patch);
if (mc->hdr.patch_id && mc->hdr.processor_rev_id) {
__apply_microcode_amd(mc);
return;
@@ -275,7 +278,7 @@ void load_ucode_amd_ap(void)
if (!*ucode || !*usize)
return;
- apply_ucode_in_initrd(*ucode, *usize);
+ apply_ucode_in_initrd(*ucode, *usize, false);
}
static void __init collect_cpu_sig_on_bsp(void *arg)
@@ -339,7 +342,7 @@ void load_ucode_amd_ap(void)
* AP has a different equivalence ID than BSP, looks like
* mixed-steppings silicon so go through the ucode blob anew.
*/
- apply_ucode_in_initrd(ucode_cpio.data, ucode_cpio.size);
+ apply_ucode_in_initrd(ucode_cpio.data, ucode_cpio.size, false);
}
}
#endif
@@ -347,6 +350,7 @@ void load_ucode_amd_ap(void)
int __init save_microcode_in_initrd_amd(void)
{
unsigned long cont;
+ int retval = 0;
enum ucode_state ret;
u8 *cont_va;
u32 eax;
@@ -387,7 +391,7 @@ int __init save_microcode_in_initrd_amd(
ret = load_microcode_amd(eax, container, container_size);
if (ret != UCODE_OK)
- return -EINVAL;
+ retval = -EINVAL;
/*
* This will be freed any msec now, stash patches for the current
@@ -396,5 +400,5 @@ int __init save_microcode_in_initrd_amd(
container = NULL;
container_size = 0;
- return 0;
+ return retval;
}
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Houcheng Lin <[email protected]>
commit b51d3fa364885a2c1e1668f88776c67c95291820 upstream.
The kernel should reserve enough room in the skb so that the DONE
message can always be appended. However, in case of e.g. new attribute
erronously not being size-accounted for, __nfulnl_send() will still
try to put next nlmsg into this full skbuf, causing the skb to be stuck
forever and blocking delivery of further messages.
Fix issue by releasing skb immediately after nlmsg_put error and
WARN() so we can track down the cause of such size mismatch.
[ [email protected]: add tailroom/len info to WARN ]
Signed-off-by: Houcheng Lin <[email protected]>
Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/netfilter/nfnetlink_log.c | 17 ++++++++---------
1 file changed, 8 insertions(+), 9 deletions(-)
--- a/net/netfilter/nfnetlink_log.c
+++ b/net/netfilter/nfnetlink_log.c
@@ -346,26 +346,25 @@ nfulnl_alloc_skb(struct net *net, u32 pe
return skb;
}
-static int
+static void
__nfulnl_send(struct nfulnl_instance *inst)
{
- int status = -1;
-
if (inst->qlen > 1) {
struct nlmsghdr *nlh = nlmsg_put(inst->skb, 0, 0,
NLMSG_DONE,
sizeof(struct nfgenmsg),
0);
- if (!nlh)
+ if (WARN_ONCE(!nlh, "bad nlskb size: %u, tailroom %d\n",
+ inst->skb->len, skb_tailroom(inst->skb))) {
+ kfree_skb(inst->skb);
goto out;
+ }
}
- status = nfnetlink_unicast(inst->skb, inst->net, inst->peer_portid,
- MSG_DONTWAIT);
-
+ nfnetlink_unicast(inst->skb, inst->net, inst->peer_portid,
+ MSG_DONTWAIT);
+out:
inst->qlen = 0;
inst->skb = NULL;
-out:
- return status;
}
static void
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter <[email protected]>
commit 0f9f5e1b83abd2b37c67658e02a6fc9001831fa5 upstream.
The ->ip_set_list[] array is initialized in ip_set_net_init() and it
has ->ip_set_max elements so this check should be >= instead of >
otherwise we are off by one.
Signed-off-by: Dan Carpenter <[email protected]>
Acked-by: Jozsef Kadlecsik <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/netfilter/ipset/ip_set_core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/netfilter/ipset/ip_set_core.c
+++ b/net/netfilter/ipset/ip_set_core.c
@@ -635,7 +635,7 @@ ip_set_nfnl_get_byindex(struct net *net,
struct ip_set *set;
struct ip_set_net *inst = ip_set_pernet(net);
- if (index > inst->ip_set_max)
+ if (index >= inst->ip_set_max)
return IPSET_INVALID_ID;
nfnl_lock(NFNL_SUBSYS_IPSET);
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Westphal <[email protected]>
commit 9dfa1dfe4d5e5e66a991321ab08afe69759d797a upstream.
We currently neither account for the nlattr size, nor do we consider
the size of the trailing NLMSG_DONE when allocating nlmsg skb.
This can result in nflog to stop working, as __nfulnl_send() re-tries
sending forever if it failed to append NLMSG_DONE (which will never
work if buffer is not large enough).
Reported-by: Houcheng Lin <[email protected]>
Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/netfilter/nfnetlink_log.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/net/netfilter/nfnetlink_log.c
+++ b/net/netfilter/nfnetlink_log.c
@@ -649,7 +649,8 @@ nfulnl_log_packet(struct net *net,
+ nla_total_size(sizeof(u_int32_t)) /* gid */
+ nla_total_size(plen) /* prefix */
+ nla_total_size(sizeof(struct nfulnl_msg_packet_hw))
- + nla_total_size(sizeof(struct nfulnl_msg_packet_timestamp));
+ + nla_total_size(sizeof(struct nfulnl_msg_packet_timestamp))
+ + nla_total_size(sizeof(struct nfgenmsg)); /* NLMSG_DONE */
if (in && skb_mac_header_was_set(skb)) {
size += nla_total_size(skb->dev->hard_header_len)
@@ -692,8 +693,7 @@ nfulnl_log_packet(struct net *net,
goto unlock_and_release;
}
- if (inst->skb &&
- size > skb_tailroom(inst->skb) - sizeof(struct nfgenmsg)) {
+ if (inst->skb && size > skb_tailroom(inst->skb)) {
/* either the queue len is too high or we don't have
* enough room in the skb left. flush to userspace. */
__nfulnl_flush(inst);
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Westphal <[email protected]>
commit c1e7dc91eed0ed1a51c9b814d648db18bf8fc6e9 upstream.
don't try to queue payloads > 0xffff - NLA_HDRLEN, it does not work.
The nla length includes the size of the nla struct, so anything larger
results in u16 integer overflow.
This patch is similar to
9cefbbc9c8f9abe (netfilter: nfnetlink_queue: cleanup copy_range usage).
Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/netfilter/nfnetlink_log.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
--- a/net/netfilter/nfnetlink_log.c
+++ b/net/netfilter/nfnetlink_log.c
@@ -43,7 +43,8 @@
#define NFULNL_NLBUFSIZ_DEFAULT NLMSG_GOODSIZE
#define NFULNL_TIMEOUT_DEFAULT 100 /* every second */
#define NFULNL_QTHRESH_DEFAULT 100 /* 100 packets */
-#define NFULNL_COPY_RANGE_MAX 0xFFFF /* max packet size is limited by 16-bit struct nfattr nfa_len field */
+/* max packet size is limited by 16-bit struct nfattr nfa_len field */
+#define NFULNL_COPY_RANGE_MAX (0xFFFF - NLA_HDRLEN)
#define PRINTR(x, args...) do { if (net_ratelimit()) \
printk(x, ## args); } while (0);
@@ -252,6 +253,8 @@ nfulnl_set_mode(struct nfulnl_instance *
case NFULNL_COPY_PACKET:
inst->copy_mode = mode;
+ if (range == 0)
+ range = NFULNL_COPY_RANGE_MAX;
inst->copy_range = min_t(unsigned int,
range, NFULNL_COPY_RANGE_MAX);
break;
@@ -679,8 +682,7 @@ nfulnl_log_packet(struct net *net,
break;
case NFULNL_COPY_PACKET:
- if (inst->copy_range == 0
- || inst->copy_range > skb->len)
+ if (inst->copy_range > skb->len)
data_len = skb->len;
else
data_len = inst->copy_range;
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sabrina Dubroca <[email protected]>
commit c123bb7163043bb8f33858cf8e45b01c17dbd171 upstream.
alloc_percpu returns NULL on failure, not a negative error code.
Fixes: ff3cd7b3c922 ("netfilter: nf_tables: refactor chain statistic routines")
Signed-off-by: Sabrina Dubroca <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/netfilter/nf_tables_api.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -1102,10 +1102,10 @@ static int nf_tables_newchain(struct soc
basechain->stats = stats;
} else {
stats = netdev_alloc_pcpu_stats(struct nft_stats);
- if (IS_ERR(stats)) {
+ if (stats == NULL) {
module_put(type->owner);
kfree(basechain);
- return PTR_ERR(stats);
+ return -ENOMEM;
}
rcu_assign_pointer(basechain->stats, stats);
}
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bob Peterson <[email protected]>
commit 19aeb5a65f1a6504fc665466c188241e7393d66f upstream.
This patch fixes a regression in the patch "GFS2: Remember directory
insert point", commit 2b47dad866d04f14c328f888ba5406057b8c7d33.
The problem had to do with the rename function: The function found
space for the new dirent, and remembered that location. But then the
old dirent was removed, which often moved the eligible location for
the renamed dirent. Putting the new dirent at the saved location
caused file system corruption.
This patch adds a new "save_loc" variable to struct gfs2_diradd.
If 1, the dirent location is saved. If 0, the dirent location is not
saved and the buffer_head is released as per previous behavior.
Signed-off-by: Bob Peterson <[email protected]>
Signed-off-by: Steven Whitehouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/gfs2/dir.c | 9 +++++++--
fs/gfs2/dir.h | 1 +
fs/gfs2/inode.c | 6 +++---
3 files changed, 11 insertions(+), 5 deletions(-)
--- a/fs/gfs2/dir.c
+++ b/fs/gfs2/dir.c
@@ -2100,8 +2100,13 @@ int gfs2_diradd_alloc_required(struct in
}
if (IS_ERR(dent))
return PTR_ERR(dent);
- da->bh = bh;
- da->dent = dent;
+
+ if (da->save_loc) {
+ da->bh = bh;
+ da->dent = dent;
+ } else {
+ brelse(bh);
+ }
return 0;
}
--- a/fs/gfs2/dir.h
+++ b/fs/gfs2/dir.h
@@ -23,6 +23,7 @@ struct gfs2_diradd {
unsigned nr_blocks;
struct gfs2_dirent *dent;
struct buffer_head *bh;
+ int save_loc;
};
extern struct inode *gfs2_dir_search(struct inode *dir,
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -600,7 +600,7 @@ static int gfs2_create_inode(struct inod
int error, free_vfs_inode = 0;
u32 aflags = 0;
unsigned blocks = 1;
- struct gfs2_diradd da = { .bh = NULL, };
+ struct gfs2_diradd da = { .bh = NULL, .save_loc = 1, };
if (!name->len || name->len > GFS2_FNAMESIZE)
return -ENAMETOOLONG;
@@ -899,7 +899,7 @@ static int gfs2_link(struct dentry *old_
struct gfs2_inode *ip = GFS2_I(inode);
struct gfs2_holder ghs[2];
struct buffer_head *dibh;
- struct gfs2_diradd da = { .bh = NULL, };
+ struct gfs2_diradd da = { .bh = NULL, .save_loc = 1, };
int error;
if (S_ISDIR(inode->i_mode))
@@ -1337,7 +1337,7 @@ static int gfs2_rename(struct inode *odi
struct gfs2_rgrpd *nrgd;
unsigned int num_gh;
int dir_rename = 0;
- struct gfs2_diradd da = { .nr_blocks = 0, };
+ struct gfs2_diradd da = { .nr_blocks = 0, .save_loc = 0, };
unsigned int x;
int error;
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arturo Borrero <[email protected]>
commit 7965ee93719921ea5978f331da653dfa2d7b99f5 upstream.
The code looks for an already loaded target, and the correct list to search
is nft_target_list, not nft_match_list.
Signed-off-by: Arturo Borrero Gonzalez <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/netfilter/nft_compat.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/netfilter/nft_compat.c
+++ b/net/netfilter/nft_compat.c
@@ -696,7 +696,7 @@ nft_target_select_ops(const struct nft_c
family = ctx->afi->family;
/* Re-use the existing target if it's already loaded. */
- list_for_each_entry(nft_target, &nft_match_list, head) {
+ list_for_each_entry(nft_target, &nft_target_list, head) {
struct xt_target *target = nft_target->ops.data;
if (strcmp(target->name, tg_name) == 0 &&
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pablo Neira <[email protected]>
commit e10038a8ec06ac819b7552bb67aaa6d2d6f850c1 upstream.
This structure is not exposed to userspace, so fix this by defining
struct sk_filter; so we skip the casting in kernelspace. This is safe
since userspace has no way to lurk with that internal pointer.
Fixes: e6f30c7 ("netfilter: x_tables: add xt_bpf match")
Signed-off-by: Pablo Neira Ayuso <[email protected]>
Acked-by: Willem de Bruijn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/uapi/linux/netfilter/xt_bpf.h | 2 ++
1 file changed, 2 insertions(+)
--- a/include/uapi/linux/netfilter/xt_bpf.h
+++ b/include/uapi/linux/netfilter/xt_bpf.h
@@ -8,6 +8,8 @@
struct bpf_prog;
+struct sk_filter;
+
struct xt_bpf_info {
__u16 bpf_program_num_elem;
struct sock_filter bpf_program[XT_BPF_MAX_NUM_INSTR];
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Junjie Mao <[email protected]>
commit e6023367d779060fddc9a52d1f474085b2b36298 upstream.
When choosing a random address, the current implementation does not take into
account the reversed space for .bss and .brk sections. Thus the relocated kernel
may overlap other components in memory. Here is an example of the overlap from a
x86_64 kernel in qemu (the ranges of physical addresses are presented):
Physical Address
0x0fe00000 --+--------------------+ <-- randomized base
/ | relocated kernel |
vmlinux.bin | (from vmlinux.bin) |
0x1336d000 (an ELF file) +--------------------+--
\ | | \
0x1376d870 --+--------------------+ |
| relocs table | |
0x13c1c2a8 +--------------------+ .bss and .brk
| | |
0x13ce6000 +--------------------+ |
| | /
0x13f77000 | initrd |--
| |
0x13fef374 +--------------------+
The initrd image will then be overwritten by the memset during early
initialization:
[ 1.655204] Unpacking initramfs...
[ 1.662831] Initramfs unpacking failed: junk in compressed archive
This patch prevents the above situation by requiring a larger space when looking
for a random kernel base, so that existing logic can effectively avoids the
overlap.
[kees: switched to perl to avoid hex translation pain in mawk vs gawk]
[kees: calculated overlap without relocs table]
Fixes: 82fa9637a2 ("x86, kaslr: Select random position from e820 maps")
Reported-by: Fengguang Wu <[email protected]>
Signed-off-by: Junjie Mao <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Matt Fleming <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Andi Kleen <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/boot/compressed/Makefile | 4 +++-
arch/x86/boot/compressed/head_32.S | 5 +++--
arch/x86/boot/compressed/head_64.S | 5 ++++-
arch/x86/boot/compressed/misc.c | 13 ++++++++++---
arch/x86/boot/compressed/mkpiggy.c | 9 +++++++--
arch/x86/tools/calc_run_size.pl | 30 ++++++++++++++++++++++++++++++
6 files changed, 57 insertions(+), 9 deletions(-)
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -75,8 +75,10 @@ suffix-$(CONFIG_KERNEL_XZ) := xz
suffix-$(CONFIG_KERNEL_LZO) := lzo
suffix-$(CONFIG_KERNEL_LZ4) := lz4
+RUN_SIZE = $(shell objdump -h vmlinux | \
+ perl $(srctree)/arch/x86/tools/calc_run_size.pl)
quiet_cmd_mkpiggy = MKPIGGY $@
- cmd_mkpiggy = $(obj)/mkpiggy $< > $@ || ( rm -f $@ ; false )
+ cmd_mkpiggy = $(obj)/mkpiggy $< $(RUN_SIZE) > $@ || ( rm -f $@ ; false )
targets += piggy.S
$(obj)/piggy.S: $(obj)/vmlinux.bin.$(suffix-y) $(obj)/mkpiggy FORCE
--- a/arch/x86/boot/compressed/head_32.S
+++ b/arch/x86/boot/compressed/head_32.S
@@ -207,7 +207,8 @@ relocated:
* Do the decompression, and jump to the new kernel..
*/
/* push arguments for decompress_kernel: */
- pushl $z_output_len /* decompressed length */
+ pushl $z_run_size /* size of kernel with .bss and .brk */
+ pushl $z_output_len /* decompressed length, end of relocs */
leal z_extract_offset_negative(%ebx), %ebp
pushl %ebp /* output address */
pushl $z_input_len /* input_len */
@@ -217,7 +218,7 @@ relocated:
pushl %eax /* heap area */
pushl %esi /* real mode pointer */
call decompress_kernel /* returns kernel location in %eax */
- addl $24, %esp
+ addl $28, %esp
/*
* Jump to the decompressed kernel.
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -402,13 +402,16 @@ relocated:
* Do the decompression, and jump to the new kernel..
*/
pushq %rsi /* Save the real mode argument */
+ movq $z_run_size, %r9 /* size of kernel with .bss and .brk */
+ pushq %r9
movq %rsi, %rdi /* real mode address */
leaq boot_heap(%rip), %rsi /* malloc area for uncompression */
leaq input_data(%rip), %rdx /* input_data */
movl $z_input_len, %ecx /* input_len */
movq %rbp, %r8 /* output target address */
- movq $z_output_len, %r9 /* decompressed length */
+ movq $z_output_len, %r9 /* decompressed length, end of relocs */
call decompress_kernel /* returns kernel location in %rax */
+ popq %r9
popq %rsi
/*
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -358,7 +358,8 @@ asmlinkage __visible void *decompress_ke
unsigned char *input_data,
unsigned long input_len,
unsigned char *output,
- unsigned long output_len)
+ unsigned long output_len,
+ unsigned long run_size)
{
real_mode = rmode;
@@ -381,8 +382,14 @@ asmlinkage __visible void *decompress_ke
free_mem_ptr = heap; /* Heap */
free_mem_end_ptr = heap + BOOT_HEAP_SIZE;
- output = choose_kernel_location(input_data, input_len,
- output, output_len);
+ /*
+ * The memory hole needed for the kernel is the larger of either
+ * the entire decompressed kernel plus relocation table, or the
+ * entire decompressed kernel plus .bss and .brk sections.
+ */
+ output = choose_kernel_location(input_data, input_len, output,
+ output_len > run_size ? output_len
+ : run_size);
/* Validate memory location choices. */
if ((unsigned long)output & (MIN_KERNEL_ALIGN - 1))
--- a/arch/x86/boot/compressed/mkpiggy.c
+++ b/arch/x86/boot/compressed/mkpiggy.c
@@ -36,11 +36,13 @@ int main(int argc, char *argv[])
uint32_t olen;
long ilen;
unsigned long offs;
+ unsigned long run_size;
FILE *f = NULL;
int retval = 1;
- if (argc < 2) {
- fprintf(stderr, "Usage: %s compressed_file\n", argv[0]);
+ if (argc < 3) {
+ fprintf(stderr, "Usage: %s compressed_file run_size\n",
+ argv[0]);
goto bail;
}
@@ -74,6 +76,7 @@ int main(int argc, char *argv[])
offs += olen >> 12; /* Add 8 bytes for each 32K block */
offs += 64*1024 + 128; /* Add 64K + 128 bytes slack */
offs = (offs+4095) & ~4095; /* Round to a 4K boundary */
+ run_size = atoi(argv[2]);
printf(".section \".rodata..compressed\",\"a\",@progbits\n");
printf(".globl z_input_len\n");
@@ -85,6 +88,8 @@ int main(int argc, char *argv[])
/* z_extract_offset_negative allows simplification of head_32.S */
printf(".globl z_extract_offset_negative\n");
printf("z_extract_offset_negative = -0x%lx\n", offs);
+ printf(".globl z_run_size\n");
+ printf("z_run_size = %lu\n", run_size);
printf(".globl input_data, input_data_end\n");
printf("input_data:\n");
--- /dev/null
+++ b/arch/x86/tools/calc_run_size.pl
@@ -0,0 +1,30 @@
+#!/usr/bin/perl
+#
+# Calculate the amount of space needed to run the kernel, including room for
+# the .bss and .brk sections.
+#
+# Usage:
+# objdump -h a.out | perl calc_run_size.pl
+use strict;
+
+my $mem_size = 0;
+my $file_offset = 0;
+
+my $sections=" *[0-9]+ \.(?:bss|brk) +";
+while (<>) {
+ if (/^$sections([0-9a-f]+) +(?:[0-9a-f]+ +){2}([0-9a-f]+)/) {
+ my $size = hex($1);
+ my $offset = hex($2);
+ $mem_size += $size;
+ if ($file_offset == 0) {
+ $file_offset = $offset;
+ } elsif ($file_offset != $offset) {
+ die ".bss and .brk lack common file offset\n";
+ }
+ }
+}
+
+if ($file_offset == 0) {
+ die "Never found .bss or .brk file offset\n";
+}
+printf("%d\n", $mem_size + $file_offset);
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Devesh Sharma <[email protected]>
commit 8b0f93d9490653a7b9fc91f3570089132faed1c0 upstream.
During create-ah from userspace, uverbs is sending garbage data in
attr.dmac and attr.vlan_id. This patch sets attr.dmac and
attr.vlan_id to zero.
Fixes: dd5f03beb4f7 ("IB/core: Ethernet L2 attributes in verbs/cm structures")
Signed-off-by: Devesh Sharma <[email protected]>
Signed-off-by: Roland Dreier <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/infiniband/core/uverbs_cmd.c | 2 ++
1 file changed, 2 insertions(+)
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -2518,6 +2518,8 @@ ssize_t ib_uverbs_create_ah(struct ib_uv
attr.grh.sgid_index = cmd.attr.grh.sgid_index;
attr.grh.hop_limit = cmd.attr.grh.hop_limit;
attr.grh.traffic_class = cmd.attr.grh.traffic_class;
+ attr.vlan_id = 0;
+ memset(&attr.dmac, 0, sizeof(attr.dmac));
memcpy(attr.grh.dgid.raw, cmd.attr.grh.dgid, 16);
ah = ib_create_ah(pd, &attr);
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andrey Vagin <[email protected]>
commit 1195d94e006b23c6292e78857e154872e33b6d7e upstream.
proc_dointvec_minmax() returns zero if a new value has been set. So we
don't need to check all charecters have been handled.
Below you can find two examples. In the new value has not been handled
properly.
$ strace ./a.out
open("/proc/sys/kernel/auto_msgmni", O_WRONLY) = 3
write(3, "0\n\0", 3) = 2
close(3) = 0
exit_group(0)
$ cat /sys/kernel/debug/tracing/trace
$strace ./a.out
open("/proc/sys/kernel/auto_msgmni", O_WRONLY) = 3
write(3, "0\n", 2) = 2
close(3) = 0
$ cat /sys/kernel/debug/tracing/trace
a.out-697 [000] .... 3280.998235: unregister_ipcns_notifier <-proc_ipcauto_dointvec_minmax
Fixes: 9eefe520c814 ("ipc: do not use a negative value to re-enable msgmni automatic recomputin")
Signed-off-by: Andrey Vagin <[email protected]>
Cc: Mathias Krause <[email protected]>
Cc: Manfred Spraul <[email protected]>
Cc: Joe Perches <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
ipc/ipc_sysctl.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
--- a/ipc/ipc_sysctl.c
+++ b/ipc/ipc_sysctl.c
@@ -123,7 +123,6 @@ static int proc_ipcauto_dointvec_minmax(
void __user *buffer, size_t *lenp, loff_t *ppos)
{
struct ctl_table ipc_table;
- size_t lenp_bef = *lenp;
int oldval;
int rc;
@@ -133,7 +132,7 @@ static int proc_ipcauto_dointvec_minmax(
rc = proc_dointvec_minmax(&ipc_table, write, buffer, lenp, ppos);
- if (write && !rc && lenp_bef == *lenp) {
+ if (write && !rc) {
int newval = *((int *)(ipc_table.data));
/*
* The file "auto_msgmni" has correctly been set.
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pranith Kumar <[email protected]>
commit 2aa792e6faf1a00f5accf1f69e87e11a390ba2cd upstream.
The rcu_gp_kthread_wake() function checks for three conditions before
waking up grace period kthreads:
* Is the thread we are trying to wake up the current thread?
* Are the gp_flags zero? (all threads wait on non-zero gp_flags condition)
* Is there no thread created for this flavour, hence nothing to wake up?
If any one of these condition is true, we do not call wake_up().
It was found that there are quite a few avoidable wake ups both during
idle time and under stress induced by rcutorture.
Idle:
Total:66000, unnecessary:66000, case1:61827, case2:66000, case3:0
Total:68000, unnecessary:68000, case1:63696, case2:68000, case3:0
rcutorture:
Total:254000, unnecessary:254000, case1:199913, case2:254000, case3:0
Total:256000, unnecessary:256000, case1:201784, case2:256000, case3:0
Here case{1-3} are the cases listed above. We can avoid these wake
ups by using rcu_gp_kthread_wake() to conditionally wake up the grace
period kthreads.
There is a comment about an implied barrier supplied by the wake_up()
logic. This barrier is necessary for the awakened thread to see the
updated ->gp_flags. This flag is always being updated with the root node
lock held. Also, the awakened thread tries to acquire the root node lock
before reading ->gp_flags because of which there is proper ordering.
Hence this commit tries to avoid calling wake_up() whenever we can by
using rcu_gp_kthread_wake() function.
Signed-off-by: Pranith Kumar <[email protected]>
CC: Mathieu Desnoyers <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Cc: Kamal Mostafa <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
kernel/rcu/tree.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1928,7 +1928,7 @@ static void rcu_report_qs_rsp(struct rcu
{
WARN_ON_ONCE(!rcu_gp_in_progress(rsp));
raw_spin_unlock_irqrestore(&rcu_get_root(rsp)->lock, flags);
- wake_up(&rsp->gp_wq); /* Memory barrier implied by wake_up() path. */
+ rcu_gp_kthread_wake(rsp);
}
/*
@@ -2507,7 +2507,7 @@ static void force_quiescent_state(struct
}
ACCESS_ONCE(rsp->gp_flags) |= RCU_GP_FLAG_FQS;
raw_spin_unlock_irqrestore(&rnp_old->lock, flags);
- wake_up(&rsp->gp_wq); /* Memory barrier implied by wake_up() path. */
+ rcu_gp_kthread_wake(rsp);
}
/*
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Rientjes <[email protected]>
commit 6d50e60cd2edb5a57154db5a6f64eef5aa59b751 upstream.
If an anonymous mapping is not allowed to fault thp memory and then
madvise(MADV_HUGEPAGE) is used after fault, khugepaged will never
collapse this memory into thp memory.
This occurs because the madvise(2) handler for thp, hugepage_madvise(),
clears VM_NOHUGEPAGE on the stack and it isn't stored in vma->vm_flags
until the final action of madvise_behavior(). This causes the
khugepaged_enter_vma_merge() to be a no-op in hugepage_madvise() when
the vma had previously had VM_NOHUGEPAGE set.
Fix this by passing the correct vma flags to the khugepaged mm slot
handler. There's no chance khugepaged can run on this vma until after
madvise_behavior() returns since we hold mm->mmap_sem.
It would be possible to clear VM_NOHUGEPAGE directly from vma->vm_flags
in hugepage_advise(), but I didn't want to introduce special case
behavior into madvise_behavior(). I think it's best to just let it
always set vma->vm_flags itself.
Signed-off-by: David Rientjes <[email protected]>
Reported-by: Suleiman Souhlal <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/linux/khugepaged.h | 17 ++++++++++-------
mm/huge_memory.c | 11 ++++++-----
mm/mmap.c | 8 ++++----
3 files changed, 20 insertions(+), 16 deletions(-)
--- a/include/linux/khugepaged.h
+++ b/include/linux/khugepaged.h
@@ -6,7 +6,8 @@
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
extern int __khugepaged_enter(struct mm_struct *mm);
extern void __khugepaged_exit(struct mm_struct *mm);
-extern int khugepaged_enter_vma_merge(struct vm_area_struct *vma);
+extern int khugepaged_enter_vma_merge(struct vm_area_struct *vma,
+ unsigned long vm_flags);
#define khugepaged_enabled() \
(transparent_hugepage_flags & \
@@ -35,13 +36,13 @@ static inline void khugepaged_exit(struc
__khugepaged_exit(mm);
}
-static inline int khugepaged_enter(struct vm_area_struct *vma)
+static inline int khugepaged_enter(struct vm_area_struct *vma,
+ unsigned long vm_flags)
{
if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags))
if ((khugepaged_always() ||
- (khugepaged_req_madv() &&
- vma->vm_flags & VM_HUGEPAGE)) &&
- !(vma->vm_flags & VM_NOHUGEPAGE))
+ (khugepaged_req_madv() && (vm_flags & VM_HUGEPAGE))) &&
+ !(vm_flags & VM_NOHUGEPAGE))
if (__khugepaged_enter(vma->vm_mm))
return -ENOMEM;
return 0;
@@ -54,11 +55,13 @@ static inline int khugepaged_fork(struct
static inline void khugepaged_exit(struct mm_struct *mm)
{
}
-static inline int khugepaged_enter(struct vm_area_struct *vma)
+static inline int khugepaged_enter(struct vm_area_struct *vma,
+ unsigned long vm_flags)
{
return 0;
}
-static inline int khugepaged_enter_vma_merge(struct vm_area_struct *vma)
+static inline int khugepaged_enter_vma_merge(struct vm_area_struct *vma,
+ unsigned long vm_flags)
{
return 0;
}
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -803,7 +803,7 @@ int do_huge_pmd_anonymous_page(struct mm
return VM_FAULT_FALLBACK;
if (unlikely(anon_vma_prepare(vma)))
return VM_FAULT_OOM;
- if (unlikely(khugepaged_enter(vma)))
+ if (unlikely(khugepaged_enter(vma, vma->vm_flags)))
return VM_FAULT_OOM;
if (!(flags & FAULT_FLAG_WRITE) &&
transparent_hugepage_use_zero_page()) {
@@ -1970,7 +1970,7 @@ int hugepage_madvise(struct vm_area_stru
* register it here without waiting a page fault that
* may not happen any time soon.
*/
- if (unlikely(khugepaged_enter_vma_merge(vma)))
+ if (unlikely(khugepaged_enter_vma_merge(vma, *vm_flags)))
return -ENOMEM;
break;
case MADV_NOHUGEPAGE:
@@ -2071,7 +2071,8 @@ int __khugepaged_enter(struct mm_struct
return 0;
}
-int khugepaged_enter_vma_merge(struct vm_area_struct *vma)
+int khugepaged_enter_vma_merge(struct vm_area_struct *vma,
+ unsigned long vm_flags)
{
unsigned long hstart, hend;
if (!vma->anon_vma)
@@ -2083,11 +2084,11 @@ int khugepaged_enter_vma_merge(struct vm
if (vma->vm_ops)
/* khugepaged not yet working on file or special mappings */
return 0;
- VM_BUG_ON(vma->vm_flags & VM_NO_THP);
+ VM_BUG_ON(vm_flags & VM_NO_THP);
hstart = (vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK;
hend = vma->vm_end & HPAGE_PMD_MASK;
if (hstart < hend)
- return khugepaged_enter(vma);
+ return khugepaged_enter(vma, vm_flags);
return 0;
}
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1056,7 +1056,7 @@ struct vm_area_struct *vma_merge(struct
end, prev->vm_pgoff, NULL);
if (err)
return NULL;
- khugepaged_enter_vma_merge(prev);
+ khugepaged_enter_vma_merge(prev, vm_flags);
return prev;
}
@@ -1075,7 +1075,7 @@ struct vm_area_struct *vma_merge(struct
next->vm_pgoff - pglen, NULL);
if (err)
return NULL;
- khugepaged_enter_vma_merge(area);
+ khugepaged_enter_vma_merge(area, vm_flags);
return area;
}
@@ -2192,7 +2192,7 @@ int expand_upwards(struct vm_area_struct
}
}
vma_unlock_anon_vma(vma);
- khugepaged_enter_vma_merge(vma);
+ khugepaged_enter_vma_merge(vma, vma->vm_flags);
validate_mm(vma->vm_mm);
return error;
}
@@ -2261,7 +2261,7 @@ int expand_downwards(struct vm_area_stru
}
}
vma_unlock_anon_vma(vma);
- khugepaged_enter_vma_merge(vma);
+ khugepaged_enter_vma_merge(vma, vma->vm_flags);
validate_mm(vma->vm_mm);
return error;
}
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michal Marek <[email protected]>
commit 2d0871396995139b37f9ceb153c8b07589148343 upstream.
Since the conversion of objtree to use relative pathnames (commit
7e1c04779e, "kbuild: Use relative path for $(objtree)"), the debug
info files have been ending up in /debian/dbgtmp/ in the regular
linux-image package instead of the debug files package. Fix up the
paths so that the debug files end up in the -dbg package.
This is based on a similar patch by Darrick.
Reported-and-tested-by: "Darrick J. Wong" <[email protected]>
Signed-off-by: Michal Marek <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
scripts/package/builddeb | 22 ++++++++++------------
1 file changed, 10 insertions(+), 12 deletions(-)
--- a/scripts/package/builddeb
+++ b/scripts/package/builddeb
@@ -152,18 +152,16 @@ if grep -q '^CONFIG_MODULES=y' $KCONFIG_
rmdir "$tmpdir/lib/modules/$version"
fi
if [ -n "$BUILD_DEBUG" ] ; then
- (
- cd $tmpdir
- for module in $(find lib/modules/ -name *.ko); do
- mkdir -p $(dirname $dbg_dir/usr/lib/debug/$module)
- # only keep debug symbols in the debug file
- $OBJCOPY --only-keep-debug $module $dbg_dir/usr/lib/debug/$module
- # strip original module from debug symbols
- $OBJCOPY --strip-debug $module
- # then add a link to those
- $OBJCOPY --add-gnu-debuglink=$dbg_dir/usr/lib/debug/$module $module
- done
- )
+ for module in $(find $tmpdir/lib/modules/ -name *.ko -printf '%P\n'); do
+ module=lib/modules/$module
+ mkdir -p $(dirname $dbg_dir/usr/lib/debug/$module)
+ # only keep debug symbols in the debug file
+ $OBJCOPY --only-keep-debug $tmpdir/$module $dbg_dir/usr/lib/debug/$module
+ # strip original module from debug symbols
+ $OBJCOPY --strip-debug $tmpdir/$module
+ # then add a link to those
+ $OBJCOPY --add-gnu-debuglink=$dbg_dir/usr/lib/debug/$module $tmpdir/$module
+ done
fi
fi
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Howells <[email protected]>
commit 54e2c2c1a9d6cbb270b0999a38545fa9a69bee43 upstream.
Reinstate the generation of EPERM for a key type name beginning with a '.' in
a userspace call. Types whose name begins with a '.' are internal only.
The test was removed by:
commit a4e3b8d79a5c6d40f4a9703abf7fe3abcc6c3b8d
Author: Mimi Zohar <[email protected]>
Date: Thu May 22 14:02:23 2014 -0400
Subject: KEYS: special dot prefixed keyring name bug fix
I think we want to keep the restriction on type name so that userspace can't
add keys of a special internal type.
Note that removal of the test causes several of the tests in the keyutils
testsuite to fail.
Signed-off-by: David Howells <[email protected]>
Acked-by: Vivek Goyal <[email protected]>
cc: Mimi Zohar <[email protected]>
Cc: Josh Boyer <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
security/keys/keyctl.c | 2 ++
1 file changed, 2 insertions(+)
--- a/security/keys/keyctl.c
+++ b/security/keys/keyctl.c
@@ -37,6 +37,8 @@ static int key_get_type_from_user(char *
return ret;
if (ret == 0 || ret >= len)
return -EINVAL;
+ if (type[0] == '.')
+ return -EPERM;
type[len - 1] = '\0';
return 0;
}
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Emmanuel Grumbach <[email protected]>
commit 31b8b343e019e0a0c57ca9c13520a87f9cab884b upstream.
If the RFkill interrupt fires while we calibrate, it would
make the firmware fail and the driver wasn't able to recover.
Change the flow so that the driver will kill the firmware
in that case.
Since we have now two flows that are calling
trans_stop_device (the RFkill interrupt and the
op_mode_mvm_start function) - we need to better sync this.
Use the STATUS_DEVICE_ENABLED in the pcie transport in an
atomic way to achieve this.
This fixes: https://bugzilla.kernel.org/show_bug.cgi?id=86231
Reviewed-by: Johannes Berg <[email protected]>
Reviewed-by: Luciano Coelho <[email protected]>
Signed-off-by: Emmanuel Grumbach <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/wireless/iwlwifi/mvm/fw.c | 10 +++++++++-
drivers/net/wireless/iwlwifi/mvm/mac80211.c | 1 +
drivers/net/wireless/iwlwifi/mvm/mvm.h | 1 +
drivers/net/wireless/iwlwifi/mvm/ops.c | 11 ++++++++++-
drivers/net/wireless/iwlwifi/pcie/trans.c | 4 ++--
5 files changed, 23 insertions(+), 4 deletions(-)
--- a/drivers/net/wireless/iwlwifi/mvm/fw.c
+++ b/drivers/net/wireless/iwlwifi/mvm/fw.c
@@ -282,7 +282,7 @@ int iwl_run_init_mvm_ucode(struct iwl_mv
lockdep_assert_held(&mvm->mutex);
- if (WARN_ON_ONCE(mvm->init_ucode_complete))
+ if (WARN_ON_ONCE(mvm->init_ucode_complete || mvm->calibrating))
return 0;
iwl_init_notification_wait(&mvm->notif_wait,
@@ -332,6 +332,8 @@ int iwl_run_init_mvm_ucode(struct iwl_mv
goto out;
}
+ mvm->calibrating = true;
+
/* Send TX valid antennas before triggering calibrations */
ret = iwl_send_tx_ant_cfg(mvm, mvm->fw->valid_tx_ant);
if (ret)
@@ -356,11 +358,17 @@ int iwl_run_init_mvm_ucode(struct iwl_mv
MVM_UCODE_CALIB_TIMEOUT);
if (!ret)
mvm->init_ucode_complete = true;
+
+ if (ret && iwl_mvm_is_radio_killed(mvm)) {
+ IWL_DEBUG_RF_KILL(mvm, "RFKILL while calibrating.\n");
+ ret = 1;
+ }
goto out;
error:
iwl_remove_notification(&mvm->notif_wait, &calib_wait);
out:
+ mvm->calibrating = false;
if (iwlmvm_mod_params.init_dbg && !mvm->nvm_data) {
/* we want to debug INIT and we have no NVM - fake */
mvm->nvm_data = kzalloc(sizeof(struct iwl_nvm_data) +
--- a/drivers/net/wireless/iwlwifi/mvm/mac80211.c
+++ b/drivers/net/wireless/iwlwifi/mvm/mac80211.c
@@ -778,6 +778,7 @@ static void iwl_mvm_restart_cleanup(stru
iwl_trans_stop_device(mvm->trans);
mvm->scan_status = IWL_MVM_SCAN_NONE;
+ mvm->calibrating = false;
/* just in case one was running */
ieee80211_remain_on_channel_expired(mvm->hw);
--- a/drivers/net/wireless/iwlwifi/mvm/mvm.h
+++ b/drivers/net/wireless/iwlwifi/mvm/mvm.h
@@ -541,6 +541,7 @@ struct iwl_mvm {
enum iwl_ucode_type cur_ucode;
bool ucode_loaded;
bool init_ucode_complete;
+ bool calibrating;
u32 error_event_table;
u32 log_event_table;
u32 umac_error_event_table;
--- a/drivers/net/wireless/iwlwifi/mvm/ops.c
+++ b/drivers/net/wireless/iwlwifi/mvm/ops.c
@@ -745,6 +745,7 @@ void iwl_mvm_set_hw_ctkill_state(struct
static bool iwl_mvm_set_hw_rfkill_state(struct iwl_op_mode *op_mode, bool state)
{
struct iwl_mvm *mvm = IWL_OP_MODE_GET_MVM(op_mode);
+ bool calibrating = ACCESS_ONCE(mvm->calibrating);
if (state)
set_bit(IWL_MVM_STATUS_HW_RFKILL, &mvm->status);
@@ -753,7 +754,15 @@ static bool iwl_mvm_set_hw_rfkill_state(
wiphy_rfkill_set_hw_state(mvm->hw->wiphy, iwl_mvm_is_radio_killed(mvm));
- return state && mvm->cur_ucode != IWL_UCODE_INIT;
+ /* iwl_run_init_mvm_ucode is waiting for results, abort it */
+ if (calibrating)
+ iwl_abort_notification_waits(&mvm->notif_wait);
+
+ /*
+ * Stop the device if we run OPERATIONAL firmware or if we are in the
+ * middle of the calibrations.
+ */
+ return state && (mvm->cur_ucode != IWL_UCODE_INIT || calibrating);
}
static void iwl_mvm_free_skb(struct iwl_op_mode *op_mode, struct sk_buff *skb)
--- a/drivers/net/wireless/iwlwifi/pcie/trans.c
+++ b/drivers/net/wireless/iwlwifi/pcie/trans.c
@@ -913,7 +913,8 @@ static void iwl_trans_pcie_stop_device(s
* restart. So don't process again if the device is
* already dead.
*/
- if (test_bit(STATUS_DEVICE_ENABLED, &trans->status)) {
+ if (test_and_clear_bit(STATUS_DEVICE_ENABLED, &trans->status)) {
+ IWL_DEBUG_INFO(trans, "DEVICE_ENABLED bit was set and is now cleared\n");
iwl_pcie_tx_stop(trans);
iwl_pcie_rx_stop(trans);
@@ -943,7 +944,6 @@ static void iwl_trans_pcie_stop_device(s
/* clear all status bits */
clear_bit(STATUS_SYNC_HCMD_ACTIVE, &trans->status);
clear_bit(STATUS_INT_ENABLED, &trans->status);
- clear_bit(STATUS_DEVICE_ENABLED, &trans->status);
clear_bit(STATUS_TPOWER_PMI, &trans->status);
clear_bit(STATUS_RFKILL, &trans->status);
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Borkmann <[email protected]>
commit 9de7922bc709eee2f609cd01d98aaedc4cf5ea74 upstream.
Commit 6f4c618ddb0 ("SCTP : Add paramters validity check for
ASCONF chunk") added basic verification of ASCONF chunks, however,
it is still possible to remotely crash a server by sending a
special crafted ASCONF chunk, even up to pre 2.6.12 kernels:
skb_over_panic: text:ffffffffa01ea1c3 len:31056 put:30768
head:ffff88011bd81800 data:ffff88011bd81800 tail:0x7950
end:0x440 dev:<NULL>
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:129!
[...]
Call Trace:
<IRQ>
[<ffffffff8144fb1c>] skb_put+0x5c/0x70
[<ffffffffa01ea1c3>] sctp_addto_chunk+0x63/0xd0 [sctp]
[<ffffffffa01eadaf>] sctp_process_asconf+0x1af/0x540 [sctp]
[<ffffffff8152d025>] ? _read_unlock_bh+0x15/0x20
[<ffffffffa01e0038>] sctp_sf_do_asconf+0x168/0x240 [sctp]
[<ffffffffa01e3751>] sctp_do_sm+0x71/0x1210 [sctp]
[<ffffffff8147645d>] ? fib_rules_lookup+0xad/0xf0
[<ffffffffa01e6b22>] ? sctp_cmp_addr_exact+0x32/0x40 [sctp]
[<ffffffffa01e8393>] sctp_assoc_bh_rcv+0xd3/0x180 [sctp]
[<ffffffffa01ee986>] sctp_inq_push+0x56/0x80 [sctp]
[<ffffffffa01fcc42>] sctp_rcv+0x982/0xa10 [sctp]
[<ffffffffa01d5123>] ? ipt_local_in_hook+0x23/0x28 [iptable_filter]
[<ffffffff8148bdc9>] ? nf_iterate+0x69/0xb0
[<ffffffff81496d10>] ? ip_local_deliver_finish+0x0/0x2d0
[<ffffffff8148bf86>] ? nf_hook_slow+0x76/0x120
[<ffffffff81496d10>] ? ip_local_deliver_finish+0x0/0x2d0
[<ffffffff81496ded>] ip_local_deliver_finish+0xdd/0x2d0
[<ffffffff81497078>] ip_local_deliver+0x98/0xa0
[<ffffffff8149653d>] ip_rcv_finish+0x12d/0x440
[<ffffffff81496ac5>] ip_rcv+0x275/0x350
[<ffffffff8145c88b>] __netif_receive_skb+0x4ab/0x750
[<ffffffff81460588>] netif_receive_skb+0x58/0x60
This can be triggered e.g., through a simple scripted nmap
connection scan injecting the chunk after the handshake, for
example, ...
-------------- INIT[ASCONF; ASCONF_ACK] ------------->
<----------- INIT-ACK[ASCONF; ASCONF_ACK] ------------
-------------------- COOKIE-ECHO -------------------->
<-------------------- COOKIE-ACK ---------------------
------------------ ASCONF; UNKNOWN ------------------>
... where ASCONF chunk of length 280 contains 2 parameters ...
1) Add IP address parameter (param length: 16)
2) Add/del IP address parameter (param length: 255)
... followed by an UNKNOWN chunk of e.g. 4 bytes. Here, the
Address Parameter in the ASCONF chunk is even missing, too.
This is just an example and similarly-crafted ASCONF chunks
could be used just as well.
The ASCONF chunk passes through sctp_verify_asconf() as all
parameters passed sanity checks, and after walking, we ended
up successfully at the chunk end boundary, and thus may invoke
sctp_process_asconf(). Parameter walking is done with
WORD_ROUND() to take padding into account.
In sctp_process_asconf()'s TLV processing, we may fail in
sctp_process_asconf_param() e.g., due to removal of the IP
address that is also the source address of the packet containing
the ASCONF chunk, and thus we need to add all TLVs after the
failure to our ASCONF response to remote via helper function
sctp_add_asconf_response(), which basically invokes a
sctp_addto_chunk() adding the error parameters to the given
skb.
When walking to the next parameter this time, we proceed
with ...
length = ntohs(asconf_param->param_hdr.length);
asconf_param = (void *)asconf_param + length;
... instead of the WORD_ROUND()'ed length, thus resulting here
in an off-by-one that leads to reading the follow-up garbage
parameter length of 12336, and thus throwing an skb_over_panic
for the reply when trying to sctp_addto_chunk() next time,
which implicitly calls the skb_put() with that length.
Fix it by using sctp_walk_params() [ which is also used in
INIT parameter processing ] macro in the verification *and*
in ASCONF processing: it will make sure we don't spill over,
that we walk parameters WORD_ROUND()'ed. Moreover, we're being
more defensive and guard against unknown parameter types and
missized addresses.
Joint work with Vlad Yasevich.
Fixes: b896b82be4ae ("[SCTP] ADDIP: Support for processing incoming ASCONF_ACK chunks.")
Signed-off-by: Daniel Borkmann <[email protected]>
Signed-off-by: Vlad Yasevich <[email protected]>
Acked-by: Neil Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Cc: Josh Boyer <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/net/sctp/sm.h | 6 +-
net/sctp/sm_make_chunk.c | 99 ++++++++++++++++++++++++++---------------------
net/sctp/sm_statefuns.c | 18 --------
3 files changed, 60 insertions(+), 63 deletions(-)
--- a/include/net/sctp/sm.h
+++ b/include/net/sctp/sm.h
@@ -248,9 +248,9 @@ struct sctp_chunk *sctp_make_asconf_upda
int, __be16);
struct sctp_chunk *sctp_make_asconf_set_prim(struct sctp_association *asoc,
union sctp_addr *addr);
-int sctp_verify_asconf(const struct sctp_association *asoc,
- struct sctp_paramhdr *param_hdr, void *chunk_end,
- struct sctp_paramhdr **errp);
+bool sctp_verify_asconf(const struct sctp_association *asoc,
+ struct sctp_chunk *chunk, bool addr_param_needed,
+ struct sctp_paramhdr **errp);
struct sctp_chunk *sctp_process_asconf(struct sctp_association *asoc,
struct sctp_chunk *asconf);
int sctp_process_asconf_ack(struct sctp_association *asoc,
--- a/net/sctp/sm_make_chunk.c
+++ b/net/sctp/sm_make_chunk.c
@@ -3113,50 +3113,63 @@ static __be16 sctp_process_asconf_param(
return SCTP_ERROR_NO_ERROR;
}
-/* Verify the ASCONF packet before we process it. */
-int sctp_verify_asconf(const struct sctp_association *asoc,
- struct sctp_paramhdr *param_hdr, void *chunk_end,
- struct sctp_paramhdr **errp) {
- sctp_addip_param_t *asconf_param;
+/* Verify the ASCONF packet before we process it. */
+bool sctp_verify_asconf(const struct sctp_association *asoc,
+ struct sctp_chunk *chunk, bool addr_param_needed,
+ struct sctp_paramhdr **errp)
+{
+ sctp_addip_chunk_t *addip = (sctp_addip_chunk_t *) chunk->chunk_hdr;
union sctp_params param;
- int length, plen;
-
- param.v = (sctp_paramhdr_t *) param_hdr;
- while (param.v <= chunk_end - sizeof(sctp_paramhdr_t)) {
- length = ntohs(param.p->length);
- *errp = param.p;
+ bool addr_param_seen = false;
- if (param.v > chunk_end - length ||
- length < sizeof(sctp_paramhdr_t))
- return 0;
+ sctp_walk_params(param, addip, addip_hdr.params) {
+ size_t length = ntohs(param.p->length);
+ *errp = param.p;
switch (param.p->type) {
+ case SCTP_PARAM_ERR_CAUSE:
+ break;
+ case SCTP_PARAM_IPV4_ADDRESS:
+ if (length != sizeof(sctp_ipv4addr_param_t))
+ return false;
+ addr_param_seen = true;
+ break;
+ case SCTP_PARAM_IPV6_ADDRESS:
+ if (length != sizeof(sctp_ipv6addr_param_t))
+ return false;
+ addr_param_seen = true;
+ break;
case SCTP_PARAM_ADD_IP:
case SCTP_PARAM_DEL_IP:
case SCTP_PARAM_SET_PRIMARY:
- asconf_param = (sctp_addip_param_t *)param.v;
- plen = ntohs(asconf_param->param_hdr.length);
- if (plen < sizeof(sctp_addip_param_t) +
- sizeof(sctp_paramhdr_t))
- return 0;
+ /* In ASCONF chunks, these need to be first. */
+ if (addr_param_needed && !addr_param_seen)
+ return false;
+ length = ntohs(param.addip->param_hdr.length);
+ if (length < sizeof(sctp_addip_param_t) +
+ sizeof(sctp_paramhdr_t))
+ return false;
break;
case SCTP_PARAM_SUCCESS_REPORT:
case SCTP_PARAM_ADAPTATION_LAYER_IND:
if (length != sizeof(sctp_addip_param_t))
- return 0;
-
+ return false;
break;
default:
- break;
+ /* This is unkown to us, reject! */
+ return false;
}
-
- param.v += WORD_ROUND(length);
}
- if (param.v != chunk_end)
- return 0;
+ /* Remaining sanity checks. */
+ if (addr_param_needed && !addr_param_seen)
+ return false;
+ if (!addr_param_needed && addr_param_seen)
+ return false;
+ if (param.v != chunk->chunk_end)
+ return false;
- return 1;
+ return true;
}
/* Process an incoming ASCONF chunk with the next expected serial no. and
@@ -3165,16 +3178,17 @@ int sctp_verify_asconf(const struct sctp
struct sctp_chunk *sctp_process_asconf(struct sctp_association *asoc,
struct sctp_chunk *asconf)
{
+ sctp_addip_chunk_t *addip = (sctp_addip_chunk_t *) asconf->chunk_hdr;
+ bool all_param_pass = true;
+ union sctp_params param;
sctp_addiphdr_t *hdr;
union sctp_addr_param *addr_param;
sctp_addip_param_t *asconf_param;
struct sctp_chunk *asconf_ack;
-
__be16 err_code;
int length = 0;
int chunk_len;
__u32 serial;
- int all_param_pass = 1;
chunk_len = ntohs(asconf->chunk_hdr->length) - sizeof(sctp_chunkhdr_t);
hdr = (sctp_addiphdr_t *)asconf->skb->data;
@@ -3202,9 +3216,14 @@ struct sctp_chunk *sctp_process_asconf(s
goto done;
/* Process the TLVs contained within the ASCONF chunk. */
- while (chunk_len > 0) {
+ sctp_walk_params(param, addip, addip_hdr.params) {
+ /* Skip preceeding address parameters. */
+ if (param.p->type == SCTP_PARAM_IPV4_ADDRESS ||
+ param.p->type == SCTP_PARAM_IPV6_ADDRESS)
+ continue;
+
err_code = sctp_process_asconf_param(asoc, asconf,
- asconf_param);
+ param.addip);
/* ADDIP 4.1 A7)
* If an error response is received for a TLV parameter,
* all TLVs with no response before the failed TLV are
@@ -3212,28 +3231,20 @@ struct sctp_chunk *sctp_process_asconf(s
* the failed response are considered unsuccessful unless
* a specific success indication is present for the parameter.
*/
- if (SCTP_ERROR_NO_ERROR != err_code)
- all_param_pass = 0;
-
+ if (err_code != SCTP_ERROR_NO_ERROR)
+ all_param_pass = false;
if (!all_param_pass)
- sctp_add_asconf_response(asconf_ack,
- asconf_param->crr_id, err_code,
- asconf_param);
+ sctp_add_asconf_response(asconf_ack, param.addip->crr_id,
+ err_code, param.addip);
/* ADDIP 4.3 D11) When an endpoint receiving an ASCONF to add
* an IP address sends an 'Out of Resource' in its response, it
* MUST also fail any subsequent add or delete requests bundled
* in the ASCONF.
*/
- if (SCTP_ERROR_RSRC_LOW == err_code)
+ if (err_code == SCTP_ERROR_RSRC_LOW)
goto done;
-
- /* Move to the next ASCONF param. */
- length = ntohs(asconf_param->param_hdr.length);
- asconf_param = (void *)asconf_param + length;
- chunk_len -= length;
}
-
done:
asoc->peer.addip_serial++;
--- a/net/sctp/sm_statefuns.c
+++ b/net/sctp/sm_statefuns.c
@@ -3594,9 +3594,7 @@ sctp_disposition_t sctp_sf_do_asconf(str
struct sctp_chunk *asconf_ack = NULL;
struct sctp_paramhdr *err_param = NULL;
sctp_addiphdr_t *hdr;
- union sctp_addr_param *addr_param;
__u32 serial;
- int length;
if (!sctp_vtag_verify(chunk, asoc)) {
sctp_add_cmd_sf(commands, SCTP_CMD_REPORT_BAD_TAG,
@@ -3621,17 +3619,8 @@ sctp_disposition_t sctp_sf_do_asconf(str
hdr = (sctp_addiphdr_t *)chunk->skb->data;
serial = ntohl(hdr->serial);
- addr_param = (union sctp_addr_param *)hdr->params;
- length = ntohs(addr_param->p.length);
- if (length < sizeof(sctp_paramhdr_t))
- return sctp_sf_violation_paramlen(net, ep, asoc, type, arg,
- (void *)addr_param, commands);
-
/* Verify the ASCONF chunk before processing it. */
- if (!sctp_verify_asconf(asoc,
- (sctp_paramhdr_t *)((void *)addr_param + length),
- (void *)chunk->chunk_end,
- &err_param))
+ if (!sctp_verify_asconf(asoc, chunk, true, &err_param))
return sctp_sf_violation_paramlen(net, ep, asoc, type, arg,
(void *)err_param, commands);
@@ -3748,10 +3737,7 @@ sctp_disposition_t sctp_sf_do_asconf_ack
rcvd_serial = ntohl(addip_hdr->serial);
/* Verify the ASCONF-ACK chunk before processing it. */
- if (!sctp_verify_asconf(asoc,
- (sctp_paramhdr_t *)addip_hdr->params,
- (void *)asconf_ack->chunk_end,
- &err_param))
+ if (!sctp_verify_asconf(asoc, asconf_ack, false, &err_param))
return sctp_sf_violation_paramlen(net, ep, asoc, type, arg,
(void *)err_param, commands);
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stanislaw Gruszka <[email protected]>
commit 4ec7a45b51a32ee513898e2f1e42bb681b340fcf upstream.
X550VB as many others Asus laptops need wapf4 quirk to make RFKILL
switch be functional. Otherwise system boots with wireless card
disabled and is only possible to enable it by suspend/resume.
Bug report:
http://bugzilla.redhat.com/show_bug.cgi?id=1089731#c23
Reported-and-tested-by: Vratislav Podzimek <[email protected]>
Signed-off-by: Stanislaw Gruszka <[email protected]>
Signed-off-by: Darren Hart <[email protected]>
Cc: Josh Boyer <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/platform/x86/asus-nb-wmi.c | 9 +++++++++
1 file changed, 9 insertions(+)
--- a/drivers/platform/x86/asus-nb-wmi.c
+++ b/drivers/platform/x86/asus-nb-wmi.c
@@ -182,6 +182,15 @@ static const struct dmi_system_id asus_q
},
{
.callback = dmi_matched,
+ .ident = "ASUSTeK COMPUTER INC. X550VB",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+ DMI_MATCH(DMI_PRODUCT_NAME, "X550VB"),
+ },
+ .driver_data = &quirk_asus_wapf4,
+ },
+ {
+ .callback = dmi_matched,
.ident = "ASUSTeK COMPUTER INC. X55A",
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Borkmann <[email protected]>
commit 26b87c7881006311828bb0ab271a551a62dcceb4 upstream.
This scenario is not limited to ASCONF, just taken as one
example triggering the issue. When receiving ASCONF probes
in the form of ...
-------------- INIT[ASCONF; ASCONF_ACK] ------------->
<----------- INIT-ACK[ASCONF; ASCONF_ACK] ------------
-------------------- COOKIE-ECHO -------------------->
<-------------------- COOKIE-ACK ---------------------
---- ASCONF_a; [ASCONF_b; ...; ASCONF_n;] JUNK ------>
[...]
---- ASCONF_m; [ASCONF_o; ...; ASCONF_z;] JUNK ------>
... where ASCONF_a, ASCONF_b, ..., ASCONF_z are good-formed
ASCONFs and have increasing serial numbers, we process such
ASCONF chunk(s) marked with !end_of_packet and !singleton,
since we have not yet reached the SCTP packet end. SCTP does
only do verification on a chunk by chunk basis, as an SCTP
packet is nothing more than just a container of a stream of
chunks which it eats up one by one.
We could run into the case that we receive a packet with a
malformed tail, above marked as trailing JUNK. All previous
chunks are here goodformed, so the stack will eat up all
previous chunks up to this point. In case JUNK does not fit
into a chunk header and there are no more other chunks in
the input queue, or in case JUNK contains a garbage chunk
header, but the encoded chunk length would exceed the skb
tail, or we came here from an entirely different scenario
and the chunk has pdiscard=1 mark (without having had a flush
point), it will happen, that we will excessively queue up
the association's output queue (a correct final chunk may
then turn it into a response flood when flushing the
queue ;)): I ran a simple script with incremental ASCONF
serial numbers and could see the server side consuming
excessive amount of RAM [before/after: up to 2GB and more].
The issue at heart is that the chunk train basically ends
with !end_of_packet and !singleton markers and since commit
2e3216cd54b1 ("sctp: Follow security requirement of responding
with 1 packet") therefore preventing an output queue flush
point in sctp_do_sm() -> sctp_cmd_interpreter() on the input
chunk (chunk = event_arg) even though local_cork is set,
but its precedence has changed since then. In the normal
case, the last chunk with end_of_packet=1 would trigger the
queue flush to accommodate possible outgoing bundling.
In the input queue, sctp_inq_pop() seems to do the right thing
in terms of discarding invalid chunks. So, above JUNK will
not enter the state machine and instead be released and exit
the sctp_assoc_bh_rcv() chunk processing loop. It's simply
the flush point being missing at loop exit. Adding a try-flush
approach on the output queue might not work as the underlying
infrastructure might be long gone at this point due to the
side-effect interpreter run.
One possibility, albeit a bit of a kludge, would be to defer
invalid chunk freeing into the state machine in order to
possibly trigger packet discards and thus indirectly a queue
flush on error. It would surely be better to discard chunks
as in the current, perhaps better controlled environment, but
going back and forth, it's simply architecturally not possible.
I tried various trailing JUNK attack cases and it seems to
look good now.
Joint work with Vlad Yasevich.
Fixes: 2e3216cd54b1 ("sctp: Follow security requirement of responding with 1 packet")
Signed-off-by: Daniel Borkmann <[email protected]>
Signed-off-by: Vlad Yasevich <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Cc: Josh Boyer <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/sctp/inqueue.c | 33 +++++++--------------------------
net/sctp/sm_statefuns.c | 3 +++
2 files changed, 10 insertions(+), 26 deletions(-)
--- a/net/sctp/inqueue.c
+++ b/net/sctp/inqueue.c
@@ -140,18 +140,9 @@ struct sctp_chunk *sctp_inq_pop(struct s
} else {
/* Nothing to do. Next chunk in the packet, please. */
ch = (sctp_chunkhdr_t *) chunk->chunk_end;
-
/* Force chunk->skb->data to chunk->chunk_end. */
- skb_pull(chunk->skb,
- chunk->chunk_end - chunk->skb->data);
-
- /* Verify that we have at least chunk headers
- * worth of buffer left.
- */
- if (skb_headlen(chunk->skb) < sizeof(sctp_chunkhdr_t)) {
- sctp_chunk_free(chunk);
- chunk = queue->in_progress = NULL;
- }
+ skb_pull(chunk->skb, chunk->chunk_end - chunk->skb->data);
+ /* We are guaranteed to pull a SCTP header. */
}
}
@@ -187,24 +178,14 @@ struct sctp_chunk *sctp_inq_pop(struct s
skb_pull(chunk->skb, sizeof(sctp_chunkhdr_t));
chunk->subh.v = NULL; /* Subheader is no longer valid. */
- if (chunk->chunk_end < skb_tail_pointer(chunk->skb)) {
+ if (chunk->chunk_end + sizeof(sctp_chunkhdr_t) <
+ skb_tail_pointer(chunk->skb)) {
/* This is not a singleton */
chunk->singleton = 0;
} else if (chunk->chunk_end > skb_tail_pointer(chunk->skb)) {
- /* RFC 2960, Section 6.10 Bundling
- *
- * Partial chunks MUST NOT be placed in an SCTP packet.
- * If the receiver detects a partial chunk, it MUST drop
- * the chunk.
- *
- * Since the end of the chunk is past the end of our buffer
- * (which contains the whole packet, we can freely discard
- * the whole packet.
- */
- sctp_chunk_free(chunk);
- chunk = queue->in_progress = NULL;
-
- return NULL;
+ /* Discard inside state machine. */
+ chunk->pdiscard = 1;
+ chunk->chunk_end = skb_tail_pointer(chunk->skb);
} else {
/* We are at the end of the packet, so mark the chunk
* in case we need to send a SACK.
--- a/net/sctp/sm_statefuns.c
+++ b/net/sctp/sm_statefuns.c
@@ -170,6 +170,9 @@ sctp_chunk_length_valid(struct sctp_chun
{
__u16 chunk_length = ntohs(chunk->chunk_hdr->length);
+ /* Previously already marked? */
+ if (unlikely(chunk->pdiscard))
+ return 0;
if (unlikely(chunk_length < required_length))
return 0;
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Borkmann <[email protected]>
commit b69040d8e39f20d5215a03502a8e8b4c6ab78395 upstream.
When receiving a e.g. semi-good formed connection scan in the
form of ...
-------------- INIT[ASCONF; ASCONF_ACK] ------------->
<----------- INIT-ACK[ASCONF; ASCONF_ACK] ------------
-------------------- COOKIE-ECHO -------------------->
<-------------------- COOKIE-ACK ---------------------
---------------- ASCONF_a; ASCONF_b ----------------->
... where ASCONF_a equals ASCONF_b chunk (at least both serials
need to be equal), we panic an SCTP server!
The problem is that good-formed ASCONF chunks that we reply with
ASCONF_ACK chunks are cached per serial. Thus, when we receive a
same ASCONF chunk twice (e.g. through a lost ASCONF_ACK), we do
not need to process them again on the server side (that was the
idea, also proposed in the RFC). Instead, we know it was cached
and we just resend the cached chunk instead. So far, so good.
Where things get nasty is in SCTP's side effect interpreter, that
is, sctp_cmd_interpreter():
While incoming ASCONF_a (chunk = event_arg) is being marked
!end_of_packet and !singleton, and we have an association context,
we do not flush the outqueue the first time after processing the
ASCONF_ACK singleton chunk via SCTP_CMD_REPLY. Instead, we keep it
queued up, although we set local_cork to 1. Commit 2e3216cd54b1
changed the precedence, so that as long as we get bundled, incoming
chunks we try possible bundling on outgoing queue as well. Before
this commit, we would just flush the output queue.
Now, while ASCONF_a's ASCONF_ACK sits in the corked outq, we
continue to process the same ASCONF_b chunk from the packet. As
we have cached the previous ASCONF_ACK, we find it, grab it and
do another SCTP_CMD_REPLY command on it. So, effectively, we rip
the chunk->list pointers and requeue the same ASCONF_ACK chunk
another time. Since we process ASCONF_b, it's correctly marked
with end_of_packet and we enforce an uncork, and thus flush, thus
crashing the kernel.
Fix it by testing if the ASCONF_ACK is currently pending and if
that is the case, do not requeue it. When flushing the output
queue we may relink the chunk for preparing an outgoing packet,
but eventually unlink it when it's copied into the skb right
before transmission.
Joint work with Vlad Yasevich.
Fixes: 2e3216cd54b1 ("sctp: Follow security requirement of responding with 1 packet")
Signed-off-by: Daniel Borkmann <[email protected]>
Signed-off-by: Vlad Yasevich <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Cc: Josh Boyer <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/net/sctp/sctp.h | 5 +++++
net/sctp/associola.c | 2 ++
2 files changed, 7 insertions(+)
--- a/include/net/sctp/sctp.h
+++ b/include/net/sctp/sctp.h
@@ -426,6 +426,11 @@ static inline void sctp_assoc_pending_pm
asoc->pmtu_pending = 0;
}
+static inline bool sctp_chunk_pending(const struct sctp_chunk *chunk)
+{
+ return !list_empty(&chunk->list);
+}
+
/* Walk through a list of TLV parameters. Don't trust the
* individual parameter lengths and instead depend on
* the chunk length to indicate when to stop. Make sure
--- a/net/sctp/associola.c
+++ b/net/sctp/associola.c
@@ -1668,6 +1668,8 @@ struct sctp_chunk *sctp_assoc_lookup_asc
* ack chunk whose serial number matches that of the request.
*/
list_for_each_entry(ack, &asoc->asconf_ack_list, transmitted_list) {
+ if (sctp_chunk_pending(ack))
+ continue;
if (ack->subh.addip_hdr->serial == serial) {
sctp_chunk_hold(ack);
return ack;
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nadav Amit <[email protected]>
commit a2b9e6c1a35afcc0973acb72e591c714e78885ff upstream.
Commit fc3a9157d314 ("KVM: X86: Don't report L2 emulation failures to
user-space") disabled the reporting of L2 (nested guest) emulation failures to
userspace due to race-condition between a vmexit and the instruction emulator.
The same rational applies also to userspace applications that are permitted by
the guest OS to access MMIO area or perform PIO.
This patch extends the current behavior - of injecting a #UD instead of
reporting it to userspace - also for guest userspace code.
Signed-off-by: Nadav Amit <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/kvm/x86.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5002,7 +5002,7 @@ static int handle_emulation_failure(stru
++vcpu->stat.insn_emulation_fail;
trace_kvm_emulate_insn_failed(vcpu);
- if (!is_guest_mode(vcpu)) {
+ if (!is_guest_mode(vcpu) && kvm_x86_ops->get_cpl(vcpu) == 0) {
vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
vcpu->run->internal.ndata = 0;
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stephan Mueller <[email protected]>
commit 725c7f619e20f5051bba627fca11dc107c2a93b1 upstream.
The Yoga 3 does not contain any physical rfkill switch. Therefore
disable the rfkill switch identically to the Yoga 2 approach.
Signed-off-by: Stephan Mueller <[email protected]>
Signed-off-by: Darren Hart <[email protected]>
Cc: Josh Boyer <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/platform/x86/ideapad-laptop.c | 7 +++++++
1 file changed, 7 insertions(+)
--- a/drivers/platform/x86/ideapad-laptop.c
+++ b/drivers/platform/x86/ideapad-laptop.c
@@ -837,6 +837,13 @@ static const struct dmi_system_id no_hw_
DMI_MATCH(DMI_PRODUCT_VERSION, "Lenovo Yoga 2"),
},
},
+ {
+ .ident = "Lenovo Yoga 3 Pro 1370",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
+ DMI_MATCH(DMI_PRODUCT_VERSION, "Lenovo YOGA 3 Pro-1370"),
+ },
+ },
{}
};
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joe Perches <[email protected]>
commit d2207ccbc59900311c88bb9150b24253cd4ddd49 upstream.
There's a useless "+" use that needs to be removed as perl 5.20 emits a
"Useless use of greediness modifier '+'" message each time it's hit.
Signed-off-by: Joe Perches <[email protected]>
Reported-by: Greg KH <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
scripts/checkpatch.pl | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2424,7 +2424,7 @@ sub process {
"please, no space before tabs\n" . $herevet) &&
$fix) {
while ($fixed[$fixlinenr] =~
- s/(^\+.*) {8,8}+\t/$1\t\t/) {}
+ s/(^\+.*) {8,8}\t/$1\t\t/) {}
while ($fixed[$fixlinenr] =~
s/(^\+.*) +\t/$1\t/) {}
}
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mikulas Patocka <[email protected]>
commit 9d28eb12447ee08bb5d1e8bb3195cf20e1ecd1c0 upstream.
The shrinker uses gfp flags to indicate what kind of operation can the
driver wait for. If __GFP_IO flag is present, the driver can wait for
block I/O operations, if __GFP_FS flag is present, the driver can wait on
operations involving the filesystem.
dm-bufio tested for __GFP_IO. However, dm-bufio can run on a loop block
device that makes calls into the filesystem. If __GFP_IO is present and
__GFP_FS isn't, dm-bufio could still block on filesystem operations if it
runs on a loop block device.
The change from __GFP_IO to __GFP_FS supposedly fixes one observed (though
unreproducible) deadlock involving dm-bufio and loop device.
Signed-off-by: Mikulas Patocka <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/md/dm-bufio.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -1435,9 +1435,9 @@ static void drop_buffers(struct dm_bufio
/*
* Test if the buffer is unused and too old, and commit it.
- * At if noio is set, we must not do any I/O because we hold
- * dm_bufio_clients_lock and we would risk deadlock if the I/O gets rerouted to
- * different bufio client.
+ * And if GFP_NOFS is used, we must not do any I/O because we hold
+ * dm_bufio_clients_lock and we would risk deadlock if the I/O gets
+ * rerouted to different bufio client.
*/
static int __cleanup_old_buffer(struct dm_buffer *b, gfp_t gfp,
unsigned long max_jiffies)
@@ -1445,7 +1445,7 @@ static int __cleanup_old_buffer(struct d
if (jiffies - b->last_accessed < max_jiffies)
return 0;
- if (!(gfp & __GFP_IO)) {
+ if (!(gfp & __GFP_FS)) {
if (test_bit(B_READING, &b->state) ||
test_bit(B_WRITING, &b->state) ||
test_bit(B_DIRTY, &b->state))
@@ -1487,7 +1487,7 @@ dm_bufio_shrink_scan(struct shrinker *sh
unsigned long freed;
c = container_of(shrink, struct dm_bufio_client, shrinker);
- if (sc->gfp_mask & __GFP_IO)
+ if (sc->gfp_mask & __GFP_FS)
dm_bufio_lock(c);
else if (!dm_bufio_trylock(c))
return SHRINK_STOP;
@@ -1504,7 +1504,7 @@ dm_bufio_shrink_count(struct shrinker *s
unsigned long count;
c = container_of(shrink, struct dm_bufio_client, shrinker);
- if (sc->gfp_mask & __GFP_IO)
+ if (sc->gfp_mask & __GFP_FS)
dm_bufio_lock(c);
else if (!dm_bufio_trylock(c))
return 0;
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krzysztof Kozlowski <[email protected]>
commit bdbe81445407644492b9ac69a24d35e3202d773b upstream.
The charger manager obtained reference to fuel gauge power supply in probe
with power_supply_get_by_name() for later usage. However if fuel gauge
driver was removed and re-added then this reference would point to old
power supply (from driver which was removed).
This lead to accessing old (and probably invalid) memory which could be
observed with:
$ echo "12-0036" > /sys/bus/i2c/drivers/max17042/unbind
$ echo "12-0036" > /sys/bus/i2c/drivers/max17042/bind
$ cat /sys/devices/virtual/power_supply/battery/capacity
[ 240.480084] INFO: task cat:1393 blocked for more than 120 seconds.
[ 240.484799] Not tainted 3.17.0-next-20141007-00028-ge60b6dd79570 #203
[ 240.491782] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 240.499589] cat D c0469530 0 1393 1 0x00000000
[ 240.505947] [<c0469530>] (__schedule) from [<c0469d3c>] (schedule_preempt_disabled+0x14/0x20)
[ 240.514449] [<c0469d3c>] (schedule_preempt_disabled) from [<c046af08>] (mutex_lock_nested+0x1bc/0x458)
[ 240.523736] [<c046af08>] (mutex_lock_nested) from [<c0287a98>] (regmap_read+0x30/0x60)
[ 240.531647] [<c0287a98>] (regmap_read) from [<c032238c>] (max17042_get_property+0x2e8/0x350)
[ 240.540055] [<c032238c>] (max17042_get_property) from [<c03247d8>] (charger_get_property+0x264/0x348)
[ 240.549252] [<c03247d8>] (charger_get_property) from [<c0320764>] (power_supply_show_property+0x48/0x1e0)
[ 240.558808] [<c0320764>] (power_supply_show_property) from [<c027308c>] (dev_attr_show+0x1c/0x48)
[ 240.567664] [<c027308c>] (dev_attr_show) from [<c0141fb0>] (sysfs_kf_seq_show+0x84/0x104)
[ 240.575814] [<c0141fb0>] (sysfs_kf_seq_show) from [<c0140b18>] (kernfs_seq_show+0x24/0x28)
[ 240.584061] [<c0140b18>] (kernfs_seq_show) from [<c0104574>] (seq_read+0x1b0/0x484)
[ 240.591702] [<c0104574>] (seq_read) from [<c00e1e24>] (vfs_read+0x88/0x144)
[ 240.598640] [<c00e1e24>] (vfs_read) from [<c00e1f20>] (SyS_read+0x40/0x8c)
[ 240.605507] [<c00e1f20>] (SyS_read) from [<c000e760>] (ret_fast_syscall+0x0/0x48)
[ 240.612952] 4 locks held by cat/1393:
[ 240.616589] #0: (&p->lock){+.+.+.}, at: [<c01043f4>] seq_read+0x30/0x484
[ 240.623414] #1: (&of->mutex){+.+.+.}, at: [<c01417dc>] kernfs_seq_start+0x1c/0x8c
[ 240.631086] #2: (s_active#31){++++.+}, at: [<c01417e4>] kernfs_seq_start+0x24/0x8c
[ 240.638777] #3: (&map->mutex){+.+...}, at: [<c0287a98>] regmap_read+0x30/0x60
The charger-manager should get reference to fuel gauge power supply on
each use of get_property callback. The thermal zone 'tzd' field of
power supply should not be used because of the same reason.
Additionally this change solves also the issue with nested
thermal_zone_get_temp() calls and related false lockdep positive for
deadlock for thermal zone's mutex [1]. When fuel gauge is used as source of
temperature then the charger manager forwards its get_temp calls to fuel
gauge thermal zone. So actually different mutexes are used (one for
charger manager thermal zone and second for fuel gauge thermal zone) but
for lockdep this is one class of mutex.
The recursion is removed by retrieving temperature through power
supply's get_property().
In case external thermal zone is used ('cm-thermal-zone' property is
present in DTS) the recursion does not exist. Charger manager simply
exports POWER_SUPPLY_PROP_TEMP_AMBIENT property (instead of
POWER_SUPPLY_PROP_TEMP) thus no thermal zone is created for this power
supply.
[1] https://lkml.org/lkml/2014/10/6/309
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Fixes: 3bb3dbbd56ea ("power_supply: Add initial Charger-Manager driver")
Signed-off-by: Sebastian Reichel <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/power/charger-manager.c | 99 ++++++++++++++++++++++++----------
include/linux/power/charger-manager.h | 1
2 files changed, 71 insertions(+), 29 deletions(-)
--- a/drivers/power/charger-manager.c
+++ b/drivers/power/charger-manager.c
@@ -97,6 +97,7 @@ static struct charger_global_desc *g_des
static bool is_batt_present(struct charger_manager *cm)
{
union power_supply_propval val;
+ struct power_supply *psy;
bool present = false;
int i, ret;
@@ -107,7 +108,11 @@ static bool is_batt_present(struct charg
case CM_NO_BATTERY:
break;
case CM_FUEL_GAUGE:
- ret = cm->fuel_gauge->get_property(cm->fuel_gauge,
+ psy = power_supply_get_by_name(cm->desc->psy_fuel_gauge);
+ if (!psy)
+ break;
+
+ ret = psy->get_property(psy,
POWER_SUPPLY_PROP_PRESENT, &val);
if (ret == 0 && val.intval)
present = true;
@@ -167,12 +172,14 @@ static bool is_ext_pwr_online(struct cha
static int get_batt_uV(struct charger_manager *cm, int *uV)
{
union power_supply_propval val;
+ struct power_supply *fuel_gauge;
int ret;
- if (!cm->fuel_gauge)
+ fuel_gauge = power_supply_get_by_name(cm->desc->psy_fuel_gauge);
+ if (!fuel_gauge)
return -ENODEV;
- ret = cm->fuel_gauge->get_property(cm->fuel_gauge,
+ ret = fuel_gauge->get_property(fuel_gauge,
POWER_SUPPLY_PROP_VOLTAGE_NOW, &val);
if (ret)
return ret;
@@ -248,6 +255,7 @@ static bool is_full_charged(struct charg
{
struct charger_desc *desc = cm->desc;
union power_supply_propval val;
+ struct power_supply *fuel_gauge;
int ret = 0;
int uV;
@@ -255,11 +263,15 @@ static bool is_full_charged(struct charg
if (!is_batt_present(cm))
return false;
- if (cm->fuel_gauge && desc->fullbatt_full_capacity > 0) {
+ fuel_gauge = power_supply_get_by_name(cm->desc->psy_fuel_gauge);
+ if (!fuel_gauge)
+ return false;
+
+ if (desc->fullbatt_full_capacity > 0) {
val.intval = 0;
/* Not full if capacity of fuel gauge isn't full */
- ret = cm->fuel_gauge->get_property(cm->fuel_gauge,
+ ret = fuel_gauge->get_property(fuel_gauge,
POWER_SUPPLY_PROP_CHARGE_FULL, &val);
if (!ret && val.intval > desc->fullbatt_full_capacity)
return true;
@@ -273,10 +285,10 @@ static bool is_full_charged(struct charg
}
/* Full, if the capacity is more than fullbatt_soc */
- if (cm->fuel_gauge && desc->fullbatt_soc > 0) {
+ if (desc->fullbatt_soc > 0) {
val.intval = 0;
- ret = cm->fuel_gauge->get_property(cm->fuel_gauge,
+ ret = fuel_gauge->get_property(fuel_gauge,
POWER_SUPPLY_PROP_CAPACITY, &val);
if (!ret && val.intval >= desc->fullbatt_soc)
return true;
@@ -551,6 +563,20 @@ static int check_charging_duration(struc
return ret;
}
+static int cm_get_battery_temperature_by_psy(struct charger_manager *cm,
+ int *temp)
+{
+ struct power_supply *fuel_gauge;
+
+ fuel_gauge = power_supply_get_by_name(cm->desc->psy_fuel_gauge);
+ if (!fuel_gauge)
+ return -ENODEV;
+
+ return fuel_gauge->get_property(fuel_gauge,
+ POWER_SUPPLY_PROP_TEMP,
+ (union power_supply_propval *)temp);
+}
+
static int cm_get_battery_temperature(struct charger_manager *cm,
int *temp)
{
@@ -560,15 +586,18 @@ static int cm_get_battery_temperature(st
return -ENODEV;
#ifdef CONFIG_THERMAL
- ret = thermal_zone_get_temp(cm->tzd_batt, (unsigned long *)temp);
- if (!ret)
- /* Calibrate temperature unit */
- *temp /= 100;
-#else
- ret = cm->fuel_gauge->get_property(cm->fuel_gauge,
- POWER_SUPPLY_PROP_TEMP,
- (union power_supply_propval *)temp);
+ if (cm->tzd_batt) {
+ ret = thermal_zone_get_temp(cm->tzd_batt, (unsigned long *)temp);
+ if (!ret)
+ /* Calibrate temperature unit */
+ *temp /= 100;
+ } else
#endif
+ {
+ /* if-else continued from CONFIG_THERMAL */
+ ret = cm_get_battery_temperature_by_psy(cm, temp);
+ }
+
return ret;
}
@@ -827,6 +856,7 @@ static int charger_get_property(struct p
struct charger_manager *cm = container_of(psy,
struct charger_manager, charger_psy);
struct charger_desc *desc = cm->desc;
+ struct power_supply *fuel_gauge;
int ret = 0;
int uV;
@@ -857,14 +887,20 @@ static int charger_get_property(struct p
ret = get_batt_uV(cm, &val->intval);
break;
case POWER_SUPPLY_PROP_CURRENT_NOW:
- ret = cm->fuel_gauge->get_property(cm->fuel_gauge,
+ fuel_gauge = power_supply_get_by_name(cm->desc->psy_fuel_gauge);
+ if (!fuel_gauge) {
+ ret = -ENODEV;
+ break;
+ }
+ ret = fuel_gauge->get_property(fuel_gauge,
POWER_SUPPLY_PROP_CURRENT_NOW, val);
break;
case POWER_SUPPLY_PROP_TEMP:
case POWER_SUPPLY_PROP_TEMP_AMBIENT:
return cm_get_battery_temperature(cm, &val->intval);
case POWER_SUPPLY_PROP_CAPACITY:
- if (!cm->fuel_gauge) {
+ fuel_gauge = power_supply_get_by_name(cm->desc->psy_fuel_gauge);
+ if (!fuel_gauge) {
ret = -ENODEV;
break;
}
@@ -875,7 +911,7 @@ static int charger_get_property(struct p
break;
}
- ret = cm->fuel_gauge->get_property(cm->fuel_gauge,
+ ret = fuel_gauge->get_property(fuel_gauge,
POWER_SUPPLY_PROP_CAPACITY, val);
if (ret)
break;
@@ -924,7 +960,14 @@ static int charger_get_property(struct p
break;
case POWER_SUPPLY_PROP_CHARGE_NOW:
if (is_charging(cm)) {
- ret = cm->fuel_gauge->get_property(cm->fuel_gauge,
+ fuel_gauge = power_supply_get_by_name(
+ cm->desc->psy_fuel_gauge);
+ if (!fuel_gauge) {
+ ret = -ENODEV;
+ break;
+ }
+
+ ret = fuel_gauge->get_property(fuel_gauge,
POWER_SUPPLY_PROP_CHARGE_NOW,
val);
if (ret) {
@@ -1485,14 +1528,15 @@ err:
return ret;
}
-static int cm_init_thermal_data(struct charger_manager *cm)
+static int cm_init_thermal_data(struct charger_manager *cm,
+ struct power_supply *fuel_gauge)
{
struct charger_desc *desc = cm->desc;
union power_supply_propval val;
int ret;
/* Verify whether fuel gauge provides battery temperature */
- ret = cm->fuel_gauge->get_property(cm->fuel_gauge,
+ ret = fuel_gauge->get_property(fuel_gauge,
POWER_SUPPLY_PROP_TEMP, &val);
if (!ret) {
@@ -1502,8 +1546,6 @@ static int cm_init_thermal_data(struct c
cm->desc->measure_battery_temp = true;
}
#ifdef CONFIG_THERMAL
- cm->tzd_batt = cm->fuel_gauge->tzd;
-
if (ret && desc->thermal_zone) {
cm->tzd_batt =
thermal_zone_get_zone_by_name(desc->thermal_zone);
@@ -1666,6 +1708,7 @@ static int charger_manager_probe(struct
int ret = 0, i = 0;
int j = 0;
union power_supply_propval val;
+ struct power_supply *fuel_gauge;
if (g_desc && !rtc_dev && g_desc->rtc_name) {
rtc_dev = rtc_class_open(g_desc->rtc_name);
@@ -1744,8 +1787,8 @@ static int charger_manager_probe(struct
}
}
- cm->fuel_gauge = power_supply_get_by_name(desc->psy_fuel_gauge);
- if (!cm->fuel_gauge) {
+ fuel_gauge = power_supply_get_by_name(desc->psy_fuel_gauge);
+ if (!fuel_gauge) {
dev_err(&pdev->dev, "Cannot find power supply \"%s\"\n",
desc->psy_fuel_gauge);
return -ENODEV;
@@ -1788,13 +1831,13 @@ static int charger_manager_probe(struct
cm->charger_psy.num_properties = psy_default.num_properties;
/* Find which optional psy-properties are available */
- if (!cm->fuel_gauge->get_property(cm->fuel_gauge,
+ if (!fuel_gauge->get_property(fuel_gauge,
POWER_SUPPLY_PROP_CHARGE_NOW, &val)) {
cm->charger_psy.properties[cm->charger_psy.num_properties] =
POWER_SUPPLY_PROP_CHARGE_NOW;
cm->charger_psy.num_properties++;
}
- if (!cm->fuel_gauge->get_property(cm->fuel_gauge,
+ if (!fuel_gauge->get_property(fuel_gauge,
POWER_SUPPLY_PROP_CURRENT_NOW,
&val)) {
cm->charger_psy.properties[cm->charger_psy.num_properties] =
@@ -1802,7 +1845,7 @@ static int charger_manager_probe(struct
cm->charger_psy.num_properties++;
}
- ret = cm_init_thermal_data(cm);
+ ret = cm_init_thermal_data(cm, fuel_gauge);
if (ret) {
dev_err(&pdev->dev, "Failed to initialize thermal data\n");
cm->desc->measure_battery_temp = false;
--- a/include/linux/power/charger-manager.h
+++ b/include/linux/power/charger-manager.h
@@ -253,7 +253,6 @@ struct charger_manager {
struct device *dev;
struct charger_desc *desc;
- struct power_supply *fuel_gauge;
struct power_supply **charger_stat;
#ifdef CONFIG_THERMAL
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jeff Layton <[email protected]>
commit b3ecba096729f521312d1863ad22530695527aed upstream.
Bruce reported that he was seeing the following BUG pop:
BUG: sleeping function called from invalid context at mm/slab.c:2846
in_atomic(): 0, irqs_disabled(): 0, pid: 4539, name: mount.nfs
2 locks held by mount.nfs/4539:
#0: (nfs_clid_init_mutex){+.+.+.}, at: [<ffffffffa01c0a9a>] nfs4_discover_server_trunking+0x4a/0x2f0 [nfsv4]
#1: (rcu_read_lock){......}, at: [<ffffffffa00e3185>] gss_stringify_acceptor+0x5/0xb0 [auth_rpcgss]
Preemption disabled at:[<ffffffff81a4f082>] printk+0x4d/0x4f
CPU: 3 PID: 4539 Comm: mount.nfs Not tainted 3.18.0-rc1-00013-g5b095e9 #3393
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
ffff880021499390 ffff8800381476a8 ffffffff81a534cf 0000000000000001
0000000000000000 ffff8800381476c8 ffffffff81097854 00000000000000d0
0000000000000018 ffff880038147718 ffffffff8118e4f3 0000000020479f00
Call Trace:
[<ffffffff81a534cf>] dump_stack+0x4f/0x7c
[<ffffffff81097854>] __might_sleep+0x114/0x180
[<ffffffff8118e4f3>] __kmalloc+0x1a3/0x280
[<ffffffffa00e31d8>] gss_stringify_acceptor+0x58/0xb0 [auth_rpcgss]
[<ffffffffa00e3185>] ? gss_stringify_acceptor+0x5/0xb0 [auth_rpcgss]
[<ffffffffa006b438>] rpcauth_stringify_acceptor+0x18/0x30 [sunrpc]
[<ffffffffa01b0469>] nfs4_proc_setclientid+0x199/0x380 [nfsv4]
[<ffffffffa01b04d0>] ? nfs4_proc_setclientid+0x200/0x380 [nfsv4]
[<ffffffffa01bdf1a>] nfs40_discover_server_trunking+0xda/0x150 [nfsv4]
[<ffffffffa01bde45>] ? nfs40_discover_server_trunking+0x5/0x150 [nfsv4]
[<ffffffffa01c0acf>] nfs4_discover_server_trunking+0x7f/0x2f0 [nfsv4]
[<ffffffffa01c8e24>] nfs4_init_client+0x104/0x2f0 [nfsv4]
[<ffffffffa01539b4>] nfs_get_client+0x314/0x3f0 [nfs]
[<ffffffffa0153780>] ? nfs_get_client+0xe0/0x3f0 [nfs]
[<ffffffffa01c83aa>] nfs4_set_client+0x8a/0x110 [nfsv4]
[<ffffffffa0069708>] ? __rpc_init_priority_wait_queue+0xa8/0xf0 [sunrpc]
[<ffffffffa01c9b2f>] nfs4_create_server+0x12f/0x390 [nfsv4]
[<ffffffffa01c1472>] nfs4_remote_mount+0x32/0x60 [nfsv4]
[<ffffffff81196489>] mount_fs+0x39/0x1b0
[<ffffffff81166145>] ? __alloc_percpu+0x15/0x20
[<ffffffff811b276b>] vfs_kern_mount+0x6b/0x150
[<ffffffffa01c1396>] nfs_do_root_mount+0x86/0xc0 [nfsv4]
[<ffffffffa01c1784>] nfs4_try_mount+0x44/0xc0 [nfsv4]
[<ffffffffa01549b7>] ? get_nfs_version+0x27/0x90 [nfs]
[<ffffffffa0161a2d>] nfs_fs_mount+0x47d/0xd60 [nfs]
[<ffffffff81a59c5e>] ? mutex_unlock+0xe/0x10
[<ffffffffa01606a0>] ? nfs_remount+0x430/0x430 [nfs]
[<ffffffffa01609c0>] ? nfs_clone_super+0x140/0x140 [nfs]
[<ffffffff81196489>] mount_fs+0x39/0x1b0
[<ffffffff81166145>] ? __alloc_percpu+0x15/0x20
[<ffffffff811b276b>] vfs_kern_mount+0x6b/0x150
[<ffffffff811b5830>] do_mount+0x210/0xbe0
[<ffffffff811b54ca>] ? copy_mount_options+0x3a/0x160
[<ffffffff811b651f>] SyS_mount+0x6f/0xb0
[<ffffffff81a5c852>] system_call_fastpath+0x12/0x17
Sleeping under the rcu_read_lock is bad. This patch fixes it by dropping
the rcu_read_lock before doing the allocation and then reacquiring it
and redoing the dereference before doing the copy. If we find that the
string has somehow grown in the meantime, we'll reallocate and try again.
Reported-by: "J. Bruce Fields" <[email protected]>
Signed-off-by: Jeff Layton <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/sunrpc/auth_gss/auth_gss.c | 35 ++++++++++++++++++++++++++++++-----
1 file changed, 30 insertions(+), 5 deletions(-)
--- a/net/sunrpc/auth_gss/auth_gss.c
+++ b/net/sunrpc/auth_gss/auth_gss.c
@@ -1353,6 +1353,7 @@ gss_stringify_acceptor(struct rpc_cred *
char *string = NULL;
struct gss_cred *gss_cred = container_of(cred, struct gss_cred, gc_base);
struct gss_cl_ctx *ctx;
+ unsigned int len;
struct xdr_netobj *acceptor;
rcu_read_lock();
@@ -1360,15 +1361,39 @@ gss_stringify_acceptor(struct rpc_cred *
if (!ctx)
goto out;
- acceptor = &ctx->gc_acceptor;
+ len = ctx->gc_acceptor.len;
+ rcu_read_unlock();
/* no point if there's no string */
- if (!acceptor->len)
- goto out;
-
- string = kmalloc(acceptor->len + 1, GFP_KERNEL);
+ if (!len)
+ return NULL;
+realloc:
+ string = kmalloc(len + 1, GFP_KERNEL);
if (!string)
+ return NULL;
+
+ rcu_read_lock();
+ ctx = rcu_dereference(gss_cred->gc_ctx);
+
+ /* did the ctx disappear or was it replaced by one with no acceptor? */
+ if (!ctx || !ctx->gc_acceptor.len) {
+ kfree(string);
+ string = NULL;
goto out;
+ }
+
+ acceptor = &ctx->gc_acceptor;
+
+ /*
+ * Did we find a new acceptor that's longer than the original? Allocate
+ * a longer buffer and try again.
+ */
+ if (len < acceptor->len) {
+ len = acceptor->len;
+ rcu_read_unlock();
+ kfree(string);
+ goto realloc;
+ }
memcpy(string, acceptor->data, acceptor->len);
string[acceptor->len] = '\0';
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Geert Uytterhoeven <[email protected]>
commit 09712f557b31838092e1f22a5f2dd131a843a3de upstream.
When resuming from s2ram on an SMP system without cpufreq operating
points (e.g. there's no "operating-points" property for the CPU node in
DT, or the platform doesn't use DT yet), the kernel crashes when
bringing CPU 1 online:
Enabling non-boot CPUs ...
CPU1: Booted secondary processor
Unable to handle kernel NULL pointer dereference at virtual address 0000003c
pgd = ee5e6b00
[0000003c] *pgd=6e579003, *pmd=6e588003, *pte=00000000
Internal error: Oops: a07 [#1] SMP ARM
Modules linked in:
CPU: 0 PID: 1246 Comm: s2ram Tainted: G W 3.18.0-rc3-koelsch-01614-g0377af242bb175c8-dirty #589
task: eeec5240 ti: ee704000 task.ti: ee704000
PC is at __cpufreq_add_dev.isra.24+0x24c/0x77c
LR is at __cpufreq_add_dev.isra.24+0x244/0x77c
pc : [<c0298efc>] lr : [<c0298ef4>] psr: 60000153
sp : ee705d48 ip : ee705d48 fp : ee705d84
r10: c04e0450 r9 : 00000000 r8 : 00000001
r7 : c05426a8 r6 : 00000001 r5 : 00000001 r4 : 00000000
r3 : 00000000 r2 : 00000000 r1 : 20000153 r0 : c0542734
Verify that policy is not NULL before dereferencing it to fix this.
Signed-off-by: Geert Uytterhoeven <[email protected]>
Fixes: 8414809c6a1e (cpufreq: Preserve policy structure across suspend/resume)
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/cpufreq/cpufreq.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1022,7 +1022,8 @@ static struct cpufreq_policy *cpufreq_po
read_unlock_irqrestore(&cpufreq_driver_lock, flags);
- policy->governor = NULL;
+ if (policy)
+ policy->governor = NULL;
return policy;
}
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krzysztof Kozlowski <[email protected]>
commit 21e863b233553998737e1b506c823a00bf012e00 upstream.
Memory allocated for 'name' was leaking if required binding properties
were not present.
The memory for 'name' was allocated early at probe with kasprintf(). It
was freed in error paths executed before and after parsing DTS but not
in that error path.
Fix the error path for parsing device tree properties.
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Fixes: faffd234cf85 ("bq2415x_charger: Add DT support")
Signed-off-by: Sebastian Reichel <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/power/bq2415x_charger.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
--- a/drivers/power/bq2415x_charger.c
+++ b/drivers/power/bq2415x_charger.c
@@ -1609,27 +1609,27 @@ static int bq2415x_probe(struct i2c_clie
ret = of_property_read_u32(np, "ti,current-limit",
&bq->init_data.current_limit);
if (ret)
- return ret;
+ goto error_2;
ret = of_property_read_u32(np, "ti,weak-battery-voltage",
&bq->init_data.weak_battery_voltage);
if (ret)
- return ret;
+ goto error_2;
ret = of_property_read_u32(np, "ti,battery-regulation-voltage",
&bq->init_data.battery_regulation_voltage);
if (ret)
- return ret;
+ goto error_2;
ret = of_property_read_u32(np, "ti,charge-current",
&bq->init_data.charge_current);
if (ret)
- return ret;
+ goto error_2;
ret = of_property_read_u32(np, "ti,termination-current",
&bq->init_data.termination_current);
if (ret)
- return ret;
+ goto error_2;
ret = of_property_read_u32(np, "ti,resistor-sense",
&bq->init_data.resistor_sense);
if (ret)
- return ret;
+ goto error_2;
} else {
memcpy(&bq->init_data, pdata, sizeof(bq->init_data));
}
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Borislav Petkov <[email protected]>
commit 4750a0d112cbfcc744929f1530ffe3193436766c upstream.
Konrad triggered the following splat below in a 32-bit guest on an AMD
box. As it turns out, in save_microcode_in_initrd_amd() we're using the
*physical* address of the container *after* we have enabled paging and
thus we #PF in load_microcode_amd() when trying to access the microcode
container in the ramdisk range.
Because the ramdisk is exactly there:
[ 0.000000] RAMDISK: [mem 0x35e04000-0x36ef9fff]
and we fault at 0x35e04304.
And since this guest doesn't relocate the ramdisk, we don't do the
computation which will give us the correct virtual address and we end up
with the PA.
So, we should actually be using virtual addresses on 32-bit too by the
time we're freeing the initrd. Do that then!
Unpacking initramfs...
BUG: unable to handle kernel paging request at 35d4e304
IP: [<c042e905>] load_microcode_amd+0x25/0x4a0
*pde = 00000000
Oops: 0000 [#1] SMP
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.17.1-302.fc21.i686 #1
Hardware name: Xen HVM domU, BIOS 4.4.1 10/01/2014
task: f5098000 ti: f50d0000 task.ti: f50d0000
EIP: 0060:[<c042e905>] EFLAGS: 00010246 CPU: 0
EIP is at load_microcode_amd+0x25/0x4a0
EAX: 00000000 EBX: f6e9ec4c ECX: 00001ec4 EDX: 00000000
ESI: f5d4e000 EDI: 35d4e2fc EBP: f50d1ed0 ESP: f50d1e94
DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
CR0: 8005003b CR2: 35d4e304 CR3: 00e33000 CR4: 000406d0
Stack:
00000000 00000000 f50d1ebc f50d1ec4 f5d4e000 c0d7735a f50d1ed0 15a3d17f
f50d1ec4 00600f20 00001ec4 bfb83203 f6e9ec4c f5d4e000 c0d7735a f50d1ed8
c0d80861 f50d1ee0 c0d80429 f50d1ef0 c0d889a9 f5d4e000 c0000000 f50d1f04
Call Trace:
? unpack_to_rootfs
? unpack_to_rootfs
save_microcode_in_initrd_amd
save_microcode_in_initrd
free_initrd_mem
populate_rootfs
? unpack_to_rootfs
do_one_initcall
? unpack_to_rootfs
? repair_env_string
? proc_mkdir
kernel_init_freeable
kernel_init
ret_from_kernel_thread
? rest_init
Reported-and-tested-by: Konrad Rzeszutek Wilk <[email protected]>
References: https://bugzilla.redhat.com/show_bug.cgi?id=1158204
Fixes: 75a1ba5b2c52 ("x86, microcode, AMD: Unify valid container checks")
Signed-off-by: Borislav Petkov <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/kernel/cpu/microcode/amd_early.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
--- a/arch/x86/kernel/cpu/microcode/amd_early.c
+++ b/arch/x86/kernel/cpu/microcode/amd_early.c
@@ -348,6 +348,7 @@ int __init save_microcode_in_initrd_amd(
{
unsigned long cont;
enum ucode_state ret;
+ u8 *cont_va;
u32 eax;
if (!container)
@@ -355,13 +356,15 @@ int __init save_microcode_in_initrd_amd(
#ifdef CONFIG_X86_32
get_bsp_sig();
- cont = (unsigned long)container;
+ cont = (unsigned long)container;
+ cont_va = __va(container);
#else
/*
* We need the physical address of the container for both bitness since
* boot_params.hdr.ramdisk_image is a physical address.
*/
- cont = __pa(container);
+ cont = __pa(container);
+ cont_va = container;
#endif
/*
@@ -372,6 +375,8 @@ int __init save_microcode_in_initrd_amd(
if (relocated_ramdisk)
container = (u8 *)(__va(relocated_ramdisk) +
(cont - boot_params.hdr.ramdisk_image));
+ else
+ container = cont_va;
if (ucode_new_rev)
pr_info("microcode: updated early to new patch_level=0x%08x\n",
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Borislav Petkov <[email protected]>
commit 85be07c32496dc264661308e4d9d4e9ccaff8072 upstream.
We should be accessing it through a pointer, like on the BSP.
Tested-by: Richard Hendershot <[email protected]>
Fixes: 65cef1311d5d ("x86, microcode: Add a disable chicken bit")
Signed-off-by: Borislav Petkov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/kernel/cpu/microcode/core_early.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kernel/cpu/microcode/core_early.c
+++ b/arch/x86/kernel/cpu/microcode/core_early.c
@@ -124,7 +124,7 @@ void __init load_ucode_bsp(void)
static bool check_loader_disabled_ap(void)
{
#ifdef CONFIG_X86_32
- return __pa_nodebug(dis_ucode_ldr);
+ return *((bool *)__pa_nodebug(&dis_ucode_ldr));
#else
return dis_ucode_ldr;
#endif
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tang Chen <[email protected]>
commit 0bd854200873894a76f32603ff2c4c988ad6b5b5 upstream.
When memory is hot-added, all the memory is in offline state. So clear
all zones' present_pages because they will be updated in online_pages()
and offline_pages(). Otherwise, /proc/zoneinfo will corrupt:
When the memory of node2 is offline:
# cat /proc/zoneinfo
......
Node 2, zone Movable
......
spanned 8388608
present 8388608
managed 0
When we online memory on node2:
# cat /proc/zoneinfo
......
Node 2, zone Movable
......
spanned 8388608
present 16777216
managed 8388608
Signed-off-by: Tang Chen <[email protected]>
Reviewed-by: Yasuaki Ishimatsu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
mm/memory_hotplug.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1067,6 +1067,16 @@ out:
}
#endif /* CONFIG_MEMORY_HOTPLUG_SPARSE */
+static void reset_node_present_pages(pg_data_t *pgdat)
+{
+ struct zone *z;
+
+ for (z = pgdat->node_zones; z < pgdat->node_zones + MAX_NR_ZONES; z++)
+ z->present_pages = 0;
+
+ pgdat->node_present_pages = 0;
+}
+
/* we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG */
static pg_data_t __ref *hotadd_new_pgdat(int nid, u64 start)
{
@@ -1105,6 +1115,13 @@ static pg_data_t __ref *hotadd_new_pgdat
*/
reset_node_managed_pages(pgdat);
+ /*
+ * When memory is hot-added, all the memory is in offline state. So
+ * clear all zones' present_pages because they will be updated in
+ * online_pages() and offline_pages().
+ */
+ reset_node_present_pages(pgdat);
+
return pgdat;
}
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krzysztof Kozlowski <[email protected]>
commit cdaf3e15385d3232b52287e50692506f8fd01a09 upstream.
The charger manager obtained in probe references to power supplies for
all chargers with power_supply_get_by_name() for later usage. However
if such charger driver was removed then this reference would point to
old power supply (from driver which was removed).
This lead to accessing invalid memory which could be observed with:
$ echo "max77693-charger" > /sys/bus/platform/drivers/max77693-charger/unbind
$ grep . /sys/devices/virtual/power_supply/battery/charger.0/*
$ grep . /sys/devices/virtual/power_supply/battery/*
[ 15.339817] Unable to handle kernel paging request at virtual address 0001c12c
[ 15.346187] pgd = edd08000
[ 15.348814] [0001c12c] *pgd=6dce2831, *pte=00000000, *ppte=00000000
[ 15.355075] Internal error: Oops: 80000007 [#1] PREEMPT SMP ARM
[ 15.360967] Modules linked in:
[ 15.364010] CPU: 2 PID: 1388 Comm: grep Not tainted 3.17.0-next-20141007-00027-ga95e761db1b0 #245
[ 15.372859] task: ee03ad00 ti: edcf6000 task.ti: edcf6000
[ 15.378241] PC is at 0x1c12c
[ 15.381113] LR is at is_ext_pwr_online+0x30/0x6c
[ 15.385706] pc : [<0001c12c>] lr : [<c0339fc4>] psr: a0000013
[ 15.385706] sp : edcf7e88 ip : 00000000 fp : 00000000
[ 15.397161] r10: eeb02c08 r9 : c04b1f84 r8 : eeb02c00
[ 15.402369] r7 : edc69a10 r6 : eea6ac10 r5 : eea6ac10 r4 : 00000004
[ 15.408878] r3 : 0001c12c r2 : edcf7e8c r1 : 00000004 r0 : ee914418
[ 15.415390] Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user
[ 15.422506] Control: 10c5387d Table: 6dd0804a DAC: 00000015
[ 15.428236] Process grep (pid: 1388, stack limit = 0xedcf6240)
[ 15.434050] Stack: (0xedcf7e88 to 0xedcf8000)
[ 15.438395] 7e80: ee03ad00 00000000 edcf7f80 eea6aca8 edcf7ec4 c033b7b0
[ 15.446554] 7ea0: 00000001 ee1cc3f0 00000004 c06e1e44 eebdc000 c06e1e44 eeb02c00 c0337144
[ 15.454713] 7ec0: ee2dac68 c005cffc ee1cc3c0 c06e1e44 00000fff 00001000 eebdc000 c0278ca8
[ 15.462872] 7ee0: c0278c8c ee1cc3c0 eeb7ce00 c014422c edcf7f20 00008000 ee1cc3c0 ee9a48c0
[ 15.471030] 7f00: 00000001 00000001 edcf7f80 c0142d94 c0142d70 c01060f4 00021000 ee1cc3f0
[ 15.479190] 7f20: 00000000 00000000 c06a2150 eebdc000 2e7ec000 ee9a48c0 00008000 00021000
[ 15.487349] 7f40: edcf7f80 00008000 edcf6000 00021000 00021000 c00e39a4 00000000 ee9a48c0
[ 15.495508] 7f60: 00004000 00000000 00000000 ee9a48c0 ee9a48c0 00008000 00021000 c00e3aa0
[ 15.503668] 7f80: 00000000 00000000 0001f2e0 0001f2e0 00021000 00001000 00000003 c000f364
[ 15.511826] 7fa0: 00000000 c000f1a0 0001f2e0 00021000 00000003 00021000 00008000 00000000
[ 15.519986] 7fc0: 0001f2e0 00021000 00001000 00000003 00000001 000205e8 00000000 00021000
[ 15.528145] 7fe0: 00008000 bebbe910 0000a7ad b6edc49c 60000010 00000003 aaaaaaaa aaaaaaaa
[ 15.536320] [<c0339fc4>] (is_ext_pwr_online) from [<c033b7b0>] (charger_get_property+0x170/0x314)
[ 15.545164] [<c033b7b0>] (charger_get_property) from [<c0337144>] (power_supply_show_property+0x48/0x20c)
[ 15.554719] [<c0337144>] (power_supply_show_property) from [<c0278ca8>] (dev_attr_show+0x1c/0x48)
[ 15.563577] [<c0278ca8>] (dev_attr_show) from [<c014422c>] (sysfs_kf_seq_show+0x84/0x104)
[ 15.571725] [<c014422c>] (sysfs_kf_seq_show) from [<c0142d94>] (kernfs_seq_show+0x24/0x28)
[ 15.579973] [<c0142d94>] (kernfs_seq_show) from [<c01060f4>] (seq_read+0x1b0/0x484)
[ 15.587614] [<c01060f4>] (seq_read) from [<c00e39a4>] (vfs_read+0x88/0x144)
[ 15.594552] [<c00e39a4>] (vfs_read) from [<c00e3aa0>] (SyS_read+0x40/0x8c)
[ 15.601417] [<c00e3aa0>] (SyS_read) from [<c000f1a0>] (ret_fast_syscall+0x0/0x48)
[ 15.608877] Code: bad PC value
[ 15.611991] ---[ end trace a88fcc95208db283 ]---
The charger-manager should get reference to charger power supply on
each use of get_property callback.
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Fixes: 3bb3dbbd56ea ("power_supply: Add initial Charger-Manager driver")
Signed-off-by: Sebastian Reichel <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/power/charger-manager.c | 64 ++++++++++++++++++++--------------
include/linux/power/charger-manager.h | 2 -
2 files changed, 39 insertions(+), 27 deletions(-)
--- a/drivers/power/charger-manager.c
+++ b/drivers/power/charger-manager.c
@@ -118,10 +118,17 @@ static bool is_batt_present(struct charg
present = true;
break;
case CM_CHARGER_STAT:
- for (i = 0; cm->charger_stat[i]; i++) {
- ret = cm->charger_stat[i]->get_property(
- cm->charger_stat[i],
- POWER_SUPPLY_PROP_PRESENT, &val);
+ for (i = 0; cm->desc->psy_charger_stat[i]; i++) {
+ psy = power_supply_get_by_name(
+ cm->desc->psy_charger_stat[i]);
+ if (!psy) {
+ dev_err(cm->dev, "Cannot find power supply \"%s\"\n",
+ cm->desc->psy_charger_stat[i]);
+ continue;
+ }
+
+ ret = psy->get_property(psy, POWER_SUPPLY_PROP_PRESENT,
+ &val);
if (ret == 0 && val.intval) {
present = true;
break;
@@ -144,14 +151,20 @@ static bool is_batt_present(struct charg
static bool is_ext_pwr_online(struct charger_manager *cm)
{
union power_supply_propval val;
+ struct power_supply *psy;
bool online = false;
int i, ret;
/* If at least one of them has one, it's yes. */
- for (i = 0; cm->charger_stat[i]; i++) {
- ret = cm->charger_stat[i]->get_property(
- cm->charger_stat[i],
- POWER_SUPPLY_PROP_ONLINE, &val);
+ for (i = 0; cm->desc->psy_charger_stat[i]; i++) {
+ psy = power_supply_get_by_name(cm->desc->psy_charger_stat[i]);
+ if (!psy) {
+ dev_err(cm->dev, "Cannot find power supply \"%s\"\n",
+ cm->desc->psy_charger_stat[i]);
+ continue;
+ }
+
+ ret = psy->get_property(psy, POWER_SUPPLY_PROP_ONLINE, &val);
if (ret == 0 && val.intval) {
online = true;
break;
@@ -196,6 +209,7 @@ static bool is_charging(struct charger_m
{
int i, ret;
bool charging = false;
+ struct power_supply *psy;
union power_supply_propval val;
/* If there is no battery, it cannot be charged */
@@ -203,17 +217,22 @@ static bool is_charging(struct charger_m
return false;
/* If at least one of the charger is charging, return yes */
- for (i = 0; cm->charger_stat[i]; i++) {
+ for (i = 0; cm->desc->psy_charger_stat[i]; i++) {
/* 1. The charger sholuld not be DISABLED */
if (cm->emergency_stop)
continue;
if (!cm->charger_enabled)
continue;
+ psy = power_supply_get_by_name(cm->desc->psy_charger_stat[i]);
+ if (!psy) {
+ dev_err(cm->dev, "Cannot find power supply \"%s\"\n",
+ cm->desc->psy_charger_stat[i]);
+ continue;
+ }
+
/* 2. The charger should be online (ext-power) */
- ret = cm->charger_stat[i]->get_property(
- cm->charger_stat[i],
- POWER_SUPPLY_PROP_ONLINE, &val);
+ ret = psy->get_property(psy, POWER_SUPPLY_PROP_ONLINE, &val);
if (ret) {
dev_warn(cm->dev, "Cannot read ONLINE value from %s\n",
cm->desc->psy_charger_stat[i]);
@@ -226,9 +245,7 @@ static bool is_charging(struct charger_m
* 3. The charger should not be FULL, DISCHARGING,
* or NOT_CHARGING.
*/
- ret = cm->charger_stat[i]->get_property(
- cm->charger_stat[i],
- POWER_SUPPLY_PROP_STATUS, &val);
+ ret = psy->get_property(psy, POWER_SUPPLY_PROP_STATUS, &val);
if (ret) {
dev_warn(cm->dev, "Cannot read STATUS value from %s\n",
cm->desc->psy_charger_stat[i]);
@@ -1772,15 +1789,12 @@ static int charger_manager_probe(struct
while (desc->psy_charger_stat[i])
i++;
- cm->charger_stat = devm_kzalloc(&pdev->dev,
- sizeof(struct power_supply *) * i, GFP_KERNEL);
- if (!cm->charger_stat)
- return -ENOMEM;
-
+ /* Check if charger's supplies are present at probe */
for (i = 0; desc->psy_charger_stat[i]; i++) {
- cm->charger_stat[i] = power_supply_get_by_name(
- desc->psy_charger_stat[i]);
- if (!cm->charger_stat[i]) {
+ struct power_supply *psy;
+
+ psy = power_supply_get_by_name(desc->psy_charger_stat[i]);
+ if (!psy) {
dev_err(&pdev->dev, "Cannot find power supply \"%s\"\n",
desc->psy_charger_stat[i]);
return -ENODEV;
@@ -2102,8 +2116,8 @@ static bool find_power_supply(struct cha
int i;
bool found = false;
- for (i = 0; cm->charger_stat[i]; i++) {
- if (psy == cm->charger_stat[i]) {
+ for (i = 0; cm->desc->psy_charger_stat[i]; i++) {
+ if (!strcmp(psy->name, cm->desc->psy_charger_stat[i])) {
found = true;
break;
}
--- a/include/linux/power/charger-manager.h
+++ b/include/linux/power/charger-manager.h
@@ -253,8 +253,6 @@ struct charger_manager {
struct device *dev;
struct charger_desc *desc;
- struct power_supply **charger_stat;
-
#ifdef CONFIG_THERMAL
struct thermal_zone_device *tzd_batt;
#endif
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krzysztof Kozlowski <[email protected]>
commit 0eaf437aa14949d2230aeab7364f4ab47901304a upstream.
The power_supply_get_by_phandle() on error returns ENODEV or NULL.
The driver later expects obtained pointer to power supply to be
valid or NULL. If it is not NULL then it dereferences it in
bq2415x_notifier_call() which would lead to dereferencing ENODEV-value
pointer.
Properly handle the power_supply_get_by_phandle() error case by
replacing error value with NULL. This indicates that usb charger
detection won't be used.
Fix also memory leak of 'name' if power_supply_get_by_phandle() fails
with NULL and probe should defer.
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Fixes: faffd234cf85 ("bq2415x_charger: Add DT support")
[small fix regarding the missing ti,usb-charger-detection info message]
Signed-off-by: Sebastian Reichel <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/power/bq2415x_charger.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
--- a/drivers/power/bq2415x_charger.c
+++ b/drivers/power/bq2415x_charger.c
@@ -1579,8 +1579,15 @@ static int bq2415x_probe(struct i2c_clie
if (np) {
bq->notify_psy = power_supply_get_by_phandle(np, "ti,usb-charger-detection");
- if (!bq->notify_psy)
- return -EPROBE_DEFER;
+ if (IS_ERR(bq->notify_psy)) {
+ dev_info(&client->dev,
+ "no 'ti,usb-charger-detection' property (err=%ld)\n",
+ PTR_ERR(bq->notify_psy));
+ bq->notify_psy = NULL;
+ } else if (!bq->notify_psy) {
+ ret = -EPROBE_DEFER;
+ goto error_2;
+ }
}
else if (pdata->notify_device)
bq->notify_psy = power_supply_get_by_name(pdata->notify_device);
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joe Thornber <[email protected]>
commit 9b460d3699324d570a4d4161c3741431887f102f upstream.
The walk code was using a 'ro_spine' to hold it's locked btree nodes.
But this data structure is designed for the rolling lock scheme, and
as such automatically unlocks blocks that are two steps up the call
chain. This is not suitable for the simple recursive walk algorithm,
which retraces its steps.
This code is only used by the persistent array code, which in turn is
only used by dm-cache. In order to trigger it you need to have a
mapping tree that is more than 2 levels deep; which equates to 8-16
million cache blocks. For instance a 4T ssd with a very small block
size of 32k only just triggers this bug.
The fix just places the locked blocks on the stack, and stops using
the ro_spine altogether.
Signed-off-by: Joe Thornber <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/md/persistent-data/dm-btree-internal.h | 6 ++++++
drivers/md/persistent-data/dm-btree-spine.c | 2 +-
drivers/md/persistent-data/dm-btree.c | 24 ++++++++++--------------
3 files changed, 17 insertions(+), 15 deletions(-)
--- a/drivers/md/persistent-data/dm-btree-internal.h
+++ b/drivers/md/persistent-data/dm-btree-internal.h
@@ -42,6 +42,12 @@ struct btree_node {
} __packed;
+/*
+ * Locks a block using the btree node validator.
+ */
+int bn_read_lock(struct dm_btree_info *info, dm_block_t b,
+ struct dm_block **result);
+
void inc_children(struct dm_transaction_manager *tm, struct btree_node *n,
struct dm_btree_value_type *vt);
--- a/drivers/md/persistent-data/dm-btree-spine.c
+++ b/drivers/md/persistent-data/dm-btree-spine.c
@@ -92,7 +92,7 @@ struct dm_block_validator btree_node_val
/*----------------------------------------------------------------*/
-static int bn_read_lock(struct dm_btree_info *info, dm_block_t b,
+int bn_read_lock(struct dm_btree_info *info, dm_block_t b,
struct dm_block **result)
{
return dm_tm_read_lock(info->tm, b, &btree_node_validator, result);
--- a/drivers/md/persistent-data/dm-btree.c
+++ b/drivers/md/persistent-data/dm-btree.c
@@ -847,22 +847,26 @@ EXPORT_SYMBOL_GPL(dm_btree_find_lowest_k
* FIXME: We shouldn't use a recursive algorithm when we have limited stack
* space. Also this only works for single level trees.
*/
-static int walk_node(struct ro_spine *s, dm_block_t block,
+static int walk_node(struct dm_btree_info *info, dm_block_t block,
int (*fn)(void *context, uint64_t *keys, void *leaf),
void *context)
{
int r;
unsigned i, nr;
+ struct dm_block *node;
struct btree_node *n;
uint64_t keys;
- r = ro_step(s, block);
- n = ro_node(s);
+ r = bn_read_lock(info, block, &node);
+ if (r)
+ return r;
+
+ n = dm_block_data(node);
nr = le32_to_cpu(n->header.nr_entries);
for (i = 0; i < nr; i++) {
if (le32_to_cpu(n->header.flags) & INTERNAL_NODE) {
- r = walk_node(s, value64(n, i), fn, context);
+ r = walk_node(info, value64(n, i), fn, context);
if (r)
goto out;
} else {
@@ -874,7 +878,7 @@ static int walk_node(struct ro_spine *s,
}
out:
- ro_pop(s);
+ dm_tm_unlock(info->tm, node);
return r;
}
@@ -882,15 +886,7 @@ int dm_btree_walk(struct dm_btree_info *
int (*fn)(void *context, uint64_t *keys, void *leaf),
void *context)
{
- int r;
- struct ro_spine spine;
-
BUG_ON(info->levels > 1);
-
- init_ro_spine(&spine, info);
- r = walk_node(&spine, root, fn, context);
- exit_ro_spine(&spine);
-
- return r;
+ return walk_node(info, root, fn, context);
}
EXPORT_SYMBOL_GPL(dm_btree_walk);
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miklos Szeredi <[email protected]>
commit 799b601451b21ebe7af0e6e8f6e2ccd4683c5064 upstream.
Audit rules disappear when an inode they watch is evicted from the cache.
This is likely not what we want.
The guilty commit is "fsnotify: allow marks to not pin inodes in core",
which didn't take into account that audit_tree adds watches with a zero
mask.
Adding any mask should fix this.
Fixes: 90b1e7a57880 ("fsnotify: allow marks to not pin inodes in core")
Signed-off-by: Miklos Szeredi <[email protected]>
Signed-off-by: Paul Moore <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
kernel/audit_tree.c | 1 +
1 file changed, 1 insertion(+)
--- a/kernel/audit_tree.c
+++ b/kernel/audit_tree.c
@@ -154,6 +154,7 @@ static struct audit_chunk *alloc_chunk(i
chunk->owners[i].index = i;
}
fsnotify_init_mark(&chunk->mark, audit_tree_destroy_watch);
+ chunk->mark.mask = FS_IN_IGNORED;
return chunk;
}
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rabin Vincent <[email protected]>
commit e30f53aad2202b5526c40c36d8eeac8bf290bde5 upstream.
On a !PREEMPT kernel, attempting to use trace-cmd results in a soft
lockup:
# trace-cmd record -e raw_syscalls:* -F false
NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [trace-cmd:61]
...
Call Trace:
[<ffffffff8105b580>] ? __wake_up_common+0x90/0x90
[<ffffffff81092e25>] wait_on_pipe+0x35/0x40
[<ffffffff810936e3>] tracing_buffers_splice_read+0x2e3/0x3c0
[<ffffffff81093300>] ? tracing_stats_read+0x2a0/0x2a0
[<ffffffff812d10ab>] ? _raw_spin_unlock+0x2b/0x40
[<ffffffff810dc87b>] ? do_read_fault+0x21b/0x290
[<ffffffff810de56a>] ? handle_mm_fault+0x2ba/0xbd0
[<ffffffff81095c80>] ? trace_event_buffer_lock_reserve+0x40/0x80
[<ffffffff810951e2>] ? trace_buffer_lock_reserve+0x22/0x60
[<ffffffff81095c80>] ? trace_event_buffer_lock_reserve+0x40/0x80
[<ffffffff8112415d>] do_splice_to+0x6d/0x90
[<ffffffff81126971>] SyS_splice+0x7c1/0x800
[<ffffffff812d1edd>] tracesys_phase2+0xd3/0xd8
The problem is this: tracing_buffers_splice_read() calls
ring_buffer_wait() to wait for data in the ring buffers. The buffers
are not empty so ring_buffer_wait() returns immediately. But
tracing_buffers_splice_read() calls ring_buffer_read_page() with full=1,
meaning it only wants to read a full page. When the full page is not
available, tracing_buffers_splice_read() tries to wait again with
ring_buffer_wait(), which again returns immediately, and so on.
Fix this by adding a "full" argument to ring_buffer_wait() which will
make ring_buffer_wait() wait until the writer has left the reader's
page, i.e. until full-page reads will succeed.
Link: http://lkml.kernel.org/r/[email protected]
Fixes: b1169cc69ba9 ("tracing: Remove mock up poll wait function")
Signed-off-by: Rabin Vincent <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h
index 49a4d6f59108..e2c13cd863bd 100644
--- a/include/linux/ring_buffer.h
+++ b/include/linux/ring_buffer.h
@@ -97,7 +97,7 @@ __ring_buffer_alloc(unsigned long size, unsigned flags, struct lock_class_key *k
__ring_buffer_alloc((size), (flags), &__key); \
})
-int ring_buffer_wait(struct ring_buffer *buffer, int cpu);
+int ring_buffer_wait(struct ring_buffer *buffer, int cpu, bool full);
int ring_buffer_poll_wait(struct ring_buffer *buffer, int cpu,
struct file *filp, poll_table *poll_table);
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 2d75c94ae87d..a56e07c8d15b 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -538,16 +538,18 @@ static void rb_wake_up_waiters(struct irq_work *work)
* ring_buffer_wait - wait for input to the ring buffer
* @buffer: buffer to wait on
* @cpu: the cpu buffer to wait on
+ * @full: wait until a full page is available, if @cpu != RING_BUFFER_ALL_CPUS
*
* If @cpu == RING_BUFFER_ALL_CPUS then the task will wake up as soon
* as data is added to any of the @buffer's cpu buffers. Otherwise
* it will wait for data to be added to a specific cpu buffer.
*/
-int ring_buffer_wait(struct ring_buffer *buffer, int cpu)
+int ring_buffer_wait(struct ring_buffer *buffer, int cpu, bool full)
{
- struct ring_buffer_per_cpu *cpu_buffer;
+ struct ring_buffer_per_cpu *uninitialized_var(cpu_buffer);
DEFINE_WAIT(wait);
struct rb_irq_work *work;
+ int ret = 0;
/*
* Depending on what the caller is waiting for, either any
@@ -564,36 +566,61 @@ int ring_buffer_wait(struct ring_buffer *buffer, int cpu)
}
- prepare_to_wait(&work->waiters, &wait, TASK_INTERRUPTIBLE);
+ while (true) {
+ prepare_to_wait(&work->waiters, &wait, TASK_INTERRUPTIBLE);
- /*
- * The events can happen in critical sections where
- * checking a work queue can cause deadlocks.
- * After adding a task to the queue, this flag is set
- * only to notify events to try to wake up the queue
- * using irq_work.
- *
- * We don't clear it even if the buffer is no longer
- * empty. The flag only causes the next event to run
- * irq_work to do the work queue wake up. The worse
- * that can happen if we race with !trace_empty() is that
- * an event will cause an irq_work to try to wake up
- * an empty queue.
- *
- * There's no reason to protect this flag either, as
- * the work queue and irq_work logic will do the necessary
- * synchronization for the wake ups. The only thing
- * that is necessary is that the wake up happens after
- * a task has been queued. It's OK for spurious wake ups.
- */
- work->waiters_pending = true;
+ /*
+ * The events can happen in critical sections where
+ * checking a work queue can cause deadlocks.
+ * After adding a task to the queue, this flag is set
+ * only to notify events to try to wake up the queue
+ * using irq_work.
+ *
+ * We don't clear it even if the buffer is no longer
+ * empty. The flag only causes the next event to run
+ * irq_work to do the work queue wake up. The worse
+ * that can happen if we race with !trace_empty() is that
+ * an event will cause an irq_work to try to wake up
+ * an empty queue.
+ *
+ * There's no reason to protect this flag either, as
+ * the work queue and irq_work logic will do the necessary
+ * synchronization for the wake ups. The only thing
+ * that is necessary is that the wake up happens after
+ * a task has been queued. It's OK for spurious wake ups.
+ */
+ work->waiters_pending = true;
+
+ if (signal_pending(current)) {
+ ret = -EINTR;
+ break;
+ }
+
+ if (cpu == RING_BUFFER_ALL_CPUS && !ring_buffer_empty(buffer))
+ break;
+
+ if (cpu != RING_BUFFER_ALL_CPUS &&
+ !ring_buffer_empty_cpu(buffer, cpu)) {
+ unsigned long flags;
+ bool pagebusy;
+
+ if (!full)
+ break;
+
+ raw_spin_lock_irqsave(&cpu_buffer->reader_lock, flags);
+ pagebusy = cpu_buffer->reader_page == cpu_buffer->commit_page;
+ raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags);
+
+ if (!pagebusy)
+ break;
+ }
- if ((cpu == RING_BUFFER_ALL_CPUS && ring_buffer_empty(buffer)) ||
- (cpu != RING_BUFFER_ALL_CPUS && ring_buffer_empty_cpu(buffer, cpu)))
schedule();
+ }
finish_wait(&work->waiters, &wait);
- return 0;
+
+ return ret;
}
/**
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 8a528392b1f4..15209335888d 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -1076,13 +1076,14 @@ update_max_tr_single(struct trace_array *tr, struct task_struct *tsk, int cpu)
}
#endif /* CONFIG_TRACER_MAX_TRACE */
-static int wait_on_pipe(struct trace_iterator *iter)
+static int wait_on_pipe(struct trace_iterator *iter, bool full)
{
/* Iterators are static, they should be filled or empty */
if (trace_buffer_iter(iter, iter->cpu_file))
return 0;
- return ring_buffer_wait(iter->trace_buffer->buffer, iter->cpu_file);
+ return ring_buffer_wait(iter->trace_buffer->buffer, iter->cpu_file,
+ full);
}
#ifdef CONFIG_FTRACE_STARTUP_TEST
@@ -4434,15 +4435,12 @@ static int tracing_wait_pipe(struct file *filp)
mutex_unlock(&iter->mutex);
- ret = wait_on_pipe(iter);
+ ret = wait_on_pipe(iter, false);
mutex_lock(&iter->mutex);
if (ret)
return ret;
-
- if (signal_pending(current))
- return -EINTR;
}
return 1;
@@ -5372,16 +5370,12 @@ tracing_buffers_read(struct file *filp, char __user *ubuf,
goto out_unlock;
}
mutex_unlock(&trace_types_lock);
- ret = wait_on_pipe(iter);
+ ret = wait_on_pipe(iter, false);
mutex_lock(&trace_types_lock);
if (ret) {
size = ret;
goto out_unlock;
}
- if (signal_pending(current)) {
- size = -EINTR;
- goto out_unlock;
- }
goto again;
}
size = 0;
@@ -5587,14 +5581,11 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos,
goto out;
}
mutex_unlock(&trace_types_lock);
- ret = wait_on_pipe(iter);
+ ret = wait_on_pipe(iter, true);
mutex_lock(&trace_types_lock);
if (ret)
goto out;
- if (signal_pending(current)) {
- ret = -EINTR;
- goto out;
- }
+
goto again;
}
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Greg Kurz <[email protected]>
commit 24c65bc7037e7d0f362c0df70d17dd72ee64b8b9 upstream.
The add_early_randomness() function in drivers/char/hw_random/core.c passes
a 16-byte buffer to pseries_rng_data_read(). Unfortunately, plpar_hcall()
returns four 64-bit values and trashes 16 bytes on the stack.
This bug has been lying around for a long time. It got unveiled by:
commit d3cc7996473a7bdd33256029988ea690754e4e2a
Author: Amit Shah <[email protected]>
Date: Thu Jul 10 15:42:34 2014 +0530
hwrng: fetch randomness only after device init
It may trig a oops while loading or unloading the pseries-rng module for both
PowerVM and PowerKVM guests.
This patch does two things:
- pass an intermediate well sized buffer to plpar_hcall(). This is acceptalbe
since we're not on a hot path.
- move to the new read API so that we know the return buffer size for sure.
Signed-off-by: Greg Kurz <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/char/hw_random/pseries-rng.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
--- a/drivers/char/hw_random/pseries-rng.c
+++ b/drivers/char/hw_random/pseries-rng.c
@@ -25,18 +25,21 @@
#include <asm/vio.h>
-static int pseries_rng_data_read(struct hwrng *rng, u32 *data)
+static int pseries_rng_read(struct hwrng *rng, void *data, size_t max, bool wait)
{
+ u64 buffer[PLPAR_HCALL_BUFSIZE];
+ size_t size = max < 8 ? max : 8;
int rc;
- rc = plpar_hcall(H_RANDOM, (unsigned long *)data);
+ rc = plpar_hcall(H_RANDOM, (unsigned long *)buffer);
if (rc != H_SUCCESS) {
pr_err_ratelimited("H_RANDOM call failed %d\n", rc);
return -EIO;
}
+ memcpy(data, buffer, size);
/* The hypervisor interface returns 64 bits */
- return 8;
+ return size;
}
/**
@@ -55,7 +58,7 @@ static unsigned long pseries_rng_get_des
static struct hwrng pseries_rng = {
.name = KBUILD_MODNAME,
- .data_read = pseries_rng_data_read,
+ .read = pseries_rng_read,
};
static int __init pseries_rng_probe(struct vio_dev *dev,
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andy Lutomirski <[email protected]>
commit 81f49a8fd7088cfcb588d182eeede862c0e3303e upstream.
is_compat_task() is the wrong check for audit arch; the check should
be is_ia32_task(): x32 syscalls should be AUDIT_ARCH_X86_64, not
AUDIT_ARCH_I386.
CONFIG_AUDITSYSCALL is currently incompatible with x32, so this has
no visible effect.
Signed-off-by: Andy Lutomirski <[email protected]>
Link: http://lkml.kernel.org/r/a0138ed8c709882aec06e4acc30bfa9b623b8717.1409954077.git.luto@amacapital.net
Signed-off-by: H. Peter Anvin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/kernel/ptrace.c | 11 +----------
1 file changed, 1 insertion(+), 10 deletions(-)
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -1441,15 +1441,6 @@ void send_sigtrap(struct task_struct *ts
force_sig_info(SIGTRAP, &info, tsk);
}
-
-#ifdef CONFIG_X86_32
-# define IS_IA32 1
-#elif defined CONFIG_IA32_EMULATION
-# define IS_IA32 is_compat_task()
-#else
-# define IS_IA32 0
-#endif
-
/*
* We must return the syscall number to actually look up in the table.
* This can be -1L to skip running any syscall at all.
@@ -1487,7 +1478,7 @@ long syscall_trace_enter(struct pt_regs
if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT)))
trace_sys_enter(regs, regs->orig_ax);
- if (IS_IA32)
+ if (is_ia32_task())
audit_syscall_entry(AUDIT_ARCH_I386,
regs->orig_ax,
regs->bx, regs->cx,
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Richard Guy Briggs <[email protected]>
commit 9ef91514774a140e468f99d73d7593521e6d25dc upstream.
When an AUDIT_GET_FEATURE message is sent from userspace to the kernel, it
should reply with a message tagged as an AUDIT_GET_FEATURE type with a struct
audit_feature. The current reply is a message tagged as an AUDIT_GET
type with a struct audit_feature.
This appears to have been a cut-and-paste-eo in commit b0fed40.
Reported-by: Steve Grubb <[email protected]>
Signed-off-by: Richard Guy Briggs <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
kernel/audit.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -724,7 +724,7 @@ static int audit_get_feature(struct sk_b
seq = nlmsg_hdr(skb)->nlmsg_seq;
- audit_send_reply(skb, seq, AUDIT_GET, 0, 0, &af, sizeof(af));
+ audit_send_reply(skb, seq, AUDIT_GET_FEATURE, 0, 0, &af, sizeof(af));
return 0;
}
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Richard Guy Briggs <[email protected]>
commit 897f1acbb6702ddaa953e8d8436eee3b12016c7e upstream.
Add a space between subj= and feature= fields to make them parsable.
Signed-off-by: Richard Guy Briggs <[email protected]>
Signed-off-by: Paul Moore <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
kernel/audit.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -739,7 +739,7 @@ static void audit_log_feature_change(int
ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_FEATURE_CHANGE);
audit_log_task_info(ab, current);
- audit_log_format(ab, "feature=%s old=%u new=%u old_lock=%u new_lock=%u res=%d",
+ audit_log_format(ab, " feature=%s old=%u new=%u old_lock=%u new_lock=%u res=%d",
audit_feature_names[which], !!old_feature, !!new_feature,
!!old_lock, !!new_lock, res);
audit_log_end(ab);
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christoph Hellwig <[email protected]>
commit 48379270fe6808cf4612ee094adc8da2b7a83baa upstream.
Setups that use the blk-mq I/O path can lock up if a host with a single
device that has its door locked enters EH. Make sure to only send the
command to re-lock the door to devices that actually were reset and thus
might have lost their state. Otherwise the EH code might be get blocked
on blk_get_request as all requests for non-reset devices might be in use.
Signed-off-by: Christoph Hellwig <[email protected]>
Reported-by: Meelis Roos <[email protected]>
Tested-by: Meelis Roos <[email protected]>
Reviewed-by: Martin K. Petersen <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/scsi/scsi_error.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -1998,8 +1998,10 @@ static void scsi_restart_operations(stru
* is no point trying to lock the door of an off-line device.
*/
shost_for_each_device(sdev, shost) {
- if (scsi_device_online(sdev) && sdev->locked)
+ if (scsi_device_online(sdev) && sdev->was_reset && sdev->locked) {
scsi_eh_lock_door(sdev);
+ sdev->was_reset = 0;
+ }
}
/*
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Helge Deller <[email protected]>
commit 2fe749f50b0bec07650ef135b29b1f55bf543869 upstream.
Switch over the msgctl, shmat, shmctl and semtimedop syscalls to use the compat
layer. The problem was found with the debian procenv package, which called
shmctl(0, SHM_INFO, &info);
in which the shmctl syscall then overwrote parts of the surrounding areas on
the stack on which the info variable was stored and thus lead to a segfault
later on.
Additionally fix the definition of struct shminfo64 to use unsigned longs like
the other architectures. This has no impact on userspace since we only have a
32bit userspace up to now.
Signed-off-by: Helge Deller <[email protected]>
Cc: John David Anglin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/parisc/include/uapi/asm/shmbuf.h | 25 +++++++++----------------
arch/parisc/kernel/syscall_table.S | 8 ++++----
2 files changed, 13 insertions(+), 20 deletions(-)
--- a/arch/parisc/include/uapi/asm/shmbuf.h
+++ b/arch/parisc/include/uapi/asm/shmbuf.h
@@ -36,23 +36,16 @@ struct shmid64_ds {
unsigned int __unused2;
};
-#ifdef CONFIG_64BIT
-/* The 'unsigned int' (formerly 'unsigned long') data types below will
- * ensure that a 32-bit app calling shmctl(*,IPC_INFO,*) will work on
- * a wide kernel, but if some of these values are meant to contain pointers
- * they may need to be 'long long' instead. -PB XXX FIXME
- */
-#endif
struct shminfo64 {
- unsigned int shmmax;
- unsigned int shmmin;
- unsigned int shmmni;
- unsigned int shmseg;
- unsigned int shmall;
- unsigned int __unused1;
- unsigned int __unused2;
- unsigned int __unused3;
- unsigned int __unused4;
+ unsigned long shmmax;
+ unsigned long shmmin;
+ unsigned long shmmni;
+ unsigned long shmseg;
+ unsigned long shmall;
+ unsigned long __unused1;
+ unsigned long __unused2;
+ unsigned long __unused3;
+ unsigned long __unused4;
};
#endif /* _PARISC_SHMBUF_H */
--- a/arch/parisc/kernel/syscall_table.S
+++ b/arch/parisc/kernel/syscall_table.S
@@ -286,11 +286,11 @@
ENTRY_COMP(msgsnd)
ENTRY_COMP(msgrcv)
ENTRY_SAME(msgget) /* 190 */
- ENTRY_SAME(msgctl)
- ENTRY_SAME(shmat)
+ ENTRY_COMP(msgctl)
+ ENTRY_COMP(shmat)
ENTRY_SAME(shmdt)
ENTRY_SAME(shmget)
- ENTRY_SAME(shmctl) /* 195 */
+ ENTRY_COMP(shmctl) /* 195 */
ENTRY_SAME(ni_syscall) /* streams1 */
ENTRY_SAME(ni_syscall) /* streams2 */
ENTRY_SAME(lstat64)
@@ -323,7 +323,7 @@
ENTRY_SAME(epoll_ctl) /* 225 */
ENTRY_SAME(epoll_wait)
ENTRY_SAME(remap_file_pages)
- ENTRY_SAME(semtimedop)
+ ENTRY_COMP(semtimedop)
ENTRY_COMP(mq_open)
ENTRY_SAME(mq_unlink) /* 230 */
ENTRY_COMP(mq_timedsend)
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara <[email protected]>
commit ece9c72accdc45c3a9484dacb1125ce572647288 upstream.
Priority of a merged request is computed by ioprio_best(). If one of the
requests has undefined priority (IOPRIO_CLASS_NONE) and another request
has priority from IOPRIO_CLASS_BE, the function will return the
undefined priority which is wrong. Fix the function to properly return
priority of a request with the defined priority.
Fixes: d58cdfb89ce0c6bd5f81ae931a984ef298dbda20
Signed-off-by: Jan Kara <[email protected]>
Reviewed-by: Jeff Moyer <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
block/ioprio.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)
--- a/block/ioprio.c
+++ b/block/ioprio.c
@@ -157,14 +157,16 @@ out:
int ioprio_best(unsigned short aprio, unsigned short bprio)
{
- unsigned short aclass = IOPRIO_PRIO_CLASS(aprio);
- unsigned short bclass = IOPRIO_PRIO_CLASS(bprio);
+ unsigned short aclass;
+ unsigned short bclass;
- if (aclass == IOPRIO_CLASS_NONE)
- aclass = IOPRIO_CLASS_BE;
- if (bclass == IOPRIO_CLASS_NONE)
- bclass = IOPRIO_CLASS_BE;
+ if (!ioprio_valid(aprio))
+ aprio = IOPRIO_PRIO_VALUE(IOPRIO_CLASS_BE, IOPRIO_NORM);
+ if (!ioprio_valid(bprio))
+ bprio = IOPRIO_PRIO_VALUE(IOPRIO_CLASS_BE, IOPRIO_NORM);
+ aclass = IOPRIO_PRIO_CLASS(aprio);
+ bclass = IOPRIO_PRIO_CLASS(bprio);
if (aclass == bclass)
return min(aprio, bprio);
if (aclass > bclass)
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Simon Horman <[email protected]>
commit aa1cf25887099bba68f1f3879c0d394e08b8779f upstream.
Unlike other SATA R-Car r8a7790 controllers the r8a7790 ES1 SATA R-Car
controller needs to be run with DIPM disabled.
Signed-off-by: Simon Horman <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
Documentation/devicetree/bindings/ata/sata_rcar.txt | 3 ++-
drivers/ata/sata_rcar.c | 10 ++++++++++
2 files changed, 12 insertions(+), 1 deletion(-)
--- a/Documentation/devicetree/bindings/ata/sata_rcar.txt
+++ b/Documentation/devicetree/bindings/ata/sata_rcar.txt
@@ -3,7 +3,8 @@
Required properties:
- compatible : should contain one of the following:
- "renesas,sata-r8a7779" for R-Car H1
- - "renesas,sata-r8a7790" for R-Car H2
+ - "renesas,sata-r8a7790-es1" for R-Car H2 ES1
+ - "renesas,sata-r8a7790" for R-Car H2 other than ES1
- "renesas,sata-r8a7791" for R-Car M2
- reg : address and length of the SATA registers;
- interrupts : must consist of one interrupt specifier.
--- a/drivers/ata/sata_rcar.c
+++ b/drivers/ata/sata_rcar.c
@@ -146,6 +146,7 @@
enum sata_rcar_type {
RCAR_GEN1_SATA,
RCAR_GEN2_SATA,
+ RCAR_R8A7790_ES1_SATA,
};
struct sata_rcar_priv {
@@ -763,6 +764,9 @@ static void sata_rcar_setup_port(struct
ap->udma_mask = ATA_UDMA6;
ap->flags |= ATA_FLAG_SATA;
+ if (priv->type == RCAR_R8A7790_ES1_SATA)
+ ap->flags |= ATA_FLAG_NO_DIPM;
+
ioaddr->cmd_addr = base + SDATA_REG;
ioaddr->ctl_addr = base + SSDEVCON_REG;
ioaddr->scr_addr = base + SCRSSTS_REG;
@@ -792,6 +796,7 @@ static void sata_rcar_init_controller(st
sata_rcar_gen1_phy_init(priv);
break;
case RCAR_GEN2_SATA:
+ case RCAR_R8A7790_ES1_SATA:
sata_rcar_gen2_phy_init(priv);
break;
default:
@@ -838,6 +843,10 @@ static struct of_device_id sata_rcar_mat
.data = (void *)RCAR_GEN2_SATA
},
{
+ .compatible = "renesas,sata-r8a7790-es1",
+ .data = (void *)RCAR_R8A7790_ES1_SATA
+ },
+ {
.compatible = "renesas,sata-r8a7791",
.data = (void *)RCAR_GEN2_SATA
},
@@ -849,6 +858,7 @@ static const struct platform_device_id s
{ "sata_rcar", RCAR_GEN1_SATA }, /* Deprecated by "sata-r8a7779" */
{ "sata-r8a7779", RCAR_GEN1_SATA },
{ "sata-r8a7790", RCAR_GEN2_SATA },
+ { "sata-r8a7790-es1", RCAR_R8A7790_ES1_SATA },
{ "sata-r8a7791", RCAR_GEN2_SATA },
{ },
};
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peng Tao <[email protected]>
commit 8c393f9a721c30a030049a680e1bf896669bb279 upstream.
For pNFS direct writes, layout driver may dynamically allocate ds_cinfo.buckets.
So we need to take care to free them when freeing dreq.
Ideally this needs to be done inside layout driver where ds_cinfo.buckets
are allocated. But buckets are attached to dreq and reused across LD IO iterations.
So I feel it's OK to free them in the generic layer.
Signed-off-by: Peng Tao <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/nfs/direct.c | 1 +
include/linux/nfs_xdr.h | 11 +++++++++++
2 files changed, 12 insertions(+)
--- a/fs/nfs/direct.c
+++ b/fs/nfs/direct.c
@@ -270,6 +270,7 @@ static void nfs_direct_req_free(struct k
{
struct nfs_direct_req *dreq = container_of(kref, struct nfs_direct_req, kref);
+ nfs_free_pnfs_ds_cinfo(&dreq->ds_cinfo);
if (dreq->l_ctx != NULL)
nfs_put_lock_context(dreq->l_ctx);
if (dreq->ctx != NULL)
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -1232,11 +1232,22 @@ struct nfs41_free_stateid_res {
unsigned int status;
};
+static inline void
+nfs_free_pnfs_ds_cinfo(struct pnfs_ds_commit_info *cinfo)
+{
+ kfree(cinfo->buckets);
+}
+
#else
struct pnfs_ds_commit_info {
};
+static inline void
+nfs_free_pnfs_ds_cinfo(struct pnfs_ds_commit_info *cinfo)
+{
+}
+
#endif /* CONFIG_NFS_V4_1 */
struct nfs_page;
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: William Cohen <[email protected]>
commit 899d5933b2dd2720f2b20b01eaa07871aa6ad096 upstream.
When experimenting with patches to provide kprobes support for aarch64
smp machines would hang when inserting breakpoints into kernel code.
The hangs were caused by a race condition in the code called by
aarch64_insn_patch_text_sync(). The first processor in the
aarch64_insn_patch_text_cb() function would patch the code while other
processors were still entering the function and incrementing the
cpu_count field. This resulted in some processors never observing the
exit condition and exiting the function. Thus, processors in the
system hung.
The first processor to enter the patching function performs the
patching and signals that the patching is complete with an increment
of the cpu_count field. When all the processors have incremented the
cpu_count field the cpu_count will be num_cpus_online()+1 and they
will return to normal execution.
Fixes: ae16480785de arm64: introduce interfaces to hotpatch kernel and module code
Signed-off-by: William Cohen <[email protected]>
Acked-by: Will Deacon <[email protected]>
Signed-off-by: Catalin Marinas <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/arm64/kernel/insn.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
--- a/arch/arm64/kernel/insn.c
+++ b/arch/arm64/kernel/insn.c
@@ -156,9 +156,10 @@ static int __kprobes aarch64_insn_patch_
* which ends with "dsb; isb" pair guaranteeing global
* visibility.
*/
- atomic_set(&pp->cpu_count, -1);
+ /* Notify other processors with an additional increment. */
+ atomic_inc(&pp->cpu_count);
} else {
- while (atomic_read(&pp->cpu_count) != -1)
+ while (atomic_read(&pp->cpu_count) <= num_online_cpus())
cpu_relax();
isb();
}
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Herbert Xu <[email protected]>
commit 3ce9b20f1971690b8b3b620e735ec99431573b39 upstream.
When VLAN is in use in macvtap_put_user, we end up setting
csum_start to the wrong place. The result is that the whoever
ends up doing the checksum setting will corrupt the packet instead
of writing the checksum to the expected location, usually this
means writing the checksum with an offset of -4.
This patch fixes this by adjusting csum_start when VLAN tags are
detected.
Fixes: f09e2249c4f5 ("macvtap: restore vlan header on user read")
Signed-off-by: Herbert Xu <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
---
drivers/net/macvtap.c | 2 ++
1 file changed, 2 insertions(+)
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -629,6 +629,8 @@ static void macvtap_skb_to_vnet_hdr(cons
if (skb->ip_summed == CHECKSUM_PARTIAL) {
vnet_hdr->flags = VIRTIO_NET_HDR_F_NEEDS_CSUM;
vnet_hdr->csum_start = skb_checksum_start_offset(skb);
+ if (vlan_tx_tag_present(skb))
+ vnet_hdr->csum_start += VLAN_HLEN;
vnet_hdr->csum_offset = skb->csum_offset;
} else if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
vnet_hdr->flags = VIRTIO_NET_HDR_F_DATA_VALID;
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Herbert Xu <[email protected]>
commit a8f9bfdf982e2b1fb9f094e4de9ab08c57f3d2fd upstream.
When VLAN acceleration is in use on the xmit path, we end up
setting csum_start to the wrong place. The result is that the
whoever ends up doing the checksum setting will corrupt the packet
instead of writing the checksum to the expected location, usually
this means writing the checksum with an offset of -4.
This patch fixes this by adjusting csum_start when VLAN acceleration
is detected.
Fixes: 6680ec68eff4 ("tuntap: hardware vlan tx support")
Signed-off-by: Herbert Xu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/tun.c | 16 +++++++++-------
1 file changed, 9 insertions(+), 7 deletions(-)
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1225,6 +1225,10 @@ static ssize_t tun_put_user(struct tun_s
struct tun_pi pi = { 0, skb->protocol };
ssize_t total = 0;
int vlan_offset = 0, copied;
+ int vlan_hlen = 0;
+
+ if (vlan_tx_tag_present(skb))
+ vlan_hlen = VLAN_HLEN;
if (!(tun->flags & TUN_NO_PI)) {
if ((len -= sizeof(pi)) < 0)
@@ -1276,7 +1280,8 @@ static ssize_t tun_put_user(struct tun_s
if (skb->ip_summed == CHECKSUM_PARTIAL) {
gso.flags = VIRTIO_NET_HDR_F_NEEDS_CSUM;
- gso.csum_start = skb_checksum_start_offset(skb);
+ gso.csum_start = skb_checksum_start_offset(skb) +
+ vlan_hlen;
gso.csum_offset = skb->csum_offset;
} else if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
gso.flags = VIRTIO_NET_HDR_F_DATA_VALID;
@@ -1289,10 +1294,9 @@ static ssize_t tun_put_user(struct tun_s
}
copied = total;
- total += skb->len;
- if (!vlan_tx_tag_present(skb)) {
- len = min_t(int, skb->len, len);
- } else {
+ len = min_t(int, skb->len + vlan_hlen, len);
+ total += skb->len + vlan_hlen;
+ if (vlan_hlen) {
int copy, ret;
struct {
__be16 h_vlan_proto;
@@ -1303,8 +1307,6 @@ static ssize_t tun_put_user(struct tun_s
veth.h_vlan_TCI = htons(vlan_tx_tag_get(skb));
vlan_offset = offsetof(struct vlan_ethhdr, h_vlan_proto);
- len = min_t(int, skb->len + VLAN_HLEN, len);
- total += VLAN_HLEN;
copy = min_t(int, vlan_offset, len);
ret = skb_copy_datagram_const_iovec(skb, 0, iv, copied, copy);
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mark Rutland <[email protected]>
commit 9b0b26580a753d4d6bdd2b8b4ca9a8f3f2d39065 upstream.
While efi-entry.S mentions that efi_entry() will have relocated the
kernel image, it actually means that efi_entry will have placed a copy
of the kernel in the appropriate location, and until this is branched to
at the end of efi_entry.S, all instructions are executed from the
original image.
Thus while the flush in efi_entry.S does ensure that the copy is visible
to noncacheable accesses, it does not guarantee that this is true for
the image instructions are being executed from. This could have
disasterous effects when the MMU and caches are disabled if the image
has not been naturally evicted to the PoC.
Additionally, due to a missing dsb following the ic ialluis, the new
kernel image is not necessarily clean in the I-cache when it is branched
to, with similar potentially disasterous effects.
This patch adds additional flushing to ensure that the currently
executing stub text is flushed to the PoC and is thus visible to
noncacheable accesses. As it is placed after the instructions cache
maintenance for the new image and __flush_dcache_area already contains a
dsb, we do not need to add a separate barrier to ensure completion of
the icache maintenance.
Comments are updated to clarify the situation with regard to the two
images and the maintenance required for both.
Fixes: 3c7f255039a2ad6ee1e3890505caf0d029b22e29
Signed-off-by: Mark Rutland <[email protected]>
Acked-by: Joel Schopp <[email protected]>
Reviewed-by: Roy Franz <[email protected]>
Tested-by: Tom Lendacky <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Ian Campbell <[email protected]>
Cc: Leif Lindholm <[email protected]>
Cc: Mark Salter <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Catalin Marinas <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/arm64/kernel/efi-entry.S | 27 +++++++++++++++++++++------
1 file changed, 21 insertions(+), 6 deletions(-)
--- a/arch/arm64/kernel/efi-entry.S
+++ b/arch/arm64/kernel/efi-entry.S
@@ -54,18 +54,17 @@ ENTRY(efi_stub_entry)
b.eq efi_load_fail
/*
- * efi_entry() will have relocated the kernel image if necessary
- * and we return here with device tree address in x0 and the kernel
- * entry point stored at *image_addr. Save those values in registers
- * which are callee preserved.
+ * efi_entry() will have copied the kernel image if necessary and we
+ * return here with device tree address in x0 and the kernel entry
+ * point stored at *image_addr. Save those values in registers which
+ * are callee preserved.
*/
mov x20, x0 // DTB address
ldr x0, [sp, #16] // relocated _text address
mov x21, x0
/*
- * Flush dcache covering current runtime addresses
- * of kernel text/data. Then flush all of icache.
+ * Calculate size of the kernel Image (same for original and copy).
*/
adrp x1, _text
add x1, x1, #:lo12:_text
@@ -73,9 +72,24 @@ ENTRY(efi_stub_entry)
add x2, x2, #:lo12:_edata
sub x1, x2, x1
+ /*
+ * Flush the copied Image to the PoC, and ensure it is not shadowed by
+ * stale icache entries from before relocation.
+ */
bl __flush_dcache_area
ic ialluis
+ /*
+ * Ensure that the rest of this function (in the original Image) is
+ * visible when the caches are disabled. The I-cache can't have stale
+ * entries for the VA range of the current image, so no maintenance is
+ * necessary.
+ */
+ adr x0, efi_stub_entry
+ adr x1, efi_stub_entry_end
+ sub x1, x1, x0
+ bl __flush_dcache_area
+
/* Turn off Dcache and MMU */
mrs x0, CurrentEL
cmp x0, #CurrentEL_EL2
@@ -105,4 +119,5 @@ efi_load_fail:
ldp x29, x30, [sp], #32
ret
+efi_stub_entry_end:
ENDPROC(efi_stub_entry)
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stefan Richter <[email protected]>
commit eaca2d8e75e90a70a63a6695c9f61932609db212 upstream.
Found by the UC-KLEE tool: A user could supply less input to
firewire-cdev ioctls than write- or write/read-type ioctl handlers
expect. The handlers used data from uninitialized kernel stack then.
This could partially leak back to the user if the kernel subsequently
generated fw_cdev_event_'s (to be read from the firewire-cdev fd)
which notably would contain the _u64 closure field which many of the
ioctl argument structures contain.
The fact that the handlers would act on random garbage input is a
lesser issue since all handlers must check their input anyway.
The fix simply always null-initializes the entire ioctl argument buffer
regardless of the actual length of expected user input. That is, a
runtime overhead of memset(..., 40) is added to each firewirew-cdev
ioctl() call. [Comment from Clemens Ladisch: This part of the stack is
most likely to be already in the cache.]
Remarks:
- There was never any leak from kernel stack to the ioctl output
buffer itself. IOW, it was not possible to read kernel stack by a
read-type or write/read-type ioctl alone; the leak could at most
happen in combination with read()ing subsequent event data.
- The actual expected minimum user input of each ioctl from
include/uapi/linux/firewire-cdev.h is, in bytes:
[0x00] = 32, [0x05] = 4, [0x0a] = 16, [0x0f] = 20, [0x14] = 16,
[0x01] = 36, [0x06] = 20, [0x0b] = 4, [0x10] = 20, [0x15] = 20,
[0x02] = 20, [0x07] = 4, [0x0c] = 0, [0x11] = 0, [0x16] = 8,
[0x03] = 4, [0x08] = 24, [0x0d] = 20, [0x12] = 36, [0x17] = 12,
[0x04] = 20, [0x09] = 24, [0x0e] = 4, [0x13] = 40, [0x18] = 4.
Reported-by: David Ramos <[email protected]>
Signed-off-by: Stefan Richter <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/firewire/core-cdev.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
--- a/drivers/firewire/core-cdev.c
+++ b/drivers/firewire/core-cdev.c
@@ -1637,8 +1637,7 @@ static int dispatch_ioctl(struct client
_IOC_SIZE(cmd) > sizeof(buffer))
return -ENOTTY;
- if (_IOC_DIR(cmd) == _IOC_READ)
- memset(&buffer, 0, _IOC_SIZE(cmd));
+ memset(&buffer, 0, sizeof(buffer));
if (_IOC_DIR(cmd) & _IOC_WRITE)
if (copy_from_user(&buffer, arg, _IOC_SIZE(cmd)))
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paul Mackerras <[email protected]>
commit ad0eab9293485d1c06237e9249f6d4dfa3d93d4d upstream.
The branches of the if (i->type & ITER_BVEC) statement in
iov_iter_single_seg_count() are the wrong way around; if ITER_BVEC is
clear then we use i->bvec, when we should be using i->iov. This fixes
it.
In my case, the symptom that this caused was that a KVM guest doing
filesystem operations on a virtual disk would result in one of qemu's
threads on the host going into an infinite loop in
generic_perform_write(). The loop would hit the copied == 0 case and
call iov_iter_single_seg_count() to reduce the number of bytes to try
to process, but because of the error, iov_iter_single_seg_count()
would just return i->count and the loop made no progress and continued
forever.
Signed-off-by: Paul Mackerras <[email protected]>
Signed-off-by: Al Viro <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
mm/iov_iter.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/mm/iov_iter.c
+++ b/mm/iov_iter.c
@@ -699,9 +699,9 @@ size_t iov_iter_single_seg_count(const s
if (i->nr_segs == 1)
return i->count;
else if (i->type & ITER_BVEC)
- return min(i->count, i->iov->iov_len - i->iov_offset);
- else
return min(i->count, i->bvec->bv_len - i->iov_offset);
+ else
+ return min(i->count, i->iov->iov_len - i->iov_offset);
}
EXPORT_SYMBOL(iov_iter_single_seg_count);
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joe Thornber <[email protected]>
commit c822ed967cba38505713d59ed40a114386ef6c01 upstream.
Avoids normal IO racing with discard.
Signed-off-by: Joe Thornber <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/md/dm-thin.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)
--- a/drivers/md/dm-thin.c
+++ b/drivers/md/dm-thin.c
@@ -1936,6 +1936,14 @@ static int thin_bio_map(struct dm_target
return DM_MAPIO_SUBMITTED;
}
+ /*
+ * We must hold the virtual cell before doing the lookup, otherwise
+ * there's a race with discard.
+ */
+ build_virtual_key(tc->td, block, &key);
+ if (dm_bio_detain(tc->pool->prison, &key, bio, &cell1, &cell_result))
+ return DM_MAPIO_SUBMITTED;
+
r = dm_thin_find_block(td, block, 0, &result);
/*
@@ -1959,13 +1967,10 @@ static int thin_bio_map(struct dm_target
* shared flag will be set in their case.
*/
thin_defer_bio(tc, bio);
+ cell_defer_no_holder_no_free(tc, &cell1);
return DM_MAPIO_SUBMITTED;
}
- build_virtual_key(tc->td, block, &key);
- if (dm_bio_detain(tc->pool->prison, &key, bio, &cell1, &cell_result))
- return DM_MAPIO_SUBMITTED;
-
build_data_key(tc->td, result.block, &key);
if (dm_bio_detain(tc->pool->prison, &key, bio, &cell2, &cell_result)) {
cell_defer_no_holder_no_free(tc, &cell1);
@@ -1986,6 +1991,7 @@ static int thin_bio_map(struct dm_target
* of doing so.
*/
handle_unserviceable_bio(tc->pool, bio);
+ cell_defer_no_holder_no_free(tc, &cell1);
return DM_MAPIO_SUBMITTED;
}
/* fall through */
@@ -1996,6 +2002,7 @@ static int thin_bio_map(struct dm_target
* provide the hint to load the metadata into cache.
*/
thin_defer_bio(tc, bio);
+ cell_defer_no_holder_no_free(tc, &cell1);
return DM_MAPIO_SUBMITTED;
default:
@@ -2005,6 +2012,7 @@ static int thin_bio_map(struct dm_target
* pool is switched to fail-io mode.
*/
bio_io_error(bio);
+ cell_defer_no_holder_no_free(tc, &cell1);
return DM_MAPIO_SUBMITTED;
}
}
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andrew Lunn <[email protected]>
commit 5129ee22ce4aff7c5907d4c3d67d23f86cd6db9b upstream.
A second product has come to light which makes use of the A0 stepping
of the Armada XP SoC. A0 stepping has a hardware bug in the i2c core
meaning that hardware offload does not work, resulting in the kernel
failing to boot. The quirk detects that the kernel is running on an A0
stepping SoC and disables the use of hardware offload.
Currently the quirk is only enabled for PlatHome Openblocks AX3. The
AX3 has been produced with both A0 and B0 stepping SoCs. The second
product is the Lenovo Iomega IX4-300d. It seems likely that this
device will also swap from A0 to B0 SoC sometime during its life.
If there are two products using A0, it seems likely there are more
products with A0. Also, since the number of A0 SoCs is limited, these
products are also likely to transition to B0. Hence detecting at run
time is the safest option. So enable the quirk for all Armada XP
boards.
Tested on an AX3 with A0 stepping.
Signed-off-by: Andrew Lunn <[email protected]>
Acked-by: Gregory CLEMENT <[email protected]>
Acked-by: Thomas Petazzoni <[email protected]>
Fixes: 930ab3d403ae: ("i2c: mv64xxx: Add I2C Transaction Generator support")
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Jason Cooper <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/arm/mach-mvebu/board-v7.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm/mach-mvebu/board-v7.c
+++ b/arch/arm/mach-mvebu/board-v7.c
@@ -188,7 +188,7 @@ static void __init thermal_quirk(void)
static void __init mvebu_dt_init(void)
{
- if (of_machine_is_compatible("plathome,openblocks-ax3-4"))
+ if (of_machine_is_compatible("marvell,armadaxp"))
i2c_quirk();
if (of_machine_is_compatible("marvell,a375-db")) {
external_abort_quirk();
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Roger Quadros <[email protected]>
commit 73b3a6657a88ef5348a0d69c9a8107d6f01ae862 upstream.
For PIN_OUTPUT_PULLUP and PIN_OUTPUT_PULLDOWN we must not set the
PULL_DIS bit which disables the PULLs.
PULL_ENA is a 0 and using it in an OR operation is a NOP, so don't
use it in the PIN_OUTPUT_PULLUP/DOWN macros.
Fixes: 23d9cec07c58 ("pinctrl: dra: dt-bindings: Fix pull enable/disable")
Signed-off-by: Roger Quadros <[email protected]>
Acked-by: Nishanth Menon <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/dt-bindings/pinctrl/dra.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/include/dt-bindings/pinctrl/dra.h
+++ b/include/dt-bindings/pinctrl/dra.h
@@ -40,8 +40,8 @@
/* Active pin states */
#define PIN_OUTPUT (0 | PULL_DIS)
-#define PIN_OUTPUT_PULLUP (PIN_OUTPUT | PULL_ENA | PULL_UP)
-#define PIN_OUTPUT_PULLDOWN (PIN_OUTPUT | PULL_ENA)
+#define PIN_OUTPUT_PULLUP (PULL_UP)
+#define PIN_OUTPUT_PULLDOWN (0)
#define PIN_INPUT (INPUT_EN | PULL_DIS)
#define PIN_INPUT_SLEW (INPUT_EN | SLEWCONTROL)
#define PIN_INPUT_PULLUP (PULL_ENA | INPUT_EN | PULL_UP)
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Junjie Mao <[email protected]>
commit 805dbe17d1c832ad341f14fae8cedf41b67ca6fa upstream.
The driver is not released when ieee80211_register_hw fails in
mac80211_hwsim_create_radio, leading to the access to the unregistered (and
possibly freed) device in platform_driver_unregister:
[ 0.447547] mac80211_hwsim: ieee80211_register_hw failed (-2)
[ 0.448292] ------------[ cut here ]------------
[ 0.448854] WARNING: CPU: 0 PID: 1 at ../include/linux/kref.h:47 kobject_get+0x33/0x50()
[ 0.449839] CPU: 0 PID: 1 Comm: swapper Not tainted 3.17.0-00001-gdd46990-dirty #2
[ 0.450813] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 0.451512] 00000000 00000000 78025e38 7967c6c6 78025e68 7905e09b 7988b480 00000000
[ 0.452579] 00000001 79887d62 0000002f 79170bb3 79170bb3 78397008 79ac9d74 00000001
[ 0.453614] 78025e78 7905e15d 00000009 00000000 78025e84 79170bb3 78397000 78025e8c
[ 0.454632] Call Trace:
[ 0.454921] [<7967c6c6>] dump_stack+0x16/0x18
[ 0.455453] [<7905e09b>] warn_slowpath_common+0x6b/0x90
[ 0.456067] [<79170bb3>] ? kobject_get+0x33/0x50
[ 0.456612] [<79170bb3>] ? kobject_get+0x33/0x50
[ 0.457155] [<7905e15d>] warn_slowpath_null+0x1d/0x20
[ 0.457748] [<79170bb3>] kobject_get+0x33/0x50
[ 0.458274] [<7925824f>] get_device+0xf/0x20
[ 0.458779] [<7925b5cd>] driver_detach+0x3d/0xa0
[ 0.459331] [<7925a3ff>] bus_remove_driver+0x8f/0xb0
[ 0.459927] [<7925bf80>] ? class_unregister+0x40/0x80
[ 0.460660] [<7925bad7>] driver_unregister+0x47/0x50
[ 0.461248] [<7925c033>] ? class_destroy+0x13/0x20
[ 0.461824] [<7925d07b>] platform_driver_unregister+0xb/0x10
[ 0.462507] [<79b51ba0>] init_mac80211_hwsim+0x3e8/0x3f9
[ 0.463161] [<79b30c58>] do_one_initcall+0x106/0x1a9
[ 0.463758] [<79b517b8>] ? if_spi_init_module+0xac/0xac
[ 0.464393] [<79b517b8>] ? if_spi_init_module+0xac/0xac
[ 0.465001] [<79071935>] ? parse_args+0x2f5/0x480
[ 0.465569] [<7906b41e>] ? __usermodehelper_set_disable_depth+0x3e/0x50
[ 0.466345] [<79b30dd9>] kernel_init_freeable+0xde/0x17d
[ 0.466972] [<79b304d6>] ? do_early_param+0x7a/0x7a
[ 0.467546] [<79677b1b>] kernel_init+0xb/0xe0
[ 0.468072] [<79075f42>] ? schedule_tail+0x12/0x40
[ 0.468658] [<79686580>] ret_from_kernel_thread+0x20/0x30
[ 0.469303] [<79677b10>] ? rest_init+0xc0/0xc0
[ 0.469829] ---[ end trace ad8ac403ff8aef5c ]---
[ 0.470509] ------------[ cut here ]------------
[ 0.471047] WARNING: CPU: 0 PID: 1 at ../kernel/locking/lockdep.c:3161 __lock_acquire.isra.22+0x7aa/0xb00()
[ 0.472163] DEBUG_LOCKS_WARN_ON(id >= MAX_LOCKDEP_KEYS)
[ 0.472774] CPU: 0 PID: 1 Comm: swapper Tainted: G W 3.17.0-00001-gdd46990-dirty #2
[ 0.473815] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 0.474492] 78025de0 78025de0 78025da0 7967c6c6 78025dd0 7905e09b 79888931 78025dfc
[ 0.475515] 00000001 79888a93 00000c59 7907f33a 7907f33a 78028000 fffe9d09 00000000
[ 0.476519] 78025de8 7905e10e 00000009 78025de0 79888931 78025dfc 78025e24 7907f33a
[ 0.477523] Call Trace:
[ 0.477821] [<7967c6c6>] dump_stack+0x16/0x18
[ 0.478352] [<7905e09b>] warn_slowpath_common+0x6b/0x90
[ 0.478976] [<7907f33a>] ? __lock_acquire.isra.22+0x7aa/0xb00
[ 0.479658] [<7907f33a>] ? __lock_acquire.isra.22+0x7aa/0xb00
[ 0.480417] [<7905e10e>] warn_slowpath_fmt+0x2e/0x30
[ 0.480479] [<7907f33a>] __lock_acquire.isra.22+0x7aa/0xb00
[ 0.480479] [<79078aa5>] ? sched_clock_cpu+0xb5/0xf0
[ 0.480479] [<7907fd06>] lock_acquire+0x56/0x70
[ 0.480479] [<7925b5e8>] ? driver_detach+0x58/0xa0
[ 0.480479] [<79682d11>] mutex_lock_nested+0x61/0x2a0
[ 0.480479] [<7925b5e8>] ? driver_detach+0x58/0xa0
[ 0.480479] [<7925b5e8>] ? driver_detach+0x58/0xa0
[ 0.480479] [<7925b5e8>] driver_detach+0x58/0xa0
[ 0.480479] [<7925a3ff>] bus_remove_driver+0x8f/0xb0
[ 0.480479] [<7925bf80>] ? class_unregister+0x40/0x80
[ 0.480479] [<7925bad7>] driver_unregister+0x47/0x50
[ 0.480479] [<7925c033>] ? class_destroy+0x13/0x20
[ 0.480479] [<7925d07b>] platform_driver_unregister+0xb/0x10
[ 0.480479] [<79b51ba0>] init_mac80211_hwsim+0x3e8/0x3f9
[ 0.480479] [<79b30c58>] do_one_initcall+0x106/0x1a9
[ 0.480479] [<79b517b8>] ? if_spi_init_module+0xac/0xac
[ 0.480479] [<79b517b8>] ? if_spi_init_module+0xac/0xac
[ 0.480479] [<79071935>] ? parse_args+0x2f5/0x480
[ 0.480479] [<7906b41e>] ? __usermodehelper_set_disable_depth+0x3e/0x50
[ 0.480479] [<79b30dd9>] kernel_init_freeable+0xde/0x17d
[ 0.480479] [<79b304d6>] ? do_early_param+0x7a/0x7a
[ 0.480479] [<79677b1b>] kernel_init+0xb/0xe0
[ 0.480479] [<79075f42>] ? schedule_tail+0x12/0x40
[ 0.480479] [<79686580>] ret_from_kernel_thread+0x20/0x30
[ 0.480479] [<79677b10>] ? rest_init+0xc0/0xc0
[ 0.480479] ---[ end trace ad8ac403ff8aef5d ]---
[ 0.495478] BUG: unable to handle kernel paging request at 00200200
[ 0.496257] IP: [<79682de5>] mutex_lock_nested+0x135/0x2a0
[ 0.496923] *pde = 00000000
[ 0.497290] Oops: 0002 [#1]
[ 0.497653] CPU: 0 PID: 1 Comm: swapper Tainted: G W 3.17.0-00001-gdd46990-dirty #2
[ 0.498659] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 0.499321] task: 78028000 ti: 78024000 task.ti: 78024000
[ 0.499955] EIP: 0060:[<79682de5>] EFLAGS: 00010097 CPU: 0
[ 0.500620] EIP is at mutex_lock_nested+0x135/0x2a0
[ 0.501145] EAX: 00200200 EBX: 78397434 ECX: 78397460 EDX: 78025e70
[ 0.501816] ESI: 00000246 EDI: 78028000 EBP: 78025e8c ESP: 78025e54
[ 0.502497] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
[ 0.503076] CR0: 8005003b CR2: 00200200 CR3: 01b9d000 CR4: 00000690
[ 0.503773] Stack:
[ 0.503998] 00000000 00000001 00000000 7925b5e8 78397460 7925b5e8 78397474 78397460
[ 0.504944] 00200200 11111111 78025e70 78397000 79ac9d74 00000001 78025ea0 7925b5e8
[ 0.505451] 79ac9d74 fffffffe 00000001 78025ebc 7925a3ff 7a251398 78025ec8 7925bf80
[ 0.505451] Call Trace:
[ 0.505451] [<7925b5e8>] ? driver_detach+0x58/0xa0
[ 0.505451] [<7925b5e8>] ? driver_detach+0x58/0xa0
[ 0.505451] [<7925b5e8>] driver_detach+0x58/0xa0
[ 0.505451] [<7925a3ff>] bus_remove_driver+0x8f/0xb0
[ 0.505451] [<7925bf80>] ? class_unregister+0x40/0x80
[ 0.505451] [<7925bad7>] driver_unregister+0x47/0x50
[ 0.505451] [<7925c033>] ? class_destroy+0x13/0x20
[ 0.505451] [<7925d07b>] platform_driver_unregister+0xb/0x10
[ 0.505451] [<79b51ba0>] init_mac80211_hwsim+0x3e8/0x3f9
[ 0.505451] [<79b30c58>] do_one_initcall+0x106/0x1a9
[ 0.505451] [<79b517b8>] ? if_spi_init_module+0xac/0xac
[ 0.505451] [<79b517b8>] ? if_spi_init_module+0xac/0xac
[ 0.505451] [<79071935>] ? parse_args+0x2f5/0x480
[ 0.505451] [<7906b41e>] ? __usermodehelper_set_disable_depth+0x3e/0x50
[ 0.505451] [<79b30dd9>] kernel_init_freeable+0xde/0x17d
[ 0.505451] [<79b304d6>] ? do_early_param+0x7a/0x7a
[ 0.505451] [<79677b1b>] kernel_init+0xb/0xe0
[ 0.505451] [<79075f42>] ? schedule_tail+0x12/0x40
[ 0.505451] [<79686580>] ret_from_kernel_thread+0x20/0x30
[ 0.505451] [<79677b10>] ? rest_init+0xc0/0xc0
[ 0.505451] Code: 89 d8 e8 cf 9b 9f ff 8b 4f 04 8d 55 e4 89 d8 e8 72 9d 9f ff 8d 43 2c 89 c1 89 45 d8 8b 43 30 8d 55 e4 89 53 30 89 4d e4 89 45 e8 <89> 10 8b 55 dc 8b 45 e0 89 7d ec e8 db af 9f ff eb 11 90 31 c0
[ 0.505451] EIP: [<79682de5>] mutex_lock_nested+0x135/0x2a0 SS:ESP 0068:78025e54
[ 0.505451] CR2: 0000000000200200
[ 0.505451] ---[ end trace ad8ac403ff8aef5e ]---
[ 0.505451] Kernel panic - not syncing: Fatal exception
Fixes: 9ea927748ced ("mac80211_hwsim: Register and bind to driver")
Reported-by: Fengguang Wu <[email protected]>
Signed-off-by: Junjie Mao <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/wireless/mac80211_hwsim.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/net/wireless/mac80211_hwsim.c
+++ b/drivers/net/wireless/mac80211_hwsim.c
@@ -1987,7 +1987,7 @@ static int mac80211_hwsim_create_radio(i
if (err != 0) {
printk(KERN_DEBUG "mac80211_hwsim: device_bind_driver failed (%d)\n",
err);
- goto failed_hw;
+ goto failed_bind;
}
skb_queue_head_init(&data->pending);
@@ -2183,6 +2183,8 @@ static int mac80211_hwsim_create_radio(i
return idx;
failed_hw:
+ device_release_driver(data->dev);
+failed_bind:
device_unregister(data->dev);
failed_drvdata:
ieee80211_free_hw(hw);
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Roger Quadros <[email protected]>
commit a8ead0ecb9d4ce472f4cdab936d6f18e41e3a9ee upstream.
The 5th NAND partition should be named "NAND.u-boot-spl-os"
instead of "NAND.u-boot-spl". This is to be consistent with other
TI boards as well as u-boot.
Fixes: 91994facdd2d ("ARM: dts: am335x-evm: NAND: update MTD partition table")
Signed-off-by: Roger Quadros <[email protected]>
Signed-off-by: Sekhar Nori <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/arm/boot/dts/am335x-evm.dts | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm/boot/dts/am335x-evm.dts
+++ b/arch/arm/boot/dts/am335x-evm.dts
@@ -489,7 +489,7 @@
reg = <0x00060000 0x00020000>;
};
partition@4 {
- label = "NAND.u-boot-spl";
+ label = "NAND.u-boot-spl-os";
reg = <0x00080000 0x00040000>;
};
partition@5 {
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dave Airlie <[email protected]>
commit 1c9498425453bb65ef339a57705c5ef59fe1541d upstream.
While developing MST support I noticed I often got the wrong data
back from a transaction, in a racy fashion. I noticed the scratch
space wasn't locked against concurrent users.
Based on a patch by Alex, but I've made it a bit more obvious when
things are locked.
Signed-off-by: Dave Airlie <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/radeon/atom.c | 11 ++++++++++-
drivers/gpu/drm/radeon/atom.h | 2 ++
drivers/gpu/drm/radeon/atombios_dp.c | 4 +++-
drivers/gpu/drm/radeon/atombios_i2c.c | 4 +++-
drivers/gpu/drm/radeon/radeon_device.c | 1 +
5 files changed, 19 insertions(+), 3 deletions(-)
--- a/drivers/gpu/drm/radeon/atom.c
+++ b/drivers/gpu/drm/radeon/atom.c
@@ -1217,7 +1217,7 @@ free:
return ret;
}
-int atom_execute_table(struct atom_context *ctx, int index, uint32_t * params)
+int atom_execute_table_scratch_unlocked(struct atom_context *ctx, int index, uint32_t * params)
{
int r;
@@ -1238,6 +1238,15 @@ int atom_execute_table(struct atom_conte
return r;
}
+int atom_execute_table(struct atom_context *ctx, int index, uint32_t * params)
+{
+ int r;
+ mutex_lock(&ctx->scratch_mutex);
+ r = atom_execute_table_scratch_unlocked(ctx, index, params);
+ mutex_unlock(&ctx->scratch_mutex);
+ return r;
+}
+
static int atom_iio_len[] = { 1, 2, 3, 3, 3, 3, 4, 4, 4, 3 };
static void atom_index_iio(struct atom_context *ctx, int base)
--- a/drivers/gpu/drm/radeon/atom.h
+++ b/drivers/gpu/drm/radeon/atom.h
@@ -125,6 +125,7 @@ struct card_info {
struct atom_context {
struct card_info *card;
struct mutex mutex;
+ struct mutex scratch_mutex;
void *bios;
uint32_t cmd_table, data_table;
uint16_t *iio;
@@ -145,6 +146,7 @@ extern int atom_debug;
struct atom_context *atom_parse(struct card_info *, void *);
int atom_execute_table(struct atom_context *, int, uint32_t *);
+int atom_execute_table_scratch_unlocked(struct atom_context *, int, uint32_t *);
int atom_asic_init(struct atom_context *);
void atom_destroy(struct atom_context *);
bool atom_parse_data_header(struct atom_context *ctx, int index, uint16_t *size,
--- a/drivers/gpu/drm/radeon/atombios_dp.c
+++ b/drivers/gpu/drm/radeon/atombios_dp.c
@@ -100,6 +100,7 @@ static int radeon_process_aux_ch(struct
memset(&args, 0, sizeof(args));
mutex_lock(&chan->mutex);
+ mutex_lock(&rdev->mode_info.atom_context->scratch_mutex);
base = (unsigned char *)(rdev->mode_info.atom_context->scratch + 1);
@@ -113,7 +114,7 @@ static int radeon_process_aux_ch(struct
if (ASIC_IS_DCE4(rdev))
args.v2.ucHPD_ID = chan->rec.hpd;
- atom_execute_table(rdev->mode_info.atom_context, index, (uint32_t *)&args);
+ atom_execute_table_scratch_unlocked(rdev->mode_info.atom_context, index, (uint32_t *)&args);
*ack = args.v1.ucReplyStatus;
@@ -147,6 +148,7 @@ static int radeon_process_aux_ch(struct
r = recv_bytes;
done:
+ mutex_unlock(&rdev->mode_info.atom_context->scratch_mutex);
mutex_unlock(&chan->mutex);
return r;
--- a/drivers/gpu/drm/radeon/atombios_i2c.c
+++ b/drivers/gpu/drm/radeon/atombios_i2c.c
@@ -48,6 +48,7 @@ static int radeon_process_i2c_ch(struct
memset(&args, 0, sizeof(args));
mutex_lock(&chan->mutex);
+ mutex_lock(&rdev->mode_info.atom_context->scratch_mutex);
base = (unsigned char *)rdev->mode_info.atom_context->scratch;
@@ -82,7 +83,7 @@ static int radeon_process_i2c_ch(struct
args.ucSlaveAddr = slave_addr << 1;
args.ucLineNumber = chan->rec.i2c_id;
- atom_execute_table(rdev->mode_info.atom_context, index, (uint32_t *)&args);
+ atom_execute_table_scratch_unlocked(rdev->mode_info.atom_context, index, (uint32_t *)&args);
/* error */
if (args.ucStatus != HW_ASSISTED_I2C_STATUS_SUCCESS) {
@@ -95,6 +96,7 @@ static int radeon_process_i2c_ch(struct
radeon_atom_copy_swap(buf, base, num, false);
done:
+ mutex_unlock(&rdev->mode_info.atom_context->scratch_mutex);
mutex_unlock(&chan->mutex);
return r;
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -952,6 +952,7 @@ int radeon_atombios_init(struct radeon_d
}
mutex_init(&rdev->mode_info.atom_context->mutex);
+ mutex_init(&rdev->mode_info.atom_context->scratch_mutex);
radeon_atom_initialize_bios_scratch_regs(rdev->ddev);
atom_allocate_fb_scratch(rdev->mode_info.atom_context);
return 0;
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Will Deacon <[email protected]>
commit 238962ac71910d6c20162ea5230685fead1836a4 upstream.
To speed up decompression, the decompressor sets up a flat, cacheable
mapping of memory. However, when there is insufficient space to hold
the page tables for this mapping, we don't bother to enable the caches
and subsequently skip all the cache maintenance hooks.
Skipping the cache maintenance before jumping to the relocated code
allows the processor to predict the branch and populate the I-cache
with stale data before the relocation loop has completed (since a
bootloader may have SCTLR.I set, which permits normal, cacheable
instruction fetches regardless of SCTLR.M).
This patch moves the cache maintenance check into the maintenance
routines themselves, allowing the v6/v7 versions to invalidate the
I-cache regardless of the MMU state.
Reported-by: Marc Carino <[email protected]>
Tested-by: Julien Grall <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Russell King <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/arm/boot/compressed/head.S | 20 ++++++++++++++++----
1 file changed, 16 insertions(+), 4 deletions(-)
--- a/arch/arm/boot/compressed/head.S
+++ b/arch/arm/boot/compressed/head.S
@@ -397,8 +397,7 @@ dtb_check_done:
add sp, sp, r6
#endif
- tst r4, #1
- bleq cache_clean_flush
+ bl cache_clean_flush
adr r0, BSYM(restart)
add r0, r0, r6
@@ -1047,6 +1046,8 @@ cache_clean_flush:
b call_cache_fn
__armv4_mpu_cache_flush:
+ tst r4, #1
+ movne pc, lr
mov r2, #1
mov r3, #0
mcr p15, 0, ip, c7, c6, 0 @ invalidate D cache
@@ -1064,6 +1065,8 @@ __armv4_mpu_cache_flush:
mov pc, lr
__fa526_cache_flush:
+ tst r4, #1
+ movne pc, lr
mov r1, #0
mcr p15, 0, r1, c7, c14, 0 @ clean and invalidate D cache
mcr p15, 0, r1, c7, c5, 0 @ flush I cache
@@ -1072,13 +1075,16 @@ __fa526_cache_flush:
__armv6_mmu_cache_flush:
mov r1, #0
- mcr p15, 0, r1, c7, c14, 0 @ clean+invalidate D
+ tst r4, #1
+ mcreq p15, 0, r1, c7, c14, 0 @ clean+invalidate D
mcr p15, 0, r1, c7, c5, 0 @ invalidate I+BTB
- mcr p15, 0, r1, c7, c15, 0 @ clean+invalidate unified
+ mcreq p15, 0, r1, c7, c15, 0 @ clean+invalidate unified
mcr p15, 0, r1, c7, c10, 4 @ drain WB
mov pc, lr
__armv7_mmu_cache_flush:
+ tst r4, #1
+ bne iflush
mrc p15, 0, r10, c0, c1, 5 @ read ID_MMFR1
tst r10, #0xf << 16 @ hierarchical cache (ARMv7)
mov r10, #0
@@ -1139,6 +1145,8 @@ iflush:
mov pc, lr
__armv5tej_mmu_cache_flush:
+ tst r4, #1
+ movne pc, lr
1: mrc p15, 0, r15, c7, c14, 3 @ test,clean,invalidate D cache
bne 1b
mcr p15, 0, r0, c7, c5, 0 @ flush I cache
@@ -1146,6 +1154,8 @@ __armv5tej_mmu_cache_flush:
mov pc, lr
__armv4_mmu_cache_flush:
+ tst r4, #1
+ movne pc, lr
mov r2, #64*1024 @ default: 32K dcache size (*2)
mov r11, #32 @ default: 32 byte line size
mrc p15, 0, r3, c0, c0, 1 @ read cache type
@@ -1179,6 +1189,8 @@ no_cache_id:
__armv3_mmu_cache_flush:
__armv3_mpu_cache_flush:
+ tst r4, #1
+ movne pc, lr
mov r1, #0
mcr p15, 0, r1, c7, c0, 0 @ invalidate whole cache v3
mov pc, lr
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Deucher <[email protected]>
commit 0b021c5802fbe5addf6f89f5030f684adf04f7b7 upstream.
Use gart rather than vram to avoid having to deal with
the HDP cache.
Port of adfed2b0587289013f8143c54913ddfd44ac1fd3
(drm/radeon: use gart memory for DMA ring tests)
to the IB tests.
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/radeon/cik_sdma.c | 21 ++++++++++++---------
drivers/gpu/drm/radeon/r600_dma.c | 20 ++++++++++----------
2 files changed, 22 insertions(+), 19 deletions(-)
--- a/drivers/gpu/drm/radeon/cik_sdma.c
+++ b/drivers/gpu/drm/radeon/cik_sdma.c
@@ -666,17 +666,20 @@ int cik_sdma_ib_test(struct radeon_devic
{
struct radeon_ib ib;
unsigned i;
+ unsigned index;
int r;
- void __iomem *ptr = (void *)rdev->vram_scratch.ptr;
u32 tmp = 0;
+ u64 gpu_addr;
- if (!ptr) {
- DRM_ERROR("invalid vram scratch pointer\n");
- return -EINVAL;
- }
+ if (ring->idx == R600_RING_TYPE_DMA_INDEX)
+ index = R600_WB_DMA_RING_TEST_OFFSET;
+ else
+ index = CAYMAN_WB_DMA1_RING_TEST_OFFSET;
+
+ gpu_addr = rdev->wb.gpu_addr + index;
tmp = 0xCAFEDEAD;
- writel(tmp, ptr);
+ rdev->wb.wb[index/4] = cpu_to_le32(tmp);
r = radeon_ib_get(rdev, ring->idx, &ib, NULL, 256);
if (r) {
@@ -685,8 +688,8 @@ int cik_sdma_ib_test(struct radeon_devic
}
ib.ptr[0] = SDMA_PACKET(SDMA_OPCODE_WRITE, SDMA_WRITE_SUB_OPCODE_LINEAR, 0);
- ib.ptr[1] = rdev->vram_scratch.gpu_addr & 0xfffffffc;
- ib.ptr[2] = upper_32_bits(rdev->vram_scratch.gpu_addr);
+ ib.ptr[1] = lower_32_bits(gpu_addr);
+ ib.ptr[2] = upper_32_bits(gpu_addr);
ib.ptr[3] = 1;
ib.ptr[4] = 0xDEADBEEF;
ib.length_dw = 5;
@@ -703,7 +706,7 @@ int cik_sdma_ib_test(struct radeon_devic
return r;
}
for (i = 0; i < rdev->usec_timeout; i++) {
- tmp = readl(ptr);
+ tmp = le32_to_cpu(rdev->wb.wb[index/4]);
if (tmp == 0xDEADBEEF)
break;
DRM_UDELAY(1);
--- a/drivers/gpu/drm/radeon/r600_dma.c
+++ b/drivers/gpu/drm/radeon/r600_dma.c
@@ -338,17 +338,17 @@ int r600_dma_ib_test(struct radeon_devic
{
struct radeon_ib ib;
unsigned i;
+ unsigned index;
int r;
- void __iomem *ptr = (void *)rdev->vram_scratch.ptr;
u32 tmp = 0;
+ u64 gpu_addr;
- if (!ptr) {
- DRM_ERROR("invalid vram scratch pointer\n");
- return -EINVAL;
- }
+ if (ring->idx == R600_RING_TYPE_DMA_INDEX)
+ index = R600_WB_DMA_RING_TEST_OFFSET;
+ else
+ index = CAYMAN_WB_DMA1_RING_TEST_OFFSET;
- tmp = 0xCAFEDEAD;
- writel(tmp, ptr);
+ gpu_addr = rdev->wb.gpu_addr + index;
r = radeon_ib_get(rdev, ring->idx, &ib, NULL, 256);
if (r) {
@@ -357,8 +357,8 @@ int r600_dma_ib_test(struct radeon_devic
}
ib.ptr[0] = DMA_PACKET(DMA_PACKET_WRITE, 0, 0, 1);
- ib.ptr[1] = rdev->vram_scratch.gpu_addr & 0xfffffffc;
- ib.ptr[2] = upper_32_bits(rdev->vram_scratch.gpu_addr) & 0xff;
+ ib.ptr[1] = lower_32_bits(gpu_addr);
+ ib.ptr[2] = upper_32_bits(gpu_addr) & 0xff;
ib.ptr[3] = 0xDEADBEEF;
ib.length_dw = 4;
@@ -374,7 +374,7 @@ int r600_dma_ib_test(struct radeon_devic
return r;
}
for (i = 0; i < rdev->usec_timeout; i++) {
- tmp = readl(ptr);
+ tmp = le32_to_cpu(rdev->wb.wb[index/4]);
if (tmp == 0xDEADBEEF)
break;
DRM_UDELAY(1);
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nadav Amit <[email protected]>
commit d29b9d7ed76c0b961603ca692b8a562556a20212 upstream.
The emulator could reuse an op->type from a previous instruction for some
immediate values. If it mistakenly considers the operands as memory
operands, it will performs a memory read and overwrite op->val.
Consider for instance the ROR instruction - src2 (the number of times)
would be read from memory instead of being used as immediate.
Mark every immediate operand as such to avoid this problem.
Fixes: c44b4c6ab80eef3a9c52c7b3f0c632942e6489aa
Signed-off-by: Nadav Amit <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/kvm/emulate.c | 8 ++++++++
1 file changed, 8 insertions(+)
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -4272,6 +4272,7 @@ static int decode_operand(struct x86_emu
fetch_register_operand(op);
break;
case OpCL:
+ op->type = OP_IMM;
op->bytes = 1;
op->val = reg_read(ctxt, VCPU_REGS_RCX) & 0xff;
break;
@@ -4279,6 +4280,7 @@ static int decode_operand(struct x86_emu
rc = decode_imm(ctxt, op, 1, true);
break;
case OpOne:
+ op->type = OP_IMM;
op->bytes = 1;
op->val = 1;
break;
@@ -4337,21 +4339,27 @@ static int decode_operand(struct x86_emu
ctxt->memop.bytes = ctxt->op_bytes + 2;
goto mem_common;
case OpES:
+ op->type = OP_IMM;
op->val = VCPU_SREG_ES;
break;
case OpCS:
+ op->type = OP_IMM;
op->val = VCPU_SREG_CS;
break;
case OpSS:
+ op->type = OP_IMM;
op->val = VCPU_SREG_SS;
break;
case OpDS:
+ op->type = OP_IMM;
op->val = VCPU_SREG_DS;
break;
case OpFS:
+ op->type = OP_IMM;
op->val = VCPU_SREG_FS;
break;
case OpGS:
+ op->type = OP_IMM;
op->val = VCPU_SREG_GS;
break;
case OpImplicit:
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Deucher <[email protected]>
commit f0d7bfb9407fccb6499ec01c33afe43512a439a2 upstream.
Need to unlock the crtc after updating the blanking state.
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/radeon/evergreen.c | 1 +
1 file changed, 1 insertion(+)
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -2556,6 +2556,7 @@ void evergreen_mc_stop(struct radeon_dev
WREG32(EVERGREEN_CRTC_UPDATE_LOCK + crtc_offsets[i], 1);
tmp |= EVERGREEN_CRTC_BLANK_DATA_EN;
WREG32(EVERGREEN_CRTC_BLANK_CONTROL + crtc_offsets[i], tmp);
+ WREG32(EVERGREEN_CRTC_UPDATE_LOCK + crtc_offsets[i], 0);
}
} else {
tmp = RREG32(EVERGREEN_CRTC_CONTROL + crtc_offsets[i]);
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jammy Zhou <[email protected]>
commit dc4edad6530a9b7b66c3d905e2bc06021a05dcad upstream.
CE ram size is 32k/0k/0k for GFX/CS0/CS1 with CIK
Ported from amdgpu driver.
Signed-off-by: Jammy Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/radeon/cik.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/gpu/drm/radeon/cik.c
+++ b/drivers/gpu/drm/radeon/cik.c
@@ -4315,8 +4315,8 @@ static int cik_cp_gfx_start(struct radeo
/* init the CE partitions. CE only used for gfx on CIK */
radeon_ring_write(ring, PACKET3(PACKET3_SET_BASE, 2));
radeon_ring_write(ring, PACKET3_BASE_INDEX(CE_PARTITION_BASE));
- radeon_ring_write(ring, 0xc000);
- radeon_ring_write(ring, 0xc000);
+ radeon_ring_write(ring, 0x8000);
+ radeon_ring_write(ring, 0x8000);
/* setup clear context state */
radeon_ring_write(ring, PACKET3(PACKET3_PREAMBLE_CNTL, 0));
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Deucher <[email protected]>
commit 8efe82ca908400785253c8f0dfcf301e6bd93488 upstream.
The power management code calls into the display code for
certain things. If certain power management sysfs attributes
are called before the driver has finished initializing all of
the hardware we can run into problems with uninitialized
modesetting state. Add a check to make sure modesetting
init has completed to the bandwidth update callbacks to
fix this. Can be triggered by the tlp and laptop start
up scripts depending on the timing.
bugs:
https://bugzilla.kernel.org/show_bug.cgi?id=83611
https://bugs.freedesktop.org/show_bug.cgi?id=85771
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/radeon/cik.c | 3 +++
drivers/gpu/drm/radeon/evergreen.c | 3 +++
drivers/gpu/drm/radeon/r100.c | 3 +++
drivers/gpu/drm/radeon/rs600.c | 3 +++
drivers/gpu/drm/radeon/rs690.c | 3 +++
drivers/gpu/drm/radeon/rv515.c | 3 +++
drivers/gpu/drm/radeon/si.c | 3 +++
7 files changed, 21 insertions(+)
--- a/drivers/gpu/drm/radeon/cik.c
+++ b/drivers/gpu/drm/radeon/cik.c
@@ -9447,6 +9447,9 @@ void dce8_bandwidth_update(struct radeon
u32 num_heads = 0, lb_size;
int i;
+ if (!rdev->mode_info.mode_config_initialized)
+ return;
+
radeon_update_display_priority(rdev);
for (i = 0; i < rdev->num_crtc; i++) {
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -2346,6 +2346,9 @@ void evergreen_bandwidth_update(struct r
u32 num_heads = 0, lb_size;
int i;
+ if (!rdev->mode_info.mode_config_initialized)
+ return;
+
radeon_update_display_priority(rdev);
for (i = 0; i < rdev->num_crtc; i++) {
--- a/drivers/gpu/drm/radeon/r100.c
+++ b/drivers/gpu/drm/radeon/r100.c
@@ -3204,6 +3204,9 @@ void r100_bandwidth_update(struct radeon
uint32_t pixel_bytes1 = 0;
uint32_t pixel_bytes2 = 0;
+ if (!rdev->mode_info.mode_config_initialized)
+ return;
+
radeon_update_display_priority(rdev);
if (rdev->mode_info.crtcs[0]->base.enabled) {
--- a/drivers/gpu/drm/radeon/rs600.c
+++ b/drivers/gpu/drm/radeon/rs600.c
@@ -879,6 +879,9 @@ void rs600_bandwidth_update(struct radeo
u32 d1mode_priority_a_cnt, d2mode_priority_a_cnt;
/* FIXME: implement full support */
+ if (!rdev->mode_info.mode_config_initialized)
+ return;
+
radeon_update_display_priority(rdev);
if (rdev->mode_info.crtcs[0]->base.enabled)
--- a/drivers/gpu/drm/radeon/rs690.c
+++ b/drivers/gpu/drm/radeon/rs690.c
@@ -579,6 +579,9 @@ void rs690_bandwidth_update(struct radeo
u32 d1mode_priority_a_cnt, d1mode_priority_b_cnt;
u32 d2mode_priority_a_cnt, d2mode_priority_b_cnt;
+ if (!rdev->mode_info.mode_config_initialized)
+ return;
+
radeon_update_display_priority(rdev);
if (rdev->mode_info.crtcs[0]->base.enabled)
--- a/drivers/gpu/drm/radeon/rv515.c
+++ b/drivers/gpu/drm/radeon/rv515.c
@@ -1277,6 +1277,9 @@ void rv515_bandwidth_update(struct radeo
struct drm_display_mode *mode0 = NULL;
struct drm_display_mode *mode1 = NULL;
+ if (!rdev->mode_info.mode_config_initialized)
+ return;
+
radeon_update_display_priority(rdev);
if (rdev->mode_info.crtcs[0]->base.enabled)
--- a/drivers/gpu/drm/radeon/si.c
+++ b/drivers/gpu/drm/radeon/si.c
@@ -2384,6 +2384,9 @@ void dce6_bandwidth_update(struct radeon
u32 num_heads = 0, lb_size;
int i;
+ if (!rdev->mode_info.mode_config_initialized)
+ return;
+
radeon_update_display_priority(rdev);
for (i = 0; i < rdev->num_crtc; i++) {
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jani Nikula <[email protected]>
commit e1c412e75754ab7b7002f3e18a2652d999c40d4b upstream.
Never trust (your interpretation of) the VBT. Regression from
commit 6dda730e55f412a6dfb181cae6784822ba463847
Author: Jani Nikula <[email protected]>
Date: Tue Jun 24 18:27:40 2014 +0300
drm/i915: respect the VBT minimum backlight brightness
causing div by zero if VBT minimum brightness equals maximum brightness.
Despite my attempts I've failed in my detective work to figure out what
the root cause is. This is not the real fix, but we have to do
something.
Reported-by: Mike Auty <[email protected]>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=86551
Reviewed-by: Daniel Vetter <[email protected]>
Signed-off-by: Jani Nikula <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/i915/intel_panel.c | 17 +++++++++++++++--
1 file changed, 15 insertions(+), 2 deletions(-)
--- a/drivers/gpu/drm/i915/intel_panel.c
+++ b/drivers/gpu/drm/i915/intel_panel.c
@@ -1074,12 +1074,25 @@ static u32 get_backlight_min_vbt(struct
struct drm_device *dev = connector->base.dev;
struct drm_i915_private *dev_priv = dev->dev_private;
struct intel_panel *panel = &connector->panel;
+ int min;
WARN_ON(panel->backlight.max == 0);
+ /*
+ * XXX: If the vbt value is 255, it makes min equal to max, which leads
+ * to problems. There are such machines out there. Either our
+ * interpretation is wrong or the vbt has bogus data. Or both. Safeguard
+ * against this by letting the minimum be at most (arbitrarily chosen)
+ * 25% of the max.
+ */
+ min = clamp_t(int, dev_priv->vbt.backlight.min_brightness, 0, 64);
+ if (min != dev_priv->vbt.backlight.min_brightness) {
+ DRM_DEBUG_KMS("clamping VBT min backlight %d/255 to %d/255\n",
+ dev_priv->vbt.backlight.min_brightness, min);
+ }
+
/* vbt value is a coefficient in range [0..255] */
- return scale(dev_priv->vbt.backlight.min_brightness, 0, 255,
- 0, panel->backlight.max);
+ return scale(min, 0, 255, 0, panel->backlight.max);
}
static int bdw_setup_backlight(struct intel_connector *connector)
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rodrigo Vivi <[email protected]>
commit d6a8b72edc92471283925ceb4ba12799b67c3ff8 upstream.
Global GTT doesn't have pat_sel[2:0] so it always point to pat_sel = 000;
So the only way to avoid screen corruptions is setting PAT 0 to Uncached.
MOCS can still be used though. But if userspace is trusting PTE for
cache selection the safest thing to do is to let caches disabled.
BSpec: "For GGTT, there is NO pat_sel[2:0] from the entry,
so RTL will always use the value corresponding to pat_sel = 000"
- System agent ggtt writes (i.e. cpu gtt mmaps) already work before
this patch, i.e. the same uncached + snooping access like on gen6/7
seems to be in effect.
- So this just fixes blitter/render access. Again it looks like it's
not just uncached access, but uncached + snooping. So we can still
hold onto all our assumptions wrt cpu clflushing on LLC machines.
v2: Cleaner patch as suggested by Chris.
v3: Add Daniel's comment
Reference: https://bugs.freedesktop.org/show_bug.cgi?id=85576
Cc: Chris Wilson <[email protected]>
Cc: James Ausmus <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Jani Nikula <[email protected]>
Tested-by: James Ausmus <[email protected]>
Reviewed-by: James Ausmus <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Jani Nikula <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/i915/i915_gem_gtt.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1901,6 +1901,22 @@ static void bdw_setup_private_ppat(struc
GEN8_PPAT(6, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(2)) |
GEN8_PPAT(7, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(3));
+ if (!USES_PPGTT(dev_priv->dev))
+ /* Spec: "For GGTT, there is NO pat_sel[2:0] from the entry,
+ * so RTL will always use the value corresponding to
+ * pat_sel = 000".
+ * So let's disable cache for GGTT to avoid screen corruptions.
+ * MOCS still can be used though.
+ * - System agent ggtt writes (i.e. cpu gtt mmaps) already work
+ * before this patch, i.e. the same uncached + snooping access
+ * like on gen6/7 seems to be in effect.
+ * - So this just fixes blitter/render access. Again it looks
+ * like it's not just uncached access, but uncached + snooping.
+ * So we can still hold onto all our assumptions wrt cpu
+ * clflushing on LLC machines.
+ */
+ pat = GEN8_PPAT(0, GEN8_PPAT_UC);
+
/* XXX: spec defines this as 2 distinct registers. It's unclear if a 64b
* write would work. */
I915_WRITE(GEN8_PRIVATE_PAT, pat);
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Takashi Iwai <[email protected]>
commit 3542aed7480925eb859f7ce101982209cc19a126 upstream.
Lenovo Ideapad Z560 has a mute LED that is controlled via EAPD pin
0x1b on CX20585 codec. (EAPD bit on corresponds to mute LED on.)
The machine doesn't need other EAPD, so the fixup concentrates on
controlling EAPD 0x1b following the vmaster state (but inversely).
Bugzilla: https://bugzilla.novell.com/show_bug.cgi?id=665315
Reported-by: Szymon Kowalczyk <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
sound/pci/hda/patch_conexant.c | 31 +++++++++++++++++++++++++++++++
1 file changed, 31 insertions(+)
--- a/sound/pci/hda/patch_conexant.c
+++ b/sound/pci/hda/patch_conexant.c
@@ -44,6 +44,7 @@ struct conexant_spec {
unsigned int num_eapds;
hda_nid_t eapds[4];
bool dynamic_eapd;
+ hda_nid_t mute_led_eapd;
unsigned int parse_flags; /* flag for snd_hda_parse_pin_defcfg() */
@@ -164,6 +165,17 @@ static void cx_auto_vmaster_hook(void *p
cx_auto_turn_eapd(codec, spec->num_eapds, spec->eapds, enabled);
}
+/* turn on/off EAPD according to Master switch (inversely!) for mute LED */
+static void cx_auto_vmaster_hook_mute_led(void *private_data, int enabled)
+{
+ struct hda_codec *codec = private_data;
+ struct conexant_spec *spec = codec->spec;
+
+ snd_hda_codec_write(codec, spec->mute_led_eapd, 0,
+ AC_VERB_SET_EAPD_BTLENABLE,
+ enabled ? 0x00 : 0x02);
+}
+
static int cx_auto_build_controls(struct hda_codec *codec)
{
int err;
@@ -224,6 +236,7 @@ enum {
CXT_FIXUP_TOSHIBA_P105,
CXT_FIXUP_HP_530,
CXT_FIXUP_CAP_MIX_AMP_5047,
+ CXT_FIXUP_MUTE_LED_EAPD,
};
/* for hda_fixup_thinkpad_acpi() */
@@ -557,6 +570,18 @@ static void cxt_fixup_olpc_xo(struct hda
}
}
+static void cxt_fixup_mute_led_eapd(struct hda_codec *codec,
+ const struct hda_fixup *fix, int action)
+{
+ struct conexant_spec *spec = codec->spec;
+
+ if (action == HDA_FIXUP_ACT_PRE_PROBE) {
+ spec->mute_led_eapd = 0x1b;
+ spec->dynamic_eapd = 1;
+ spec->gen.vmaster_mute.hook = cx_auto_vmaster_hook_mute_led;
+ }
+}
+
/*
* Fix max input level on mixer widget to 0dB
* (originally it has 0x2b steps with 0dB offset 0x14)
@@ -705,6 +730,10 @@ static const struct hda_fixup cxt_fixups
.type = HDA_FIXUP_FUNC,
.v.func = cxt_fixup_cap_mix_amp_5047,
},
+ [CXT_FIXUP_MUTE_LED_EAPD] = {
+ .type = HDA_FIXUP_FUNC,
+ .v.func = cxt_fixup_mute_led_eapd,
+ },
};
static const struct snd_pci_quirk cxt5045_fixups[] = {
@@ -761,6 +790,7 @@ static const struct snd_pci_quirk cxt506
SND_PCI_QUIRK(0x17aa, 0x21cf, "Lenovo T520", CXT_PINCFG_LENOVO_TP410),
SND_PCI_QUIRK(0x17aa, 0x21da, "Lenovo X220", CXT_PINCFG_LENOVO_TP410),
SND_PCI_QUIRK(0x17aa, 0x21db, "Lenovo X220-tablet", CXT_PINCFG_LENOVO_TP410),
+ SND_PCI_QUIRK(0x17aa, 0x38af, "Lenovo IdeaPad Z560", CXT_FIXUP_MUTE_LED_EAPD),
SND_PCI_QUIRK(0x17aa, 0x3975, "Lenovo U300s", CXT_FIXUP_STEREO_DMIC),
SND_PCI_QUIRK(0x17aa, 0x3977, "Lenovo IdeaPad U310", CXT_FIXUP_STEREO_DMIC),
SND_PCI_QUIRK(0x17aa, 0x397b, "Lenovo S205", CXT_FIXUP_STEREO_DMIC),
@@ -779,6 +809,7 @@ static const struct hda_model_fixup cxt5
{ .id = CXT_PINCFG_LEMOTE_A1004, .name = "lemote-a1004" },
{ .id = CXT_PINCFG_LEMOTE_A1205, .name = "lemote-a1205" },
{ .id = CXT_FIXUP_OLPC_XO, .name = "olpc-xo" },
+ { .id = CXT_FIXUP_MUTE_LED_EAPD, .name = "mute-led-eapd" },
{}
};
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Takashi Iwai <[email protected]>
commit 1a290581ded60e87276741f8ca97b161d2b226fc upstream.
M-audio FastTrack Ultra quirk doesn't release the kzalloc'ed memory.
This patch adds the private_free callback to release it properly.
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
sound/usb/mixer_quirks.c | 6 ++++++
1 file changed, 6 insertions(+)
--- a/sound/usb/mixer_quirks.c
+++ b/sound/usb/mixer_quirks.c
@@ -885,6 +885,11 @@ static int snd_ftu_eff_switch_put(struct
return changed;
}
+static void kctl_private_value_free(struct snd_kcontrol *kctl)
+{
+ kfree((void *)kctl->private_value);
+}
+
static int snd_ftu_create_effect_switch(struct usb_mixer_interface *mixer,
int validx, int bUnitID)
{
@@ -919,6 +924,7 @@ static int snd_ftu_create_effect_switch(
return -ENOMEM;
}
+ kctl->private_free = kctl_private_value_free;
err = snd_ctl_add(mixer->chip->card, kctl);
if (err < 0)
return err;
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Max Filippov <[email protected]>
commit 2651cc6974d47fc43bef1cd8cd26966e4f5ba306 upstream.
Userspace actually passes single parameter (path name) to the umount
syscall, so new umount just fails. Fix it by requesting old umount
syscall implementation and re-wiring umount to it.
Signed-off-by: Max Filippov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/xtensa/include/uapi/asm/unistd.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/arch/xtensa/include/uapi/asm/unistd.h
+++ b/arch/xtensa/include/uapi/asm/unistd.h
@@ -384,7 +384,8 @@ __SYSCALL(174, sys_chroot, 1)
#define __NR_pivot_root 175
__SYSCALL(175, sys_pivot_root, 2)
#define __NR_umount 176
-__SYSCALL(176, sys_umount, 2)
+__SYSCALL(176, sys_oldumount, 1)
+#define __ARCH_WANT_SYS_OLDUMOUNT
#define __NR_swapoff 177
__SYSCALL(177, sys_swapoff, 1)
#define __NR_sync 178
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Antoine Tenart <[email protected]>
commit 9a23c1d6f0f5dbac4c9b73fa6cea7c9ee3d29074 upstream.
Changes into the AHCI subsystem have introduced a bug by not taking into
account the force_port_map and mask_port_map parameters when using the
ahci_pci_save_initial_config function. This commit fixes it by setting
the internal parameters of the ahci_port_priv structure.
Fixes: 725c7b570fda
Reported-and-tested-by: Zlatko Calusic <[email protected]>
Signed-off-by: Antoine Tenart <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/ata/ahci.c | 9 +++------
1 file changed, 3 insertions(+), 6 deletions(-)
--- a/drivers/ata/ahci.c
+++ b/drivers/ata/ahci.c
@@ -519,12 +519,9 @@ MODULE_PARM_DESC(marvell_enable, "Marvel
static void ahci_pci_save_initial_config(struct pci_dev *pdev,
struct ahci_host_priv *hpriv)
{
- unsigned int force_port_map = 0;
- unsigned int mask_port_map = 0;
-
if (pdev->vendor == PCI_VENDOR_ID_JMICRON && pdev->device == 0x2361) {
dev_info(&pdev->dev, "JMB361 has only one port\n");
- force_port_map = 1;
+ hpriv->force_port_map = 1;
}
/*
@@ -534,9 +531,9 @@ static void ahci_pci_save_initial_config
*/
if (hpriv->flags & AHCI_HFLAG_MV_PATA) {
if (pdev->device == 0x6121)
- mask_port_map = 0x3;
+ hpriv->mask_port_map = 0x3;
else
- mask_port_map = 0xf;
+ hpriv->mask_port_map = 0xf;
dev_info(&pdev->dev,
"Disabling your PATA port. Use the boot option 'ahci.marvell_enable=0' to avoid this.\n");
}
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tejun Heo <[email protected]>
commit 66a7cbc303f4d28f201529b06061944d51ab530c upstream.
Samsung pci-e SSDs on macbooks failed miserably on NCQ commands, so
67809f85d31e ("ahci: disable NCQ on Samsung pci-e SSDs on macbooks")
disabled NCQ on them. It turns out that NCQ is fine as long as MSI is
not used, so let's turn off MSI and leave NCQ on.
Signed-off-by: Tejun Heo <[email protected]>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=60731
Tested-by: <[email protected]>
Tested-by: Imre Kaloz <[email protected]>
Fixes: 67809f85d31e ("ahci: disable NCQ on Samsung pci-e SSDs on macbooks")
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/ata/ahci.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
--- a/drivers/ata/ahci.c
+++ b/drivers/ata/ahci.c
@@ -60,6 +60,7 @@ enum board_ids {
/* board IDs by feature in alphabetical order */
board_ahci,
board_ahci_ign_iferr,
+ board_ahci_nomsi,
board_ahci_noncq,
board_ahci_nosntf,
board_ahci_yes_fbs,
@@ -121,6 +122,13 @@ static const struct ata_port_info ahci_p
.udma_mask = ATA_UDMA6,
.port_ops = &ahci_ops,
},
+ [board_ahci_nomsi] = {
+ AHCI_HFLAGS (AHCI_HFLAG_NO_MSI),
+ .flags = AHCI_FLAG_COMMON,
+ .pio_mask = ATA_PIO4,
+ .udma_mask = ATA_UDMA6,
+ .port_ops = &ahci_ops,
+ },
[board_ahci_noncq] = {
AHCI_HFLAGS (AHCI_HFLAG_NO_NCQ),
.flags = AHCI_FLAG_COMMON,
@@ -480,10 +488,10 @@ static const struct pci_device_id ahci_p
{ PCI_VDEVICE(ASMEDIA, 0x0612), board_ahci }, /* ASM1062 */
/*
- * Samsung SSDs found on some macbooks. NCQ times out.
- * https://bugzilla.kernel.org/show_bug.cgi?id=60731
+ * Samsung SSDs found on some macbooks. NCQ times out if MSI is
+ * enabled. https://bugzilla.kernel.org/show_bug.cgi?id=60731
*/
- { PCI_VDEVICE(SAMSUNG, 0x1600), board_ahci_noncq },
+ { PCI_VDEVICE(SAMSUNG, 0x1600), board_ahci_nomsi },
/* Enmotus */
{ PCI_DEVICE(0x1c44, 0x8000), board_ahci },
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tang Chen <[email protected]>
commit f784a3f19613901ca4539a5b0eed3bdc700e6ee7 upstream.
In free_area_init_core(), zone->managed_pages is set to an approximate
value for lowmem, and will be adjusted when the bootmem allocator frees
pages into the buddy system.
But free_area_init_core() is also called by hotadd_new_pgdat() when
hot-adding memory. As a result, zone->managed_pages of the newly added
node's pgdat is set to an approximate value in the very beginning.
Even if the memory on that node has node been onlined,
/sys/device/system/node/nodeXXX/meminfo has wrong value:
hot-add node2 (memory not onlined)
cat /sys/device/system/node/node2/meminfo
Node 2 MemTotal: 33554432 kB
Node 2 MemFree: 0 kB
Node 2 MemUsed: 33554432 kB
Node 2 Active: 0 kB
This patch fixes this problem by reset node managed pages to 0 after
hot-adding a new node.
1. Move reset_managed_pages_done from reset_node_managed_pages() to
reset_all_zones_managed_pages()
2. Make reset_node_managed_pages() non-static
3. Call reset_node_managed_pages() in hotadd_new_pgdat() after pgdat
is initialized
Signed-off-by: Tang Chen <[email protected]>
Signed-off-by: Yasuaki Ishimatsu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/linux/bootmem.h | 1 +
mm/bootmem.c | 9 +++++----
mm/memory_hotplug.c | 9 +++++++++
mm/nobootmem.c | 8 +++++---
4 files changed, 20 insertions(+), 7 deletions(-)
--- a/include/linux/bootmem.h
+++ b/include/linux/bootmem.h
@@ -46,6 +46,7 @@ extern unsigned long init_bootmem_node(p
extern unsigned long init_bootmem(unsigned long addr, unsigned long memend);
extern unsigned long free_all_bootmem(void);
+extern void reset_node_managed_pages(pg_data_t *pgdat);
extern void reset_all_zones_managed_pages(void);
extern void free_bootmem_node(pg_data_t *pgdat,
--- a/mm/bootmem.c
+++ b/mm/bootmem.c
@@ -243,13 +243,10 @@ static unsigned long __init free_all_boo
static int reset_managed_pages_done __initdata;
-static inline void __init reset_node_managed_pages(pg_data_t *pgdat)
+void reset_node_managed_pages(pg_data_t *pgdat)
{
struct zone *z;
- if (reset_managed_pages_done)
- return;
-
for (z = pgdat->node_zones; z < pgdat->node_zones + MAX_NR_ZONES; z++)
z->managed_pages = 0;
}
@@ -258,8 +255,12 @@ void __init reset_all_zones_managed_page
{
struct pglist_data *pgdat;
+ if (reset_managed_pages_done)
+ return;
+
for_each_online_pgdat(pgdat)
reset_node_managed_pages(pgdat);
+
reset_managed_pages_done = 1;
}
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -31,6 +31,7 @@
#include <linux/stop_machine.h>
#include <linux/hugetlb.h>
#include <linux/memblock.h>
+#include <linux/bootmem.h>
#include <asm/tlbflush.h>
@@ -1096,6 +1097,14 @@ static pg_data_t __ref *hotadd_new_pgdat
build_all_zonelists(pgdat, NULL);
mutex_unlock(&zonelists_mutex);
+ /*
+ * zone->managed_pages is set to an approximate value in
+ * free_area_init_core(), which will cause
+ * /sys/device/system/node/nodeX/meminfo has wrong data.
+ * So reset it to 0 before any memory is onlined.
+ */
+ reset_node_managed_pages(pgdat);
+
return pgdat;
}
--- a/mm/nobootmem.c
+++ b/mm/nobootmem.c
@@ -145,12 +145,10 @@ static unsigned long __init free_low_mem
static int reset_managed_pages_done __initdata;
-static inline void __init reset_node_managed_pages(pg_data_t *pgdat)
+void reset_node_managed_pages(pg_data_t *pgdat)
{
struct zone *z;
- if (reset_managed_pages_done)
- return;
for (z = pgdat->node_zones; z < pgdat->node_zones + MAX_NR_ZONES; z++)
z->managed_pages = 0;
}
@@ -159,8 +157,12 @@ void __init reset_all_zones_managed_page
{
struct pglist_data *pgdat;
+ if (reset_managed_pages_done)
+ return;
+
for_each_online_pgdat(pgdat)
reset_node_managed_pages(pgdat);
+
reset_managed_pages_done = 1;
}
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Thompson <[email protected]>
commit 3438cf549d2f3ee8e52c82acc8e2a9710ac21a5b upstream.
Currently if the user passes an invalid value on the kernel command line
then the kernel will crash during argument parsing. On most systems this
is very hard to debug because the console hasn't been initialized yet.
This is a regression due to commit 51e158c12aca ("param: hand arguments
after -- straight to init") which, in response to the systemd debug
controversy, made it possible to explicitly pass arguments to init. To
achieve this parse_args() was extended from simply returning an error
code to returning a pointer. Regretably the new init args logic does not
perform a proper validity check on the pointer resulting in a crash.
This patch fixes the validity check. Should the check fail then no arguments
will be passed to init. This is reasonable and matches how the kernel treats
its own arguments (i.e. no error recovery).
Signed-off-by: Daniel Thompson <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
init/main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/init/main.c
+++ b/init/main.c
@@ -544,7 +544,7 @@ asmlinkage __visible void __init start_k
static_command_line, __start___param,
__stop___param - __start___param,
-1, -1, &unknown_bootoption);
- if (after_dashes)
+ if (!IS_ERR_OR_NULL(after_dashes))
parse_args("Setting init args", after_dashes, NULL, 0, -1, -1,
set_init_arg);
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: James Ralston <[email protected]>
commit 690000b930456a98663567d35dd5c54b688d1e3f upstream.
This patch adds the AHCI-mode SATA Device IDs for the Intel Sunrise Point PCH.
Signed-off-by: James Ralston <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/ata/ahci.c | 5 +++++
1 file changed, 5 insertions(+)
--- a/drivers/ata/ahci.c
+++ b/drivers/ata/ahci.c
@@ -313,6 +313,11 @@ static const struct pci_device_id ahci_p
{ PCI_VDEVICE(INTEL, 0x8c87), board_ahci }, /* 9 Series RAID */
{ PCI_VDEVICE(INTEL, 0x8c8e), board_ahci }, /* 9 Series RAID */
{ PCI_VDEVICE(INTEL, 0x8c8f), board_ahci }, /* 9 Series RAID */
+ { PCI_VDEVICE(INTEL, 0xa103), board_ahci }, /* Sunrise Point-H AHCI */
+ { PCI_VDEVICE(INTEL, 0xa103), board_ahci }, /* Sunrise Point-H RAID */
+ { PCI_VDEVICE(INTEL, 0xa105), board_ahci }, /* Sunrise Point-H RAID */
+ { PCI_VDEVICE(INTEL, 0xa107), board_ahci }, /* Sunrise Point-H RAID */
+ { PCI_VDEVICE(INTEL, 0xa10f), board_ahci }, /* Sunrise Point-H RAID */
/* JMicron 360/1/3/5/6, match class to avoid IDE function */
{ PCI_VENDOR_ID_JMICRON, PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID,
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Karl Beldan <[email protected]>
[ Upstream commit 2c2a9cbd64387d6b70ac5db013e9bfe9412c7354 ]
ATM, txq_reclaim will dequeue and free an skb for each tx desc released
by the hw that has TX_LAST_DESC set. However, in case of TSO, each
hw desc embedding the last part of a segment has TX_LAST_DESC set,
losing the one-to-one 'last skb frag'/'TX_LAST_DESC set' correspondance,
which causes data corruption.
Fix this by checking TX_ENABLE_INTERRUPT instead of TX_LAST_DESC, and
warn when trying to dequeue from an empty txq (which can be symptomatic
of releasing skbs prematurely).
Fixes: 3ae8f4e0b98 ('net: mv643xx_eth: Implement software TSO')
Reported-by: Slawomir Gajzner <[email protected]>
Reported-by: Julien D'Ascenzio <[email protected]>
Signed-off-by: Karl Beldan <[email protected]>
Cc: Ian Campbell <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Ezequiel Garcia <[email protected]>
Cc: Sebastian Hesselbarth <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/marvell/mv643xx_eth.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)
--- a/drivers/net/ethernet/marvell/mv643xx_eth.c
+++ b/drivers/net/ethernet/marvell/mv643xx_eth.c
@@ -1047,7 +1047,6 @@ static int txq_reclaim(struct tx_queue *
int tx_index;
struct tx_desc *desc;
u32 cmd_sts;
- struct sk_buff *skb;
tx_index = txq->tx_used_desc;
desc = &txq->tx_desc_area[tx_index];
@@ -1066,19 +1065,22 @@ static int txq_reclaim(struct tx_queue *
reclaimed++;
txq->tx_desc_count--;
- skb = NULL;
- if (cmd_sts & TX_LAST_DESC)
- skb = __skb_dequeue(&txq->tx_skb);
+ if (!IS_TSO_HEADER(txq, desc->buf_ptr))
+ dma_unmap_single(mp->dev->dev.parent, desc->buf_ptr,
+ desc->byte_cnt, DMA_TO_DEVICE);
+
+ if (cmd_sts & TX_ENABLE_INTERRUPT) {
+ struct sk_buff *skb = __skb_dequeue(&txq->tx_skb);
+
+ if (!WARN_ON(!skb))
+ dev_kfree_skb(skb);
+ }
if (cmd_sts & ERROR_SUMMARY) {
netdev_info(mp->dev, "tx error\n");
mp->dev->stats.tx_errors++;
}
- if (!IS_TSO_HEADER(txq, desc->buf_ptr))
- dma_unmap_single(mp->dev->dev.parent, desc->buf_ptr,
- desc->byte_cnt, DMA_TO_DEVICE);
- dev_kfree_skb(skb);
}
__netif_tx_unlock_bh(nq);
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jesse Gross <[email protected]>
[ Upstream commit cfdf1e1ba5bf55e095cf4bcaa9585c4759f239e8 ]
When doing GRO processing for UDP tunnels, we never add
SKB_GSO_UDP_TUNNEL to gso_type - only the type of the inner protocol
is added (such as SKB_GSO_TCPV4). The result is that if the packet is
later resegmented we will do GSO but not treat it as a tunnel. This
results in UDP fragmentation of the outer header instead of (i.e.) TCP
segmentation of the inner header as was originally on the wire.
Signed-off-by: Jesse Gross <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/vxlan.c | 2 ++
include/net/udp_tunnel.h | 9 +++++++++
2 files changed, 11 insertions(+)
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -620,6 +620,8 @@ static int vxlan_gro_complete(struct sk_
int vxlan_len = sizeof(struct vxlanhdr) + sizeof(struct ethhdr);
int err = -ENOSYS;
+ udp_tunnel_gro_complete(skb, nhoff);
+
eh = (struct ethhdr *)(skb->data + nhoff + sizeof(struct vxlanhdr));
type = eh->h_proto;
--- a/include/net/udp_tunnel.h
+++ b/include/net/udp_tunnel.h
@@ -26,6 +26,15 @@ struct udp_port_cfg {
use_udp6_rx_checksums:1;
};
+static inline void udp_tunnel_gro_complete(struct sk_buff *skb, int nhoff)
+{
+ struct udphdr *uh;
+
+ uh = (struct udphdr *)(skb->data + nhoff - sizeof(struct udphdr));
+ skb_shinfo(skb)->gso_type |= uh->check ?
+ SKB_GSO_UDP_TUNNEL_CSUM : SKB_GSO_UDP_TUNNEL;
+}
+
int udp_sock_create(struct net *net, struct udp_port_cfg *cfg,
struct socket **sockp);
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steffen Klassert <[email protected]>
[ Upstream commit ebe084aafb7e93adf210e80043c9f69adf56820d ]
ipip6_tunnel_init() sets the dev->iflink via a call to
ipip6_tunnel_bind_dev(). After that, register_netdevice()
sets dev->iflink = -1. So we loose the iflink configuration
for ipv6 tunnels. Fix this by using ipip6_tunnel_init() as the
ndo_init function. Then ipip6_tunnel_init() is called after
dev->iflink is set to -1 from register_netdevice().
Signed-off-by: Steffen Klassert <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv6/sit.c | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -195,10 +195,8 @@ static int ipip6_tunnel_create(struct ne
struct sit_net *sitn = net_generic(net, sit_net_id);
int err;
- err = ipip6_tunnel_init(dev);
- if (err < 0)
- goto out;
- ipip6_tunnel_clone_6rd(dev, sitn);
+ memcpy(dev->dev_addr, &t->parms.iph.saddr, 4);
+ memcpy(dev->broadcast, &t->parms.iph.daddr, 4);
if ((__force u16)t->parms.i_flags & SIT_ISATAP)
dev->priv_flags |= IFF_ISATAP;
@@ -207,7 +205,8 @@ static int ipip6_tunnel_create(struct ne
if (err < 0)
goto out;
- strcpy(t->parms.name, dev->name);
+ ipip6_tunnel_clone_6rd(dev, sitn);
+
dev->rtnl_link_ops = &sit_link_ops;
dev_hold(dev);
@@ -1314,6 +1313,7 @@ static int ipip6_tunnel_change_mtu(struc
}
static const struct net_device_ops ipip6_netdev_ops = {
+ .ndo_init = ipip6_tunnel_init,
.ndo_uninit = ipip6_tunnel_uninit,
.ndo_start_xmit = sit_tunnel_xmit,
.ndo_do_ioctl = ipip6_tunnel_ioctl,
@@ -1359,9 +1359,7 @@ static int ipip6_tunnel_init(struct net_
tunnel->dev = dev;
tunnel->net = dev_net(dev);
-
- memcpy(dev->dev_addr, &tunnel->parms.iph.saddr, 4);
- memcpy(dev->broadcast, &tunnel->parms.iph.daddr, 4);
+ strcpy(tunnel->parms.name, dev->name);
ipip6_tunnel_bind_dev(dev);
dev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats);
@@ -1386,7 +1384,6 @@ static int __net_init ipip6_fb_tunnel_in
tunnel->dev = dev;
tunnel->net = dev_net(dev);
- strcpy(tunnel->parms.name, dev->name);
iph->version = 4;
iph->protocol = IPPROTO_IPV6;
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steffen Klassert <[email protected]>
[ Upstream commit f03eb128e3f4276f46442d14f3b8f864f3775821 ]
Otherwise it gets overwritten by register_netdev().
Signed-off-by: Steffen Klassert <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv6/ip6_gre.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -957,8 +957,6 @@ static void ip6gre_tnl_link_config(struc
else
dev->flags &= ~IFF_POINTOPOINT;
- dev->iflink = p->link;
-
/* Precalculate GRE options length */
if (t->parms.o_flags&(GRE_CSUM|GRE_KEY|GRE_SEQ)) {
if (t->parms.o_flags&GRE_CSUM)
@@ -1268,6 +1266,7 @@ static int ip6gre_tunnel_init(struct net
u64_stats_init(&ip6gre_tunnel_stats->syncp);
}
+ dev->iflink = tunnel->parms.link;
return 0;
}
@@ -1477,6 +1476,8 @@ static int ip6gre_tap_init(struct net_de
if (!dev->tstats)
return -ENOMEM;
+ dev->iflink = tunnel->parms.link;
+
return 0;
}
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steffen Klassert <[email protected]>
[ Upstream commit 6c6151daaf2d8dc2046d9926539feed5f66bf74e ]
ip6_tnl_dev_init() sets the dev->iflink via a call to
ip6_tnl_link_config(). After that, register_netdevice()
sets dev->iflink = -1. So we loose the iflink configuration
for ipv6 tunnels. Fix this by using ip6_tnl_dev_init() as the
ndo_init function. Then ip6_tnl_dev_init() is called after
dev->iflink is set to -1 from register_netdevice().
Signed-off-by: Steffen Klassert <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv6/ip6_tunnel.c | 10 +---------
1 file changed, 1 insertion(+), 9 deletions(-)
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -272,9 +272,6 @@ static int ip6_tnl_create2(struct net_de
int err;
t = netdev_priv(dev);
- err = ip6_tnl_dev_init(dev);
- if (err < 0)
- goto out;
err = register_netdevice(dev);
if (err < 0)
@@ -1462,6 +1459,7 @@ ip6_tnl_change_mtu(struct net_device *de
static const struct net_device_ops ip6_tnl_netdev_ops = {
+ .ndo_init = ip6_tnl_dev_init,
.ndo_uninit = ip6_tnl_dev_uninit,
.ndo_start_xmit = ip6_tnl_xmit,
.ndo_do_ioctl = ip6_tnl_ioctl,
@@ -1546,16 +1544,10 @@ static int __net_init ip6_fb_tnl_dev_ini
struct ip6_tnl *t = netdev_priv(dev);
struct net *net = dev_net(dev);
struct ip6_tnl_net *ip6n = net_generic(net, ip6_tnl_net_id);
- int err = ip6_tnl_dev_init_gen(dev);
-
- if (err)
- return err;
t->parms.proto = IPPROTO_IPV6;
dev_hold(dev);
- ip6_tnl_link_config(t);
-
rcu_assign_pointer(ip6n->tnls_wc[0], t);
return 0;
}
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steffen Klassert <[email protected]>
[ Upstream commit 16a0231bf7dc3fb37e9b1f1cb1a277dc220b5c5e ]
vti6_dev_init() sets the dev->iflink via a call to
vti6_link_config(). After that, register_netdevice()
sets dev->iflink = -1. So we loose the iflink configuration
for vti6 tunnels. Fix this by using vti6_dev_init() as the
ndo_init function. Then vti6_dev_init() is called after
dev->iflink is set to -1 from register_netdevice().
Signed-off-by: Steffen Klassert <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv6/ip6_vti.c | 11 +----------
1 file changed, 1 insertion(+), 10 deletions(-)
--- a/net/ipv6/ip6_vti.c
+++ b/net/ipv6/ip6_vti.c
@@ -172,10 +172,6 @@ static int vti6_tnl_create2(struct net_d
struct vti6_net *ip6n = net_generic(net, vti6_net_id);
int err;
- err = vti6_dev_init(dev);
- if (err < 0)
- goto out;
-
err = register_netdevice(dev);
if (err < 0)
goto out;
@@ -783,6 +779,7 @@ static int vti6_change_mtu(struct net_de
}
static const struct net_device_ops vti6_netdev_ops = {
+ .ndo_init = vti6_dev_init,
.ndo_uninit = vti6_dev_uninit,
.ndo_start_xmit = vti6_tnl_xmit,
.ndo_do_ioctl = vti6_ioctl,
@@ -852,16 +849,10 @@ static int __net_init vti6_fb_tnl_dev_in
struct ip6_tnl *t = netdev_priv(dev);
struct net *net = dev_net(dev);
struct vti6_net *ip6n = net_generic(net, vti6_net_id);
- int err = vti6_dev_init_gen(dev);
-
- if (err)
- return err;
t->parms.proto = IPPROTO_IPV6;
dev_hold(dev);
- vti6_link_config(t);
-
rcu_assign_pointer(ip6n->tnls_wc[0], t);
return 0;
}
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krzysztof Kozlowski <[email protected]>
commit c0acb8144bd6d8d88aee1dab33364b7353e9a903 upstream.
All interrupts coming from MUIC were ignored because interrupt source
register was masked.
The Maxim 77693 has a "interrupt source" - a separate register and interrupts
which give information about PMIC block triggering the individual
interrupt (charger, topsys, MUIC, flash LED).
By default bootloader could initialize this register to "mask all"
value. In such case (observed on Trats2 board) MUIC interrupts won't be
generated regardless of their mask status. Regmap irq chip was unmasking
individual MUIC interrupts but the source was masked
Before introducing regmap irq chip this interrupt source was unmasked,
read and acked. Reading and acking is not necessary but unmasking is.
Fixes: 342d669c1ee4 ("mfd: max77693: Handle IRQs using regmap")
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Reviewed-by: Chanwoo Choi <[email protected]>
Signed-off-by: Lee Jones <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/mfd/max77693.c | 12 ++++++++++++
include/linux/mfd/max77693-private.h | 7 +++++++
2 files changed, 19 insertions(+)
--- a/drivers/mfd/max77693.c
+++ b/drivers/mfd/max77693.c
@@ -247,6 +247,17 @@ static int max77693_i2c_probe(struct i2c
goto err_irq_muic;
}
+ /* Unmask interrupts from all blocks in interrupt source register */
+ ret = regmap_update_bits(max77693->regmap,
+ MAX77693_PMIC_REG_INTSRC_MASK,
+ SRC_IRQ_ALL, (unsigned int)~SRC_IRQ_ALL);
+ if (ret < 0) {
+ dev_err(max77693->dev,
+ "Could not unmask interrupts in INTSRC: %d\n",
+ ret);
+ goto err_intsrc;
+ }
+
pm_runtime_set_active(max77693->dev);
ret = mfd_add_devices(max77693->dev, -1, max77693_devs,
@@ -258,6 +269,7 @@ static int max77693_i2c_probe(struct i2c
err_mfd:
mfd_remove_devices(max77693->dev);
+err_intsrc:
regmap_del_irq_chip(max77693->irq, max77693->irq_data_muic);
err_irq_muic:
regmap_del_irq_chip(max77693->irq, max77693->irq_data_charger);
--- a/include/linux/mfd/max77693-private.h
+++ b/include/linux/mfd/max77693-private.h
@@ -262,6 +262,13 @@ enum max77693_irq_source {
MAX77693_IRQ_GROUP_NR,
};
+#define SRC_IRQ_CHARGER BIT(0)
+#define SRC_IRQ_TOP BIT(1)
+#define SRC_IRQ_FLASH BIT(2)
+#define SRC_IRQ_MUIC BIT(3)
+#define SRC_IRQ_ALL (SRC_IRQ_CHARGER | SRC_IRQ_TOP \
+ | SRC_IRQ_FLASH | SRC_IRQ_MUIC)
+
#define LED_IRQ_FLED2_OPEN BIT(0)
#define LED_IRQ_FLED2_SHORT BIT(1)
#define LED_IRQ_FLED1_OPEN BIT(2)
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nikolay Aleksandrov <[email protected]>
[ Upstream commit 65ba1f1ec0eff1c25933468e1d238201c0c2cb29 ]
When the evictor is running it adds some chosen frags to a local list to
be evicted once the chain lock has been released but at the same time
the *frag_queue can be running for some of the same queues and it
may call inet_frag_kill which will wait on the chain lock and
will then delete the queue from the wrong list since it was added in the
eviction one. The fix is simple - check if the queue has the evict flag
set under the chain lock before deleting it, this is safe because the
evict flag is set only under that lock and having the flag set also means
that the queue has been detached from the chain list, so no need to delete
it again.
An important note to make is that we're safe w.r.t refcnt because
inet_frag_kill and inet_evict_bucket will sync on the del_timer operation
where only one of the two can succeed (or if the timer is executing -
none of them), the cases are:
1. inet_frag_kill succeeds in del_timer
- then the timer ref is removed, but inet_evict_bucket will not add
this queue to its expire list but will restart eviction in that chain
2. inet_evict_bucket succeeds in del_timer
- then the timer ref is kept until the evictor "expires" the queue, but
inet_frag_kill will remove the initial ref and will set
INET_FRAG_COMPLETE which will make the frag_expire fn just to remove
its ref.
In the end all of the queue users will do an inet_frag_put and the one
that reaches 0 will free it. The refcount balance should be okay.
CC: Florian Westphal <[email protected]>
CC: Eric Dumazet <[email protected]>
CC: Patrick McLean <[email protected]>
Fixes: b13d3cbfb8e8 ("inet: frag: move eviction of queues to work queue")
Suggested-by: Eric Dumazet <[email protected]>
Reported-by: Patrick McLean <[email protected]>
Tested-by: Patrick McLean <[email protected]>
Signed-off-by: Nikolay Aleksandrov <[email protected]>
Reviewed-by: Florian Westphal <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/inet_fragment.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/net/ipv4/inet_fragment.c
+++ b/net/ipv4/inet_fragment.c
@@ -285,7 +285,8 @@ static inline void fq_unlink(struct inet
struct inet_frag_bucket *hb;
hb = get_frag_bucket_locked(fq, f);
- hlist_del(&fq->list);
+ if (!(fq->flags & INET_FRAG_EVICTED))
+ hlist_del(&fq->list);
spin_unlock(&hb->chain_lock);
}
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Cristian Stoica <[email protected]>
commit 307fd543f3d23f8f56850eca1b27b1be2fe71017 upstream.
Replace equivalent (and partially incorrect) scatter-gather functions
with ones from crypto-API.
The replacement is motivated by page-faults in sg_copy_part triggered
by successive calls to crypto_hash_update. The following fault appears
after calling crypto_ahash_update twice, first with 13 and then
with 285 bytes:
Unable to handle kernel paging request for data at address 0x00000008
Faulting instruction address: 0xf9bf9a8c
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=8 CoreNet Generic
Modules linked in: tcrypt(+) caamhash caam_jr caam tls
CPU: 6 PID: 1497 Comm: cryptomgr_test Not tainted
3.12.19-rt30-QorIQ-SDK-V1.6+g9fda9f2 #75
task: e9308530 ti: e700e000 task.ti: e700e000
NIP: f9bf9a8c LR: f9bfcf28 CTR: c0019ea0
REGS: e700fb80 TRAP: 0300 Not tainted
(3.12.19-rt30-QorIQ-SDK-V1.6+g9fda9f2)
MSR: 00029002 <CE,EE,ME> CR: 44f92024 XER: 20000000
DEAR: 00000008, ESR: 00000000
GPR00: f9bfcf28 e700fc30 e9308530 e70b1e55 00000000 ffffffdd e70b1e54 0bebf888
GPR08: 902c7ef5 c0e771e2 00000002 00000888 c0019ea0 00000000 00000000 c07a4154
GPR16: c08d0000 e91a8f9c 00000001 e98fb400 00000100 e9c83028 e70b1e08 e70b1d48
GPR24: e992ce10 e70b1dc8 f9bfe4f4 e70b1e55 ffffffdd e70b1ce0 00000000 00000000
NIP [f9bf9a8c] sg_copy+0x1c/0x100 [caamhash]
LR [f9bfcf28] ahash_update_no_ctx+0x628/0x660 [caamhash]
Call Trace:
[e700fc30] [f9bf9c50] sg_copy_part+0xe0/0x160 [caamhash] (unreliable)
[e700fc50] [f9bfcf28] ahash_update_no_ctx+0x628/0x660 [caamhash]
[e700fcb0] [f954e19c] crypto_tls_genicv+0x13c/0x300 [tls]
[e700fd10] [f954e65c] crypto_tls_encrypt+0x5c/0x260 [tls]
[e700fd40] [c02250ec] __test_aead.constprop.9+0x2bc/0xb70
[e700fe40] [c02259f0] alg_test_aead+0x50/0xc0
[e700fe60] [c02241e4] alg_test+0x114/0x2e0
[e700fee0] [c022276c] cryptomgr_test+0x4c/0x60
[e700fef0] [c004f658] kthread+0x98/0xa0
[e700ff40] [c000fd04] ret_from_kernel_thread+0x5c/0x64
Signed-off-by: Herbert Xu <[email protected]>
Cc: Cristian Stoica <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/crypto/caam/caamhash.c | 22 ++++++++++-----
drivers/crypto/caam/sg_sw_sec4.h | 54 ---------------------------------------
2 files changed, 14 insertions(+), 62 deletions(-)
--- a/drivers/crypto/caam/caamhash.c
+++ b/drivers/crypto/caam/caamhash.c
@@ -836,8 +836,9 @@ static int ahash_update_ctx(struct ahash
edesc->sec4_sg + sec4_sg_src_index,
chained);
if (*next_buflen) {
- sg_copy_part(next_buf, req->src, to_hash -
- *buflen, req->nbytes);
+ scatterwalk_map_and_copy(next_buf, req->src,
+ to_hash - *buflen,
+ *next_buflen, 0);
state->current_buf = !state->current_buf;
}
} else {
@@ -878,7 +879,8 @@ static int ahash_update_ctx(struct ahash
kfree(edesc);
}
} else if (*next_buflen) {
- sg_copy(buf + *buflen, req->src, req->nbytes);
+ scatterwalk_map_and_copy(buf + *buflen, req->src, 0,
+ req->nbytes, 0);
*buflen = *next_buflen;
*next_buflen = last_buflen;
}
@@ -1262,8 +1264,9 @@ static int ahash_update_no_ctx(struct ah
src_map_to_sec4_sg(jrdev, req->src, src_nents,
edesc->sec4_sg + 1, chained);
if (*next_buflen) {
- sg_copy_part(next_buf, req->src, to_hash - *buflen,
- req->nbytes);
+ scatterwalk_map_and_copy(next_buf, req->src,
+ to_hash - *buflen,
+ *next_buflen, 0);
state->current_buf = !state->current_buf;
}
@@ -1304,7 +1307,8 @@ static int ahash_update_no_ctx(struct ah
kfree(edesc);
}
} else if (*next_buflen) {
- sg_copy(buf + *buflen, req->src, req->nbytes);
+ scatterwalk_map_and_copy(buf + *buflen, req->src, 0,
+ req->nbytes, 0);
*buflen = *next_buflen;
*next_buflen = 0;
}
@@ -1476,7 +1480,8 @@ static int ahash_update_first(struct aha
}
if (*next_buflen)
- sg_copy_part(next_buf, req->src, to_hash, req->nbytes);
+ scatterwalk_map_and_copy(next_buf, req->src, to_hash,
+ *next_buflen, 0);
sh_len = desc_len(sh_desc);
desc = edesc->hw_desc;
@@ -1511,7 +1516,8 @@ static int ahash_update_first(struct aha
state->update = ahash_update_no_ctx;
state->finup = ahash_finup_no_ctx;
state->final = ahash_final_no_ctx;
- sg_copy(next_buf, req->src, req->nbytes);
+ scatterwalk_map_and_copy(next_buf, req->src, 0,
+ req->nbytes, 0);
}
#ifdef DEBUG
print_hex_dump(KERN_ERR, "next buf@"__stringify(__LINE__)": ",
--- a/drivers/crypto/caam/sg_sw_sec4.h
+++ b/drivers/crypto/caam/sg_sw_sec4.h
@@ -116,57 +116,3 @@ static int dma_unmap_sg_chained(struct d
}
return nents;
}
-
-/* Map SG page in kernel virtual address space and copy */
-static inline void sg_map_copy(u8 *dest, struct scatterlist *sg,
- int len, int offset)
-{
- u8 *mapped_addr;
-
- /*
- * Page here can be user-space pinned using get_user_pages
- * Same must be kmapped before use and kunmapped subsequently
- */
- mapped_addr = kmap_atomic(sg_page(sg));
- memcpy(dest, mapped_addr + offset, len);
- kunmap_atomic(mapped_addr);
-}
-
-/* Copy from len bytes of sg to dest, starting from beginning */
-static inline void sg_copy(u8 *dest, struct scatterlist *sg, unsigned int len)
-{
- struct scatterlist *current_sg = sg;
- int cpy_index = 0, next_cpy_index = current_sg->length;
-
- while (next_cpy_index < len) {
- sg_map_copy(dest + cpy_index, current_sg, current_sg->length,
- current_sg->offset);
- current_sg = scatterwalk_sg_next(current_sg);
- cpy_index = next_cpy_index;
- next_cpy_index += current_sg->length;
- }
- if (cpy_index < len)
- sg_map_copy(dest + cpy_index, current_sg, len-cpy_index,
- current_sg->offset);
-}
-
-/* Copy sg data, from to_skip to end, to dest */
-static inline void sg_copy_part(u8 *dest, struct scatterlist *sg,
- int to_skip, unsigned int end)
-{
- struct scatterlist *current_sg = sg;
- int sg_index, cpy_index, offset;
-
- sg_index = current_sg->length;
- while (sg_index <= to_skip) {
- current_sg = scatterwalk_sg_next(current_sg);
- sg_index += current_sg->length;
- }
- cpy_index = sg_index - to_skip;
- offset = current_sg->offset + current_sg->length - cpy_index;
- sg_map_copy(dest, current_sg, cpy_index, offset);
- if (end - sg_index) {
- current_sg = scatterwalk_sg_next(current_sg);
- sg_copy(dest + cpy_index, current_sg, end - sg_index);
- }
-}
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tony Lindgren <[email protected]>
commit 481c7f868c6d855f31a29c69b445ac4aee9625a6 upstream.
Commit e7cd1d1eb16f ("mfd: twl4030-power: Add generic reset
configuration") enabled configuring the PM features for twl4030.
This caused poweroff command to fail on devices that have the
BCI charger on twl4030 wired, or have power wired for VBUS.
Instead of powering off, the device reboots. This is because
voltage is detected on charger or VBUS with the default bits
enabled for the power transition registers.
To fix the issue, let's just clear VBUS and CHG bits as we want
poweroff command to keep the system powered off.
Fixes: e7cd1d1eb16f ("mfd: twl4030-power: Add generic reset configuration")
Reported-by: Russell King <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
Signed-off-by: Lee Jones <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/mfd/twl4030-power.c | 52 ++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 52 insertions(+)
--- a/drivers/mfd/twl4030-power.c
+++ b/drivers/mfd/twl4030-power.c
@@ -44,6 +44,15 @@ static u8 twl4030_start_script_address =
#define PWR_DEVSLP BIT(1)
#define PWR_DEVOFF BIT(0)
+/* Register bits for CFG_P1_TRANSITION (also for P2 and P3) */
+#define STARTON_SWBUG BIT(7) /* Start on watchdog */
+#define STARTON_VBUS BIT(5) /* Start on VBUS */
+#define STARTON_VBAT BIT(4) /* Start on battery insert */
+#define STARTON_RTC BIT(3) /* Start on RTC */
+#define STARTON_USB BIT(2) /* Start on USB host */
+#define STARTON_CHG BIT(1) /* Start on charger */
+#define STARTON_PWON BIT(0) /* Start on PWRON button */
+
#define SEQ_OFFSYNC (1 << 0)
#define PHY_TO_OFF_PM_MASTER(p) (p - 0x36)
@@ -606,6 +615,44 @@ twl4030_power_configure_resources(const
return 0;
}
+static int twl4030_starton_mask_and_set(u8 bitmask, u8 bitvalues)
+{
+ u8 regs[3] = { TWL4030_PM_MASTER_CFG_P1_TRANSITION,
+ TWL4030_PM_MASTER_CFG_P2_TRANSITION,
+ TWL4030_PM_MASTER_CFG_P3_TRANSITION, };
+ u8 val;
+ int i, err;
+
+ err = twl_i2c_write_u8(TWL_MODULE_PM_MASTER, TWL4030_PM_MASTER_KEY_CFG1,
+ TWL4030_PM_MASTER_PROTECT_KEY);
+ if (err)
+ goto relock;
+ err = twl_i2c_write_u8(TWL_MODULE_PM_MASTER,
+ TWL4030_PM_MASTER_KEY_CFG2,
+ TWL4030_PM_MASTER_PROTECT_KEY);
+ if (err)
+ goto relock;
+
+ for (i = 0; i < sizeof(regs); i++) {
+ err = twl_i2c_read_u8(TWL_MODULE_PM_MASTER,
+ &val, regs[i]);
+ if (err)
+ break;
+ val = (~bitmask & val) | (bitmask & bitvalues);
+ err = twl_i2c_write_u8(TWL_MODULE_PM_MASTER,
+ val, regs[i]);
+ if (err)
+ break;
+ }
+
+ if (err)
+ pr_err("TWL4030 Register access failed: %i\n", err);
+
+relock:
+ return twl_i2c_write_u8(TWL_MODULE_PM_MASTER, 0,
+ TWL4030_PM_MASTER_PROTECT_KEY);
+}
+
/*
* In master mode, start the power off sequence.
* After a successful execution, TWL shuts down the power to the SoC
@@ -615,6 +662,11 @@ void twl4030_power_off(void)
{
int err;
+ /* Disable start on charger or VBUS as it can break poweroff */
+ err = twl4030_starton_mask_and_set(STARTON_VBUS | STARTON_CHG, 0);
+ if (err)
+ pr_err("TWL4030 Unable to configure start-up\n");
+
err = twl_i2c_write_u8(TWL_MODULE_PM_MASTER, PWR_DEVOFF,
TWL4030_PM_MASTER_P1_SW_EVENTS);
if (err)
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krzysztof Kozlowski <[email protected]>
commit 43fc9396cac3f7498e07a22e6a987b911462fa58 upstream.
Interrupts coming from Maxim77693 MUIC block (MicroUSB Interface
Controller) were not handled at all because wrong regmap was used for
MUIC's regmap_irq_chip.
The MUIC component of Maxim 77693 uses different I2C address thus second
regmap is created and used by max77693 extcon driver. The registers for
MUIC interrupts are also in that block and should be handled by that
second regmap.
However the regmap irq chip for MUIC was configured with default regmap
which could not read MUIC registers.
Fixes: 342d669c1ee4 ("mfd: max77693: Handle IRQs using regmap")
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Reviewed-by: Chanwoo Choi <[email protected]>
Signed-off-by: Lee Jones <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/mfd/max77693.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/mfd/max77693.c
+++ b/drivers/mfd/max77693.c
@@ -237,7 +237,7 @@ static int max77693_i2c_probe(struct i2c
goto err_irq_charger;
}
- ret = regmap_add_irq_chip(max77693->regmap, max77693->irq,
+ ret = regmap_add_irq_chip(max77693->regmap_muic, max77693->irq,
IRQF_ONESHOT | IRQF_SHARED |
IRQF_TRIGGER_FALLING, 0,
&max77693_muic_irq_chip,
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dwight Engen <[email protected]>
[ Upstream commit 5eed69ffd248c9f68f56c710caf07db134aef28b ]
ldc_map_sg() could fail its check that the number of pages referred to
by the sg scatterlist was <= the number of cookies.
This fixes the issue by doing a similar thing to the xen-blkfront driver,
ensuring that the scatterlist will only ever contain a segment count <=
port->ring_cookies, and each segment will be page aligned, and <= page
size. This ensures that the scatterlist is always mappable.
Orabug: 19347817
OraBZ: 15945
Signed-off-by: Dwight Engen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/block/sunvdc.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/drivers/block/sunvdc.c
+++ b/drivers/block/sunvdc.c
@@ -747,6 +747,10 @@ static int probe_disk(struct vdc_port *p
port->disk = g;
+ /* Each segment in a request is up to an aligned page in size. */
+ blk_queue_segment_boundary(q, PAGE_SIZE - 1);
+ blk_queue_max_segment_size(q, PAGE_SIZE);
+
blk_queue_max_segments(q, port->ring_cookies);
blk_queue_max_hw_sectors(q, port->max_xfer_size);
g->major = vdc_major;
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tadeusz Struk <[email protected]>
commit 09adc8789c4e895d7548fa9eb5d24ad9a5d91c5d upstream.
In a system with NUMA configuration we want to enforce that the accelerator is
connected to a node with memory to avoid cross QPI memory transaction.
Otherwise there is no point in using the accelerator as the encryption in
software will be faster.
Signed-off-by: Tadeusz Struk <[email protected]>
Tested-by: Nikolay Aleksandrov <[email protected]>
Reviewed-by: Prarit Bhargava <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/crypto/qat/qat_common/adf_accel_devices.h | 3 --
drivers/crypto/qat/qat_common/adf_transport.c | 12 ++++----
drivers/crypto/qat/qat_common/qat_algs.c | 5 ++-
drivers/crypto/qat/qat_common/qat_crypto.c | 8 +++--
drivers/crypto/qat/qat_dh895xcc/adf_admin.c | 2 -
drivers/crypto/qat/qat_dh895xcc/adf_drv.c | 32 ++++++++--------------
drivers/crypto/qat/qat_dh895xcc/adf_isr.c | 2 -
7 files changed, 30 insertions(+), 34 deletions(-)
--- a/drivers/crypto/qat/qat_common/adf_accel_devices.h
+++ b/drivers/crypto/qat/qat_common/adf_accel_devices.h
@@ -198,8 +198,7 @@ struct adf_accel_dev {
struct dentry *debugfs_dir;
struct list_head list;
struct module *owner;
- uint8_t accel_id;
- uint8_t numa_node;
struct adf_accel_pci accel_pci_dev;
+ uint8_t accel_id;
} __packed;
#endif
--- a/drivers/crypto/qat/qat_common/adf_transport.c
+++ b/drivers/crypto/qat/qat_common/adf_transport.c
@@ -419,9 +419,10 @@ static int adf_init_bank(struct adf_acce
WRITE_CSR_RING_BASE(csr_addr, bank_num, i, 0);
ring = &bank->rings[i];
if (hw_data->tx_rings_mask & (1 << i)) {
- ring->inflights = kzalloc_node(sizeof(atomic_t),
- GFP_KERNEL,
- accel_dev->numa_node);
+ ring->inflights =
+ kzalloc_node(sizeof(atomic_t),
+ GFP_KERNEL,
+ dev_to_node(&GET_DEV(accel_dev)));
if (!ring->inflights)
goto err;
} else {
@@ -469,13 +470,14 @@ int adf_init_etr_data(struct adf_accel_d
int i, ret;
etr_data = kzalloc_node(sizeof(*etr_data), GFP_KERNEL,
- accel_dev->numa_node);
+ dev_to_node(&GET_DEV(accel_dev)));
if (!etr_data)
return -ENOMEM;
num_banks = GET_MAX_BANKS(accel_dev);
size = num_banks * sizeof(struct adf_etr_bank_data);
- etr_data->banks = kzalloc_node(size, GFP_KERNEL, accel_dev->numa_node);
+ etr_data->banks = kzalloc_node(size, GFP_KERNEL,
+ dev_to_node(&GET_DEV(accel_dev)));
if (!etr_data->banks) {
ret = -ENOMEM;
goto err_bank;
--- a/drivers/crypto/qat/qat_common/qat_algs.c
+++ b/drivers/crypto/qat/qat_common/qat_algs.c
@@ -641,7 +641,8 @@ static int qat_alg_sgl_to_bufl(struct qa
if (unlikely(!n))
return -EINVAL;
- bufl = kmalloc_node(sz, GFP_ATOMIC, inst->accel_dev->numa_node);
+ bufl = kmalloc_node(sz, GFP_ATOMIC,
+ dev_to_node(&GET_DEV(inst->accel_dev)));
if (unlikely(!bufl))
return -ENOMEM;
@@ -687,7 +688,7 @@ static int qat_alg_sgl_to_bufl(struct qa
struct qat_alg_buf *bufers;
buflout = kmalloc_node(sz, GFP_ATOMIC,
- inst->accel_dev->numa_node);
+ dev_to_node(&GET_DEV(inst->accel_dev)));
if (unlikely(!buflout))
goto err;
bloutp = dma_map_single(dev, buflout, sz, DMA_TO_DEVICE);
--- a/drivers/crypto/qat/qat_common/qat_crypto.c
+++ b/drivers/crypto/qat/qat_common/qat_crypto.c
@@ -109,12 +109,14 @@ struct qat_crypto_instance *qat_crypto_g
list_for_each(itr, adf_devmgr_get_head()) {
accel_dev = list_entry(itr, struct adf_accel_dev, list);
- if (accel_dev->numa_node == node && adf_dev_started(accel_dev))
+ if ((node == dev_to_node(&GET_DEV(accel_dev)) ||
+ dev_to_node(&GET_DEV(accel_dev)) < 0)
+ && adf_dev_started(accel_dev))
break;
accel_dev = NULL;
}
if (!accel_dev) {
- pr_err("QAT: Could not find device on give node\n");
+ pr_err("QAT: Could not find device on node %d\n", node);
accel_dev = adf_devmgr_get_first();
}
if (!accel_dev || !adf_dev_started(accel_dev))
@@ -164,7 +166,7 @@ static int qat_crypto_create_instances(s
for (i = 0; i < num_inst; i++) {
inst = kzalloc_node(sizeof(*inst), GFP_KERNEL,
- accel_dev->numa_node);
+ dev_to_node(&GET_DEV(accel_dev)));
if (!inst)
goto err;
--- a/drivers/crypto/qat/qat_dh895xcc/adf_admin.c
+++ b/drivers/crypto/qat/qat_dh895xcc/adf_admin.c
@@ -108,7 +108,7 @@ int adf_init_admin_comms(struct adf_acce
uint64_t reg_val;
admin = kzalloc_node(sizeof(*accel_dev->admin), GFP_KERNEL,
- accel_dev->numa_node);
+ dev_to_node(&GET_DEV(accel_dev)));
if (!admin)
return -ENOMEM;
admin->virt_addr = dma_zalloc_coherent(&GET_DEV(accel_dev), PAGE_SIZE,
--- a/drivers/crypto/qat/qat_dh895xcc/adf_drv.c
+++ b/drivers/crypto/qat/qat_dh895xcc/adf_drv.c
@@ -119,21 +119,6 @@ static void adf_cleanup_accel(struct adf
kfree(accel_dev);
}
-static uint8_t adf_get_dev_node_id(struct pci_dev *pdev)
-{
- unsigned int bus_per_cpu = 0;
- struct cpuinfo_x86 *c = &cpu_data(num_online_cpus() - 1);
-
- if (!c->phys_proc_id)
- return 0;
-
- bus_per_cpu = 256 / (c->phys_proc_id + 1);
-
- if (bus_per_cpu != 0)
- return pdev->bus->number / bus_per_cpu;
- return 0;
-}
-
static int qat_dev_start(struct adf_accel_dev *accel_dev)
{
int cpus = num_online_cpus();
@@ -235,7 +220,6 @@ static int adf_probe(struct pci_dev *pde
void __iomem *pmisc_bar_addr = NULL;
char name[ADF_DEVICE_NAME_LENGTH];
unsigned int i, bar_nr;
- uint8_t node;
int ret;
switch (ent->device) {
@@ -246,12 +230,19 @@ static int adf_probe(struct pci_dev *pde
return -ENODEV;
}
- node = adf_get_dev_node_id(pdev);
- accel_dev = kzalloc_node(sizeof(*accel_dev), GFP_KERNEL, node);
+ if (num_possible_nodes() > 1 && dev_to_node(&pdev->dev) < 0) {
+ /* If the accelerator is connected to a node with no memory
+ * there is no point in using the accelerator since the remote
+ * memory transaction will be very slow. */
+ dev_err(&pdev->dev, "Invalid NUMA configuration.\n");
+ return -EINVAL;
+ }
+
+ accel_dev = kzalloc_node(sizeof(*accel_dev), GFP_KERNEL,
+ dev_to_node(&pdev->dev));
if (!accel_dev)
return -ENOMEM;
- accel_dev->numa_node = node;
INIT_LIST_HEAD(&accel_dev->crypto_list);
/* Add accel device to accel table.
@@ -264,7 +255,8 @@ static int adf_probe(struct pci_dev *pde
accel_dev->owner = THIS_MODULE;
/* Allocate and configure device configuration structure */
- hw_data = kzalloc_node(sizeof(*hw_data), GFP_KERNEL, node);
+ hw_data = kzalloc_node(sizeof(*hw_data), GFP_KERNEL,
+ dev_to_node(&pdev->dev));
if (!hw_data) {
ret = -ENOMEM;
goto out_err;
--- a/drivers/crypto/qat/qat_dh895xcc/adf_isr.c
+++ b/drivers/crypto/qat/qat_dh895xcc/adf_isr.c
@@ -168,7 +168,7 @@ static int adf_isr_alloc_msix_entry_tabl
uint32_t msix_num_entries = hw_data->num_banks + 1;
entries = kzalloc_node(msix_num_entries * sizeof(*entries),
- GFP_KERNEL, accel_dev->numa_node);
+ GFP_KERNEL, dev_to_node(&GET_DEV(accel_dev)));
if (!entries)
return -ENOMEM;
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tadeusz Struk <[email protected]>
commit 923a6e5e5f171317ac8bb462ac4b814fa7880d3c upstream.
Do not attempt to dma map associated data if it is zero length.
Signed-off-by: Tadeusz Struk <[email protected]>
Tested-by: Nikolay Aleksandrov <[email protected]>
Reviewed-by: Prarit Bhargava <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/crypto/qat/qat_common/qat_algs.c | 2 ++
1 file changed, 2 insertions(+)
--- a/drivers/crypto/qat/qat_common/qat_algs.c
+++ b/drivers/crypto/qat/qat_common/qat_algs.c
@@ -650,6 +650,8 @@ static int qat_alg_sgl_to_bufl(struct qa
goto err;
for_each_sg(assoc, sg, assoc_n, i) {
+ if (!sg->length)
+ continue;
bufl->bufers[bufs].addr = dma_map_single(dev,
sg_virt(sg),
sg->length,
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Cristian Stoica <[email protected]>
commit 738459e3f88538f2ece263424dafe5d91799e46b upstream.
If dma mapping for dma_addr_out fails, the descriptor memory is freed
but the previous dma mapping for dma_addr_in remains.
This patch resolves the missing dma unmap and groups resource
allocations at function start.
Signed-off-by: Cristian Stoica <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/crypto/caam/key_gen.c | 29 ++++++++++++++---------------
1 file changed, 14 insertions(+), 15 deletions(-)
--- a/drivers/crypto/caam/key_gen.c
+++ b/drivers/crypto/caam/key_gen.c
@@ -48,23 +48,29 @@ int gen_split_key(struct device *jrdev,
u32 *desc;
struct split_key_result result;
dma_addr_t dma_addr_in, dma_addr_out;
- int ret = 0;
+ int ret = -ENOMEM;
desc = kmalloc(CAAM_CMD_SZ * 6 + CAAM_PTR_SZ * 2, GFP_KERNEL | GFP_DMA);
if (!desc) {
dev_err(jrdev, "unable to allocate key input memory\n");
- return -ENOMEM;
+ return ret;
}
- init_job_desc(desc, 0);
-
dma_addr_in = dma_map_single(jrdev, (void *)key_in, keylen,
DMA_TO_DEVICE);
if (dma_mapping_error(jrdev, dma_addr_in)) {
dev_err(jrdev, "unable to map key input memory\n");
- kfree(desc);
- return -ENOMEM;
+ goto out_free;
}
+
+ dma_addr_out = dma_map_single(jrdev, key_out, split_key_pad_len,
+ DMA_FROM_DEVICE);
+ if (dma_mapping_error(jrdev, dma_addr_out)) {
+ dev_err(jrdev, "unable to map key output memory\n");
+ goto out_unmap_in;
+ }
+
+ init_job_desc(desc, 0);
append_key(desc, dma_addr_in, keylen, CLASS_2 | KEY_DEST_CLASS_REG);
/* Sets MDHA up into an HMAC-INIT */
@@ -81,13 +87,6 @@ int gen_split_key(struct device *jrdev,
* FIFO_STORE with the explicit split-key content store
* (0x26 output type)
*/
- dma_addr_out = dma_map_single(jrdev, key_out, split_key_pad_len,
- DMA_FROM_DEVICE);
- if (dma_mapping_error(jrdev, dma_addr_out)) {
- dev_err(jrdev, "unable to map key output memory\n");
- kfree(desc);
- return -ENOMEM;
- }
append_fifo_store(desc, dma_addr_out, split_key_len,
LDST_CLASS_2_CCB | FIFOST_TYPE_SPLIT_KEK);
@@ -115,10 +114,10 @@ int gen_split_key(struct device *jrdev,
dma_unmap_single(jrdev, dma_addr_out, split_key_pad_len,
DMA_FROM_DEVICE);
+out_unmap_in:
dma_unmap_single(jrdev, dma_addr_in, keylen, DMA_TO_DEVICE);
-
+out_free:
kfree(desc);
-
return ret;
}
EXPORT_SYMBOL(gen_split_key);
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joonsoo Kim <[email protected]>
commit 3c605096d3158216ba9326a16266f6ba128c2c8d upstream.
Current pageblock isolation logic could isolate each pageblock
individually. This causes freepage accounting problem if freepage with
pageblock order on isolate pageblock is merged with other freepage on
normal pageblock. We can prevent merging by restricting max order of
merging to pageblock order if freepage is on isolate pageblock.
A side-effect of this change is that there could be non-merged buddy
freepage even if finishing pageblock isolation, because undoing
pageblock isolation is just to move freepage from isolate buddy list to
normal buddy list rather than to consider merging. So, the patch also
makes undoing pageblock isolation consider freepage merge. When
un-isolation, freepage with more than pageblock order and it's buddy are
checked. If they are on normal pageblock, instead of just moving, we
isolate the freepage and free it in order to get merged.
Signed-off-by: Joonsoo Kim <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Yasuaki Ishimatsu <[email protected]>
Cc: Zhang Yanfei <[email protected]>
Cc: Tang Chen <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: Bartlomiej Zolnierkiewicz <[email protected]>
Cc: Wen Congyang <[email protected]>
Cc: Marek Szyprowski <[email protected]>
Cc: Michal Nazarewicz <[email protected]>
Cc: Laura Abbott <[email protected]>
Cc: Heesub Shin <[email protected]>
Cc: "Aneesh Kumar K.V" <[email protected]>
Cc: Ritesh Harjani <[email protected]>
Cc: Gioh Kim <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
mm/internal.h | 25 +++++++++++++++++++++++++
mm/page_alloc.c | 41 ++++++++++++++---------------------------
mm/page_isolation.c | 41 +++++++++++++++++++++++++++++++++++++++--
3 files changed, 78 insertions(+), 29 deletions(-)
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -108,6 +108,31 @@ extern pmd_t *mm_find_pmd(struct mm_stru
/*
* in mm/page_alloc.c
*/
+
+/*
+ * Locate the struct page for both the matching buddy in our
+ * pair (buddy1) and the combined O(n+1) page they form (page).
+ *
+ * 1) Any buddy B1 will have an order O twin B2 which satisfies
+ * the following equation:
+ * B2 = B1 ^ (1 << O)
+ * For example, if the starting buddy (buddy2) is #8 its order
+ * 1 buddy is #10:
+ * B2 = 8 ^ (1 << 1) = 8 ^ 2 = 10
+ *
+ * 2) Any buddy B will have an order O+1 parent P which
+ * satisfies the following equation:
+ * P = B & ~(1 << O)
+ *
+ * Assumption: *_mem_map is contiguous at least up to MAX_ORDER
+ */
+static inline unsigned long
+__find_buddy_index(unsigned long page_idx, unsigned int order)
+{
+ return page_idx ^ (1 << order);
+}
+
+extern int __isolate_free_page(struct page *page, unsigned int order);
extern void __free_pages_bootmem(struct page *page, unsigned int order);
extern void prep_compound_page(struct page *page, unsigned long order);
#ifdef CONFIG_MEMORY_FAILURE
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -468,29 +468,6 @@ static inline void rmv_page_order(struct
}
/*
- * Locate the struct page for both the matching buddy in our
- * pair (buddy1) and the combined O(n+1) page they form (page).
- *
- * 1) Any buddy B1 will have an order O twin B2 which satisfies
- * the following equation:
- * B2 = B1 ^ (1 << O)
- * For example, if the starting buddy (buddy2) is #8 its order
- * 1 buddy is #10:
- * B2 = 8 ^ (1 << 1) = 8 ^ 2 = 10
- *
- * 2) Any buddy B will have an order O+1 parent P which
- * satisfies the following equation:
- * P = B & ~(1 << O)
- *
- * Assumption: *_mem_map is contiguous at least up to MAX_ORDER
- */
-static inline unsigned long
-__find_buddy_index(unsigned long page_idx, unsigned int order)
-{
- return page_idx ^ (1 << order);
-}
-
-/*
* This function checks whether a page is free && is the buddy
* we can do coalesce a page and its buddy if
* (a) the buddy is not in a hole &&
@@ -570,6 +547,7 @@ static inline void __free_one_page(struc
unsigned long combined_idx;
unsigned long uninitialized_var(buddy_idx);
struct page *buddy;
+ int max_order = MAX_ORDER;
VM_BUG_ON(!zone_is_initialized(zone));
@@ -578,15 +556,24 @@ static inline void __free_one_page(struc
return;
VM_BUG_ON(migratetype == -1);
- if (!is_migrate_isolate(migratetype))
+ if (is_migrate_isolate(migratetype)) {
+ /*
+ * We restrict max order of merging to prevent merge
+ * between freepages on isolate pageblock and normal
+ * pageblock. Without this, pageblock isolation
+ * could cause incorrect freepage accounting.
+ */
+ max_order = min(MAX_ORDER, pageblock_order + 1);
+ } else {
__mod_zone_freepage_state(zone, 1 << order, migratetype);
+ }
- page_idx = pfn & ((1 << MAX_ORDER) - 1);
+ page_idx = pfn & ((1 << max_order) - 1);
VM_BUG_ON_PAGE(page_idx & ((1 << order) - 1), page);
VM_BUG_ON_PAGE(bad_range(zone, page), page);
- while (order < MAX_ORDER-1) {
+ while (order < max_order - 1) {
buddy_idx = __find_buddy_index(page_idx, order);
buddy = page + (buddy_idx - page_idx);
if (!page_is_buddy(page, buddy, order))
@@ -1487,7 +1474,7 @@ void split_page(struct page *page, unsig
}
EXPORT_SYMBOL_GPL(split_page);
-static int __isolate_free_page(struct page *page, unsigned int order)
+int __isolate_free_page(struct page *page, unsigned int order)
{
unsigned long watermark;
struct zone *zone;
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -76,17 +76,54 @@ void unset_migratetype_isolate(struct pa
{
struct zone *zone;
unsigned long flags, nr_pages;
+ struct page *isolated_page = NULL;
+ unsigned int order;
+ unsigned long page_idx, buddy_idx;
+ struct page *buddy;
zone = page_zone(page);
spin_lock_irqsave(&zone->lock, flags);
if (get_pageblock_migratetype(page) != MIGRATE_ISOLATE)
goto out;
- nr_pages = move_freepages_block(zone, page, migratetype);
- __mod_zone_freepage_state(zone, nr_pages, migratetype);
+
+ /*
+ * Because freepage with more than pageblock_order on isolated
+ * pageblock is restricted to merge due to freepage counting problem,
+ * it is possible that there is free buddy page.
+ * move_freepages_block() doesn't care of merge so we need other
+ * approach in order to merge them. Isolation and free will make
+ * these pages to be merged.
+ */
+ if (PageBuddy(page)) {
+ order = page_order(page);
+ if (order >= pageblock_order) {
+ page_idx = page_to_pfn(page) & ((1 << MAX_ORDER) - 1);
+ buddy_idx = __find_buddy_index(page_idx, order);
+ buddy = page + (buddy_idx - page_idx);
+
+ if (!is_migrate_isolate_page(buddy)) {
+ __isolate_free_page(page, order);
+ set_page_refcounted(page);
+ isolated_page = page;
+ }
+ }
+ }
+
+ /*
+ * If we isolate freepage with more than pageblock_order, there
+ * should be no freepage in the range, so we could avoid costly
+ * pageblock scanning for freepage moving.
+ */
+ if (!isolated_page) {
+ nr_pages = move_freepages_block(zone, page, migratetype);
+ __mod_zone_freepage_state(zone, nr_pages, migratetype);
+ }
set_pageblock_migratetype(page, migratetype);
zone->nr_isolate_pageblock--;
out:
spin_unlock_irqrestore(&zone->lock, flags);
+ if (isolated_page)
+ __free_pages(isolated_page, order);
}
static inline struct page *
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Allen Pais <[email protected]>
[ Upstream commit de5b73f08468b4fc5e2f6d1505f650262622f78b ]
The LDom diskserver doesn't return reliable geometry data. In addition,
the types for all fields in the vio_disk_geom are u16, which were being
truncated in the cast into the u8's of the Linux struct hd_geometry.
Modify vdc_getgeo() to compute the geometry from the disk's capacity in a
manner consistent with xen-blkfront::blkif_getgeo().
Signed-off-by: Dwight Engen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/block/sunvdc.c | 23 ++++++++++++++---------
1 file changed, 14 insertions(+), 9 deletions(-)
--- a/drivers/block/sunvdc.c
+++ b/drivers/block/sunvdc.c
@@ -70,7 +70,6 @@ struct vdc_port {
char disk_name[32];
- struct vio_disk_geom geom;
struct vio_disk_vtoc label;
};
@@ -103,11 +102,15 @@ static inline u32 vdc_tx_dring_avail(str
static int vdc_getgeo(struct block_device *bdev, struct hd_geometry *geo)
{
struct gendisk *disk = bdev->bd_disk;
- struct vdc_port *port = disk->private_data;
+ sector_t nsect = get_capacity(disk);
+ sector_t cylinders = nsect;
- geo->heads = (u8) port->geom.num_hd;
- geo->sectors = (u8) port->geom.num_sec;
- geo->cylinders = port->geom.num_cyl;
+ geo->heads = 0xff;
+ geo->sectors = 0x3f;
+ sector_div(cylinders, geo->heads * geo->sectors);
+ geo->cylinders = cylinders;
+ if ((sector_t)(geo->cylinders + 1) * geo->heads * geo->sectors < nsect)
+ geo->cylinders = 0xffff;
return 0;
}
@@ -714,16 +717,18 @@ static int probe_disk(struct vdc_port *p
if (port->vdisk_size == -1)
return -ENODEV;
} else {
+ struct vio_disk_geom geom;
+
err = generic_request(port, VD_OP_GET_DISKGEOM,
- &port->geom, sizeof(port->geom));
+ &geom, sizeof(geom));
if (err < 0) {
printk(KERN_ERR PFX "VD_OP_GET_DISKGEOM returns "
"error %d\n", err);
return err;
}
- port->vdisk_size = ((u64)port->geom.num_cyl *
- (u64)port->geom.num_hd *
- (u64)port->geom.num_sec);
+ port->vdisk_size = ((u64)geom.num_cyl *
+ (u64)geom.num_hd *
+ (u64)geom.num_sec);
}
q = blk_init_queue(do_vdc_request, &port->vio.lock);
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joonsoo Kim <[email protected]>
commit 8f82b55dd558a74fc33d69a1f2c2605d0cd2c908 upstream.
All the caller of __free_one_page() has similar freepage counting logic,
so we can move it to __free_one_page(). This reduce line of code and
help future maintenance.
This is also preparation step for "mm/page_alloc: restrict max order of
merging on isolated pageblock" which fix the freepage counting problem
on freepage with more than pageblock order.
Signed-off-by: Joonsoo Kim <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Yasuaki Ishimatsu <[email protected]>
Cc: Zhang Yanfei <[email protected]>
Cc: Tang Chen <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: Bartlomiej Zolnierkiewicz <[email protected]>
Cc: Wen Congyang <[email protected]>
Cc: Marek Szyprowski <[email protected]>
Cc: Michal Nazarewicz <[email protected]>
Cc: Laura Abbott <[email protected]>
Cc: Heesub Shin <[email protected]>
Cc: "Aneesh Kumar K.V" <[email protected]>
Cc: Ritesh Harjani <[email protected]>
Cc: Gioh Kim <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
mm/page_alloc.c | 14 +++-----------
1 file changed, 3 insertions(+), 11 deletions(-)
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -578,6 +578,8 @@ static inline void __free_one_page(struc
return;
VM_BUG_ON(migratetype == -1);
+ if (!is_migrate_isolate(migratetype))
+ __mod_zone_freepage_state(zone, 1 << order, migratetype);
page_idx = pfn & ((1 << MAX_ORDER) - 1);
@@ -716,14 +718,9 @@ static void free_pcppages_bulk(struct zo
/* must delete as __free_one_page list manipulates */
list_del(&page->lru);
mt = get_freepage_migratetype(page);
- if (unlikely(has_isolate_pageblock(zone))) {
+ if (unlikely(has_isolate_pageblock(zone)))
mt = get_pageblock_migratetype(page);
- if (is_migrate_isolate(mt))
- goto skip_counting;
- }
- __mod_zone_freepage_state(zone, 1, mt);
-skip_counting:
/* MIGRATE_MOVABLE list may include MIGRATE_RESERVEs */
__free_one_page(page, page_to_pfn(page), zone, 0, mt);
trace_mm_page_pcpu_drain(page, 0, mt);
@@ -746,12 +743,7 @@ static void free_one_page(struct zone *z
if (unlikely(has_isolate_pageblock(zone) ||
is_migrate_isolate(migratetype))) {
migratetype = get_pfnblock_migratetype(page, pfn);
- if (is_migrate_isolate(migratetype))
- goto skip_counting;
}
- __mod_zone_freepage_state(zone, 1 << order, migratetype);
-
-skip_counting:
__free_one_page(page, pfn, zone, order, migratetype);
spin_unlock(&zone->lock);
}
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nikolay Aleksandrov <[email protected]>
[ Upstream commit d70127e8a942364de8dd140fe73893efda363293 ]
The WARN_ON in inet_evict_bucket can be triggered by a valid case:
inet_frag_kill and inet_evict_bucket can be running in parallel on the
same queue which means that there has been at least one more ref added
by a previous inet_frag_find call, but inet_frag_kill can delete the
timer before inet_evict_bucket which will cause the WARN_ON() there to
trigger since we'll have refcnt!=1. Now, this case is valid because the
queue is being "killed" for some reason (removed from the chain list and
its timer deleted) so it will get destroyed in the end by one of the
inet_frag_put() calls which reaches 0 i.e. refcnt is still valid.
CC: Florian Westphal <[email protected]>
CC: Eric Dumazet <[email protected]>
CC: Patrick McLean <[email protected]>
Fixes: b13d3cbfb8e8 ("inet: frag: move eviction of queues to work queue")
Reported-by: Patrick McLean <[email protected]>
Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/inet_fragment.c | 1 -
1 file changed, 1 deletion(-)
--- a/net/ipv4/inet_fragment.c
+++ b/net/ipv4/inet_fragment.c
@@ -146,7 +146,6 @@ evict_again:
atomic_inc(&fq->refcnt);
spin_unlock(&hb->chain_lock);
del_timer_sync(&fq->timer);
- WARN_ON(atomic_read(&fq->refcnt) != 1);
inet_frag_put(fq, f);
goto evict_again;
}
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joonsoo Kim <[email protected]>
commit 51bb1a4093cc68bc16b282548d9cee6104be0ef1 upstream.
In free_pcppages_bulk(), we use cached migratetype of freepage to
determine type of buddy list where freepage will be added. This
information is stored when freepage is added to pcp list, so if
isolation of pageblock of this freepage begins after storing, this
cached information could be stale. In other words, it has original
migratetype rather than MIGRATE_ISOLATE.
There are two problems caused by this stale information.
One is that we can't keep these freepages from being allocated.
Although this pageblock is isolated, freepage will be added to normal
buddy list so that it could be allocated without any restriction. And
the other problem is incorrect freepage accounting. Freepages on
isolate pageblock should not be counted for number of freepage.
Following is the code snippet in free_pcppages_bulk().
/* MIGRATE_MOVABLE list may include MIGRATE_RESERVEs */
__free_one_page(page, page_to_pfn(page), zone, 0, mt);
trace_mm_page_pcpu_drain(page, 0, mt);
if (likely(!is_migrate_isolate_page(page))) {
__mod_zone_page_state(zone, NR_FREE_PAGES, 1);
if (is_migrate_cma(mt))
__mod_zone_page_state(zone, NR_FREE_CMA_PAGES, 1);
}
As you can see above snippet, current code already handle second
problem, incorrect freepage accounting, by re-fetching pageblock
migratetype through is_migrate_isolate_page(page).
But, because this re-fetched information isn't used for
__free_one_page(), first problem would not be solved. This patch try to
solve this situation to re-fetch pageblock migratetype before
__free_one_page() and to use it for __free_one_page().
In addition to move up position of this re-fetch, this patch use
optimization technique, re-fetching migratetype only if there is isolate
pageblock. Pageblock isolation is rare event, so we can avoid
re-fetching in common case with this optimization.
This patch also correct migratetype of the tracepoint output.
Signed-off-by: Joonsoo Kim <[email protected]>
Acked-by: Minchan Kim <[email protected]>
Acked-by: Michal Nazarewicz <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Yasuaki Ishimatsu <[email protected]>
Cc: Zhang Yanfei <[email protected]>
Cc: Tang Chen <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: Bartlomiej Zolnierkiewicz <[email protected]>
Cc: Wen Congyang <[email protected]>
Cc: Marek Szyprowski <[email protected]>
Cc: Laura Abbott <[email protected]>
Cc: Heesub Shin <[email protected]>
Cc: "Aneesh Kumar K.V" <[email protected]>
Cc: Ritesh Harjani <[email protected]>
Cc: Gioh Kim <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
mm/page_alloc.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -716,14 +716,17 @@ static void free_pcppages_bulk(struct zo
/* must delete as __free_one_page list manipulates */
list_del(&page->lru);
mt = get_freepage_migratetype(page);
+ if (unlikely(has_isolate_pageblock(zone))) {
+ mt = get_pageblock_migratetype(page);
+ if (is_migrate_isolate(mt))
+ goto skip_counting;
+ }
+ __mod_zone_freepage_state(zone, 1, mt);
+
+skip_counting:
/* MIGRATE_MOVABLE list may include MIGRATE_RESERVEs */
__free_one_page(page, page_to_pfn(page), zone, 0, mt);
trace_mm_page_pcpu_drain(page, 0, mt);
- if (likely(!is_migrate_isolate_page(page))) {
- __mod_zone_page_state(zone, NR_FREE_PAGES, 1);
- if (is_migrate_cma(mt))
- __mod_zone_page_state(zone, NR_FREE_CMA_PAGES, 1);
- }
} while (--to_free && --batch_free && !list_empty(list));
}
spin_unlock(&zone->lock);
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joonsoo Kim <[email protected]>
commit ad53f92eb416d81e469fa8ea57153e59455e7175 upstream.
Before describing bugs itself, I first explain definition of freepage.
1. pages on buddy list are counted as freepage.
2. pages on isolate migratetype buddy list are *not* counted as freepage.
3. pages on cma buddy list are counted as CMA freepage, too.
Now, I describe problems and related patch.
Patch 1: There is race conditions on getting pageblock migratetype that
it results in misplacement of freepages on buddy list, incorrect
freepage count and un-availability of freepage.
Patch 2: Freepages on pcp list could have stale cached information to
determine migratetype of buddy list to go. This causes misplacement of
freepages on buddy list and incorrect freepage count.
Patch 4: Merging between freepages on different migratetype of
pageblocks will cause freepages accouting problem. This patch fixes it.
Without patchset [3], above problem doesn't happens on my CMA allocation
test, because CMA reserved pages aren't used at all. So there is no
chance for above race.
With patchset [3], I did simple CMA allocation test and get below
result:
- Virtual machine, 4 cpus, 1024 MB memory, 256 MB CMA reservation
- run kernel build (make -j16) on background
- 30 times CMA allocation(8MB * 30 = 240MB) attempts in 5 sec interval
- Result: more than 5000 freepage count are missed
With patchset [3] and this patchset, I found that no freepage count are
missed so that I conclude that problems are solved.
On my simple memory offlining test, these problems also occur on that
environment, too.
This patch (of 4):
There are two paths to reach core free function of buddy allocator,
__free_one_page(), one is free_one_page()->__free_one_page() and the
other is free_hot_cold_page()->free_pcppages_bulk()->__free_one_page().
Each paths has race condition causing serious problems. At first, this
patch is focused on first type of freepath. And then, following patch
will solve the problem in second type of freepath.
In the first type of freepath, we got migratetype of freeing page
without holding the zone lock, so it could be racy. There are two cases
of this race.
1. pages are added to isolate buddy list after restoring orignal
migratetype
CPU1 CPU2
get migratetype => return MIGRATE_ISOLATE
call free_one_page() with MIGRATE_ISOLATE
grab the zone lock
unisolate pageblock
release the zone lock
grab the zone lock
call __free_one_page() with MIGRATE_ISOLATE
freepage go into isolate buddy list,
although pageblock is already unisolated
This may cause two problems. One is that we can't use this page anymore
until next isolation attempt of this pageblock, because freepage is on
isolate buddy list. The other is that freepage accouting could be wrong
due to merging between different buddy list. Freepages on isolate buddy
list aren't counted as freepage, but ones on normal buddy list are
counted as freepage. If merge happens, buddy freepage on normal buddy
list is inevitably moved to isolate buddy list without any consideration
of freepage accouting so it could be incorrect.
2. pages are added to normal buddy list while pageblock is isolated.
It is similar with above case.
This also may cause two problems. One is that we can't keep these
freepages from being allocated. Although this pageblock is isolated,
freepage would be added to normal buddy list so that it could be
allocated without any restriction. And the other problem is same as
case 1, that it, incorrect freepage accouting.
This race condition would be prevented by checking migratetype again
with holding the zone lock. Because it is somewhat heavy operation and
it isn't needed in common case, we want to avoid rechecking as much as
possible. So this patch introduce new variable, nr_isolate_pageblock in
struct zone to check if there is isolated pageblock. With this, we can
avoid to re-check migratetype in common case and do it only if there is
isolated pageblock or migratetype is MIGRATE_ISOLATE. This solve above
mentioned problems.
Changes from v3:
Add one more check in free_one_page() that checks whether migratetype is
MIGRATE_ISOLATE or not. Without this, abovementioned case 1 could happens.
Signed-off-by: Joonsoo Kim <[email protected]>
Acked-by: Minchan Kim <[email protected]>
Acked-by: Michal Nazarewicz <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Yasuaki Ishimatsu <[email protected]>
Cc: Zhang Yanfei <[email protected]>
Cc: Tang Chen <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: Bartlomiej Zolnierkiewicz <[email protected]>
Cc: Wen Congyang <[email protected]>
Cc: Marek Szyprowski <[email protected]>
Cc: Laura Abbott <[email protected]>
Cc: Heesub Shin <[email protected]>
Cc: "Aneesh Kumar K.V" <[email protected]>
Cc: Ritesh Harjani <[email protected]>
Cc: Gioh Kim <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/linux/mmzone.h | 9 +++++++++
include/linux/page-isolation.h | 8 ++++++++
mm/page_alloc.c | 11 +++++++++--
mm/page_isolation.c | 2 ++
4 files changed, 28 insertions(+), 2 deletions(-)
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -431,6 +431,15 @@ struct zone {
*/
int nr_migrate_reserve_block;
+#ifdef CONFIG_MEMORY_ISOLATION
+ /*
+ * Number of isolated pageblock. It is used to solve incorrect
+ * freepage counting problem due to racy retrieving migratetype
+ * of pageblock. Protected by zone->lock.
+ */
+ unsigned long nr_isolate_pageblock;
+#endif
+
#ifdef CONFIG_MEMORY_HOTPLUG
/* see spanned/present_pages for more description */
seqlock_t span_seqlock;
--- a/include/linux/page-isolation.h
+++ b/include/linux/page-isolation.h
@@ -2,6 +2,10 @@
#define __LINUX_PAGEISOLATION_H
#ifdef CONFIG_MEMORY_ISOLATION
+static inline bool has_isolate_pageblock(struct zone *zone)
+{
+ return zone->nr_isolate_pageblock;
+}
static inline bool is_migrate_isolate_page(struct page *page)
{
return get_pageblock_migratetype(page) == MIGRATE_ISOLATE;
@@ -11,6 +15,10 @@ static inline bool is_migrate_isolate(in
return migratetype == MIGRATE_ISOLATE;
}
#else
+static inline bool has_isolate_pageblock(struct zone *zone)
+{
+ return false;
+}
static inline bool is_migrate_isolate_page(struct page *page)
{
return false;
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -740,9 +740,16 @@ static void free_one_page(struct zone *z
if (nr_scanned)
__mod_zone_page_state(zone, NR_PAGES_SCANNED, -nr_scanned);
+ if (unlikely(has_isolate_pageblock(zone) ||
+ is_migrate_isolate(migratetype))) {
+ migratetype = get_pfnblock_migratetype(page, pfn);
+ if (is_migrate_isolate(migratetype))
+ goto skip_counting;
+ }
+ __mod_zone_freepage_state(zone, 1 << order, migratetype);
+
+skip_counting:
__free_one_page(page, pfn, zone, order, migratetype);
- if (unlikely(!is_migrate_isolate(migratetype)))
- __mod_zone_freepage_state(zone, 1 << order, migratetype);
spin_unlock(&zone->lock);
}
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -60,6 +60,7 @@ out:
int migratetype = get_pageblock_migratetype(page);
set_pageblock_migratetype(page, MIGRATE_ISOLATE);
+ zone->nr_isolate_pageblock++;
nr_pages = move_freepages_block(zone, page, MIGRATE_ISOLATE);
__mod_zone_freepage_state(zone, -nr_pages, migratetype);
@@ -83,6 +84,7 @@ void unset_migratetype_isolate(struct pa
nr_pages = move_freepages_block(zone, page, migratetype);
__mod_zone_freepage_state(zone, nr_pages, migratetype);
set_pageblock_migratetype(page, migratetype);
+ zone->nr_isolate_pageblock--;
out:
spin_unlock_irqrestore(&zone->lock, flags);
}
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Weijie Yang <[email protected]>
commit c406515239376fc93a30d5d03192182160cbd3fb upstream.
zram could kunmap_atomic() a NULL pointer in a rare situation: a zram
page becomes a full-zeroed page after a partial write io. The current
code doesn't handle this case and performs kunmap_atomic() on a NULL
pointer, which panics the kernel.
This patch fixes this issue.
Signed-off-by: Weijie Yang <[email protected]>
Cc: Sergey Senozhatsky <[email protected]>
Cc: Dan Streetman <[email protected]>
Cc: Nitin Gupta <[email protected]>
Cc: Weijie Yang <[email protected]>
Acked-by: Jerome Marchand <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/block/zram/zram_drv.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -476,7 +476,8 @@ static int zram_bvec_write(struct zram *
}
if (page_zero_filled(uncmem)) {
- kunmap_atomic(user_mem);
+ if (user_mem)
+ kunmap_atomic(user_mem);
/* Free memory associated with this sector now. */
bit_spin_lock(ZRAM_ACCESS, &meta->table[index].value);
zram_free_page(zram, index);
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andreas Larsson <[email protected]>
[ Upstream commit 1a17fdc4f4ed06b63fac1937470378a5441a663a ]
Atomicity between xchg and cmpxchg cannot be guaranteed when xchg is
implemented with a swap and cmpxchg is implemented with locks.
Without this, e.g. mcs_spin_lock and mcs_spin_unlock are broken.
Signed-off-by: Andreas Larsson <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/include/asm/atomic_32.h | 2 +-
arch/sparc/include/asm/cmpxchg_32.h | 12 ++----------
arch/sparc/lib/atomic32.c | 27 +++++++++++++++++++++++++++
3 files changed, 30 insertions(+), 11 deletions(-)
--- a/arch/sparc/include/asm/atomic_32.h
+++ b/arch/sparc/include/asm/atomic_32.h
@@ -22,7 +22,7 @@
int __atomic_add_return(int, atomic_t *);
int atomic_cmpxchg(atomic_t *, int, int);
-#define atomic_xchg(v, new) (xchg(&((v)->counter), new))
+int atomic_xchg(atomic_t *, int);
int __atomic_add_unless(atomic_t *, int, int);
void atomic_set(atomic_t *, int);
--- a/arch/sparc/include/asm/cmpxchg_32.h
+++ b/arch/sparc/include/asm/cmpxchg_32.h
@@ -11,22 +11,14 @@
#ifndef __ARCH_SPARC_CMPXCHG__
#define __ARCH_SPARC_CMPXCHG__
-static inline unsigned long xchg_u32(__volatile__ unsigned long *m, unsigned long val)
-{
- __asm__ __volatile__("swap [%2], %0"
- : "=&r" (val)
- : "0" (val), "r" (m)
- : "memory");
- return val;
-}
-
+unsigned long __xchg_u32(volatile u32 *m, u32 new);
void __xchg_called_with_bad_pointer(void);
static inline unsigned long __xchg(unsigned long x, __volatile__ void * ptr, int size)
{
switch (size) {
case 4:
- return xchg_u32(ptr, x);
+ return __xchg_u32(ptr, x);
}
__xchg_called_with_bad_pointer();
return x;
--- a/arch/sparc/lib/atomic32.c
+++ b/arch/sparc/lib/atomic32.c
@@ -40,6 +40,19 @@ int __atomic_add_return(int i, atomic_t
}
EXPORT_SYMBOL(__atomic_add_return);
+int atomic_xchg(atomic_t *v, int new)
+{
+ int ret;
+ unsigned long flags;
+
+ spin_lock_irqsave(ATOMIC_HASH(v), flags);
+ ret = v->counter;
+ v->counter = new;
+ spin_unlock_irqrestore(ATOMIC_HASH(v), flags);
+ return ret;
+}
+EXPORT_SYMBOL(atomic_xchg);
+
int atomic_cmpxchg(atomic_t *v, int old, int new)
{
int ret;
@@ -132,3 +145,17 @@ unsigned long __cmpxchg_u32(volatile u32
return (unsigned long)prev;
}
EXPORT_SYMBOL(__cmpxchg_u32);
+
+unsigned long __xchg_u32(volatile u32 *ptr, u32 new)
+{
+ unsigned long flags;
+ u32 prev;
+
+ spin_lock_irqsave(ATOMIC_HASH(ptr), flags);
+ prev = *ptr;
+ *ptr = new;
+ spin_unlock_irqrestore(ATOMIC_HASH(ptr), flags);
+
+ return (unsigned long)prev;
+}
+EXPORT_SYMBOL(__xchg_u32);
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Allen Pais <[email protected]>
[ Upstream commit 9bce21828d54a95143f1b74619705c2dd8e88b92 ]
Interpret the media type from v1.1 protocol to support CDROM/DVD.
For v1.0 protocol, a disk's size continues to be calculated from the
geometry returned by the vdisk server. The geometry returned by the server
can be less than the actual number of sectors available in the backing
image/device due to the rounding in the division used to compute the
geometry in the vdisk server.
In v1.1 protocol a disk's actual size in sectors is returned during the
handshake. Use this size when v1.1 protocol is negotiated. Since this size
will always be larger than the former geometry computed size, disks created
under v1.0 will be forwards compatible to v1.1, but not vice versa.
Signed-off-by: Dwight Engen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/include/asm/vio.h | 12 +++-
drivers/block/sunvdc.c | 109 ++++++++++++++++++++++++++++++++++++-------
2 files changed, 101 insertions(+), 20 deletions(-)
--- a/arch/sparc/include/asm/vio.h
+++ b/arch/sparc/include/asm/vio.h
@@ -118,12 +118,18 @@ struct vio_disk_attr_info {
u8 vdisk_type;
#define VD_DISK_TYPE_SLICE 0x01 /* Slice in block device */
#define VD_DISK_TYPE_DISK 0x02 /* Entire block device */
- u16 resv1;
+ u8 vdisk_mtype; /* v1.1 */
+#define VD_MEDIA_TYPE_FIXED 0x01 /* Fixed device */
+#define VD_MEDIA_TYPE_CD 0x02 /* CD Device */
+#define VD_MEDIA_TYPE_DVD 0x03 /* DVD Device */
+ u8 resv1;
u32 vdisk_block_size;
u64 operations;
- u64 vdisk_size;
+ u64 vdisk_size; /* v1.1 */
u64 max_xfer_size;
- u64 resv2[2];
+ u32 phys_block_size; /* v1.2 */
+ u32 resv2;
+ u64 resv3[1];
};
struct vio_disk_desc {
--- a/drivers/block/sunvdc.c
+++ b/drivers/block/sunvdc.c
@@ -9,6 +9,7 @@
#include <linux/blkdev.h>
#include <linux/hdreg.h>
#include <linux/genhd.h>
+#include <linux/cdrom.h>
#include <linux/slab.h>
#include <linux/spinlock.h>
#include <linux/completion.h>
@@ -22,8 +23,8 @@
#define DRV_MODULE_NAME "sunvdc"
#define PFX DRV_MODULE_NAME ": "
-#define DRV_MODULE_VERSION "1.0"
-#define DRV_MODULE_RELDATE "June 25, 2007"
+#define DRV_MODULE_VERSION "1.1"
+#define DRV_MODULE_RELDATE "February 13, 2013"
static char version[] =
DRV_MODULE_NAME ".c:v" DRV_MODULE_VERSION " (" DRV_MODULE_RELDATE ")\n";
@@ -65,6 +66,7 @@ struct vdc_port {
u64 operations;
u32 vdisk_size;
u8 vdisk_type;
+ u8 vdisk_mtype;
char disk_name[32];
@@ -79,9 +81,16 @@ static inline struct vdc_port *to_vdc_po
/* Ordered from largest major to lowest */
static struct vio_version vdc_versions[] = {
+ { .major = 1, .minor = 1 },
{ .major = 1, .minor = 0 },
};
+static inline int vdc_version_supported(struct vdc_port *port,
+ u16 major, u16 minor)
+{
+ return port->vio.ver.major == major && port->vio.ver.minor >= minor;
+}
+
#define VDCBLK_NAME "vdisk"
static int vdc_major;
#define PARTITION_SHIFT 3
@@ -103,9 +112,41 @@ static int vdc_getgeo(struct block_devic
return 0;
}
+/* Add ioctl/CDROM_GET_CAPABILITY to support cdrom_id in udev
+ * when vdisk_mtype is VD_MEDIA_TYPE_CD or VD_MEDIA_TYPE_DVD.
+ * Needed to be able to install inside an ldom from an iso image.
+ */
+static int vdc_ioctl(struct block_device *bdev, fmode_t mode,
+ unsigned command, unsigned long argument)
+{
+ int i;
+ struct gendisk *disk;
+
+ switch (command) {
+ case CDROMMULTISESSION:
+ pr_debug(PFX "Multisession CDs not supported\n");
+ for (i = 0; i < sizeof(struct cdrom_multisession); i++)
+ if (put_user(0, (char __user *)(argument + i)))
+ return -EFAULT;
+ return 0;
+
+ case CDROM_GET_CAPABILITY:
+ disk = bdev->bd_disk;
+
+ if (bdev->bd_disk && (disk->flags & GENHD_FL_CD))
+ return 0;
+ return -EINVAL;
+
+ default:
+ pr_debug(PFX "ioctl %08x not supported\n", command);
+ return -EINVAL;
+ }
+}
+
static const struct block_device_operations vdc_fops = {
.owner = THIS_MODULE,
.getgeo = vdc_getgeo,
+ .ioctl = vdc_ioctl,
};
static void vdc_finish(struct vio_driver_state *vio, int err, int waiting_for)
@@ -165,9 +206,9 @@ static int vdc_handle_attr(struct vio_dr
struct vio_disk_attr_info *pkt = arg;
viodbg(HS, "GOT ATTR stype[0x%x] ops[%llx] disk_size[%llu] disk_type[%x] "
- "xfer_mode[0x%x] blksz[%u] max_xfer[%llu]\n",
+ "mtype[0x%x] xfer_mode[0x%x] blksz[%u] max_xfer[%llu]\n",
pkt->tag.stype, pkt->operations,
- pkt->vdisk_size, pkt->vdisk_type,
+ pkt->vdisk_size, pkt->vdisk_type, pkt->vdisk_mtype,
pkt->xfer_mode, pkt->vdisk_block_size,
pkt->max_xfer_size);
@@ -192,8 +233,11 @@ static int vdc_handle_attr(struct vio_dr
}
port->operations = pkt->operations;
- port->vdisk_size = pkt->vdisk_size;
port->vdisk_type = pkt->vdisk_type;
+ if (vdc_version_supported(port, 1, 1)) {
+ port->vdisk_size = pkt->vdisk_size;
+ port->vdisk_mtype = pkt->vdisk_mtype;
+ }
if (pkt->max_xfer_size < port->max_xfer_size)
port->max_xfer_size = pkt->max_xfer_size;
port->vdisk_block_size = pkt->vdisk_block_size;
@@ -663,18 +707,25 @@ static int probe_disk(struct vdc_port *p
return err;
}
- err = generic_request(port, VD_OP_GET_DISKGEOM,
- &port->geom, sizeof(port->geom));
- if (err < 0) {
- printk(KERN_ERR PFX "VD_OP_GET_DISKGEOM returns "
- "error %d\n", err);
- return err;
+ if (vdc_version_supported(port, 1, 1)) {
+ /* vdisk_size should be set during the handshake, if it wasn't
+ * then the underlying disk is reserved by another system
+ */
+ if (port->vdisk_size == -1)
+ return -ENODEV;
+ } else {
+ err = generic_request(port, VD_OP_GET_DISKGEOM,
+ &port->geom, sizeof(port->geom));
+ if (err < 0) {
+ printk(KERN_ERR PFX "VD_OP_GET_DISKGEOM returns "
+ "error %d\n", err);
+ return err;
+ }
+ port->vdisk_size = ((u64)port->geom.num_cyl *
+ (u64)port->geom.num_hd *
+ (u64)port->geom.num_sec);
}
- port->vdisk_size = ((u64)port->geom.num_cyl *
- (u64)port->geom.num_hd *
- (u64)port->geom.num_sec);
-
q = blk_init_queue(do_vdc_request, &port->vio.lock);
if (!q) {
printk(KERN_ERR PFX "%s: Could not allocate queue.\n",
@@ -704,9 +755,32 @@ static int probe_disk(struct vdc_port *p
set_capacity(g, port->vdisk_size);
- printk(KERN_INFO PFX "%s: %u sectors (%u MB)\n",
+ if (vdc_version_supported(port, 1, 1)) {
+ switch (port->vdisk_mtype) {
+ case VD_MEDIA_TYPE_CD:
+ pr_info(PFX "Virtual CDROM %s\n", port->disk_name);
+ g->flags |= GENHD_FL_CD;
+ g->flags |= GENHD_FL_REMOVABLE;
+ set_disk_ro(g, 1);
+ break;
+
+ case VD_MEDIA_TYPE_DVD:
+ pr_info(PFX "Virtual DVD %s\n", port->disk_name);
+ g->flags |= GENHD_FL_CD;
+ g->flags |= GENHD_FL_REMOVABLE;
+ set_disk_ro(g, 1);
+ break;
+
+ case VD_MEDIA_TYPE_FIXED:
+ pr_info(PFX "Virtual Hard disk %s\n", port->disk_name);
+ break;
+ }
+ }
+
+ pr_info(PFX "%s: %u sectors (%u MB) protocol %d.%d\n",
g->disk_name,
- port->vdisk_size, (port->vdisk_size >> (20 - 9)));
+ port->vdisk_size, (port->vdisk_size >> (20 - 9)),
+ port->vio.ver.major, port->vio.ver.minor);
add_disk(g);
@@ -765,6 +839,7 @@ static int vdc_port_probe(struct vio_dev
else
snprintf(port->disk_name, sizeof(port->disk_name),
VDCBLK_NAME "%c", 'a' + ((int)vdev->dev_no % 26));
+ port->vdisk_size = -1;
err = vio_driver_init(&port->vio, vdev, VDEV_DISK,
vdc_versions, ARRAY_SIZE(vdc_versions),
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: "David S. Miller" <[email protected]>
[ Upstream commit ab5c780913bca0a5763ca05dd5c2cb5cb08ccb26 ]
Otherwise rcu_irq_{enter,exit}() do not happen and we get dumps like:
====================
[ 188.275021] ===============================
[ 188.309351] [ INFO: suspicious RCU usage. ]
[ 188.343737] 3.18.0-rc3-00068-g20f3963-dirty #54 Not tainted
[ 188.394786] -------------------------------
[ 188.429170] include/linux/rcupdate.h:883 rcu_read_lock() used
illegally while idle!
[ 188.505235]
other info that might help us debug this:
[ 188.554230]
RCU used illegally from idle CPU!
rcu_scheduler_active = 1, debug_locks = 0
[ 188.637587] RCU used illegally from extended quiescent state!
[ 188.690684] 3 locks held by swapper/7/0:
[ 188.721932] #0: (&x->wait#11){......}, at: [<0000000000495de8>] complete+0x8/0x60
[ 188.797994] #1: (&p->pi_lock){-.-.-.}, at: [<000000000048510c>] try_to_wake_up+0xc/0x400
[ 188.881343] #2: (rcu_read_lock){......}, at: [<000000000048a910>] select_task_rq_fair+0x90/0xb40
[ 188.973043]stack backtrace:
[ 188.993879] CPU: 7 PID: 0 Comm: swapper/7 Not tainted 3.18.0-rc3-00068-g20f3963-dirty #54
[ 189.076187] Call Trace:
[ 189.089719] [0000000000499360] lockdep_rcu_suspicious+0xe0/0x100
[ 189.147035] [000000000048a99c] select_task_rq_fair+0x11c/0xb40
[ 189.202253] [00000000004852d8] try_to_wake_up+0x1d8/0x400
[ 189.252258] [000000000048554c] default_wake_function+0xc/0x20
[ 189.306435] [0000000000495554] __wake_up_common+0x34/0x80
[ 189.356448] [00000000004955b4] __wake_up_locked+0x14/0x40
[ 189.406456] [0000000000495e08] complete+0x28/0x60
[ 189.448142] [0000000000636e28] blk_end_sync_rq+0x8/0x20
[ 189.496057] [0000000000639898] __blk_mq_end_request+0x18/0x60
[ 189.550249] [00000000006ee014] scsi_end_request+0x94/0x180
[ 189.601286] [00000000006ee334] scsi_io_completion+0x1d4/0x600
[ 189.655463] [00000000006e51c4] scsi_finish_command+0xc4/0xe0
[ 189.708598] [00000000006ed958] scsi_softirq_done+0x118/0x140
[ 189.761735] [00000000006398ec] __blk_mq_complete_request_remote+0xc/0x20
[ 189.827383] [00000000004c75d0] generic_smp_call_function_single_interrupt+0x150/0x1c0
[ 189.906581] [000000000043e514] smp_call_function_single_client+0x14/0x40
====================
Based almost entirely upon a patch by Paul E. McKenney.
Reported-by: Meelis Roos <[email protected]>
Tested-by: Meelis Roos <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/kernel/smp_64.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/arch/sparc/kernel/smp_64.c
+++ b/arch/sparc/kernel/smp_64.c
@@ -816,13 +816,17 @@ void arch_send_call_function_single_ipi(
void __irq_entry smp_call_function_client(int irq, struct pt_regs *regs)
{
clear_softint(1 << irq);
+ irq_enter();
generic_smp_call_function_interrupt();
+ irq_exit();
}
void __irq_entry smp_call_function_single_client(int irq, struct pt_regs *regs)
{
clear_softint(1 << irq);
+ irq_enter();
generic_smp_call_function_single_interrupt();
+ irq_exit();
}
static void tsb_sync(void *info)
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: "David S. Miller" <[email protected]>
[ Upstream commit 7da89a2a3776442a57e918ca0b8678d1b16a7072 ]
Meelis Roos reports crashes during bootup on a V480 that look like
this:
====================
[ 61.300577] PCI: Scanning PBM /pci@9,600000
[ 61.304867] schizo f009b070: PCI host bridge to bus 0003:00
[ 61.310385] pci_bus 0003:00: root bus resource [io 0x7ffe9000000-0x7ffe9ffffff] (bus address [0x0000-0xffffff])
[ 61.320515] pci_bus 0003:00: root bus resource [mem 0x7fb00000000-0x7fbffffffff] (bus address [0x00000000-0xffffffff])
[ 61.331173] pci_bus 0003:00: root bus resource [bus 00]
[ 61.385344] Unable to handle kernel NULL pointer dereference
[ 61.390970] tsk->{mm,active_mm}->context = 0000000000000000
[ 61.396515] tsk->{mm,active_mm}->pgd = fff000b000002000
[ 61.401716] \|/ ____ \|/
[ 61.401716] "@'/ .. \`@"
[ 61.401716] /_| \__/ |_\
[ 61.401716] \__U_/
[ 61.416362] swapper/0(0): Oops [#1]
[ 61.419837] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc1-00422-g2cc9188-dirty #24
[ 61.427975] task: fff000b0fd8e9c40 ti: fff000b0fd928000 task.ti: fff000b0fd928000
[ 61.435426] TSTATE: 0000004480e01602 TPC: 00000000004455e4 TNPC: 00000000004455e8 Y: 00000000 Not tainted
[ 61.445230] TPC: <schizo_pcierr_intr+0x104/0x560>
[ 61.449897] g0: 0000000000000000 g1: 0000000000000000 g2: 0000000000a10f78 g3: 000000000000000a
[ 61.458563] g4: fff000b0fd8e9c40 g5: fff000b0fdd82000 g6: fff000b0fd928000 g7: 000000000000000a
[ 61.467229] o0: 000000000000003d o1: 0000000000000000 o2: 0000000000000006 o3: fff000b0ffa5fc7e
[ 61.475894] o4: 0000000000060000 o5: c000000000000000 sp: fff000b0ffa5f3c1 ret_pc: 00000000004455cc
[ 61.484909] RPC: <schizo_pcierr_intr+0xec/0x560>
[ 61.489500] l0: fff000b0fd8e9c40 l1: 0000000000a20800 l2: 0000000000000000 l3: 000000000119a430
[ 61.498164] l4: 0000000001742400 l5: 00000000011cfbe0 l6: 00000000011319c0 l7: fff000b0fd8ea348
[ 61.506830] i0: 0000000000000000 i1: fff000b0fdb34000 i2: 0000000320000000 i3: 0000000000000000
[ 61.515497] i4: 00060002010b003f i5: 0000040004e02000 i6: fff000b0ffa5f481 i7: 00000000004a9920
[ 61.524175] I7: <handle_irq_event_percpu+0x40/0x140>
[ 61.529099] Call Trace:
[ 61.531531] [00000000004a9920] handle_irq_event_percpu+0x40/0x140
[ 61.537681] [00000000004a9a58] handle_irq_event+0x38/0x80
[ 61.543145] [00000000004ac77c] handle_fasteoi_irq+0xbc/0x200
[ 61.548860] [00000000004a9084] generic_handle_irq+0x24/0x40
[ 61.554500] [000000000042be0c] handler_irq+0xac/0x100
====================
The problem is that pbm->pci_bus->self is NULL.
This code is trying to go through the standard PCI config space
interfaces to read the PCI controller's PCI_STATUS register.
This doesn't work, because we more often than not do not enumerate
the PCI controller as a bonafide PCI device during the OF device
node scan. Therefore bus->self remains NULL.
Existing common code for PSYCHO and PSYCHO-like PCI controllers
handles this properly, by doing the config space access directly.
Do the same here, pbm->pci_ops->{read,write}().
Reported-by: Meelis Roos <[email protected]>
Tested-by: Meelis Roos <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/kernel/pci_schizo.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/arch/sparc/kernel/pci_schizo.c
+++ b/arch/sparc/kernel/pci_schizo.c
@@ -581,7 +581,7 @@ static irqreturn_t schizo_pcierr_intr_ot
{
unsigned long csr_reg, csr, csr_error_bits;
irqreturn_t ret = IRQ_NONE;
- u16 stat;
+ u32 stat;
csr_reg = pbm->pbm_regs + SCHIZO_PCI_CTRL;
csr = upa_readq(csr_reg);
@@ -617,7 +617,7 @@ static irqreturn_t schizo_pcierr_intr_ot
pbm->name);
ret = IRQ_HANDLED;
}
- pci_read_config_word(pbm->pci_bus->self, PCI_STATUS, &stat);
+ pbm->pci_ops->read(pbm->pci_bus, 0, PCI_STATUS, 2, &stat);
if (stat & (PCI_STATUS_PARITY |
PCI_STATUS_SIG_TARGET_ABORT |
PCI_STATUS_REC_TARGET_ABORT |
@@ -625,7 +625,7 @@ static irqreturn_t schizo_pcierr_intr_ot
PCI_STATUS_SIG_SYSTEM_ERROR)) {
printk("%s: PCI bus error, PCI_STATUS[%04x]\n",
pbm->name, stat);
- pci_write_config_word(pbm->pci_bus->self, PCI_STATUS, 0xffff);
+ pbm->pci_ops->write(pbm->pci_bus, 0, PCI_STATUS, 2, 0xffff);
ret = IRQ_HANDLED;
}
return ret;
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shuah Khan <[email protected]>
commit 4ea48a01bb1a99f4185b77cd90cf962730336cc4 upstream.
The following generated files are missing from gitignore
and show up in git status after x86_64 build. Add them
to gitignore.
arch/x86/purgatory/kexec-purgatory.c
arch/x86/purgatory/purgatory.ro
Signed-off-by: Shuah Khan <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/.gitignore | 2 ++
1 file changed, 2 insertions(+)
--- a/arch/x86/.gitignore
+++ b/arch/x86/.gitignore
@@ -1,4 +1,6 @@
boot/compressed/vmlinux
tools/test_get_len
tools/insn_sanity
+purgatory/kexec-purgatory.c
+purgatory/purgatory.ro
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dwight Engen <[email protected]>
[ Upstream commit d0aedcd4f14a22e23b313f42b7e6e6ebfc0fbc31 ]
vio_dring_avail() will allow use of every dring entry, but when the last
entry is allocated then dr->prod == dr->cons which is indistinguishable from
the ring empty condition. This causes the next allocation to reuse an entry.
When this happens in sunvdc, the server side vds driver begins nack'ing the
messages and ends up resetting the ldc channel. This problem does not effect
sunvnet since it checks for < 2.
The fix here is to just never allocate the very last dring slot so that full
and empty are not the same condition. The request start path was changed to
check for the ring being full a bit earlier, and to stop the blk_queue if
there is no space left. The blk_queue will be restarted once the ring is
only half full again. The number of ring entries was increased to 512 which
matches the sunvnet and Solaris vdc drivers, and greatly reduces the
frequency of hitting the ring full condition and the associated blk_queue
stop/starting. The checks in sunvent were adjusted to account for
vio_dring_avail() returning 1 less.
Orabug: 19441666
OraBZ: 14983
Signed-off-by: Dwight Engen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/include/asm/vio.h | 2 -
drivers/block/sunvdc.c | 39 +++++++++++++++++++++----------------
drivers/net/ethernet/sun/sunvnet.c | 4 +--
3 files changed, 26 insertions(+), 19 deletions(-)
--- a/arch/sparc/include/asm/vio.h
+++ b/arch/sparc/include/asm/vio.h
@@ -265,7 +265,7 @@ static inline u32 vio_dring_avail(struct
unsigned int ring_size)
{
return (dr->pending -
- ((dr->prod - dr->cons) & (ring_size - 1)));
+ ((dr->prod - dr->cons) & (ring_size - 1)) - 1);
}
#define VIO_MAX_TYPE_LEN 32
--- a/drivers/block/sunvdc.c
+++ b/drivers/block/sunvdc.c
@@ -33,7 +33,7 @@ MODULE_DESCRIPTION("Sun LDOM virtual dis
MODULE_LICENSE("GPL");
MODULE_VERSION(DRV_MODULE_VERSION);
-#define VDC_TX_RING_SIZE 256
+#define VDC_TX_RING_SIZE 512
#define WAITING_FOR_LINK_UP 0x01
#define WAITING_FOR_TX_SPACE 0x02
@@ -283,7 +283,9 @@ static void vdc_end_one(struct vdc_port
__blk_end_request(req, (desc->status ? -EIO : 0), desc->size);
- if (blk_queue_stopped(port->disk->queue))
+ /* restart blk queue when ring is half emptied */
+ if (blk_queue_stopped(port->disk->queue) &&
+ vdc_tx_dring_avail(dr) * 100 / VDC_TX_RING_SIZE >= 50)
blk_start_queue(port->disk->queue);
}
@@ -435,12 +437,6 @@ static int __send_request(struct request
for (i = 0; i < nsg; i++)
len += sg[i].length;
- if (unlikely(vdc_tx_dring_avail(dr) < 1)) {
- blk_stop_queue(port->disk->queue);
- err = -ENOMEM;
- goto out;
- }
-
desc = vio_dring_cur(dr);
err = ldc_map_sg(port->vio.lp, sg, nsg,
@@ -480,21 +476,32 @@ static int __send_request(struct request
port->req_id++;
dr->prod = (dr->prod + 1) & (VDC_TX_RING_SIZE - 1);
}
-out:
return err;
}
-static void do_vdc_request(struct request_queue *q)
+static void do_vdc_request(struct request_queue *rq)
{
- while (1) {
- struct request *req = blk_fetch_request(q);
+ struct request *req;
- if (!req)
+ while ((req = blk_peek_request(rq)) != NULL) {
+ struct vdc_port *port;
+ struct vio_dring_state *dr;
+
+ port = req->rq_disk->private_data;
+ dr = &port->vio.drings[VIO_DRIVER_TX_RING];
+ if (unlikely(vdc_tx_dring_avail(dr) < 1))
+ goto wait;
+
+ blk_start_request(req);
+
+ if (__send_request(req) < 0) {
+ blk_requeue_request(rq, req);
+wait:
+ /* Avoid pointless unplugs. */
+ blk_stop_queue(rq);
break;
-
- if (__send_request(req) < 0)
- __blk_end_request_all(req, -EIO);
+ }
}
}
--- a/drivers/net/ethernet/sun/sunvnet.c
+++ b/drivers/net/ethernet/sun/sunvnet.c
@@ -693,7 +693,7 @@ static int vnet_start_xmit(struct sk_buf
spin_lock_irqsave(&port->vio.lock, flags);
dr = &port->vio.drings[VIO_DRIVER_TX_RING];
- if (unlikely(vnet_tx_dring_avail(dr) < 2)) {
+ if (unlikely(vnet_tx_dring_avail(dr) < 1)) {
if (!netif_queue_stopped(dev)) {
netif_stop_queue(dev);
@@ -749,7 +749,7 @@ static int vnet_start_xmit(struct sk_buf
dev->stats.tx_bytes += skb->len;
dr->prod = (dr->prod + 1) & (VNET_TX_RING_SIZE - 1);
- if (unlikely(vnet_tx_dring_avail(dr) < 2)) {
+ if (unlikely(vnet_tx_dring_avail(dr) < 1)) {
netif_stop_queue(dev);
if (vnet_tx_dring_avail(dr) > VNET_TX_WAKEUP_THRESH(dr))
netif_wake_queue(dev);
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dwight Engen <[email protected]>
[ Upstream commit 85b0c6e62c48bb9179fd5b3e954f362fb346cbd5 ]
The VD_OP_GET_VTOC operation will succeed only if the vdisk backend has a
VTOC label, otherwise it will fail. In particular, it will return error
48 (ENOTSUP) if the disk has an EFI label. VTOC disk labels are already
handled by directly reading the disk in block/partitions/sun.c (enabled by
CONFIG_SUN_PARTITION which defaults to y on SPARC). Since port->label is
unused in the driver, remove the call and the field.
Signed-off-by: Dwight Engen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/block/sunvdc.c | 9 ---------
1 file changed, 9 deletions(-)
--- a/drivers/block/sunvdc.c
+++ b/drivers/block/sunvdc.c
@@ -69,8 +69,6 @@ struct vdc_port {
u8 vdisk_mtype;
char disk_name[32];
-
- struct vio_disk_vtoc label;
};
static inline struct vdc_port *to_vdc_port(struct vio_driver_state *vio)
@@ -710,13 +708,6 @@ static int probe_disk(struct vdc_port *p
if (comp.err)
return comp.err;
- err = generic_request(port, VD_OP_GET_VTOC,
- &port->label, sizeof(port->label));
- if (err < 0) {
- printk(KERN_ERR PFX "VD_OP_GET_VTOC returns error %d\n", err);
- return err;
- }
-
if (vdc_version_supported(port, 1, 1)) {
/* vdisk_size should be set during the handshake, if it wasn't
* then the underlying disk is reserved by another system
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet <[email protected]>
[ Upstream commit 5337b5b75cd9bd3624a6820e3c2a084d2480061c ]
Use IS_ENABLED(CONFIG_IPV6), to enable this code if IPv6 is
a module.
Signed-off-by: Eric Dumazet <[email protected]>
Fixes: c8e6ad0829a7 ("ipv6: honor IPV6_PKTINFO with v4 mapped addresses on sendmsg")
Acked-by: Hannes Frederic Sowa <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/ip_sockglue.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -195,7 +195,7 @@ int ip_cmsg_send(struct net *net, struct
for (cmsg = CMSG_FIRSTHDR(msg); cmsg; cmsg = CMSG_NXTHDR(msg, cmsg)) {
if (!CMSG_OK(msg, cmsg))
return -EINVAL;
-#if defined(CONFIG_IPV6)
+#if IS_ENABLED(CONFIG_IPV6)
if (allow_ipv6 &&
cmsg->cmsg_level == SOL_IPV6 &&
cmsg->cmsg_type == IPV6_PKTINFO) {
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Richard Cochran <[email protected]>
[ Upstream commit cca04b2854ecfb7cd1b8ee84ab38bc99af59f526 ]
Commit ae5c6c6d "ptp: Classify ptp over ip over vlan packets" changed the
code in two drivers that matches time stamps with PTP frames, with the goal
of allowing VLAN tagged PTP packets to receive hardware time stamps.
However, that commit failed to account for the VLAN header when parsing
IPv4 packets. This patch fixes those two drivers to correctly match VLAN
tagged IPv4/UDP PTP messages with their time stamps.
This patch should also be applied to v3.17.
Signed-off-by: Richard Cochran <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/ti/cpts.c | 2 +-
drivers/net/phy/dp83640.c | 4 ++--
2 files changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/net/ethernet/ti/cpts.c
+++ b/drivers/net/ethernet/ti/cpts.c
@@ -264,7 +264,7 @@ static int cpts_match(struct sk_buff *sk
switch (ptp_class & PTP_CLASS_PMASK) {
case PTP_CLASS_IPV4:
- offset += ETH_HLEN + IPV4_HLEN(data) + UDP_HLEN;
+ offset += ETH_HLEN + IPV4_HLEN(data + offset) + UDP_HLEN;
break;
case PTP_CLASS_IPV6:
offset += ETH_HLEN + IP6_HLEN + UDP_HLEN;
--- a/drivers/net/phy/dp83640.c
+++ b/drivers/net/phy/dp83640.c
@@ -784,7 +784,7 @@ static int match(struct sk_buff *skb, un
switch (type & PTP_CLASS_PMASK) {
case PTP_CLASS_IPV4:
- offset += ETH_HLEN + IPV4_HLEN(data) + UDP_HLEN;
+ offset += ETH_HLEN + IPV4_HLEN(data + offset) + UDP_HLEN;
break;
case PTP_CLASS_IPV6:
offset += ETH_HLEN + IP6_HLEN + UDP_HLEN;
@@ -927,7 +927,7 @@ static int is_sync(struct sk_buff *skb,
switch (type & PTP_CLASS_PMASK) {
case PTP_CLASS_IPV4:
- offset += ETH_HLEN + IPV4_HLEN(data) + UDP_HLEN;
+ offset += ETH_HLEN + IPV4_HLEN(data + offset) + UDP_HLEN;
break;
case PTP_CLASS_IPV6:
offset += ETH_HLEN + IP6_HLEN + UDP_HLEN;
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hiroaki SHIMODA <[email protected]>
[ Upstream commit 6251edd932ce3faadbfe27b0a0fe79780e0972e9 ]
Even if netlink_kernel_cfg::unbind is implemented the unbind() method is
not called, because cfg->unbind is omitted in __netlink_kernel_create().
And fix wrong argument of test_bit() and off by one problem.
At this point, no unbind() method is implemented, so there is no real
issue.
Fixes: 4f520900522f ("netlink: have netlink per-protocol bind function return an error code.")
Signed-off-by: Hiroaki SHIMODA <[email protected]>
Cc: Richard Guy Briggs <[email protected]>
Acked-by: Richard Guy Briggs <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/netlink/af_netlink.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -1440,7 +1440,7 @@ static void netlink_unbind(int group, lo
return;
for (undo = 0; undo < group; undo++)
- if (test_bit(group, &groups))
+ if (test_bit(undo, &groups))
nlk->netlink_unbind(undo);
}
@@ -1492,7 +1492,7 @@ static int netlink_bind(struct socket *s
netlink_insert(sk, net, nladdr->nl_pid) :
netlink_autobind(sock);
if (err) {
- netlink_unbind(nlk->ngroups - 1, groups, nlk);
+ netlink_unbind(nlk->ngroups, groups, nlk);
return err;
}
}
@@ -2509,6 +2509,7 @@ __netlink_kernel_create(struct net *net,
nl_table[unit].module = module;
if (cfg) {
nl_table[unit].bind = cfg->bind;
+ nl_table[unit].unbind = cfg->unbind;
nl_table[unit].flags = cfg->flags;
if (cfg->compare)
nl_table[unit].compare = cfg->compare;
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Borkmann <[email protected]>
[ Upstream commit 4184b2a79a7612a9272ce20d639934584a1f3786 ]
A very minimal and simple user space application allocating an SCTP
socket, setting SCTP_AUTH_KEY setsockopt(2) on it and then closing
the socket again will leak the memory containing the authentication
key from user space:
unreferenced object 0xffff8800837047c0 (size 16):
comm "a.out", pid 2789, jiffies 4296954322 (age 192.258s)
hex dump (first 16 bytes):
01 00 00 00 04 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<ffffffff816d7e8e>] kmemleak_alloc+0x4e/0xb0
[<ffffffff811c88d8>] __kmalloc+0xe8/0x270
[<ffffffffa0870c23>] sctp_auth_create_key+0x23/0x50 [sctp]
[<ffffffffa08718b1>] sctp_auth_set_key+0xa1/0x140 [sctp]
[<ffffffffa086b383>] sctp_setsockopt+0xd03/0x1180 [sctp]
[<ffffffff815bfd94>] sock_common_setsockopt+0x14/0x20
[<ffffffff815beb61>] SyS_setsockopt+0x71/0xd0
[<ffffffff816e58a9>] system_call_fastpath+0x12/0x17
[<ffffffffffffffff>] 0xffffffffffffffff
This is bad because of two things, we can bring down a machine from
user space when auth_enable=1, but also we would leave security sensitive
keying material in memory without clearing it after use. The issue is
that sctp_auth_create_key() already sets the refcount to 1, but after
allocation sctp_auth_set_key() does an additional refcount on it, and
thus leaving it around when we free the socket.
Fixes: 65b07e5d0d0 ("[SCTP]: API updates to suport SCTP-AUTH extensions.")
Signed-off-by: Daniel Borkmann <[email protected]>
Cc: Vlad Yasevich <[email protected]>
Acked-by: Neil Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/sctp/auth.c | 2 --
1 file changed, 2 deletions(-)
--- a/net/sctp/auth.c
+++ b/net/sctp/auth.c
@@ -862,8 +862,6 @@ int sctp_auth_set_key(struct sctp_endpoi
list_add(&cur_key->key_list, sh_keys);
cur_key->key = key;
- sctp_auth_key_hold(key);
-
return 0;
nomem:
if (!replace)
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Takashi Iwai <[email protected]>
[ Upstream commit 5748eb8f8e989a9da1ac7c96dc73d68cbdedf7df ]
In ppp_ioctl(), bpf_prog_create() is called inside ppp_lock, which
eventually calls vmalloc() and hits BUG_ON() in vmalloc.c. This patch
works around the problem by moving the allocation outside the lock.
The bug was revealed by the recent change in net/core/filter.c, as it
allocates via vmalloc() instead of kmalloc() now.
Reported-and-tested-by: Stefan Seyfried <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ppp/ppp_generic.c | 40 ++++++++++++++++++++--------------------
1 file changed, 20 insertions(+), 20 deletions(-)
--- a/drivers/net/ppp/ppp_generic.c
+++ b/drivers/net/ppp/ppp_generic.c
@@ -755,23 +755,23 @@ static long ppp_ioctl(struct file *file,
err = get_filter(argp, &code);
if (err >= 0) {
+ struct bpf_prog *pass_filter = NULL;
struct sock_fprog_kern fprog = {
.len = err,
.filter = code,
};
- ppp_lock(ppp);
- if (ppp->pass_filter) {
- bpf_prog_destroy(ppp->pass_filter);
- ppp->pass_filter = NULL;
+ err = 0;
+ if (fprog.filter)
+ err = bpf_prog_create(&pass_filter, &fprog);
+ if (!err) {
+ ppp_lock(ppp);
+ if (ppp->pass_filter)
+ bpf_prog_destroy(ppp->pass_filter);
+ ppp->pass_filter = pass_filter;
+ ppp_unlock(ppp);
}
- if (fprog.filter != NULL)
- err = bpf_prog_create(&ppp->pass_filter,
- &fprog);
- else
- err = 0;
kfree(code);
- ppp_unlock(ppp);
}
break;
}
@@ -781,23 +781,23 @@ static long ppp_ioctl(struct file *file,
err = get_filter(argp, &code);
if (err >= 0) {
+ struct bpf_prog *active_filter = NULL;
struct sock_fprog_kern fprog = {
.len = err,
.filter = code,
};
- ppp_lock(ppp);
- if (ppp->active_filter) {
- bpf_prog_destroy(ppp->active_filter);
- ppp->active_filter = NULL;
+ err = 0;
+ if (fprog.filter)
+ err = bpf_prog_create(&active_filter, &fprog);
+ if (!err) {
+ ppp_lock(ppp);
+ if (ppp->active_filter)
+ bpf_prog_destroy(ppp->active_filter);
+ ppp->active_filter = active_filter;
+ ppp_unlock(ppp);
}
- if (fprog.filter != NULL)
- err = bpf_prog_create(&ppp->active_filter,
- &fprog);
- else
- err = 0;
kfree(code);
- ppp_unlock(ppp);
}
break;
}
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Borkmann <[email protected]>
[ Upstream commit e40607cbe270a9e8360907cb1e62ddf0736e4864 ]
An SCTP server doing ASCONF will panic on malformed INIT ping-of-death
in the form of:
------------ INIT[PARAM: SET_PRIMARY_IP] ------------>
While the INIT chunk parameter verification dissects through many things
in order to detect malformed input, it misses to actually check parameters
inside of parameters. E.g. RFC5061, section 4.2.4 proposes a 'set primary
IP address' parameter in ASCONF, which has as a subparameter an address
parameter.
So an attacker may send a parameter type other than SCTP_PARAM_IPV4_ADDRESS
or SCTP_PARAM_IPV6_ADDRESS, param_type2af() will subsequently return 0
and thus sctp_get_af_specific() returns NULL, too, which we then happily
dereference unconditionally through af->from_addr_param().
The trace for the log:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000078
IP: [<ffffffffa01e9c62>] sctp_process_init+0x492/0x990 [sctp]
PGD 0
Oops: 0000 [#1] SMP
[...]
Pid: 0, comm: swapper Not tainted 2.6.32-504.el6.x86_64 #1 Bochs Bochs
RIP: 0010:[<ffffffffa01e9c62>] [<ffffffffa01e9c62>] sctp_process_init+0x492/0x990 [sctp]
[...]
Call Trace:
<IRQ>
[<ffffffffa01f2add>] ? sctp_bind_addr_copy+0x5d/0xe0 [sctp]
[<ffffffffa01e1fcb>] sctp_sf_do_5_1B_init+0x21b/0x340 [sctp]
[<ffffffffa01e3751>] sctp_do_sm+0x71/0x1210 [sctp]
[<ffffffffa01e5c09>] ? sctp_endpoint_lookup_assoc+0xc9/0xf0 [sctp]
[<ffffffffa01e61f6>] sctp_endpoint_bh_rcv+0x116/0x230 [sctp]
[<ffffffffa01ee986>] sctp_inq_push+0x56/0x80 [sctp]
[<ffffffffa01fcc42>] sctp_rcv+0x982/0xa10 [sctp]
[<ffffffffa01d5123>] ? ipt_local_in_hook+0x23/0x28 [iptable_filter]
[<ffffffff8148bdc9>] ? nf_iterate+0x69/0xb0
[<ffffffff81496d10>] ? ip_local_deliver_finish+0x0/0x2d0
[<ffffffff8148bf86>] ? nf_hook_slow+0x76/0x120
[<ffffffff81496d10>] ? ip_local_deliver_finish+0x0/0x2d0
[...]
A minimal way to address this is to check for NULL as we do on all
other such occasions where we know sctp_get_af_specific() could
possibly return with NULL.
Fixes: d6de3097592b ("[SCTP]: Add the handling of "Set Primary IP Address" parameter to INIT")
Signed-off-by: Daniel Borkmann <[email protected]>
Cc: Vlad Yasevich <[email protected]>
Acked-by: Neil Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/sctp/sm_make_chunk.c | 3 +++
1 file changed, 3 insertions(+)
--- a/net/sctp/sm_make_chunk.c
+++ b/net/sctp/sm_make_chunk.c
@@ -2609,6 +2609,9 @@ do_addr_param:
addr_param = param.v + sizeof(sctp_addip_param_t);
af = sctp_get_af_specific(param_type2af(param.p->type));
+ if (af == NULL)
+ break;
+
af->from_addr_param(&addr, addr_param,
htons(asoc->peer.port), 0);
3.17-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marcelo Leitner <[email protected]>
[ Upstream commit 19ca9fc1445b76b60d34148f7ff837b055f5dcf3 ]
Currently, we only match against local port number in order to reuse
socket. But if this new vxlan wants an IPv6 socket and a IPv4 one bound
to that port, vxlan will reuse an IPv4 socket as IPv6 and a panic will
follow. The following steps reproduce it:
# ip link add vxlan6 type vxlan id 42 group 229.10.10.10 \
srcport 5000 6000 dev eth0
# ip link add vxlan7 type vxlan id 43 group ff0e::110 \
srcport 5000 6000 dev eth0
# ip link set vxlan6 up
# ip link set vxlan7 up
<panic>
[ 4.187481] BUG: unable to handle kernel NULL pointer dereference at 0000000000000058
...
[ 4.188076] Call Trace:
[ 4.188085] [<ffffffff81667c4a>] ? ipv6_sock_mc_join+0x3a/0x630
[ 4.188098] [<ffffffffa05a6ad6>] vxlan_igmp_join+0x66/0xd0 [vxlan]
[ 4.188113] [<ffffffff810a3430>] process_one_work+0x220/0x710
[ 4.188125] [<ffffffff810a33c4>] ? process_one_work+0x1b4/0x710
[ 4.188138] [<ffffffff810a3a3b>] worker_thread+0x11b/0x3a0
[ 4.188149] [<ffffffff810a3920>] ? process_one_work+0x710/0x710
So address family must also match in order to reuse a socket.
Reported-by: Jean-Tsung Hsiao <[email protected]>
Signed-off-by: Marcelo Ricardo Leitner <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/vxlan.c | 29 +++++++++++++++++++----------
1 file changed, 19 insertions(+), 10 deletions(-)
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -274,13 +274,15 @@ static inline struct vxlan_rdst *first_r
return list_first_entry(&fdb->remotes, struct vxlan_rdst, list);
}
-/* Find VXLAN socket based on network namespace and UDP port */
-static struct vxlan_sock *vxlan_find_sock(struct net *net, __be16 port)
+/* Find VXLAN socket based on network namespace, address family and UDP port */
+static struct vxlan_sock *vxlan_find_sock(struct net *net,
+ sa_family_t family, __be16 port)
{
struct vxlan_sock *vs;
hlist_for_each_entry_rcu(vs, vs_head(net, port), hlist) {
- if (inet_sk(vs->sock->sk)->inet_sport == port)
+ if (inet_sk(vs->sock->sk)->inet_sport == port &&
+ inet_sk(vs->sock->sk)->sk.sk_family == family)
return vs;
}
return NULL;
@@ -299,11 +301,12 @@ static struct vxlan_dev *vxlan_vs_find_v
}
/* Look up VNI in a per net namespace table */
-static struct vxlan_dev *vxlan_find_vni(struct net *net, u32 id, __be16 port)
+static struct vxlan_dev *vxlan_find_vni(struct net *net, u32 id,
+ sa_family_t family, __be16 port)
{
struct vxlan_sock *vs;
- vs = vxlan_find_sock(net, port);
+ vs = vxlan_find_sock(net, family, port);
if (!vs)
return NULL;
@@ -1822,7 +1825,8 @@ static void vxlan_xmit_one(struct sk_buf
struct vxlan_dev *dst_vxlan;
ip_rt_put(rt);
- dst_vxlan = vxlan_find_vni(vxlan->net, vni, dst_port);
+ dst_vxlan = vxlan_find_vni(vxlan->net, vni,
+ dst->sa.sa_family, dst_port);
if (!dst_vxlan)
goto tx_error;
vxlan_encap_bypass(skb, vxlan, dst_vxlan);
@@ -1876,7 +1880,8 @@ static void vxlan_xmit_one(struct sk_buf
struct vxlan_dev *dst_vxlan;
dst_release(ndst);
- dst_vxlan = vxlan_find_vni(vxlan->net, vni, dst_port);
+ dst_vxlan = vxlan_find_vni(vxlan->net, vni,
+ dst->sa.sa_family, dst_port);
if (!dst_vxlan)
goto tx_error;
vxlan_encap_bypass(skb, vxlan, dst_vxlan);
@@ -2036,13 +2041,15 @@ static int vxlan_init(struct net_device
struct vxlan_dev *vxlan = netdev_priv(dev);
struct vxlan_net *vn = net_generic(vxlan->net, vxlan_net_id);
struct vxlan_sock *vs;
+ bool ipv6 = vxlan->flags & VXLAN_F_IPV6;
dev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats);
if (!dev->tstats)
return -ENOMEM;
spin_lock(&vn->sock_lock);
- vs = vxlan_find_sock(vxlan->net, vxlan->dst_port);
+ vs = vxlan_find_sock(vxlan->net, ipv6 ? AF_INET6 : AF_INET,
+ vxlan->dst_port);
if (vs) {
/* If we have a socket with same port already, reuse it */
atomic_inc(&vs->refcnt);
@@ -2441,6 +2448,7 @@ struct vxlan_sock *vxlan_sock_add(struct
{
struct vxlan_net *vn = net_generic(net, vxlan_net_id);
struct vxlan_sock *vs;
+ bool ipv6 = flags & VXLAN_F_IPV6;
vs = vxlan_socket_create(net, port, rcv, data, flags);
if (!IS_ERR(vs))
@@ -2450,7 +2458,7 @@ struct vxlan_sock *vxlan_sock_add(struct
return vs;
spin_lock(&vn->sock_lock);
- vs = vxlan_find_sock(net, port);
+ vs = vxlan_find_sock(net, ipv6 ? AF_INET6 : AF_INET, port);
if (vs) {
if (vs->rcv == rcv)
atomic_inc(&vs->refcnt);
@@ -2609,7 +2617,8 @@ static int vxlan_newlink(struct net *net
nla_get_u8(data[IFLA_VXLAN_UDP_ZERO_CSUM6_RX]))
vxlan->flags |= VXLAN_F_UDP_ZERO_CSUM6_RX;
- if (vxlan_find_vni(net, vni, vxlan->dst_port)) {
+ if (vxlan_find_vni(net, vni, use_ipv6 ? AF_INET6 : AF_INET,
+ vxlan->dst_port)) {
pr_info("duplicate VNI %u\n", vni);
return -EEXIST;
}
On 11/19/2014 12:50 PM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 3.17.4 release.
> There are 141 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Fri Nov 21 20:51:28 UTC 2014.
> Anything received after that time might be too late.
>
Build results:
total: 133 pass: 132 fail: 1
Failed builds:
avr32:atngw100mkii_evklcd101_defconfig
Qemu test results:
total: 30 pass: 30 fail: 0
Details at http://server.roeck-us.net:8010/builders.
Guenter
On Wed, Nov 19, 2014 at 09:38:18PM -0800, Guenter Roeck wrote:
> On 11/19/2014 12:50 PM, Greg Kroah-Hartman wrote:
> >This is the start of the stable review cycle for the 3.17.4 release.
> >There are 141 patches in this series, all will be posted as a response
> >to this one. If anyone has any issues with these being applied, please
> >let me know.
> >
> >Responses should be made by Fri Nov 21 20:51:28 UTC 2014.
> >Anything received after that time might be too late.
> >
>
> Build results:
> total: 133 pass: 132 fail: 1
> Failed builds:
> avr32:atngw100mkii_evklcd101_defconfig
>
> Qemu test results:
> total: 30 pass: 30 fail: 0
>
> Details at http://server.roeck-us.net:8010/builders.
Thanks for testing all of these and letting me know.
greg k-h
On 11/19/2014 01:50 PM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 3.17.4 release.
> There are 141 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Fri Nov 21 20:51:28 UTC 2014.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.17.4-rc1.gz
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>
Compiled and booted on my test system. No dmesg regressions.
-- Shuah
--
Shuah Khan
Sr. Linux Kernel Developer
Samsung Research America (Silicon Valley)
[email protected] | (970) 217-8978
On Thu, Nov 20, 2014 at 06:36:13PM -0700, Shuah Khan wrote:
> On 11/19/2014 01:50 PM, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 3.17.4 release.
> > There are 141 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Fri Nov 21 20:51:28 UTC 2014.
> > Anything received after that time might be too late.
> >
> > The whole patch series can be found in one patch at:
> > kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.17.4-rc1.gz
> > and the diffstat can be found below.
> >
> > thanks,
> >
> > greg k-h
> >
>
> Compiled and booted on my test system. No dmesg regressions.
Thanks for testing all 3 of these and letting me know.
greg k-h
On 11/19/2014 03:52 PM, Greg Kroah-Hartman wrote:
> 3.17-stable review patch. If anyone has any objections, please let me know.
This breaks PV on Xen.
-boris
>
> ------------------
>
> From: Borislav Petkov <[email protected]>
>
> commit 85be07c32496dc264661308e4d9d4e9ccaff8072 upstream.
>
> We should be accessing it through a pointer, like on the BSP.
>
> Tested-by: Richard Hendershot <[email protected]>
> Fixes: 65cef1311d5d ("x86, microcode: Add a disable chicken bit")
> Signed-off-by: Borislav Petkov <[email protected]>
> Signed-off-by: Greg Kroah-Hartman <[email protected]>
>
> ---
> arch/x86/kernel/cpu/microcode/core_early.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> --- a/arch/x86/kernel/cpu/microcode/core_early.c
> +++ b/arch/x86/kernel/cpu/microcode/core_early.c
> @@ -124,7 +124,7 @@ void __init load_ucode_bsp(void)
> static bool check_loader_disabled_ap(void)
> {
> #ifdef CONFIG_X86_32
> - return __pa_nodebug(dis_ucode_ldr);
> + return *((bool *)__pa_nodebug(&dis_ucode_ldr));
> #else
> return dis_ucode_ldr;
> #endif
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
On Tue, Nov 25, 2014 at 01:12:10PM -0500, Boris Ostrovsky wrote:
> On 11/19/2014 03:52 PM, Greg Kroah-Hartman wrote:
> >3.17-stable review patch. If anyone has any objections, please let me know.
>
>
> This breaks PV on Xen.
>
> -boris
>
> >
> >------------------
> >
> >From: Borislav Petkov <[email protected]>
> >
> >commit 85be07c32496dc264661308e4d9d4e9ccaff8072 upstream.
> >
> >We should be accessing it through a pointer, like on the BSP.
> >
> >Tested-by: Richard Hendershot <[email protected]>
> >Fixes: 65cef1311d5d ("x86, microcode: Add a disable chicken bit")
> >Signed-off-by: Borislav Petkov <[email protected]>
> >Signed-off-by: Greg Kroah-Hartman <[email protected]>
> >
> >---
> > arch/x86/kernel/cpu/microcode/core_early.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> >--- a/arch/x86/kernel/cpu/microcode/core_early.c
> >+++ b/arch/x86/kernel/cpu/microcode/core_early.c
> >@@ -124,7 +124,7 @@ void __init load_ucode_bsp(void)
> > static bool check_loader_disabled_ap(void)
> > {
> > #ifdef CONFIG_X86_32
> >- return __pa_nodebug(dis_ucode_ldr);
> >+ return *((bool *)__pa_nodebug(&dis_ucode_ldr));
And practically the same line in check_loader_disabled_bsp() doesn't?
--
Regards/Gruss,
Boris.
Sent from a fat crate under my desk. Formatting is fine.
--
On 11/25/2014 01:24 PM, Borislav Petkov wrote:
> On Tue, Nov 25, 2014 at 01:12:10PM -0500, Boris Ostrovsky wrote:
>> On 11/19/2014 03:52 PM, Greg Kroah-Hartman wrote:
>>> 3.17-stable review patch. If anyone has any objections, please let me know.
>>
>> This breaks PV on Xen.
>>
>> -boris
>>
>>> ------------------
>>>
>>> From: Borislav Petkov <[email protected]>
>>>
>>> commit 85be07c32496dc264661308e4d9d4e9ccaff8072 upstream.
>>>
>>> We should be accessing it through a pointer, like on the BSP.
>>>
>>> Tested-by: Richard Hendershot <[email protected]>
>>> Fixes: 65cef1311d5d ("x86, microcode: Add a disable chicken bit")
>>> Signed-off-by: Borislav Petkov <[email protected]>
>>> Signed-off-by: Greg Kroah-Hartman <[email protected]>
>>>
>>> ---
>>> arch/x86/kernel/cpu/microcode/core_early.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> --- a/arch/x86/kernel/cpu/microcode/core_early.c
>>> +++ b/arch/x86/kernel/cpu/microcode/core_early.c
>>> @@ -124,7 +124,7 @@ void __init load_ucode_bsp(void)
>>> static bool check_loader_disabled_ap(void)
>>> {
>>> #ifdef CONFIG_X86_32
>>> - return __pa_nodebug(dis_ucode_ldr);
>>> + return *((bool *)__pa_nodebug(&dis_ucode_ldr));
> And practically the same line in check_loader_disabled_bsp() doesn't?
I don't think this routine is called on PV.
-boris
On Tue, Nov 25, 2014 at 01:43:26PM -0500, Boris Ostrovsky wrote:
> I don't think this routine is called on PV.
They're either both called or none is. At least on baremetal, that is.
--
Regards/Gruss,
Boris.
Sent from a fat crate under my desk. Formatting is fine.
--
On Tue, Nov 25, 2014 at 01:12:10PM -0500, Boris Ostrovsky wrote:
> On 11/19/2014 03:52 PM, Greg Kroah-Hartman wrote:
> >3.17-stable review patch. If anyone has any objections, please let me know.
>
>
> This breaks PV on Xen.
Does that mean it is also broken in Linus's tree? If so, please fix it
there. If not, is there some other patch I am missing for 3.17-stable
to resolve this?
thanks,
greg k-h
On Tue, Nov 25, 2014 at 10:45:01AM -0800, Greg Kroah-Hartman wrote:
> Does that mean it is also broken in Linus's tree?
Should be.
> If so, please fix it there.
Gladly, if Boris would share some more info as to why it breaks the PV
gunk...
--
Regards/Gruss,
Boris.
Sent from a fat crate under my desk. Formatting is fine.
--
On 11/25/2014 01:45 PM, Greg Kroah-Hartman wrote:
> On Tue, Nov 25, 2014 at 01:12:10PM -0500, Boris Ostrovsky wrote:
>> On 11/19/2014 03:52 PM, Greg Kroah-Hartman wrote:
>>> 3.17-stable review patch. If anyone has any objections, please let me know.
>>
>> This breaks PV on Xen.
> Does that mean it is also broken in Linus's tree? If so, please fix it
> there. If not, is there some other patch I am missing for 3.17-stable
> to resolve this?
Yes, it is broken in Linus's tree. That's the only tree that I tested
and before we have a fix I wanted to avoid for this to trickle into
stable trees as well (although I may be late).
-boris
On 11/25/2014 01:43 PM, Borislav Petkov wrote:
> On Tue, Nov 25, 2014 at 01:43:26PM -0500, Boris Ostrovsky wrote:
>> I don't think this routine is called on PV.
> They're either both called or none is. At least on baremetal, that is.
>
PV guests don't start with startup_32.
We are coming from a resume into load_ucode_ap as:
[ 38.644599] BUG: unable to handle kernel paging request at 0197eec0
[ 38.644599] IP: [<c1071fa6>] load_ucode_ap+0x6/0xe0
[ 38.644599] *pdpt = 0000000003267007 *pde = 0000000000000000
[ 38.644599] Oops: 0000 [#1] SMP
[ 38.644599] Modules linked in: sg sd_mod dm_multipath dm_mod
xen_evtchn iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi
scsi_transport_iscsi scsi_mod libcrc32c crc32c_generic radeon fbcon
tileblit font bitblit ttm softcursor drm_kms_helper x86_pkg_temp_thermal
crc32c_intel xen_blkfront xen_netfront xen_fbfront fb_sys_fops sysimgblt
sysfillrect syscopyarea xen_kbdfront xenfs xen_privcmd
[ 38.644599] CPU: 0 PID: 9 Comm: migration/0 Tainted: G W
3.18.0-rc6upstream-00001-g0de9524 #1
[ 38.644599] task: eb894650 ti: eb89c000 task.ti: eb89c000
[ 38.644599] EIP: 0061:[<c1071fa6>] EFLAGS: 00010082 CPU: 0
[ 38.644599] EIP is at load_ucode_ap+0x6/0xe0
[ 38.644599] EAX: 00000000 EBX: c1823160 ECX: 00000000 EDX: c197eee0
[ 38.644599] ESI: eb9bded0 EDI: c1793e95 EBP: eb89de28 ESP: eb89de20
[ 38.644599] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
[ 38.644599] CR0: 80050033 CR2: 0197eec0 CR3: 03280000 CR4: 00042660
[ 38.644599] Stack:
[ 38.644599] eb89de30 c10539b0 eb89de30 c1070f9d eb89de54 c140de9e
eb9bded0 eb89de54
[ 38.644599] c103dbcc 00000008 deadbeef eb9bded0 eb9bdee4 eb89de80
c1397f57 00000000
[ 38.644599] 00000000 80000002 eb9bdf1c 00000000 00000002 00000003
eb9bded0 eb9bdee4
[ 38.644599] Call Trace:
[ 38.644599] [<c10539b0>] ? i8237A_resume+0xb0/0xe0
[ 38.644599] [<c1070f9d>] mc_bp_resume+0x3d/0x50
[ 38.644599] [<c140de9e>] syscore_resume+0x4e/0x190
[ 38.644599] [<c103dbcc>] ? xen_timer_resume+0x3c/0x60
[ 38.644599] [<c1397f57>] xen_suspend+0x77/0xf0
[ 38.644599] [<c110dc63>] multi_cpu_stop+0x93/0xc0
[ 38.644599] [<c110de76>] cpu_stopper_thread+0x46/0x170
[ 38.644599] [<c110dbd0>] ? irq_cpu_stop_queue_work+0x20/0x20
[ 38.644599] [<c162c536>] ? __schedule+0x356/0x880
[ 38.644599] [<c10b7aab>] ? default_wake_function+0xb/0x10
[ 38.644599] [<c10c7c00>] ? __wake_up_common+0x40/0x70
[ 38.644599] [<c1630730>] ? _raw_spin_unlock_irqrestore+0x20/0x90
[ 38.644599] [<c1630730>] ? _raw_spin_unlock_irqrestore+0x20/0x90
[ 38.644599] [<c16304ef>] ? _raw_spin_lock_irqsave+0x1f/0x80
[ 38.644599] [<c16306bf>] ? _raw_spin_lock_irq+0xf/0x60
[ 38.644599] [<c10aff47>] smpboot_thread_fn+0x117/0x1a0
[ 38.644599] [<c10ac824>] kthread+0xa4/0xc0
[ 38.644599] [<c10afe30>] ? smpboot_create_threads+0x60/0x60
[ 38.644599] [<c1630bc1>] ret_from_kernel_thread+0x21/0x30
[ 38.644599] [<c10ac780>] ? kthread_freezable_should_stop+0x60/0x60
[ 38.644599] Code: 55 cc eb 93 89 44 24 04 66 31 f6 c7 04 24 78 f0 74
c1 e8 84 9f 5b 00 eb ae 90 90 90 90 90 90 90 90 90 90 90 90 55 89 e5 83
ec 08 <80> 3d c0 ee 97 01 00 89 1c 24 89 74 24 04 74 12 8b 1c 24 8b 74
[ 38.644599] EIP: [<c1071fa6>] load_ucode_ap+0x6/0xe0 SS:ESP 0069:eb89de20
[ 38.644599] CR2: 000000000197eec0
[ 38.644599] ---[ end trace 0ad7358b42202518 ]---
[ 38.644599] Kernel panic - not syncing: Fatal exception
[ 38.644599] Kernel Offset: 0x0 from 0xc1000000 (relocation range:
0xc0000000-0xed7fdfff)
-boris
On Tue, Nov 25, 2014 at 01:55:29PM -0500, Boris Ostrovsky wrote:
> On 11/25/2014 01:43 PM, Borislav Petkov wrote:
> >On Tue, Nov 25, 2014 at 01:43:26PM -0500, Boris Ostrovsky wrote:
> >>I don't think this routine is called on PV.
> >They're either both called or none is. At least on baremetal, that is.
> >
>
> PV guests don't start with startup_32.
>
> We are coming from a resume into load_ucode_ap as:
>
> [ 38.644599] BUG: unable to handle kernel paging request at 0197eec0
> [ 38.644599] IP: [<c1071fa6>] load_ucode_ap+0x6/0xe0
Aha, and at that point, the APs have enabled paging and switched to
virtual addresses, correct?
Does that fix it?
---
diff --git a/arch/x86/kernel/cpu/microcode/core_early.c b/arch/x86/kernel/cpu/microcode/core_early.c
index 2c017f242a78..11ff39fe9d88 100644
--- a/arch/x86/kernel/cpu/microcode/core_early.c
+++ b/arch/x86/kernel/cpu/microcode/core_early.c
@@ -123,11 +123,7 @@ void __init load_ucode_bsp(void)
static bool check_loader_disabled_ap(void)
{
-#ifdef CONFIG_X86_32
- return *((bool *)__pa_nodebug(&dis_ucode_ldr));
-#else
return dis_ucode_ldr;
-#endif
}
void load_ucode_ap(void)
> [ 38.644599] *pdpt = 0000000003267007 *pde = 0000000000000000
> [ 38.644599] Oops: 0000 [#1] SMP
> [ 38.644599] Modules linked in: sg sd_mod dm_multipath dm_mod xen_evtchn
> iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi
> scsi_mod libcrc32c crc32c_generic radeon fbcon tileblit font bitblit ttm
> softcursor drm_kms_helper x86_pkg_temp_thermal crc32c_intel xen_blkfront
> xen_netfront xen_fbfront fb_sys_fops sysimgblt sysfillrect syscopyarea
> xen_kbdfront xenfs xen_privcmd
> [ 38.644599] CPU: 0 PID: 9 Comm: migration/0 Tainted: G W
> 3.18.0-rc6upstream-00001-g0de9524 #1
> [ 38.644599] task: eb894650 ti: eb89c000 task.ti: eb89c000
> [ 38.644599] EIP: 0061:[<c1071fa6>] EFLAGS: 00010082 CPU: 0
> [ 38.644599] EIP is at load_ucode_ap+0x6/0xe0
> [ 38.644599] EAX: 00000000 EBX: c1823160 ECX: 00000000 EDX: c197eee0
> [ 38.644599] ESI: eb9bded0 EDI: c1793e95 EBP: eb89de28 ESP: eb89de20
> [ 38.644599] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
> [ 38.644599] CR0: 80050033 CR2: 0197eec0 CR3: 03280000 CR4: 00042660
> [ 38.644599] Stack:
> [ 38.644599] eb89de30 c10539b0 eb89de30 c1070f9d eb89de54 c140de9e
> eb9bded0 eb89de54
> [ 38.644599] c103dbcc 00000008 deadbeef eb9bded0 eb9bdee4 eb89de80
> c1397f57 00000000
> [ 38.644599] 00000000 80000002 eb9bdf1c 00000000 00000002 00000003
> eb9bded0 eb9bdee4
> [ 38.644599] Call Trace:
> [ 38.644599] [<c10539b0>] ? i8237A_resume+0xb0/0xe0
> [ 38.644599] [<c1070f9d>] mc_bp_resume+0x3d/0x50
> [ 38.644599] [<c140de9e>] syscore_resume+0x4e/0x190
> [ 38.644599] [<c103dbcc>] ? xen_timer_resume+0x3c/0x60
> [ 38.644599] [<c1397f57>] xen_suspend+0x77/0xf0
> [ 38.644599] [<c110dc63>] multi_cpu_stop+0x93/0xc0
> [ 38.644599] [<c110de76>] cpu_stopper_thread+0x46/0x170
> [ 38.644599] [<c110dbd0>] ? irq_cpu_stop_queue_work+0x20/0x20
> [ 38.644599] [<c162c536>] ? __schedule+0x356/0x880
> [ 38.644599] [<c10b7aab>] ? default_wake_function+0xb/0x10
> [ 38.644599] [<c10c7c00>] ? __wake_up_common+0x40/0x70
> [ 38.644599] [<c1630730>] ? _raw_spin_unlock_irqrestore+0x20/0x90
> [ 38.644599] [<c1630730>] ? _raw_spin_unlock_irqrestore+0x20/0x90
> [ 38.644599] [<c16304ef>] ? _raw_spin_lock_irqsave+0x1f/0x80
> [ 38.644599] [<c16306bf>] ? _raw_spin_lock_irq+0xf/0x60
> [ 38.644599] [<c10aff47>] smpboot_thread_fn+0x117/0x1a0
> [ 38.644599] [<c10ac824>] kthread+0xa4/0xc0
> [ 38.644599] [<c10afe30>] ? smpboot_create_threads+0x60/0x60
> [ 38.644599] [<c1630bc1>] ret_from_kernel_thread+0x21/0x30
> [ 38.644599] [<c10ac780>] ? kthread_freezable_should_stop+0x60/0x60
> [ 38.644599] Code: 55 cc eb 93 89 44 24 04 66 31 f6 c7 04 24 78 f0 74 c1
> e8 84 9f 5b 00 eb ae 90 90 90 90 90 90 90 90 90 90 90 90 55 89 e5 83 ec 08
> <80> 3d c0 ee 97 01 00 89 1c 24 89 74 24 04 74 12 8b 1c 24 8b 74
> [ 38.644599] EIP: [<c1071fa6>] load_ucode_ap+0x6/0xe0 SS:ESP 0069:eb89de20
> [ 38.644599] CR2: 000000000197eec0
> [ 38.644599] ---[ end trace 0ad7358b42202518 ]---
> [ 38.644599] Kernel panic - not syncing: Fatal exception
> [ 38.644599] Kernel Offset: 0x0 from 0xc1000000 (relocation range:
> 0xc0000000-0xed7fdfff)
>
>
>
>
> -boris
--
Regards/Gruss,
Boris.
Sent from a fat crate under my desk. Formatting is fine.
--
On Tue, Nov 25, 2014 at 01:55:29PM -0500, Boris Ostrovsky wrote:
> On 11/25/2014 01:43 PM, Borislav Petkov wrote:
> >On Tue, Nov 25, 2014 at 01:43:26PM -0500, Boris Ostrovsky wrote:
> >>I don't think this routine is called on PV.
> >They're either both called or none is. At least on baremetal, that is.
> >
>
> PV guests don't start with startup_32.
>
> We are coming from a resume into load_ucode_ap as:
Btw, why is this thing even running on xen? I'd like to make
CONFIG_MICROCODE depend on !PARAVIRT and be done with it.
--
Regards/Gruss,
Boris.
Sent from a fat crate under my desk. Formatting is fine.
--
On 11/25/2014 02:03 PM, Borislav Petkov wrote:
> On Tue, Nov 25, 2014 at 01:55:29PM -0500, Boris Ostrovsky wrote:
>> On 11/25/2014 01:43 PM, Borislav Petkov wrote:
>>> On Tue, Nov 25, 2014 at 01:43:26PM -0500, Boris Ostrovsky wrote:
>>>> I don't think this routine is called on PV.
>>> They're either both called or none is. At least on baremetal, that is.
>>>
>> PV guests don't start with startup_32.
>>
>> We are coming from a resume into load_ucode_ap as:
>>
>> [ 38.644599] BUG: unable to handle kernel paging request at 0197eec0
>> [ 38.644599] IP: [<c1071fa6>] load_ucode_ap+0x6/0xe0
> Aha, and at that point, the APs have enabled paging and switched to
> virtual addresses, correct?
Right.
>
> Does that fix it?
Hmm... no, although expected this to fix it.
-boris
>
> ---
> diff --git a/arch/x86/kernel/cpu/microcode/core_early.c b/arch/x86/kernel/cpu/microcode/core_early.c
> index 2c017f242a78..11ff39fe9d88 100644
> --- a/arch/x86/kernel/cpu/microcode/core_early.c
> +++ b/arch/x86/kernel/cpu/microcode/core_early.c
> @@ -123,11 +123,7 @@ void __init load_ucode_bsp(void)
>
> static bool check_loader_disabled_ap(void)
> {
> -#ifdef CONFIG_X86_32
> - return *((bool *)__pa_nodebug(&dis_ucode_ldr));
> -#else
> return dis_ucode_ldr;
> -#endif
> }
>
> void load_ucode_ap(void)
>
>
>> [ 38.644599] *pdpt = 0000000003267007 *pde = 0000000000000000
>> [ 38.644599] Oops: 0000 [#1] SMP
>> [ 38.644599] Modules linked in: sg sd_mod dm_multipath dm_mod xen_evtchn
>> iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi
>> scsi_mod libcrc32c crc32c_generic radeon fbcon tileblit font bitblit ttm
>> softcursor drm_kms_helper x86_pkg_temp_thermal crc32c_intel xen_blkfront
>> xen_netfront xen_fbfront fb_sys_fops sysimgblt sysfillrect syscopyarea
>> xen_kbdfront xenfs xen_privcmd
>> [ 38.644599] CPU: 0 PID: 9 Comm: migration/0 Tainted: G W
>> 3.18.0-rc6upstream-00001-g0de9524 #1
>> [ 38.644599] task: eb894650 ti: eb89c000 task.ti: eb89c000
>> [ 38.644599] EIP: 0061:[<c1071fa6>] EFLAGS: 00010082 CPU: 0
>> [ 38.644599] EIP is at load_ucode_ap+0x6/0xe0
>> [ 38.644599] EAX: 00000000 EBX: c1823160 ECX: 00000000 EDX: c197eee0
>> [ 38.644599] ESI: eb9bded0 EDI: c1793e95 EBP: eb89de28 ESP: eb89de20
>> [ 38.644599] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
>> [ 38.644599] CR0: 80050033 CR2: 0197eec0 CR3: 03280000 CR4: 00042660
>> [ 38.644599] Stack:
>> [ 38.644599] eb89de30 c10539b0 eb89de30 c1070f9d eb89de54 c140de9e
>> eb9bded0 eb89de54
>> [ 38.644599] c103dbcc 00000008 deadbeef eb9bded0 eb9bdee4 eb89de80
>> c1397f57 00000000
>> [ 38.644599] 00000000 80000002 eb9bdf1c 00000000 00000002 00000003
>> eb9bded0 eb9bdee4
>> [ 38.644599] Call Trace:
>> [ 38.644599] [<c10539b0>] ? i8237A_resume+0xb0/0xe0
>> [ 38.644599] [<c1070f9d>] mc_bp_resume+0x3d/0x50
>> [ 38.644599] [<c140de9e>] syscore_resume+0x4e/0x190
>> [ 38.644599] [<c103dbcc>] ? xen_timer_resume+0x3c/0x60
>> [ 38.644599] [<c1397f57>] xen_suspend+0x77/0xf0
>> [ 38.644599] [<c110dc63>] multi_cpu_stop+0x93/0xc0
>> [ 38.644599] [<c110de76>] cpu_stopper_thread+0x46/0x170
>> [ 38.644599] [<c110dbd0>] ? irq_cpu_stop_queue_work+0x20/0x20
>> [ 38.644599] [<c162c536>] ? __schedule+0x356/0x880
>> [ 38.644599] [<c10b7aab>] ? default_wake_function+0xb/0x10
>> [ 38.644599] [<c10c7c00>] ? __wake_up_common+0x40/0x70
>> [ 38.644599] [<c1630730>] ? _raw_spin_unlock_irqrestore+0x20/0x90
>> [ 38.644599] [<c1630730>] ? _raw_spin_unlock_irqrestore+0x20/0x90
>> [ 38.644599] [<c16304ef>] ? _raw_spin_lock_irqsave+0x1f/0x80
>> [ 38.644599] [<c16306bf>] ? _raw_spin_lock_irq+0xf/0x60
>> [ 38.644599] [<c10aff47>] smpboot_thread_fn+0x117/0x1a0
>> [ 38.644599] [<c10ac824>] kthread+0xa4/0xc0
>> [ 38.644599] [<c10afe30>] ? smpboot_create_threads+0x60/0x60
>> [ 38.644599] [<c1630bc1>] ret_from_kernel_thread+0x21/0x30
>> [ 38.644599] [<c10ac780>] ? kthread_freezable_should_stop+0x60/0x60
>> [ 38.644599] Code: 55 cc eb 93 89 44 24 04 66 31 f6 c7 04 24 78 f0 74 c1
>> e8 84 9f 5b 00 eb ae 90 90 90 90 90 90 90 90 90 90 90 90 55 89 e5 83 ec 08
>> <80> 3d c0 ee 97 01 00 89 1c 24 89 74 24 04 74 12 8b 1c 24 8b 74
>> [ 38.644599] EIP: [<c1071fa6>] load_ucode_ap+0x6/0xe0 SS:ESP 0069:eb89de20
>> [ 38.644599] CR2: 000000000197eec0
>> [ 38.644599] ---[ end trace 0ad7358b42202518 ]---
>> [ 38.644599] Kernel panic - not syncing: Fatal exception
>> [ 38.644599] Kernel Offset: 0x0 from 0xc1000000 (relocation range:
>> 0xc0000000-0xed7fdfff)
>>
>>
>>
>>
>> -boris
On 11/25/2014 02:08 PM, Borislav Petkov wrote:
> On Tue, Nov 25, 2014 at 01:55:29PM -0500, Boris Ostrovsky wrote:
>> On 11/25/2014 01:43 PM, Borislav Petkov wrote:
>>> On Tue, Nov 25, 2014 at 01:43:26PM -0500, Boris Ostrovsky wrote:
>>>> I don't think this routine is called on PV.
>>> They're either both called or none is. At least on baremetal, that is.
>>>
>> PV guests don't start with startup_32.
>>
>> We are coming from a resume into load_ucode_ap as:
> Btw, why is this thing even running on xen? I'd like to make
> CONFIG_MICROCODE depend on !PARAVIRT and be done with it.
You'd have to decide at runtime --- many baremetal systems are compiled
with PARAVIRT.
-boris
On Tue, Nov 25, 2014 at 02:28:46PM -0500, Boris Ostrovsky wrote:
> You'd have to decide at runtime --- many baremetal systems are
> compiled with PARAVIRT.
Right, but the microcode loader is not used at all on PV, right?
If so, I'd like to add a arch_something_blabla_disabled_loader()
function which is run in the loader init path and returns false on
baremetal and a true when running as a xen guest. I'm not sure how the
detection should be done, though... CPUID with the hypervisor leaf?
--
Regards/Gruss,
Boris.
Sent from a fat crate under my desk. Formatting is fine.
--
On Tue, Nov 25, 2014 at 09:26:28PM +0100, Borislav Petkov wrote:
> On Tue, Nov 25, 2014 at 02:28:46PM -0500, Boris Ostrovsky wrote:
> > You'd have to decide at runtime --- many baremetal systems are
> > compiled with PARAVIRT.
>
> Right, but the microcode loader is not used at all on PV, right?
Is there an use-case for this in virtualization at all?
>
> If so, I'd like to add a arch_something_blabla_disabled_loader()
> function which is run in the loader init path and returns false on
> baremetal and a true when running as a xen guest. I'm not sure how the
> detection should be done, though... CPUID with the hypervisor leaf?
Why not make it in general then? Like:
if (cpu_has_hypervisor)
return;
?
>
> --
> Regards/Gruss,
> Boris.
>
> Sent from a fat crate under my desk. Formatting is fine.
> --
On Tue, Nov 25, 2014 at 03:36:34PM -0500, Konrad Rzeszutek Wilk wrote:
> Is there an use-case for this in virtualization at all?
Not that I know of...
> Why not make it in general then? Like:
>
> if (cpu_has_hypervisor)
> return;
Ah, good idea. Although we need to do it by-foot because the cpu_has
stuff hasn't been initialized yet that early. Boris, I'm guessing
something that should work... ?
---
diff --git a/arch/x86/kernel/cpu/microcode/core_early.c b/arch/x86/kernel/cpu/microcode/core_early.c
index 2c017f242a78..77137b317e2a 100644
--- a/arch/x86/kernel/cpu/microcode/core_early.c
+++ b/arch/x86/kernel/cpu/microcode/core_early.c
@@ -74,6 +74,16 @@ static int x86_family(void)
return x86;
}
+static bool x86_guest(void)
+{
+ u32 eax = 0x1;
+ u32 ebx, ecx = 0, edx;
+
+ native_cpuid(&eax, &ebx, &ecx, &edx);
+
+ return !!(ecx & BIT(31));
+}
+
static bool __init check_loader_disabled_bsp(void)
{
#ifdef CONFIG_X86_32
@@ -98,6 +108,9 @@ void __init load_ucode_bsp(void)
{
int vendor, x86;
+ if (x86_guest())
+ return;
+
if (check_loader_disabled_bsp())
return;
@@ -134,6 +147,9 @@ void load_ucode_ap(void)
{
int vendor, x86;
+ if (x86_guest())
+ return;
+
if (check_loader_disabled_ap())
return;
--
Regards/Gruss,
Boris.
Sent from a fat crate under my desk. Formatting is fine.
--
On 11/25/2014 04:17 PM, Borislav Petkov wrote:
> On Tue, Nov 25, 2014 at 03:36:34PM -0500, Konrad Rzeszutek Wilk wrote:
>> Is there an use-case for this in virtualization at all?
> Not that I know of...
>
>> Why not make it in general then? Like:
>>
>> if (cpu_has_hypervisor)
>> return;
> Ah, good idea. Although we need to do it by-foot because the cpu_has
> stuff hasn't been initialized yet that early. Boris, I'm guessing
> something that should work... ?
>
> ---
> diff --git a/arch/x86/kernel/cpu/microcode/core_early.c b/arch/x86/kernel/cpu/microcode/core_early.c
> index 2c017f242a78..77137b317e2a 100644
> --- a/arch/x86/kernel/cpu/microcode/core_early.c
> +++ b/arch/x86/kernel/cpu/microcode/core_early.c
> @@ -74,6 +74,16 @@ static int x86_family(void)
> return x86;
> }
>
> +static bool x86_guest(void)
> +{
> + u32 eax = 0x1;
> + u32 ebx, ecx = 0, edx;
> +
> + native_cpuid(&eax, &ebx, &ecx, &edx);
This should be cpuid(0x1, &eax, &ebx, &ecx, &edx). Otherwise we are not
getting bits that the hypervisor wants the guest to see (on Xen cpuid()
turns into hypercall, on baremetal it's native).
With that change it works and
Tested-by: Boris Ostrovsky <[email protected]>
(May be worth adding a comment as to what is_guest() is checking for
since 31 is a magic number).
BTW, the crash had nothing to do with accessing dis_ucode_ldr, we are
crashing much later, in load_ucode_intel_ap(), trying to access
*initrd_start_p. And the reason we didn't crash before was because
compiler optimized out whole load_ucode_ap() since
check_loader_disabled_ap() was always true.
Thanks.
-boris
> +
> + return !!(ecx & BIT(31));
> +}
> +
> static bool __init check_loader_disabled_bsp(void)
> {
> #ifdef CONFIG_X86_32
> @@ -98,6 +108,9 @@ void __init load_ucode_bsp(void)
> {
> int vendor, x86;
>
> + if (x86_guest())
> + return;
> +
> if (check_loader_disabled_bsp())
> return;
>
> @@ -134,6 +147,9 @@ void load_ucode_ap(void)
> {
> int vendor, x86;
>
> + if (x86_guest())
> + return;
> +
> if (check_loader_disabled_ap())
> return;
>
Adding x86 people.
On Tue, Nov 25, 2014 at 04:59:34PM -0500, Boris Ostrovsky wrote:
> This should be cpuid(0x1, &eax, &ebx, &ecx, &edx). Otherwise we are not
> getting bits that the hypervisor wants the guest to see (on Xen cpuid()
> turns into hypercall, on baremetal it's native).
>
> With that change it works and
>
> Tested-by: Boris Ostrovsky <[email protected]>
Thanks for testing.
> (May be worth adding a comment as to what is_guest() is checking for since
> 31 is a magic number).
See below.
> BTW, the crash had nothing to do with accessing dis_ucode_ldr, we are
> crashing much later, in load_ucode_intel_ap(), trying to access
> *initrd_start_p. And the reason we didn't crash before was because compiler
> optimized out whole load_ucode_ap() since check_loader_disabled_ap() was
> always true.
Right, and my fix actually uncovered the original issue :-\
Ok, here's a v2 which adds the check to the late loader too, for
completeness. I'll write a proper commit message tomorrow.
---
diff --git a/arch/x86/include/asm/microcode.h b/arch/x86/include/asm/microcode.h
index 64dc362506b7..654907db5f09 100644
--- a/arch/x86/include/asm/microcode.h
+++ b/arch/x86/include/asm/microcode.h
@@ -87,4 +87,9 @@ static inline int __init save_microcode_in_initrd(void)
}
#endif
+/* Check whether we're running as a guest on a hypervisor. */
+static inline bool x86_guest(void)
+{
+ return !!(cpuid_ecx(1) & BIT(31));
+}
#endif /* _ASM_X86_MICROCODE_H */
diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
index 2ce9051174e6..0b6db2a97f61 100644
--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -557,6 +557,9 @@ static int __init microcode_init(void)
struct cpuinfo_x86 *c = &cpu_data(0);
int error;
+ if (x86_guest())
+ return 0;
+
if (dis_ucode_ldr)
return 0;
diff --git a/arch/x86/kernel/cpu/microcode/core_early.c b/arch/x86/kernel/cpu/microcode/core_early.c
index 2c017f242a78..dfa93e74c370 100644
--- a/arch/x86/kernel/cpu/microcode/core_early.c
+++ b/arch/x86/kernel/cpu/microcode/core_early.c
@@ -98,6 +98,9 @@ void __init load_ucode_bsp(void)
{
int vendor, x86;
+ if (x86_guest())
+ return;
+
if (check_loader_disabled_bsp())
return;
@@ -134,6 +137,9 @@ void load_ucode_ap(void)
{
int vendor, x86;
+ if (x86_guest())
+ return;
+
if (check_loader_disabled_ap())
return;
--
Regards/Gruss,
Boris.
Sent from a fat crate under my desk. Formatting is fine.
--
On 11/25/2014 05:18 PM, Borislav Petkov wrote:
> Adding x86 people.
>
> On Tue, Nov 25, 2014 at 04:59:34PM -0500, Boris Ostrovsky wrote:
>> This should be cpuid(0x1, &eax, &ebx, &ecx, &edx). Otherwise we are not
>> getting bits that the hypervisor wants the guest to see (on Xen cpuid()
>> turns into hypercall, on baremetal it's native).
>>
>> With that change it works and
>>
>> Tested-by: Boris Ostrovsky <[email protected]>
Sigh... I take this back. It breaks 32-bit baremetal. I haven't looked
any further but it seems to be dying very early. I suspect cpuid pv_op
is not set up yet. If that's true, perhaps you could check whether it is
valid in x86_guest()?
I won't be able to do anything tomorrow morning, the best I can hope for
is evening.
-boris
> Thanks for testing.
>
>> (May be worth adding a comment as to what is_guest() is checking for since
>> 31 is a magic number).
> See below.
>
>> BTW, the crash had nothing to do with accessing dis_ucode_ldr, we are
>> crashing much later, in load_ucode_intel_ap(), trying to access
>> *initrd_start_p. And the reason we didn't crash before was because compiler
>> optimized out whole load_ucode_ap() since check_loader_disabled_ap() was
>> always true.
> Right, and my fix actually uncovered the original issue :-\
>
> Ok, here's a v2 which adds the check to the late loader too, for
> completeness. I'll write a proper commit message tomorrow.
>
> ---
> diff --git a/arch/x86/include/asm/microcode.h b/arch/x86/include/asm/microcode.h
> index 64dc362506b7..654907db5f09 100644
> --- a/arch/x86/include/asm/microcode.h
> +++ b/arch/x86/include/asm/microcode.h
> @@ -87,4 +87,9 @@ static inline int __init save_microcode_in_initrd(void)
> }
> #endif
>
> +/* Check whether we're running as a guest on a hypervisor. */
> +static inline bool x86_guest(void)
> +{
> + return !!(cpuid_ecx(1) & BIT(31));
> +}
> #endif /* _ASM_X86_MICROCODE_H */
> diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
> index 2ce9051174e6..0b6db2a97f61 100644
> --- a/arch/x86/kernel/cpu/microcode/core.c
> +++ b/arch/x86/kernel/cpu/microcode/core.c
> @@ -557,6 +557,9 @@ static int __init microcode_init(void)
> struct cpuinfo_x86 *c = &cpu_data(0);
> int error;
>
> + if (x86_guest())
> + return 0;
> +
> if (dis_ucode_ldr)
> return 0;
>
> diff --git a/arch/x86/kernel/cpu/microcode/core_early.c b/arch/x86/kernel/cpu/microcode/core_early.c
> index 2c017f242a78..dfa93e74c370 100644
> --- a/arch/x86/kernel/cpu/microcode/core_early.c
> +++ b/arch/x86/kernel/cpu/microcode/core_early.c
> @@ -98,6 +98,9 @@ void __init load_ucode_bsp(void)
> {
> int vendor, x86;
>
> + if (x86_guest())
> + return;
> +
> if (check_loader_disabled_bsp())
> return;
>
> @@ -134,6 +137,9 @@ void load_ucode_ap(void)
> {
> int vendor, x86;
>
> + if (x86_guest())
> + return;
> +
> if (check_loader_disabled_ap())
> return;
>
>
On Wed, Nov 26, 2014 at 12:00:45AM -0500, Boris Ostrovsky wrote:
> Sigh... I take this back. It breaks 32-bit baremetal. I haven't looked any
> further but it seems to be dying very early. I suspect cpuid pv_op is not
> set up yet. If that's true, perhaps you could check whether it is valid in
> x86_guest()?
Right, this is why we're using the native variants in the early loader.
So we need a different method for detecting very early whether we're
running as a guest.
What I'd like more, though, is if we continue debugging the original
issue where we fail in load_ucode_intel_ap(). Does it happen on this line:
initrd_start_addr = (unsigned long)__pa_nodebug(*initrd_start_p);
where we deref the initrd_start_p? Do you have a full splat with a Code:
section?
Thanks.
--
Regards/Gruss,
Boris.
Sent from a fat crate under my desk. Formatting is fine.
--
On 11/26/2014 5:55 AM, Borislav Petkov wrote:
> On Wed, Nov 26, 2014 at 12:00:45AM -0500, Boris Ostrovsky wrote:
>> Sigh... I take this back. It breaks 32-bit baremetal. I haven't looked any
>> further but it seems to be dying very early. I suspect cpuid pv_op is not
>> set up yet. If that's true, perhaps you could check whether it is valid in
>> x86_guest()?
> Right, this is why we're using the native variants in the early loader.
> So we need a different method for detecting very early whether we're
> running as a guest.
>
> What I'd like more, though, is if we continue debugging the original
> issue where we fail in load_ucode_intel_ap(). Does it happen on this line:
>
> initrd_start_addr = (unsigned long)__pa_nodebug(*initrd_start_p);
I don't have access to my test setup right now (and won't be until late
today at best) but I am pretty sure this was the line when I was looking
at this yesterday.
>
> where we deref the initrd_start_p? Do you have a full splat with a Code:
> section?
https://lkml.org/lkml/2014/11/25/973 is all I have right now.
-boris
On Wed, Nov 26, 2014 at 07:39:26AM -0500, boris ostrovsky wrote:
> https://lkml.org/lkml/2014/11/25/973 is all I have right now.
Ok, so the Code: section from this splat says:
25: 55 push %ebp
26: 89 e5 mov %esp,%ebp
28: 83 ec 08 sub $0x8,%esp
2b:* 80 3d c0 ee 97 01 00 cmpb $0x0,0x197eec0 <-- trapping instruction
32: 89 1c 24 mov %ebx,(%esp)
35: 89 74 24 04 mov %esi,0x4(%esp)
39: 74 12 je 0x4d
3b: 8b 1c 24 mov (%esp),%ebx
3e: 8b .byte 0x8b
3f: 74 .byte 0x74
which I can correlate to the dis_ucode_ldr test here:
.loc 1 134 0
.loc 1 137 0
cmpb $0, dis_ucode_ldr+1073741824 #, *_11
je .L46 #,
so we must be faulting when accessing that dis_ucode_ldr thing. But you
said that accessing it through its virtual address doesn't fix the issue
either. Which is very very strange...
Hmmm.
--
Regards/Gruss,
Boris.
Sent from a fat crate under my desk. Formatting is fine.
--
----- [email protected] wrote:
> On Wed, Nov 26, 2014 at 07:39:26AM -0500, boris ostrovsky wrote:
> > https://lkml.org/lkml/2014/11/25/973 is all I have right now.
>
> Ok, so the Code: section from this splat says:
>
> 25: 55 push %ebp
> 26: 89 e5 mov %esp,%ebp
> 28: 83 ec 08 sub $0x8,%esp
> 2b:* 80 3d c0 ee 97 01 00 cmpb $0x0,0x197eec0 <--
> trapping instruction
> 32: 89 1c 24 mov %ebx,(%esp)
> 35: 89 74 24 04 mov %esi,0x4(%esp)
> 39: 74 12 je 0x4d
> 3b: 8b 1c 24 mov (%esp),%ebx
> 3e: 8b .byte 0x8b
> 3f: 74 .byte 0x74
>
> which I can correlate to the dis_ucode_ldr test here:
>
> .loc 1 134 0
> .loc 1 137 0
> cmpb $0, dis_ucode_ldr+1073741824 #, *_11
> je .L46 #,
>
>
> so we must be faulting when accessing that dis_ucode_ldr thing. But
> you
> said that accessing it through its virtual address doesn't fix the
> issue
> either. Which is very very strange...
I was confusing you: accessing dis_ucode_ldr by virtual address does work on PV. But we then fail later, in load_ucode_intel_ap(), because it also tries to use __pa_nodebug() which can't be used by PV.
So if accessing dis_ucode_ldr by virtual address is acceptable (although I don't think it is?) then we can stick dis_ucode_ldr=1 into xen_start_kernel() and then things look OK.
A better solution may be to replace cpuid in x86_guest() with 'return pv_info.paravirt_enabled' (or paravirt_enabled(), I guess). I gave it a quick spin (32-bit only) and it seems to work. I'll see how my overnight tests behave.
-boris
On Wed, Nov 26, 2014 at 07:13:02PM -0800, Boris Ostrovsky wrote:
> I was confusing you: accessing dis_ucode_ldr by virtual address does
> work on PV. But we then fail later, in load_ucode_intel_ap(), because
> it also tries to use __pa_nodebug() which can't be used by PV.
>
> So if accessing dis_ucode_ldr by virtual address is acceptable
> (although I don't think it is?) then we can stick dis_ucode_ldr=1 into
> xen_start_kernel() and then things look OK.
>
> A better solution may be to replace cpuid in x86_guest() with 'return
> pv_info.paravirt_enabled' (or paravirt_enabled(), I guess). I gave
> it a quick spin (32-bit only) and it seems to work. I'll see how my
> overnight tests behave.
Ok, but let's have a clean design: maybe have a weak default stub which
returns false when PARAVIRT is not enabled in the .config and then add
an override in, say, arch/x86/kernel/paravirt.c which returns true when
running as a guest. Something like that, at least.
I can imagine other stuff wanting to use the dynamic checking at runtime
too...
--
Regards/Gruss,
Boris.
Sent from a fat crate under my desk. Formatting is fine.
--
On Thu, Nov 27, 2014 at 10:12:28AM +0100, Borislav Petkov wrote:
> On Wed, Nov 26, 2014 at 07:13:02PM -0800, Boris Ostrovsky wrote:
> > I was confusing you: accessing dis_ucode_ldr by virtual address does
> > work on PV. But we then fail later, in load_ucode_intel_ap(), because
> > it also tries to use __pa_nodebug() which can't be used by PV.
> >
> > So if accessing dis_ucode_ldr by virtual address is acceptable
> > (although I don't think it is?) then we can stick dis_ucode_ldr=1 into
> > xen_start_kernel() and then things look OK.
> >
> > A better solution may be to replace cpuid in x86_guest() with 'return
> > pv_info.paravirt_enabled' (or paravirt_enabled(), I guess). I gave
> > it a quick spin (32-bit only) and it seems to work. I'll see how my
> > overnight tests behave.
>
> Ok, but let's have a clean design: maybe have a weak default stub which
> returns false when PARAVIRT is not enabled in the .config and then add
> an override in, say, arch/x86/kernel/paravirt.c which returns true when
> running as a guest. Something like that, at least.
You are describing 'paravirt_enabled()' :-)
>
> I can imagine other stuff wanting to use the dynamic checking at runtime
> too...
>
> --
> Regards/Gruss,
> Boris.
>
> Sent from a fat crate under my desk. Formatting is fine.
> --
On Thu, Nov 27, 2014 at 11:21:19AM -0500, Konrad Rzeszutek Wilk wrote:
> > Ok, but let's have a clean design: maybe have a weak default stub which
> > returns false when PARAVIRT is not enabled in the .config and then add
> > an override in, say, arch/x86/kernel/paravirt.c which returns true when
> > running as a guest. Something like that, at least.
>
> You are describing 'paravirt_enabled()' :-)
Haha.
Although I have a suspicion this won't work either because we're loading
microcode very early on 32-bit, before paging has been enabled, and
accessing pv_info.paravirt_enabled will probably go boom. AFAICT, from a
quick glance.
--
Regards/Gruss,
Boris.
Sent from a fat crate under my desk. Formatting is fine.
--
----- [email protected] wrote:
> On Thu, Nov 27, 2014 at 11:21:19AM -0500, Konrad Rzeszutek Wilk
> wrote:
> > > Ok, but let's have a clean design: maybe have a weak default stub
> which
> > > returns false when PARAVIRT is not enabled in the .config and then
> add
> > > an override in, say, arch/x86/kernel/paravirt.c which returns true
> when
> > > running as a guest. Something like that, at least.
> >
> > You are describing 'paravirt_enabled()' :-)
>
> Haha.
>
> Although I have a suspicion this won't work either because we're
> loading
> microcode very early on 32-bit, before paging has been enabled, and
> accessing pv_info.paravirt_enabled will probably go boom. AFAICT, from
> a
> quick glance.
The overnight tests passed. This includes baremetal, HVM and PV(H), 32- and 64-bit.
-boris
On Thu, Nov 27, 2014 at 09:14:11AM -0800, Boris Ostrovsky wrote:
> The overnight tests passed. This includes baremetal, HVM and PV(H),
> 32- and 64-bit.
Cool. Want to send a proper patch for 3.18 and CC: stable?
Thanks.
--
Regards/Gruss,
Boris.
Sent from a fat crate under my desk. Formatting is fine.
--
----- [email protected] wrote:
> On Thu, Nov 27, 2014 at 09:14:11AM -0800, Boris Ostrovsky wrote:
> > The overnight tests passed. This includes baremetal, HVM and PV(H),
> > 32- and 64-bit.
>
> Cool. Want to send a proper patch for 3.18 and CC: stable?
>
So I am not convinced that we actually update microcode on baremetal now.
I will look into this but it would have to wait until Monday.
-boris