2017-03-20 18:35:09

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 00/93] 4.9.17-stable review

This is the start of the stable review cycle for the 4.9.17 release.
There are 93 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.

Responses should be made by Wed Mar 22 17:47:16 UTC 2017.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.17-rc1.gz
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <[email protected]>
Linux 4.9.17-rc1

Daniel Axtens <[email protected]>
crypto: powerpc - Fix initialisation of crc32c context

Niklas Cassel <[email protected]>
locking/rwsem: Fix down_write_killable() for CONFIG_RWSEM_GENERIC_SPINLOCK=y

Peter Zijlstra <[email protected]>
futex: Add missing error handling to FUTEX_REQUEUE_PI

Peter Zijlstra <[email protected]>
futex: Fix potential use-after-free in FUTEX_REQUEUE_PI

Andy Lutomirski <[email protected]>
x86/perf: Fix CR4.PCE propagation to use active_mm instead of mm

Andrey Ryabinin <[email protected]>
x86/kasan: Fix boot with KASAN=y and PROFILE_ANNOTATED_BRANCHES=y

Peter Zijlstra <[email protected]>
x86/tsc: Fix ART for TSC_KNOWN_FREQ

Shanker Donthineni <[email protected]>
irqchip/gicv3-its: Add workaround for QDF2400 ITS erratum 0065

Marc Zyngier <[email protected]>
arm64: KVM: VHE: Clear HCR_TGE when invalidating guest TLBs

Boris Brezillon <[email protected]>
drm/vc4: Fix ->clock_select setting for the VEC encoder

Derek Foreman <[email protected]>
drm/vc4: Fix race between page flip completion event and clean-up

Boris Brezillon <[email protected]>
clk: bcm2835: Fix ->fixed_divider of pllh_aux

Michael Ellerman <[email protected]>
powerpc/mm: Fix build break when CMA=n && SPAPR_TCE_IOMMU=y

Alexandre Belloni <[email protected]>
usb: gadget: udc: atmel: remove memory leak

Gabriel Krisman Bertazi <[email protected]>
serial: 8250_pci: Detach low-level driver during PCI error recovery

Michael Pobega <[email protected]>
ACPI / blacklist: Make Dell Latitude 3350 ethernet work

Alex Hung <[email protected]>
ACPI / blacklist: add _REV quirks for Dell Precision 5520 and 3520

Vladimir Davydov <[email protected]>
slub: move synchronize_sched out of slab_mutex on shrink

Henrik Ingo <[email protected]>
uvcvideo: uvc_scan_fallback() for webcams with broken chain

Harald Freudenberger <[email protected]>
s390/zcrypt: Introduce CEX6 toleration

Mauricio Faria de Oliveira <[email protected]>
block: allow WRITE_SAME commands with the SG_IO ioctl

Ben Skeggs <[email protected]>
drm/nouveau/disp/nv50-: specify ctrl/user separately when constructing classes

Ben Skeggs <[email protected]>
drm/nouveau/disp/nv50-: split chid into chid.ctrl and chid.user

Ben Skeggs <[email protected]>
drm/nouveau/disp/gp102: fix cursor/overlay immediate channel indices

Alexey Kardashevskiy <[email protected]>
vfio/spapr: Postpone default window creation

Alexey Kardashevskiy <[email protected]>
vfio/spapr: Add a helper to create default DMA window

Alexey Kardashevskiy <[email protected]>
powerpc/mm/iommu, vfio/spapr: Put pages on VFIO container shutdown

Alexey Kardashevskiy <[email protected]>
vfio/spapr: Reference mm in tce_container

Alexey Kardashevskiy <[email protected]>
powerpc/iommu: Stop using @current in mm_iommu_xxx

Alexey Kardashevskiy <[email protected]>
powerpc/iommu: Pass mm_struct to init/cleanup helpers

Alexey Kardashevskiy <[email protected]>
vfio/spapr: Postpone allocation of userspace version of TCE table

Vitaly Kuznetsov <[email protected]>
Drivers: hv: ring_buffer: count on wrap around mappings in get_next_pkt_raw() (v2)

Thomas Falcon <[email protected]>
ibmveth: calculate gso_segs for large packets

Gavin Shan <[email protected]>
PCI: Do any VF BAR updates before enabling the BARs

Bjorn Helgaas <[email protected]>
PCI: Ignore BAR updates on virtual functions

Bjorn Helgaas <[email protected]>
PCI: Update BARs using property bits appropriate for type

Bjorn Helgaas <[email protected]>
PCI: Don't update VF BARs while VF memory space is enabled

Bjorn Helgaas <[email protected]>
PCI: Decouple IORESOURCE_ROM_ENABLE and PCI_ROM_ADDRESS_ENABLE

Bjorn Helgaas <[email protected]>
PCI: Add comments about ROM BAR updating

Bjorn Helgaas <[email protected]>
PCI: Remove pci_resource_bar() and pci_iov_resource_bar()

Bjorn Helgaas <[email protected]>
PCI: Separate VF BAR updates from standard BAR updates

Vitaly Kuznetsov <[email protected]>
x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic

Michael Cyr <[email protected]>
scsi: ibmvscsis: Synchronize cmds at remove time

Michael Cyr <[email protected]>
scsi: ibmvscsis: Synchronize cmds at tpg_enable_store time

Michael Cyr <[email protected]>
scsi: ibmvscsis: Rearrange functions for future patches

Michael Cyr <[email protected]>
scsi: ibmvscsis: Clean up properly if target_submit_cmd/tmr fails

Michael Cyr <[email protected]>
scsi: ibmvscsis: Return correct partition name/# to client

Michael Cyr <[email protected]>
scsi: ibmvscsis: Issues from Dan Carpenter/Smatch

Todd Fujinaka <[email protected]>
igb: add i211 to i210 PHY workaround

Chris J Arges <[email protected]>
igb: Workaround for igb i210 firmware issue

Dan Streetman <[email protected]>
xen: do not re-use pirq number cached in pci device msi msg data

Krister Johansen <[email protected]>
dmaengine: iota: ioat_alloc_chan_resources should not perform sleeping allocations.

Daniel Borkmann <[email protected]>
bpf: fix mark_reg_unknown_value for spilled regs on map value marking

Daniel Borkmann <[email protected]>
bpf: fix regression on verifier pruning wrt map lookups

Alexei Starovoitov <[email protected]>
bpf: fix state equivalence

Thomas Graf <[email protected]>
bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers

Hannes Frederic Sowa <[email protected]>
dccp: fix memory leak during tear-down of unsuccessful connection request

Hannes Frederic Sowa <[email protected]>
tun: fix premature POLLOUT notification on tun devices

Jon Maxwell <[email protected]>
dccp/tcp: fix routing redirect race

Florian Westphal <[email protected]>
bridge: drop netfilter fake rtable unconditionally

Florian Westphal <[email protected]>
ipv6: avoid write to a possibly cloned skb

Sabrina Dubroca <[email protected]>
ipv6: make ECMP route replacement less greedy

David Ahern <[email protected]>
mpls: Do not decrement alive counter for unregister events

David Ahern <[email protected]>
mpls: Send route delete notifications when router module is unloaded

Etienne Noss <[email protected]>
act_connmark: avoid crashing on malformed nlattrs with null parms

Dmitry V. Levin <[email protected]>
uapi: fix linux/packet_diag.h userspace compilation error

Paolo Abeni <[email protected]>
net/tunnel: set inner protocol in network gro hooks

David Ahern <[email protected]>
vrf: Fix use-after-free in vrf_xmit

Eric Dumazet <[email protected]>
dccp: fix use-after-free in dccp_feat_activate_values

Alexey Khoroshilov <[email protected]>
net/sched: act_skbmod: remove unneeded rcu_read_unlock in tcf_skbmod_dump

Eric Dumazet <[email protected]>
net: fix socket refcounting in skb_complete_tx_timestamp()

Eric Dumazet <[email protected]>
net: fix socket refcounting in skb_complete_wifi_ack()

Eric Dumazet <[email protected]>
tcp: fix various issues for sockets morphing to listen state

WANG Cong <[email protected]>
strparser: destroy workqueue on module exit

Arnaldo Carvalho de Melo <[email protected]>
dccp: Unlock sock before calling sk_free()

Eric Dumazet <[email protected]>
ipv6: orphan skbs in reassembly unit

Eric Dumazet <[email protected]>
net: net_enable_timestamp() can be called from irq contexts

Alexander Potapenko <[email protected]>
net: don't call strlen() on the user buffer in packet_bind_spkt()

Mike Manning <[email protected]>
net: bridge: allow IPv6 when multicast flood is disabled

Eric Dumazet <[email protected]>
tcp/dccp: block BH for SYN processing

Ido Schimmel <[email protected]>
mlxsw: spectrum_router: Avoid potential packets loss

Jakub Kicinski <[email protected]>
geneve: lock RCU on TX path

Jakub Kicinski <[email protected]>
vxlan: lock RCU on TX path

Florian Fainelli <[email protected]>
net: phy: Avoid deadlock during phy_error()

Paul Hüber <[email protected]>
l2tp: avoid use-after-free caused by l2tp_ip_backlog_recv

Roman Mashak <[email protected]>
net sched actions: decrement module reference count after table flush.

Julian Anastasov <[email protected]>
ipv4: mask tos for input route

Brian Russell <[email protected]>
vxlan: don't allow overwrite of config src addr

David Forster <[email protected]>
vti6: return GRE_KEY for vti6

Matthias Schiffer <[email protected]>
vxlan: correctly validate VXLAN ID against VXLAN_N_VID

Tariq Toukan <[email protected]>
net/mlx5e: Fix wrong CQE decompression

Tariq Toukan <[email protected]>
net/mlx5e: Do not reduce LRO WQE size when not using build_skb

Saeed Mahameed <[email protected]>
net/mlx5e: Register/unregister vport representors on interface attach/detach


-------------

Diffstat:

Documentation/arm64/silicon-errata.txt | 44 +-
Makefile | 4 +-
arch/arm64/Kconfig | 10 +
arch/arm64/kvm/hyp/tlb.c | 64 +-
arch/powerpc/crypto/crc32c-vpmsum_glue.c | 2 +-
arch/powerpc/include/asm/mmu_context.h | 20 +-
arch/powerpc/kernel/setup-common.c | 2 +-
arch/powerpc/mm/mmu_context_book3s64.c | 6 +-
arch/powerpc/mm/mmu_context_iommu.c | 62 +-
arch/x86/events/core.c | 4 +-
arch/x86/kernel/cpu/mshyperv.c | 24 +
arch/x86/kernel/head64.c | 1 +
arch/x86/kernel/tsc.c | 2 +
arch/x86/mm/kasan_init_64.c | 1 +
arch/x86/pci/xen.c | 23 +-
block/scsi_ioctl.c | 3 +
drivers/acpi/blacklist.c | 28 +
drivers/clk/bcm/clk-bcm2835.c | 2 +-
drivers/dma/ioat/init.c | 4 +-
drivers/gpu/drm/nouveau/nvkm/engine/disp/Kbuild | 2 +
.../gpu/drm/nouveau/nvkm/engine/disp/channv50.c | 30 +-
.../gpu/drm/nouveau/nvkm/engine/disp/channv50.h | 23 +-
drivers/gpu/drm/nouveau/nvkm/engine/disp/cursg84.c | 2 +-
.../gpu/drm/nouveau/nvkm/engine/disp/cursgf119.c | 2 +-
.../gpu/drm/nouveau/nvkm/engine/disp/cursgk104.c | 2 +-
.../gpu/drm/nouveau/nvkm/engine/disp/cursgp102.c | 37 +
.../gpu/drm/nouveau/nvkm/engine/disp/cursgt215.c | 2 +-
.../gpu/drm/nouveau/nvkm/engine/disp/cursnv50.c | 6 +-
.../gpu/drm/nouveau/nvkm/engine/disp/dmacgf119.c | 44 +-
.../gpu/drm/nouveau/nvkm/engine/disp/dmacgp104.c | 23 +-
.../gpu/drm/nouveau/nvkm/engine/disp/dmacnv50.c | 46 +-
drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmg84.c | 2 +-
.../gpu/drm/nouveau/nvkm/engine/disp/oimmgf119.c | 2 +-
.../gpu/drm/nouveau/nvkm/engine/disp/oimmgk104.c | 2 +-
.../gpu/drm/nouveau/nvkm/engine/disp/oimmgp102.c | 37 +
.../gpu/drm/nouveau/nvkm/engine/disp/oimmgt215.c | 2 +-
.../gpu/drm/nouveau/nvkm/engine/disp/oimmnv50.c | 6 +-
.../gpu/drm/nouveau/nvkm/engine/disp/piocgf119.c | 28 +-
.../gpu/drm/nouveau/nvkm/engine/disp/piocnv50.c | 30 +-
.../gpu/drm/nouveau/nvkm/engine/disp/rootgp104.c | 4 +-
.../gpu/drm/nouveau/nvkm/engine/disp/rootnv50.c | 4 +-
drivers/gpu/drm/vc4/vc4_crtc.c | 46 +-
drivers/gpu/drm/vc4/vc4_drv.h | 2 +
drivers/gpu/drm/vc4/vc4_kms.c | 33 +-
drivers/gpu/drm/vc4/vc4_regs.h | 3 +-
drivers/irqchip/irq-gic-v3-its.c | 16 +
drivers/media/usb/uvc/uvc_driver.c | 118 ++-
drivers/net/ethernet/ibm/ibmveth.c | 12 +-
drivers/net/ethernet/intel/igb/e1000_phy.c | 4 +
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 34 +-
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 13 +-
.../net/ethernet/mellanox/mlxsw/spectrum_router.c | 30 +-
drivers/net/geneve.c | 10 +-
drivers/net/phy/phy.c | 14 +-
drivers/net/tun.c | 18 +-
drivers/net/vrf.c | 3 +-
drivers/net/vxlan.c | 27 +-
drivers/pci/iov.c | 70 +-
drivers/pci/pci.c | 34 -
drivers/pci/pci.h | 7 +-
drivers/pci/probe.c | 3 +-
drivers/pci/rom.c | 5 +
drivers/pci/setup-res.c | 48 +-
drivers/s390/crypto/ap_bus.c | 3 +
drivers/s390/crypto/ap_bus.h | 1 +
drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c | 1096 +++++++++-----------
drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.h | 5 +-
drivers/tty/serial/8250/8250_pci.c | 23 +-
drivers/usb/gadget/udc/atmel_usba_udc.c | 3 +-
drivers/usb/gadget/udc/atmel_usba_udc.h | 1 +
drivers/vfio/vfio_iommu_spapr_tce.c | 328 ++++--
include/linux/bpf_verifier.h | 14 +-
include/linux/dccp.h | 1 +
include/linux/hyperv.h | 32 +-
include/uapi/linux/packet_diag.h | 2 +-
kernel/bpf/verifier.c | 77 +-
kernel/futex.c | 22 +-
kernel/locking/rwsem-spinlock.c | 15 +-
mm/slab.c | 4 +-
mm/slab.h | 2 +-
mm/slab_common.c | 27 +-
mm/slob.c | 2 +-
mm/slub.c | 19 +-
net/bridge/br_forward.c | 3 +-
net/bridge/br_input.c | 1 +
net/bridge/br_netfilter_hooks.c | 21 -
net/core/dev.c | 35 +-
net/core/skbuff.c | 30 +-
net/dccp/ccids/ccid2.c | 1 +
net/dccp/input.c | 10 +-
net/dccp/ipv4.c | 3 +-
net/dccp/ipv6.c | 8 +-
net/dccp/minisocks.c | 25 +-
net/ipv4/af_inet.c | 4 +-
net/ipv4/route.c | 1 +
net/ipv4/tcp_input.c | 10 +-
net/ipv4/tcp_ipv4.c | 10 +-
net/ipv4/tcp_timer.c | 6 +-
net/ipv6/ip6_fib.c | 2 +
net/ipv6/ip6_offload.c | 4 +-
net/ipv6/ip6_output.c | 7 +-
net/ipv6/ip6_vti.c | 4 +
net/ipv6/netfilter/nf_conntrack_reasm.c | 1 +
net/ipv6/tcp_ipv6.c | 8 +-
net/l2tp/l2tp_ip.c | 2 +-
net/mpls/af_mpls.c | 4 +-
net/openvswitch/conntrack.c | 1 -
net/packet/af_packet.c | 8 +-
net/sched/act_api.c | 5 +-
net/sched/act_connmark.c | 3 +
net/sched/act_skbmod.c | 1 -
net/strparser/strparser.c | 1 +
112 files changed, 1816 insertions(+), 1272 deletions(-)



2017-03-20 17:54:39

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 10/93] net: phy: Avoid deadlock during phy_error()

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Florian Fainelli <[email protected]>


[ Upstream commit eab127717a6af54401ba534790c793ec143cd1fc ]

phy_error() is called in the PHY state machine workqueue context, and
calls phy_trigger_machine() which does a cancel_delayed_work_sync() of
the workqueue we execute from, causing a deadlock situation.

Augment phy_trigger_machine() machine with a sync boolean indicating
whether we should use cancel_*_sync() or just cancel_*_work().

Fixes: 3c293f4e08b5 ("net: phy: Trigger state machine on state change and not polling.")
Reported-by: Russell King <[email protected]>
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/phy/phy.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)

--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -611,14 +611,18 @@ void phy_start_machine(struct phy_device
* phy_trigger_machine - trigger the state machine to run
*
* @phydev: the phy_device struct
+ * @sync: indicate whether we should wait for the workqueue cancelation
*
* Description: There has been a change in state which requires that the
* state machine runs.
*/

-static void phy_trigger_machine(struct phy_device *phydev)
+static void phy_trigger_machine(struct phy_device *phydev, bool sync)
{
- cancel_delayed_work_sync(&phydev->state_queue);
+ if (sync)
+ cancel_delayed_work_sync(&phydev->state_queue);
+ else
+ cancel_delayed_work(&phydev->state_queue);
queue_delayed_work(system_power_efficient_wq, &phydev->state_queue, 0);
}

@@ -655,7 +659,7 @@ static void phy_error(struct phy_device
phydev->state = PHY_HALTED;
mutex_unlock(&phydev->lock);

- phy_trigger_machine(phydev);
+ phy_trigger_machine(phydev, false);
}

/**
@@ -817,7 +821,7 @@ void phy_change(struct work_struct *work
}

/* reschedule state queue work to run as soon as possible */
- phy_trigger_machine(phydev);
+ phy_trigger_machine(phydev, true);
return;

ignore:
@@ -907,7 +911,7 @@ void phy_start(struct phy_device *phydev
if (do_resume)
phy_resume(phydev);

- phy_trigger_machine(phydev);
+ phy_trigger_machine(phydev, true);
}
EXPORT_SYMBOL(phy_start);



2017-03-20 17:54:42

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 01/93] net/mlx5e: Register/unregister vport representors on interface attach/detach

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Saeed Mahameed <[email protected]>


[ Upstream commit 6f08a22c5fb2b9aefb8ecd8496758e7a677c1fde ]

Currently vport representors are added only on driver load and removed on
driver unload. Apparently we forgot to handle them when we added the
seamless reset flow feature. This caused to leave the representors
netdevs alive and active with open HW resources on pci shutdown and on
error reset flows.

To overcome this we move their handling to interface attach/detach, so
they would be cleaned up on shutdown and recreated on reset flows.

Fixes: 26e59d8077a3 ("net/mlx5e: Implement mlx5e interface attach/detach callbacks")
Signed-off-by: Saeed Mahameed <[email protected]>
Reviewed-by: Hadar Hen Zion <[email protected]>
Reviewed-by: Roi Dayan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 23 ++++++++++++++--------
1 file changed, 15 insertions(+), 8 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -3936,6 +3936,19 @@ static void mlx5e_register_vport_rep(str
}
}

+static void mlx5e_unregister_vport_rep(struct mlx5_core_dev *mdev)
+{
+ struct mlx5_eswitch *esw = mdev->priv.eswitch;
+ int total_vfs = MLX5_TOTAL_VPORTS(mdev);
+ int vport;
+
+ if (!MLX5_CAP_GEN(mdev, vport_group_manager))
+ return;
+
+ for (vport = 1; vport < total_vfs; vport++)
+ mlx5_eswitch_unregister_vport_rep(esw, vport);
+}
+
void mlx5e_detach_netdev(struct mlx5_core_dev *mdev, struct net_device *netdev)
{
struct mlx5e_priv *priv = netdev_priv(netdev);
@@ -3983,6 +3996,7 @@ static int mlx5e_attach(struct mlx5_core
return err;
}

+ mlx5e_register_vport_rep(mdev);
return 0;
}

@@ -3994,6 +4008,7 @@ static void mlx5e_detach(struct mlx5_cor
if (!netif_device_present(netdev))
return;

+ mlx5e_unregister_vport_rep(mdev);
mlx5e_detach_netdev(mdev, netdev);
mlx5e_destroy_mdev_resources(mdev);
}
@@ -4012,8 +4027,6 @@ static void *mlx5e_add(struct mlx5_core_
if (err)
return NULL;

- mlx5e_register_vport_rep(mdev);
-
if (MLX5_CAP_GEN(mdev, vport_group_manager))
ppriv = &esw->offloads.vport_reps[0];

@@ -4065,13 +4078,7 @@ void mlx5e_destroy_netdev(struct mlx5_co

static void mlx5e_remove(struct mlx5_core_dev *mdev, void *vpriv)
{
- struct mlx5_eswitch *esw = mdev->priv.eswitch;
- int total_vfs = MLX5_TOTAL_VPORTS(mdev);
struct mlx5e_priv *priv = vpriv;
- int vport;
-
- for (vport = 1; vport < total_vfs; vport++)
- mlx5_eswitch_unregister_vport_rep(esw, vport);

unregister_netdev(priv->netdev);
mlx5e_detach(mdev, vpriv);


2017-03-20 17:54:55

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 11/93] vxlan: lock RCU on TX path

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Jakub Kicinski <[email protected]>


[ Upstream commit 56de859e9967c070464a9a9f4f18d73f9447298e ]

There is no guarantees that callers of the TX path will hold
the RCU lock. Grab it explicitly.

Fixes: c6fcc4fc5f8b ("vxlan: avoid using stale vxlan socket.")
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/vxlan.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)

--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -1955,6 +1955,7 @@ static void vxlan_xmit_one(struct sk_buf

info = skb_tunnel_info(skb);

+ rcu_read_lock();
if (rdst) {
dst_port = rdst->remote_port ? rdst->remote_port : vxlan->cfg.dst_port;
vni = rdst->remote_vni;
@@ -1985,7 +1986,7 @@ static void vxlan_xmit_one(struct sk_buf
if (did_rsc) {
/* short-circuited back to local bridge */
vxlan_encap_bypass(skb, vxlan, vxlan);
- return;
+ goto out_unlock;
}
goto drop;
}
@@ -2054,7 +2055,7 @@ static void vxlan_xmit_one(struct sk_buf
if (!dst_vxlan)
goto tx_error;
vxlan_encap_bypass(skb, vxlan, dst_vxlan);
- return;
+ goto out_unlock;
}

if (!info)
@@ -2115,7 +2116,7 @@ static void vxlan_xmit_one(struct sk_buf
if (!dst_vxlan)
goto tx_error;
vxlan_encap_bypass(skb, vxlan, dst_vxlan);
- return;
+ goto out_unlock;
}

if (!info)
@@ -2129,7 +2130,7 @@ static void vxlan_xmit_one(struct sk_buf
if (err < 0) {
dst_release(ndst);
dev->stats.tx_errors++;
- return;
+ goto out_unlock;
}
udp_tunnel6_xmit_skb(ndst, sk, skb, dev,
&local_ip.sin6.sin6_addr,
@@ -2137,7 +2138,8 @@ static void vxlan_xmit_one(struct sk_buf
label, src_port, dst_port, !udp_sum);
#endif
}
-
+out_unlock:
+ rcu_read_unlock();
return;

drop:
@@ -2153,6 +2155,7 @@ tx_error:
dev->stats.tx_errors++;
tx_free:
dev_kfree_skb(skb);
+ rcu_read_unlock();
}

/* Transmit local packets over Vxlan


2017-03-20 17:54:48

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 12/93] geneve: lock RCU on TX path

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Jakub Kicinski <[email protected]>


[ Upstream commit a717e3f740803cc88bd5c9a70c93504f6a368663 ]

There is no guarantees that callers of the TX path will hold
the RCU lock. Grab it explicitly.

Fixes: fceb9c3e3825 ("geneve: avoid using stale geneve socket.")
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/geneve.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -1039,16 +1039,22 @@ static netdev_tx_t geneve_xmit(struct sk
{
struct geneve_dev *geneve = netdev_priv(dev);
struct ip_tunnel_info *info = NULL;
+ int err;

if (geneve->collect_md)
info = skb_tunnel_info(skb);

+ rcu_read_lock();
#if IS_ENABLED(CONFIG_IPV6)
if ((info && ip_tunnel_info_af(info) == AF_INET6) ||
(!info && geneve->remote.sa.sa_family == AF_INET6))
- return geneve6_xmit_skb(skb, dev, info);
+ err = geneve6_xmit_skb(skb, dev, info);
+ else
#endif
- return geneve_xmit_skb(skb, dev, info);
+ err = geneve_xmit_skb(skb, dev, info);
+ rcu_read_unlock();
+
+ return err;
}

static int __geneve_change_mtu(struct net_device *dev, int new_mtu, bool strict)


2017-03-20 17:54:58

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 13/93] mlxsw: spectrum_router: Avoid potential packets loss

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Ido Schimmel <[email protected]>


[ Upstream commit f7df4923fa986247e93ec2cdff5ca168fff14dcf ]

When the structure of the LPM tree changes (f.e., due to the addition of
a new prefix), we unbind the old tree and then bind the new one. This
may result in temporary packet loss.

Instead, overwrite the old binding with the new one.

Fixes: 6b75c4807db3 ("mlxsw: spectrum_router: Add virtual router management")
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c | 30 ++++++++++++------
1 file changed, 20 insertions(+), 10 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
@@ -500,30 +500,40 @@ static int
mlxsw_sp_vr_lpm_tree_check(struct mlxsw_sp *mlxsw_sp, struct mlxsw_sp_vr *vr,
struct mlxsw_sp_prefix_usage *req_prefix_usage)
{
- struct mlxsw_sp_lpm_tree *lpm_tree;
+ struct mlxsw_sp_lpm_tree *lpm_tree = vr->lpm_tree;
+ struct mlxsw_sp_lpm_tree *new_tree;
+ int err;

- if (mlxsw_sp_prefix_usage_eq(req_prefix_usage,
- &vr->lpm_tree->prefix_usage))
+ if (mlxsw_sp_prefix_usage_eq(req_prefix_usage, &lpm_tree->prefix_usage))
return 0;

- lpm_tree = mlxsw_sp_lpm_tree_get(mlxsw_sp, req_prefix_usage,
+ new_tree = mlxsw_sp_lpm_tree_get(mlxsw_sp, req_prefix_usage,
vr->proto, false);
- if (IS_ERR(lpm_tree)) {
+ if (IS_ERR(new_tree)) {
/* We failed to get a tree according to the required
* prefix usage. However, the current tree might be still good
* for us if our requirement is subset of the prefixes used
* in the tree.
*/
if (mlxsw_sp_prefix_usage_subset(req_prefix_usage,
- &vr->lpm_tree->prefix_usage))
+ &lpm_tree->prefix_usage))
return 0;
- return PTR_ERR(lpm_tree);
+ return PTR_ERR(new_tree);
}

- mlxsw_sp_vr_lpm_tree_unbind(mlxsw_sp, vr);
- mlxsw_sp_lpm_tree_put(mlxsw_sp, vr->lpm_tree);
+ /* Prevent packet loss by overwriting existing binding */
+ vr->lpm_tree = new_tree;
+ err = mlxsw_sp_vr_lpm_tree_bind(mlxsw_sp, vr);
+ if (err)
+ goto err_tree_bind;
+ mlxsw_sp_lpm_tree_put(mlxsw_sp, lpm_tree);
+
+ return 0;
+
+err_tree_bind:
vr->lpm_tree = lpm_tree;
- return mlxsw_sp_vr_lpm_tree_bind(mlxsw_sp, vr);
+ mlxsw_sp_lpm_tree_put(mlxsw_sp, new_tree);
+ return err;
}

static struct mlxsw_sp_vr *mlxsw_sp_vr_get(struct mlxsw_sp *mlxsw_sp,


2017-03-20 17:55:09

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 16/93] net: dont call strlen() on the user buffer in packet_bind_spkt()

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Alexander Potapenko <[email protected]>


[ Upstream commit 540e2894f7905538740aaf122bd8e0548e1c34a4 ]

KMSAN (KernelMemorySanitizer, a new error detection tool) reports use of
uninitialized memory in packet_bind_spkt():
Acked-by: Eric Dumazet <[email protected]>

==================================================================
BUG: KMSAN: use of unitialized memory
CPU: 0 PID: 1074 Comm: packet Not tainted 4.8.0-rc6+ #1891
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
01/01/2011
0000000000000000 ffff88006b6dfc08 ffffffff82559ae8 ffff88006b6dfb48
ffffffff818a7c91 ffffffff85b9c870 0000000000000092 ffffffff85b9c550
0000000000000000 0000000000000092 00000000ec400911 0000000000000002
Call Trace:
[< inline >] __dump_stack lib/dump_stack.c:15
[<ffffffff82559ae8>] dump_stack+0x238/0x290 lib/dump_stack.c:51
[<ffffffff818a6626>] kmsan_report+0x276/0x2e0 mm/kmsan/kmsan.c:1003
[<ffffffff818a783b>] __msan_warning+0x5b/0xb0
mm/kmsan/kmsan_instr.c:424
[< inline >] strlen lib/string.c:484
[<ffffffff8259b58d>] strlcpy+0x9d/0x200 lib/string.c:144
[<ffffffff84b2eca4>] packet_bind_spkt+0x144/0x230
net/packet/af_packet.c:3132
[<ffffffff84242e4d>] SYSC_bind+0x40d/0x5f0 net/socket.c:1370
[<ffffffff84242a22>] SyS_bind+0x82/0xa0 net/socket.c:1356
[<ffffffff8515991b>] entry_SYSCALL_64_fastpath+0x13/0x8f
arch/x86/entry/entry_64.o:?
chained origin: 00000000eba00911
[<ffffffff810bb787>] save_stack_trace+0x27/0x50
arch/x86/kernel/stacktrace.c:67
[< inline >] kmsan_save_stack_with_flags mm/kmsan/kmsan.c:322
[< inline >] kmsan_save_stack mm/kmsan/kmsan.c:334
[<ffffffff818a59f8>] kmsan_internal_chain_origin+0x118/0x1e0
mm/kmsan/kmsan.c:527
[<ffffffff818a7773>] __msan_set_alloca_origin4+0xc3/0x130
mm/kmsan/kmsan_instr.c:380
[<ffffffff84242b69>] SYSC_bind+0x129/0x5f0 net/socket.c:1356
[<ffffffff84242a22>] SyS_bind+0x82/0xa0 net/socket.c:1356
[<ffffffff8515991b>] entry_SYSCALL_64_fastpath+0x13/0x8f
arch/x86/entry/entry_64.o:?
origin description: ----address@SYSC_bind (origin=00000000eb400911)
==================================================================
(the line numbers are relative to 4.8-rc6, but the bug persists
upstream)

, when I run the following program as root:

=====================================
#include <string.h>
#include <sys/socket.h>
#include <netpacket/packet.h>
#include <net/ethernet.h>

int main() {
struct sockaddr addr;
memset(&addr, 0xff, sizeof(addr));
addr.sa_family = AF_PACKET;
int fd = socket(PF_PACKET, SOCK_PACKET, htons(ETH_P_ALL));
bind(fd, &addr, sizeof(addr));
return 0;
}
=====================================

This happens because addr.sa_data copied from the userspace is not
zero-terminated, and copying it with strlcpy() in packet_bind_spkt()
results in calling strlen() on the kernel copy of that non-terminated
buffer.

Signed-off-by: Alexander Potapenko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/packet/af_packet.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -3140,7 +3140,7 @@ static int packet_bind_spkt(struct socke
int addr_len)
{
struct sock *sk = sock->sk;
- char name[15];
+ char name[sizeof(uaddr->sa_data) + 1];

/*
* Check legality
@@ -3148,7 +3148,11 @@ static int packet_bind_spkt(struct socke

if (addr_len != sizeof(struct sockaddr))
return -EINVAL;
- strlcpy(name, uaddr->sa_data, sizeof(name));
+ /* uaddr->sa_data comes from the userspace, it's not guaranteed to be
+ * zero-terminated.
+ */
+ memcpy(name, uaddr->sa_data, sizeof(uaddr->sa_data));
+ name[sizeof(uaddr->sa_data)] = 0;

return packet_do_bind(sk, name, 0, pkt_sk(sk)->num);
}


2017-03-20 17:55:26

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 14/93] tcp/dccp: block BH for SYN processing

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <[email protected]>


[ Upstream commit 449809a66c1d0b1563dee84493e14bf3104d2d7e ]

SYN processing really was meant to be handled from BH.

When I got rid of BH blocking while processing socket backlog
in commit 5413d1babe8f ("net: do not block BH while processing socket
backlog"), I forgot that a malicious user could transition to TCP_LISTEN
from a state that allowed (SYN) packets to be parked in the socket
backlog while socket is owned by the thread doing the listen() call.

Sure enough syzkaller found this and reported the bug ;)

=================================
[ INFO: inconsistent lock state ]
4.10.0+ #60 Not tainted
---------------------------------
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
syz-executor0/5090 [HC0[0]:SC0[0]:HE1:SE1] takes:
(&(&hashinfo->ehash_locks[i])->rlock){+.?...}, at:
[<ffffffff83a6a370>] spin_lock include/linux/spinlock.h:299 [inline]
(&(&hashinfo->ehash_locks[i])->rlock){+.?...}, at:
[<ffffffff83a6a370>] inet_ehash_insert+0x240/0xad0
net/ipv4/inet_hashtables.c:407
{IN-SOFTIRQ-W} state was registered at:
mark_irqflags kernel/locking/lockdep.c:2923 [inline]
__lock_acquire+0xbcf/0x3270 kernel/locking/lockdep.c:3295
lock_acquire+0x241/0x580 kernel/locking/lockdep.c:3753
__raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
_raw_spin_lock+0x33/0x50 kernel/locking/spinlock.c:151
spin_lock include/linux/spinlock.h:299 [inline]
inet_ehash_insert+0x240/0xad0 net/ipv4/inet_hashtables.c:407
reqsk_queue_hash_req net/ipv4/inet_connection_sock.c:753 [inline]
inet_csk_reqsk_queue_hash_add+0x1b7/0x2a0 net/ipv4/inet_connection_sock.c:764
tcp_conn_request+0x25cc/0x3310 net/ipv4/tcp_input.c:6399
tcp_v4_conn_request+0x157/0x220 net/ipv4/tcp_ipv4.c:1262
tcp_rcv_state_process+0x802/0x4130 net/ipv4/tcp_input.c:5889
tcp_v4_do_rcv+0x56b/0x940 net/ipv4/tcp_ipv4.c:1433
tcp_v4_rcv+0x2e12/0x3210 net/ipv4/tcp_ipv4.c:1711
ip_local_deliver_finish+0x4ce/0xc40 net/ipv4/ip_input.c:216
NF_HOOK include/linux/netfilter.h:257 [inline]
ip_local_deliver+0x1ce/0x710 net/ipv4/ip_input.c:257
dst_input include/net/dst.h:492 [inline]
ip_rcv_finish+0xb1d/0x2110 net/ipv4/ip_input.c:396
NF_HOOK include/linux/netfilter.h:257 [inline]
ip_rcv+0xd90/0x19c0 net/ipv4/ip_input.c:487
__netif_receive_skb_core+0x1ad1/0x3400 net/core/dev.c:4179
__netif_receive_skb+0x2a/0x170 net/core/dev.c:4217
netif_receive_skb_internal+0x1d6/0x430 net/core/dev.c:4245
napi_skb_finish net/core/dev.c:4602 [inline]
napi_gro_receive+0x4e6/0x680 net/core/dev.c:4636
e1000_receive_skb drivers/net/ethernet/intel/e1000/e1000_main.c:4033 [inline]
e1000_clean_rx_irq+0x5e0/0x1490
drivers/net/ethernet/intel/e1000/e1000_main.c:4489
e1000_clean+0xb9a/0x2910 drivers/net/ethernet/intel/e1000/e1000_main.c:3834
napi_poll net/core/dev.c:5171 [inline]
net_rx_action+0xe70/0x1900 net/core/dev.c:5236
__do_softirq+0x2fb/0xb7d kernel/softirq.c:284
invoke_softirq kernel/softirq.c:364 [inline]
irq_exit+0x19e/0x1d0 kernel/softirq.c:405
exiting_irq arch/x86/include/asm/apic.h:658 [inline]
do_IRQ+0x81/0x1a0 arch/x86/kernel/irq.c:250
ret_from_intr+0x0/0x20
native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:53
arch_safe_halt arch/x86/include/asm/paravirt.h:98 [inline]
default_idle+0x8f/0x410 arch/x86/kernel/process.c:271
arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:262
default_idle_call+0x36/0x60 kernel/sched/idle.c:96
cpuidle_idle_call kernel/sched/idle.c:154 [inline]
do_idle+0x348/0x440 kernel/sched/idle.c:243
cpu_startup_entry+0x18/0x20 kernel/sched/idle.c:345
start_secondary+0x344/0x440 arch/x86/kernel/smpboot.c:272
verify_cpu+0x0/0xfc
irq event stamp: 1741
hardirqs last enabled at (1741): [<ffffffff84d49d77>]
__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160
[inline]
hardirqs last enabled at (1741): [<ffffffff84d49d77>]
_raw_spin_unlock_irqrestore+0xf7/0x1a0 kernel/locking/spinlock.c:191
hardirqs last disabled at (1740): [<ffffffff84d4a732>]
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:108 [inline]
hardirqs last disabled at (1740): [<ffffffff84d4a732>]
_raw_spin_lock_irqsave+0xa2/0x110 kernel/locking/spinlock.c:159
softirqs last enabled at (1738): [<ffffffff84d4deff>]
__do_softirq+0x7cf/0xb7d kernel/softirq.c:310
softirqs last disabled at (1571): [<ffffffff84d4b92c>]
do_softirq_own_stack+0x1c/0x30 arch/x86/entry/entry_64.S:902

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(&(&hashinfo->ehash_locks[i])->rlock);
<Interrupt>
lock(&(&hashinfo->ehash_locks[i])->rlock);

*** DEADLOCK ***

1 lock held by syz-executor0/5090:
#0: (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff83406b43>] lock_sock
include/net/sock.h:1460 [inline]
#0: (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff83406b43>]
sock_setsockopt+0x233/0x1e40 net/core/sock.c:683

stack backtrace:
CPU: 1 PID: 5090 Comm: syz-executor0 Not tainted 4.10.0+ #60
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:15 [inline]
dump_stack+0x292/0x398 lib/dump_stack.c:51
print_usage_bug+0x3ef/0x450 kernel/locking/lockdep.c:2387
valid_state kernel/locking/lockdep.c:2400 [inline]
mark_lock_irq kernel/locking/lockdep.c:2602 [inline]
mark_lock+0xf30/0x1410 kernel/locking/lockdep.c:3065
mark_irqflags kernel/locking/lockdep.c:2941 [inline]
__lock_acquire+0x6dc/0x3270 kernel/locking/lockdep.c:3295
lock_acquire+0x241/0x580 kernel/locking/lockdep.c:3753
__raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
_raw_spin_lock+0x33/0x50 kernel/locking/spinlock.c:151
spin_lock include/linux/spinlock.h:299 [inline]
inet_ehash_insert+0x240/0xad0 net/ipv4/inet_hashtables.c:407
reqsk_queue_hash_req net/ipv4/inet_connection_sock.c:753 [inline]
inet_csk_reqsk_queue_hash_add+0x1b7/0x2a0 net/ipv4/inet_connection_sock.c:764
dccp_v6_conn_request+0xada/0x11b0 net/dccp/ipv6.c:380
dccp_rcv_state_process+0x51e/0x1660 net/dccp/input.c:606
dccp_v6_do_rcv+0x213/0x350 net/dccp/ipv6.c:632
sk_backlog_rcv include/net/sock.h:896 [inline]
__release_sock+0x127/0x3a0 net/core/sock.c:2052
release_sock+0xa5/0x2b0 net/core/sock.c:2539
sock_setsockopt+0x60f/0x1e40 net/core/sock.c:1016
SYSC_setsockopt net/socket.c:1782 [inline]
SyS_setsockopt+0x2fb/0x3a0 net/socket.c:1765
entry_SYSCALL_64_fastpath+0x1f/0xc2
RIP: 0033:0x4458b9
RSP: 002b:00007fe8b26c2b58 EFLAGS: 00000292 ORIG_RAX: 0000000000000036
RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 00000000004458b9
RDX: 000000000000001a RSI: 0000000000000001 RDI: 0000000000000006
RBP: 00000000006e2110 R08: 0000000000000010 R09: 0000000000000000
R10: 00000000208c3000 R11: 0000000000000292 R12: 0000000000708000
R13: 0000000020000000 R14: 0000000000001000 R15: 0000000000000000

Fixes: 5413d1babe8f ("net: do not block BH while processing socket backlog")
Signed-off-by: Eric Dumazet <[email protected]>
Reported-by: Andrey Konovalov <[email protected]>
Acked-by: Soheil Hassas Yeganeh <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/dccp/input.c | 10 ++++++++--
net/ipv4/tcp_input.c | 10 ++++++++--
2 files changed, 16 insertions(+), 4 deletions(-)

--- a/net/dccp/input.c
+++ b/net/dccp/input.c
@@ -577,6 +577,7 @@ int dccp_rcv_state_process(struct sock *
struct dccp_sock *dp = dccp_sk(sk);
struct dccp_skb_cb *dcb = DCCP_SKB_CB(skb);
const int old_state = sk->sk_state;
+ bool acceptable;
int queued = 0;

/*
@@ -603,8 +604,13 @@ int dccp_rcv_state_process(struct sock *
*/
if (sk->sk_state == DCCP_LISTEN) {
if (dh->dccph_type == DCCP_PKT_REQUEST) {
- if (inet_csk(sk)->icsk_af_ops->conn_request(sk,
- skb) < 0)
+ /* It is possible that we process SYN packets from backlog,
+ * so we need to make sure to disable BH right there.
+ */
+ local_bh_disable();
+ acceptable = inet_csk(sk)->icsk_af_ops->conn_request(sk, skb) >= 0;
+ local_bh_enable();
+ if (!acceptable)
return 1;
consume_skb(skb);
return 0;
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -5916,9 +5916,15 @@ int tcp_rcv_state_process(struct sock *s
if (th->syn) {
if (th->fin)
goto discard;
- if (icsk->icsk_af_ops->conn_request(sk, skb) < 0)
- return 1;
+ /* It is possible that we process SYN packets from backlog,
+ * so we need to make sure to disable BH right there.
+ */
+ local_bh_disable();
+ acceptable = icsk->icsk_af_ops->conn_request(sk, skb) >= 0;
+ local_bh_enable();

+ if (!acceptable)
+ return 1;
consume_skb(skb);
return 0;
}


2017-03-20 17:55:53

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 15/93] net: bridge: allow IPv6 when multicast flood is disabled

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Mike Manning <[email protected]>


[ Upstream commit 8953de2f02ad7b15e4964c82f9afd60f128e4e98 ]

Even with multicast flooding turned off, IPv6 ND should still work so
that IPv6 connectivity is provided. Allow this by continuing to flood
multicast traffic originated by us.

Fixes: b6cb5ac8331b ("net: bridge: add per-port multicast flood flag")
Cc: Nikolay Aleksandrov <[email protected]>
Signed-off-by: Mike Manning <[email protected]>
Acked-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/bridge/br_forward.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/net/bridge/br_forward.c
+++ b/net/bridge/br_forward.c
@@ -186,8 +186,9 @@ void br_flood(struct net_bridge *br, str
/* Do not flood unicast traffic to ports that turn it off */
if (pkt_type == BR_PKT_UNICAST && !(p->flags & BR_FLOOD))
continue;
+ /* Do not flood if mc off, except for traffic we originate */
if (pkt_type == BR_PKT_MULTICAST &&
- !(p->flags & BR_MCAST_FLOOD))
+ !(p->flags & BR_MCAST_FLOOD) && skb->dev != br->dev)
continue;

/* Do not flood to ports that enable proxy ARP */


2017-03-20 17:58:23

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 34/93] bridge: drop netfilter fake rtable unconditionally

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Florian Westphal <[email protected]>


[ Upstream commit a13b2082ece95247779b9995c4e91b4246bed023 ]

Andreas reports kernel oops during rmmod of the br_netfilter module.
Hannes debugged the oops down to a NULL rt6info->rt6i_indev.

Problem is that br_netfilter has the nasty concept of adding a fake
rtable to skb->dst; this happens in a br_netfilter prerouting hook.

A second hook (in bridge LOCAL_IN) is supposed to remove these again
before the skb is handed up the stack.

However, on module unload hooks get unregistered which means an
skb could traverse the prerouting hook that attaches the fake_rtable,
while the 'fake rtable remove' hook gets removed from the hooklist
immediately after.

Fixes: 34666d467cbf1e2e3c7 ("netfilter: bridge: move br_netfilter out of the core")
Reported-by: Andreas Karis <[email protected]>
Debugged-by: Hannes Frederic Sowa <[email protected]>
Signed-off-by: Florian Westphal <[email protected]>
Acked-by: Pablo Neira Ayuso <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/bridge/br_input.c | 1 +
net/bridge/br_netfilter_hooks.c | 21 ---------------------
2 files changed, 1 insertion(+), 21 deletions(-)

--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -29,6 +29,7 @@ EXPORT_SYMBOL(br_should_route_hook);
static int
br_netif_receive_skb(struct net *net, struct sock *sk, struct sk_buff *skb)
{
+ br_drop_fake_rtable(skb);
return netif_receive_skb(skb);
}

--- a/net/bridge/br_netfilter_hooks.c
+++ b/net/bridge/br_netfilter_hooks.c
@@ -521,21 +521,6 @@ static unsigned int br_nf_pre_routing(vo
}


-/* PF_BRIDGE/LOCAL_IN ************************************************/
-/* The packet is locally destined, which requires a real
- * dst_entry, so detach the fake one. On the way up, the
- * packet would pass through PRE_ROUTING again (which already
- * took place when the packet entered the bridge), but we
- * register an IPv4 PRE_ROUTING 'sabotage' hook that will
- * prevent this from happening. */
-static unsigned int br_nf_local_in(void *priv,
- struct sk_buff *skb,
- const struct nf_hook_state *state)
-{
- br_drop_fake_rtable(skb);
- return NF_ACCEPT;
-}
-
/* PF_BRIDGE/FORWARD *************************************************/
static int br_nf_forward_finish(struct net *net, struct sock *sk, struct sk_buff *skb)
{
@@ -906,12 +891,6 @@ static struct nf_hook_ops br_nf_ops[] __
.priority = NF_BR_PRI_BRNF,
},
{
- .hook = br_nf_local_in,
- .pf = NFPROTO_BRIDGE,
- .hooknum = NF_BR_LOCAL_IN,
- .priority = NF_BR_PRI_BRNF,
- },
- {
.hook = br_nf_forward_ip,
.pf = NFPROTO_BRIDGE,
.hooknum = NF_BR_FORWARD,


2017-03-20 17:58:26

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 38/93] bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Thomas Graf <[email protected]>


[ Upstream commit 57a09bf0a416700676e77102c28f9cfcb48267e0 ]

A BPF program is required to check the return register of a
map_elem_lookup() call before accessing memory. The verifier keeps
track of this by converting the type of the result register from
PTR_TO_MAP_VALUE_OR_NULL to PTR_TO_MAP_VALUE after a conditional
jump ensures safety. This check is currently exclusively performed
for the result register 0.

In the event the compiler reorders instructions, BPF_MOV64_REG
instructions may be moved before the conditional jump which causes
them to keep their type PTR_TO_MAP_VALUE_OR_NULL to which the
verifier objects when the register is accessed:

0: (b7) r1 = 10
1: (7b) *(u64 *)(r10 -8) = r1
2: (bf) r2 = r10
3: (07) r2 += -8
4: (18) r1 = 0x59c00000
6: (85) call 1
7: (bf) r4 = r0
8: (15) if r0 == 0x0 goto pc+1
R0=map_value(ks=8,vs=8) R4=map_value_or_null(ks=8,vs=8) R10=fp
9: (7a) *(u64 *)(r4 +0) = 0
R4 invalid mem access 'map_value_or_null'

This commit extends the verifier to keep track of all identical
PTR_TO_MAP_VALUE_OR_NULL registers after a map_elem_lookup() by
assigning them an ID and then marking them all when the conditional
jump is observed.

Signed-off-by: Thomas Graf <[email protected]>
Reviewed-by: Josef Bacik <[email protected]>
Acked-by: Daniel Borkmann <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/linux/bpf_verifier.h | 2 -
kernel/bpf/verifier.c | 61 +++++++++++++++++++++++++++++++------------
2 files changed, 46 insertions(+), 17 deletions(-)

--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -24,13 +24,13 @@ struct bpf_reg_state {
*/
s64 min_value;
u64 max_value;
+ u32 id;
union {
/* valid when type == CONST_IMM | PTR_TO_STACK | UNKNOWN_VALUE */
s64 imm;

/* valid when type == PTR_TO_PACKET* */
struct {
- u32 id;
u16 off;
u16 range;
};
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -212,9 +212,10 @@ static void print_verifier_state(struct
else if (t == CONST_PTR_TO_MAP || t == PTR_TO_MAP_VALUE ||
t == PTR_TO_MAP_VALUE_OR_NULL ||
t == PTR_TO_MAP_VALUE_ADJ)
- verbose("(ks=%d,vs=%d)",
+ verbose("(ks=%d,vs=%d,id=%u)",
reg->map_ptr->key_size,
- reg->map_ptr->value_size);
+ reg->map_ptr->value_size,
+ reg->id);
if (reg->min_value != BPF_REGISTER_MIN_RANGE)
verbose(",min_value=%lld",
(long long)reg->min_value);
@@ -447,6 +448,7 @@ static void mark_reg_unknown_value(struc
{
BUG_ON(regno >= MAX_BPF_REG);
regs[regno].type = UNKNOWN_VALUE;
+ regs[regno].id = 0;
regs[regno].imm = 0;
}

@@ -1252,6 +1254,7 @@ static int check_call(struct bpf_verifie
return -EINVAL;
}
regs[BPF_REG_0].map_ptr = meta.map_ptr;
+ regs[BPF_REG_0].id = ++env->id_gen;
} else {
verbose("unknown return type %d of func %d\n",
fn->ret_type, func_id);
@@ -1668,8 +1671,7 @@ static int check_alu_op(struct bpf_verif
insn->src_reg);
return -EACCES;
}
- regs[insn->dst_reg].type = UNKNOWN_VALUE;
- regs[insn->dst_reg].map_ptr = NULL;
+ mark_reg_unknown_value(regs, insn->dst_reg);
}
} else {
/* case: R = imm
@@ -1931,6 +1933,38 @@ static void reg_set_min_max_inv(struct b
check_reg_overflow(true_reg);
}

+static void mark_map_reg(struct bpf_reg_state *regs, u32 regno, u32 id,
+ enum bpf_reg_type type)
+{
+ struct bpf_reg_state *reg = &regs[regno];
+
+ if (reg->type == PTR_TO_MAP_VALUE_OR_NULL && reg->id == id) {
+ reg->type = type;
+ if (type == UNKNOWN_VALUE)
+ mark_reg_unknown_value(regs, regno);
+ }
+}
+
+/* The logic is similar to find_good_pkt_pointers(), both could eventually
+ * be folded together at some point.
+ */
+static void mark_map_regs(struct bpf_verifier_state *state, u32 regno,
+ enum bpf_reg_type type)
+{
+ struct bpf_reg_state *regs = state->regs;
+ int i;
+
+ for (i = 0; i < MAX_BPF_REG; i++)
+ mark_map_reg(regs, i, regs[regno].id, type);
+
+ for (i = 0; i < MAX_BPF_STACK; i += BPF_REG_SIZE) {
+ if (state->stack_slot_type[i] != STACK_SPILL)
+ continue;
+ mark_map_reg(state->spilled_regs, i / BPF_REG_SIZE,
+ regs[regno].id, type);
+ }
+}
+
static int check_cond_jmp_op(struct bpf_verifier_env *env,
struct bpf_insn *insn, int *insn_idx)
{
@@ -2018,18 +2052,13 @@ static int check_cond_jmp_op(struct bpf_
if (BPF_SRC(insn->code) == BPF_K &&
insn->imm == 0 && (opcode == BPF_JEQ || opcode == BPF_JNE) &&
dst_reg->type == PTR_TO_MAP_VALUE_OR_NULL) {
- if (opcode == BPF_JEQ) {
- /* next fallthrough insn can access memory via
- * this register
- */
- regs[insn->dst_reg].type = PTR_TO_MAP_VALUE;
- /* branch targer cannot access it, since reg == 0 */
- mark_reg_unknown_value(other_branch->regs,
- insn->dst_reg);
- } else {
- other_branch->regs[insn->dst_reg].type = PTR_TO_MAP_VALUE;
- mark_reg_unknown_value(regs, insn->dst_reg);
- }
+ /* Mark all identical map registers in each branch as either
+ * safe or unknown depending R == 0 or R != 0 conditional.
+ */
+ mark_map_regs(this_branch, insn->dst_reg,
+ opcode == BPF_JEQ ? PTR_TO_MAP_VALUE : UNKNOWN_VALUE);
+ mark_map_regs(other_branch, insn->dst_reg,
+ opcode == BPF_JEQ ? UNKNOWN_VALUE : PTR_TO_MAP_VALUE);
} else if (BPF_SRC(insn->code) == BPF_X && opcode == BPF_JGT &&
dst_reg->type == PTR_TO_PACKET &&
regs[insn->src_reg].type == PTR_TO_PACKET_END) {


2017-03-20 17:58:42

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 55/93] PCI: Add comments about ROM BAR updating

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Bjorn Helgaas <[email protected]>

[ Upstream commit 0b457dde3cf8b7c76a60f8e960f21bbd4abdc416 ]

pci_update_resource() updates a hardware BAR so its address matches the
kernel's struct resource UNLESS it's a disabled ROM BAR. We only update
those when we enable the ROM.

It's not obvious from the code why ROM BARs should be handled specially.
Apparently there are Matrox devices with defective ROM BARs that read as
zero when disabled. That means that if pci_enable_rom() reads the disabled
BAR, sets PCI_ROM_ADDRESS_ENABLE (without re-inserting the address), and
writes it back, it would enable the ROM at address zero.

Add comments and references to explain why we can't make the code look more
rational.

The code changes are from 755528c860b0 ("Ignore disabled ROM resources at
setup") and 8085ce084c0f ("[PATCH] Fix PCI ROM mapping").

Link: https://lkml.org/lkml/2005/8/30/138
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/pci/rom.c | 5 +++++
drivers/pci/setup-res.c | 6 ++++++
2 files changed, 11 insertions(+)

--- a/drivers/pci/rom.c
+++ b/drivers/pci/rom.c
@@ -35,6 +35,11 @@ int pci_enable_rom(struct pci_dev *pdev)
if (res->flags & IORESOURCE_ROM_SHADOW)
return 0;

+ /*
+ * Ideally pci_update_resource() would update the ROM BAR address,
+ * and we would only set the enable bit here. But apparently some
+ * devices have buggy ROM BARs that read as zero when disabled.
+ */
pcibios_resource_to_bus(pdev->bus, &region, res);
pci_read_config_dword(pdev, pdev->rom_base_reg, &rom_addr);
rom_addr &= ~PCI_ROM_ADDRESS_MASK;
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -68,6 +68,12 @@ static void pci_std_update_resource(stru
if (resno < PCI_ROM_RESOURCE) {
reg = PCI_BASE_ADDRESS_0 + 4 * resno;
} else if (resno == PCI_ROM_RESOURCE) {
+
+ /*
+ * Apparently some Matrox devices have ROM BARs that read
+ * as zero when disabled, so don't update ROM BARs unless
+ * they're enabled. See https://lkml.org/lkml/2005/8/30/138.
+ */
if (!(res->flags & IORESOURCE_ROM_ENABLE))
return;



2017-03-20 17:58:56

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 77/93] ACPI / blacklist: add _REV quirks for Dell Precision 5520 and 3520

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Alex Hung <[email protected]>

[ Upstream commit 9523b9bf6dceef6b0215e90b2348cd646597f796 ]

Precision 5520 and 3520 either hang at login and during suspend or reboot.

It turns out that that adding them to acpi_rev_dmi_table[] helps to work
around those issues.

Signed-off-by: Alex Hung <[email protected]>
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki <[email protected]>

Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/acpi/blacklist.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)

--- a/drivers/acpi/blacklist.c
+++ b/drivers/acpi/blacklist.c
@@ -160,6 +160,22 @@ static struct dmi_system_id acpi_rev_dmi
DMI_MATCH(DMI_PRODUCT_NAME, "XPS 13 9343"),
},
},
+ {
+ .callback = dmi_enable_rev_override,
+ .ident = "DELL Precision 5520",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
+ DMI_MATCH(DMI_PRODUCT_NAME, "Precision 5520"),
+ },
+ },
+ {
+ .callback = dmi_enable_rev_override,
+ .ident = "DELL Precision 3520",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
+ DMI_MATCH(DMI_PRODUCT_NAME, "Precision 3520"),
+ },
+ },
#endif
{}
};


2017-03-20 17:59:01

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 79/93] serial: 8250_pci: Detach low-level driver during PCI error recovery

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Gabriel Krisman Bertazi <[email protected]>

[ Upstream commit f209fa03fc9d131b3108c2e4936181eabab87416 ]

During a PCI error recovery, like the ones provoked by EEH in the ppc64
platform, all IO to the device must be blocked while the recovery is
completed. Current 8250_pci implementation only suspends the port
instead of detaching it, which doesn't prevent incoming accesses like
TIOCMGET and TIOCMSET calls from reaching the device. Those end up
racing with the EEH recovery, crashing it. Similar races were also
observed when opening the device and when shutting it down during
recovery.

This patch implements a more robust IO blockage for the 8250_pci
recovery by unregistering the port at the beginning of the procedure and
re-adding it afterwards. Since the port is detached from the uart
layer, we can be sure that no request will make through to the device
during recovery. This is similar to the solution used by the JSM serial
driver.

I thank Peter Hurley <[email protected]> for valuable input on
this one over one year ago.

Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/tty/serial/8250/8250_pci.c | 23 +++++++++++++++++++----
1 file changed, 19 insertions(+), 4 deletions(-)

--- a/drivers/tty/serial/8250/8250_pci.c
+++ b/drivers/tty/serial/8250/8250_pci.c
@@ -52,6 +52,7 @@ struct serial_private {
struct pci_dev *dev;
unsigned int nr;
struct pci_serial_quirk *quirk;
+ const struct pciserial_board *board;
int line[0];
};

@@ -3871,6 +3872,7 @@ pciserial_init_ports(struct pci_dev *dev
}
}
priv->nr = i;
+ priv->board = board;
return priv;

err_deinit:
@@ -3881,7 +3883,7 @@ err_out:
}
EXPORT_SYMBOL_GPL(pciserial_init_ports);

-void pciserial_remove_ports(struct serial_private *priv)
+void pciserial_detach_ports(struct serial_private *priv)
{
struct pci_serial_quirk *quirk;
int i;
@@ -3895,7 +3897,11 @@ void pciserial_remove_ports(struct seria
quirk = find_quirk(priv->dev);
if (quirk->exit)
quirk->exit(priv->dev);
+}

+void pciserial_remove_ports(struct serial_private *priv)
+{
+ pciserial_detach_ports(priv);
kfree(priv);
}
EXPORT_SYMBOL_GPL(pciserial_remove_ports);
@@ -5590,7 +5596,7 @@ static pci_ers_result_t serial8250_io_er
return PCI_ERS_RESULT_DISCONNECT;

if (priv)
- pciserial_suspend_ports(priv);
+ pciserial_detach_ports(priv);

pci_disable_device(dev);

@@ -5615,9 +5621,18 @@ static pci_ers_result_t serial8250_io_sl
static void serial8250_io_resume(struct pci_dev *dev)
{
struct serial_private *priv = pci_get_drvdata(dev);
+ const struct pciserial_board *board;

- if (priv)
- pciserial_resume_ports(priv);
+ if (!priv)
+ return;
+
+ board = priv->board;
+ kfree(priv);
+ priv = pciserial_init_ports(dev, board);
+
+ if (!IS_ERR(priv)) {
+ pci_set_drvdata(dev, priv);
+ }
}

static const struct pci_error_handlers serial8250_err_handler = {


2017-03-20 17:59:15

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 89/93] x86/perf: Fix CR4.PCE propagation to use active_mm instead of mm

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <[email protected]>

commit 5dc855d44c2ad960a86f593c60461f1ae1566b6d upstream.

If one thread mmaps a perf event while another thread in the same mm
is in some context where active_mm != mm (which can happen in the
scheduler, for example), refresh_pce() would write the wrong value
to CR4.PCE. This broke some PAPI tests.

Reported-and-tested-by: Vince Weaver <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Fixes: 7911d3f7af14 ("perf/x86: Only allow rdpmc if a perf_event is mapped")
Link: http://lkml.kernel.org/r/0c5b38a76ea50e405f9abe07a13dfaef87c173a1.1489694270.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/events/core.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2096,8 +2096,8 @@ static int x86_pmu_event_init(struct per

static void refresh_pce(void *ignored)
{
- if (current->mm)
- load_mm_cr4(current->mm);
+ if (current->active_mm)
+ load_mm_cr4(current->active_mm);
}

static void x86_pmu_event_mapped(struct perf_event *event)


2017-03-20 17:59:07

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 63/93] vfio/spapr: Postpone allocation of userspace version of TCE table

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Alexey Kardashevskiy <[email protected]>

[ Upstream commit 39701e56f5f16ea0cf8fc9e8472e645f8de91d23 ]

The iommu_table struct manages a hardware TCE table and a vmalloc'd
table with corresponding userspace addresses. Both are allocated when
the default DMA window is created and this happens when the very first
group is attached to a container.

As we are going to allow the userspace to configure container in one
memory context and pas container fd to another, we have to postpones
such allocations till a container fd is passed to the destination
user process so we would account locked memory limit against the actual
container user constrainsts.

This postpones the it_userspace array allocation till it is used first
time for mapping. The unmapping patch already checks if the array is
allocated.

Signed-off-by: Alexey Kardashevskiy <[email protected]>
Reviewed-by: David Gibson <[email protected]>
Acked-by: Alex Williamson <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/vfio/vfio_iommu_spapr_tce.c | 20 +++++++-------------
1 file changed, 7 insertions(+), 13 deletions(-)

--- a/drivers/vfio/vfio_iommu_spapr_tce.c
+++ b/drivers/vfio/vfio_iommu_spapr_tce.c
@@ -509,6 +509,12 @@ static long tce_iommu_build_v2(struct tc
unsigned long hpa;
enum dma_data_direction dirtmp;

+ if (!tbl->it_userspace) {
+ ret = tce_iommu_userspace_view_alloc(tbl);
+ if (ret)
+ return ret;
+ }
+
for (i = 0; i < pages; ++i) {
struct mm_iommu_table_group_mem_t *mem = NULL;
unsigned long *pua = IOMMU_TABLE_USERSPACE_ENTRY(tbl,
@@ -582,15 +588,6 @@ static long tce_iommu_create_table(struc
WARN_ON(!ret && !(*ptbl)->it_ops->free);
WARN_ON(!ret && ((*ptbl)->it_allocated_size != table_size));

- if (!ret && container->v2) {
- ret = tce_iommu_userspace_view_alloc(*ptbl);
- if (ret)
- (*ptbl)->it_ops->free(*ptbl);
- }
-
- if (ret)
- decrement_locked_vm(table_size >> PAGE_SHIFT);
-
return ret;
}

@@ -1062,10 +1059,7 @@ static int tce_iommu_take_ownership(stru
if (!tbl || !tbl->it_map)
continue;

- rc = tce_iommu_userspace_view_alloc(tbl);
- if (!rc)
- rc = iommu_take_ownership(tbl);
-
+ rc = iommu_take_ownership(tbl);
if (rc) {
for (j = 0; j < i; ++j)
iommu_release_ownership(


2017-03-20 17:59:22

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 91/93] futex: Add missing error handling to FUTEX_REQUEUE_PI

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Peter Zijlstra <[email protected]>

commit 9bbb25afeb182502ca4f2c4f3f88af0681b34cae upstream.

Thomas spotted that fixup_pi_state_owner() can return errors and we
fail to unlock the rt_mutex in that case.

Reported-by: Thomas Gleixner <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Darren Hart <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
kernel/futex.c | 2 ++
1 file changed, 2 insertions(+)

--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -2896,6 +2896,8 @@ static int futex_wait_requeue_pi(u32 __u
if (q.pi_state && (q.pi_state->owner != current)) {
spin_lock(q.lock_ptr);
ret = fixup_pi_state_owner(uaddr2, &q, current);
+ if (ret && rt_mutex_owner(&q.pi_state->pi_mutex) == current)
+ rt_mutex_unlock(&q.pi_state->pi_mutex);
/*
* Drop the reference to the pi state which
* the requeue_pi() code acquired for us.


2017-03-20 17:59:26

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 90/93] futex: Fix potential use-after-free in FUTEX_REQUEUE_PI

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Peter Zijlstra <[email protected]>

commit c236c8e95a3d395b0494e7108f0d41cf36ec107c upstream.

While working on the futex code, I stumbled over this potential
use-after-free scenario. Dmitry triggered it later with syzkaller.

pi_mutex is a pointer into pi_state, which we drop the reference on in
unqueue_me_pi(). So any access to that pointer after that is bad.

Since other sites already do rt_mutex_unlock() with hb->lock held, see
for example futex_lock_pi(), simply move the unlock before
unqueue_me_pi().

Reported-by: Dmitry Vyukov <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Darren Hart <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
kernel/futex.c | 20 +++++++++++---------
1 file changed, 11 insertions(+), 9 deletions(-)

--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -2813,7 +2813,6 @@ static int futex_wait_requeue_pi(u32 __u
{
struct hrtimer_sleeper timeout, *to = NULL;
struct rt_mutex_waiter rt_waiter;
- struct rt_mutex *pi_mutex = NULL;
struct futex_hash_bucket *hb;
union futex_key key2 = FUTEX_KEY_INIT;
struct futex_q q = futex_q_init;
@@ -2905,6 +2904,8 @@ static int futex_wait_requeue_pi(u32 __u
spin_unlock(q.lock_ptr);
}
} else {
+ struct rt_mutex *pi_mutex;
+
/*
* We have been woken up by futex_unlock_pi(), a timeout, or a
* signal. futex_unlock_pi() will not destroy the lock_ptr nor
@@ -2928,18 +2929,19 @@ static int futex_wait_requeue_pi(u32 __u
if (res)
ret = (res < 0) ? res : 0;

+ /*
+ * If fixup_pi_state_owner() faulted and was unable to handle
+ * the fault, unlock the rt_mutex and return the fault to
+ * userspace.
+ */
+ if (ret && rt_mutex_owner(pi_mutex) == current)
+ rt_mutex_unlock(pi_mutex);
+
/* Unqueue and drop the lock. */
unqueue_me_pi(&q);
}

- /*
- * If fixup_pi_state_owner() faulted and was unable to handle the
- * fault, unlock the rt_mutex and return the fault to userspace.
- */
- if (ret == -EFAULT) {
- if (pi_mutex && rt_mutex_owner(pi_mutex) == current)
- rt_mutex_unlock(pi_mutex);
- } else if (ret == -EINTR) {
+ if (ret == -EINTR) {
/*
* We've already been requeued, but cannot restart by calling
* futex_lock_pi() directly. We could restart this syscall, but


2017-03-20 18:02:53

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 08/93] net sched actions: decrement module reference count after table flush.

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Roman Mashak <[email protected]>


[ Upstream commit edb9d1bff4bbe19b8ae0e71b1f38732591a9eeb2 ]

When tc actions are loaded as a module and no actions have been installed,
flushing them would result in actions removed from the memory, but modules
reference count not being decremented, so that the modules would not be
unloaded.

Following is example with GACT action:

% sudo modprobe act_gact
% lsmod
Module Size Used by
act_gact 16384 0
%
% sudo tc actions ls action gact
%
% sudo tc actions flush action gact
% lsmod
Module Size Used by
act_gact 16384 1
% sudo tc actions flush action gact
% lsmod
Module Size Used by
act_gact 16384 2
% sudo rmmod act_gact
rmmod: ERROR: Module act_gact is in use
....

After the fix:
% lsmod
Module Size Used by
act_gact 16384 0
%
% sudo tc actions add action pass index 1
% sudo tc actions add action pass index 2
% sudo tc actions add action pass index 3
% lsmod
Module Size Used by
act_gact 16384 3
%
% sudo tc actions flush action gact
% lsmod
Module Size Used by
act_gact 16384 0
%
% sudo tc actions flush action gact
% lsmod
Module Size Used by
act_gact 16384 0
% sudo rmmod act_gact
% lsmod
Module Size Used by
%

Fixes: f97017cdefef ("net-sched: Fix actions flushing")
Signed-off-by: Roman Mashak <[email protected]>
Signed-off-by: Jamal Hadi Salim <[email protected]>
Acked-by: Cong Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/sched/act_api.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)

--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -820,10 +820,8 @@ static int tca_action_flush(struct net *
goto out_module_put;

err = ops->walk(net, skb, &dcb, RTM_DELACTION, ops);
- if (err < 0)
+ if (err <= 0)
goto out_module_put;
- if (err == 0)
- goto noflush_out;

nla_nest_end(skb, nest);

@@ -840,7 +838,6 @@ static int tca_action_flush(struct net *
out_module_put:
module_put(ops->owner);
err_out:
-noflush_out:
kfree_skb(skb);
return err;
}


2017-03-20 18:03:02

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 03/93] net/mlx5e: Fix wrong CQE decompression

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Tariq Toukan <[email protected]>


[ Upstream commit 36154be40a28e4afaa0416da2681d80b7e2ca319 ]

In cqe compression with striding RQ, the decompression of the CQE field
wqe_counter was done with a wrong wraparound value.
This caused handling cqes with a wrong pointer to wqe (rx descriptor)
and creating SKBs with wrong data, pointing to wrong (and already consumed)
strides/pages.

The meaning of the CQE field wqe_counter in striding RQ holds the
stride index instead of the WQE index. Hence, when decompressing
a CQE, wqe_counter should have wrapped-around the number of strides
in a single multi-packet WQE.

We dropped this wrap-around mask at all in CQE decompression of striding
RQ. It is not needed as in such cases the CQE compression session would
break because of different value of wqe_id field, starting a new
compression session.

Tested:
ethtool -K ethxx lro off/on
ethtool --set-priv-flags ethxx rx_cqe_compress on
super_netperf 16 {ipv4,ipv6} -t TCP_STREAM -m 50 -D
verified no csum errors and no page refcount issues.

Fixes: 7219ab34f184 ("net/mlx5e: CQE compression")
Signed-off-by: Tariq Toukan <[email protected]>
Reported-by: Tom Herbert <[email protected]>
Cc: [email protected]
Signed-off-by: Saeed Mahameed <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -92,19 +92,18 @@ static inline void mlx5e_cqes_update_own
static inline void mlx5e_decompress_cqe(struct mlx5e_rq *rq,
struct mlx5e_cq *cq, u32 cqcc)
{
- u16 wqe_cnt_step;
-
cq->title.byte_cnt = cq->mini_arr[cq->mini_arr_idx].byte_cnt;
cq->title.check_sum = cq->mini_arr[cq->mini_arr_idx].checksum;
cq->title.op_own &= 0xf0;
cq->title.op_own |= 0x01 & (cqcc >> cq->wq.log_sz);
cq->title.wqe_counter = cpu_to_be16(cq->decmprs_wqe_counter);

- wqe_cnt_step =
- rq->wq_type == MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ ?
- mpwrq_get_cqe_consumed_strides(&cq->title) : 1;
- cq->decmprs_wqe_counter =
- (cq->decmprs_wqe_counter + wqe_cnt_step) & rq->wq.sz_m1;
+ if (rq->wq_type == MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ)
+ cq->decmprs_wqe_counter +=
+ mpwrq_get_cqe_consumed_strides(&cq->title);
+ else
+ cq->decmprs_wqe_counter =
+ (cq->decmprs_wqe_counter + 1) & rq->wq.sz_m1;
}

static inline void mlx5e_decompress_cqe_no_hash(struct mlx5e_rq *rq,


2017-03-20 18:02:55

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 06/93] vxlan: dont allow overwrite of config src addr

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Brian Russell <[email protected]>


[ Upstream commit 1158632b5a2dcce0786c1b1b99654e81cc867981 ]

When using IPv6 transport and a default dst, a pointer to the configured
source address is passed into the route lookup. If no source address is
configured, then the value is overwritten.

IPv6 route lookup ignores egress ifindex match if the source address is set,
so if egress ifindex match is desired, the source address must be passed
as any. The overwrite breaks this for subsequent lookups.

Avoid this by copying the configured address to an existing stack variable
and pass a pointer to that instead.

Fixes: 272d96a5ab10 ("net: vxlan: lwt: Use source ip address during route lookup.")

Signed-off-by: Brian Russell <[email protected]>
Acked-by: Jiri Benc <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/vxlan.c | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)

--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -1942,7 +1942,6 @@ static void vxlan_xmit_one(struct sk_buf
const struct iphdr *old_iph;
union vxlan_addr *dst;
union vxlan_addr remote_ip, local_ip;
- union vxlan_addr *src;
struct vxlan_metadata _md;
struct vxlan_metadata *md = &_md;
__be16 src_port = 0, dst_port;
@@ -1960,7 +1959,7 @@ static void vxlan_xmit_one(struct sk_buf
dst_port = rdst->remote_port ? rdst->remote_port : vxlan->cfg.dst_port;
vni = rdst->remote_vni;
dst = &rdst->remote_ip;
- src = &vxlan->cfg.saddr;
+ local_ip = vxlan->cfg.saddr;
dst_cache = &rdst->dst_cache;
} else {
if (!info) {
@@ -1979,7 +1978,6 @@ static void vxlan_xmit_one(struct sk_buf
local_ip.sin6.sin6_addr = info->key.u.ipv6.src;
}
dst = &remote_ip;
- src = &local_ip;
dst_cache = &info->dst_cache;
}

@@ -2028,7 +2026,7 @@ static void vxlan_xmit_one(struct sk_buf
rt = vxlan_get_route(vxlan, skb,
rdst ? rdst->remote_ifindex : 0, tos,
dst->sin.sin_addr.s_addr,
- &src->sin.sin_addr.s_addr,
+ &local_ip.sin.sin_addr.s_addr,
dst_cache, info);
if (IS_ERR(rt)) {
netdev_dbg(dev, "no route to %pI4\n",
@@ -2071,7 +2069,7 @@ static void vxlan_xmit_one(struct sk_buf
if (err < 0)
goto xmit_tx_error;

- udp_tunnel_xmit_skb(rt, sk, skb, src->sin.sin_addr.s_addr,
+ udp_tunnel_xmit_skb(rt, sk, skb, local_ip.sin.sin_addr.s_addr,
dst->sin.sin_addr.s_addr, tos, ttl, df,
src_port, dst_port, xnet, !udp_sum);
#if IS_ENABLED(CONFIG_IPV6)
@@ -2087,7 +2085,7 @@ static void vxlan_xmit_one(struct sk_buf
ndst = vxlan6_get_route(vxlan, skb,
rdst ? rdst->remote_ifindex : 0, tos,
label, &dst->sin6.sin6_addr,
- &src->sin6.sin6_addr,
+ &local_ip.sin6.sin6_addr,
dst_cache, info);
if (IS_ERR(ndst)) {
netdev_dbg(dev, "no route to %pI6\n",
@@ -2134,7 +2132,7 @@ static void vxlan_xmit_one(struct sk_buf
return;
}
udp_tunnel6_xmit_skb(ndst, sk, skb, dev,
- &src->sin6.sin6_addr,
+ &local_ip.sin6.sin6_addr,
&dst->sin6.sin6_addr, tos, ttl,
label, src_port, dst_port, !udp_sum);
#endif


2017-03-20 18:03:08

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 07/93] ipv4: mask tos for input route

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Julian Anastasov <[email protected]>


[ Upstream commit 6e28099d38c0e50d62c1afc054e37e573adf3d21 ]

Restore the lost masking of TOS in input route code to
allow ip rules to match it properly.

Problem [1] noticed by Shmulik Ladkani <[email protected]>

[1] http://marc.info/?t=137331755300040&r=1&w=2

Fixes: 89aef8921bfb ("ipv4: Delete routing cache.")
Signed-off-by: Julian Anastasov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/route.c | 1 +
1 file changed, 1 insertion(+)

--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1968,6 +1968,7 @@ int ip_route_input_noref(struct sk_buff
{
int res;

+ tos &= IPTOS_RT_MASK;
rcu_read_lock();

/* Multicast recognition logic is moved from route cache to here.


2017-03-20 18:03:04

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 21/93] tcp: fix various issues for sockets morphing to listen state

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <[email protected]>


[ Upstream commit 02b2faaf0af1d85585f6d6980e286d53612acfc2 ]

Dmitry Vyukov reported a divide by 0 triggered by syzkaller, exploiting
tcp_disconnect() path that was never really considered and/or used
before syzkaller ;)

I was not able to reproduce the bug, but it seems issues here are the
three possible actions that assumed they would never trigger on a
listener.

1) tcp_write_timer_handler
2) tcp_delack_timer_handler
3) MTU reduction

Only IPv6 MTU reduction was properly testing TCP_CLOSE and TCP_LISTEN
states from tcp_v6_mtu_reduced()

Signed-off-by: Eric Dumazet <[email protected]>
Reported-by: Dmitry Vyukov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/tcp_ipv4.c | 7 +++++--
net/ipv4/tcp_timer.c | 6 ++++--
2 files changed, 9 insertions(+), 4 deletions(-)

--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -269,10 +269,13 @@ EXPORT_SYMBOL(tcp_v4_connect);
*/
void tcp_v4_mtu_reduced(struct sock *sk)
{
- struct dst_entry *dst;
struct inet_sock *inet = inet_sk(sk);
- u32 mtu = tcp_sk(sk)->mtu_info;
+ struct dst_entry *dst;
+ u32 mtu;

+ if ((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_CLOSE))
+ return;
+ mtu = tcp_sk(sk)->mtu_info;
dst = inet_csk_update_pmtu(sk, mtu);
if (!dst)
return;
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -249,7 +249,8 @@ void tcp_delack_timer_handler(struct soc

sk_mem_reclaim_partial(sk);

- if (sk->sk_state == TCP_CLOSE || !(icsk->icsk_ack.pending & ICSK_ACK_TIMER))
+ if (((1 << sk->sk_state) & (TCPF_CLOSE | TCPF_LISTEN)) ||
+ !(icsk->icsk_ack.pending & ICSK_ACK_TIMER))
goto out;

if (time_after(icsk->icsk_ack.timeout, jiffies)) {
@@ -552,7 +553,8 @@ void tcp_write_timer_handler(struct sock
struct inet_connection_sock *icsk = inet_csk(sk);
int event;

- if (sk->sk_state == TCP_CLOSE || !icsk->icsk_pending)
+ if (((1 << sk->sk_state) & (TCPF_CLOSE | TCPF_LISTEN)) ||
+ !icsk->icsk_pending)
goto out;

if (time_after(icsk->icsk_timeout, jiffies)) {


2017-03-20 18:02:49

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 20/93] strparser: destroy workqueue on module exit

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: WANG Cong <[email protected]>


[ Upstream commit f78ef7cd9a0686b979679d0de061c6dbfd8d649e ]

Fixes: 43a0c6751a32 ("strparser: Stream parser for messages")
Cc: Tom Herbert <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/strparser/strparser.c | 1 +
1 file changed, 1 insertion(+)

--- a/net/strparser/strparser.c
+++ b/net/strparser/strparser.c
@@ -504,6 +504,7 @@ static int __init strp_mod_init(void)

static void __exit strp_mod_exit(void)
{
+ destroy_workqueue(strp_wq);
}
module_init(strp_mod_init);
module_exit(strp_mod_exit);


2017-03-20 18:04:58

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 04/93] vxlan: correctly validate VXLAN ID against VXLAN_N_VID

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Matthias Schiffer <[email protected]>


[ Upstream commit 4e37d6911f36545b286d15073f6f2222f840e81c ]

The incorrect check caused an off-by-one error: the maximum VID 0xffffff
was unusable.

Fixes: d342894c5d2f ("vxlan: virtual extensible lan")
Signed-off-by: Matthias Schiffer <[email protected]>
Acked-by: Jiri Benc <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/vxlan.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -2637,7 +2637,7 @@ static int vxlan_validate(struct nlattr

if (data[IFLA_VXLAN_ID]) {
__u32 id = nla_get_u32(data[IFLA_VXLAN_ID]);
- if (id >= VXLAN_VID_MASK)
+ if (id >= VXLAN_N_VID)
return -ERANGE;
}



2017-03-20 18:05:16

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 17/93] net: net_enable_timestamp() can be called from irq contexts

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <[email protected]>


[ Upstream commit 13baa00ad01bb3a9f893e3a08cbc2d072fc0c15d ]

It is now very clear that silly TCP listeners might play with
enabling/disabling timestamping while new children are added
to their accept queue.

Meaning net_enable_timestamp() can be called from BH context
while current state of the static key is not enabled.

Lets play safe and allow all contexts.

The work queue is scheduled only under the problematic cases,
which are the static key enable/disable transition, to not slow down
critical paths.

This extends and improves what we did in commit 5fa8bbda38c6 ("net: use
a work queue to defer net_disable_timestamp() work")

Fixes: b90e5794c5bd ("net: dont call jump_label_dec from irq context")
Signed-off-by: Eric Dumazet <[email protected]>
Reported-by: Dmitry Vyukov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/core/dev.c | 35 +++++++++++++++++++++++++++++++----
1 file changed, 31 insertions(+), 4 deletions(-)

--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1697,27 +1697,54 @@ EXPORT_SYMBOL_GPL(net_dec_egress_queue);
static struct static_key netstamp_needed __read_mostly;
#ifdef HAVE_JUMP_LABEL
static atomic_t netstamp_needed_deferred;
+static atomic_t netstamp_wanted;
static void netstamp_clear(struct work_struct *work)
{
int deferred = atomic_xchg(&netstamp_needed_deferred, 0);
+ int wanted;

- while (deferred--)
- static_key_slow_dec(&netstamp_needed);
+ wanted = atomic_add_return(deferred, &netstamp_wanted);
+ if (wanted > 0)
+ static_key_enable(&netstamp_needed);
+ else
+ static_key_disable(&netstamp_needed);
}
static DECLARE_WORK(netstamp_work, netstamp_clear);
#endif

void net_enable_timestamp(void)
{
+#ifdef HAVE_JUMP_LABEL
+ int wanted;
+
+ while (1) {
+ wanted = atomic_read(&netstamp_wanted);
+ if (wanted <= 0)
+ break;
+ if (atomic_cmpxchg(&netstamp_wanted, wanted, wanted + 1) == wanted)
+ return;
+ }
+ atomic_inc(&netstamp_needed_deferred);
+ schedule_work(&netstamp_work);
+#else
static_key_slow_inc(&netstamp_needed);
+#endif
}
EXPORT_SYMBOL(net_enable_timestamp);

void net_disable_timestamp(void)
{
#ifdef HAVE_JUMP_LABEL
- /* net_disable_timestamp() can be called from non process context */
- atomic_inc(&netstamp_needed_deferred);
+ int wanted;
+
+ while (1) {
+ wanted = atomic_read(&netstamp_wanted);
+ if (wanted <= 1)
+ break;
+ if (atomic_cmpxchg(&netstamp_wanted, wanted, wanted - 1) == wanted)
+ return;
+ }
+ atomic_dec(&netstamp_needed_deferred);
schedule_work(&netstamp_work);
#else
static_key_slow_dec(&netstamp_needed);


2017-03-20 18:06:06

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 22/93] net: fix socket refcounting in skb_complete_wifi_ack()

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <[email protected]>


[ Upstream commit dd4f10722aeb10f4f582948839f066bebe44e5fb ]

TX skbs do not necessarily hold a reference on skb->sk->sk_refcnt
By the time TX completion happens, sk_refcnt might be already 0.

sock_hold()/sock_put() would then corrupt critical state, like
sk_wmem_alloc.

Fixes: bf7fa551e0ce ("mac80211: Resolve sk_refcnt/sk_wmem_alloc issue in wifi ack path")
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Alexander Duyck <[email protected]>
Cc: Johannes Berg <[email protected]>
Cc: Soheil Hassas Yeganeh <[email protected]>
Cc: Willem de Bruijn <[email protected]>
Acked-by: Soheil Hassas Yeganeh <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/core/skbuff.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)

--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3871,7 +3871,7 @@ void skb_complete_wifi_ack(struct sk_buf
{
struct sock *sk = skb->sk;
struct sock_exterr_skb *serr;
- int err;
+ int err = 1;

skb->wifi_acked_valid = 1;
skb->wifi_acked = acked;
@@ -3881,14 +3881,15 @@ void skb_complete_wifi_ack(struct sk_buf
serr->ee.ee_errno = ENOMSG;
serr->ee.ee_origin = SO_EE_ORIGIN_TXSTATUS;

- /* take a reference to prevent skb_orphan() from freeing the socket */
- sock_hold(sk);
-
- err = sock_queue_err_skb(sk, skb);
+ /* Take a reference to prevent skb_orphan() from freeing the socket,
+ * but only if the socket refcount is not zero.
+ */
+ if (likely(atomic_inc_not_zero(&sk->sk_refcnt))) {
+ err = sock_queue_err_skb(sk, skb);
+ sock_put(sk);
+ }
if (err)
kfree_skb(skb);
-
- sock_put(sk);
}
EXPORT_SYMBOL_GPL(skb_complete_wifi_ack);



2017-03-20 18:06:37

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 19/93] dccp: Unlock sock before calling sk_free()

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Arnaldo Carvalho de Melo <[email protected]>


[ Upstream commit d5afb6f9b6bb2c57bd0c05e76e12489dc0d037d9 ]

The code where sk_clone() came from created a new socket and locked it,
but then, on the error path didn't unlock it.

This problem stayed there for a long while, till b0691c8ee7c2 ("net:
Unlock sock before calling sk_free()") fixed it, but unfortunately the
callers of sk_clone() (now sk_clone_locked()) were not audited and the
one in dccp_create_openreq_child() remained.

Now in the age of the syskaller fuzzer, this was finally uncovered, as
reported by Dmitry:

---- 8< ----

I've got the following report while running syzkaller fuzzer on
86292b33d4b7 ("Merge branch 'akpm' (patches from Andrew)")

[ BUG: held lock freed! ]
4.10.0+ #234 Not tainted
-------------------------
syz-executor6/6898 is freeing memory
ffff88006286cac0-ffff88006286d3b7, with a lock still held there!
(slock-AF_INET6){+.-...}, at: [<ffffffff8362c2c9>] spin_lock
include/linux/spinlock.h:299 [inline]
(slock-AF_INET6){+.-...}, at: [<ffffffff8362c2c9>]
sk_clone_lock+0x3d9/0x12c0 net/core/sock.c:1504
5 locks held by syz-executor6/6898:
#0: (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff839a34b4>] lock_sock
include/net/sock.h:1460 [inline]
#0: (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff839a34b4>]
inet_stream_connect+0x44/0xa0 net/ipv4/af_inet.c:681
#1: (rcu_read_lock){......}, at: [<ffffffff83bc1c2a>]
inet6_csk_xmit+0x12a/0x5d0 net/ipv6/inet6_connection_sock.c:126
#2: (rcu_read_lock){......}, at: [<ffffffff8369b424>] __skb_unlink
include/linux/skbuff.h:1767 [inline]
#2: (rcu_read_lock){......}, at: [<ffffffff8369b424>] __skb_dequeue
include/linux/skbuff.h:1783 [inline]
#2: (rcu_read_lock){......}, at: [<ffffffff8369b424>]
process_backlog+0x264/0x730 net/core/dev.c:4835
#3: (rcu_read_lock){......}, at: [<ffffffff83aeb5c0>]
ip6_input_finish+0x0/0x1700 net/ipv6/ip6_input.c:59
#4: (slock-AF_INET6){+.-...}, at: [<ffffffff8362c2c9>] spin_lock
include/linux/spinlock.h:299 [inline]
#4: (slock-AF_INET6){+.-...}, at: [<ffffffff8362c2c9>]
sk_clone_lock+0x3d9/0x12c0 net/core/sock.c:1504

Fix it just like was done by b0691c8ee7c2 ("net: Unlock sock before calling
sk_free()").

Reported-by: Dmitry Vyukov <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Gerrit Renker <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/dccp/minisocks.c | 1 +
1 file changed, 1 insertion(+)

--- a/net/dccp/minisocks.c
+++ b/net/dccp/minisocks.c
@@ -122,6 +122,7 @@ struct sock *dccp_create_openreq_child(c
/* It is still raw copy of parent, so invalidate
* destructor and make plain sk_free() */
newsk->sk_destruct = NULL;
+ bh_unlock_sock(newsk);
sk_free(newsk);
return NULL;
}


2017-03-20 18:06:34

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 05/93] vti6: return GRE_KEY for vti6

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: David Forster <[email protected]>


[ Upstream commit 7dcdf941cdc96692ab99fd790c8cc68945514851 ]

Align vti6 with vti by returning GRE_KEY flag. This enables iproute2
to display tunnel keys on "ip -6 tunnel show"

Signed-off-by: David Forster <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv6/ip6_vti.c | 4 ++++
1 file changed, 4 insertions(+)

--- a/net/ipv6/ip6_vti.c
+++ b/net/ipv6/ip6_vti.c
@@ -691,6 +691,10 @@ vti6_parm_to_user(struct ip6_tnl_parm2 *
u->link = p->link;
u->i_key = p->i_key;
u->o_key = p->o_key;
+ if (u->i_key)
+ u->i_flags |= GRE_KEY;
+ if (u->o_key)
+ u->o_flags |= GRE_KEY;
u->proto = p->proto;

memcpy(u->name, p->name, sizeof(u->name));


2017-03-20 18:07:28

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 02/93] net/mlx5e: Do not reduce LRO WQE size when not using build_skb

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Tariq Toukan <[email protected]>


[ Upstream commit 4078e637c12f1e0a74293f1ec9563f42bff14a03 ]

When rq_type is Striding RQ, no room of SKB_RESERVE is needed
as SKB allocation is not done via build_skb.

Fixes: e4b85508072b ("net/mlx5e: Slightly reduce hardware LRO size")
Signed-off-by: Tariq Toukan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -81,6 +81,7 @@ static bool mlx5e_check_fragmented_strid
static void mlx5e_set_rq_type_params(struct mlx5e_priv *priv, u8 rq_type)
{
priv->params.rq_wq_type = rq_type;
+ priv->params.lro_wqe_sz = MLX5E_PARAMS_DEFAULT_LRO_WQE_SZ;
switch (priv->params.rq_wq_type) {
case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ:
priv->params.log_rq_size = MLX5E_PARAMS_DEFAULT_LOG_RQ_SIZE_MPW;
@@ -92,6 +93,10 @@ static void mlx5e_set_rq_type_params(str
break;
default: /* MLX5_WQ_TYPE_LINKED_LIST */
priv->params.log_rq_size = MLX5E_PARAMS_DEFAULT_LOG_RQ_SIZE;
+
+ /* Extra room needed for build_skb */
+ priv->params.lro_wqe_sz -= MLX5_RX_HEADROOM +
+ SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
}
priv->params.min_rx_wqes = mlx5_min_rx_wqes(priv->params.rq_wq_type,
BIT(priv->params.log_rq_size));
@@ -3473,12 +3478,6 @@ static void mlx5e_build_nic_netdev_priv(
mlx5e_build_default_indir_rqt(mdev, priv->params.indirection_rqt,
MLX5E_INDIR_RQT_SIZE, profile->max_nch(mdev));

- priv->params.lro_wqe_sz =
- MLX5E_PARAMS_DEFAULT_LRO_WQE_SZ -
- /* Extra room needed for build_skb */
- MLX5_RX_HEADROOM -
- SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
-
/* Initialize pflags */
MLX5E_SET_PRIV_FLAG(priv, MLX5E_PFLAG_RX_CQE_BASED_MODER,
priv->params.rx_cq_period_mode == MLX5_CQ_PERIOD_MODE_START_FROM_CQE);


2017-03-20 18:07:26

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 18/93] ipv6: orphan skbs in reassembly unit

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <[email protected]>


[ Upstream commit 48cac18ecf1de82f76259a54402c3adb7839ad01 ]

Andrey reported a use-after-free in IPv6 stack.

Issue here is that we free the socket while it still has skb
in TX path and in some queues.

It happens here because IPv6 reassembly unit messes skb->truesize,
breaking skb_set_owner_w() badly.

We fixed a similar issue for IPV4 in commit 8282f27449bf ("inet: frag:
Always orphan skbs inside ip_defrag()")
Acked-by: Joe Stringer <[email protected]>

==================================================================
BUG: KASAN: use-after-free in sock_wfree+0x118/0x120
Read of size 8 at addr ffff880062da0060 by task a.out/4140

page:ffffea00018b6800 count:1 mapcount:0 mapping: (null)
index:0x0 compound_mapcount: 0
flags: 0x100000000008100(slab|head)
raw: 0100000000008100 0000000000000000 0000000000000000 0000000180130013
raw: dead000000000100 dead000000000200 ffff88006741f140 0000000000000000
page dumped because: kasan: bad access detected

CPU: 0 PID: 4140 Comm: a.out Not tainted 4.10.0-rc3+ #59
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:15
dump_stack+0x292/0x398 lib/dump_stack.c:51
describe_address mm/kasan/report.c:262
kasan_report_error+0x121/0x560 mm/kasan/report.c:370
kasan_report mm/kasan/report.c:392
__asan_report_load8_noabort+0x3e/0x40 mm/kasan/report.c:413
sock_flag ./arch/x86/include/asm/bitops.h:324
sock_wfree+0x118/0x120 net/core/sock.c:1631
skb_release_head_state+0xfc/0x250 net/core/skbuff.c:655
skb_release_all+0x15/0x60 net/core/skbuff.c:668
__kfree_skb+0x15/0x20 net/core/skbuff.c:684
kfree_skb+0x16e/0x4e0 net/core/skbuff.c:705
inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304
inet_frag_put ./include/net/inet_frag.h:133
nf_ct_frag6_gather+0x1125/0x38b0 net/ipv6/netfilter/nf_conntrack_reasm.c:617
ipv6_defrag+0x21b/0x350 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68
nf_hook_entry_hookfn ./include/linux/netfilter.h:102
nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310
nf_hook ./include/linux/netfilter.h:212
__ip6_local_out+0x52c/0xaf0 net/ipv6/output_core.c:160
ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170
ip6_send_skb+0xa1/0x340 net/ipv6/ip6_output.c:1722
ip6_push_pending_frames+0xb3/0xe0 net/ipv6/ip6_output.c:1742
rawv6_push_pending_frames net/ipv6/raw.c:613
rawv6_sendmsg+0x2cff/0x4130 net/ipv6/raw.c:927
inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
sock_sendmsg_nosec net/socket.c:635
sock_sendmsg+0xca/0x110 net/socket.c:645
sock_write_iter+0x326/0x620 net/socket.c:848
new_sync_write fs/read_write.c:499
__vfs_write+0x483/0x760 fs/read_write.c:512
vfs_write+0x187/0x530 fs/read_write.c:560
SYSC_write fs/read_write.c:607
SyS_write+0xfb/0x230 fs/read_write.c:599
entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203
RIP: 0033:0x7ff26e6f5b79
RSP: 002b:00007ff268e0ed98 EFLAGS: 00000206 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007ff268e0f9c0 RCX: 00007ff26e6f5b79
RDX: 0000000000000010 RSI: 0000000020f50fe1 RDI: 0000000000000003
RBP: 00007ff26ebc1220 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
R13: 00007ff268e0f9c0 R14: 00007ff26efec040 R15: 0000000000000003

The buggy address belongs to the object at ffff880062da0000
which belongs to the cache RAWv6 of size 1504
The buggy address ffff880062da0060 is located 96 bytes inside
of 1504-byte region [ffff880062da0000, ffff880062da05e0)

Freed by task 4113:
save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
save_stack+0x43/0xd0 mm/kasan/kasan.c:502
set_track mm/kasan/kasan.c:514
kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
slab_free_hook mm/slub.c:1352
slab_free_freelist_hook mm/slub.c:1374
slab_free mm/slub.c:2951
kmem_cache_free+0xb2/0x2c0 mm/slub.c:2973
sk_prot_free net/core/sock.c:1377
__sk_destruct+0x49c/0x6e0 net/core/sock.c:1452
sk_destruct+0x47/0x80 net/core/sock.c:1460
__sk_free+0x57/0x230 net/core/sock.c:1468
sk_free+0x23/0x30 net/core/sock.c:1479
sock_put ./include/net/sock.h:1638
sk_common_release+0x31e/0x4e0 net/core/sock.c:2782
rawv6_close+0x54/0x80 net/ipv6/raw.c:1214
inet_release+0xed/0x1c0 net/ipv4/af_inet.c:425
inet6_release+0x50/0x70 net/ipv6/af_inet6.c:431
sock_release+0x8d/0x1e0 net/socket.c:599
sock_close+0x16/0x20 net/socket.c:1063
__fput+0x332/0x7f0 fs/file_table.c:208
____fput+0x15/0x20 fs/file_table.c:244
task_work_run+0x19b/0x270 kernel/task_work.c:116
exit_task_work ./include/linux/task_work.h:21
do_exit+0x186b/0x2800 kernel/exit.c:839
do_group_exit+0x149/0x420 kernel/exit.c:943
SYSC_exit_group kernel/exit.c:954
SyS_exit_group+0x1d/0x20 kernel/exit.c:952
entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203

Allocated by task 4115:
save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
save_stack+0x43/0xd0 mm/kasan/kasan.c:502
set_track mm/kasan/kasan.c:514
kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:605
kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:544
slab_post_alloc_hook mm/slab.h:432
slab_alloc_node mm/slub.c:2708
slab_alloc mm/slub.c:2716
kmem_cache_alloc+0x1af/0x250 mm/slub.c:2721
sk_prot_alloc+0x65/0x2a0 net/core/sock.c:1334
sk_alloc+0x105/0x1010 net/core/sock.c:1396
inet6_create+0x44d/0x1150 net/ipv6/af_inet6.c:183
__sock_create+0x4f6/0x880 net/socket.c:1199
sock_create net/socket.c:1239
SYSC_socket net/socket.c:1269
SyS_socket+0xf9/0x230 net/socket.c:1249
entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203

Memory state around the buggy address:
ffff880062d9ff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff880062d9ff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff880062da0000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff880062da0080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff880062da0100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

Reported-by: Andrey Konovalov <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv6/netfilter/nf_conntrack_reasm.c | 1 +
net/openvswitch/conntrack.c | 1 -
2 files changed, 1 insertion(+), 1 deletion(-)

--- a/net/ipv6/netfilter/nf_conntrack_reasm.c
+++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
@@ -589,6 +589,7 @@ int nf_ct_frag6_gather(struct net *net,
hdr = ipv6_hdr(skb);
fhdr = (struct frag_hdr *)skb_transport_header(skb);

+ skb_orphan(skb);
fq = fq_find(net, fhdr->identification, user, &hdr->saddr, &hdr->daddr,
skb->dev ? skb->dev->ifindex : 0, ip6_frag_ecn(hdr));
if (fq == NULL) {
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -367,7 +367,6 @@ static int handle_fragments(struct net *
} else if (key->eth.type == htons(ETH_P_IPV6)) {
enum ip6_defrag_users user = IP6_DEFRAG_CONNTRACK_IN + zone;

- skb_orphan(skb);
memset(IP6CB(skb), 0, sizeof(struct inet6_skb_parm));
err = nf_ct_frag6_gather(net, skb, user);
if (err) {


2017-03-20 17:59:35

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 92/93] locking/rwsem: Fix down_write_killable() for CONFIG_RWSEM_GENERIC_SPINLOCK=y

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Niklas Cassel <[email protected]>

commit 17fcbd590d0c3e35bd9646e2215f86586378bc42 upstream.

We hang if SIGKILL has been sent, but the task is stuck in down_read()
(after do_exit()), even though no task is doing down_write() on the
rwsem in question:

INFO: task libupnp:21868 blocked for more than 120 seconds.
libupnp D 0 21868 1 0x08100008
...
Call Trace:
__schedule()
schedule()
__down_read()
do_exit()
do_group_exit()
__wake_up_parent()

This bug has already been fixed for CONFIG_RWSEM_XCHGADD_ALGORITHM=y in
the following commit:

04cafed7fc19 ("locking/rwsem: Fix down_write_killable()")

... however, this bug also exists for CONFIG_RWSEM_GENERIC_SPINLOCK=y.

Signed-off-by: Niklas Cassel <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Niklas Cassel <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Fixes: d47996082f52 ("locking/rwsem: Introduce basis for down_write_killable()")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
kernel/locking/rwsem-spinlock.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)

--- a/kernel/locking/rwsem-spinlock.c
+++ b/kernel/locking/rwsem-spinlock.c
@@ -216,10 +216,8 @@ int __sched __down_write_common(struct r
*/
if (sem->count == 0)
break;
- if (signal_pending_state(state, current)) {
- ret = -EINTR;
- goto out;
- }
+ if (signal_pending_state(state, current))
+ goto out_nolock;
set_task_state(tsk, state);
raw_spin_unlock_irqrestore(&sem->wait_lock, flags);
schedule();
@@ -227,12 +225,19 @@ int __sched __down_write_common(struct r
}
/* got the lock */
sem->count = -1;
-out:
list_del(&waiter.list);

raw_spin_unlock_irqrestore(&sem->wait_lock, flags);

return ret;
+
+out_nolock:
+ list_del(&waiter.list);
+ if (!list_empty(&sem->wait_list))
+ __rwsem_do_wake(sem, 1);
+ raw_spin_unlock_irqrestore(&sem->wait_lock, flags);
+
+ return -EINTR;
}

void __sched __down_write(struct rw_semaphore *sem)


2017-03-20 18:20:22

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 88/93] x86/kasan: Fix boot with KASAN=y and PROFILE_ANNOTATED_BRANCHES=y

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andrey Ryabinin <[email protected]>

commit be3606ff739d1c1be36389f8737c577ad87e1f57 upstream.

The kernel doesn't boot with both PROFILE_ANNOTATED_BRANCHES=y and KASAN=y
options selected. With branch profiling enabled we end up calling
ftrace_likely_update() before kasan_early_init(). ftrace_likely_update() is
built with KASAN instrumentation, so calling it before kasan has been
initialized leads to crash.

Use DISABLE_BRANCH_PROFILING define to make sure that we don't call
ftrace_likely_update() from early code before kasan_early_init().

Fixes: ef7f0d6a6ca8 ("x86_64: add KASan support")
Reported-by: Fengguang Wu <[email protected]>
Signed-off-by: Andrey Ryabinin <[email protected]>
Cc: [email protected]
Cc: Alexander Potapenko <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: [email protected]
Cc: Dmitry Vyukov <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/kernel/head64.c | 1 +
arch/x86/mm/kasan_init_64.c | 1 +
2 files changed, 2 insertions(+)

--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -4,6 +4,7 @@
* Copyright (C) 2000 Andrea Arcangeli <[email protected]> SuSE
*/

+#define DISABLE_BRANCH_PROFILING
#include <linux/init.h>
#include <linux/linkage.h>
#include <linux/types.h>
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -1,3 +1,4 @@
+#define DISABLE_BRANCH_PROFILING
#define pr_fmt(fmt) "kasan: " fmt
#include <linux/bootmem.h>
#include <linux/kasan.h>


2017-03-20 18:20:37

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 93/93] crypto: powerpc - Fix initialisation of crc32c context

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Daniel Axtens <[email protected]>

commit aa2be9b3d6d2d699e9ca7cbfc00867c80e5da213 upstream.

Turning on crypto self-tests on a POWER8 shows:

alg: hash: Test 1 failed for crc32c-vpmsum
00000000: ff ff ff ff

Comparing the code with the Intel CRC32c implementation on which
ours is based shows that we are doing an init with 0, not ~0
as CRC32c requires.

This probably wasn't caught because btrfs does its own weird
open-coded initialisation.

Initialise our internal context to ~0 on init.

This makes the self-tests pass, and btrfs continues to work.

Fixes: 6dd7a82cc54e ("crypto: powerpc - Add POWER8 optimised crc32c")
Cc: Anton Blanchard <[email protected]>
Signed-off-by: Daniel Axtens <[email protected]>
Acked-by: Anton Blanchard <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/powerpc/crypto/crc32c-vpmsum_glue.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/powerpc/crypto/crc32c-vpmsum_glue.c
+++ b/arch/powerpc/crypto/crc32c-vpmsum_glue.c
@@ -52,7 +52,7 @@ static int crc32c_vpmsum_cra_init(struct
{
u32 *key = crypto_tfm_ctx(tfm);

- *key = 0;
+ *key = ~0;

return 0;
}


2017-03-20 18:21:09

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 56/93] PCI: Decouple IORESOURCE_ROM_ENABLE and PCI_ROM_ADDRESS_ENABLE

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Bjorn Helgaas <[email protected]>

[ Upstream commit 7a6d312b50e63f598f5b5914c4fd21878ac2b595 ]

Remove the assumption that IORESOURCE_ROM_ENABLE == PCI_ROM_ADDRESS_ENABLE.
PCI_ROM_ADDRESS_ENABLE is the ROM enable bit defined by the PCI spec, so if
we're reading or writing a BAR register value, that's what we should use.
IORESOURCE_ROM_ENABLE is a corresponding bit in struct resource flags.

Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/pci/probe.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -227,7 +227,8 @@ int __pci_read_base(struct pci_dev *dev,
mask64 = (u32)PCI_BASE_ADDRESS_MEM_MASK;
}
} else {
- res->flags |= (l & IORESOURCE_ROM_ENABLE);
+ if (l & PCI_ROM_ADDRESS_ENABLE)
+ res->flags |= IORESOURCE_ROM_ENABLE;
l64 = l & PCI_ROM_ADDRESS_MASK;
sz64 = sz & PCI_ROM_ADDRESS_MASK;
mask64 = (u32)PCI_ROM_ADDRESS_MASK;


2017-03-20 18:21:28

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 64/93] powerpc/iommu: Pass mm_struct to init/cleanup helpers

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Alexey Kardashevskiy <[email protected]>

[ Upstream commit 88f54a3581eb9deaa3bd1aade40aef266d782385 ]

We are going to get rid of @current references in mmu_context_boos3s64.c
and cache mm_struct in the VFIO container. Since mm_context_t does not
have reference counting, we will be using mm_struct which does have
the reference counter.

This changes mm_iommu_init/mm_iommu_cleanup to receive mm_struct rather
than mm_context_t (which is embedded into mm).

This should not cause any behavioral change.

Signed-off-by: Alexey Kardashevskiy <[email protected]>
Reviewed-by: David Gibson <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/powerpc/include/asm/mmu_context.h | 4 ++--
arch/powerpc/kernel/setup-common.c | 2 +-
arch/powerpc/mm/mmu_context_book3s64.c | 4 ++--
arch/powerpc/mm/mmu_context_iommu.c | 9 +++++----
4 files changed, 10 insertions(+), 9 deletions(-)

--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -23,8 +23,8 @@ extern bool mm_iommu_preregistered(void)
extern long mm_iommu_get(unsigned long ua, unsigned long entries,
struct mm_iommu_table_group_mem_t **pmem);
extern long mm_iommu_put(struct mm_iommu_table_group_mem_t *mem);
-extern void mm_iommu_init(mm_context_t *ctx);
-extern void mm_iommu_cleanup(mm_context_t *ctx);
+extern void mm_iommu_init(struct mm_struct *mm);
+extern void mm_iommu_cleanup(struct mm_struct *mm);
extern struct mm_iommu_table_group_mem_t *mm_iommu_lookup(unsigned long ua,
unsigned long size);
extern struct mm_iommu_table_group_mem_t *mm_iommu_find(unsigned long ua,
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -915,7 +915,7 @@ void __init setup_arch(char **cmdline_p)
init_mm.context.pte_frag = NULL;
#endif
#ifdef CONFIG_SPAPR_TCE_IOMMU
- mm_iommu_init(&init_mm.context);
+ mm_iommu_init(&init_mm);
#endif
irqstack_early_init();
exc_lvl_early_init();
--- a/arch/powerpc/mm/mmu_context_book3s64.c
+++ b/arch/powerpc/mm/mmu_context_book3s64.c
@@ -115,7 +115,7 @@ int init_new_context(struct task_struct
mm->context.pte_frag = NULL;
#endif
#ifdef CONFIG_SPAPR_TCE_IOMMU
- mm_iommu_init(&mm->context);
+ mm_iommu_init(mm);
#endif
return 0;
}
@@ -160,7 +160,7 @@ static inline void destroy_pagetable_pag
void destroy_context(struct mm_struct *mm)
{
#ifdef CONFIG_SPAPR_TCE_IOMMU
- mm_iommu_cleanup(&mm->context);
+ mm_iommu_cleanup(mm);
#endif

#ifdef CONFIG_PPC_ICSWX
--- a/arch/powerpc/mm/mmu_context_iommu.c
+++ b/arch/powerpc/mm/mmu_context_iommu.c
@@ -373,16 +373,17 @@ void mm_iommu_mapped_dec(struct mm_iommu
}
EXPORT_SYMBOL_GPL(mm_iommu_mapped_dec);

-void mm_iommu_init(mm_context_t *ctx)
+void mm_iommu_init(struct mm_struct *mm)
{
- INIT_LIST_HEAD_RCU(&ctx->iommu_group_mem_list);
+ INIT_LIST_HEAD_RCU(&mm->context.iommu_group_mem_list);
}

-void mm_iommu_cleanup(mm_context_t *ctx)
+void mm_iommu_cleanup(struct mm_struct *mm)
{
struct mm_iommu_table_group_mem_t *mem, *tmp;

- list_for_each_entry_safe(mem, tmp, &ctx->iommu_group_mem_list, next) {
+ list_for_each_entry_safe(mem, tmp, &mm->context.iommu_group_mem_list,
+ next) {
list_del_rcu(&mem->next);
mm_iommu_do_free(mem);
}


2017-03-20 18:21:49

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 65/93] powerpc/iommu: Stop using @current in mm_iommu_xxx

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Alexey Kardashevskiy <[email protected]>

[ Upstream commit d7baee6901b34c4895eb78efdbf13a49079d7404 ]

This changes mm_iommu_xxx helpers to take mm_struct as a parameter
instead of getting it from @current which in some situations may
not have a valid reference to mm.

This changes helpers to receive @mm and moves all references to @current
to the caller, including checks for !current and !current->mm;
checks in mm_iommu_preregistered() are removed as there is no caller
yet.

This moves the mm_iommu_adjust_locked_vm() call to the caller as
it receives mm_iommu_table_group_mem_t but it needs mm.

This should cause no behavioral change.

Signed-off-by: Alexey Kardashevskiy <[email protected]>
Reviewed-by: David Gibson <[email protected]>
Acked-by: Alex Williamson <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/powerpc/include/asm/mmu_context.h | 16 ++++++-----
arch/powerpc/mm/mmu_context_iommu.c | 46 ++++++++++++---------------------
drivers/vfio/vfio_iommu_spapr_tce.c | 14 +++++++---
3 files changed, 36 insertions(+), 40 deletions(-)

--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -19,16 +19,18 @@ extern void destroy_context(struct mm_st
struct mm_iommu_table_group_mem_t;

extern int isolate_lru_page(struct page *page); /* from internal.h */
-extern bool mm_iommu_preregistered(void);
-extern long mm_iommu_get(unsigned long ua, unsigned long entries,
+extern bool mm_iommu_preregistered(struct mm_struct *mm);
+extern long mm_iommu_get(struct mm_struct *mm,
+ unsigned long ua, unsigned long entries,
struct mm_iommu_table_group_mem_t **pmem);
-extern long mm_iommu_put(struct mm_iommu_table_group_mem_t *mem);
+extern long mm_iommu_put(struct mm_struct *mm,
+ struct mm_iommu_table_group_mem_t *mem);
extern void mm_iommu_init(struct mm_struct *mm);
extern void mm_iommu_cleanup(struct mm_struct *mm);
-extern struct mm_iommu_table_group_mem_t *mm_iommu_lookup(unsigned long ua,
- unsigned long size);
-extern struct mm_iommu_table_group_mem_t *mm_iommu_find(unsigned long ua,
- unsigned long entries);
+extern struct mm_iommu_table_group_mem_t *mm_iommu_lookup(struct mm_struct *mm,
+ unsigned long ua, unsigned long size);
+extern struct mm_iommu_table_group_mem_t *mm_iommu_find(struct mm_struct *mm,
+ unsigned long ua, unsigned long entries);
extern long mm_iommu_ua_to_hpa(struct mm_iommu_table_group_mem_t *mem,
unsigned long ua, unsigned long *hpa);
extern long mm_iommu_mapped_inc(struct mm_iommu_table_group_mem_t *mem);
--- a/arch/powerpc/mm/mmu_context_iommu.c
+++ b/arch/powerpc/mm/mmu_context_iommu.c
@@ -56,7 +56,7 @@ static long mm_iommu_adjust_locked_vm(st
}

pr_debug("[%d] RLIMIT_MEMLOCK HASH64 %c%ld %ld/%ld\n",
- current->pid,
+ current ? current->pid : 0,
incr ? '+' : '-',
npages << PAGE_SHIFT,
mm->locked_vm << PAGE_SHIFT,
@@ -66,12 +66,9 @@ static long mm_iommu_adjust_locked_vm(st
return ret;
}

-bool mm_iommu_preregistered(void)
+bool mm_iommu_preregistered(struct mm_struct *mm)
{
- if (!current || !current->mm)
- return false;
-
- return !list_empty(&current->mm->context.iommu_group_mem_list);
+ return !list_empty(&mm->context.iommu_group_mem_list);
}
EXPORT_SYMBOL_GPL(mm_iommu_preregistered);

@@ -124,19 +121,16 @@ static int mm_iommu_move_page_from_cma(s
return 0;
}

-long mm_iommu_get(unsigned long ua, unsigned long entries,
+long mm_iommu_get(struct mm_struct *mm, unsigned long ua, unsigned long entries,
struct mm_iommu_table_group_mem_t **pmem)
{
struct mm_iommu_table_group_mem_t *mem;
long i, j, ret = 0, locked_entries = 0;
struct page *page = NULL;

- if (!current || !current->mm)
- return -ESRCH; /* process exited */
-
mutex_lock(&mem_list_mutex);

- list_for_each_entry_rcu(mem, &current->mm->context.iommu_group_mem_list,
+ list_for_each_entry_rcu(mem, &mm->context.iommu_group_mem_list,
next) {
if ((mem->ua == ua) && (mem->entries == entries)) {
++mem->used;
@@ -154,7 +148,7 @@ long mm_iommu_get(unsigned long ua, unsi

}

- ret = mm_iommu_adjust_locked_vm(current->mm, entries, true);
+ ret = mm_iommu_adjust_locked_vm(mm, entries, true);
if (ret)
goto unlock_exit;

@@ -215,11 +209,11 @@ populate:
mem->entries = entries;
*pmem = mem;

- list_add_rcu(&mem->next, &current->mm->context.iommu_group_mem_list);
+ list_add_rcu(&mem->next, &mm->context.iommu_group_mem_list);

unlock_exit:
if (locked_entries && ret)
- mm_iommu_adjust_locked_vm(current->mm, locked_entries, false);
+ mm_iommu_adjust_locked_vm(mm, locked_entries, false);

mutex_unlock(&mem_list_mutex);

@@ -264,17 +258,13 @@ static void mm_iommu_free(struct rcu_hea
static void mm_iommu_release(struct mm_iommu_table_group_mem_t *mem)
{
list_del_rcu(&mem->next);
- mm_iommu_adjust_locked_vm(current->mm, mem->entries, false);
call_rcu(&mem->rcu, mm_iommu_free);
}

-long mm_iommu_put(struct mm_iommu_table_group_mem_t *mem)
+long mm_iommu_put(struct mm_struct *mm, struct mm_iommu_table_group_mem_t *mem)
{
long ret = 0;

- if (!current || !current->mm)
- return -ESRCH; /* process exited */
-
mutex_lock(&mem_list_mutex);

if (mem->used == 0) {
@@ -297,6 +287,8 @@ long mm_iommu_put(struct mm_iommu_table_
/* @mapped became 0 so now mappings are disabled, release the region */
mm_iommu_release(mem);

+ mm_iommu_adjust_locked_vm(mm, mem->entries, false);
+
unlock_exit:
mutex_unlock(&mem_list_mutex);

@@ -304,14 +296,12 @@ unlock_exit:
}
EXPORT_SYMBOL_GPL(mm_iommu_put);

-struct mm_iommu_table_group_mem_t *mm_iommu_lookup(unsigned long ua,
- unsigned long size)
+struct mm_iommu_table_group_mem_t *mm_iommu_lookup(struct mm_struct *mm,
+ unsigned long ua, unsigned long size)
{
struct mm_iommu_table_group_mem_t *mem, *ret = NULL;

- list_for_each_entry_rcu(mem,
- &current->mm->context.iommu_group_mem_list,
- next) {
+ list_for_each_entry_rcu(mem, &mm->context.iommu_group_mem_list, next) {
if ((mem->ua <= ua) &&
(ua + size <= mem->ua +
(mem->entries << PAGE_SHIFT))) {
@@ -324,14 +314,12 @@ struct mm_iommu_table_group_mem_t *mm_io
}
EXPORT_SYMBOL_GPL(mm_iommu_lookup);

-struct mm_iommu_table_group_mem_t *mm_iommu_find(unsigned long ua,
- unsigned long entries)
+struct mm_iommu_table_group_mem_t *mm_iommu_find(struct mm_struct *mm,
+ unsigned long ua, unsigned long entries)
{
struct mm_iommu_table_group_mem_t *mem, *ret = NULL;

- list_for_each_entry_rcu(mem,
- &current->mm->context.iommu_group_mem_list,
- next) {
+ list_for_each_entry_rcu(mem, &mm->context.iommu_group_mem_list, next) {
if ((mem->ua == ua) && (mem->entries == entries)) {
ret = mem;
break;
--- a/drivers/vfio/vfio_iommu_spapr_tce.c
+++ b/drivers/vfio/vfio_iommu_spapr_tce.c
@@ -107,14 +107,17 @@ static long tce_iommu_unregister_pages(s
{
struct mm_iommu_table_group_mem_t *mem;

+ if (!current || !current->mm)
+ return -ESRCH; /* process exited */
+
if ((vaddr & ~PAGE_MASK) || (size & ~PAGE_MASK))
return -EINVAL;

- mem = mm_iommu_find(vaddr, size >> PAGE_SHIFT);
+ mem = mm_iommu_find(current->mm, vaddr, size >> PAGE_SHIFT);
if (!mem)
return -ENOENT;

- return mm_iommu_put(mem);
+ return mm_iommu_put(current->mm, mem);
}

static long tce_iommu_register_pages(struct tce_container *container,
@@ -124,11 +127,14 @@ static long tce_iommu_register_pages(str
struct mm_iommu_table_group_mem_t *mem = NULL;
unsigned long entries = size >> PAGE_SHIFT;

+ if (!current || !current->mm)
+ return -ESRCH; /* process exited */
+
if ((vaddr & ~PAGE_MASK) || (size & ~PAGE_MASK) ||
((vaddr + size) < vaddr))
return -EINVAL;

- ret = mm_iommu_get(vaddr, entries, &mem);
+ ret = mm_iommu_get(current->mm, vaddr, entries, &mem);
if (ret)
return ret;

@@ -375,7 +381,7 @@ static int tce_iommu_prereg_ua_to_hpa(un
long ret = 0;
struct mm_iommu_table_group_mem_t *mem;

- mem = mm_iommu_lookup(tce, size);
+ mem = mm_iommu_lookup(current->mm, tce, size);
if (!mem)
return -EINVAL;



2017-03-20 17:58:58

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 78/93] ACPI / blacklist: Make Dell Latitude 3350 ethernet work

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Michael Pobega <[email protected]>

[ Upstream commit 708f5dcc21ae9b35f395865fc154b0105baf4de4 ]

The Dell Latitude 3350's ethernet card attempts to use a reserved
IRQ (18), resulting in ACPI being unable to enable the ethernet.

Adding it to acpi_rev_dmi_table[] helps to work around this problem.

Signed-off-by: Michael Pobega <[email protected]>
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki <[email protected]>

Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/acpi/blacklist.c | 12 ++++++++++++
1 file changed, 12 insertions(+)

--- a/drivers/acpi/blacklist.c
+++ b/drivers/acpi/blacklist.c
@@ -176,6 +176,18 @@ static struct dmi_system_id acpi_rev_dmi
DMI_MATCH(DMI_PRODUCT_NAME, "Precision 3520"),
},
},
+ /*
+ * Resolves a quirk with the Dell Latitude 3350 that
+ * causes the ethernet adapter to not function.
+ */
+ {
+ .callback = dmi_enable_rev_override,
+ .ident = "DELL Latitude 3350",
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
+ DMI_MATCH(DMI_PRODUCT_NAME, "Latitude 3350"),
+ },
+ },
#endif
{}
};


2017-03-20 18:22:32

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 84/93] drm/vc4: Fix ->clock_select setting for the VEC encoder

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Boris Brezillon <[email protected]>

commit ab8df60e3a3b68420d0d4477c5f07c00fbfb078b upstream.

PV_CONTROL_CLK_SELECT_VEC is actually 2 and not 0. Fix the definition and
rework the vc4_set_crtc_possible_masks() to cover the full range of the
PV_CONTROL_CLK_SELECT field.

Signed-off-by: Boris Brezillon <[email protected]>
Signed-off-by: Eric Anholt <[email protected]>
Cc: Amit Pundir <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>


---
drivers/gpu/drm/vc4/vc4_crtc.c | 36 ++++++++++++++++++++++--------------
drivers/gpu/drm/vc4/vc4_drv.h | 1 +
drivers/gpu/drm/vc4/vc4_regs.h | 3 ++-
3 files changed, 25 insertions(+), 15 deletions(-)

--- a/drivers/gpu/drm/vc4/vc4_crtc.c
+++ b/drivers/gpu/drm/vc4/vc4_crtc.c
@@ -83,8 +83,7 @@ struct vc4_crtc_data {
/* Which channel of the HVS this pixelvalve sources from. */
int hvs_channel;

- enum vc4_encoder_type encoder0_type;
- enum vc4_encoder_type encoder1_type;
+ enum vc4_encoder_type encoder_types[4];
};

#define CRTC_WRITE(offset, val) writel(val, vc4_crtc->regs + (offset))
@@ -867,20 +866,26 @@ static const struct drm_crtc_helper_func

static const struct vc4_crtc_data pv0_data = {
.hvs_channel = 0,
- .encoder0_type = VC4_ENCODER_TYPE_DSI0,
- .encoder1_type = VC4_ENCODER_TYPE_DPI,
+ .encoder_types = {
+ [PV_CONTROL_CLK_SELECT_DSI] = VC4_ENCODER_TYPE_DSI0,
+ [PV_CONTROL_CLK_SELECT_DPI_SMI_HDMI] = VC4_ENCODER_TYPE_DPI,
+ },
};

static const struct vc4_crtc_data pv1_data = {
.hvs_channel = 2,
- .encoder0_type = VC4_ENCODER_TYPE_DSI1,
- .encoder1_type = VC4_ENCODER_TYPE_SMI,
+ .encoder_types = {
+ [PV_CONTROL_CLK_SELECT_DSI] = VC4_ENCODER_TYPE_DSI1,
+ [PV_CONTROL_CLK_SELECT_DPI_SMI_HDMI] = VC4_ENCODER_TYPE_SMI,
+ },
};

static const struct vc4_crtc_data pv2_data = {
.hvs_channel = 1,
- .encoder0_type = VC4_ENCODER_TYPE_VEC,
- .encoder1_type = VC4_ENCODER_TYPE_HDMI,
+ .encoder_types = {
+ [PV_CONTROL_CLK_SELECT_DPI_SMI_HDMI] = VC4_ENCODER_TYPE_HDMI,
+ [PV_CONTROL_CLK_SELECT_VEC] = VC4_ENCODER_TYPE_VEC,
+ },
};

static const struct of_device_id vc4_crtc_dt_match[] = {
@@ -894,17 +899,20 @@ static void vc4_set_crtc_possible_masks(
struct drm_crtc *crtc)
{
struct vc4_crtc *vc4_crtc = to_vc4_crtc(crtc);
+ const struct vc4_crtc_data *crtc_data = vc4_crtc->data;
+ const enum vc4_encoder_type *encoder_types = crtc_data->encoder_types;
struct drm_encoder *encoder;

drm_for_each_encoder(encoder, drm) {
struct vc4_encoder *vc4_encoder = to_vc4_encoder(encoder);
+ int i;

- if (vc4_encoder->type == vc4_crtc->data->encoder0_type) {
- vc4_encoder->clock_select = 0;
- encoder->possible_crtcs |= drm_crtc_mask(crtc);
- } else if (vc4_encoder->type == vc4_crtc->data->encoder1_type) {
- vc4_encoder->clock_select = 1;
- encoder->possible_crtcs |= drm_crtc_mask(crtc);
+ for (i = 0; i < ARRAY_SIZE(crtc_data->encoder_types); i++) {
+ if (vc4_encoder->type == encoder_types[i]) {
+ vc4_encoder->clock_select = i;
+ encoder->possible_crtcs |= drm_crtc_mask(crtc);
+ break;
+ }
}
}
}
--- a/drivers/gpu/drm/vc4/vc4_drv.h
+++ b/drivers/gpu/drm/vc4/vc4_drv.h
@@ -194,6 +194,7 @@ to_vc4_plane(struct drm_plane *plane)
}

enum vc4_encoder_type {
+ VC4_ENCODER_TYPE_NONE,
VC4_ENCODER_TYPE_HDMI,
VC4_ENCODER_TYPE_VEC,
VC4_ENCODER_TYPE_DSI0,
--- a/drivers/gpu/drm/vc4/vc4_regs.h
+++ b/drivers/gpu/drm/vc4/vc4_regs.h
@@ -177,8 +177,9 @@
# define PV_CONTROL_WAIT_HSTART BIT(12)
# define PV_CONTROL_PIXEL_REP_MASK VC4_MASK(5, 4)
# define PV_CONTROL_PIXEL_REP_SHIFT 4
-# define PV_CONTROL_CLK_SELECT_DSI_VEC 0
+# define PV_CONTROL_CLK_SELECT_DSI 0
# define PV_CONTROL_CLK_SELECT_DPI_SMI_HDMI 1
+# define PV_CONTROL_CLK_SELECT_VEC 2
# define PV_CONTROL_CLK_SELECT_MASK VC4_MASK(3, 2)
# define PV_CONTROL_CLK_SELECT_SHIFT 2
# define PV_CONTROL_FIFO_CLR BIT(1)


2017-03-20 18:22:46

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 62/93] Drivers: hv: ring_buffer: count on wrap around mappings in get_next_pkt_raw() (v2)

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Vitaly Kuznetsov <[email protected]>

[ Upstream commit fa32ff6576623616c1751562edaed8c164ca5199 ]

With wrap around mappings in place we can always provide drivers with
direct links to packets on the ring buffer, even when they wrap around.
Do the required updates to get_next_pkt_raw()/put_pkt_raw()

The first version of this commit was reverted (65a532f3d50a) to deal with
cross-tree merge issues which are (hopefully) resolved now.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
Signed-off-by: K. Y. Srinivasan <[email protected]>
Tested-by: Dexuan Cui <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/linux/hyperv.h | 32 +++++++++++---------------------
1 file changed, 11 insertions(+), 21 deletions(-)

--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -1548,31 +1548,23 @@ static inline struct vmpacket_descriptor
get_next_pkt_raw(struct vmbus_channel *channel)
{
struct hv_ring_buffer_info *ring_info = &channel->inbound;
- u32 read_loc = ring_info->priv_read_index;
+ u32 priv_read_loc = ring_info->priv_read_index;
void *ring_buffer = hv_get_ring_buffer(ring_info);
- struct vmpacket_descriptor *cur_desc;
- u32 packetlen;
u32 dsize = ring_info->ring_datasize;
- u32 delta = read_loc - ring_info->ring_buffer->read_index;
+ /*
+ * delta is the difference between what is available to read and
+ * what was already consumed in place. We commit read index after
+ * the whole batch is processed.
+ */
+ u32 delta = priv_read_loc >= ring_info->ring_buffer->read_index ?
+ priv_read_loc - ring_info->ring_buffer->read_index :
+ (dsize - ring_info->ring_buffer->read_index) + priv_read_loc;
u32 bytes_avail_toread = (hv_get_bytes_to_read(ring_info) - delta);

if (bytes_avail_toread < sizeof(struct vmpacket_descriptor))
return NULL;

- if ((read_loc + sizeof(*cur_desc)) > dsize)
- return NULL;
-
- cur_desc = ring_buffer + read_loc;
- packetlen = cur_desc->len8 << 3;
-
- /*
- * If the packet under consideration is wrapping around,
- * return failure.
- */
- if ((read_loc + packetlen + VMBUS_PKT_TRAILER) > (dsize - 1))
- return NULL;
-
- return cur_desc;
+ return ring_buffer + priv_read_loc;
}

/*
@@ -1584,16 +1576,14 @@ static inline void put_pkt_raw(struct vm
struct vmpacket_descriptor *desc)
{
struct hv_ring_buffer_info *ring_info = &channel->inbound;
- u32 read_loc = ring_info->priv_read_index;
u32 packetlen = desc->len8 << 3;
u32 dsize = ring_info->ring_datasize;

- if ((read_loc + packetlen + VMBUS_PKT_TRAILER) > dsize)
- BUG();
/*
* Include the packet trailer.
*/
ring_info->priv_read_index += packetlen + VMBUS_PKT_TRAILER;
+ ring_info->priv_read_index %= dsize;
}

/*


2017-03-20 18:23:06

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 61/93] ibmveth: calculate gso_segs for large packets

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Thomas Falcon <[email protected]>

[ Upstream commit 94acf164dc8f1184e8d0737be7125134c2701dbe ]

Include calculations to compute the number of segments
that comprise an aggregated large packet.

Signed-off-by: Thomas Falcon <[email protected]>
Reviewed-by: Marcelo Ricardo Leitner <[email protected]>
Reviewed-by: Jonathan Maxwell <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/ibm/ibmveth.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)

--- a/drivers/net/ethernet/ibm/ibmveth.c
+++ b/drivers/net/ethernet/ibm/ibmveth.c
@@ -1181,7 +1181,9 @@ map_failed:

static void ibmveth_rx_mss_helper(struct sk_buff *skb, u16 mss, int lrg_pkt)
{
+ struct tcphdr *tcph;
int offset = 0;
+ int hdr_len;

/* only TCP packets will be aggregated */
if (skb->protocol == htons(ETH_P_IP)) {
@@ -1208,14 +1210,20 @@ static void ibmveth_rx_mss_helper(struct
/* if mss is not set through Large Packet bit/mss in rx buffer,
* expect that the mss will be written to the tcp header checksum.
*/
+ tcph = (struct tcphdr *)(skb->data + offset);
if (lrg_pkt) {
skb_shinfo(skb)->gso_size = mss;
} else if (offset) {
- struct tcphdr *tcph = (struct tcphdr *)(skb->data + offset);
-
skb_shinfo(skb)->gso_size = ntohs(tcph->check);
tcph->check = 0;
}
+
+ if (skb_shinfo(skb)->gso_size) {
+ hdr_len = offset + tcph->doff * 4;
+ skb_shinfo(skb)->gso_segs =
+ DIV_ROUND_UP(skb->len - hdr_len,
+ skb_shinfo(skb)->gso_size);
+ }
}

static int ibmveth_poll(struct napi_struct *napi, int budget)


2017-03-20 18:23:17

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 60/93] PCI: Do any VF BAR updates before enabling the BARs

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Gavin Shan <[email protected]>

[ Upstream commit f40ec3c748c6912f6266c56a7f7992de61b255ed ]

Previously we enabled VFs and enable their memory space before calling
pcibios_sriov_enable(). But pcibios_sriov_enable() may update the VF BARs:
for example, on PPC PowerNV we may change them to manage the association of
VFs to PEs.

Because 64-bit BARs cannot be updated atomically, it's unsafe to update
them while they're enabled. The half-updated state may conflict with other
devices in the system.

Call pcibios_sriov_enable() before enabling the VFs so any BAR updates
happen while the VF BARs are disabled.

[bhelgaas: changelog]
Tested-by: Carol Soto <[email protected]>
Signed-off-by: Gavin Shan <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>

Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/pci/iov.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)

--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -306,13 +306,6 @@ static int sriov_enable(struct pci_dev *
return rc;
}

- pci_iov_set_numvfs(dev, nr_virtfn);
- iov->ctrl |= PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE;
- pci_cfg_access_lock(dev);
- pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
- msleep(100);
- pci_cfg_access_unlock(dev);
-
iov->initial_VFs = initial;
if (nr_virtfn < initial)
initial = nr_virtfn;
@@ -323,6 +316,13 @@ static int sriov_enable(struct pci_dev *
goto err_pcibios;
}

+ pci_iov_set_numvfs(dev, nr_virtfn);
+ iov->ctrl |= PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE;
+ pci_cfg_access_lock(dev);
+ pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
+ msleep(100);
+ pci_cfg_access_unlock(dev);
+
for (i = 0; i < initial; i++) {
rc = pci_iov_add_virtfn(dev, i, 0);
if (rc)


2017-03-20 18:23:22

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 87/93] x86/tsc: Fix ART for TSC_KNOWN_FREQ

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Peter Zijlstra <[email protected]>

commit 44fee88cea43d3c2cac962e0439cb10a3cabff6d upstream.

Subhransu reported that convert_art_to_tsc() isn't working for him.

The ART to TSC relation is only set up for systems which use the refined
TSC calibration. Systems with known TSC frequency (available via CPUID 15)
are not using the refined calibration and therefor the ART to TSC relation
is never established.

Add the setup to the known frequency init path which skips ART
calibration. The init code needs to be duplicated as for systems which use
refined calibration the ART setup must be delayed until calibration has
been done.

The problem has been there since the ART support was introdduced, but only
detected now because Subhransu tested the first time on hardware which has
TSC frequency enumerated via CPUID 15.

Note for stable: The conditional has changed from TSC_RELIABLE to
TSC_KNOWN_FREQUENCY.

[ tglx: Rewrote changelog and identified the proper 'Fixes' commit ]

Fixes: f9677e0f8308 ("x86/tsc: Always Running Timer (ART) correlated clocksource")
Reported-by: "Prusty, Subhransu S" <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/kernel/tsc.c | 2 ++
1 file changed, 2 insertions(+)

--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -1287,6 +1287,8 @@ static int __init init_tsc_clocksource(v
* exporting a reliable TSC.
*/
if (boot_cpu_has(X86_FEATURE_TSC_RELIABLE)) {
+ if (boot_cpu_has(X86_FEATURE_ART))
+ art_related_clocksource = &clocksource_tsc;
clocksource_register_khz(&clocksource_tsc, tsc_khz);
return 0;
}


2017-03-20 18:23:27

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 59/93] PCI: Ignore BAR updates on virtual functions

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Bjorn Helgaas <[email protected]>

[ Upstream commit 63880b230a4af502c56dde3d4588634c70c66006 ]

VF BARs are read-only zero, so updating VF BARs will not have any effect.
See the SR-IOV spec r1.1, sec 3.4.1.11.

We already ignore these updates because of 70675e0b6a1a ("PCI: Don't try to
restore VF BARs"); this merely restructures it slightly to make it easier
to split updates for standard and SR-IOV BARs.

Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/pci/pci.c | 4 ----
drivers/pci/setup-res.c | 5 ++---
2 files changed, 2 insertions(+), 7 deletions(-)

--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -564,10 +564,6 @@ static void pci_restore_bars(struct pci_
{
int i;

- /* Per SR-IOV spec 3.4.1.11, VF BARs are RO zero */
- if (dev->is_virtfn)
- return;
-
for (i = 0; i < PCI_BRIDGE_RESOURCES; i++)
pci_update_resource(dev, i);
}
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -34,10 +34,9 @@ static void pci_std_update_resource(stru
int reg;
struct resource *res = dev->resource + resno;

- if (dev->is_virtfn) {
- dev_warn(&dev->dev, "can't update VF BAR%d\n", resno);
+ /* Per SR-IOV spec 3.4.1.11, VF BARs are RO zero */
+ if (dev->is_virtfn)
return;
- }

/*
* Ignore resources for unimplemented BARs and unused resource slots


2017-03-20 18:23:25

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 86/93] irqchip/gicv3-its: Add workaround for QDF2400 ITS erratum 0065

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Shanker Donthineni <[email protected]>

commit 90922a2d03d84de36bf8a9979d62580102f31a92 upstream.

On Qualcomm Datacenter Technologies QDF2400 SoCs, the ITS hardware
implementation uses 16Bytes for Interrupt Translation Entry (ITE),
but reports an incorrect value of 8Bytes in GITS_TYPER.ITTE_size.

It might cause kernel memory corruption depending on the number
of MSI(x) that are configured and the amount of memory that has
been allocated for ITEs in its_create_device().

This patch fixes the potential memory corruption by setting the
correct ITE size to 16Bytes.

Cc: [email protected]
Signed-off-by: Shanker Donthineni <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
Documentation/arm64/silicon-errata.txt | 44 +++++++++++++++++----------------
arch/arm64/Kconfig | 10 +++++++
drivers/irqchip/irq-gic-v3-its.c | 16 ++++++++++++
3 files changed, 49 insertions(+), 21 deletions(-)

--- a/Documentation/arm64/silicon-errata.txt
+++ b/Documentation/arm64/silicon-errata.txt
@@ -42,24 +42,26 @@ file acts as a registry of software work
will be updated when new workarounds are committed and backported to
stable kernels.

-| Implementor | Component | Erratum ID | Kconfig |
-+----------------+-----------------+-----------------+-------------------------+
-| ARM | Cortex-A53 | #826319 | ARM64_ERRATUM_826319 |
-| ARM | Cortex-A53 | #827319 | ARM64_ERRATUM_827319 |
-| ARM | Cortex-A53 | #824069 | ARM64_ERRATUM_824069 |
-| ARM | Cortex-A53 | #819472 | ARM64_ERRATUM_819472 |
-| ARM | Cortex-A53 | #845719 | ARM64_ERRATUM_845719 |
-| ARM | Cortex-A53 | #843419 | ARM64_ERRATUM_843419 |
-| ARM | Cortex-A57 | #832075 | ARM64_ERRATUM_832075 |
-| ARM | Cortex-A57 | #852523 | N/A |
-| ARM | Cortex-A57 | #834220 | ARM64_ERRATUM_834220 |
-| ARM | Cortex-A72 | #853709 | N/A |
-| ARM | MMU-500 | #841119,#826419 | N/A |
-| | | | |
-| Cavium | ThunderX ITS | #22375, #24313 | CAVIUM_ERRATUM_22375 |
-| Cavium | ThunderX ITS | #23144 | CAVIUM_ERRATUM_23144 |
-| Cavium | ThunderX GICv3 | #23154 | CAVIUM_ERRATUM_23154 |
-| Cavium | ThunderX Core | #27456 | CAVIUM_ERRATUM_27456 |
-| Cavium | ThunderX SMMUv2 | #27704 | N/A |
-| | | | |
-| Freescale/NXP | LS2080A/LS1043A | A-008585 | FSL_ERRATUM_A008585 |
+| Implementor | Component | Erratum ID | Kconfig |
++----------------+-----------------+-----------------+-----------------------------+
+| ARM | Cortex-A53 | #826319 | ARM64_ERRATUM_826319 |
+| ARM | Cortex-A53 | #827319 | ARM64_ERRATUM_827319 |
+| ARM | Cortex-A53 | #824069 | ARM64_ERRATUM_824069 |
+| ARM | Cortex-A53 | #819472 | ARM64_ERRATUM_819472 |
+| ARM | Cortex-A53 | #845719 | ARM64_ERRATUM_845719 |
+| ARM | Cortex-A53 | #843419 | ARM64_ERRATUM_843419 |
+| ARM | Cortex-A57 | #832075 | ARM64_ERRATUM_832075 |
+| ARM | Cortex-A57 | #852523 | N/A |
+| ARM | Cortex-A57 | #834220 | ARM64_ERRATUM_834220 |
+| ARM | Cortex-A72 | #853709 | N/A |
+| ARM | MMU-500 | #841119,#826419 | N/A |
+| | | | |
+| Cavium | ThunderX ITS | #22375, #24313 | CAVIUM_ERRATUM_22375 |
+| Cavium | ThunderX ITS | #23144 | CAVIUM_ERRATUM_23144 |
+| Cavium | ThunderX GICv3 | #23154 | CAVIUM_ERRATUM_23154 |
+| Cavium | ThunderX Core | #27456 | CAVIUM_ERRATUM_27456 |
+| Cavium | ThunderX SMMUv2 | #27704 | N/A |
+| | | | |
+| Freescale/NXP | LS2080A/LS1043A | A-008585 | FSL_ERRATUM_A008585 |
+| | | | |
+| Qualcomm Tech. | QDF2400 ITS | E0065 | QCOM_QDF2400_ERRATUM_0065 |
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -474,6 +474,16 @@ config CAVIUM_ERRATUM_27456

If unsure, say Y.

+config QCOM_QDF2400_ERRATUM_0065
+ bool "QDF2400 E0065: Incorrect GITS_TYPER.ITT_Entry_size"
+ default y
+ help
+ On Qualcomm Datacenter Technologies QDF2400 SoC, ITS hardware reports
+ ITE size incorrectly. The GITS_TYPER.ITT_Entry_size field should have
+ been indicated as 16Bytes (0xf), not 8Bytes (0x7).
+
+ If unsure, say Y.
+
endmenu


--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -1598,6 +1598,14 @@ static void __maybe_unused its_enable_qu
its->flags |= ITS_FLAGS_WORKAROUND_CAVIUM_23144;
}

+static void __maybe_unused its_enable_quirk_qdf2400_e0065(void *data)
+{
+ struct its_node *its = data;
+
+ /* On QDF2400, the size of the ITE is 16Bytes */
+ its->ite_size = 16;
+}
+
static const struct gic_quirk its_quirks[] = {
#ifdef CONFIG_CAVIUM_ERRATUM_22375
{
@@ -1615,6 +1623,14 @@ static const struct gic_quirk its_quirks
.init = its_enable_quirk_cavium_23144,
},
#endif
+#ifdef CONFIG_QCOM_QDF2400_ERRATUM_0065
+ {
+ .desc = "ITS: QDF2400 erratum 0065",
+ .iidr = 0x00001070, /* QDF2400 ITS rev 1.x */
+ .mask = 0xffffffff,
+ .init = its_enable_quirk_qdf2400_e0065,
+ },
+#endif
{
}
};


2017-03-20 18:23:15

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 83/93] drm/vc4: Fix race between page flip completion event and clean-up

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Derek Foreman <[email protected]>

commit 26fc78f6fef39b9d7a15def5e7e9826ff68303f4 upstream.

There was a small window where a userspace program could submit
a pageflip after receiving a pageflip completion event yet still
receive EBUSY.

Signed-off-by: Derek Foreman <[email protected]>
Signed-off-by: Eric Anholt <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Daniel Stone <[email protected]>
Cc: Amit Pundir <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/gpu/drm/vc4/vc4_crtc.c | 8 ++++++++
drivers/gpu/drm/vc4/vc4_drv.h | 1 +
drivers/gpu/drm/vc4/vc4_kms.c | 33 +++++++++++++++++++++++++--------
3 files changed, 34 insertions(+), 8 deletions(-)

--- a/drivers/gpu/drm/vc4/vc4_crtc.c
+++ b/drivers/gpu/drm/vc4/vc4_crtc.c
@@ -669,6 +669,14 @@ void vc4_disable_vblank(struct drm_devic
CRTC_WRITE(PV_INTEN, 0);
}

+/* Must be called with the event lock held */
+bool vc4_event_pending(struct drm_crtc *crtc)
+{
+ struct vc4_crtc *vc4_crtc = to_vc4_crtc(crtc);
+
+ return !!vc4_crtc->event;
+}
+
static void vc4_crtc_handle_page_flip(struct vc4_crtc *vc4_crtc)
{
struct drm_crtc *crtc = &vc4_crtc->base;
--- a/drivers/gpu/drm/vc4/vc4_drv.h
+++ b/drivers/gpu/drm/vc4/vc4_drv.h
@@ -440,6 +440,7 @@ int vc4_bo_stats_debugfs(struct seq_file
extern struct platform_driver vc4_crtc_driver;
int vc4_enable_vblank(struct drm_device *dev, unsigned int crtc_id);
void vc4_disable_vblank(struct drm_device *dev, unsigned int crtc_id);
+bool vc4_event_pending(struct drm_crtc *crtc);
int vc4_crtc_debugfs_regs(struct seq_file *m, void *arg);
int vc4_crtc_get_scanoutpos(struct drm_device *dev, unsigned int crtc_id,
unsigned int flags, int *vpos, int *hpos,
--- a/drivers/gpu/drm/vc4/vc4_kms.c
+++ b/drivers/gpu/drm/vc4/vc4_kms.c
@@ -119,17 +119,34 @@ static int vc4_atomic_commit(struct drm_

/* Make sure that any outstanding modesets have finished. */
if (nonblock) {
- ret = down_trylock(&vc4->async_modeset);
- if (ret) {
+ struct drm_crtc *crtc;
+ struct drm_crtc_state *crtc_state;
+ unsigned long flags;
+ bool busy = false;
+
+ /*
+ * If there's an undispatched event to send then we're
+ * obviously still busy. If there isn't, then we can
+ * unconditionally wait for the semaphore because it
+ * shouldn't be contended (for long).
+ *
+ * This is to prevent a race where queuing a new flip
+ * from userspace immediately on receipt of an event
+ * beats our clean-up and returns EBUSY.
+ */
+ spin_lock_irqsave(&dev->event_lock, flags);
+ for_each_crtc_in_state(state, crtc, crtc_state, i)
+ busy |= vc4_event_pending(crtc);
+ spin_unlock_irqrestore(&dev->event_lock, flags);
+ if (busy) {
kfree(c);
return -EBUSY;
}
- } else {
- ret = down_interruptible(&vc4->async_modeset);
- if (ret) {
- kfree(c);
- return ret;
- }
+ }
+ ret = down_interruptible(&vc4->async_modeset);
+ if (ret) {
+ kfree(c);
+ return ret;
}

ret = drm_atomic_helper_prepare_planes(dev, state);


2017-03-20 18:23:11

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 75/93] [media] uvcvideo: uvc_scan_fallback() for webcams with broken chain

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Henrik Ingo <[email protected]>

[ Upstream commit e950267ab802c8558f1100eafd4087fd039ad634 ]

Some devices have invalid baSourceID references, causing uvc_scan_chain()
to fail, but if we just take the entities we can find and put them
together in the most sensible chain we can think of, turns out they do
work anyway. Note: This heuristic assumes there is a single chain.

At the time of writing, devices known to have such a broken chain are
- Acer Integrated Camera (5986:055a)
- Realtek rtl157a7 (0bda:57a7)

Signed-off-by: Henrik Ingo <[email protected]>
Signed-off-by: Laurent Pinchart <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/media/usb/uvc/uvc_driver.c | 118 +++++++++++++++++++++++++++++++++++--
1 file changed, 112 insertions(+), 6 deletions(-)

--- a/drivers/media/usb/uvc/uvc_driver.c
+++ b/drivers/media/usb/uvc/uvc_driver.c
@@ -1595,6 +1595,114 @@ static const char *uvc_print_chain(struc
return buffer;
}

+static struct uvc_video_chain *uvc_alloc_chain(struct uvc_device *dev)
+{
+ struct uvc_video_chain *chain;
+
+ chain = kzalloc(sizeof(*chain), GFP_KERNEL);
+ if (chain == NULL)
+ return NULL;
+
+ INIT_LIST_HEAD(&chain->entities);
+ mutex_init(&chain->ctrl_mutex);
+ chain->dev = dev;
+ v4l2_prio_init(&chain->prio);
+
+ return chain;
+}
+
+/*
+ * Fallback heuristic for devices that don't connect units and terminals in a
+ * valid chain.
+ *
+ * Some devices have invalid baSourceID references, causing uvc_scan_chain()
+ * to fail, but if we just take the entities we can find and put them together
+ * in the most sensible chain we can think of, turns out they do work anyway.
+ * Note: This heuristic assumes there is a single chain.
+ *
+ * At the time of writing, devices known to have such a broken chain are
+ * - Acer Integrated Camera (5986:055a)
+ * - Realtek rtl157a7 (0bda:57a7)
+ */
+static int uvc_scan_fallback(struct uvc_device *dev)
+{
+ struct uvc_video_chain *chain;
+ struct uvc_entity *iterm = NULL;
+ struct uvc_entity *oterm = NULL;
+ struct uvc_entity *entity;
+ struct uvc_entity *prev;
+
+ /*
+ * Start by locating the input and output terminals. We only support
+ * devices with exactly one of each for now.
+ */
+ list_for_each_entry(entity, &dev->entities, list) {
+ if (UVC_ENTITY_IS_ITERM(entity)) {
+ if (iterm)
+ return -EINVAL;
+ iterm = entity;
+ }
+
+ if (UVC_ENTITY_IS_OTERM(entity)) {
+ if (oterm)
+ return -EINVAL;
+ oterm = entity;
+ }
+ }
+
+ if (iterm == NULL || oterm == NULL)
+ return -EINVAL;
+
+ /* Allocate the chain and fill it. */
+ chain = uvc_alloc_chain(dev);
+ if (chain == NULL)
+ return -ENOMEM;
+
+ if (uvc_scan_chain_entity(chain, oterm) < 0)
+ goto error;
+
+ prev = oterm;
+
+ /*
+ * Add all Processing and Extension Units with two pads. The order
+ * doesn't matter much, use reverse list traversal to connect units in
+ * UVC descriptor order as we build the chain from output to input. This
+ * leads to units appearing in the order meant by the manufacturer for
+ * the cameras known to require this heuristic.
+ */
+ list_for_each_entry_reverse(entity, &dev->entities, list) {
+ if (entity->type != UVC_VC_PROCESSING_UNIT &&
+ entity->type != UVC_VC_EXTENSION_UNIT)
+ continue;
+
+ if (entity->num_pads != 2)
+ continue;
+
+ if (uvc_scan_chain_entity(chain, entity) < 0)
+ goto error;
+
+ prev->baSourceID[0] = entity->id;
+ prev = entity;
+ }
+
+ if (uvc_scan_chain_entity(chain, iterm) < 0)
+ goto error;
+
+ prev->baSourceID[0] = iterm->id;
+
+ list_add_tail(&chain->list, &dev->chains);
+
+ uvc_trace(UVC_TRACE_PROBE,
+ "Found a video chain by fallback heuristic (%s).\n",
+ uvc_print_chain(chain));
+
+ return 0;
+
+error:
+ kfree(chain);
+ return -EINVAL;
+}
+
/*
* Scan the device for video chains and register video devices.
*
@@ -1617,15 +1725,10 @@ static int uvc_scan_device(struct uvc_de
if (term->chain.next || term->chain.prev)
continue;

- chain = kzalloc(sizeof(*chain), GFP_KERNEL);
+ chain = uvc_alloc_chain(dev);
if (chain == NULL)
return -ENOMEM;

- INIT_LIST_HEAD(&chain->entities);
- mutex_init(&chain->ctrl_mutex);
- chain->dev = dev;
- v4l2_prio_init(&chain->prio);
-
term->flags |= UVC_ENTITY_FLAG_DEFAULT;

if (uvc_scan_chain(chain, term) < 0) {
@@ -1639,6 +1742,9 @@ static int uvc_scan_device(struct uvc_de
list_add_tail(&chain->list, &dev->chains);
}

+ if (list_empty(&dev->chains))
+ uvc_scan_fallback(dev);
+
if (list_empty(&dev->chains)) {
uvc_printk(KERN_INFO, "No valid video chain found.\n");
return -1;


2017-03-20 18:25:07

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 82/93] clk: bcm2835: Fix ->fixed_divider of pllh_aux

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Boris Brezillon <[email protected]>

commit f2a46926aba1f0c33944901d2420a6a887455ddc upstream.

There is no fixed divider on pllh_aux.

Signed-off-by: Boris Brezillon <[email protected]>
Signed-off-by: Eric Anholt <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Stephen Boyd <[email protected]>
Cc: Amit Pundir <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/clk/bcm/clk-bcm2835.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/clk/bcm/clk-bcm2835.c
+++ b/drivers/clk/bcm/clk-bcm2835.c
@@ -1598,7 +1598,7 @@ static const struct bcm2835_clk_desc clk
.a2w_reg = A2W_PLLH_AUX,
.load_mask = CM_PLLH_LOADAUX,
.hold_mask = 0,
- .fixed_divider = 10),
+ .fixed_divider = 1),
[BCM2835_PLLH_PIX] = REGISTER_PLL_DIV(
.name = "pllh_pix",
.source_pll = "pllh",


2017-03-20 18:25:09

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 80/93] usb: gadget: udc: atmel: remove memory leak

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Alexandre Belloni <[email protected]>

[ Upstream commit 32856eea7bf75dfb99b955ada6e147f553a11366 ]

Commit bbe097f092b0 ("usb: gadget: udc: atmel: fix endpoint name")
introduced a memory leak when unbinding the driver. The endpoint names
would not be freed. Solve that by including the name as a string in struct
usba_ep so it is freed when the endpoint is.

Signed-off-by: Alexandre Belloni <[email protected]>
Signed-off-by: Felipe Balbi <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/usb/gadget/udc/atmel_usba_udc.c | 3 ++-
drivers/usb/gadget/udc/atmel_usba_udc.h | 1 +
2 files changed, 3 insertions(+), 1 deletion(-)

--- a/drivers/usb/gadget/udc/atmel_usba_udc.c
+++ b/drivers/usb/gadget/udc/atmel_usba_udc.c
@@ -1978,7 +1978,8 @@ static struct usba_ep * atmel_udc_of_ini
dev_err(&pdev->dev, "of_probe: name error(%d)\n", ret);
goto err;
}
- ep->ep.name = kasprintf(GFP_KERNEL, "ep%d", ep->index);
+ sprintf(ep->name, "ep%d", ep->index);
+ ep->ep.name = ep->name;

ep->ep_regs = udc->regs + USBA_EPT_BASE(i);
ep->dma_regs = udc->regs + USBA_DMA_BASE(i);
--- a/drivers/usb/gadget/udc/atmel_usba_udc.h
+++ b/drivers/usb/gadget/udc/atmel_usba_udc.h
@@ -280,6 +280,7 @@ struct usba_ep {
void __iomem *ep_regs;
void __iomem *dma_regs;
void __iomem *fifo;
+ char name[8];
struct usb_ep ep;
struct usba_udc *udc;



2017-03-20 18:25:06

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 74/93] s390/zcrypt: Introduce CEX6 toleration

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Harald Freudenberger <[email protected]>

[ Upstream commit b3e8652bcbfa04807e44708d4d0c8cdad39c9215 ]

Signed-off-by: Harald Freudenberger <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/s390/crypto/ap_bus.c | 3 +++
drivers/s390/crypto/ap_bus.h | 1 +
2 files changed, 4 insertions(+)

--- a/drivers/s390/crypto/ap_bus.c
+++ b/drivers/s390/crypto/ap_bus.c
@@ -1712,6 +1712,9 @@ static void ap_scan_bus(struct work_stru
ap_dev->queue_depth = queue_depth;
ap_dev->raw_hwtype = device_type;
ap_dev->device_type = device_type;
+ /* CEX6 toleration: map to CEX5 */
+ if (device_type == AP_DEVICE_TYPE_CEX6)
+ ap_dev->device_type = AP_DEVICE_TYPE_CEX5;
ap_dev->functions = device_functions;
spin_lock_init(&ap_dev->lock);
INIT_LIST_HEAD(&ap_dev->pendingq);
--- a/drivers/s390/crypto/ap_bus.h
+++ b/drivers/s390/crypto/ap_bus.h
@@ -105,6 +105,7 @@ static inline int ap_test_bit(unsigned i
#define AP_DEVICE_TYPE_CEX3C 9
#define AP_DEVICE_TYPE_CEX4 10
#define AP_DEVICE_TYPE_CEX5 11
+#define AP_DEVICE_TYPE_CEX6 12

/*
* Known function facilities


2017-03-20 18:25:02

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 85/93] arm64: KVM: VHE: Clear HCR_TGE when invalidating guest TLBs

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Marc Zyngier <[email protected]>

commit 68925176296a8b995e503349200e256674bfe5ac upstream.

When invalidating guest TLBs, special care must be taken to
actually shoot the guest TLBs and not the host ones if we're
running on a VHE system. This is controlled by the HCR_EL2.TGE
bit, which we forget to clear before invalidating TLBs.

Address the issue by introducing two wrappers (__tlb_switch_to_guest
and __tlb_switch_to_host) that take care of both the VTTBR_EL2
and HCR_EL2.TGE switching.

Reported-by: Tomasz Nowicki <[email protected]>
Tested-by: Tomasz Nowicki <[email protected]>
Reviewed-by: Christoffer Dall <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm64/kvm/hyp/tlb.c | 64 ++++++++++++++++++++++++++++++++++++++++-------
1 file changed, 55 insertions(+), 9 deletions(-)

--- a/arch/arm64/kvm/hyp/tlb.c
+++ b/arch/arm64/kvm/hyp/tlb.c
@@ -17,14 +17,62 @@

#include <asm/kvm_hyp.h>

+static void __hyp_text __tlb_switch_to_guest_vhe(struct kvm *kvm)
+{
+ u64 val;
+
+ /*
+ * With VHE enabled, we have HCR_EL2.{E2H,TGE} = {1,1}, and
+ * most TLB operations target EL2/EL0. In order to affect the
+ * guest TLBs (EL1/EL0), we need to change one of these two
+ * bits. Changing E2H is impossible (goodbye TTBR1_EL2), so
+ * let's flip TGE before executing the TLB operation.
+ */
+ write_sysreg(kvm->arch.vttbr, vttbr_el2);
+ val = read_sysreg(hcr_el2);
+ val &= ~HCR_TGE;
+ write_sysreg(val, hcr_el2);
+ isb();
+}
+
+static void __hyp_text __tlb_switch_to_guest_nvhe(struct kvm *kvm)
+{
+ write_sysreg(kvm->arch.vttbr, vttbr_el2);
+ isb();
+}
+
+static hyp_alternate_select(__tlb_switch_to_guest,
+ __tlb_switch_to_guest_nvhe,
+ __tlb_switch_to_guest_vhe,
+ ARM64_HAS_VIRT_HOST_EXTN);
+
+static void __hyp_text __tlb_switch_to_host_vhe(struct kvm *kvm)
+{
+ /*
+ * We're done with the TLB operation, let's restore the host's
+ * view of HCR_EL2.
+ */
+ write_sysreg(0, vttbr_el2);
+ write_sysreg(HCR_HOST_VHE_FLAGS, hcr_el2);
+}
+
+static void __hyp_text __tlb_switch_to_host_nvhe(struct kvm *kvm)
+{
+ write_sysreg(0, vttbr_el2);
+}
+
+static hyp_alternate_select(__tlb_switch_to_host,
+ __tlb_switch_to_host_nvhe,
+ __tlb_switch_to_host_vhe,
+ ARM64_HAS_VIRT_HOST_EXTN);
+
void __hyp_text __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
{
dsb(ishst);

/* Switch to requested VMID */
kvm = kern_hyp_va(kvm);
- write_sysreg(kvm->arch.vttbr, vttbr_el2);
- isb();
+ __tlb_switch_to_guest()(kvm);

/*
* We could do so much better if we had the VA as well.
@@ -45,7 +93,7 @@ void __hyp_text __kvm_tlb_flush_vmid_ipa
dsb(ish);
isb();

- write_sysreg(0, vttbr_el2);
+ __tlb_switch_to_host()(kvm);
}

void __hyp_text __kvm_tlb_flush_vmid(struct kvm *kvm)
@@ -54,14 +102,13 @@ void __hyp_text __kvm_tlb_flush_vmid(str

/* Switch to requested VMID */
kvm = kern_hyp_va(kvm);
- write_sysreg(kvm->arch.vttbr, vttbr_el2);
- isb();
+ __tlb_switch_to_guest()(kvm);

asm volatile("tlbi vmalls12e1is" : : );
dsb(ish);
isb();

- write_sysreg(0, vttbr_el2);
+ __tlb_switch_to_host()(kvm);
}

void __hyp_text __kvm_tlb_flush_local_vmid(struct kvm_vcpu *vcpu)
@@ -69,14 +116,13 @@ void __hyp_text __kvm_tlb_flush_local_vm
struct kvm *kvm = kern_hyp_va(kern_hyp_va(vcpu)->kvm);

/* Switch to requested VMID */
- write_sysreg(kvm->arch.vttbr, vttbr_el2);
- isb();
+ __tlb_switch_to_guest()(kvm);

asm volatile("tlbi vmalle1" : : );
dsb(nsh);
isb();

- write_sysreg(0, vttbr_el2);
+ __tlb_switch_to_host()(kvm);
}

void __hyp_text __kvm_flush_vm_context(void)


2017-03-20 18:26:34

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 58/93] PCI: Update BARs using property bits appropriate for type

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Bjorn Helgaas <[email protected]>

[ Upstream commit 45d004f4afefdd8d79916ee6d97a9ecd94bb1ffe ]

The BAR property bits (0-3 for memory BARs, 0-1 for I/O BARs) are supposed
to be read-only, but we do save them in res->flags and include them when
updating the BAR.

Mask the I/O property bits with ~PCI_BASE_ADDRESS_IO_MASK (0x3) instead of
PCI_REGION_FLAG_MASK (0xf) to make it obvious that we can't corrupt bits
2-3 of I/O addresses.

Use PCI_ROM_ADDRESS_MASK for ROM BARs. This means we'll only check the top
21 bits (instead of the 28 bits we used to check) of a ROM BAR to see if
the update was successful.

Signed-off-by: Bjorn Helgaas <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/pci/setup-res.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)

--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -58,12 +58,17 @@ static void pci_std_update_resource(stru
return;

pcibios_resource_to_bus(dev->bus, &region, res);
+ new = region.start;

- new = region.start | (res->flags & PCI_REGION_FLAG_MASK);
- if (res->flags & IORESOURCE_IO)
+ if (res->flags & IORESOURCE_IO) {
mask = (u32)PCI_BASE_ADDRESS_IO_MASK;
- else
+ new |= res->flags & ~PCI_BASE_ADDRESS_IO_MASK;
+ } else if (resno == PCI_ROM_RESOURCE) {
+ mask = (u32)PCI_ROM_ADDRESS_MASK;
+ } else {
mask = (u32)PCI_BASE_ADDRESS_MEM_MASK;
+ new |= res->flags & ~PCI_BASE_ADDRESS_MEM_MASK;
+ }

if (resno < PCI_ROM_RESOURCE) {
reg = PCI_BASE_ADDRESS_0 + 4 * resno;


2017-03-20 18:26:31

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 76/93] slub: move synchronize_sched out of slab_mutex on shrink

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Vladimir Davydov <[email protected]>

[ Upstream commit 89e364db71fb5e7fc8d93228152abfa67daf35fa ]

synchronize_sched() is a heavy operation and calling it per each cache
owned by a memory cgroup being destroyed may take quite some time. What
is worse, it's currently called under the slab_mutex, stalling all works
doing cache creation/destruction.

Actually, there isn't much point in calling synchronize_sched() for each
cache - it's enough to call it just once - after setting cpu_partial for
all caches and before shrinking them. This way, we can also move it out
of the slab_mutex, which we have to hold for iterating over the slab
cache list.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=172991
Link: http://lkml.kernel.org/r/0a10d71ecae3db00fb4421bcd3f82bcc911f4be4.1475329751.git.vdavydov.dev@gmail.com
Signed-off-by: Vladimir Davydov <[email protected]>
Reported-by: Doug Smythies <[email protected]>
Acked-by: Joonsoo Kim <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Pekka Enberg <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
mm/slab.c | 4 ++--
mm/slab.h | 2 +-
mm/slab_common.c | 27 +++++++++++++++++++++++++--
mm/slob.c | 2 +-
mm/slub.c | 19 ++-----------------
5 files changed, 31 insertions(+), 23 deletions(-)

--- a/mm/slab.c
+++ b/mm/slab.c
@@ -2332,7 +2332,7 @@ out:
return nr_freed;
}

-int __kmem_cache_shrink(struct kmem_cache *cachep, bool deactivate)
+int __kmem_cache_shrink(struct kmem_cache *cachep)
{
int ret = 0;
int node;
@@ -2352,7 +2352,7 @@ int __kmem_cache_shrink(struct kmem_cach

int __kmem_cache_shutdown(struct kmem_cache *cachep)
{
- return __kmem_cache_shrink(cachep, false);
+ return __kmem_cache_shrink(cachep);
}

void __kmem_cache_release(struct kmem_cache *cachep)
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -146,7 +146,7 @@ static inline unsigned long kmem_cache_f

int __kmem_cache_shutdown(struct kmem_cache *);
void __kmem_cache_release(struct kmem_cache *);
-int __kmem_cache_shrink(struct kmem_cache *, bool);
+int __kmem_cache_shrink(struct kmem_cache *);
void slab_kmem_cache_release(struct kmem_cache *);

struct seq_file;
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -573,6 +573,29 @@ void memcg_deactivate_kmem_caches(struct
get_online_cpus();
get_online_mems();

+#ifdef CONFIG_SLUB
+ /*
+ * In case of SLUB, we need to disable empty slab caching to
+ * avoid pinning the offline memory cgroup by freeable kmem
+ * pages charged to it. SLAB doesn't need this, as it
+ * periodically purges unused slabs.
+ */
+ mutex_lock(&slab_mutex);
+ list_for_each_entry(s, &slab_caches, list) {
+ c = is_root_cache(s) ? cache_from_memcg_idx(s, idx) : NULL;
+ if (c) {
+ c->cpu_partial = 0;
+ c->min_partial = 0;
+ }
+ }
+ mutex_unlock(&slab_mutex);
+ /*
+ * kmem_cache->cpu_partial is checked locklessly (see
+ * put_cpu_partial()). Make sure the change is visible.
+ */
+ synchronize_sched();
+#endif
+
mutex_lock(&slab_mutex);
list_for_each_entry(s, &slab_caches, list) {
if (!is_root_cache(s))
@@ -584,7 +607,7 @@ void memcg_deactivate_kmem_caches(struct
if (!c)
continue;

- __kmem_cache_shrink(c, true);
+ __kmem_cache_shrink(c);
arr->entries[idx] = NULL;
}
mutex_unlock(&slab_mutex);
@@ -755,7 +778,7 @@ int kmem_cache_shrink(struct kmem_cache
get_online_cpus();
get_online_mems();
kasan_cache_shrink(cachep);
- ret = __kmem_cache_shrink(cachep, false);
+ ret = __kmem_cache_shrink(cachep);
put_online_mems();
put_online_cpus();
return ret;
--- a/mm/slob.c
+++ b/mm/slob.c
@@ -634,7 +634,7 @@ void __kmem_cache_release(struct kmem_ca
{
}

-int __kmem_cache_shrink(struct kmem_cache *d, bool deactivate)
+int __kmem_cache_shrink(struct kmem_cache *d)
{
return 0;
}
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3887,7 +3887,7 @@ EXPORT_SYMBOL(kfree);
* being allocated from last increasing the chance that the last objects
* are freed in them.
*/
-int __kmem_cache_shrink(struct kmem_cache *s, bool deactivate)
+int __kmem_cache_shrink(struct kmem_cache *s)
{
int node;
int i;
@@ -3899,21 +3899,6 @@ int __kmem_cache_shrink(struct kmem_cach
unsigned long flags;
int ret = 0;

- if (deactivate) {
- /*
- * Disable empty slabs caching. Used to avoid pinning offline
- * memory cgroups by kmem pages that can be freed.
- */
- s->cpu_partial = 0;
- s->min_partial = 0;
-
- /*
- * s->cpu_partial is checked locklessly (see put_cpu_partial),
- * so we have to make sure the change is visible.
- */
- synchronize_sched();
- }
-
flush_all(s);
for_each_kmem_cache_node(s, node, n) {
INIT_LIST_HEAD(&discard);
@@ -3970,7 +3955,7 @@ static int slab_mem_going_offline_callba

mutex_lock(&slab_mutex);
list_for_each_entry(s, &slab_caches, list)
- __kmem_cache_shrink(s, false);
+ __kmem_cache_shrink(s);
mutex_unlock(&slab_mutex);

return 0;


2017-03-20 18:27:11

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 71/93] drm/nouveau/disp/nv50-: split chid into chid.ctrl and chid.user

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Ben Skeggs <[email protected]>

[ Upstream commit 4391d7f5c79a9fe6fa11cf6c160ca7f7bdb49d2a ]

GP102/GP104 make life difficult by redefining the channel indices for
some registers, but not others.

Signed-off-by: Ben Skeggs <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.c | 23 +++++----
drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.h | 6 ++
drivers/gpu/drm/nouveau/nvkm/engine/disp/dmacgf119.c | 44 +++++++++----------
drivers/gpu/drm/nouveau/nvkm/engine/disp/dmacgp104.c | 23 +++++----
drivers/gpu/drm/nouveau/nvkm/engine/disp/dmacnv50.c | 44 +++++++++----------
drivers/gpu/drm/nouveau/nvkm/engine/disp/piocgf119.c | 28 ++++++------
drivers/gpu/drm/nouveau/nvkm/engine/disp/piocnv50.c | 30 ++++++------
7 files changed, 106 insertions(+), 92 deletions(-)

--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.c
@@ -82,7 +82,7 @@ nv50_disp_chan_mthd(struct nv50_disp_cha

if (mthd->addr) {
snprintf(cname_, sizeof(cname_), "%s %d",
- mthd->name, chan->chid);
+ mthd->name, chan->chid.user);
cname = cname_;
}

@@ -139,7 +139,7 @@ nv50_disp_chan_uevent_ctor(struct nvkm_o
if (!(ret = nvif_unvers(ret, &data, &size, args->none))) {
notify->size = sizeof(struct nvif_notify_uevent_rep);
notify->types = 1;
- notify->index = chan->chid;
+ notify->index = chan->chid.user;
return 0;
}

@@ -159,7 +159,7 @@ nv50_disp_chan_rd32(struct nvkm_object *
struct nv50_disp_chan *chan = nv50_disp_chan(object);
struct nv50_disp *disp = chan->root->disp;
struct nvkm_device *device = disp->base.engine.subdev.device;
- *data = nvkm_rd32(device, 0x640000 + (chan->chid * 0x1000) + addr);
+ *data = nvkm_rd32(device, 0x640000 + (chan->chid.user * 0x1000) + addr);
return 0;
}

@@ -169,7 +169,7 @@ nv50_disp_chan_wr32(struct nvkm_object *
struct nv50_disp_chan *chan = nv50_disp_chan(object);
struct nv50_disp *disp = chan->root->disp;
struct nvkm_device *device = disp->base.engine.subdev.device;
- nvkm_wr32(device, 0x640000 + (chan->chid * 0x1000) + addr, data);
+ nvkm_wr32(device, 0x640000 + (chan->chid.user * 0x1000) + addr, data);
return 0;
}

@@ -196,7 +196,7 @@ nv50_disp_chan_map(struct nvkm_object *o
struct nv50_disp *disp = chan->root->disp;
struct nvkm_device *device = disp->base.engine.subdev.device;
*addr = device->func->resource_addr(device, 0) +
- 0x640000 + (chan->chid * 0x1000);
+ 0x640000 + (chan->chid.user * 0x1000);
*size = 0x001000;
return 0;
}
@@ -243,8 +243,8 @@ nv50_disp_chan_dtor(struct nvkm_object *
{
struct nv50_disp_chan *chan = nv50_disp_chan(object);
struct nv50_disp *disp = chan->root->disp;
- if (chan->chid >= 0)
- disp->chan[chan->chid] = NULL;
+ if (chan->chid.user >= 0)
+ disp->chan[chan->chid.user] = NULL;
return chan->func->dtor ? chan->func->dtor(chan) : chan;
}

@@ -273,14 +273,15 @@ nv50_disp_chan_ctor(const struct nv50_di
chan->func = func;
chan->mthd = mthd;
chan->root = root;
- chan->chid = chid;
+ chan->chid.ctrl = chid;
+ chan->chid.user = chid;
chan->head = head;

- if (disp->chan[chan->chid]) {
- chan->chid = -1;
+ if (disp->chan[chan->chid.user]) {
+ chan->chid.user = -1;
return -EBUSY;
}
- disp->chan[chan->chid] = chan;
+ disp->chan[chan->chid.user] = chan;
return 0;
}

--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.h
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.h
@@ -7,7 +7,11 @@ struct nv50_disp_chan {
const struct nv50_disp_chan_func *func;
const struct nv50_disp_chan_mthd *mthd;
struct nv50_disp_root *root;
- int chid;
+
+ struct {
+ int ctrl;
+ int user;
+ } chid;
int head;

struct nvkm_object object;
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/dmacgf119.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/dmacgf119.c
@@ -32,8 +32,8 @@ gf119_disp_dmac_bind(struct nv50_disp_dm
struct nvkm_object *object, u32 handle)
{
return nvkm_ramht_insert(chan->base.root->ramht, object,
- chan->base.chid, -9, handle,
- chan->base.chid << 27 | 0x00000001);
+ chan->base.chid.user, -9, handle,
+ chan->base.chid.user << 27 | 0x00000001);
}

void
@@ -42,22 +42,23 @@ gf119_disp_dmac_fini(struct nv50_disp_dm
struct nv50_disp *disp = chan->base.root->disp;
struct nvkm_subdev *subdev = &disp->base.engine.subdev;
struct nvkm_device *device = subdev->device;
- int chid = chan->base.chid;
+ int ctrl = chan->base.chid.ctrl;
+ int user = chan->base.chid.user;

/* deactivate channel */
- nvkm_mask(device, 0x610490 + (chid * 0x0010), 0x00001010, 0x00001000);
- nvkm_mask(device, 0x610490 + (chid * 0x0010), 0x00000003, 0x00000000);
+ nvkm_mask(device, 0x610490 + (ctrl * 0x0010), 0x00001010, 0x00001000);
+ nvkm_mask(device, 0x610490 + (ctrl * 0x0010), 0x00000003, 0x00000000);
if (nvkm_msec(device, 2000,
- if (!(nvkm_rd32(device, 0x610490 + (chid * 0x10)) & 0x001e0000))
+ if (!(nvkm_rd32(device, 0x610490 + (ctrl * 0x10)) & 0x001e0000))
break;
) < 0) {
- nvkm_error(subdev, "ch %d fini: %08x\n", chid,
- nvkm_rd32(device, 0x610490 + (chid * 0x10)));
+ nvkm_error(subdev, "ch %d fini: %08x\n", user,
+ nvkm_rd32(device, 0x610490 + (ctrl * 0x10)));
}

/* disable error reporting and completion notification */
- nvkm_mask(device, 0x610090, 0x00000001 << chid, 0x00000000);
- nvkm_mask(device, 0x6100a0, 0x00000001 << chid, 0x00000000);
+ nvkm_mask(device, 0x610090, 0x00000001 << user, 0x00000000);
+ nvkm_mask(device, 0x6100a0, 0x00000001 << user, 0x00000000);
}

static int
@@ -66,26 +67,27 @@ gf119_disp_dmac_init(struct nv50_disp_dm
struct nv50_disp *disp = chan->base.root->disp;
struct nvkm_subdev *subdev = &disp->base.engine.subdev;
struct nvkm_device *device = subdev->device;
- int chid = chan->base.chid;
+ int ctrl = chan->base.chid.ctrl;
+ int user = chan->base.chid.user;

/* enable error reporting */
- nvkm_mask(device, 0x6100a0, 0x00000001 << chid, 0x00000001 << chid);
+ nvkm_mask(device, 0x6100a0, 0x00000001 << user, 0x00000001 << user);

/* initialise channel for dma command submission */
- nvkm_wr32(device, 0x610494 + (chid * 0x0010), chan->push);
- nvkm_wr32(device, 0x610498 + (chid * 0x0010), 0x00010000);
- nvkm_wr32(device, 0x61049c + (chid * 0x0010), 0x00000001);
- nvkm_mask(device, 0x610490 + (chid * 0x0010), 0x00000010, 0x00000010);
- nvkm_wr32(device, 0x640000 + (chid * 0x1000), 0x00000000);
- nvkm_wr32(device, 0x610490 + (chid * 0x0010), 0x00000013);
+ nvkm_wr32(device, 0x610494 + (ctrl * 0x0010), chan->push);
+ nvkm_wr32(device, 0x610498 + (ctrl * 0x0010), 0x00010000);
+ nvkm_wr32(device, 0x61049c + (ctrl * 0x0010), 0x00000001);
+ nvkm_mask(device, 0x610490 + (ctrl * 0x0010), 0x00000010, 0x00000010);
+ nvkm_wr32(device, 0x640000 + (ctrl * 0x1000), 0x00000000);
+ nvkm_wr32(device, 0x610490 + (ctrl * 0x0010), 0x00000013);

/* wait for it to go inactive */
if (nvkm_msec(device, 2000,
- if (!(nvkm_rd32(device, 0x610490 + (chid * 0x10)) & 0x80000000))
+ if (!(nvkm_rd32(device, 0x610490 + (ctrl * 0x10)) & 0x80000000))
break;
) < 0) {
- nvkm_error(subdev, "ch %d init: %08x\n", chid,
- nvkm_rd32(device, 0x610490 + (chid * 0x10)));
+ nvkm_error(subdev, "ch %d init: %08x\n", user,
+ nvkm_rd32(device, 0x610490 + (ctrl * 0x10)));
return -EBUSY;
}

--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/dmacgp104.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/dmacgp104.c
@@ -32,26 +32,27 @@ gp104_disp_dmac_init(struct nv50_disp_dm
struct nv50_disp *disp = chan->base.root->disp;
struct nvkm_subdev *subdev = &disp->base.engine.subdev;
struct nvkm_device *device = subdev->device;
- int chid = chan->base.chid;
+ int ctrl = chan->base.chid.ctrl;
+ int user = chan->base.chid.user;

/* enable error reporting */
- nvkm_mask(device, 0x6100a0, 0x00000001 << chid, 0x00000001 << chid);
+ nvkm_mask(device, 0x6100a0, 0x00000001 << user, 0x00000001 << user);

/* initialise channel for dma command submission */
- nvkm_wr32(device, 0x611494 + (chid * 0x0010), chan->push);
- nvkm_wr32(device, 0x611498 + (chid * 0x0010), 0x00010000);
- nvkm_wr32(device, 0x61149c + (chid * 0x0010), 0x00000001);
- nvkm_mask(device, 0x610490 + (chid * 0x0010), 0x00000010, 0x00000010);
- nvkm_wr32(device, 0x640000 + (chid * 0x1000), 0x00000000);
- nvkm_wr32(device, 0x610490 + (chid * 0x0010), 0x00000013);
+ nvkm_wr32(device, 0x611494 + (ctrl * 0x0010), chan->push);
+ nvkm_wr32(device, 0x611498 + (ctrl * 0x0010), 0x00010000);
+ nvkm_wr32(device, 0x61149c + (ctrl * 0x0010), 0x00000001);
+ nvkm_mask(device, 0x610490 + (ctrl * 0x0010), 0x00000010, 0x00000010);
+ nvkm_wr32(device, 0x640000 + (ctrl * 0x1000), 0x00000000);
+ nvkm_wr32(device, 0x610490 + (ctrl * 0x0010), 0x00000013);

/* wait for it to go inactive */
if (nvkm_msec(device, 2000,
- if (!(nvkm_rd32(device, 0x610490 + (chid * 0x10)) & 0x80000000))
+ if (!(nvkm_rd32(device, 0x610490 + (ctrl * 0x10)) & 0x80000000))
break;
) < 0) {
- nvkm_error(subdev, "ch %d init: %08x\n", chid,
- nvkm_rd32(device, 0x610490 + (chid * 0x10)));
+ nvkm_error(subdev, "ch %d init: %08x\n", user,
+ nvkm_rd32(device, 0x610490 + (ctrl * 0x10)));
return -EBUSY;
}

--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/dmacnv50.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/dmacnv50.c
@@ -179,9 +179,9 @@ nv50_disp_dmac_bind(struct nv50_disp_dma
struct nvkm_object *object, u32 handle)
{
return nvkm_ramht_insert(chan->base.root->ramht, object,
- chan->base.chid, -10, handle,
- chan->base.chid << 28 |
- chan->base.chid);
+ chan->base.chid.user, -10, handle,
+ chan->base.chid.user << 28 |
+ chan->base.chid.user);
}

static void
@@ -190,21 +190,22 @@ nv50_disp_dmac_fini(struct nv50_disp_dma
struct nv50_disp *disp = chan->base.root->disp;
struct nvkm_subdev *subdev = &disp->base.engine.subdev;
struct nvkm_device *device = subdev->device;
- int chid = chan->base.chid;
+ int ctrl = chan->base.chid.ctrl;
+ int user = chan->base.chid.user;

/* deactivate channel */
- nvkm_mask(device, 0x610200 + (chid * 0x0010), 0x00001010, 0x00001000);
- nvkm_mask(device, 0x610200 + (chid * 0x0010), 0x00000003, 0x00000000);
+ nvkm_mask(device, 0x610200 + (ctrl * 0x0010), 0x00001010, 0x00001000);
+ nvkm_mask(device, 0x610200 + (ctrl * 0x0010), 0x00000003, 0x00000000);
if (nvkm_msec(device, 2000,
- if (!(nvkm_rd32(device, 0x610200 + (chid * 0x10)) & 0x001e0000))
+ if (!(nvkm_rd32(device, 0x610200 + (ctrl * 0x10)) & 0x001e0000))
break;
) < 0) {
- nvkm_error(subdev, "ch %d fini timeout, %08x\n", chid,
- nvkm_rd32(device, 0x610200 + (chid * 0x10)));
+ nvkm_error(subdev, "ch %d fini timeout, %08x\n", user,
+ nvkm_rd32(device, 0x610200 + (ctrl * 0x10)));
}

/* disable error reporting and completion notifications */
- nvkm_mask(device, 0x610028, 0x00010001 << chid, 0x00000000 << chid);
+ nvkm_mask(device, 0x610028, 0x00010001 << user, 0x00000000 << user);
}

static int
@@ -213,26 +214,27 @@ nv50_disp_dmac_init(struct nv50_disp_dma
struct nv50_disp *disp = chan->base.root->disp;
struct nvkm_subdev *subdev = &disp->base.engine.subdev;
struct nvkm_device *device = subdev->device;
- int chid = chan->base.chid;
+ int ctrl = chan->base.chid.ctrl;
+ int user = chan->base.chid.user;

/* enable error reporting */
- nvkm_mask(device, 0x610028, 0x00010000 << chid, 0x00010000 << chid);
+ nvkm_mask(device, 0x610028, 0x00010000 << user, 0x00010000 << user);

/* initialise channel for dma command submission */
- nvkm_wr32(device, 0x610204 + (chid * 0x0010), chan->push);
- nvkm_wr32(device, 0x610208 + (chid * 0x0010), 0x00010000);
- nvkm_wr32(device, 0x61020c + (chid * 0x0010), chid);
- nvkm_mask(device, 0x610200 + (chid * 0x0010), 0x00000010, 0x00000010);
- nvkm_wr32(device, 0x640000 + (chid * 0x1000), 0x00000000);
- nvkm_wr32(device, 0x610200 + (chid * 0x0010), 0x00000013);
+ nvkm_wr32(device, 0x610204 + (ctrl * 0x0010), chan->push);
+ nvkm_wr32(device, 0x610208 + (ctrl * 0x0010), 0x00010000);
+ nvkm_wr32(device, 0x61020c + (ctrl * 0x0010), ctrl);
+ nvkm_mask(device, 0x610200 + (ctrl * 0x0010), 0x00000010, 0x00000010);
+ nvkm_wr32(device, 0x640000 + (ctrl * 0x1000), 0x00000000);
+ nvkm_wr32(device, 0x610200 + (ctrl * 0x0010), 0x00000013);

/* wait for it to go inactive */
if (nvkm_msec(device, 2000,
- if (!(nvkm_rd32(device, 0x610200 + (chid * 0x10)) & 0x80000000))
+ if (!(nvkm_rd32(device, 0x610200 + (ctrl * 0x10)) & 0x80000000))
break;
) < 0) {
- nvkm_error(subdev, "ch %d init timeout, %08x\n", chid,
- nvkm_rd32(device, 0x610200 + (chid * 0x10)));
+ nvkm_error(subdev, "ch %d init timeout, %08x\n", user,
+ nvkm_rd32(device, 0x610200 + (ctrl * 0x10)));
return -EBUSY;
}

--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/piocgf119.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/piocgf119.c
@@ -32,20 +32,21 @@ gf119_disp_pioc_fini(struct nv50_disp_ch
struct nv50_disp *disp = chan->root->disp;
struct nvkm_subdev *subdev = &disp->base.engine.subdev;
struct nvkm_device *device = subdev->device;
- int chid = chan->chid;
+ int ctrl = chan->chid.ctrl;
+ int user = chan->chid.user;

- nvkm_mask(device, 0x610490 + (chid * 0x10), 0x00000001, 0x00000000);
+ nvkm_mask(device, 0x610490 + (ctrl * 0x10), 0x00000001, 0x00000000);
if (nvkm_msec(device, 2000,
- if (!(nvkm_rd32(device, 0x610490 + (chid * 0x10)) & 0x00030000))
+ if (!(nvkm_rd32(device, 0x610490 + (ctrl * 0x10)) & 0x00030000))
break;
) < 0) {
- nvkm_error(subdev, "ch %d fini: %08x\n", chid,
- nvkm_rd32(device, 0x610490 + (chid * 0x10)));
+ nvkm_error(subdev, "ch %d fini: %08x\n", user,
+ nvkm_rd32(device, 0x610490 + (ctrl * 0x10)));
}

/* disable error reporting and completion notification */
- nvkm_mask(device, 0x610090, 0x00000001 << chid, 0x00000000);
- nvkm_mask(device, 0x6100a0, 0x00000001 << chid, 0x00000000);
+ nvkm_mask(device, 0x610090, 0x00000001 << user, 0x00000000);
+ nvkm_mask(device, 0x6100a0, 0x00000001 << user, 0x00000000);
}

static int
@@ -54,20 +55,21 @@ gf119_disp_pioc_init(struct nv50_disp_ch
struct nv50_disp *disp = chan->root->disp;
struct nvkm_subdev *subdev = &disp->base.engine.subdev;
struct nvkm_device *device = subdev->device;
- int chid = chan->chid;
+ int ctrl = chan->chid.ctrl;
+ int user = chan->chid.user;

/* enable error reporting */
- nvkm_mask(device, 0x6100a0, 0x00000001 << chid, 0x00000001 << chid);
+ nvkm_mask(device, 0x6100a0, 0x00000001 << user, 0x00000001 << user);

/* activate channel */
- nvkm_wr32(device, 0x610490 + (chid * 0x10), 0x00000001);
+ nvkm_wr32(device, 0x610490 + (ctrl * 0x10), 0x00000001);
if (nvkm_msec(device, 2000,
- u32 tmp = nvkm_rd32(device, 0x610490 + (chid * 0x10));
+ u32 tmp = nvkm_rd32(device, 0x610490 + (ctrl * 0x10));
if ((tmp & 0x00030000) == 0x00010000)
break;
) < 0) {
- nvkm_error(subdev, "ch %d init: %08x\n", chid,
- nvkm_rd32(device, 0x610490 + (chid * 0x10)));
+ nvkm_error(subdev, "ch %d init: %08x\n", user,
+ nvkm_rd32(device, 0x610490 + (ctrl * 0x10)));
return -EBUSY;
}

--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/piocnv50.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/piocnv50.c
@@ -32,15 +32,16 @@ nv50_disp_pioc_fini(struct nv50_disp_cha
struct nv50_disp *disp = chan->root->disp;
struct nvkm_subdev *subdev = &disp->base.engine.subdev;
struct nvkm_device *device = subdev->device;
- int chid = chan->chid;
+ int ctrl = chan->chid.ctrl;
+ int user = chan->chid.user;

- nvkm_mask(device, 0x610200 + (chid * 0x10), 0x00000001, 0x00000000);
+ nvkm_mask(device, 0x610200 + (ctrl * 0x10), 0x00000001, 0x00000000);
if (nvkm_msec(device, 2000,
- if (!(nvkm_rd32(device, 0x610200 + (chid * 0x10)) & 0x00030000))
+ if (!(nvkm_rd32(device, 0x610200 + (ctrl * 0x10)) & 0x00030000))
break;
) < 0) {
- nvkm_error(subdev, "ch %d timeout: %08x\n", chid,
- nvkm_rd32(device, 0x610200 + (chid * 0x10)));
+ nvkm_error(subdev, "ch %d timeout: %08x\n", user,
+ nvkm_rd32(device, 0x610200 + (ctrl * 0x10)));
}
}

@@ -50,26 +51,27 @@ nv50_disp_pioc_init(struct nv50_disp_cha
struct nv50_disp *disp = chan->root->disp;
struct nvkm_subdev *subdev = &disp->base.engine.subdev;
struct nvkm_device *device = subdev->device;
- int chid = chan->chid;
+ int ctrl = chan->chid.ctrl;
+ int user = chan->chid.user;

- nvkm_wr32(device, 0x610200 + (chid * 0x10), 0x00002000);
+ nvkm_wr32(device, 0x610200 + (ctrl * 0x10), 0x00002000);
if (nvkm_msec(device, 2000,
- if (!(nvkm_rd32(device, 0x610200 + (chid * 0x10)) & 0x00030000))
+ if (!(nvkm_rd32(device, 0x610200 + (ctrl * 0x10)) & 0x00030000))
break;
) < 0) {
- nvkm_error(subdev, "ch %d timeout0: %08x\n", chid,
- nvkm_rd32(device, 0x610200 + (chid * 0x10)));
+ nvkm_error(subdev, "ch %d timeout0: %08x\n", user,
+ nvkm_rd32(device, 0x610200 + (ctrl * 0x10)));
return -EBUSY;
}

- nvkm_wr32(device, 0x610200 + (chid * 0x10), 0x00000001);
+ nvkm_wr32(device, 0x610200 + (ctrl * 0x10), 0x00000001);
if (nvkm_msec(device, 2000,
- u32 tmp = nvkm_rd32(device, 0x610200 + (chid * 0x10));
+ u32 tmp = nvkm_rd32(device, 0x610200 + (ctrl * 0x10));
if ((tmp & 0x00030000) == 0x00010000)
break;
) < 0) {
- nvkm_error(subdev, "ch %d timeout1: %08x\n", chid,
- nvkm_rd32(device, 0x610200 + (chid * 0x10)));
+ nvkm_error(subdev, "ch %d timeout1: %08x\n", user,
+ nvkm_rd32(device, 0x610200 + (ctrl * 0x10)));
return -EBUSY;
}



2017-03-20 18:27:13

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 69/93] vfio/spapr: Postpone default window creation

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Alexey Kardashevskiy <[email protected]>

[ Upstream commit d9c728949ddc9de5734bf3b12ea906ca8a77f2a0 ]

We are going to allow the userspace to configure container in
one memory context and pass container fd to another so
we are postponing memory allocations accounted against
the locked memory limit. One of previous patches took care of
it_userspace.

At the moment we create the default DMA window when the first group is
attached to a container; this is done for the userspace which is not
DDW-aware but familiar with the SPAPR TCE IOMMU v2 in the part of memory
pre-registration - such client expects the default DMA window to exist.

This postpones the default DMA window allocation till one of
the folliwing happens:
1. first map/unmap request arrives;
2. new window is requested;
This adds noop for the case when the userspace requested removal
of the default window which has not been created yet.

Signed-off-by: Alexey Kardashevskiy <[email protected]>
Reviewed-by: David Gibson <[email protected]>
Acked-by: Alex Williamson <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/vfio/vfio_iommu_spapr_tce.c | 40 ++++++++++++++++++++++--------------
1 file changed, 25 insertions(+), 15 deletions(-)

--- a/drivers/vfio/vfio_iommu_spapr_tce.c
+++ b/drivers/vfio/vfio_iommu_spapr_tce.c
@@ -106,6 +106,7 @@ struct tce_container {
struct mutex lock;
bool enabled;
bool v2;
+ bool def_window_pending;
unsigned long locked_pages;
struct mm_struct *mm;
struct iommu_table *tables[IOMMU_TABLE_GROUP_MAX_TABLES];
@@ -791,6 +792,9 @@ static long tce_iommu_create_default_win
struct tce_iommu_group *tcegrp;
struct iommu_table_group *table_group;

+ if (!container->def_window_pending)
+ return 0;
+
if (!tce_groups_attached(container))
return -ENODEV;

@@ -804,6 +808,9 @@ static long tce_iommu_create_default_win
table_group->tce32_size, 1, &start_addr);
WARN_ON_ONCE(!ret && start_addr);

+ if (!ret)
+ container->def_window_pending = false;
+
return ret;
}

@@ -907,6 +914,10 @@ static long tce_iommu_ioctl(void *iommu_
VFIO_DMA_MAP_FLAG_WRITE))
return -EINVAL;

+ ret = tce_iommu_create_default_window(container);
+ if (ret)
+ return ret;
+
num = tce_iommu_find_table(container, param.iova, &tbl);
if (num < 0)
return -ENXIO;
@@ -970,6 +981,10 @@ static long tce_iommu_ioctl(void *iommu_
if (param.flags)
return -EINVAL;

+ ret = tce_iommu_create_default_window(container);
+ if (ret)
+ return ret;
+
num = tce_iommu_find_table(container, param.iova, &tbl);
if (num < 0)
return -ENXIO;
@@ -1107,6 +1122,10 @@ static long tce_iommu_ioctl(void *iommu_

mutex_lock(&container->lock);

+ ret = tce_iommu_create_default_window(container);
+ if (ret)
+ return ret;
+
ret = tce_iommu_create_window(container, create.page_shift,
create.window_size, create.levels,
&create.start_addr);
@@ -1143,6 +1162,11 @@ static long tce_iommu_ioctl(void *iommu_
if (remove.flags)
return -EINVAL;

+ if (container->def_window_pending && !remove.start_addr) {
+ container->def_window_pending = false;
+ return 0;
+ }
+
mutex_lock(&container->lock);

ret = tce_iommu_remove_window(container, remove.start_addr);
@@ -1240,7 +1264,6 @@ static int tce_iommu_attach_group(void *
struct tce_container *container = iommu_data;
struct iommu_table_group *table_group;
struct tce_iommu_group *tcegrp = NULL;
- bool create_default_window = false;

mutex_lock(&container->lock);

@@ -1288,25 +1311,12 @@ static int tce_iommu_attach_group(void *
} else {
ret = tce_iommu_take_ownership_ddw(container, table_group);
if (!tce_groups_attached(container) && !container->tables[0])
- create_default_window = true;
+ container->def_window_pending = true;
}

if (!ret) {
tcegrp->grp = iommu_group;
list_add(&tcegrp->next, &container->group_list);
- /*
- * If it the first group attached, check if there is
- * a default DMA window and create one if none as
- * the userspace expects it to exist.
- */
- if (create_default_window) {
- ret = tce_iommu_create_default_window(container);
- if (ret) {
- list_del(&tcegrp->next);
- tce_iommu_release_ownership_ddw(container,
- table_group);
- }
- }
}

unlock_exit:


2017-03-20 18:27:21

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 70/93] drm/nouveau/disp/gp102: fix cursor/overlay immediate channel indices

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Ben Skeggs <[email protected]>

[ Upstream commit e50fcff15fe120ef2103a9e18af6644235c2b14d ]

Signed-off-by: Ben Skeggs <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/nouveau/nvkm/engine/disp/Kbuild | 2 +
drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.h | 2 +
drivers/gpu/drm/nouveau/nvkm/engine/disp/cursgp102.c | 37 +++++++++++++++++++
drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmgp102.c | 37 +++++++++++++++++++
drivers/gpu/drm/nouveau/nvkm/engine/disp/rootgp104.c | 4 +-
5 files changed, 80 insertions(+), 2 deletions(-)
create mode 100644 drivers/gpu/drm/nouveau/nvkm/engine/disp/cursgp102.c
create mode 100644 drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmgp102.c

--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/Kbuild
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/Kbuild
@@ -95,9 +95,11 @@ nvkm-y += nvkm/engine/disp/cursg84.o
nvkm-y += nvkm/engine/disp/cursgt215.o
nvkm-y += nvkm/engine/disp/cursgf119.o
nvkm-y += nvkm/engine/disp/cursgk104.o
+nvkm-y += nvkm/engine/disp/cursgp102.o

nvkm-y += nvkm/engine/disp/oimmnv50.o
nvkm-y += nvkm/engine/disp/oimmg84.o
nvkm-y += nvkm/engine/disp/oimmgt215.o
nvkm-y += nvkm/engine/disp/oimmgf119.o
nvkm-y += nvkm/engine/disp/oimmgk104.o
+nvkm-y += nvkm/engine/disp/oimmgp102.o
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.h
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.h
@@ -114,6 +114,8 @@ extern const struct nv50_disp_pioc_oclas
extern const struct nv50_disp_pioc_oclass gk104_disp_oimm_oclass;
extern const struct nv50_disp_pioc_oclass gk104_disp_curs_oclass;

+extern const struct nv50_disp_pioc_oclass gp102_disp_oimm_oclass;
+extern const struct nv50_disp_pioc_oclass gp102_disp_curs_oclass;

int nv50_disp_curs_new(const struct nv50_disp_chan_func *,
const struct nv50_disp_chan_mthd *,
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/cursgp102.c
@@ -0,0 +1,37 @@
+/*
+ * Copyright 2016 Red Hat Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Authors: Ben Skeggs <[email protected]>
+ */
+#include "channv50.h"
+#include "rootnv50.h"
+
+#include <nvif/class.h>
+
+const struct nv50_disp_pioc_oclass
+gp102_disp_curs_oclass = {
+ .base.oclass = GK104_DISP_CURSOR,
+ .base.minver = 0,
+ .base.maxver = 0,
+ .ctor = nv50_disp_curs_new,
+ .func = &gf119_disp_pioc_func,
+ .chid = { 13, 17 },
+};
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmgp102.c
@@ -0,0 +1,37 @@
+/*
+ * Copyright 2016 Red Hat Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Authors: Ben Skeggs <[email protected]>
+ */
+#include "channv50.h"
+#include "rootnv50.h"
+
+#include <nvif/class.h>
+
+const struct nv50_disp_pioc_oclass
+gp102_disp_oimm_oclass = {
+ .base.oclass = GK104_DISP_OVERLAY,
+ .base.minver = 0,
+ .base.maxver = 0,
+ .ctor = nv50_disp_oimm_new,
+ .func = &gf119_disp_pioc_func,
+ .chid = { 9, 13 },
+};
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/rootgp104.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/rootgp104.c
@@ -36,8 +36,8 @@ gp104_disp_root = {
&gp104_disp_ovly_oclass,
},
.pioc = {
- &gk104_disp_oimm_oclass,
- &gk104_disp_curs_oclass,
+ &gp102_disp_oimm_oclass,
+ &gp102_disp_curs_oclass,
},
};



2017-03-20 18:27:17

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 68/93] vfio/spapr: Add a helper to create default DMA window

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Alexey Kardashevskiy <[email protected]>

[ Upstream commit 6f01cc692a16405235d5c34056455b182682123c ]

There is already a helper to create a DMA window which does allocate
a table and programs it to the IOMMU group. However
tce_iommu_take_ownership_ddw() did not use it and did these 2 calls
itself to simplify error path.

Since we are going to delay the default window creation till
the default window is accessed/removed or new window is added,
we need a helper to create a default window from all these cases.

This adds tce_iommu_create_default_window(). Since it relies on
a VFIO container to have at least one IOMMU group (for future use),
this changes tce_iommu_attach_group() to add a group to the container
first and then call the new helper.

Signed-off-by: Alexey Kardashevskiy <[email protected]>
Reviewed-by: David Gibson <[email protected]>
Acked-by: Alex Williamson <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/vfio/vfio_iommu_spapr_tce.c | 87 +++++++++++++++++-------------------
1 file changed, 42 insertions(+), 45 deletions(-)

--- a/drivers/vfio/vfio_iommu_spapr_tce.c
+++ b/drivers/vfio/vfio_iommu_spapr_tce.c
@@ -784,6 +784,29 @@ static long tce_iommu_remove_window(stru
return 0;
}

+static long tce_iommu_create_default_window(struct tce_container *container)
+{
+ long ret;
+ __u64 start_addr = 0;
+ struct tce_iommu_group *tcegrp;
+ struct iommu_table_group *table_group;
+
+ if (!tce_groups_attached(container))
+ return -ENODEV;
+
+ tcegrp = list_first_entry(&container->group_list,
+ struct tce_iommu_group, next);
+ table_group = iommu_group_get_iommudata(tcegrp->grp);
+ if (!table_group)
+ return -ENODEV;
+
+ ret = tce_iommu_create_window(container, IOMMU_PAGE_SHIFT_4K,
+ table_group->tce32_size, 1, &start_addr);
+ WARN_ON_ONCE(!ret && start_addr);
+
+ return ret;
+}
+
static long tce_iommu_ioctl(void *iommu_data,
unsigned int cmd, unsigned long arg)
{
@@ -1199,9 +1222,6 @@ static void tce_iommu_release_ownership_
static long tce_iommu_take_ownership_ddw(struct tce_container *container,
struct iommu_table_group *table_group)
{
- long i, ret = 0;
- struct iommu_table *tbl = NULL;
-
if (!table_group->ops->create_table || !table_group->ops->set_window ||
!table_group->ops->release_ownership) {
WARN_ON_ONCE(1);
@@ -1210,47 +1230,7 @@ static long tce_iommu_take_ownership_ddw

table_group->ops->take_ownership(table_group);

- /*
- * If it the first group attached, check if there is
- * a default DMA window and create one if none as
- * the userspace expects it to exist.
- */
- if (!tce_groups_attached(container) && !container->tables[0]) {
- ret = tce_iommu_create_table(container,
- table_group,
- 0, /* window number */
- IOMMU_PAGE_SHIFT_4K,
- table_group->tce32_size,
- 1, /* default levels */
- &tbl);
- if (ret)
- goto release_exit;
- else
- container->tables[0] = tbl;
- }
-
- /* Set all windows to the new group */
- for (i = 0; i < IOMMU_TABLE_GROUP_MAX_TABLES; ++i) {
- tbl = container->tables[i];
-
- if (!tbl)
- continue;
-
- /* Set the default window to a new group */
- ret = table_group->ops->set_window(table_group, i, tbl);
- if (ret)
- goto release_exit;
- }
-
return 0;
-
-release_exit:
- for (i = 0; i < IOMMU_TABLE_GROUP_MAX_TABLES; ++i)
- table_group->ops->unset_window(table_group, i);
-
- table_group->ops->release_ownership(table_group);
-
- return ret;
}

static int tce_iommu_attach_group(void *iommu_data,
@@ -1260,6 +1240,7 @@ static int tce_iommu_attach_group(void *
struct tce_container *container = iommu_data;
struct iommu_table_group *table_group;
struct tce_iommu_group *tcegrp = NULL;
+ bool create_default_window = false;

mutex_lock(&container->lock);

@@ -1302,14 +1283,30 @@ static int tce_iommu_attach_group(void *
}

if (!table_group->ops || !table_group->ops->take_ownership ||
- !table_group->ops->release_ownership)
+ !table_group->ops->release_ownership) {
ret = tce_iommu_take_ownership(container, table_group);
- else
+ } else {
ret = tce_iommu_take_ownership_ddw(container, table_group);
+ if (!tce_groups_attached(container) && !container->tables[0])
+ create_default_window = true;
+ }

if (!ret) {
tcegrp->grp = iommu_group;
list_add(&tcegrp->next, &container->group_list);
+ /*
+ * If it the first group attached, check if there is
+ * a default DMA window and create one if none as
+ * the userspace expects it to exist.
+ */
+ if (create_default_window) {
+ ret = tce_iommu_create_default_window(container);
+ if (ret) {
+ list_del(&tcegrp->next);
+ tce_iommu_release_ownership_ddw(container,
+ table_group);
+ }
+ }
}

unlock_exit:


2017-03-20 18:27:09

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 72/93] drm/nouveau/disp/nv50-: specify ctrl/user separately when constructing classes

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Ben Skeggs <[email protected]>

[ Upstream commit 2a32b9b1866a2ee9f01fbf2a48d99012f0120739 ]

Signed-off-by: Ben Skeggs <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.c | 11 ++++++-----
drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.h | 15 +++++++++------
drivers/gpu/drm/nouveau/nvkm/engine/disp/cursg84.c | 2 +-
drivers/gpu/drm/nouveau/nvkm/engine/disp/cursgf119.c | 2 +-
drivers/gpu/drm/nouveau/nvkm/engine/disp/cursgk104.c | 2 +-
drivers/gpu/drm/nouveau/nvkm/engine/disp/cursgt215.c | 2 +-
drivers/gpu/drm/nouveau/nvkm/engine/disp/cursnv50.c | 6 +++---
drivers/gpu/drm/nouveau/nvkm/engine/disp/dmacnv50.c | 2 +-
drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmg84.c | 2 +-
drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmgf119.c | 2 +-
drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmgk104.c | 2 +-
drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmgt215.c | 2 +-
drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmnv50.c | 6 +++---
drivers/gpu/drm/nouveau/nvkm/engine/disp/rootnv50.c | 4 ++--
14 files changed, 32 insertions(+), 28 deletions(-)

--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.c
@@ -263,7 +263,7 @@ nv50_disp_chan = {
int
nv50_disp_chan_ctor(const struct nv50_disp_chan_func *func,
const struct nv50_disp_chan_mthd *mthd,
- struct nv50_disp_root *root, int chid, int head,
+ struct nv50_disp_root *root, int ctrl, int user, int head,
const struct nvkm_oclass *oclass,
struct nv50_disp_chan *chan)
{
@@ -273,8 +273,8 @@ nv50_disp_chan_ctor(const struct nv50_di
chan->func = func;
chan->mthd = mthd;
chan->root = root;
- chan->chid.ctrl = chid;
- chan->chid.user = chid;
+ chan->chid.ctrl = ctrl;
+ chan->chid.user = user;
chan->head = head;

if (disp->chan[chan->chid.user]) {
@@ -288,7 +288,7 @@ nv50_disp_chan_ctor(const struct nv50_di
int
nv50_disp_chan_new_(const struct nv50_disp_chan_func *func,
const struct nv50_disp_chan_mthd *mthd,
- struct nv50_disp_root *root, int chid, int head,
+ struct nv50_disp_root *root, int ctrl, int user, int head,
const struct nvkm_oclass *oclass,
struct nvkm_object **pobject)
{
@@ -298,5 +298,6 @@ nv50_disp_chan_new_(const struct nv50_di
return -ENOMEM;
*pobject = &chan->object;

- return nv50_disp_chan_ctor(func, mthd, root, chid, head, oclass, chan);
+ return nv50_disp_chan_ctor(func, mthd, root, ctrl, user,
+ head, oclass, chan);
}
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.h
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.h
@@ -29,11 +29,11 @@ struct nv50_disp_chan_func {

int nv50_disp_chan_ctor(const struct nv50_disp_chan_func *,
const struct nv50_disp_chan_mthd *,
- struct nv50_disp_root *, int chid, int head,
+ struct nv50_disp_root *, int ctrl, int user, int head,
const struct nvkm_oclass *, struct nv50_disp_chan *);
int nv50_disp_chan_new_(const struct nv50_disp_chan_func *,
const struct nv50_disp_chan_mthd *,
- struct nv50_disp_root *, int chid, int head,
+ struct nv50_disp_root *, int ctrl, int user, int head,
const struct nvkm_oclass *, struct nvkm_object **);

extern const struct nv50_disp_chan_func nv50_disp_pioc_func;
@@ -94,13 +94,16 @@ extern const struct nv50_disp_chan_mthd
struct nv50_disp_pioc_oclass {
int (*ctor)(const struct nv50_disp_chan_func *,
const struct nv50_disp_chan_mthd *,
- struct nv50_disp_root *, int chid,
+ struct nv50_disp_root *, int ctrl, int user,
const struct nvkm_oclass *, void *data, u32 size,
struct nvkm_object **);
struct nvkm_sclass base;
const struct nv50_disp_chan_func *func;
const struct nv50_disp_chan_mthd *mthd;
- int chid;
+ struct {
+ int ctrl;
+ int user;
+ } chid;
};

extern const struct nv50_disp_pioc_oclass nv50_disp_oimm_oclass;
@@ -123,12 +126,12 @@ extern const struct nv50_disp_pioc_oclas

int nv50_disp_curs_new(const struct nv50_disp_chan_func *,
const struct nv50_disp_chan_mthd *,
- struct nv50_disp_root *, int chid,
+ struct nv50_disp_root *, int ctrl, int user,
const struct nvkm_oclass *, void *data, u32 size,
struct nvkm_object **);
int nv50_disp_oimm_new(const struct nv50_disp_chan_func *,
const struct nv50_disp_chan_mthd *,
- struct nv50_disp_root *, int chid,
+ struct nv50_disp_root *, int ctrl, int user,
const struct nvkm_oclass *, void *data, u32 size,
struct nvkm_object **);
#endif
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/cursg84.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/cursg84.c
@@ -33,5 +33,5 @@ g84_disp_curs_oclass = {
.base.maxver = 0,
.ctor = nv50_disp_curs_new,
.func = &nv50_disp_pioc_func,
- .chid = 7,
+ .chid = { 7, 7 },
};
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/cursgf119.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/cursgf119.c
@@ -33,5 +33,5 @@ gf119_disp_curs_oclass = {
.base.maxver = 0,
.ctor = nv50_disp_curs_new,
.func = &gf119_disp_pioc_func,
- .chid = 13,
+ .chid = { 13, 13 },
};
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/cursgk104.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/cursgk104.c
@@ -33,5 +33,5 @@ gk104_disp_curs_oclass = {
.base.maxver = 0,
.ctor = nv50_disp_curs_new,
.func = &gf119_disp_pioc_func,
- .chid = 13,
+ .chid = { 13, 13 },
};
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/cursgt215.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/cursgt215.c
@@ -33,5 +33,5 @@ gt215_disp_curs_oclass = {
.base.maxver = 0,
.ctor = nv50_disp_curs_new,
.func = &nv50_disp_pioc_func,
- .chid = 7,
+ .chid = { 7, 7 },
};
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/cursnv50.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/cursnv50.c
@@ -33,7 +33,7 @@
int
nv50_disp_curs_new(const struct nv50_disp_chan_func *func,
const struct nv50_disp_chan_mthd *mthd,
- struct nv50_disp_root *root, int chid,
+ struct nv50_disp_root *root, int ctrl, int user,
const struct nvkm_oclass *oclass, void *data, u32 size,
struct nvkm_object **pobject)
{
@@ -54,7 +54,7 @@ nv50_disp_curs_new(const struct nv50_dis
} else
return ret;

- return nv50_disp_chan_new_(func, mthd, root, chid + head,
+ return nv50_disp_chan_new_(func, mthd, root, ctrl + head, user + head,
head, oclass, pobject);
}

@@ -65,5 +65,5 @@ nv50_disp_curs_oclass = {
.base.maxver = 0,
.ctor = nv50_disp_curs_new,
.func = &nv50_disp_pioc_func,
- .chid = 7,
+ .chid = { 7, 7 },
};
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/dmacnv50.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/dmacnv50.c
@@ -149,7 +149,7 @@ nv50_disp_dmac_new_(const struct nv50_di
chan->func = func;

ret = nv50_disp_chan_ctor(&nv50_disp_dmac_func_, mthd, root,
- chid, head, oclass, &chan->base);
+ chid, chid, head, oclass, &chan->base);
if (ret)
return ret;

--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmg84.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmg84.c
@@ -33,5 +33,5 @@ g84_disp_oimm_oclass = {
.base.maxver = 0,
.ctor = nv50_disp_oimm_new,
.func = &nv50_disp_pioc_func,
- .chid = 5,
+ .chid = { 5, 5 },
};
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmgf119.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmgf119.c
@@ -33,5 +33,5 @@ gf119_disp_oimm_oclass = {
.base.maxver = 0,
.ctor = nv50_disp_oimm_new,
.func = &gf119_disp_pioc_func,
- .chid = 9,
+ .chid = { 9, 9 },
};
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmgk104.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmgk104.c
@@ -33,5 +33,5 @@ gk104_disp_oimm_oclass = {
.base.maxver = 0,
.ctor = nv50_disp_oimm_new,
.func = &gf119_disp_pioc_func,
- .chid = 9,
+ .chid = { 9, 9 },
};
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmgt215.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmgt215.c
@@ -33,5 +33,5 @@ gt215_disp_oimm_oclass = {
.base.maxver = 0,
.ctor = nv50_disp_oimm_new,
.func = &nv50_disp_pioc_func,
- .chid = 5,
+ .chid = { 5, 5 },
};
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmnv50.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/oimmnv50.c
@@ -33,7 +33,7 @@
int
nv50_disp_oimm_new(const struct nv50_disp_chan_func *func,
const struct nv50_disp_chan_mthd *mthd,
- struct nv50_disp_root *root, int chid,
+ struct nv50_disp_root *root, int ctrl, int user,
const struct nvkm_oclass *oclass, void *data, u32 size,
struct nvkm_object **pobject)
{
@@ -54,7 +54,7 @@ nv50_disp_oimm_new(const struct nv50_dis
} else
return ret;

- return nv50_disp_chan_new_(func, mthd, root, chid + head,
+ return nv50_disp_chan_new_(func, mthd, root, ctrl + head, user + head,
head, oclass, pobject);
}

@@ -65,5 +65,5 @@ nv50_disp_oimm_oclass = {
.base.maxver = 0,
.ctor = nv50_disp_oimm_new,
.func = &nv50_disp_pioc_func,
- .chid = 5,
+ .chid = { 5, 5 },
};
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/rootnv50.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/rootnv50.c
@@ -207,8 +207,8 @@ nv50_disp_root_pioc_new_(const struct nv
{
const struct nv50_disp_pioc_oclass *sclass = oclass->priv;
struct nv50_disp_root *root = nv50_disp_root(oclass->parent);
- return sclass->ctor(sclass->func, sclass->mthd, root, sclass->chid,
- oclass, data, size, pobject);
+ return sclass->ctor(sclass->func, sclass->mthd, root, sclass->chid.ctrl,
+ sclass->chid.user, oclass, data, size, pobject);
}

static int


2017-03-20 18:27:06

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 73/93] block: allow WRITE_SAME commands with the SG_IO ioctl

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Mauricio Faria de Oliveira <[email protected]>

[ Upstream commit 25cdb64510644f3e854d502d69c73f21c6df88a9 ]

The WRITE_SAME commands are not present in the blk_default_cmd_filter
write_ok list, and thus are failed with -EPERM when the SG_IO ioctl()
is executed without CAP_SYS_RAWIO capability (e.g., unprivileged users).
[ sg_io() -> blk_fill_sghdr_rq() > blk_verify_command() -> -EPERM ]

The problem can be reproduced with the sg_write_same command

# sg_write_same --num 1 --xferlen 512 /dev/sda
#

# capsh --drop=cap_sys_rawio -- -c \
'sg_write_same --num 1 --xferlen 512 /dev/sda'
Write same: pass through os error: Operation not permitted
#

For comparison, the WRITE_VERIFY command does not observe this problem,
since it is in that list:

# capsh --drop=cap_sys_rawio -- -c \
'sg_write_verify --num 1 --ilen 512 --lba 0 /dev/sda'
#

So, this patch adds the WRITE_SAME commands to the list, in order
for the SG_IO ioctl to finish successfully:

# capsh --drop=cap_sys_rawio -- -c \
'sg_write_same --num 1 --xferlen 512 /dev/sda'
#

That case happens to be exercised by QEMU KVM guests with 'scsi-block' devices
(qemu "-device scsi-block" [1], libvirt "<disk type='block' device='lun'>" [2]),
which employs the SG_IO ioctl() and runs as an unprivileged user (libvirt-qemu).

In that scenario, when a filesystem (e.g., ext4) performs its zero-out calls,
which are translated to write-same calls in the guest kernel, and then into
SG_IO ioctls to the host kernel, SCSI I/O errors may be observed in the guest:

[...] sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[...] sd 0:0:0:0: [sda] tag#0 Sense Key : Aborted Command [current]
[...] sd 0:0:0:0: [sda] tag#0 Add. Sense: I/O process terminated
[...] sd 0:0:0:0: [sda] tag#0 CDB: Write Same(10) 41 00 01 04 e0 78 00 00 08 00
[...] blk_update_request: I/O error, dev sda, sector 17096824

Links:
[1] http://git.qemu.org/?p=qemu.git;a=commit;h=336a6915bc7089fb20fea4ba99972ad9a97c5f52
[2] https://libvirt.org/formatdomain.html#elementsDisks (see 'disk' -> 'device')

Signed-off-by: Mauricio Faria de Oliveira <[email protected]>
Signed-off-by: Brahadambal Srinivasan <[email protected]>
Reported-by: Manjunatha H R <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
block/scsi_ioctl.c | 3 +++
1 file changed, 3 insertions(+)

--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -182,6 +182,9 @@ static void blk_set_cmd_filter_defaults(
__set_bit(WRITE_16, filter->write_ok);
__set_bit(WRITE_LONG, filter->write_ok);
__set_bit(WRITE_LONG_2, filter->write_ok);
+ __set_bit(WRITE_SAME, filter->write_ok);
+ __set_bit(WRITE_SAME_16, filter->write_ok);
+ __set_bit(WRITE_SAME_32, filter->write_ok);
__set_bit(ERASE, filter->write_ok);
__set_bit(GPCMD_MODE_SELECT_10, filter->write_ok);
__set_bit(MODE_SELECT, filter->write_ok);


2017-03-20 18:27:03

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 67/93] powerpc/mm/iommu, vfio/spapr: Put pages on VFIO container shutdown

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Alexey Kardashevskiy <[email protected]>

[ Upstream commit 4b6fad7097f883335b6d9627c883cb7f276d94c9 ]

At the moment the userspace tool is expected to request pinning of
the entire guest RAM when VFIO IOMMU SPAPR v2 driver is present.
When the userspace process finishes, all the pinned pages need to
be put; this is done as a part of the userspace memory context (MM)
destruction which happens on the very last mmdrop().

This approach has a problem that a MM of the userspace process
may live longer than the userspace process itself as kernel threads
use userspace process MMs which was runnning on a CPU where
the kernel thread was scheduled to. If this happened, the MM remains
referenced until this exact kernel thread wakes up again
and releases the very last reference to the MM, on an idle system this
can take even hours.

This moves preregistered regions tracking from MM to VFIO; insteads of
using mm_iommu_table_group_mem_t::used, tce_container::prereg_list is
added so each container releases regions which it has pre-registered.

This changes the userspace interface to return EBUSY if a memory
region is already registered in a container. However it should not
have any practical effect as the only userspace tool available now
does register memory region once per container anyway.

As tce_iommu_register_pages/tce_iommu_unregister_pages are called
under container->lock, this does not need additional locking.

Signed-off-by: Alexey Kardashevskiy <[email protected]>
Reviewed-by: Nicholas Piggin <[email protected]>
Acked-by: Alex Williamson <[email protected]>
Reviewed-by: David Gibson <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/powerpc/mm/mmu_context_book3s64.c | 4 --
arch/powerpc/mm/mmu_context_iommu.c | 11 -----
drivers/vfio/vfio_iommu_spapr_tce.c | 61 ++++++++++++++++++++++++++++++++-
3 files changed, 61 insertions(+), 15 deletions(-)

--- a/arch/powerpc/mm/mmu_context_book3s64.c
+++ b/arch/powerpc/mm/mmu_context_book3s64.c
@@ -156,13 +156,11 @@ static inline void destroy_pagetable_pag
}
#endif

-
void destroy_context(struct mm_struct *mm)
{
#ifdef CONFIG_SPAPR_TCE_IOMMU
- mm_iommu_cleanup(mm);
+ WARN_ON_ONCE(!list_empty(&mm->context.iommu_group_mem_list));
#endif
-
#ifdef CONFIG_PPC_ICSWX
drop_cop(mm->context.acop, mm);
kfree(mm->context.cop_lockp);
--- a/arch/powerpc/mm/mmu_context_iommu.c
+++ b/arch/powerpc/mm/mmu_context_iommu.c
@@ -365,14 +365,3 @@ void mm_iommu_init(struct mm_struct *mm)
{
INIT_LIST_HEAD_RCU(&mm->context.iommu_group_mem_list);
}
-
-void mm_iommu_cleanup(struct mm_struct *mm)
-{
- struct mm_iommu_table_group_mem_t *mem, *tmp;
-
- list_for_each_entry_safe(mem, tmp, &mm->context.iommu_group_mem_list,
- next) {
- list_del_rcu(&mem->next);
- mm_iommu_do_free(mem);
- }
-}
--- a/drivers/vfio/vfio_iommu_spapr_tce.c
+++ b/drivers/vfio/vfio_iommu_spapr_tce.c
@@ -89,6 +89,15 @@ struct tce_iommu_group {
};

/*
+ * A container needs to remember which preregistered region it has
+ * referenced to do proper cleanup at the userspace process exit.
+ */
+struct tce_iommu_prereg {
+ struct list_head next;
+ struct mm_iommu_table_group_mem_t *mem;
+};
+
+/*
* The container descriptor supports only a single group per container.
* Required by the API as the container is not supplied with the IOMMU group
* at the moment of initialization.
@@ -101,6 +110,7 @@ struct tce_container {
struct mm_struct *mm;
struct iommu_table *tables[IOMMU_TABLE_GROUP_MAX_TABLES];
struct list_head group_list;
+ struct list_head prereg_list;
};

static long tce_iommu_mm_set(struct tce_container *container)
@@ -117,10 +127,27 @@ static long tce_iommu_mm_set(struct tce_
return 0;
}

+static long tce_iommu_prereg_free(struct tce_container *container,
+ struct tce_iommu_prereg *tcemem)
+{
+ long ret;
+
+ ret = mm_iommu_put(container->mm, tcemem->mem);
+ if (ret)
+ return ret;
+
+ list_del(&tcemem->next);
+ kfree(tcemem);
+
+ return 0;
+}
+
static long tce_iommu_unregister_pages(struct tce_container *container,
__u64 vaddr, __u64 size)
{
struct mm_iommu_table_group_mem_t *mem;
+ struct tce_iommu_prereg *tcemem;
+ bool found = false;

if ((vaddr & ~PAGE_MASK) || (size & ~PAGE_MASK))
return -EINVAL;
@@ -129,7 +156,17 @@ static long tce_iommu_unregister_pages(s
if (!mem)
return -ENOENT;

- return mm_iommu_put(container->mm, mem);
+ list_for_each_entry(tcemem, &container->prereg_list, next) {
+ if (tcemem->mem == mem) {
+ found = true;
+ break;
+ }
+ }
+
+ if (!found)
+ return -ENOENT;
+
+ return tce_iommu_prereg_free(container, tcemem);
}

static long tce_iommu_register_pages(struct tce_container *container,
@@ -137,16 +174,29 @@ static long tce_iommu_register_pages(str
{
long ret = 0;
struct mm_iommu_table_group_mem_t *mem = NULL;
+ struct tce_iommu_prereg *tcemem;
unsigned long entries = size >> PAGE_SHIFT;

if ((vaddr & ~PAGE_MASK) || (size & ~PAGE_MASK) ||
((vaddr + size) < vaddr))
return -EINVAL;

+ mem = mm_iommu_find(container->mm, vaddr, entries);
+ if (mem) {
+ list_for_each_entry(tcemem, &container->prereg_list, next) {
+ if (tcemem->mem == mem)
+ return -EBUSY;
+ }
+ }
+
ret = mm_iommu_get(container->mm, vaddr, entries, &mem);
if (ret)
return ret;

+ tcemem = kzalloc(sizeof(*tcemem), GFP_KERNEL);
+ tcemem->mem = mem;
+ list_add(&tcemem->next, &container->prereg_list);
+
container->enabled = true;

return 0;
@@ -333,6 +383,7 @@ static void *tce_iommu_open(unsigned lon

mutex_init(&container->lock);
INIT_LIST_HEAD_RCU(&container->group_list);
+ INIT_LIST_HEAD_RCU(&container->prereg_list);

container->v2 = arg == VFIO_SPAPR_TCE_v2_IOMMU;

@@ -371,6 +422,14 @@ static void tce_iommu_release(void *iomm
tce_iommu_free_table(container, tbl);
}

+ while (!list_empty(&container->prereg_list)) {
+ struct tce_iommu_prereg *tcemem;
+
+ tcemem = list_first_entry(&container->prereg_list,
+ struct tce_iommu_prereg, next);
+ WARN_ON_ONCE(tce_iommu_prereg_free(container, tcemem));
+ }
+
tce_iommu_disable(container);
if (container->mm)
mmdrop(container->mm);


2017-03-20 18:29:17

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 66/93] vfio/spapr: Reference mm in tce_container

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Alexey Kardashevskiy <[email protected]>

[ Upstream commit bc82d122ae4a0e9f971f13403995898fcfa0c09e ]

In some situations the userspace memory context may live longer than
the userspace process itself so if we need to do proper memory context
cleanup, we better have tce_container take a reference to mm_struct and
use it later when the process is gone (@current or @current->mm is NULL).

This references mm and stores the pointer in the container; this is done
in a new helper - tce_iommu_mm_set() - when one of the following happens:
- a container is enabled (IOMMU v1);
- a first attempt to pre-register memory is made (IOMMU v2);
- a DMA window is created (IOMMU v2).
The @mm stays referenced till the container is destroyed.

This replaces current->mm with container->mm everywhere except debug
prints.

This adds a check that current->mm is the same as the one stored in
the container to prevent userspace from making changes to a memory
context of other processes.

DMA map/unmap ioctls() do not check for @mm as they already check
for @enabled which is set after tce_iommu_mm_set() is called.

This does not reference a task as multiple threads within the same mm
are allowed to ioctl() to vfio and supposedly they will have same limits
and capabilities and if they do not, we'll just fail with no harm made.

Signed-off-by: Alexey Kardashevskiy <[email protected]>
Acked-by: Alex Williamson <[email protected]>
Reviewed-by: David Gibson <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/vfio/vfio_iommu_spapr_tce.c | 160 ++++++++++++++++++++++--------------
1 file changed, 100 insertions(+), 60 deletions(-)

--- a/drivers/vfio/vfio_iommu_spapr_tce.c
+++ b/drivers/vfio/vfio_iommu_spapr_tce.c
@@ -31,49 +31,49 @@
static void tce_iommu_detach_group(void *iommu_data,
struct iommu_group *iommu_group);

-static long try_increment_locked_vm(long npages)
+static long try_increment_locked_vm(struct mm_struct *mm, long npages)
{
long ret = 0, locked, lock_limit;

- if (!current || !current->mm)
- return -ESRCH; /* process exited */
+ if (WARN_ON_ONCE(!mm))
+ return -EPERM;

if (!npages)
return 0;

- down_write(&current->mm->mmap_sem);
- locked = current->mm->locked_vm + npages;
+ down_write(&mm->mmap_sem);
+ locked = mm->locked_vm + npages;
lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
if (locked > lock_limit && !capable(CAP_IPC_LOCK))
ret = -ENOMEM;
else
- current->mm->locked_vm += npages;
+ mm->locked_vm += npages;

pr_debug("[%d] RLIMIT_MEMLOCK +%ld %ld/%ld%s\n", current->pid,
npages << PAGE_SHIFT,
- current->mm->locked_vm << PAGE_SHIFT,
+ mm->locked_vm << PAGE_SHIFT,
rlimit(RLIMIT_MEMLOCK),
ret ? " - exceeded" : "");

- up_write(&current->mm->mmap_sem);
+ up_write(&mm->mmap_sem);

return ret;
}

-static void decrement_locked_vm(long npages)
+static void decrement_locked_vm(struct mm_struct *mm, long npages)
{
- if (!current || !current->mm || !npages)
- return; /* process exited */
+ if (!mm || !npages)
+ return;

- down_write(&current->mm->mmap_sem);
- if (WARN_ON_ONCE(npages > current->mm->locked_vm))
- npages = current->mm->locked_vm;
- current->mm->locked_vm -= npages;
+ down_write(&mm->mmap_sem);
+ if (WARN_ON_ONCE(npages > mm->locked_vm))
+ npages = mm->locked_vm;
+ mm->locked_vm -= npages;
pr_debug("[%d] RLIMIT_MEMLOCK -%ld %ld/%ld\n", current->pid,
npages << PAGE_SHIFT,
- current->mm->locked_vm << PAGE_SHIFT,
+ mm->locked_vm << PAGE_SHIFT,
rlimit(RLIMIT_MEMLOCK));
- up_write(&current->mm->mmap_sem);
+ up_write(&mm->mmap_sem);
}

/*
@@ -98,26 +98,38 @@ struct tce_container {
bool enabled;
bool v2;
unsigned long locked_pages;
+ struct mm_struct *mm;
struct iommu_table *tables[IOMMU_TABLE_GROUP_MAX_TABLES];
struct list_head group_list;
};

+static long tce_iommu_mm_set(struct tce_container *container)
+{
+ if (container->mm) {
+ if (container->mm == current->mm)
+ return 0;
+ return -EPERM;
+ }
+ BUG_ON(!current->mm);
+ container->mm = current->mm;
+ atomic_inc(&container->mm->mm_count);
+
+ return 0;
+}
+
static long tce_iommu_unregister_pages(struct tce_container *container,
__u64 vaddr, __u64 size)
{
struct mm_iommu_table_group_mem_t *mem;

- if (!current || !current->mm)
- return -ESRCH; /* process exited */
-
if ((vaddr & ~PAGE_MASK) || (size & ~PAGE_MASK))
return -EINVAL;

- mem = mm_iommu_find(current->mm, vaddr, size >> PAGE_SHIFT);
+ mem = mm_iommu_find(container->mm, vaddr, size >> PAGE_SHIFT);
if (!mem)
return -ENOENT;

- return mm_iommu_put(current->mm, mem);
+ return mm_iommu_put(container->mm, mem);
}

static long tce_iommu_register_pages(struct tce_container *container,
@@ -127,14 +139,11 @@ static long tce_iommu_register_pages(str
struct mm_iommu_table_group_mem_t *mem = NULL;
unsigned long entries = size >> PAGE_SHIFT;

- if (!current || !current->mm)
- return -ESRCH; /* process exited */
-
if ((vaddr & ~PAGE_MASK) || (size & ~PAGE_MASK) ||
((vaddr + size) < vaddr))
return -EINVAL;

- ret = mm_iommu_get(current->mm, vaddr, entries, &mem);
+ ret = mm_iommu_get(container->mm, vaddr, entries, &mem);
if (ret)
return ret;

@@ -143,7 +152,8 @@ static long tce_iommu_register_pages(str
return 0;
}

-static long tce_iommu_userspace_view_alloc(struct iommu_table *tbl)
+static long tce_iommu_userspace_view_alloc(struct iommu_table *tbl,
+ struct mm_struct *mm)
{
unsigned long cb = _ALIGN_UP(sizeof(tbl->it_userspace[0]) *
tbl->it_size, PAGE_SIZE);
@@ -152,13 +162,13 @@ static long tce_iommu_userspace_view_all

BUG_ON(tbl->it_userspace);

- ret = try_increment_locked_vm(cb >> PAGE_SHIFT);
+ ret = try_increment_locked_vm(mm, cb >> PAGE_SHIFT);
if (ret)
return ret;

uas = vzalloc(cb);
if (!uas) {
- decrement_locked_vm(cb >> PAGE_SHIFT);
+ decrement_locked_vm(mm, cb >> PAGE_SHIFT);
return -ENOMEM;
}
tbl->it_userspace = uas;
@@ -166,7 +176,8 @@ static long tce_iommu_userspace_view_all
return 0;
}

-static void tce_iommu_userspace_view_free(struct iommu_table *tbl)
+static void tce_iommu_userspace_view_free(struct iommu_table *tbl,
+ struct mm_struct *mm)
{
unsigned long cb = _ALIGN_UP(sizeof(tbl->it_userspace[0]) *
tbl->it_size, PAGE_SIZE);
@@ -176,7 +187,7 @@ static void tce_iommu_userspace_view_fre

vfree(tbl->it_userspace);
tbl->it_userspace = NULL;
- decrement_locked_vm(cb >> PAGE_SHIFT);
+ decrement_locked_vm(mm, cb >> PAGE_SHIFT);
}

static bool tce_page_is_contained(struct page *page, unsigned page_shift)
@@ -236,9 +247,6 @@ static int tce_iommu_enable(struct tce_c
struct iommu_table_group *table_group;
struct tce_iommu_group *tcegrp;

- if (!current->mm)
- return -ESRCH; /* process exited */
-
if (container->enabled)
return -EBUSY;

@@ -283,8 +291,12 @@ static int tce_iommu_enable(struct tce_c
if (!table_group->tce32_size)
return -EPERM;

+ ret = tce_iommu_mm_set(container);
+ if (ret)
+ return ret;
+
locked = table_group->tce32_size >> PAGE_SHIFT;
- ret = try_increment_locked_vm(locked);
+ ret = try_increment_locked_vm(container->mm, locked);
if (ret)
return ret;

@@ -302,10 +314,8 @@ static void tce_iommu_disable(struct tce

container->enabled = false;

- if (!current->mm)
- return;
-
- decrement_locked_vm(container->locked_pages);
+ BUG_ON(!container->mm);
+ decrement_locked_vm(container->mm, container->locked_pages);
}

static void *tce_iommu_open(unsigned long arg)
@@ -332,7 +342,8 @@ static void *tce_iommu_open(unsigned lon
static int tce_iommu_clear(struct tce_container *container,
struct iommu_table *tbl,
unsigned long entry, unsigned long pages);
-static void tce_iommu_free_table(struct iommu_table *tbl);
+static void tce_iommu_free_table(struct tce_container *container,
+ struct iommu_table *tbl);

static void tce_iommu_release(void *iommu_data)
{
@@ -357,10 +368,12 @@ static void tce_iommu_release(void *iomm
continue;

tce_iommu_clear(container, tbl, tbl->it_offset, tbl->it_size);
- tce_iommu_free_table(tbl);
+ tce_iommu_free_table(container, tbl);
}

tce_iommu_disable(container);
+ if (container->mm)
+ mmdrop(container->mm);
mutex_destroy(&container->lock);

kfree(container);
@@ -375,13 +388,14 @@ static void tce_iommu_unuse_page(struct
put_page(page);
}

-static int tce_iommu_prereg_ua_to_hpa(unsigned long tce, unsigned long size,
+static int tce_iommu_prereg_ua_to_hpa(struct tce_container *container,
+ unsigned long tce, unsigned long size,
unsigned long *phpa, struct mm_iommu_table_group_mem_t **pmem)
{
long ret = 0;
struct mm_iommu_table_group_mem_t *mem;

- mem = mm_iommu_lookup(current->mm, tce, size);
+ mem = mm_iommu_lookup(container->mm, tce, size);
if (!mem)
return -EINVAL;

@@ -394,18 +408,18 @@ static int tce_iommu_prereg_ua_to_hpa(un
return 0;
}

-static void tce_iommu_unuse_page_v2(struct iommu_table *tbl,
- unsigned long entry)
+static void tce_iommu_unuse_page_v2(struct tce_container *container,
+ struct iommu_table *tbl, unsigned long entry)
{
struct mm_iommu_table_group_mem_t *mem = NULL;
int ret;
unsigned long hpa = 0;
unsigned long *pua = IOMMU_TABLE_USERSPACE_ENTRY(tbl, entry);

- if (!pua || !current || !current->mm)
+ if (!pua)
return;

- ret = tce_iommu_prereg_ua_to_hpa(*pua, IOMMU_PAGE_SIZE(tbl),
+ ret = tce_iommu_prereg_ua_to_hpa(container, *pua, IOMMU_PAGE_SIZE(tbl),
&hpa, &mem);
if (ret)
pr_debug("%s: tce %lx at #%lx was not cached, ret=%d\n",
@@ -435,7 +449,7 @@ static int tce_iommu_clear(struct tce_co
continue;

if (container->v2) {
- tce_iommu_unuse_page_v2(tbl, entry);
+ tce_iommu_unuse_page_v2(container, tbl, entry);
continue;
}

@@ -516,7 +530,7 @@ static long tce_iommu_build_v2(struct tc
enum dma_data_direction dirtmp;

if (!tbl->it_userspace) {
- ret = tce_iommu_userspace_view_alloc(tbl);
+ ret = tce_iommu_userspace_view_alloc(tbl, container->mm);
if (ret)
return ret;
}
@@ -526,8 +540,8 @@ static long tce_iommu_build_v2(struct tc
unsigned long *pua = IOMMU_TABLE_USERSPACE_ENTRY(tbl,
entry + i);

- ret = tce_iommu_prereg_ua_to_hpa(tce, IOMMU_PAGE_SIZE(tbl),
- &hpa, &mem);
+ ret = tce_iommu_prereg_ua_to_hpa(container,
+ tce, IOMMU_PAGE_SIZE(tbl), &hpa, &mem);
if (ret)
break;

@@ -548,7 +562,7 @@ static long tce_iommu_build_v2(struct tc
ret = iommu_tce_xchg(tbl, entry + i, &hpa, &dirtmp);
if (ret) {
/* dirtmp cannot be DMA_NONE here */
- tce_iommu_unuse_page_v2(tbl, entry + i);
+ tce_iommu_unuse_page_v2(container, tbl, entry + i);
pr_err("iommu_tce: %s failed ioba=%lx, tce=%lx, ret=%ld\n",
__func__, entry << tbl->it_page_shift,
tce, ret);
@@ -556,7 +570,7 @@ static long tce_iommu_build_v2(struct tc
}

if (dirtmp != DMA_NONE)
- tce_iommu_unuse_page_v2(tbl, entry + i);
+ tce_iommu_unuse_page_v2(container, tbl, entry + i);

*pua = tce;

@@ -584,7 +598,7 @@ static long tce_iommu_create_table(struc
if (!table_size)
return -EINVAL;

- ret = try_increment_locked_vm(table_size >> PAGE_SHIFT);
+ ret = try_increment_locked_vm(container->mm, table_size >> PAGE_SHIFT);
if (ret)
return ret;

@@ -597,13 +611,14 @@ static long tce_iommu_create_table(struc
return ret;
}

-static void tce_iommu_free_table(struct iommu_table *tbl)
+static void tce_iommu_free_table(struct tce_container *container,
+ struct iommu_table *tbl)
{
unsigned long pages = tbl->it_allocated_size >> PAGE_SHIFT;

- tce_iommu_userspace_view_free(tbl);
+ tce_iommu_userspace_view_free(tbl, container->mm);
tbl->it_ops->free(tbl);
- decrement_locked_vm(pages);
+ decrement_locked_vm(container->mm, pages);
}

static long tce_iommu_create_window(struct tce_container *container,
@@ -666,7 +681,7 @@ unset_exit:
table_group = iommu_group_get_iommudata(tcegrp->grp);
table_group->ops->unset_window(table_group, num);
}
- tce_iommu_free_table(tbl);
+ tce_iommu_free_table(container, tbl);

return ret;
}
@@ -704,7 +719,7 @@ static long tce_iommu_remove_window(stru

/* Free table */
tce_iommu_clear(container, tbl, tbl->it_offset, tbl->it_size);
- tce_iommu_free_table(tbl);
+ tce_iommu_free_table(container, tbl);
container->tables[num] = NULL;

return 0;
@@ -730,7 +745,17 @@ static long tce_iommu_ioctl(void *iommu_
}

return (ret < 0) ? 0 : ret;
+ }
+
+ /*
+ * Sanity check to prevent one userspace from manipulating
+ * another userspace mm.
+ */
+ BUG_ON(!container);
+ if (container->mm && container->mm != current->mm)
+ return -EPERM;

+ switch (cmd) {
case VFIO_IOMMU_SPAPR_TCE_GET_INFO: {
struct vfio_iommu_spapr_tce_info info;
struct tce_iommu_group *tcegrp;
@@ -891,6 +916,10 @@ static long tce_iommu_ioctl(void *iommu_
minsz = offsetofend(struct vfio_iommu_spapr_register_memory,
size);

+ ret = tce_iommu_mm_set(container);
+ if (ret)
+ return ret;
+
if (copy_from_user(&param, (void __user *)arg, minsz))
return -EFAULT;

@@ -914,6 +943,9 @@ static long tce_iommu_ioctl(void *iommu_
if (!container->v2)
break;

+ if (!container->mm)
+ return -EPERM;
+
minsz = offsetofend(struct vfio_iommu_spapr_register_memory,
size);

@@ -972,6 +1004,10 @@ static long tce_iommu_ioctl(void *iommu_
if (!container->v2)
break;

+ ret = tce_iommu_mm_set(container);
+ if (ret)
+ return ret;
+
if (!tce_groups_attached(container))
return -ENXIO;

@@ -1006,6 +1042,10 @@ static long tce_iommu_ioctl(void *iommu_
if (!container->v2)
break;

+ ret = tce_iommu_mm_set(container);
+ if (ret)
+ return ret;
+
if (!tce_groups_attached(container))
return -ENXIO;

@@ -1046,7 +1086,7 @@ static void tce_iommu_release_ownership(
continue;

tce_iommu_clear(container, tbl, tbl->it_offset, tbl->it_size);
- tce_iommu_userspace_view_free(tbl);
+ tce_iommu_userspace_view_free(tbl, container->mm);
if (tbl->it_map)
iommu_release_ownership(tbl);



2017-03-20 18:29:13

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 48/93] scsi: ibmvscsis: Clean up properly if target_submit_cmd/tmr fails

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Michael Cyr <[email protected]>

[ Upstream commit 7435b32e2d2fb5da6c2ae9b9c8ce56d8a3cb3bc3 ]

Signed-off-by: Michael Cyr <[email protected]>
Signed-off-by: Bryant G. Ly <[email protected]>
Tested-by: Steven Royer <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c | 7 +++++++
1 file changed, 7 insertions(+)

--- a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c
+++ b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c
@@ -2552,6 +2552,10 @@ static void ibmvscsis_parse_cmd(struct s
data_len, attr, dir, 0);
if (rc) {
dev_err(&vscsi->dev, "target_submit_cmd failed, rc %d\n", rc);
+ spin_lock_bh(&vscsi->intr_lock);
+ list_del(&cmd->list);
+ ibmvscsis_free_cmd_resources(vscsi, cmd);
+ spin_unlock_bh(&vscsi->intr_lock);
goto fail;
}
return;
@@ -2631,6 +2635,9 @@ static void ibmvscsis_parse_task(struct
if (rc) {
dev_err(&vscsi->dev, "target_submit_tmr failed, rc %d\n",
rc);
+ spin_lock_bh(&vscsi->intr_lock);
+ list_del(&cmd->list);
+ spin_unlock_bh(&vscsi->intr_lock);
cmd->se_cmd.se_tmr_req->response =
TMR_FUNCTION_REJECTED;
}


2017-03-20 18:29:21

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 54/93] PCI: Remove pci_resource_bar() and pci_iov_resource_bar()

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Bjorn Helgaas <[email protected]>

[ Upstream commit 286c2378aaccc7343ebf17ec6cd86567659caf70 ]

pci_std_update_resource() only deals with standard BARs, so we don't have
to worry about the complications of VF BARs in an SR-IOV capability.

Compute the BAR address inline and remove pci_resource_bar(). That makes
pci_iov_resource_bar() unused, so remove that as well.

Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/pci/iov.c | 18 ------------------
drivers/pci/pci.c | 30 ------------------------------
drivers/pci/pci.h | 6 ------
drivers/pci/setup-res.c | 13 +++++++------
4 files changed, 7 insertions(+), 60 deletions(-)

--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -554,24 +554,6 @@ void pci_iov_release(struct pci_dev *dev
}

/**
- * pci_iov_resource_bar - get position of the SR-IOV BAR
- * @dev: the PCI device
- * @resno: the resource number
- *
- * Returns position of the BAR encapsulated in the SR-IOV capability.
- */
-int pci_iov_resource_bar(struct pci_dev *dev, int resno)
-{
- if (resno < PCI_IOV_RESOURCES || resno > PCI_IOV_RESOURCE_END)
- return 0;
-
- BUG_ON(!dev->is_physfn);
-
- return dev->sriov->pos + PCI_SRIOV_BAR +
- 4 * (resno - PCI_IOV_RESOURCES);
-}
-
-/**
* pci_iov_update_resource - update a VF BAR
* @dev: the PCI device
* @resno: the resource number
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4835,36 +4835,6 @@ int pci_select_bars(struct pci_dev *dev,
}
EXPORT_SYMBOL(pci_select_bars);

-/**
- * pci_resource_bar - get position of the BAR associated with a resource
- * @dev: the PCI device
- * @resno: the resource number
- * @type: the BAR type to be filled in
- *
- * Returns BAR position in config space, or 0 if the BAR is invalid.
- */
-int pci_resource_bar(struct pci_dev *dev, int resno, enum pci_bar_type *type)
-{
- int reg;
-
- if (resno < PCI_ROM_RESOURCE) {
- *type = pci_bar_unknown;
- return PCI_BASE_ADDRESS_0 + 4 * resno;
- } else if (resno == PCI_ROM_RESOURCE) {
- *type = pci_bar_mem32;
- return dev->rom_base_reg;
- } else if (resno < PCI_BRIDGE_RESOURCES) {
- /* device specific resource */
- *type = pci_bar_unknown;
- reg = pci_iov_resource_bar(dev, resno);
- if (reg)
- return reg;
- }
-
- dev_err(&dev->dev, "BAR %d: invalid resource\n", resno);
- return 0;
-}
-
/* Some architectures require additional programming to enable VGA */
static arch_set_vga_state_t arch_set_vga_state;

--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -245,7 +245,6 @@ bool pci_bus_read_dev_vendor_id(struct p
int pci_setup_device(struct pci_dev *dev);
int __pci_read_base(struct pci_dev *dev, enum pci_bar_type type,
struct resource *res, unsigned int reg);
-int pci_resource_bar(struct pci_dev *dev, int resno, enum pci_bar_type *type);
void pci_configure_ari(struct pci_dev *dev);
void __pci_bus_size_bridges(struct pci_bus *bus,
struct list_head *realloc_head);
@@ -289,7 +288,6 @@ static inline void pci_restore_ats_state
#ifdef CONFIG_PCI_IOV
int pci_iov_init(struct pci_dev *dev);
void pci_iov_release(struct pci_dev *dev);
-int pci_iov_resource_bar(struct pci_dev *dev, int resno);
void pci_iov_update_resource(struct pci_dev *dev, int resno);
resource_size_t pci_sriov_resource_alignment(struct pci_dev *dev, int resno);
void pci_restore_iov_state(struct pci_dev *dev);
@@ -304,10 +302,6 @@ static inline void pci_iov_release(struc

{
}
-static inline int pci_iov_resource_bar(struct pci_dev *dev, int resno)
-{
- return 0;
-}
static inline void pci_restore_iov_state(struct pci_dev *dev)
{
}
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -32,7 +32,6 @@ static void pci_std_update_resource(stru
u16 cmd;
u32 new, check, mask;
int reg;
- enum pci_bar_type type;
struct resource *res = dev->resource + resno;

if (dev->is_virtfn) {
@@ -66,14 +65,16 @@ static void pci_std_update_resource(stru
else
mask = (u32)PCI_BASE_ADDRESS_MEM_MASK;

- reg = pci_resource_bar(dev, resno, &type);
- if (!reg)
- return;
- if (type != pci_bar_unknown) {
+ if (resno < PCI_ROM_RESOURCE) {
+ reg = PCI_BASE_ADDRESS_0 + 4 * resno;
+ } else if (resno == PCI_ROM_RESOURCE) {
if (!(res->flags & IORESOURCE_ROM_ENABLE))
return;
+
+ reg = dev->rom_base_reg;
new |= PCI_ROM_ADDRESS_ENABLE;
- }
+ } else
+ return;

/*
* We can't update a 64-bit BAR atomically, so when possible,


2017-03-20 18:29:44

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 23/93] net: fix socket refcounting in skb_complete_tx_timestamp()

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <[email protected]>


[ Upstream commit 9ac25fc063751379cb77434fef9f3b088cd3e2f7 ]

TX skbs do not necessarily hold a reference on skb->sk->sk_refcnt
By the time TX completion happens, sk_refcnt might be already 0.

sock_hold()/sock_put() would then corrupt critical state, like
sk_wmem_alloc and lead to leaks or use after free.

Fixes: 62bccb8cdb69 ("net-timestamp: Make the clone operation stand-alone from phy timestamping")
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Alexander Duyck <[email protected]>
Cc: Johannes Berg <[email protected]>
Cc: Soheil Hassas Yeganeh <[email protected]>
Cc: Willem de Bruijn <[email protected]>
Acked-by: Soheil Hassas Yeganeh <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/core/skbuff.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)

--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3814,13 +3814,14 @@ void skb_complete_tx_timestamp(struct sk
if (!skb_may_tx_timestamp(sk, false))
return;

- /* take a reference to prevent skb_orphan() from freeing the socket */
- sock_hold(sk);
-
- *skb_hwtstamps(skb) = *hwtstamps;
- __skb_complete_tx_timestamp(skb, sk, SCM_TSTAMP_SND);
-
- sock_put(sk);
+ /* Take a reference to prevent skb_orphan() from freeing the socket,
+ * but only if the socket refcount is not zero.
+ */
+ if (likely(atomic_inc_not_zero(&sk->sk_refcnt))) {
+ *skb_hwtstamps(skb) = *hwtstamps;
+ __skb_complete_tx_timestamp(skb, sk, SCM_TSTAMP_SND);
+ sock_put(sk);
+ }
}
EXPORT_SYMBOL_GPL(skb_complete_tx_timestamp);



2017-03-20 18:29:26

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 57/93] PCI: Dont update VF BARs while VF memory space is enabled

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Bjorn Helgaas <[email protected]>

[ Upstream commit 546ba9f8f22f71b0202b6ba8967be5cc6dae4e21 ]

If we update a VF BAR while it's enabled, there are two potential problems:

1) Any driver that's using the VF has a cached BAR value that is stale
after the update, and

2) We can't update 64-bit BARs atomically, so the intermediate state
(new lower dword with old upper dword) may conflict with another
device, and an access by a driver unrelated to the VF may cause a bus
error.

Warn about attempts to update VF BARs while they are enabled. This is a
programming error, so use dev_WARN() to get a backtrace.

Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/pci/iov.c | 8 ++++++++
1 file changed, 8 insertions(+)

--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -566,6 +566,7 @@ void pci_iov_update_resource(struct pci_
struct resource *res = dev->resource + resno;
int vf_bar = resno - PCI_IOV_RESOURCES;
struct pci_bus_region region;
+ u16 cmd;
u32 new;
int reg;

@@ -577,6 +578,13 @@ void pci_iov_update_resource(struct pci_
if (!iov)
return;

+ pci_read_config_word(dev, iov->pos + PCI_SRIOV_CTRL, &cmd);
+ if ((cmd & PCI_SRIOV_CTRL_VFE) && (cmd & PCI_SRIOV_CTRL_MSE)) {
+ dev_WARN(&dev->dev, "can't update enabled VF BAR%d %pR\n",
+ vf_bar, res);
+ return;
+ }
+
/*
* Ignore unimplemented BARs, unused resource slots for 64-bit
* BARs, and non-movable resources, e.g., those described via


2017-03-20 18:29:41

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 47/93] scsi: ibmvscsis: Return correct partition name/# to client

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Michael Cyr <[email protected]>

[ Upstream commit 9c93cf03d4eb3dc58931ff7cac0af9c344fe5e0b ]

Signed-off-by: Michael Cyr <[email protected]>
Signed-off-by: Bryant G. Ly <[email protected]>
Tested-by: Steven Royer <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

--- a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c
+++ b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c
@@ -3387,6 +3387,9 @@ static int ibmvscsis_probe(struct vio_de
strncat(vscsi->eye, vdev->name, MAX_EYE);

vscsi->dds.unit_id = vdev->unit_address;
+ strncpy(vscsi->dds.partition_name, partition_name,
+ sizeof(vscsi->dds.partition_name));
+ vscsi->dds.partition_num = partition_number;

spin_lock_bh(&ibmvscsis_dev_lock);
list_add_tail(&vscsi->list, &ibmvscsis_dev_list);
@@ -3603,7 +3606,7 @@ static int ibmvscsis_get_system_info(voi

num = of_get_property(rootdn, "ibm,partition-no", NULL);
if (num)
- partition_number = *num;
+ partition_number = of_read_number(num, 1);

of_node_put(rootdn);



2017-03-20 18:29:38

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 30/93] mpls: Send route delete notifications when router module is unloaded

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: David Ahern <[email protected]>


[ Upstream commit e37791ec1ad785b59022ae211f63a16189bacebf ]

When the mpls_router module is unloaded, mpls routes are deleted but
notifications are not sent to userspace leaving userspace caches
out of sync. Add the call to mpls_notify_route in mpls_net_exit as
routes are freed.

Fixes: 0189197f44160 ("mpls: Basic routing support")
Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/mpls/af_mpls.c | 1 +
1 file changed, 1 insertion(+)

--- a/net/mpls/af_mpls.c
+++ b/net/mpls/af_mpls.c
@@ -1696,6 +1696,7 @@ static void mpls_net_exit(struct net *ne
for (index = 0; index < platform_labels; index++) {
struct mpls_route *rt = rtnl_dereference(platform_label[index]);
RCU_INIT_POINTER(platform_label[index], NULL);
+ mpls_notify_route(net, index, rt, NULL, NULL);
mpls_rt_free(rt);
}
rtnl_unlock();


2017-03-20 18:29:35

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 50/93] scsi: ibmvscsis: Synchronize cmds at tpg_enable_store time

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Michael Cyr <[email protected]>

[ Upstream commit c9b3379f60a83288a5e2f8ea75476460978689b0 ]

This patch changes the way the IBM vSCSI server driver manages its
Command/Response Queue (CRQ). We used to register the CRQ with phyp at
probe time. Now we wait until tpg_enable_store. Similarly, when
tpg_enable_store is called to "disable" (i.e. the stored value is 0),
we unregister the queue with phyp.

One consquence to this is that we have no need for the PART_UP_WAIT_ENAB
state, since we can't get an Init Message from the client in our CRQ if
we're waiting to be enabled, since we haven't registered the queue yet.

Signed-off-by: Michael Cyr <[email protected]>
Signed-off-by: Bryant G. Ly <[email protected]>
Tested-by: Steven Royer <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c | 224 +++++--------------------------
drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.h | 2
2 files changed, 38 insertions(+), 188 deletions(-)

--- a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c
+++ b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c
@@ -62,8 +62,6 @@ static long ibmvscsis_parse_command(stru

static void ibmvscsis_adapter_idle(struct scsi_info *vscsi);

-static void ibmvscsis_reset_queue(struct scsi_info *vscsi, uint new_state);
-
static void ibmvscsis_determine_resid(struct se_cmd *se_cmd,
struct srp_rsp *rsp)
{
@@ -418,7 +416,6 @@ static void ibmvscsis_disconnect(struct
proc_work);
u16 new_state;
bool wait_idle = false;
- long rc = ADAPT_SUCCESS;

spin_lock_bh(&vscsi->intr_lock);
new_state = vscsi->new_state;
@@ -471,30 +468,12 @@ static void ibmvscsis_disconnect(struct
vscsi->state = new_state;
break;

- /*
- * If this is a transition into an error state.
- * a client is attempting to establish a connection
- * and has violated the RPA protocol.
- * There can be nothing pending on the adapter although
- * there can be requests in the command queue.
- */
case WAIT_ENABLED:
- case PART_UP_WAIT_ENAB:
switch (new_state) {
+ /* should never happen */
case ERR_DISCONNECT:
- vscsi->flags |= RESPONSE_Q_DOWN;
- vscsi->state = new_state;
- vscsi->flags &= ~(SCHEDULE_DISCONNECT |
- DISCONNECT_SCHEDULED);
- ibmvscsis_free_command_q(vscsi);
- break;
case ERR_DISCONNECT_RECONNECT:
- ibmvscsis_reset_queue(vscsi, WAIT_ENABLED);
- break;
-
- /* should never happen */
case WAIT_IDLE:
- rc = ERROR;
dev_err(&vscsi->dev, "disconnect: invalid state %d for WAIT_IDLE\n",
vscsi->state);
break;
@@ -631,7 +610,6 @@ static void ibmvscsis_post_disconnect(st
break;

case WAIT_ENABLED:
- case PART_UP_WAIT_ENAB:
case WAIT_IDLE:
case WAIT_CONNECTION:
case CONNECTED:
@@ -676,7 +654,6 @@ static long ibmvscsis_handle_init_compl_
case SRP_PROCESSING:
case CONNECTED:
case WAIT_ENABLED:
- case PART_UP_WAIT_ENAB:
default:
rc = ERROR;
dev_err(&vscsi->dev, "init_msg: invalid state %d to get init compl msg\n",
@@ -699,10 +676,6 @@ static long ibmvscsis_handle_init_msg(st
long rc = ADAPT_SUCCESS;

switch (vscsi->state) {
- case WAIT_ENABLED:
- vscsi->state = PART_UP_WAIT_ENAB;
- break;
-
case WAIT_CONNECTION:
rc = ibmvscsis_send_init_message(vscsi, INIT_COMPLETE_MSG);
switch (rc) {
@@ -738,7 +711,7 @@ static long ibmvscsis_handle_init_msg(st
case UNCONFIGURING:
break;

- case PART_UP_WAIT_ENAB:
+ case WAIT_ENABLED:
case CONNECTED:
case SRP_PROCESSING:
case WAIT_IDLE:
@@ -801,11 +774,10 @@ static long ibmvscsis_init_msg(struct sc
/**
* ibmvscsis_establish_new_q() - Establish new CRQ queue
* @vscsi: Pointer to our adapter structure
- * @new_state: New state being established after resetting the queue
*
* Must be called with interrupt lock held.
*/
-static long ibmvscsis_establish_new_q(struct scsi_info *vscsi, uint new_state)
+static long ibmvscsis_establish_new_q(struct scsi_info *vscsi)
{
long rc = ADAPT_SUCCESS;
uint format;
@@ -817,19 +789,19 @@ static long ibmvscsis_establish_new_q(st

rc = vio_enable_interrupts(vscsi->dma_dev);
if (rc) {
- pr_warn("reset_queue: failed to enable interrupts, rc %ld\n",
+ pr_warn("establish_new_q: failed to enable interrupts, rc %ld\n",
rc);
return rc;
}

rc = ibmvscsis_check_init_msg(vscsi, &format);
if (rc) {
- dev_err(&vscsi->dev, "reset_queue: check_init_msg failed, rc %ld\n",
+ dev_err(&vscsi->dev, "establish_new_q: check_init_msg failed, rc %ld\n",
rc);
return rc;
}

- if (format == UNUSED_FORMAT && new_state == WAIT_CONNECTION) {
+ if (format == UNUSED_FORMAT) {
rc = ibmvscsis_send_init_message(vscsi, INIT_MSG);
switch (rc) {
case H_SUCCESS:
@@ -847,6 +819,8 @@ static long ibmvscsis_establish_new_q(st
rc = H_HARDWARE;
break;
}
+ } else if (format == INIT_MSG) {
+ rc = ibmvscsis_handle_init_msg(vscsi);
}

return rc;
@@ -855,7 +829,6 @@ static long ibmvscsis_establish_new_q(st
/**
* ibmvscsis_reset_queue() - Reset CRQ Queue
* @vscsi: Pointer to our adapter structure
- * @new_state: New state to establish after resetting the queue
*
* This function calls h_free_q and then calls h_reg_q and does all
* of the bookkeeping to get us back to where we can communicate.
@@ -872,7 +845,7 @@ static long ibmvscsis_establish_new_q(st
* EXECUTION ENVIRONMENT:
* Process environment, called with interrupt lock held
*/
-static void ibmvscsis_reset_queue(struct scsi_info *vscsi, uint new_state)
+static void ibmvscsis_reset_queue(struct scsi_info *vscsi)
{
int bytes;
long rc = ADAPT_SUCCESS;
@@ -885,19 +858,18 @@ static void ibmvscsis_reset_queue(struct
vscsi->rsp_q_timer.timer_pops = 0;
vscsi->debit = 0;
vscsi->credit = 0;
- vscsi->state = new_state;
+ vscsi->state = WAIT_CONNECTION;
vio_enable_interrupts(vscsi->dma_dev);
} else {
rc = ibmvscsis_free_command_q(vscsi);
if (rc == ADAPT_SUCCESS) {
- vscsi->state = new_state;
+ vscsi->state = WAIT_CONNECTION;

bytes = vscsi->cmd_q.size * PAGE_SIZE;
rc = h_reg_crq(vscsi->dds.unit_id,
vscsi->cmd_q.crq_token, bytes);
if (rc == H_CLOSED || rc == H_SUCCESS) {
- rc = ibmvscsis_establish_new_q(vscsi,
- new_state);
+ rc = ibmvscsis_establish_new_q(vscsi);
}

if (rc != ADAPT_SUCCESS) {
@@ -1016,10 +988,6 @@ static long ibmvscsis_trans_event(struct
TRANS_EVENT));
break;

- case PART_UP_WAIT_ENAB:
- vscsi->state = WAIT_ENABLED;
- break;
-
case SRP_PROCESSING:
if ((vscsi->debit > 0) ||
!list_empty(&vscsi->schedule_q) ||
@@ -1220,15 +1188,18 @@ static void ibmvscsis_adapter_idle(struc

switch (vscsi->state) {
case ERR_DISCONNECT_RECONNECT:
- ibmvscsis_reset_queue(vscsi, WAIT_CONNECTION);
+ ibmvscsis_reset_queue(vscsi);
pr_debug("adapter_idle, disc_rec: flags 0x%x\n", vscsi->flags);
break;

case ERR_DISCONNECT:
ibmvscsis_free_command_q(vscsi);
- vscsi->flags &= ~DISCONNECT_SCHEDULED;
+ vscsi->flags &= ~(SCHEDULE_DISCONNECT | DISCONNECT_SCHEDULED);
vscsi->flags |= RESPONSE_Q_DOWN;
- vscsi->state = ERR_DISCONNECTED;
+ if (vscsi->tport.enabled)
+ vscsi->state = ERR_DISCONNECTED;
+ else
+ vscsi->state = WAIT_ENABLED;
pr_debug("adapter_idle, disc: flags 0x%x, state 0x%hx\n",
vscsi->flags, vscsi->state);
break;
@@ -1773,8 +1744,8 @@ static void ibmvscsis_send_messages(stru
be64_to_cpu(msg_hi),
be64_to_cpu(cmd->rsp.tag));

- pr_debug("send_messages: tag 0x%llx, rc %ld\n",
- be64_to_cpu(cmd->rsp.tag), rc);
+ pr_debug("send_messages: cmd %p, tag 0x%llx, rc %ld\n",
+ cmd, be64_to_cpu(cmd->rsp.tag), rc);

/* if all ok free up the command element resources */
if (rc == H_SUCCESS) {
@@ -2788,36 +2759,6 @@ static irqreturn_t ibmvscsis_interrupt(i
}

/**
- * ibmvscsis_check_q() - Helper function to Check Init Message Valid
- * @vscsi: Pointer to our adapter structure
- *
- * Checks if a initialize message was queued by the initiatior
- * while the timing window was open. This function is called from
- * probe after the CRQ is created and interrupts are enabled.
- * It would only be used by adapters who wait for some event before
- * completing the init handshake with the client. For ibmvscsi, this
- * event is waiting for the port to be enabled.
- *
- * EXECUTION ENVIRONMENT:
- * Process level only, interrupt lock held
- */
-static long ibmvscsis_check_q(struct scsi_info *vscsi)
-{
- uint format;
- long rc;
-
- rc = ibmvscsis_check_init_msg(vscsi, &format);
- if (rc)
- ibmvscsis_post_disconnect(vscsi, ERR_DISCONNECT_RECONNECT, 0);
- else if (format == UNUSED_FORMAT)
- vscsi->state = WAIT_ENABLED;
- else
- vscsi->state = PART_UP_WAIT_ENAB;
-
- return rc;
-}
-
-/**
* ibmvscsis_enable_change_state() - Set new state based on enabled status
* @vscsi: Pointer to our adapter structure
*
@@ -2828,77 +2769,19 @@ static long ibmvscsis_check_q(struct scs
*/
static long ibmvscsis_enable_change_state(struct scsi_info *vscsi)
{
+ int bytes;
long rc = ADAPT_SUCCESS;

-handle_state_change:
- switch (vscsi->state) {
- case WAIT_ENABLED:
- rc = ibmvscsis_send_init_message(vscsi, INIT_MSG);
- switch (rc) {
- case H_SUCCESS:
- case H_DROPPED:
- case H_CLOSED:
- vscsi->state = WAIT_CONNECTION;
- rc = ADAPT_SUCCESS;
- break;
-
- case H_PARAMETER:
- break;
-
- case H_HARDWARE:
- break;
-
- default:
- vscsi->state = UNDEFINED;
- rc = H_HARDWARE;
- break;
- }
- break;
- case PART_UP_WAIT_ENAB:
- rc = ibmvscsis_send_init_message(vscsi, INIT_COMPLETE_MSG);
- switch (rc) {
- case H_SUCCESS:
- vscsi->state = CONNECTED;
- rc = ADAPT_SUCCESS;
- break;
-
- case H_DROPPED:
- case H_CLOSED:
- vscsi->state = WAIT_ENABLED;
- goto handle_state_change;
-
- case H_PARAMETER:
- break;
-
- case H_HARDWARE:
- break;
-
- default:
- rc = H_HARDWARE;
- break;
- }
- break;
-
- case WAIT_CONNECTION:
- case WAIT_IDLE:
- case SRP_PROCESSING:
- case CONNECTED:
- rc = ADAPT_SUCCESS;
- break;
- /* should not be able to get here */
- case UNCONFIGURING:
- rc = ERROR;
- vscsi->state = UNDEFINED;
- break;
+ bytes = vscsi->cmd_q.size * PAGE_SIZE;
+ rc = h_reg_crq(vscsi->dds.unit_id, vscsi->cmd_q.crq_token, bytes);
+ if (rc == H_CLOSED || rc == H_SUCCESS) {
+ vscsi->state = WAIT_CONNECTION;
+ rc = ibmvscsis_establish_new_q(vscsi);
+ }

- /* driver should never allow this to happen */
- case ERR_DISCONNECT:
- case ERR_DISCONNECT_RECONNECT:
- default:
- dev_err(&vscsi->dev, "in invalid state %d during enable_change_state\n",
- vscsi->state);
- rc = ADAPT_SUCCESS;
- break;
+ if (rc != ADAPT_SUCCESS) {
+ vscsi->state = ERR_DISCONNECTED;
+ vscsi->flags |= RESPONSE_Q_DOWN;
}

return rc;
@@ -2918,7 +2801,6 @@ handle_state_change:
*/
static long ibmvscsis_create_command_q(struct scsi_info *vscsi, int num_cmds)
{
- long rc = 0;
int pages;
struct vio_dev *vdev = vscsi->dma_dev;

@@ -2942,22 +2824,7 @@ static long ibmvscsis_create_command_q(s
return -ENOMEM;
}

- rc = h_reg_crq(vscsi->dds.unit_id, vscsi->cmd_q.crq_token, PAGE_SIZE);
- if (rc) {
- if (rc == H_CLOSED) {
- vscsi->state = WAIT_ENABLED;
- rc = 0;
- } else {
- dma_unmap_single(&vdev->dev, vscsi->cmd_q.crq_token,
- PAGE_SIZE, DMA_BIDIRECTIONAL);
- free_page((unsigned long)vscsi->cmd_q.base_addr);
- rc = -ENODEV;
- }
- } else {
- vscsi->state = WAIT_ENABLED;
- }
-
- return rc;
+ return 0;
}

/**
@@ -3491,31 +3358,12 @@ static int ibmvscsis_probe(struct vio_de
goto destroy_WQ;
}

- spin_lock_bh(&vscsi->intr_lock);
- vio_enable_interrupts(vdev);
- if (rc) {
- dev_err(&vscsi->dev, "enabling interrupts failed, rc %d\n", rc);
- rc = -ENODEV;
- spin_unlock_bh(&vscsi->intr_lock);
- goto free_irq;
- }
-
- if (ibmvscsis_check_q(vscsi)) {
- rc = ERROR;
- dev_err(&vscsi->dev, "probe: check_q failed, rc %d\n", rc);
- spin_unlock_bh(&vscsi->intr_lock);
- goto disable_interrupt;
- }
- spin_unlock_bh(&vscsi->intr_lock);
+ vscsi->state = WAIT_ENABLED;

dev_set_drvdata(&vdev->dev, vscsi);

return 0;

-disable_interrupt:
- vio_disable_interrupts(vdev);
-free_irq:
- free_irq(vdev->irq, vscsi);
destroy_WQ:
destroy_workqueue(vscsi->work_q);
unmap_buf:
@@ -3909,18 +3757,22 @@ static ssize_t ibmvscsis_tpg_enable_stor
}

if (tmp) {
- tport->enabled = true;
spin_lock_bh(&vscsi->intr_lock);
+ tport->enabled = true;
lrc = ibmvscsis_enable_change_state(vscsi);
if (lrc)
pr_err("enable_change_state failed, rc %ld state %d\n",
lrc, vscsi->state);
spin_unlock_bh(&vscsi->intr_lock);
} else {
+ spin_lock_bh(&vscsi->intr_lock);
tport->enabled = false;
+ /* This simulates the server going down */
+ ibmvscsis_post_disconnect(vscsi, ERR_DISCONNECT, 0);
+ spin_unlock_bh(&vscsi->intr_lock);
}

- pr_debug("tpg_enable_store, state %d\n", vscsi->state);
+ pr_debug("tpg_enable_store, tmp %ld, state %d\n", tmp, vscsi->state);

return count;
}
--- a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.h
+++ b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.h
@@ -204,8 +204,6 @@ struct scsi_info {
struct list_head waiting_rsp;
#define NO_QUEUE 0x00
#define WAIT_ENABLED 0X01
- /* driver has received an initialize command */
-#define PART_UP_WAIT_ENAB 0x02
#define WAIT_CONNECTION 0x04
/* have established a connection */
#define CONNECTED 0x08


2017-03-20 18:29:33

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 49/93] scsi: ibmvscsis: Rearrange functions for future patches

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Michael Cyr <[email protected]>

[ Upstream commit 79fac9c9b74f4951c9ce82b22e714bcc34ae4a56 ]

This patch reorders functions in a manner necessary for a follow-on
patch. It also makes some minor styling changes (mostly removing extra
spaces) and fixes some typos.

There are no code changes in this patch, with one exception: due to the
reordering of the functions, I needed to explicitly declare a function
at the top of the file. However, this will be removed in the next patch,
since the code requiring the predeclaration will be removed.

Signed-off-by: Michael Cyr <[email protected]>
Signed-off-by: Bryant G. Ly <[email protected]>
Tested-by: Steven Royer <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c | 658 +++++++++++++++----------------
1 file changed, 330 insertions(+), 328 deletions(-)

--- a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c
+++ b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c
@@ -22,7 +22,7 @@
*
****************************************************************************/

-#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt

#include <linux/module.h>
#include <linux/kernel.h>
@@ -62,6 +62,8 @@ static long ibmvscsis_parse_command(stru

static void ibmvscsis_adapter_idle(struct scsi_info *vscsi);

+static void ibmvscsis_reset_queue(struct scsi_info *vscsi, uint new_state);
+
static void ibmvscsis_determine_resid(struct se_cmd *se_cmd,
struct srp_rsp *rsp)
{
@@ -82,7 +84,7 @@ static void ibmvscsis_determine_resid(st
}
} else if (se_cmd->se_cmd_flags & SCF_OVERFLOW_BIT) {
if (se_cmd->data_direction == DMA_TO_DEVICE) {
- /* residual data from an overflow write */
+ /* residual data from an overflow write */
rsp->flags = SRP_RSP_FLAG_DOOVER;
rsp->data_out_res_cnt = cpu_to_be32(residual_count);
} else if (se_cmd->data_direction == DMA_FROM_DEVICE) {
@@ -102,7 +104,7 @@ static void ibmvscsis_determine_resid(st
* and the function returns TRUE.
*
* EXECUTION ENVIRONMENT:
- * Interrupt or Process environment
+ * Interrupt or Process environment
*/
static bool connection_broken(struct scsi_info *vscsi)
{
@@ -325,7 +327,7 @@ static struct viosrp_crq *ibmvscsis_cmd_
}

/**
- * ibmvscsis_send_init_message() - send initialize message to the client
+ * ibmvscsis_send_init_message() - send initialize message to the client
* @vscsi: Pointer to our adapter structure
* @format: Which Init Message format to send
*
@@ -383,13 +385,13 @@ static long ibmvscsis_check_init_msg(str
vscsi->cmd_q.base_addr);
if (crq) {
*format = (uint)(crq->format);
- rc = ERROR;
+ rc = ERROR;
crq->valid = INVALIDATE_CMD_RESP_EL;
dma_rmb();
}
} else {
*format = (uint)(crq->format);
- rc = ERROR;
+ rc = ERROR;
crq->valid = INVALIDATE_CMD_RESP_EL;
dma_rmb();
}
@@ -398,166 +400,6 @@ static long ibmvscsis_check_init_msg(str
}

/**
- * ibmvscsis_establish_new_q() - Establish new CRQ queue
- * @vscsi: Pointer to our adapter structure
- * @new_state: New state being established after resetting the queue
- *
- * Must be called with interrupt lock held.
- */
-static long ibmvscsis_establish_new_q(struct scsi_info *vscsi, uint new_state)
-{
- long rc = ADAPT_SUCCESS;
- uint format;
-
- vscsi->flags &= PRESERVE_FLAG_FIELDS;
- vscsi->rsp_q_timer.timer_pops = 0;
- vscsi->debit = 0;
- vscsi->credit = 0;
-
- rc = vio_enable_interrupts(vscsi->dma_dev);
- if (rc) {
- pr_warn("reset_queue: failed to enable interrupts, rc %ld\n",
- rc);
- return rc;
- }
-
- rc = ibmvscsis_check_init_msg(vscsi, &format);
- if (rc) {
- dev_err(&vscsi->dev, "reset_queue: check_init_msg failed, rc %ld\n",
- rc);
- return rc;
- }
-
- if (format == UNUSED_FORMAT && new_state == WAIT_CONNECTION) {
- rc = ibmvscsis_send_init_message(vscsi, INIT_MSG);
- switch (rc) {
- case H_SUCCESS:
- case H_DROPPED:
- case H_CLOSED:
- rc = ADAPT_SUCCESS;
- break;
-
- case H_PARAMETER:
- case H_HARDWARE:
- break;
-
- default:
- vscsi->state = UNDEFINED;
- rc = H_HARDWARE;
- break;
- }
- }
-
- return rc;
-}
-
-/**
- * ibmvscsis_reset_queue() - Reset CRQ Queue
- * @vscsi: Pointer to our adapter structure
- * @new_state: New state to establish after resetting the queue
- *
- * This function calls h_free_q and then calls h_reg_q and does all
- * of the bookkeeping to get us back to where we can communicate.
- *
- * Actually, we don't always call h_free_crq. A problem was discovered
- * where one partition would close and reopen his queue, which would
- * cause his partner to get a transport event, which would cause him to
- * close and reopen his queue, which would cause the original partition
- * to get a transport event, etc., etc. To prevent this, we don't
- * actually close our queue if the client initiated the reset, (i.e.
- * either we got a transport event or we have detected that the client's
- * queue is gone)
- *
- * EXECUTION ENVIRONMENT:
- * Process environment, called with interrupt lock held
- */
-static void ibmvscsis_reset_queue(struct scsi_info *vscsi, uint new_state)
-{
- int bytes;
- long rc = ADAPT_SUCCESS;
-
- pr_debug("reset_queue: flags 0x%x\n", vscsi->flags);
-
- /* don't reset, the client did it for us */
- if (vscsi->flags & (CLIENT_FAILED | TRANS_EVENT)) {
- vscsi->flags &= PRESERVE_FLAG_FIELDS;
- vscsi->rsp_q_timer.timer_pops = 0;
- vscsi->debit = 0;
- vscsi->credit = 0;
- vscsi->state = new_state;
- vio_enable_interrupts(vscsi->dma_dev);
- } else {
- rc = ibmvscsis_free_command_q(vscsi);
- if (rc == ADAPT_SUCCESS) {
- vscsi->state = new_state;
-
- bytes = vscsi->cmd_q.size * PAGE_SIZE;
- rc = h_reg_crq(vscsi->dds.unit_id,
- vscsi->cmd_q.crq_token, bytes);
- if (rc == H_CLOSED || rc == H_SUCCESS) {
- rc = ibmvscsis_establish_new_q(vscsi,
- new_state);
- }
-
- if (rc != ADAPT_SUCCESS) {
- pr_debug("reset_queue: reg_crq rc %ld\n", rc);
-
- vscsi->state = ERR_DISCONNECTED;
- vscsi->flags |= RESPONSE_Q_DOWN;
- ibmvscsis_free_command_q(vscsi);
- }
- } else {
- vscsi->state = ERR_DISCONNECTED;
- vscsi->flags |= RESPONSE_Q_DOWN;
- }
- }
-}
-
-/**
- * ibmvscsis_free_cmd_resources() - Free command resources
- * @vscsi: Pointer to our adapter structure
- * @cmd: Command which is not longer in use
- *
- * Must be called with interrupt lock held.
- */
-static void ibmvscsis_free_cmd_resources(struct scsi_info *vscsi,
- struct ibmvscsis_cmd *cmd)
-{
- struct iu_entry *iue = cmd->iue;
-
- switch (cmd->type) {
- case TASK_MANAGEMENT:
- case SCSI_CDB:
- /*
- * When the queue goes down this value is cleared, so it
- * cannot be cleared in this general purpose function.
- */
- if (vscsi->debit)
- vscsi->debit -= 1;
- break;
- case ADAPTER_MAD:
- vscsi->flags &= ~PROCESSING_MAD;
- break;
- case UNSET_TYPE:
- break;
- default:
- dev_err(&vscsi->dev, "free_cmd_resources unknown type %d\n",
- cmd->type);
- break;
- }
-
- cmd->iue = NULL;
- list_add_tail(&cmd->list, &vscsi->free_cmd);
- srp_iu_put(iue);
-
- if (list_empty(&vscsi->active_q) && list_empty(&vscsi->schedule_q) &&
- list_empty(&vscsi->waiting_rsp) && (vscsi->flags & WAIT_FOR_IDLE)) {
- vscsi->flags &= ~WAIT_FOR_IDLE;
- complete(&vscsi->wait_idle);
- }
-}
-
-/**
* ibmvscsis_disconnect() - Helper function to disconnect
* @work: Pointer to work_struct, gives access to our adapter structure
*
@@ -590,7 +432,7 @@ static void ibmvscsis_disconnect(struct
* should transitition to the new state
*/
switch (vscsi->state) {
- /* Should never be called while in this state. */
+ /* Should never be called while in this state. */
case NO_QUEUE:
/*
* Can never transition from this state;
@@ -807,6 +649,316 @@ static void ibmvscsis_post_disconnect(st
}

/**
+ * ibmvscsis_handle_init_compl_msg() - Respond to an Init Complete Message
+ * @vscsi: Pointer to our adapter structure
+ *
+ * Must be called with interrupt lock held.
+ */
+static long ibmvscsis_handle_init_compl_msg(struct scsi_info *vscsi)
+{
+ long rc = ADAPT_SUCCESS;
+
+ switch (vscsi->state) {
+ case NO_QUEUE:
+ case ERR_DISCONNECT:
+ case ERR_DISCONNECT_RECONNECT:
+ case ERR_DISCONNECTED:
+ case UNCONFIGURING:
+ case UNDEFINED:
+ rc = ERROR;
+ break;
+
+ case WAIT_CONNECTION:
+ vscsi->state = CONNECTED;
+ break;
+
+ case WAIT_IDLE:
+ case SRP_PROCESSING:
+ case CONNECTED:
+ case WAIT_ENABLED:
+ case PART_UP_WAIT_ENAB:
+ default:
+ rc = ERROR;
+ dev_err(&vscsi->dev, "init_msg: invalid state %d to get init compl msg\n",
+ vscsi->state);
+ ibmvscsis_post_disconnect(vscsi, ERR_DISCONNECT_RECONNECT, 0);
+ break;
+ }
+
+ return rc;
+}
+
+/**
+ * ibmvscsis_handle_init_msg() - Respond to an Init Message
+ * @vscsi: Pointer to our adapter structure
+ *
+ * Must be called with interrupt lock held.
+ */
+static long ibmvscsis_handle_init_msg(struct scsi_info *vscsi)
+{
+ long rc = ADAPT_SUCCESS;
+
+ switch (vscsi->state) {
+ case WAIT_ENABLED:
+ vscsi->state = PART_UP_WAIT_ENAB;
+ break;
+
+ case WAIT_CONNECTION:
+ rc = ibmvscsis_send_init_message(vscsi, INIT_COMPLETE_MSG);
+ switch (rc) {
+ case H_SUCCESS:
+ vscsi->state = CONNECTED;
+ break;
+
+ case H_PARAMETER:
+ dev_err(&vscsi->dev, "init_msg: failed to send, rc %ld\n",
+ rc);
+ ibmvscsis_post_disconnect(vscsi, ERR_DISCONNECT, 0);
+ break;
+
+ case H_DROPPED:
+ dev_err(&vscsi->dev, "init_msg: failed to send, rc %ld\n",
+ rc);
+ rc = ERROR;
+ ibmvscsis_post_disconnect(vscsi,
+ ERR_DISCONNECT_RECONNECT, 0);
+ break;
+
+ case H_CLOSED:
+ pr_warn("init_msg: failed to send, rc %ld\n", rc);
+ rc = 0;
+ break;
+ }
+ break;
+
+ case UNDEFINED:
+ rc = ERROR;
+ break;
+
+ case UNCONFIGURING:
+ break;
+
+ case PART_UP_WAIT_ENAB:
+ case CONNECTED:
+ case SRP_PROCESSING:
+ case WAIT_IDLE:
+ case NO_QUEUE:
+ case ERR_DISCONNECT:
+ case ERR_DISCONNECT_RECONNECT:
+ case ERR_DISCONNECTED:
+ default:
+ rc = ERROR;
+ dev_err(&vscsi->dev, "init_msg: invalid state %d to get init msg\n",
+ vscsi->state);
+ ibmvscsis_post_disconnect(vscsi, ERR_DISCONNECT_RECONNECT, 0);
+ break;
+ }
+
+ return rc;
+}
+
+/**
+ * ibmvscsis_init_msg() - Respond to an init message
+ * @vscsi: Pointer to our adapter structure
+ * @crq: Pointer to CRQ element containing the Init Message
+ *
+ * EXECUTION ENVIRONMENT:
+ * Interrupt, interrupt lock held
+ */
+static long ibmvscsis_init_msg(struct scsi_info *vscsi, struct viosrp_crq *crq)
+{
+ long rc = ADAPT_SUCCESS;
+
+ pr_debug("init_msg: state 0x%hx\n", vscsi->state);
+
+ rc = h_vioctl(vscsi->dds.unit_id, H_GET_PARTNER_INFO,
+ (u64)vscsi->map_ioba | ((u64)PAGE_SIZE << 32), 0, 0, 0,
+ 0);
+ if (rc == H_SUCCESS) {
+ vscsi->client_data.partition_number =
+ be64_to_cpu(*(u64 *)vscsi->map_buf);
+ pr_debug("init_msg, part num %d\n",
+ vscsi->client_data.partition_number);
+ } else {
+ pr_debug("init_msg h_vioctl rc %ld\n", rc);
+ rc = ADAPT_SUCCESS;
+ }
+
+ if (crq->format == INIT_MSG) {
+ rc = ibmvscsis_handle_init_msg(vscsi);
+ } else if (crq->format == INIT_COMPLETE_MSG) {
+ rc = ibmvscsis_handle_init_compl_msg(vscsi);
+ } else {
+ rc = ERROR;
+ dev_err(&vscsi->dev, "init_msg: invalid format %d\n",
+ (uint)crq->format);
+ ibmvscsis_post_disconnect(vscsi, ERR_DISCONNECT_RECONNECT, 0);
+ }
+
+ return rc;
+}
+
+/**
+ * ibmvscsis_establish_new_q() - Establish new CRQ queue
+ * @vscsi: Pointer to our adapter structure
+ * @new_state: New state being established after resetting the queue
+ *
+ * Must be called with interrupt lock held.
+ */
+static long ibmvscsis_establish_new_q(struct scsi_info *vscsi, uint new_state)
+{
+ long rc = ADAPT_SUCCESS;
+ uint format;
+
+ vscsi->flags &= PRESERVE_FLAG_FIELDS;
+ vscsi->rsp_q_timer.timer_pops = 0;
+ vscsi->debit = 0;
+ vscsi->credit = 0;
+
+ rc = vio_enable_interrupts(vscsi->dma_dev);
+ if (rc) {
+ pr_warn("reset_queue: failed to enable interrupts, rc %ld\n",
+ rc);
+ return rc;
+ }
+
+ rc = ibmvscsis_check_init_msg(vscsi, &format);
+ if (rc) {
+ dev_err(&vscsi->dev, "reset_queue: check_init_msg failed, rc %ld\n",
+ rc);
+ return rc;
+ }
+
+ if (format == UNUSED_FORMAT && new_state == WAIT_CONNECTION) {
+ rc = ibmvscsis_send_init_message(vscsi, INIT_MSG);
+ switch (rc) {
+ case H_SUCCESS:
+ case H_DROPPED:
+ case H_CLOSED:
+ rc = ADAPT_SUCCESS;
+ break;
+
+ case H_PARAMETER:
+ case H_HARDWARE:
+ break;
+
+ default:
+ vscsi->state = UNDEFINED;
+ rc = H_HARDWARE;
+ break;
+ }
+ }
+
+ return rc;
+}
+
+/**
+ * ibmvscsis_reset_queue() - Reset CRQ Queue
+ * @vscsi: Pointer to our adapter structure
+ * @new_state: New state to establish after resetting the queue
+ *
+ * This function calls h_free_q and then calls h_reg_q and does all
+ * of the bookkeeping to get us back to where we can communicate.
+ *
+ * Actually, we don't always call h_free_crq. A problem was discovered
+ * where one partition would close and reopen his queue, which would
+ * cause his partner to get a transport event, which would cause him to
+ * close and reopen his queue, which would cause the original partition
+ * to get a transport event, etc., etc. To prevent this, we don't
+ * actually close our queue if the client initiated the reset, (i.e.
+ * either we got a transport event or we have detected that the client's
+ * queue is gone)
+ *
+ * EXECUTION ENVIRONMENT:
+ * Process environment, called with interrupt lock held
+ */
+static void ibmvscsis_reset_queue(struct scsi_info *vscsi, uint new_state)
+{
+ int bytes;
+ long rc = ADAPT_SUCCESS;
+
+ pr_debug("reset_queue: flags 0x%x\n", vscsi->flags);
+
+ /* don't reset, the client did it for us */
+ if (vscsi->flags & (CLIENT_FAILED | TRANS_EVENT)) {
+ vscsi->flags &= PRESERVE_FLAG_FIELDS;
+ vscsi->rsp_q_timer.timer_pops = 0;
+ vscsi->debit = 0;
+ vscsi->credit = 0;
+ vscsi->state = new_state;
+ vio_enable_interrupts(vscsi->dma_dev);
+ } else {
+ rc = ibmvscsis_free_command_q(vscsi);
+ if (rc == ADAPT_SUCCESS) {
+ vscsi->state = new_state;
+
+ bytes = vscsi->cmd_q.size * PAGE_SIZE;
+ rc = h_reg_crq(vscsi->dds.unit_id,
+ vscsi->cmd_q.crq_token, bytes);
+ if (rc == H_CLOSED || rc == H_SUCCESS) {
+ rc = ibmvscsis_establish_new_q(vscsi,
+ new_state);
+ }
+
+ if (rc != ADAPT_SUCCESS) {
+ pr_debug("reset_queue: reg_crq rc %ld\n", rc);
+
+ vscsi->state = ERR_DISCONNECTED;
+ vscsi->flags |= RESPONSE_Q_DOWN;
+ ibmvscsis_free_command_q(vscsi);
+ }
+ } else {
+ vscsi->state = ERR_DISCONNECTED;
+ vscsi->flags |= RESPONSE_Q_DOWN;
+ }
+ }
+}
+
+/**
+ * ibmvscsis_free_cmd_resources() - Free command resources
+ * @vscsi: Pointer to our adapter structure
+ * @cmd: Command which is not longer in use
+ *
+ * Must be called with interrupt lock held.
+ */
+static void ibmvscsis_free_cmd_resources(struct scsi_info *vscsi,
+ struct ibmvscsis_cmd *cmd)
+{
+ struct iu_entry *iue = cmd->iue;
+
+ switch (cmd->type) {
+ case TASK_MANAGEMENT:
+ case SCSI_CDB:
+ /*
+ * When the queue goes down this value is cleared, so it
+ * cannot be cleared in this general purpose function.
+ */
+ if (vscsi->debit)
+ vscsi->debit -= 1;
+ break;
+ case ADAPTER_MAD:
+ vscsi->flags &= ~PROCESSING_MAD;
+ break;
+ case UNSET_TYPE:
+ break;
+ default:
+ dev_err(&vscsi->dev, "free_cmd_resources unknown type %d\n",
+ cmd->type);
+ break;
+ }
+
+ cmd->iue = NULL;
+ list_add_tail(&cmd->list, &vscsi->free_cmd);
+ srp_iu_put(iue);
+
+ if (list_empty(&vscsi->active_q) && list_empty(&vscsi->schedule_q) &&
+ list_empty(&vscsi->waiting_rsp) && (vscsi->flags & WAIT_FOR_IDLE)) {
+ vscsi->flags &= ~WAIT_FOR_IDLE;
+ complete(&vscsi->wait_idle);
+ }
+}
+
+/**
* ibmvscsis_trans_event() - Handle a Transport Event
* @vscsi: Pointer to our adapter structure
* @crq: Pointer to CRQ entry containing the Transport Event
@@ -896,7 +1048,7 @@ static long ibmvscsis_trans_event(struct
}
}

- rc = vscsi->flags & SCHEDULE_DISCONNECT;
+ rc = vscsi->flags & SCHEDULE_DISCONNECT;

pr_debug("Leaving trans_event: flags 0x%x, state 0x%hx, rc %ld\n",
vscsi->flags, vscsi->state, rc);
@@ -1221,7 +1373,7 @@ static long ibmvscsis_copy_crq_packet(st
* @iue: Information Unit containing the Adapter Info MAD request
*
* EXECUTION ENVIRONMENT:
- * Interrupt adpater lock is held
+ * Interrupt adapter lock is held
*/
static long ibmvscsis_adapter_info(struct scsi_info *vscsi,
struct iu_entry *iue)
@@ -1692,7 +1844,7 @@ static void ibmvscsis_send_mad_resp(stru
* @crq: Pointer to the CRQ entry containing the MAD request
*
* EXECUTION ENVIRONMENT:
- * Interrupt called with adapter lock held
+ * Interrupt, called with adapter lock held
*/
static long ibmvscsis_mad(struct scsi_info *vscsi, struct viosrp_crq *crq)
{
@@ -1858,7 +2010,7 @@ static long ibmvscsis_srp_login_rej(stru
break;
case H_PERMISSION:
if (connection_broken(vscsi))
- flag_bits = RESPONSE_Q_DOWN | CLIENT_FAILED;
+ flag_bits = RESPONSE_Q_DOWN | CLIENT_FAILED;
dev_err(&vscsi->dev, "login_rej: error copying to client, rc %ld\n",
rc);
ibmvscsis_post_disconnect(vscsi, ERR_DISCONNECT_RECONNECT,
@@ -2181,156 +2333,6 @@ static long ibmvscsis_ping_response(stru
}

/**
- * ibmvscsis_handle_init_compl_msg() - Respond to an Init Complete Message
- * @vscsi: Pointer to our adapter structure
- *
- * Must be called with interrupt lock held.
- */
-static long ibmvscsis_handle_init_compl_msg(struct scsi_info *vscsi)
-{
- long rc = ADAPT_SUCCESS;
-
- switch (vscsi->state) {
- case NO_QUEUE:
- case ERR_DISCONNECT:
- case ERR_DISCONNECT_RECONNECT:
- case ERR_DISCONNECTED:
- case UNCONFIGURING:
- case UNDEFINED:
- rc = ERROR;
- break;
-
- case WAIT_CONNECTION:
- vscsi->state = CONNECTED;
- break;
-
- case WAIT_IDLE:
- case SRP_PROCESSING:
- case CONNECTED:
- case WAIT_ENABLED:
- case PART_UP_WAIT_ENAB:
- default:
- rc = ERROR;
- dev_err(&vscsi->dev, "init_msg: invalid state %d to get init compl msg\n",
- vscsi->state);
- ibmvscsis_post_disconnect(vscsi, ERR_DISCONNECT_RECONNECT, 0);
- break;
- }
-
- return rc;
-}
-
-/**
- * ibmvscsis_handle_init_msg() - Respond to an Init Message
- * @vscsi: Pointer to our adapter structure
- *
- * Must be called with interrupt lock held.
- */
-static long ibmvscsis_handle_init_msg(struct scsi_info *vscsi)
-{
- long rc = ADAPT_SUCCESS;
-
- switch (vscsi->state) {
- case WAIT_ENABLED:
- vscsi->state = PART_UP_WAIT_ENAB;
- break;
-
- case WAIT_CONNECTION:
- rc = ibmvscsis_send_init_message(vscsi, INIT_COMPLETE_MSG);
- switch (rc) {
- case H_SUCCESS:
- vscsi->state = CONNECTED;
- break;
-
- case H_PARAMETER:
- dev_err(&vscsi->dev, "init_msg: failed to send, rc %ld\n",
- rc);
- ibmvscsis_post_disconnect(vscsi, ERR_DISCONNECT, 0);
- break;
-
- case H_DROPPED:
- dev_err(&vscsi->dev, "init_msg: failed to send, rc %ld\n",
- rc);
- rc = ERROR;
- ibmvscsis_post_disconnect(vscsi,
- ERR_DISCONNECT_RECONNECT, 0);
- break;
-
- case H_CLOSED:
- pr_warn("init_msg: failed to send, rc %ld\n", rc);
- rc = 0;
- break;
- }
- break;
-
- case UNDEFINED:
- rc = ERROR;
- break;
-
- case UNCONFIGURING:
- break;
-
- case PART_UP_WAIT_ENAB:
- case CONNECTED:
- case SRP_PROCESSING:
- case WAIT_IDLE:
- case NO_QUEUE:
- case ERR_DISCONNECT:
- case ERR_DISCONNECT_RECONNECT:
- case ERR_DISCONNECTED:
- default:
- rc = ERROR;
- dev_err(&vscsi->dev, "init_msg: invalid state %d to get init msg\n",
- vscsi->state);
- ibmvscsis_post_disconnect(vscsi, ERR_DISCONNECT_RECONNECT, 0);
- break;
- }
-
- return rc;
-}
-
-/**
- * ibmvscsis_init_msg() - Respond to an init message
- * @vscsi: Pointer to our adapter structure
- * @crq: Pointer to CRQ element containing the Init Message
- *
- * EXECUTION ENVIRONMENT:
- * Interrupt, interrupt lock held
- */
-static long ibmvscsis_init_msg(struct scsi_info *vscsi, struct viosrp_crq *crq)
-{
- long rc = ADAPT_SUCCESS;
-
- pr_debug("init_msg: state 0x%hx\n", vscsi->state);
-
- rc = h_vioctl(vscsi->dds.unit_id, H_GET_PARTNER_INFO,
- (u64)vscsi->map_ioba | ((u64)PAGE_SIZE << 32), 0, 0, 0,
- 0);
- if (rc == H_SUCCESS) {
- vscsi->client_data.partition_number =
- be64_to_cpu(*(u64 *)vscsi->map_buf);
- pr_debug("init_msg, part num %d\n",
- vscsi->client_data.partition_number);
- } else {
- pr_debug("init_msg h_vioctl rc %ld\n", rc);
- rc = ADAPT_SUCCESS;
- }
-
- if (crq->format == INIT_MSG) {
- rc = ibmvscsis_handle_init_msg(vscsi);
- } else if (crq->format == INIT_COMPLETE_MSG) {
- rc = ibmvscsis_handle_init_compl_msg(vscsi);
- } else {
- rc = ERROR;
- dev_err(&vscsi->dev, "init_msg: invalid format %d\n",
- (uint)crq->format);
- ibmvscsis_post_disconnect(vscsi, ERR_DISCONNECT_RECONNECT, 0);
- }
-
- return rc;
-}
-
-/**
* ibmvscsis_parse_command() - Parse an element taken from the cmd rsp queue.
* @vscsi: Pointer to our adapter structure
* @crq: Pointer to CRQ element containing the SRP request
@@ -2385,7 +2387,7 @@ static long ibmvscsis_parse_command(stru
break;

case VALID_TRANS_EVENT:
- rc = ibmvscsis_trans_event(vscsi, crq);
+ rc = ibmvscsis_trans_event(vscsi, crq);
break;

case VALID_INIT_MSG:
@@ -3270,7 +3272,7 @@ static void ibmvscsis_handle_crq(unsigne
/*
* if we are in a path where we are waiting for all pending commands
* to complete because we received a transport event and anything in
- * the command queue is for a new connection, do nothing
+ * the command queue is for a new connection, do nothing
*/
if (TARGET_STOP(vscsi)) {
vio_enable_interrupts(vscsi->dma_dev);
@@ -3314,7 +3316,7 @@ cmd_work:
* everything but transport events on the queue
*
* need to decrement the queue index so we can
- * look at the elment again
+ * look at the element again
*/
if (vscsi->cmd_q.index)
vscsi->cmd_q.index -= 1;
@@ -3988,10 +3990,10 @@ static struct attribute *ibmvscsis_dev_a
ATTRIBUTE_GROUPS(ibmvscsis_dev);

static struct class ibmvscsis_class = {
- .name = "ibmvscsis",
- .dev_release = ibmvscsis_dev_release,
- .class_attrs = ibmvscsis_class_attrs,
- .dev_groups = ibmvscsis_dev_groups,
+ .name = "ibmvscsis",
+ .dev_release = ibmvscsis_dev_release,
+ .class_attrs = ibmvscsis_class_attrs,
+ .dev_groups = ibmvscsis_dev_groups,
};

static struct vio_device_id ibmvscsis_device_table[] = {


2017-03-20 18:29:29

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 31/93] mpls: Do not decrement alive counter for unregister events

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: David Ahern <[email protected]>


[ Upstream commit 79099aab38c8f5c746748b066ae74ba984fe2cc8 ]

Multipath routes can be rendered usesless when a device in one of the
paths is deleted. For example:

$ ip -f mpls ro ls
100
nexthop as to 200 via inet 172.16.2.2 dev virt12
nexthop as to 300 via inet 172.16.3.2 dev br0
101
nexthop as to 201 via inet6 2000:2::2 dev virt12
nexthop as to 301 via inet6 2000:3::2 dev br0

$ ip li del br0

When br0 is deleted the other hop is not considered in
mpls_select_multipath because of the alive check -- rt_nhn_alive
is 0.

rt_nhn_alive is decremented once in mpls_ifdown when the device is taken
down (NETDEV_DOWN) and again when it is deleted (NETDEV_UNREGISTER). For
a 2 hop route, deleting one device drops the alive count to 0. Since
devices are taken down before unregistering, the decrement on
NETDEV_UNREGISTER is redundant.

Fixes: c89359a42e2a4 ("mpls: support for dead routes")
Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/mpls/af_mpls.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/net/mpls/af_mpls.c
+++ b/net/mpls/af_mpls.c
@@ -956,7 +956,8 @@ static void mpls_ifdown(struct net_devic
/* fall through */
case NETDEV_CHANGE:
nh->nh_flags |= RTNH_F_LINKDOWN;
- ACCESS_ONCE(rt->rt_nhn_alive) = rt->rt_nhn_alive - 1;
+ if (event != NETDEV_UNREGISTER)
+ ACCESS_ONCE(rt->rt_nhn_alive) = rt->rt_nhn_alive - 1;
break;
}
if (event == NETDEV_UNREGISTER)


2017-03-20 18:29:28

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 32/93] ipv6: make ECMP route replacement less greedy

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Sabrina Dubroca <[email protected]>


[ Upstream commit 67e194007be08d071294456274dd53e0a04fdf90 ]

Commit 27596472473a ("ipv6: fix ECMP route replacement") introduced a
loop that removes all siblings of an ECMP route that is being
replaced. However, this loop doesn't stop when it has replaced
siblings, and keeps removing other routes with a higher metric.
We also end up triggering the WARN_ON after the loop, because after
this nsiblings < 0.

Instead, stop the loop when we have taken care of all routes with the
same metric as the route being replaced.

Reproducer:
===========
#!/bin/sh

ip netns add ns1
ip netns add ns2
ip -net ns1 link set lo up

for x in 0 1 2 ; do
ip link add veth$x netns ns2 type veth peer name eth$x netns ns1
ip -net ns1 link set eth$x up
ip -net ns2 link set veth$x up
done

ip -net ns1 -6 r a 2000::/64 nexthop via fe80::0 dev eth0 \
nexthop via fe80::1 dev eth1 nexthop via fe80::2 dev eth2
ip -net ns1 -6 r a 2000::/64 via fe80::42 dev eth0 metric 256
ip -net ns1 -6 r a 2000::/64 via fe80::43 dev eth0 metric 2048

echo "before replace, 3 routes"
ip -net ns1 -6 r | grep -v '^fe80\|^ff00'
echo

ip -net ns1 -6 r c 2000::/64 nexthop via fe80::4 dev eth0 \
nexthop via fe80::5 dev eth1 nexthop via fe80::6 dev eth2

echo "after replace, only 2 routes, metric 2048 is gone"
ip -net ns1 -6 r | grep -v '^fe80\|^ff00'

Fixes: 27596472473a ("ipv6: fix ECMP route replacement")
Signed-off-by: Sabrina Dubroca <[email protected]>
Acked-by: Nicolas Dichtel <[email protected]>
Reviewed-by: Xin Long <[email protected]>
Reviewed-by: Michal Kubecek <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv6/ip6_fib.c | 2 ++
1 file changed, 2 insertions(+)

--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -908,6 +908,8 @@ add:
ins = &rt->dst.rt6_next;
iter = *ins;
while (iter) {
+ if (iter->rt6i_metric > rt->rt6i_metric)
+ break;
if (rt6_qualify_for_ecmp(iter)) {
*ins = iter->dst.rt6_next;
fib6_purge_rt(iter, fn, info->nl_net);


2017-03-20 18:33:02

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 27/93] net/tunnel: set inner protocol in network gro hooks

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Paolo Abeni <[email protected]>


[ Upstream commit 294acf1c01bace5cea5d30b510504238bf5f7c25 ]

The gso code of several tunnels type (gre and udp tunnels)
takes for granted that the skb->inner_protocol is properly
initialized and drops the packet elsewhere.

On the forwarding path no one is initializing such field,
so gro encapsulated packets are dropped on forward.

Since commit 38720352412a ("gre: Use inner_proto to obtain
inner header protocol"), this can be reproduced when the
encapsulated packets use gre as the tunneling protocol.

The issue happens also with vxlan and geneve tunnels since
commit 8bce6d7d0d1e ("udp: Generalize skb_udp_segment"), if the
forwarding host's ingress nic has h/w offload for such tunnel
and a vxlan/geneve device is configured on top of it, regardless
of the configured peer address and vni.

To address the issue, this change initialize the inner_protocol
field for encapsulated packets in both ipv4 and ipv6 gro complete
callbacks.

Fixes: 38720352412a ("gre: Use inner_proto to obtain inner header protocol")
Fixes: 8bce6d7d0d1e ("udp: Generalize skb_udp_segment")
Signed-off-by: Paolo Abeni <[email protected]>
Acked-by: Alexander Duyck <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/af_inet.c | 4 +++-
net/ipv6/ip6_offload.c | 4 +++-
2 files changed, 6 insertions(+), 2 deletions(-)

--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1460,8 +1460,10 @@ int inet_gro_complete(struct sk_buff *sk
int proto = iph->protocol;
int err = -ENOSYS;

- if (skb->encapsulation)
+ if (skb->encapsulation) {
+ skb_set_inner_protocol(skb, cpu_to_be16(ETH_P_IP));
skb_set_inner_network_header(skb, nhoff);
+ }

csum_replace2(&iph->check, iph->tot_len, newlen);
iph->tot_len = newlen;
--- a/net/ipv6/ip6_offload.c
+++ b/net/ipv6/ip6_offload.c
@@ -294,8 +294,10 @@ static int ipv6_gro_complete(struct sk_b
struct ipv6hdr *iph = (struct ipv6hdr *)(skb->data + nhoff);
int err = -ENOSYS;

- if (skb->encapsulation)
+ if (skb->encapsulation) {
+ skb_set_inner_protocol(skb, cpu_to_be16(ETH_P_IPV6));
skb_set_inner_network_header(skb, nhoff);
+ }

iph->payload_len = htons(skb->len - nhoff - sizeof(*iph));



2017-03-20 18:33:00

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 28/93] uapi: fix linux/packet_diag.h userspace compilation error

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: "Dmitry V. Levin" <[email protected]>


[ Upstream commit 745cb7f8a5de0805cade3de3991b7a95317c7c73 ]

Replace MAX_ADDR_LEN with its numeric value to fix the following
linux/packet_diag.h userspace compilation error:

/usr/include/linux/packet_diag.h:67:17: error: 'MAX_ADDR_LEN' undeclared here (not in a function)
__u8 pdmc_addr[MAX_ADDR_LEN];

This is not the first case in the UAPI where the numeric value
of MAX_ADDR_LEN is used instead of symbolic one, uapi/linux/if_link.h
already does the same:

$ grep MAX_ADDR_LEN include/uapi/linux/if_link.h
__u8 mac[32]; /* MAX_ADDR_LEN */

There are no UAPI headers besides these two that use MAX_ADDR_LEN.

Signed-off-by: Dmitry V. Levin <[email protected]>
Acked-by: Pavel Emelyanov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/uapi/linux/packet_diag.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/include/uapi/linux/packet_diag.h
+++ b/include/uapi/linux/packet_diag.h
@@ -64,7 +64,7 @@ struct packet_diag_mclist {
__u32 pdmc_count;
__u16 pdmc_type;
__u16 pdmc_alen;
- __u8 pdmc_addr[MAX_ADDR_LEN];
+ __u8 pdmc_addr[32]; /* MAX_ADDR_LEN */
};

struct packet_diag_ring {


2017-03-20 18:33:32

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 53/93] PCI: Separate VF BAR updates from standard BAR updates

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Bjorn Helgaas <[email protected]>

[ Upstream commit 6ffa2489c51da77564a0881a73765ea2169f955d ]

Previously pci_update_resource() used the same code path for updating
standard BARs and VF BARs in SR-IOV capabilities.

Split the VF BAR update into a new pci_iov_update_resource() internal
interface, which makes it simpler to compute the BAR address (we can get
rid of pci_resource_bar() and pci_iov_resource_bar()).

This patch:

- Renames pci_update_resource() to pci_std_update_resource(),
- Adds pci_iov_update_resource(),
- Makes pci_update_resource() a wrapper that calls the appropriate one,

No functional change intended.

Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Gavin Shan <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/pci/iov.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++
drivers/pci/pci.h | 1
drivers/pci/setup-res.c | 13 ++++++++++--
3 files changed, 62 insertions(+), 2 deletions(-)

--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -571,6 +571,56 @@ int pci_iov_resource_bar(struct pci_dev
4 * (resno - PCI_IOV_RESOURCES);
}

+/**
+ * pci_iov_update_resource - update a VF BAR
+ * @dev: the PCI device
+ * @resno: the resource number
+ *
+ * Update a VF BAR in the SR-IOV capability of a PF.
+ */
+void pci_iov_update_resource(struct pci_dev *dev, int resno)
+{
+ struct pci_sriov *iov = dev->is_physfn ? dev->sriov : NULL;
+ struct resource *res = dev->resource + resno;
+ int vf_bar = resno - PCI_IOV_RESOURCES;
+ struct pci_bus_region region;
+ u32 new;
+ int reg;
+
+ /*
+ * The generic pci_restore_bars() path calls this for all devices,
+ * including VFs and non-SR-IOV devices. If this is not a PF, we
+ * have nothing to do.
+ */
+ if (!iov)
+ return;
+
+ /*
+ * Ignore unimplemented BARs, unused resource slots for 64-bit
+ * BARs, and non-movable resources, e.g., those described via
+ * Enhanced Allocation.
+ */
+ if (!res->flags)
+ return;
+
+ if (res->flags & IORESOURCE_UNSET)
+ return;
+
+ if (res->flags & IORESOURCE_PCI_FIXED)
+ return;
+
+ pcibios_resource_to_bus(dev->bus, &region, res);
+ new = region.start;
+ new |= res->flags & ~PCI_BASE_ADDRESS_MEM_MASK;
+
+ reg = iov->pos + PCI_SRIOV_BAR + 4 * vf_bar;
+ pci_write_config_dword(dev, reg, new);
+ if (res->flags & IORESOURCE_MEM_64) {
+ new = region.start >> 16 >> 16;
+ pci_write_config_dword(dev, reg + 4, new);
+ }
+}
+
resource_size_t __weak pcibios_iov_resource_alignment(struct pci_dev *dev,
int resno)
{
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -290,6 +290,7 @@ static inline void pci_restore_ats_state
int pci_iov_init(struct pci_dev *dev);
void pci_iov_release(struct pci_dev *dev);
int pci_iov_resource_bar(struct pci_dev *dev, int resno);
+void pci_iov_update_resource(struct pci_dev *dev, int resno);
resource_size_t pci_sriov_resource_alignment(struct pci_dev *dev, int resno);
void pci_restore_iov_state(struct pci_dev *dev);
int pci_iov_bus_range(struct pci_bus *bus);
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -25,8 +25,7 @@
#include <linux/slab.h>
#include "pci.h"

-
-void pci_update_resource(struct pci_dev *dev, int resno)
+static void pci_std_update_resource(struct pci_dev *dev, int resno)
{
struct pci_bus_region region;
bool disable;
@@ -110,6 +109,16 @@ void pci_update_resource(struct pci_dev
pci_write_config_word(dev, PCI_COMMAND, cmd);
}

+void pci_update_resource(struct pci_dev *dev, int resno)
+{
+ if (resno <= PCI_ROM_RESOURCE)
+ pci_std_update_resource(dev, resno);
+#ifdef CONFIG_PCI_IOV
+ else if (resno >= PCI_IOV_RESOURCES && resno <= PCI_IOV_RESOURCE_END)
+ pci_iov_update_resource(dev, resno);
+#endif
+}
+
int pci_claim_resource(struct pci_dev *dev, int resource)
{
struct resource *res = &dev->resource[resource];


2017-03-20 18:33:52

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 26/93] vrf: Fix use-after-free in vrf_xmit

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: David Ahern <[email protected]>


[ Upstream commit f7887d40e541f74402df0684a1463c0a0bb68c68 ]

KASAN detected a use-after-free:

[ 269.467067] BUG: KASAN: use-after-free in vrf_xmit+0x7f1/0x827 [vrf] at addr ffff8800350a21c0
[ 269.467067] Read of size 4 by task ssh/1879
[ 269.467067] CPU: 1 PID: 1879 Comm: ssh Not tainted 4.10.0+ #249
[ 269.467067] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[ 269.467067] Call Trace:
[ 269.467067] dump_stack+0x81/0xb6
[ 269.467067] kasan_object_err+0x21/0x78
[ 269.467067] kasan_report+0x2f7/0x450
[ 269.467067] ? vrf_xmit+0x7f1/0x827 [vrf]
[ 269.467067] ? ip_output+0xa4/0xdb
[ 269.467067] __asan_load4+0x6b/0x6d
[ 269.467067] vrf_xmit+0x7f1/0x827 [vrf]
...

Which corresponds to the skb access after xmit handling. Fix by saving
skb->len and using the saved value to update stats.

Fixes: 193125dbd8eb2 ("net: Introduce VRF device driver")
Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/vrf.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/net/vrf.c
+++ b/drivers/net/vrf.c
@@ -346,6 +346,7 @@ static netdev_tx_t is_ip_tx_frame(struct

static netdev_tx_t vrf_xmit(struct sk_buff *skb, struct net_device *dev)
{
+ int len = skb->len;
netdev_tx_t ret = is_ip_tx_frame(skb, dev);

if (likely(ret == NET_XMIT_SUCCESS || ret == NET_XMIT_CN)) {
@@ -353,7 +354,7 @@ static netdev_tx_t vrf_xmit(struct sk_bu

u64_stats_update_begin(&dstats->syncp);
dstats->tx_pkts++;
- dstats->tx_bytes += skb->len;
+ dstats->tx_bytes += len;
u64_stats_update_end(&dstats->syncp);
} else {
this_cpu_inc(dev->dstats->tx_drps);


2017-03-20 18:33:56

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 52/93] x86/hyperv: Handle unknown NMIs on one CPU when unknown_nmi_panic

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Vitaly Kuznetsov <[email protected]>

[ Upstream commit 59107e2f48831daedc46973ce4988605ab066de3 ]

There is a feature in Hyper-V ('Debug-VM --InjectNonMaskableInterrupt')
which injects NMI to the guest. We may want to crash the guest and do kdump
on this NMI by enabling unknown_nmi_panic. To make kdump succeed we need to
allow the kdump kernel to re-establish VMBus connection so it will see
VMBus devices (storage, network,..).

To properly unload VMBus making it possible to start over during kdump we
need to do the following:

- Send an 'unload' message to the hypervisor. This can be done on any CPU
so we do this the crashing CPU.

- Receive the 'unload finished' reply message. WS2012R2 delivers this
message to the CPU which was used to establish VMBus connection during
module load and this CPU may differ from the CPU sending 'unload'.

Receiving a VMBus message means the following:

- There is a per-CPU slot in memory for one message. This slot can in
theory be accessed by any CPU.

- We get an interrupt on the CPU when a message was placed into the slot.

- When we read the message we need to clear the slot and signal the fact
to the hypervisor. In case there are more messages to this CPU pending
the hypervisor will deliver the next message. The signaling is done by
writing to an MSR so this can only be done on the appropriate CPU.

To avoid doing cross-CPU work on crash we have vmbus_wait_for_unload()
function which checks message slots for all CPUs in a loop waiting for the
'unload finished' messages. However, there is an issue which arises when
these conditions are met:

- We're crashing on a CPU which is different from the one which was used
to initially contact the hypervisor.

- The CPU which was used for the initial contact is blocked with interrupts
disabled and there is a message pending in the message slot.

In this case we won't be able to read the 'unload finished' message on the
crashing CPU. This is reproducible when we receive unknown NMIs on all CPUs
simultaneously: the first CPU entering panic() will proceed to crash and
all other CPUs will stop themselves with interrupts disabled.

The suggested solution is to handle unknown NMIs for Hyper-V guests on the
first CPU which gets them only. This will allow us to rely on VMBus
interrupt handler being able to receive the 'unload finish' message in
case it is delivered to a different CPU.

The issue is not reproducible on WS2016 as Debug-VM delivers NMI to the
boot CPU only, WS2012R2 and earlier Hyper-V versions are affected.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
Acked-by: K. Y. Srinivasan <[email protected]>
Cc: [email protected]
Cc: Haiyang Zhang <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/kernel/cpu/mshyperv.c | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)

--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -31,6 +31,7 @@
#include <asm/apic.h>
#include <asm/timer.h>
#include <asm/reboot.h>
+#include <asm/nmi.h>

struct ms_hyperv_info ms_hyperv;
EXPORT_SYMBOL_GPL(ms_hyperv);
@@ -158,6 +159,26 @@ static unsigned char hv_get_nmi_reason(v
return 0;
}

+#ifdef CONFIG_X86_LOCAL_APIC
+/*
+ * Prior to WS2016 Debug-VM sends NMIs to all CPUs which makes
+ * it dificult to process CHANNELMSG_UNLOAD in case of crash. Handle
+ * unknown NMI on the first CPU which gets it.
+ */
+static int hv_nmi_unknown(unsigned int val, struct pt_regs *regs)
+{
+ static atomic_t nmi_cpu = ATOMIC_INIT(-1);
+
+ if (!unknown_nmi_panic)
+ return NMI_DONE;
+
+ if (atomic_cmpxchg(&nmi_cpu, -1, raw_smp_processor_id()) != -1)
+ return NMI_HANDLED;
+
+ return NMI_DONE;
+}
+#endif
+
static void __init ms_hyperv_init_platform(void)
{
/*
@@ -183,6 +204,9 @@ static void __init ms_hyperv_init_platfo
pr_info("HyperV: LAPIC Timer Frequency: %#x\n",
lapic_timer_frequency);
}
+
+ register_nmi_handler(NMI_UNKNOWN, hv_nmi_unknown, NMI_FLAG_FIRST,
+ "hv_nmi_unknown");
#endif

if (ms_hyperv.features & HV_X64_MSR_TIME_REF_COUNT_AVAILABLE)


2017-03-20 18:33:58

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 51/93] scsi: ibmvscsis: Synchronize cmds at remove time

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Michael Cyr <[email protected]>

[ Upstream commit 8bf11557d44d00562360d370de8aa70ba89aa0d5 ]

This patch adds code to disconnect from the client, which will make sure
any outstanding commands have been completed, before continuing on with
the remove operation.

Signed-off-by: Michael Cyr <[email protected]>
Signed-off-by: Bryant G. Ly <[email protected]>
Tested-by: Steven Royer <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c | 39 +++++++++++++++++++++++++++----
drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.h | 3 ++
2 files changed, 37 insertions(+), 5 deletions(-)

--- a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c
+++ b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c
@@ -470,6 +470,18 @@ static void ibmvscsis_disconnect(struct

case WAIT_ENABLED:
switch (new_state) {
+ case UNCONFIGURING:
+ vscsi->state = new_state;
+ vscsi->flags |= RESPONSE_Q_DOWN;
+ vscsi->flags &= ~(SCHEDULE_DISCONNECT |
+ DISCONNECT_SCHEDULED);
+ dma_rmb();
+ if (vscsi->flags & CFG_SLEEPING) {
+ vscsi->flags &= ~CFG_SLEEPING;
+ complete(&vscsi->unconfig);
+ }
+ break;
+
/* should never happen */
case ERR_DISCONNECT:
case ERR_DISCONNECT_RECONNECT:
@@ -482,6 +494,13 @@ static void ibmvscsis_disconnect(struct

case WAIT_IDLE:
switch (new_state) {
+ case UNCONFIGURING:
+ vscsi->flags |= RESPONSE_Q_DOWN;
+ vscsi->state = new_state;
+ vscsi->flags &= ~(SCHEDULE_DISCONNECT |
+ DISCONNECT_SCHEDULED);
+ ibmvscsis_free_command_q(vscsi);
+ break;
case ERR_DISCONNECT:
case ERR_DISCONNECT_RECONNECT:
vscsi->state = new_state;
@@ -1187,6 +1206,15 @@ static void ibmvscsis_adapter_idle(struc
free_qs = true;

switch (vscsi->state) {
+ case UNCONFIGURING:
+ ibmvscsis_free_command_q(vscsi);
+ dma_rmb();
+ isync();
+ if (vscsi->flags & CFG_SLEEPING) {
+ vscsi->flags &= ~CFG_SLEEPING;
+ complete(&vscsi->unconfig);
+ }
+ break;
case ERR_DISCONNECT_RECONNECT:
ibmvscsis_reset_queue(vscsi);
pr_debug("adapter_idle, disc_rec: flags 0x%x\n", vscsi->flags);
@@ -3342,6 +3370,7 @@ static int ibmvscsis_probe(struct vio_de
(unsigned long)vscsi);

init_completion(&vscsi->wait_idle);
+ init_completion(&vscsi->unconfig);

snprintf(wq_name, 24, "ibmvscsis%s", dev_name(&vdev->dev));
vscsi->work_q = create_workqueue(wq_name);
@@ -3397,10 +3426,11 @@ static int ibmvscsis_remove(struct vio_d

pr_debug("remove (%s)\n", dev_name(&vscsi->dma_dev->dev));

- /*
- * TBD: Need to handle if there are commands on the waiting_rsp q
- * Actually, can there still be cmds outstanding to tcm?
- */
+ spin_lock_bh(&vscsi->intr_lock);
+ ibmvscsis_post_disconnect(vscsi, UNCONFIGURING, 0);
+ vscsi->flags |= CFG_SLEEPING;
+ spin_unlock_bh(&vscsi->intr_lock);
+ wait_for_completion(&vscsi->unconfig);

vio_disable_interrupts(vdev);
free_irq(vdev->irq, vscsi);
@@ -3409,7 +3439,6 @@ static int ibmvscsis_remove(struct vio_d
DMA_BIDIRECTIONAL);
kfree(vscsi->map_buf);
tasklet_kill(&vscsi->work_task);
- ibmvscsis_unregister_command_q(vscsi);
ibmvscsis_destroy_command_q(vscsi);
ibmvscsis_freetimer(vscsi);
ibmvscsis_free_cmds(vscsi);
--- a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.h
+++ b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.h
@@ -257,6 +257,8 @@ struct scsi_info {
#define SCHEDULE_DISCONNECT 0x00400
/* disconnect handler is scheduled */
#define DISCONNECT_SCHEDULED 0x00800
+ /* remove function is sleeping */
+#define CFG_SLEEPING 0x01000
u32 flags;
/* adapter lock */
spinlock_t intr_lock;
@@ -285,6 +287,7 @@ struct scsi_info {

struct workqueue_struct *work_q;
struct completion wait_idle;
+ struct completion unconfig;
struct device dev;
struct vio_dev *dma_dev;
struct srp_target target;


2017-03-20 18:34:50

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 46/93] scsi: ibmvscsis: Issues from Dan Carpenter/Smatch

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Michael Cyr <[email protected]>

[ Upstream commit 11950d70b52d2bc5e3580da8cd63909ef38d67db ]

Signed-off-by: Michael Cyr <[email protected]>
Signed-off-by: Bryant G. Ly <[email protected]>
Tested-by: Steven Royer <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c | 13 +++----------
1 file changed, 3 insertions(+), 10 deletions(-)

--- a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c
+++ b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c
@@ -1746,14 +1746,7 @@ static long ibmvscsis_mad(struct scsi_in

pr_debug("mad: type %d\n", be32_to_cpu(mad->type));

- if (be16_to_cpu(mad->length) < 0) {
- dev_err(&vscsi->dev, "mad: length is < 0\n");
- ibmvscsis_post_disconnect(vscsi,
- ERR_DISCONNECT_RECONNECT, 0);
- rc = SRP_VIOLATION;
- } else {
- rc = ibmvscsis_process_mad(vscsi, iue);
- }
+ rc = ibmvscsis_process_mad(vscsi, iue);

pr_debug("mad: status %hd, rc %ld\n", be16_to_cpu(mad->status),
rc);
@@ -2523,7 +2516,6 @@ static void ibmvscsis_parse_cmd(struct s
dev_err(&vscsi->dev, "0x%llx: parsing SRP descriptor table failed.\n",
srp->tag);
goto fail;
- return;
}

cmd->rsp.sol_not = srp->sol_not;
@@ -3379,7 +3371,8 @@ static int ibmvscsis_probe(struct vio_de
INIT_LIST_HEAD(&vscsi->waiting_rsp);
INIT_LIST_HEAD(&vscsi->active_q);

- snprintf(vscsi->tport.tport_name, 256, "%s", dev_name(&vdev->dev));
+ snprintf(vscsi->tport.tport_name, IBMVSCSIS_NAMELEN, "%s",
+ dev_name(&vdev->dev));

pr_debug("probe tport_name: %s\n", vscsi->tport.tport_name);



2017-03-20 18:35:21

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 43/93] xen: do not re-use pirq number cached in pci device msi msg data

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dan Streetman <[email protected]>

[ Upstream commit c74fd80f2f41d05f350bb478151021f88551afe8 ]

Revert the main part of commit:
af42b8d12f8a ("xen: fix MSI setup and teardown for PV on HVM guests")

That commit introduced reading the pci device's msi message data to see
if a pirq was previously configured for the device's msi/msix, and re-use
that pirq. At the time, that was the correct behavior. However, a
later change to Qemu caused it to call into the Xen hypervisor to unmap
all pirqs for a pci device, when the pci device disables its MSI/MSIX
vectors; specifically the Qemu commit:
c976437c7dba9c7444fb41df45468968aaa326ad
("qemu-xen: free all the pirqs for msi/msix when driver unload")

Once Qemu added this pirq unmapping, it was no longer correct for the
kernel to re-use the pirq number cached in the pci device msi message
data. All Qemu releases since 2.1.0 contain the patch that unmaps the
pirqs when the pci device disables its MSI/MSIX vectors.

This bug is causing failures to initialize multiple NVMe controllers
under Xen, because the NVMe driver sets up a single MSIX vector for
each controller (concurrently), and then after using that to talk to
the controller for some configuration data, it disables the single MSIX
vector and re-configures all the MSIX vectors it needs. So the MSIX
setup code tries to re-use the cached pirq from the first vector
for each controller, but the hypervisor has already given away that
pirq to another controller, and its initialization fails.

This is discussed in more detail at:
https://lists.xen.org/archives/html/xen-devel/2017-01/msg00447.html

Fixes: af42b8d12f8a ("xen: fix MSI setup and teardown for PV on HVM guests")
Signed-off-by: Dan Streetman <[email protected]>
Reviewed-by: Stefano Stabellini <[email protected]>
Acked-by: Konrad Rzeszutek Wilk <[email protected]>
Signed-off-by: Boris Ostrovsky <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/pci/xen.c | 23 +++++++----------------
1 file changed, 7 insertions(+), 16 deletions(-)

--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -234,23 +234,14 @@ static int xen_hvm_setup_msi_irqs(struct
return 1;

for_each_pci_msi_entry(msidesc, dev) {
- __pci_read_msi_msg(msidesc, &msg);
- pirq = MSI_ADDR_EXT_DEST_ID(msg.address_hi) |
- ((msg.address_lo >> MSI_ADDR_DEST_ID_SHIFT) & 0xff);
- if (msg.data != XEN_PIRQ_MSI_DATA ||
- xen_irq_from_pirq(pirq) < 0) {
- pirq = xen_allocate_pirq_msi(dev, msidesc);
- if (pirq < 0) {
- irq = -ENODEV;
- goto error;
- }
- xen_msi_compose_msg(dev, pirq, &msg);
- __pci_write_msi_msg(msidesc, &msg);
- dev_dbg(&dev->dev, "xen: msi bound to pirq=%d\n", pirq);
- } else {
- dev_dbg(&dev->dev,
- "xen: msi already bound to pirq=%d\n", pirq);
+ pirq = xen_allocate_pirq_msi(dev, msidesc);
+ if (pirq < 0) {
+ irq = -ENODEV;
+ goto error;
}
+ xen_msi_compose_msg(dev, pirq, &msg);
+ __pci_write_msi_msg(msidesc, &msg);
+ dev_dbg(&dev->dev, "xen: msi bound to pirq=%d\n", pirq);
irq = xen_bind_pirq_msi_to_irq(dev, msidesc, pirq,
(type == PCI_CAP_ID_MSI) ? nvec : 1,
(type == PCI_CAP_ID_MSIX) ?


2017-03-20 18:35:27

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 25/93] dccp: fix use-after-free in dccp_feat_activate_values

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <[email protected]>


[ Upstream commit 62f8f4d9066c1c6f2474845d1ca7e2891f2ae3fd ]

Dmitry reported crashes in DCCP stack [1]

Problem here is that when I got rid of listener spinlock, I missed the
fact that DCCP stores a complex state in struct dccp_request_sock,
while TCP does not.

Since multiple cpus could access it at the same time, we need to add
protection.

[1]
BUG: KASAN: use-after-free in dccp_feat_activate_values+0x967/0xab0
net/dccp/feat.c:1541 at addr ffff88003713be68
Read of size 8 by task syz-executor2/8457
CPU: 2 PID: 8457 Comm: syz-executor2 Not tainted 4.10.0-rc7+ #127
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:15 [inline]
dump_stack+0x292/0x398 lib/dump_stack.c:51
kasan_object_err+0x1c/0x70 mm/kasan/report.c:162
print_address_description mm/kasan/report.c:200 [inline]
kasan_report_error mm/kasan/report.c:289 [inline]
kasan_report.part.1+0x20e/0x4e0 mm/kasan/report.c:311
kasan_report mm/kasan/report.c:332 [inline]
__asan_report_load8_noabort+0x29/0x30 mm/kasan/report.c:332
dccp_feat_activate_values+0x967/0xab0 net/dccp/feat.c:1541
dccp_create_openreq_child+0x464/0x610 net/dccp/minisocks.c:121
dccp_v6_request_recv_sock+0x1f6/0x1960 net/dccp/ipv6.c:457
dccp_check_req+0x335/0x5a0 net/dccp/minisocks.c:186
dccp_v6_rcv+0x69e/0x1d00 net/dccp/ipv6.c:711
ip6_input_finish+0x46d/0x17a0 net/ipv6/ip6_input.c:279
NF_HOOK include/linux/netfilter.h:257 [inline]
ip6_input+0xdb/0x590 net/ipv6/ip6_input.c:322
dst_input include/net/dst.h:507 [inline]
ip6_rcv_finish+0x289/0x890 net/ipv6/ip6_input.c:69
NF_HOOK include/linux/netfilter.h:257 [inline]
ipv6_rcv+0x12ec/0x23d0 net/ipv6/ip6_input.c:203
__netif_receive_skb_core+0x1ae5/0x3400 net/core/dev.c:4190
__netif_receive_skb+0x2a/0x170 net/core/dev.c:4228
process_backlog+0xe5/0x6c0 net/core/dev.c:4839
napi_poll net/core/dev.c:5202 [inline]
net_rx_action+0xe70/0x1900 net/core/dev.c:5267
__do_softirq+0x2fb/0xb7d kernel/softirq.c:284
do_softirq_own_stack+0x1c/0x30 arch/x86/entry/entry_64.S:902
</IRQ>
do_softirq.part.17+0x1e8/0x230 kernel/softirq.c:328
do_softirq kernel/softirq.c:176 [inline]
__local_bh_enable_ip+0x1f2/0x200 kernel/softirq.c:181
local_bh_enable include/linux/bottom_half.h:31 [inline]
rcu_read_unlock_bh include/linux/rcupdate.h:971 [inline]
ip6_finish_output2+0xbb0/0x23d0 net/ipv6/ip6_output.c:123
ip6_finish_output+0x302/0x960 net/ipv6/ip6_output.c:148
NF_HOOK_COND include/linux/netfilter.h:246 [inline]
ip6_output+0x1cb/0x8d0 net/ipv6/ip6_output.c:162
ip6_xmit+0xcdf/0x20d0 include/net/dst.h:501
inet6_csk_xmit+0x320/0x5f0 net/ipv6/inet6_connection_sock.c:179
dccp_transmit_skb+0xb09/0x1120 net/dccp/output.c:141
dccp_xmit_packet+0x215/0x760 net/dccp/output.c:280
dccp_write_xmit+0x168/0x1d0 net/dccp/output.c:362
dccp_sendmsg+0x79c/0xb10 net/dccp/proto.c:796
inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
sock_sendmsg_nosec net/socket.c:635 [inline]
sock_sendmsg+0xca/0x110 net/socket.c:645
SYSC_sendto+0x660/0x810 net/socket.c:1687
SyS_sendto+0x40/0x50 net/socket.c:1655
entry_SYSCALL_64_fastpath+0x1f/0xc2
RIP: 0033:0x4458b9
RSP: 002b:00007f8ceb77bb58 EFLAGS: 00000282 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 0000000000000017 RCX: 00000000004458b9
RDX: 0000000000000023 RSI: 0000000020e60000 RDI: 0000000000000017
RBP: 00000000006e1b90 R08: 00000000200f9fe1 R09: 0000000000000020
R10: 0000000000008010 R11: 0000000000000282 R12: 00000000007080a8
R13: 0000000000000000 R14: 00007f8ceb77c9c0 R15: 00007f8ceb77c700
Object at ffff88003713be50, in cache kmalloc-64 size: 64
Allocated:
PID = 8446
save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
save_stack+0x43/0xd0 mm/kasan/kasan.c:502
set_track mm/kasan/kasan.c:514 [inline]
kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:605
kmem_cache_alloc_trace+0x82/0x270 mm/slub.c:2738
kmalloc include/linux/slab.h:490 [inline]
dccp_feat_entry_new+0x214/0x410 net/dccp/feat.c:467
dccp_feat_push_change+0x38/0x220 net/dccp/feat.c:487
__feat_register_sp+0x223/0x2f0 net/dccp/feat.c:741
dccp_feat_propagate_ccid+0x22b/0x2b0 net/dccp/feat.c:949
dccp_feat_server_ccid_dependencies+0x1b3/0x250 net/dccp/feat.c:1012
dccp_make_response+0x1f1/0xc90 net/dccp/output.c:423
dccp_v6_send_response+0x4ec/0xc20 net/dccp/ipv6.c:217
dccp_v6_conn_request+0xaba/0x11b0 net/dccp/ipv6.c:377
dccp_rcv_state_process+0x51e/0x1650 net/dccp/input.c:606
dccp_v6_do_rcv+0x213/0x350 net/dccp/ipv6.c:632
sk_backlog_rcv include/net/sock.h:893 [inline]
__sk_receive_skb+0x36f/0xcc0 net/core/sock.c:479
dccp_v6_rcv+0xba5/0x1d00 net/dccp/ipv6.c:742
ip6_input_finish+0x46d/0x17a0 net/ipv6/ip6_input.c:279
NF_HOOK include/linux/netfilter.h:257 [inline]
ip6_input+0xdb/0x590 net/ipv6/ip6_input.c:322
dst_input include/net/dst.h:507 [inline]
ip6_rcv_finish+0x289/0x890 net/ipv6/ip6_input.c:69
NF_HOOK include/linux/netfilter.h:257 [inline]
ipv6_rcv+0x12ec/0x23d0 net/ipv6/ip6_input.c:203
__netif_receive_skb_core+0x1ae5/0x3400 net/core/dev.c:4190
__netif_receive_skb+0x2a/0x170 net/core/dev.c:4228
process_backlog+0xe5/0x6c0 net/core/dev.c:4839
napi_poll net/core/dev.c:5202 [inline]
net_rx_action+0xe70/0x1900 net/core/dev.c:5267
__do_softirq+0x2fb/0xb7d kernel/softirq.c:284
Freed:
PID = 15
save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
save_stack+0x43/0xd0 mm/kasan/kasan.c:502
set_track mm/kasan/kasan.c:514 [inline]
kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
slab_free_hook mm/slub.c:1355 [inline]
slab_free_freelist_hook mm/slub.c:1377 [inline]
slab_free mm/slub.c:2954 [inline]
kfree+0xe8/0x2b0 mm/slub.c:3874
dccp_feat_entry_destructor.part.4+0x48/0x60 net/dccp/feat.c:418
dccp_feat_entry_destructor net/dccp/feat.c:416 [inline]
dccp_feat_list_pop net/dccp/feat.c:541 [inline]
dccp_feat_activate_values+0x57f/0xab0 net/dccp/feat.c:1543
dccp_create_openreq_child+0x464/0x610 net/dccp/minisocks.c:121
dccp_v6_request_recv_sock+0x1f6/0x1960 net/dccp/ipv6.c:457
dccp_check_req+0x335/0x5a0 net/dccp/minisocks.c:186
dccp_v6_rcv+0x69e/0x1d00 net/dccp/ipv6.c:711
ip6_input_finish+0x46d/0x17a0 net/ipv6/ip6_input.c:279
NF_HOOK include/linux/netfilter.h:257 [inline]
ip6_input+0xdb/0x590 net/ipv6/ip6_input.c:322
dst_input include/net/dst.h:507 [inline]
ip6_rcv_finish+0x289/0x890 net/ipv6/ip6_input.c:69
NF_HOOK include/linux/netfilter.h:257 [inline]
ipv6_rcv+0x12ec/0x23d0 net/ipv6/ip6_input.c:203
__netif_receive_skb_core+0x1ae5/0x3400 net/core/dev.c:4190
__netif_receive_skb+0x2a/0x170 net/core/dev.c:4228
process_backlog+0xe5/0x6c0 net/core/dev.c:4839
napi_poll net/core/dev.c:5202 [inline]
net_rx_action+0xe70/0x1900 net/core/dev.c:5267
__do_softirq+0x2fb/0xb7d kernel/softirq.c:284
Memory state around the buggy address:
ffff88003713bd00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff88003713bd80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88003713be00: fc fc fc fc fc fc fc fc fc fc fb fb fb fb fb fb
^

Fixes: 079096f103fa ("tcp/dccp: install syn_recv requests into ehash table")
Signed-off-by: Eric Dumazet <[email protected]>
Reported-by: Dmitry Vyukov <[email protected]>
Tested-by: Dmitry Vyukov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/linux/dccp.h | 1 +
net/dccp/minisocks.c | 24 ++++++++++++++++--------
2 files changed, 17 insertions(+), 8 deletions(-)

--- a/include/linux/dccp.h
+++ b/include/linux/dccp.h
@@ -163,6 +163,7 @@ struct dccp_request_sock {
__u64 dreq_isr;
__u64 dreq_gsr;
__be32 dreq_service;
+ spinlock_t dreq_lock;
struct list_head dreq_featneg;
__u32 dreq_timestamp_echo;
__u32 dreq_timestamp_time;
--- a/net/dccp/minisocks.c
+++ b/net/dccp/minisocks.c
@@ -146,6 +146,13 @@ struct sock *dccp_check_req(struct sock
struct dccp_request_sock *dreq = dccp_rsk(req);
bool own_req;

+ /* TCP/DCCP listeners became lockless.
+ * DCCP stores complex state in its request_sock, so we need
+ * a protection for them, now this code runs without being protected
+ * by the parent (listener) lock.
+ */
+ spin_lock_bh(&dreq->dreq_lock);
+
/* Check for retransmitted REQUEST */
if (dccp_hdr(skb)->dccph_type == DCCP_PKT_REQUEST) {

@@ -160,7 +167,7 @@ struct sock *dccp_check_req(struct sock
inet_rtx_syn_ack(sk, req);
}
/* Network Duplicate, discard packet */
- return NULL;
+ goto out;
}

DCCP_SKB_CB(skb)->dccpd_reset_code = DCCP_RESET_CODE_PACKET_ERROR;
@@ -186,20 +193,20 @@ struct sock *dccp_check_req(struct sock

child = inet_csk(sk)->icsk_af_ops->syn_recv_sock(sk, skb, req, NULL,
req, &own_req);
- if (!child)
- goto listen_overflow;
-
- return inet_csk_complete_hashdance(sk, child, req, own_req);
+ if (child) {
+ child = inet_csk_complete_hashdance(sk, child, req, own_req);
+ goto out;
+ }

-listen_overflow:
- dccp_pr_debug("listen_overflow!\n");
DCCP_SKB_CB(skb)->dccpd_reset_code = DCCP_RESET_CODE_TOO_BUSY;
drop:
if (dccp_hdr(skb)->dccph_type != DCCP_PKT_RESET)
req->rsk_ops->send_reset(sk, skb);

inet_csk_reqsk_queue_drop(sk, req);
- return NULL;
+out:
+ spin_unlock_bh(&dreq->dreq_lock);
+ return child;
}

EXPORT_SYMBOL_GPL(dccp_check_req);
@@ -250,6 +257,7 @@ int dccp_reqsk_init(struct request_sock
{
struct dccp_request_sock *dreq = dccp_rsk(req);

+ spin_lock_init(&dreq->dreq_lock);
inet_rsk(req)->ir_rmt_port = dccp_hdr(skb)->dccph_sport;
inet_rsk(req)->ir_num = ntohs(dccp_hdr(skb)->dccph_dport);
inet_rsk(req)->acked = 0;


2017-03-20 18:35:25

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 37/93] dccp: fix memory leak during tear-down of unsuccessful connection request

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Hannes Frederic Sowa <[email protected]>


[ Upstream commit 72ef9c4125c7b257e3a714d62d778ab46583d6a3 ]

This patch fixes a memory leak, which happens if the connection request
is not fulfilled between parsing the DCCP options and handling the SYN
(because e.g. the backlog is full), because we forgot to free the
list of ack vectors.

Reported-by: Jianwen Ji <[email protected]>
Signed-off-by: Hannes Frederic Sowa <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/dccp/ccids/ccid2.c | 1 +
1 file changed, 1 insertion(+)

--- a/net/dccp/ccids/ccid2.c
+++ b/net/dccp/ccids/ccid2.c
@@ -749,6 +749,7 @@ static void ccid2_hc_tx_exit(struct sock
for (i = 0; i < hc->tx_seqbufc; i++)
kfree(hc->tx_seqbuf[i]);
hc->tx_seqbufc = 0;
+ dccp_ackvec_parsed_cleanup(&hc->tx_av_chunks);
}

static void ccid2_hc_rx_packet_recv(struct sock *sk, struct sk_buff *skb)


2017-03-20 18:35:35

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 45/93] igb: add i211 to i210 PHY workaround

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Todd Fujinaka <[email protected]>

[ Upstream commit 5bc8c230e2a993b49244f9457499f17283da9ec7 ]

i210 and i211 share the same PHY but have different PCI IDs. Don't
forget i211 for any i210 workarounds.

Signed-off-by: Todd Fujinaka <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/intel/igb/e1000_phy.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/ethernet/intel/igb/e1000_phy.c
+++ b/drivers/net/ethernet/intel/igb/e1000_phy.c
@@ -78,7 +78,7 @@ s32 igb_get_phy_id(struct e1000_hw *hw)
u16 phy_id;

/* ensure PHY page selection to fix misconfigured i210 */
- if (hw->mac.type == e1000_i210)
+ if ((hw->mac.type == e1000_i210) || (hw->mac.type == e1000_i211))
phy->ops.write_reg(hw, I347AT4_PAGE_SELECT, 0);

ret_val = phy->ops.read_reg(hw, PHY_ID1, &phy_id);


2017-03-20 18:35:44

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 44/93] igb: Workaround for igb i210 firmware issue

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Chris J Arges <[email protected]>

[ Upstream commit 4e684f59d760a2c7c716bb60190783546e2d08a1 ]

Sometimes firmware may not properly initialize I347AT4_PAGE_SELECT causing
the probe of an igb i210 NIC to fail. This patch adds an addition zeroing
of this register during igb_get_phy_id to workaround this issue.

Thanks for Jochen Henneberg for the idea and original patch.

Signed-off-by: Chris J Arges <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/intel/igb/e1000_phy.c | 4 ++++
1 file changed, 4 insertions(+)

--- a/drivers/net/ethernet/intel/igb/e1000_phy.c
+++ b/drivers/net/ethernet/intel/igb/e1000_phy.c
@@ -77,6 +77,10 @@ s32 igb_get_phy_id(struct e1000_hw *hw)
s32 ret_val = 0;
u16 phy_id;

+ /* ensure PHY page selection to fix misconfigured i210 */
+ if (hw->mac.type == e1000_i210)
+ phy->ops.write_reg(hw, I347AT4_PAGE_SELECT, 0);
+
ret_val = phy->ops.read_reg(hw, PHY_ID1, &phy_id);
if (ret_val)
goto out;


2017-03-20 18:35:55

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 33/93] ipv6: avoid write to a possibly cloned skb

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Florian Westphal <[email protected]>


[ Upstream commit 79e49503efe53a8c51d8b695bedc8a346c5e4a87 ]

ip6_fragment, in case skb has a fraglist, checks if the
skb is cloned. If it is, it will move to the 'slow path' and allocates
new skbs for each fragment.

However, right before entering the slowpath loop, it updates the
nexthdr value of the last ipv6 extension header to NEXTHDR_FRAGMENT,
to account for the fragment header that will be inserted in the new
ipv6-fragment skbs.

In case original skb is cloned this munges nexthdr value of another
skb. Avoid this by doing the nexthdr update for each of the new fragment
skbs separately.

This was observed with tcpdump on a bridge device where netfilter ipv6
reassembly is active: tcpdump shows malformed fragment headers as
the l4 header (icmpv6, tcp, etc). is decoded as a fragment header.

Cc: Hannes Frederic Sowa <[email protected]>
Reported-by: Andreas Karis <[email protected]>
Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv6/ip6_output.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -757,13 +757,14 @@ slow_path:
* Fragment the datagram.
*/

- *prevhdr = NEXTHDR_FRAGMENT;
troom = rt->dst.dev->needed_tailroom;

/*
* Keep copying data until we run out.
*/
while (left > 0) {
+ u8 *fragnexthdr_offset;
+
len = left;
/* IF: it doesn't fit, use 'mtu' - the data space left */
if (len > mtu)
@@ -808,6 +809,10 @@ slow_path:
*/
skb_copy_from_linear_data(skb, skb_network_header(frag), hlen);

+ fragnexthdr_offset = skb_network_header(frag);
+ fragnexthdr_offset += prevhdr - skb_network_header(skb);
+ *fragnexthdr_offset = NEXTHDR_FRAGMENT;
+
/*
* Build fragment header.
*/


2017-03-20 18:35:47

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 41/93] bpf: fix mark_reg_unknown_value for spilled regs on map value marking

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Daniel Borkmann <[email protected]>


[ Upstream commit 6760bf2ddde8ad64f8205a651223a93de3a35494 ]

Martin reported a verifier issue that hit the BUG_ON() for his
test case in the mark_reg_unknown_value() function:

[ 202.861380] kernel BUG at kernel/bpf/verifier.c:467!
[...]
[ 203.291109] Call Trace:
[ 203.296501] [<ffffffff811364d5>] mark_map_reg+0x45/0x50
[ 203.308225] [<ffffffff81136558>] mark_map_regs+0x78/0x90
[ 203.320140] [<ffffffff8113938d>] do_check+0x226d/0x2c90
[ 203.331865] [<ffffffff8113a6ab>] bpf_check+0x48b/0x780
[ 203.343403] [<ffffffff81134c8e>] bpf_prog_load+0x27e/0x440
[ 203.355705] [<ffffffff8118a38f>] ? handle_mm_fault+0x11af/0x1230
[ 203.369158] [<ffffffff812d8188>] ? security_capable+0x48/0x60
[ 203.382035] [<ffffffff811351a4>] SyS_bpf+0x124/0x960
[ 203.393185] [<ffffffff810515f6>] ? __do_page_fault+0x276/0x490
[ 203.406258] [<ffffffff816db320>] entry_SYSCALL_64_fastpath+0x13/0x94

This issue got uncovered after the fix in a08dd0da5307 ("bpf: fix
regression on verifier pruning wrt map lookups"). The reason why it
wasn't noticed before was, because as mentioned in a08dd0da5307,
mark_map_regs() was doing the id matching incorrectly based on the
uncached regs[regno].id. So, in the first loop, we walked all regs
and as soon as we found regno == i, then this reg's id was cleared
when calling mark_reg_unknown_value() thus that every subsequent
register was probed against id of 0 (which, in combination with the
PTR_TO_MAP_VALUE_OR_NULL type is an invalid condition that no other
register state can hold), and therefore wasn't type transitioned such
as in the spilled register case for the second loop.

Now since that got fixed, it turned out that 57a09bf0a416 ("bpf:
Detect identical PTR_TO_MAP_VALUE_OR_NULL registers") used
mark_reg_unknown_value() incorrectly for the spilled regs, and thus
hitting the BUG_ON() in some cases due to regno >= MAX_BPF_REG.

Although spilled regs have the same type as the non-spilled regs
for the verifier state, that is, struct bpf_reg_state, they are
semantically different from the non-spilled regs. In other words,
there can be up to 64 (MAX_BPF_STACK / BPF_REG_SIZE) spilled regs
in the stack, for example, register R<x> could have been spilled by
the program to stack location X, Y, Z, and in mark_map_regs() we
need to scan these stack slots of type STACK_SPILL for potential
registers that we have to transition from PTR_TO_MAP_VALUE_OR_NULL.
Therefore, depending on the location, the spilled_regs regno can
be a lot higher than just MAX_BPF_REG's value since we operate on
stack instead. The reset in mark_reg_unknown_value() itself is
just fine, only that the BUG_ON() was inappropriate for this. Fix
it by making a __mark_reg_unknown_value() version that can be
called from mark_map_reg() generically; we know for the non-spilled
case that the regno is always < MAX_BPF_REG anyway.

Fixes: 57a09bf0a416 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers")
Reported-by: Martin KaFai Lau <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
kernel/bpf/verifier.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)

--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -444,14 +444,19 @@ static void init_reg_state(struct bpf_re
regs[BPF_REG_1].type = PTR_TO_CTX;
}

-static void mark_reg_unknown_value(struct bpf_reg_state *regs, u32 regno)
+static void __mark_reg_unknown_value(struct bpf_reg_state *regs, u32 regno)
{
- BUG_ON(regno >= MAX_BPF_REG);
regs[regno].type = UNKNOWN_VALUE;
regs[regno].id = 0;
regs[regno].imm = 0;
}

+static void mark_reg_unknown_value(struct bpf_reg_state *regs, u32 regno)
+{
+ BUG_ON(regno >= MAX_BPF_REG);
+ __mark_reg_unknown_value(regs, regno);
+}
+
static void reset_reg_range_values(struct bpf_reg_state *regs, u32 regno)
{
regs[regno].min_value = BPF_REGISTER_MIN_RANGE;
@@ -1946,7 +1951,7 @@ static void mark_map_reg(struct bpf_reg_
*/
reg->id = 0;
if (type == UNKNOWN_VALUE)
- mark_reg_unknown_value(regs, regno);
+ __mark_reg_unknown_value(regs, regno);
}
}



2017-03-20 18:36:00

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 24/93] net/sched: act_skbmod: remove unneeded rcu_read_unlock in tcf_skbmod_dump

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Alexey Khoroshilov <[email protected]>


[ Upstream commit 6c4dc75c251721f517e9daeb5370ea606b5b35ce ]

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Alexey Khoroshilov <[email protected]>
Acked-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/sched/act_skbmod.c | 1 -
1 file changed, 1 deletion(-)

--- a/net/sched/act_skbmod.c
+++ b/net/sched/act_skbmod.c
@@ -228,7 +228,6 @@ static int tcf_skbmod_dump(struct sk_buf

return skb->len;
nla_put_failure:
- rcu_read_unlock();
nlmsg_trim(skb, b);
return -1;
}


2017-03-20 18:35:51

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 39/93] bpf: fix state equivalence

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Alexei Starovoitov <[email protected]>


[ Upstream commit d2a4dd37f6b41fbcad76efbf63124eb3126c66fe ]

Commmits 57a09bf0a416 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers")
and 484611357c19 ("bpf: allow access into map value arrays") by themselves
are correct, but in combination they make state equivalence ignore 'id' field
of the register state which can lead to accepting invalid program.

Fixes: 57a09bf0a416 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers")
Fixes: 484611357c19 ("bpf: allow access into map value arrays")
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Daniel Borkmann <[email protected]>
Acked-by: Thomas Graf <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/linux/bpf_verifier.h | 14 +++++++-------
kernel/bpf/verifier.c | 2 +-
2 files changed, 8 insertions(+), 8 deletions(-)

--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -18,13 +18,6 @@

struct bpf_reg_state {
enum bpf_reg_type type;
- /*
- * Used to determine if any memory access using this register will
- * result in a bad access.
- */
- s64 min_value;
- u64 max_value;
- u32 id;
union {
/* valid when type == CONST_IMM | PTR_TO_STACK | UNKNOWN_VALUE */
s64 imm;
@@ -40,6 +33,13 @@ struct bpf_reg_state {
*/
struct bpf_map *map_ptr;
};
+ u32 id;
+ /* Used to determine if any memory access using this register will
+ * result in a bad access. These two fields must be last.
+ * See states_equal()
+ */
+ s64 min_value;
+ u64 max_value;
};

enum bpf_stack_slot_type {
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -2498,7 +2498,7 @@ static bool states_equal(struct bpf_veri
* we didn't do a variable access into a map then we are a-ok.
*/
if (!varlen_map_access &&
- rold->type == rcur->type && rold->imm == rcur->imm)
+ memcmp(rold, rcur, offsetofend(struct bpf_reg_state, id)) == 0)
continue;

/* If we didn't map access then again we don't care about the


2017-03-20 18:35:33

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 42/93] dmaengine: iota: ioat_alloc_chan_resources should not perform sleeping allocations.

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Krister Johansen <[email protected]>

commit 21d25f6a4217e755906cb548b55ddab39d0e88b9 upstream.

On a kernel with DEBUG_LOCKS, ioat_free_chan_resources triggers an
in_interrupt() warning. With PROVE_LOCKING, it reports detecting a
SOFTIRQ-safe to SOFTIRQ-unsafe lock ordering in the same code path.

This is because dma_generic_alloc_coherent() checks if the GFP flags
permit blocking. It allocates from different subsystems if blocking is
permitted. The free path knows how to return the memory to the correct
allocator. If GFP_KERNEL is specified then the alloc and free end up
going through cma_alloc(), which uses mutexes.

Given that ioat_free_chan_resources() can be called in interrupt
context, ioat_alloc_chan_resources() must specify GFP_NOWAIT so that the
allocations do not block and instead use an allocator that uses
spinlocks.

Signed-off-by: Krister Johansen <[email protected]>
Acked-by: Dave Jiang <[email protected]>
Signed-off-by: Vinod Koul <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/dma/ioat/init.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/dma/ioat/init.c
+++ b/drivers/dma/ioat/init.c
@@ -691,7 +691,7 @@ static int ioat_alloc_chan_resources(str
/* doing 2 32bit writes to mmio since 1 64b write doesn't work */
ioat_chan->completion =
dma_pool_zalloc(ioat_chan->ioat_dma->completion_pool,
- GFP_KERNEL, &ioat_chan->completion_dma);
+ GFP_NOWAIT, &ioat_chan->completion_dma);
if (!ioat_chan->completion)
return -ENOMEM;

@@ -701,7 +701,7 @@ static int ioat_alloc_chan_resources(str
ioat_chan->reg_base + IOAT_CHANCMP_OFFSET_HIGH);

order = IOAT_MAX_ORDER;
- ring = ioat_alloc_ring(c, order, GFP_KERNEL);
+ ring = ioat_alloc_ring(c, order, GFP_NOWAIT);
if (!ring)
return -ENOMEM;



2017-03-20 18:35:31

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 40/93] bpf: fix regression on verifier pruning wrt map lookups

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Daniel Borkmann <[email protected]>


[ Upstream commit a08dd0da5307ba01295c8383923e51e7997c3576 ]

Commit 57a09bf0a416 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL
registers") introduced a regression where existing programs stopped
loading due to reaching the verifier's maximum complexity limit,
whereas prior to this commit they were loading just fine; the affected
program has roughly 2k instructions.

What was found is that state pruning couldn't be performed effectively
anymore due to mismatches of the verifier's register state, in particular
in the id tracking. It doesn't mean that 57a09bf0a416 is incorrect per
se, but rather that verifier needs to perform a lot more work for the
same program with regards to involved map lookups.

Since commit 57a09bf0a416 is only about tracking registers with type
PTR_TO_MAP_VALUE_OR_NULL, the id is only needed to follow registers
until they are promoted through pattern matching with a NULL check to
either PTR_TO_MAP_VALUE or UNKNOWN_VALUE type. After that point, the
id becomes irrelevant for the transitioned types.

For UNKNOWN_VALUE, id is already reset to 0 via mark_reg_unknown_value(),
but not so for PTR_TO_MAP_VALUE where id is becoming stale. It's even
transferred further into other types that don't make use of it. Among
others, one example is where UNKNOWN_VALUE is set on function call
return with RET_INTEGER return type.

states_equal() will then fall through the memcmp() on register state;
note that the second memcmp() uses offsetofend(), so the id is part of
that since d2a4dd37f6b4 ("bpf: fix state equivalence"). But the bisect
pointed already to 57a09bf0a416, where we really reach beyond complexity
limit. What I found was that states_equal() often failed in this
case due to id mismatches in spilled regs with registers in type
PTR_TO_MAP_VALUE. Unlike non-spilled regs, spilled regs just perform
a memcmp() on their reg state and don't have any other optimizations
in place, therefore also id was relevant in this case for making a
pruning decision.

We can safely reset id to 0 as well when converting to PTR_TO_MAP_VALUE.
For the affected program, it resulted in a ~17 fold reduction of
complexity and let the program load fine again. Selftest suite also
runs fine. The only other place where env->id_gen is used currently is
through direct packet access, but for these cases id is long living, thus
a different scenario.

Also, the current logic in mark_map_regs() is not fully correct when
marking NULL branch with UNKNOWN_VALUE. We need to cache the destination
reg's id in any case. Otherwise, once we marked that reg as UNKNOWN_VALUE,
it's id is reset and any subsequent registers that hold the original id
and are of type PTR_TO_MAP_VALUE_OR_NULL won't be marked UNKNOWN_VALUE
anymore, since mark_map_reg() reuses the uncached regs[regno].id that
was just overridden. Note, we don't need to cache it outside of
mark_map_regs(), since it's called once on this_branch and the other
time on other_branch, which are both two independent verifier states.
A test case for this is added here, too.

Fixes: 57a09bf0a416 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers")
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Thomas Graf <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
kernel/bpf/verifier.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)

--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1940,6 +1940,11 @@ static void mark_map_reg(struct bpf_reg_

if (reg->type == PTR_TO_MAP_VALUE_OR_NULL && reg->id == id) {
reg->type = type;
+ /* We don't need id from this point onwards anymore, thus we
+ * should better reset it, so that state pruning has chances
+ * to take effect.
+ */
+ reg->id = 0;
if (type == UNKNOWN_VALUE)
mark_reg_unknown_value(regs, regno);
}
@@ -1952,16 +1957,16 @@ static void mark_map_regs(struct bpf_ver
enum bpf_reg_type type)
{
struct bpf_reg_state *regs = state->regs;
+ u32 id = regs[regno].id;
int i;

for (i = 0; i < MAX_BPF_REG; i++)
- mark_map_reg(regs, i, regs[regno].id, type);
+ mark_map_reg(regs, i, id, type);

for (i = 0; i < MAX_BPF_STACK; i += BPF_REG_SIZE) {
if (state->stack_slot_type[i] != STACK_SPILL)
continue;
- mark_map_reg(state->spilled_regs, i / BPF_REG_SIZE,
- regs[regno].id, type);
+ mark_map_reg(state->spilled_regs, i / BPF_REG_SIZE, id, type);
}
}



2017-03-21 00:21:30

by Shuah Khan

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/93] 4.9.17-stable review

On 03/20/2017 11:50 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.9.17 release.
> There are 93 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed Mar 22 17:47:16 UTC 2017.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.17-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah

2017-03-21 02:13:55

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/93] 4.9.17-stable review

On 03/20/2017 10:50 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.9.17 release.
> There are 93 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed Mar 22 17:47:16 UTC 2017.
> Anything received after that time might be too late.
>

Build results:
total: 149 pass: 149 fail: 0
Qemu test results:
total: 122 pass: 122 fail: 0

Details are available at http://kerneltests.org/builders.

Guenter