2015-07-08 08:04:54

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 00/55] 4.0.8-stable review

This is the start of the stable review cycle for the 4.0.8 release.
There are 55 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.

Responses should be made by Fri Jul 10 07:31:26 UTC 2015.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.0.8-rc1.gz
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <[email protected]>
Linux 4.0.8-rc1

Eric W. Biederman <[email protected]>
vfs: Ignore unlocked mounts in fs_fully_visible

Eric W. Biederman <[email protected]>
vfs: Remove incorrect debugging WARN in prepend_path

Fabian Frederick <[email protected]>
fs/ufs: restore s_lock mutex

Fabian Frederick <[email protected]>
fs/ufs: revert "ufs: fix deadlocks introduced by sb mutex merge"

Jan Kara <[email protected]>
fs: Fix S_NOSEC handling

Radim Krčmář <[email protected]>
KVM: x86: make vapics_in_nmi_mode atomic

Radim Krčmář <[email protected]>
KVM: x86: properly restore LVT0

Cornelia Huck <[email protected]>
KVM: s390: virtio-ccw: don't overwrite config space values

Michael Holzheu <[email protected]>
s390/kdump: fix REGSET_VX_LOW vector register ELF notes

David Hildenbrand <[email protected]>
KVM: s390: fix external call injection without sigp interpretation

James Hogan <[email protected]>
MIPS: Fix KVM guest fixmap address

Paolo Bonzini <[email protected]>
KVM: mips: use id_to_memslot correctly

Bjorn Helgaas <[email protected]>
x86/PCI: Use host bridge _CRS info on Foxconn K8M890-8237A

Bjorn Helgaas <[email protected]>
x86/PCI: Use host bridge _CRS info on systems with >32 bit addressing

Anton Blanchard <[email protected]>
powerpc/perf: Fix book3s kernel to userspace backtraces

preeti <[email protected]>
tick/idle/powerpc: Do not register idle states with CPUIDLE_FLAG_TIMER_STOP set in periodic mode

Thomas Petazzoni <[email protected]>
ARM: mvebu: fix suspend to RAM on big-endian configurations

Dmitry Osipenko <[email protected]>
ARM: tegra20: Store CPU "resettable" status in IRAM

Lorenzo Pieralisi <[email protected]>
ARM: kvm: psci: fix handling of unimplemented functions

Marc Zyngier <[email protected]>
arm: KVM: force execution of HCPTR access on VM exit

J. Bruce Fields <[email protected]>
selinux: fix setting of security labels on NFS

Joe Konno <[email protected]>
intel_pstate: set BYT MSR with wrmsrl_on_cpu()

Jiri Slaby <[email protected]>
mmc: sdhci: fix low memory corruption

Joerg Roedel <[email protected]>
iommu/amd: Handle large pages correctly in free_pagetable

Will Deacon <[email protected]>
iommu/arm-smmu: Fix broken ATOS check

Horia Geant? <[email protected]>
Revert "crypto: talitos - convert to use be16_add_cpu()"

Horia Geant? <[email protected]>
crypto: talitos - avoid memleak in talitos_alg_alloc()

Rui Miguel Silva <[email protected]>
usb: gadget: f_fs: add extra check before unregister_gadget_item

Rui Miguel Silva <[email protected]>
usb: gadget: f_fs: fix check in read operation

Simon Guinot <[email protected]>
net: mvneta: disable IP checksum with jumbo frames for Armada 370

Simon Guinot <[email protected]>
ARM: mvebu: update Ethernet compatible string for Armada XP

Simon Guinot <[email protected]>
net: mvneta: introduce compatible string "marvell, armada-xp-neta"

Tom Lendacky <[email protected]>
amd-xgbe: Add the __GFP_NOWARN flag to Rx buffer allocation

Alexander Sverdlin <[email protected]>
sctp: Fix race between OOTB responce and route removal

Eric Dumazet <[email protected]>
bnx2x: fix lockdep splat

Mugunthan V N <[email protected]>
net: phy: fix phy link up when limiting speed via device tree

Or Gerlitz <[email protected]>
mlx4: Disable HA for SRIOV PF RoCE devices

Ido Shamay <[email protected]>
net/mlx4_en: Fix wrong csum complete report when rxvlan offload is disabled

Ido Shamay <[email protected]>
net/mlx4_en: Wake TX queues only when there's enough room

Eran Ben Elisha <[email protected]>
net/mlx4_en: Release TX QP when destroying TX ring

Julian Anastasov <[email protected]>
ip: report the original address of ICMP messages

Christoph Paasch <[email protected]>
tcp: Do not call tcp_fastopen_reset_cipher from interrupt context

Julian Anastasov <[email protected]>
neigh: do not modify unlinked entries

Willem de Bruijn <[email protected]>
packet: avoid out of bounds read in round robin fanout

Eric Dumazet <[email protected]>
packet: read num_members once in packet_rcv_fanout()

Nikolay Aleksandrov <[email protected]>
bridge: fix br_stp_set_bridge_priority race conditions

Marcelo Ricardo Leitner <[email protected]>
sctp: fix ASCONF list handling

Shaohua Li <[email protected]>
net: don't wait for order-3 page allocation

Richard Cochran <[email protected]>
net: igb: fix the start time for periodic output signals

Nikolay Aleksandrov <[email protected]>
bridge: fix multicast router rlist endless loop

Sowmini Varadhan <[email protected]>
sparc: Use GFP_ATOMIC in ldc_alloc_exp_dring() as it can be called in softirq context

Bandan Das <[email protected]>
KVM: nSVM: Check for NRIPS support before updating control field

Sebastien Szymanski <[email protected]>
ARM: clk-imx6q: refine sata's parent

Patrick McHardy <[email protected]>
netfilter: nft_rbtree: fix locking

Konrad Rzeszutek Wilk <[email protected]>
config: Enable NEED_DMA_MAP_STATE by default when SWIOTLB is selected


-------------

Diffstat:

.../bindings/net/marvell-armada-370-neta.txt | 2 +-
Makefile | 4 +-
arch/arm/boot/dts/armada-370-xp.dtsi | 2 -
arch/arm/boot/dts/armada-370.dtsi | 8 ++++
arch/arm/boot/dts/armada-xp-mv78260.dtsi | 2 +-
arch/arm/boot/dts/armada-xp-mv78460.dtsi | 2 +-
arch/arm/boot/dts/armada-xp.dtsi | 10 ++++-
arch/arm/kvm/interrupts.S | 10 ++---
arch/arm/kvm/interrupts_head.S | 20 +++++++++-
arch/arm/kvm/psci.c | 16 ++------
arch/arm/mach-imx/clk-imx6q.c | 2 +-
arch/arm/mach-mvebu/pm-board.c | 3 ++
arch/arm/mach-tegra/cpuidle-tegra20.c | 5 +--
arch/arm/mach-tegra/reset-handler.S | 10 +++--
arch/arm/mach-tegra/reset.h | 4 ++
arch/arm/mach-tegra/sleep-tegra20.S | 37 +++++++++++--------
arch/arm/mach-tegra/sleep.h | 4 ++
arch/mips/include/asm/mach-generic/spaces.h | 4 ++
arch/mips/kvm/mips.c | 2 +-
arch/powerpc/perf/core-book3s.c | 11 +++++-
arch/s390/kernel/crash_dump.c | 2 +-
arch/s390/kvm/interrupt.c | 2 +-
arch/sparc/kernel/ldc.c | 2 +-
arch/x86/Kconfig | 2 +-
arch/x86/include/asm/kvm_host.h | 2 +-
arch/x86/kvm/i8254.c | 2 +-
arch/x86/kvm/lapic.c | 5 ++-
arch/x86/kvm/svm.c | 8 +++-
arch/x86/pci/acpi.c | 17 ++++++++-
drivers/cpufreq/intel_pstate.c | 2 +-
drivers/cpuidle/cpuidle-powernv.c | 15 ++++++--
drivers/crypto/talitos.c | 4 +-
drivers/iommu/amd_iommu.c | 6 +++
drivers/iommu/arm-smmu.c | 2 +-
drivers/mmc/host/sdhci.c | 2 +-
drivers/net/ethernet/amd/xgbe/xgbe-desc.c | 2 +-
drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 3 +-
drivers/net/ethernet/intel/igb/igb_ptp.c | 4 +-
drivers/net/ethernet/marvell/mvneta.c | 27 +++++++++++++-
drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 4 --
drivers/net/ethernet/mellanox/mlx4/en_rx.c | 17 +++------
drivers/net/ethernet/mellanox/mlx4/en_tx.c | 20 ++++++----
drivers/net/ethernet/mellanox/mlx4/intf.c | 8 +++-
drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 2 +-
drivers/net/phy/phy_device.c | 5 ++-
drivers/s390/kvm/virtio_ccw.c | 11 +++++-
drivers/usb/gadget/function/f_fs.c | 12 ++++--
fs/dcache.c | 11 ------
fs/inode.c | 4 +-
fs/namespace.c | 8 +++-
fs/ufs/balloc.c | 34 ++++++++---------
fs/ufs/ialloc.c | 16 ++++----
fs/ufs/inode.c | 5 ++-
fs/ufs/namei.c | 14 ++++---
fs/ufs/super.c | 10 +++++
fs/ufs/ufs.h | 1 +
include/net/netns/sctp.h | 1 +
include/net/sctp/structs.h | 4 ++
net/bridge/br_ioctl.c | 2 -
net/bridge/br_multicast.c | 7 ++--
net/bridge/br_stp_if.c | 4 +-
net/core/neighbour.c | 13 +++++++
net/core/skbuff.c | 2 +-
net/core/sock.c | 2 +-
net/ipv4/af_inet.c | 2 +
net/ipv4/ip_sockglue.c | 11 +++++-
net/ipv4/tcp.c | 7 +++-
net/ipv4/tcp_fastopen.c | 2 -
net/ipv6/datagram.c | 12 +++++-
net/netfilter/nft_rbtree.c | 6 +--
net/packet/af_packet.c | 20 ++--------
net/sctp/output.c | 4 +-
net/sctp/socket.c | 43 ++++++++++++++++------
security/selinux/hooks.c | 1 +
74 files changed, 387 insertions(+), 205 deletions(-)


2015-07-08 08:19:20

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 01/55] config: Enable NEED_DMA_MAP_STATE by default when SWIOTLB is selected

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Konrad Rzeszutek Wilk <[email protected]>

commit a6dfa128ce5c414ab46b1d690f7a1b8decb8526d upstream.

A huge amount of NIC drivers use the DMA API, however if
compiled under 32-bit an very important part of the DMA API can
be ommitted leading to the drivers not working at all
(especially if used with 'swiotlb=force iommu=soft').

As Prashant Sreedharan explains it: "the driver [tg3] uses
DEFINE_DMA_UNMAP_ADDR(), dma_unmap_addr_set() to keep a copy of
the dma "mapping" and dma_unmap_addr() to get the "mapping"
value. On most of the platforms this is a no-op, but ... with
"iommu=soft and swiotlb=force" this house keeping is required,
... otherwise we pass 0 while calling pci_unmap_/pci_dma_sync_
instead of the DMA address."

As such enable this even when using 32-bit kernels.

Reported-by: Ian Jackson <[email protected]>
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
Acked-by: David S. Miller <[email protected]>
Acked-by: Prashant Sreedharan <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Michael Chan <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
Cc: Ben Hutchings <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -177,7 +177,7 @@ config SBUS

config NEED_DMA_MAP_STATE
def_bool y
- depends on X86_64 || INTEL_IOMMU || DMA_API_DEBUG
+ depends on X86_64 || INTEL_IOMMU || DMA_API_DEBUG || SWIOTLB

config NEED_SG_DMA_LENGTH
def_bool y

2015-07-08 08:16:18

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 02/55] netfilter: nft_rbtree: fix locking

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Patrick McHardy <[email protected]>

commit 16c45eda96038aae848b6cfd42e2bf4b5e80f365 upstream.

Fix a race condition and unnecessary locking:

* the root rb_node must only be accessed under the lock in nft_rbtree_lookup()
* the lock is not needed in lookup functions in netlink context

Signed-off-by: Patrick McHardy <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/netfilter/nft_rbtree.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

--- a/net/netfilter/nft_rbtree.c
+++ b/net/netfilter/nft_rbtree.c
@@ -37,10 +37,11 @@ static bool nft_rbtree_lookup(const stru
{
const struct nft_rbtree *priv = nft_set_priv(set);
const struct nft_rbtree_elem *rbe, *interval = NULL;
- const struct rb_node *parent = priv->root.rb_node;
+ const struct rb_node *parent;
int d;

spin_lock_bh(&nft_rbtree_lock);
+ parent = priv->root.rb_node;
while (parent != NULL) {
rbe = rb_entry(parent, struct nft_rbtree_elem, node);

@@ -158,7 +159,6 @@ static int nft_rbtree_get(const struct n
struct nft_rbtree_elem *rbe;
int d;

- spin_lock_bh(&nft_rbtree_lock);
while (parent != NULL) {
rbe = rb_entry(parent, struct nft_rbtree_elem, node);

@@ -173,11 +173,9 @@ static int nft_rbtree_get(const struct n
!(rbe->flags & NFT_SET_ELEM_INTERVAL_END))
nft_data_copy(&elem->data, rbe->data);
elem->flags = rbe->flags;
- spin_unlock_bh(&nft_rbtree_lock);
return 0;
}
}
- spin_unlock_bh(&nft_rbtree_lock);
return -ENOENT;
}


2015-07-08 08:12:20

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 03/55] ARM: clk-imx6q: refine satas parent

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Sebastien Szymanski <[email protected]>

commit da946aeaeadcd24ff0cda9984c6fb8ed2bfd462a upstream.

According to IMX6D/Q RM, table 18-3, sata clock's parent is ahb, not ipg.

Signed-off-by: Sebastien Szymanski <[email protected]>
Reviewed-by: Fabio Estevam <[email protected]>
Signed-off-by: Shawn Guo <[email protected]>
[dirk.behme: Adjust moved file]
Signed-off-by: Dirk Behme <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm/mach-imx/clk-imx6q.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/arm/mach-imx/clk-imx6q.c
+++ b/arch/arm/mach-imx/clk-imx6q.c
@@ -439,7 +439,7 @@ static void __init imx6q_clocks_init(str
clk[IMX6QDL_CLK_GPMI_IO] = imx_clk_gate2("gpmi_io", "enfc", base + 0x78, 28);
clk[IMX6QDL_CLK_GPMI_APB] = imx_clk_gate2("gpmi_apb", "usdhc3", base + 0x78, 30);
clk[IMX6QDL_CLK_ROM] = imx_clk_gate2("rom", "ahb", base + 0x7c, 0);
- clk[IMX6QDL_CLK_SATA] = imx_clk_gate2("sata", "ipg", base + 0x7c, 4);
+ clk[IMX6QDL_CLK_SATA] = imx_clk_gate2("sata", "ahb", base + 0x7c, 4);
clk[IMX6QDL_CLK_SDMA] = imx_clk_gate2("sdma", "ahb", base + 0x7c, 6);
clk[IMX6QDL_CLK_SPBA] = imx_clk_gate2("spba", "ipg", base + 0x7c, 12);
clk[IMX6QDL_CLK_SPDIF] = imx_clk_gate2("spdif", "spdif_podf", base + 0x7c, 14);

2015-07-08 08:07:33

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 04/55] KVM: nSVM: Check for NRIPS support before updating control field

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Bandan Das <[email protected]>

commit f104765b4f81fd74d69e0eb161e89096deade2db upstream.

If hardware doesn't support DecodeAssist - a feature that provides
more information about the intercept in the VMCB, KVM decodes the
instruction and then updates the next_rip vmcb control field.
However, NRIP support itself depends on cpuid Fn8000_000A_EDX[NRIPS].
Since skip_emulated_instruction() doesn't verify nrip support
before accepting control.next_rip as valid, avoid writing this
field if support isn't present.

Signed-off-by: Bandan Das <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/kvm/svm.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -511,8 +511,10 @@ static void skip_emulated_instruction(st
{
struct vcpu_svm *svm = to_svm(vcpu);

- if (svm->vmcb->control.next_rip != 0)
+ if (svm->vmcb->control.next_rip != 0) {
+ WARN_ON(!static_cpu_has(X86_FEATURE_NRIPS));
svm->next_rip = svm->vmcb->control.next_rip;
+ }

if (!svm->next_rip) {
if (emulate_instruction(vcpu, EMULTYPE_SKIP) !=
@@ -4310,7 +4312,9 @@ static int svm_check_intercept(struct kv
break;
}

- vmcb->control.next_rip = info->next_rip;
+ /* TODO: Advertise NRIPS to guest hypervisor unconditionally */
+ if (static_cpu_has(X86_FEATURE_NRIPS))
+ vmcb->control.next_rip = info->next_rip;
vmcb->control.exit_code = icpt_info.exit_code;
vmexit = nested_svm_exit_handled(svm);


2015-07-08 08:06:11

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 05/55] sparc: Use GFP_ATOMIC in ldc_alloc_exp_dring() as it can be called in softirq context

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Sowmini Varadhan <[email protected]>

Upstream commit 671d773297969bebb1732e1cdc1ec03aa53c6be2

Since it is possible for vnet_event_napi to end up doing
vnet_control_pkt_engine -> ... -> vnet_send_attr ->
vnet_port_alloc_tx_ring -> ldc_alloc_exp_dring -> kzalloc()
(i.e., in softirq context), kzalloc() should be called with
GFP_ATOMIC from ldc_alloc_exp_dring.

Signed-off-by: Sowmini Varadhan <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/sparc/kernel/ldc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/sparc/kernel/ldc.c
+++ b/arch/sparc/kernel/ldc.c
@@ -2313,7 +2313,7 @@ void *ldc_alloc_exp_dring(struct ldc_cha
if (len & (8UL - 1))
return ERR_PTR(-EINVAL);

- buf = kzalloc(len, GFP_KERNEL);
+ buf = kzalloc(len, GFP_ATOMIC);
if (!buf)
return ERR_PTR(-ENOMEM);


2015-07-08 08:05:56

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 06/55] bridge: fix multicast router rlist endless loop

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Nikolay Aleksandrov <[email protected]>

[ Upstream commit 1a040eaca1a22f8da8285ceda6b5e4a2cb704867 ]

Since the addition of sysfs multicast router support if one set
multicast_router to "2" more than once, then the port would be added to
the hlist every time and could end up linking to itself and thus causing an
endless loop for rlist walkers.
So to reproduce just do:
echo 2 > multicast_router; echo 2 > multicast_router;
in a bridge port and let some igmp traffic flow, for me it hangs up
in br_multicast_flood().
Fix this by adding a check in br_multicast_add_router() if the port is
already linked.
The reason this didn't happen before the addition of multicast_router
sysfs entries is because there's a !hlist_unhashed check that prevents
it.

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Fixes: 0909e11758bd ("bridge: Add multicast_router sysfs entries")
Acked-by: Herbert Xu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/bridge/br_multicast.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)

--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -1166,6 +1166,9 @@ static void br_multicast_add_router(stru
struct net_bridge_port *p;
struct hlist_node *slot = NULL;

+ if (!hlist_unhashed(&port->rlist))
+ return;
+
hlist_for_each_entry(p, &br->router_list, rlist) {
if ((unsigned long) port >= (unsigned long) p)
break;
@@ -1193,12 +1196,8 @@ static void br_multicast_mark_router(str
if (port->multicast_router != 1)
return;

- if (!hlist_unhashed(&port->rlist))
- goto timer;
-
br_multicast_add_router(br, port);

-timer:
mod_timer(&port->multicast_router_timer,
now + br->multicast_querier_interval);
}

2015-07-08 08:05:39

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 07/55] net: igb: fix the start time for periodic output signals

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Richard Cochran <[email protected]>

[ Upstream commit 58c98be137830d34b79024cc5dc95ef54fcd7ffe ]

When programming the start of a periodic output, the code wrongly places
the seconds value into the "low" register and the nanoseconds into the
"high" register. Even though this is backwards, it slipped through my
testing, because the re-arming code in the interrupt service routine is
correct, and the signal does appear starting with the second edge.

This patch fixes the issue by programming the registers correctly.

Signed-off-by: Richard Cochran <[email protected]>
Reviewed-by: Jacob Keller <[email protected]>
Acked-by: Jeff Kirsher <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/intel/igb/igb_ptp.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/net/ethernet/intel/igb/igb_ptp.c
+++ b/drivers/net/ethernet/intel/igb/igb_ptp.c
@@ -540,8 +540,8 @@ static int igb_ptp_feature_enable_i210(s
igb->perout[i].start.tv_nsec = rq->perout.start.nsec;
igb->perout[i].period.tv_sec = ts.tv_sec;
igb->perout[i].period.tv_nsec = ts.tv_nsec;
- wr32(trgttiml, rq->perout.start.sec);
- wr32(trgttimh, rq->perout.start.nsec);
+ wr32(trgttimh, rq->perout.start.sec);
+ wr32(trgttiml, rq->perout.start.nsec);
tsauxc |= tsauxc_mask;
tsim |= tsim_mask;
} else {

2015-07-08 08:05:28

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 08/55] net: dont wait for order-3 page allocation

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Shaohua Li <[email protected]>

[ Upstream commit fb05e7a89f500cfc06ae277bdc911b281928995d ]

We saw excessive direct memory compaction triggered by skb_page_frag_refill.
This causes performance issues and add latency. Commit 5640f7685831e0
introduces the order-3 allocation. According to the changelog, the order-3
allocation isn't a must-have but to improve performance. But direct memory
compaction has high overhead. The benefit of order-3 allocation can't
compensate the overhead of direct memory compaction.

This patch makes the order-3 page allocation atomic. If there is no memory
pressure and memory isn't fragmented, the alloction will still success, so we
don't sacrifice the order-3 benefit here. If the atomic allocation fails,
direct memory compaction will not be triggered, skb_page_frag_refill will
fallback to order-0 immediately, hence the direct memory compaction overhead is
avoided. In the allocation failure case, kswapd is waken up and doing
compaction, so chances are allocation could success next time.

alloc_skb_with_frags is the same.

The mellanox driver does similar thing, if this is accepted, we must fix
the driver too.

V3: fix the same issue in alloc_skb_with_frags as pointed out by Eric
V2: make the changelog clearer

Cc: Eric Dumazet <[email protected]>
Cc: Chris Mason <[email protected]>
Cc: Debabrata Banerjee <[email protected]>
Signed-off-by: Shaohua Li <[email protected]>
Acked-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/core/skbuff.c | 2 +-
net/core/sock.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4443,7 +4443,7 @@ struct sk_buff *alloc_skb_with_frags(uns

while (order) {
if (npages >= 1 << order) {
- page = alloc_pages(gfp_mask |
+ page = alloc_pages((gfp_mask & ~__GFP_WAIT) |
__GFP_COMP |
__GFP_NOWARN |
__GFP_NORETRY,
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1895,7 +1895,7 @@ bool skb_page_frag_refill(unsigned int s

pfrag->offset = 0;
if (SKB_FRAG_PAGE_ORDER) {
- pfrag->page = alloc_pages(gfp | __GFP_COMP |
+ pfrag->page = alloc_pages((gfp & ~__GFP_WAIT) | __GFP_COMP |
__GFP_NOWARN | __GFP_NORETRY,
SKB_FRAG_PAGE_ORDER);
if (likely(pfrag->page)) {

2015-07-08 08:05:12

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 09/55] sctp: fix ASCONF list handling

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Marcelo Ricardo Leitner <[email protected]>

[ Upstream commit 2d45a02d0166caf2627fe91897c6ffc3b19514c4 ]

->auto_asconf_splist is per namespace and mangled by functions like
sctp_setsockopt_auto_asconf() which doesn't guarantee any serialization.

Also, the call to inet_sk_copy_descendant() was backuping
->auto_asconf_list through the copy but was not honoring
->do_auto_asconf, which could lead to list corruption if it was
different between both sockets.

This commit thus fixes the list handling by using ->addr_wq_lock
spinlock to protect the list. A special handling is done upon socket
creation and destruction for that. Error handlig on sctp_init_sock()
will never return an error after having initialized asconf, so
sctp_destroy_sock() can be called without addrq_wq_lock. The lock now
will be take on sctp_close_sock(), before locking the socket, so we
don't do it in inverse order compared to sctp_addr_wq_timeout_handler().

Instead of taking the lock on sctp_sock_migrate() for copying and
restoring the list values, it's preferred to avoid rewritting it by
implementing sctp_copy_descendant().

Issue was found with a test application that kept flipping sysctl
default_auto_asconf on and off, but one could trigger it by issuing
simultaneous setsockopt() calls on multiple sockets or by
creating/destroying sockets fast enough. This is only triggerable
locally.

Fixes: 9f7d653b67ae ("sctp: Add Auto-ASCONF support (core).")
Reported-by: Ji Jianwen <[email protected]>
Suggested-by: Neil Horman <[email protected]>
Suggested-by: Hannes Frederic Sowa <[email protected]>
Acked-by: Hannes Frederic Sowa <[email protected]>
Signed-off-by: Marcelo Ricardo Leitner <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/net/netns/sctp.h | 1 +
include/net/sctp/structs.h | 4 ++++
net/sctp/socket.c | 43 ++++++++++++++++++++++++++++++++-----------
3 files changed, 37 insertions(+), 11 deletions(-)

--- a/include/net/netns/sctp.h
+++ b/include/net/netns/sctp.h
@@ -31,6 +31,7 @@ struct netns_sctp {
struct list_head addr_waitq;
struct timer_list addr_wq_timer;
struct list_head auto_asconf_splist;
+ /* Lock that protects both addr_waitq and auto_asconf_splist */
spinlock_t addr_wq_lock;

/* Lock that protects the local_addr_list writers */
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -223,6 +223,10 @@ struct sctp_sock {
atomic_t pd_mode;
/* Receive to here while partial delivery is in effect. */
struct sk_buff_head pd_lobby;
+
+ /* These must be the last fields, as they will skipped on copies,
+ * like on accept and peeloff operations
+ */
struct list_head auto_asconf_list;
int do_auto_asconf;
};
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -1533,8 +1533,10 @@ static void sctp_close(struct sock *sk,

/* Supposedly, no process has access to the socket, but
* the net layers still may.
+ * Also, sctp_destroy_sock() needs to be called with addr_wq_lock
+ * held and that should be grabbed before socket lock.
*/
- local_bh_disable();
+ spin_lock_bh(&net->sctp.addr_wq_lock);
bh_lock_sock(sk);

/* Hold the sock, since sk_common_release() will put sock_put()
@@ -1544,7 +1546,7 @@ static void sctp_close(struct sock *sk,
sk_common_release(sk);

bh_unlock_sock(sk);
- local_bh_enable();
+ spin_unlock_bh(&net->sctp.addr_wq_lock);

sock_put(sk);

@@ -3587,6 +3589,7 @@ static int sctp_setsockopt_auto_asconf(s
if ((val && sp->do_auto_asconf) || (!val && !sp->do_auto_asconf))
return 0;

+ spin_lock_bh(&sock_net(sk)->sctp.addr_wq_lock);
if (val == 0 && sp->do_auto_asconf) {
list_del(&sp->auto_asconf_list);
sp->do_auto_asconf = 0;
@@ -3595,6 +3598,7 @@ static int sctp_setsockopt_auto_asconf(s
&sock_net(sk)->sctp.auto_asconf_splist);
sp->do_auto_asconf = 1;
}
+ spin_unlock_bh(&sock_net(sk)->sctp.addr_wq_lock);
return 0;
}

@@ -4128,18 +4132,28 @@ static int sctp_init_sock(struct sock *s
local_bh_disable();
percpu_counter_inc(&sctp_sockets_allocated);
sock_prot_inuse_add(net, sk->sk_prot, 1);
+
+ /* Nothing can fail after this block, otherwise
+ * sctp_destroy_sock() will be called without addr_wq_lock held
+ */
if (net->sctp.default_auto_asconf) {
+ spin_lock(&sock_net(sk)->sctp.addr_wq_lock);
list_add_tail(&sp->auto_asconf_list,
&net->sctp.auto_asconf_splist);
sp->do_auto_asconf = 1;
- } else
+ spin_unlock(&sock_net(sk)->sctp.addr_wq_lock);
+ } else {
sp->do_auto_asconf = 0;
+ }
+
local_bh_enable();

return 0;
}

-/* Cleanup any SCTP per socket resources. */
+/* Cleanup any SCTP per socket resources. Must be called with
+ * sock_net(sk)->sctp.addr_wq_lock held if sp->do_auto_asconf is true
+ */
static void sctp_destroy_sock(struct sock *sk)
{
struct sctp_sock *sp;
@@ -7202,6 +7216,19 @@ void sctp_copy_sock(struct sock *newsk,
newinet->mc_list = NULL;
}

+static inline void sctp_copy_descendant(struct sock *sk_to,
+ const struct sock *sk_from)
+{
+ int ancestor_size = sizeof(struct inet_sock) +
+ sizeof(struct sctp_sock) -
+ offsetof(struct sctp_sock, auto_asconf_list);
+
+ if (sk_from->sk_family == PF_INET6)
+ ancestor_size += sizeof(struct ipv6_pinfo);
+
+ __inet_sk_copy_descendant(sk_to, sk_from, ancestor_size);
+}
+
/* Populate the fields of the newsk from the oldsk and migrate the assoc
* and its messages to the newsk.
*/
@@ -7216,7 +7243,6 @@ static void sctp_sock_migrate(struct soc
struct sk_buff *skb, *tmp;
struct sctp_ulpevent *event;
struct sctp_bind_hashbucket *head;
- struct list_head tmplist;

/* Migrate socket buffer sizes and all the socket level options to the
* new socket.
@@ -7224,12 +7250,7 @@ static void sctp_sock_migrate(struct soc
newsk->sk_sndbuf = oldsk->sk_sndbuf;
newsk->sk_rcvbuf = oldsk->sk_rcvbuf;
/* Brute force copy old sctp opt. */
- if (oldsp->do_auto_asconf) {
- memcpy(&tmplist, &newsp->auto_asconf_list, sizeof(tmplist));
- inet_sk_copy_descendant(newsk, oldsk);
- memcpy(&newsp->auto_asconf_list, &tmplist, sizeof(tmplist));
- } else
- inet_sk_copy_descendant(newsk, oldsk);
+ sctp_copy_descendant(newsk, oldsk);

/* Restore the ep value that was overwritten with the above structure
* copy.

2015-07-08 08:19:08

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 10/55] bridge: fix br_stp_set_bridge_priority race conditions

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Nikolay Aleksandrov <[email protected]>

[ Upstream commit 2dab80a8b486f02222a69daca6859519e05781d9 ]

After the ->set() spinlocks were removed br_stp_set_bridge_priority
was left running without any protection when used via sysfs. It can
race with port add/del and could result in use-after-free cases and
corrupted lists. Tested by running port add/del in a loop with stp
enabled while setting priority in a loop, crashes are easily
reproducible.
The spinlocks around sysfs ->set() were removed in commit:
14f98f258f19 ("bridge: range check STP parameters")
There's also a race condition in the netlink priority support that is
fixed by this change, but it was introduced recently and the fixes tag
covers it, just in case it's needed the commit is:
af615762e972 ("bridge: add ageing_time, stp_state, priority over netlink")

Signed-off-by: Nikolay Aleksandrov <[email protected]>
Fixes: 14f98f258f19 ("bridge: range check STP parameters")
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/bridge/br_ioctl.c | 2 --
net/bridge/br_stp_if.c | 4 +++-
2 files changed, 3 insertions(+), 3 deletions(-)

--- a/net/bridge/br_ioctl.c
+++ b/net/bridge/br_ioctl.c
@@ -247,9 +247,7 @@ static int old_dev_ioctl(struct net_devi
if (!ns_capable(dev_net(dev)->user_ns, CAP_NET_ADMIN))
return -EPERM;

- spin_lock_bh(&br->lock);
br_stp_set_bridge_priority(br, args[1]);
- spin_unlock_bh(&br->lock);
return 0;

case BRCTL_SET_PORT_PRIORITY:
--- a/net/bridge/br_stp_if.c
+++ b/net/bridge/br_stp_if.c
@@ -243,12 +243,13 @@ bool br_stp_recalculate_bridge_id(struct
return true;
}

-/* called under bridge lock */
+/* Acquires and releases bridge lock */
void br_stp_set_bridge_priority(struct net_bridge *br, u16 newprio)
{
struct net_bridge_port *p;
int wasroot;

+ spin_lock_bh(&br->lock);
wasroot = br_is_root_bridge(br);

list_for_each_entry(p, &br->port_list, list) {
@@ -266,6 +267,7 @@ void br_stp_set_bridge_priority(struct n
br_port_state_selection(br);
if (br_is_root_bridge(br) && !wasroot)
br_become_root_bridge(br);
+ spin_unlock_bh(&br->lock);
}

/* called under bridge lock */

2015-07-08 08:18:54

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 11/55] packet: read num_members once in packet_rcv_fanout()

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <[email protected]>

[ Upstream commit f98f4514d07871da7a113dd9e3e330743fd70ae4 ]

We need to tell compiler it must not read f->num_members multiple
times. Otherwise testing if num is not zero is flaky, and we could
attempt an invalid divide by 0 in fanout_demux_cpu()

Note bug was present in packet_rcv_fanout_hash() and
packet_rcv_fanout_lb() but final 3.1 had a simple location
after commit 95ec3eb417115fb ("packet: Add 'cpu' fanout policy.")

Fixes: dc99f600698dc ("packet: Add fanout support.")
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Willem de Bruijn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/packet/af_packet.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1347,7 +1347,7 @@ static int packet_rcv_fanout(struct sk_b
struct packet_type *pt, struct net_device *orig_dev)
{
struct packet_fanout *f = pt->af_packet_priv;
- unsigned int num = f->num_members;
+ unsigned int num = READ_ONCE(f->num_members);
struct packet_sock *po;
unsigned int idx;


2015-07-08 08:18:41

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 12/55] packet: avoid out of bounds read in round robin fanout

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Willem de Bruijn <[email protected]>

[ Upstream commit 468479e6043c84f5a65299cc07cb08a22a28c2b1 ]

PACKET_FANOUT_LB computes f->rr_cur such that it is modulo
f->num_members. It returns the old value unconditionally, but
f->num_members may have changed since the last store. Ensure
that the return value is always < num.

When modifying the logic, simplify it further by replacing the loop
with an unconditional atomic increment.

Fixes: dc99f600698d ("packet: Add fanout support.")
Suggested-by: Eric Dumazet <[email protected]>
Signed-off-by: Willem de Bruijn <[email protected]>
Acked-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/packet/af_packet.c | 18 ++----------------
1 file changed, 2 insertions(+), 16 deletions(-)

--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1266,16 +1266,6 @@ static void packet_sock_destruct(struct
sk_refcnt_debug_dec(sk);
}

-static int fanout_rr_next(struct packet_fanout *f, unsigned int num)
-{
- int x = atomic_read(&f->rr_cur) + 1;
-
- if (x >= num)
- x = 0;
-
- return x;
-}
-
static unsigned int fanout_demux_hash(struct packet_fanout *f,
struct sk_buff *skb,
unsigned int num)
@@ -1287,13 +1277,9 @@ static unsigned int fanout_demux_lb(stru
struct sk_buff *skb,
unsigned int num)
{
- int cur, old;
+ unsigned int val = atomic_inc_return(&f->rr_cur);

- cur = atomic_read(&f->rr_cur);
- while ((old = atomic_cmpxchg(&f->rr_cur, cur,
- fanout_rr_next(f, num))) != cur)
- cur = old;
- return cur;
+ return val % num;
}

static unsigned int fanout_demux_cpu(struct packet_fanout *f,

2015-07-08 08:18:25

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 13/55] neigh: do not modify unlinked entries

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Julian Anastasov <[email protected]>

[ Upstream commit 2c51a97f76d20ebf1f50fef908b986cb051fdff9 ]

The lockless lookups can return entry that is unlinked.
Sometimes they get reference before last neigh_cleanup_and_release,
sometimes they do not need reference. Later, any
modification attempts may result in the following problems:

1. entry is not destroyed immediately because neigh_update
can start the timer for dead entry, eg. on change to NUD_REACHABLE
state. As result, entry lives for some time but is invisible
and out of control.

2. __neigh_event_send can run in parallel with neigh_destroy
while refcnt=0 but if timer is started and expired refcnt can
reach 0 for second time leading to second neigh_destroy and
possible crash.

Thanks to Eric Dumazet and Ying Xue for their work and analyze
on the __neigh_event_send change.

Fixes: 767e97e1e0db ("neigh: RCU conversion of struct neighbour")
Fixes: a263b3093641 ("ipv4: Make neigh lookups directly in output packet path.")
Fixes: 6fd6ce2056de ("ipv6: Do not depend on rt->n in ip6_finish_output2().")
Cc: Eric Dumazet <[email protected]>
Cc: Ying Xue <[email protected]>
Signed-off-by: Julian Anastasov <[email protected]>
Acked-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/core/neighbour.c | 13 +++++++++++++
1 file changed, 13 insertions(+)

--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -971,6 +971,8 @@ int __neigh_event_send(struct neighbour
rc = 0;
if (neigh->nud_state & (NUD_CONNECTED | NUD_DELAY | NUD_PROBE))
goto out_unlock_bh;
+ if (neigh->dead)
+ goto out_dead;

if (!(neigh->nud_state & (NUD_STALE | NUD_INCOMPLETE))) {
if (NEIGH_VAR(neigh->parms, MCAST_PROBES) +
@@ -1027,6 +1029,13 @@ out_unlock_bh:
write_unlock(&neigh->lock);
local_bh_enable();
return rc;
+
+out_dead:
+ if (neigh->nud_state & NUD_STALE)
+ goto out_unlock_bh;
+ write_unlock_bh(&neigh->lock);
+ kfree_skb(skb);
+ return 1;
}
EXPORT_SYMBOL(__neigh_event_send);

@@ -1090,6 +1099,8 @@ int neigh_update(struct neighbour *neigh
if (!(flags & NEIGH_UPDATE_F_ADMIN) &&
(old & (NUD_NOARP | NUD_PERMANENT)))
goto out;
+ if (neigh->dead)
+ goto out;

if (!(new & NUD_VALID)) {
neigh_del_timer(neigh);
@@ -1239,6 +1250,8 @@ EXPORT_SYMBOL(neigh_update);
*/
void __neigh_set_probe_once(struct neighbour *neigh)
{
+ if (neigh->dead)
+ return;
neigh->updated = jiffies;
if (!(neigh->nud_state & NUD_FAILED))
return;

2015-07-08 08:18:10

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 14/55] tcp: Do not call tcp_fastopen_reset_cipher from interrupt context

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Christoph Paasch <[email protected]>

[ Upstream commit dfea2aa654243f70dc53b8648d0bbdeec55a7df1 ]

tcp_fastopen_reset_cipher really cannot be called from interrupt
context. It allocates the tcp_fastopen_context with GFP_KERNEL and
calls crypto_alloc_cipher, which allocates all kind of stuff with
GFP_KERNEL.

Thus, we might sleep when the key-generation is triggered by an
incoming TFO cookie-request which would then happen in interrupt-
context, as shown by enabling CONFIG_DEBUG_ATOMIC_SLEEP:

[ 36.001813] BUG: sleeping function called from invalid context at mm/slub.c:1266
[ 36.003624] in_atomic(): 1, irqs_disabled(): 0, pid: 1016, name: packetdrill
[ 36.004859] CPU: 1 PID: 1016 Comm: packetdrill Not tainted 4.1.0-rc7 #14
[ 36.006085] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
[ 36.008250] 00000000000004f2 ffff88007f8838a8 ffffffff8171d53a ffff880075a084a8
[ 36.009630] ffff880075a08000 ffff88007f8838c8 ffffffff810967d3 ffff88007f883928
[ 36.011076] 0000000000000000 ffff88007f8838f8 ffffffff81096892 ffff88007f89be00
[ 36.012494] Call Trace:
[ 36.012953] <IRQ> [<ffffffff8171d53a>] dump_stack+0x4f/0x6d
[ 36.014085] [<ffffffff810967d3>] ___might_sleep+0x103/0x170
[ 36.015117] [<ffffffff81096892>] __might_sleep+0x52/0x90
[ 36.016117] [<ffffffff8118e887>] kmem_cache_alloc_trace+0x47/0x190
[ 36.017266] [<ffffffff81680d82>] ? tcp_fastopen_reset_cipher+0x42/0x130
[ 36.018485] [<ffffffff81680d82>] tcp_fastopen_reset_cipher+0x42/0x130
[ 36.019679] [<ffffffff81680f01>] tcp_fastopen_init_key_once+0x61/0x70
[ 36.020884] [<ffffffff81680f2c>] __tcp_fastopen_cookie_gen+0x1c/0x60
[ 36.022058] [<ffffffff816814ff>] tcp_try_fastopen+0x58f/0x730
[ 36.023118] [<ffffffff81671788>] tcp_conn_request+0x3e8/0x7b0
[ 36.024185] [<ffffffff810e3872>] ? __module_text_address+0x12/0x60
[ 36.025327] [<ffffffff8167b2e1>] tcp_v4_conn_request+0x51/0x60
[ 36.026410] [<ffffffff816727e0>] tcp_rcv_state_process+0x190/0xda0
[ 36.027556] [<ffffffff81661f97>] ? __inet_lookup_established+0x47/0x170
[ 36.028784] [<ffffffff8167c2ad>] tcp_v4_do_rcv+0x16d/0x3d0
[ 36.029832] [<ffffffff812e6806>] ? security_sock_rcv_skb+0x16/0x20
[ 36.030936] [<ffffffff8167cc8a>] tcp_v4_rcv+0x77a/0x7b0
[ 36.031875] [<ffffffff816af8c3>] ? iptable_filter_hook+0x33/0x70
[ 36.032953] [<ffffffff81657d22>] ip_local_deliver_finish+0x92/0x1f0
[ 36.034065] [<ffffffff81657f1a>] ip_local_deliver+0x9a/0xb0
[ 36.035069] [<ffffffff81657c90>] ? ip_rcv+0x3d0/0x3d0
[ 36.035963] [<ffffffff81657569>] ip_rcv_finish+0x119/0x330
[ 36.036950] [<ffffffff81657ba7>] ip_rcv+0x2e7/0x3d0
[ 36.037847] [<ffffffff81610652>] __netif_receive_skb_core+0x552/0x930
[ 36.038994] [<ffffffff81610a57>] __netif_receive_skb+0x27/0x70
[ 36.040033] [<ffffffff81610b72>] process_backlog+0xd2/0x1f0
[ 36.041025] [<ffffffff81611482>] net_rx_action+0x122/0x310
[ 36.042007] [<ffffffff81076743>] __do_softirq+0x103/0x2f0
[ 36.042978] [<ffffffff81723e3c>] do_softirq_own_stack+0x1c/0x30

This patch moves the call to tcp_fastopen_init_key_once to the places
where a listener socket creates its TFO-state, which always happens in
user-context (either from the setsockopt, or implicitly during the
listen()-call)

Cc: Eric Dumazet <[email protected]>
Cc: Hannes Frederic Sowa <[email protected]>
Fixes: 222e83d2e0ae ("tcp: switch tcp_fastopen key generation to net_get_random_once")
Signed-off-by: Christoph Paasch <[email protected]>
Acked-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/af_inet.c | 2 ++
net/ipv4/tcp.c | 7 +++++--
net/ipv4/tcp_fastopen.c | 2 --
3 files changed, 7 insertions(+), 4 deletions(-)

--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -228,6 +228,8 @@ int inet_listen(struct socket *sock, int
err = 0;
if (err)
goto out;
+
+ tcp_fastopen_init_key_once(true);
}
err = inet_csk_listen_start(sk, backlog);
if (err)
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2541,10 +2541,13 @@ static int do_tcp_setsockopt(struct sock

case TCP_FASTOPEN:
if (val >= 0 && ((1 << sk->sk_state) & (TCPF_CLOSE |
- TCPF_LISTEN)))
+ TCPF_LISTEN))) {
+ tcp_fastopen_init_key_once(true);
+
err = fastopen_init_queue(sk, val);
- else
+ } else {
err = -EINVAL;
+ }
break;
case TCP_TIMESTAMP:
if (!tp->repair)
--- a/net/ipv4/tcp_fastopen.c
+++ b/net/ipv4/tcp_fastopen.c
@@ -78,8 +78,6 @@ static bool __tcp_fastopen_cookie_gen(co
struct tcp_fastopen_context *ctx;
bool ok = false;

- tcp_fastopen_init_key_once(true);
-
rcu_read_lock();
ctx = rcu_dereference(tcp_fastopen_ctx);
if (ctx) {

2015-07-08 08:16:59

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 15/55] ip: report the original address of ICMP messages

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Julian Anastasov <[email protected]>

[ Upstream commit 34b99df4e6256ddafb663c6de0711dceceddfe0e ]

ICMP messages can trigger ICMP and local errors. In this case
serr->port is 0 and starting from Linux 4.0 we do not return
the original target address to the error queue readers.
Add function to define which errors provide addr_offset.
With this fix my ping command is not silent anymore.

Fixes: c247f0534cc5 ("ip: fix error queue empty skb handling")
Signed-off-by: Julian Anastasov <[email protected]>
Acked-by: Willem de Bruijn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/ip_sockglue.c | 11 ++++++++++-
net/ipv6/datagram.c | 12 +++++++++++-
2 files changed, 21 insertions(+), 2 deletions(-)

--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -432,6 +432,15 @@ void ip_local_error(struct sock *sk, int
kfree_skb(skb);
}

+/* For some errors we have valid addr_offset even with zero payload and
+ * zero port. Also, addr_offset should be supported if port is set.
+ */
+static inline bool ipv4_datagram_support_addr(struct sock_exterr_skb *serr)
+{
+ return serr->ee.ee_origin == SO_EE_ORIGIN_ICMP ||
+ serr->ee.ee_origin == SO_EE_ORIGIN_LOCAL || serr->port;
+}
+
/* IPv4 supports cmsg on all imcp errors and some timestamps
*
* Timestamp code paths do not initialize the fields expected by cmsg:
@@ -498,7 +507,7 @@ int ip_recv_error(struct sock *sk, struc

serr = SKB_EXT_ERR(skb);

- if (sin && serr->port) {
+ if (sin && ipv4_datagram_support_addr(serr)) {
sin->sin_family = AF_INET;
sin->sin_addr.s_addr = *(__be32 *)(skb_network_header(skb) +
serr->addr_offset);
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -325,6 +325,16 @@ void ipv6_local_rxpmtu(struct sock *sk,
kfree_skb(skb);
}

+/* For some errors we have valid addr_offset even with zero payload and
+ * zero port. Also, addr_offset should be supported if port is set.
+ */
+static inline bool ipv6_datagram_support_addr(struct sock_exterr_skb *serr)
+{
+ return serr->ee.ee_origin == SO_EE_ORIGIN_ICMP6 ||
+ serr->ee.ee_origin == SO_EE_ORIGIN_ICMP ||
+ serr->ee.ee_origin == SO_EE_ORIGIN_LOCAL || serr->port;
+}
+
/* IPv6 supports cmsg on all origins aside from SO_EE_ORIGIN_LOCAL.
*
* At one point, excluding local errors was a quick test to identify icmp/icmp6
@@ -389,7 +399,7 @@ int ipv6_recv_error(struct sock *sk, str

serr = SKB_EXT_ERR(skb);

- if (sin && serr->port) {
+ if (sin && ipv6_datagram_support_addr(serr)) {
const unsigned char *nh = skb_network_header(skb);
sin->sin6_family = AF_INET6;
sin->sin6_flowinfo = 0;

2015-07-08 07:37:23

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 16/55] net/mlx4_en: Release TX QP when destroying TX ring

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Eran Ben Elisha <[email protected]>

[ Upstream commit 0eb08514fdbdcd16fd6870680cd638f203662e9d ]

TX ring QP wasn't released at mlx4_en_destroy_tx_ring. Instead, the code
used the deprecated base_tx_qpn field. Move TX QP release to
mlx4_en_destroy_tx_ring and remove the base_tx_qpn field.

Fixes: ddae0349fdb7 ('net/mlx4: Change QP allocation scheme')
Signed-off-by: Eran Ben Elisha <[email protected]>
Signed-off-by: Or Gerlitz <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 4 ----
drivers/net/ethernet/mellanox/mlx4/en_tx.c | 1 +
drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 1 -
3 files changed, 1 insertion(+), 5 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -1971,10 +1971,6 @@ void mlx4_en_free_resources(struct mlx4_
mlx4_en_destroy_cq(priv, &priv->rx_cq[i]);
}

- if (priv->base_tx_qpn) {
- mlx4_qp_release_range(priv->mdev->dev, priv->base_tx_qpn, priv->tx_ring_num);
- priv->base_tx_qpn = 0;
- }
}

int mlx4_en_alloc_resources(struct mlx4_en_priv *priv)
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -180,6 +180,7 @@ void mlx4_en_destroy_tx_ring(struct mlx4
mlx4_bf_free(mdev->dev, &ring->bf);
mlx4_qp_remove(mdev->dev, &ring->qp);
mlx4_qp_free(mdev->dev, &ring->qp);
+ mlx4_qp_release_range(priv->mdev->dev, ring->qpn, 1);
mlx4_en_unmap_buffer(&ring->wqres.buf);
mlx4_free_hwq_res(mdev->dev, &ring->wqres, ring->buf_size);
kfree(ring->bounce_buf);
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -601,7 +601,6 @@ struct mlx4_en_priv {
int vids[128];
bool wol;
struct device *ddev;
- int base_tx_qpn;
struct hlist_head mac_hash[MLX4_EN_MAC_HASH_SIZE];
struct hwtstamp_config hwtstamp_config;


2015-07-08 08:16:52

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 17/55] net/mlx4_en: Wake TX queues only when theres enough room

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Ido Shamay <[email protected]>

[ Upstream commit 488a9b48e398b157703766e2cd91ea45ac6997c5 ]

Indication of a single completed packet, marked by txbbs_skipped
being bigger then zero, in not enough in order to wake up a
stopped TX queue. The completed packet may contain a single TXBB,
while next packet to be sent (after the wake up) may have multiple
TXBBs (LSO/TSO packets for example), causing overflow in queue followed
by WQE corruption and TX queue timeout.
Instead, wake the stopped queue only when there's enough room for the
worst case (maximum sized WQE) packet that we should need to handle after
the queue is opened again.

Also created an helper routine - mlx4_en_is_tx_ring_full, which checks
if the current TX ring is full or not. It provides better code readability
and removes code duplication.

Signed-off-by: Ido Shamay <[email protected]>
Signed-off-by: Or Gerlitz <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/mellanox/mlx4/en_tx.c | 19 +++++++++++--------
drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 1 +
2 files changed, 12 insertions(+), 8 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -66,6 +66,7 @@ int mlx4_en_create_tx_ring(struct mlx4_e
ring->size = size;
ring->size_mask = size - 1;
ring->stride = stride;
+ ring->full_size = ring->size - HEADROOM - MAX_DESC_TXBBS;

tmp = size * sizeof(struct mlx4_en_tx_info);
ring->tx_info = kmalloc_node(tmp, GFP_KERNEL | __GFP_NOWARN, node);
@@ -232,6 +233,11 @@ void mlx4_en_deactivate_tx_ring(struct m
MLX4_QP_STATE_RST, NULL, 0, 0, &ring->qp);
}

+static inline bool mlx4_en_is_tx_ring_full(struct mlx4_en_tx_ring *ring)
+{
+ return ring->prod - ring->cons > ring->full_size;
+}
+
static void mlx4_en_stamp_wqe(struct mlx4_en_priv *priv,
struct mlx4_en_tx_ring *ring, int index,
u8 owner)
@@ -474,11 +480,10 @@ static bool mlx4_en_process_tx_cq(struct

netdev_tx_completed_queue(ring->tx_queue, packets, bytes);

- /*
- * Wakeup Tx queue if this stopped, and at least 1 packet
- * was completed
+ /* Wakeup Tx queue if this stopped, and ring is not full.
*/
- if (netif_tx_queue_stopped(ring->tx_queue) && txbbs_skipped > 0) {
+ if (netif_tx_queue_stopped(ring->tx_queue) &&
+ !mlx4_en_is_tx_ring_full(ring)) {
netif_tx_wake_queue(ring->tx_queue);
ring->wake_queue++;
}
@@ -922,8 +927,7 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff
skb_tx_timestamp(skb);

/* Check available TXBBs And 2K spare for prefetch */
- stop_queue = (int)(ring->prod - ring_cons) >
- ring->size - HEADROOM - MAX_DESC_TXBBS;
+ stop_queue = mlx4_en_is_tx_ring_full(ring);
if (unlikely(stop_queue)) {
netif_tx_stop_queue(ring->tx_queue);
ring->queue_stopped++;
@@ -992,8 +996,7 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff
smp_rmb();

ring_cons = ACCESS_ONCE(ring->cons);
- if (unlikely(((int)(ring->prod - ring_cons)) <=
- ring->size - HEADROOM - MAX_DESC_TXBBS)) {
+ if (unlikely(!mlx4_en_is_tx_ring_full(ring))) {
netif_tx_wake_queue(ring->tx_queue);
ring->wake_queue++;
}
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -280,6 +280,7 @@ struct mlx4_en_tx_ring {
u32 size; /* number of TXBBs */
u32 size_mask;
u16 stride;
+ u32 full_size;
u16 cqn; /* index of port CQ associated with this ring */
u32 buf_size;
__be32 doorbell_qpn;

2015-07-08 08:16:27

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 18/55] net/mlx4_en: Fix wrong csum complete report when rxvlan offload is disabled

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Ido Shamay <[email protected]>

[ Upstream commit 79a258526ce1051cb9684018c25a89d51ac21be8 ]

The check_csum() function relied on hwtstamp_rx_filter to know if rxvlan
offload is disabled. This is wrong since rxvlan offload can be switched
on/off regardless of hwtstamp_rx_filter.

Also moved check_csum to query CQE information to identify VLAN packets
and removed the check of IP packets, since it has been validated before.

Fixes: f8c6455bb04b ('net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETE')
Signed-off-by: Ido Shamay <[email protected]>
Signed-off-by: Or Gerlitz <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/mellanox/mlx4/en_rx.c | 17 ++++++-----------
1 file changed, 6 insertions(+), 11 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -723,7 +723,7 @@ static int get_fixed_ipv6_csum(__wsum hw
}
#endif
static int check_csum(struct mlx4_cqe *cqe, struct sk_buff *skb, void *va,
- int hwtstamp_rx_filter)
+ netdev_features_t dev_features)
{
__wsum hw_checksum = 0;

@@ -731,14 +731,8 @@ static int check_csum(struct mlx4_cqe *c

hw_checksum = csum_unfold((__force __sum16)cqe->checksum);

- if (((struct ethhdr *)va)->h_proto == htons(ETH_P_8021Q) &&
- hwtstamp_rx_filter != HWTSTAMP_FILTER_NONE) {
- /* next protocol non IPv4 or IPv6 */
- if (((struct vlan_hdr *)hdr)->h_vlan_encapsulated_proto
- != htons(ETH_P_IP) &&
- ((struct vlan_hdr *)hdr)->h_vlan_encapsulated_proto
- != htons(ETH_P_IPV6))
- return -1;
+ if (cqe->vlan_my_qpn & cpu_to_be32(MLX4_CQE_VLAN_PRESENT_MASK) &&
+ !(dev_features & NETIF_F_HW_VLAN_CTAG_RX)) {
hw_checksum = get_fixed_vlan_csum(hw_checksum, hdr);
hdr += sizeof(struct vlan_hdr);
}
@@ -901,7 +895,8 @@ int mlx4_en_process_rx_cq(struct net_dev

if (ip_summed == CHECKSUM_COMPLETE) {
void *va = skb_frag_address(skb_shinfo(gro_skb)->frags);
- if (check_csum(cqe, gro_skb, va, ring->hwtstamp_rx_filter)) {
+ if (check_csum(cqe, gro_skb, va,
+ dev->features)) {
ip_summed = CHECKSUM_NONE;
ring->csum_none++;
ring->csum_complete--;
@@ -956,7 +951,7 @@ int mlx4_en_process_rx_cq(struct net_dev
}

if (ip_summed == CHECKSUM_COMPLETE) {
- if (check_csum(cqe, skb, skb->data, ring->hwtstamp_rx_filter)) {
+ if (check_csum(cqe, skb, skb->data, dev->features)) {
ip_summed = CHECKSUM_NONE;
ring->csum_complete--;
ring->csum_none++;

2015-07-08 08:16:37

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 19/55] mlx4: Disable HA for SRIOV PF RoCE devices

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Or Gerlitz <[email protected]>

[ Upstream commit 7254acffeeec3c0a75b9c5364c29a6eb00014930 ]

When in HA mode, the driver exposes an IB (RoCE) device instance with only
one port. Under SRIOV, the existing implementation doesn't go well with
the PF RoCE driver's role of Special QPs Para-Virtualization, etc.

As such, disable HA for the mlx4 PF RoCE device in SRIOV mode.

Fixes: a57500903093 ('IB/mlx4: Add port aggregation support')
Signed-off-by: Or Gerlitz <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/mellanox/mlx4/intf.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

--- a/drivers/net/ethernet/mellanox/mlx4/intf.c
+++ b/drivers/net/ethernet/mellanox/mlx4/intf.c
@@ -93,8 +93,14 @@ int mlx4_register_interface(struct mlx4_
mutex_lock(&intf_mutex);

list_add_tail(&intf->list, &intf_list);
- list_for_each_entry(priv, &dev_list, dev_list)
+ list_for_each_entry(priv, &dev_list, dev_list) {
+ if (mlx4_is_mfunc(&priv->dev) && (intf->flags & MLX4_INTFF_BONDING)) {
+ mlx4_dbg(&priv->dev,
+ "SRIOV, disabling HA mode for intf proto %d\n", intf->protocol);
+ intf->flags &= ~MLX4_INTFF_BONDING;
+ }
mlx4_add_device(intf, priv);
+ }

mutex_unlock(&intf_mutex);


2015-07-08 08:16:01

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 20/55] net: phy: fix phy link up when limiting speed via device tree

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Mugunthan V N <[email protected]>

[ Upstream commit eb686231fce3770299760f24fdcf5ad041f44153 ]

When limiting phy link speed using "max-speed" to 100mbps or less on a
giga bit phy, phy never completes auto negotiation and phy state
machine is held in PHY_AN. Fixing this issue by comparing the giga
bit advertise though phydev->supported doesn't have it but phy has
BMSR_ESTATEN set. So that auto negotiation is restarted as old and
new advertise are different and link comes up fine.

Signed-off-by: Mugunthan V N <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/phy/phy_device.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -796,10 +796,11 @@ static int genphy_config_advert(struct p
if (phydev->supported & (SUPPORTED_1000baseT_Half |
SUPPORTED_1000baseT_Full)) {
adv |= ethtool_adv_to_mii_ctrl1000_t(advertise);
- if (adv != oldadv)
- changed = 1;
}

+ if (adv != oldadv)
+ changed = 1;
+
err = phy_write(phydev, MII_CTRL1000, adv);
if (err < 0)
return err;

2015-07-08 08:13:16

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 21/55] bnx2x: fix lockdep splat

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <[email protected]>

[ Upstream commit d53c66a5b80698620f7c9ba2372fff4017e987b8 ]

Michel reported following lockdep splat

[ 44.718117] INFO: trying to register non-static key.
[ 44.723081] the code is fine but needs lockdep annotation.
[ 44.728559] turning off the locking correctness validator.
[ 44.734036] CPU: 8 PID: 5483 Comm: ethtool Not tainted 4.1.0
[ 44.770289] Call Trace:
[ 44.772741] [<ffffffff816eb1cd>] dump_stack+0x4c/0x65
[ 44.777879] [<ffffffff8111d921>] ? console_unlock+0x1f1/0x510
[ 44.783708] [<ffffffff811121f5>] __lock_acquire+0x1d05/0x1f10
[ 44.789538] [<ffffffff8111370a>] ? mark_held_locks+0x6a/0x90
[ 44.795276] [<ffffffff81113835>] ? trace_hardirqs_on_caller+0x105/0x1d0
[ 44.801967] [<ffffffff8111390d>] ? trace_hardirqs_on+0xd/0x10
[ 44.807793] [<ffffffff811330fa>] ? hrtimer_try_to_cancel+0x4a/0x250
[ 44.814142] [<ffffffff81112ba6>] lock_acquire+0xb6/0x290
[ 44.819537] [<ffffffff810d6675>] ? flush_work+0x5/0x280
[ 44.824844] [<ffffffff810d66ad>] flush_work+0x3d/0x280
[ 44.830061] [<ffffffff810d6675>] ? flush_work+0x5/0x280
[ 44.835366] [<ffffffff816f3c43>] ? schedule_hrtimeout_range+0x13/0x20
[ 44.841889] [<ffffffff8112ec9b>] ? usleep_range+0x4b/0x50
[ 44.847365] [<ffffffff8111370a>] ? mark_held_locks+0x6a/0x90
[ 44.853102] [<ffffffff810d8585>] ? __cancel_work_timer+0x105/0x1c0
[ 44.859359] [<ffffffff81113835>] ? trace_hardirqs_on_caller+0x105/0x1d0
[ 44.866045] [<ffffffff810d851f>] __cancel_work_timer+0x9f/0x1c0
[ 44.872048] [<ffffffffa0010982>] ? bnx2x_func_stop+0x42/0x90 [bnx2x]
[ 44.878481] [<ffffffff810d8670>] cancel_work_sync+0x10/0x20
[ 44.884134] [<ffffffffa00259e5>] bnx2x_chip_cleanup+0x245/0x730 [bnx2x]
[ 44.890829] [<ffffffff8110ce02>] ? up+0x32/0x50
[ 44.895439] [<ffffffff811306b5>] ? del_timer_sync+0x5/0xd0
[ 44.901005] [<ffffffffa005596d>] bnx2x_nic_unload+0x20d/0x8e0 [bnx2x]
[ 44.907527] [<ffffffff811f1aef>] ? might_fault+0x5f/0xb0
[ 44.912921] [<ffffffffa005851c>] bnx2x_reload_if_running+0x2c/0x50 [bnx2x]
[ 44.919879] [<ffffffffa005a3c5>] bnx2x_set_ringparam+0x2b5/0x460 [bnx2x]
[ 44.926664] [<ffffffff815d498b>] dev_ethtool+0x55b/0x1c40
[ 44.932148] [<ffffffff815dfdc7>] ? rtnl_lock+0x17/0x20
[ 44.937364] [<ffffffff815e7f8b>] dev_ioctl+0x17b/0x630
[ 44.942582] [<ffffffff815abf8d>] sock_do_ioctl+0x5d/0x70
[ 44.947972] [<ffffffff815ac013>] sock_ioctl+0x73/0x280
[ 44.953192] [<ffffffff8124c1c8>] do_vfs_ioctl+0x88/0x5b0
[ 44.958587] [<ffffffff8110d0b3>] ? up_read+0x23/0x40
[ 44.963631] [<ffffffff812584cc>] ? __fget_light+0x6c/0xa0
[ 44.969105] [<ffffffff8124c781>] SyS_ioctl+0x91/0xb0
[ 44.974149] [<ffffffff816f4dd7>] system_call_fastpath+0x12/0x6f

As bnx2x_init_ptp() is only called if bp->flags contains PTP_SUPPORTED,
we also need to guard bnx2x_stop_ptp() with same condition, otherwise
ptp_task workqueue is not initialized and kernel barfs on
cancel_work_sync()

Fixes: eeed018cbfa30 ("bnx2x: Add timestamping and PTP hardware clock support")
Reported-by: Michel Lespinasse <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Michal Kalderon <[email protected]>
Cc: Ariel Elior <[email protected]>
Cc: Yuval Mintz <[email protected]>
Cc: David Decotigny <[email protected]>
Acked-by: Sony Chacko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -9323,7 +9323,8 @@ unload_error:
* function stop ramrod is sent, since as part of this ramrod FW access
* PTP registers.
*/
- bnx2x_stop_ptp(bp);
+ if (bp->flags & PTP_SUPPORTED)
+ bnx2x_stop_ptp(bp);

/* Disable HW interrupts, NAPI */
bnx2x_netif_stop(bp, 1);

2015-07-08 08:13:08

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 22/55] sctp: Fix race between OOTB responce and route removal

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Alexander Sverdlin <[email protected]>

[ Upstream commit 29c4afc4e98f4dc0ea9df22c631841f9c220b944 ]

There is NULL pointer dereference possible during statistics update if the route
used for OOTB responce is removed at unfortunate time. If the route exists when
we receive OOTB packet and we finally jump into sctp_packet_transmit() to send
ABORT, but in the meantime route is removed under our feet, we take "no_route"
path and try to update stats with IP_INC_STATS(sock_net(asoc->base.sk), ...).

But sctp_ootb_pkt_new() used to prepare responce packet doesn't call
sctp_transport_set_owner() and therefore there is no asoc associated with this
packet. Probably temporary asoc just for OOTB responces is overkill, so just
introduce a check like in all other places in sctp_packet_transmit(), where
"asoc" is dereferenced.

To reproduce this, one needs to
0. ensure that sctp module is loaded (otherwise ABORT is not generated)
1. remove default route on the machine
2. while true; do
ip route del [interface-specific route]
ip route add [interface-specific route]
done
3. send enough OOTB packets (i.e. HB REQs) from another host to trigger ABORT
responce

On x86_64 the crash looks like this:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
IP: [<ffffffffa05ec9ac>] sctp_packet_transmit+0x63c/0x730 [sctp]
PGD 0
Oops: 0000 [#1] PREEMPT SMP
Modules linked in: ...
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G O 4.0.5-1-ARCH #1
Hardware name: ...
task: ffffffff818124c0 ti: ffffffff81800000 task.ti: ffffffff81800000
RIP: 0010:[<ffffffffa05ec9ac>] [<ffffffffa05ec9ac>] sctp_packet_transmit+0x63c/0x730 [sctp]
RSP: 0018:ffff880127c037b8 EFLAGS: 00010296
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000015ff66b480
RDX: 00000015ff66b400 RSI: ffff880127c17200 RDI: ffff880123403700
RBP: ffff880127c03888 R08: 0000000000017200 R09: ffffffff814625af
R10: ffffea00047e4680 R11: 00000000ffffff80 R12: ffff8800b0d38a28
R13: ffff8800b0d38a28 R14: ffff8800b3e88000 R15: ffffffffa05f24e0
FS: 0000000000000000(0000) GS:ffff880127c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000020 CR3: 00000000c855b000 CR4: 00000000000007f0
Stack:
ffff880127c03910 ffff8800b0d38a28 ffffffff8189d240 ffff88011f91b400
ffff880127c03828 ffffffffa05c94c5 0000000000000000 ffff8800baa1c520
0000000000000000 0000000000000001 0000000000000000 0000000000000000
Call Trace:
<IRQ>
[<ffffffffa05c94c5>] ? sctp_sf_tabort_8_4_8.isra.20+0x85/0x140 [sctp]
[<ffffffffa05d6b42>] ? sctp_transport_put+0x52/0x80 [sctp]
[<ffffffffa05d0bfc>] sctp_do_sm+0xb8c/0x19a0 [sctp]
[<ffffffff810b0e00>] ? trigger_load_balance+0x90/0x210
[<ffffffff810e0329>] ? update_process_times+0x59/0x60
[<ffffffff812c7a40>] ? timerqueue_add+0x60/0xb0
[<ffffffff810e0549>] ? enqueue_hrtimer+0x29/0xa0
[<ffffffff8101f599>] ? read_tsc+0x9/0x10
[<ffffffff8116d4b5>] ? put_page+0x55/0x60
[<ffffffff810ee1ad>] ? clockevents_program_event+0x6d/0x100
[<ffffffff81462b68>] ? skb_free_head+0x58/0x80
[<ffffffffa029a10b>] ? chksum_update+0x1b/0x27 [crc32c_generic]
[<ffffffff81283f3e>] ? crypto_shash_update+0xce/0xf0
[<ffffffffa05d3993>] sctp_endpoint_bh_rcv+0x113/0x280 [sctp]
[<ffffffffa05dd4e6>] sctp_inq_push+0x46/0x60 [sctp]
[<ffffffffa05ed7a0>] sctp_rcv+0x880/0x910 [sctp]
[<ffffffffa05ecb50>] ? sctp_packet_transmit_chunk+0xb0/0xb0 [sctp]
[<ffffffffa05ecb70>] ? sctp_csum_update+0x20/0x20 [sctp]
[<ffffffff814b05a5>] ? ip_route_input_noref+0x235/0xd30
[<ffffffff81051d6b>] ? ack_ioapic_level+0x7b/0x150
[<ffffffff814b27be>] ip_local_deliver_finish+0xae/0x210
[<ffffffff814b2e15>] ip_local_deliver+0x35/0x90
[<ffffffff814b2a15>] ip_rcv_finish+0xf5/0x370
[<ffffffff814b3128>] ip_rcv+0x2b8/0x3a0
[<ffffffff81474193>] __netif_receive_skb_core+0x763/0xa50
[<ffffffff81476c28>] __netif_receive_skb+0x18/0x60
[<ffffffff81476cb0>] netif_receive_skb_internal+0x40/0xd0
[<ffffffff814776c8>] napi_gro_receive+0xe8/0x120
[<ffffffffa03946aa>] rtl8169_poll+0x2da/0x660 [r8169]
[<ffffffff8147896a>] net_rx_action+0x21a/0x360
[<ffffffff81078dc1>] __do_softirq+0xe1/0x2d0
[<ffffffff8107912d>] irq_exit+0xad/0xb0
[<ffffffff8157d158>] do_IRQ+0x58/0xf0
[<ffffffff8157b06d>] common_interrupt+0x6d/0x6d
<EOI>
[<ffffffff810e1218>] ? hrtimer_start+0x18/0x20
[<ffffffffa05d65f9>] ? sctp_transport_destroy_rcu+0x29/0x30 [sctp]
[<ffffffff81020c50>] ? mwait_idle+0x60/0xa0
[<ffffffff810216ef>] arch_cpu_idle+0xf/0x20
[<ffffffff810b731c>] cpu_startup_entry+0x3ec/0x480
[<ffffffff8156b365>] rest_init+0x85/0x90
[<ffffffff818eb035>] start_kernel+0x48b/0x4ac
[<ffffffff818ea120>] ? early_idt_handlers+0x120/0x120
[<ffffffff818ea339>] x86_64_start_reservations+0x2a/0x2c
[<ffffffff818ea49c>] x86_64_start_kernel+0x161/0x184
Code: 90 48 8b 80 b8 00 00 00 48 89 85 70 ff ff ff 48 83 bd 70 ff ff ff 00 0f 85 cd fa ff ff 48 89 df 31 db e8 18 63 e7 e0 48 8b 45 80 <48> 8b 40 20 48 8b 40 30 48 8b 80 68 01 00 00 65 48 ff 40 78 e9
RIP [<ffffffffa05ec9ac>] sctp_packet_transmit+0x63c/0x730 [sctp]
RSP <ffff880127c037b8>
CR2: 0000000000000020
---[ end trace 5aec7fd2dc983574 ]---
Kernel panic - not syncing: Fatal exception in interrupt
Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)
drm_kms_helper: panic occurred, switching back to text console
---[ end Kernel panic - not syncing: Fatal exception in interrupt

Signed-off-by: Alexander Sverdlin <[email protected]>
Acked-by: Neil Horman <[email protected]>
Acked-by: Marcelo Ricardo Leitner <[email protected]>
Acked-by: Vlad Yasevich <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/sctp/output.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

--- a/net/sctp/output.c
+++ b/net/sctp/output.c
@@ -599,7 +599,9 @@ out:
return err;
no_route:
kfree_skb(nskb);
- IP_INC_STATS(sock_net(asoc->base.sk), IPSTATS_MIB_OUTNOROUTES);
+
+ if (asoc)
+ IP_INC_STATS(sock_net(asoc->base.sk), IPSTATS_MIB_OUTNOROUTES);

/* FIXME: Returning the 'err' will effect all the associations
* associated with a socket, although only one of the paths of the

2015-07-08 08:12:58

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 23/55] amd-xgbe: Add the __GFP_NOWARN flag to Rx buffer allocation

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Tom Lendacky <[email protected]>

[ Upstream commit 472cfe7127760d68b819cf35a26e5a1b44b30f4e ]

When allocating Rx related buffers, alloc_pages is called using an order
number that is decreased until successful. A system under stress can
experience failures during this allocation process resulting in a warning
being issued. This message can be of concern to end users even though the
failure is not fatal. Since the failure is not fatal and can occur
multiple times, the driver should include the __GFP_NOWARN flag to
suppress the warning message from being issued.

Signed-off-by: Tom Lendacky <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/amd/xgbe/xgbe-desc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/ethernet/amd/xgbe/xgbe-desc.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-desc.c
@@ -263,7 +263,7 @@ static int xgbe_alloc_pages(struct xgbe_
int ret;

/* Try to obtain pages, decreasing order if necessary */
- gfp |= __GFP_COLD | __GFP_COMP;
+ gfp |= __GFP_COLD | __GFP_COMP | __GFP_NOWARN;
while (order >= 0) {
pages = alloc_pages(gfp, order);
if (pages)

2015-07-08 08:12:38

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 24/55] net: mvneta: introduce compatible string "marvell, armada-xp-neta"

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Simon Guinot <[email protected]>

[ Upstream commit f522a975a8101895a85354b9c143f41b8248e71a ]

The mvneta driver supports the Ethernet IP found in the Armada 370, XP,
380 and 385 SoCs. Since at least one more hardware feature is available
for the Armada XP SoCs then a way to identify them is needed.

This patch introduces a new compatible string "marvell,armada-xp-neta".

Signed-off-by: Simon Guinot <[email protected]>
Fixes: c5aff18204da ("net: mvneta: driver for Marvell Armada 370/XP network unit")
Cc: <[email protected]> # v3.8+
Acked-by: Gregory CLEMENT <[email protected]>
Acked-by: Thomas Petazzoni <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt | 2 +-
drivers/net/ethernet/marvell/mvneta.c | 1 +
2 files changed, 2 insertions(+), 1 deletion(-)

--- a/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt
+++ b/Documentation/devicetree/bindings/net/marvell-armada-370-neta.txt
@@ -1,7 +1,7 @@
* Marvell Armada 370 / Armada XP Ethernet Controller (NETA)

Required properties:
-- compatible: should be "marvell,armada-370-neta".
+- compatible: "marvell,armada-370-neta" or "marvell,armada-xp-neta".
- reg: address and length of the register set for the device.
- interrupts: interrupt for the device
- phy: See ethernet.txt file in the same directory.
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -3095,6 +3095,7 @@ static int mvneta_remove(struct platform

static const struct of_device_id mvneta_match[] = {
{ .compatible = "marvell,armada-370-neta" },
+ { .compatible = "marvell,armada-xp-neta" },
{ }
};
MODULE_DEVICE_TABLE(of, mvneta_match);

2015-07-08 08:12:29

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 25/55] ARM: mvebu: update Ethernet compatible string for Armada XP

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Simon Guinot <[email protected]>

[ Upstream commit ea3b55fe83b5fcede82d183164b9d6831b26e33b ]

This patch updates the Ethernet DT nodes for Armada XP SoCs with the
compatible string "marvell,armada-xp-neta".

Signed-off-by: Simon Guinot <[email protected]>
Fixes: 77916519cba3 ("arm: mvebu: Armada XP MV78230 has only three Ethernet interfaces")
Cc: <[email protected]> # v3.8+
Acked-by: Gregory CLEMENT <[email protected]>
Reviewed-by: Thomas Petazzoni <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/arm/boot/dts/armada-370-xp.dtsi | 2 --
arch/arm/boot/dts/armada-370.dtsi | 8 ++++++++
arch/arm/boot/dts/armada-xp-mv78260.dtsi | 2 +-
arch/arm/boot/dts/armada-xp-mv78460.dtsi | 2 +-
arch/arm/boot/dts/armada-xp.dtsi | 10 +++++++++-
5 files changed, 19 insertions(+), 5 deletions(-)

--- a/arch/arm/boot/dts/armada-370-xp.dtsi
+++ b/arch/arm/boot/dts/armada-370-xp.dtsi
@@ -265,7 +265,6 @@
};

eth0: ethernet@70000 {
- compatible = "marvell,armada-370-neta";
reg = <0x70000 0x4000>;
interrupts = <8>;
clocks = <&gateclk 4>;
@@ -281,7 +280,6 @@
};

eth1: ethernet@74000 {
- compatible = "marvell,armada-370-neta";
reg = <0x74000 0x4000>;
interrupts = <10>;
clocks = <&gateclk 3>;
--- a/arch/arm/boot/dts/armada-370.dtsi
+++ b/arch/arm/boot/dts/armada-370.dtsi
@@ -306,6 +306,14 @@
dmacap,memset;
};
};
+
+ ethernet@70000 {
+ compatible = "marvell,armada-370-neta";
+ };
+
+ ethernet@74000 {
+ compatible = "marvell,armada-370-neta";
+ };
};
};
};
--- a/arch/arm/boot/dts/armada-xp-mv78260.dtsi
+++ b/arch/arm/boot/dts/armada-xp-mv78260.dtsi
@@ -319,7 +319,7 @@
};

eth3: ethernet@34000 {
- compatible = "marvell,armada-370-neta";
+ compatible = "marvell,armada-xp-neta";
reg = <0x34000 0x4000>;
interrupts = <14>;
clocks = <&gateclk 1>;
--- a/arch/arm/boot/dts/armada-xp-mv78460.dtsi
+++ b/arch/arm/boot/dts/armada-xp-mv78460.dtsi
@@ -357,7 +357,7 @@
};

eth3: ethernet@34000 {
- compatible = "marvell,armada-370-neta";
+ compatible = "marvell,armada-xp-neta";
reg = <0x34000 0x4000>;
interrupts = <14>;
clocks = <&gateclk 1>;
--- a/arch/arm/boot/dts/armada-xp.dtsi
+++ b/arch/arm/boot/dts/armada-xp.dtsi
@@ -175,7 +175,7 @@
};

eth2: ethernet@30000 {
- compatible = "marvell,armada-370-neta";
+ compatible = "marvell,armada-xp-neta";
reg = <0x30000 0x4000>;
interrupts = <12>;
clocks = <&gateclk 2>;
@@ -218,6 +218,14 @@
};
};

+ ethernet@70000 {
+ compatible = "marvell,armada-xp-neta";
+ };
+
+ ethernet@74000 {
+ compatible = "marvell,armada-xp-neta";
+ };
+
xor@f0900 {
compatible = "marvell,orion-xor";
reg = <0xF0900 0x100

2015-07-08 08:12:45

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 26/55] net: mvneta: disable IP checksum with jumbo frames for Armada 370

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Simon Guinot <[email protected]>

[ Upstream commit b65657fc240ae6c1d2a1e62db9a0e61ac9631d7a ]

The Ethernet controller found in the Armada 370, 380 and 385 SoCs don't
support TCP/IP checksumming with frame sizes larger than 1600 bytes.

This patch fixes the issue by disabling the features NETIF_F_IP_CSUM and
NETIF_F_TSO for the Armada 370 and compatibles SoCs when the MTU is set
to a value greater than 1600 bytes.

Signed-off-by: Simon Guinot <[email protected]>
Fixes: c5aff18204da ("net: mvneta: driver for Marvell Armada 370/XP network unit")
Cc: <[email protected]> # v3.8+
Acked-by: Thomas Petazzoni <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/marvell/mvneta.c | 26 +++++++++++++++++++++++++-
1 file changed, 25 insertions(+), 1 deletion(-)

--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -304,6 +304,7 @@ struct mvneta_port {
unsigned int link;
unsigned int duplex;
unsigned int speed;
+ unsigned int tx_csum_limit;
};

/* The mvneta_tx_desc and mvneta_rx_desc structures describe the
@@ -2441,8 +2442,10 @@ static int mvneta_change_mtu(struct net_

dev->mtu = mtu;

- if (!netif_running(dev))
+ if (!netif_running(dev)) {
+ netdev_update_features(dev);
return 0;
+ }

/* The interface is running, so we have to force a
* reallocation of the queues
@@ -2471,9 +2474,26 @@ static int mvneta_change_mtu(struct net_
mvneta_start_dev(pp);
mvneta_port_up(pp);

+ netdev_update_features(dev);
+
return 0;
}

+static netdev_features_t mvneta_fix_features(struct net_device *dev,
+ netdev_features_t features)
+{
+ struct mvneta_port *pp = netdev_priv(dev);
+
+ if (pp->tx_csum_limit && dev->mtu > pp->tx_csum_limit) {
+ features &= ~(NETIF_F_IP_CSUM | NETIF_F_TSO);
+ netdev_info(dev,
+ "Disable IP checksum for MTU greater than %dB\n",
+ pp->tx_csum_limit);
+ }
+
+ return features;
+}
+
/* Get mac address */
static void mvneta_get_mac_addr(struct mvneta_port *pp, unsigned char *addr)
{
@@ -2785,6 +2805,7 @@ static const struct net_device_ops mvnet
.ndo_set_rx_mode = mvneta_set_rx_mode,
.ndo_set_mac_address = mvneta_set_mac_addr,
.ndo_change_mtu = mvneta_change_mtu,
+ .ndo_fix_features = mvneta_fix_features,
.ndo_get_stats64 = mvneta_get_stats64,
.ndo_do_ioctl = mvneta_ioctl,
};
@@ -3023,6 +3044,9 @@ static int mvneta_probe(struct platform_
}
}

+ if (of_device_is_compatible(dn, "marvell,armada-370-neta"))
+ pp->tx_csum_limit = 1600;
+
pp->tx_ring_size = MVNETA_MAX_TXD;
pp->rx_ring_size = MVNETA_MAX_RXD;


2015-07-08 08:11:42

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 27/55] usb: gadget: f_fs: fix check in read operation

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Rui Miguel Silva <[email protected]>

commit 342f39a6c8d34d638a87b7d5f2156adc4db2585c upstream.

when copying to iter the size can be different then the iov count,
the check for full iov is wrong and make any read on request which
is not the exactly size of iov to return -EFAULT.

So, just check the success of the copy.

Signed-off-by: Rui Miguel Silva <[email protected]>
Signed-off-by: Felipe Balbi <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/usb/gadget/function/f_fs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/usb/gadget/function/f_fs.c
+++ b/drivers/usb/gadget/function/f_fs.c
@@ -845,7 +845,7 @@ static ssize_t ffs_epfile_io(struct file
ret = ep->status;
if (io_data->read && ret > 0) {
ret = copy_to_iter(data, ret, &io_data->data);
- if (unlikely(iov_iter_count(&io_data->data)))
+ if (!ret)
ret = -EFAULT;
}
}

2015-07-08 08:11:50

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 28/55] usb: gadget: f_fs: add extra check before unregister_gadget_item

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Rui Miguel Silva <[email protected]>

commit f14e9ad17f46051b02bffffac2036486097de19e upstream.

ffs_closed can race with configfs_rmdir which will call config_item_release, so
add an extra check to avoid calling the unregister_gadget_item with an null
gadget item.

Signed-off-by: Rui Miguel Silva <[email protected]>
Signed-off-by: Felipe Balbi <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/usb/gadget/function/f_fs.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

--- a/drivers/usb/gadget/function/f_fs.c
+++ b/drivers/usb/gadget/function/f_fs.c
@@ -3433,6 +3433,7 @@ done:
static void ffs_closed(struct ffs_data *ffs)
{
struct ffs_dev *ffs_obj;
+ struct f_fs_opts *opts;

ENTER();
ffs_dev_lock();
@@ -3446,8 +3447,13 @@ static void ffs_closed(struct ffs_data *
if (ffs_obj->ffs_closed_callback)
ffs_obj->ffs_closed_callback(ffs);

- if (!ffs_obj->opts || ffs_obj->opts->no_configfs
- || !ffs_obj->opts->func_inst.group.cg_item.ci_parent)
+ if (ffs_obj->opts)
+ opts = ffs_obj->opts;
+ else
+ goto done;
+
+ if (opts->no_configfs || !opts->func_inst.group.cg_item.ci_parent
+ || !atomic_read(&opts->func_inst.group.cg_item.ci_kref.refcount))
goto done;

unregister_gadget_item(ffs_obj->opts->

2015-07-08 08:12:06

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 29/55] crypto: talitos - avoid memleak in talitos_alg_alloc()

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Horia Geant? <[email protected]>

commit 5fa7dadc898567ce14d6d6d427e7bd8ce6eb5d39 upstream.

Fixes: 1d11911a8c57 ("crypto: talitos - fix warning: 'alg' may be used uninitialized in this function")
Signed-off-by: Horia Geanta <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/crypto/talitos.c | 1 +
1 file changed, 1 insertion(+)

--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -2563,6 +2563,7 @@ static struct talitos_crypto_alg *talito
break;
default:
dev_err(dev, "unknown algorithm type %d\n", t_alg->algt.type);
+ kfree(t_alg);
return ERR_PTR(-EINVAL);
}


2015-07-08 08:10:55

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 30/55] Revert "crypto: talitos - convert to use be16_add_cpu()"

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Horia Geant? <[email protected]>

commit 69d9cd8c592f1abce820dbce7181bbbf6812cfbd upstream.

This reverts commit 7291a932c6e27d9768e374e9d648086636daf61c.

The conversion to be16_add_cpu() is incorrect in case cryptlen is
negative due to premature (i.e. before addition / subtraction)
implicit conversion of cryptlen (int -> u16) leading to sign loss.

Cc: Wei Yongjun <[email protected]>
Signed-off-by: Horia Geanta <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/crypto/talitos.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -927,7 +927,8 @@ static int sg_to_link_tbl(struct scatter
sg_count--;
link_tbl_ptr--;
}
- be16_add_cpu(&link_tbl_ptr->len, cryptlen);
+ link_tbl_ptr->len = cpu_to_be16(be16_to_cpu(link_tbl_ptr->len)
+ + cryptlen);

/* tag end of link table */
link_tbl_ptr->j_extent = DESC_PTR_LNKTBL_RETURN;

2015-07-08 08:11:04

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 31/55] iommu/arm-smmu: Fix broken ATOS check

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Will Deacon <[email protected]>

commit d38f0ff9ab35414644995bae187d015c31aae19c upstream.

Commit 83a60ed8f0b5 ("iommu/arm-smmu: fix ARM_SMMU_FEAT_TRANS_OPS
condition") accidentally negated the ID0_ATOSNS predicate in the ATOS
feature check, causing the driver to attempt ATOS requests on SMMUv2
hardware without the ATOS feature implemented.

This patch restores the predicate to the correct value.

Reported-by: Varun Sethi <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/iommu/arm-smmu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1533,7 +1533,7 @@ static int arm_smmu_device_cfg_probe(str
return -ENODEV;
}

- if ((id & ID0_S1TS) && ((smmu->version == 1) || (id & ID0_ATOSNS))) {
+ if ((id & ID0_S1TS) && ((smmu->version == 1) || !(id & ID0_ATOSNS))) {
smmu->features |= ARM_SMMU_FEAT_TRANS_OPS;
dev_notice(smmu->dev, "\taddress translation ops\n");
}

2015-07-08 08:08:57

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 32/55] iommu/amd: Handle large pages correctly in free_pagetable

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Joerg Roedel <[email protected]>

commit 0b3fff54bc01e8e6064d222a33e6fa7adabd94cd upstream.

Make sure that we are skipping over large PTEs while walking
the page-table tree.

Fixes: 5c34c403b723 ("iommu/amd: Fix memory leak in free_pagetable")
Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/iommu/amd_iommu.c | 6 ++++++
1 file changed, 6 insertions(+)

--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1870,9 +1870,15 @@ static void free_pt_##LVL (unsigned long
pt = (u64 *)__pt; \
\
for (i = 0; i < 512; ++i) { \
+ /* PTE present? */ \
if (!IOMMU_PTE_PRESENT(pt[i])) \
continue; \
\
+ /* Large PTE? */ \
+ if (PM_PTE_LEVEL(pt[i]) == 0 || \
+ PM_PTE_LEVEL(pt[i]) == 7) \
+ continue; \
+ \
p = (unsigned long)IOMMU_PTE_PAGE(pt[i]); \
FN(p); \
} \

2015-07-08 08:08:29

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 33/55] mmc: sdhci: fix low memory corruption

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Jiri Slaby <[email protected]>

commit 62a7f368ffbc13d9aedfdd7aeae711b177db69ac upstream.

When dma mapping (dma_map_sg) fails in sdhci_pre_dma_transfer, -EINVAL
is returned. There are 3 callers of sdhci_pre_dma_transfer:
* sdhci_pre_req and sdhci_adma_table_pre: handle negative return
* sdhci_prepare_data: handles 0 (error) and "else" (good) only

sdhci_prepare_data is therefore broken. When it receives -EINVAL from
sdhci_pre_dma_transfer, it assumes 1 sg mapping was mapped. Later,
this non-existent mapping with address 0 is kmap'ped and written to:
Corrupted low memory at ffff880000001000 (1000 phys) = 22b7d67df2f6d1cf
Corrupted low memory at ffff880000001008 (1008 phys) = 63848a5216b7dd95
Corrupted low memory at ffff880000001010 (1010 phys) = 330eb7ddef39e427
Corrupted low memory at ffff880000001018 (1018 phys) = 8017ac7295039bda
Corrupted low memory at ffff880000001020 (1020 phys) = 8ce039eac119074f
...

So teach sdhci_prepare_data to understand negative return values from
sdhci_pre_dma_transfer and disable DMA in that case, as well as for
zero.

It was introduced in 348487cb28e66b032bae1b38424d81bf5b444408 (mmc:
sdhci: use pipeline mmc requests to improve performance). The commit
seems to be suspicious also by assigning host->sg_count both in
sdhci_pre_dma_transfer and sdhci_adma_table_pre.

Signed-off-by: Jiri Slaby <[email protected]>
Fixes: 348487cb28e6
Cc: Ulf Hansson <[email protected]>
Cc: Haibo Chen <[email protected]>
Signed-off-by: Ulf Hansson <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/mmc/host/sdhci.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -846,7 +846,7 @@ static void sdhci_prepare_data(struct sd
int sg_cnt;

sg_cnt = sdhci_pre_dma_transfer(host, data, NULL);
- if (sg_cnt == 0) {
+ if (sg_cnt <= 0) {
/*
* This only happens when someone fed
* us an invalid request.

2015-07-08 08:08:48

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 34/55] intel_pstate: set BYT MSR with wrmsrl_on_cpu()

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Joe Konno <[email protected]>

commit 0dd23f94251f49da99a6cbfb22418b2d757d77d6 upstream.

Commit 007bea098b86 (intel_pstate: Add setting voltage value for
baytrail P states.) introduced byt_set_pstate() with the assumption that
it would always be run by the CPU whose MSR is to be written by it. It
turns out, however, that is not always the case in practice, so modify
byt_set_pstate() to enforce the MSR write done by it to always happen on
the right CPU.

Fixes: 007bea098b86 (intel_pstate: Add setting voltage value for baytrail P states.)
Signed-off-by: Joe Konno <[email protected]>
Acked-by: Kristen Carlson Accardi <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/cpufreq/intel_pstate.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -534,7 +534,7 @@ static void byt_set_pstate(struct cpudat

val |= vid;

- wrmsrl(MSR_IA32_PERF_CTL, val);
+ wrmsrl_on_cpu(cpudata->cpu, MSR_IA32_PERF_CTL, val);
}

#define BYT_BCLK_FREQS 5

2015-07-08 08:08:14

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 35/55] selinux: fix setting of security labels on NFS

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: "J. Bruce Fields" <[email protected]>

commit 9fc2b4b436cff7d8403034676014f1be9d534942 upstream.

Before calling into the filesystem, vfs_setxattr calls
security_inode_setxattr, which ends up calling selinux_inode_setxattr in
our case. That returns -EOPNOTSUPP whenever SBLABEL_MNT is not set.
SBLABEL_MNT was supposed to be set by sb_finish_set_opts, which sets it
only if selinux_is_sblabel_mnt returns true.

The selinux_is_sblabel_mnt logic was broken by eadcabc697e9 "SELinux: do
all flags twiddling in one place", which didn't take into the account
the SECURITY_FS_USE_NATIVE behavior that had been introduced for nfs
with eb9ae686507b "SELinux: Add new labeling type native labels".

This caused setxattr's of security labels over NFSv4.2 to fail.

Cc: Eric Paris <[email protected]>
Cc: David Quigley <[email protected]>
Reported-by: Richard Chan <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>
Acked-by: Stephen Smalley <[email protected]>
[PM: added the stable dependency]
Signed-off-by: Paul Moore <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
security/selinux/hooks.c | 1 +
1 file changed, 1 insertion(+)

--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -404,6 +404,7 @@ static int selinux_is_sblabel_mnt(struct
return sbsec->behavior == SECURITY_FS_USE_XATTR ||
sbsec->behavior == SECURITY_FS_USE_TRANS ||
sbsec->behavior == SECURITY_FS_USE_TASK ||
+ sbsec->behavior == SECURITY_FS_USE_NATIVE ||
/* Special handling. Genfs but also in-core setxattr handler */
!strcmp(sb->s_type->name, "sysfs") ||
!strcmp(sb->s_type->name, "pstore") ||

2015-07-08 08:08:21

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 36/55] arm: KVM: force execution of HCPTR access on VM exit

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Marc Zyngier <[email protected]>

commit 85e84ba31039595995dae80b277378213602891b upstream.

On VM entry, we disable access to the VFP registers in order to
perform a lazy save/restore of these registers.

On VM exit, we restore access, test if we did enable them before,
and save/restore the guest/host registers if necessary. In this
sequence, the FPEXC register is always accessed, irrespective
of the trapping configuration.

If the guest didn't touch the VFP registers, then the HCPTR access
has now enabled such access, but we're missing a barrier to ensure
architectural execution of the new HCPTR configuration. If the HCPTR
access has been delayed/reordered, the subsequent access to FPEXC
will cause a trap, which we aren't prepared to handle at all.

The same condition exists when trapping to enable VFP for the guest.

The fix is to introduce a barrier after enabling VFP access. In the
vmexit case, it can be relaxed to only takes place if the guest hasn't
accessed its view of the VFP registers, making the access to FPEXC safe.

The set_hcptr macro is modified to deal with both vmenter/vmexit and
vmtrap operations, and now takes an optional label that is branched to
when the guest hasn't touched the VFP registers.

Reported-by: Vikram Sethi <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm/kvm/interrupts.S | 10 ++++------
arch/arm/kvm/interrupts_head.S | 20 ++++++++++++++++++--
2 files changed, 22 insertions(+), 8 deletions(-)

--- a/arch/arm/kvm/interrupts.S
+++ b/arch/arm/kvm/interrupts.S
@@ -170,13 +170,9 @@ __kvm_vcpu_return:
@ Don't trap coprocessor accesses for host kernel
set_hstr vmexit
set_hdcr vmexit
- set_hcptr vmexit, (HCPTR_TTA | HCPTR_TCP(10) | HCPTR_TCP(11))
+ set_hcptr vmexit, (HCPTR_TTA | HCPTR_TCP(10) | HCPTR_TCP(11)), after_vfp_restore

#ifdef CONFIG_VFPv3
- @ Save floating point registers we if let guest use them.
- tst r2, #(HCPTR_TCP(10) | HCPTR_TCP(11))
- bne after_vfp_restore
-
@ Switch VFP/NEON hardware state to the host's
add r7, vcpu, #VCPU_VFP_GUEST
store_vfp_state r7
@@ -188,6 +184,8 @@ after_vfp_restore:
@ Restore FPEXC_EN which we clobbered on entry
pop {r2}
VFPFMXR FPEXC, r2
+#else
+after_vfp_restore:
#endif

@ Reset Hyp-role
@@ -483,7 +481,7 @@ switch_to_guest_vfp:
push {r3-r7}

@ NEON/VFP used. Turn on VFP access.
- set_hcptr vmexit, (HCPTR_TCP(10) | HCPTR_TCP(11))
+ set_hcptr vmtrap, (HCPTR_TCP(10) | HCPTR_TCP(11))

@ Switch VFP/NEON hardware state to the guest's
add r7, r0, #VCPU_VFP_HOST
--- a/arch/arm/kvm/interrupts_head.S
+++ b/arch/arm/kvm/interrupts_head.S
@@ -599,8 +599,13 @@ ARM_BE8(rev r6, r6 )
.endm

/* Configures the HCPTR (Hyp Coprocessor Trap Register) on entry/return
- * (hardware reset value is 0). Keep previous value in r2. */
-.macro set_hcptr operation, mask
+ * (hardware reset value is 0). Keep previous value in r2.
+ * An ISB is emited on vmexit/vmtrap, but executed on vmexit only if
+ * VFP wasn't already enabled (always executed on vmtrap).
+ * If a label is specified with vmexit, it is branched to if VFP wasn't
+ * enabled.
+ */
+.macro set_hcptr operation, mask, label = none
mrc p15, 4, r2, c1, c1, 2
ldr r3, =\mask
.if \operation == vmentry
@@ -609,6 +614,17 @@ ARM_BE8(rev r6, r6 )
bic r3, r2, r3 @ Don't trap defined coproc-accesses
.endif
mcr p15, 4, r3, c1, c1, 2
+ .if \operation != vmentry
+ .if \operation == vmexit
+ tst r2, #(HCPTR_TCP(10) | HCPTR_TCP(11))
+ beq 1f
+ .endif
+ isb
+ .if \label != none
+ b \label
+ .endif
+1:
+ .endif
.endm

/* Configures the HDCR (Hyp Debug Configuration Register) on entry/return

2015-07-08 08:07:55

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 37/55] ARM: kvm: psci: fix handling of unimplemented functions

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Lorenzo Pieralisi <[email protected]>

commit e2d997366dc5b6c9d14035867f73957f93e7578c upstream.

According to the PSCI specification and the SMC/HVC calling
convention, PSCI function_ids that are not implemented must
return NOT_SUPPORTED as return value.

Current KVM implementation takes an unhandled PSCI function_id
as an error and injects an undefined instruction into the guest
if PSCI implementation is called with a function_id that is not
handled by the resident PSCI version (ie it is not implemented),
which is not the behaviour expected by a guest when calling a
PSCI function_id that is not implemented.

This patch fixes this issue by returning NOT_SUPPORTED whenever
the kvm PSCI call is executed for a function_id that is not
implemented by the PSCI kvm layer.

Cc: Christoffer Dall <[email protected]>
Acked-by: Sudeep Holla <[email protected]>
Signed-off-by: Lorenzo Pieralisi <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm/kvm/psci.c | 16 +++-------------
1 file changed, 3 insertions(+), 13 deletions(-)

--- a/arch/arm/kvm/psci.c
+++ b/arch/arm/kvm/psci.c
@@ -230,10 +230,6 @@ static int kvm_psci_0_2_call(struct kvm_
case PSCI_0_2_FN64_AFFINITY_INFO:
val = kvm_psci_vcpu_affinity_info(vcpu);
break;
- case PSCI_0_2_FN_MIGRATE:
- case PSCI_0_2_FN64_MIGRATE:
- val = PSCI_RET_NOT_SUPPORTED;
- break;
case PSCI_0_2_FN_MIGRATE_INFO_TYPE:
/*
* Trusted OS is MP hence does not require migration
@@ -242,10 +238,6 @@ static int kvm_psci_0_2_call(struct kvm_
*/
val = PSCI_0_2_TOS_MP;
break;
- case PSCI_0_2_FN_MIGRATE_INFO_UP_CPU:
- case PSCI_0_2_FN64_MIGRATE_INFO_UP_CPU:
- val = PSCI_RET_NOT_SUPPORTED;
- break;
case PSCI_0_2_FN_SYSTEM_OFF:
kvm_psci_system_off(vcpu);
/*
@@ -271,7 +263,8 @@ static int kvm_psci_0_2_call(struct kvm_
ret = 0;
break;
default:
- return -EINVAL;
+ val = PSCI_RET_NOT_SUPPORTED;
+ break;
}

*vcpu_reg(vcpu, 0) = val;
@@ -291,12 +284,9 @@ static int kvm_psci_0_1_call(struct kvm_
case KVM_PSCI_FN_CPU_ON:
val = kvm_psci_vcpu_on(vcpu);
break;
- case KVM_PSCI_FN_CPU_SUSPEND:
- case KVM_PSCI_FN_MIGRATE:
+ default:
val = PSCI_RET_NOT_SUPPORTED;
break;
- default:
- return -EINVAL;
}

*vcpu_reg(vcpu, 0) = val;

2015-07-08 08:07:46

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 38/55] ARM: tegra20: Store CPU "resettable" status in IRAM

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dmitry Osipenko <[email protected]>

commit 4d48edb3c3e1234d6b3fcdfb9ac24d7c6de449cb upstream.

Commit 7232398abc6a ("ARM: tegra: Convert PMC to a driver") changed tegra_resume()
location storing from late to early and, as a result, broke suspend on Tegra20.
PMC scratch register 41 is used by tegra LP1 resume code for retrieving stored
physical memory address of common resume function and in the same time used by
tegra20_cpu_shutdown() (shared by Tegra20 cpuidle driver and platform SMP code),
which is storing CPU1 "resettable" status. It implies strict order of scratch
register usage, otherwise resume function address is lost on Tegra20 after
disabling non-boot CPU's on suspend. Fix it by storing "resettable" status in
IRAM instead of PMC scratch register.

Signed-off-by: Dmitry Osipenko <[email protected]>
Fixes: 7232398abc6a (ARM: tegra: Convert PMC to a driver)
Signed-off-by: Thierry Reding <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm/mach-tegra/cpuidle-tegra20.c | 5 +---
arch/arm/mach-tegra/reset-handler.S | 10 ++++++---
arch/arm/mach-tegra/reset.h | 4 +++
arch/arm/mach-tegra/sleep-tegra20.S | 37 +++++++++++++++++++---------------
arch/arm/mach-tegra/sleep.h | 4 +++
5 files changed, 38 insertions(+), 22 deletions(-)

--- a/arch/arm/mach-tegra/cpuidle-tegra20.c
+++ b/arch/arm/mach-tegra/cpuidle-tegra20.c
@@ -35,6 +35,7 @@
#include "iomap.h"
#include "irq.h"
#include "pm.h"
+#include "reset.h"
#include "sleep.h"

#ifdef CONFIG_PM_SLEEP
@@ -71,15 +72,13 @@ static struct cpuidle_driver tegra_idle_

#ifdef CONFIG_PM_SLEEP
#ifdef CONFIG_SMP
-static void __iomem *pmc = IO_ADDRESS(TEGRA_PMC_BASE);
-
static int tegra20_reset_sleeping_cpu_1(void)
{
int ret = 0;

tegra_pen_lock();

- if (readl(pmc + PMC_SCRATCH41) == CPU_RESETTABLE)
+ if (readb(tegra20_cpu1_resettable_status) == CPU_RESETTABLE)
tegra20_cpu_shutdown(1);
else
ret = -EINVAL;
--- a/arch/arm/mach-tegra/reset-handler.S
+++ b/arch/arm/mach-tegra/reset-handler.S
@@ -169,10 +169,10 @@ after_errata:
cmp r6, #TEGRA20
bne 1f
/* If not CPU0, don't let CPU0 reset CPU1 now that CPU1 is coming up. */
- mov32 r5, TEGRA_PMC_BASE
- mov r0, #0
+ mov32 r5, TEGRA_IRAM_BASE + TEGRA_IRAM_RESET_HANDLER_OFFSET
+ mov r0, #CPU_NOT_RESETTABLE
cmp r10, #0
- strne r0, [r5, #PMC_SCRATCH41]
+ strneb r0, [r5, #__tegra20_cpu1_resettable_status_offset]
1:
#endif

@@ -281,6 +281,10 @@ __tegra_cpu_reset_handler_data:
.rept TEGRA_RESET_DATA_SIZE
.long 0
.endr
+ .globl __tegra20_cpu1_resettable_status_offset
+ .equ __tegra20_cpu1_resettable_status_offset, \
+ . - __tegra_cpu_reset_handler_start
+ .byte 0
.align L1_CACHE_SHIFT

ENTRY(__tegra_cpu_reset_handler_end)
--- a/arch/arm/mach-tegra/reset.h
+++ b/arch/arm/mach-tegra/reset.h
@@ -35,6 +35,7 @@ extern unsigned long __tegra_cpu_reset_h

void __tegra_cpu_reset_handler_start(void);
void __tegra_cpu_reset_handler(void);
+void __tegra20_cpu1_resettable_status_offset(void);
void __tegra_cpu_reset_handler_end(void);
void tegra_secondary_startup(void);

@@ -47,6 +48,9 @@ void tegra_secondary_startup(void);
(IO_ADDRESS(TEGRA_IRAM_BASE + TEGRA_IRAM_RESET_HANDLER_OFFSET + \
((u32)&__tegra_cpu_reset_handler_data[TEGRA_RESET_MASK_LP2] - \
(u32)__tegra_cpu_reset_handler_start)))
+#define tegra20_cpu1_resettable_status \
+ (IO_ADDRESS(TEGRA_IRAM_BASE + TEGRA_IRAM_RESET_HANDLER_OFFSET + \
+ (u32)__tegra20_cpu1_resettable_status_offset))
#endif

#define tegra_cpu_reset_handler_offset \
--- a/arch/arm/mach-tegra/sleep-tegra20.S
+++ b/arch/arm/mach-tegra/sleep-tegra20.S
@@ -97,9 +97,10 @@ ENDPROC(tegra20_hotplug_shutdown)
ENTRY(tegra20_cpu_shutdown)
cmp r0, #0
reteq lr @ must not be called for CPU 0
- mov32 r1, TEGRA_PMC_VIRT + PMC_SCRATCH41
+ mov32 r1, TEGRA_IRAM_RESET_BASE_VIRT
+ ldr r2, =__tegra20_cpu1_resettable_status_offset
mov r12, #CPU_RESETTABLE
- str r12, [r1]
+ strb r12, [r1, r2]

cpu_to_halt_reg r1, r0
ldr r3, =TEGRA_FLOW_CTRL_VIRT
@@ -182,38 +183,41 @@ ENDPROC(tegra_pen_unlock)
/*
* tegra20_cpu_clear_resettable(void)
*
- * Called to clear the "resettable soon" flag in PMC_SCRATCH41 when
+ * Called to clear the "resettable soon" flag in IRAM variable when
* it is expected that the secondary CPU will be idle soon.
*/
ENTRY(tegra20_cpu_clear_resettable)
- mov32 r1, TEGRA_PMC_VIRT + PMC_SCRATCH41
+ mov32 r1, TEGRA_IRAM_RESET_BASE_VIRT
+ ldr r2, =__tegra20_cpu1_resettable_status_offset
mov r12, #CPU_NOT_RESETTABLE
- str r12, [r1]
+ strb r12, [r1, r2]
ret lr
ENDPROC(tegra20_cpu_clear_resettable)

/*
* tegra20_cpu_set_resettable_soon(void)
*
- * Called to set the "resettable soon" flag in PMC_SCRATCH41 when
+ * Called to set the "resettable soon" flag in IRAM variable when
* it is expected that the secondary CPU will be idle soon.
*/
ENTRY(tegra20_cpu_set_resettable_soon)
- mov32 r1, TEGRA_PMC_VIRT + PMC_SCRATCH41
+ mov32 r1, TEGRA_IRAM_RESET_BASE_VIRT
+ ldr r2, =__tegra20_cpu1_resettable_status_offset
mov r12, #CPU_RESETTABLE_SOON
- str r12, [r1]
+ strb r12, [r1, r2]
ret lr
ENDPROC(tegra20_cpu_set_resettable_soon)

/*
* tegra20_cpu_is_resettable_soon(void)
*
- * Returns true if the "resettable soon" flag in PMC_SCRATCH41 has been
+ * Returns true if the "resettable soon" flag in IRAM variable has been
* set because it is expected that the secondary CPU will be idle soon.
*/
ENTRY(tegra20_cpu_is_resettable_soon)
- mov32 r1, TEGRA_PMC_VIRT + PMC_SCRATCH41
- ldr r12, [r1]
+ mov32 r1, TEGRA_IRAM_RESET_BASE_VIRT
+ ldr r2, =__tegra20_cpu1_resettable_status_offset
+ ldrb r12, [r1, r2]
cmp r12, #CPU_RESETTABLE_SOON
moveq r0, #1
movne r0, #0
@@ -256,9 +260,10 @@ ENTRY(tegra20_sleep_cpu_secondary_finish
mov r0, #TEGRA_FLUSH_CACHE_LOUIS
bl tegra_disable_clean_inv_dcache

- mov32 r0, TEGRA_PMC_VIRT + PMC_SCRATCH41
+ mov32 r0, TEGRA_IRAM_RESET_BASE_VIRT
+ ldr r4, =__tegra20_cpu1_resettable_status_offset
mov r3, #CPU_RESETTABLE
- str r3, [r0]
+ strb r3, [r0, r4]

bl tegra_cpu_do_idle

@@ -274,10 +279,10 @@ ENTRY(tegra20_sleep_cpu_secondary_finish

bl tegra_pen_lock

- mov32 r3, TEGRA_PMC_VIRT
- add r0, r3, #PMC_SCRATCH41
+ mov32 r0, TEGRA_IRAM_RESET_BASE_VIRT
+ ldr r4, =__tegra20_cpu1_resettable_status_offset
mov r3, #CPU_NOT_RESETTABLE
- str r3, [r0]
+ strb r3, [r0, r4]

bl tegra_pen_unlock

--- a/arch/arm/mach-tegra/sleep.h
+++ b/arch/arm/mach-tegra/sleep.h
@@ -18,6 +18,7 @@
#define __MACH_TEGRA_SLEEP_H

#include "iomap.h"
+#include "irammap.h"

#define TEGRA_ARM_PERIF_VIRT (TEGRA_ARM_PERIF_BASE - IO_CPU_PHYS \
+ IO_CPU_VIRT)
@@ -29,6 +30,9 @@
+ IO_APB_VIRT)
#define TEGRA_PMC_VIRT (TEGRA_PMC_BASE - IO_APB_PHYS + IO_APB_VIRT)

+#define TEGRA_IRAM_RESET_BASE_VIRT (IO_IRAM_VIRT + \
+ TEGRA_IRAM_RESET_HANDLER_OFFSET)
+
/* PMC_SCRATCH37-39 and 41 are used for tegra_pen_lock and idle */
#define PMC_SCRATCH37 0x130
#define PMC_SCRATCH38 0x134

2015-07-08 07:59:40

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 39/55] ARM: mvebu: fix suspend to RAM on big-endian configurations

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Thomas Petazzoni <[email protected]>

commit 2f5bc307be2480ba89e4c5d118f406f04a4a7299 upstream.

The current Armada XP suspend to RAM implementation, as added in
commit 27432825ae19f ("ARM: mvebu: Armada XP GP specific
suspend/resume code") does not handle big-endian configurations
properly: the small bit of assembly code putting the DRAM in
self-refresh and toggling the GPIOs to turn off power forgets to
convert the values to little-endian.

This commit fixes that by making sure the two values we will write to
the DRAM controller register and GPIO register are already in
little-endian before entering the critical assembly code.

Signed-off-by: Thomas Petazzoni <[email protected]>
Fixes: 27432825ae19f ("ARM: mvebu: Armada XP GP specific suspend/resume code")
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm/mach-mvebu/pm-board.c | 3 +++
1 file changed, 3 insertions(+)

--- a/arch/arm/mach-mvebu/pm-board.c
+++ b/arch/arm/mach-mvebu/pm-board.c
@@ -43,6 +43,9 @@ static void mvebu_armada_xp_gp_pm_enter(
for (i = 0; i < ARMADA_XP_GP_PIC_NR_GPIOS; i++)
ackcmd |= BIT(pic_raw_gpios[i]);

+ srcmd = cpu_to_le32(srcmd);
+ ackcmd = cpu_to_le32(ackcmd);
+
/*
* Wait a while, the PIC needs quite a bit of time between the
* two GPIO commands.

2015-07-08 08:04:42

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 40/55] tick/idle/powerpc: Do not register idle states with CPUIDLE_FLAG_TIMER_STOP set in periodic mode

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: preeti <[email protected]>

commit cc5a2f7b8f39e7db559778f7913a2410257b3e50 upstream.

On some archs, the local clockevent device stops in deep cpuidle states.
The broadcast framework is used to wakeup cpus in these idle states, in
which either an external clockevent device is used to send wakeup ipis
or the hrtimer broadcast framework kicks in in the absence of such a
device. One cpu is nominated as the broadcast cpu and this cpu sends
wakeup ipis to sleeping cpus at the appropriate time. This is the
implementation in the oneshot mode of broadcast.

In periodic mode of broadcast however, the presence of such cpuidle
states results in the cpuidle driver calling tick_broadcast_enable()
which shuts down the local clockevent devices of all the cpus and
appoints the tick broadcast device as the clockevent device for each of
them. This works on those archs where the tick broadcast device is a
real clockevent device. But on archs which depend on the hrtimer mode
of broadcast, the tick broadcast device hapens to be a pseudo device.
The consequence is that the local clockevent devices of all cpus are
shutdown and the kernel hangs at boot time in periodic mode.

Let us thus not register the cpuidle states which have
CPUIDLE_FLAG_TIMER_STOP flag set, on archs which depend on the hrtimer
mode of broadcast in periodic mode. This patch takes care of doing this
on powerpc. The cpus would not have entered into such deep cpuidle
states in periodic mode on powerpc anyway. So there is no loss here.

Signed-off-by: Preeti U Murthy <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/cpuidle/cpuidle-powernv.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)

--- a/drivers/cpuidle/cpuidle-powernv.c
+++ b/drivers/cpuidle/cpuidle-powernv.c
@@ -60,6 +60,8 @@ static int nap_loop(struct cpuidle_devic
return index;
}

+/* Register for fastsleep only in oneshot mode of broadcast */
+#ifdef CONFIG_TICK_ONESHOT
static int fastsleep_loop(struct cpuidle_device *dev,
struct cpuidle_driver *drv,
int index)
@@ -83,7 +85,7 @@ static int fastsleep_loop(struct cpuidle

return index;
}
-
+#endif
/*
* States for dedicated partition case.
*/
@@ -209,7 +211,14 @@ static int powernv_add_idle_states(void)
powernv_states[nr_idle_states].flags = 0;
powernv_states[nr_idle_states].target_residency = 100;
powernv_states[nr_idle_states].enter = &nap_loop;
- } else if (flags[i] & OPAL_PM_SLEEP_ENABLED ||
+ }
+
+ /*
+ * All cpuidle states with CPUIDLE_FLAG_TIMER_STOP set must come
+ * within this config dependency check.
+ */
+#ifdef CONFIG_TICK_ONESHOT
+ if (flags[i] & OPAL_PM_SLEEP_ENABLED ||
flags[i] & OPAL_PM_SLEEP_ENABLED_ER1) {
/* Add FASTSLEEP state */
strcpy(powernv_states[nr_idle_states].name, "FastSleep");
@@ -218,7 +227,7 @@ static int powernv_add_idle_states(void)
powernv_states[nr_idle_states].target_residency = 300000;
powernv_states[nr_idle_states].enter = &fastsleep_loop;
}
-
+#endif
powernv_states[nr_idle_states].exit_latency =
((unsigned int)latency_ns[i]) / 1000;


2015-07-08 08:02:35

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 41/55] powerpc/perf: Fix book3s kernel to userspace backtraces

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Anton Blanchard <[email protected]>

commit 72e349f1124a114435e599479c9b8d14bfd1ebcd upstream.

When we take a PMU exception or a software event we call
perf_read_regs(). This overloads regs->result with a boolean that
describes if we should use the sampled instruction address register
(SIAR) or the regs.

If the exception is in kernel, we start with the kernel regs and
backtrace through the kernel stack. At this point we switch to the
userspace regs and backtrace the user stack with perf_callchain_user().

Unfortunately these regs have not got the perf_read_regs() treatment,
so regs->result could be anything. If it is non zero,
perf_instruction_pointer() decides to use the SIAR, and we get issues
like this:

0.11% qemu-system-ppc [kernel.kallsyms] [k] _raw_spin_lock_irqsave
|
---_raw_spin_lock_irqsave
|
|--52.35%-- 0
| |
| |--46.39%-- __hrtimer_start_range_ns
| | kvmppc_run_core
| | kvmppc_vcpu_run_hv
| | kvmppc_vcpu_run
| | kvm_arch_vcpu_ioctl_run
| | kvm_vcpu_ioctl
| | do_vfs_ioctl
| | sys_ioctl
| | system_call
| | |
| | |--67.08%-- _raw_spin_lock_irqsave <--- hi mum
| | | |
| | | --100.00%-- 0x7e714
| | | 0x7e714

Notice the bogus _raw_spin_irqsave when we transition from kernel
(system_call) to userspace (0x7e714). We inserted what was in the SIAR.

Add a check in regs_use_siar() to check that the regs in question
are from a PMU exception. With this fix the backtrace makes sense:

0.47% qemu-system-ppc [kernel.vmlinux] [k] _raw_spin_lock_irqsave
|
---_raw_spin_lock_irqsave
|
|--53.83%-- 0
| |
| |--44.73%-- hrtimer_try_to_cancel
| | kvmppc_start_thread
| | kvmppc_run_core
| | kvmppc_vcpu_run_hv
| | kvmppc_vcpu_run
| | kvm_arch_vcpu_ioctl_run
| | kvm_vcpu_ioctl
| | do_vfs_ioctl
| | sys_ioctl
| | system_call
| | __ioctl
| | 0x7e714
| | 0x7e714

Signed-off-by: Anton Blanchard <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/powerpc/perf/core-book3s.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)

--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -131,7 +131,16 @@ static void pmao_restore_workaround(bool

static bool regs_use_siar(struct pt_regs *regs)
{
- return !!regs->result;
+ /*
+ * When we take a performance monitor exception the regs are setup
+ * using perf_read_regs() which overloads some fields, in particular
+ * regs->result to tell us whether to use SIAR.
+ *
+ * However if the regs are from another exception, eg. a syscall, then
+ * they have not been setup using perf_read_regs() and so regs->result
+ * is something random.
+ */
+ return ((TRAP(regs) == 0xf00) && regs->result);
}

/*

2015-07-08 08:02:05

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 42/55] x86/PCI: Use host bridge _CRS info on systems with >32 bit addressing

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Bjorn Helgaas <[email protected]>

commit 3d9fecf6bfb8b12bc2f9a4c7109895a2a2bb9436 upstream.

We enable _CRS on all systems from 2008 and later. On older systems, we
ignore _CRS and assume the whole physical address space (excluding RAM and
other devices) is available for PCI devices, but on systems that support
physical address spaces larger than 4GB, it's doubtful that the area above
4GB is really available for PCI.

After d56dbf5bab8c ("PCI: Allocate 64-bit BARs above 4G when possible"), we
try to use that space above 4GB *first*, so we're more likely to put a
device there.

On Juan's Toshiba Satellite Pro U200, BIOS left the graphics, sound, 1394,
and card reader devices unassigned (but only after Windows had been
booted). Only the sound device had a 64-bit BAR, so it was the only device
placed above 4GB, and hence the only device that didn't work.

Keep _CRS enabled even on pre-2008 systems if they support physical address
space larger than 4GB.

Fixes: d56dbf5bab8c ("PCI: Allocate 64-bit BARs above 4G when possible")
Reported-and-tested-by: Juan Dayer <[email protected]>
Reported-and-tested-by: Alan Horsfield <[email protected]>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=99221
Link: https://bugzilla.opensuse.org/show_bug.cgi?id=907092
Signed-off-by: Bjorn Helgaas <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/pci/acpi.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

--- a/arch/x86/pci/acpi.c
+++ b/arch/x86/pci/acpi.c
@@ -121,8 +121,10 @@ void __init pci_acpi_crs_quirks(void)
{
int year;

- if (dmi_get_date(DMI_BIOS_DATE, &year, NULL, NULL) && year < 2008)
- pci_use_crs = false;
+ if (dmi_get_date(DMI_BIOS_DATE, &year, NULL, NULL) && year < 2008) {
+ if (iomem_resource.end <= 0xffffffff)
+ pci_use_crs = false;
+ }

dmi_check_system(pci_crs_quirks);


2015-07-08 08:01:54

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 43/55] x86/PCI: Use host bridge _CRS info on Foxconn K8M890-8237A

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Bjorn Helgaas <[email protected]>

commit 1dace0116d0b05c967d94644fc4dfe96be2ecd3d upstream.

The Foxconn K8M890-8237A has two PCI host bridges, and we can't assign
resources correctly without the information from _CRS that tells us which
address ranges are claimed by which bridge. In the bugs mentioned below,
we incorrectly assign a sound card address (this example is from 1033299):

bus: 00 index 2 [mem 0x80000000-0xfcffffffff]
ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-7f])
pci_root PNP0A08:00: host bridge window [mem 0x80000000-0xbfefffff] (ignored)
pci_root PNP0A08:00: host bridge window [mem 0xc0000000-0xdfffffff] (ignored)
pci_root PNP0A08:00: host bridge window [mem 0xf0000000-0xfebfffff] (ignored)
ACPI: PCI Root Bridge [PCI1] (domain 0000 [bus 80-ff])
pci_root PNP0A08:01: host bridge window [mem 0xbff00000-0xbfffffff] (ignored)
pci 0000:80:01.0: [1106:3288] type 0 class 0x000403
pci 0000:80:01.0: reg 10: [mem 0xbfffc000-0xbfffffff 64bit]
pci 0000:80:01.0: address space collision: [mem 0xbfffc000-0xbfffffff 64bit] conflicts with PCI Bus #00 [mem 0x80000000-0xfcffffffff]
pci 0000:80:01.0: BAR 0: assigned [mem 0xfd00000000-0xfd00003fff 64bit]
BUG: unable to handle kernel paging request at ffffc90000378000
IP: [<ffffffffa0345f63>] azx_create+0x37c/0x822 [snd_hda_intel]

We assigned 0xfd_0000_0000, but that is not in any of the host bridge
windows, and the sound card doesn't work.

Turn on pci=use_crs automatically for this system.

Link: https://bugs.launchpad.net/ubuntu/+source/alsa-driver/+bug/931368
Link: https://bugs.launchpad.net/ubuntu/+source/alsa-driver/+bug/1033299
Signed-off-by: Bjorn Helgaas <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/pci/acpi.c | 11 +++++++++++
1 file changed, 11 insertions(+)

--- a/arch/x86/pci/acpi.c
+++ b/arch/x86/pci/acpi.c
@@ -81,6 +81,17 @@ static const struct dmi_system_id pci_cr
DMI_MATCH(DMI_BIOS_VENDOR, "Phoenix Technologies, LTD"),
},
},
+ /* https://bugs.launchpad.net/ubuntu/+source/alsa-driver/+bug/931368 */
+ /* https://bugs.launchpad.net/ubuntu/+source/alsa-driver/+bug/1033299 */
+ {
+ .callback = set_use_crs,
+ .ident = "Foxconn K8M890-8237A",
+ .matches = {
+ DMI_MATCH(DMI_BOARD_VENDOR, "Foxconn"),
+ DMI_MATCH(DMI_BOARD_NAME, "K8M890-8237A"),
+ DMI_MATCH(DMI_BIOS_VENDOR, "Phoenix Technologies, LTD"),
+ },
+ },

/* Now for the blacklist.. */


2015-07-08 08:01:47

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 44/55] KVM: mips: use id_to_memslot correctly

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Paolo Bonzini <[email protected]>

commit 69a1220060c1523fd0515216eaa29e22f133b894 upstream.

The argument to KVM_GET_DIRTY_LOG is a memslot id; it may not match the
position in the memslots array, which is sorted by gfn.

Reviewed-by: James Hogan <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/mips/kvm/mips.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/mips/kvm/mips.c
+++ b/arch/mips/kvm/mips.c
@@ -785,7 +785,7 @@ int kvm_vm_ioctl_get_dirty_log(struct kv

/* If nothing is dirty, don't bother messing with page tables. */
if (is_dirty) {
- memslot = &kvm->memslots->memslots[log->slot];
+ memslot = id_to_memslot(kvm->memslots, log->slot);

ga = memslot->base_gfn << PAGE_SHIFT;
ga_end = ga + (memslot->npages << PAGE_SHIFT);

2015-07-08 08:00:23

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 45/55] MIPS: Fix KVM guest fixmap address

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: James Hogan <[email protected]>

commit 8e748c8d09a9314eedb5c6367d9acfaacddcdc88 upstream.

KVM guest kernels for trap & emulate run in user mode, with a modified
set of kernel memory segments. However the fixmap address is still in
the normal KSeg3 region at 0xfffe0000 regardless, causing problems when
cache alias handling makes use of them when handling copy on write.

Therefore define FIXADDR_TOP as 0x7ffe0000 in the guest kernel mapped
region when CONFIG_KVM_GUEST is defined.

Signed-off-by: James Hogan <[email protected]>
Cc: [email protected]
Patchwork: https://patchwork.linux-mips.org/patch/9887/
Signed-off-by: Ralf Baechle <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/mips/include/asm/mach-generic/spaces.h | 4 ++++
1 file changed, 4 insertions(+)

--- a/arch/mips/include/asm/mach-generic/spaces.h
+++ b/arch/mips/include/asm/mach-generic/spaces.h
@@ -94,7 +94,11 @@
#endif

#ifndef FIXADDR_TOP
+#ifdef CONFIG_KVM_GUEST
+#define FIXADDR_TOP ((unsigned long)(long)(int)0x7ffe0000)
+#else
#define FIXADDR_TOP ((unsigned long)(long)(int)0xfffe0000)
#endif
+#endif

#endif /* __ASM_MACH_GENERIC_SPACES_H */

2015-07-08 08:00:12

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 46/55] KVM: s390: fix external call injection without sigp interpretation

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: David Hildenbrand <[email protected]>

commit b938eacea0b6881f2116a061e6da3ec840e75137 upstream.

Commit ea5f49692575 ("KVM: s390: only one external call may be pending
at a time") introduced a bug on machines that don't have SIGP
interpretation facility installed.
The injection of an external call will now always fail with -EBUSY
(if none is already pending).

This leads to the following symptoms:
- An external call will be injected but with the wrong "src cpu id",
as this id will not be remembered.
- The target vcpu will not be woken up, therefore the guest will hang if
it cannot deal with unexpected failures of the SIGP EXTERNAL CALL
instruction.
- If an external call is already pending, -EBUSY will not be reported.

Reviewed-by: Christian Borntraeger <[email protected]>
Reviewed-by: Jens Freimann <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
Signed-off-by: Christian Borntraeger <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/s390/kvm/interrupt.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -1037,7 +1037,7 @@ static int __inject_extcall(struct kvm_v
if (sclp_has_sigpif())
return __inject_extcall_sigpif(vcpu, src_id);

- if (!test_and_set_bit(IRQ_PEND_EXT_EXTERNAL, &li->pending_irqs))
+ if (test_and_set_bit(IRQ_PEND_EXT_EXTERNAL, &li->pending_irqs))
return -EBUSY;
*extcall = irq->u.extcall;
atomic_set_mask(CPUSTAT_EXT_INT, li->cpuflags);

2015-07-08 08:00:01

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 47/55] s390/kdump: fix REGSET_VX_LOW vector register ELF notes

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Michael Holzheu <[email protected]>

commit 3c8e5105e759e7b2d88ea8a85b1285e535bc7500 upstream.

The REGSET_VX_LOW ELF notes should contain the lower 64 bit halfes of the
first sixteen 128 bit vector registers. Unfortunately currently we copy
the upper halfes.

Fix this and correctly copy the lower halfes.

Fixes: a62bc0739253 ("s390/kdump: add support for vector extension")
Signed-off-by: Michael Holzheu <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/s390/kernel/crash_dump.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/s390/kernel/crash_dump.c
+++ b/arch/s390/kernel/crash_dump.c
@@ -415,7 +415,7 @@ static void *nt_s390_vx_low(void *ptr, _
ptr += len;
/* Copy lower halves of SIMD registers 0-15 */
for (i = 0; i < 16; i++) {
- memcpy(ptr, &vx_regs[i], 8);
+ memcpy(ptr, &vx_regs[i].u[2], 8);
ptr += 8;
}
return ptr;

2015-07-08 07:59:50

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 48/55] KVM: s390: virtio-ccw: dont overwrite config space values

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Cornelia Huck <[email protected]>

commit 431dae778aea4eed31bd12e5ee82edc571cd4d70 upstream.

Eric noticed problems with vhost-scsi and virtio-ccw: vhost-scsi
complained about overwriting values in the config space, which
was triggered by a broken implementation of virtio-ccw's config
get/set routines. It was probably sheer luck that we did not hit
this before.

When writing a value to the config space, the WRITE_CONF ccw will
always write from the beginning of the config space up to and
including the value to be set. If the config space up to the value
has not yet been retrieved from the device, however, we'll end up
overwriting values. Keep track of the known config space and update
if needed to avoid this.

Moreover, READ_CONF will only read the number of bytes it has been
instructed to retrieve, so we must not copy more than that to the
buffer, or we might overwrite trailing values.

Reported-by: Eric Farman <[email protected]>
Signed-off-by: Cornelia Huck <[email protected]>
Reviewed-by: Eric Farman <[email protected]>
Tested-by: Eric Farman <[email protected]>
Signed-off-by: Christian Borntraeger <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/s390/kvm/virtio_ccw.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)

--- a/drivers/s390/kvm/virtio_ccw.c
+++ b/drivers/s390/kvm/virtio_ccw.c
@@ -65,6 +65,7 @@ struct virtio_ccw_device {
bool is_thinint;
bool going_away;
bool device_lost;
+ unsigned int config_ready;
void *airq_info;
};

@@ -833,8 +834,11 @@ static void virtio_ccw_get_config(struct
if (ret)
goto out_free;

- memcpy(vcdev->config, config_area, sizeof(vcdev->config));
- memcpy(buf, &vcdev->config[offset], len);
+ memcpy(vcdev->config, config_area, offset + len);
+ if (buf)
+ memcpy(buf, &vcdev->config[offset], len);
+ if (vcdev->config_ready < offset + len)
+ vcdev->config_ready = offset + len;

out_free:
kfree(config_area);
@@ -857,6 +861,9 @@ static void virtio_ccw_set_config(struct
if (!config_area)
goto out_free;

+ /* Make sure we don't overwrite fields. */
+ if (vcdev->config_ready < offset)
+ virtio_ccw_get_config(vdev, 0, NULL, offset);
memcpy(&vcdev->config[offset], buf, len);
/* Write the config area to the host. */
memcpy(config_area, vcdev->config, sizeof(vcdev->config));

2015-07-08 07:37:36

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 51/55] fs: Fix S_NOSEC handling

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Jan Kara <[email protected]>

commit 2426f3910069ed47c0cc58559a6d088af7920201 upstream.

file_remove_suid() could mistakenly set S_NOSEC inode bit when root was
modifying the file. As a result following writes to the file by ordinary
user would avoid clearing suid or sgid bits.

Fix the bug by checking actual mode bits before setting S_NOSEC.

Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Al Viro <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/inode.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1693,8 +1693,8 @@ int file_remove_suid(struct file *file)
error = security_inode_killpriv(dentry);
if (!error && killsuid)
error = __remove_suid(dentry, killsuid);
- if (!error && (inode->i_sb->s_flags & MS_NOSEC))
- inode->i_flags |= S_NOSEC;
+ if (!error)
+ inode_has_no_xattr(inode);

return error;
}

2015-07-08 08:03:18

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 52/55] fs/ufs: revert "ufs: fix deadlocks introduced by sb mutex merge"

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Fabian Frederick <[email protected]>

commit 13b987ea275840d74d9df9a44326632fab1894da upstream.

This reverts commit 9ef7db7f38d0 ("ufs: fix deadlocks introduced by sb
mutex merge") That patch tried to solve commit 0244756edc4b98c ("ufs: sb
mutex merge + mutex_destroy") which is itself partially reverted due to
multiple deadlocks.

Signed-off-by: Fabian Frederick <[email protected]>
Suggested-by: Jan Kara <[email protected]>
Cc: Ian Campbell <[email protected]>
Cc: Evgeniy Dushistov <[email protected]>
Cc: Alexey Khoroshilov <[email protected]>
Cc: Roger Pau Monne <[email protected]>
Cc: Ian Jackson <[email protected]>
Cc: Al Viro <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/ufs/inode.c | 5 ++++-
fs/ufs/namei.c | 14 ++++++++------
2 files changed, 12 insertions(+), 7 deletions(-)

--- a/fs/ufs/inode.c
+++ b/fs/ufs/inode.c
@@ -902,6 +902,9 @@ void ufs_evict_inode(struct inode * inod
invalidate_inode_buffers(inode);
clear_inode(inode);

- if (want_delete)
+ if (want_delete) {
+ lock_ufs(inode->i_sb);
ufs_free_inode(inode);
+ unlock_ufs(inode->i_sb);
+ }
}
--- a/fs/ufs/namei.c
+++ b/fs/ufs/namei.c
@@ -128,12 +128,12 @@ static int ufs_symlink (struct inode * d
if (l > sb->s_blocksize)
goto out_notlocked;

+ lock_ufs(dir->i_sb);
inode = ufs_new_inode(dir, S_IFLNK | S_IRWXUGO);
err = PTR_ERR(inode);
if (IS_ERR(inode))
- goto out_notlocked;
+ goto out;

- lock_ufs(dir->i_sb);
if (l > UFS_SB(sb)->s_uspi->s_maxsymlinklen) {
/* slow symlink */
inode->i_op = &ufs_symlink_inode_operations;
@@ -184,9 +184,13 @@ static int ufs_mkdir(struct inode * dir,
struct inode * inode;
int err;

+ lock_ufs(dir->i_sb);
+ inode_inc_link_count(dir);
+
inode = ufs_new_inode(dir, S_IFDIR|mode);
+ err = PTR_ERR(inode);
if (IS_ERR(inode))
- return PTR_ERR(inode);
+ goto out_dir;

inode->i_op = &ufs_dir_inode_operations;
inode->i_fop = &ufs_dir_operations;
@@ -194,9 +198,6 @@ static int ufs_mkdir(struct inode * dir,

inode_inc_link_count(inode);

- lock_ufs(dir->i_sb);
- inode_inc_link_count(dir);
-
err = ufs_make_empty(inode, dir);
if (err)
goto out_fail;
@@ -215,6 +216,7 @@ out_fail:
inode_dec_link_count(inode);
unlock_new_inode(inode);
iput (inode);
+out_dir:
inode_dec_link_count(dir);
unlock_ufs(dir->i_sb);
goto out;

2015-07-08 08:02:47

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 53/55] fs/ufs: restore s_lock mutex

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: Fabian Frederick <[email protected]>

commit cdd9eefdf905e92e7fc6cc393314efe68dc6ff66 upstream.

Commit 0244756edc4b98c ("ufs: sb mutex merge + mutex_destroy") generated
deadlocks in read/write mode on mkdir.

This patch partially reverts it keeping fixes by Andrew Morton and
mutex_destroy()

[AV: fixed a missing bit in ufs_remount()]

Signed-off-by: Fabian Frederick <[email protected]>
Reported-by: Ian Campbell <[email protected]>
Suggested-by: Jan Kara <[email protected]>
Cc: Ian Campbell <[email protected]>
Cc: Evgeniy Dushistov <[email protected]>
Cc: Alexey Khoroshilov <[email protected]>
Cc: Roger Pau Monne <[email protected]>
Cc: Ian Jackson <[email protected]>
Cc: Al Viro <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Al Viro <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/ufs/balloc.c | 34 +++++++++++++++++-----------------
fs/ufs/ialloc.c | 16 ++++++++--------
fs/ufs/super.c | 10 ++++++++++
fs/ufs/ufs.h | 1 +
4 files changed, 36 insertions(+), 25 deletions(-)

--- a/fs/ufs/balloc.c
+++ b/fs/ufs/balloc.c
@@ -51,8 +51,8 @@ void ufs_free_fragments(struct inode *in

if (ufs_fragnum(fragment) + count > uspi->s_fpg)
ufs_error (sb, "ufs_free_fragments", "internal error");
-
- lock_ufs(sb);
+
+ mutex_lock(&UFS_SB(sb)->s_lock);

cgno = ufs_dtog(uspi, fragment);
bit = ufs_dtogd(uspi, fragment);
@@ -115,13 +115,13 @@ void ufs_free_fragments(struct inode *in
if (sb->s_flags & MS_SYNCHRONOUS)
ubh_sync_block(UCPI_UBH(ucpi));
ufs_mark_sb_dirty(sb);
-
- unlock_ufs(sb);
+
+ mutex_unlock(&UFS_SB(sb)->s_lock);
UFSD("EXIT\n");
return;

failed:
- unlock_ufs(sb);
+ mutex_unlock(&UFS_SB(sb)->s_lock);
UFSD("EXIT (FAILED)\n");
return;
}
@@ -151,7 +151,7 @@ void ufs_free_blocks(struct inode *inode
goto failed;
}

- lock_ufs(sb);
+ mutex_lock(&UFS_SB(sb)->s_lock);

do_more:
overflow = 0;
@@ -211,12 +211,12 @@ do_more:
}

ufs_mark_sb_dirty(sb);
- unlock_ufs(sb);
+ mutex_unlock(&UFS_SB(sb)->s_lock);
UFSD("EXIT\n");
return;

failed_unlock:
- unlock_ufs(sb);
+ mutex_unlock(&UFS_SB(sb)->s_lock);
failed:
UFSD("EXIT (FAILED)\n");
return;
@@ -357,7 +357,7 @@ u64 ufs_new_fragments(struct inode *inod
usb1 = ubh_get_usb_first(uspi);
*err = -ENOSPC;

- lock_ufs(sb);
+ mutex_lock(&UFS_SB(sb)->s_lock);
tmp = ufs_data_ptr_to_cpu(sb, p);

if (count + ufs_fragnum(fragment) > uspi->s_fpb) {
@@ -378,19 +378,19 @@ u64 ufs_new_fragments(struct inode *inod
"fragment %llu, tmp %llu\n",
(unsigned long long)fragment,
(unsigned long long)tmp);
- unlock_ufs(sb);
+ mutex_unlock(&UFS_SB(sb)->s_lock);
return INVBLOCK;
}
if (fragment < UFS_I(inode)->i_lastfrag) {
UFSD("EXIT (ALREADY ALLOCATED)\n");
- unlock_ufs(sb);
+ mutex_unlock(&UFS_SB(sb)->s_lock);
return 0;
}
}
else {
if (tmp) {
UFSD("EXIT (ALREADY ALLOCATED)\n");
- unlock_ufs(sb);
+ mutex_unlock(&UFS_SB(sb)->s_lock);
return 0;
}
}
@@ -399,7 +399,7 @@ u64 ufs_new_fragments(struct inode *inod
* There is not enough space for user on the device
*/
if (!capable(CAP_SYS_RESOURCE) && ufs_freespace(uspi, UFS_MINFREE) <= 0) {
- unlock_ufs(sb);
+ mutex_unlock(&UFS_SB(sb)->s_lock);
UFSD("EXIT (FAILED)\n");
return 0;
}
@@ -424,7 +424,7 @@ u64 ufs_new_fragments(struct inode *inod
ufs_clear_frags(inode, result + oldcount,
newcount - oldcount, locked_page != NULL);
}
- unlock_ufs(sb);
+ mutex_unlock(&UFS_SB(sb)->s_lock);
UFSD("EXIT, result %llu\n", (unsigned long long)result);
return result;
}
@@ -439,7 +439,7 @@ u64 ufs_new_fragments(struct inode *inod
fragment + count);
ufs_clear_frags(inode, result + oldcount, newcount - oldcount,
locked_page != NULL);
- unlock_ufs(sb);
+ mutex_unlock(&UFS_SB(sb)->s_lock);
UFSD("EXIT, result %llu\n", (unsigned long long)result);
return result;
}
@@ -477,7 +477,7 @@ u64 ufs_new_fragments(struct inode *inod
*err = 0;
UFS_I(inode)->i_lastfrag = max(UFS_I(inode)->i_lastfrag,
fragment + count);
- unlock_ufs(sb);
+ mutex_unlock(&UFS_SB(sb)->s_lock);
if (newcount < request)
ufs_free_fragments (inode, result + newcount, request - newcount);
ufs_free_fragments (inode, tmp, oldcount);
@@ -485,7 +485,7 @@ u64 ufs_new_fragments(struct inode *inod
return result;
}

- unlock_ufs(sb);
+ mutex_unlock(&UFS_SB(sb)->s_lock);
UFSD("EXIT (FAILED)\n");
return 0;
}
--- a/fs/ufs/ialloc.c
+++ b/fs/ufs/ialloc.c
@@ -69,11 +69,11 @@ void ufs_free_inode (struct inode * inod

ino = inode->i_ino;

- lock_ufs(sb);
+ mutex_lock(&UFS_SB(sb)->s_lock);

if (!((ino > 1) && (ino < (uspi->s_ncg * uspi->s_ipg )))) {
ufs_warning(sb, "ufs_free_inode", "reserved inode or nonexistent inode %u\n", ino);
- unlock_ufs(sb);
+ mutex_unlock(&UFS_SB(sb)->s_lock);
return;
}

@@ -81,7 +81,7 @@ void ufs_free_inode (struct inode * inod
bit = ufs_inotocgoff (ino);
ucpi = ufs_load_cylinder (sb, cg);
if (!ucpi) {
- unlock_ufs(sb);
+ mutex_unlock(&UFS_SB(sb)->s_lock);
return;
}
ucg = ubh_get_ucg(UCPI_UBH(ucpi));
@@ -115,7 +115,7 @@ void ufs_free_inode (struct inode * inod
ubh_sync_block(UCPI_UBH(ucpi));

ufs_mark_sb_dirty(sb);
- unlock_ufs(sb);
+ mutex_unlock(&UFS_SB(sb)->s_lock);
UFSD("EXIT\n");
}

@@ -193,7 +193,7 @@ struct inode *ufs_new_inode(struct inode
sbi = UFS_SB(sb);
uspi = sbi->s_uspi;

- lock_ufs(sb);
+ mutex_lock(&sbi->s_lock);

/*
* Try to place the inode in its parent directory
@@ -331,21 +331,21 @@ cg_found:
sync_dirty_buffer(bh);
brelse(bh);
}
- unlock_ufs(sb);
+ mutex_unlock(&sbi->s_lock);

UFSD("allocating inode %lu\n", inode->i_ino);
UFSD("EXIT\n");
return inode;

fail_remove_inode:
- unlock_ufs(sb);
+ mutex_unlock(&sbi->s_lock);
clear_nlink(inode);
unlock_new_inode(inode);
iput(inode);
UFSD("EXIT (FAILED): err %d\n", err);
return ERR_PTR(err);
failed:
- unlock_ufs(sb);
+ mutex_unlock(&sbi->s_lock);
make_bad_inode(inode);
iput (inode);
UFSD("EXIT (FAILED): err %d\n", err);
--- a/fs/ufs/super.c
+++ b/fs/ufs/super.c
@@ -694,6 +694,7 @@ static int ufs_sync_fs(struct super_bloc
unsigned flags;

lock_ufs(sb);
+ mutex_lock(&UFS_SB(sb)->s_lock);

UFSD("ENTER\n");

@@ -711,6 +712,7 @@ static int ufs_sync_fs(struct super_bloc
ufs_put_cstotal(sb);

UFSD("EXIT\n");
+ mutex_unlock(&UFS_SB(sb)->s_lock);
unlock_ufs(sb);

return 0;
@@ -1277,6 +1279,7 @@ static int ufs_remount (struct super_blo

sync_filesystem(sb);
lock_ufs(sb);
+ mutex_lock(&UFS_SB(sb)->s_lock);
uspi = UFS_SB(sb)->s_uspi;
flags = UFS_SB(sb)->s_flags;
usb1 = ubh_get_usb_first(uspi);
@@ -1290,6 +1293,7 @@ static int ufs_remount (struct super_blo
new_mount_opt = 0;
ufs_set_opt (new_mount_opt, ONERROR_LOCK);
if (!ufs_parse_options (data, &new_mount_opt)) {
+ mutex_unlock(&UFS_SB(sb)->s_lock);
unlock_ufs(sb);
return -EINVAL;
}
@@ -1297,12 +1301,14 @@ static int ufs_remount (struct super_blo
new_mount_opt |= ufstype;
} else if ((new_mount_opt & UFS_MOUNT_UFSTYPE) != ufstype) {
pr_err("ufstype can't be changed during remount\n");
+ mutex_unlock(&UFS_SB(sb)->s_lock);
unlock_ufs(sb);
return -EINVAL;
}

if ((*mount_flags & MS_RDONLY) == (sb->s_flags & MS_RDONLY)) {
UFS_SB(sb)->s_mount_opt = new_mount_opt;
+ mutex_unlock(&UFS_SB(sb)->s_lock);
unlock_ufs(sb);
return 0;
}
@@ -1326,6 +1332,7 @@ static int ufs_remount (struct super_blo
*/
#ifndef CONFIG_UFS_FS_WRITE
pr_err("ufs was compiled with read-only support, can't be mounted as read-write\n");
+ mutex_unlock(&UFS_SB(sb)->s_lock);
unlock_ufs(sb);
return -EINVAL;
#else
@@ -1335,11 +1342,13 @@ static int ufs_remount (struct super_blo
ufstype != UFS_MOUNT_UFSTYPE_SUNx86 &&
ufstype != UFS_MOUNT_UFSTYPE_UFS2) {
pr_err("this ufstype is read-only supported\n");
+ mutex_unlock(&UFS_SB(sb)->s_lock);
unlock_ufs(sb);
return -EINVAL;
}
if (!ufs_read_cylinder_structures(sb)) {
pr_err("failed during remounting\n");
+ mutex_unlock(&UFS_SB(sb)->s_lock);
unlock_ufs(sb);
return -EPERM;
}
@@ -1347,6 +1356,7 @@ static int ufs_remount (struct super_blo
#endif
}
UFS_SB(sb)->s_mount_opt = new_mount_opt;
+ mutex_unlock(&UFS_SB(sb)->s_lock);
unlock_ufs(sb);
return 0;
}
--- a/fs/ufs/ufs.h
+++ b/fs/ufs/ufs.h
@@ -30,6 +30,7 @@ struct ufs_sb_info {
int work_queued; /* non-zero if the delayed work is queued */
struct delayed_work sync_work; /* FS sync delayed work */
spinlock_t work_lock; /* protects sync_work and work_queued */
+ struct mutex s_lock;
};

struct ufs_inode_info {

2015-07-08 08:02:19

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 54/55] vfs: Remove incorrect debugging WARN in prepend_path

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: "Eric W. Biederman" <[email protected]>

commit 93e3bce6287e1fb3e60d3324ed08555b5bbafa89 upstream.

The warning message in prepend_path is unclear and outdated. It was
added as a warning that the mechanism for generating names of pseudo
files had been removed from prepend_path and d_dname should be used
instead. Unfortunately the warning reads like a general warning,
making it unclear what to do with it.

Remove the warning. The transition it was added to warn about is long
over, and I added code several years ago which in rare cases causes
the warning to fire on legitimate code, and the warning is now firing
and scaring people for no good reason.

Reported-by: Ivan Delalande <[email protected]>
Reported-by: Omar Sandoval <[email protected]>
Fixes: f48cfddc6729e ("vfs: In d_path don't call d_dname on a mount point")
Signed-off-by: "Eric W. Biederman" <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/dcache.c | 11 -----------
1 file changed, 11 deletions(-)

--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -2896,17 +2896,6 @@ restart:
vfsmnt = &mnt->mnt;
continue;
}
- /*
- * Filesystems needing to implement special "root names"
- * should do so with ->d_dname()
- */
- if (IS_ROOT(dentry) &&
- (dentry->d_name.len != 1 ||
- dentry->d_name.name[0] != '/')) {
- WARN(1, "Root dentry has weird name <%.*s>\n",
- (int) dentry->d_name.len,
- dentry->d_name.name);
- }
if (!error)
error = is_mounted(vfsmnt) ? 1 : 2;
break;

2015-07-08 08:03:00

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.0 55/55] vfs: Ignore unlocked mounts in fs_fully_visible

4.0-stable review patch. If anyone has any objections, please let me know.

------------------

From: "Eric W. Biederman" <[email protected]>

commit ceeb0e5d39fcdf4dca2c997bf225c7fc49200b37 upstream.

Limit the mounts fs_fully_visible considers to locked mounts.
Unlocked can always be unmounted so considering them adds hassle
but no security benefit.

Signed-off-by: "Eric W. Biederman" <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/namespace.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -3187,11 +3187,15 @@ bool fs_fully_visible(struct file_system
if (mnt->mnt.mnt_root != mnt->mnt.mnt_sb->s_root)
continue;

- /* This mount is not fully visible if there are any child mounts
- * that cover anything except for empty directories.
+ /* This mount is not fully visible if there are any
+ * locked child mounts that cover anything except for
+ * empty directories.
*/
list_for_each_entry(child, &mnt->mnt_mounts, mnt_child) {
struct inode *inode = child->mnt_mountpoint->d_inode;
+ /* Only worry about locked mounts */
+ if (!(mnt->mnt.mnt_flags & MNT_LOCKED))
+ continue;
if (!S_ISDIR(inode->i_mode))
goto next;
if (inode->i_nlink > 2)

2015-07-08 14:09:59

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 4.0 00/55] 4.0.8-stable review

On 07/08/2015 12:34 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.0.8 release.
> There are 55 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Fri Jul 10 07:31:26 UTC 2015.
> Anything received after that time might be too late.
>

Build results:
total: 124 pass: 124 fail: 0
Qemu test results:
total: 30 pass: 30 fail: 0

Details are available at http://server.roeck-us.net:8010/builders.

Guenter

2015-07-08 16:34:16

by Shuah Khan

[permalink] [raw]
Subject: Re: [PATCH 4.0 00/55] 4.0.8-stable review

On 07/08/2015 01:34 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.0.8 release.
> There are 55 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Fri Jul 10 07:31:26 UTC 2015.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.0.8-rc1.gz
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah


--
Shuah Khan
Sr. Linux Kernel Developer
Open Source Innovation Group
Samsung Research America (Silicon Valley)
[email protected] | (970) 217-8978

2015-07-09 04:21:37

by Sudip Mukherjee

[permalink] [raw]
Subject: Re: [PATCH 4.0 00/55] 4.0.8-stable review

On Wed, Jul 08, 2015 at 12:34:35AM -0700, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.0.8 release.
> There are 55 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Fri Jul 10 07:31:26 UTC 2015.
> Anything received after that time might be too late.
Compiled and booted on x86_32. No errors in dmesg.

But when tried sudo make kselftest, it gave:
Makefile:29: *** missing separator. Stop.
make: *** [kselftest] Error 2

gcc version is 4.8.2

regards
sudip

2015-07-10 16:05:12

by Kevin Hilman

[permalink] [raw]
Subject: Re: [PATCH 4.0 00/55] 4.0.8-stable review

Greg Kroah-Hartman <[email protected]> writes:

> This is the start of the stable review cycle for the 4.0.8 release.
> There are 55 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.

Results from kernelci.org below.

Short version: PASS

Long version: while there are some boot failures, none of these are new
failures. I've just submitted a backport patch for v4.0.y to fix the
OMAP THUMB2_KERNEL=y failures, and the exynos5 failures are due to
missing NFS support which didn't hit upstream until v4.1. The NFSroot
tests will be removed from subsequent tests for v4.0.y

Kevin


stable-queue boot: 364 boots: 9 failed, 354 passed with 1 offline (v4.0.7-55-g279a3342322b)

Full Boot Summary: http://kernelci.org/boot/all/job/stable-queue/kernel/v4.0.7-55-g279a3342322b/
Full Build Summary: http://kernelci.org/build/stable-queue/kernel/v4.0.7-55-g279a3342322b/

Tree: stable-queue
Branch: local/linux-4.0.y-queue
Git Describe: v4.0.7-55-g279a3342322b
Git Commit: 279a3342322b9418c703e39b8418e0b925d0f508
Git URL: git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-stable.git
Tested: 96 unique boards, 22 SoC families, 28 builds out of 134

Boot Failures Detected: http://kernelci.org/boot/?v4.0.7-55-g279a3342322b&fail

arm:

multi_v7_defconfig+CONFIG_THUMB2_KERNEL=y:
omap3-beagle: 1 failed lab
omap3-beagle-xm: 2 failed labs
omap3-n900: 1 failed lab
omap3-overo-storm-tobi: 1 failed lab
omap3-overo-tobi: 1 failed lab

exynos_defconfig:
exynos5422-odroidxu3_rootfs:nfs: 2 failed labs
exynos5800-peach-pi_rootfs:nfs: 1 failed lab

Offline Platforms:

arm:

omap2plus_defconfig:
am335x-boneblack: 1 offline lab

---
For more info write to <[email protected]>

2015-07-10 17:31:19

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 4.0 00/55] 4.0.8-stable review

On Thu, Jul 09, 2015 at 09:51:04AM +0530, Sudip Mukherjee wrote:
> On Wed, Jul 08, 2015 at 12:34:35AM -0700, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.0.8 release.
> > There are 55 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Fri Jul 10 07:31:26 UTC 2015.
> > Anything received after that time might be too late.
> Compiled and booted on x86_32. No errors in dmesg.
>
> But when tried sudo make kselftest, it gave:
> Makefile:29: *** missing separator. Stop.
> make: *** [kselftest] Error 2
>
> gcc version is 4.8.2

Odd, is this a regression from 4.0.7?

thanks,

greg k-h

2015-07-10 17:34:17

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 4.0 00/55] 4.0.8-stable review

On Fri, Jul 10, 2015 at 09:05:01AM -0700, Kevin Hilman wrote:
> Greg Kroah-Hartman <[email protected]> writes:
>
> > This is the start of the stable review cycle for the 4.0.8 release.
> > There are 55 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
>
> Results from kernelci.org below.
>
> Short version: PASS
>
> Long version: while there are some boot failures, none of these are new
> failures. I've just submitted a backport patch for v4.0.y to fix the
> OMAP THUMB2_KERNEL=y failures, and the exynos5 failures are due to
> missing NFS support which didn't hit upstream until v4.1. The NFSroot
> tests will be removed from subsequent tests for v4.0.y

I'm only going to do one more 4.0-stable release after the next one next
week, so don't spend a lot of time on it if you don't really want to.

Thanks for testing these.

greg k-h

2015-07-10 18:53:41

by Kevin Hilman

[permalink] [raw]
Subject: Re: [PATCH 4.0 00/55] 4.0.8-stable review

Greg Kroah-Hartman <[email protected]> writes:

> On Fri, Jul 10, 2015 at 09:05:01AM -0700, Kevin Hilman wrote:
>> Greg Kroah-Hartman <[email protected]> writes:
>>
>> > This is the start of the stable review cycle for the 4.0.8 release.
>> > There are 55 patches in this series, all will be posted as a response
>> > to this one. If anyone has any issues with these being applied, please
>> > let me know.
>>
>> Results from kernelci.org below.
>>
>> Short version: PASS
>>
>> Long version: while there are some boot failures, none of these are new
>> failures. I've just submitted a backport patch for v4.0.y to fix the
>> OMAP THUMB2_KERNEL=y failures, and the exynos5 failures are due to
>> missing NFS support which didn't hit upstream until v4.1. The NFSroot
>> tests will be removed from subsequent tests for v4.0.y
>
> I'm only going to do one more 4.0-stable release after the next one next
> week, so don't spend a lot of time on it if you don't really want to.

I've already submitted the THUMB2_KERNEL=y fix earlier today, so that
should fix most of these failures. For the NFS ones, we just won't test
that on v4.0 anymore.

Kevin

2015-07-11 09:47:24

by Sudip Mukherjee

[permalink] [raw]
Subject: Re: [PATCH 4.0 00/55] 4.0.8-stable review

On Fri, Jul 10, 2015 at 10:31:13AM -0700, Greg Kroah-Hartman wrote:
> On Thu, Jul 09, 2015 at 09:51:04AM +0530, Sudip Mukherjee wrote:
> > On Wed, Jul 08, 2015 at 12:34:35AM -0700, Greg Kroah-Hartman wrote:
> > > This is the start of the stable review cycle for the 4.0.8 release.
> > > There are 55 patches in this series, all will be posted as a response
> > > to this one. If anyone has any issues with these being applied, please
> > > let me know.
> > >
> > > Responses should be made by Fri Jul 10 07:31:26 UTC 2015.
> > > Anything received after that time might be too late.
> > Compiled and booted on x86_32. No errors in dmesg.
> >
> > But when tried sudo make kselftest, it gave:
> > Makefile:29: *** missing separator. Stop.
> > make: *** [kselftest] Error 2
> >
> > gcc version is 4.8.2
>
> Odd, is this a regression from 4.0.7?
No. Much older. I tested till 4.0.6 and the error was still there.
60df4642a835 ("tools selftests: Fix 'clean' target with make 3.81") solved
the problem.
should i submit it for stable 4.0.9 or you will add it?

After this, kselftest had some more problem like:
1) kcmp_test.c:13:24: fatal error: linux/kcmp.h: No such file or directory
2) memfd_test.c:26:17: error: ‘__NR_memfd_create’ undeclared (first use
in this function)
3) hugetlbfstest: hugetlbfstest.c:60: open_file: Assertion `fd > 2' failed.
Aborted (core dumped)

I will see if something is already there in Linus tree regarding these.

regards
sudip

2015-07-11 11:03:56

by Sudip Mukherjee

[permalink] [raw]
Subject: Re: [PATCH 4.0 00/55] 4.0.8-stable review

On Sat, Jul 11, 2015 at 03:17:12PM +0530, Sudip Mukherjee wrote:
>
> After this, kselftest had some more problem like:
> 1) kcmp_test.c:13:24: fatal error: linux/kcmp.h: No such file or directory
> 2) memfd_test.c:26:17: error: ‘__NR_memfd_create’ undeclared (first use
> in this function)
> 3) hugetlbfstest: hugetlbfstest.c:60: open_file: Assertion `fd > 2' failed.
> Aborted (core dumped)

1 and 2 were just problems with headers in the userspace.
3 because I don't have "/dev/shm/hugetlbhog" node, maybe something in
my udev.

regards
sudip

2015-07-11 12:51:43

by Sudip Mukherjee

[permalink] [raw]
Subject: Re: [PATCH 4.0 00/55] 4.0.8-stable review

On Sat, Jul 11, 2015 at 04:33:42PM +0530, Sudip Mukherjee wrote:
> On Sat, Jul 11, 2015 at 03:17:12PM +0530, Sudip Mukherjee wrote:
> >
> > After this, kselftest had some more problem like:
> > 1) kcmp_test.c:13:24: fatal error: linux/kcmp.h: No such file or directory
> > 2) memfd_test.c:26:17: error: ‘__NR_memfd_create’ undeclared (first use
> > in this function)
> > 3) hugetlbfstest: hugetlbfstest.c:60: open_file: Assertion `fd > 2' failed.
> > Aborted (core dumped)
>
> 1 and 2 were just problems with headers in the userspace.
> 3 because I don't have "/dev/shm/hugetlbhog" node, maybe something in
> my udev.
Just an update. hugetlbfstest is failing because it is not able to create
/hugepages/hugetlbhog as i don't have /hugepages.

regards
sudip

2015-07-11 14:39:16

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 4.0 00/55] 4.0.8-stable review

On Sat, Jul 11, 2015 at 03:17:12PM +0530, Sudip Mukherjee wrote:
> On Fri, Jul 10, 2015 at 10:31:13AM -0700, Greg Kroah-Hartman wrote:
> > On Thu, Jul 09, 2015 at 09:51:04AM +0530, Sudip Mukherjee wrote:
> > > On Wed, Jul 08, 2015 at 12:34:35AM -0700, Greg Kroah-Hartman wrote:
> > > > This is the start of the stable review cycle for the 4.0.8 release.
> > > > There are 55 patches in this series, all will be posted as a response
> > > > to this one. If anyone has any issues with these being applied, please
> > > > let me know.
> > > >
> > > > Responses should be made by Fri Jul 10 07:31:26 UTC 2015.
> > > > Anything received after that time might be too late.
> > > Compiled and booted on x86_32. No errors in dmesg.
> > >
> > > But when tried sudo make kselftest, it gave:
> > > Makefile:29: *** missing separator. Stop.
> > > make: *** [kselftest] Error 2
> > >
> > > gcc version is 4.8.2
> >
> > Odd, is this a regression from 4.0.7?
> No. Much older. I tested till 4.0.6 and the error was still there.
> 60df4642a835 ("tools selftests: Fix 'clean' target with make 3.81") solved
> the problem.
> should i submit it for stable 4.0.9 or you will add it?

I'll add it, thanks.

greg k-h