2015-04-26 13:46:59

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.10 00/31] 3.10.76-stable review

This is the start of the stable review cycle for the 3.10.76 release.
There are 31 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.

Responses should be made by Tue Apr 28 13:41:43 UTC 2015.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.10.76-rc1.gz
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <[email protected]>
Linux 3.10.76-rc1

Seth Jennings <[email protected]>
sb_edac: avoid INTERNAL ERROR message in EDAC with unspecified channel

Linus Torvalds <[email protected]>
x86: mm: move mmap_sem unlock from mm_fault_error() to caller

Linus Torvalds <[email protected]>
vm: make stack guard page errors return VM_FAULT_SIGSEGV rather than SIGBUS

Linus Torvalds <[email protected]>
vm: add VM_FAULT_SIGSEGV handling support

Al Viro <[email protected]>
deal with deadlock in d_walk()

Al Viro <[email protected]>
move d_rcu from overlapping d_child to overlapping d_alias

Peter Kümmel <[email protected]>
kconfig: Fix warning "‘jump’ may be used uninitialized"

Nadav Amit <[email protected]>
KVM: x86: SYSENTER emulation is broken

Florian Westphal <[email protected]>
netfilter: conntrack: disable generic tracking for known protocols

Marcel Holtmann <[email protected]>
Bluetooth: Ignore isochronous endpoints for Intel USB bootloader

Marcel Holtmann <[email protected]>
Bluetooth: Add support for Intel bootloader devices

Jurgen Kramer <[email protected]>
Bluetooth: btusb: Add IMC Networks (Broadcom based)

Oliver Neukum <[email protected]>
Bluetooth: Add firmware update for Atheros 0cf3:311f

Oliver Neukum <[email protected]>
Bluetooth: Enable Atheros 0cf3:311e for firmware upload

Kirill A. Shutemov <[email protected]>
mm: Fix NULL pointer dereference in madvise(MADV_WILLNEED) support

Ben Hutchings <[email protected]>
splice: Apply generic position and size checks to each write

Dave Kleikamp <[email protected]>
jfs: fix readdir regression

Peter Hurley <[email protected]>
serial: 8250_dw: Fix deadlock in LCR workaround

Eric W. Biederman <[email protected]>
benet: Call dev_kfree_skby_any instead of kfree_skb.

Eric W. Biederman <[email protected]>
ixgb: Call dev_kfree_skby_any instead of dev_kfree_skb.

Eric W. Biederman <[email protected]>
tg3: Call dev_kfree_skby_any instead of dev_kfree_skb.

Eric W. Biederman <[email protected]>
bnx2: Call dev_kfree_skby_any instead of dev_kfree_skb.

Eric W. Biederman <[email protected]>
r8169: Call dev_kfree_skby_any instead of dev_kfree_skb.

Eric W. Biederman <[email protected]>
8139too: Call dev_kfree_skby_any instead of dev_kfree_skb.

Eric W. Biederman <[email protected]>
8139cp: Call dev_kfree_skby_any instead of kfree_skb.

Eric Dumazet <[email protected]>
tcp: tcp_make_synack() should clear skb->tstamp

Neal Cardwell <[email protected]>
tcp: fix FRTO undo on cumulative ACK of SACKed range

D.S. Ljungmark <[email protected]>
ipv6: Don't reduce hop limit for an interface

Michal Kubeček <[email protected]>
tcp: prevent fetching dst twice in early demux code

Alex Elder <[email protected]>
remove extra definitions of U32_MAX

Alex Elder <[email protected]>
conditionally define U32_MAX


-------------

Diffstat:

Makefile | 4 +-
arch/alpha/mm/fault.c | 2 +
arch/arc/mm/fault.c | 2 +
arch/avr32/mm/fault.c | 2 +
arch/cris/mm/fault.c | 2 +
arch/frv/mm/fault.c | 2 +
arch/ia64/mm/fault.c | 2 +
arch/m32r/mm/fault.c | 2 +
arch/m68k/mm/fault.c | 2 +
arch/metag/mm/fault.c | 2 +
arch/microblaze/mm/fault.c | 2 +
arch/mips/mm/fault.c | 2 +
arch/mn10300/mm/fault.c | 2 +
arch/openrisc/mm/fault.c | 2 +
arch/parisc/mm/fault.c | 2 +
arch/powerpc/mm/fault.c | 2 +
arch/powerpc/platforms/cell/spu_fault.c | 2 +-
arch/powerpc/platforms/cell/spufs/inode.c | 2 +-
arch/s390/mm/fault.c | 6 +
arch/score/mm/fault.c | 2 +
arch/sh/mm/fault.c | 2 +
arch/sparc/mm/fault_32.c | 2 +
arch/sparc/mm/fault_64.c | 2 +
arch/tile/mm/fault.c | 2 +
arch/um/kernel/trap.c | 2 +
arch/x86/kvm/emulate.c | 27 ++---
arch/x86/mm/fault.c | 10 +-
arch/xtensa/mm/fault.c | 2 +
drivers/bluetooth/ath3k.c | 4 +
drivers/bluetooth/btusb.c | 13 +++
drivers/edac/sb_edac.c | 8 +-
drivers/net/ethernet/broadcom/bnx2.c | 6 +-
drivers/net/ethernet/broadcom/tg3.c | 14 +--
drivers/net/ethernet/emulex/benet/be_main.c | 2 +-
drivers/net/ethernet/intel/ixgb/ixgb_main.c | 6 +-
drivers/net/ethernet/realtek/8139cp.c | 2 +-
drivers/net/ethernet/realtek/8139too.c | 4 +-
drivers/net/ethernet/realtek/r8169.c | 6 +-
drivers/tty/serial/8250/8250_dw.c | 10 +-
fs/affs/amigaffs.c | 2 +-
fs/autofs4/expire.c | 12 +-
fs/autofs4/root.c | 2 +-
fs/ceph/dir.c | 8 +-
fs/ceph/inode.c | 6 +-
fs/cifs/inode.c | 2 +-
fs/coda/cache.c | 2 +-
fs/dcache.c | 172 ++++++++++++++++------------
fs/debugfs/inode.c | 6 +-
fs/exportfs/expfs.c | 2 +-
fs/jfs/jfs_dtree.c | 4 +-
fs/libfs.c | 12 +-
fs/ncpfs/dir.c | 2 +-
fs/ncpfs/ncplib_kernel.h | 4 +-
fs/nfs/getroot.c | 2 +-
fs/notify/fsnotify.c | 4 +-
fs/ocfs2/dcache.c | 2 +-
fs/ocfs2/file.c | 8 +-
fs/reiserfs/reiserfs.h | 2 -
fs/splice.c | 8 +-
include/asm-generic/pgtable.h | 5 +-
include/linux/ceph/decode.h | 17 ---
include/linux/dcache.h | 8 +-
include/linux/mm.h | 5 +-
kernel/cgroup.c | 2 +-
kernel/trace/trace.c | 4 +-
kernel/trace/trace_events.c | 2 +-
mm/ksm.c | 2 +-
mm/memory.c | 7 +-
net/ipv4/tcp_illinois.c | 1 -
net/ipv4/tcp_input.c | 7 +-
net/ipv4/tcp_ipv4.c | 2 +-
net/ipv4/tcp_output.c | 2 +
net/ipv6/ndisc.c | 9 +-
net/ipv6/tcp_ipv6.c | 2 +-
net/netfilter/nf_conntrack_proto_generic.c | 26 ++++-
scripts/kconfig/menu.c | 4 +-
security/selinux/selinuxfs.c | 6 +-
77 files changed, 324 insertions(+), 219 deletions(-)


2015-04-26 13:47:11

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.10 01/31] conditionally define U32_MAX

3.10-stable review patch. If anyone has any objections, please let me know.

------------------

From: Alex Elder <[email protected]>

commit 77719536dc00f8fd8f5abe6dadbde5331c37f996 upstream.

The symbol U32_MAX is defined in several spots. Change these
definitions to be conditional. This is in preparation for the next
patch, which centralizes the definition in <linux/kernel.h>.

Signed-off-by: Alex Elder <[email protected]>
Cc: Sage Weil <[email protected]>
Cc: David Miller <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/reiserfs/reiserfs.h | 2 ++
include/linux/ceph/decode.h | 2 ++
net/ipv4/tcp_illinois.c | 2 ++
3 files changed, 6 insertions(+)

--- a/fs/reiserfs/reiserfs.h
+++ b/fs/reiserfs/reiserfs.h
@@ -1954,7 +1954,9 @@ struct treepath var = {.path_length = IL
#define MAX_US_INT 0xffff

// reiserfs version 2 has max offset 60 bits. Version 1 - 32 bit offset
+#ifndef U32_MAX
#define U32_MAX (~(__u32)0)
+#endif /* !U32_MAX */

static inline loff_t max_reiserfs_offset(struct inode *inode)
{
--- a/include/linux/ceph/decode.h
+++ b/include/linux/ceph/decode.h
@@ -10,6 +10,7 @@

/* This seemed to be the easiest place to define these */

+#ifndef U32_MAX
#define U8_MAX ((u8)(~0U))
#define U16_MAX ((u16)(~0U))
#define U32_MAX ((u32)(~0U))
@@ -24,6 +25,7 @@
#define S16_MIN ((s16)(-S16_MAX - 1))
#define S32_MIN ((s32)(-S32_MAX - 1))
#define S64_MIN ((s64)(-S64_MAX - 1LL))
+#endif /* !U32_MAX */

/*
* in all cases,
--- a/net/ipv4/tcp_illinois.c
+++ b/net/ipv4/tcp_illinois.c
@@ -23,7 +23,9 @@
#define ALPHA_MIN ((3*ALPHA_SCALE)/10) /* ~0.3 */
#define ALPHA_MAX (10*ALPHA_SCALE) /* 10.0 */
#define ALPHA_BASE ALPHA_SCALE /* 1.0 */
+#ifndef U32_MAX
#define U32_MAX ((u32)~0U)
+#endif /* !U32_MAX */
#define RTT_MAX (U32_MAX / ALPHA_MAX) /* 3.3 secs */

#define BETA_SHIFT 6

2015-04-26 13:47:43

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.10 02/31] remove extra definitions of U32_MAX

3.10-stable review patch. If anyone has any objections, please let me know.

------------------

From: Alex Elder <[email protected]>

commit 04f9b74e4d96d349de12fdd4e6626af4a9f75e09 upstream.

Now that the definition is centralized in <linux/kernel.h>, the
definitions of U32_MAX (and related) elsewhere in the kernel can be
removed.

Signed-off-by: Alex Elder <[email protected]>
Acked-by: Sage Weil <[email protected]>
Acked-by: David S. Miller <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/reiserfs/reiserfs.h | 4 ----
include/linux/ceph/decode.h | 19 -------------------
net/ipv4/tcp_illinois.c | 3 ---
3 files changed, 26 deletions(-)

--- a/fs/reiserfs/reiserfs.h
+++ b/fs/reiserfs/reiserfs.h
@@ -1954,10 +1954,6 @@ struct treepath var = {.path_length = IL
#define MAX_US_INT 0xffff

// reiserfs version 2 has max offset 60 bits. Version 1 - 32 bit offset
-#ifndef U32_MAX
-#define U32_MAX (~(__u32)0)
-#endif /* !U32_MAX */
-
static inline loff_t max_reiserfs_offset(struct inode *inode)
{
if (get_inode_item_key_version(inode) == KEY_FORMAT_3_5)
--- a/include/linux/ceph/decode.h
+++ b/include/linux/ceph/decode.h
@@ -8,25 +8,6 @@

#include <linux/ceph/types.h>

-/* This seemed to be the easiest place to define these */
-
-#ifndef U32_MAX
-#define U8_MAX ((u8)(~0U))
-#define U16_MAX ((u16)(~0U))
-#define U32_MAX ((u32)(~0U))
-#define U64_MAX ((u64)(~0ULL))
-
-#define S8_MAX ((s8)(U8_MAX >> 1))
-#define S16_MAX ((s16)(U16_MAX >> 1))
-#define S32_MAX ((s32)(U32_MAX >> 1))
-#define S64_MAX ((s64)(U64_MAX >> 1LL))
-
-#define S8_MIN ((s8)(-S8_MAX - 1))
-#define S16_MIN ((s16)(-S16_MAX - 1))
-#define S32_MIN ((s32)(-S32_MAX - 1))
-#define S64_MIN ((s64)(-S64_MAX - 1LL))
-#endif /* !U32_MAX */
-
/*
* in all cases,
* void **p pointer to position pointer
--- a/net/ipv4/tcp_illinois.c
+++ b/net/ipv4/tcp_illinois.c
@@ -23,9 +23,6 @@
#define ALPHA_MIN ((3*ALPHA_SCALE)/10) /* ~0.3 */
#define ALPHA_MAX (10*ALPHA_SCALE) /* 10.0 */
#define ALPHA_BASE ALPHA_SCALE /* 1.0 */
-#ifndef U32_MAX
-#define U32_MAX ((u32)~0U)
-#endif /* !U32_MAX */
#define RTT_MAX (U32_MAX / ALPHA_MAX) /* 3.3 secs */

#define BETA_SHIFT 6

2015-04-26 13:59:42

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.10 03/31] tcp: prevent fetching dst twice in early demux code

3.10-stable review patch. If anyone has any objections, please let me know.

------------------

From: =?UTF-8?q?Michal=20Kube=C4=8Dek?= <[email protected]>

[ Upstream commit d0c294c53a771ae7e84506dfbd8c18c30f078735 ]

On s390x, gcc 4.8 compiles this part of tcp_v6_early_demux()

struct dst_entry *dst = sk->sk_rx_dst;

if (dst)
dst = dst_check(dst, inet6_sk(sk)->rx_dst_cookie);

to code reading sk->sk_rx_dst twice, once for the test and once for
the argument of ip6_dst_check() (dst_check() is inline). This allows
ip6_dst_check() to be called with null first argument, causing a crash.

Protect sk->sk_rx_dst access by ACCESS_ONCE() both in IPv4 and IPv6
TCP early demux code.

Fixes: 41063e9dd119 ("ipv4: Early TCP socket demux.")
Fixes: c7109986db3c ("ipv6: Early TCP socket demux")
Signed-off-by: Michal Kubecek <[email protected]>
Acked-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/tcp_ipv4.c | 2 +-
net/ipv6/tcp_ipv6.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1901,7 +1901,7 @@ void tcp_v4_early_demux(struct sk_buff *
skb->sk = sk;
skb->destructor = sock_edemux;
if (sk->sk_state != TCP_TIME_WAIT) {
- struct dst_entry *dst = sk->sk_rx_dst;
+ struct dst_entry *dst = ACCESS_ONCE(sk->sk_rx_dst);

if (dst)
dst = dst_check(dst, 0);
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1616,7 +1616,7 @@ static void tcp_v6_early_demux(struct sk
skb->sk = sk;
skb->destructor = sock_edemux;
if (sk->sk_state != TCP_TIME_WAIT) {
- struct dst_entry *dst = sk->sk_rx_dst;
+ struct dst_entry *dst = ACCESS_ONCE(sk->sk_rx_dst);

if (dst)
dst = dst_check(dst, inet6_sk(sk)->rx_dst_cookie);

2015-04-26 13:47:59

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.10 04/31] ipv6: Dont reduce hop limit for an interface

3.10-stable review patch. If anyone has any objections, please let me know.

------------------

From: "D.S. Ljungmark" <[email protected]>

[ Upstream commit 6fd99094de2b83d1d4c8457f2c83483b2828e75a ]

A local route may have a lower hop_limit set than global routes do.

RFC 3756, Section 4.2.7, "Parameter Spoofing"

> 1. The attacker includes a Current Hop Limit of one or another small
> number which the attacker knows will cause legitimate packets to
> be dropped before they reach their destination.

> As an example, one possible approach to mitigate this threat is to
> ignore very small hop limits. The nodes could implement a
> configurable minimum hop limit, and ignore attempts to set it below
> said limit.

Signed-off-by: D.S. Ljungmark <[email protected]>
Acked-by: Hannes Frederic Sowa <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv6/ndisc.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -1193,7 +1193,14 @@ static void ndisc_router_discovery(struc
if (rt)
rt6_set_expires(rt, jiffies + (HZ * lifetime));
if (ra_msg->icmph.icmp6_hop_limit) {
- in6_dev->cnf.hop_limit = ra_msg->icmph.icmp6_hop_limit;
+ /* Only set hop_limit on the interface if it is higher than
+ * the current hop_limit.
+ */
+ if (in6_dev->cnf.hop_limit < ra_msg->icmph.icmp6_hop_limit) {
+ in6_dev->cnf.hop_limit = ra_msg->icmph.icmp6_hop_limit;
+ } else {
+ ND_PRINTK(2, warn, "RA: Got route advertisement with lower hop_limit than current\n");
+ }
if (rt)
dst_metric_set(&rt->dst, RTAX_HOPLIMIT,
ra_msg->icmph.icmp6_hop_limit);

2015-04-26 13:59:14

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.10 05/31] tcp: fix FRTO undo on cumulative ACK of SACKed range

3.10-stable review patch. If anyone has any objections, please let me know.

------------------

From: Neal Cardwell <[email protected]>

[ Upstream commit 666b805150efd62f05810ff0db08f44a2370c937 ]

On processing cumulative ACKs, the FRTO code was not checking the
SACKed bit, meaning that there could be a spurious FRTO undo on a
cumulative ACK of a previously SACKed skb.

The FRTO code should only consider a cumulative ACK to indicate that
an original/unretransmitted skb is newly ACKed if the skb was not yet
SACKed.

The effect of the spurious FRTO undo would typically be to make the
connection think that all previously-sent packets were in flight when
they really weren't, leading to a stall and an RTO.

Signed-off-by: Neal Cardwell <[email protected]>
Signed-off-by: Yuchung Cheng <[email protected]>
Fixes: e33099f96d99c ("tcp: implement RFC5682 F-RTO")
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/tcp_input.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -3076,10 +3076,11 @@ static int tcp_clean_rtx_queue(struct so
if (seq_rtt < 0) {
seq_rtt = ca_seq_rtt;
}
- if (!(sacked & TCPCB_SACKED_ACKED))
+ if (!(sacked & TCPCB_SACKED_ACKED)) {
reord = min(pkts_acked, reord);
- if (!after(scb->end_seq, tp->high_seq))
- flag |= FLAG_ORIG_SACK_ACKED;
+ if (!after(scb->end_seq, tp->high_seq))
+ flag |= FLAG_ORIG_SACK_ACKED;
+ }
}

if (sacked & TCPCB_SACKED_ACKED)

2015-04-26 13:48:05

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.10 06/31] tcp: tcp_make_synack() should clear skb->tstamp

3.10-stable review patch. If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <[email protected]>

[ Upstream commit b50edd7812852d989f2ef09dcfc729690f54a42d ]

I noticed tcpdump was giving funky timestamps for locally
generated SYNACK messages on loopback interface.

11:42:46.938990 IP 127.0.0.1.48245 > 127.0.0.2.23850: S
945476042:945476042(0) win 43690 <mss 65495,nop,nop,sackOK,nop,wscale 7>

20:28:58.502209 IP 127.0.0.2.23850 > 127.0.0.1.48245: S
3160535375:3160535375(0) ack 945476043 win 43690 <mss
65495,nop,nop,sackOK,nop,wscale 7>

This is because we need to clear skb->tstamp before
entering lower stack, otherwise net_timestamp_check()
does not set skb->tstamp.

Fixes: 7faee5c0d514 ("tcp: remove TCP_SKB_CB(skb)->when")
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/tcp_output.c | 2 ++
1 file changed, 2 insertions(+)

--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2772,6 +2772,8 @@ struct sk_buff *tcp_make_synack(struct s
}
#endif

+ /* Do not fool tcpdump (if any), clean our debris */
+ skb->tstamp.tv64 = 0;
return skb;
}
EXPORT_SYMBOL(tcp_make_synack);

2015-04-26 13:58:53

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.10 07/31] 8139cp: Call dev_kfree_skby_any instead of kfree_skb.

3.10-stable review patch. If anyone has any objections, please let me know.

------------------

From: "Eric W. Biederman" <[email protected]>

Replace kfree_skb with dev_kfree_skb_any in cp_start_xmit
as it can be called in both hard irq and other contexts.

Signed-off-by: "Eric W. Biederman" <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/realtek/8139cp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/ethernet/realtek/8139cp.c
+++ b/drivers/net/ethernet/realtek/8139cp.c
@@ -899,7 +899,7 @@ out_unlock:

return NETDEV_TX_OK;
out_dma_error:
- kfree_skb(skb);
+ dev_kfree_skb_any(skb);
cp->dev->stats.tx_dropped++;
goto out_unlock;
}

2015-04-26 13:48:13

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.10 08/31] 8139too: Call dev_kfree_skby_any instead of dev_kfree_skb.

3.10-stable review patch. If anyone has any objections, please let me know.

------------------

From: "Eric W. Biederman" <[email protected]>

Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.

Signed-off-by: "Eric W. Biederman" <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/realtek/8139too.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/net/ethernet/realtek/8139too.c
+++ b/drivers/net/ethernet/realtek/8139too.c
@@ -1715,9 +1715,9 @@ static netdev_tx_t rtl8139_start_xmit (s
if (len < ETH_ZLEN)
memset(tp->tx_buf[entry], 0, ETH_ZLEN);
skb_copy_and_csum_dev(skb, tp->tx_buf[entry]);
- dev_kfree_skb(skb);
+ dev_kfree_skb_any(skb);
} else {
- dev_kfree_skb(skb);
+ dev_kfree_skb_any(skb);
dev->stats.tx_dropped++;
return NETDEV_TX_OK;
}

2015-04-26 13:58:33

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.10 09/31] r8169: Call dev_kfree_skby_any instead of dev_kfree_skb.

3.10-stable review patch. If anyone has any objections, please let me know.

------------------

From: "Eric W. Biederman" <[email protected]>

Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.

Signed-off-by: "Eric W. Biederman" <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/realtek/r8169.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -5768,7 +5768,7 @@ static void rtl8169_tx_clear_range(struc
tp->TxDescArray + entry);
if (skb) {
tp->dev->stats.tx_dropped++;
- dev_kfree_skb(skb);
+ dev_kfree_skb_any(skb);
tx_skb->skb = NULL;
}
}
@@ -5993,7 +5993,7 @@ static netdev_tx_t rtl8169_start_xmit(st
err_dma_1:
rtl8169_unmap_tx_skb(d, tp->tx_skb + entry, txd);
err_dma_0:
- dev_kfree_skb(skb);
+ dev_kfree_skb_any(skb);
err_update_stats:
dev->stats.tx_dropped++;
return NETDEV_TX_OK;
@@ -6076,7 +6076,7 @@ static void rtl_tx(struct net_device *de
tp->tx_stats.packets++;
tp->tx_stats.bytes += tx_skb->skb->len;
u64_stats_update_end(&tp->tx_stats.syncp);
- dev_kfree_skb(tx_skb->skb);
+ dev_kfree_skb_any(tx_skb->skb);
tx_skb->skb = NULL;
}
dirty_tx++;

2015-04-26 13:47:16

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.10 10/31] bnx2: Call dev_kfree_skby_any instead of dev_kfree_skb.

3.10-stable review patch. If anyone has any objections, please let me know.

------------------

From: "Eric W. Biederman" <[email protected]>

Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.

Signed-off-by: "Eric W. Biederman" <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/broadcom/bnx2.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

--- a/drivers/net/ethernet/broadcom/bnx2.c
+++ b/drivers/net/ethernet/broadcom/bnx2.c
@@ -2869,7 +2869,7 @@ bnx2_tx_int(struct bnx2 *bp, struct bnx2
sw_cons = BNX2_NEXT_TX_BD(sw_cons);

tx_bytes += skb->len;
- dev_kfree_skb(skb);
+ dev_kfree_skb_any(skb);
tx_pkt++;
if (tx_pkt == budget)
break;
@@ -6610,7 +6610,7 @@ bnx2_start_xmit(struct sk_buff *skb, str

mapping = dma_map_single(&bp->pdev->dev, skb->data, len, PCI_DMA_TODEVICE);
if (dma_mapping_error(&bp->pdev->dev, mapping)) {
- dev_kfree_skb(skb);
+ dev_kfree_skb_any(skb);
return NETDEV_TX_OK;
}

@@ -6703,7 +6703,7 @@ dma_error:
PCI_DMA_TODEVICE);
}

- dev_kfree_skb(skb);
+ dev_kfree_skb_any(skb);
return NETDEV_TX_OK;
}


2015-04-26 13:47:19

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.10 11/31] tg3: Call dev_kfree_skby_any instead of dev_kfree_skb.

3.10-stable review patch. If anyone has any objections, please let me know.

------------------

From: "Eric W. Biederman" <[email protected]>

Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.

Signed-off-by: "Eric W. Biederman" <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/broadcom/tg3.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)

--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -6437,7 +6437,7 @@ static void tg3_tx(struct tg3_napi *tnap
pkts_compl++;
bytes_compl += skb->len;

- dev_kfree_skb(skb);
+ dev_kfree_skb_any(skb);

if (unlikely(tx_bug)) {
tg3_tx_recover(tp);
@@ -6769,7 +6769,7 @@ static int tg3_rx(struct tg3_napi *tnapi
if (len > (tp->dev->mtu + ETH_HLEN) &&
skb->protocol != htons(ETH_P_8021Q) &&
skb->protocol != htons(ETH_P_8021AD)) {
- dev_kfree_skb(skb);
+ dev_kfree_skb_any(skb);
goto drop_it_no_recycle;
}

@@ -7652,7 +7652,7 @@ static int tigon3_dma_hwbug_workaround(s
PCI_DMA_TODEVICE);
/* Make sure the mapping succeeded */
if (pci_dma_mapping_error(tp->pdev, new_addr)) {
- dev_kfree_skb(new_skb);
+ dev_kfree_skb_any(new_skb);
ret = -1;
} else {
u32 save_entry = *entry;
@@ -7667,13 +7667,13 @@ static int tigon3_dma_hwbug_workaround(s
new_skb->len, base_flags,
mss, vlan)) {
tg3_tx_skb_unmap(tnapi, save_entry, -1);
- dev_kfree_skb(new_skb);
+ dev_kfree_skb_any(new_skb);
ret = -1;
}
}
}

- dev_kfree_skb(skb);
+ dev_kfree_skb_any(skb);
*pskb = new_skb;
return ret;
}
@@ -7716,7 +7716,7 @@ static int tg3_tso_bug(struct tg3 *tp, s
} while (segs);

tg3_tso_bug_end:
- dev_kfree_skb(skb);
+ dev_kfree_skb_any(skb);

return NETDEV_TX_OK;
}
@@ -7954,7 +7954,7 @@ dma_error:
tg3_tx_skb_unmap(tnapi, tnapi->tx_prod, --i);
tnapi->tx_buffers[tnapi->tx_prod].skb = NULL;
drop:
- dev_kfree_skb(skb);
+ dev_kfree_skb_any(skb);
drop_nofree:
tp->tx_dropped++;
return NETDEV_TX_OK;

2015-04-26 14:22:27

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.10 12/31] ixgb: Call dev_kfree_skby_any instead of dev_kfree_skb.

3.10-stable review patch. If anyone has any objections, please let me know.

------------------

From: "Eric W. Biederman" <[email protected]>

Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.

Signed-off-by: "Eric W. Biederman" <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/intel/ixgb/ixgb_main.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

--- a/drivers/net/ethernet/intel/ixgb/ixgb_main.c
+++ b/drivers/net/ethernet/intel/ixgb/ixgb_main.c
@@ -1527,12 +1527,12 @@ ixgb_xmit_frame(struct sk_buff *skb, str
int tso;

if (test_bit(__IXGB_DOWN, &adapter->flags)) {
- dev_kfree_skb(skb);
+ dev_kfree_skb_any(skb);
return NETDEV_TX_OK;
}

if (skb->len <= 0) {
- dev_kfree_skb(skb);
+ dev_kfree_skb_any(skb);
return NETDEV_TX_OK;
}

@@ -1549,7 +1549,7 @@ ixgb_xmit_frame(struct sk_buff *skb, str

tso = ixgb_tso(adapter, skb);
if (tso < 0) {
- dev_kfree_skb(skb);
+ dev_kfree_skb_any(skb);
return NETDEV_TX_OK;
}


2015-04-26 14:02:03

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.10 13/31] benet: Call dev_kfree_skby_any instead of kfree_skb.

3.10-stable review patch. If anyone has any objections, please let me know.

------------------

From: "Eric W. Biederman" <[email protected]>

Replace free_skb with dev_kfree_skb_any in be_tx_compl_process as
which can be called in hard irq by netpoll, softirq context
by normal napi polling, and in normal sleepable context
by the network device close method.

Signed-off-by: "Eric W. Biederman" <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/emulex/benet/be_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -1767,7 +1767,7 @@ static u16 be_tx_compl_process(struct be
queue_tail_inc(txq);
} while (cur_index != last_index);

- kfree_skb(sent_skb);
+ dev_kfree_skb_any(sent_skb);
return num_wrbs;
}


2015-04-26 13:47:23

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.10 14/31] serial: 8250_dw: Fix deadlock in LCR workaround

3.10-stable review patch. If anyone has any objections, please let me know.

------------------

From: Peter Hurley <[email protected]>

commit 7fd6f640f2dd17dac6ddd6702c378cb0bb9cfa11 upstream.

Trying to write console output from within the serial console driver
while the port->lock is held causes recursive deadlock:

CPU 0
spin_lock_irqsave(&port->lock)
printk()
console_unlock()
call_console_drivers()
serial8250_console_write()
spin_lock_irqsave(&port->lock)
** DEADLOCK **

The 8250_dw i/o accessors try to write a console error message if the
LCR workaround was unsuccessful. When the port->lock is already held
(eg., when called from serial8250_set_termios()), this deadlocks.

Make the error message a FIXME until a general solution is devised.

Cc: Tim Kryger <[email protected]>
Reported-by: Zhang Zhen <[email protected]>
Signed-off-by: Peter Hurley <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/tty/serial/8250/8250_dw.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

--- a/drivers/tty/serial/8250/8250_dw.c
+++ b/drivers/tty/serial/8250/8250_dw.c
@@ -98,7 +98,10 @@ static void dw8250_serial_out(struct uar
dw8250_force_idle(p);
writeb(value, p->membase + (UART_LCR << p->regshift));
}
- dev_err(p->dev, "Couldn't set LCR to %d\n", value);
+ /*
+ * FIXME: this deadlocks if port->lock is already held
+ * dev_err(p->dev, "Couldn't set LCR to %d\n", value);
+ */
}
}

@@ -128,7 +131,10 @@ static void dw8250_serial_out32(struct u
dw8250_force_idle(p);
writel(value, p->membase + (UART_LCR << p->regshift));
}
- dev_err(p->dev, "Couldn't set LCR to %d\n", value);
+ /*
+ * FIXME: this deadlocks if port->lock is already held
+ * dev_err(p->dev, "Couldn't set LCR to %d\n", value);
+ */
}
}


2015-04-26 13:47:26

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.10 15/31] jfs: fix readdir regression

3.10-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dave Kleikamp <[email protected]>

Upstream commit 44512449, "jfs: fix readdir cookie incompatibility
with NFSv4", was backported incorrectly into the stable trees which
used the filldir callback (rather than dir_emit). The position is
being incorrectly passed to filldir for the . and .. entries.

The still-maintained stable trees that need to be fixed are 3.2.y,
3.4.y and 3.10.y.

https://bugzilla.kernel.org/show_bug.cgi?id=94741

Signed-off-by: Dave Kleikamp <[email protected]>
Cc: [email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/jfs/jfs_dtree.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/fs/jfs/jfs_dtree.c
+++ b/fs/jfs/jfs_dtree.c
@@ -3103,7 +3103,7 @@ int jfs_readdir(struct file *filp, void
* self "."
*/
filp->f_pos = 1;
- if (filldir(dirent, ".", 1, 0, ip->i_ino,
+ if (filldir(dirent, ".", 1, 1, ip->i_ino,
DT_DIR))
return 0;
}
@@ -3111,7 +3111,7 @@ int jfs_readdir(struct file *filp, void
* parent ".."
*/
filp->f_pos = 2;
- if (filldir(dirent, "..", 2, 1, PARENT(ip), DT_DIR))
+ if (filldir(dirent, "..", 2, 2, PARENT(ip), DT_DIR))
return 0;

/*

2015-04-26 14:01:21

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.10 16/31] splice: Apply generic position and size checks to each write

3.10-stable review patch. If anyone has any objections, please let me know.

------------------

From: Ben Hutchings <[email protected]>

commit 894c6350eaad7e613ae267504014a456e00a3e2a from the 3.2-stable branch.

We need to check the position and size of file writes against various
limits, using generic_write_check(). This was not being done for
the splice write path. It was fixed upstream by commit 8d0207652cbe
("->splice_write() via ->write_iter()") but we can't apply that.

CVE-2014-7822

Signed-off-by: Ben Hutchings <[email protected]>
[Ben fixed it in 3.2 stable, i ported it to 3.10 stable]
Signed-off-by: Zhang Zhen <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/ocfs2/file.c | 8 +++++---
fs/splice.c | 8 ++++++--
2 files changed, 11 insertions(+), 5 deletions(-)

--- a/fs/ocfs2/file.c
+++ b/fs/ocfs2/file.c
@@ -2459,12 +2459,14 @@ static ssize_t ocfs2_file_splice_write(s
struct address_space *mapping = out->f_mapping;
struct inode *inode = mapping->host;
struct splice_desc sd = {
- .total_len = len,
.flags = flags,
- .pos = *ppos,
.u.file = out,
};
-
+ ret = generic_write_checks(out, ppos, &len, 0);
+ if(ret)
+ return ret;
+ sd.total_len = len;
+ sd.pos = *ppos;

trace_ocfs2_file_splice_write(inode, out, out->f_path.dentry,
(unsigned long long)OCFS2_I(inode)->ip_blkno,
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -1012,13 +1012,17 @@ generic_file_splice_write(struct pipe_in
struct address_space *mapping = out->f_mapping;
struct inode *inode = mapping->host;
struct splice_desc sd = {
- .total_len = len,
.flags = flags,
- .pos = *ppos,
.u.file = out,
};
ssize_t ret;

+ ret = generic_write_checks(out, ppos, &len, S_ISBLK(inode->i_mode));
+ if (ret)
+ return ret;
+ sd.total_len = len;
+ sd.pos = *ppos;
+
pipe_lock(pipe);

splice_from_pipe_begin(&sd);

2015-04-26 13:47:33

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.10 17/31] mm: Fix NULL pointer dereference in madvise(MADV_WILLNEED) support

3.10-stable review patch. If anyone has any objections, please let me know.

------------------

From: "Kirill A. Shutemov" <[email protected]>

commit ee53664bda169f519ce3c6a22d378f0b946c8178 upstream.

Sasha Levin found a NULL pointer dereference that is due to a missing
page table lock, which in turn is due to the pmd entry in question being
a transparent huge-table entry.

The code - introduced in commit 1998cc048901 ("mm: make
madvise(MADV_WILLNEED) support swap file prefetch") - correctly checks
for this situation using pmd_none_or_trans_huge_or_clear_bad(), but it
turns out that that function doesn't work correctly.

pmd_none_or_trans_huge_or_clear_bad() expected that pmd_bad() would
trigger if the transparent hugepage bit was set, but it doesn't do that
if pmd_numa() is also set. Note that the NUMA bit only gets set on real
NUMA machines, so people trying to reproduce this on most normal
development systems would never actually trigger this.

Fix it by removing the very subtle (and subtly incorrect) expectation,
and instead just checking pmd_trans_huge() explicitly.

Reported-by: Sasha Levin <[email protected]>
Acked-by: Andrea Arcangeli <[email protected]>
[ Additionally remove the now stale test for pmd_trans_huge() inside the
pmd_bad() case - Linus ]
Signed-off-by: Linus Torvalds <[email protected]>
Cc: Wang Long <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
include/asm-generic/pgtable.h | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -550,11 +550,10 @@ static inline int pmd_none_or_trans_huge
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
barrier();
#endif
- if (pmd_none(pmdval))
+ if (pmd_none(pmdval) || pmd_trans_huge(pmdval))
return 1;
if (unlikely(pmd_bad(pmdval))) {
- if (!pmd_trans_huge(pmdval))
- pmd_clear_bad(pmd);
+ pmd_clear_bad(pmd);
return 1;
}
return 0;

2015-04-26 14:00:54

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.10 18/31] Bluetooth: Enable Atheros 0cf3:311e for firmware upload

3.10-stable review patch. If anyone has any objections, please let me know.

------------------

From: Oliver Neukum <[email protected]>

commit b131237ca3995edad9efc162d0bc959c3b1dddc2 upstream.

The device will bind to btusb without firmware, but with the original
buggy firmware device discovery does not work. No devices are detected.

Device descriptor without firmware:
T: Bus=03 Lev=01 Prnt=01 Port=02 Cnt=01 Dev#= 2 Spd=12 MxCh= 0
D: Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1
P: Vendor=0cf3 ProdID=311e Rev= 0.01
C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA
I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=1ms
E: Ad=82(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms
E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms
I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms
E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms
I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms
E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms
I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms
E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms
I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms
E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms
I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms
E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms
I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms
E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms

with firmware:
T: Bus=03 Lev=01 Prnt=01 Port=02 Cnt=01 Dev#= 3 Spd=12 MxCh= 0
D: Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1
P: Vendor=0cf3 ProdID=311e Rev= 0.02
C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA
I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=1ms
E: Ad=82(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms
E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms
I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms
E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms
I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms
E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms
I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms
E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms
I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms
E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms
I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms
E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms
I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms
E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms

Signed-off-by: Oliver Neukum <[email protected]>
Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/bluetooth/ath3k.c | 2 ++
drivers/bluetooth/btusb.c | 1 +
2 files changed, 3 insertions(+)

--- a/drivers/bluetooth/ath3k.c
+++ b/drivers/bluetooth/ath3k.c
@@ -77,6 +77,7 @@ static struct usb_device_id ath3k_table[
{ USB_DEVICE(0x0CF3, 0x3004) },
{ USB_DEVICE(0x0CF3, 0x3008) },
{ USB_DEVICE(0x0CF3, 0x311D) },
+ { USB_DEVICE(0x0CF3, 0x311E) },
{ USB_DEVICE(0x0CF3, 0x817a) },
{ USB_DEVICE(0x13d3, 0x3375) },
{ USB_DEVICE(0x04CA, 0x3004) },
@@ -120,6 +121,7 @@ static struct usb_device_id ath3k_blist_
{ USB_DEVICE(0x0cf3, 0x3004), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x0cf3, 0x3008), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x0cf3, 0x311D), .driver_info = BTUSB_ATH3012 },
+ { USB_DEVICE(0x0cf3, 0x311E), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x0CF3, 0x817a), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x13d3, 0x3375), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x04ca, 0x3004), .driver_info = BTUSB_ATH3012 },
--- a/drivers/bluetooth/btusb.c
+++ b/drivers/bluetooth/btusb.c
@@ -141,6 +141,7 @@ static struct usb_device_id blacklist_ta
{ USB_DEVICE(0x0cf3, 0x3004), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x0cf3, 0x3008), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x0cf3, 0x311d), .driver_info = BTUSB_ATH3012 },
+ { USB_DEVICE(0x0cf3, 0x311e), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x0cf3, 0x817a), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x13d3, 0x3375), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x04ca, 0x3004), .driver_info = BTUSB_ATH3012 },

2015-04-26 13:47:39

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.10 19/31] Bluetooth: Add firmware update for Atheros 0cf3:311f

3.10-stable review patch. If anyone has any objections, please let me know.

------------------

From: Oliver Neukum <[email protected]>

commit 1e56f1eb2bbeab0ddc3a1e536d2a0065cfe4c131 upstream.

The device is not functional without firmware.

The device without firmware:
T: Bus=02 Lev=02 Prnt=02 Port=05 Cnt=01 Dev#= 3 Spd=12 MxCh= 0
D: Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1
P: Vendor=0cf3 ProdID=311f Rev=00.01
C: #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA
I: If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
I: If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb

The device with firmware:
T: Bus=02 Lev=02 Prnt=02 Port=05 Cnt=01 Dev#= 4 Spd=12 MxCh= 0
D: Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1
P: Vendor=0cf3 ProdID=3007 Rev=00.01
C: #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA
I: If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
I: If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb

Signed-off-by: Oliver Neukum <[email protected]>
Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/bluetooth/ath3k.c | 2 ++
drivers/bluetooth/btusb.c | 1 +
2 files changed, 3 insertions(+)

--- a/drivers/bluetooth/ath3k.c
+++ b/drivers/bluetooth/ath3k.c
@@ -78,6 +78,7 @@ static struct usb_device_id ath3k_table[
{ USB_DEVICE(0x0CF3, 0x3008) },
{ USB_DEVICE(0x0CF3, 0x311D) },
{ USB_DEVICE(0x0CF3, 0x311E) },
+ { USB_DEVICE(0x0CF3, 0x311F) },
{ USB_DEVICE(0x0CF3, 0x817a) },
{ USB_DEVICE(0x13d3, 0x3375) },
{ USB_DEVICE(0x04CA, 0x3004) },
@@ -122,6 +123,7 @@ static struct usb_device_id ath3k_blist_
{ USB_DEVICE(0x0cf3, 0x3008), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x0cf3, 0x311D), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x0cf3, 0x311E), .driver_info = BTUSB_ATH3012 },
+ { USB_DEVICE(0x0cf3, 0x311F), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x0CF3, 0x817a), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x13d3, 0x3375), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x04ca, 0x3004), .driver_info = BTUSB_ATH3012 },
--- a/drivers/bluetooth/btusb.c
+++ b/drivers/bluetooth/btusb.c
@@ -142,6 +142,7 @@ static struct usb_device_id blacklist_ta
{ USB_DEVICE(0x0cf3, 0x3008), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x0cf3, 0x311d), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x0cf3, 0x311e), .driver_info = BTUSB_ATH3012 },
+ { USB_DEVICE(0x0cf3, 0x311f), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x0cf3, 0x817a), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x13d3, 0x3375), .driver_info = BTUSB_ATH3012 },
{ USB_DEVICE(0x04ca, 0x3004), .driver_info = BTUSB_ATH3012 },

2015-04-26 13:47:47

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.10 20/31] Bluetooth: btusb: Add IMC Networks (Broadcom based)

3.10-stable review patch. If anyone has any objections, please let me know.

------------------

From: Jurgen Kramer <[email protected]>

commit 9113bfd82dc8ece9cbb898df8794f58a78a36e97 upstream.

Add support for IMC Networks (Broadcom based) to btusb driver.

Below the output of /sys/kernel/debug/usb/devices for this device:

T: Bus=01 Lev=02 Prnt=02 Port=04 Cnt=01 Dev#= 3 Spd=12 MxCh= 0
D: Ver= 2.00 Cls=ff(vend.) Sub=01 Prot=01 MxPS=64 #Cfgs= 1
P: Vendor=13d3 ProdID=3404 Rev= 1.12
S: Manufacturer=Broadcom Corp
S: Product=BCM20702A0
S: SerialNumber=240A649F8246
C:* #Ifs= 4 Cfg#= 1 Atr=e0 MxPwr= 0mA
I:* If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=01 Prot=01 Driver=btusb
E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=1ms
E: Ad=82(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms
E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms
I:* If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=01 Prot=01 Driver=btusb
E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms
E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms
I: If#= 1 Alt= 1 #EPs= 2 Cls=ff(vend.) Sub=01 Prot=01 Driver=btusb
E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms
E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms
I: If#= 1 Alt= 2 #EPs= 2 Cls=ff(vend.) Sub=01 Prot=01 Driver=btusb
E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms
E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms
I: If#= 1 Alt= 3 #EPs= 2 Cls=ff(vend.) Sub=01 Prot=01 Driver=btusb
E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms
E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms
I: If#= 1 Alt= 4 #EPs= 2 Cls=ff(vend.) Sub=01 Prot=01 Driver=btusb
E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms
E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms
I: If#= 1 Alt= 5 #EPs= 2 Cls=ff(vend.) Sub=01 Prot=01 Driver=btusb
E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms
E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms
I:* If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)
E: Ad=84(I) Atr=02(Bulk) MxPS= 32 Ivl=0ms
E: Ad=04(O) Atr=02(Bulk) MxPS= 32 Ivl=0ms
I:* If#= 3 Alt= 0 #EPs= 0 Cls=fe(app. ) Sub=01 Prot=01 Driver=(none)

Signed-off-by: Jurgen Kramer <[email protected]>
Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/bluetooth/btusb.c | 3 +++
1 file changed, 3 insertions(+)

--- a/drivers/bluetooth/btusb.c
+++ b/drivers/bluetooth/btusb.c
@@ -113,6 +113,9 @@ static struct usb_device_id btusb_table[
/*Broadcom devices with vendor specific id */
{ USB_VENDOR_AND_INTERFACE_INFO(0x0a5c, 0xff, 0x01, 0x01) },

+ /* IMC Networks - Broadcom based */
+ { USB_VENDOR_AND_INTERFACE_INFO(0x13d3, 0xff, 0x01, 0x01) },
+
{ } /* Terminating entry */
};


2015-04-26 14:00:01

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.10 21/31] Bluetooth: Add support for Intel bootloader devices

3.10-stable review patch. If anyone has any objections, please let me know.

------------------

From: Marcel Holtmann <[email protected]>

commit 40df783d1ef1989ac454e3dfcda017270b8950e6 upstream.

Intel Bluetooth devices that boot up in bootloader mode can not
be used as generic HCI devices, but their HCI transport is still
valuable and so bring that up as raw-only devices.

T: Bus=02 Lev=02 Prnt=03 Port=00 Cnt=01 Dev#= 14 Spd=12 MxCh= 0
D: Ver= 1.10 Cls=ff(vend.) Sub=00 Prot=00 MxPS=64 #Cfgs= 1
P: Vendor=8087 ProdID=0a5a Rev= 0.00
S: Manufacturer=Intel(R) Corporation
S: Product=Intel(R) Wilkins Peak 2x2
S: SerialNumber=001122334455 WP_A0
C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA
I:* If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
E: Ad=81(I) Atr=03(Int.) MxPS= 64 Ivl=1ms
E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms
E: Ad=82(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms
I:* If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms
E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms
I: If#= 1 Alt= 1 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms
E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms
I: If#= 1 Alt= 2 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms
E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms
I: If#= 1 Alt= 3 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms
E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms
I: If#= 1 Alt= 4 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms
E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms
I: If#= 1 Alt= 5 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms
E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms

Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Johan Hedberg <[email protected]>
[bwh: Backported to 3.14: adjust context]
Signed-off-by: Johan Hedberg <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/bluetooth/btusb.c | 7 +++++++
1 file changed, 7 insertions(+)

--- a/drivers/bluetooth/btusb.c
+++ b/drivers/bluetooth/btusb.c
@@ -49,6 +49,7 @@ static struct usb_driver btusb_driver;
#define BTUSB_WRONG_SCO_MTU 0x40
#define BTUSB_ATH3012 0x80
#define BTUSB_INTEL 0x100
+#define BTUSB_INTEL_BOOT 0x200

static struct usb_device_id btusb_table[] = {
/* Generic Bluetooth USB device */
@@ -116,6 +117,9 @@ static struct usb_device_id btusb_table[
/* IMC Networks - Broadcom based */
{ USB_VENDOR_AND_INTERFACE_INFO(0x13d3, 0xff, 0x01, 0x01) },

+ /* Intel Bluetooth USB Bootloader (RAM module) */
+ { USB_DEVICE(0x8087, 0x0a5a), .driver_info = BTUSB_INTEL_BOOT },
+
{ } /* Terminating entry */
};

@@ -1449,6 +1453,9 @@ static int btusb_probe(struct usb_interf
if (id->driver_info & BTUSB_INTEL)
hdev->setup = btusb_setup_intel;

+ if (id->driver_info & BTUSB_INTEL_BOOT)
+ set_bit(HCI_QUIRK_RAW_DEVICE, &hdev->quirks);
+
/* Interface numbers are hardcoded in the specification */
data->isoc = usb_ifnum_to_if(data->udev, 1);


2015-04-26 13:47:54

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.10 22/31] Bluetooth: Ignore isochronous endpoints for Intel USB bootloader

3.10-stable review patch. If anyone has any objections, please let me know.

------------------

From: Marcel Holtmann <[email protected]>

commit d92f2df0565ea04101d6ac04bdc10feeb1d93c94 upstream.

The isochronous endpoints are not valid when the Intel Bluetooth
controller boots up in bootloader mode. So just mark these endpoints
as broken and then they will not be configured.

Signed-off-by: Marcel Holtmann <[email protected]>
Signed-off-by: Johan Hedberg <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/bluetooth/btusb.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/bluetooth/btusb.c
+++ b/drivers/bluetooth/btusb.c
@@ -118,7 +118,8 @@ static struct usb_device_id btusb_table[
{ USB_VENDOR_AND_INTERFACE_INFO(0x13d3, 0xff, 0x01, 0x01) },

/* Intel Bluetooth USB Bootloader (RAM module) */
- { USB_DEVICE(0x8087, 0x0a5a), .driver_info = BTUSB_INTEL_BOOT },
+ { USB_DEVICE(0x8087, 0x0a5a),
+ .driver_info = BTUSB_INTEL_BOOT | BTUSB_BROKEN_ISOC },

{ } /* Terminating entry */
};

2015-04-26 13:48:45

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.10 23/31] netfilter: conntrack: disable generic tracking for known protocols

3.10-stable review patch. If anyone has any objections, please let me know.

------------------

From: Florian Westphal <[email protected]>

commit db29a9508a9246e77087c5531e45b2c88ec6988b upstream.

Given following iptables ruleset:

-P FORWARD DROP
-A FORWARD -m sctp --dport 9 -j ACCEPT
-A FORWARD -p tcp --dport 80 -j ACCEPT
-A FORWARD -p tcp -m conntrack -m state ESTABLISHED,RELATED -j ACCEPT

One would assume that this allows SCTP on port 9 and TCP on port 80.
Unfortunately, if the SCTP conntrack module is not loaded, this allows
*all* SCTP communication, to pass though, i.e. -p sctp -j ACCEPT,
which we think is a security issue.

This is because on the first SCTP packet on port 9, we create a dummy
"generic l4" conntrack entry without any port information (since
conntrack doesn't know how to extract this information).

All subsequent packets that are unknown will then be in established
state since they will fallback to proto_generic and will match the
'generic' entry.

Our originally proposed version [1] completely disabled generic protocol
tracking, but Jozsef suggests to not track protocols for which a more
suitable helper is available, hence we now mitigate the issue for in
tree known ct protocol helpers only, so that at least NAT and direction
information will still be preserved for others.

[1] http://www.spinics.net/lists/netfilter-devel/msg33430.html

Joint work with Daniel Borkmann.

Fixes CVE-2014-8160.

Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Jozsef Kadlecsik <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
Signed-off-by: Zhiqiang Zhang <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/netfilter/nf_conntrack_proto_generic.c | 26 +++++++++++++++++++++++++-
1 file changed, 25 insertions(+), 1 deletion(-)

--- a/net/netfilter/nf_conntrack_proto_generic.c
+++ b/net/netfilter/nf_conntrack_proto_generic.c
@@ -14,6 +14,30 @@

static unsigned int nf_ct_generic_timeout __read_mostly = 600*HZ;

+static bool nf_generic_should_process(u8 proto)
+{
+ switch (proto) {
+#ifdef CONFIG_NF_CT_PROTO_SCTP_MODULE
+ case IPPROTO_SCTP:
+ return false;
+#endif
+#ifdef CONFIG_NF_CT_PROTO_DCCP_MODULE
+ case IPPROTO_DCCP:
+ return false;
+#endif
+#ifdef CONFIG_NF_CT_PROTO_GRE_MODULE
+ case IPPROTO_GRE:
+ return false;
+#endif
+#ifdef CONFIG_NF_CT_PROTO_UDPLITE_MODULE
+ case IPPROTO_UDPLITE:
+ return false;
+#endif
+ default:
+ return true;
+ }
+}
+
static inline struct nf_generic_net *generic_pernet(struct net *net)
{
return &net->ct.nf_ct_proto.generic;
@@ -67,7 +91,7 @@ static int generic_packet(struct nf_conn
static bool generic_new(struct nf_conn *ct, const struct sk_buff *skb,
unsigned int dataoff, unsigned int *timeouts)
{
- return true;
+ return nf_generic_should_process(nf_ct_protonum(ct));
}

#if IS_ENABLED(CONFIG_NF_CT_NETLINK_TIMEOUT)

2015-04-26 13:48:17

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.10 24/31] KVM: x86: SYSENTER emulation is broken

3.10-stable review patch. If anyone has any objections, please let me know.

------------------

From: Nadav Amit <[email protected]>

commit f3747379accba8e95d70cec0eae0582c8c182050 upstream.

SYSENTER emulation is broken in several ways:
1. It misses the case of 16-bit code segments completely (CVE-2015-0239).
2. MSR_IA32_SYSENTER_CS is checked in 64-bit mode incorrectly (bits 0 and 1 can
still be set without causing #GP).
3. MSR_IA32_SYSENTER_EIP and MSR_IA32_SYSENTER_ESP are not masked in
legacy-mode.
4. There is some unneeded code.

Fix it.

Signed-off-by: Nadav Amit <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
[zhangzhiqiang: backport to 3.10:
- adjust context
- in 3.10 context "ctxt->eflags &= ~(EFLG_VM | EFLG_IF | EFLG_RF)" is replaced by
"ctxt->eflags &= ~(EFLG_VM | EFLG_IF)" in upstream, which was changed by another commit.
- After the above adjustments, becomes same to the original patch:
https://github.com/torvalds/linux/commit/f3747379accba8e95d70cec0eae0582c8c182050
]
Signed-off-by: Zhiqiang Zhang <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/kvm/emulate.c | 27 ++++++++-------------------
1 file changed, 8 insertions(+), 19 deletions(-)

--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -2450,7 +2450,7 @@ static int em_sysenter(struct x86_emulat
* Not recognized on AMD in compat mode (but is recognized in legacy
* mode).
*/
- if ((ctxt->mode == X86EMUL_MODE_PROT32) && (efer & EFER_LMA)
+ if ((ctxt->mode != X86EMUL_MODE_PROT64) && (efer & EFER_LMA)
&& !vendor_intel(ctxt))
return emulate_ud(ctxt);

@@ -2463,25 +2463,13 @@ static int em_sysenter(struct x86_emulat
setup_syscalls_segments(ctxt, &cs, &ss);

ops->get_msr(ctxt, MSR_IA32_SYSENTER_CS, &msr_data);
- switch (ctxt->mode) {
- case X86EMUL_MODE_PROT32:
- if ((msr_data & 0xfffc) == 0x0)
- return emulate_gp(ctxt, 0);
- break;
- case X86EMUL_MODE_PROT64:
- if (msr_data == 0x0)
- return emulate_gp(ctxt, 0);
- break;
- default:
- break;
- }
+ if ((msr_data & 0xfffc) == 0x0)
+ return emulate_gp(ctxt, 0);

ctxt->eflags &= ~(EFLG_VM | EFLG_IF | EFLG_RF);
- cs_sel = (u16)msr_data;
- cs_sel &= ~SELECTOR_RPL_MASK;
+ cs_sel = (u16)msr_data & ~SELECTOR_RPL_MASK;
ss_sel = cs_sel + 8;
- ss_sel &= ~SELECTOR_RPL_MASK;
- if (ctxt->mode == X86EMUL_MODE_PROT64 || (efer & EFER_LMA)) {
+ if (efer & EFER_LMA) {
cs.d = 0;
cs.l = 1;
}
@@ -2490,10 +2478,11 @@ static int em_sysenter(struct x86_emulat
ops->set_segment(ctxt, ss_sel, &ss, 0, VCPU_SREG_SS);

ops->get_msr(ctxt, MSR_IA32_SYSENTER_EIP, &msr_data);
- ctxt->_eip = msr_data;
+ ctxt->_eip = (efer & EFER_LMA) ? msr_data : (u32)msr_data;

ops->get_msr(ctxt, MSR_IA32_SYSENTER_ESP, &msr_data);
- *reg_write(ctxt, VCPU_REGS_RSP) = msr_data;
+ *reg_write(ctxt, VCPU_REGS_RSP) = (efer & EFER_LMA) ? msr_data :
+ (u32)msr_data;

return X86EMUL_CONTINUE;
}

2015-04-26 13:48:25

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.10 26/31] move d_rcu from overlapping d_child to overlapping d_alias

3.10-stable review patch. If anyone has any objections, please let me know.

------------------

From: Al Viro <[email protected]>

commit 946e51f2bf37f1656916eb75bd0742ba33983c28 upstream.

Signed-off-by: Al Viro <[email protected]>
Cc: Ben Hutchings <[email protected]>
[hujianyang: Backported to 3.10 refer to the work of Ben Hutchings in 3.2:
- Apply name changes in all the different places we use d_alias and d_child
- Move the WARN_ON() in __d_free() to d_free() as we don't have dentry_free()]
Signed-off-by: hujianyang <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/powerpc/platforms/cell/spufs/inode.c | 2
fs/affs/amigaffs.c | 2
fs/autofs4/expire.c | 12 ++---
fs/autofs4/root.c | 2
fs/ceph/dir.c | 8 +--
fs/ceph/inode.c | 6 +-
fs/cifs/inode.c | 2
fs/coda/cache.c | 2
fs/dcache.c | 72 +++++++++++++++---------------
fs/debugfs/inode.c | 6 +-
fs/exportfs/expfs.c | 2
fs/libfs.c | 12 ++---
fs/ncpfs/dir.c | 2
fs/ncpfs/ncplib_kernel.h | 4 -
fs/nfs/getroot.c | 2
fs/notify/fsnotify.c | 4 -
fs/ocfs2/dcache.c | 2
include/linux/dcache.h | 8 +--
kernel/cgroup.c | 2
kernel/trace/trace.c | 4 -
kernel/trace/trace_events.c | 2
security/selinux/selinuxfs.c | 6 +-
22 files changed, 82 insertions(+), 82 deletions(-)

--- a/arch/powerpc/platforms/cell/spufs/inode.c
+++ b/arch/powerpc/platforms/cell/spufs/inode.c
@@ -164,7 +164,7 @@ static void spufs_prune_dir(struct dentr
struct dentry *dentry, *tmp;

mutex_lock(&dir->d_inode->i_mutex);
- list_for_each_entry_safe(dentry, tmp, &dir->d_subdirs, d_u.d_child) {
+ list_for_each_entry_safe(dentry, tmp, &dir->d_subdirs, d_child) {
spin_lock(&dentry->d_lock);
if (!(d_unhashed(dentry)) && dentry->d_inode) {
dget_dlock(dentry);
--- a/fs/affs/amigaffs.c
+++ b/fs/affs/amigaffs.c
@@ -126,7 +126,7 @@ affs_fix_dcache(struct inode *inode, u32
{
struct dentry *dentry;
spin_lock(&inode->i_lock);
- hlist_for_each_entry(dentry, &inode->i_dentry, d_alias) {
+ hlist_for_each_entry(dentry, &inode->i_dentry, d_u.d_alias) {
if (entry_ino == (u32)(long)dentry->d_fsdata) {
dentry->d_fsdata = (void *)inode->i_ino;
break;
--- a/fs/autofs4/expire.c
+++ b/fs/autofs4/expire.c
@@ -91,7 +91,7 @@ static struct dentry *get_next_positive_
spin_lock(&root->d_lock);

if (prev)
- next = prev->d_u.d_child.next;
+ next = prev->d_child.next;
else {
prev = dget_dlock(root);
next = prev->d_subdirs.next;
@@ -105,13 +105,13 @@ cont:
return NULL;
}

- q = list_entry(next, struct dentry, d_u.d_child);
+ q = list_entry(next, struct dentry, d_child);

spin_lock_nested(&q->d_lock, DENTRY_D_LOCK_NESTED);
/* Already gone or negative dentry (under construction) - try next */
if (q->d_count == 0 || !simple_positive(q)) {
spin_unlock(&q->d_lock);
- next = q->d_u.d_child.next;
+ next = q->d_child.next;
goto cont;
}
dget_dlock(q);
@@ -161,13 +161,13 @@ again:
goto relock;
}
spin_unlock(&p->d_lock);
- next = p->d_u.d_child.next;
+ next = p->d_child.next;
p = parent;
if (next != &parent->d_subdirs)
break;
}
}
- ret = list_entry(next, struct dentry, d_u.d_child);
+ ret = list_entry(next, struct dentry, d_child);

spin_lock_nested(&ret->d_lock, DENTRY_D_LOCK_NESTED);
/* Negative dentry - try next */
@@ -447,7 +447,7 @@ found:
spin_lock(&sbi->lookup_lock);
spin_lock(&expired->d_parent->d_lock);
spin_lock_nested(&expired->d_lock, DENTRY_D_LOCK_NESTED);
- list_move(&expired->d_parent->d_subdirs, &expired->d_u.d_child);
+ list_move(&expired->d_parent->d_subdirs, &expired->d_child);
spin_unlock(&expired->d_lock);
spin_unlock(&expired->d_parent->d_lock);
spin_unlock(&sbi->lookup_lock);
--- a/fs/autofs4/root.c
+++ b/fs/autofs4/root.c
@@ -655,7 +655,7 @@ static void autofs_clear_leaf_automount_
/* only consider parents below dentrys in the root */
if (IS_ROOT(parent->d_parent))
return;
- d_child = &dentry->d_u.d_child;
+ d_child = &dentry->d_child;
/* Set parent managed if it's becoming empty */
if (d_child->next == &parent->d_subdirs &&
d_child->prev == &parent->d_subdirs)
--- a/fs/ceph/dir.c
+++ b/fs/ceph/dir.c
@@ -103,7 +103,7 @@ static unsigned fpos_off(loff_t p)
/*
* When possible, we try to satisfy a readdir by peeking at the
* dcache. We make this work by carefully ordering dentries on
- * d_u.d_child when we initially get results back from the MDS, and
+ * d_child when we initially get results back from the MDS, and
* falling back to a "normal" sync readdir if any dentries in the dir
* are dropped.
*
@@ -139,11 +139,11 @@ static int __dcache_readdir(struct file
p = parent->d_subdirs.prev;
dout(" initial p %p/%p\n", p->prev, p->next);
} else {
- p = last->d_u.d_child.prev;
+ p = last->d_child.prev;
}

more:
- dentry = list_entry(p, struct dentry, d_u.d_child);
+ dentry = list_entry(p, struct dentry, d_child);
di = ceph_dentry(dentry);
while (1) {
dout(" p %p/%p %s d_subdirs %p/%p\n", p->prev, p->next,
@@ -165,7 +165,7 @@ more:
!dentry->d_inode ? " null" : "");
spin_unlock(&dentry->d_lock);
p = p->prev;
- dentry = list_entry(p, struct dentry, d_u.d_child);
+ dentry = list_entry(p, struct dentry, d_child);
di = ceph_dentry(dentry);
}

--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -867,9 +867,9 @@ static void ceph_set_dentry_offset(struc

spin_lock(&dir->d_lock);
spin_lock_nested(&dn->d_lock, DENTRY_D_LOCK_NESTED);
- list_move(&dn->d_u.d_child, &dir->d_subdirs);
+ list_move(&dn->d_child, &dir->d_subdirs);
dout("set_dentry_offset %p %lld (%p %p)\n", dn, di->offset,
- dn->d_u.d_child.prev, dn->d_u.d_child.next);
+ dn->d_child.prev, dn->d_child.next);
spin_unlock(&dn->d_lock);
spin_unlock(&dir->d_lock);
}
@@ -1296,7 +1296,7 @@ retry_lookup:
/* reorder parent's d_subdirs */
spin_lock(&parent->d_lock);
spin_lock_nested(&dn->d_lock, DENTRY_D_LOCK_NESTED);
- list_move(&dn->d_u.d_child, &parent->d_subdirs);
+ list_move(&dn->d_child, &parent->d_subdirs);
spin_unlock(&dn->d_lock);
spin_unlock(&parent->d_lock);
}
--- a/fs/cifs/inode.c
+++ b/fs/cifs/inode.c
@@ -832,7 +832,7 @@ inode_has_hashed_dentries(struct inode *
struct dentry *dentry;

spin_lock(&inode->i_lock);
- hlist_for_each_entry(dentry, &inode->i_dentry, d_alias) {
+ hlist_for_each_entry(dentry, &inode->i_dentry, d_u.d_alias) {
if (!d_unhashed(dentry) || IS_ROOT(dentry)) {
spin_unlock(&inode->i_lock);
return true;
--- a/fs/coda/cache.c
+++ b/fs/coda/cache.c
@@ -92,7 +92,7 @@ static void coda_flag_children(struct de
struct dentry *de;

spin_lock(&parent->d_lock);
- list_for_each_entry(de, &parent->d_subdirs, d_u.d_child) {
+ list_for_each_entry(de, &parent->d_subdirs, d_child) {
/* don't know what to do with negative dentries */
if (de->d_inode )
coda_flag_inode(de->d_inode, flag);
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -43,7 +43,7 @@
/*
* Usage:
* dcache->d_inode->i_lock protects:
- * - i_dentry, d_alias, d_inode of aliases
+ * - i_dentry, d_u.d_alias, d_inode of aliases
* dcache_hash_bucket lock protects:
* - the dcache hash table
* s_anon bl list spinlock protects:
@@ -58,7 +58,7 @@
* - d_unhashed()
* - d_parent and d_subdirs
* - childrens' d_child and d_parent
- * - d_alias, d_inode
+ * - d_u.d_alias, d_inode
*
* Ordering:
* dentry->d_inode->i_lock
@@ -215,7 +215,6 @@ static void __d_free(struct rcu_head *he
{
struct dentry *dentry = container_of(head, struct dentry, d_u.d_rcu);

- WARN_ON(!hlist_unhashed(&dentry->d_alias));
if (dname_external(dentry))
kfree(dentry->d_name.name);
kmem_cache_free(dentry_cache, dentry);
@@ -226,6 +225,7 @@ static void __d_free(struct rcu_head *he
*/
static void d_free(struct dentry *dentry)
{
+ WARN_ON(!hlist_unhashed(&dentry->d_u.d_alias));
BUG_ON(dentry->d_count);
this_cpu_dec(nr_dentry);
if (dentry->d_op && dentry->d_op->d_release)
@@ -264,7 +264,7 @@ static void dentry_iput(struct dentry *
struct inode *inode = dentry->d_inode;
if (inode) {
dentry->d_inode = NULL;
- hlist_del_init(&dentry->d_alias);
+ hlist_del_init(&dentry->d_u.d_alias);
spin_unlock(&dentry->d_lock);
spin_unlock(&inode->i_lock);
if (!inode->i_nlink)
@@ -288,7 +288,7 @@ static void dentry_unlink_inode(struct d
{
struct inode *inode = dentry->d_inode;
dentry->d_inode = NULL;
- hlist_del_init(&dentry->d_alias);
+ hlist_del_init(&dentry->d_u.d_alias);
dentry_rcuwalk_barrier(dentry);
spin_unlock(&dentry->d_lock);
spin_unlock(&inode->i_lock);
@@ -364,7 +364,7 @@ static struct dentry *d_kill(struct dent
__releases(parent->d_lock)
__releases(dentry->d_inode->i_lock)
{
- list_del(&dentry->d_u.d_child);
+ list_del(&dentry->d_child);
/*
* Inform try_to_ascend() that we are no longer attached to the
* dentry tree
@@ -660,7 +660,7 @@ static struct dentry *__d_find_alias(str

again:
discon_alias = NULL;
- hlist_for_each_entry(alias, &inode->i_dentry, d_alias) {
+ hlist_for_each_entry(alias, &inode->i_dentry, d_u.d_alias) {
spin_lock(&alias->d_lock);
if (S_ISDIR(inode->i_mode) || !d_unhashed(alias)) {
if (IS_ROOT(alias) &&
@@ -713,7 +713,7 @@ void d_prune_aliases(struct inode *inode
struct dentry *dentry;
restart:
spin_lock(&inode->i_lock);
- hlist_for_each_entry(dentry, &inode->i_dentry, d_alias) {
+ hlist_for_each_entry(dentry, &inode->i_dentry, d_u.d_alias) {
spin_lock(&dentry->d_lock);
if (!dentry->d_count) {
__dget_dlock(dentry);
@@ -893,7 +893,7 @@ static void shrink_dcache_for_umount_sub
/* descend to the first leaf in the current subtree */
while (!list_empty(&dentry->d_subdirs))
dentry = list_entry(dentry->d_subdirs.next,
- struct dentry, d_u.d_child);
+ struct dentry, d_child);

/* consume the dentries from this leaf up through its parents
* until we find one with children or run out altogether */
@@ -927,17 +927,17 @@ static void shrink_dcache_for_umount_sub

if (IS_ROOT(dentry)) {
parent = NULL;
- list_del(&dentry->d_u.d_child);
+ list_del(&dentry->d_child);
} else {
parent = dentry->d_parent;
parent->d_count--;
- list_del(&dentry->d_u.d_child);
+ list_del(&dentry->d_child);
}

inode = dentry->d_inode;
if (inode) {
dentry->d_inode = NULL;
- hlist_del_init(&dentry->d_alias);
+ hlist_del_init(&dentry->d_u.d_alias);
if (dentry->d_op && dentry->d_op->d_iput)
dentry->d_op->d_iput(dentry, inode);
else
@@ -955,7 +955,7 @@ static void shrink_dcache_for_umount_sub
} while (list_empty(&dentry->d_subdirs));

dentry = list_entry(dentry->d_subdirs.next,
- struct dentry, d_u.d_child);
+ struct dentry, d_child);
}
}

@@ -1048,7 +1048,7 @@ repeat:
resume:
while (next != &this_parent->d_subdirs) {
struct list_head *tmp = next;
- struct dentry *dentry = list_entry(tmp, struct dentry, d_u.d_child);
+ struct dentry *dentry = list_entry(tmp, struct dentry, d_child);
next = tmp->next;

spin_lock_nested(&dentry->d_lock, DENTRY_D_LOCK_NESTED);
@@ -1075,7 +1075,7 @@ resume:
this_parent = try_to_ascend(this_parent, locked, seq);
if (!this_parent)
goto rename_retry;
- next = child->d_u.d_child.next;
+ next = child->d_child.next;
goto resume;
}
spin_unlock(&this_parent->d_lock);
@@ -1131,7 +1131,7 @@ repeat:
resume:
while (next != &this_parent->d_subdirs) {
struct list_head *tmp = next;
- struct dentry *dentry = list_entry(tmp, struct dentry, d_u.d_child);
+ struct dentry *dentry = list_entry(tmp, struct dentry, d_child);
next = tmp->next;

spin_lock_nested(&dentry->d_lock, DENTRY_D_LOCK_NESTED);
@@ -1182,7 +1182,7 @@ resume:
this_parent = try_to_ascend(this_parent, locked, seq);
if (!this_parent)
goto rename_retry;
- next = child->d_u.d_child.next;
+ next = child->d_child.next;
goto resume;
}
out:
@@ -1278,8 +1278,8 @@ struct dentry *__d_alloc(struct super_bl
INIT_HLIST_BL_NODE(&dentry->d_hash);
INIT_LIST_HEAD(&dentry->d_lru);
INIT_LIST_HEAD(&dentry->d_subdirs);
- INIT_HLIST_NODE(&dentry->d_alias);
- INIT_LIST_HEAD(&dentry->d_u.d_child);
+ INIT_HLIST_NODE(&dentry->d_u.d_alias);
+ INIT_LIST_HEAD(&dentry->d_child);
d_set_d_op(dentry, dentry->d_sb->s_d_op);

this_cpu_inc(nr_dentry);
@@ -1309,7 +1309,7 @@ struct dentry *d_alloc(struct dentry * p
*/
__dget_dlock(parent);
dentry->d_parent = parent;
- list_add(&dentry->d_u.d_child, &parent->d_subdirs);
+ list_add(&dentry->d_child, &parent->d_subdirs);
spin_unlock(&parent->d_lock);

return dentry;
@@ -1369,7 +1369,7 @@ static void __d_instantiate(struct dentr
if (inode) {
if (unlikely(IS_AUTOMOUNT(inode)))
dentry->d_flags |= DCACHE_NEED_AUTOMOUNT;
- hlist_add_head(&dentry->d_alias, &inode->i_dentry);
+ hlist_add_head(&dentry->d_u.d_alias, &inode->i_dentry);
}
dentry->d_inode = inode;
dentry_rcuwalk_barrier(dentry);
@@ -1394,7 +1394,7 @@ static void __d_instantiate(struct dentr

void d_instantiate(struct dentry *entry, struct inode * inode)
{
- BUG_ON(!hlist_unhashed(&entry->d_alias));
+ BUG_ON(!hlist_unhashed(&entry->d_u.d_alias));
if (inode)
spin_lock(&inode->i_lock);
__d_instantiate(entry, inode);
@@ -1433,7 +1433,7 @@ static struct dentry *__d_instantiate_un
return NULL;
}

- hlist_for_each_entry(alias, &inode->i_dentry, d_alias) {
+ hlist_for_each_entry(alias, &inode->i_dentry, d_u.d_alias) {
/*
* Don't need alias->d_lock here, because aliases with
* d_parent == entry->d_parent are not subject to name or
@@ -1459,7 +1459,7 @@ struct dentry *d_instantiate_unique(stru
{
struct dentry *result;

- BUG_ON(!hlist_unhashed(&entry->d_alias));
+ BUG_ON(!hlist_unhashed(&entry->d_u.d_alias));

if (inode)
spin_lock(&inode->i_lock);
@@ -1502,7 +1502,7 @@ static struct dentry * __d_find_any_alia

if (hlist_empty(&inode->i_dentry))
return NULL;
- alias = hlist_entry(inode->i_dentry.first, struct dentry, d_alias);
+ alias = hlist_entry(inode->i_dentry.first, struct dentry, d_u.d_alias);
__dget(alias);
return alias;
}
@@ -1576,7 +1576,7 @@ struct dentry *d_obtain_alias(struct ino
spin_lock(&tmp->d_lock);
tmp->d_inode = inode;
tmp->d_flags |= DCACHE_DISCONNECTED;
- hlist_add_head(&tmp->d_alias, &inode->i_dentry);
+ hlist_add_head(&tmp->d_u.d_alias, &inode->i_dentry);
hlist_bl_lock(&tmp->d_sb->s_anon);
hlist_bl_add_head(&tmp->d_hash, &tmp->d_sb->s_anon);
hlist_bl_unlock(&tmp->d_sb->s_anon);
@@ -2019,7 +2019,7 @@ int d_validate(struct dentry *dentry, st
struct dentry *child;

spin_lock(&dparent->d_lock);
- list_for_each_entry(child, &dparent->d_subdirs, d_u.d_child) {
+ list_for_each_entry(child, &dparent->d_subdirs, d_child) {
if (dentry == child) {
spin_lock_nested(&dentry->d_lock, DENTRY_D_LOCK_NESTED);
__dget_dlock(dentry);
@@ -2266,8 +2266,8 @@ static void __d_move(struct dentry * den
/* Unhash the target: dput() will then get rid of it */
__d_drop(target);

- list_del(&dentry->d_u.d_child);
- list_del(&target->d_u.d_child);
+ list_del(&dentry->d_child);
+ list_del(&target->d_child);

/* Switch the names.. */
switch_names(dentry, target);
@@ -2277,15 +2277,15 @@ static void __d_move(struct dentry * den
if (IS_ROOT(dentry)) {
dentry->d_parent = target->d_parent;
target->d_parent = target;
- INIT_LIST_HEAD(&target->d_u.d_child);
+ INIT_LIST_HEAD(&target->d_child);
} else {
swap(dentry->d_parent, target->d_parent);

/* And add them back to the (new) parent lists */
- list_add(&target->d_u.d_child, &target->d_parent->d_subdirs);
+ list_add(&target->d_child, &target->d_parent->d_subdirs);
}

- list_add(&dentry->d_u.d_child, &dentry->d_parent->d_subdirs);
+ list_add(&dentry->d_child, &dentry->d_parent->d_subdirs);

write_seqcount_end(&target->d_seq);
write_seqcount_end(&dentry->d_seq);
@@ -2392,9 +2392,9 @@ static void __d_materialise_dentry(struc
swap(dentry->d_name.hash, anon->d_name.hash);

dentry->d_parent = dentry;
- list_del_init(&dentry->d_u.d_child);
+ list_del_init(&dentry->d_child);
anon->d_parent = dparent;
- list_move(&anon->d_u.d_child, &dparent->d_subdirs);
+ list_move(&anon->d_child, &dparent->d_subdirs);

write_seqcount_end(&dentry->d_seq);
write_seqcount_end(&anon->d_seq);
@@ -2933,7 +2933,7 @@ repeat:
resume:
while (next != &this_parent->d_subdirs) {
struct list_head *tmp = next;
- struct dentry *dentry = list_entry(tmp, struct dentry, d_u.d_child);
+ struct dentry *dentry = list_entry(tmp, struct dentry, d_child);
next = tmp->next;

spin_lock_nested(&dentry->d_lock, DENTRY_D_LOCK_NESTED);
@@ -2963,7 +2963,7 @@ resume:
this_parent = try_to_ascend(this_parent, locked, seq);
if (!this_parent)
goto rename_retry;
- next = child->d_u.d_child.next;
+ next = child->d_child.next;
goto resume;
}
spin_unlock(&this_parent->d_lock);
--- a/fs/debugfs/inode.c
+++ b/fs/debugfs/inode.c
@@ -545,7 +545,7 @@ void debugfs_remove_recursive(struct den
parent = dentry;
down:
mutex_lock(&parent->d_inode->i_mutex);
- list_for_each_entry_safe(child, next, &parent->d_subdirs, d_u.d_child) {
+ list_for_each_entry_safe(child, next, &parent->d_subdirs, d_child) {
if (!debugfs_positive(child))
continue;

@@ -566,8 +566,8 @@ void debugfs_remove_recursive(struct den
mutex_lock(&parent->d_inode->i_mutex);

if (child != dentry) {
- next = list_entry(child->d_u.d_child.next, struct dentry,
- d_u.d_child);
+ next = list_entry(child->d_child.next, struct dentry,
+ d_child);
goto up;
}

--- a/fs/exportfs/expfs.c
+++ b/fs/exportfs/expfs.c
@@ -50,7 +50,7 @@ find_acceptable_alias(struct dentry *res

inode = result->d_inode;
spin_lock(&inode->i_lock);
- hlist_for_each_entry(dentry, &inode->i_dentry, d_alias) {
+ hlist_for_each_entry(dentry, &inode->i_dentry, d_u.d_alias) {
dget(dentry);
spin_unlock(&inode->i_lock);
if (toput)
--- a/fs/libfs.c
+++ b/fs/libfs.c
@@ -104,18 +104,18 @@ loff_t dcache_dir_lseek(struct file *fil

spin_lock(&dentry->d_lock);
/* d_lock not required for cursor */
- list_del(&cursor->d_u.d_child);
+ list_del(&cursor->d_child);
p = dentry->d_subdirs.next;
while (n && p != &dentry->d_subdirs) {
struct dentry *next;
- next = list_entry(p, struct dentry, d_u.d_child);
+ next = list_entry(p, struct dentry, d_child);
spin_lock_nested(&next->d_lock, DENTRY_D_LOCK_NESTED);
if (simple_positive(next))
n--;
spin_unlock(&next->d_lock);
p = p->next;
}
- list_add_tail(&cursor->d_u.d_child, p);
+ list_add_tail(&cursor->d_child, p);
spin_unlock(&dentry->d_lock);
}
}
@@ -139,7 +139,7 @@ int dcache_readdir(struct file * filp, v
{
struct dentry *dentry = filp->f_path.dentry;
struct dentry *cursor = filp->private_data;
- struct list_head *p, *q = &cursor->d_u.d_child;
+ struct list_head *p, *q = &cursor->d_child;
ino_t ino;
int i = filp->f_pos;

@@ -165,7 +165,7 @@ int dcache_readdir(struct file * filp, v

for (p=q->next; p != &dentry->d_subdirs; p=p->next) {
struct dentry *next;
- next = list_entry(p, struct dentry, d_u.d_child);
+ next = list_entry(p, struct dentry, d_child);
spin_lock_nested(&next->d_lock, DENTRY_D_LOCK_NESTED);
if (!simple_positive(next)) {
spin_unlock(&next->d_lock);
@@ -289,7 +289,7 @@ int simple_empty(struct dentry *dentry)
int ret = 0;

spin_lock(&dentry->d_lock);
- list_for_each_entry(child, &dentry->d_subdirs, d_u.d_child) {
+ list_for_each_entry(child, &dentry->d_subdirs, d_child) {
spin_lock_nested(&child->d_lock, DENTRY_D_LOCK_NESTED);
if (simple_positive(child)) {
spin_unlock(&child->d_lock);
--- a/fs/ncpfs/dir.c
+++ b/fs/ncpfs/dir.c
@@ -391,7 +391,7 @@ ncp_dget_fpos(struct dentry *dentry, str
spin_lock(&parent->d_lock);
next = parent->d_subdirs.next;
while (next != &parent->d_subdirs) {
- dent = list_entry(next, struct dentry, d_u.d_child);
+ dent = list_entry(next, struct dentry, d_child);
if ((unsigned long)dent->d_fsdata == fpos) {
if (dent->d_inode)
dget(dent);
--- a/fs/ncpfs/ncplib_kernel.h
+++ b/fs/ncpfs/ncplib_kernel.h
@@ -194,7 +194,7 @@ ncp_renew_dentries(struct dentry *parent
spin_lock(&parent->d_lock);
next = parent->d_subdirs.next;
while (next != &parent->d_subdirs) {
- dentry = list_entry(next, struct dentry, d_u.d_child);
+ dentry = list_entry(next, struct dentry, d_child);

if (dentry->d_fsdata == NULL)
ncp_age_dentry(server, dentry);
@@ -216,7 +216,7 @@ ncp_invalidate_dircache_entries(struct d
spin_lock(&parent->d_lock);
next = parent->d_subdirs.next;
while (next != &parent->d_subdirs) {
- dentry = list_entry(next, struct dentry, d_u.d_child);
+ dentry = list_entry(next, struct dentry, d_child);
dentry->d_fsdata = NULL;
ncp_age_dentry(server, dentry);
next = next->next;
--- a/fs/nfs/getroot.c
+++ b/fs/nfs/getroot.c
@@ -58,7 +58,7 @@ static int nfs_superblock_set_dummy_root
*/
spin_lock(&sb->s_root->d_inode->i_lock);
spin_lock(&sb->s_root->d_lock);
- hlist_del_init(&sb->s_root->d_alias);
+ hlist_del_init(&sb->s_root->d_u.d_alias);
spin_unlock(&sb->s_root->d_lock);
spin_unlock(&sb->s_root->d_inode->i_lock);
}
--- a/fs/notify/fsnotify.c
+++ b/fs/notify/fsnotify.c
@@ -63,14 +63,14 @@ void __fsnotify_update_child_dentry_flag
spin_lock(&inode->i_lock);
/* run all of the dentries associated with this inode. Since this is a
* directory, there damn well better only be one item on this list */
- hlist_for_each_entry(alias, &inode->i_dentry, d_alias) {
+ hlist_for_each_entry(alias, &inode->i_dentry, d_u.d_alias) {
struct dentry *child;

/* run all of the children of the original inode and fix their
* d_flags to indicate parental interest (their parent is the
* original inode) */
spin_lock(&alias->d_lock);
- list_for_each_entry(child, &alias->d_subdirs, d_u.d_child) {
+ list_for_each_entry(child, &alias->d_subdirs, d_child) {
if (!child->d_inode)
continue;

--- a/fs/ocfs2/dcache.c
+++ b/fs/ocfs2/dcache.c
@@ -172,7 +172,7 @@ struct dentry *ocfs2_find_local_alias(st
struct dentry *dentry;

spin_lock(&inode->i_lock);
- hlist_for_each_entry(dentry, &inode->i_dentry, d_alias) {
+ hlist_for_each_entry(dentry, &inode->i_dentry, d_u.d_alias) {
spin_lock(&dentry->d_lock);
if (ocfs2_match_dentry(dentry, parent_blkno, skip_unhashed)) {
trace_ocfs2_find_local_alias(dentry->d_name.len,
--- a/include/linux/dcache.h
+++ b/include/linux/dcache.h
@@ -120,15 +120,15 @@ struct dentry {
void *d_fsdata; /* fs-specific data */

struct list_head d_lru; /* LRU list */
+ struct list_head d_child; /* child of parent list */
+ struct list_head d_subdirs; /* our children */
/*
- * d_child and d_rcu can share memory
+ * d_alias and d_rcu can share memory
*/
union {
- struct list_head d_child; /* child of parent list */
+ struct hlist_node d_alias; /* inode alias list */
struct rcu_head d_rcu;
} d_u;
- struct list_head d_subdirs; /* our children */
- struct hlist_node d_alias; /* inode alias list */
};

/*
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -984,7 +984,7 @@ static void cgroup_d_remove_dir(struct d
parent = dentry->d_parent;
spin_lock(&parent->d_lock);
spin_lock_nested(&dentry->d_lock, DENTRY_D_LOCK_NESTED);
- list_del_init(&dentry->d_u.d_child);
+ list_del_init(&dentry->d_child);
spin_unlock(&dentry->d_lock);
spin_unlock(&parent->d_lock);
remove_dir(dentry);
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -6063,7 +6063,7 @@ static int instance_mkdir (struct inode
int ret;

/* Paranoid: Make sure the parent is the "instances" directory */
- parent = hlist_entry(inode->i_dentry.first, struct dentry, d_alias);
+ parent = hlist_entry(inode->i_dentry.first, struct dentry, d_u.d_alias);
if (WARN_ON_ONCE(parent != trace_instance_dir))
return -ENOENT;

@@ -6090,7 +6090,7 @@ static int instance_rmdir(struct inode *
int ret;

/* Paranoid: Make sure the parent is the "instances" directory */
- parent = hlist_entry(inode->i_dentry.first, struct dentry, d_alias);
+ parent = hlist_entry(inode->i_dentry.first, struct dentry, d_u.d_alias);
if (WARN_ON_ONCE(parent != trace_instance_dir))
return -ENOENT;

--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -425,7 +425,7 @@ static void remove_event_file_dir(struct

if (dir) {
spin_lock(&dir->d_lock); /* probably unneeded */
- list_for_each_entry(child, &dir->d_subdirs, d_u.d_child) {
+ list_for_each_entry(child, &dir->d_subdirs, d_child) {
if (child->d_inode) /* probably unneeded */
child->d_inode->i_private = NULL;
}
--- a/security/selinux/selinuxfs.c
+++ b/security/selinux/selinuxfs.c
@@ -1190,7 +1190,7 @@ static void sel_remove_entries(struct de
spin_lock(&de->d_lock);
node = de->d_subdirs.next;
while (node != &de->d_subdirs) {
- struct dentry *d = list_entry(node, struct dentry, d_u.d_child);
+ struct dentry *d = list_entry(node, struct dentry, d_child);

spin_lock_nested(&d->d_lock, DENTRY_D_LOCK_NESTED);
list_del_init(node);
@@ -1664,12 +1664,12 @@ static void sel_remove_classes(void)

list_for_each(class_node, &class_dir->d_subdirs) {
struct dentry *class_subdir = list_entry(class_node,
- struct dentry, d_u.d_child);
+ struct dentry, d_child);
struct list_head *class_subdir_node;

list_for_each(class_subdir_node, &class_subdir->d_subdirs) {
struct dentry *d = list_entry(class_subdir_node,
- struct dentry, d_u.d_child);
+ struct dentry, d_child);

if (d->d_inode)
if (d->d_inode->i_mode & S_IFDIR)

2015-04-26 13:57:57

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.10 27/31] deal with deadlock in d_walk()

3.10-stable review patch. If anyone has any objections, please let me know.

------------------

From: Al Viro <[email protected]>

commit ca5358ef75fc69fee5322a38a340f5739d997c10 upstream.

... by not hitting rename_retry for reasons other than rename having
happened. In other words, do _not_ restart when finding that
between unlocking the child and locking the parent the former got
into __dentry_kill(). Skip the killed siblings instead...

Signed-off-by: Al Viro <[email protected]>
Cc: Ben Hutchings <[email protected]>
[hujianyang: Backported to 3.10 refer to the work of Ben Hutchings in 3.2:
- As we only have try_to_ascend() and not d_walk(), apply this
change to all callers of try_to_ascend()
- Adjust context to make __dentry_kill() apply to d_kill()]
Signed-off-by: hujianyang <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/dcache.c | 102 ++++++++++++++++++++++++++++++++++++------------------------
1 file changed, 62 insertions(+), 40 deletions(-)

--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -364,9 +364,9 @@ static struct dentry *d_kill(struct dent
__releases(parent->d_lock)
__releases(dentry->d_inode->i_lock)
{
- list_del(&dentry->d_child);
+ __list_del_entry(&dentry->d_child);
/*
- * Inform try_to_ascend() that we are no longer attached to the
+ * Inform ascending readers that we are no longer attached to the
* dentry tree
*/
dentry->d_flags |= DCACHE_DENTRY_KILLED;
@@ -988,35 +988,6 @@ void shrink_dcache_for_umount(struct sup
}

/*
- * This tries to ascend one level of parenthood, but
- * we can race with renaming, so we need to re-check
- * the parenthood after dropping the lock and check
- * that the sequence number still matches.
- */
-static struct dentry *try_to_ascend(struct dentry *old, int locked, unsigned seq)
-{
- struct dentry *new = old->d_parent;
-
- rcu_read_lock();
- spin_unlock(&old->d_lock);
- spin_lock(&new->d_lock);
-
- /*
- * might go back up the wrong parent if we have had a rename
- * or deletion
- */
- if (new != old->d_parent ||
- (old->d_flags & DCACHE_DENTRY_KILLED) ||
- (!locked && read_seqretry(&rename_lock, seq))) {
- spin_unlock(&new->d_lock);
- new = NULL;
- }
- rcu_read_unlock();
- return new;
-}
-
-
-/*
* Search for at least 1 mount point in the dentry's subdirs.
* We descend to the next level whenever the d_subdirs
* list is non-empty and continue searching.
@@ -1070,17 +1041,32 @@ resume:
/*
* All done at this level ... ascend and resume the search.
*/
+ rcu_read_lock();
+ascend:
if (this_parent != parent) {
struct dentry *child = this_parent;
- this_parent = try_to_ascend(this_parent, locked, seq);
- if (!this_parent)
+ this_parent = child->d_parent;
+
+ spin_unlock(&child->d_lock);
+ spin_lock(&this_parent->d_lock);
+
+ /* might go back up the wrong parent if we have had a rename. */
+ if (!locked && read_seqretry(&rename_lock, seq))
goto rename_retry;
next = child->d_child.next;
+ while (unlikely(child->d_flags & DCACHE_DENTRY_KILLED)) {
+ if (next == &this_parent->d_subdirs)
+ goto ascend;
+ child = list_entry(next, struct dentry, d_child);
+ next = next->next;
+ }
+ rcu_read_unlock();
goto resume;
}
- spin_unlock(&this_parent->d_lock);
if (!locked && read_seqretry(&rename_lock, seq))
goto rename_retry;
+ spin_unlock(&this_parent->d_lock);
+ rcu_read_unlock();
if (locked)
write_sequnlock(&rename_lock);
return 0; /* No mount points found in tree */
@@ -1092,6 +1078,8 @@ positive:
return 1;

rename_retry:
+ spin_unlock(&this_parent->d_lock);
+ rcu_read_unlock();
if (locked)
goto again;
locked = 1;
@@ -1177,23 +1165,40 @@ resume:
/*
* All done at this level ... ascend and resume the search.
*/
+ rcu_read_lock();
+ascend:
if (this_parent != parent) {
struct dentry *child = this_parent;
- this_parent = try_to_ascend(this_parent, locked, seq);
- if (!this_parent)
+ this_parent = child->d_parent;
+
+ spin_unlock(&child->d_lock);
+ spin_lock(&this_parent->d_lock);
+
+ /* might go back up the wrong parent if we have had a rename. */
+ if (!locked && read_seqretry(&rename_lock, seq))
goto rename_retry;
next = child->d_child.next;
+ while (unlikely(child->d_flags & DCACHE_DENTRY_KILLED)) {
+ if (next == &this_parent->d_subdirs)
+ goto ascend;
+ child = list_entry(next, struct dentry, d_child);
+ next = next->next;
+ }
+ rcu_read_unlock();
goto resume;
}
out:
- spin_unlock(&this_parent->d_lock);
if (!locked && read_seqretry(&rename_lock, seq))
goto rename_retry;
+ spin_unlock(&this_parent->d_lock);
+ rcu_read_unlock();
if (locked)
write_sequnlock(&rename_lock);
return found;

rename_retry:
+ spin_unlock(&this_parent->d_lock);
+ rcu_read_unlock();
if (found)
return found;
if (locked)
@@ -2954,26 +2959,43 @@ resume:
}
spin_unlock(&dentry->d_lock);
}
+ rcu_read_lock();
+ascend:
if (this_parent != root) {
struct dentry *child = this_parent;
if (!(this_parent->d_flags & DCACHE_GENOCIDE)) {
this_parent->d_flags |= DCACHE_GENOCIDE;
this_parent->d_count--;
}
- this_parent = try_to_ascend(this_parent, locked, seq);
- if (!this_parent)
+ this_parent = child->d_parent;
+
+ spin_unlock(&child->d_lock);
+ spin_lock(&this_parent->d_lock);
+
+ /* might go back up the wrong parent if we have had a rename. */
+ if (!locked && read_seqretry(&rename_lock, seq))
goto rename_retry;
next = child->d_child.next;
+ while (unlikely(child->d_flags & DCACHE_DENTRY_KILLED)) {
+ if (next == &this_parent->d_subdirs)
+ goto ascend;
+ child = list_entry(next, struct dentry, d_child);
+ next = next->next;
+ }
+ rcu_read_unlock();
goto resume;
}
- spin_unlock(&this_parent->d_lock);
if (!locked && read_seqretry(&rename_lock, seq))
goto rename_retry;
+ spin_unlock(&this_parent->d_lock);
+ rcu_read_unlock();
if (locked)
write_sequnlock(&rename_lock);
return;

rename_retry:
+ spin_unlock(&this_parent->d_lock);
+ rcu_read_unlock();
if (locked)
goto again;
locked = 1;

2015-04-26 13:48:33

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.10 28/31] vm: add VM_FAULT_SIGSEGV handling support

3.10-stable review patch. If anyone has any objections, please let me know.

------------------

From: Linus Torvalds <[email protected]>

commit 33692f27597fcab536d7cbbcc8f52905133e4aa7 upstream.

The core VM already knows about VM_FAULT_SIGBUS, but cannot return a
"you should SIGSEGV" error, because the SIGSEGV case was generally
handled by the caller - usually the architecture fault handler.

That results in lots of duplication - all the architecture fault
handlers end up doing very similar "look up vma, check permissions, do
retries etc" - but it generally works. However, there are cases where
the VM actually wants to SIGSEGV, and applications _expect_ SIGSEGV.

In particular, when accessing the stack guard page, libsigsegv expects a
SIGSEGV. And it usually got one, because the stack growth is handled by
that duplicated architecture fault handler.

However, when the generic VM layer started propagating the error return
from the stack expansion in commit fee7e49d4514 ("mm: propagate error
from stack expansion even for guard page"), that now exposed the
existing VM_FAULT_SIGBUS result to user space. And user space really
expected SIGSEGV, not SIGBUS.

To fix that case, we need to add a VM_FAULT_SIGSEGV, and teach all those
duplicate architecture fault handlers about it. They all already have
the code to handle SIGSEGV, so it's about just tying that new return
value to the existing code, but it's all a bit annoying.

This is the mindless minimal patch to do this. A more extensive patch
would be to try to gather up the mostly shared fault handling logic into
one generic helper routine, and long-term we really should do that
cleanup.

Just from this patch, you can generally see that most architectures just
copied (directly or indirectly) the old x86 way of doing things, but in
the meantime that original x86 model has been improved to hold the VM
semaphore for shorter times etc and to handle VM_FAULT_RETRY and other
"newer" things, so it would be a good idea to bring all those
improvements to the generic case and teach other architectures about
them too.

Reported-and-tested-by: Takashi Iwai <[email protected]>
Tested-by: Jan Engelhardt <[email protected]>
Acked-by: Heiko Carstens <[email protected]> # "s390 still compiles and boots"
Cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>
[shengyong: Backport to 3.10
- adjust context
- ignore modification for arch nios2, because 3.10 does not support it
- ignore modification for driver lustre, because 3.10 does not support it
- ignore VM_FAULT_FALLBACK in VM_FAULT_ERROR, becase 3.10 does not support
this flag
- add SIGSEGV handling to powerpc/cell spu_fault.c, because 3.10 does not
separate it to copro_fault.c
- add SIGSEGV handling in mm/memory.c, because 3.10 does not separate it
to gup.c
]
Signed-off-by: Sheng Yong <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/alpha/mm/fault.c | 2 ++
arch/arc/mm/fault.c | 2 ++
arch/avr32/mm/fault.c | 2 ++
arch/cris/mm/fault.c | 2 ++
arch/frv/mm/fault.c | 2 ++
arch/ia64/mm/fault.c | 2 ++
arch/m32r/mm/fault.c | 2 ++
arch/m68k/mm/fault.c | 2 ++
arch/metag/mm/fault.c | 2 ++
arch/microblaze/mm/fault.c | 2 ++
arch/mips/mm/fault.c | 2 ++
arch/mn10300/mm/fault.c | 2 ++
arch/openrisc/mm/fault.c | 2 ++
arch/parisc/mm/fault.c | 2 ++
arch/powerpc/mm/fault.c | 2 ++
arch/powerpc/platforms/cell/spu_fault.c | 2 +-
arch/s390/mm/fault.c | 6 ++++++
arch/score/mm/fault.c | 2 ++
arch/sh/mm/fault.c | 2 ++
arch/sparc/mm/fault_32.c | 2 ++
arch/sparc/mm/fault_64.c | 2 ++
arch/tile/mm/fault.c | 2 ++
arch/um/kernel/trap.c | 2 ++
arch/x86/mm/fault.c | 2 ++
arch/xtensa/mm/fault.c | 2 ++
include/linux/mm.h | 5 +++--
mm/ksm.c | 2 +-
mm/memory.c | 5 +++--
28 files changed, 60 insertions(+), 6 deletions(-)

--- a/arch/alpha/mm/fault.c
+++ b/arch/alpha/mm/fault.c
@@ -156,6 +156,8 @@ retry:
if (unlikely(fault & VM_FAULT_ERROR)) {
if (fault & VM_FAULT_OOM)
goto out_of_memory;
+ else if (fault & VM_FAULT_SIGSEGV)
+ goto bad_area;
else if (fault & VM_FAULT_SIGBUS)
goto do_sigbus;
BUG();
--- a/arch/arc/mm/fault.c
+++ b/arch/arc/mm/fault.c
@@ -160,6 +160,8 @@ good_area:
/* TBD: switch to pagefault_out_of_memory() */
if (fault & VM_FAULT_OOM)
goto out_of_memory;
+ else if (fault & VM_FAULT_SIGSEV)
+ goto bad_area;
else if (fault & VM_FAULT_SIGBUS)
goto do_sigbus;

--- a/arch/avr32/mm/fault.c
+++ b/arch/avr32/mm/fault.c
@@ -142,6 +142,8 @@ good_area:
if (unlikely(fault & VM_FAULT_ERROR)) {
if (fault & VM_FAULT_OOM)
goto out_of_memory;
+ else if (fault & VM_FAULT_SIGSEGV)
+ goto bad_area;
else if (fault & VM_FAULT_SIGBUS)
goto do_sigbus;
BUG();
--- a/arch/cris/mm/fault.c
+++ b/arch/cris/mm/fault.c
@@ -176,6 +176,8 @@ retry:
if (unlikely(fault & VM_FAULT_ERROR)) {
if (fault & VM_FAULT_OOM)
goto out_of_memory;
+ else if (fault & VM_FAULT_SIGSEGV)
+ goto bad_area;
else if (fault & VM_FAULT_SIGBUS)
goto do_sigbus;
BUG();
--- a/arch/frv/mm/fault.c
+++ b/arch/frv/mm/fault.c
@@ -168,6 +168,8 @@ asmlinkage void do_page_fault(int datamm
if (unlikely(fault & VM_FAULT_ERROR)) {
if (fault & VM_FAULT_OOM)
goto out_of_memory;
+ else if (fault & VM_FAULT_SIGSEGV)
+ goto bad_area;
else if (fault & VM_FAULT_SIGBUS)
goto do_sigbus;
BUG();
--- a/arch/ia64/mm/fault.c
+++ b/arch/ia64/mm/fault.c
@@ -172,6 +172,8 @@ retry:
*/
if (fault & VM_FAULT_OOM) {
goto out_of_memory;
+ } else if (fault & VM_FAULT_SIGSEGV) {
+ goto bad_area;
} else if (fault & VM_FAULT_SIGBUS) {
signal = SIGBUS;
goto bad_area;
--- a/arch/m32r/mm/fault.c
+++ b/arch/m32r/mm/fault.c
@@ -200,6 +200,8 @@ good_area:
if (unlikely(fault & VM_FAULT_ERROR)) {
if (fault & VM_FAULT_OOM)
goto out_of_memory;
+ else if (fault & VM_FAULT_SIGSEGV)
+ goto bad_area;
else if (fault & VM_FAULT_SIGBUS)
goto do_sigbus;
BUG();
--- a/arch/m68k/mm/fault.c
+++ b/arch/m68k/mm/fault.c
@@ -153,6 +153,8 @@ good_area:
if (unlikely(fault & VM_FAULT_ERROR)) {
if (fault & VM_FAULT_OOM)
goto out_of_memory;
+ else if (fault & VM_FAULT_SIGSEGV)
+ goto map_err;
else if (fault & VM_FAULT_SIGBUS)
goto bus_err;
BUG();
--- a/arch/metag/mm/fault.c
+++ b/arch/metag/mm/fault.c
@@ -141,6 +141,8 @@ good_area:
if (unlikely(fault & VM_FAULT_ERROR)) {
if (fault & VM_FAULT_OOM)
goto out_of_memory;
+ else if (fault & VM_FAULT_SIGSEGV)
+ goto bad_area;
else if (fault & VM_FAULT_SIGBUS)
goto do_sigbus;
BUG();
--- a/arch/microblaze/mm/fault.c
+++ b/arch/microblaze/mm/fault.c
@@ -224,6 +224,8 @@ good_area:
if (unlikely(fault & VM_FAULT_ERROR)) {
if (fault & VM_FAULT_OOM)
goto out_of_memory;
+ else if (fault & VM_FAULT_SIGSEGV)
+ goto bad_area;
else if (fault & VM_FAULT_SIGBUS)
goto do_sigbus;
BUG();
--- a/arch/mips/mm/fault.c
+++ b/arch/mips/mm/fault.c
@@ -157,6 +157,8 @@ good_area:
if (unlikely(fault & VM_FAULT_ERROR)) {
if (fault & VM_FAULT_OOM)
goto out_of_memory;
+ else if (fault & VM_FAULT_SIGSEGV)
+ goto bad_area;
else if (fault & VM_FAULT_SIGBUS)
goto do_sigbus;
BUG();
--- a/arch/mn10300/mm/fault.c
+++ b/arch/mn10300/mm/fault.c
@@ -262,6 +262,8 @@ good_area:
if (unlikely(fault & VM_FAULT_ERROR)) {
if (fault & VM_FAULT_OOM)
goto out_of_memory;
+ else if (fault & VM_FAULT_SIGSEGV)
+ goto bad_area;
else if (fault & VM_FAULT_SIGBUS)
goto do_sigbus;
BUG();
--- a/arch/openrisc/mm/fault.c
+++ b/arch/openrisc/mm/fault.c
@@ -171,6 +171,8 @@ good_area:
if (unlikely(fault & VM_FAULT_ERROR)) {
if (fault & VM_FAULT_OOM)
goto out_of_memory;
+ else if (fault & VM_FAULT_SIGSEGV)
+ goto bad_area;
else if (fault & VM_FAULT_SIGBUS)
goto do_sigbus;
BUG();
--- a/arch/parisc/mm/fault.c
+++ b/arch/parisc/mm/fault.c
@@ -220,6 +220,8 @@ good_area:
*/
if (fault & VM_FAULT_OOM)
goto out_of_memory;
+ else if (fault & VM_FAULT_SIGSEGV)
+ goto bad_area;
else if (fault & VM_FAULT_SIGBUS)
goto bad_area;
BUG();
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -425,6 +425,8 @@ good_area:
*/
fault = handle_mm_fault(mm, vma, address, flags);
if (unlikely(fault & (VM_FAULT_RETRY|VM_FAULT_ERROR))) {
+ if (fault & VM_FAULT_SIGSEGV)
+ goto bad_area;
rc = mm_fault_error(regs, address, fault);
if (rc >= MM_FAULT_RETURN)
goto bail;
--- a/arch/powerpc/platforms/cell/spu_fault.c
+++ b/arch/powerpc/platforms/cell/spu_fault.c
@@ -75,7 +75,7 @@ int spu_handle_mm_fault(struct mm_struct
if (*flt & VM_FAULT_OOM) {
ret = -ENOMEM;
goto out_unlock;
- } else if (*flt & VM_FAULT_SIGBUS) {
+ } else if (*flt & (VM_FAULT_SIGBUS | VM_FAULT_SIGSEGV)) {
ret = -EFAULT;
goto out_unlock;
}
--- a/arch/s390/mm/fault.c
+++ b/arch/s390/mm/fault.c
@@ -244,6 +244,12 @@ static noinline void do_fault_error(stru
do_no_context(regs);
else
pagefault_out_of_memory();
+ } else if (fault & VM_FAULT_SIGSEGV) {
+ /* Kernel mode? Handle exceptions or die */
+ if (!user_mode(regs))
+ do_no_context(regs);
+ else
+ do_sigsegv(regs, SEGV_MAPERR);
} else if (fault & VM_FAULT_SIGBUS) {
/* Kernel mode? Handle exceptions or die */
if (!user_mode(regs))
--- a/arch/score/mm/fault.c
+++ b/arch/score/mm/fault.c
@@ -114,6 +114,8 @@ good_area:
if (unlikely(fault & VM_FAULT_ERROR)) {
if (fault & VM_FAULT_OOM)
goto out_of_memory;
+ else if (fault & VM_FAULT_SIGSEGV)
+ goto bad_area;
else if (fault & VM_FAULT_SIGBUS)
goto do_sigbus;
BUG();
--- a/arch/sh/mm/fault.c
+++ b/arch/sh/mm/fault.c
@@ -353,6 +353,8 @@ mm_fault_error(struct pt_regs *regs, uns
} else {
if (fault & VM_FAULT_SIGBUS)
do_sigbus(regs, error_code, address);
+ else if (fault & VM_FAULT_SIGSEGV)
+ bad_area(regs, error_code, address);
else
BUG();
}
--- a/arch/sparc/mm/fault_32.c
+++ b/arch/sparc/mm/fault_32.c
@@ -252,6 +252,8 @@ good_area:
if (unlikely(fault & VM_FAULT_ERROR)) {
if (fault & VM_FAULT_OOM)
goto out_of_memory;
+ else if (fault & VM_FAULT_SIGSEGV)
+ goto bad_area;
else if (fault & VM_FAULT_SIGBUS)
goto do_sigbus;
BUG();
--- a/arch/sparc/mm/fault_64.c
+++ b/arch/sparc/mm/fault_64.c
@@ -443,6 +443,8 @@ good_area:
if (unlikely(fault & VM_FAULT_ERROR)) {
if (fault & VM_FAULT_OOM)
goto out_of_memory;
+ else if (fault & VM_FAULT_SIGSEGV)
+ goto bad_area;
else if (fault & VM_FAULT_SIGBUS)
goto do_sigbus;
BUG();
--- a/arch/tile/mm/fault.c
+++ b/arch/tile/mm/fault.c
@@ -446,6 +446,8 @@ good_area:
if (unlikely(fault & VM_FAULT_ERROR)) {
if (fault & VM_FAULT_OOM)
goto out_of_memory;
+ else if (fault & VM_FAULT_SIGSEGV)
+ goto bad_area;
else if (fault & VM_FAULT_SIGBUS)
goto do_sigbus;
BUG();
--- a/arch/um/kernel/trap.c
+++ b/arch/um/kernel/trap.c
@@ -80,6 +80,8 @@ good_area:
if (unlikely(fault & VM_FAULT_ERROR)) {
if (fault & VM_FAULT_OOM) {
goto out_of_memory;
+ } else if (fault & VM_FAULT_SIGSEGV) {
+ goto out;
} else if (fault & VM_FAULT_SIGBUS) {
err = -EACCES;
goto out;
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -873,6 +873,8 @@ mm_fault_error(struct pt_regs *regs, uns
if (fault & (VM_FAULT_SIGBUS|VM_FAULT_HWPOISON|
VM_FAULT_HWPOISON_LARGE))
do_sigbus(regs, error_code, address, fault);
+ else if (fault & VM_FAULT_SIGSEGV)
+ bad_area_nosemaphore(regs, error_code, address);
else
BUG();
}
--- a/arch/xtensa/mm/fault.c
+++ b/arch/xtensa/mm/fault.c
@@ -117,6 +117,8 @@ good_area:
if (unlikely(fault & VM_FAULT_ERROR)) {
if (fault & VM_FAULT_OOM)
goto out_of_memory;
+ else if (fault & VM_FAULT_SIGSEGV)
+ goto bad_area;
else if (fault & VM_FAULT_SIGBUS)
goto do_sigbus;
BUG();
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -891,6 +891,7 @@ static inline int page_mapped(struct pag
#define VM_FAULT_WRITE 0x0008 /* Special case for get_user_pages */
#define VM_FAULT_HWPOISON 0x0010 /* Hit poisoned small page */
#define VM_FAULT_HWPOISON_LARGE 0x0020 /* Hit poisoned large page. Index encoded in upper bits */
+#define VM_FAULT_SIGSEGV 0x0040

#define VM_FAULT_NOPAGE 0x0100 /* ->fault installed the pte, not return page */
#define VM_FAULT_LOCKED 0x0200 /* ->fault locked the returned page */
@@ -898,8 +899,8 @@ static inline int page_mapped(struct pag

#define VM_FAULT_HWPOISON_LARGE_MASK 0xf000 /* encodes hpage index for large hwpoison */

-#define VM_FAULT_ERROR (VM_FAULT_OOM | VM_FAULT_SIGBUS | VM_FAULT_HWPOISON | \
- VM_FAULT_HWPOISON_LARGE)
+#define VM_FAULT_ERROR (VM_FAULT_OOM | VM_FAULT_SIGBUS | VM_FAULT_SIGSEGV | \
+ VM_FAULT_HWPOISON | VM_FAULT_HWPOISON_LARGE)

/* Encode hstate index for a hwpoisoned large page */
#define VM_FAULT_SET_HINDEX(x) ((x) << 12)
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -376,7 +376,7 @@ static int break_ksm(struct vm_area_stru
else
ret = VM_FAULT_WRITE;
put_page(page);
- } while (!(ret & (VM_FAULT_WRITE | VM_FAULT_SIGBUS | VM_FAULT_OOM)));
+ } while (!(ret & (VM_FAULT_WRITE | VM_FAULT_SIGBUS | VM_FAULT_SIGSEGV | VM_FAULT_OOM)));
/*
* We must loop because handle_mm_fault() may back out if there's
* any difficulty e.g. if pte accessed bit gets updated concurrently.
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1844,7 +1844,8 @@ long __get_user_pages(struct task_struct
else
return -EFAULT;
}
- if (ret & VM_FAULT_SIGBUS)
+ if (ret & (VM_FAULT_SIGBUS |
+ VM_FAULT_SIGSEGV))
return i ? i : -EFAULT;
BUG();
}
@@ -1954,7 +1955,7 @@ int fixup_user_fault(struct task_struct
return -ENOMEM;
if (ret & (VM_FAULT_HWPOISON | VM_FAULT_HWPOISON_LARGE))
return -EHWPOISON;
- if (ret & VM_FAULT_SIGBUS)
+ if (ret & (VM_FAULT_SIGBUS | VM_FAULT_SIGSEGV))
return -EFAULT;
BUG();
}

2015-04-26 13:48:40

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.10 29/31] vm: make stack guard page errors return VM_FAULT_SIGSEGV rather than SIGBUS

3.10-stable review patch. If anyone has any objections, please let me know.

------------------

From: Linus Torvalds <[email protected]>

commit 9c145c56d0c8a0b62e48c8d71e055ad0fb2012ba upstream.

The stack guard page error case has long incorrectly caused a SIGBUS
rather than a SIGSEGV, but nobody actually noticed until commit
fee7e49d4514 ("mm: propagate error from stack expansion even for guard
page") because that error case was never actually triggered in any
normal situations.

Now that we actually report the error, people noticed the wrong signal
that resulted. So far, only the test suite of libsigsegv seems to have
actually cared, but there are real applications that use libsigsegv, so
let's not wait for any of those to break.

Reported-and-tested-by: Takashi Iwai <[email protected]>
Tested-by: Jan Engelhardt <[email protected]>
Acked-by: Heiko Carstens <[email protected]> # "s390 still compiles and boots"
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
mm/memory.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3232,7 +3232,7 @@ static int do_anonymous_page(struct mm_s

/* Check if we need to add a guard page to the stack */
if (check_stack_guard_page(vma, address) < 0)
- return VM_FAULT_SIGBUS;
+ return VM_FAULT_SIGSEGV;

/* Use the zero-page for reads */
if (!(flags & FAULT_FLAG_WRITE)) {

2015-04-26 13:49:25

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.10 30/31] x86: mm: move mmap_sem unlock from mm_fault_error() to caller

3.10-stable review patch. If anyone has any objections, please let me know.

------------------

From: Linus Torvalds <[email protected]>

commit 7fb08eca45270d0ae86e1ad9d39c40b7a55d0190 upstream.

This replaces four copies in various stages of mm_fault_error() handling
with just a single one. It will also allow for more natural placement
of the unlocking after some further cleanup.

Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/mm/fault.c | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)

--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -812,11 +812,8 @@ do_sigbus(struct pt_regs *regs, unsigned
unsigned int fault)
{
struct task_struct *tsk = current;
- struct mm_struct *mm = tsk->mm;
int code = BUS_ADRERR;

- up_read(&mm->mmap_sem);
-
/* Kernel mode? Handle exceptions or die: */
if (!(error_code & PF_USER)) {
no_context(regs, error_code, address, SIGBUS, BUS_ADRERR);
@@ -847,7 +844,6 @@ mm_fault_error(struct pt_regs *regs, uns
unsigned long address, unsigned int fault)
{
if (fatal_signal_pending(current) && !(error_code & PF_USER)) {
- up_read(&current->mm->mmap_sem);
no_context(regs, error_code, address, 0, 0);
return;
}
@@ -855,14 +851,11 @@ mm_fault_error(struct pt_regs *regs, uns
if (fault & VM_FAULT_OOM) {
/* Kernel mode? Handle exceptions or die: */
if (!(error_code & PF_USER)) {
- up_read(&current->mm->mmap_sem);
no_context(regs, error_code, address,
SIGSEGV, SEGV_MAPERR);
return;
}

- up_read(&current->mm->mmap_sem);
-
/*
* We ran out of memory, call the OOM killer, and return the
* userspace (which will retry the fault, or kill us if we got
@@ -1195,6 +1188,7 @@ good_area:
return;

if (unlikely(fault & VM_FAULT_ERROR)) {
+ up_read(&mm->mmap_sem);
mm_fault_error(regs, error_code, address, fault);
return;
}

2015-04-26 13:49:06

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 3.10 31/31] sb_edac: avoid INTERNAL ERROR message in EDAC with unspecified channel

3.10-stable review patch. If anyone has any objections, please let me know.

------------------

From: Seth Jennings <[email protected]>

commit 351fc4a99d49fde63fe5ab7412beb35c40d27269 upstream.

Intel IA32 SDM Table 15-14 defines channel 0xf as 'not specified', but
EDAC doesn't know about this and returns and INTERNAL ERROR when the
channel is greater than NUM_CHANNELS:

kernel: [ 1538.886456] CPU 0: Machine Check Exception: 0 Bank 1: 940000000000009f
kernel: [ 1538.886669] TSC 2bc68b22e7e812 ADDR 46dae7000 MISC 0 PROCESSOR 0:306e4 TIME 1390414572 SOCKET 0 APIC 0
kernel: [ 1538.971948] EDAC MC1: INTERNAL ERROR: channel value is out of range (15 >= 4)
kernel: [ 1538.972203] EDAC MC1: 0 CE memory read error on unknown memory (slot:0 page:0x46dae7 offset:0x0 grain:0 syndrome:0x0 - area:DRAM err_code:0000:009f socket:1 channel_mask:1 rank:0)

This commit changes sb_edac to forward a channel of -1 to EDAC if the
channel is not specified. edac_mc_handle_error() sets the channel to -1
internally after the error message anyway, so this commit should have no
effect other than avoiding the INTERNAL ERROR message when the channel
is not specified.

Signed-off-by: Seth Jennings <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Cc: Vinson Lee <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/edac/sb_edac.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -270,8 +270,9 @@ static const u32 correrrthrsld[] = {
* sbridge structs
*/

-#define NUM_CHANNELS 4
-#define MAX_DIMMS 3 /* Max DIMMS per channel */
+#define NUM_CHANNELS 4
+#define MAX_DIMMS 3 /* Max DIMMS per channel */
+#define CHANNEL_UNSPECIFIED 0xf /* Intel IA32 SDM 15-14 */

struct sbridge_info {
u32 mcmtr;
@@ -1451,6 +1452,9 @@ static void sbridge_mce_output_error(str

/* FIXME: need support for channel mask */

+ if (channel == CHANNEL_UNSPECIFIED)
+ channel = -1;
+
/* Call the helper to output message */
edac_mc_handle_error(tp_event, mci, core_err_cnt,
m->addr >> PAGE_SHIFT, m->addr & ~PAGE_MASK, 0,

2015-04-26 15:16:17

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 3.10 00/31] 3.10.76-stable review

On 04/26/2015 06:46 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 3.10.76 release.
> There are 31 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Tue Apr 28 13:41:43 UTC 2015.
> Anything received after that time might be too late.
>


Building arc:defconfig ... failed
--------------
Error log:
arch/arc/mm/fault.c: In function 'do_page_fault':
arch/arc/mm/fault.c:163: error: 'VM_FAULT_SIGSEV' undeclared (first use in this function)

Caused by the backport of 'vm: add VM_FAULT_SIGSEGV handling support' which introduced
above misspelling. Affected are 3.10.y-stable-queue and 3.14.y-stable-queue.

Guenter

2015-04-26 17:13:02

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 3.10 00/31] 3.10.76-stable review

On Sun, Apr 26, 2015 at 08:15:49AM -0700, Guenter Roeck wrote:
> On 04/26/2015 06:46 AM, Greg Kroah-Hartman wrote:
> >This is the start of the stable review cycle for the 3.10.76 release.
> >There are 31 patches in this series, all will be posted as a response
> >to this one. If anyone has any issues with these being applied, please
> >let me know.
> >
> >Responses should be made by Tue Apr 28 13:41:43 UTC 2015.
> >Anything received after that time might be too late.
> >
>
>
> Building arc:defconfig ... failed
> --------------
> Error log:
> arch/arc/mm/fault.c: In function 'do_page_fault':
> arch/arc/mm/fault.c:163: error: 'VM_FAULT_SIGSEV' undeclared (first use in this function)
>
> Caused by the backport of 'vm: add VM_FAULT_SIGSEGV handling support' which introduced
> above misspelling. Affected are 3.10.y-stable-queue and 3.14.y-stable-queue.

Thanks, should now be fixed, I've applied commit e262eb9381ad ("arc: mm:
Fix build failure")

2015-04-26 17:15:10

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 3.10 00/31] 3.10.76-stable review

On 04/26/2015 10:12 AM, Greg Kroah-Hartman wrote:
> On Sun, Apr 26, 2015 at 08:15:49AM -0700, Guenter Roeck wrote:
>> On 04/26/2015 06:46 AM, Greg Kroah-Hartman wrote:
>>> This is the start of the stable review cycle for the 3.10.76 release.
>>> There are 31 patches in this series, all will be posted as a response
>>> to this one. If anyone has any issues with these being applied, please
>>> let me know.
>>>
>>> Responses should be made by Tue Apr 28 13:41:43 UTC 2015.
>>> Anything received after that time might be too late.
>>>
>>
>>
>> Building arc:defconfig ... failed
>> --------------
>> Error log:
>> arch/arc/mm/fault.c: In function 'do_page_fault':
>> arch/arc/mm/fault.c:163: error: 'VM_FAULT_SIGSEV' undeclared (first use in this function)
>>
>> Caused by the backport of 'vm: add VM_FAULT_SIGSEGV handling support' which introduced
>> above misspelling. Affected are 3.10.y-stable-queue and 3.14.y-stable-queue.
>
> Thanks, should now be fixed, I've applied commit e262eb9381ad ("arc: mm:
> Fix build failure")
>
>
Ah, now I know why that rang a bell.

Guenter

2015-04-26 20:01:58

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 3.10 00/31] 3.10.76-stable review

On 04/26/2015 06:48 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 3.10.76 release.
> There are 31 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Tue Apr 28 13:41:43 UTC 2015.
> Anything received after that time might be too late.
>

Build results:
total: 126 pass: 124 fail: 2
Failed builds:
arm64:allmodconfig
s390:allmodconfig

Qemu tests:
total: 27 pass: 27 fail: 0

Results are as expected; arm64:allmodconfig was previously known to fail,
and s390:allmodconfig is a new addition to the list of builds. It is also
known to fail in 3.10.

Details are available at http://server.roeck-us.net:8010/builders.

Guenter

2015-04-27 01:20:52

by Ben Hutchings

[permalink] [raw]
Subject: Re: [PATCH 3.10 27/31] deal with deadlock in d_walk()

On Sun, 2015-04-26 at 15:49 +0200, Greg Kroah-Hartman wrote:
> 3.10-stable review patch. If anyone has any objections, please let me know.
>
> ------------------
>
> From: Al Viro <[email protected]>
>
> commit ca5358ef75fc69fee5322a38a340f5739d997c10 upstream.
>
> ... by not hitting rename_retry for reasons other than rename having
> happened. In other words, do _not_ restart when finding that
> between unlocking the child and locking the parent the former got
> into __dentry_kill(). Skip the killed siblings instead...
>
> Signed-off-by: Al Viro <[email protected]>
> Cc: Ben Hutchings <[email protected]>
> [hujianyang: Backported to 3.10 refer to the work of Ben Hutchings in 3.2:
> - As we only have try_to_ascend() and not d_walk(), apply this
> change to all callers of try_to_ascend()
> - Adjust context to make __dentry_kill() apply to d_kill()]
> Signed-off-by: hujianyang <[email protected]>
> Signed-off-by: Greg Kroah-Hartman <[email protected]>

This is broken; you need to fold in commit 20defcec264c from 3.2.y
("dcache: Fix locking bugs in backported "deal with deadlock in
d_walk()"").

Ben.

> ---
> fs/dcache.c | 102 ++++++++++++++++++++++++++++++++++++------------------------
> 1 file changed, 62 insertions(+), 40 deletions(-)
>
> --- a/fs/dcache.c
> +++ b/fs/dcache.c
> @@ -364,9 +364,9 @@ static struct dentry *d_kill(struct dent
> __releases(parent->d_lock)
> __releases(dentry->d_inode->i_lock)
> {
> - list_del(&dentry->d_child);
> + __list_del_entry(&dentry->d_child);
> /*
> - * Inform try_to_ascend() that we are no longer attached to the
> + * Inform ascending readers that we are no longer attached to the
> * dentry tree
> */
> dentry->d_flags |= DCACHE_DENTRY_KILLED;
> @@ -988,35 +988,6 @@ void shrink_dcache_for_umount(struct sup
> }
>
> /*
> - * This tries to ascend one level of parenthood, but
> - * we can race with renaming, so we need to re-check
> - * the parenthood after dropping the lock and check
> - * that the sequence number still matches.
> - */
> -static struct dentry *try_to_ascend(struct dentry *old, int locked, unsigned seq)
> -{
> - struct dentry *new = old->d_parent;
> -
> - rcu_read_lock();
> - spin_unlock(&old->d_lock);
> - spin_lock(&new->d_lock);
> -
> - /*
> - * might go back up the wrong parent if we have had a rename
> - * or deletion
> - */
> - if (new != old->d_parent ||
> - (old->d_flags & DCACHE_DENTRY_KILLED) ||
> - (!locked && read_seqretry(&rename_lock, seq))) {
> - spin_unlock(&new->d_lock);
> - new = NULL;
> - }
> - rcu_read_unlock();
> - return new;
> -}
> -
> -
> -/*
> * Search for at least 1 mount point in the dentry's subdirs.
> * We descend to the next level whenever the d_subdirs
> * list is non-empty and continue searching.
> @@ -1070,17 +1041,32 @@ resume:
> /*
> * All done at this level ... ascend and resume the search.
> */
> + rcu_read_lock();
> +ascend:
> if (this_parent != parent) {
> struct dentry *child = this_parent;
> - this_parent = try_to_ascend(this_parent, locked, seq);
> - if (!this_parent)
> + this_parent = child->d_parent;
> +
> + spin_unlock(&child->d_lock);
> + spin_lock(&this_parent->d_lock);
> +
> + /* might go back up the wrong parent if we have had a rename. */
> + if (!locked && read_seqretry(&rename_lock, seq))
> goto rename_retry;
> next = child->d_child.next;
> + while (unlikely(child->d_flags & DCACHE_DENTRY_KILLED)) {
> + if (next == &this_parent->d_subdirs)
> + goto ascend;
> + child = list_entry(next, struct dentry, d_child);
> + next = next->next;
> + }
> + rcu_read_unlock();
> goto resume;
> }
> - spin_unlock(&this_parent->d_lock);
> if (!locked && read_seqretry(&rename_lock, seq))
> goto rename_retry;
> + spin_unlock(&this_parent->d_lock);
> + rcu_read_unlock();
> if (locked)
> write_sequnlock(&rename_lock);
> return 0; /* No mount points found in tree */
> @@ -1092,6 +1078,8 @@ positive:
> return 1;
>
> rename_retry:
> + spin_unlock(&this_parent->d_lock);
> + rcu_read_unlock();
> if (locked)
> goto again;
> locked = 1;
> @@ -1177,23 +1165,40 @@ resume:
> /*
> * All done at this level ... ascend and resume the search.
> */
> + rcu_read_lock();
> +ascend:
> if (this_parent != parent) {
> struct dentry *child = this_parent;
> - this_parent = try_to_ascend(this_parent, locked, seq);
> - if (!this_parent)
> + this_parent = child->d_parent;
> +
> + spin_unlock(&child->d_lock);
> + spin_lock(&this_parent->d_lock);
> +
> + /* might go back up the wrong parent if we have had a rename. */
> + if (!locked && read_seqretry(&rename_lock, seq))
> goto rename_retry;
> next = child->d_child.next;
> + while (unlikely(child->d_flags & DCACHE_DENTRY_KILLED)) {
> + if (next == &this_parent->d_subdirs)
> + goto ascend;
> + child = list_entry(next, struct dentry, d_child);
> + next = next->next;
> + }
> + rcu_read_unlock();
> goto resume;
> }
> out:
> - spin_unlock(&this_parent->d_lock);
> if (!locked && read_seqretry(&rename_lock, seq))
> goto rename_retry;
> + spin_unlock(&this_parent->d_lock);
> + rcu_read_unlock();
> if (locked)
> write_sequnlock(&rename_lock);
> return found;
>
> rename_retry:
> + spin_unlock(&this_parent->d_lock);
> + rcu_read_unlock();
> if (found)
> return found;
> if (locked)
> @@ -2954,26 +2959,43 @@ resume:
> }
> spin_unlock(&dentry->d_lock);
> }
> + rcu_read_lock();
> +ascend:
> if (this_parent != root) {
> struct dentry *child = this_parent;
> if (!(this_parent->d_flags & DCACHE_GENOCIDE)) {
> this_parent->d_flags |= DCACHE_GENOCIDE;
> this_parent->d_count--;
> }
> - this_parent = try_to_ascend(this_parent, locked, seq);
> - if (!this_parent)
> + this_parent = child->d_parent;
> +
> + spin_unlock(&child->d_lock);
> + spin_lock(&this_parent->d_lock);
> +
> + /* might go back up the wrong parent if we have had a rename. */
> + if (!locked && read_seqretry(&rename_lock, seq))
> goto rename_retry;
> next = child->d_child.next;
> + while (unlikely(child->d_flags & DCACHE_DENTRY_KILLED)) {
> + if (next == &this_parent->d_subdirs)
> + goto ascend;
> + child = list_entry(next, struct dentry, d_child);
> + next = next->next;
> + }
> + rcu_read_unlock();
> goto resume;
> }
> - spin_unlock(&this_parent->d_lock);
> if (!locked && read_seqretry(&rename_lock, seq))
> goto rename_retry;
> + spin_unlock(&this_parent->d_lock);
> + rcu_read_unlock();
> if (locked)
> write_sequnlock(&rename_lock);
> return;
>
> rename_retry:
> + spin_unlock(&this_parent->d_lock);
> + rcu_read_unlock();
> if (locked)
> goto again;
> locked = 1;
>
>

--
Ben Hutchings
I'm not a reverse psychological virus. Please don't copy me into your sig.


Attachments:
signature.asc (811.00 B)
This is a digitally signed message part

2015-04-27 04:02:33

by Willy Tarreau

[permalink] [raw]
Subject: Re: [PATCH 3.10 06/31] tcp: tcp_make_synack() should clear skb->tstamp

Hi Greg,

On Sun, Apr 26, 2015 at 03:46:26PM +0200, Greg Kroah-Hartman wrote:
> 3.10-stable review patch. If anyone has any objections, please let me know.
>
> ------------------
>
> From: Eric Dumazet <[email protected]>
>
> [ Upstream commit b50edd7812852d989f2ef09dcfc729690f54a42d ]
>
> I noticed tcpdump was giving funky timestamps for locally
> generated SYNACK messages on loopback interface.
>
> 11:42:46.938990 IP 127.0.0.1.48245 > 127.0.0.2.23850: S
> 945476042:945476042(0) win 43690 <mss 65495,nop,nop,sackOK,nop,wscale 7>
>
> 20:28:58.502209 IP 127.0.0.2.23850 > 127.0.0.1.48245: S
> 3160535375:3160535375(0) ack 945476043 win 43690 <mss
> 65495,nop,nop,sackOK,nop,wscale 7>
>
> This is because we need to clear skb->tstamp before
> entering lower stack, otherwise net_timestamp_check()
> does not set skb->tstamp.
>
> Fixes: 7faee5c0d514 ("tcp: remove TCP_SKB_CB(skb)->when")
> Signed-off-by: Eric Dumazet <[email protected]>
> Signed-off-by: David S. Miller <[email protected]>
> Signed-off-by: Greg Kroah-Hartman <[email protected]>
> ---

Unless I missed something, the commit this patch fixes was not
backported to 3.10 so I think we don't need this one. I have no
idea whether it can have a side effect there though, Eric will
probably know better.

Thanks,
Willy

2015-04-27 04:23:21

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH 3.10 06/31] tcp: tcp_make_synack() should clear skb->tstamp

Right.

Bug was introduced in 3.18, the Fixes: tag tells us ;)

git describe --contains 7faee5c0d514
v3.18-rc1~52^2~148^2

Note that it does not hurt having this backport to prior kernel versions.

Field is already 0 after skb allocation/cloning.



On Sun, Apr 26, 2015 at 9:02 PM, Willy Tarreau <[email protected]> wrote:
> Hi Greg,
>
> On Sun, Apr 26, 2015 at 03:46:26PM +0200, Greg Kroah-Hartman wrote:
>> 3.10-stable review patch. If anyone has any objections, please let me know.
>>
>> ------------------
>>
>> From: Eric Dumazet <[email protected]>
>>
>> [ Upstream commit b50edd7812852d989f2ef09dcfc729690f54a42d ]
>>
>> I noticed tcpdump was giving funky timestamps for locally
>> generated SYNACK messages on loopback interface.
>>
>> 11:42:46.938990 IP 127.0.0.1.48245 > 127.0.0.2.23850: S
>> 945476042:945476042(0) win 43690 <mss 65495,nop,nop,sackOK,nop,wscale 7>
>>
>> 20:28:58.502209 IP 127.0.0.2.23850 > 127.0.0.1.48245: S
>> 3160535375:3160535375(0) ack 945476043 win 43690 <mss
>> 65495,nop,nop,sackOK,nop,wscale 7>
>>
>> This is because we need to clear skb->tstamp before
>> entering lower stack, otherwise net_timestamp_check()
>> does not set skb->tstamp.
>>
>> Fixes: 7faee5c0d514 ("tcp: remove TCP_SKB_CB(skb)->when")
>> Signed-off-by: Eric Dumazet <[email protected]>
>> Signed-off-by: David S. Miller <[email protected]>
>> Signed-off-by: Greg Kroah-Hartman <[email protected]>
>> ---
>
> Unless I missed something, the commit this patch fixes was not
> backported to 3.10 so I think we don't need this one. I have no
> idea whether it can have a side effect there though, Eric will
> probably know better.
>
> Thanks,
> Willy
>

2015-04-27 04:45:31

by David Miller

[permalink] [raw]
Subject: Re: [PATCH 3.10 06/31] tcp: tcp_make_synack() should clear skb->tstamp

From: Willy Tarreau <[email protected]>
Date: Mon, 27 Apr 2015 06:02:22 +0200

> Hi Greg,
>
> On Sun, Apr 26, 2015 at 03:46:26PM +0200, Greg Kroah-Hartman wrote:
>> 3.10-stable review patch. If anyone has any objections, please let me know.
>>
>> ------------------
>>
>> From: Eric Dumazet <[email protected]>
>>
>> [ Upstream commit b50edd7812852d989f2ef09dcfc729690f54a42d ]
>>
>> I noticed tcpdump was giving funky timestamps for locally
>> generated SYNACK messages on loopback interface.
>>
>> 11:42:46.938990 IP 127.0.0.1.48245 > 127.0.0.2.23850: S
>> 945476042:945476042(0) win 43690 <mss 65495,nop,nop,sackOK,nop,wscale 7>
>>
>> 20:28:58.502209 IP 127.0.0.2.23850 > 127.0.0.1.48245: S
>> 3160535375:3160535375(0) ack 945476043 win 43690 <mss
>> 65495,nop,nop,sackOK,nop,wscale 7>
>>
>> This is because we need to clear skb->tstamp before
>> entering lower stack, otherwise net_timestamp_check()
>> does not set skb->tstamp.
>>
>> Fixes: 7faee5c0d514 ("tcp: remove TCP_SKB_CB(skb)->when")
>> Signed-off-by: Eric Dumazet <[email protected]>
>> Signed-off-by: David S. Miller <[email protected]>
>> Signed-off-by: Greg Kroah-Hartman <[email protected]>
>> ---
>
> Unless I missed something, the commit this patch fixes was not
> backported to 3.10 so I think we don't need this one. I have no
> idea whether it can have a side effect there though, Eric will
> probably know better.

Eric Dumazet mentioned this and said it's harmless.

2015-04-27 07:53:25

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 3.10 27/31] deal with deadlock in d_walk()

On Mon, Apr 27, 2015 at 02:20:32AM +0100, Ben Hutchings wrote:
> On Sun, 2015-04-26 at 15:49 +0200, Greg Kroah-Hartman wrote:
> > 3.10-stable review patch. If anyone has any objections, please let me know.
> >
> > ------------------
> >
> > From: Al Viro <[email protected]>
> >
> > commit ca5358ef75fc69fee5322a38a340f5739d997c10 upstream.
> >
> > ... by not hitting rename_retry for reasons other than rename having
> > happened. In other words, do _not_ restart when finding that
> > between unlocking the child and locking the parent the former got
> > into __dentry_kill(). Skip the killed siblings instead...
> >
> > Signed-off-by: Al Viro <[email protected]>
> > Cc: Ben Hutchings <[email protected]>
> > [hujianyang: Backported to 3.10 refer to the work of Ben Hutchings in 3.2:
> > - As we only have try_to_ascend() and not d_walk(), apply this
> > change to all callers of try_to_ascend()
> > - Adjust context to make __dentry_kill() apply to d_kill()]
> > Signed-off-by: hujianyang <[email protected]>
> > Signed-off-by: Greg Kroah-Hartman <[email protected]>
>
> This is broken; you need to fold in commit 20defcec264c from 3.2.y
> ("dcache: Fix locking bugs in backported "deal with deadlock in
> d_walk()"").

Thanks for letting me know, now applied.

greg k-h

2015-04-27 17:19:20

by Shuah Khan

[permalink] [raw]
Subject: Re: [PATCH 3.10 00/31] 3.10.76-stable review

On 04/26/2015 07:46 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 3.10.76 release.
> There are 31 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Tue Apr 28 13:41:43 UTC 2015.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.10.76-rc1.gz
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>

Compiled and booted on my test system. No dmesg regressions.

-- Shuah


--
Shuah Khan
Sr. Linux Kernel Developer
Open Source Innovation Group
Samsung Research America (Silicon Valley)
[email protected] | (970) 217-8978