Commit graph

4250 commits

Author SHA1 Message Date
Gianluca Borello
a5e8c07059 bpf: add bpf_probe_read_str helper
Provide a simple helper with the same semantics of strncpy_from_unsafe():

int bpf_probe_read_str(void *dst, int size, const void *unsafe_addr)

This gives more flexibility to a bpf program. A typical use case is
intercepting a file name during sys_open(). The current approach is:

SEC("kprobe/sys_open")
void bpf_sys_open(struct pt_regs *ctx)
{
	char buf[PATHLEN]; // PATHLEN is defined to 256
	bpf_probe_read(buf, sizeof(buf), ctx->di);

	/* consume buf */
}

This is suboptimal because the size of the string needs to be estimated
at compile time, causing more memory to be copied than often necessary,
and can become more problematic if further processing on buf is done,
for example by pushing it to userspace via bpf_perf_event_output(),
since the real length of the string is unknown and the entire buffer
must be copied (and defining an unrolled strnlen() inside the bpf
program is a very inefficient and unfeasible approach).

With the new helper, the code can easily operate on the actual string
length rather than the buffer size:

SEC("kprobe/sys_open")
void bpf_sys_open(struct pt_regs *ctx)
{
	char buf[PATHLEN]; // PATHLEN is defined to 256
	int res = bpf_probe_read_str(buf, sizeof(buf), ctx->di);

	/* consume buf, for example push it to userspace via
	 * bpf_perf_event_output(), but this time we can use
	 * res (the string length) as event size, after checking
	 * its boundaries.
	 */
}

Another useful use case is when parsing individual process arguments or
individual environment variables navigating current->mm->arg_start and
current->mm->env_start: using this helper and the return value, one can
quickly iterate at the right offset of the memory area.

The code changes simply leverage the already existent
strncpy_from_unsafe() kernel function, which is safe to be called from a
bpf program as it is used in bpf_trace_printk().

Signed-off-by: Gianluca Borello <g.borello@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20 12:08:43 -05:00
Xin Long
7f9d68ac94 sctp: implement sender-side procedures for SSN Reset Request Parameter
This patch is to implement sender-side procedures for the Outgoing
and Incoming SSN Reset Request Parameter described in rfc6525 section
5.1.2 and 5.1.3.

It is also add sockopt SCTP_RESET_STREAMS in rfc6525 section 6.3.2
for users.

Note that the new asoc member strreset_outstanding is to make sure
only one reconf request chunk on the fly as rfc6525 section 5.1.1
demands.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-18 14:55:11 -05:00
Xin Long
9fb657aec0 sctp: add sockopt SCTP_ENABLE_STREAM_RESET
This patch is to add sockopt SCTP_ENABLE_STREAM_RESET to get/set
strreset_enable to indicate which reconf request type it supports,
which is described in rfc6525 section 6.3.1.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-18 14:55:10 -05:00
Lance Richardson
53631a5f9c bridge: sparse fixes in br_ip6_multicast_alloc_query()
Changed type of csum field in struct igmpv3_query from __be16 to
__sum16 to eliminate type warning, made same change in struct
igmpv3_report for consistency.

Fixed up an ntohs() where htons() should have been used instead.

Signed-off-by: Lance Richardson <lrichard@redhat.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-17 15:22:05 -05:00
David S. Miller
580bdf5650 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-01-17 15:19:37 -05:00
Robert Shearman
27d691056b mpls: Packet stats
Having MPLS packet stats is useful for observing network operation and
for diagnosing network problems. In the absence of anything better,
RFC2863 and RFC3813 are used for guidance for which stats to expose
and the semantics of them. In particular rx_noroutes maps to in
unknown protos in RFC2863. The stats are exposed to userspace via
AF_MPLS attributes embedded in the IFLA_STATS_AF_SPEC attribute of
RTM_GETSTATS messages.

All the introduced fields are 64-bit, even error ones, to ensure no
overflow with long uptimes. Per-CPU counters are used to avoid
cache-line contention on the commonly used fields. The other fields
have also been made per-CPU for code to avoid performance problems in
error conditions on the assumption that on some platforms the cost of
atomic operations could be more expensive than sending the packet
(which is what would be done in the success case). If that's not the
case, we could instead not use per-CPU counters for these fields.

Only unicast and non-fragment are exposed at the moment, but other
counters can be exposed in the future either by adding to the end of
struct mpls_link_stats or by additional netlink attributes in the
AF_MPLS IFLA_STATS_AF_SPEC nested attribute.

Signed-off-by: Robert Shearman <rshearma@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-17 14:38:43 -05:00
Robert Shearman
aefb4d4ad8 net: AF-specific RTM_GETSTATS attributes
Add the functionality for including address-family-specific per-link
stats in RTM_GETSTATS messages. This is done through adding a new
IFLA_STATS_AF_SPEC attribute under which address family attributes are
nested and then the AF-specific attributes can be further nested. This
follows the model of IFLA_AF_SPEC on RTM_*LINK messages and it has the
advantage of presenting an easily extended hierarchy. The rtnl_af_ops
structure is extended to provide AFs with the opportunity to fill and
provide the size of their stats attributes.

One alternative would have been to provide AFs with the ability to add
attributes directly into the RTM_GETSTATS message without a nested
hierarchy. I discounted this approach as it increases the rate at
which the 32 attribute number space is used up and it makes
implementation a little more tricky for stats dump resuming (at the
moment the order in which attributes are added to the message has to
match the numeric order of the attributes).

Another alternative would have been to register per-AF RTM_GETSTATS
handlers. I discounted this approach as I perceived a common use-case
to be getting all the stats for an interface and this approach would
necessitate multiple requests/dumps to retrieve them all.

Signed-off-by: Robert Shearman <rshearma@brocade.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-17 14:38:43 -05:00
David Lebrun
a50a05f497 ipv6: sr: add missing Kbuild export for header files
Add missing IPv6-SR header files in include/uapi/linux/Kbuild.

Also, prevent seg6_lwt_headroom() from being exported and add
missing linux/types.h include.

Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-16 14:47:21 -05:00
Daniel Borkmann
f1f7714ea5 bpf: rework prog_digest into prog_tag
Commit 7bd509e311 ("bpf: add prog_digest and expose it via
fdinfo/netlink") was recently discussed, partially due to
admittedly suboptimal name of "prog_digest" in combination
with sha1 hash usage, thus inevitably and rightfully concerns
about its security in terms of collision resistance were
raised with regards to use-cases.

The intended use cases are for debugging resp. introspection
only for providing a stable "tag" over the instruction sequence
that both kernel and user space can calculate independently.
It's not usable at all for making a security relevant decision.
So collisions where two different instruction sequences generate
the same tag can happen, but ideally at a rather low rate. The
"tag" will be dumped in hex and is short enough to introspect
in tracepoints or kallsyms output along with other data such
as stack trace, etc. Thus, this patch performs a rename into
prog_tag and truncates the tag to a short output (64 bits) to
make it obvious it's not collision-free.

Should in future a hash or facility be needed with a security
relevant focus, then we can think about requirements, constraints,
etc that would fit to that situation. For now, rework the exposed
parts for the current use cases as long as nothing has been
released yet. Tested on x86_64 and s390x.

Fixes: 7bd509e311 ("bpf: add prog_digest and expose it via fdinfo/netlink")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-16 14:03:31 -05:00
David S. Miller
1a717fcf8b We have a number of fixes, in part because I was late
in actually sending them out - will try to do better in
 the future:
  * handle VHT opmode properly when hostapd is controlling
    full station state
  * two fixes for minimum channel width in mac80211
  * don't leave SMPS set to junk in HT capabilities
  * fix headroom when forwarding mesh packets, recently
    broken by another fix that failed to take into account
    frame encryption
  * fix the TID in null-data packets indicating EOSP (end
    of service period) in U-APSD
  * prevent attempting to use (and then failing which
    results in crashes) TXQs on stations that aren't added
    to the driver yet
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCgAGBQJYeNveAAoJEGt7eEactAAdCjgP+gIIUjpH08MazRE9Nl6gqRGW
 G6EAoaQDQ8AjiVo8nIJ33X2jmHRIPcyelyfMhdww3AtMJRrvl9wVhQvwKWI4+l6D
 o8UVMBu66fq2luQA6AtOkkU6SkKwC7Se72Mdx6OA/zvdZz4/9Y4nZpOFPVwCbxqL
 UJRzjrhnbS51YgL/Y0s0Vp+7Rv7IYCuH9JORNarMO5sYxaVpLWihJhbShX1bOshw
 uFTHOjNRseImLhM4GOvVA7fSUiK8jxEuMECmlDKQB/6nVxMskE54yLOqMB5Ys6va
 2CKjp5xqM4FfqB4LMNp7soJAXUOXvqk25JXAAbkVNo4VRd2Y4GoJ7XdBPqd4kfKb
 rPlr+UP2xaSQsfqIoe+uFr/lVUmm4oCTrS/Mo7YVrjSMU7fntYwZccr9aw5jrQFW
 YU+1QTF25HEb1SL18FH9JNWpoTlOlpB3bwpbAW4BHzsqDYc76CE/oyDLdZ6zQYlu
 z92cLuycGTwgNhi1zDNThxB81zhZsWH1Jbh9ppZBdDPxo6E4DMK71okreGAXBEMJ
 IdFZZqJbYW4I/TsUtse4atozk6oXlTEFY/pX4Qm0gwGwmqQx+wfSNLbymsD7gR42
 OkL+bZ701HBl2gJUWuICrM2lFWtD4/o6oWpaW2I7QUAQhjvUr4ld9kthVMF2vdlt
 w316EIfZQzBKw7Z7zjZI
 =GU6q
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-for-davem-2017-01-13' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
We have a number of fixes, in part because I was late
in actually sending them out - will try to do better in
the future:
 * handle VHT opmode properly when hostapd is controlling
   full station state
 * two fixes for minimum channel width in mac80211
 * don't leave SMPS set to junk in HT capabilities
 * fix headroom when forwarding mesh packets, recently
   broken by another fix that failed to take into account
   frame encryption
 * fix the TID in null-data packets indicating EOSP (end
   of service period) in U-APSD
 * prevent attempting to use (and then failing which
   results in crashes) TXQs on stations that aren't added
   to the driver yet
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-15 22:17:59 -05:00
David S. Miller
bb60b8b35a For 4.11, we seem to have more than in the past few releases:
* socket owner support for connections, so when the wifi
    manager (e.g. wpa_supplicant) is killed, connections are
    torn down - wpa_supplicant is critical to managing certain
    operations, and can opt in to this where applicable
  * minstrel & minstrel_ht updates to be more efficient (time and space)
  * set wifi_acked/wifi_acked_valid for skb->destructor use in the
    kernel, which was already available to userspace
  * don't indicate new mesh peers that might be used if there's no
    room to add them
  * multicast-to-unicast support in mac80211, for better medium usage
    (since unicast frames can use *much* higher rates, by ~3 orders of
    magnitude)
  * add API to read channel (frequency) limitations from DT
  * add infrastructure to allow randomizing public action frames for
    MAC address privacy (still requires driver support)
  * many cleanups and small improvements/fixes across the board
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABCgAGBQJYeKu7AAoJEGt7eEactAAdwjEP/RA4bXFMfkC7qUJ++cLrMMwY
 yCvjb8+ULWL2wbCzpfY37acbGJgot3DNoQJzrO2jMQPqyM9nRlTMg5aF49cI7t62
 gU6daNKJaGBe/0yeG7lTJ4n5UtVCDtN45hGc06Yert+ewb9njiJf+XYrtCWetsIJ
 5bOLYQKPWOz/7UyMH7uJ25zrPFaiA3y7XnXKPEudagG/EwEq9ZuUpSSfLwEAEBPi
 6i/2w4fLj32vXRsQMvQT0sU6mjd+1ub8Is7w5l2F06iWwNYPzdSM0IbU+E+ie2tk
 sE6RA70c4ILrp8KisTAz2lJPa4XEpFkLhI3lzRRy8CVzjyyo/OJen92zvr2R7TVb
 /uZG9qfRQ3UitQmgeKd+wS8PsbRAyWUR/xhNxD2r7zARH2vliwyneU+zEpXLeGA1
 Y4PrN1+Fk45Ye4/4XSbPO4cf1MHX7qinN4rjrpsJKPwoYD/gQ1cZvef4AbaKPvq6
 oCKRVrwNoUuSB8NTcMLPqze3WCfhnJyVUhCZTyzHeW4uG81qrHwrvBvM25vcWGcm
 CcSWFktFIpuGML4FCU3byZfb0NkmJtpCD4n7P98WFPGjvsWIEVCMckqlC8x1F7B7
 BqqjGS2mGA17Xy0uLfmN/JempesQJnZhnAnFERdyX1S1YQuKhLwEu7OsYegnStDL
 Cn1wFw2/qcgeTkJfBICB
 =UToW
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-next-for-davem-2017-01-13' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Johannes Berg says:

====================
For 4.11, we seem to have more than in the past few releases:
 * socket owner support for connections, so when the wifi
   manager (e.g. wpa_supplicant) is killed, connections are
   torn down - wpa_supplicant is critical to managing certain
   operations, and can opt in to this where applicable
 * minstrel & minstrel_ht updates to be more efficient (time and space)
 * set wifi_acked/wifi_acked_valid for skb->destructor use in the
   kernel, which was already available to userspace
 * don't indicate new mesh peers that might be used if there's no
   room to add them
 * multicast-to-unicast support in mac80211, for better medium usage
   (since unicast frames can use *much* higher rates, by ~3 orders of
   magnitude)
 * add API to read channel (frequency) limitations from DT
 * add infrastructure to allow randomizing public action frames for
   MAC address privacy (still requires driver support)
 * many cleanups and small improvements/fixes across the board
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-14 12:02:15 -05:00
Purushottam Kushwaha
3093ebbeab cfg80211: Specify the reason for connect timeout
This enhances the connect timeout API to also carry the reason for the
timeout. These reason codes for the connect time out are represented by
enum nl80211_timeout_reason and are passed to user space through a new
attribute NL80211_ATTR_TIMEOUT_REASON (u32).

Signed-off-by: Purushottam Kushwaha <pkushwah@qti.qualcomm.com>
Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com>
[keep gfp_t argument last]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-01-13 09:46:18 +01:00
vamsi krishna
bf95ecdba9 cfg80211: Add support to sched scan to report better BSSs
Enhance sched scan to support option of finding a better BSS while in
connected state. Firmware scans the medium and reports when it finds a
known BSS which has better RSSI than the current connected BSS. New
attributes to specify the relative RSSI (compared to the current BSS)
are added to the sched scan to implement this.

Signed-off-by: vamsi krishna <vamsin@qti.qualcomm.com>
Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-01-13 09:40:41 +01:00
vamsi krishna
ab5bb2d51b cfg80211: Add support for randomizing TA of Public Action frames
Add support to use a random local address (Address 2 = TA in transmit
and the same address in receive functionality) for Public Action frames
in order to improve privacy of WLAN clients. Applications fill the
random transmit address in the frame buffer in the NL80211_CMD_FRAME
command. This can be used only with the drivers that indicate support
for random local address by setting the new
NL80211_EXT_FEATURE_MGMT_TX_RANDOM_TA and/or
NL80211_EXT_FEATURE_MGMT_TX_RANDOM_TA_CONNECTED in ext_features.

The driver needs to configure receive behavior to accept frames to the
specified random address during the time the frame exchange is pending
and such frames need to be acknowledged similarly to frames sent to the
local permanent address when this random address functionality is not
used.

Signed-off-by: vamsi krishna <vamsin@qti.qualcomm.com>
Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-01-13 09:39:47 +01:00
David S. Miller
02ac5d1487 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Two AF_* families adding entries to the lockdep tables
at the same time.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-11 14:43:39 -05:00
Simon Horman
99d31326cb net/sched: cls_flower: Support matching on ARP
Support matching on ARP operation, and hardware and protocol addresses
for Ethernet hardware and IPv4 protocol addresses.

Example usage:

tc qdisc add dev eth0 ingress

tc filter add dev eth0 protocol arp parent ffff: flower indev eth0 \
	arp_op request arp_sip 10.0.0.1 action drop
tc filter add dev eth0 protocol rarp parent ffff: flower indev eth0 \
	arp_op reply arp_tha 52:54:3f:00:00:00/24 action drop

Signed-off-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-11 11:02:47 -05:00
Beni Lev
06f7c88c10 cfg80211: consider VHT opmode on station update
Currently, this attribute is only fetched on station addition, but
not on station change. Since this info is only present in the assoc
request, with full station state support in the driver it cannot be
present when the station is added.

Thus, add support for changing the VHT opmode on station update if
done before (or while) the station is marked as associated. After
this, ignore it, since it used to be ignored.

Signed-off-by: Beni Lev <beni.lev@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-01-11 16:34:25 +01:00
Mike Frysinger
575b1967e1 timerfd: export defines to userspace
Since userspace is expected to call timerfd syscalls directly with these
flags/ioctls, make sure we export them so they don't have to duplicate
the values themselves.

Link: http://lkml.kernel.org/r/20161219064052.7196-1-vapier@gentoo.org
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-01-10 18:31:55 -08:00
David S. Miller
bda65b4255 mlx5 4K UAR
The following series of patches optimizes the usage of the UAR area which is
 contained within the BAR 0-1. Previous versions of the firmware and the driver
 assumed each system page contains a single UAR. This patch set will query the
 firmware for a new capability that if published, means that the firmware can
 support UARs of fixed 4K regardless of system page size. In the case of
 powerpc, where page size equals 64KB, this means we can utilize 16 UARs per
 system page. Since user space processes by default consume eight UARs per
 context this means that with this change a process will need a single system
 page to fulfill that requirement and in fact make use of more UARs which is
 better in terms of performance.
 
 In addition to optimizing user-space processes, we introduce an allocator
 that can be used by kernel consumers to allocate blue flame registers
 (which are areas within a UAR that are used to write doorbells). This provides
 further optimization on using the UAR area since the Ethernet driver makes
 use of a single blue flame register per system page and now it will use two
 blue flame registers per 4K.
 
 The series also makes changes to naming conventions and now the terms used in
 the driver code match the terms used in the PRM (programmers reference manual).
 Thus, what used to be called UUAR (micro UAR) is now called BFREG (blue flame
 register).
 
 In order to support compatibility between different versions of
 library/driver/firmware, the library has now means to notify the kernel driver
 that it supports the new scheme and the kernel can notify the library if it
 supports this extension. So mixed versions of libraries can run concurrently
 without any issues.
 
 Thanks,
         Eli and Matan
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJYc9kSAAoJEEg/ir3gV/o+a0EH/jEGiopH7CHc4T4nXT1I4kQa
 TicrkMNV3Sr9MBWwn8TLOyx+Fi1dex4cumrJI/BNVjC6h/nS6JHbslYoZxTkX9lT
 L0vRsHJBVr/PODqimIGNnlJFBPhNJSGiHG4JHlJHlpvcGNahitN3gXmUjcRNju+V
 ExnvgwWzAXM0qg1qWf5A/3HmqbtYES1rJXQUsimtc2QAif/SIayBD4fEA8x5zNBA
 i0p8xcDrzUqmeblkpnsJA3w40s1rsuqvJnvLPDpbpKENtHfw1UFZ2987P7LvOrIv
 NF/mZBkStC0gOZX6dLEAdoZXL1gTsJX19hTkUMfYH4BHqHARa2/oCS3wcCf1Giw=
 =C+cp
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-4kuar-for-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux

Saeed Mahameed says:

====================
mlx5 4K UAR

The following series of patches optimizes the usage of the UAR area which is
contained within the BAR 0-1. Previous versions of the firmware and the driver
assumed each system page contains a single UAR. This patch set will query the
firmware for a new capability that if published, means that the firmware can
support UARs of fixed 4K regardless of system page size. In the case of
powerpc, where page size equals 64KB, this means we can utilize 16 UARs per
system page. Since user space processes by default consume eight UARs per
context this means that with this change a process will need a single system
page to fulfill that requirement and in fact make use of more UARs which is
better in terms of performance.

In addition to optimizing user-space processes, we introduce an allocator
that can be used by kernel consumers to allocate blue flame registers
(which are areas within a UAR that are used to write doorbells). This provides
further optimization on using the UAR area since the Ethernet driver makes
use of a single blue flame register per system page and now it will use two
blue flame registers per 4K.

The series also makes changes to naming conventions and now the terms used in
the driver code match the terms used in the PRM (programmers reference manual).
Thus, what used to be called UUAR (micro UAR) is now called BFREG (blue flame
register).

In order to support compatibility between different versions of
library/driver/firmware, the library has now means to notify the kernel driver
that it supports the new scheme and the kernel can notify the library if it
supports this extension. So mixed versions of libraries can run concurrently
without any issues.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-09 17:09:31 -05:00
Ursula Braun
f16a7dd5cf smc: netlink interface for SMC sockets
Support for SMC socket monitoring via netlink sockets of protocol
NETLINK_SOCK_DIAG.

Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-09 16:07:41 -05:00
Thomas Richter
6812baabf2 smc: establish pnet table management
Connection creation with SMC-R starts through an internal
TCP-connection. The Ethernet interface for this TCP-connection is not
restricted to the Ethernet interface of a RoCE device. Any existing
Ethernet interface belonging to the same physical net can be used, as
long as there is a defined relation between the Ethernet interface and
some RoCE devices. This relation is defined with the help of an
identification string called "Physical Net ID" or short "pnet ID".
Information about defined pnet IDs and their related Ethernet
interfaces and RoCE devices is stored in the SMC-R pnet table.

A pnet table entry consists of the identifying pnet ID and the
associated network and IB device.
This patch adds pnet table configuration support using the
generic netlink message interface referring to network and IB device
by their names. Commands exist to add, delete, and display pnet table
entries, and to flush or display the entire pnet table.

There are cross-checks to verify whether the ethernet interfaces
or infiniband devices really exist in the system. If either device
is not available, the pnet ID entry is not created.
Loss of network devices and IB devices is also monitored;
a pnet ID entry is removed when an associated network or
IB device is removed.

Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-09 16:07:38 -05:00
David S. Miller
bb1d303444 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-01-09 15:39:11 -05:00
Davide Caratti
c008b33f3e net/sched: act_csum: compute crc32c on SCTP packets
modify act_csum to compute crc32c on IPv4/IPv6 packets having SCTP in
their payload, and extend UAPI definitions accordingly.

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-09 14:36:57 -05:00
Eli Cohen
30aa60b3bd IB/mlx5: Support 4k UAR for libmlx5
Add fields to structs to convey to kernel an indication whether the
library supports multi UARs per page and return to the library the size
of a UAR based on the queried value.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-01-09 20:25:09 +02:00
Andrzej Zaborowski
bd2522b168 cfg80211: NL80211_ATTR_SOCKET_OWNER support for CMD_CONNECT
Disconnect or deauthenticate when the owning socket is closed if this
flag is supplied to CMD_CONNECT or CMD_ASSOCIATE.  This may be used
to ensure userspace daemon doesn't leave an unmanaged connection behind.

In some situations it would be possible to account for that, to some
degree, in the deamon restart code or in the up/down scripts without
the use of this attribute.  But there will be systems where the daemon
can go away for varying periods without a warning due to local resource
management.

Signed-off-by: Andrew Zaborowski <andrew.zaborowski@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-01-09 13:08:47 +01:00
Willem de Bruijn
bc31c905e9 net-tc: convert tc_from to tc_from_ingress and tc_redirected
The tc_from field fulfills two roles. It encodes whether a packet was
redirected by an act_mirred device and, if so, whether act_mirred was
called on ingress or egress. Split it into separate fields.

The information is needed by the special IFB loop, where packets are
taken out of the normal path by act_mirred, forwarded to IFB, then
reinjected at their original location (ingress or egress) by IFB.

The IFB device cannot use skb->tc_at_ingress, because that may have
been overwritten as the packet travels from act_mirred to ifb_xmit,
when it passes through tc_classify on the IFB egress path. Cache this
value in skb->tc_from_ingress.

That field is valid only if a packet arriving at ifb_xmit came from
act_mirred. Other packets can be crafted to reach ifb_xmit. These
must be dropped. Set tc_redirected on redirection and drop all packets
that do not have this bit set.

Both fields are set only on cloned skbs in tc actions, so original
packet sources do not have to clear the bit when reusing packets
(notably, pktgen and octeon).

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-08 20:58:52 -05:00
Willem de Bruijn
a5135bcfba net-tc: convert tc_verd to integer bitfields
Extract the remaining two fields from tc_verd and remove the __u16
completely. TC_AT and TC_FROM are converted to equivalent two-bit
integer fields tc_at and tc_from. Where possible, use existing
helper skb_at_tc_ingress when reading tc_at. Introduce helper
skb_reset_tc to clear fields.

Not documenting tc_from and tc_at, because they will be replaced
with single bit fields in follow-on patches.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-08 20:58:52 -05:00
Willem de Bruijn
e7246e122a net-tc: extract skip classify bit from tc_verd
Packets sent by the IFB device skip subsequent tc classification.
A single bit governs this state. Move it out of tc_verd in
anticipation of removing that __u16 completely.

The new bitfield tc_skip_classify temporarily uses one bit of a
hole, until tc_verd is removed completely in a follow-up patch.

Remove the bit hole comment. It could be 2, 3, 4 or 5 bits long.
With that many options, little value in documenting it.

Introduce a helper function to deduplicate the logic in the two
sites that check this bit.

The field tc_skip_classify is set only in IFB on skbs cloned in
act_mirred, so original packet sources do not have to clear the
bit when reusing packets (notably, pktgen and octeon).

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-08 20:58:52 -05:00
Willem de Bruijn
d6264071ce net-tc: make MAX_RECLASSIFY_LOOP local
This field is no longer kept in tc_verd. Remove it from the global
definition of that struct.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-08 20:58:52 -05:00
Willem de Bruijn
aec745e2c5 net-tc: remove unused tc_verd fields
Remove the last reference to tc_verd's munge and redirect ttl bits.
These fields are no longer used.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-08 20:58:52 -05:00
Eli Cohen
2f5ff26478 mlx5: Fix naming convention with respect to UARs
This establishes a solid naming conventions for UARs. A UAR (User Access
Region) can have size identical to a system page or can be fixed 4KB
depending on a value queried by firmware. Each UAR always has 4 blue
flame register which are used to post doorbell to send queue. In
addition, a UAR has section used for posting doorbells to CQs or EQs. In
this patch we change names to reflect this conventions.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-01-08 11:21:26 +02:00
Nikolay Aleksandrov
1708ebc963 ipmr, ip6mr: add RTNH_F_UNRESOLVED flag to unresolved cache entries
While working with ipmr, we noticed that it is impossible to determine
if an entry is actually unresolved or its IIF interface has disappeared
(e.g. virtual interface got deleted). These entries look almost
identical to user-space when dumping or receiving notifications. So in
order to recognize them add a new RTNH_F_UNRESOLVED flag which is set when
sending an unresolved cache entry to user-space.

Suggested-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-03 10:04:31 -05:00
Santosh Shilimkar
3289025aed RDS: add receive message trace used by application
Socket option to tap receive path latency in various stages
in nano seconds. It can be enabled on selective sockets using
using SO_RDS_MSG_RXPATH_LATENCY socket option. RDS will return
the data to application with RDS_CMSG_RXPATH_LATENCY in defined
format. Scope is left to add more trace points for future
without need of change in the interface.

Reviewed-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
2017-01-02 14:02:59 -08:00
Vincent Pelletier
96a420d2d3 usb: gadget: f_fs: Document eventfd effect on descriptor format.
When FUNCTIONFS_EVENTFD flag is set, __ffs_data_got_descs reads a 32bits,
little-endian value right after the fixed structure header, and passes it
to eventfd_ctx_fdget. Document this.

Also, rephrase a comment to be affirmative about the role of string
descriptor at index 0. Ref: USB 2.0 spec paragraph "9.6.7 String", and
also checked to still be current in USB 3.0 spec paragraph "9.6.9 String".

Signed-off-by: Vincent Pelletier <plr.vincent@gmail.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2017-01-02 10:55:28 +02:00
Linus Torvalds
eb254f323b Merge branch 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cache allocation interface from Thomas Gleixner:
 "This provides support for Intel's Cache Allocation Technology, a cache
  partitioning mechanism.

  The interface is odd, but the hardware interface of that CAT stuff is
  odd as well.

  We tried hard to come up with an abstraction, but that only allows
  rather simple partitioning, but no way of sharing and dealing with the
  per package nature of this mechanism.

  In the end we decided to expose the allocation bitmaps directly so all
  combinations of the hardware can be utilized.

  There are two ways of associating a cache partition:

   - Task

     A task can be added to a resource group. It uses the cache
     partition associated to the group.

   - CPU

     All tasks which are not member of a resource group use the group to
     which the CPU they are running on is associated with.

     That allows for simple CPU based partitioning schemes.

  The main expected user sare:

   - Virtualization so a VM can only trash only the associated part of
     the cash w/o disturbing others

   - Real-Time systems to seperate RT and general workloads.

   - Latency sensitive enterprise workloads

   - In theory this also can be used to protect against cache side
     channel attacks"

[ Intel RDT is "Resource Director Technology". The interface really is
  rather odd and very specific, which delayed this pull request while I
  was thinking about it. The pull request itself came in early during
  the merge window, I just delayed it until things had calmed down and I
  had more time.

  But people tell me they'll use this, and the good news is that it is
  _so_ specific that it's rather independent of anything else, and no
  user is going to depend on the interface since it's pretty rare. So if
  push comes to shove, we can just remove the interface and nothing will
  break ]

* 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits)
  x86/intel_rdt: Implement show_options() for resctrlfs
  x86/intel_rdt: Call intel_rdt_sched_in() with preemption disabled
  x86/intel_rdt: Update task closid immediately on CPU in rmdir and unmount
  x86/intel_rdt: Fix setting of closid when adding CPUs to a group
  x86/intel_rdt: Update percpu closid immeditately on CPUs affected by changee
  x86/intel_rdt: Reset per cpu closids on unmount
  x86/intel_rdt: Select KERNFS when enabling INTEL_RDT_A
  x86/intel_rdt: Prevent deadlock against hotplug lock
  x86/intel_rdt: Protect info directory from removal
  x86/intel_rdt: Add info files to Documentation
  x86/intel_rdt: Export the minimum number of set mask bits in sysfs
  x86/intel_rdt: Propagate error in rdt_mount() properly
  x86/intel_rdt: Add a missing #include
  MAINTAINERS: Add maintainer for Intel RDT resource allocation
  x86/intel_rdt: Add scheduler hook
  x86/intel_rdt: Add schemata file
  x86/intel_rdt: Add tasks files
  x86/intel_rdt: Add cpus file
  x86/intel_rdt: Add mkdir to resctrl file system
  x86/intel_rdt: Add "info" files to resctrl file system
  ...
2016-12-22 09:25:45 -08:00
Linus Torvalds
bd9999cd6a media updates for v4.10-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJYUnMLAAoJEAhfPr2O5OEV+qMP/3Bg+j/rF7v//uIoPwAxPZeV
 DGffdrBGViZsurYtVYBNTzD6HXHNakfeZvVS8bDZYKHNQ9L/5ezUctgVuoVa98vZ
 crQg9NspSwSMQkiruRto3ueZhMaDSaax/nRtLo6MIA5rL9n1z1hqgCq2/WbIilJf
 etpWnEdhYZQZ7OMOgbum1nfYbcvHhw9ZlJAbPBjZyaaVxNOOtePbSU3jV0PLmMc0
 d8KSvHcCMZYGx6PA0aNj8TZ2kdkTCcuL83Ub8VzaBXMdxfORsFTM5CQfZGVmTGhD
 aDCVBFo3mfyCQaarE4T0LQvb9vw91Qud6VJrAlg6k5dptGSRuryS9uyKjPjQ88ae
 98uiOQP8Pr8n1C0luNtaZzzm9D8BTcROvQne1HUo2hpOHu1AWsYPoUqPSdnU77Ms
 B7zlfvAfmRj/tGDK49ItEjRGGjV7V2uLGWzDdd2QqWPId9Qwk7NQmD0jGCipzICi
 ioxjagnL96JkNSuZUhMiuVPZkVMITREM24BGe8+1sjJY80dnSjZfYv1eo7jchD1v
 cclN8BC+gQYGmsVEOZY1oH69rITvAa8ksX231CvWEIetStrFqUqqM5NuQIyMdKGH
 hn6MyfZNm+XeEvpBmYLGDy50pyox2N149o1DXxV5AOsmzoxFUVCfMU96J3MPooW/
 qLl/FLatHI1bQ2RW6bUV
 =ADpU
 -----END PGP SIGNATURE-----

Merge tag 'media/v4.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media updates from Mauro Carvalho Chehab:

 - new Mediatek drivers: mtk-mdp and mtk-vcodec

 - some additions at the media documentation

 - the CEC core and drivers were promoted from staging to mainstream

 - some cleanups at the DVB core

 - the LIRC serial driver got promoted from staging to mainstream

 - added a driver for Renesas R-Car FDP1 driver

 - add DVBv5 statistics support to mn88473 driver

 - several fixes related to printk continuation lines

 - add support for HSV encoding formats

 - lots of other cleanups, fixups and driver improvements.

* tag 'media/v4.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (496 commits)
  [media] v4l: tvp5150: Add missing break in set control handler
  [media] v4l: tvp5150: Don't inline the tvp5150_selmux() function
  [media] v4l: tvp5150: Compile tvp5150_link_setup out if !CONFIG_MEDIA_CONTROLLER
  [media] em28xx: don't store usb_device at struct em28xx
  [media] em28xx: use usb_interface for dev_foo() calls
  [media] em28xx: don't change the device's name
  [media] mn88472: fix chip id check on probe
  [media] mn88473: fix chip id check on probe
  [media] lirc: fix error paths in lirc_cdev_add()
  [media] s5p-mfc: Add support for MFC v8 available in Exynos 5433 SoCs
  [media] s5p-mfc: Rework clock handling
  [media] s5p-mfc: Don't keep clock prepared all the time
  [media] s5p-mfc: Kill all IS_ERR_OR_NULL in clocks management code
  [media] s5p-mfc: Remove dead conditional code
  [media] s5p-mfc: Ensure that clock is disabled before turning power off
  [media] s5p-mfc: Remove special clock rate management
  [media] s5p-mfc: Use printk_ratelimited for reporting ioctl errors
  [media] s5p-mfc: Set DMA_ATTR_ALLOC_SINGLE_PAGES
  [media] vivid: Set color_enc on HSV formats
  [media] v4l2-tpg: Init hv_enc field with a valid value
  ...
2016-12-16 09:39:16 -08:00
Arend Van Spriel
e77a8be9a0 nl80211: better describe field in struct nl80211_bss_select_rssi_adjust
The two fields in struct nl80211_bss_select_rssi_adjust did not state
their type or unit. Adding documentation.

Reported-by: Jouni Malinen <j@w1.fi>
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-12-16 13:32:49 +01:00
Linus Torvalds
ed3c5a0be3 virtio, vhost: new device, fixes, speedups
This includes the new virtio crypto device, and fixes all over the
 place.  In particular enabling endian-ness checks for sparse builds
 found some bugs which this fixes.  And it appears that everyone is in
 agreement that disabling endian-ness sparse checks shouldn't be
 necessary any longer.
 
 So this enables them for everyone, and drops __CHECK_ENDIAN__
 and __bitwise__ APIs.
 
 IRQ handling in virtio has been refactored somewhat, the
 larger switch to IRQ_SHARED will have to wait as
 it proved too aggressive.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJYUxYEAAoJECgfDbjSjVRp5lgH/22HKRyb3+M+z3oH6R9rJmz5
 T4y3XI4yDOTlh93VzxlrHjHNBnoWRvzV5hn6BKH6bTbSZ87TabNhfws11FKGvhER
 G1ipl/DvwytvvWgZ5dFdcC4x/0wpWawt2jgpEpPP33VDVkGJFEEAGj6GX10ClX99
 ggrNfzUCHOAFaIWzC29i7gYMnYHIJDUqK6ycDxZebzsE/c12SNRGASxei2D+6eYC
 YkdVg0c/d7Wsk+ZO1ugiA6omO4UdvPAVvxUkvd4YphRikwEWH7gGuz558wiSo4VN
 iEMZvyYXSEjx4B2Hg8+mH63zWROEpCmaToUix9+4AF7YhkaeX5fICNdkAPdtxc8=
 =urXH
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio updates from Michael Tsirkin:
 "virtio, vhost: new device, fixes, speedups

  This includes the new virtio crypto device, and fixes all over the
  place. In particular enabling endian-ness checks for sparse builds
  found some bugs which this fixes. And it appears that everyone is in
  agreement that disabling endian-ness sparse checks shouldn't be
  necessary any longer.

  So this enables them for everyone, and drops the __CHECK_ENDIAN__ and
  __bitwise__ APIs.

  IRQ handling in virtio has been refactored somewhat, the larger switch
  to IRQ_SHARED will have to wait as it proved too aggressive"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (34 commits)
  Makefile: drop -D__CHECK_ENDIAN__ from cflags
  fs/logfs: drop __CHECK_ENDIAN__
  Documentation/sparse: drop __CHECK_ENDIAN__
  linux: drop __bitwise__ everywhere
  checkpatch: replace __bitwise__ with __bitwise
  Documentation/sparse: drop __bitwise__
  tools: enable endian checks for all sparse builds
  linux/types.h: enable endian checks for all sparse builds
  virtio_mmio: Set dev.release() to avoid warning
  vhost: remove unused feature bit
  virtio_ring: fix description of virtqueue_get_buf
  vhost/scsi: Remove unused but set variable
  tools/virtio: use {READ,WRITE}_ONCE() in uaccess.h
  vringh: kill off ACCESS_ONCE()
  tools/virtio: fix READ_ONCE()
  crypto: add virtio-crypto driver
  vhost: cache used event for better performance
  vsock: lookup and setup guest_cid inside vhost_vsock_lock
  virtio_pci: split vp_try_to_find_vqs into INTx and MSI-X variants
  virtio_pci: merge vp_free_vectors into vp_del_vqs
  ...
2016-12-15 18:13:41 -08:00
Michael S. Tsirkin
9efeccacd3 linux: drop __bitwise__ everywhere
__bitwise__ used to mean "yes, please enable sparse checks
unconditionally", but now that we dropped __CHECK_ENDIAN__
__bitwise is exactly the same.
There aren't many users, replace it by __bitwise everywhere.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Stefan Schmidt <stefan@osg.samsung.com>
Acked-by: Krzysztof Kozlowski <krzk@kernel.org>
Akced-by: Lee Duncan <lduncan@suse.com>
2016-12-16 00:13:41 +02:00
Michael S. Tsirkin
05de97003c linux/types.h: enable endian checks for all sparse builds
By now, linux is mostly endian-clean. Enabling endian-ness
checks for everyone produces about 200 new sparse warnings for me -
less than 10% over the 2000 sparse warnings already there.

Not a big deal, OTOH enabling this helps people notice
they are introducing new bugs.

So let's just drop __CHECK_ENDIAN__. Follow-up patches
can drop distinction between __bitwise and __bitwise__.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-12-16 00:13:39 +02:00
Jason Wang
8d390464bf vhost: remove unused feature bit
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-12-16 00:13:38 +02:00
Gonglei
dbaf0624ff crypto: add virtio-crypto driver
This patch introduces virtio-crypto driver for Linux Kernel.

The virtio crypto device is a virtual cryptography device
as well as a kind of virtual hardware accelerator for
virtual machines. The encryption anddecryption requests
are placed in the data queue and are ultimately handled by
thebackend crypto accelerators. The second queue is the
control queue used to create or destroy sessions for
symmetric algorithms and will control some advanced features
in the future. The virtio crypto device provides the following
cryptoservices: CIPHER, MAC, HASH, and AEAD.

For more information about virtio-crypto device, please see:
  http://qemu-project.org/Features/VirtioCrypto

CC: Michael S. Tsirkin <mst@redhat.com>
CC: Cornelia Huck <cornelia.huck@de.ibm.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: Halil Pasic <pasic@linux.vnet.ibm.com>
CC: David S. Miller <davem@davemloft.net>
CC: Zeng Xin <xin.zeng@intel.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-12-16 00:13:32 +02:00
Linus Torvalds
0ab7b12c49 pci-v4.10-changes
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJYUt1vAAoJEFmIoMA60/r8abgP/3R+5Lsk5/kfAHk5/2Mtqbvg
 mZ0eDUpY9GbUeMjSq84Nr2H8u7d+1AJCCu8KtDJYZCmjZpnSp2SuE2PS5JoGC7zC
 fintD24jlIF4/J5+HeVXXmbfr3xATxvpTuiSLEi8sLBRJ3KRIswhMSwoPwOyeTQw
 v/EclWKPGYcI5Zp0oigY9/Jd3q3lQ17KXppi/0dDoLh7PNOFvEHItXWzmf++u/NP
 iYT9R1xmzEsy0/HRd6hiwPT2xA8YsAXxgobhHooUgh1FWmZ02Tg1WjgDemOW4lVh
 kNIUcsLczh7wZCceogrrJ+pwb9+NyyIyKuHPv6OG3ieyz1IZdznaj1fAE5HJYiPo
 eVS7cP1S6DyV3Y5qFj5F2dSRS7T4GXdXG5mNhmeCpUHs0vfzSCG36jLmhTy8UIxs
 1rCf5oFa+uU9q0okfH8VtcGOXqWjGgyxTSGGfF71HUMLnPbsci2fxC2cO6svzIX7
 wDY0uxOzpyMIYMuQR6iz7VqvAwEaZ+7pfMIrWWdDcQ9/5tCNJ49cLuKaThPL4bVu
 juiGBQtnTLg8tjrhjDL9tQiJpuVIweVXyyQ1fvZoVXkMLlhVCF2ttirvwFUit2PB
 84OlevQZ+9QdE/qalrWbv4qzhesuiwu0avkzjGoqg6tWTF0epu2AHI2vqy6UBYEG
 tcfJPEcz1019PKZNSvWy
 =ut0k
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.10-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "PCI changes:

   - add support for PCI on ARM64 boxes with ACPI. We already had this
     for theoretical spec-compliant hardware; now we're adding quirks
     for the actual hardware (Cavium, HiSilicon, Qualcomm, X-Gene)

   - add runtime PM support for hotplug ports

   - enable runtime suspend for Intel UHCI that uses platform-specific
     wakeup signaling

   - add yet another host bridge registration interface. We hope this is
     extensible enough to subsume the others

   - expose device revision in sysfs for DRM

   - to avoid device conflicts, make sure any VF BAR updates are done
     before enabling the VF

   - avoid unnecessary link retrains for ASPM

   - allow INTx masking on Mellanox devices that support it

   - allow access to non-standard VPD for Chelsio devices

   - update Broadcom iProc support for PAXB v2, PAXC v2, inbound DMA,
     etc

   - update Rockchip support for max-link-speed

   - add NVIDIA Tegra210 support

   - add Layerscape LS1046a support

   - update R-Car compatibility strings

   - add Qualcomm MSM8996 support

   - remove some uninformative bootup messages"

* tag 'pci-v4.10-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (115 commits)
  PCI: Enable access to non-standard VPD for Chelsio devices (cxgb3)
  PCI: Expand "VPD access disabled" quirk message
  PCI: pciehp: Remove loading message
  PCI: hotplug: Remove hotplug core message
  PCI: Remove service driver load/unload messages
  PCI/AER: Log AER IRQ when claiming Root Port
  PCI/AER: Log errors with PCI device, not PCIe service device
  PCI/AER: Remove unused version macros
  PCI/PME: Log PME IRQ when claiming Root Port
  PCI/PME: Drop unused support for PMEs from Root Complex Event Collectors
  PCI: Move config space size macros to pci_regs.h
  x86/platform/intel-mid: Constify mid_pci_platform_pm
  PCI/ASPM: Don't retrain link if ASPM not possible
  PCI: iproc: Skip check for legacy IRQ on PAXC buses
  PCI: pciehp: Leave power indicator on when enabling already-enabled slot
  PCI: pciehp: Prioritize data-link event over presence detect
  PCI: rcar: Add gen3 fallback compatibility string for pcie-rcar
  PCI: rcar: Use gen2 fallback compatibility last
  PCI: rcar-gen2: Use gen2 fallback compatibility last
  PCI: rockchip: Move the deassert of pm/aclk/pclk after phy_init()
  ..
2016-12-15 12:46:48 -08:00
Linus Torvalds
4d5b57e05a Updates for 4.10 kernel merge window
- Shared mlx5 updates with net stack (will drop out on merge if Dave's
   tree has already been merged)
 - Driver updates: cxgb4, hfi1, hns-roce, i40iw, mlx4, mlx5, qedr, rxe
 - Debug cleanups
 - New connection rejection helpers
 - SRP updates
 - Various misc fixes
 - New paravirt driver from vmware
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJYUbAPAAoJELgmozMOVy/dMXcP/iuG5MNzfN8Ny1JftyBQGWg3
 cqoQ2OLj9CsXjwVB+5EqbcZHRZY852lKONaLoDKkIOx4YAXO2YuIKOp944vN7EQx
 96wfqzT1F5jzAcy5mYZXgLaStGFDAwejKMqeHd0LfJj3OEtemGnVPWYzyqSQmSKo
 dzJraS1Z9GIRppzU5WaRpB9PtRBkqIqGJ5vZ0EKLGhed5hYY5r0iMJB0GfriMRDO
 lJ4UUVfpsAoLPnqDBFH6IMn2V2UeAw9IR5zNa1mrM1RBfvt/uYTxrw1w3p9WoaNs
 GRodhk4DCeAfeyqzVPNBLyXZ4Zq4FzGe3UWM4qysJ1RR4oFNw9Cuw0Fqk8mrfznr
 7hv5TpGIckRZiKf8l6e+qLirF0qGtXJg29j2vPVQI9i5nSj95g1agA81PnLQlLLb
 flWyxeMj81my7lfMHN1xcV6pqPEKMCOysZmfcvVfJd2XxpjuVD7ekl/YXWp8o8kU
 YPdQMqPD626XsD8VpPdMszb9FPmx0JD0HEv+Y1rIFX8JegEI+c3H2X0dqC27T/Ou
 FEPWOy025EgHm0Fh/7eIzkG6tjZ4JHoCugJAcxNZGj2XW4eB6r5vY8UwJ8iQRv+n
 PVYHiy0UoIRePh0mrdOSSphGZMi/GO/DsqKwCtAMEK43WqZQju6wR7QSIGkh66mp
 4uSHJqpf3YEYylxGMhk3
 =QeGy
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull rdma updates from Doug Ledford:
 "This is the complete update for the rdma stack for this release cycle.

  Most of it is typical driver and core updates, but there is the
  entirely new VMWare pvrdma driver. You may have noticed that there
  were changes in DaveM's pull request to the bnxt Ethernet driver to
  support a RoCE RDMA driver. The bnxt_re driver was tentatively set to
  be pulled in this release cycle, but it simply wasn't ready in time
  and was dropped (a few review comments still to address, and some
  multi-arch build issues like prefetch() not working across all
  arches).

  Summary:

   - shared mlx5 updates with net stack (will drop out on merge if
     Dave's tree has already been merged)

   - driver updates: cxgb4, hfi1, hns-roce, i40iw, mlx4, mlx5, qedr, rxe

   - debug cleanups

   - new connection rejection helpers

   - SRP updates

   - various misc fixes

   - new paravirt driver from vmware"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (210 commits)
  IB: Add vmw_pvrdma driver
  IB/mlx4: fix improper return value
  IB/ocrdma: fix bad initialization
  infiniband: nes: return value of skb_linearize should be handled
  MAINTAINERS: Update Intel RDMA RNIC driver maintainers
  MAINTAINERS: Remove Mitesh Ahuja from emulex maintainers
  IB/core: fix unmap_sg argument
  qede: fix general protection fault may occur on probe
  IB/mthca: Replace pci_pool_alloc by pci_pool_zalloc
  mlx5, calc_sq_size(): Make a debug message more informative
  mlx5: Remove a set-but-not-used variable
  mlx5: Use { } instead of { 0 } to init struct
  IB/srp: Make writing the add_target sysfs attr interruptible
  IB/srp: Make mapping failures easier to debug
  IB/srp: Make login failures easier to debug
  IB/srp: Introduce a local variable in srp_add_one()
  IB/srp: Fix CONFIG_DYNAMIC_DEBUG=n build
  IB/multicast: Check ib_find_pkey() return value
  IPoIB: Avoid reading an uninitialized member variable
  IB/mad: Fix an array index check
  ...
2016-12-15 12:03:32 -08:00
Mauro Carvalho Chehab
65390ea01c Merge branch 'patchwork' into v4l_for_linus
* patchwork: (496 commits)
  [media] v4l: tvp5150: Add missing break in set control handler
  [media] v4l: tvp5150: Don't inline the tvp5150_selmux() function
  [media] v4l: tvp5150: Compile tvp5150_link_setup out if !CONFIG_MEDIA_CONTROLLER
  [media] em28xx: don't store usb_device at struct em28xx
  [media] em28xx: use usb_interface for dev_foo() calls
  [media] em28xx: don't change the device's name
  [media] mn88472: fix chip id check on probe
  [media] mn88473: fix chip id check on probe
  [media] lirc: fix error paths in lirc_cdev_add()
  [media] s5p-mfc: Add support for MFC v8 available in Exynos 5433 SoCs
  [media] s5p-mfc: Rework clock handling
  [media] s5p-mfc: Don't keep clock prepared all the time
  [media] s5p-mfc: Kill all IS_ERR_OR_NULL in clocks management code
  [media] s5p-mfc: Remove dead conditional code
  [media] s5p-mfc: Ensure that clock is disabled before turning power off
  [media] s5p-mfc: Remove special clock rate management
  [media] s5p-mfc: Use printk_ratelimited for reporting ioctl errors
  [media] s5p-mfc: Set DMA_ATTR_ALLOC_SINGLE_PAGES
  [media] vivid: Set color_enc on HSV formats
  [media] v4l2-tpg: Init hv_enc field with a valid value
  ...
2016-12-15 08:38:35 -02:00
Linus Torvalds
dcdaa2f948 Merge branch 'stable-4.10' of git://git.infradead.org/users/pcmoore/audit
Pull audit updates from Paul Moore:
 "After the small number of patches for v4.9, we've got a much bigger
  pile for v4.10.

  The bulk of these patches involve a rework of the audit backlog queue
  to enable us to move the netlink multicasting out of the task/thread
  that generates the audit record and into the kernel thread that emits
  the record (just like we do for the audit unicast to auditd).

  While we were playing with the backlog queue(s) we fixed a number of
  other little problems with the code, and from all the testing so far
  things look to be in much better shape now. Doing this also allowed us
  to re-enable disabling IRQs for some netns operations ("netns: avoid
  disabling irq for netns id").

  The remaining patches fix some small problems that are well documented
  in the commit descriptions, as well as adding session ID filtering
  support"

* 'stable-4.10' of git://git.infradead.org/users/pcmoore/audit:
  audit: use proper refcount locking on audit_sock
  netns: avoid disabling irq for netns id
  audit: don't ever sleep on a command record/message
  audit: handle a clean auditd shutdown with grace
  audit: wake up kauditd_thread after auditd registers
  audit: rework audit_log_start()
  audit: rework the audit queue handling
  audit: rename the queues and kauditd related functions
  audit: queue netlink multicast sends just like we do for unicast sends
  audit: fixup audit_init()
  audit: move kaudit thread start from auditd registration to kaudit init (#2)
  audit: add support for session ID user filter
  audit: fix formatting of AUDIT_CONFIG_CHANGE events
  audit: skip sessionid sentinel value when auto-incrementing
  audit: tame initialization warning len_abuf in audit_log_execve_info
  audit: less stack usage for /proc/*/loginuid
2016-12-14 14:06:40 -08:00
Linus Torvalds
683b96f4d1 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
 "Generally pretty quiet for this release. Highlights:

  Yama:
   - allow ptrace access for original parent after re-parenting

  TPM:
   - add documentation
   - many bugfixes & cleanups
   - define a generic open() method for ascii & bios measurements

  Integrity:
   - Harden against malformed xattrs

  SELinux:
   - bugfixes & cleanups

  Smack:
   - Remove unnecessary smack_known_invalid label
   - Do not apply star label in smack_setprocattr hook
   - parse mnt opts after privileges check (fixes unpriv DoS vuln)"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (56 commits)
  Yama: allow access for the current ptrace parent
  tpm: adjust return value of tpm_read_log
  tpm: vtpm_proxy: conditionally call tpm_chip_unregister
  tpm: Fix handling of missing event log
  tpm: Check the bios_dir entry for NULL before accessing it
  tpm: return -ENODEV if np is not set
  tpm: cleanup of printk error messages
  tpm: replace of_find_node_by_name() with dev of_node property
  tpm: redefine read_log() to handle ACPI/OF at runtime
  tpm: fix the missing .owner in tpm_bios_measurements_ops
  tpm: have event log use the tpm_chip
  tpm: drop tpm1_chip_register(/unregister)
  tpm: replace dynamically allocated bios_dir with a static array
  tpm: replace symbolic permission with octal for securityfs files
  char: tpm: fix kerneldoc tpm2_unseal_trusted name typo
  tpm_tis: Allow tpm_tis to be bound using DT
  tpm, tpm_vtpm_proxy: add kdoc comments for VTPM_PROXY_IOC_NEW_DEV
  tpm: Only call pm_runtime_get_sync if device has a parent
  tpm: define a generic open() method for ascii & bios measurements
  Documentation: tpm: add the Physical TPM device tree binding documentation
  ...
2016-12-14 13:57:44 -08:00
Linus Torvalds
0f1d6dfe03 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "Here is the crypto update for 4.10:

  API:
   - add skcipher walk interface
   - add asynchronous compression (acomp) interface
   - fix algif_aed AIO handling of zero buffer

  Algorithms:
   - fix unaligned access in poly1305
   - fix DRBG output to large buffers

  Drivers:
   - add support for iMX6UL to caam
   - fix givenc descriptors (used by IPsec) in caam
   - accelerated SHA256/SHA512 for ARM64 from OpenSSL
   - add SSE CRCT10DIF and CRC32 to ARM/ARM64
   - add AEAD support to Chelsio chcr
   - add Armada 8K support to omap-rng"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (148 commits)
  crypto: testmgr - fix overlap in chunked tests again
  crypto: arm/crc32 - accelerated support based on x86 SSE implementation
  crypto: arm64/crc32 - accelerated support based on x86 SSE implementation
  crypto: arm/crct10dif - port x86 SSE implementation to ARM
  crypto: arm64/crct10dif - port x86 SSE implementation to arm64
  crypto: testmgr - add/enhance test cases for CRC-T10DIF
  crypto: testmgr - avoid overlap in chunked tests
  crypto: chcr - checking for IS_ERR() instead of NULL
  crypto: caam - check caam_emi_slow instead of re-lookup platform
  crypto: algif_aead - fix AIO handling of zero buffer
  crypto: aes-ce - Make aes_simd_algs static
  crypto: algif_skcipher - set error code when kcalloc fails
  crypto: caam - make aamalg_desc a proper module
  crypto: caam - pass key buffers with typesafe pointers
  crypto: arm64/aes-ce-ccm - Fix AEAD decryption length
  MAINTAINERS: add crypto headers to crypto entry
  crypt: doc - remove misleading mention of async API
  crypto: doc - fix header file name
  crypto: api - fix comment typo
  crypto: skcipher - Add separate walker for AEAD decryption
  ..
2016-12-14 13:31:29 -08:00
Doug Ledford
6f94ba2079 Merge branch 'vmw_pvrdma' into merge-test 2016-12-14 14:56:21 -05:00
Adit Ranadive
29c8d9eba5 IB: Add vmw_pvrdma driver
This patch series adds a driver for a paravirtual RDMA device. The
device is developed for VMware's Virtual Machines and allows existing RDMA
applications to continue to use existing Verbs API when deployed in VMs
on ESXi. We recently did a presentation in the OFA Workshop [1] regarding
this device.

Description and RDMA Support
============================
The virtual device is exposed as a dual function PCIe device. One part
is a virtual network device (VMXNet3) which provides networking properties
like MAC, IP addresses to the RDMA part of the device. The networking
properties are used to register GIDs required by RDMA applications to
communicate.

These patches add support and the all required infrastructure for
letting applications use such a device. We support the mandatory Verbs API as
well as the base memory management extensions (Local Inv, Send with Inv and
Fast Register Work Requests). We currently support both Reliable Connected
and Unreliable Datagram QPs but do not support Shared Receive Queues
(SRQs).

Also, we support the following types of Work Requests:
 o Send/Receive (with or without Immediate Data)
 o RDMA Write (with or without Immediate Data)
 o RDMA Read
 o Local Invalidate
 o Send with Invalidate
 o Fast Register Work Requests

This version only adds support for version 1 of RoCE. We will add RoCEv2
support in a future patch. We do support registration of both MAC-based
and IP-based GIDs. I have also created a git tree for our user-level driver
[2].

Testing
=======
We have tested this internally for various types of Guest OS - Red Hat,
Centos, Ubuntu 12.04/14.04/16.04, Oracle Enterprise Linux, SLES 12
using backported versions of this driver. The tests included several
runs of the performance tests (included with OFED), Intel MPI PingPong
benchmark on OpenMPI, krping for FRWRs. Mellanox has been kind enough
to test the backported version of the driver internally on their hardware
using a VMware provided ESX build. I have also applied and tested this
with Doug's k.o/for-4.9 branch (commit 5603910b). Note, that this patch
series should be applied all together. I split out the commits so that
it may be easier to review.

PVRDMA Resources
================
[1] OFA Workshop Presentation -
https://openfabrics.org/images/eventpresos/2016presentations/102parardma.pdf

[2] Libpvrdma User-level library -
http://git.openfabrics.org/?p=~aditr/libpvrdma.git;a=summary

Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
Reviewed-by: George Zhang <georgezhang@vmware.com>
Reviewed-by: Aditya Sarwade <asarwade@vmware.com>
Reviewed-by: Bryan Tan <bryantan@vmware.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Adit Ranadive <aditr@vmware.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-14 14:55:10 -05:00