This is the simplest possible policy that still does something of note.
When a pte_numa is faulted, it is moved immediately. Any replacement
policy must at least do better than this and in all likelihood this
policy regresses normal workloads.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
The use of MPOL_NOOP and MPOL_MF_LAZY to allow an application to
explicitly request lazy migration is a good idea but the actual
API has not been well reviewed and once released we have to support it.
For now this patch prevents an application using the services. This
will need to be revisited.
Signed-off-by: Mel Gorman <mgorman@suse.de>
NOTE: Once again there is a lot of patch stealing and the end result
is sufficiently different that I had to drop the signed-offs.
Will re-add if the original authors are ok with that.
This patch adds another mbind() flag to request "lazy migration". The
flag, MPOL_MF_LAZY, modifies MPOL_MF_MOVE* such that the selected
pages are marked PROT_NONE. The pages will be migrated in the fault
path on "first touch", if the policy dictates at that time.
"Lazy Migration" will allow testing of migrate-on-fault via mbind().
Also allows applications to specify that only subsequently touched
pages be migrated to obey new policy, instead of all pages in range.
This can be useful for multi-threaded applications working on a
large shared data area that is initialized by an initial thread
resulting in all pages on one [or a few, if overflowed] nodes.
After PROT_NONE, the pages in regions assigned to the worker threads
will be automatically migrated local to the threads on 1st touch.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
This patch provides a new function to test whether a page resides
on a node that is appropriate for the mempolicy for the vma and
address where the page is supposed to be mapped. This involves
looking up the node where the page belongs. So, the function
returns that node so that it may be used to allocated the page
without consulting the policy again.
A subsequent patch will call this function from the fault path.
Because of this, I don't want to go ahead and allocate the page, e.g.,
via alloc_page_vma() only to have to free it if it has the correct
policy. So, I just mimic the alloc_page_vma() node computation
logic--sort of.
Note: we could use this function to implement a MPOL_MF_STRICT
behavior when migrating pages to match mbind() mempolicy--e.g.,
to ensure that pages in an interleaved range are reinterleaved
rather than left where they are when they reside on any page in
the interleave nodemask.
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
[ Added MPOL_F_LAZY to trigger migrate-on-fault;
simplified code now that we don't have to bother
with special crap for interleaved ]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
This patch augments the MPOL_MF_LAZY feature by adding a "NOOP" policy
to mbind(). When the NOOP policy is used with the 'MOVE and 'LAZY
flags, mbind() will map the pages PROT_NONE so that they will be
migrated on the next touch.
This allows an application to prepare for a new phase of operation
where different regions of shared storage will be assigned to
worker threads, w/o changing policy. Note that we could just use
"default" policy in this case. However, this also allows an
application to request that pages be migrated, only if necessary,
to follow any arbitrary policy that might currently apply to a
range of pages, without knowing the policy, or without specifying
multiple mbind()s for ranges with different policies.
[ Bug in early version of mpol_parse_str() reported by Fengguang Wu. ]
Bug-Reported-by: Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Make MPOL_LOCAL a real and exposed policy such that applications that
relied on the previous default behaviour can explicitly request it.
Requested-by: Christoph Lameter <cl@linux.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
* for_3.8-rc1: (243 commits)
[media] omap3isp: Replace cpu_is_omap3630() with ISP revision check
[media] omap3isp: Prepare/unprepare clocks before/after enable/disable
[media] omap3isp: preview: Add support for 8-bit formats at the sink pad
[media] omap3isp: Replace printk with dev_*
[media] omap3isp: Find source pad from external entity
[media] omap3isp: Configure CSI-2 phy based on platform data
[media] omap3isp: Add PHY routing configuration
[media] omap3isp: Add CSI configuration registers from control block to ISP resources
[media] omap3isp: Remove unneeded module memory address definitions
[media] omap3isp: Use monotonic timestamps for statistics buffers
[media] uvcvideo: Fix control value clamping for unsigned integer controls
[media] uvcvideo: Mark first output terminal as default video node
[media] uvcvideo: Add VIDIOC_[GS]_PRIORITY support
[media] uvcvideo: Return -ENOTTY for unsupported ioctls
[media] uvcvideo: Set device_caps in VIDIOC_QUERYCAP
[media] uvcvideo: Don't fail when an unsupported format is requested
[media] uvcvideo: Return -EACCES when trying to access a read/write-only control
[media] uvcvideo: Set error_idx properly for extended controls API failures
[media] rtl28xxu: add NOXON DAB/DAB+ USB dongle rev 2
[media] fc2580: write some registers conditionally
...
This adds the following major in-memory structures in f2fs.
- f2fs_sb_info:
contains f2fs-specific information, two special inode pointers for node and
meta address spaces, and orphan inode management.
- f2fs_inode_info:
contains vfs_inode and other fs-specific information.
- f2fs_nm_info:
contains node manager information such as NAT entry cache, free nid list,
and NAT page management.
- f2fs_node_info:
represents a node as node id, inode number, block address, and its version.
- f2fs_sm_info:
contains segment manager information such as SIT entry cache, free segment
map, current active logs, dirty segment management, and segment utilization.
The specific structures are sit_info, free_segmap_info, dirty_seglist_info,
curseg_info.
In addition, add F2FS_SUPER_MAGIC in magic.h.
Signed-off-by: Chul Lee <chur.lee@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Alex writes:
Pretty minor -next pull request. We some additional new bits waiting
internally for release. Hopefully Monday we can get at least some of
them out. The others will probably take a few more weeks.
Highlights of the current request:
- ELD registers for passing audio information to the sound hardware
- Handle GPUVM page faults more gracefully
- Misc fixes
Merge radeon test
* 'drm-next-3.8' of git://people.freedesktop.org/~agd5f/linux: (483 commits)
drm/radeon: bump driver version for new info ioctl requests
drm/radeon: fix eDP clk and lane setup for scaled modes
drm/radeon: add new INFO ioctl requests
drm/radeon/dce32+: use fractional fb dividers for high clocks
drm/radeon: use cached memory when evicting for vram on non agp
drm/radeon: add a CS flag END_OF_FRAME
drm/radeon: stop page faults from hanging the system (v2)
drm/radeon/dce4/5: add registers for ELD handling
drm/radeon/dce3.2: add registers for ELD handling
radeon: fix pll/ctrc mapping on dce2 and dce3 hardware
Linux 3.7-rc7
powerpc/eeh: Do not invalidate PE properly
Revert "drm/i915: enable rc6 on ilk again"
ALSA: hda - Fix build without CONFIG_PM
of/address: sparc: Declare of_iomap as an extern function for sparc again
PM / QoS: fix wrong error-checking condition
bnx2x: remove redundant warning log
vxlan: fix command usage in its doc
8139cp: revert "set ring address before enabling receiver"
MPI: Fix compilation on MIPS with GCC 4.4 and newer
...
Conflicts:
drivers/gpu/drm/exynos/exynos_drm_encoder.c
drivers/gpu/drm/exynos/exynos_drm_fbdev.c
drivers/gpu/drm/nouveau/core/engine/disp/nv50.c
This patch adds the multiqueue (VIRTIO_NET_F_MQ) support to virtio_net
driver. VIRTIO_NET_F_MQ capable device could allow the driver to do packet
transmission and reception through multiple queue pairs and does the packet
steering to get better performance. By default, one one queue pair is used, user
could change the number of queue pairs by ethtool in the next patch.
When multiple queue pairs is used and the number of queue pairs is equal to the
number of vcpus. Driver does the following optimizations to implement per-cpu
virt queue pairs:
- select the txq based on the smp processor id.
- smp affinity hint to the cpu that owns the queue pairs.
This could be used with the flow steering support of the device to guarantee the
packets of a single flow is handled by the same cpu.
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add requests to get the number of shader engines (SE) and
the number of SH per SE. These are needed for geometry
and tesselation shaders in the 3D driver as well as setting
up PA_SC_RASTER_CONFIG on SI asics.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
No version bump is required because setting the flag on older DRM has
no effect.
This only reserves the bit and doesn't use it. I assume we will use it
for buffer eviction heuristics.
Signed-off-by: Marek Olšák <maraeo@gmail.com>
V5: fix two bugs pointed out by Thomas
remove seq check for now, mark it as TODO
V4: remove some useless #include
some coding style fix
V3: drop debugging printk's
update selinux perm table as well
V2: drop patch 1/2, export ifindex directly
Redesign netlink attributes
Improve netlink seq check
Handle IPv6 addr as well
This patch exports bridge multicast database via netlink
message type RTM_GETMDB. Similar to fdb, but currently bridge-specific.
We may need to support modify multicast database too (RTM_{ADD,DEL}MDB).
(Thanks to Thomas for patient reviews)
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Cong Wang <amwang@redhat.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
* pci/bjorn-pcie-cap:
ath9k: Use standard #defines for PCIe Capability ASPM fields
iwlwifi: Use standard #defines for PCIe Capability ASPM fields
iwlwifi: collapse wrapper for pcie_capability_read_word()
iwlegacy: Use standard #defines for PCIe Capability ASPM fields
iwlegacy: collapse wrapper for pcie_capability_read_word()
cxgb3: Use standard #defines for PCIe Capability ASPM fields
PCI: Add standard PCIe Capability Link ASPM field names
PCI/portdrv: Use PCI Express Capability accessors
PCI: Use standard PCIe Capability Link register field names
PCI: Add and use standard PCI-X Capability register names
Add standard #defines for ASPM fields in PCI Express Link Capability and
Link Control registers.
Previously we used PCIE_LINK_STATE_L0S and PCIE_LINK_STATE_L1 directly, but
these are defined for the Linux ASPM interfaces, e.g.,
pci_disable_link_state(), and only coincidentally match the actual register
bits. PCIE_LINK_STATE_CLKPM, also part of that interface, does not match
the register bit.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Since GCC 4.4, there have been __builtin_bswap32() and __builtin_bswap16()
intrinsics. A __builtin_bswap16() came a little later (4.6 for PowerPC,
48 for other platforms).
By using these instead of the inline assembler that most architectures
have in their __arch_swabXX() macros, we let the compiler see what's
actually happening. The resulting code should be at least as good, and
much *better* in the cases where it can be combined with a nearby load
or store, using a load-and-byteswap or store-and-byteswap instruction
(e.g. lwbrx/stwbrx on PowerPC, movbe on Atom).
When GCC is sufficiently recent *and* the architecture opts in to using
the intrinsics by setting CONFIG_ARCH_USE_BUILTIN_BSWAP, they will be
used in preference to the __arch_swabXX() macros. An architecture which
does not set ARCH_USE_BUILTIN_BSWAP will continue to use its own
hand-crafted macros.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
A new ioctl, KVM_PPC_GET_HTAB_FD, returns a file descriptor. Reads on
this fd return the contents of the HPT (hashed page table), writes
create and/or remove entries in the HPT. There is a new capability,
KVM_CAP_PPC_HTAB_FD, to indicate the presence of the ioctl. The ioctl
takes an argument structure with the index of the first HPT entry to
read out and a set of flags. The flags indicate whether the user is
intending to read or write the HPT, and whether to return all entries
or only the "bolted" entries (those with the bolted bit, 0x10, set in
the first doubleword).
This is intended for use in implementing qemu's savevm/loadvm and for
live migration. Therefore, on reads, the first pass returns information
about all HPTEs (or all bolted HPTEs). When the first pass reaches the
end of the HPT, it returns from the read. Subsequent reads only return
information about HPTEs that have changed since they were last read.
A read that finds no changed HPTEs in the HPT following where the last
read finished will return 0 bytes.
The format of the data provides a simple run-length compression of the
invalid entries. Each block of data starts with a header that indicates
the index (position in the HPT, which is just an array), the number of
valid entries starting at that index (may be zero), and the number of
invalid entries following those valid entries. The valid entries, 16
bytes each, follow the header. The invalid entries are not explicitly
represented.
Signed-off-by: Paul Mackerras <paulus@samba.org>
[agraf: fix documentation]
Signed-off-by: Alexander Graf <agraf@suse.de>
V3: make it a flag
V2: make the toggle per-port
Fast leave allows bridge to immediately stops the multicast
traffic on the port receives IGMP Leave when IGMP snooping is enabled,
no timeouts are observed.
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Add and use #defines for PCI-X Capability registers and fields.
Note that the PCI-X Capability has a different layout for
type 0 (endpoint) and type 1 (bridge) devices.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
A mfc entry can be static or not (added via the mroute_sk socket). The patch
reports MFC_STATIC flag into rtm_protocol by setting rtm_protocol to
RTPROT_STATIC or RTPROT_MROUTED.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These statistics can be checked only via /proc/net/ip_mr_cache or
SIOCGETSGCNT[_IN6] and thus only for the table RT_TABLE_DEFAULT.
Advertising them via rtnetlink allows to get statistics for all cache entries,
whatever the table is.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch advertise the MC_FORWARDING status for IPv4 and IPv6.
This field is readonly, only multicast engine in the kernel updates it.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
* Remove limitation in the maximum number of supported sets in ipset.
Now ipset automagically increments the number of slots in the array
of sets by 64 new spare slots, from Jozsef Kadlecsik.
* Partially remove the generic queue infrastructure now that ip_queue
is gone. Its only client is nfnetlink_queue now, from Florian
Westphal.
* Add missing attribute policy checkings in ctnetlink, from Florian
Westphal.
* Automagically kill conntrack entries that use the wrong output
interface for the masquerading case in case of routing changes,
from Jozsef Kadlecsik.
* Two patches two improve ct object traceability. Now ct objects are
always placed in any of the existing lists. This allows us to dump
the content of unconfirmed and dying conntracks via ctnetlink as
a way to provide more instrumentation in case you suspect leaks,
from myself.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds userptr feautre for G2D module.
The userptr means user space address allocated by malloc().
And the purpose of this feature is to make G2D's dma able
to access the user space region.
To user this feature, user should flag G2D_BUF_USRPTR to
offset variable of struct drm_exynos_g2d_cmd and fill
struct drm_exynos_g2d_userptr with user space address
and size for it and then should set a pointer to
drm_exynos_g2d_userptr object to data variable of struct
drm_exynos_g2d_cmd. The last bit of offset variable is used
to check if the cmdlist's buffer type is userptr or not.
If userptr, the g2d driver gets user space address and size
and then gets pages through get_user_pages().
(another case is counted as gem handle)
Below is sample codes:
static void set_cmd(struct drm_exynos_g2d_cmd *cmd,
unsigned long offset, unsigned long data)
{
cmd->offset = offset;
cmd->data = data;
}
static int solid_fill_test(int x, int y, unsigned long userptr)
{
struct drm_exynos_g2d_cmd cmd_gem[5];
struct drm_exynos_g2d_userptr g2d_userptr;
unsigned int gem_nr = 0;
...
g2d_userptr.userptr = userptr;
g2d_userptr.size = x * y * 4;
set_cmd(&cmd_gem[gem_nr++], DST_BASE_ADDR_REG |
G2D_BUF_USERPTR,
(unsigned long)&g2d_userptr);
...
}
int main(int argc, char **argv)
{
unsigned long addr;
...
addr = malloc(x * y * 4);
...
solid_fill_test(x, y, addr);
...
}
And next, the pages are mapped with iommu table and the device
address is set to cmdlist so that G2D's dma can access it.
As you may know, the pages from get_user_pages() are pinned.
In other words, they CAN NOT be migrated and also swapped out.
So the dma access would be safe.
But the use of userptr feature has performance overhead so
this patch also has memory pool to the userptr feature.
Please, assume that user sends cmdlist filled with userptr
and size every time to g2d driver, and the get_user_pages
funcion will be called every time.
The memory pool has maximum 64MB size and the userptr that
user had ever sent, is holded in the memory pool.
This meaning is that if the userptr from user is same as one
in the memory pool, device address to the userptr in the memory
pool is set to cmdlist.
And last, the pages from get_user_pages() will be freed once
user calls free() and the dma access is completed. Actually,
get_user_pages() takes 2 reference counts if the user process
has never accessed user region allocated by malloc(). Then, if
the user calls free(), the page reference count becomes 1 and
becomes 0 with put_page() call. And the reverse holds as well.
This means how the pages backed are used by dma and freed.
This patch is based on "drm/exynos: add iommu support for g2d",
https://patchwork.kernel.org/patch/1629481/
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Historically tun supported two modes of operation:
- in default mode, a small number of packets would get queued
at the device, the rest would be queued in qdisc
- in one queue mode, all packets would get queued at the device
This might have made sense up to a point where we made the
queue depth for both modes the same and set it to
a huge value (500) so unless the consumer
is stuck the chance of losing packets is small.
Thus in practice both modes behave the same, but the
default mode has some problems:
- if packets are never consumed, fragments are never orphaned
which cases a DOS for sender using zero copy transmit
- overrun errors are hard to diagnose: fifo error is incremented
only once so you can not distinguish between
userspace that is stuck and a transient failure,
tcpdump on the device does not show any traffic
Userspace solves this simply by enabling IFF_ONE_QUEUE
but there seems to be little point in not doing the
right thing for everyone, by default.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a new operation to dump the content of the dying and
unconfirmed lists.
Under some situations, the global conntrack counter can be inconsistent
with the number of entries that we can dump from the conntrack table.
The way to resolve this is to allow dumping the content of the unconfirmed
and dying lists, so far it was not possible to look at its content.
This provides some extra instrumentation to resolve problematic situations
in which anyone suspects memory leaks.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pull perf fixes from Ingo Molnar:
"This is mostly about unbreaking architectures that took the UAPI
changes in the v3.7 cycle, plus misc fixes."
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf kvm: Fix building perf kvm on non x86 arches
perf kvm: Rename perf_kvm to perf_kvm_stat
perf: Make perf build for x86 with UAPI disintegration applied
perf powerpc: Use uapi/unistd.h to fix build error
tools: Pass the target in descend
tools: Honour the O= flag when tool build called from a higher Makefile
tools: Define a Makefile function to do subdir processing
x86: Export asm/{svm.h,vmx.h,perf_regs.h}
perf tools: Fix strbuf_addf() when the buffer needs to grow
perf header: Fix numa topology printing
perf, powerpc: Fix hw breakpoints returning -ENOSPC
* linus/master: (1428 commits)
futex: avoid wake_futex() for a PI futex_q
watchdog: using u64 in get_sample_period()
writeback: put unused inodes to LRU after writeback completion
mm: vmscan: check for fatal signals iff the process was throttled
Revert "mm: remove __GFP_NO_KSWAPD"
proc: check vma->vm_file before dereferencing
UAPI: strip the _UAPI prefix from header guards during header installation
include/linux/bug.h: fix sparse warning related to BUILD_BUG_ON_INVALID
Linux 3.7-rc7
powerpc/eeh: Do not invalidate PE properly
ALSA: hda - Fix build without CONFIG_PM
of/address: sparc: Declare of_iomap as an extern function for sparc again
PM / QoS: fix wrong error-checking condition
bnx2x: remove redundant warning log
vxlan: fix command usage in its doc
8139cp: revert "set ring address before enabling receiver"
MPI: Fix compilation on MIPS with GCC 4.4 and newer
MIPS: Fix crash that occurs when function tracing is enabled
MIPS: Merge overlapping bootmem ranges
jbd: Fix lock ordering bug in journal_unmap_buffer()
...
If a driver supports P2P GO powersave, allow it to
set the new feature flags for it and allow userspace
to configure the parameters for it. This can be done
at GO startup and later changed with SET_BSS.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Add support for reporting and calculating VHT MCSes.
Note that I'm not completely sure that the bitrate
calculations are correct, nor that they can't be
simplified.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Change nl80211 to support specifying a VHT (or HT)
using the control channel frequency (as before) and
new attributes for the channel width and first and
second center frequency. The old channel type is of
course still supported for HT.
Also change the cfg80211 channel definition struct
to support these by adding the relevant fields to
it (and removing the _type field.)
This also adds new helper functions:
- cfg80211_chandef_create to create a channel def
struct given the control channel and channel type,
- cfg80211_chandef_identical to check if two channel
definitions are identical
- cfg80211_chandef_compatible to check if the given
channel definitions are compatible, and return the
wider of the two
This isn't entirely complete, but that doesn't matter
until we have a driver using it. In particular, it's
missing
- regulatory checks on the usable bandwidth (if that
even makes sense)
- regulatory TX power (database can't deal with it)
- a proper channel compatibility calculation for the
new channel types
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
As mwifiex (and mac80211 in the software case) are the
only drivers actually implementing remain-on-channel
with channel type, userspace can't be relying on it.
This is the case, as it's used only for P2P operations
right now.
Rather than adding a flag to tell userspace whether or
not it can actually rely on it, simplify all the code
by removing the ability to use different channel types.
Leave only the validation of the attribute, so that if
we extend it again later (with the needed capability
flag), it can't break userspace sending invalid data.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This patch adds extension to V4L2 api. A new ioctl VIDIOC_EXPBUF is added. The
ioctl is used to export an mmap buffer as a DMABUF file descriptor.
Signed-off-by: Tomasz Stanislawski <t.stanislaws@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
Tested-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Adds DMABUF memory type to v4l framework. Also adds the related file
descriptor in v4l2_plane and v4l2_buffer.
[original work in the PoC for buffer sharing]
Signed-off-by: Tomasz Stanislawski <t.stanislaws@samsung.com>
Signed-off-by: Sumit Semwal <sumit.semwal@ti.com>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
Tested-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Michael Kerrisk <mtk.manpages@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Dave Jones <davej@redhat.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Add register definitions used in several Exar PCI/PCIe UARTs
Signed-off-by: Matt Schulte <matts@commtech-fastcom.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add support for new devices: Exar's XR17V35x family of multi-port PCIe UARTs.
Signed-off-by: Matt Schulte <matts@commtech-fastcom.com>
Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch add the support of 6RD tunnels management via netlink.
Note that netdev_state_change() is now called when 6RD parameters are updated.
6RD parameters are updated only if there is at least one 6RD attribute.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch provides extensions to VXLAN for supporting Distributed
Overlay Virtual Ethernet (DOVE) networks. The patch includes:
+ a dove flag per VXLAN device to enable DOVE extensions
+ ARP reduction, whereby a bridge-connected VXLAN tunnel endpoint
answers ARP requests from the local bridge on behalf of
remote DOVE clients
+ route short-circuiting (aka L3 switching). Known destination IP
addresses use the corresponding destination MAC address for
switching rather than going to a (possibly remote) router first.
+ netlink notification messages for forwarding table and L3 switching
misses
Changes since v2
- combined bools into "u32 flags"
- replaced loop with !is_zero_ether_addr()
Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jumps in the vblank and page flip event timestamps cause trouble for
clients, so we should avoid them. The timestamp we get currently with
gettimeofday can jump, so use instead monotonic timestamps.
For backward compatibility use a module flag to revert back to using
gettimeofday timestamps. Add also a DRM_CAP_TIMESTAMP_MONOTONIC flag
that is simply a read only version of the module flag, so that clients
can query this without depending on sysfs.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Daniel writes:
Highlights of this -next round:
- ivb fdi B/C fixes
- hsw sprite/plane offset fixes from Damien
- unified dp/hdmi encoder for hsw, finally external dp support on hsw
(Paulo)
- kill-agp and some other prep work in the gtt code from Ben
- some fb handling fixes from Ville
- massive pile of patches to align hsw VGA with the spec and make it
actually work (Paulo)
- pile of workarounds from Jesse, mostly for vlv, but also some other
related platforms
- start of a dev_priv reorg, that thing grew out of bounds and chaotic
- small bits&pieces all over the place, down to better error handling for
load-detect on gen2 (Chris, Jani, Mika, Zhenyu, ...)
On top of the previous pile (just copypasta):
- tons of hsw dp prep patches form Paulo
- round scheduled work items and timers to nearest second (Chris)
- some hw workarounds (Jesse&Damien)
- vlv dp support and related fixups (Vijay et al.)
- basic haswell dp support, not yet wired up for external ports (Paulo)
- edp support (Paulo)
- tons of refactorings to prepare for the above (Paulo)
- panel rework, unifiying code between lvds and edp panels (Jani)
- panel fitter scaling modes (Jani + Yuly Novikov)
- panel power improvements, should now work without the BIOS setting it up
- extracting some dp helpers from radeon/i915 and move them to
drm_dp_helper.c
- randome pile of workarounds (Damien, Ben, ...)
- some cleanups for the register restore code for suspend/resume
- secure batchbuffer support, should enable tear-free blits on gen6+
Chris)
- random smaller fixlets and cleanups.
* 'for-airlied' of git://people.freedesktop.org/~danvet/drm-intel: (231 commits)
drm/i915: Restore physical HWS_PGA after resume
drm/i915: Report amount of usable graphics memory in MiB
drm/i915/i2c: Track users of GMBUS force-bit
drm/i915: Allocate the proper size for contexts.
drm/i915: Update load-detect failure paths for modeset-rework
drm/i915: Clear unused fields of mode for framebuffer creation
drm/i915: Always calculate 8xx WM values based on a 32-bpp framebuffer
drm/i915: Fix sparse warnings in from AGP kill code
drm/i915: Missed lock change with rps lock
drm/i915: Move the remaining gtt code
drm/i915: flush system agent TLBs on SNB
drm/i915: Kill off now unused gen6+ AGP code
drm/i915: Calculate correct stolen size for GEN7+
drm/i915: Stop using AGP layer for GEN6+
drm/i915: drop the double-OP_STOREDW usage in blt_ring_flush
drm/i915: don't rewrite the GTT on resume v4
drm/i915: protect RPS/RC6 related accesses (including PCU) with a new mutex
drm/i915: put ring frequency and turbo setup into a work queue v5
drm/i915: don't block resume on fb console resume v2
drm/i915: extract l3_parity substruct from dev_priv
...
Make perf build for x86 once the UAPI disintegration patches for that arch
have been applied by adding the appropriate -I flags - in the right order -
and then converting some #includes that use ../.. notation to find main kernel
headerfiles to use <asm/foo.h> and <linux/foo.h> instead.
Note that -Iarch/foo/include/uapi is present _before_ -Iarch/foo/include.
This makes sure we get the userspace version of the pt_regs struct. Ideally,
we wouldn't have the latter -I flag at all, but unfortunately we want
asm/svm.h and asm/vmx.h in builtin-kvm.c and these aren't part of the UAPI -
at least not for x86. I wonder if the bits outside of the __KERNEL__ guards
*should* be transferred there.
I note also that perf seems to do its dependency handling manually by listing
all the header files it might want to use in LIB_H in the Makefile. Can this
be changed to use -MD?
Note that to do make this work, we need to export and UAPI disintegrate
linux/hw_breakpoint.h, which I think should've been exported previously so that
perf can access the bits. We have to do this in the same patch to maintain
bisectability.
Signed-off-by: David Howells <dhowells@redhat.com>
The NL80211_CMD_TDLS_OPER command was previously used only for userspace
request for the kernel code to perform TDLS operations. However, there
are also cases where the driver may need to request operations from
userspace, e.g., when using security on the AP path. Add a new cfg80211
function for generating a TDLS operation event for drivers to request a
new link to be set up (NL80211_TDLS_SETUP) or an existing link to be
torn down (NL80211_TDLS_TEARDOWN). Drivers can optionally use these
events, e.g., based on noticing data traffic being sent to a peer
station that is seen with good signal strength.
Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This is mostly a revert of 01dc52ebdf ("oom: remove deprecated oom_adj")
from Davidlohr Bueso.
It reintroduces /proc/pid/oom_adj for backwards compatibility with earlier
kernels. It simply scales the value linearly when /proc/pid/oom_score_adj
is written.
The major difference is that its scheduled removal is no longer included
in Documentation/feature-removal-schedule.txt. We do warn users with a
single printk, though, to suggest the more powerful and supported
/proc/pid/oom_score_adj interface.
Reported-by: Artem S. Tashkinov <t.artem@lycos.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Conflicts:
net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c
Minor conflict due to some IS_ENABLED conversions done
in net-next.
Signed-off-by: David S. Miller <davem@davemloft.net>
The kernel uses some default metric when routes are managed. For example, a
static route added with a metric set to 0 is inserted in the kernel with
metric 1024 (IP6_RT_PRIO_USER).
It is useful for routing daemons to know these values, to be able to set routes
without interfering with what the kernel does.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some devices provides the actual timestamp (hid_dg_scan_time in win8 ones)
computed by the hardware itself. This value is global to the frame and is
not specific to the multitouch protocol.
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@gmail.com>
Reviewed-by: Henrik Rydberg <rydberg@euromail.se>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
John W. Linville says:
====================
Included is a Bluetooth pull -- Gustavo says:
"These are the Bluetooth bits for inclusion in 3.8, there is basically one big
thing here which is the High Speed patches from Andrei, he did a lot of work on
A2MP and management of AMP devices. The rest are mostly clean up and bug
fixes."
Also included is an NFC pull -- Samuel says:
"With this one we have:
- pn544 p2p support.
- pn544 physical and HCI layers separation. We are getting the pn544 driver
ready to support non i2c physical layers.
- LLCP SNL (Service Name Lookup). This is the NFC p2p service discovery
protocol.
- LLCP datagram sockets (connection less) support.
- IDR library usage for NFC devices indexes assignement.
- NFC netlink extension for setting and getting LLCP link characteristics.
- Various code style fixes and cleanups spread over the pn533, LLCP, HCI and
pn544 code."
There are a couple of mac80211 pulls as well -- Johannes says:
"Please pull my mac80211-next tree to get the first round of new features
for 3.8. We have:
* finally, the mac80211 multi-channel work
* scan improvements:
- bg scan
- scan flush
- forced AP scan
* cfg80211 tracing
* a bit of new code to allow implementing SAE (secure authentication of
equals) in managed mode
Along with a few random improvements, features and fixes."
and...
"Please pull from mac80211-next (per below pull request) to get a few
updates. Most important is probably the fix for the WDS regression that
my previous pull request introduced. Other than that, I have some
tracing code, two mesh updates and a change to allow drivers to
calculate the AES CMAC subkeys without having to implement the GF_mulx
operation themselves."
On top of that are the usual updates to iwlwifi, ath9k, rt2x00,
brcmfmac, mwifiex, and a few others here and there. Of note is the
addition of the ar5523 driver, ported from an original FreeBSD driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This parameter was missing in the dump.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IPv6 tunnels can have three mode: 4in6, 6in6 and xin6.
This information was missing in the netlink message.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is Linux bridge implementation of root port guard.
If BPDU is received from a leaf (edge) port, it should not
be elected as root port.
Why would you want to do this?
If using STP on a bridge and the downstream bridges are not fully
trusted; this prevents a hostile guest for rerouting traffic.
Why not just use netfilter?
Netfilter does not track of follow spanning tree decisions.
It would be difficult and error prone to try and mirror STP
resolution in netfilter module.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is Linux bridge implementation of STP protection
(Cisco BPDU guard/Juniper BPDU block). BPDU block disables
the bridge port if a STP BPDU packet is received.
Why would you want to do this?
If running Spanning Tree on bridge, hostile devices on the network
may send BPDU and cause network failure. Enabling bpdu block
will detect and stop this.
How to recover the port?
The port will be restarted if link is brought down, or
removed and reattached. For example:
# ip li set dev eth0 down; ip li set dev eth0 up
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Expose bridge port parameter over netlink. By switching to a nested
message, this can be used for other bridge parameters.
This changes IFLA_PROTINFO attribute from one byte to a full nested
set of attributes. This is safe for application interface because the
old message used IFLA_PROTINFO and new one uses
IFLA_PROTINFO | NLA_F_NESTED.
The code adapts to old format requests, and therefore stays
compatible with user mode RSTP daemon. Since the type field
for nested and unnested attributes are different, and the old
code in libnetlink doesn't do the mask, it is also safe to use
with old versions of bridge monitor command.
Note: although mode is only a boolean, treating it as a
full byte since in the future someone will probably want to add more
values (like macvlan has).
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch introduces a new knob ndisc_notify. If enabled, the kernel
will transmit an unsolicited neighbour advertisement on link-layer address
change to update the neighbour tables of the corresponding hosts more quickly.
This is the equivalent to arp_notify in ipv4 world.
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
wpa_supplicant will do OBSS scan for drivers that implement
auth/assoc API. Drivers that implement nl80211 connect API
(rather than auth/assoc) may need wpa_supplicant to do this
as well.
Add a new feature flag to inform it (wpa_s) that a driver
needs wpa_supplicant to do OBSS scans.
Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
Signed-off-by: Bing Zhao <bzhao@marvell.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Conflicts:
drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
Minor conflict between the BCM_CNIC define removal in net-next
and a bug fix added to net. Based upon a conflict resolution
patch posted by Stephen Rothwell.
Signed-off-by: David S. Miller <davem@davemloft.net>
It is usefull for daemons that monitor link event to have the full parameters of
these interfaces when a rtnl message is sent.
It allows also to dump them via rtnetlink.
It is based on what is done for GRE tunnels.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It is usefull for daemons that monitor link event to have the full parameters of
these interfaces when a rtnl message is sent.
It allows also to dump them via rtnetlink.
It is based on what is done for GRE tunnels.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the wanxl firmware to include missing constants such as PARITY_NONE. It
should be #including the linux/hdlc/ioctl.h header.
To make this work, we also have to guard parts of ioctl.h with !__ASSEMBLY__.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Michael Kerrisk <mtk.manpages@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Dave Jones <davej@redhat.com>
Acked-by: Krzysztof Halasa <khc@pm.waw.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
According to the PCIe 3.0 spec, PCI_EXP_LNKCAP2_SLS_2_5GB is
1st bit of PCI_EXP_LNKCAP2 register, not 0th bit. So, the bit
definition of supported link speed vector should be fixed.
[bhelgaas: change "Current" to "Supported"]
Signed-off-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Revert commit 03a7beb55b ("epoll: support for disabling items, and a
self-test app") pending resolution of the issues identified by Michael
Kerrisk, copied below.
We'll revisit this for 3.8.
: I've taken a look at this patch as it currently stands in 3.7-rc1, and
: done a bit of testing. (By the way, the test program
: tools/testing/selftests/epoll/test_epoll.c does not compile...)
:
: There are one or two places where the behavior seems a little strange,
: so I have a question or two at the end of this mail. But other than
: that, I want to check my understanding so that the interface can be
: correctly documented.
:
: Just to go though my understanding, the problem is the following
: scenario in a multithreaded application:
:
: 1. Multiple threads are performing epoll_wait() operations,
: and maintaining a user-space cache that contains information
: corresponding to each file descriptor being monitored by
: epoll_wait().
:
: 2. At some point, a thread wants to delete (EPOLL_CTL_DEL)
: a file descriptor from the epoll interest list, and
: delete the corresponding record from the user-space cache.
:
: 3. The problem with (2) is that some other thread may have
: previously done an epoll_wait() that retrieved information
: about the fd in question, and may be in the middle of using
: information in the cache that relates to that fd. Thus,
: there is a potential race.
:
: 4. The race can't solved purely in user space, because doing
: so would require applying a mutex across the epoll_wait()
: call, which would of course blow thread concurrency.
:
: Right?
:
: Your solution is the EPOLL_CTL_DISABLE operation. I want to
: confirm my understanding about how to use this flag, since
: the description that has accompanied the patches so far
: has been a bit sparse
:
: 0. In the scenario you're concerned about, deleting a file
: descriptor means (safely) doing the following:
: (a) Deleting the file descriptor from the epoll interest list
: using EPOLL_CTL_DEL
: (b) Deleting the corresponding record in the user-space cache
:
: 1. It's only meaningful to use this EPOLL_CTL_DISABLE in
: conjunction with EPOLLONESHOT.
:
: 2. Using EPOLL_CTL_DISABLE without using EPOLLONESHOT in
: conjunction is a logical error.
:
: 3. The correct way to code multithreaded applications using
: EPOLL_CTL_DISABLE and EPOLLONESHOT is as follows:
:
: a. All EPOLL_CTL_ADD and EPOLL_CTL_MOD operations should
: should EPOLLONESHOT.
:
: b. When a thread wants to delete a file descriptor, it
: should do the following:
:
: [1] Call epoll_ctl(EPOLL_CTL_DISABLE)
: [2] If the return status from epoll_ctl(EPOLL_CTL_DISABLE)
: was zero, then the file descriptor can be safely
: deleted by the thread that made this call.
: [3] If the epoll_ctl(EPOLL_CTL_DISABLE) fails with EBUSY,
: then the descriptor is in use. In this case, the calling
: thread should set a flag in the user-space cache to
: indicate that the thread that is using the descriptor
: should perform the deletion operation.
:
: Is all of the above correct?
:
: The implementation depends on checking on whether
: (events & ~EP_PRIVATE_BITS) == 0
: This replies on the fact that EPOLL_CTL_AD and EPOLL_CTL_MOD always
: set EPOLLHUP and EPOLLERR in the 'events' mask, and EPOLLONESHOT
: causes those flags (as well as all others in ~EP_PRIVATE_BITS) to be
: cleared.
:
: A corollary to the previous paragraph is that using EPOLL_CTL_DISABLE
: is only useful in conjunction with EPOLLONESHOT. However, as things
: stand, one can use EPOLL_CTL_DISABLE on a file descriptor that does
: not have EPOLLONESHOT set in 'events' This results in the following
: (slightly surprising) behavior:
:
: (a) The first call to epoll_ctl(EPOLL_CTL_DISABLE) returns 0
: (the indicator that the file descriptor can be safely deleted).
: (b) The next call to epoll_ctl(EPOLL_CTL_DISABLE) fails with EBUSY.
:
: This doesn't seem particularly useful, and in fact is probably an
: indication that the user made a logic error: they should only be using
: epoll_ctl(EPOLL_CTL_DISABLE) on a file descriptor for which
: EPOLLONESHOT was set in 'events'. If that is correct, then would it
: not make sense to return an error to user space for this case?
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: "Paton J. Lewis" <palewis@adobe.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The tx data offset of packet mmap tx ring used to be :
(TPACKET2_HDRLEN - sizeof(struct sockaddr_ll))
The problem is that, with SOCK_RAW socket, the payload (14 bytes after
the beginning of the user data) is misaligned.
This patch allows to let the user gives an offset for it's tx data if
he desires.
Set sock option PACKET_TX_HAS_OFF to 1, then specify in each frame of
your tx ring tp_net for SOCK_DGRAM, or tp_mac for SOCK_RAW.
Signed-off-by: Paul Chavent <paul.chavent@onera.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
This command triggers a new callback: set_mcast_rate(). It enables
the user to change the rate used to send multicast frames for vif
configured as IBSS or MESH_POINT
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Add Ethertype 0x4305 (not an officially registered id).
This Ethertype is used by every frame generated by B.A.T.M.A.N.-Advanced. Its
definition is currently batman-adv local only and since it is not officially
registered it is better to make its definition kernel-wide so that we avoid
collisions given by future unofficial uses of the same Ethertype.
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
userspace can query the original ipv4 destination address of a REDIRECTed
connection via
getsockopt(m_sock, SOL_IP, SO_ORIGINAL_DST, &m_server_addr, &addrsize)
but for ipv6 no such option existed.
This adds getsockopt(..., IPPROTO_IPV6, IP6T_SO_ORIGINAL_DST, ...).
Without this, userspace needs to parse /proc or use ctnetlink, which
appears to be overkill.
This uses option number 80 for IP6T_SO_ORIGINAL_DST, which is spare,
to use the same number we use in the IPv4 socket option SO_ORIGINAL_DST.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch adds an ioctl for PTP Hardware Clock (PHC) devices that allows
user space to measure the time offset between the PHC and the system
clock. Rather than hard coding any kind of estimation algorithm into the
kernel, this patch takes the more flexible approach of just delivering
an array of raw clock readings. In that way, the user space clock servo
may be adapted to new and different hardware clocks.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The SO_ATTACH_FILTER option is set only. I propose to add the get
ability by using SO_ATTACH_FILTER in getsockopt. To be less
irritating to eyes the SO_GET_FILTER alias to it is declared. This
ability is required by checkpoint-restore project to be able to
save full state of a socket.
There are two issues with getting filter back.
First, kernel modifies the sock_filter->code on filter load, thus in
order to return the filter element back to user we have to decode it
into user-visible constants. Fortunately the modification in question
is interconvertible.
Second, the BPF_S_ALU_DIV_K code modifies the command argument k to
speed up the run-time division by doing kernel_k = reciprocal(user_k).
Bad news is that different user_k may result in same kernel_k, so we
can't get the original user_k back. Good news is that we don't have
to do it. What we need to is calculate a user2_k so, that
reciprocal(user2_k) == reciprocal(user_k) == kernel_k
i.e. if it's re-loaded back the compiled again value will be exactly
the same as it was. That said, the user2_k can be calculated like this
user2_k = reciprocal(kernel_k)
with an exception, that if kernel_k == 0, then user2_k == 1.
The optlen argument is treated like this -- when zero, kernel returns
the amount of sock_fprog elements in filter, otherwise it should be
large enough for the sock_fprog array.
changes since v1:
* Declared SO_GET_FILTER in all arch headers
* Added decode of vlan-tag codes
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sometimes usespace may need to active/deactive a queue, this could be done by
detaching and attaching a file from tuntap device.
This patch introduces a new ioctls - TUNSETQUEUE which could be used to do
this. Flag IFF_ATTACH_QUEUE were introduced to do attaching while
IFF_DETACH_QUEUE were introduced to do the detaching.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add flags to be used by creating multiqueue tuntap device.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
BPF filters lack ability to access skb->vlan_tci
This patch adds two new ancillary accessors :
SKF_AD_VLAN_TAG (44) mapped to vlan_tx_tag_get(skb)
SKF_AD_VLAN_TAG_PRESENT (48) mapped to vlan_tx_tag_present(skb)
This allows libpcap/tcpdump to use a kernel filter instead of
having to fallback to accept all packets, then filter them in
user space.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Suggested-by: Ani Sinha <ani@aristanetworks.com>
Suggested-by: Daniel Borkmann <danborkmann@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hardware switches may support enabling and disabling the
loopback switch which puts the device in a VEPA mode defined
in the IEEE 802.1Qbg specification. In this mode frames are
not switched in the hardware but sent directly to the switch.
SR-IOV capable NICs will likely support this mode I am
aware of at least two such devices. Also I am told (but don't
have any of this hardware available) that there are devices
that only support VEPA modes. In these cases it is important
at a minimum to be able to query these attributes.
This patch adds an additional IFLA_BRIDGE_MODE attribute that can be
set and dumped via the PF_BRIDGE:{SET|GET}LINK operations. Also
anticipating bridge attributes that may be common for both embedded
bridges and software bridges this adds a flags attribute
IFLA_BRIDGE_FLAGS currently used to determine if the command or event
is being generated to/from an embedded bridge or software bridge.
Finally, the event generation is pulled out of the bridge module and
into rtnetlink proper.
For example using the macvlan driver in VEPA mode on top of
an embedded switch requires putting the embedded switch into
a VEPA mode to get the expected results.
-------- --------
| VEPA | | VEPA | <-- macvlan vepa edge relays
-------- --------
| |
| |
------------------
| VEPA | <-- embedded switch in NIC
------------------
|
|
-------------------
| external switch | <-- shiny new physical
------------------- switch with VEPA support
A packet sent from the macvlan VEPA at the top could be
loopbacked on the embedded switch and never seen by the
external switch. So in order for this to work the embedded
switch needs to be set in the VEPA state via the above
described commands.
By making these attributes nested in IFLA_AF_SPEC we allow
future extensions to be made as needed.
CC: Lennert Buytenhek <buytenh@wantstofly.org>
CC: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- one recently introduced crash for dm-raid10 with discard
- one bug in new functionality that has been around for a few releases.
- minor bug in md's 'faulty' personality
and UAPI disintegration for md.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iQIVAwUAUJB3rDnsnt1WYoG5AQLuAw/8C2I1LNHRc9zccO4akAg9AyYcpoNGcY6I
PG1SR7sQiKuQYNTwc7xqqYJ241r3U+Ablh8nurr0rbCmYX8rcnjwTZzhH6h0ER5Q
M31i7CKb2OY7VGKjs1FtlVnRtdRWVkLHHappEaT0NzjHUqpCDGZYcLMoSaLaCNdE
8P8GlAI+w8kachkWRnp1a4pdR7Kc1SnP97aZJ304EDy63gYwcsOg+m8zZj5h74u9
gJpVES1yqflN12CHIkK3K22QM9a1KbP9L9TKQSsevmOe4ju/ID3IlTKjKJvvYoUS
r9FJIJsGbzOREr1iap4hr81+rrH56t4o1FxgWCuj2wpw7EWelMFrTH0iMNNaxjyk
z+g7ZElnSjkOYxQXirKcWTJ+F5F4jEc48XlFNjtuvHz771xby3Q5dTN/+hMCQ9k1
JNML2A9QquK0jLZauRIsbBpVy2uC+vOoJ2BX2kcMOvuHUeCzK78x4HZjZi7mP6Dg
O9E4+ocGnFZsqnCPtBAxv9G8RE36Efp3uxms9HlwY6TeTGJWyZuiWDyNea2tRLct
OARMseYVxkup7DOnHirtb9Pywc3kkLqtXcWbZH68Hi5uHMrGFUO2ZhSwjfsC5+rZ
Nyt1lcRLZaxy/JFgHXzOeLqA2o/nY62OiMEgP+ENbASNJ4HKf685ytzmg2BVetsY
9E/KUQBEJqY=
=plEs
-----END PGP SIGNATURE-----
Merge tag 'md-3.7-fixes' of git://neil.brown.name/md
Pull md fixes from NeilBrown:
"Some fixes for md in 3.7
- one recently introduced crash for dm-raid10 with discard
- one bug in new functionality that has been around for a few
releases.
- minor bug in md's 'faulty' personality
and UAPI disintegration for md."
* tag 'md-3.7-fixes' of git://neil.brown.name/md:
MD RAID10: Fix oops when creating RAID10 arrays via dm-raid.c
md/raid1: Fix assembling of arrays containing Replacements.
md faulty: use disk_stack_limits()
UAPI: (Scripted) Disintegrate include/linux/raid
Driver for non-standard on-chip UART, instantiated in the ARC (Synopsys)
FPGA Boards such as ARCAngel4/ML50x
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Reviewed-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Using pstore's superblock magic number is no doubt going to cause
problems in the future. Give efivarfs its own magic number.
Acked-by: Jeremy Kerr <jeremy.kerr@canonical.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
The TX power setting is currently per wiphy (hardware
device) but with multi-channel capabilities that doesn't
make much sense any more.
Allow drivers (and mac80211) to advertise support for
per-interface TX power configuration. When the TX power
is configured for the wiphy, the wdev will be NULL and
the driver can still handle that, but when a wdev is
given the TX power can be set only for that wdev now.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Merge reason: development work has dependency on kvm patches merged
upstream.
Conflicts:
arch/powerpc/include/asm/Kbuild
arch/powerpc/include/asm/kvm_para.h
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
NFC_CMD_LLC_GET_PARAMS: request LTO, RW, and MIUX parameters for a device
NFC_CMD_LLC_SET_PARAMS: set one or more of LTO, RW, and MIUX parameters for
a device. LTO must be set before the link is up otherwise -EINPROGRESS is
returned. RW and MIUX can be set at anytime and will be passed in subsequent
CONNECT and CC messages. If one of the passed parameters is wrong none is
set and -EINVAL is returned.
Signed-off-by: Thierry Escande <thierry.escande@linux.intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
A new type is added to allow userland to monitor protocol configuration, like
IPv4 or IPv6.
For example, monitoring the state of the forwarding status of an interface of
the system.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Judging from what drivers do and from my experience temeperframe
fraction is set in seconds - look e.g. here
static int bttv_g_parm(struct file *file, void *f,
struct v4l2_streamparm *parm)
{
struct bttv_fh *fh = f;
struct bttv *btv = fh->btv;
v4l2_video_std_frame_period(bttv_tvnorms[btv->tvnorm].v4l2_id,
&parm->parm.capture.timeperframe);
...
void v4l2_video_std_frame_period(int id, struct v4l2_fract *frameperiod)
{
if (id & V4L2_STD_525_60) {
frameperiod->numerator = 1001;
frameperiod->denominator = 30000;
} else {
frameperiod->numerator = 1;
frameperiod->denominator = 25;
}
and also v4l2-ctl in userspace decodes this as seconds:
if (doioctl(fd, VIDIOC_G_PARM, &parm, "VIDIOC_G_PARM") == 0) {
const struct v4l2_fract &tf = parm.parm.capture.timeperframe;
...
printf("\tFrames per second: %.3f (%d/%d)\n",
(1.0 * tf.denominator) / tf.numerator,
tf.denominator, tf.numerator);
The typo was there from day 1 - added in 2002 in e028b61b ([PATCH]
add v4l2 api)(*)
(*) found in history tree
git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Commit 4eeaaeaea (ALSA: core: add hooks for audio timestamps) added the
new audio_tstamp field to struct snd_pcm_status. However, struct
timespec requires 64-bit alignment, so the 64-bit compiler would insert
32 bits of padding before this field, which broke SNDRV_PCM_IOCTL_STATUS
with error messages like this:
kernel: unknown ioctl = 0x80984120
To solve this, insert the padding explicitly so that it can be taken
into account when calculating the ABI structure size.
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Pull networking fixes from David Miller:
"This is what we usually expect at this stage of the game, lots of
little things, mostly in drivers. With the occasional 'oops didn't
mean to do that' kind of regressions in the core code."
1) Uninitialized data in __ip_vs_get_timeouts(), from Arnd Bergmann
2) Reject invalid ACK sequences in Fast Open sockets, from Jerry Chu.
3) Lost error code on return from _rtl_usb_receive(), from Christian
Lamparter.
4) Fix reset resume on USB rt2x00, from Stanislaw Gruszka.
5) Release resources on error in pch_gbe driver, from Veaceslav Falico.
6) Default hop limit not set correctly in ip6_template_metrics[], fix
from Li RongQing.
7) Gianfar PTP code requests wrong kind of resource during probe, fix
from Wei Yang.
8) Fix VHOST net driver on big-endian, from Michael S Tsirkin.
9) Mallenox driver bug fixes from Jack Morgenstein, Or Gerlitz, Moni
Shoua, Dotan Barak, and Uri Habusha.
10) usbnet leaks memory on TX path, fix from Hemant Kumar.
11) Use socket state test, rather than presence of FIN bit packet, to
determine FIONREAD/SIOCINQ value. Fix from Eric Dumazet.
12) Fix cxgb4 build failure, from Vipul Pandya.
13) Provide a SYN_DATA_ACKED state to complement SYN_FASTOPEN in socket
info dumps. From Yuchung Cheng.
14) Fix leak of security path in kfree_skb_partial(). Fix from Eric
Dumazet.
15) Handle RX FIFO overflows more resiliently in pch_gbe driver, from
Veaceslav Falico.
16) Fix MAINTAINERS file pattern for networking drivers, from Jean
Delvare.
17) Add iPhone5 IDs to IPHETH driver, from Jay Purohit.
18) VLAN device type change restriction is too strict, and should not
trigger for the automatically generated vlan0 device. Fix from Jiri
Pirko.
19) Make PMTU/redirect flushing work properly again in ipv4, from
Steffen Klassert.
20) Fix memory corruptions by using kfree_rcu() in netlink_release().
From Eric Dumazet.
21) More qmi_wwan device IDs, from Bjørn Mork.
22) Fix unintentional change of SNAT/DNAT hooks in generic NAT
infrastructure, from Elison Niven.
23) Fix 3.6.x regression in xt_TEE netfilter module, from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (57 commits)
tilegx: fix some issues in the SW TSO support
qmi_wwan/cdc_ether: move Novatel 551 and E362 to qmi_wwan
net: usb: Fix memory leak on Tx data path
net/mlx4_core: Unmap UAR also in the case of error flow
net/mlx4_en: Don't use vlan tag value as an indication for vlan presence
net/mlx4_en: Fix double-release-range in tx-rings
bas_gigaset: fix pre_reset handling
vhost: fix mergeable bufs on BE hosts
gianfar_ptp: use iomem, not ioports resource tree in probe
ipv6: Set default hoplimit as zero.
NET_VENDOR_TI: make available for am33xx as well
pch_gbe: fix error handling in pch_gbe_up()
b43: Fix oops on unload when firmware not found
mwifiex: clean up scan state on error
mwifiex: return -EBUSY if specific scan request cannot be honored
brcmfmac: fix potential NULL dereference
Revert "ath9k_hw: Updated AR9003 tx gain table for 5GHz"
ath9k_htc: Add PID/VID for a Ubiquiti WiFiStation
rt2x00: usb: fix reset resume
rtlwifi: pass rx setup error code to caller
...
This patch defines new ioctl codes TIOCGPKT, TIOCGPTLCK,
TIOCGEXCL for fetching pty's packet mode and locking state,
and exclusive mode of tty.
[ No real handlers for the codes though, this will be
addressed in another patch for easier review and
bisectability ]
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
CC: Alan Cox <alan@lxorguk.ukuu.org.uk>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Pavel Emelyanov <xemul@parallels.com>
CC: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Make it simple -- just put new nlattr with just sk->sk_shutdown bits.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ALSA did not provide any direct means to infer the audio time for A/V
sync and system/audio time correlations (eg. PulseAudio).
Applications had to track the number of samples read/written and
add/subtract the number of samples queued in the ring buffer. This
accounting led to small errors, typically several samples, due to the
two-step process. Computing the audio time in the kernel is more
direct, as all the information is available in the same routines.
Also add new .audio_wallclock routine to enable fine-grain synchronization
between monotonic system time and audio hardware time.
Using the wallclock, if supported in hardware, allows for a
much better sub-microsecond precision and a common drift tracking for
all devices sharing the same wall clock (master clock).
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>