Commit graph

3535 commits

Author SHA1 Message Date
Jingbo Xu
d60b87600d erofs: update print symbols for various flags in trace
As new flags introduced, the corresponding print symbols for trace are
not added accordingly.  Add these missing print symbols for these flags.

Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Link: https://lore.kernel.org/r/20230209024825.17335-1-jefflexu@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2023-02-15 08:11:26 +08:00
Gao Xiang
b780d3fc61 erofs: simplify iloc()
Actually we could pass in inodes directly to clean up all callers.
Also rename iloc() as erofs_iloc().

Link: https://lore.kernel.org/r/20230114150823.432069-1-xiang@kernel.org
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2023-02-15 08:11:24 +08:00
Boris Burkov
52bb7a2166 btrfs: introduce size class to block group allocator
The aim of this patch is to reduce the fragmentation of block groups
under certain unhappy workloads. It is particularly effective when the
size of extents correlates with their lifetime, which is something we
have observed causing fragmentation in the fleet at Meta.

This patch categorizes extents into size classes:

- x < 128KiB: "small"
- 128KiB < x < 8MiB: "medium"
- x > 8MiB: "large"

and as much as possible reduces allocations of extents into block groups
that don't match the size class. This takes advantage of any (possible)
correlation between size and lifetime and also leaves behind predictable
re-usable gaps when extents are freed; small writes don't gum up bigger
holes.

Size classes are implemented in the following way:

- Mark each new block group with a size class of the first allocation
  that goes into it.

- Add two new passes to ffe: "unset size class" and "wrong size class".
  First, try only matching block groups, then try unset ones, then allow
  allocation of new ones, and finally allow mismatched block groups.

- Filtering is done just by skipping inappropriate ones, there is no
  special size class indexing.

Other solutions I considered were:

- A best fit allocator with an rb-tree. This worked well, as small
  writes didn't leak big holes from large freed extents, but led to
  regressions in ffe and write performance due to lock contention on
  the rb-tree with every allocation possibly updating it in parallel.
  Perhaps something clever could be done to do the updates in the
  background while being "right enough".

- A fixed size "working set". This prevents freeing an extent
  drastically changing where writes currently land, and seems like a
  good option too. Doesn't take advantage of size in any way.

- The same size class idea, but implemented with xarray marks. This
  turned out to be slower than looping the linked list and skipping
  wrong block groups, and is also less flexible since we must have only
  3 size classes (max #marks). With the current approach we can have as
  many as we like.

Performance testing was done via: https://github.com/josefbacik/fsperf
Of particular relevance are the new fragmentation specific tests.

A brief summary of the testing results:

- Neutral results on existing tests. There are some minor regressions
  and improvements here and there, but nothing that truly stands out as
  notable.
- Improvement on new tests where size class and extent lifetime are
  correlated. Fragmentation in these cases is completely eliminated
  and write performance is generally a little better. There is also
  significant improvement where extent sizes are just a bit larger than
  the size class boundaries.
- Regression on one new tests: where the allocations are sized
  intentionally a hair under the borders of the size classes. Results
  are neutral on the test that intentionally attacks this new scheme by
  mixing extent size and lifetime.

The full dump of the performance results can be found here:
https://bur.io/fsperf/size-class-2022-11-15.txt
(there are ANSI escape codes, so best to curl and view in terminal)

Here is a snippet from the full results for a new test which mixes
buffered writes appending to a long lived set of files and large short
lived fallocates:

bufferedappendvsfallocate results
         metric             baseline       current        stdev            diff
======================================================================================
avg_commit_ms                    31.13         29.20          2.67     -6.22%
bg_count                            14         15.60             0     11.43%
commits                          11.10         12.20          0.32      9.91%
elapsed                          27.30         26.40          2.98     -3.30%
end_state_mount_ns         11122551.90   10635118.90     851143.04     -4.38%
end_state_umount_ns           1.36e+09      1.35e+09   12248056.65     -1.07%
find_free_extent_calls       116244.30     114354.30        964.56     -1.63%
find_free_extent_ns_max      599507.20    1047168.20     103337.08     74.67%
find_free_extent_ns_mean       3607.19       3672.11        101.20      1.80%
find_free_extent_ns_min            500           512          6.67      2.40%
find_free_extent_ns_p50           2848          2876         37.65      0.98%
find_free_extent_ns_p95           4916          5000         75.45      1.71%
find_free_extent_ns_p99       20734.49      20920.48       1670.93      0.90%
frag_pct_max                     61.67             0          8.05   -100.00%
frag_pct_mean                    43.59             0          6.10   -100.00%
frag_pct_min                     25.91             0         16.60   -100.00%
frag_pct_p50                     42.53             0          7.25   -100.00%
frag_pct_p95                     61.67             0          8.05   -100.00%
frag_pct_p99                     61.67             0          8.05   -100.00%
fragmented_bg_count               6.10             0          1.45   -100.00%
max_commit_ms                    49.80            46          5.37     -7.63%
sys_cpu                           2.59          2.62          0.29      1.39%
write_bw_bytes                1.62e+08      1.68e+08   17975843.50      3.23%
write_clat_ns_mean            57426.39      54475.95       2292.72     -5.14%
write_clat_ns_p50             46950.40      42905.60       2101.35     -8.62%
write_clat_ns_p99            148070.40     143769.60       2115.17     -2.90%
write_io_kbytes                4194304       4194304             0      0.00%
write_iops                     2476.15       2556.10        274.29      3.23%
write_lat_ns_max            2101667.60    2251129.50     370556.59      7.11%
write_lat_ns_mean             59374.91      55682.00       2523.09     -6.22%
write_lat_ns_min              17353.10         16250       1646.08     -6.36%

There are some mixed improvements/regressions in most metrics along with
an elimination of fragmentation in this workload.

On the balance, the drastic 1->0 improvement in the happy cases seems
worth the mix of regressions and improvements we do observe.

Some considerations for future work:

- Experimenting with more size classes
- More hinting/search ordering work to approximate a best-fit allocator

Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-13 17:50:34 +01:00
Boris Burkov
854c2f365d btrfs: add more find_free_extent tracepoints
find_free_extent is a complicated function. It consists (at least) of:

- a hint that jumps into the middle of a for loop macro
- a middle loop trying every raid level
- an outer loop ascending through ffe loop levels
- complicated logic for skipping some of those ffe loop levels
- multiple underlying in-bg allocators (zoned, cluster, no cluster)

Which is all to say that more tracing is helpful for debugging its
behavior. Add two new tracepoints: at the entrance to the block_groups
loop (hit for every raid level and every ffe_ctl loop) and at the point
we seriously consider a block_group for allocation. This way we can see
the whole path through the algorithm, including hints, multiple loops,
etc.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-13 17:50:34 +01:00
Boris Burkov
cfc2de0fce btrfs: pass find_free_extent_ctl to allocator tracepoints
The allocator tracepoints currently have a pile of values from ffe_ctl.
In modifying the allocator and adding more tracepoints, I found myself
adding to the already long argument list of the tracepoints. It makes it
a lot simpler to just send in the ffe_ctl itself.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-02-13 17:50:34 +01:00
Yafang Shao
b6c7abd1c2 tracing: Fix TASK_COMM_LEN in trace event format file
After commit 3087c61ed2 ("tools/testing/selftests/bpf: replace open-coded 16 with TASK_COMM_LEN"),
the content of the format file under
/sys/kernel/tracing/events/task/task_newtask was changed from
  field:char comm[16];    offset:12;    size:16;    signed:0;
to
  field:char comm[TASK_COMM_LEN];    offset:12;    size:16;    signed:0;

John reported that this change breaks older versions of perfetto.
Then Mathieu pointed out that this behavioral change was caused by the
use of __stringify(_len), which happens to work on macros, but not on enum
labels. And he also gave the suggestion on how to fix it:
  :One possible solution to make this more robust would be to extend
  :struct trace_event_fields with one more field that indicates the length
  :of an array as an actual integer, without storing it in its stringified
  :form in the type, and do the formatting in f_show where it belongs.

The result as follows after this change,
$ cat /sys/kernel/tracing/events/task/task_newtask/format
        field:char comm[16];    offset:12;      size:16;        signed:0;

Link: https://lore.kernel.org/lkml/Y+QaZtz55LIirsUO@google.com/
Link: https://lore.kernel.org/linux-trace-kernel/20230210155921.4610-1-laoar.shao@gmail.com/
Link: https://lore.kernel.org/linux-trace-kernel/20230212151303.12353-1-laoar.shao@gmail.com

Cc: stable@vger.kernel.org
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Kajetan Puchalski <kajetan.puchalski@arm.com>
CC: Qais Yousef <qyousef@layalina.io>
Fixes: 3087c61ed2 ("tools/testing/selftests/bpf: replace open-coded 16 with TASK_COMM_LEN")
Reported-by: John Stultz <jstultz@google.com>
Debugged-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Suggested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-02-12 10:23:39 -05:00
David Howells
f789bff2de rxrpc: Trace ack.rwind
Log ack.rwind in the rxrpc_tx_ack tracepoint.  This value is useful to see
as it represents flow-control information to the peer.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2023-02-07 23:11:21 +00:00
Dan Williams
5485eb9559 Merge branch 'for-6.3/cxl' into cxl/next
Merge the general CXL updates with fixes targeting v6.2-rc for v6.3.
Resolve a conflict with the fix and move of cxl_report_and_clear() from
pci.c to core/pci.c.
2023-02-07 11:12:24 -08:00
Yangtao Li
d9bac032ac f2fs: use iostat_lat_type directly as a parameter in the iostat_update_and_unbind_ctx()
Convert to use iostat_lat_type as parameter instead of raw number.
BTW, move NUM_PREALLOC_IOSTAT_CTXS to the header file, adjust
iostat_lat[{0,1,2}] to iostat_lat[{READ_IO,WRITE_SYNC_IO,WRITE_ASYNC_IO}]
in tracepoint function, and rename iotype to page_type to match the definition.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-02-07 10:39:28 -08:00
Linyu Yuan
a9c4bdd505 tracing: Acquire buffer from temparary trace sequence
there is one dwc3 trace event declare as below,
DECLARE_EVENT_CLASS(dwc3_log_event,
	TP_PROTO(u32 event, struct dwc3 *dwc),
	TP_ARGS(event, dwc),
	TP_STRUCT__entry(
		__field(u32, event)
		__field(u32, ep0state)
		__dynamic_array(char, str, DWC3_MSG_MAX)
	),
	TP_fast_assign(
		__entry->event = event;
		__entry->ep0state = dwc->ep0state;
	),
	TP_printk("event (%08x): %s", __entry->event,
			dwc3_decode_event(__get_str(str), DWC3_MSG_MAX,
				__entry->event, __entry->ep0state))
);
the problem is when trace function called, it will allocate up to
DWC3_MSG_MAX bytes from trace event buffer, but never fill the buffer
during fast assignment, it only fill the buffer when output function are
called, so this means if output function are not called, the buffer will
never used.

add __get_buf(len) which acquiree buffer from iter->tmp_seq when trace
output function called, it allow user write string to acquired buffer.

the mentioned dwc3 trace event will changed as below,
DECLARE_EVENT_CLASS(dwc3_log_event,
	TP_PROTO(u32 event, struct dwc3 *dwc),
	TP_ARGS(event, dwc),
	TP_STRUCT__entry(
		__field(u32, event)
		__field(u32, ep0state)
	),
	TP_fast_assign(
		__entry->event = event;
		__entry->ep0state = dwc->ep0state;
	),
	TP_printk("event (%08x): %s", __entry->event,
		dwc3_decode_event(__get_buf(DWC3_MSG_MAX), DWC3_MSG_MAX,
				__entry->event, __entry->ep0state))
);.

Link: https://lore.kernel.org/linux-trace-kernel/1675065249-23368-1-git-send-email-quic_linyyuan@quicinc.com

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Linyu Yuan <quic_linyyuan@quicinc.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-02-07 12:42:54 -05:00
Petr Machata
d47230a348 net: bridge: Add a tracepoint for MDB overflows
The following patch will add two more maximum MDB allowances to the global
one, mcast_hash_max, that exists today. In all these cases, attempts to add
MDB entries above the configured maximums through netlink, fail noisily and
obviously. Such visibility is missing when adding entries through the
control plane traffic, by IGMP or MLD packets.

To improve visibility in those cases, add a trace point that reports the
violation, including the relevant netdevice (be it a slave or the bridge
itself), and the MDB entry parameters:

	# perf record -e bridge:br_mdb_full &
	# [...]
	# perf script | cut -d: -f4-
	 dev v2 af 2 src ::ffff:0.0.0.0 grp ::ffff:239.1.1.112/00:00:00:00:00:00 vid 0
	 dev v2 af 10 src :: grp ff0e::112/00:00:00:00:00:00 vid 0
	 dev v2 af 2 src ::ffff:0.0.0.0 grp ::ffff:239.1.1.112/00:00:00:00:00:00 vid 10
	 dev v2 af 10 src 2001:db8:1::1 grp ff0e::1/00:00:00:00:00:00 vid 10
	 dev v2 af 2 src ::ffff:192.0.2.1 grp ::ffff:239.1.1.1/00:00:00:00:00:00 vid 10

CC: Steven Rostedt <rostedt@goodmis.org>
CC: linux-trace-kernel@vger.kernel.org
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-06 08:48:25 +00:00
NeilBrown
2973d8229b mm: discard __GFP_ATOMIC
__GFP_ATOMIC serves little purpose.  Its main effect is to set
ALLOC_HARDER which adds a few little boosts to increase the chance of an
allocation succeeding, one of which is to lower the water-mark at which it
will succeed.

It is *always* paired with __GFP_HIGH which sets ALLOC_HIGH which also
adjusts this watermark.  It is probable that other users of __GFP_HIGH
should benefit from the other little bonuses that __GFP_ATOMIC gets.

__GFP_ATOMIC also gives a warning if used with __GFP_DIRECT_RECLAIM.
There is little point to this.  We already get a might_sleep() warning if
__GFP_DIRECT_RECLAIM is set.

__GFP_ATOMIC allows the "watermark_boost" to be side-stepped.  It is
probable that testing ALLOC_HARDER is a better fit here.

__GFP_ATOMIC is used by tegra-smmu.c to check if the allocation might
sleep.  This should test __GFP_DIRECT_RECLAIM instead.

This patch:
 - removes __GFP_ATOMIC
 - allows __GFP_HIGH allocations to ignore watermark boosting as well
   as GFP_ATOMIC requests.
 - makes other adjustments as suggested by the above.

The net result is not change to GFP_ATOMIC allocations.  Other
allocations that use __GFP_HIGH will benefit from a few different extra
privileges.  This affects:
  xen, dm, md, ntfs3
  the vermillion frame buffer
  hibernation
  ksm
  swap
all of which likely produce more benefit than cost if these selected
allocation are more likely to succeed quickly.

[mgorman: Minor adjustments to rework on top of a series]
Link: https://lkml.kernel.org/r/163712397076.13692.4727608274002939094@noble.neil.brown.name
Link: https://lkml.kernel.org/r/20230113111217.14134-7-mgorman@techsingularity.net
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-02 22:33:13 -08:00
David Howells
83836eb4df rxrpc: Change rx_packet tracepoint to display securityIndex not type twice
Change the rx_packet tracepoint to display the securityIndex from the
packet header instead of displaying the type in numeric form.  There's no
need for the latter, as the display of the type in symbolic form will fall
back automatically to displaying the hex value if no symbol is available.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2023-01-31 16:38:35 +00:00
David Howells
f21e93485b rxrpc: Simplify ACK handling
Now that general ACK transmission is done from the same thread as incoming
DATA packet wrangling, there's no possibility that the SACK table will be
being updated by the latter whilst the former is trying to copy it to an
ACK.

This means that we can safely rotate the SACK table whilst updating it
without having to take a lock, rather than keeping all the bits inside it
in fixed place and copying and then rotating it in the transmitter.

Therefore, simplify SACK handing by keeping track of starting point in the
ring and rotate slots down as we consume them.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2023-01-31 16:38:35 +00:00
David Howells
5bbf953382 rxrpc: De-atomic call->ackr_window and call->ackr_nr_unacked
call->ackr_window doesn't need to be atomic as ACK generation and ACK
transmission are now done in the same thread, so drop the atomic64 handling
and split it into two separate members.

Similarly, call->ackr_nr_unacked doesn't need to be atomic now either.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2023-01-31 16:38:26 +00:00
David Howells
84e28aa513 rxrpc: Generate extra pings for RTT during heavy-receive call
When doing a call that has a single transmitted data packet and a massive
amount of received data packets, we only ping for one RTT sample, which
means we don't get a good reading on it.

Fix this by converting occasional IDLE ACKs into PING ACKs to elicit a
response.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2023-01-31 16:38:10 +00:00
David Howells
828bebc80a rxrpc: Shrink the tabulation in the rxrpc trace header a bit
Shrink the tabulation in the rxrpc trace header a bit to allow for fields
with long type names that have been removed.

Signed-off-by: David Howells <dhowells@redhat.com>
2023-01-31 16:37:44 +00:00
David Howells
371e68ba03 rxrpc: Remove whitespace before ')' in trace header
Work around checkpatch warnings in the rxrpc trace header by removing
whitespace before ')' on lines defining the trace record struct.

Signed-off-by: David Howells <dhowells@redhat.com>
2023-01-31 16:36:15 +00:00
Ingo Molnar
57a30218fa Linux 6.2-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmPW7E8eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGf7MIAI0JnHN9WvtEukSZ
 E6j6+cEGWxsvD6q0g3GPolaKOCw7hlv0pWcFJFcUAt0jebspMdxV2oUGJ8RYW7Lg
 nCcHvEVswGKLAQtQSWw52qotW6fUfMPsNYYB5l31sm1sKH4Cgss0W7l2HxO/1LvG
 TSeNHX53vNAZ8pVnFYEWCSXC9bzrmU/VALF2EV00cdICmfvjlgkELGXoLKJJWzUp
 s63fBHYGGURSgwIWOKStoO6HNo0j/F/wcSMx8leY8qDUtVKHj4v24EvSgxUSDBER
 ch3LiSQ6qf4sw/z7pqruKFthKOrlNmcc0phjiES0xwwGiNhLv0z3rAhc4OM2cgYh
 SDc/Y/c=
 =zpaD
 -----END PGP SIGNATURE-----

Merge tag 'v6.2-rc6' into sched/core, to pick up fixes

Pick up fixes before merging another batch of cpuidle updates.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2023-01-31 15:01:20 +01:00
Daniel Vetter
aebd8f0c6f Linux 6.2-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmPW7E8eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGf7MIAI0JnHN9WvtEukSZ
 E6j6+cEGWxsvD6q0g3GPolaKOCw7hlv0pWcFJFcUAt0jebspMdxV2oUGJ8RYW7Lg
 nCcHvEVswGKLAQtQSWw52qotW6fUfMPsNYYB5l31sm1sKH4Cgss0W7l2HxO/1LvG
 TSeNHX53vNAZ8pVnFYEWCSXC9bzrmU/VALF2EV00cdICmfvjlgkELGXoLKJJWzUp
 s63fBHYGGURSgwIWOKStoO6HNo0j/F/wcSMx8leY8qDUtVKHj4v24EvSgxUSDBER
 ch3LiSQ6qf4sw/z7pqruKFthKOrlNmcc0phjiES0xwwGiNhLv0z3rAhc4OM2cgYh
 SDc/Y/c=
 =zpaD
 -----END PGP SIGNATURE-----

Merge v6.2-rc6 into drm-next

Due to holidays we started -next with more -fixes in-flight than
usual, and people have been asking where they are. Backmerge to get
things better in sync.

Conflicts:
- Tiny conflict in drm_fbdev_generic.c between variable rename and
  missing error handling that got added.
- Conflict in drm_fb_helper.c between the added call to vgaswitcheroo
  in drm_fb_helper_single_fb_probe and a refactor patch that extracted
  lots of helpers and incidentally removed the dev local variable.
  Readd it to make things compile.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2023-01-31 12:23:23 +01:00
Chao Yu
2f3a9ae990 f2fs: introduce trace_f2fs_replace_atomic_write_block
Commit 3db1de0e58 ("f2fs: change the current atomic write way")
removed old tracepoints, but it missed to add new one, this patch
fixes to introduce trace_f2fs_replace_atomic_write_block to trace
atomic_write commit flow.

Fixes: 3db1de0e58 ("f2fs: change the current atomic write way")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-01-30 14:46:20 -08:00
David Howells
8395406b34 rxrpc: Fix trace string
Fix a trace string to indicate that it's discarding the local endpoint for
a preallocated peer, not a preallocated connection.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2023-01-30 14:13:29 +00:00
Ohad Sharabi
d5077a5500 habanalabs: define events to trace PCI LBW access
There are cases where it may be useful to dump the whole LBW configs.
Yet, doing so while spamming the kernel log will probably shade other
important messages since the LBW access is done in sheer volume.
To answer this we add trace events for those too.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-01-26 11:52:11 +02:00
Ohad Sharabi
811c74baed habanalabs: define traces for COMMS protocol
As the COMMS protocol is being used more widely in our driver,
an available debug tool for the handshake will be handy.

This commit defines tracepoints to various key points of the COMMS
protocol.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2023-01-26 10:56:21 +02:00
Steven Rostedt (Google)
dc513fd532 bpf/tracing: Use stage6 of tracing to not duplicate macros
The bpf events are created by the same macro magic as tracefs trace
events are. But to hook into bpf, it has its own code. It duplicates many
of the same macros as the tracefs macros and this is an issue because it
misses bug fixes as well as any new enhancements that come with the other
trace macros.

As the trace macros have been put into their own staging files, have bpf
take advantage of this and use the tracefs stage 6 macros that the "fast
ssign" portion of the trace event macro uses.

Link: https://lkml.kernel.org/r/20230124202515.873075730@goodmis.org
Link: https://lore.kernel.org/lkml/1671181385-5719-1-git-send-email-quic_linyyuan@quicinc.com/

Cc: bpf@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Reported-by: Linyu Yuan <quic_linyyuan@quicinc.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-01-25 10:31:24 -05:00
Steven Rostedt (Google)
92a22cea4c perf/tracing: Use stage6 of tracing to not duplicate macros
The perf events are created by the same macro magic as tracefs trace
events are. But to hook into perf, it has its own code. It duplicates many
of the same macros as the tracefs macros and this is an issue because it
misses bug fixes as well as any new enhancements that come with the other
trace macros.

As the trace macros have been put into their own staging files, have perf
take advantage of this and use the tracefs stage 6 macros that the "fast
assign" portion of the trace event macro uses.

Link: https://lkml.kernel.org/r/20230124202515.716458410@goodmis.org
Link: https://lore.kernel.org/lkml/1671181385-5719-1-git-send-email-quic_linyyuan@quicinc.com/

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reported-by: Linyu Yuan <quic_linyyuan@quicinc.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-01-25 10:31:24 -05:00
Arnd Bergmann
f938b29d27 Arm SCMI updates for v6.3
The main addition is a unified userspace interface for SCMI irrespective
 of the underlying transport and along with some changed to refactor the
 SCMI stack probing sequence.
 
 1. SCMI unified userspace interface
 
    This is to have a unified way of testing an SCMI platform firmware
    implementation for compliance, fuzzing etc., from the perspective of
    the non-secure OSPM irrespective of the underlying transport supporting
    SCMI. It is just for testing/development and not a feature intended fo
    use in production.
 
    Currently an SCMI Compliance Suite[1] can only work by injecting SCMI
    messages using the mailbox test driver only which makes it transport
    specific and can't be used with any other transport like virtio,
    smc/hvc, optee, etc. Also the shared memory can be transport specific
    and it is better to even abstract/hide those details while providing
    the userspace access. So in order to scale with any transport, we need
    a unified interface for the same.
 
    In order to achieve that, SCMI "raw mode support" is being added through
    debugfs which is more configurable as well. A userspace application
    can inject bare SCMI binary messages into the SCMI core stack; such
    messages will be routed by the SCMI regular kernel stack to the backend
    platform firmware using the configured transport transparently. This
    eliminates the to know about the specific underlying transport
    internals that will be taken care of by the SCMI core stack itself.
    Further no additional changes needed in the device tree like in the
    mailbox-test driver.
 
 [1] https://gitlab.arm.com/tests/scmi-tests
 
 2. Refactoring of the SCMI stack probing sequence
 
    On some platforms, SCMI transport can be provide by OPTEE/TEE which
    introduces certain dependency in the probe ordering. In order to address
    the same, the SCMI bus is split into its own module which continues to
    be initialized at subsys_initcall, while the SCMI core stack, including
    its various transport backends (like optee, mailbox, virtio, smc), is
    now moved into a separate module at module_init level.
 
    This allows the other possibly dependent subsystems to register and/or
    access SCMI bus well before the core SCMI stack and its dependent
    transport backends.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEunHlEgbzHrJD3ZPhAEG6vDF+4pgFAmPKvdkACgkQAEG6vDF+
 4pgB0g//eU5S0aTgt8XlwDmdjeu+mNrj68QHKINq9CS7PmBs37So0IdLJ+CpqJlo
 VSmk2kI5oLWz/u3N92QQY9RXM4hvO95kiPKuyO8NsoPWrfjBZH3rKcgEpRquZjrt
 TdBUPd2aqoKhFqkUzxs5lNEZOV/R6mm0q+i9dD4RIRKP9Tjrlm3jYDSMFnW3/QMJ
 OR3Ub0e/4Lj3QyNUxrUqwpdjTiAqXimCW7LWZ2fwY5kPxcL4wedAfQS3zGa5m8Wk
 htqRTXmYtSVKAZ/oFUPDOHuZUNqn0ZdNI7guEPgzo90+pJs0yQUQf/wtc+X9quXZ
 /FUGaVSTzlvcl1MPJPTPQ9d7dJH8lR0+nxzovkBSoMX/tNByuVtBUpNiclu8seob
 CqbywRtASkd0g6dKHHEIylwj0FpRSYBLJEcE6jXhxfvXt+sCDZbDSUpWaGGZnNqO
 oj8FhEmRk/t/d+ZEkn6MlRgy5uiJSv4GstNQ5V/ZSz3vhp1u2Sl7y8xAcuVsqXyH
 Dok6iM9GoSdskCdSICk5iA2ESC5s+1IiDd2PnSwWz+yj9HbmWKwU0nKgRRWvDUZh
 jJYWAvcwuh3SQ/sN/FNTVsxQt/x5V3L/K7oe2o983l0Lq8k9WB0mEBAyRidxAlVj
 TpfXbe9CDqJtoF85Bslb7eS++iGADkVvmWAqjqUQddFQR++Dj2w=
 =yLSu
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmPQQ0QACgkQmmx57+YA
 GNlYsRAAr/qGRRSiOtTYZi0yPmSuGCYb4xORbJHS00gD3aRzrehtsf//av4WyeyM
 cGpeR/2OIa66/M7wIjKM7lHXBdzFxLgwAM9DkexVwY1i7w6ExDZFClX7UhlgmwiM
 LbsApqHYW/DhAnOBL7SXPQkDIcTifURvOZCOJ0NJg5HOn+DeRP5UPEE5Xbqt0+yc
 JUnKC2eqN15G/BWJCWtmfZVXwMKtWB8ScU2HJeoEjjH7d31XvWvw5z0upyhlEQvb
 lqhFgKftEY+boEfbz86AoZoBY8EWFdqB8RvcNOmReU8hyequjnrK/ykQq9EAAMmo
 LbxNkW72zN727ji3O7onX8zWjqYLKjV3FC/cuNah//USQa8nPvq4atXf+D7my0fC
 Rm9ZF3ztHQjTgMVlmm/uAU3c4BxEcwHMnppEoYvxa23ErSt+hp2aJvfDfseMAYFW
 2iie4KAR03WeHnspYioa88PHj83SeT9AsHxf+PCdXkXgzl74HDVrt/8gpLQMmu4Q
 BTkwTF3QvEry5c6uF7yI/oOuQq06Y03miN5GEQKtqTIIpn3g/tJs0GJot0nUPZ0f
 xJJ+mlOcoc76zwaDARcGJO0yzX9qJt4I+/HS39BLvMt2EfbZ3IXlSsQu7esd5p2V
 6xokmygD5rZY92CMd3SPpJ4xRBHUN6PIFb1GUyJmg9xoghDwjXs=
 =jvm3
 -----END PGP SIGNATURE-----

Merge tag 'scmi-updates-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into soc/drivers

Arm SCMI updates for v6.3

The main addition is a unified userspace interface for SCMI irrespective
of the underlying transport and along with some changed to refactor the
SCMI stack probing sequence.

1. SCMI unified userspace interface

   This is to have a unified way of testing an SCMI platform firmware
   implementation for compliance, fuzzing etc., from the perspective of
   the non-secure OSPM irrespective of the underlying transport supporting
   SCMI. It is just for testing/development and not a feature intended fo
   use in production.

   Currently an SCMI Compliance Suite[1] can only work by injecting SCMI
   messages using the mailbox test driver only which makes it transport
   specific and can't be used with any other transport like virtio,
   smc/hvc, optee, etc. Also the shared memory can be transport specific
   and it is better to even abstract/hide those details while providing
   the userspace access. So in order to scale with any transport, we need
   a unified interface for the same.

   In order to achieve that, SCMI "raw mode support" is being added through
   debugfs which is more configurable as well. A userspace application
   can inject bare SCMI binary messages into the SCMI core stack; such
   messages will be routed by the SCMI regular kernel stack to the backend
   platform firmware using the configured transport transparently. This
   eliminates the to know about the specific underlying transport
   internals that will be taken care of by the SCMI core stack itself.
   Further no additional changes needed in the device tree like in the
   mailbox-test driver.

[1] https://gitlab.arm.com/tests/scmi-tests

2. Refactoring of the SCMI stack probing sequence

   On some platforms, SCMI transport can be provide by OPTEE/TEE which
   introduces certain dependency in the probe ordering. In order to address
   the same, the SCMI bus is split into its own module which continues to
   be initialized at subsys_initcall, while the SCMI core stack, including
   its various transport backends (like optee, mailbox, virtio, smc), is
   now moved into a separate module at module_init level.

   This allows the other possibly dependent subsystems to register and/or
   access SCMI bus well before the core SCMI stack and its dependent
   transport backends.

* tag 'scmi-updates-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: (31 commits)
  firmware: arm_scmi: Clarify raw per-channel ABI documentation
  firmware: arm_scmi: Add per-channel raw injection support
  firmware: arm_scmi: Add the raw mode co-existence support
  firmware: arm_scmi: Call raw mode hooks from the core stack
  firmware: arm_scmi: Reject SCMI drivers when configured in raw mode
  firmware: arm_scmi: Add debugfs ABI documentation for raw mode
  firmware: arm_scmi: Add core raw transmission support
  firmware: arm_scmi: Add debugfs ABI documentation for common entries
  firmware: arm_scmi: Populate a common SCMI debugfs root
  debugfs: Export debugfs_create_str symbol
  include: trace: Add platform and channel instance references
  firmware: arm_scmi: Add internal platform/channel identifiers
  firmware: arm_scmi: Move errors defs and code to common.h
  firmware: arm_scmi: Add xfer helpers to provide raw access
  firmware: arm_scmi: Add flags field to xfer
  firmware: arm_scmi: Refactor scmi_wait_for_message_response
  firmware: arm_scmi: Refactor polling helpers
  firmware: arm_scmi: Refactor xfer in-flight registration routines
  firmware: arm_scmi: Split bus and driver into distinct modules
  firmware: arm_scmi: Introduce a new lifecycle for protocol devices
  ...

Link: https://lore.kernel.org/r/20230120162152.1438456-1-sudeep.holla@arm.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-01-24 21:44:52 +01:00
Peilin Ye
40e0b09081 net/sock: Introduce trace_sk_data_ready()
As suggested by Cong, introduce a tracepoint for all ->sk_data_ready()
callback implementations.  For example:

<...>
  iperf-609  [002] .....  70.660425: sk_data_ready: family=2 protocol=6 func=sock_def_readable
  iperf-609  [002] .....  70.660436: sk_data_ready: family=2 protocol=6 func=sock_def_readable
<...>

Suggested-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-23 11:26:50 +00:00
Cristian Marussi
8b2bd71119 include: trace: Add platform and channel instance references
Add the channel and platform instance indentifier to SCMI message dump
traces in order to easily associate message flows to specific transport
channels.

Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20230118121426.492864-9-cristian.marussi@arm.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2023-01-20 11:40:57 +00:00
Wenchao Hao
cb6c33d4dc cma: tracing: print alloc result in trace_cma_alloc_finish
The result of the allocation attempt is not printed in
trace_cma_alloc_finish, but it's important to do it so we can set filters
to catch specific errors on allocation or to trigger some operations on
specific errors.

We have printed the result in log, but the log is conditional and could
not be filtered by tracing events.

It introduces little overhead to print this result.  The result of
allocation is named `errorno' in the trace.

Link: https://lkml.kernel.org/r/20221208142130.1501195-1-haowenchao@huawei.com
Signed-off-by: Wenchao Hao <haowenchao@huawei.com>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-18 17:12:41 -08:00
Lu Baolu
8f9930fa01 iommu: Remove detach_dev callback
The detach_dev callback of domain ops is not called in the IOMMU core.
Remove this callback to avoid dead code. The trace event for detaching
domain from device is removed accordingly.

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20230110025408.667767-6-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-01-13 16:39:18 +01:00
Yunhui Cui
6e6eda44b9 sock: add tracepoint for send recv length
Add 2 tracepoints to monitor the tcp/udp traffic
of per process and per cgroup.

Regarding monitoring the tcp/udp traffic of each process, there are two
existing solutions, the first one is https://www.atoptool.nl/netatop.php.
The second is via kprobe/kretprobe.

Netatop solution is implemented by registering the hook function at the
hook point provided by the netfilter framework.

These hook functions may be in the soft interrupt context and cannot
directly obtain the pid. Some data structures are added to bind packets
and processes. For example, struct taskinfobucket, struct taskinfo ...

Every time the process sends and receives packets it needs multiple
hashmaps,resulting in low performance and it has the problem fo inaccurate
tcp/udp traffic statistics(for example: multiple threads share sockets).

We can obtain the information with kretprobe, but as we know, kprobe gets
the result by trappig in an exception, which loses performance compared
to tracepoint.

We compared the performance of tracepoints with the above two methods, and
the results are as follows:

ab -n 1000000 -c 1000 -r http://127.0.0.1/index.html
without trace:
Time per request: 39.660 [ms] (mean)
Time per request: 0.040 [ms] (mean, across all concurrent requests)

netatop:
Time per request: 50.717 [ms] (mean)
Time per request: 0.051 [ms] (mean, across all concurrent requests)

kr:
Time per request: 43.168 [ms] (mean)
Time per request: 0.043 [ms] (mean, across all concurrent requests)

tracepoint:
Time per request: 41.004 [ms] (mean)
Time per request: 0.041 [ms] (mean, across all concurrent requests

It can be seen that tracepoint has better performance.

Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
Signed-off-by: Xiongchun Duan <duanxiongchun@bytedance.com>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-13 10:25:10 +00:00
Yangtao Li
7a2b15cfa8 f2fs: support accounting iostat count and avg_bytes
Previously, we supported to account iostat io_bytes,
in this patch, it adds to account iostat count and avg_bytes:

time:           1671648667
                        io_bytes         count            avg_bytes
[WRITE]
app buffered data:      31               2                15

Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-01-11 11:15:19 -08:00
Christoph Hellwig
cd8fc5226b f2fs: remove the create argument to f2fs_map_blocks
The create argument is always identicaly to map->m_may_create, so use
that consistently.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-01-06 15:13:29 -08:00
David Howells
9d35d880e0 rxrpc: Move client call connection to the I/O thread
Move the connection setup of client calls to the I/O thread so that a whole
load of locking and barrierage can be eliminated.  This necessitates the
app thread waiting for connection to complete before it can begin
encrypting data.

This also completes the fix for a race that exists between call connection
and call disconnection whereby the data transmission code adds the call to
the peer error distribution list after the call has been disconnected (say
by the rxrpc socket getting closed).

The fix is to complete the process of moving call connection, data
transmission and call disconnection into the I/O thread and thus forcibly
serialising them.

Note that the issue may predate the overhaul to an I/O thread model that
were included in the merge window for v6.2, but the timing is very much
changed by the change given below.

Fixes: cf37b59875 ("rxrpc: Move DATA transmission into call processor work item")
Reported-by: syzbot+c22650d2844392afdcfd@syzkaller.appspotmail.com
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2023-01-06 09:43:33 +00:00
David Howells
1bab27af6b rxrpc: Set up a connection bundle from a call, not rxrpc_conn_parameters
Use the information now stored in struct rxrpc_call to configure the
connection bundle and thence the connection, rather than using the
rxrpc_conn_parameters struct.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2023-01-06 09:43:32 +00:00
David Howells
2953d3b8d8 rxrpc: Offload the completion of service conn security to the I/O thread
Offload the completion of the challenge/response cycle on a service
connection to the I/O thread.  After the RESPONSE packet has been
successfully decrypted and verified by the work queue, offloading the
changing of the call states to the I/O thread makes iteration over the
conn's channel list simpler.

Do this by marking the RESPONSE skbuff and putting it onto the receive
queue for the I/O thread to collect.  We put it on the front of the queue
as we've already received the packet for it.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2023-01-06 09:43:32 +00:00
David Howells
57af281e53 rxrpc: Tidy up abort generation infrastructure
Tidy up the abort generation infrastructure in the following ways:

 (1) Create an enum and string mapping table to list the reasons an abort
     might be generated in tracing.

 (2) Replace the 3-char string with the values from (1) in the places that
     use that to log the abort source.  This gets rid of a memcpy() in the
     tracepoint.

 (3) Subsume the rxrpc_rx_eproto tracepoint with the rxrpc_abort tracepoint
     and use values from (1) to indicate the trace reason.

 (4) Always make a call to an abort function at the point of the abort
     rather than stashing the values into variables and using goto to get
     to a place where it reported.  The C optimiser will collapse the calls
     together as appropriate.  The abort functions return a value that can
     be returned directly if appropriate.

Note that this extends into afs also at the points where that generates an
abort.  To aid with this, the afs sources need to #define
RXRPC_TRACE_ONLY_DEFINE_ENUMS before including the rxrpc tracing header
because they don't have access to the rxrpc internal structures that some
of the tracepoints make use of.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2023-01-06 09:43:32 +00:00
David Howells
a00ce28b17 rxrpc: Clean up connection abort
Clean up connection abort, using the connection state_lock to gate access
to change that state, and use an rxrpc_call_completion value to indicate
the difference between local and remote aborts as these can be pasted
directly into the call state.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2023-01-06 09:43:32 +00:00
David Howells
f2cce89a07 rxrpc: Implement a mechanism to send an event notification to a connection
Provide a means by which an event notification can be sent to a connection
through such that the I/O thread can pick it up and handle it rather than
doing it in a separate workqueue.

This is then used to move the deferred final ACK of a call into the I/O
thread rather than a separate work queue as part of the drive to do all
transmission from the I/O thread.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2023-01-06 09:43:31 +00:00
David Howells
03fc55adf8 rxrpc: Only disconnect calls in the I/O thread
Only perform call disconnection in the I/O thread to reduce the locking
requirement.

This is the first part of a fix for a race that exists between call
connection and call disconnection whereby the data transmission code adds
the call to the peer error distribution list after the call has been
disconnected (say by the rxrpc socket getting closed).

The fix is to complete the process of moving call connection, data
transmission and call disconnection into the I/O thread and thus forcibly
serialising them.

Note that the issue may predate the overhaul to an I/O thread model that
were included in the merge window for v6.2, but the timing is very much
changed by the change given below.

Fixes: cf37b59875 ("rxrpc: Move DATA transmission into call processor work item")
Reported-by: syzbot+c22650d2844392afdcfd@syzkaller.appspotmail.com
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2023-01-06 09:43:31 +00:00
David Howells
a343b174b4 rxrpc: Only set/transmit aborts in the I/O thread
Only set the abort call completion state in the I/O thread and only
transmit ABORT packets from there.  rxrpc_abort_call() can then be made to
actually send the packet.

Further, ABORT packets should only be sent if the call has been exposed to
the network (ie. at least one attempted DATA transmission has occurred for
it).

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2023-01-06 09:43:31 +00:00
David Howells
5040011d07 rxrpc: Make the local endpoint hold a ref on a connected call
Make the local endpoint and it's I/O thread hold a reference on a connected
call until that call is disconnected.  Without this, we're reliant on
either the AF_RXRPC socket to hold a ref (which is dropped when the call is
released) or a queued work item to hold a ref (the work item is being
replaced with the I/O thread).

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2023-01-06 09:43:31 +00:00
Linus Torvalds
50011c32f4 Including fixes from bpf, wifi, and netfilter.
Current release - regressions:
 
  - bpf: fix nullness propagation for reg to reg comparisons,
    avoid null-deref
 
  - inet: control sockets should not use current thread task_frag
 
  - bpf: always use maximal size for copy_array()
 
  - eth: bnxt_en: don't link netdev to a devlink port for VFs
 
 Current release - new code bugs:
 
  - rxrpc: fix a couple of potential use-after-frees
 
  - netfilter: conntrack: fix IPv6 exthdr error check
 
  - wifi: iwlwifi: fw: skip PPAG for JF, avoid FW crashes
 
  - eth: dsa: qca8k: various fixes for the in-band register access
 
  - eth: nfp: fix schedule in atomic context when sync mc address
 
  - eth: renesas: rswitch: fix getting mac address from device tree
 
  - mobile: ipa: use proper endpoint mask for suspend
 
 Previous releases - regressions:
 
  - tcp: add TIME_WAIT sockets in bhash2, fix regression caught
    by Jiri / python tests
 
  - net: tc: don't intepret cls results when asked to drop, fix
    oob-access
 
  - vrf: determine the dst using the original ifindex for multicast
 
  - eth: bnxt_en:
    - fix XDP RX path if BPF adjusted packet length
    - fix HDS (header placement) and jumbo thresholds for RX packets
 
  - eth: ice: xsk: do not use xdp_return_frame() on tx_buf->raw_buf,
    avoid memory corruptions
 
 Previous releases - always broken:
 
  - ulp: prevent ULP without clone op from entering the LISTEN status
 
  - veth: fix race with AF_XDP exposing old or uninitialized descriptors
 
  - bpf:
    - pull before calling skb_postpull_rcsum() (fix checksum support
      and avoid a WARN())
    - fix panic due to wrong pageattr of im->image (when livepatch
      and kretfunc coexist)
    - keep a reference to the mm, in case the task is dead
 
  - mptcp: fix deadlock in fastopen error path
 
  - netfilter:
    - nf_tables: perform type checking for existing sets
    - nf_tables: honor set timeout and garbage collection updates
    - ipset: fix hash:net,port,net hang with /0 subnet
    - ipset: avoid hung task warning when adding/deleting entries
 
  - selftests: net:
    - fix cmsg_so_mark.sh test hang on non-x86 systems
    - fix the arp_ndisc_evict_nocarrier test for IPv6
 
  - usb: rndis_host: secure rndis_query check against int overflow
 
  - eth: r8169: fix dmar pte write access during suspend/resume with WOL
 
  - eth: lan966x: fix configuration of the PCS
 
  - eth: sparx5: fix reading of the MAC address
 
  - eth: qed: allow sleep in qed_mcp_trace_dump()
 
  - eth: hns3:
    - fix interrupts re-initialization after VF FLR
    - fix handling of promisc when MAC addr table gets full
    - refine the handling for VF heartbeat
 
  - eth: mlx5:
    - properly handle ingress QinQ-tagged packets on VST
    - fix io_eq_size and event_eq_size params validation on big endian
    - fix RoCE setting at HCA level if not supported at all
    - don't turn CQE compression on by default for IPoIB
 
  - eth: ena:
    - fix toeplitz initial hash key value
    - account for the number of XDP-processed bytes in interface stats
    - fix rx_copybreak value update
 
 Misc:
 
  - ethtool: harden phy stat handling against buggy drivers
 
  - docs: netdev: convert maintainer's doc from FAQ to a normal document
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmO3MLcACgkQMUZtbf5S
 IrsEQBAAijPrpxsGMfX+VMqZ8RPKA3Qg8XF3ji2fSp4c0kiKv6lYI7PzPTR3u/fj
 CAlhQMHv7z53uM6Zd7FdUVl23paaEycu8YnlwSubg9z+wSeh/RQ6iq94mSk1PV+K
 LLVR/yop2N35Yp/oc5KZMb9fMLkxRG9Ci73QUVVYgvIrSd4Zdm13FjfVjL2C1MZH
 Yp003wigMs9IkIHOpHjNqwn/5s//0yXsb1PgKxCsaMdMQsG0yC+7eyDmxshCqsji
 xQm15mkGMjvWEYJaa4Tj4L3JW6lWbQzCu9nqPUX16KpmrnScr8S8Is+aifFZIBeW
 GZeDYgvjSxNWodeOrJnD3X+fnbrR9+qfx7T9y7XighfytAz5DNm1LwVOvZKDgPFA
 s+LlxOhzkDNEqbIsusK/LW+04EFc5gJyTI2iR6s4SSqmH3c3coJZQJeyRFWDZy/x
 1oqzcCcq8SwGUTJ9g6HAmDQoVkhDWDT/ZcRKhpWG0nJub972lB2iwM7LrAu+HoHI
 r8hyCkHpOi5S3WZKI9gPiGD+yOlpVAuG2wHg2IpjhKQvtd9DFUChGDhFeoB2rqJf
 9uI3RJBBYTDkeNu3kpfy5uMh2XhvbIZntK5kwpJ4VettZWFMaOAzn7KNqk8iT4gJ
 ASMrUrX59X0TAN0MgpJJm7uGtKbKZOu4lHNm74TUxH7V7bYn7dk=
 =TlcN
 -----END PGP SIGNATURE-----

Merge tag 'net-6.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bpf, wifi, and netfilter.

  Current release - regressions:

   - bpf: fix nullness propagation for reg to reg comparisons, avoid
     null-deref

   - inet: control sockets should not use current thread task_frag

   - bpf: always use maximal size for copy_array()

   - eth: bnxt_en: don't link netdev to a devlink port for VFs

  Current release - new code bugs:

   - rxrpc: fix a couple of potential use-after-frees

   - netfilter: conntrack: fix IPv6 exthdr error check

   - wifi: iwlwifi: fw: skip PPAG for JF, avoid FW crashes

   - eth: dsa: qca8k: various fixes for the in-band register access

   - eth: nfp: fix schedule in atomic context when sync mc address

   - eth: renesas: rswitch: fix getting mac address from device tree

   - mobile: ipa: use proper endpoint mask for suspend

  Previous releases - regressions:

   - tcp: add TIME_WAIT sockets in bhash2, fix regression caught by
     Jiri / python tests

   - net: tc: don't intepret cls results when asked to drop, fix
     oob-access

   - vrf: determine the dst using the original ifindex for multicast

   - eth: bnxt_en:
      - fix XDP RX path if BPF adjusted packet length
      - fix HDS (header placement) and jumbo thresholds for RX packets

   - eth: ice: xsk: do not use xdp_return_frame() on tx_buf->raw_buf,
     avoid memory corruptions

  Previous releases - always broken:

   - ulp: prevent ULP without clone op from entering the LISTEN status

   - veth: fix race with AF_XDP exposing old or uninitialized
     descriptors

   - bpf:
      - pull before calling skb_postpull_rcsum() (fix checksum support
        and avoid a WARN())
      - fix panic due to wrong pageattr of im->image (when livepatch and
        kretfunc coexist)
      - keep a reference to the mm, in case the task is dead

   - mptcp: fix deadlock in fastopen error path

   - netfilter:
      - nf_tables: perform type checking for existing sets
      - nf_tables: honor set timeout and garbage collection updates
      - ipset: fix hash:net,port,net hang with /0 subnet
      - ipset: avoid hung task warning when adding/deleting entries

   - selftests: net:
      - fix cmsg_so_mark.sh test hang on non-x86 systems
      - fix the arp_ndisc_evict_nocarrier test for IPv6

   - usb: rndis_host: secure rndis_query check against int overflow

   - eth: r8169: fix dmar pte write access during suspend/resume with
     WOL

   - eth: lan966x: fix configuration of the PCS

   - eth: sparx5: fix reading of the MAC address

   - eth: qed: allow sleep in qed_mcp_trace_dump()

   - eth: hns3:
      - fix interrupts re-initialization after VF FLR
      - fix handling of promisc when MAC addr table gets full
      - refine the handling for VF heartbeat

   - eth: mlx5:
      - properly handle ingress QinQ-tagged packets on VST
      - fix io_eq_size and event_eq_size params validation on big endian
      - fix RoCE setting at HCA level if not supported at all
      - don't turn CQE compression on by default for IPoIB

   - eth: ena:
      - fix toeplitz initial hash key value
      - account for the number of XDP-processed bytes in interface stats
      - fix rx_copybreak value update

  Misc:

   - ethtool: harden phy stat handling against buggy drivers

   - docs: netdev: convert maintainer's doc from FAQ to a normal
     document"

* tag 'net-6.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (112 commits)
  caif: fix memory leak in cfctrl_linkup_request()
  inet: control sockets should not use current thread task_frag
  net/ulp: prevent ULP without clone op from entering the LISTEN status
  qed: allow sleep in qed_mcp_trace_dump()
  MAINTAINERS: Update maintainers for ptp_vmw driver
  usb: rndis_host: Secure rndis_query check against int overflow
  net: dpaa: Fix dtsec check for PCS availability
  octeontx2-pf: Fix lmtst ID used in aura free
  drivers/net/bonding/bond_3ad: return when there's no aggregator
  netfilter: ipset: Rework long task execution when adding/deleting entries
  netfilter: ipset: fix hash:net,port,net hang with /0 subnet
  net: sparx5: Fix reading of the MAC address
  vxlan: Fix memory leaks in error path
  net: sched: htb: fix htb_classify() kernel-doc
  net: sched: cbq: dont intepret cls results when asked to drop
  net: sched: atm: dont intepret cls results when asked to drop
  dt-bindings: net: marvell,orion-mdio: Fix examples
  dt-bindings: net: sun8i-emac: Add phy-supply property
  net: ipa: use proper endpoint mask for suspend
  selftests: net: return non-zero for failures reported in arp_ndisc_evict_nocarrier
  ...
2023-01-05 12:40:50 -08:00
Dan Williams
4a20bc3e20 cxl/pci: Move tracepoint definitions to drivers/cxl/core/
CXL is using tracepoints for reporting RAS capability register payloads
for AER events, and has plans to use tracepoints for the output payload
of Get Poison List and Get Event Records commands. For organization
purposes it would be nice to keep those all under a single + local CXL
trace system. This also organization also potentially helps in the
future when CXL drivers expand beyond generic memory expanders, however
that would also entail a move away from the expander-specific
cxl_dev_state context, save that for later.

Note that the powerpc-specific drivers/misc/cxl/ also defines a 'cxl'
trace system, however, it is unlikely that a single platform will ever
load both drivers simultaneously.

Cc: Steven Rostedt <rostedt@goodmis.org>
Tested-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/167051869176.436579.9728373544811641087.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-01-04 17:11:11 -08:00
Linus Torvalds
69b41ac87e for-6.2-rc2-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmOyzdUACgkQxWXV+ddt
 WDt4qhAAqZZ7Tldx3kVKN6ExBfcDoimeQPPZmmMnL7A7POQyATtyBHCcu9ymj6Z6
 tuUqYcj7h4ydeHjL0AvaskpV1ALkfopkOA9KWAE2m1lyu4qclF6tSEJl7AKyCft7
 g4UyBpCFcnml/by0JeErHMJoxUz/AADYfW/wbyM/XvH2IiODJWf4mMWzJaL+t+GP
 rkJe9OgtmKEVZ2h5Gvdfnw4CrYm/Ds7CfG0UntpwIHvQBLHcms+OvFDSxRKZHxGs
 kt4u/b589AgL+8xNQrpfWfUQf9Zev2c+ekatU3ibi+c67XRtv45kHwsJvqaX+gmV
 +AaBI0GrQDdHXPNU22nmXeIi7tb3JnI/Vy6GHNkopIzdWkIiEtRu8hkVARhRxle7
 Z1WEAWgzPj2QerwmWrgk2TedxF1KD5J0jEJlNaNN7Dh3T8Fu5YjediQVf6mbKhkM
 yFUd0OBAlGNhEqq42ObH6TUYsqbzGk58EYaHGzBDa6QbA/yEfHaFwSqRstg/X3gv
 7WxImSq67KN0SkZZDMszZxzfEehXK9nmxoIfgo0/WGaYMSCxzBs6Xh17SJl9bhiE
 7Cee5dfiHamrYZF6oGpolP/FoZx68yPJXRmfEUQARTrMvF7cE62hjLLUjU7OgW9m
 GeLoFDq9bAh3OC4aEPdqyyu3Bh2yOfMPwpCO1wMk9I/tsIvR8mY=
 =+EpE
 -----END PGP SIGNATURE-----

Merge tag 'for-6.2-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:
 "First batch of regression and regular fixes:

   - regressions:
       - fix error handling after conversion to qstr for paths
       - fix raid56/scrub recovery caused by uninitialized variable
         after conversion to error bitmaps
       - restore qgroup backref lookup behaviour after recent
         refactoring
       - fix leak of device lists at module exit time

   - fix resolving backrefs for inline extent followed by prealloc

   - reset defrag ioctl buffer on memory allocation error"

* tag 'for-6.2-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: fix fscrypt name leak after failure to join log transaction
  btrfs: scrub: fix uninitialized return value in recover_scrub_rbio
  btrfs: fix resolving backrefs for inline extent followed by prealloc
  btrfs: fix trace event name typo for FLUSH_DELAYED_REFS
  btrfs: restore BTRFS_SEQ_LAST when looking up qgroup backref lookup
  btrfs: fix leak of fs devices after removing btrfs module
  btrfs: fix an error handling path in btrfs_defrag_leaves()
  btrfs: fix an error handling path in btrfs_rename()
2023-01-02 11:06:18 -08:00
David Howells
0e50d99990 rxrpc: Fix a couple of potential use-after-frees
At the end of rxrpc_recvmsg(), if a call is found, the call is put and then
a trace line is emitted referencing that call in a couple of places - but
the call may have been deallocated by the time those traces happen.

Fix this by stashing the call debug_id in a variable and passing that to
the tracepoint rather than the call pointer.

Fixes: 849979051c ("rxrpc: Add a tracepoint to follow what recvmsg does")
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-28 09:59:23 +00:00
Mathieu Desnoyers
14a8644d4f tracing/rseq: Add mm_cid field to rseq_update
Add the mm_cid field to the rseq_update event, allowing tracers to
follow which mm_cid is observed by user-space, and whether negative
mm_cid values are visible in case of internal scheduler implementation
issues.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221122203932.231377-22-mathieu.desnoyers@efficios.com
2022-12-27 12:52:15 +01:00
Mathieu Desnoyers
cbae6bac29 rseq: Extend struct rseq with numa node id
Adding the NUMA node id to struct rseq is a straightforward thing to do,
and a good way to figure out if anything in the user-space ecosystem
prevents extending struct rseq.

This NUMA node id field allows memory allocators such as tcmalloc to
take advantage of fast access to the current NUMA node id to perform
NUMA-aware memory allocation.

It can also be useful for implementing fast-paths for NUMA-aware
user-space mutexes.

It also allows implementing getcpu(2) purely in user-space.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221122203932.231377-5-mathieu.desnoyers@efficios.com
2022-12-27 12:52:10 +01:00
Linus Torvalds
7a693ea78e pwm: Changes for v6.2-rc1
Various changes across the board, mostly improvements and cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCAA3FiEEiOrDCAFJzPfAjcif3SOs138+s6EFAmOi1qoZHHRoaWVycnku
 cmVkaW5nQGdtYWlsLmNvbQAKCRDdI6zXfz6zoUNIEACLRuV3datmi1xBMt5ZVdLM
 YtSNYjPENbiMbcRHWV7MeOjLFeZN6LfhuV7phwagU3n53vMjR8SNogVf6X9HM7mA
 aRf98WcoVar+zikUoWkQE4m+F3/yAIm8ab2H62XVtXe+R+DdJHBcapxLIrqt1FvK
 XyUtcdwhr6VoY41MVN9RneXpAacPvX4fFuxa63xvlvhVGdgkENzqL02zBadQNgrg
 6xsJGig0Irl4LiX9XjFB3PPEvSFeodszqubdqCuGHNXz9nymmTo0uVxrAWPhYHOv
 1JhQQwRBDcFJqTrJcTGtREH1pmZOOneo/DYW5hNLxQpBCdD0aUD6GBhn81/zVLcj
 MBXpEWEesSV4Ng/fxu7EH/k0Db3l+SpNtotUlKVJv9/n3Ni1Xhkj9hgViWg+nN1w
 RfgOvWdI6xqKgsNUnR7w3JaTqtMsTw0YZpgMvfqlulkaxQ9Mj1tzfoFSQd06uteV
 bmslEGzl19EJPvWd0ttwrN6A1RHcxWl0ZbuAP5OnNscRQPl9vf4OMSpzkD4uBguu
 BuJy8r6UohEVrN+z4WB2mIEjkskFHTMLP4p/x85L97KjIzPl4Xy32mxFrt0SFJFr
 lMKXDeEhwba5zwWHgKbNQA2EK3FNCGAeYHWSXqxD2XpyNgiAoTQdzI30CPWqQwpB
 BPvwU8jT1CRFweG1J1fd8A==
 =IM51
 -----END PGP SIGNATURE-----

Merge tag 'pwm/for-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm

Pull pwm updates from Thierry Reding:
 "Various changes across the board, mostly improvements and cleanups"

* tag 'pwm/for-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm: (42 commits)
  pwm: pca9685: Convert to i2c's .probe_new()
  pwm: sun4i: Propagate errors in .get_state() to the caller
  pwm: Handle .get_state() failures
  pwm: sprd: Propagate errors in .get_state() to the caller
  pwm: rockchip: Propagate errors in .get_state() to the caller
  pwm: mtk-disp: Propagate errors in .get_state() to the caller
  pwm: imx27: Propagate errors in .get_state() to the caller
  pwm: cros-ec: Propagate errors in .get_state() to the caller
  pwm: crc: Propagate errors in .get_state() to the caller
  leds: qcom-lpg: Propagate errors in .get_state() to the caller
  drm/bridge: ti-sn65dsi86: Propagate errors in .get_state() to the caller
  pwm/tracing: Also record trace events for failed API calls
  pwm: Make .get_state() callback return an error code
  pwm: pxa: Enable for MMP platform
  pwm: pxa: Add reference manual link and limitations
  pwm: pxa: Use abrupt shutdown mode
  pwm: pxa: Remove clk enable/disable from pxa_pwm_config
  pwm: pxa: Set duty cycle to 0 when disabling PWM
  pwm: pxa: Remove pxa_pwm_enable/disable
  pwm: mediatek: Add support for MT7986
  ...
2022-12-21 09:41:28 -08:00
Linus Torvalds
609d3bc623 Including fixes from bpf, netfilter and can.
Current release - regressions:
 
  - bpf: synchronize dispatcher update with bpf_dispatcher_xdp_func
 
  - rxrpc:
   - fix security setting propagation
   - fix null-deref in rxrpc_unuse_local()
   - fix switched parameters in peer tracing
 
 Current release - new code bugs:
 
  - rxrpc:
    - fix I/O thread startup getting skipped
    - fix locking issues in rxrpc_put_peer_locked()
    - fix I/O thread stop
    - fix uninitialised variable in rxperf server
    - fix the return value of rxrpc_new_incoming_call()
 
  - microchip: vcap: fix initialization of value and mask
 
  - nfp: fix unaligned io read of capabilities word
 
 Previous releases - regressions:
 
  - stop in-kernel socket users from corrupting socket's task_frag
 
  - stream: purge sk_error_queue in sk_stream_kill_queues()
 
  - openvswitch: fix flow lookup to use unmasked key
 
  - dsa: mv88e6xxx: avoid reg_lock deadlock in mv88e6xxx_setup_port()
 
  - devlink:
    - hold region lock when flushing snapshots
    - protect devlink dump by the instance lock
 
 Previous releases - always broken:
 
  - bpf:
    - prevent leak of lsm program after failed attach
    - resolve fext program type when checking map compatibility
 
  - skbuff: account for tail adjustment during pull operations
 
  - macsec: fix net device access prior to holding a lock
 
  - bonding: switch back when high prio link up
 
  - netfilter: flowtable: really fix NAT IPv6 offload
 
  - enetc: avoid buffer leaks on xdp_do_redirect() failure
 
  - unix: fix race in SOCK_SEQPACKET's unix_dgram_sendmsg()
 
  - dsa: microchip: remove IRQF_TRIGGER_FALLING in request_threaded_irq
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmOiGa4ACgkQMUZtbf5S
 IrvetBAAg/AjgG51gboLsuGjgRSwAi5T6ijgVR+pW+kMuoOdaamOF+h/zC1ox/H9
 QrWvTBipy+EqSD8bM4Xz0FNgidch8X4iWYhKGZuBht/4NP5FOzPUG2mNlUy5ANGq
 QZcCw6CUsir8HTb+IJpFEIq0JMwzKCm3WyAkYjEj4iuft0Y93cAgjkMVwoX0RERO
 o/pslC5dsozCLJxEglpw1aJq7aoroNuRSGSXl95nv8fU3UxmUXajnA3HNscXImdV
 6uqSIuyPIaGocpCBPRKUQd0sctkTY4cm8wmxxMCDVsBRVusoaq5eg1VRvxJm9Rxj
 gvDvHvfhnEuSigFF5A+paBp4c+i3C8g/UTBJTtptdAC+Y2tt4UT3Q5aaazYUOAqd
 W4TSJ3bk5zhkhpRF9clb0fNQaM1HOT4rkDEEGTfVN62dtHfPKpNwYufQKaYHdVj1
 RJ3ooH6c7TMVaRs6ZgEWNYToKZj94SIfPhfEhuqWXdNMDBkUMp2BXFFOp9fZDWju
 PsMQrRD7n6+XXpNvScYtnJDORqfIL9yHGZE9kxZA5QSDl9cnPA3SUbNruQPlXHrl
 w0yQlYuG3gcciua4dXaLfz1iN4rPdenuYhVBHhztEwDKl+b61CVQYlOHGkXPVURp
 oft74qCCFbva+Hf/7jENQotjT1tLfxAGdUARuFeDBueJgDRAPsw=
 =goV5
 -----END PGP SIGNATURE-----

Merge tag 'net-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bpf, netfilter and can.

  Current release - regressions:

   - bpf: synchronize dispatcher update with bpf_dispatcher_xdp_func

   - rxrpc:
      - fix security setting propagation
      - fix null-deref in rxrpc_unuse_local()
      - fix switched parameters in peer tracing

  Current release - new code bugs:

   - rxrpc:
      - fix I/O thread startup getting skipped
      - fix locking issues in rxrpc_put_peer_locked()
      - fix I/O thread stop
      - fix uninitialised variable in rxperf server
      - fix the return value of rxrpc_new_incoming_call()

   - microchip: vcap: fix initialization of value and mask

   - nfp: fix unaligned io read of capabilities word

  Previous releases - regressions:

   - stop in-kernel socket users from corrupting socket's task_frag

   - stream: purge sk_error_queue in sk_stream_kill_queues()

   - openvswitch: fix flow lookup to use unmasked key

   - dsa: mv88e6xxx: avoid reg_lock deadlock in mv88e6xxx_setup_port()

   - devlink:
      - hold region lock when flushing snapshots
      - protect devlink dump by the instance lock

  Previous releases - always broken:

   - bpf:
      - prevent leak of lsm program after failed attach
      - resolve fext program type when checking map compatibility

   - skbuff: account for tail adjustment during pull operations

   - macsec: fix net device access prior to holding a lock

   - bonding: switch back when high prio link up

   - netfilter: flowtable: really fix NAT IPv6 offload

   - enetc: avoid buffer leaks on xdp_do_redirect() failure

   - unix: fix race in SOCK_SEQPACKET's unix_dgram_sendmsg()

   - dsa: microchip: remove IRQF_TRIGGER_FALLING in
     request_threaded_irq"

* tag 'net-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (64 commits)
  net: fec: check the return value of build_skb()
  net: simplify sk_page_frag
  Treewide: Stop corrupting socket's task_frag
  net: Introduce sk_use_task_frag in struct sock.
  mctp: Remove device type check at unregister
  net: dsa: microchip: remove IRQF_TRIGGER_FALLING in request_threaded_irq
  can: kvaser_usb: hydra: help gcc-13 to figure out cmd_len
  can: flexcan: avoid unbalanced pm_runtime_enable warning
  Documentation: devlink: add missing toc entry for etas_es58x devlink doc
  mctp: serial: Fix starting value for frame check sequence
  nfp: fix unaligned io read of capabilities word
  net: stream: purge sk_error_queue in sk_stream_kill_queues()
  myri10ge: Fix an error handling path in myri10ge_probe()
  net: microchip: vcap: Fix initialization of value and mask
  rxrpc: Fix the return value of rxrpc_new_incoming_call()
  rxrpc: rxperf: Fix uninitialised variable
  rxrpc: Fix I/O thread stop
  rxrpc: Fix switched parameters in peer tracing
  rxrpc: Fix locking issues in rxrpc_put_peer_locked()
  rxrpc: Fix I/O thread startup getting skipped
  ...
2022-12-21 08:41:32 -08:00
Linus Torvalds
70b07bec95 asm-generic bits for 6.2
There are only three fairly simple patches. The #include
 change to linux/swab.h addresses a userspace build issue,
 and the change to the mmio tracing logic helps provide
 more useful traces.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmOgtJUACgkQmmx57+YA
 GNln8Q//dvQ2FRIWBXKh4r6CxtiCx2aktGmnP1YAuaIVuzjGSn/8EQZAoTYN5jKY
 Io8rFt1/FfOMtu3E32JtGpgfDAP/8sfz3Lao9bzJR/Fjv059qL5QCoI3qbEFTNz9
 vzUqiddFZGppn76qsXSA6aItHVDS4Y97XiYRSwSMlpIz+9a84rYxCo04bNR4ut4t
 PR5+lvlTDfGfmj+SebrCt/IEi/FF9ckEYCLJHfaSPcQcujLDZDKPcT2RbubgwHgB
 OfE5Rx25xJxR4BU5MFe74sKn5Qi5HOfr1GrsjL3RbMNiYuHgbwLcZkMXvbZukdHz
 50Gt8UXMAxvZYKz92kyQLYuiKEtFSrQ8JccgqVUWL/lRLDoUkTg4hz4tmGUZE6KP
 ElxdgIBem9yrFX0oCaPNkY5d3MRU2i19FvBfKWKC54NbcmBjpHxxSg+WW/P7Jw+N
 uegj7qcEh7RcQU4w97OW4nS+eZmnXb4O4qXZeFwhXHS/snH7p3iBApyoPlyb+KOs
 np5MWRNaGFfi8BWWeVTX78U2VW8Ql8nnlRIlk/Wwm8AkVaNFQDnffKPi87paZd9o
 Kl+a9broMf4v0Oq5JTxqPMzmn9zUV0rHa1VanRBnNKqTOWalmNcsfsg1Ih9PhAAT
 p3u2CN0cBI7QmrcymJHrCuv0eNJRjsYa5FB4xmhJcJkD2qjsqXI=
 =05F5
 -----END PGP SIGNATURE-----

Merge tag 'asm-generic-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull asm-generic updates from Arnd Bergmann:
 "There are only three fairly simple patches.

  The #include change to linux/swab.h addresses a userspace build issue,
  and the change to the mmio tracing logic helps provide more useful
  traces"

* tag 'asm-generic-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  uapi: Add missing _UAPI prefix to <asm-generic/types.h> include guard
  asm-generic/io: Add _RET_IP_ to MMIO trace for more accurate debug info
  include/uapi/linux/swab: Fix potentially missing __always_inline
2022-12-20 08:32:11 -06:00
David Howells
c838f1a73d rxrpc: Fix switched parameters in peer tracing
Fix the switched parameters on rxrpc_alloc_peer() and rxrpc_get_peer().
The ref argument and the why argument got mixed.

Fixes: 47c810a798 ("rxrpc: trace: Don't use __builtin_return_address for rxrpc_peer tracing")
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-19 09:51:31 +00:00
Linus Torvalds
fe36bb8736 Tracing updates for 6.2:
- Add options to the osnoise tracer
   o panic_on_stop option that panics the kernel if osnoise is greater than some
     user defined threshold.
   o preempt option, to test noise while preemption is disabled
   o irq option, to test noise when interrupts are disabled
 
 - Add .percent and .graph suffix to histograms to give different outputs
 
 - Add nohitcount to disable showing hitcount in histogram output
 
 - Add new __cpumask() to trace event fields to annotate that a unsigned long
   array is a cpumask to user space and should be treated as one.
 
 - Add trace_trigger kernel command line parameter to enable trace event
   triggers at boot up. Useful to trace stack traces, disable tracing and take
   snapshots.
 
 - Fix x86/kmmio mmio tracer to work with the updates to lockdep
 
 - Unify the panic and die notifiers
 
 - Add back ftrace_expect reference that is used to extract more information in
   the ftrace_bug() code.
 
 - Have trigger filter parsing errors show up in the tracing error log.
 
 - Updated MAINTAINERS file to add kernel tracing  mailing list and patchwork
   info
 
 - Use IDA to keep track of event type numbers.
 
 - And minor fixes and clean ups
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCY5vIcxQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qlO7AQCmtZbriadAR6N7Llj092YXmYfzrxyi
 1WS35vhpZsBJ8gEA8j68l+LrgNt51N2gXlTXEHgXzdBgL/TKAPSX4D99GQY=
 =z1pe
 -----END PGP SIGNATURE-----

Merge tag 'trace-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing updates from Steven Rostedt:

 - Add options to the osnoise tracer:
      - 'panic_on_stop' option that panics the kernel if osnoise is
        greater than some user defined threshold.
      - 'preempt' option, to test noise while preemption is disabled
      - 'irq' option, to test noise when interrupts are disabled

 - Add .percent and .graph suffix to histograms to give different
   outputs

 - Add nohitcount to disable showing hitcount in histogram output

 - Add new __cpumask() to trace event fields to annotate that a unsigned
   long array is a cpumask to user space and should be treated as one.

 - Add trace_trigger kernel command line parameter to enable trace event
   triggers at boot up. Useful to trace stack traces, disable tracing
   and take snapshots.

 - Fix x86/kmmio mmio tracer to work with the updates to lockdep

 - Unify the panic and die notifiers

 - Add back ftrace_expect reference that is used to extract more
   information in the ftrace_bug() code.

 - Have trigger filter parsing errors show up in the tracing error log.

 - Updated MAINTAINERS file to add kernel tracing mailing list and
   patchwork info

 - Use IDA to keep track of event type numbers.

 - And minor fixes and clean ups

* tag 'trace-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (44 commits)
  tracing: Fix cpumask() example typo
  tracing: Improve panic/die notifiers
  ftrace: Prevent RCU stall on PREEMPT_VOLUNTARY kernels
  tracing: Do not synchronize freeing of trigger filter on boot up
  tracing: Remove pointer (asterisk) and brackets from cpumask_t field
  tracing: Have trigger filter parsing errors show up in error_log
  x86/mm/kmmio: Remove redundant preempt_disable()
  tracing: Fix infinite loop in tracing_read_pipe on overflowed print_trace_line
  Documentation/osnoise: Add osnoise/options documentation
  tracing/osnoise: Add preempt and/or irq disabled options
  tracing/osnoise: Add PANIC_ON_STOP option
  Documentation/osnoise: Escape underscore of NO_ prefix
  tracing: Fix some checker warnings
  tracing/osnoise: Make osnoise_options static
  tracing: remove unnecessary trace_trigger ifdef
  ring-buffer: Handle resize in early boot up
  tracing/hist: Fix issue of losting command info in error_log
  tracing: Fix issue of missing one synthetic field
  tracing/hist: Fix out-of-bound write on 'action_data.var_ref_idx'
  tracing/hist: Fix wrong return value in parse_action_params()
  ...
2022-12-15 18:01:16 -08:00
Linus Torvalds
8fa590bf34 ARM64:
* Enable the per-vcpu dirty-ring tracking mechanism, together with an
   option to keep the good old dirty log around for pages that are
   dirtied by something other than a vcpu.
 
 * Switch to the relaxed parallel fault handling, using RCU to delay
   page table reclaim and giving better performance under load.
 
 * Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping option,
   which multi-process VMMs such as crosvm rely on (see merge commit 382b5b87a9:
   "Fix a number of issues with MTE, such as races on the tags being
   initialised vs the PG_mte_tagged flag as well as the lack of support
   for VM_SHARED when KVM is involved.  Patches from Catalin Marinas and
   Peter Collingbourne").
 
 * Merge the pKVM shadow vcpu state tracking that allows the hypervisor
   to have its own view of a vcpu, keeping that state private.
 
 * Add support for the PMUv3p5 architecture revision, bringing support
   for 64bit counters on systems that support it, and fix the
   no-quite-compliant CHAIN-ed counter support for the machines that
   actually exist out there.
 
 * Fix a handful of minor issues around 52bit VA/PA support (64kB pages
   only) as a prefix of the oncoming support for 4kB and 16kB pages.
 
 * Pick a small set of documentation and spelling fixes, because no
   good merge window would be complete without those.
 
 s390:
 
 * Second batch of the lazy destroy patches
 
 * First batch of KVM changes for kernel virtual != physical address support
 
 * Removal of a unused function
 
 x86:
 
 * Allow compiling out SMM support
 
 * Cleanup and documentation of SMM state save area format
 
 * Preserve interrupt shadow in SMM state save area
 
 * Respond to generic signals during slow page faults
 
 * Fixes and optimizations for the non-executable huge page errata fix.
 
 * Reprogram all performance counters on PMU filter change
 
 * Cleanups to Hyper-V emulation and tests
 
 * Process Hyper-V TLB flushes from a nested guest (i.e. from a L2 guest
   running on top of a L1 Hyper-V hypervisor)
 
 * Advertise several new Intel features
 
 * x86 Xen-for-KVM:
 
 ** Allow the Xen runstate information to cross a page boundary
 
 ** Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured
 
 ** Add support for 32-bit guests in SCHEDOP_poll
 
 * Notable x86 fixes and cleanups:
 
 ** One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0).
 
 ** Reinstate IBPB on emulated VM-Exit that was incorrectly dropped a few
    years back when eliminating unnecessary barriers when switching between
    vmcs01 and vmcs02.
 
 ** Clean up vmread_error_trampoline() to make it more obvious that params
    must be passed on the stack, even for x86-64.
 
 ** Let userspace set all supported bits in MSR_IA32_FEAT_CTL irrespective
    of the current guest CPUID.
 
 ** Fudge around a race with TSC refinement that results in KVM incorrectly
    thinking a guest needs TSC scaling when running on a CPU with a
    constant TSC, but no hardware-enumerated TSC frequency.
 
 ** Advertise (on AMD) that the SMM_CTL MSR is not supported
 
 ** Remove unnecessary exports
 
 Generic:
 
 * Support for responding to signals during page faults; introduces
   new FOLL_INTERRUPTIBLE flag that was reviewed by mm folks
 
 Selftests:
 
 * Fix an inverted check in the access tracking perf test, and restore
   support for asserting that there aren't too many idle pages when
   running on bare metal.
 
 * Fix build errors that occur in certain setups (unsure exactly what is
   unique about the problematic setup) due to glibc overriding
   static_assert() to a variant that requires a custom message.
 
 * Introduce actual atomics for clear/set_bit() in selftests
 
 * Add support for pinning vCPUs in dirty_log_perf_test.
 
 * Rename the so called "perf_util" framework to "memstress".
 
 * Add a lightweight psuedo RNG for guest use, and use it to randomize
   the access pattern and write vs. read percentage in the memstress tests.
 
 * Add a common ucall implementation; code dedup and pre-work for running
   SEV (and beyond) guests in selftests.
 
 * Provide a common constructor and arch hook, which will eventually be
   used by x86 to automatically select the right hypercall (AMD vs. Intel).
 
 * A bunch of added/enabled/fixed selftests for ARM64, covering memslots,
   breakpoints, stage-2 faults and access tracking.
 
 * x86-specific selftest changes:
 
 ** Clean up x86's page table management.
 
 ** Clean up and enhance the "smaller maxphyaddr" test, and add a related
    test to cover generic emulation failure.
 
 ** Clean up the nEPT support checks.
 
 ** Add X86_PROPERTY_* framework to retrieve multi-bit CPUID values.
 
 ** Fix an ordering issue in the AMX test introduced by recent conversions
    to use kvm_cpu_has(), and harden the code to guard against similar bugs
    in the future.  Anything that tiggers caching of KVM's supported CPUID,
    kvm_cpu_has() in this case, effectively hides opt-in XSAVE features if
    the caching occurs before the test opts in via prctl().
 
 Documentation:
 
 * Remove deleted ioctls from documentation
 
 * Clean up the docs for the x86 MSR filter.
 
 * Various fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmOaFrcUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPemQgAq49excg2Cc+EsHnZw3vu/QWdA0Rt
 KhL3OgKxuHNjCbD2O9n2t5di7eJOTQ7F7T0eDm3xPTr4FS8LQ2327/mQePU/H2CF
 mWOpq9RBWLzFsSTeVA2Mz9TUTkYSnDHYuRsBvHyw/n9cL76BWVzjImldFtjYjjex
 yAwl8c5itKH6bc7KO+5ydswbvBzODkeYKUSBNdbn6m0JGQST7XppNwIAJvpiHsii
 Qgpk0e4Xx9q4PXG/r5DedI6BlufBsLhv0aE9SHPzyKH3JbbUFhJYI8ZD5OhBQuYW
 MwxK2KlM5Jm5ud2NZDDlsMmmvd1lnYCFDyqNozaKEWC1Y5rq1AbMa51fXA==
 =QAYX
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "ARM64:

   - Enable the per-vcpu dirty-ring tracking mechanism, together with an
     option to keep the good old dirty log around for pages that are
     dirtied by something other than a vcpu.

   - Switch to the relaxed parallel fault handling, using RCU to delay
     page table reclaim and giving better performance under load.

   - Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping
     option, which multi-process VMMs such as crosvm rely on (see merge
     commit 382b5b87a9: "Fix a number of issues with MTE, such as
     races on the tags being initialised vs the PG_mte_tagged flag as
     well as the lack of support for VM_SHARED when KVM is involved.
     Patches from Catalin Marinas and Peter Collingbourne").

   - Merge the pKVM shadow vcpu state tracking that allows the
     hypervisor to have its own view of a vcpu, keeping that state
     private.

   - Add support for the PMUv3p5 architecture revision, bringing support
     for 64bit counters on systems that support it, and fix the
     no-quite-compliant CHAIN-ed counter support for the machines that
     actually exist out there.

   - Fix a handful of minor issues around 52bit VA/PA support (64kB
     pages only) as a prefix of the oncoming support for 4kB and 16kB
     pages.

   - Pick a small set of documentation and spelling fixes, because no
     good merge window would be complete without those.

  s390:

   - Second batch of the lazy destroy patches

   - First batch of KVM changes for kernel virtual != physical address
     support

   - Removal of a unused function

  x86:

   - Allow compiling out SMM support

   - Cleanup and documentation of SMM state save area format

   - Preserve interrupt shadow in SMM state save area

   - Respond to generic signals during slow page faults

   - Fixes and optimizations for the non-executable huge page errata
     fix.

   - Reprogram all performance counters on PMU filter change

   - Cleanups to Hyper-V emulation and tests

   - Process Hyper-V TLB flushes from a nested guest (i.e. from a L2
     guest running on top of a L1 Hyper-V hypervisor)

   - Advertise several new Intel features

   - x86 Xen-for-KVM:

      - Allow the Xen runstate information to cross a page boundary

      - Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured

      - Add support for 32-bit guests in SCHEDOP_poll

   - Notable x86 fixes and cleanups:

      - One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0).

      - Reinstate IBPB on emulated VM-Exit that was incorrectly dropped
        a few years back when eliminating unnecessary barriers when
        switching between vmcs01 and vmcs02.

      - Clean up vmread_error_trampoline() to make it more obvious that
        params must be passed on the stack, even for x86-64.

      - Let userspace set all supported bits in MSR_IA32_FEAT_CTL
        irrespective of the current guest CPUID.

      - Fudge around a race with TSC refinement that results in KVM
        incorrectly thinking a guest needs TSC scaling when running on a
        CPU with a constant TSC, but no hardware-enumerated TSC
        frequency.

      - Advertise (on AMD) that the SMM_CTL MSR is not supported

      - Remove unnecessary exports

  Generic:

   - Support for responding to signals during page faults; introduces
     new FOLL_INTERRUPTIBLE flag that was reviewed by mm folks

  Selftests:

   - Fix an inverted check in the access tracking perf test, and restore
     support for asserting that there aren't too many idle pages when
     running on bare metal.

   - Fix build errors that occur in certain setups (unsure exactly what
     is unique about the problematic setup) due to glibc overriding
     static_assert() to a variant that requires a custom message.

   - Introduce actual atomics for clear/set_bit() in selftests

   - Add support for pinning vCPUs in dirty_log_perf_test.

   - Rename the so called "perf_util" framework to "memstress".

   - Add a lightweight psuedo RNG for guest use, and use it to randomize
     the access pattern and write vs. read percentage in the memstress
     tests.

   - Add a common ucall implementation; code dedup and pre-work for
     running SEV (and beyond) guests in selftests.

   - Provide a common constructor and arch hook, which will eventually
     be used by x86 to automatically select the right hypercall (AMD vs.
     Intel).

   - A bunch of added/enabled/fixed selftests for ARM64, covering
     memslots, breakpoints, stage-2 faults and access tracking.

   - x86-specific selftest changes:

      - Clean up x86's page table management.

      - Clean up and enhance the "smaller maxphyaddr" test, and add a
        related test to cover generic emulation failure.

      - Clean up the nEPT support checks.

      - Add X86_PROPERTY_* framework to retrieve multi-bit CPUID values.

      - Fix an ordering issue in the AMX test introduced by recent
        conversions to use kvm_cpu_has(), and harden the code to guard
        against similar bugs in the future. Anything that tiggers
        caching of KVM's supported CPUID, kvm_cpu_has() in this case,
        effectively hides opt-in XSAVE features if the caching occurs
        before the test opts in via prctl().

  Documentation:

   - Remove deleted ioctls from documentation

   - Clean up the docs for the x86 MSR filter.

   - Various fixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (361 commits)
  KVM: x86: Add proper ReST tables for userspace MSR exits/flags
  KVM: selftests: Allocate ucall pool from MEM_REGION_DATA
  KVM: arm64: selftests: Align VA space allocator with TTBR0
  KVM: arm64: Fix benign bug with incorrect use of VA_BITS
  KVM: arm64: PMU: Fix period computation for 64bit counters with 32bit overflow
  KVM: x86: Advertise that the SMM_CTL MSR is not supported
  KVM: x86: remove unnecessary exports
  KVM: selftests: Fix spelling mistake "probabalistic" -> "probabilistic"
  tools: KVM: selftests: Convert clear/set_bit() to actual atomics
  tools: Drop "atomic_" prefix from atomic test_and_set_bit()
  tools: Drop conflicting non-atomic test_and_{clear,set}_bit() helpers
  KVM: selftests: Use non-atomic clear/set bit helpers in KVM tests
  perf tools: Use dedicated non-atomic clear/set bit helpers
  tools: Take @bit as an "unsigned long" in {clear,set}_bit() helpers
  KVM: arm64: selftests: Enable single-step without a "full" ucall()
  KVM: x86: fix APICv/x2AVIC disabled when vm reboot by itself
  KVM: Remove stale comment about KVM_REQ_UNHALT
  KVM: Add missing arch for KVM_CREATE_DEVICE and KVM_{SET,GET}_DEVICE_ATTR
  KVM: Reference to kvm_userspace_memory_region in doc and comments
  KVM: Delete all references to removed KVM_SET_MEMORY_ALIAS ioctl
  ...
2022-12-15 11:12:21 -08:00
Naohiro Aota
0a3212de8a btrfs: fix trace event name typo for FLUSH_DELAYED_REFS
Fix a typo of printing FLUSH_DELAYED_REFS event in flush_space() as
FLUSH_ELAYED_REFS.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-12-15 19:17:00 +01:00
Linus Torvalds
041fae9c10 f2fs-for-6.2-rc1
In this round, we've added two features: 1) F2FS_IOC_START_ATOMIC_REPLACE and
 2) per-block age-based extent cache. 1) is a variant of the previous atomic
 write feature which guarantees a per-file atomicity. It would be more efficient
 than AtomicFile implementation in Android framework. 2) implements another type
 of extent cache in memory which keeps the per-block age in a file, so that block
 allocator could split the hot and cold data blocks more accurately.
 
 Enhancement:
  - introduce F2FS_IOC_START_ATOMIC_REPLACE
  - refactor extent_cache to add a new per-block-age-based extent cache support
  - introduce discard_urgent_util, gc_mode, max_ordered_discard sysfs knobs
  - add proc entry to show discard_plist info
  - optimize iteration over sparse directories
  - add barrier mount option
 
 Bug fix
  - avoid victim selection from previous victim section
  - fix to enable compress for newly created file if extension matches
  - set zstd compress level correctly
  - initialize locks early in f2fs_fill_super() to fix bugs reported by syzbot
  - correct i_size change for atomic writes
  - allow to read node block after shutdown
  - allow to set compression for inlined file
  - fix gc mode when gc_urgent_high_remaining is 1
  - should put a page when checking the summary info
 
 Minor fixes and various clean-ups in GC, discard, debugfs, sysfs, and doc.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAmOaTNUACgkQQBSofoJI
 UNIQnw//V7Q8DUHw5YNj04jutwXH2DNMLAmn/NJh5S6dIzy/LiywlSzVg53/0/FP
 4K577urUkIhgilRO+yncUMSnSQk7BluQvGSx4ja2AV+dpDomjxM3GwIacGzSvr7D
 VfVf8Vig10UEFrrtEEKtv1VFlYHAmo8lLpubzrZHV8aZFLHHYO2fakQhPu8BYsaz
 eGCJwxjvTZcQUPkaeG9tWto3ChI3F6PzreiQ5TztHhLWSEgw/o0qijpsc+2SthaV
 my7uGjeBY8EGPeSYbeCxRtdx8g8Qu11K3ISuDj8zBybmjG3IWOGt1CVcrY6tZbal
 aL70CMtHkMqMn03VqbpCTqBtdWNMrrw5sYSL3qXIUdXlX/2yJBh9fLAeNxKNs5Nu
 6veSb2WgYMHqIsClkAAcP0xJ8g6kodGoG60wVr4ek0Vdt4osaQqwq+bnffpwwxtQ
 F+7aRuinv+rdrHJ4CuFXAmHPKh2lBe2lTTWZEKg2RptTxZ5DhD2Qn6x1khPD2GFA
 mG2Aeiq6PVxxEeIO+w/VBCuAgpGTFV2N/ZIF8VfjFNdWiN5OGLWQNHC2KGj2G2uV
 +fA+B91txQWtjY9h72YJb2+aGIixcnLY24ni4mDgDItqtpCB4PW56W8cbnbv9Pl+
 aXAWdADqJdDyllHoVB/JQ24gr2fATJGRIDeYDnw+vPP4f5ZT5vg=
 =f00t
 -----END PGP SIGNATURE-----

Merge tag 'f2fs-for-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
 "In this round, we've added two features: F2FS_IOC_START_ATOMIC_REPLACE
  and a per-block age-based extent cache.

  F2FS_IOC_START_ATOMIC_REPLACE is a variant of the previous atomic
  write feature which guarantees a per-file atomicity. It would be more
  efficient than AtomicFile implementation in Android framework.

  The per-block age-based extent cache implements another type of extent
  cache in memory which keeps the per-block age in a file, so that block
  allocator could split the hot and cold data blocks more accurately.

  Enhancements:
   - introduce F2FS_IOC_START_ATOMIC_REPLACE
   - refactor extent_cache to add a new per-block-age-based extent cache support
   - introduce discard_urgent_util, gc_mode, max_ordered_discard sysfs knobs
   - add proc entry to show discard_plist info
   - optimize iteration over sparse directories
   - add barrier mount option

  Bug fixes:
   - avoid victim selection from previous victim section
   - fix to enable compress for newly created file if extension matches
   - set zstd compress level correctly
   - initialize locks early in f2fs_fill_super() to fix bugs reported by syzbot
   - correct i_size change for atomic writes
   - allow to read node block after shutdown
   - allow to set compression for inlined file
   - fix gc mode when gc_urgent_high_remaining is 1
   - should put a page when checking the summary info

  Minor fixes and various clean-ups in GC, discard, debugfs, sysfs, and
  doc"

* tag 'f2fs-for-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (63 commits)
  f2fs: reset wait_ms to default if any of the victims have been selected
  f2fs: fix some format WARNING in debug.c and sysfs.c
  f2fs: don't call f2fs_issue_discard_timeout() when discard_cmd_cnt is 0 in f2fs_put_super()
  f2fs: fix iostat parameter for discard
  f2fs: Fix spelling mistake in label: free_bio_enrty_cache -> free_bio_entry_cache
  f2fs: add block_age-based extent cache
  f2fs: allocate the extent_cache by default
  f2fs: refactor extent_cache to support for read and more
  f2fs: remove unnecessary __init_extent_tree
  f2fs: move internal functions into extent_cache.c
  f2fs: specify extent cache for read explicitly
  f2fs: introduce f2fs_is_readonly() for readability
  f2fs: remove F2FS_SET_FEATURE() and F2FS_CLEAR_FEATURE() macro
  f2fs: do some cleanup for f2fs module init
  MAINTAINERS: Add f2fs bug tracker link
  f2fs: remove the unused flush argument to change_curseg
  f2fs: open code allocate_segment_by_default
  f2fs: remove struct segment_allocation default_salloc_ops
  f2fs: introduce discard_urgent_util sysfs node
  f2fs: define MIN_DISCARD_GRANULARITY macro
  ...
2022-12-14 15:27:57 -08:00
Linus Torvalds
ab425febda v6.2 merge window pull request
Usual size of updates, a new driver a most of the bulk focusing on rxe:
 
 - Usual typos, style, and language updates
 
 - Driver updates for mlx5, irdma, siw, rts, srp, hfi1, hns, erdma, mlx4, srp
 
 - Lots of RXE updates
   * Improve reply error handling for bad MR operations
   * Code tidying
   * Debug printing uses common loggers
   * Remove half implemented RD related stuff
   * Support IBA's recently defined Atomic Write and Flush operations
 
 - erdma support for atomic operations
 
 - New driver "mana" for Ethernet HW available in Azure VMs. This driver
   only supports DPDK
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCY5eIggAKCRCFwuHvBreF
 YeX7AP9+l5Y9J48OmK7y/YgADNo9g05agXp3E8EuUDmBU+PREgEAigdWaJVf2oea
 IctVja0ApLW5W+wsFt8Qh+V4PMiYTAM=
 =Q5V+
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "Usual size of updates, a new driver, and most of the bulk focusing on
  rxe:

   - Usual typos, style, and language updates

   - Driver updates for mlx5, irdma, siw, rts, srp, hfi1, hns, erdma,
     mlx4, srp

   - Lots of RXE updates:
      * Improve reply error handling for bad MR operations
      * Code tidying
      * Debug printing uses common loggers
      * Remove half implemented RD related stuff
      * Support IBA's recently defined Atomic Write and Flush operations

   - erdma support for atomic operations

   - New driver 'mana' for Ethernet HW available in Azure VMs. This
     driver only supports DPDK"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (122 commits)
  IB/IPoIB: Fix queue count inconsistency for PKEY child interfaces
  RDMA: Add missed netdev_put() for the netdevice_tracker
  RDMA/rxe: Enable RDMA FLUSH capability for rxe device
  RDMA/cm: Make QP FLUSHABLE for supported device
  RDMA/rxe: Implement flush completion
  RDMA/rxe: Implement flush execution in responder side
  RDMA/rxe: Implement RC RDMA FLUSH service in requester side
  RDMA/rxe: Extend rxe packet format to support flush
  RDMA/rxe: Allow registering persistent flag for pmem MR only
  RDMA/rxe: Extend rxe user ABI to support flush
  RDMA: Extend RDMA kernel verbs ABI to support flush
  RDMA: Extend RDMA user ABI to support flush
  RDMA/rxe: Fix incorrect responder length checking
  RDMA/rxe: Fix oops with zero length reads
  RDMA/mlx5: Remove not-used IB_FLOW_SPEC_IB define
  RDMA/hns: Fix XRC caps on HIP08
  RDMA/hns: Fix error code of CMD
  RDMA/hns: Fix page size cap from firmware
  RDMA/hns: Fix PBL page MTR find
  RDMA/hns: Fix AH attr queried by query_qp
  ...
2022-12-14 09:27:13 -08:00
Linus Torvalds
e2ca6ba6ba MM patches for 6.2-rc1.
- More userfaultfs work from Peter Xu.
 
 - Several convert-to-folios series from Sidhartha Kumar and Huang Ying.
 
 - Some filemap cleanups from Vishal Moola.
 
 - David Hildenbrand added the ability to selftest anon memory COW handling.
 
 - Some cpuset simplifications from Liu Shixin.
 
 - Addition of vmalloc tracing support by Uladzislau Rezki.
 
 - Some pagecache folioifications and simplifications from Matthew Wilcox.
 
 - A pagemap cleanup from Kefeng Wang: we have VM_ACCESS_FLAGS, so use it.
 
 - Miguel Ojeda contributed some cleanups for our use of the
   __no_sanitize_thread__ gcc keyword.  This series shold have been in the
   non-MM tree, my bad.
 
 - Naoya Horiguchi improved the interaction between memory poisoning and
   memory section removal for huge pages.
 
 - DAMON cleanups and tuneups from SeongJae Park
 
 - Tony Luck fixed the handling of COW faults against poisoned pages.
 
 - Peter Xu utilized the PTE marker code for handling swapin errors.
 
 - Hugh Dickins reworked compound page mapcount handling, simplifying it
   and making it more efficient.
 
 - Removal of the autonuma savedwrite infrastructure from Nadav Amit and
   David Hildenbrand.
 
 - zram support for multiple compression streams from Sergey Senozhatsky.
 
 - David Hildenbrand reworked the GUP code's R/O long-term pinning so
   that drivers no longer need to use the FOLL_FORCE workaround which
   didn't work very well anyway.
 
 - Mel Gorman altered the page allocator so that local IRQs can remnain
   enabled during per-cpu page allocations.
 
 - Vishal Moola removed the try_to_release_page() wrapper.
 
 - Stefan Roesch added some per-BDI sysfs tunables which are used to
   prevent network block devices from dirtying excessive amounts of
   pagecache.
 
 - David Hildenbrand did some cleanup and repair work on KSM COW
   breaking.
 
 - Nhat Pham and Johannes Weiner have implemented writeback in zswap's
   zsmalloc backend.
 
 - Brian Foster has fixed a longstanding corner-case oddity in
   file[map]_write_and_wait_range().
 
 - sparse-vmemmap changes for MIPS, LoongArch and NIOS2 from Feiyang
   Chen.
 
 - Shiyang Ruan has done some work on fsdax, to make its reflink mode
   work better under xfstests.  Better, but still not perfect.
 
 - Christoph Hellwig has removed the .writepage() method from several
   filesystems.  They only need .writepages().
 
 - Yosry Ahmed wrote a series which fixes the memcg reclaim target
   beancounting.
 
 - David Hildenbrand has fixed some of our MM selftests for 32-bit
   machines.
 
 - Many singleton patches, as usual.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY5j6ZwAKCRDdBJ7gKXxA
 jkDYAP9qNeVqp9iuHjZNTqzMXkfmJPsw2kmy2P+VdzYVuQRcJgEAgoV9d7oMq4ml
 CodAgiA51qwzId3GRytIo/tfWZSezgA=
 =d19R
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2022-12-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - More userfaultfs work from Peter Xu

 - Several convert-to-folios series from Sidhartha Kumar and Huang Ying

 - Some filemap cleanups from Vishal Moola

 - David Hildenbrand added the ability to selftest anon memory COW
   handling

 - Some cpuset simplifications from Liu Shixin

 - Addition of vmalloc tracing support by Uladzislau Rezki

 - Some pagecache folioifications and simplifications from Matthew
   Wilcox

 - A pagemap cleanup from Kefeng Wang: we have VM_ACCESS_FLAGS, so use
   it

 - Miguel Ojeda contributed some cleanups for our use of the
   __no_sanitize_thread__ gcc keyword.

   This series should have been in the non-MM tree, my bad

 - Naoya Horiguchi improved the interaction between memory poisoning and
   memory section removal for huge pages

 - DAMON cleanups and tuneups from SeongJae Park

 - Tony Luck fixed the handling of COW faults against poisoned pages

 - Peter Xu utilized the PTE marker code for handling swapin errors

 - Hugh Dickins reworked compound page mapcount handling, simplifying it
   and making it more efficient

 - Removal of the autonuma savedwrite infrastructure from Nadav Amit and
   David Hildenbrand

 - zram support for multiple compression streams from Sergey Senozhatsky

 - David Hildenbrand reworked the GUP code's R/O long-term pinning so
   that drivers no longer need to use the FOLL_FORCE workaround which
   didn't work very well anyway

 - Mel Gorman altered the page allocator so that local IRQs can remnain
   enabled during per-cpu page allocations

 - Vishal Moola removed the try_to_release_page() wrapper

 - Stefan Roesch added some per-BDI sysfs tunables which are used to
   prevent network block devices from dirtying excessive amounts of
   pagecache

 - David Hildenbrand did some cleanup and repair work on KSM COW
   breaking

 - Nhat Pham and Johannes Weiner have implemented writeback in zswap's
   zsmalloc backend

 - Brian Foster has fixed a longstanding corner-case oddity in
   file[map]_write_and_wait_range()

 - sparse-vmemmap changes for MIPS, LoongArch and NIOS2 from Feiyang
   Chen

 - Shiyang Ruan has done some work on fsdax, to make its reflink mode
   work better under xfstests. Better, but still not perfect

 - Christoph Hellwig has removed the .writepage() method from several
   filesystems. They only need .writepages()

 - Yosry Ahmed wrote a series which fixes the memcg reclaim target
   beancounting

 - David Hildenbrand has fixed some of our MM selftests for 32-bit
   machines

 - Many singleton patches, as usual

* tag 'mm-stable-2022-12-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (313 commits)
  mm/hugetlb: set head flag before setting compound_order in __prep_compound_gigantic_folio
  mm: mmu_gather: allow more than one batch of delayed rmaps
  mm: fix typo in struct pglist_data code comment
  kmsan: fix memcpy tests
  mm: add cond_resched() in swapin_walk_pmd_entry()
  mm: do not show fs mm pc for VM_LOCKONFAULT pages
  selftests/vm: ksm_functional_tests: fixes for 32bit
  selftests/vm: cow: fix compile warning on 32bit
  selftests/vm: madv_populate: fix missing MADV_POPULATE_(READ|WRITE) definitions
  mm/gup_test: fix PIN_LONGTERM_TEST_READ with highmem
  mm,thp,rmap: fix races between updates of subpages_mapcount
  mm: memcg: fix swapcached stat accounting
  mm: add nodes= arg to memory.reclaim
  mm: disable top-tier fallback to reclaim on proactive reclaim
  selftests: cgroup: make sure reclaim target memcg is unprotected
  selftests: cgroup: refactor proactive reclaim code to reclaim_until()
  mm: memcg: fix stale protection of reclaim target memcg
  mm/mmap: properly unaccount memory on mas_preallocate() failure
  omfs: remove ->writepage
  jfs: remove ->writepage
  ...
2022-12-13 19:29:45 -08:00
Linus Torvalds
7e68dd7d07 Networking changes for 6.2.
Core
 ----
  - Allow live renaming when an interface is up
 
  - Add retpoline wrappers for tc, improving considerably the
    performances of complex queue discipline configurations.
 
  - Add inet drop monitor support.
 
  - A few GRO performance improvements.
 
  - Add infrastructure for atomic dev stats, addressing long standing
    data races.
 
  - De-duplicate common code between OVS and conntrack offloading
    infrastructure.
 
  - A bunch of UBSAN_BOUNDS/FORTIFY_SOURCE improvements.
 
  - Netfilter: introduce packet parser for tunneled packets
 
  - Replace IPVS timer-based estimators with kthreads to scale up
    the workload with the number of available CPUs.
 
  - Add the helper support for connection-tracking OVS offload.
 
 BPF
 ---
  - Support for user defined BPF objects: the use case is to allocate
    own objects, build own object hierarchies and use the building
    blocks to build own data structures flexibly, for example, linked
    lists in BPF.
 
  - Make cgroup local storage available to non-cgroup attached BPF
    programs.
 
  - Avoid unnecessary deadlock detection and failures wrt BPF task
    storage helpers.
 
  - A relevant bunch of BPF verifier fixes and improvements.
 
  - Veristat tool improvements to support custom filtering, sorting,
    and replay of results.
 
  - Add LLVM disassembler as default library for dumping JITed code.
 
  - Lots of new BPF documentation for various BPF maps.
 
  - Add bpf_rcu_read_{,un}lock() support for sleepable programs.
 
  - Add RCU grace period chaining to BPF to wait for the completion
    of access from both sleepable and non-sleepable BPF programs.
 
  - Add support storing struct task_struct objects as kptrs in maps.
 
  - Improve helper UAPI by explicitly defining BPF_FUNC_xxx integer
    values.
 
  - Add libbpf *_opts API-variants for bpf_*_get_fd_by_id() functions.
 
 Protocols
 ---------
  - TCP: implement Protective Load Balancing across switch links.
 
  - TCP: allow dynamically disabling TCP-MD5 static key, reverting
    back to fast[er]-path.
 
  - UDP: Introduce optional per-netns hash lookup table.
 
  - IPv6: simplify and cleanup sockets disposal.
 
  - Netlink: support different type policies for each generic
    netlink operation.
 
  - MPTCP: add MSG_FASTOPEN and FastOpen listener side support.
 
  - MPTCP: add netlink notification support for listener sockets
    events.
 
  - SCTP: add VRF support, allowing sctp sockets binding to VRF
    devices.
 
  - Add bridging MAC Authentication Bypass (MAB) support.
 
  - Extensions for Ethernet VPN bridging implementation to better
    support multicast scenarios.
 
  - More work for Wi-Fi 7 support, comprising conversion of all
    the existing drivers to internal TX queue usage.
 
  - IPSec: introduce a new offload type (packet offload) allowing
    complete header processing and crypto offloading.
 
  - IPSec: extended ack support for more descriptive XFRM error
    reporting.
 
  - RXRPC: increase SACK table size and move processing into a
    per-local endpoint kernel thread, reducing considerably the
    required locking.
 
  - IEEE 802154: synchronous send frame and extended filtering
    support, initial support for scanning available 15.4 networks.
 
  - Tun: bump the link speed from 10Mbps to 10Gbps.
 
  - Tun/VirtioNet: implement UDP segmentation offload support.
 
 Driver API
 ----------
 
  - PHY/SFP: improve power level switching between standard
    level 1 and the higher power levels.
 
  - New API for netdev <-> devlink_port linkage.
 
  - PTP: convert existing drivers to new frequency adjustment
    implementation.
 
  - DSA: add support for rx offloading.
 
  - Autoload DSA tagging driver when dynamically changing protocol.
 
  - Add new PCP and APPTRUST attributes to Data Center Bridging.
 
  - Add configuration support for 800Gbps link speed.
 
  - Add devlink port function attribute to enable/disable RoCE and
    migratable.
 
  - Extend devlink-rate to support strict prioriry and weighted fair
    queuing.
 
  - Add devlink support to directly reading from region memory.
 
  - New device tree helper to fetch MAC address from nvmem.
 
  - New big TCP helper to simplify temporary header stripping.
 
 New hardware / drivers
 ----------------------
 
  - Ethernet:
    - Marvel Octeon CNF95N and CN10KB Ethernet Switches.
    - Marvel Prestera AC5X Ethernet Switch.
    - WangXun 10 Gigabit NIC.
    - Motorcomm yt8521 Gigabit Ethernet.
    - Microchip ksz9563 Gigabit Ethernet Switch.
    - Microsoft Azure Network Adapter.
    - Linux Automation 10Base-T1L adapter.
 
  - PHY:
    - Aquantia AQR112 and AQR412.
    - Motorcomm YT8531S.
 
  - PTP:
    - Orolia ART-CARD.
 
  - WiFi:
    - MediaTek Wi-Fi 7 (802.11be) devices.
    - RealTek rtw8821cu, rtw8822bu, rtw8822cu and rtw8723du USB
      devices.
 
  - Bluetooth:
    - Broadcom BCM4377/4378/4387 Bluetooth chipsets.
    - Realtek RTL8852BE and RTL8723DS.
    - Cypress.CYW4373A0 WiFi + Bluetooth combo device.
 
 Drivers
 -------
  - CAN:
    - gs_usb: bus error reporting support.
    - kvaser_usb: listen only and bus error reporting support.
 
  - Ethernet NICs:
    - Intel (100G):
      - extend action skbedit to RX queue mapping.
      - implement devlink-rate support.
      - support direct read from memory.
    - nVidia/Mellanox (mlx5):
      - SW steering improvements, increasing rules update rate.
      - Support for enhanced events compression.
      - extend H/W offload packet manipulation capabilities.
      - implement IPSec packet offload mode.
    - nVidia/Mellanox (mlx4):
      - better big TCP support.
    - Netronome Ethernet NICs (nfp):
      - IPsec offload support.
      - add support for multicast filter.
    - Broadcom:
      - RSS and PTP support improvements.
    - AMD/SolarFlare:
      - netlink extened ack improvements.
      - add basic flower matches to offload, and related stats.
    - Virtual NICs:
      - ibmvnic: introduce affinity hint support.
    - small / embedded:
      - FreeScale fec: add initial XDP support.
      - Marvel mv643xx_eth: support MII/GMII/RGMII modes for Kirkwood.
      - TI am65-cpsw: add suspend/resume support.
      - Mediatek MT7986: add RX wireless wthernet dispatch support.
      - Realtek 8169: enable GRO software interrupt coalescing per
        default.
 
  - Ethernet high-speed switches:
    - Microchip (sparx5):
      - add support for Sparx5 TC/flower H/W offload via VCAP.
    - Mellanox mlxsw:
      - add 802.1X and MAC Authentication Bypass offload support.
      - add ip6gre support.
 
  - Embedded Ethernet switches:
    - Mediatek (mtk_eth_soc):
      - improve PCS implementation, add DSA untag support.
      - enable flow offload support.
    - Renesas:
      - add rswitch R-Car Gen4 gPTP support.
    - Microchip (lan966x):
      - add full XDP support.
      - add TC H/W offload via VCAP.
      - enable PTP on bridge interfaces.
    - Microchip (ksz8):
      - add MTU support for KSZ8 series.
 
  - Qualcomm 802.11ax WiFi (ath11k):
    - support configuring channel dwell time during scan.
 
  - MediaTek WiFi (mt76):
    - enable Wireless Ethernet Dispatch (WED) offload support.
    - add ack signal support.
    - enable coredump support.
    - remain_on_channel support.
 
  - Intel WiFi (iwlwifi):
    - enable Wi-Fi 7 Extremely High Throughput (EHT) PHY capabilities.
    - 320 MHz channels support.
 
  - RealTek WiFi (rtw89):
    - new dynamic header firmware format support.
    - wake-over-WLAN support.
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmOYXUcSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOk8zQP/R7BZtbJMTPiWkRnSoKHnAyupDVwrz5U
 ktukLkwPsCyJuEbAjgxrxf4EEEQ9uq2FFlxNSYuKiiQMqIpFxV6KED7LCUygn4Tc
 kxtkp0Q+5XiqisWlQmtfExf2OjuuPqcjV9tWCDBI6GebKUbfNwY/eI44RcMu4BSv
 DzIlW5GkX/kZAPqnnuqaLsN3FudDTJHGEAD7NbA++7wJ076RWYSLXlFv0Z+SCSPS
 H8/PEG0/ZK/65rIWMAFRClJ9BNIDwGVgp0GrsIvs1gqbRUOlA1hl1rDM21TqtNFf
 5QPQT7sIfTcCE/nerxKJD5JE3JyP+XRlRn96PaRw3rt4MgI6I/EOj/HOKQ5tMCNc
 oPiqb7N70+hkLZyr42qX+vN9eDPjp2koEQm7EO2Zs+/534/zWDs24Zfk/Aa1ps0I
 Fa82oGjAgkBhGe/FZ6i5cYoLcyxqRqZV1Ws9XQMl72qRC7/BwvNbIW6beLpCRyeM
 yYIU+0e9dEm+wHQEdh2niJuVtR63hy8tvmPx56lyh+6u0+pondkwbfSiC5aD3kAC
 ikKsN5DyEsdXyiBAlytCEBxnaOjQy4RAz+3YXSiS0eBNacXp03UUrNGx4Pzpu/D0
 QLFJhBnMFFCgy5to8/DvKnrTPgZdSURwqbIUcZdvU21f1HLR8tUTpaQnYffc/Whm
 V8gnt1EL+0cc
 =CbJC
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Paolo Abeni:
 "Core:

   - Allow live renaming when an interface is up

   - Add retpoline wrappers for tc, improving considerably the
     performances of complex queue discipline configurations

   - Add inet drop monitor support

   - A few GRO performance improvements

   - Add infrastructure for atomic dev stats, addressing long standing
     data races

   - De-duplicate common code between OVS and conntrack offloading
     infrastructure

   - A bunch of UBSAN_BOUNDS/FORTIFY_SOURCE improvements

   - Netfilter: introduce packet parser for tunneled packets

   - Replace IPVS timer-based estimators with kthreads to scale up the
     workload with the number of available CPUs

   - Add the helper support for connection-tracking OVS offload

  BPF:

   - Support for user defined BPF objects: the use case is to allocate
     own objects, build own object hierarchies and use the building
     blocks to build own data structures flexibly, for example, linked
     lists in BPF

   - Make cgroup local storage available to non-cgroup attached BPF
     programs

   - Avoid unnecessary deadlock detection and failures wrt BPF task
     storage helpers

   - A relevant bunch of BPF verifier fixes and improvements

   - Veristat tool improvements to support custom filtering, sorting,
     and replay of results

   - Add LLVM disassembler as default library for dumping JITed code

   - Lots of new BPF documentation for various BPF maps

   - Add bpf_rcu_read_{,un}lock() support for sleepable programs

   - Add RCU grace period chaining to BPF to wait for the completion of
     access from both sleepable and non-sleepable BPF programs

   - Add support storing struct task_struct objects as kptrs in maps

   - Improve helper UAPI by explicitly defining BPF_FUNC_xxx integer
     values

   - Add libbpf *_opts API-variants for bpf_*_get_fd_by_id() functions

  Protocols:

   - TCP: implement Protective Load Balancing across switch links

   - TCP: allow dynamically disabling TCP-MD5 static key, reverting back
     to fast[er]-path

   - UDP: Introduce optional per-netns hash lookup table

   - IPv6: simplify and cleanup sockets disposal

   - Netlink: support different type policies for each generic netlink
     operation

   - MPTCP: add MSG_FASTOPEN and FastOpen listener side support

   - MPTCP: add netlink notification support for listener sockets events

   - SCTP: add VRF support, allowing sctp sockets binding to VRF devices

   - Add bridging MAC Authentication Bypass (MAB) support

   - Extensions for Ethernet VPN bridging implementation to better
     support multicast scenarios

   - More work for Wi-Fi 7 support, comprising conversion of all the
     existing drivers to internal TX queue usage

   - IPSec: introduce a new offload type (packet offload) allowing
     complete header processing and crypto offloading

   - IPSec: extended ack support for more descriptive XFRM error
     reporting

   - RXRPC: increase SACK table size and move processing into a
     per-local endpoint kernel thread, reducing considerably the
     required locking

   - IEEE 802154: synchronous send frame and extended filtering support,
     initial support for scanning available 15.4 networks

   - Tun: bump the link speed from 10Mbps to 10Gbps

   - Tun/VirtioNet: implement UDP segmentation offload support

  Driver API:

   - PHY/SFP: improve power level switching between standard level 1 and
     the higher power levels

   - New API for netdev <-> devlink_port linkage

   - PTP: convert existing drivers to new frequency adjustment
     implementation

   - DSA: add support for rx offloading

   - Autoload DSA tagging driver when dynamically changing protocol

   - Add new PCP and APPTRUST attributes to Data Center Bridging

   - Add configuration support for 800Gbps link speed

   - Add devlink port function attribute to enable/disable RoCE and
     migratable

   - Extend devlink-rate to support strict prioriry and weighted fair
     queuing

   - Add devlink support to directly reading from region memory

   - New device tree helper to fetch MAC address from nvmem

   - New big TCP helper to simplify temporary header stripping

  New hardware / drivers:

   - Ethernet:
      - Marvel Octeon CNF95N and CN10KB Ethernet Switches
      - Marvel Prestera AC5X Ethernet Switch
      - WangXun 10 Gigabit NIC
      - Motorcomm yt8521 Gigabit Ethernet
      - Microchip ksz9563 Gigabit Ethernet Switch
      - Microsoft Azure Network Adapter
      - Linux Automation 10Base-T1L adapter

   - PHY:
      - Aquantia AQR112 and AQR412
      - Motorcomm YT8531S

   - PTP:
      - Orolia ART-CARD

   - WiFi:
      - MediaTek Wi-Fi 7 (802.11be) devices
      - RealTek rtw8821cu, rtw8822bu, rtw8822cu and rtw8723du USB
        devices

   - Bluetooth:
      - Broadcom BCM4377/4378/4387 Bluetooth chipsets
      - Realtek RTL8852BE and RTL8723DS
      - Cypress.CYW4373A0 WiFi + Bluetooth combo device

  Drivers:

   - CAN:
      - gs_usb: bus error reporting support
      - kvaser_usb: listen only and bus error reporting support

   - Ethernet NICs:
      - Intel (100G):
         - extend action skbedit to RX queue mapping
         - implement devlink-rate support
         - support direct read from memory
      - nVidia/Mellanox (mlx5):
         - SW steering improvements, increasing rules update rate
         - Support for enhanced events compression
         - extend H/W offload packet manipulation capabilities
         - implement IPSec packet offload mode
      - nVidia/Mellanox (mlx4):
         - better big TCP support
      - Netronome Ethernet NICs (nfp):
         - IPsec offload support
         - add support for multicast filter
      - Broadcom:
         - RSS and PTP support improvements
      - AMD/SolarFlare:
         - netlink extened ack improvements
         - add basic flower matches to offload, and related stats
      - Virtual NICs:
         - ibmvnic: introduce affinity hint support
      - small / embedded:
         - FreeScale fec: add initial XDP support
         - Marvel mv643xx_eth: support MII/GMII/RGMII modes for Kirkwood
         - TI am65-cpsw: add suspend/resume support
         - Mediatek MT7986: add RX wireless wthernet dispatch support
         - Realtek 8169: enable GRO software interrupt coalescing per
           default

   - Ethernet high-speed switches:
      - Microchip (sparx5):
         - add support for Sparx5 TC/flower H/W offload via VCAP
      - Mellanox mlxsw:
         - add 802.1X and MAC Authentication Bypass offload support
         - add ip6gre support

   - Embedded Ethernet switches:
      - Mediatek (mtk_eth_soc):
         - improve PCS implementation, add DSA untag support
         - enable flow offload support
      - Renesas:
         - add rswitch R-Car Gen4 gPTP support
      - Microchip (lan966x):
         - add full XDP support
         - add TC H/W offload via VCAP
         - enable PTP on bridge interfaces
      - Microchip (ksz8):
         - add MTU support for KSZ8 series

   - Qualcomm 802.11ax WiFi (ath11k):
      - support configuring channel dwell time during scan

   - MediaTek WiFi (mt76):
      - enable Wireless Ethernet Dispatch (WED) offload support
      - add ack signal support
      - enable coredump support
      - remain_on_channel support

   - Intel WiFi (iwlwifi):
      - enable Wi-Fi 7 Extremely High Throughput (EHT) PHY capabilities
      - 320 MHz channels support

   - RealTek WiFi (rtw89):
      - new dynamic header firmware format support
      - wake-over-WLAN support"

* tag 'net-next-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2002 commits)
  ipvs: fix type warning in do_div() on 32 bit
  net: lan966x: Remove a useless test in lan966x_ptp_add_trap()
  net: ipa: add IPA v4.7 support
  dt-bindings: net: qcom,ipa: Add SM6350 compatible
  bnxt: Use generic HBH removal helper in tx path
  IPv6/GRO: generic helper to remove temporary HBH/jumbo header in driver
  selftests: forwarding: Add bridge MDB test
  selftests: forwarding: Rename bridge_mdb test
  bridge: mcast: Support replacement of MDB port group entries
  bridge: mcast: Allow user space to specify MDB entry routing protocol
  bridge: mcast: Allow user space to add (*, G) with a source list and filter mode
  bridge: mcast: Add support for (*, G) with a source list and filter mode
  bridge: mcast: Avoid arming group timer when (S, G) corresponds to a source
  bridge: mcast: Add a flag for user installed source entries
  bridge: mcast: Expose __br_multicast_del_group_src()
  bridge: mcast: Expose br_multicast_new_group_src()
  bridge: mcast: Add a centralized error path
  bridge: mcast: Place netlink policy before validation functions
  bridge: mcast: Split (*, G) and (S, G) addition into different functions
  bridge: mcast: Do not derive entry type from its filter mode
  ...
2022-12-13 15:47:48 -08:00
Linus Torvalds
0015edd6f6 A pile of clk driver updates with a small tracepoint patch to the clk core this
time around. The core framework is effectively unchanged, with the majority of
 the diff going to the Qualcomm clk driver directory because they added two 3k
 line files that are almost all clk data (Abel Vesa from Linaro tried to shrink
 the number of lines down, but it doesn't seem to be possible without
 sacrificing readability). The second big driver this time around is the
 Rockchip rk3588 clk and reset unit, at _only_ 2.5k lines.
 
 Ignoring the big clk drivers from the familiar SoC vendors, there's just a
 bunch of little clk driver updates and fixes throughout here. It's the usual
 set of clk data fixups to describe proper parents, or add frequencies to
 frequency tables, or plug memory leaks when function calls fail. Also, some
 drivers are converted to use modern clk_hw APIs, which is always nice to see.
 And data is deduplicated, leading to a smaller kernel Image. Overall this batch
 has a larger collection of cleanups than it typically does. Maybe that means
 there are less new SoCs right now that need supporting, and the focus has
 shifted to quality and reliability. I can dream.
 
 New Drivers:
  - Frequency hopping controller hardware on MediaTek MT8186
  - Global clock controller for Qualcomm SM8550
  - Display clock controller for Qualcomm SC8280XP
  - RPMh clock controller for Qualcomm QDU1000 and QRU1000 SoCs
  - CPU PLL on MStar/SigmaStar SoCs
  - Support for the clock and reset unit of the Rockchip rk3588
 
 Updates:
  - Tracepoints for clk_rate_request structures
  - Debugfs support for fractional divider clk
  - Make MxL's CGU driver secure compatible
  - Ingenic JZ4755 SoC clk support
  - Support audio clks on X1000 SoCs
  - Remove flags from univ/main/syspll child fixed factor clocks across
    MediaTek platforms
  - Fix clock dependency for ADC on MediaTek MT7986
  - Fix parent for FlexSPI clock for i.MX93
  - Add USB suspend clock on i.MX8MP
  - Unmap anatop base on error for i.MX93 driver
  - Change enet clock parent to wakeup_axi_root for i.MX93
  - Drop LPIT1, LPIT2, TPM1 and TPM3 clocks for i.MX93
  - Mark HSIO bus clock and SYS_CNT clock as critical on i.MX93
  - Add 320MHz and 640MHz entries to PLL146x
  - Add audio shared gate and SAI clocks for i.MX8MP
  - Fix a possible memory leak in the error path of rockchip PLL creation
  - Fix header guard for V3S clocks
  - Add IR module clock for f1c100s
  - Correct the parent clocks for the (High Speed) Serial Communication
    Interfaces with FIFO ((H)SCIF) modules and the mixed-up Ethernet
    Switch clocks on Renesas R-Car S4-8
  - Add timer (TMU, CMT) and Cortex-A76 CPU core (Z0) clocks on Renesas
    R-Car V4H
  - Two PLL driver fixups for the Amlogic clk driver
  - Round SD clock rate to improve parent clock selection
  - Add Ethernet Switch and internal SASYNCPER clocks on Renesas R-Car
    S4-8
  - Add DMA (SYS-DMAC), SPI (MSIOF), external interrupt (INTC-EX) serial
    (SCIF), PWM (PWM and TPU), SDHI, and HyperFLASH/QSPI (RPC-IF) clocks
    on Renesas R-Car V4H
  - Add Multi-Function Timer Pulse Unit (MTU3a) clock and reset on
    Renesas RZ/G2L
  - Fix endless loop on Renesas RZ/N1
  - Correct the parent clocks for the High Speed Serial Communication
    Interfaces with FIFO (HSCIF) modules on the Renesas R-Car V4H SoC
    Note: HSCIF0 is used for the serial console on the White-Hawk
    development board
  - Various clk DT binding improvements and conversions to YAML
  - Qualcomm SM8150/SM8250 display clock controller cleaned up
  - Some missing clocks for Qualcomm SM8350 added
  - Qualcomm MSM8974 Global and Multimedia clock controllers transitioned
    to parent_data and parent_hws
  - Use parent_data and add network resets for Qualcomm IPQ8074
  - Qualcomm Krait clock controller modernized
  - Fix pm_runtime usage in Qualcomm SC7180 and SC7280 LPASS clock
    controllers
  - Enable retention mode on Qualcomm SM8250 USB GDSCs
  - Cleanup Qualcomm RPM and RPMh clock drivers to avoid duplicating
    clocks which definition could be shared between platforms
  - Various NULL pointer checks added for allocations
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEE9L57QeeUxqYDyoaDrQKIl8bklSUFAmOXq7wRHHNib3lkQGtl
 cm5lbC5vcmcACgkQrQKIl8bklSU2sg/+JIguM/vYw92d3hGePFKaz5lmFXSjzRXp
 HMbbnuclAzc/C7jKGwypP2GMdVxOPvzxG4cW9Q25cTw4SuELg2nIBn9UvRteCEDA
 uGf8h0Xw/sJfyRhZbAlnbLxtn3qntQL8F2VbPJ+umDYnghD0Mq0WBMeHEoeXGXpb
 PVdEYsgpHo3EbgCL8rjErw9XDHBTGRgNXPounpKjD3Kwmj+CXWgopsma7Hzf2G/6
 VxBbcxDZA6OaEzJAKGVeIHBYLwY0aGPP2ouC2RQDBzSb7n6PjqDkOCdP6w1ab9Nl
 XehAup5p5Zgd314YgQlE9BoXwhXanZyVT88D6WbfN+qjksDm9n+W+5O9suN2eLrt
 h+YgmFdUAESUAJTbIyF6tiLUEIDKjKrJyU+HZX0peOhGIYbw3fMUACR+JrCbmCCZ
 rTTOWh92q7v39to+QIFsKwtVLl9IlRTCaA3tbhv/FH2gplJlOhvPgulAfV+JRtTZ
 1YND5adsFNsc69ZK8TTT2NzXUnU0XhocNNL1SegYXZpfHoNmg5CUQiPYMMASCJcI
 V1+qznLUeUUonkhexFTMrJHGL4e4ITzESi7IOTVcJ6Wco+gXOrOMHfONbahEsCYn
 UQIPC9tw9qwV6D3Sf9C8zFtBP26w7+UuJ8ZFpmhpf+fevF5i2TsG6x7Y31mlxzww
 OZ+r5dsauc4=
 =6vbl
 -----END PGP SIGNATURE-----

Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk driver updates from Stephen Boyd:
 "A pile of clk driver updates with a small tracepoint patch to the clk
  core this time around.

  The core framework is effectively unchanged, with the majority of the
  diff going to the Qualcomm clk driver directory because they added two
  3k line files that are almost all clk data (Abel Vesa from Linaro
  tried to shrink the number of lines down, but it doesn't seem to be
  possible without sacrificing readability).

  The second big driver this time around is the Rockchip rk3588 clk and
  reset unit, at _only_ 2.5k lines.

  Ignoring the big clk drivers from the familiar SoC vendors, there's
  just a bunch of little clk driver updates and fixes throughout here.

  It's the usual set of clk data fixups to describe proper parents, or
  add frequencies to frequency tables, or plug memory leaks when
  function calls fail. Also, some drivers are converted to use modern
  clk_hw APIs, which is always nice to see. And data is deduplicated,
  leading to a smaller kernel Image.

  Overall this batch has a larger collection of cleanups than it
  typically does. Maybe that means there are less new SoCs right now
  that need supporting, and the focus has shifted to quality and
  reliability. I can dream.

  New Drivers:
   - Frequency hopping controller hardware on MediaTek MT8186
   - Global clock controller for Qualcomm SM8550
   - Display clock controller for Qualcomm SC8280XP
   - RPMh clock controller for Qualcomm QDU1000 and QRU1000 SoCs
   - CPU PLL on MStar/SigmaStar SoCs
   - Support for the clock and reset unit of the Rockchip rk3588

  Updates:
   - Tracepoints for clk_rate_request structures
   - Debugfs support for fractional divider clk
   - Make MxL's CGU driver secure compatible
   - Ingenic JZ4755 SoC clk support
   - Support audio clks on X1000 SoCs
   - Remove flags from univ/main/syspll child fixed factor clocks across
     MediaTek platforms
   - Fix clock dependency for ADC on MediaTek MT7986
   - Fix parent for FlexSPI clock for i.MX93
   - Add USB suspend clock on i.MX8MP
   - Unmap anatop base on error for i.MX93 driver
   - Change enet clock parent to wakeup_axi_root for i.MX93
   - Drop LPIT1, LPIT2, TPM1 and TPM3 clocks for i.MX93
   - Mark HSIO bus clock and SYS_CNT clock as critical on i.MX93
   - Add 320MHz and 640MHz entries to PLL146x
   - Add audio shared gate and SAI clocks for i.MX8MP
   - Fix a possible memory leak in the error path of rockchip PLL
     creation
   - Fix header guard for V3S clocks
   - Add IR module clock for f1c100s
   - Correct the parent clocks for the (High Speed) Serial Communication
     Interfaces with FIFO ((H)SCIF) modules and the mixed-up Ethernet
     Switch clocks on Renesas R-Car S4-8
   - Add timer (TMU, CMT) and Cortex-A76 CPU core (Z0) clocks on Renesas
     R-Car V4H
   - Two PLL driver fixups for the Amlogic clk driver
   - Round SD clock rate to improve parent clock selection
   - Add Ethernet Switch and internal SASYNCPER clocks on Renesas R-Car
     S4-8
   - Add DMA (SYS-DMAC), SPI (MSIOF), external interrupt (INTC-EX)
     serial (SCIF), PWM (PWM and TPU), SDHI, and HyperFLASH/QSPI
     (RPC-IF) clocks on Renesas R-Car V4H
   - Add Multi-Function Timer Pulse Unit (MTU3a) clock and reset on
     Renesas RZ/G2L
   - Fix endless loop on Renesas RZ/N1
   - Correct the parent clocks for the High Speed Serial Communication
     Interfaces with FIFO (HSCIF) modules on the Renesas R-Car V4H SoC
     Note: HSCIF0 is used for the serial console on the White-Hawk
     development board
   - Various clk DT binding improvements and conversions to YAML
   - Qualcomm SM8150/SM8250 display clock controller cleaned up
   - Some missing clocks for Qualcomm SM8350 added
   - Qualcomm MSM8974 Global and Multimedia clock controllers
     transitioned to parent_data and parent_hws
   - Use parent_data and add network resets for Qualcomm IPQ8074
   - Qualcomm Krait clock controller modernized
   - Fix pm_runtime usage in Qualcomm SC7180 and SC7280 LPASS clock
     controllers
   - Enable retention mode on Qualcomm SM8250 USB GDSCs
   - Cleanup Qualcomm RPM and RPMh clock drivers to avoid duplicating
     clocks which definition could be shared between platforms
   - Various NULL pointer checks added for allocations"

* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (188 commits)
  clk: nomadik: correct struct name kernel-doc warning
  clk: lmk04832: fix kernel-doc warnings
  clk: lmk04832: drop superfluous #include
  clk: lmk04832: drop unnecessary semicolons
  clk: lmk04832: declare variables as const when possible
  clk: socfpga: Fix memory leak in socfpga_gate_init()
  clk: microchip: enable the MPFS clk driver by default if SOC_MICROCHIP_POLARFIRE
  clk: st: Fix memory leak in st_of_quadfs_setup()
  clk: samsung: Fix memory leak in _samsung_clk_register_pll()
  clk: Add trace events for rate requests
  clk: Store clk_core for clk_rate_request
  clk: qcom: rpmh: add support for SM6350 rpmh IPA clock
  clk: qcom: mmcc-msm8974: use parent_hws/_data instead of parent_names
  clk: qcom: mmcc-msm8974: move clock parent tables down
  clk: qcom: mmcc-msm8974: use ARRAY_SIZE instead of specifying num_parents
  clk: qcom: gcc-msm8974: use parent_hws/_data instead of parent_names
  clk: qcom: gcc-msm8974: move clock parent tables down
  clk: qcom: gcc-msm8974: use ARRAY_SIZE instead of specifying num_parents
  dt-bindings: clocks: qcom,mmcc: define clocks/clock-names for MSM8974
  dt-bindings: clock: split qcom,gcc-msm8974,-msm8226 to the separate file
  ...
2022-12-13 13:46:07 -08:00
Steven Rostedt (Google)
fab89a09c8 tracing: Remove pointer (asterisk) and brackets from cpumask_t field
To differentiate between long arrays and cpumasks, the __cpumask() field
was created. Part of the TRACE_EVENT() macros test if the type is signed
or not by using the is_signed_type() macro. The __cpumask() field used the
__dynamic_array() helper but because cpumask_t is a structure, it could
not be used in the is_signed_type() macro as that would fail to build, so
instead it passed in the pointer to cpumask_t.

Unfortunately, that creates in the format file:

  field:__data_loc cpumask_t *[] mask;    offset:36;      size:4; signed:0;

Which looks like an array of pointers to cpumask_t and not a cpumask_t
type, which is misleading to user space parsers.

Douglas Raillard pointed out that the "[]" are also misleading, as
cpumask_t is not an array.

Since cpumask() hasn't been created yet, and the parsers currently fail on
it (but will still produce the raw output), make it be:

  field:__data_loc cpumask_t mask;    offset:36;      size:4; signed:0;

Which is the correct type of the field.

Then the parsers can be updated to handle this.

Link: https://lore.kernel.org/lkml/6dda5e1d-9416-b55e-88f3-31d148bc925f@arm.com/
Link: https://lore.kernel.org/linux-trace-kernel/20221212193814.0e3f1e43@gandalf.local.home

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Fixes: 8230f27b1c ("tracing: Add __cpumask to denote a trace event field that is a cpumask_t")
Reported-by: Douglas Raillard <douglas.raillard@arm.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-12-13 15:46:35 -05:00
Linus Torvalds
ce8a79d560 for-6.2/block-2022-12-08
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmOScsgQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpi5ID/9pLXFYOq1+uDjU0KO/MdjMjK8Ukr34lCnk
 WkajRLheE8JBKOFDE54XJk56sQSZHX9bTWqziar0h1fioh7FlQR/tVvzsERCm2M9
 2y9THJNJygC68wgybStyiKlshFjl7TD7Kv5N9Y3xP3mkQygT+D6o8fXZk5xQbYyH
 YdFSoq4rJVHxRL03yzQiReGGIYdOUEQQh8l1FiLwLlKa3lXAey1KuxWIzksVN0KK
 aZB4QhiBpOiPgDHUVisq2XtyQjpZ2byoCImPzgrcqk9Jo4esvm/e6esrg4xlsvII
 LKFFkTmbVqjUZtFjqakFHmfuzVor4nU5f+xb90ZHExuuODYckkxWp5rWhf9QwqqI
 0ik6WYgI1/5vnHnX8f2DYzOFQf9qa/rLgg0CshyUODlD6RfHa9vntqYvlIFkmOBd
 Q7KblIoK8YTzUS1M+v7X8JQ7gDR2KwygH37Da2KJS+vgvfIb8kJGr1ZORuhJuJJ7
 Bl69gaNkHTHrqufp7UI64YXfueeuNu2J9z3zwzGoxeaFaofF/phDn0/2gCQE1fQI
 XBhsMw+ETqI6B2SPHMnzYDu2DM1S8ZTOYQlaD4G3uqgWnAM1tG707395uAy5yu4n
 D5azU1fVG4UocoNIyPujpaoSRs2zWZycEFEeUQkhyDDww/j4hlHi6H33eOnk0zsr
 wxzFGfvHfw==
 =k/vv
 -----END PGP SIGNATURE-----

Merge tag 'for-6.2/block-2022-12-08' of git://git.kernel.dk/linux

Pull block updates from Jens Axboe:

 - NVMe pull requests via Christoph:
      - Support some passthrough commands without CAP_SYS_ADMIN (Kanchan
        Joshi)
      - Refactor PCIe probing and reset (Christoph Hellwig)
      - Various fabrics authentication fixes and improvements (Sagi
        Grimberg)
      - Avoid fallback to sequential scan due to transient issues (Uday
        Shankar)
      - Implement support for the DEAC bit in Write Zeroes (Christoph
        Hellwig)
      - Allow overriding the IEEE OUI and firmware revision in configfs
        for nvmet (Aleksandr Miloserdov)
      - Force reconnect when number of queue changes in nvmet (Daniel
        Wagner)
      - Minor fixes and improvements (Uros Bizjak, Joel Granados, Sagi
        Grimberg, Christoph Hellwig, Christophe JAILLET)
      - Fix and cleanup nvme-fc req allocation (Chaitanya Kulkarni)
      - Use the common tagset helpers in nvme-pci driver (Christoph
        Hellwig)
      - Cleanup the nvme-pci removal path (Christoph Hellwig)
      - Use kstrtobool() instead of strtobool (Christophe JAILLET)
      - Allow unprivileged passthrough of Identify Controller (Joel
        Granados)
      - Support io stats on the mpath device (Sagi Grimberg)
      - Minor nvmet cleanup (Sagi Grimberg)

 - MD pull requests via Song:
      - Code cleanups (Christoph)
      - Various fixes

 - Floppy pull request from Denis:
      - Fix a memory leak in the init error path (Yuan)

 - Series fixing some batch wakeup issues with sbitmap (Gabriel)

 - Removal of the pktcdvd driver that was deprecated more than 5 years
   ago, and subsequent removal of the devnode callback in struct
   block_device_operations as no users are now left (Greg)

 - Fix for partition read on an exclusively opened bdev (Jan)

 - Series of elevator API cleanups (Jinlong, Christoph)

 - Series of fixes and cleanups for blk-iocost (Kemeng)

 - Series of fixes and cleanups for blk-throttle (Kemeng)

 - Series adding concurrent support for sync queues in BFQ (Yu)

 - Series bringing drbd a bit closer to the out-of-tree maintained
   version (Christian, Joel, Lars, Philipp)

 - Misc drbd fixes (Wang)

 - blk-wbt fixes and tweaks for enable/disable (Yu)

 - Fixes for mq-deadline for zoned devices (Damien)

 - Add support for read-only and offline zones for null_blk
   (Shin'ichiro)

 - Series fixing the delayed holder tracking, as used by DM (Yu,
   Christoph)

 - Series enabling bio alloc caching for IRQ based IO (Pavel)

 - Series enabling userspace peer-to-peer DMA (Logan)

 - BFQ waker fixes (Khazhismel)

 - Series fixing elevator refcount issues (Christoph, Jinlong)

 - Series cleaning up references around queue destruction (Christoph)

 - Series doing quiesce by tagset, enabling cleanups in drivers
   (Christoph, Chao)

 - Series untangling the queue kobject and queue references (Christoph)

 - Misc fixes and cleanups (Bart, David, Dawei, Jinlong, Kemeng, Ye,
   Yang, Waiman, Shin'ichiro, Randy, Pankaj, Christoph)

* tag 'for-6.2/block-2022-12-08' of git://git.kernel.dk/linux: (247 commits)
  blktrace: Fix output non-blktrace event when blk_classic option enabled
  block: sed-opal: Don't include <linux/kernel.h>
  sed-opal: allow using IOC_OPAL_SAVE for locking too
  blk-cgroup: Fix typo in comment
  block: remove bio_set_op_attrs
  nvmet: don't open-code NVME_NS_ATTR_RO enumeration
  nvme-pci: use the tagset alloc/free helpers
  nvme: add the Apple shared tag workaround to nvme_alloc_io_tag_set
  nvme: only set reserved_tags in nvme_alloc_io_tag_set for fabrics controllers
  nvme: consolidate setting the tagset flags
  nvme: pass nr_maps explicitly to nvme_alloc_io_tag_set
  block: bio_copy_data_iter
  nvme-pci: split out a nvme_pci_ctrl_is_dead helper
  nvme-pci: return early on ctrl state mismatch in nvme_reset_work
  nvme-pci: rename nvme_disable_io_queues
  nvme-pci: cleanup nvme_suspend_queue
  nvme-pci: remove nvme_pci_disable
  nvme-pci: remove nvme_disable_admin_queue
  nvme: merge nvme_shutdown_ctrl into nvme_disable_ctrl
  nvme: use nvme_wait_ready in nvme_shutdown_ctrl
  ...
2022-12-13 10:43:59 -08:00
Linus Torvalds
764822972d NFSD 6.2 Release Notes
This release introduces support for the CB_RECALL_ANY operation.
 NFSD can send this operation to request that clients return any
 delegations they choose. The server uses this operation to handle
 low memory scenarios or indicate to a client when that client has
 reached the maximum number of delegations the server supports.
 
 The NFSv4.2 READ_PLUS operation has been simplified temporarily
 whilst support for sparse files in local filesystems and the VFS is
 improved.
 
 Two major data structure fixes appear in this release:
 
 * The nfs4_file hash table is replaced with a resizable hash table
   to reduce the latency of NFSv4 OPEN operations.
 
 * Reference counting in the NFSD filecache has been hardened against
   races.
 
 In furtherance of removing support for NFSv2 in a subsequent kernel
 release, a new Kconfig option enables server-side support for NFSv2
 to be left out of a kernel build.
 
 MAINTAINERS has been updated to indicate that changes to fs/exportfs
 should go through the NFSD tree.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmOXPE4ACgkQM2qzM29m
 f5faaRAAh7YT5V61afPbfgBybO5AbDzztpZSNjNjLZs78piSnFp6hP75yNtTviwQ
 1o7St13/NkCmDaIdGUpr02U01zbM1BDOq2wGckImOJLNSgb7xHV5r4PqkRiFkh0t
 QYSnwG+wp8fDUJeCL/nAOAu9I9EQUqHzWchxiU/h8ln2hN3rXUlIRSeo17Wy7zkD
 cNIcoAjTi9fzY3dE6H4r+lZTdNCYH+AdzChmKrHdRZQwq0Xs3FWv4gAMTLbDuD4P
 B6NDHz0Umn6XnFsJGptwozkwaWeMQw4GyJj/3iUiO8JF209SaoYXMPjJAyG6tYYa
 fUrgv4UXGeXjigDbLBA5IYxfhX7GXjMQSaj3edhzyrl8P74q4/Cq/8fDUnAZ841m
 E+TGSCPIQD0QuIjdXxLv9KLY8JNThSfcAt6jr5GBXhPZQr8xpS0BqK/Onr68fgZC
 Lpull5xN68L4A1B7cf2GNPuMyvkBKxwSGXOehldh/BkvpVMjFwqd4/q5xWC+6CcQ
 tbOkjTbbSS71nzJwZip0NphaYCa3qQPzKT4SZzn/I4I9W5otbwYBx734Bw46gTDE
 ZPUXTuJ00VPgX07wbLRahg521Fwzr+8sk1WnVYq82PoaMh1l9FjzLNGouQWBdo3E
 UzIo/KUfQKmoZce6O723L6OI4ffdK5oMtfaTpe+SiUPpV1lUAcA=
 =jNlu
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull nfsd updates from Chuck Lever:
 "This release introduces support for the CB_RECALL_ANY operation. NFSD
  can send this operation to request that clients return any delegations
  they choose. The server uses this operation to handle low memory
  scenarios or indicate to a client when that client has reached the
  maximum number of delegations the server supports.

  The NFSv4.2 READ_PLUS operation has been simplified temporarily whilst
  support for sparse files in local filesystems and the VFS is improved.

  Two major data structure fixes appear in this release:

   - The nfs4_file hash table is replaced with a resizable hash table to
     reduce the latency of NFSv4 OPEN operations.

   - Reference counting in the NFSD filecache has been hardened against
     races.

  In furtherance of removing support for NFSv2 in a subsequent kernel
  release, a new Kconfig option enables server-side support for NFSv2 to
  be left out of a kernel build.

  MAINTAINERS has been updated to indicate that changes to fs/exportfs
  should go through the NFSD tree"

* tag 'nfsd-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (49 commits)
  NFSD: Avoid clashing function prototypes
  SUNRPC: Fix crasher in unwrap_integ_data()
  SUNRPC: Make the svc_authenticate tracepoint conditional
  NFSD: Use only RQ_DROPME to signal the need to drop a reply
  SUNRPC: Clean up xdr_write_pages()
  SUNRPC: Don't leak netobj memory when gss_read_proxy_verf() fails
  NFSD: add CB_RECALL_ANY tracepoints
  NFSD: add delegation reaper to react to low memory condition
  NFSD: add support for sending CB_RECALL_ANY
  NFSD: refactoring courtesy_client_reaper to a generic low memory shrinker
  trace: Relocate event helper files
  NFSD: pass range end to vfs_fsync_range() instead of count
  lockd: fix file selection in nlmsvc_cancel_blocked
  lockd: ensure we use the correct file descriptor when unlocking
  lockd: set missing fl_flags field when retrieving args
  NFSD: Use struct_size() helper in alloc_session()
  nfsd: return error if nfs4_setacl fails
  lockd: set other missing fields when unlocking files
  NFSD: Add an nfsd_file_fsync tracepoint
  sunrpc: svc: Remove an unused static function svc_ungetu32()
  ...
2022-12-12 20:54:39 -08:00
Linus Torvalds
149c51f876 for-6.2-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmOSLtIACgkQxWXV+ddt
 WDvpQA//dQ3Wosz5puFNiZvoSUn/BnYJueZHjwF0bWY8OYINkF1PvDenu/WotyFz
 Ozf4Yl4Afxncz+FjDnOtlpr6KsSU5NqdGM3NrY0eNsxd2t1KrTsN0LgkA4m24p8b
 YsYp7pygbMm7c+h0X4uFpebY4lABkEPCBXnI//ktsls0xG5sOvGfZA3rdUP0bou2
 JTn6hk+s0cLTNoTiOCGNHRJbeTzHLR0viZj/E4LCJfCeJvAmOLZamUjqe9sBNYAg
 YtsrZTpUIL3JgmRi5B6jG4fHSXOnE14mKmRIR3xPME6J6eoYyNOeuSh1oNmJEuoE
 B7nD5We+x5+isjXNw/V5CQrs7FF09UbdpbNb9NF5CYQWv40OCeefuai1opGtBUxX
 dvbfmf1blYpWW/wfFOKQwMOsl8kZIZYx68FW2OBUNglB6yRpX/3QgFSGb8kPCr83
 DW2ttqwkpSNPMKk92I/owIc4BRvZ+LMR/PimEHB/Sa2apZA2/L+7RGwoaaei1QNX
 1tJxHWeJFLDZ+YRxjO1eKqhWdGQPn1kkq8LoXLi3tGaNF4kYQfhWOSM3WRowvx1q
 f99XRgA8JQnqZS83zqRIspWlpFK0CFdvzG1Zlqx+eoxERfeaMNA2fHxv1YCyFV4+
 TiXgsnCo+PIBwlvL/HjUWZgYE9+AD+NN5vyoE2UDYff4AgBFTE8=
 =Nqg9
 -----END PGP SIGNATURE-----

Merge tag 'for-6.2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs updates from David Sterba:
 "This round there are a lot of cleanups and moved code so the diffstat
  looks huge, otherwise there are some nice performance improvements and
  an update to raid56 reliability.

  User visible features:

   - raid56 reliability vs performance trade off:
      - fix destructive RMW for raid5 data (raid6 still needs work): do
        full checksum verification for all data during RMW cycle, this
        should prevent rewriting potentially corrupted data without
        notice
      - stripes are cached in memory which should reduce the performance
        impact but still can hurt some workloads
      - checksums are verified after repair again
      - this is the last option without introducing additional features
        (write intent bitmap, journal, another tree), the extra checksum
        read/verification was supposed to be avoided by the original
        implementation exactly for performance reasons but that caused
        all the reliability problems

   - discard=async by default for devices that support it

   - implement emergency flush reserve to avoid almost all unnecessary
     transaction aborts due to ENOSPC in cases where there are too many
     delayed refs or delayed allocation

   - skip block group synchronization if there's no change in used
     bytes, can reduce transaction commit count for some workloads

  Performance improvements:

   - fiemap and lseek:
      - overall speedup due to skipping unnecessary or duplicate
        searches (-40% run time)
      - cache some data structures and sharedness of extents (-30% run
        time)

   - send:
      - faster backref resolution when finding clones
      - cached leaf to root mapping for faster backref walking
      - improved clone/sharing detection
      - overall run time improvements (-70%)

  Core:

   - module initialization converted to a table of function pointers run
     in a sequence

   - preparation for fscrypt, extend passing file names across calls,
     dir item can store encryption status

   - raid56 updates:
      - more accurate error tracking of sectors within stripe
      - simplify recovery path and remove dedicated endio worker kthread
      - simplify scrub call paths
      - refactoring to support the extra data checksum verification
        during RMW cycle

   - tree block parentness checks consolidated and done at metadata read
     time

   - improved error handling

   - cleanups:
      - move a lot of code for better synchronization between kernel and
        user space sources, split big files
      - enum cleanups
      - GFP flag cleanups
      - header file cleanups, prototypes, dependencies
      - redundant parameter cleanups
      - inline extent handling simplifications
      - inode parameter conversion
      - data structure cleanups, reductions, renames, merges"

* tag 'for-6.2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (249 commits)
  btrfs: print transaction aborted messages with an error level
  btrfs: sync some cleanups from progs into uapi/btrfs.h
  btrfs: do not BUG_ON() on ENOMEM when dropping extent items for a range
  btrfs: fix extent map use-after-free when handling missing device in read_one_chunk
  btrfs: remove outdated logic from overwrite_item() and add assertion
  btrfs: unify overwrite_item() and do_overwrite_item()
  btrfs: replace strncpy() with strscpy()
  btrfs: fix uninitialized variable in find_first_clear_extent_bit
  btrfs: fix uninitialized parent in insert_state
  btrfs: add might_sleep() annotations
  btrfs: add stack helpers for a few btrfs items
  btrfs: add nr_global_roots to the super block definition
  btrfs: remove BTRFS_LEAF_DATA_OFFSET
  btrfs: add helpers for manipulating leaf items and data
  btrfs: add eb to btrfs_node_key_ptr_offset
  btrfs: pass the extent buffer for the btrfs_item_nr helpers
  btrfs: move the csum helpers into ctree.h
  btrfs: move eb offset helpers into extent_io.h
  btrfs: move file_extent_item helpers into file-item.h
  btrfs: move leaf_data_end into ctree.c
  ...
2022-12-12 20:47:51 -08:00
Linus Torvalds
97971df811 dlm for 6.2
These patches include the usual cleanups and minor fixes, the removal of
 code that is no longer needed due to recent improvements, and
 improvements to processing large volumes of messages during heavy
 locking activity.
 
 - Misc code cleanup.
 
 - Fix a couple socket handling bugs: a double release on an error path
   and a data-ready race in an accept loop.
 
 - Remove code for resending dir-remove messages. This code is no longer
   needed since the midcomms layer now ensures the messages are resent if
   needed.
 
 - Add tracepoints for dlm messages.
 
 - Improve callback queueing by replacing the fixed array with a list.
 
 - Simplify the handling of a remove message followed by a lookup
   message by sending both without releasing a spinlock in between.
 
 - Improve the concurrency of sending and receiving messages by holding
   locks for a shorter time, and changing how workqueues are used.
 
 - Remove old code for shutting down sockets, which is no longer needed
   with the reliable connection handling that was recently added.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJjl2TfAAoJEDgbc8f8gGmqvj8P/ibVRzW00DeKJJU8Gz5qlcQo
 YziuUxR2xYffDwif/VB2FFW1dE9lY9bTTYHz6ctiIKip51XsVTVj9oaXEebSt5Xu
 /jEMJXUnMCUPcXz6T5FoaXAt+oMKqo7QhitCD4kqdM8AUZQLXKu/YM/FC1EyIMgl
 zGmHch99ojGPQmmPrFdJq8JQSF75+r28RfH9qmy7dzctX0Iok12J8Gsgd1J9Hwti
 PITmNBs0YauT7AdYwR3UhuGC9g73RISJ4QTjL+wpj7+P2F6pNYb8XXUXsEZ2wZYH
 sgo325HbIXZ8AbxIw0Tbx2NNmIQaTELZRCOSxgEjcN7AMhn1xkXYPE3mfMCU1bDZ
 TtfN86HvWYFRgNotGpDUrNnd9XJoyeHtCWBDikEepYRWEiyYkzipj5mFGRzINlvL
 Orou1tKBchvtDgQzGg6XwNkxnxR6l/xfhg1AYizYXCgOCxfEzJgIqCUttzMCPkec
 HXi/y17IDIvNvIhtmZ0EYh/6W40UYlnncREszerg5IR3ssuDmF5aA6MHdiSt0dn/
 PkjZ5DRbZaEbNl37mjShJHxztP1QKW85c3isgEBe4jvncu8VY3WDCbQRw5gjZvC4
 LnQF4hcGLm0y51fU5DXv+CBlQGRWaqYan8bWaIdVoA2UgDbM0x8eWC8IUUYBO5ak
 rCBOpWJudzZFH6ynyfwD
 =TJvP
 -----END PGP SIGNATURE-----

Merge tag 'dlm-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm

Pull dlm updates from David Teigland:
 "These patches include the usual cleanups and minor fixes, the removal
  of code that is no longer needed due to recent improvements, and
  improvements to processing large volumes of messages during heavy
  locking activity.

  Summary:

   - Misc code cleanup

   - Fix a couple of socket handling bugs: a double release on an error
     path and a data-ready race in an accept loop

   - Remove code for resending dir-remove messages. This code is no
     longer needed since the midcomms layer now ensures the messages are
     resent if needed

   - Add tracepoints for dlm messages

   - Improve callback queueing by replacing the fixed array with a list

   - Simplify the handling of a remove message followed by a lookup
     message by sending both without releasing a spinlock in between

   - Improve the concurrency of sending and receiving messages by
     holding locks for a shorter time, and changing how workqueues are
     used

   - Remove old code for shutting down sockets, which is no longer
     needed with the reliable connection handling that was recently
     added"

* tag 'dlm-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: (37 commits)
  fs: dlm: fix building without lockdep
  fs: dlm: parallelize lowcomms socket handling
  fs: dlm: don't init error value
  fs: dlm: use saved sk_error_report()
  fs: dlm: use sock2con without checking null
  fs: dlm: remove dlm_node_addrs lookup list
  fs: dlm: don't put dlm_local_addrs on heap
  fs: dlm: cleanup listen sock handling
  fs: dlm: remove socket shutdown handling
  fs: dlm: use listen sock as dlm running indicator
  fs: dlm: use list_first_entry_or_null
  fs: dlm: remove twice INIT_WORK
  fs: dlm: add midcomms init/start functions
  fs: dlm: add dst nodeid for msg tracing
  fs: dlm: rename seq to h_seq for msg tracing
  fs: dlm: rename DLM_IFL_NEED_SCHED to DLM_IFL_CB_PENDING
  fs: dlm: ast do WARN_ON_ONCE() on hotpath
  fs: dlm: drop lkb ref in bug case
  fs: dlm: avoid false-positive checker warning
  fs: dlm: use WARN_ON_ONCE() instead of WARN_ON()
  ...
2022-12-12 20:41:50 -08:00
Linus Torvalds
4a6bff1187 Changes since the last update:
- Enable large folios for iomap/fscache mode;
 
  - Avoid sysfs warning due to mounting twice with the same fsid and
    domain_id in fscache mode;
 
  - Refine fscache interface among erofs, fscache, and cachefiles;
 
  - Use kmap_local_page() only for metabuf;
 
  - Fixes around crafted images found by syzbot;
 
  - Minor cleanups and documentation updates.
 -----BEGIN PGP SIGNATURE-----
 
 iIcEABYIAC8WIQThPAmQN9sSA0DVxtI5NzHcH7XmBAUCY5S3khEceGlhbmdAa2Vy
 bmVsLm9yZwAKCRA5NzHcH7XmBLr3AQDA5xpztSsxfe0Gp+bwf12ySuntimJxXmAj
 83EHCfSC+AEAu4fcWkIF38MBBVJvFVjFaXCZKmFossbI5Rp8TuqPpgk=
 =HDsJ
 -----END PGP SIGNATURE-----

Merge tag 'erofs-for-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs

Pull erofs updates from Gao Xiang:
 "In this cycle, large folios are now enabled in the iomap/fscache mode
  for uncompressed files first. In order to do that, we've also cleaned
  up better interfaces between erofs and fscache, which are acked by
  fscache/netfs folks and included in this pull request.

  Other than that, there are random fixes around erofs over fscache and
  crafted images by syzbot, minor cleanups and documentation updates.

  Summary:

   - Enable large folios for iomap/fscache mode

   - Avoid sysfs warning due to mounting twice with the same fsid and
     domain_id in fscache mode

   - Refine fscache interface among erofs, fscache, and cachefiles

   - Use kmap_local_page() only for metabuf

   - Fixes around crafted images found by syzbot

   - Minor cleanups and documentation updates"

* tag 'erofs-for-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: validate the extent length for uncompressed pclusters
  erofs: fix missing unmap if z_erofs_get_extent_compressedlen() fails
  erofs: Fix pcluster memleak when its block address is zero
  erofs: use kmap_local_page() only for erofs_bread()
  erofs: enable large folios for fscache mode
  erofs: support large folios for fscache mode
  erofs: switch to prepare_ondemand_read() in fscache mode
  fscache,cachefiles: add prepare_ondemand_read() callback
  erofs: clean up cached I/O strategies
  erofs: update documentation
  erofs: check the uniqueness of fsid in shared domain in advance
  erofs: enable large folios for iomap mode
2022-12-12 20:14:04 -08:00
Linus Torvalds
deb9acc122 A large number of cleanups and bug fixes, with many of the bug fixes
found by Syzbot and fuzzing.  (Many of the bug fixes involve less-used
 ext4 features such as fast_commit, inline_data and bigalloc.)
 
 In addition, remove the writepage function for ext4, since the
 medium-term plan is to remove ->writepage() entirely.  (The VM doesn't
 need or want writepage() for writeback, since it is fine with
 ->writepages() so long as ->migrate_folio() is implemented.)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAmOWqrMACgkQ8vlZVpUN
 gaMvmgf+P2C6vzjn13ZdF+GwFTi4fx4TJ5BZT78LQqvTZqhkfk4k1q2SFfHI7nXT
 ZWdu1KUQ0SYLo64oaSU9W+2B2pmGi/KgUlrwNhy8DFeGStogPuDVfmGWB63p1UQL
 ld42mE9q7bjY6nCZSKYXPp2jfSwsHuliHBJ4UfzVNAIwjiUEJ7pGeIrMFdLAEkVm
 TVNzvlUZaHUnVxhpsP6hs+5WNhHQ2IhWz4rwX01ussNgHTijYac4iaL05wpTvF5e
 6NtvfmpOEMAbYrmIkJX4RVss4JNsHNOC0E8fjEHlgXJxBiAI6w8GxTxrS52Y4ELH
 nHXl/pc0L+I8+yh9B9+s0LBaSuPuTg==
 =lezv
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 updates from Ted Ts'o:
 "A large number of cleanups and bug fixes, with many of the bug fixes
  found by Syzbot and fuzzing. (Many of the bug fixes involve less-used
  ext4 features such as fast_commit, inline_data and bigalloc)

  In addition, remove the writepage function for ext4, since the
  medium-term plan is to remove ->writepage() entirely. (The VM doesn't
  need or want writepage() for writeback, since it is fine with
  ->writepages() so long as ->migrate_folio() is implemented)"

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (58 commits)
  ext4: fix reserved cluster accounting in __es_remove_extent()
  ext4: fix inode leak in ext4_xattr_inode_create() on an error path
  ext4: allocate extended attribute value in vmalloc area
  ext4: avoid unaccounted block allocation when expanding inode
  ext4: initialize quota before expanding inode in setproject ioctl
  ext4: stop providing .writepage hook
  mm: export buffer_migrate_folio_norefs()
  ext4: switch to using write_cache_pages() for data=journal writeout
  jbd2: switch jbd2_submit_inode_data() to use fs-provided hook for data writeout
  ext4: switch to using ext4_do_writepages() for ordered data writeout
  ext4: move percpu_rwsem protection into ext4_writepages()
  ext4: provide ext4_do_writepages()
  ext4: add support for writepages calls that cannot map blocks
  ext4: drop pointless IO submission from ext4_bio_write_page()
  ext4: remove nr_submitted from ext4_bio_write_page()
  ext4: move keep_towrite handling to ext4_bio_write_page()
  ext4: handle redirtying in ext4_bio_write_page()
  ext4: fix kernel BUG in 'ext4_write_inline_data_end()'
  ext4: make ext4_mb_initialize_context return void
  ext4: fix deadlock due to mbcache entry corruption
  ...
2022-12-12 19:56:37 -08:00
Jaegeuk Kim
71644dff48 f2fs: add block_age-based extent cache
This patch introduces a runtime hot/cold data separation method
for f2fs, in order to improve the accuracy for data temperature
classification, reduce the garbage collection overhead after
long-term data updates.

Enhanced hot/cold data separation can record data block update
frequency as "age" of the extent per inode, and take use of the age
info to indicate better temperature type for data block allocation:
 - It records total data blocks allocated since mount;
 - When file extent has been updated, it calculate the count of data
blocks allocated since last update as the age of the extent;
 - Before the data block allocated, it searches for the age info and
chooses the suitable segment for allocation.

Test and result:
 - Prepare: create about 30000 files
  * 3% for cold files (with cold file extension like .apk, from 3M to 10M)
  * 50% for warm files (with random file extension like .FcDxq, from 1K
to 4M)
  * 47% for hot files (with hot file extension like .db, from 1K to 256K)
 - create(5%)/random update(90%)/delete(5%) the files
  * total write amount is about 70G
  * fsync will be called for .db files, and buffered write will be used
for other files

The storage of test device is large enough(128G) so that it will not
switch to SSR mode during the test.

Benefit: dirty segment count increment reduce about 14%
 - before: Dirty +21110
 - after:  Dirty +18286

Signed-off-by: qixiaoyu1 <qixiaoyu1@xiaomi.com>
Signed-off-by: xiongping1 <xiongping1@xiaomi.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-12-12 14:53:56 -08:00
Jaegeuk Kim
e7547daccd f2fs: refactor extent_cache to support for read and more
This patch prepares extent_cache to be ready for addition.

Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-12-12 14:53:56 -08:00
Linus Torvalds
c1f0fcd85d cxl for 6.2
- Add the cpu_cache_invalidate_memregion() API for cache flushing in
   response to physical memory reconfiguration, or memory-side data
   invalidation from operations like secure erase or memory-device unlock.
 
 - Add a facility for the kernel to warn about collisions between kernel
   and userspace access to PCI configuration registers
 
 - Add support for Restricted CXL Host (RCH) topologies (formerly CXL 1.1)
 
 - Add handling and reporting of CXL errors reported via the PCIe AER
   mechanism
 
 - Add support for CXL Persistent Memory Security commands
 
 - Add support for the "XOR" algorithm for CXL host bridge interleave
 
 - Rework / simplify CXL to NVDIMM interactions
 
 - Miscellaneous cleanups and fixes
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCY5UpyAAKCRDfioYZHlFs
 Z0ttAP4uxCjIibKsFVyexpSgI4vaZqQ9yt9NesmPwonc0XookwD+PlwP6Xc0d0Ox
 t0gJ6+pwdh11NRzhcNE1pAaPcJZU4gs=
 =HAQk
 -----END PGP SIGNATURE-----

Merge tag 'cxl-for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull cxl updates from Dan Williams:
 "Compute Express Link (CXL) updates for 6.2.

  While it may seem backwards, the CXL update this time around includes
  some focus on CXL 1.x enabling where the work to date had been with
  CXL 2.0 (VH topologies) in mind.

  First generation CXL can mostly be supported via BIOS, similar to DDR,
  however it became clear there are use cases for OS native CXL error
  handling and some CXL 3.0 endpoint features can be deployed on CXL 1.x
  hosts (Restricted CXL Host (RCH) topologies). So, this update brings
  RCH topologies into the Linux CXL device model.

  In support of the ongoing CXL 2.0+ enabling two new core kernel
  facilities are added.

  One is the ability for the kernel to flag collisions between userspace
  access to PCI configuration registers and kernel accesses. This is
  brought on by the PCIe Data-Object-Exchange (DOE) facility, a hardware
  mailbox over config-cycles.

  The other is a cpu_cache_invalidate_memregion() API that maps to
  wbinvd_on_all_cpus() on x86. To prevent abuse it is disabled in guest
  VMs and architectures that do not support it yet. The CXL paths that
  need it, dynamic memory region creation and security commands (erase /
  unlock), are disabled when it is not present.

  As for the CXL 2.0+ this cycle the subsystem gains support Persistent
  Memory Security commands, error handling in response to PCIe AER
  notifications, and support for the "XOR" host bridge interleave
  algorithm.

  Summary:

   - Add the cpu_cache_invalidate_memregion() API for cache flushing in
     response to physical memory reconfiguration, or memory-side data
     invalidation from operations like secure erase or memory-device
     unlock.

   - Add a facility for the kernel to warn about collisions between
     kernel and userspace access to PCI configuration registers

   - Add support for Restricted CXL Host (RCH) topologies (formerly CXL
     1.1)

   - Add handling and reporting of CXL errors reported via the PCIe AER
     mechanism

   - Add support for CXL Persistent Memory Security commands

   - Add support for the "XOR" algorithm for CXL host bridge interleave

   - Rework / simplify CXL to NVDIMM interactions

   - Miscellaneous cleanups and fixes"

* tag 'cxl-for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (71 commits)
  cxl/region: Fix memdev reuse check
  cxl/pci: Remove endian confusion
  cxl/pci: Add some type-safety to the AER trace points
  cxl/security: Drop security command ioctl uapi
  cxl/mbox: Add variable output size validation for internal commands
  cxl/mbox: Enable cxl_mbox_send_cmd() users to validate output size
  cxl/security: Fix Get Security State output payload endian handling
  cxl: update names for interleave ways conversion macros
  cxl: update names for interleave granularity conversion macros
  cxl/acpi: Warn about an invalid CHBCR in an existing CHBS entry
  tools/testing/cxl: Require cache invalidation bypass
  cxl/acpi: Fail decoder add if CXIMS for HBIG is missing
  cxl/region: Fix spelling mistake "memergion" -> "memregion"
  cxl/regs: Fix sparse warning
  cxl/acpi: Set ACPI's CXL _OSC to indicate RCD mode support
  tools/testing/cxl: Add an RCH topology
  cxl/port: Add RCD endpoint port enumeration
  cxl/mem: Move devm_cxl_add_endpoint() from cxl_core to cxl_mem
  tools/testing/cxl: Add XOR Math support to cxl_test
  cxl/acpi: Support CXL XOR Interleave Math (CXIMS)
  ...
2022-12-12 13:55:31 -08:00
Stephen Boyd
0e2c9884cb Merge branches 'clk-mediatek', 'clk-trace', 'clk-qcom' and 'clk-microchip' into clk-next
- Tracepoints for clk_rate_request structures

* clk-mediatek:
  clk: mediatek: fix dependency of MT7986 ADC clocks
  clk: mediatek: Change PLL register API for MT8186
  clk: mediatek: Add new clock driver to handle FHCTL hardware
  dt-bindings: clock: mediatek: Add new bindings of MediaTek frequency hopping
  clk: mediatek: Export PLL operations symbols
  clk: mediatek: mt8186-topckgen: Add GPU clock mux notifier
  clk: mediatek: mt8186-mfg: Propagate rate changes to parent
  clk: mediatek: mt8195-topckgen: Drop flags for main/univpll fixed factors
  clk: mediatek: mt8192: Drop flags for main/univpll fixed factors
  clk: mediatek: mt6795-topckgen: Drop flags for main/sys/univpll fixed factors
  clk: mediatek: mt8173: Drop flags for main/sys/univpll fixed factors
  clk: mediatek: mt8183: Drop flags for sys/univpll fixed factors
  clk: mediatek: mt8183: Compress top_divs array entries
  clk: mediatek: mt8186-topckgen: Drop flags for main/univpll fixed factors
  clk: mediatek: clk-mtk: Allow specifying flags on mtk_fixed_factor clocks

* clk-trace:
  clk: Add trace events for rate requests
  clk: Store clk_core for clk_rate_request

* clk-qcom: (69 commits)
  clk: qcom: rpmh: add support for SM6350 rpmh IPA clock
  clk: qcom: mmcc-msm8974: use parent_hws/_data instead of parent_names
  clk: qcom: mmcc-msm8974: move clock parent tables down
  clk: qcom: mmcc-msm8974: use ARRAY_SIZE instead of specifying num_parents
  clk: qcom: gcc-msm8974: use parent_hws/_data instead of parent_names
  clk: qcom: gcc-msm8974: move clock parent tables down
  clk: qcom: gcc-msm8974: use ARRAY_SIZE instead of specifying num_parents
  dt-bindings: clocks: qcom,mmcc: define clocks/clock-names for MSM8974
  dt-bindings: clock: split qcom,gcc-msm8974,-msm8226 to the separate file
  clk: qcom: gcc-ipq4019: switch to devm_clk_notifier_register
  clk: qcom: rpmh: remove usage of platform name
  clk: qcom: rpmh: rename VRM clock data
  clk: qcom: rpmh: rename ARC clock data
  clk: qcom: rpmh: support separate symbol name for the RPMH clocks
  clk: qcom: rpmh: remove platform names from BCM clocks
  clk: qcom: rpmh: drop all _ao names
  clk: qcom: rpmh: reuse common duplicate clocks
  clk: qcom: rpmh: group clock definitions together
  clk: qcom: rpm: drop the platform from clock definitions
  clk: qcom: rpm: drop the _clk suffix completely
  ...

* clk-microchip:
  clk: microchip: enable the MPFS clk driver by default if SOC_MICROCHIP_POLARFIRE
  clk: microchip: check for null return of devm_kzalloc()
2022-12-12 11:13:28 -08:00
Gautam Menghani
4c9473e87e mm/khugepaged: add tracepoint to collapse_file()
"mm_khugepaged_collapse_file" for capturing is_shmem.
Currently, is_shmem is not being captured. Capturing is_shmem is useful
as it can indicate if tmpfs is being used as a backing store instead of
persistent storage. Add the tracepoint in collapse_file() named
"mm_khugepaged_collapse_file" for capturing is_shmem.

[gautammenghani201@gmail.com: swap is_shmem and addr to save space, per Steven Rostedt]
  Link: https://lkml.kernel.org/r/20221202201807.182829-1-gautammenghani201@gmail.com
Link: https://lkml.kernel.org/r/20221026052218.148234-1-gautammenghani201@gmail.com
Signed-off-by: Gautam Menghani <gautammenghani201@gmail.com>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>	[tracing]
Cc: David Hildenbrand <david@redhat.com>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Zach O'Keefe <zokeefe@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-11 18:12:09 -08:00
Chuck Lever
c65d9df0c4 SUNRPC: Make the svc_authenticate tracepoint conditional
Clean up: Simplify the tracepoint's only call site.

Also, I noticed that when svc_authenticate() returns SVC_COMPLETE,
it leaves rq_auth_stat set to an error value. That doesn't need to
be recorded in the trace log.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2022-12-10 11:01:13 -05:00
Dai Ngo
638593be55 NFSD: add CB_RECALL_ANY tracepoints
Add tracepoints to trace start and end of CB_RECALL_ANY operation.

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
[ cel: added show_rca_mask() macro ]
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-12-10 11:01:12 -05:00
Chuck Lever
247c01ff5f trace: Relocate event helper files
Steven Rostedt says:
> The include/trace/events/ directory should only hold files that
> are to create events, not headers that hold helper functions.
>
> Can you please move them out of include/trace/events/ as that
> directory is "special" in the creation of events.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Acked-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-12-10 11:01:12 -05:00
Jason Gunthorpe
d69e8c63fc Linux 6.1-rc8
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmONI6weHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG9xgH/jqXGuMoO1ikfmGb
 7oY0W/f69G9V/e0DxFLvnIjhFgCUzdnNsmD4jQJA4x6QsxwLWuvpI282Ez+bHV5T
 U4RPsxJZIIMsXE2lKM9BRgeLzDdCt0aK4Pj+3x2x7NZC5cWFSQ8PyQJkCwg+0PQo
 u8Ly+GO8c4RUMf4/rrAZQq16qZUqGDaGm1EJhtSoa+KiR81LmUUmbDIK9Mr53rmQ
 wou+95XhibwMWr17WgXA28bTgYqn9UGr67V3qvTH2LC7GW8BCoKvn+3wh6TVhlWj
 dsWplXgcOP0/OHvSC5Sb1Uibk5Gx3DlIzYa6OfNZQuZ5xmQqm9kXjW8lmYpWFHy/
 38/5HWc=
 =EuoA
 -----END PGP SIGNATURE-----

Merge tag 'v6.1-rc8' into rdma.git for-next

For dependencies in following patches

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-09 15:52:17 -04:00
Eric Biggers
0fbcb5251f ext4: disable fast-commit of encrypted dir operations
fast-commit of create, link, and unlink operations in encrypted
directories is completely broken because the unencrypted filenames are
being written to the fast-commit journal instead of the encrypted
filenames.  These operations can't be replayed, as encryption keys
aren't present at journal replay time.  It is also an information leak.

Until if/when we can get this working properly, make encrypted directory
operations ineligible for fast-commit.

Note that fast-commit operations on encrypted regular files continue to
be allowed, as they seem to work.

Fixes: aa75f4d3da ("ext4: main fast-commit commit path")
Cc: <stable@vger.kernel.org> # v5.10+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20221106224841.279231-2-ebiggers@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-12-08 21:49:24 -05:00
Bixuan Cui
d87a7b4c77 jbd2: use the correct print format
The print format error was found when using ftrace event:
    <...>-1406 [000] .... 23599442.895823: jbd2_end_commit: dev 252,8 transaction -1866216965 sync 0 head -1866217368
    <...>-1406 [000] .... 23599442.896299: jbd2_start_commit: dev 252,8 transaction -1866216964 sync 0

Use the correct print format for transaction, head and tid.

Fixes: 879c5e6b7c ('jbd2: convert instrumentation from markers to tracepoints')
Signed-off-by: Bixuan Cui <cuibixuan@linux.alibaba.com>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Link: https://lore.kernel.org/r/1665488024-95172-1-git-send-email-cuibixuan@linux.alibaba.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2022-12-08 21:49:12 -05:00
Jakub Kicinski
837e8ac871 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08 18:19:59 -08:00
Maxime Ripard
49e62e0d96 clk: Add trace events for rate requests
It is currently fairly difficult to follow what clk_rate_request are
issued, and how they have been modified once done.

Indeed, there's multiple paths that can be taken, some functions are
recursive and will just forward the request to its parent, etc.

Adding a lot of debug prints is just not very convenient, so let's add
trace events for the clock requests, one before they are submitted and
one after they are returned.

That way we can simply toggle the tracing on without modifying the
kernel code and without affecting performances or the kernel logs too
much.

Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://lore.kernel.org/r/20221018-clk-rate-request-tracing-v2-2-5170b363c413@cerno.tech
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-12-07 13:54:09 -08:00
Dave Wysochanski
b5b52de321 fscache: Fix oops due to race with cookie_lru and use_cookie
If a cookie expires from the LRU and the LRU_DISCARD flag is set, but
the state machine has not run yet, it's possible another thread can call
fscache_use_cookie and begin to use it.

When the cookie_worker finally runs, it will see the LRU_DISCARD flag
set, transition the cookie->state to LRU_DISCARDING, which will then
withdraw the cookie.  Once the cookie is withdrawn the object is removed
the below oops will occur because the object associated with the cookie
is now NULL.

Fix the oops by clearing the LRU_DISCARD bit if another thread uses the
cookie before the cookie_worker runs.

  BUG: kernel NULL pointer dereference, address: 0000000000000008
  ...
  CPU: 31 PID: 44773 Comm: kworker/u130:1 Tainted: G     E    6.0.0-5.dneg.x86_64 #1
  Hardware name: Google Compute Engine/Google Compute Engine, BIOS Google 08/26/2022
  Workqueue: events_unbound netfs_rreq_write_to_cache_work [netfs]
  RIP: 0010:cachefiles_prepare_write+0x28/0x90 [cachefiles]
  ...
  Call Trace:
    netfs_rreq_write_to_cache_work+0x11c/0x320 [netfs]
    process_one_work+0x217/0x3e0
    worker_thread+0x4a/0x3b0
    kthread+0xd6/0x100

Fixes: 12bb21a29c ("fscache: Implement cookie user counting and resource pinning")
Reported-by: Daire Byrne <daire.byrne@gmail.com>
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Daire Byrne <daire@dneg.com>
Link: https://lore.kernel.org/r/20221117115023.1350181-1-dwysocha@redhat.com/ # v1
Link: https://lore.kernel.org/r/20221117142915.1366990-1-dwysocha@redhat.com/ # v2
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-12-07 11:49:18 -08:00
Jingbo Xu
8669247524 fscache,cachefiles: add prepare_ondemand_read() callback
Add prepare_ondemand_read() callback dedicated for the on-demand read
scenario, so that callers from this scenario can be decoupled from
netfs_io_subrequest.

The original cachefiles_prepare_read() is now refactored to a generic
routine accepting a parameter list instead of netfs_io_subrequest.
There's no logic change, except that the debug id of subrequest and
request is removed from trace_cachefiles_prep_read().

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Acked-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20221124034212.81892-2-jefflexu@linux.alibaba.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2022-12-07 10:56:29 +08:00
Dan Williams
372ab3bc37 cxl/pci: Add some type-safety to the AER trace points
The first argument to the CXL AER trace points is the source device.
Pass a 'const struct device *' rather than a 'const char *' for more
type precision / safety.

Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/167030091477.4045167.15174636482098463885.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-12-06 14:37:52 -08:00
Uwe Kleine-König
3dae106f4c pwm/tracing: Also record trace events for failed API calls
Record and report an error code for the events. This allows to report
about failed calls without ambiguity and so gives a more complete
picture.

Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20221130152148.2769768-3-u.kleine-koenig@pengutronix.de
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
2022-12-06 12:46:23 +01:00
David Sterba
0988fc7bda btrfs: switch extent_io_tree::private_data to btrfs_inode and rename
The extent_io_tree::private_data was meant to be a preparatory work for
the metadata inode rework but that never materialized. Now it's used
only for an inode so it's better to change the appropriate type and
rename it.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-12-05 18:00:54 +01:00
Dave Jiang
2f6e9c3051 cxl/pci: add tracepoint events for CXL RAS
Add tracepoint events for recording the CXL uncorrectable and correctable
errors. For uncorrectable errors, there is additional data of 512B from
the header log register (CXL spec rev3 8.2.4.16.7). The trace event will
intake a dynamic array that will dump the entire Header Log data. If
multiple errors are set in the status register, then the
'first error' field (CXL spec rev3 v8.2.4.16.6) is read from the Error
Capabilities and Control Register in order to determine the error.

This implementation does not include CXL IDE Error details.

Cc: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/166974413388.1608150.5875712482260436188.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-12-03 13:40:17 -08:00
changfengnan
5f3e240321 ext4: split ext4_journal_start trace for debug
we might want to know why jbd2 thread using high io for detail,
split ext4_journal_start trace to ext4_journal_start_sb and
ext4_journal_start_inode, show ino and handle type when possible.

Signed-off-by: changfengnan <changfengnan@bytedance.com>
Link: https://lore.kernel.org/r/20221008120518.74870-1-changfengnan@bytedance.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-12-01 10:46:54 -05:00
Kemeng Shi
63c9eac4b6 blk-iocost: Trace vtime_base_rate instead of vtime_rate
Since commit ac33e91e2d ("blk-iocost: implement vtime loss
compensation") rename original vtime_rate to vtime_base_rate
and current vtime_rate is original vtime_rate with compensation.
The current rate showed in tracepoint is mixed with vtime_rate
and vtime_base_rate:
1) In function ioc_adjust_base_vrate, the first trace_iocost_ioc_vrate_adj
shows vtime_rate, the second trace_iocost_ioc_vrate_adj shows
vtime_base_rate.
2) In function iocg_activate shows vtime_rate by calling
TRACE_IOCG_PATH(iocg_activate...
3) In function ioc_check_iocgs shows vtime_rate by calling
TRACE_IOCG_PATH(iocg_idle...

Trace vtime_base_rate instead of vtime_rate as:
1) Before commit ac33e91e2d ("blk-iocost: implement vtime loss
compensation"), the traced rate is without compensation, so still
show rate without compensation.
2) The vtime_base_rate is more stable while vtime_rate heavily depends on
excess budeget on current period which may change abruptly in next period.

Signed-off-by: Kemeng Shi <shikemeng@huawei.com>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20221018121932.10792-4-shikemeng@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-12-01 07:44:12 -07:00
David Howells
b0346843b1 rxrpc: Transmit ACKs at the point of generation
For ACKs generated inside the I/O thread, transmit the ACK at the point of
generation.  Where the ACK is generated outside of the I/O thread, it's
offloaded to the I/O thread to transmit it.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2022-12-01 13:36:43 +00:00
David Howells
32cf8edb07 rxrpc: Trace/count transmission underflows and cwnd resets
Add a tracepoint to log when a cwnd reset occurs due to lack of
transmission on a call.

Add stat counters to count transmission underflows (ie. when we have tx
window space, but sendmsg doesn't manage to keep up), cwnd resets and
transmission failures.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2022-12-01 13:36:42 +00:00
David Howells
5e6ef4f101 rxrpc: Make the I/O thread take over the call and local processor work
Move the functions from the call->processor and local->processor work items
into the domain of the I/O thread.

The call event processor, now called from the I/O thread, then takes over
the job of cranking the call state machine, processing incoming packets and
transmitting DATA, ACK and ABORT packets.  In a future patch,
rxrpc_send_ACK() will transmit the ACK on the spot rather than queuing it
for later transmission.

The call event processor becomes purely received-skb driven.  It only
transmits things in response to events.  We use "pokes" to queue a dummy
skb to make it do things like start/resume transmitting data.  Timer expiry
also results in pokes.

The connection event processor, becomes similar, though crypto events, such
as dealing with CHALLENGE and RESPONSE packets is offloaded to a work item
to avoid doing crypto in the I/O thread.

The local event processor is removed and VERSION response packets are
generated directly from the packet parser.  Similarly, ABORTs generated in
response to protocol errors will be transmitted immediately rather than
being pushed onto a queue for later transmission.

Changes:
========
ver #2)
 - Fix a couple of introduced lock context imbalances.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2022-12-01 13:36:42 +00:00
David Howells
2d1faf7a0c rxrpc: Simplify skbuff accounting in receive path
A received skbuff needs a ref when it gets put on a call data queue or conn
packet queue, and rxrpc_input_packet() and co. jump through a lot of hoops
to avoid double-dropping the skbuff ref so that we can avoid getting a ref
when we queue the packet.

Change this so that the skbuff ref is unconditionally dropped by the caller
of rxrpc_input_packet().  An additional ref is then taken on the packet if
it is pushed onto a queue.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2022-12-01 13:36:41 +00:00
David Howells
cf37b59875 rxrpc: Move DATA transmission into call processor work item
Move DATA transmission into the call processor work item.  In a future
patch, this will be called from the I/O thread rather than being itsown
work item.

This will allow DATA transmission to be driven directly by incoming ACKs,
pokes and timers as those are processed.

The Tx queue is also split: The queue of packets prepared by sendmsg is now
places in call->tx_sendmsg and the packet dispatcher decants the packets
into call->tx_buffer as space becomes available in the transmission
window.  This allows sendmsg to run ahead of the available space to try and
prevent an underflow in transmission.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2022-12-01 13:36:41 +00:00
David Howells
f3441d4125 rxrpc: Copy client call parameters into rxrpc_call earlier
Copy client call parameters into rxrpc_call earlier so that that can be
used to convey them to the connection code - which can then be offloaded to
the I/O thread.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2022-12-01 13:36:41 +00:00
David Howells
15f661dc95 rxrpc: Implement a mechanism to send an event notification to a call
Provide a means by which an event notification can be sent to a call such
that the I/O thread can process it rather than it being done in a separate
workqueue.  This will allow a lot of locking to be removed.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2022-12-01 13:36:41 +00:00
David Howells
3cec055c56 rxrpc: Don't hold a ref for connection workqueue
Currently, rxrpc gives the connection's work item a ref on the connection
when it queues it - and this is called from the timer expiration function.
The problem comes when queue_work() fails (ie. the work item is already
queued): the timer routine must put the ref - but this may cause the
cleanup code to run.

This has the unfortunate effect that the cleanup code may then be run in
softirq context - which means that any spinlocks it might need to touch
have to be guarded to disable softirqs (ie. they need a "_bh" suffix).

 (1) Don't give a ref to the work item.

 (2) Simplify handling of service connections by adding a separate active
     count so that the refcount isn't also used for this.

 (3) Connection destruction for both client and service connections can
     then be cleaned up by putting rxrpc_put_connection() out of line and
     making a tidy progression through the destruction code (offloaded to a
     workqueue if put from softirq or processor function context).  The RCU
     part of the cleanup then only deals with the freeing at the end.

 (4) Make rxrpc_queue_conn() return immediately if it sees the active count
     is -1 rather then queuing the connection.

 (5) Make sure that the cleanup routine waits for the work item to
     complete.

 (6) Stash the rxrpc_net pointer in the conn struct so that the rcu free
     routine can use it, even if the local endpoint has been freed.

Unfortunately, neither the timer nor the work item can simply get around
the problem by just using refcount_inc_not_zero() as the waits would still
have to be done, and there would still be the possibility of having to put
the ref in the expiration function.

Note the connection work item is mostly going to go away with the main
event work being transferred to the I/O thread, so the wait in (6) will
become obsolete.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2022-12-01 13:36:40 +00:00
David Howells
3feda9d69c rxrpc: Don't hold a ref for call timer or workqueue
Currently, rxrpc gives the call timer a ref on the call when it starts it
and this is passed along to the workqueue by the timer expiration function.
The problem comes when queue_work() fails (ie. the work item is already
queued): the timer routine must put the ref - but this may cause the
cleanup code to run.

This has the unfortunate effect that the cleanup code may then be run in
softirq context - which means that any spinlocks it might need to touch
have to be guarded to disable softirqs (ie. they need a "_bh" suffix).

Fix this by:

 (1) Don't give a ref to the timer.

 (2) Making the expiration function not do anything if the refcount is 0.
     Note that this is more of an optimisation.

 (3) Make sure that the cleanup routine waits for timer to complete.

However, this has a consequence that timer cannot give a ref to the work
item.  Therefore the following fixes are also necessary:

 (4) Don't give a ref to the work item.

 (5) Make the work item return asap if it sees the ref count is 0.

 (6) Make sure that the cleanup routine waits for the work item to
     complete.

Unfortunately, neither the timer nor the work item can simply get around
the problem by just using refcount_inc_not_zero() as the waits would still
have to be done, and there would still be the possibility of having to put
the ref in the expiration function.

Note the call work item is going to go away with the work being transferred
to the I/O thread, so the wait in (6) will become obsolete.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2022-12-01 13:36:39 +00:00
David Howells
9a36a6bc22 rxrpc: trace: Don't use __builtin_return_address for sk_buff tracing
In rxrpc tracing, use enums to generate lists of points of interest rather
than __builtin_return_address() for the sk_buff tracepoint.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2022-12-01 13:36:39 +00:00
David Howells
fa3492abb6 rxrpc: Trace rxrpc_bundle refcount
Add a tracepoint for the rxrpc_bundle refcounting.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2022-12-01 13:36:39 +00:00
David Howells
cb0fc0c972 rxrpc: trace: Don't use __builtin_return_address for rxrpc_call tracing
In rxrpc tracing, use enums to generate lists of points of interest rather
than __builtin_return_address() for the rxrpc_call tracepoint

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2022-12-01 13:36:39 +00:00
David Howells
7fa25105b2 rxrpc: trace: Don't use __builtin_return_address for rxrpc_conn tracing
In rxrpc tracing, use enums to generate lists of points of interest rather
than __builtin_return_address() for the rxrpc_conn tracepoint

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2022-12-01 13:36:39 +00:00
David Howells
47c810a798 rxrpc: trace: Don't use __builtin_return_address for rxrpc_peer tracing
In rxrpc tracing, use enums to generate lists of points of interest rather
than __builtin_return_address() for the rxrpc_peer tracepoint

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2022-12-01 13:36:38 +00:00
David Howells
0fde882fc9 rxrpc: trace: Don't use __builtin_return_address for rxrpc_local tracing
In rxrpc tracing, use enums to generate lists of points of interest rather
than __builtin_return_address() for the rxrpc_local tracepoint

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2022-12-01 13:36:38 +00:00
David Howells
2ebdb26e6a rxrpc: Remove the [k_]proto() debugging macros
Remove the kproto() and _proto() debugging macros in preference to using
tracepoints for this.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2022-12-01 13:36:38 +00:00
Shakeel Butt
f1a7941243 mm: convert mm's rss stats into percpu_counter
Currently mm_struct maintains rss_stats which are updated on page fault
and the unmapping codepaths.  For page fault codepath the updates are
cached per thread with the batch of TASK_RSS_EVENTS_THRESH which is 64. 
The reason for caching is performance for multithreaded applications
otherwise the rss_stats updates may become hotspot for such applications.

However this optimization comes with the cost of error margin in the rss
stats.  The rss_stats for applications with large number of threads can be
very skewed.  At worst the error margin is (nr_threads * 64) and we have a
lot of applications with 100s of threads, so the error margin can be very
high.  Internally we had to reduce TASK_RSS_EVENTS_THRESH to 32.

Recently we started seeing the unbounded errors for rss_stats for specific
applications which use TCP rx0cp.  It seems like vm_insert_pages()
codepath does not sync rss_stats at all.

This patch converts the rss_stats into percpu_counter to convert the error
margin from (nr_threads * 64) to approximately (nr_cpus ^ 2).  However
this conversion enable us to get the accurate stats for situations where
accuracy is more important than the cpu cost.

This patch does not make such tradeoffs - we can just use
percpu_counter_add_local() for the updates and percpu_counter_sum() (or
percpu_counter_sync() + percpu_counter_read) for the readers.  At the
moment the readers are either procfs interface, oom_killer and memory
reclaim which I think are not performance critical and should be ok with
slow read.  However I think we can make that change in a separate patch.

Link: https://lkml.kernel.org/r/20221024052841.3291983-1-shakeelb@google.com
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-30 15:58:40 -08:00
Andrew Morton
a38358c934 Merge branch 'mm-hotfixes-stable' into mm-stable 2022-11-30 14:58:42 -08:00
Jakub Kicinski
f2bb566f5c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
tools/lib/bpf/ringbuf.c
  927cbb478a ("libbpf: Handle size overflow for ringbuf mmap")
  b486d19a0a ("libbpf: checkpatch: Fixed code alignments in ringbuf.c")
https://lore.kernel.org/all/20221121122707.44d1446a@canb.auug.org.au/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-29 13:04:52 -08:00
Peter Collingbourne
ef6458b1b6 mm: Add PG_arch_3 page flag
As with PG_arch_2, this flag is only allowed on 64-bit architectures due
to the shortage of bits available. It will be used by the arm64 MTE code
in subsequent patches.

Signed-off-by: Peter Collingbourne <pcc@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Steven Price <steven.price@arm.com>
[catalin.marinas@arm.com: added flag preserving in __split_huge_page_tail()]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221104011041.290951-5-pcc@google.com
2022-11-29 09:26:07 +00:00
Catalin Marinas
b0284cd29a mm: Do not enable PG_arch_2 for all 64-bit architectures
Commit 4beba9486a ("mm: Add PG_arch_2 page flag") introduced a new
page flag for all 64-bit architectures. However, even if an architecture
is 64-bit, it may still have limited spare bits in the 'flags' member of
'struct page'. This may happen if an architecture enables SPARSEMEM
without SPARSEMEM_VMEMMAP as is the case with the newly added loongarch.
This architecture port needs 19 more bits for the sparsemem section
information and, while it is currently fine with PG_arch_2, adding any
more PG_arch_* flags will trigger build-time warnings.

Add a new CONFIG_ARCH_USES_PG_ARCH_X option which can be selected by
architectures that need more PG_arch_* flags beyond PG_arch_1. Select it
on arm64.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
[pcc@google.com: fix build with CONFIG_ARM64_MTE disabled]
Signed-off-by: Peter Collingbourne <pcc@google.com>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Steven Price <steven.price@arm.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221104011041.290951-2-pcc@google.com
2022-11-29 09:26:06 +00:00
Stanislav Fomichev
14e5f71e31 net: use %pS for kfree_skb tracing event location
For the cases where 'reason' doesn't give any clue, it's still
nice to be able to track the kfree_skb caller location. %p doesn't
help much so let's use %pS which prints the symbol+offset.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20221123040947.1015721-1-sdf@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-24 15:27:49 +01:00
Steven Rostedt (Google)
8230f27b1c tracing: Add __cpumask to denote a trace event field that is a cpumask_t
The trace events have a __bitmask field that can be used for anything
that requires bitmasks. Although currently it is only used for CPU
masks, it could be used in the future for any type of bitmasks.

There is some user space tooling that wants to know if a field is a CPU
mask and not just some random unsigned long bitmask. Introduce
"__cpumask()" helper functions that work the same as the current
__bitmask() helpers but displays in the format file:

  field:__data_loc cpumask_t *[] mask;    offset:36;      size:4; signed:0;

Instead of:

  field:__data_loc unsigned long[] mask;  offset:32;      size:4; signed:0;

The main difference is the type. Instead of "unsigned long" it is
"cpumask_t *". Note, this type field needs to be a real type in the
__dynamic_array() logic that both __cpumask and__bitmask use, but the
comparison field requires it to be a scalar type whereas cpumask_t is a
structure (non-scalar). But everything works when making it a pointer.

Valentin added changes to remove the need of passing in "nr_bits" and the
__cpumask will always use nr_cpumask_bits as its size.

Link: https://lkml.kernel.org/r/20221014080456.1d32b989@rorschach.local.home

Requested-by: Valentin Schneider <vschneid@redhat.com>
Reviewed-by: Valentin Schneider <vschneid@redhat.com>
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-11-23 19:08:30 -05:00
Gautam Menghani
045634ff1e mm/khugepaged: refactor mm_khugepaged_scan_file tracepoint to remove filename from function call
Refactor the mm_khugepaged_scan_file tracepoint to move filename
dereference to the tracepoint definition, to maintain consistency with
other tracepoints[1].

[1]:lore.kernel.org/lkml/20221024111621.3ba17e2c@gandalf.local.home/

Link: https://lkml.kernel.org/r/20221026044524.54793-1-gautammenghani201@gmail.com
Fixes: d41fd2016e ("mm/khugepaged: add tracepoint to hpage_collapse_scan_file()")
Signed-off-by: Gautam Menghani <gautammenghani201@gmail.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Reviewed-by: Zach O'Keefe <zokeefe@google.com>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-22 18:50:41 -08:00
Sai Prakash Ranjan
5e5ff73c2e asm-generic/io: Add _RET_IP_ to MMIO trace for more accurate debug info
Due to compiler optimizations like inlining, there are cases where
MMIO traces using _THIS_IP_ for caller information might not be
sufficient to provide accurate debug traces.

1) With optimizations (Seen with GCC):

In this case, _THIS_IP_ works fine and prints the caller information
since it will be inlined into the caller and we get the debug traces
on who made the MMIO access, for ex:

rwmmio_read: qcom_smmu_tlb_sync+0xe0/0x1b0 width=32 addr=0xffff8000087447f4
rwmmio_post_read: qcom_smmu_tlb_sync+0xe0/0x1b0 width=32 val=0x0 addr=0xffff8000087447f4

2) Without optimizations (Seen with Clang):

_THIS_IP_ will not be sufficient in this case as it will print only
the MMIO accessors itself which is of not much use since it is not
inlined as below for example:

rwmmio_read: readl+0x4/0x80 width=32 addr=0xffff8000087447f4
rwmmio_post_read: readl+0x48/0x80 width=32 val=0x4 addr=0xffff8000087447f4

So in order to handle this second case as well irrespective of the compiler
optimizations, add _RET_IP_ to MMIO trace to make it provide more accurate
debug information in all these scenarios.

Before:

rwmmio_read: readl+0x4/0x80 width=32 addr=0xffff8000087447f4
rwmmio_post_read: readl+0x48/0x80 width=32 val=0x4 addr=0xffff8000087447f4

After:

rwmmio_read: qcom_smmu_tlb_sync+0xe0/0x1b0 -> readl+0x4/0x80 width=32 addr=0xffff8000087447f4
rwmmio_post_read: qcom_smmu_tlb_sync+0xe0/0x1b0 -> readl+0x4/0x80 width=32 val=0x0 addr=0xffff8000087447f4

Fixes: 210031971c ("asm-generic/io: Add logging support for MMIO accessors")
Signed-off-by: Sai Prakash Ranjan <quic_saipraka@quicinc.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-11-21 22:02:10 +01:00
Alexander Aring
17827754e5 fs: dlm: add dst nodeid for msg tracing
In DLM when we send a dlm message it is easy to add the lock resource
name, but additional lookup is required when to trace the receive
message side. The idea here is to move the lookup work to the user by
using a lookup to find the right send message with recv message. As note
DLM can't drop any message which is guaranteed by a special session
layer.

For doing the lookup a 3 tupel is required as an unique identification
which is dst nodeid, src nodeid and sequence number. This patch adds the
destination nodeid to the dlm message trace points. The source nodeid is
given by the h_nodeid field inside the header.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2022-11-21 09:45:49 -06:00
Alexander Aring
81889255c2 fs: dlm: rename seq to h_seq for msg tracing
This patch renames seq to h_seq as it is named in the dlm header
structure.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2022-11-21 09:45:49 -06:00
Leon Romanovsky
1ec5617432 Merge branch 'mana-shared-6.2' of https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Long Li says:

====================
Introduce Microsoft Azure Network Adapter (MANA) RDMA driver [netdev prep]

The first 11 patches which modify the MANA Ethernet driver to support
RDMA driver.

* 'mana-shared-6.2' of https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  net: mana: Define data structures for protection domain and memory registration
  net: mana: Define data structures for allocating doorbell page from GDMA
  net: mana: Define and process GDMA response code GDMA_STATUS_MORE_ENTRIES
  net: mana: Define max values for SGL entries
  net: mana: Move header files to a common location
  net: mana: Record port number in netdev
  net: mana: Export Work Queue functions for use by RDMA driver
  net: mana: Set the DMA device max segment size
  net: mana: Handle vport sharing between devices
  net: mana: Record the physical address for doorbell page region
  net: mana: Add support for auxiliary device
====================

Link: https://lore.kernel.org/all/1667502990-2559-1-git-send-email-longli@linuxonhyperv.com/
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-11-11 11:35:12 +02:00
Leonid Ravich
5c20311d76 IB/mad: Don't call to function that might sleep while in atomic context
Tracepoints are not allowed to sleep, as such the following splat is
generated due to call to ib_query_pkey() in atomic context.

WARNING: CPU: 0 PID: 1888000 at kernel/trace/ring_buffer.c:2492 rb_commit+0xc1/0x220
CPU: 0 PID: 1888000 Comm: kworker/u9:0 Kdump: loaded Tainted: G           OE    --------- -  - 4.18.0-305.3.1.el8.x86_64 #1
 Hardware name: Red Hat KVM, BIOS 1.13.0-2.module_el8.3.0+555+a55c8938 04/01/2014
 Workqueue: ib-comp-unb-wq ib_cq_poll_work [ib_core]
 RIP: 0010:rb_commit+0xc1/0x220
 RSP: 0000:ffffa8ac80f9bca0 EFLAGS: 00010202
 RAX: ffff8951c7c01300 RBX: ffff8951c7c14a00 RCX: 0000000000000246
 RDX: ffff8951c707c000 RSI: ffff8951c707c57c RDI: ffff8951c7c14a00
 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
 R10: ffff8951c7c01300 R11: 0000000000000001 R12: 0000000000000246
 R13: 0000000000000000 R14: ffffffff964c70c0 R15: 0000000000000000
 FS:  0000000000000000(0000) GS:ffff8951fbc00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007f20e8f39010 CR3: 000000002ca10005 CR4: 0000000000170ef0
 Call Trace:
  ring_buffer_unlock_commit+0x1d/0xa0
  trace_buffer_unlock_commit_regs+0x3b/0x1b0
  trace_event_buffer_commit+0x67/0x1d0
  trace_event_raw_event_ib_mad_recv_done_handler+0x11c/0x160 [ib_core]
  ib_mad_recv_done+0x48b/0xc10 [ib_core]
  ? trace_event_raw_event_cq_poll+0x6f/0xb0 [ib_core]
  __ib_process_cq+0x91/0x1c0 [ib_core]
  ib_cq_poll_work+0x26/0x80 [ib_core]
  process_one_work+0x1a7/0x360
  ? create_worker+0x1a0/0x1a0
  worker_thread+0x30/0x390
  ? create_worker+0x1a0/0x1a0
  kthread+0x116/0x130
  ? kthread_flush_work_fn+0x10/0x10
  ret_from_fork+0x35/0x40
 ---[ end trace 78ba8509d3830a16 ]---

Fixes: 821bf1de45 ("IB/MAD: Add recv path trace point")
Signed-off-by: Leonid Ravich <lravich@gmail.com>
Link: https://lore.kernel.org/r/Y2t5feomyznrVj7V@leonid-Inspiron-3421
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-11-10 10:57:15 +02:00
Uladzislau Rezki (Sony)
fabc27f764 mm: vmalloc: add free_vmap_area_noflush trace event
This event is used in order to validate/debug a start address of freed VA,
number of currently outstanding and maximum allowed areas.

Link: https://lkml.kernel.org/r/20221018181053.434508-4-urezki@gmail.com
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sony.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-08 17:37:17 -08:00
Uladzislau Rezki (Sony)
b3a5a7b099 mm: vmalloc: add purge_vmap_area_lazy trace event
It is for debug purposes to track number of freed vmap areas including a
range it occurs on.

Link: https://lkml.kernel.org/r/20221018181053.434508-3-urezki@gmail.com
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sony.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-08 17:37:17 -08:00
Uladzislau Rezki (Sony)
3c0c9bc9c9 mm: vmalloc: add alloc_vmap_area trace event
Patch series "Add basic trace events for vmap/vmalloc (v2)", v2.

This small series add some basic trace events for the vmap/vmalloc code. 
Since currently we lack any, sometimes it is hard to start debuging vmap
code if an issue is reported or occured.

For example https://lore.kernel.org/linux-mm/Y0p8BZIiDXLQbde%2F@pc636/T/

The final patch adds two reviewers for vmalloc code.


This patch (of 7):

It is for debug purposes and for validation of passed parameters.

Link: https://lkml.kernel.org/r/20221018181053.434508-1-urezki@gmail.com
Link: https://lkml.kernel.org/r/20221018181053.434508-2-urezki@gmail.com
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sony.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-08 17:37:17 -08:00
Alexander Aring
e01c4b7bd4 fd: dlm: trace send/recv of dlm message and rcom
This patch adds tracepoints for send and recv cases of dlm messages and
dlm rcom messages. In case of send and dlm message we add the dlm rsb
resource name this dlm messages belongs to. This has the advantage to
follow dlm messages on a per lock basis. In case of recv message the
resource name can be extracted by follow the send message sequence
number.

The dlm message DLM_MSG_PURGE doesn't belong to a lock request and will
not set the resource name in a dlm_message trace. The same for all rcom
messages.

There is additional handling required for this debugging functionality
which is tried to be small as possible. Also the midcomms layer gets
aware of lock resource names, for now this is required to make a
connection between sequence number and lock resource names. It is for
debugging purpose only.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2022-11-08 12:59:41 -06:00
David Howells
1fc4fa2ac9 rxrpc: Fix congestion management
rxrpc has a problem in its congestion management in that it saves the
congestion window size (cwnd) from one call to another, but if this is 0 at
the time is saved, then the next call may not actually manage to ever
transmit anything.

To this end:

 (1) Don't save cwnd between calls, but rather reset back down to the
     initial cwnd and re-enter slow-start if data transmission is idle for
     more than an RTT.

 (2) Preserve ssthresh instead, as that is a handy estimate of pipe
     capacity.  Knowing roughly when to stop slow start and enter
     congestion avoidance can reduce the tendency to overshoot and drop
     larger amounts of packets when probing.

In future, cwind growth also needs to be constrained when the window isn't
being filled due to being application limited.

Reported-by: Simon Wilkinson <sxw@auristor.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2022-11-08 16:42:28 +00:00
David Howells
d57a3a1516 rxrpc: Save last ACK's SACK table rather than marking txbufs
Improve the tracking of which packets need to be transmitted by saving the
last ACK packet that we receive that has a populated soft-ACK table rather
than marking packets.  Then we can step through the soft-ACK table and look
at the packets we've transmitted beyond that to determine which packets we
might want to retransmit.

We also look at the highest serial number that has been acked to try and
guess which packets we've transmitted the peer is likely to have seen.  If
necessary, we send a ping to retrieve that number.

One downside that might be a problem is that we can't then compare the
previous acked/unacked state so easily in rxrpc_input_soft_acks() - which
is a potential problem for the slow-start algorithm.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2022-11-08 16:42:28 +00:00
David Howells
a4ea4c4776 rxrpc: Don't use a ring buffer for call Tx queue
Change the way the Tx queueing works to make the following ends easier to
achieve:

 (1) The filling of packets, the encryption of packets and the transmission
     of packets can be handled in parallel by separate threads, rather than
     rxrpc_sendmsg() allocating, filling, encrypting and transmitting each
     packet before moving onto the next one.

 (2) Get rid of the fixed-size ring which sets a hard limit on the number
     of packets that can be retained in the ring.  This allows the number
     of packets to increase without having to allocate a very large ring or
     having variable-sized rings.

     [Note: the downside of this is that it's then less efficient to locate
     a packet for retransmission as we then have to step through a list and
     examine each buffer in the list.]

 (3) Allow the filler/encrypter to run ahead of the transmission window.

 (4) Make it easier to do zero copy UDP from the packet buffers.

 (5) Make it easier to do zero copy from userspace to the packet buffers -
     and thence to UDP (only if for unauthenticated connections).

To that end, the following changes are made:

 (1) Use the new rxrpc_txbuf struct instead of sk_buff for keeping packets
     to be transmitted in.  This allows them to be placed on multiple
     queues simultaneously.  An sk_buff isn't really necessary as it's
     never passed on to lower-level networking code.

 (2) Keep the transmissable packets in a linked list on the call struct
     rather than in a ring.  As a consequence, the annotation buffer isn't
     used either; rather a flag is set on the packet to indicate ackedness.

 (3) Use the RXRPC_CALL_TX_LAST flag to indicate that the last packet to be
     transmitted has been queued.  Add RXRPC_CALL_TX_ALL_ACKED to indicate
     that all packets up to and including the last got hard acked.

 (4) Wire headers are now stored in the txbuf rather than being concocted
     on the stack and they're stored immediately before the data, thereby
     allowing zerocopy of a single span.

 (5) Don't bother with instant-resend on transmission failure; rather,
     leave it for a timer or an ACK packet to trigger.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2022-11-08 16:42:28 +00:00
David Howells
5d7edbc923 rxrpc: Get rid of the Rx ring
Get rid of the Rx ring and replace it with a pair of queues instead.  One
queue gets the packets that are in-sequence and are ready for processing by
recvmsg(); the other queue gets the out-of-sequence packets for addition to
the first queue as the holes get filled.

The annotation ring is removed and replaced with a SACK table.  The SACK
table has the bits set that correspond exactly to the sequence number of
the packet being acked.  The SACK ring is copied when an ACK packet is
being assembled and rotated so that the first ACK is in byte 0.

Flow control handling is altered so that packets that are moved to the
in-sequence queue are hard-ACK'd even before they're consumed - and then
the Rx window size in the ACK packet (rsize) is shrunk down to compensate
(even going to 0 if the window is full).

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2022-11-08 16:42:28 +00:00
David Howells
d4d02d8bb5 rxrpc: Clone received jumbo subpackets and queue separately
Split up received jumbo packets into separate skbuffs by cloning the
original skbuff for each subpacket and setting the offset and length of the
data in that subpacket in the skbuff's private data.  The subpackets are
then placed on the recvmsg queue separately.  The security class then gets
to revise the offset and length to remove its metadata.

If we fail to clone a packet, we just drop it and let the peer resend it.
The original packet gets used for the final subpacket.

This should make it easier to handle parallel decryption of the subpackets.
It also simplifies the handling of lost or misordered packets in the
queuing/buffering loop as the possibility of overlapping jumbo packets no
longer needs to be considered.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2022-11-08 16:42:28 +00:00
David Howells
faf92e8d53 rxrpc: Split the rxrpc_recvmsg tracepoint
Split the rxrpc_recvmsg tracepoint so that the tracepoints that are about
data packet processing (and which have extra pieces of information) are
separate from the tracepoint that shows the general flow of recvmsg().

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2022-11-08 16:42:28 +00:00
David Howells
530403d9ba rxrpc: Clean up ACK handling
Clean up the rxrpc_propose_ACK() function.  If deferred PING ACK proposal
is split out, it's only really needed for deferred DELAY ACKs.  All other
ACKs, bar terminal IDLE ACK are sent immediately.  The deferred IDLE ACK
submission can be handled by conversion of a DELAY ACK into an IDLE ACK if
there's nothing to be SACK'd.

Also, because there's a delay between an ACK being generated and being
transmitted, it's possible that other ACKs of the same type will be
generated during that interval.  Apart from the ACK time and the serial
number responded to, most of the ACK body, including window and SACK
parameters, are not filled out till the point of transmission - so we can
avoid generating a new ACK if there's one pending that will cover the SACK
data we need to convey.

Therefore, don't propose a new DELAY or IDLE ACK for a call if there's one
already pending.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2022-11-08 16:42:28 +00:00
David Howells
72f0c6fb05 rxrpc: Allocate ACK records at proposal and queue for transmission
Allocate rxrpc_txbuf records for ACKs and put onto a queue for the
transmitter thread to dispatch.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2022-11-08 16:42:28 +00:00
David Howells
02a1935640 rxrpc: Define rxrpc_txbuf struct to carry data to be transmitted
Define a struct, rxrpc_txbuf, to carry data to be transmitted instead of a
socket buffer so that it can be placed onto multiple queues at once.  This
also allows the data buffer to be in the same allocation as the internal
data.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2022-11-08 16:42:28 +00:00
David Howells
27f699ccb8 rxrpc: Remove the flags from the rxrpc_skb tracepoint
Remove the flags from the rxrpc_skb tracepoint as we're no longer going to
be using this for the transmission buffers and so marking which are
transmission buffers isn't going to be necessary.

Note that this also remove the rxrpc skb flag that indicates if this is a
transmission buffer and so the count is not updated for the moment.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2022-11-08 16:42:28 +00:00
David Howells
f7fa52421f rxrpc: Record stats for why the REQUEST-ACK flag is being set
Record stats for why the REQUEST-ACK flag is being set.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2022-11-08 16:42:15 +00:00
David Howells
334dfbfc5a rxrpc: Split call timer-expiration from call timer-set tracepoint
Split the tracepoint for call timer-set to separate out the call
timer-expiration event

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2022-11-08 16:42:15 +00:00
David Howells
4d843be56b rxrpc: Trace setting of the request-ack flag
Add a tracepoint to log why the request-ack flag is set on an outgoing DATA
packet, allowing debugging as to why.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2022-11-08 16:42:15 +00:00
Mukesh Ojha
195623f2d8 f2fs: fix the msg data type
Data type of msg in f2fs_write_checkpoint trace should
be const char * instead of char *.

Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-11-01 17:56:03 -07:00
Mukesh Ojha
0db18eec0d f2fs: fix the assign logic of iocb
commit 18ae8d12991b ("f2fs: show more DIO information in tracepoint")
introduces iocb field in 'f2fs_direct_IO_enter' trace event
And it only assigns the pointer and later it accesses its field
in trace print log.

Unable to handle kernel paging request at virtual address ffffffc04cef3d30
Mem abort info:
ESR = 0x96000007
EC = 0x25: DABT (current EL), IL = 32 bits

 pc : trace_raw_output_f2fs_direct_IO_enter+0x54/0xa4
 lr : trace_raw_output_f2fs_direct_IO_enter+0x2c/0xa4
 sp : ffffffc0443cbbd0
 x29: ffffffc0443cbbf0 x28: ffffff8935b120d0 x27: ffffff8935b12108
 x26: ffffff8935b120f0 x25: ffffff8935b12100 x24: ffffff8935b110c0
 x23: ffffff8935b10000 x22: ffffff88859a936c x21: ffffff88859a936c
 x20: ffffff8935b110c0 x19: ffffff8935b10000 x18: ffffffc03b195060
 x17: ffffff8935b11e76 x16: 00000000000000cc x15: ffffffef855c4f2c
 x14: 0000000000000001 x13: 000000000000004e x12: ffff0000ffffff00
 x11: ffffffef86c350d0 x10: 00000000000010c0 x9 : 000000000fe0002c
 x8 : ffffffc04cef3d28 x7 : 7f7f7f7f7f7f7f7f x6 : 0000000002000000
 x5 : ffffff8935b11e9a x4 : 0000000000006250 x3 : ffff0a00ffffff04
 x2 : 0000000000000002 x1 : ffffffef86a0a31f x0 : ffffff8935b10000
 Call trace:
  trace_raw_output_f2fs_direct_IO_enter+0x54/0xa4
  print_trace_fmt+0x9c/0x138
  print_trace_line+0x154/0x254
  tracing_read_pipe+0x21c/0x380
  vfs_read+0x108/0x3ac
  ksys_read+0x7c/0xec
  __arm64_sys_read+0x20/0x30
  invoke_syscall+0x60/0x150
  el0_svc_common.llvm.1237943816091755067+0xb8/0xf8
  do_el0_svc+0x28/0xa0

Fix it by copying the required variables for printing and while at
it fix the similar issue at some other places in the same file.

Fixes: bd984c0309 ("f2fs: show more DIO information in tracepoint")
Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-11-01 17:56:03 -07:00
Linus Torvalds
4f1e0c18bc linux-watchdog 6.1-rc2 tag
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.14 (GNU/Linux)
 
 iEYEABECAAYFAmNQOiwACgkQ+iyteGJfRsopNgCgw0BPrAnTfXQxiPPJiej/vUIu
 rWMAoLs5azbM52vw54b5onmIngSYacNL
 =1XLP
 -----END PGP SIGNATURE-----

Merge tag 'linux-watchdog-6.1-rc2' of git://www.linux-watchdog.org/linux-watchdog

Pull watchdog updates from Wim Van Sebroeck:

 - Add tracing events for the most common watchdog events

* tag 'linux-watchdog-6.1-rc2' of git://www.linux-watchdog.org/linux-watchdog:
  watchdog: Add tracing events for the most usual watchdog events
2022-10-21 12:25:39 -07:00
Uwe Kleine-König
e25b091bed watchdog: Add tracing events for the most usual watchdog events
To simplify debugging which process touches a watchdog and when, add
tracing events for .start(), .set_timeout(), .ping() and .stop().

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20221008174602.3972859-1-u.kleine-koenig@pengutronix.de
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
2022-10-12 09:47:02 +02:00
Linus Torvalds
5d170fe435 f2fs-for-6.1-rc1
This round looks fairly small comparing to the previous updates which includes
 mostly minor bug fixes. Nevertheless, as we've still interested in improving
 the stability, Chao added some debugging methods to diagnoze subtle runtime
 inconsistency problem.
 
 Enhancement
  - store all the corruption or failure reasons in superblock
  - detect meta inode, summary info, and block address inconsistency
  - increase the limit for reserve_root for low-end devices
  - add the number of compressed IO in iostat
 
 Bug fix
  - DIO write fix for zoned devices
  - do out-of-place writes for cold files
  - fix some stat updates (FS_CP_DATA_IO, dirty page count)
  - fix race condition on setting FI_NO_EXTENT flag
  - fix data races when freezing super
  - fix wrong continue condition check in GC
  - do not allow ATGC for LFS mode
 
 In addition, there're some code enhancement and clean-ups as usual.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAmNEVIkACgkQQBSofoJI
 UNL/Qg//eu7k196yIKflDZmp5aJbb5ybpFmh7XkPiqAV17ns+R2uLGq68BvTs+Tg
 rqCjB7j2kkBh1kN32R7aGcx6tcbHjWc94pi59YTGQ6+pwkop3KJxFHSwAaUw6y34
 8NZwmsnrm9rv0A0QPhQPK19yWmG/2smUE9b/u7M3+20I1WANaxdS/vOKbZz/amOu
 f/BvsIIGS7Zzm9OpBCvGmq9Qpd83jlH6PuYGTC/OVbCrUiAJEmwN8wGsKP/9qB/5
 KxVpdlh3vxulS6ixNbMu2qw9GBAQpAOz50+eDL5ZtGvGIQNHZRpGlfpJoW1lz0EO
 4fJtpf5OMGqUbNaPCTG4qQGYAtKWA9YnFeWSS7RViQ6MryRXZMK8ka5eIe5Qblcf
 AXD/eU2gKzOu0fuvdBRCt/wTSb4gY8sMNhe4psDsZxfhaYIpX8Ee/XVa4d+Z4frg
 irN9gid1k3laMTx9dwJL8m7gIFvy3pak6l3B0bA69fAXd3faI40enuyfubFxnDet
 OuRNxj8j3J5C140ag5KOuBCRub2/aPaj9YSQqUstf64d8FzN/Ypn5iVPTs2DP/3D
 bcAFBwCS2+MCsk9+ra0WldZ5awdd6CRHDkvaYeDEuLCaLHUCo6CXe3aIyWawJBvJ
 RnghKNv82RIV+rQlI1/sg8lseoDnEZTp5iwDGw/qZ+ZUyn05apM=
 =aZ9y
 -----END PGP SIGNATURE-----

Merge tag 'f2fs-for-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
 "This round looks fairly small comparing to the previous updates and
  includes mostly minor bug fixes. Nevertheless, as we've still
  interested in improving the stability, Chao added some debugging
  methods to diagnoze subtle runtime inconsistency problem.

  Enhancements:
   - store all the corruption or failure reasons in superblock
   - detect meta inode, summary info, and block address inconsistency
   - increase the limit for reserve_root for low-end devices
   - add the number of compressed IO in iostat

  Bug fixes:
   - DIO write fix for zoned devices
   - do out-of-place writes for cold files
   - fix some stat updates (FS_CP_DATA_IO, dirty page count)
   - fix race condition on setting FI_NO_EXTENT flag
   - fix data races when freezing super
   - fix wrong continue condition check in GC
   - do not allow ATGC for LFS mode

  In addition, there're some code enhancement and clean-ups as usual"

* tag 'f2fs-for-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (32 commits)
  f2fs: change to use atomic_t type form sbi.atomic_files
  f2fs: account swapfile inodes
  f2fs: allow direct read for zoned device
  f2fs: support recording errors into superblock
  f2fs: support recording stop_checkpoint reason into super_block
  f2fs: remove the unnecessary check in f2fs_xattr_fiemap
  f2fs: introduce cp_status sysfs entry
  f2fs: fix to detect corrupted meta ino
  f2fs: fix to account FS_CP_DATA_IO correctly
  f2fs: code clean and fix a type error
  f2fs: add "c_len" into trace_f2fs_update_extent_tree_range for compressed file
  f2fs: fix to do sanity check on summary info
  f2fs: port to vfs{g,u}id_t and associated helpers
  f2fs: fix to do sanity check on destination blkaddr during recovery
  f2fs: let FI_OPU_WRITE override FADVISE_COLD_BIT
  f2fs: fix race condition on setting FI_NO_EXTENT flag
  f2fs: remove redundant check in f2fs_sanity_check_cluster
  f2fs: add static init_idisk_time function to reduce the code
  f2fs: fix typo
  f2fs: fix wrong dirty page count when race between mmap and fallocate.
  ...
2022-10-10 20:28:41 -07:00
Linus Torvalds
27bc50fc90 - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
linux-next for a couple of months without, to my knowledge, any negative
   reports (or any positive ones, come to that).
 
 - Also the Maple Tree from Liam R.  Howlett.  An overlapping range-based
   tree for vmas.  It it apparently slight more efficient in its own right,
   but is mainly targeted at enabling work to reduce mmap_lock contention.
 
   Liam has identified a number of other tree users in the kernel which
   could be beneficially onverted to mapletrees.
 
   Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
   (https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com).
   This has yet to be addressed due to Liam's unfortunately timed
   vacation.  He is now back and we'll get this fixed up.
 
 - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer.  It uses
   clang-generated instrumentation to detect used-unintialized bugs down to
   the single bit level.
 
   KMSAN keeps finding bugs.  New ones, as well as the legacy ones.
 
 - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
   memory into THPs.
 
 - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to support
   file/shmem-backed pages.
 
 - userfaultfd updates from Axel Rasmussen
 
 - zsmalloc cleanups from Alexey Romanov
 
 - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and memory-failure
 
 - Huang Ying adds enhancements to NUMA balancing memory tiering mode's
   page promotion, with a new way of detecting hot pages.
 
 - memcg updates from Shakeel Butt: charging optimizations and reduced
   memory consumption.
 
 - memcg cleanups from Kairui Song.
 
 - memcg fixes and cleanups from Johannes Weiner.
 
 - Vishal Moola provides more folio conversions
 
 - Zhang Yi removed ll_rw_block() :(
 
 - migration enhancements from Peter Xu
 
 - migration error-path bugfixes from Huang Ying
 
 - Aneesh Kumar added ability for a device driver to alter the memory
   tiering promotion paths.  For optimizations by PMEM drivers, DRM
   drivers, etc.
 
 - vma merging improvements from Jakub Matěn.
 
 - NUMA hinting cleanups from David Hildenbrand.
 
 - xu xin added aditional userspace visibility into KSM merging activity.
 
 - THP & KSM code consolidation from Qi Zheng.
 
 - more folio work from Matthew Wilcox.
 
 - KASAN updates from Andrey Konovalov.
 
 - DAMON cleanups from Kaixu Xia.
 
 - DAMON work from SeongJae Park: fixes, cleanups.
 
 - hugetlb sysfs cleanups from Muchun Song.
 
 - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY0HaPgAKCRDdBJ7gKXxA
 joPjAQDZ5LlRCMWZ1oxLP2NOTp6nm63q9PWcGnmY50FjD/dNlwEAnx7OejCLWGWf
 bbTuk6U2+TKgJa4X7+pbbejeoqnt5QU=
 =xfWx
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
   linux-next for a couple of months without, to my knowledge, any
   negative reports (or any positive ones, come to that).

 - Also the Maple Tree from Liam Howlett. An overlapping range-based
   tree for vmas. It it apparently slightly more efficient in its own
   right, but is mainly targeted at enabling work to reduce mmap_lock
   contention.

   Liam has identified a number of other tree users in the kernel which
   could be beneficially onverted to mapletrees.

   Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
   at [1]. This has yet to be addressed due to Liam's unfortunately
   timed vacation. He is now back and we'll get this fixed up.

 - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses
   clang-generated instrumentation to detect used-unintialized bugs down
   to the single bit level.

   KMSAN keeps finding bugs. New ones, as well as the legacy ones.

 - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
   memory into THPs.

 - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to
   support file/shmem-backed pages.

 - userfaultfd updates from Axel Rasmussen

 - zsmalloc cleanups from Alexey Romanov

 - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and
   memory-failure

 - Huang Ying adds enhancements to NUMA balancing memory tiering mode's
   page promotion, with a new way of detecting hot pages.

 - memcg updates from Shakeel Butt: charging optimizations and reduced
   memory consumption.

 - memcg cleanups from Kairui Song.

 - memcg fixes and cleanups from Johannes Weiner.

 - Vishal Moola provides more folio conversions

 - Zhang Yi removed ll_rw_block() :(

 - migration enhancements from Peter Xu

 - migration error-path bugfixes from Huang Ying

 - Aneesh Kumar added ability for a device driver to alter the memory
   tiering promotion paths. For optimizations by PMEM drivers, DRM
   drivers, etc.

 - vma merging improvements from Jakub Matěn.

 - NUMA hinting cleanups from David Hildenbrand.

 - xu xin added aditional userspace visibility into KSM merging
   activity.

 - THP & KSM code consolidation from Qi Zheng.

 - more folio work from Matthew Wilcox.

 - KASAN updates from Andrey Konovalov.

 - DAMON cleanups from Kaixu Xia.

 - DAMON work from SeongJae Park: fixes, cleanups.

 - hugetlb sysfs cleanups from Muchun Song.

 - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.

Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1]

* tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits)
  hugetlb: allocate vma lock for all sharable vmas
  hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer
  hugetlb: fix vma lock handling during split vma and range unmapping
  mglru: mm/vmscan.c: fix imprecise comments
  mm/mglru: don't sync disk for each aging cycle
  mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol
  mm: memcontrol: use do_memsw_account() in a few more places
  mm: memcontrol: deprecate swapaccounting=0 mode
  mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled
  mm/secretmem: remove reduntant return value
  mm/hugetlb: add available_huge_pages() func
  mm: remove unused inline functions from include/linux/mm_inline.h
  selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory
  selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd
  selftests/vm: add thp collapse shmem testing
  selftests/vm: add thp collapse file and tmpfs testing
  selftests/vm: modularize thp collapse memory operations
  selftests/vm: dedup THP helpers
  mm/khugepaged: add tracepoint to hpage_collapse_scan_file()
  mm/madvise: add file and shmem support to MADV_COLLAPSE
  ...
2022-10-10 17:53:04 -07:00
Linus Torvalds
52abb27abf slab fixes for 6.1-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEjUuTAak14xi+SF7M4CHKc/GJqRAFAmM6/BMACgkQ4CHKc/GJ
 qRBqBAgAh+5JdVkYBxW4MvGEolRw0RDIBNwEwmyJI7WeAegL8FaGI3jmA5Kcww4c
 yA+lL/jcS9zQ/qwwHHoCqZoCLDFa43oiDMjSW4MI6oZpV+T6lx5uaH5kXBKsmxy5
 2dONP7kYG/eFfBGB6F9qQOLJnCz0CXeY7+O99D1Nldx0yKKUVCK0krb018p5oI6a
 RTVRASSVuEGkxvJGo4BbIR1H40s1BKTyRO9eZCKEHSanYM5SVXdBy9GTh5VQWTPk
 WLwvXmd0DehZzlPrgg3PMVPBTNGO/yplWibugWyzUqGcPIhQPk6Z76aWE4vojI2q
 f0w+86BYR2U7SBV2ZaNrGrxk/PZJyg==
 =aDgU
 -----END PGP SIGNATURE-----

Merge tag 'slab-for-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab

Pull slab fixes from Vlastimil Babka:

 - The "common kmalloc v4" series [1] by Hyeonggon Yoo.

   While the plan after LPC is to try again if it's possible to get rid
   of SLOB and SLAB (and if any critical aspect of those is not possible
   to achieve with SLUB today, modify it accordingly), it will take a
   while even in case there are no objections.

   Meanwhile this is a nice cleanup and some parts (e.g. to the
   tracepoints) will be useful even if we end up with a single slab
   implementation in the future:

      - Improves the mm/slab_common.c wrappers to allow deleting
        duplicated code between SLAB and SLUB.

      - Large kmalloc() allocations in SLAB are passed to page allocator
        like in SLUB, reducing number of kmalloc caches.

      - Removes the {kmem_cache_alloc,kmalloc}_node variants of
        tracepoints, node id parameter added to non-_node variants.

 - Addition of kmalloc_size_roundup()

   The first two patches from a series by Kees Cook [2] that introduce
   kmalloc_size_roundup(). This will allow merging of per-subsystem
   patches using the new function and ultimately stop (ab)using ksize()
   in a way that causes ongoing trouble for debugging functionality and
   static checkers.

 - Wasted kmalloc() memory tracking in debugfs alloc_traces

   A patch from Feng Tang that enhances the existing debugfs
   alloc_traces file for kmalloc caches with information about how much
   space is wasted by allocations that needs less space than the
   particular kmalloc cache provides.

 - My series [3] to fix validation races for caches with enabled
   debugging:

      - By decoupling the debug cache operation more from non-debug
        fastpaths, extra locking simplifications were possible and thus
        done afterwards.

      - Additional cleanup of PREEMPT_RT specific code on top, by Thomas
        Gleixner.

      - A late fix for slab page leaks caused by the series, by Feng
        Tang.

 - Smaller fixes and cleanups:

      - Unneeded variable removals, by ye xingchen

      - A cleanup removing a BUG_ON() in create_unique_id(), by Chao Yu

Link: https://lore.kernel.org/all/20220817101826.236819-1-42.hyeyoo@gmail.com/ [1]
Link: https://lore.kernel.org/all/20220923202822.2667581-1-keescook@chromium.org/ [2]
Link: https://lore.kernel.org/all/20220823170400.26546-1-vbabka@suse.cz/ [3]

* tag 'slab-for-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: (30 commits)
  mm/slub: fix a slab missed to be freed problem
  slab: Introduce kmalloc_size_roundup()
  slab: Remove __malloc attribute from realloc functions
  mm/slub: clean up create_unique_id()
  mm/slub: enable debugging memory wasting of kmalloc
  slub: Make PREEMPT_RT support less convoluted
  mm/slub: simplify __cmpxchg_double_slab() and slab_[un]lock()
  mm/slub: convert object_map_lock to non-raw spinlock
  mm/slub: remove slab_lock() usage for debug operations
  mm/slub: restrict sysfs validation to debug caches and make it safe
  mm/sl[au]b: check if large object is valid in __ksize()
  mm/slab_common: move declaration of __ksize() to mm/slab.h
  mm/slab_common: drop kmem_alloc & avoid dereferencing fields when not using
  mm/slab_common: unify NUMA and UMA version of tracepoints
  mm/sl[au]b: cleanup kmem_cache_alloc[_node]_trace()
  mm/sl[au]b: generalize kmalloc subsystem
  mm/slub: move free_debug_processing() further
  mm/sl[au]b: introduce common alloc/free functions without tracepoint
  mm/slab: kmalloc: pass requests larger than order-1 page to page allocator
  mm/slab_common: cleanup kmalloc_large()
  ...
2022-10-10 10:21:22 -07:00
Linus Torvalds
a09476668e Char/Misc and other driver changes for 6.1-rc1
Here is the large set of char/misc and other small driver subsystem
 changes for 6.1-rc1.  Loads of different things in here:
   - IIO driver updates, additions, and changes.  Probably the largest
     part of the diffstat
   - habanalabs driver update with support for new hardware and features,
     the second largest part of the diff.
   - fpga subsystem driver updates and additions
   - mhi subsystem updates
   - Coresight driver updates
   - gnss subsystem updates
   - extcon driver updates
   - icc subsystem updates
   - fsi subsystem updates
   - nvmem subsystem and driver updates
   - misc driver updates
   - speakup driver additions for new features
   - lots of tiny driver updates and cleanups
 
 All of these have been in the linux-next tree for a while with no
 reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY0GQmA8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ylyVQCeNJjZ3hy+Wz8WkPSY+NkehuIhyCIAnjXMOJP8
 5G/JQ+rpcclr7VOXlS66
 =zVkU
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc and other driver updates from Greg KH:
 "Here is the large set of char/misc and other small driver subsystem
  changes for 6.1-rc1. Loads of different things in here:

   - IIO driver updates, additions, and changes. Probably the largest
     part of the diffstat

   - habanalabs driver update with support for new hardware and
     features, the second largest part of the diff.

   - fpga subsystem driver updates and additions

   - mhi subsystem updates

   - Coresight driver updates

   - gnss subsystem updates

   - extcon driver updates

   - icc subsystem updates

   - fsi subsystem updates

   - nvmem subsystem and driver updates

   - misc driver updates

   - speakup driver additions for new features

   - lots of tiny driver updates and cleanups

  All of these have been in the linux-next tree for a while with no
  reported issues"

* tag 'char-misc-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (411 commits)
  w1: Split memcpy() of struct cn_msg flexible array
  spmi: pmic-arb: increase SPMI transaction timeout delay
  spmi: pmic-arb: block access for invalid PMIC arbiter v5 SPMI writes
  spmi: pmic-arb: correct duplicate APID to PPID mapping logic
  spmi: pmic-arb: add support to dispatch interrupt based on IRQ status
  spmi: pmic-arb: check apid against limits before calling irq handler
  spmi: pmic-arb: do not ack and clear peripheral interrupts in cleanup_irq
  spmi: pmic-arb: handle spurious interrupt
  spmi: pmic-arb: add a print in cleanup_irq
  drivers: spmi: Directly use ida_alloc()/free()
  MAINTAINERS: add TI ECAP driver info
  counter: ti-ecap-capture: capture driver support for ECAP
  Documentation: ABI: sysfs-bus-counter: add frequency & num_overflows items
  dt-bindings: counter: add ti,am62-ecap-capture.yaml
  counter: Introduce the COUNTER_COMP_ARRAY component type
  counter: Consolidate Counter extension sysfs attribute creation
  counter: Introduce the Count capture component
  counter: 104-quad-8: Add Signal polarity component
  counter: Introduce the Signal polarity component
  counter: interrupt-cnt: Implement watch_validate callback
  ...
2022-10-08 08:56:37 -07:00
Linus Torvalds
0a78a376ef for-6.1/io_uring-2022-10-03
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmM67S0QHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgppnPEACkBzilBLKwT9MWdUAITwyrMXsAa1R9gsR9
 Tb3Xs+mNO2meuycLAUh4LIbb28NNr7/S5rwWet5NRZ71hgv4Q/WA/0EemAGGXYqd
 +3MEBAWU3FBFkC/cJXCnT8F5yCXYRkT5n/hzCSNEpNKjQ5JnAhHDlWAjgzZRuD/A
 A+YJjoBVJJuI1wY4I5XCpeQXEmg/Wc1MgXfyHgWVtGKnYrrxibiCnBZnqbAMZNvD
 hGn1Vl02ooamGTFm/nW/OAt71DtqsjWUCVOHKmlZ+zBUjbUj6FMXmPVV7vCV9o2w
 PT4Dx3CTc2iXwa8KfEFNPvXBzy0Qfu8edweP/MvZHWHVZREpEAh4cG6GhwW8whD+
 5mPisqmRjZKe0BBS4k/wKN1RXEypSQoTU4EdljfbQPU/usn35lmjMmEXXgs3IhqM
 fcTdO5ZUOp+CGyzI0Bc7UtS8vilJbX9ynN8G80MUUAZzuQg39MH7lNQYSJSSvJfU
 OlvzmL3lhRLYM1s/KKiZzdDBoMvC7R4oHmzCveOjQTMIHf6WNyqKFlrWScq2wzpN
 oRxqt0xiVQ3PFMmFj6N08f145qtbASuF3sKv7dbU3QXTsCAos3wdTdX+PejYApEZ
 W3dr0TDjNBicLNVPiSj132p0ZRtdZvLGuGVkBD4GPQeH2NwswxMHQAfz8e2lqmA4
 9bWG6BM7Yw==
 =m9kX
 -----END PGP SIGNATURE-----

Merge tag 'for-6.1/io_uring-2022-10-03' of git://git.kernel.dk/linux

Pull io_uring updates from Jens Axboe:

 - Add supported for more directly managed task_work running.

   This is beneficial for real world applications that end up issuing
   lots of system calls as part of handling work. Normal task_work will
   always execute as we transition in and out of the kernel, even for
   "unrelated" system calls. It's more efficient to defer the handling
   of io_uring's deferred work until the application wants it to be run,
   generally in batches.

   As part of ongoing work to write an io_uring network backend for
   Thrift, this has been shown to greatly improve performance. (Dylan)

 - Add IOPOLL support for passthrough (Kanchan)

 - Improvements and fixes to the send zero-copy support (Pavel)

 - Partial IO handling fixes (Pavel)

 - CQE ordering fixes around CQ ring overflow (Pavel)

 - Support sendto() for non-zc as well (Pavel)

 - Support sendmsg for zerocopy (Pavel)

 - Networking iov_iter fix (Stefan)

 - Misc fixes and cleanups (Pavel, me)

* tag 'for-6.1/io_uring-2022-10-03' of git://git.kernel.dk/linux: (56 commits)
  io_uring/net: fix notif cqe reordering
  io_uring/net: don't update msg_name if not provided
  io_uring: don't gate task_work run on TIF_NOTIFY_SIGNAL
  io_uring/rw: defer fsnotify calls to task context
  io_uring/net: fix fast_iov assignment in io_setup_async_msg()
  io_uring/net: fix non-zc send with address
  io_uring/net: don't skip notifs for failed requests
  io_uring/rw: don't lose short results on io_setup_async_rw()
  io_uring/rw: fix unexpected link breakage
  io_uring/net: fix cleanup double free free_iov init
  io_uring: fix CQE reordering
  io_uring/net: fix UAF in io_sendrecv_fail()
  selftest/net: adjust io_uring sendzc notif handling
  io_uring: ensure local task_work marks task as running
  io_uring/net: zerocopy sendmsg
  io_uring/net: combine fail handlers
  io_uring/net: rename io_sendzc()
  io_uring/net: support non-zerocopy sendto
  io_uring/net: refactor io_setup_async_addr
  io_uring/net: don't lose partial send_zc on fail
  ...
2022-10-07 08:52:43 -07:00
Linus Torvalds
76e4503534 for-6.1-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmM6zNkACgkQxWXV+ddt
 WDsNMg/+LTuwf6Js+mAl1AgtSpLOl2gLfNBJAUXhzwPbc3nF9bwONE/EUYEXTo5h
 kTf1cQRj0NCIZ7iHDwXuWNm77diNl+SChEDIoc7k0d6P7Qmmn2AWbTLM4dleyg5S
 6jxPpOMbegycQfL9tSJNaiT9zlZxj9Z+0yPibR99otrgtuv6zuvRxcdh34rEFIyf
 xoabO3/18lAKHzYzAZxNXMpbUSBmqLPVoZEOcfBAXvcuIJkzKRP6Y9gwlYs+kn+D
 J8BPa3LoSNxXrpCvWzlu7vO3gwNp7H7pQQqZKjjEcOZ+dj2UYQeTyJvl1vdzaNyk
 EoFYlkaKkYi7RaonuHjNaTeD/igJf8Eo6DTiXzACECssbKutlvNG4HXuFApsWy7M
 T7KZ5jTAQ98ZMYjgZ27UbEpFZd8lYHzV952Njjo9zbRVbqwaPEZTTdkjpz+3X6t4
 Z0A951ixOYKiOVdu3Uj1fHaBv0n/p0wrXIGt3ZIdjufM9TctV3oJwOZOiM2H0ccb
 XJVwsQG92+ja9XLZrw8H62PCKBYo3LL52r9b9NVodY9aTsQWTfiV5OP84RRlncCp
 hzPkHmO1YIyVcLoijagiO7cW21pQbKfqsRX/P1F7DXyjosHppmDS7IHDWA7Adf3W
 QA6eBnoWqVwBh7P+IyxJuRG0CrnxkPZeAZIhohDwk5Mt4NGATkA=
 =NlUz
 -----END PGP SIGNATURE-----

Merge tag 'for-6.1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs updates from David Sterba:
 "There's a bunch of performance improvements, most notably the FIEMAP
  speedup, the new block group tree to speed up mount on large
  filesystems, more io_uring integration, some sysfs exports and the
  usual fixes and core updates.

  Summary:

  Performance:

   - outstanding FIEMAP speed improvement
      - algorithmic change how extents are enumerated leads to orders of
        magnitude speed boost (uncached and cached)
      - extent sharing check speedup (2.2x uncached, 3x cached)
      - add more cancellation points, allowing to interrupt seeking in
        files with large number of extents
      - more efficient hole and data seeking (4x uncached, 1.3x cached)
      - sample results:
	    256M, 32K extents:   4s ->  29ms  (~150x)
	    512M, 64K extents:  30s ->  59ms  (~550x)
	    1G,  128K extents: 225s -> 120ms (~1800x)

   - improved inode logging, especially for directories (on dbench
     workload throughput +25%, max latency -21%)

   - improved buffered IO, remove redundant extent state tracking,
     lowering memory consumption and avoiding rb tree traversal

   - add sysfs tunable to let qgroup temporarily skip exact accounting
     when deleting snapshot, leading to a speedup but requiring a rescan
     after that, will be used by snapper

   - support io_uring and buffered writes, until now it was just for
     direct IO, with the no-wait semantics implemented in the buffered
     write path it now works and leads to speed improvement in IOPS
     (2x), throughput (2.2x), latency (depends, 2x to 150x)

   - small performance improvements when dropping and searching for
     extent maps as well as when flushing delalloc in COW mode
     (throughput +5MB/s)

  User visible changes:

   - new incompatible feature block-group-tree adding a dedicated tree
     for tracking block groups, this allows a much faster load during
     mount and avoids seeking unlike when it's scattered in the extent
     tree items
      - this reduces mount time for many-terabyte sized filesystems
      - conversion tool will be provided so existing filesystem can also
        be updated in place
      - to reduce test matrix and feature combinations requires no-holes
        and free-space-tree (mkfs defaults since 5.15)

   - improved reporting of super block corruption detected by scrub

   - scrub also tries to repair super block and does not wait until next
     commit

   - discard stats and tunables are exported in sysfs
     (/sys/fs/btrfs/FSID/discard)

   - qgroup status is exported in sysfs
     (/sys/sys/fs/btrfs/FSID/qgroups/)

   - verify that super block was not modified when thawing filesystem

  Fixes:

   - FIEMAP fixes
      - fix extent sharing status, does not depend on the cached status
        where merged
      - flush delalloc so compressed extents are reported correctly

   - fix alignment of VMA for memory mapped files on THP

   - send: fix failures when processing inodes with no links (orphan
     files and directories)

   - fix race between quota enable and quota rescan ioctl

   - handle more corner cases for read-only compat feature verification

   - fix missed extent on fsync after dropping extent maps

  Core:

   - lockdep annotations to validate various transactions states and
     state transitions

   - preliminary support for fs-verity in send

   - more effective memory use in scrub for subpage where sector is
     smaller than page

   - block group caching progress logic has been removed, load is now
     synchronous

   - simplify end IO callbacks and bio handling, use chained bios
     instead of own tracking

   - add no-wait semantics to several functions (tree search, nocow,
     flushing, buffered write

   - cleanups and refactoring

  MM changes:

   - export balance_dirty_pages_ratelimited_flags"

* tag 'for-6.1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (177 commits)
  btrfs: set generation before calling btrfs_clean_tree_block in btrfs_init_new_buffer
  btrfs: drop extent map range more efficiently
  btrfs: avoid pointless extent map tree search when flushing delalloc
  btrfs: remove unnecessary next extent map search
  btrfs: remove unnecessary NULL pointer checks when searching extent maps
  btrfs: assert tree is locked when clearing extent map from logging
  btrfs: remove unnecessary extent map initializations
  btrfs: remove the refcount warning/check at free_extent_map()
  btrfs: add helper to replace extent map range with a new extent map
  btrfs: move open coded extent map tree deletion out of inode eviction
  btrfs: use cond_resched_rwlock_write() during inode eviction
  btrfs: use extent_map_end() at btrfs_drop_extent_map_range()
  btrfs: move btrfs_drop_extent_cache() to extent_map.c
  btrfs: fix missed extent on fsync after dropping extent maps
  btrfs: remove stale prototype of btrfs_write_inode
  btrfs: enable nowait async buffered writes
  btrfs: assert nowait mode is not used for some btree search functions
  btrfs: make btrfs_buffered_write nowait compatible
  btrfs: plumb NOWAIT through the write path
  btrfs: make lock_and_cleanup_extent_if_need nowait compatible
  ...
2022-10-06 17:36:48 -07:00
Linus Torvalds
833477fce7 sound updates for 6.1-rc1
Majority of changes at this PR are ASoC drivers (SOF, Intel, AMD,
 Mediatek, Qualcomm, TI, Apple Silicon, etc), while we see a few
 small fixes in ALSA / ASoC core side, too.
 
 Here are highlights:
 
 Core:
 - A new string helper parse_int_array_user() and cleanups with it
 - Continued cleanup of memory allocation helpers
 - PCM core optimization and hardening
 - Continued ASoC core code cleanups
 
 ASoC:
 - Improvements to the SOF IPC4 code, especially around trace
 - Support for AMD Rembrant DSPs, AMD Pink Sardine ACP 6.2, Apple
   Silicon systems, Everest ES8326, Intel Sky Lake and Kaby Lake,
   Mediatek MT8186 support, NXP i.MX8ULP DSPs, Qualcomm SC8280XP,
   SM8250 and SM8450 and Texas Instruments SRC4392
 
 HD- and USB-audio:
 - Cleanups for unification of hda-ext bus
 - HD-audio HDMI codec driver cleanups
 - Continued endpoint management fixes for USB-audio
 - New quirks as usual
 -----BEGIN PGP SIGNATURE-----
 
 iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAmM9dF0OHHRpd2FpQHN1
 c2UuZGUACgkQLtJE4w1nLE+ImA//bkD6zgXRwq05zl0UuoNqv1CsI3OeQ6YgIorc
 Ca4ebCclS+uQiZo5Yw+qOSvxrEh25y0EG7bB5mKGW8bFFeThaXL1ZF+iVEabWi6G
 bMXMtYPQb2fyHlS0Jv9axtCptd8YZVCVgft1CNflvC1cp7qt1FxkCzfEKFuBpNUI
 OlU1ErWfY/u+iuxnXF+vUFjZQaN2BNztPLKjOMMv1eAE5MDfPMMP6GH7hvnEeNcZ
 zaAfxsJnqHrJrx7o1k1rSEpAeQjHuFJbT9eDV1F7cI2ZH78x8/DrZoxre/BOptX5
 +LYopxoVvldukwQQserXZS3g7R0Exbzp43vjmJA1lx/tEQCz4lrDZXXPW2kO7eWR
 +v/sVHLrBFDom4Py6NNjytH/aPoC5YvZsMzu9Go8jaiJhKHKfIyyEy8CGfYOSuQv
 E/zIHJNXy7rMVNl+o4BCljlDoYIZl9YhJ/BjcEL67nqJqZmTVzgeQ9BXuEWoL0IS
 JyuRguBUnvYoFZ9tfYsFeWosSJSqW3ewDMYHV+cRAp3+sMmM4LixNgj1K/s72j3E
 yyzEwwfUgnsy3g6L++OOwTay8fztMub7pFH8d0CGJdNVcdfuJB0yIQxaAyEYFjTP
 XWDaz20g9ctAolj2WzauHPqsQX9aY2MH19oNX331xVNCcOK6tV10AYDSt3Vpqcey
 oH7YASw=
 =EWRA
 -----END PGP SIGNATURE-----

Merge tag 'sound-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound updates from Takashi Iwai:
 "The majority of changes are ASoC drivers (SOF, Intel, AMD, Mediatek,
  Qualcomm, TI, Apple Silicon, etc), while we see a few small fixes in
  ALSA / ASoC core side, too.

  Here are highlights:

  Core:
   - A new string helper parse_int_array_user() and cleanups with it
   - Continued cleanup of memory allocation helpers
   - PCM core optimization and hardening
   - Continued ASoC core code cleanups

  ASoC:
   - Improvements to the SOF IPC4 code, especially around trace
   - Support for AMD Rembrant DSPs, AMD Pink Sardine ACP 6.2, Apple
     Silicon systems, Everest ES8326, Intel Sky Lake and Kaby Lake,
     Mediatek MT8186 support, NXP i.MX8ULP DSPs, Qualcomm SC8280XP,
     SM8250 and SM8450 and Texas Instruments SRC4392

  HD- and USB-audio:
   - Cleanups for unification of hda-ext bus
   - HD-audio HDMI codec driver cleanups
   - Continued endpoint management fixes for USB-audio
   - New quirks as usual"

* tag 'sound-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (422 commits)
  ALSA: hda: Fix position reporting on Poulsbo
  ALSA: hda/hdmi: Don't skip notification handling during PM operation
  ASoC: rockchip: i2s: use regmap_read_poll_timeout_atomic to poll I2S_CLR
  ASoC: dt-bindings: Document audio OF graph dai-tdm-slot-num dai-tdm-slot-width props
  ASoC: qcom: fix unmet direct dependencies for SND_SOC_QDSP6
  ALSA: usb-audio: Fix potential memory leaks
  ALSA: usb-audio: Fix NULL dererence at error path
  ASoC: mediatek: mt8192-mt6359: Set the driver name for the card
  ALSA: hda/realtek: More robust component matching for CS35L41
  ASoC: Intel: sof_rt5682: remove SOF_RT1015_SPEAKER_AMP_100FS flag
  ASoC: nau8825: Add TDM support
  ASoC: core: clarify the driver name initialization
  ASoC: mt6660: Fix PM disable depth imbalance in mt6660_i2c_probe
  ASoC: wm5102: Fix PM disable depth imbalance in wm5102_probe
  ASoC: wm5110: Fix PM disable depth imbalance in wm5110_probe
  ASoC: wm8997: Fix PM disable depth imbalance in wm8997_probe
  ASoC: wcd-mbhc-v2: Revert "ASoC: wcd-mbhc-v2: use pm_runtime_resume_and_get()"
  ASoC: mediatek: mt8186: Fix spelling mistake "slect" -> "select"
  ALSA: hda/realtek: Add quirk for HP Zbook Firefly 14 G9 model
  ALSA: asihpi - Remove unused struct hpi_subsys_response
  ...
2022-10-05 12:02:07 -07:00
Zhang Qilong
a834aa3ec9 f2fs: add "c_len" into trace_f2fs_update_extent_tree_range for compressed file
The trace_f2fs_update_extent_tree_range could not record compressed
block length in the cluster of compress file and we just add it.

Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-10-04 13:31:43 -07:00
Linus Torvalds
f4309528f3 dlm for 6.1
This set of commits includes:
 . Fix a couple races found with a new torture test.
 . Improve errors when api functions are used incorrectly.
 . Improve tracing for lock requests from user space.
 . Fix use after free in recently added tracing code.
 . Small internal code cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJjOyfeAAoJEDgbc8f8gGmqHF4QALKGo+95JGzfXN37dNL2ve8L
 DAKxESYIwaTEWuKxmD4AGogClEl55UoC8kxMB3dHwLZEd4U0v5ZDULR6NUYXMpos
 6miaoF+pJfBnpNRqpCieWRW5dYXD4TwSdquv5rUSmUBrdOSy34s/nORWB4kL443K
 hFPcbo5Mv1L0W70/+gdj1uBlBsenZxnXu6aEmrckONqwj9Q2SBjJTik9WuNwh+FF
 tEcmUt8kDanGkbwtMCxnbT3HDOdfQyW+qq4IJ6MOYHlW9Cqbp9QUvAIho4DEpr7f
 eGurQ/urSD3dltzuYQcZ81zGhaGxzaRt5d2AEHRrGugQ2ZvnsG74oSAmEINZTSw4
 RV2EXyJ4hXcXK/yJXo3fGzFm2/5JFvYhnvddo6wts3vQZHwefExIRCHVz2cJL9eS
 gFpfFu4uB8z7w7l9s9LJKv7cTriaDd1WHuIWZGonz3wlFSUOn7IxunDxM3Hc5YO3
 okawhr6sWe03fFcKsw1WeWymfDUwmk/7OV15OSDanItAwX5vkBYDBvAcA/cwm8cj
 P0Vb3c1/Sf1IjjHGGA13vHpD1JXJ7FHafg6jyWmjJNqaS+wtShvs2As9MqbtSWMb
 o2OcYTEEzME4mMIXZzVlKP7hhkLMaVR5PwGmbPovlyAkEUX0soH7nefyLMAqP3JG
 7VZYV46VCL7wm3yjrKYw
 =sL1G
 -----END PGP SIGNATURE-----

Merge tag 'dlm-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm

Pull dlm updates from David Teigland:

 - Fix a couple races found with a new torture test

 - Improve errors when api functions are used incorrectly

 - Improve tracing for lock requests from user space

 - Fix use after free in recently added tracing cod.

 - Small internal code cleanups

* tag 'dlm-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
  fs: dlm: fix possible use after free if tracing
  fs: dlm: const void resource name parameter
  fs: dlm: LSFL_CB_DELAY only for kernel lockspaces
  fs: dlm: remove DLM_LSFL_FS from uapi
  fs: dlm: trace user space callbacks
  fs: dlm: change ls_clear_proc_locks to spinlock
  fs: dlm: remove dlm_del_ast prototype
  fs: dlm: handle rcom in else if branch
  fs: dlm: allow lockspaces have zero lvblen
  fs: dlm: fix invalid derefence of sb_lvbptr
  fs: dlm: handle -EINVAL as log_error()
  fs: dlm: use __func__ for function name
  fs: dlm: handle -EBUSY first in unlock validation
  fs: dlm: handle -EBUSY first in lock arg validation
  fs: dlm: fix race between test_bit() and queue_work()
  fs: dlm: fix race in lowcomms
2022-10-03 20:11:59 -07:00
Linus Torvalds
3497640a80 Changes since last update:
- Introduce fscache-based domain to share blobs between images;
 
  - Support recording fragments in a special packed inode;
 
  - Support partial-referenced pclusters for global compressed data
    deduplication;
 
  - Fix an order >= MAX_ORDER warning due to crafted negative i_size;
 
  - Several cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iIcEABYIAC8WIQThPAmQN9sSA0DVxtI5NzHcH7XmBAUCYzq3FxEceGlhbmdAa2Vy
 bmVsLm9yZwAKCRA5NzHcH7XmBJbRAQDab/0DJu7iDktzupazfCibkg8vWzakXIi+
 KE0y5O8VaQEAwn9bdPU4cp+raowoMt3z8eGsj4H9ZO9NM8NfPUX0uQQ=
 =TNVH
 -----END PGP SIGNATURE-----

Merge tag 'erofs-for-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs

Pull erofs updates from Gao Xiang:
 "In this cycle, for container use cases, fscache-based shared domain is
  introduced [1] so that data blobs in the same domain will be storage
  deduplicated and it will also be used for page cache sharing later.

  Also, a special packed inode is now introduced to record inode
  fragments which keep the tail part of files by Yue Hu [2]. You can
  keep arbitary length or (at will) the whole file as a fragment and
  then fragments can be optionally compressed in the packed inode
  together and even deduplicated for smaller image sizes.

  In addition to that, global compressed data deduplication by sharing
  partial-referenced pclusters is also supported in this cycle.

  Summary:

   - Introduce fscache-based domain to share blobs between images

   - Support recording fragments in a special packed inode

   - Support partial-referenced pclusters for global compressed data
     deduplication

   - Fix an order >= MAX_ORDER warning due to crafted negative i_size

   - Several cleanups"

Link: https://lore.kernel.org/r/20220916085940.89392-1-zhujia.zj@bytedance.com [1]
Link: https://lore.kernel.org/r/cover.1663065968.git.huyue2@coolpad.com [2]

* tag 'erofs-for-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: clean up erofs_iget()
  erofs: clean up unnecessary code and comments
  erofs: fold in z_erofs_reload_indexes()
  erofs: introduce partial-referenced pclusters
  erofs: support on-disk compressed fragments data
  erofs: support interlaced uncompressed data for compressed files
  erofs: clean up .read_folio() and .readahead() in fscache mode
  erofs: introduce 'domain_id' mount option
  erofs: Support sharing cookies in the same domain
  erofs: introduce a pseudo mnt to manage shared cookies
  erofs: introduce fscache-based domain
  erofs: code clean up for fscache
  erofs: use kill_anon_super() to kill super in fscache mode
  erofs: fix order >= MAX_ORDER warning due to crafted negative i_size
2022-10-03 20:01:40 -07:00
Zach O'Keefe
d41fd2016e mm/khugepaged: add tracepoint to hpage_collapse_scan_file()
Add huge_memory:trace_mm_khugepaged_scan_file tracepoint to
hpage_collapse_scan_file() analogously to hpage_collapse_scan_pmd().

While this change is targeted at debugging MADV_COLLAPSE pathway, the
"mm_khugepaged" prefix is retained for symmetry with
huge_memory:trace_mm_khugepaged_scan_pmd, which retains it's legacy name
to prevent changing kernel ABI as much as possible.

Link: https://lkml.kernel.org/r/20220907144521.3115321-5-zokeefe@google.com
Link: https://lkml.kernel.org/r/20220922224046.1143204-5-zokeefe@google.com
Signed-off-by: Zach O'Keefe <zokeefe@google.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Chris Kennelly <ckennelly@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: James Houghton <jthoughton@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rongwei Wang <rongwei.wang@linux.alibaba.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03 14:03:33 -07:00
Zach O'Keefe
34488399fa mm/madvise: add file and shmem support to MADV_COLLAPSE
Add support for MADV_COLLAPSE to collapse shmem-backed and file-backed
memory into THPs (requires CONFIG_READ_ONLY_THP_FOR_FS=y).

On success, the backing memory will be a hugepage.  For the memory range
and process provided, the page tables will synchronously have a huge pmd
installed, mapping the THP.  Other mappings of the file extent mapped by
the memory range may be added to a set of entries that khugepaged will
later process and attempt update their page tables to map the THP by a
pmd.

This functionality unlocks two important uses:

(1)	Immediately back executable text by THPs.  Current support provided
	by CONFIG_READ_ONLY_THP_FOR_FS may take a long time on a large
	system which might impair services from serving at their full rated
	load after (re)starting.  Tricks like mremap(2)'ing text onto
	anonymous memory to immediately realize iTLB performance prevents
	page sharing and demand paging, both of which increase steady state
	memory footprint.  Now, we can have the best of both worlds: Peak
	upfront performance and lower RAM footprints.

(2)	userfaultfd-based live migration of virtual machines satisfy UFFD
	faults by fetching native-sized pages over the network (to avoid
	latency of transferring an entire hugepage).  However, after guest
	memory has been fully copied to the new host, MADV_COLLAPSE can
	be used to immediately increase guest performance.

Since khugepaged is single threaded, this change now introduces
possibility of collapse contexts racing in file collapse path.  There a
important few places to consider:

(1)	hpage_collapse_scan_file(), when we xas_pause() and drop RCU.
	We could have the memory collapsed out from under us, but
	the next xas_for_each() iteration will correctly pick up the
	hugepage.  The hugepage might not be up to date (insofar as
	copying of small page contents might not have completed - the
	page still may be locked), but regardless what small page index
	we were iterating over, we'll find the hugepage and identify it
	as a suitably aligned compound page of order HPAGE_PMD_ORDER.

	In khugepaged path, we locklessly check the value of the pmd,
	and only add it to deferred collapse array if we find pmd
	mapping pte table. This is fine, since other values that could
	have raced in right afterwards denote failure, or that the
	memory was successfully collapsed, so we don't need further
	processing.

	In madvise path, we'll take mmap_lock() in write to serialize
	against page table updates and will know what to do based on the
	true value of the pmd: recheck all ptes if we point to a pte table,
	directly install the pmd, if the pmd has been cleared, but
	memory not yet faulted, or nothing at all if we find a huge pmd.

	It's worth putting emphasis here on how we treat the none pmd
	here.  If khugepaged has processed this mm's page tables
	already, it will have left the pmd cleared (ready for refault by
	the process).  Depending on the VMA flags and sysfs settings,
	amount of RAM on the machine, and the current load, could be a
	relatively common occurrence - and as such is one we'd like to
	handle successfully in MADV_COLLAPSE.  When we see the none pmd
	in collapse_pte_mapped_thp(), we've locked mmap_lock in write
	and checked (a) huepaged_vma_check() to see if the backing
	memory is appropriate still, along with VMA sizing and
	appropriate hugepage alignment within the file, and (b) we've
	found a hugepage head of order HPAGE_PMD_ORDER at the offset
	in the file mapped by our hugepage-aligned virtual address.
	Even though the common-case is likely race with khugepaged,
	given these checks (regardless how we got here - we could be
	operating on a completely different file than originally checked
	in hpage_collapse_scan_file() for all we know) it should be safe
	to directly make the pmd a huge pmd pointing to this hugepage.

(2)	collapse_file() is mostly serialized on the same file extent by
	lock sequence:

		|	lock hupepage
		|		lock mapping->i_pages
		|			lock 1st page
		|		unlock mapping->i_pages
		|				<page checks>
		|		lock mapping->i_pages
		|				page_ref_freeze(3)
		|				xas_store(hugepage)
		|		unlock mapping->i_pages
		|				page_ref_unfreeze(1)
		|			unlock 1st page
		V	unlock hugepage

	Once a context (who already has their fresh hugepage locked)
	locks mapping->i_pages exclusively, it will hold said lock
	until it locks the first page, and it will hold that lock until
	the after the hugepage has been added to the page cache (and
	will unlock the hugepage after page table update, though that
	isn't important here).

	A racing context that loses the race for mapping->i_pages will
	then lose the race to locking the first page.  Here - depending
	on how far the other racing context has gotten - we might find
	the new hugepage (in which case we'll exit cleanly when we
	check PageTransCompound()), or we'll find the "old" 1st small
	page (in which we'll exit cleanly when we discover unexpected
	refcount of 2 after isolate_lru_page()).  This is assuming we
	are able to successfully lock the page we find - in shmem path,
	we could just fail the trylock and exit cleanly anyways.

	Failure path in collapse_file() is similar: once we hold lock
	on 1st small page, we are serialized against other collapse
	contexts.  Before the 1st small page is unlocked, we add it
	back to the pagecache and unfreeze the refcount appropriately.
	Contexts who lost the race to the 1st small page will then find
	the same 1st small page with the correct refcount and will be
	able to proceed.

[zokeefe@google.com: don't check pmd value twice in collapse_pte_mapped_thp()]
  Link: https://lkml.kernel.org/r/20220927033854.477018-1-zokeefe@google.com
[shy828301@gmail.com: Delete hugepage_vma_revalidate_anon(), remove
	check for multi-add in khugepaged_add_pte_mapped_thp()]
  Link: https://lore.kernel.org/linux-mm/CAHbLzkrtpM=ic7cYAHcqkubah5VTR8N5=k5RT8MTvv5rN1Y91w@mail.gmail.com/
Link: https://lkml.kernel.org/r/20220907144521.3115321-4-zokeefe@google.com
Link: https://lkml.kernel.org/r/20220922224046.1143204-4-zokeefe@google.com
Signed-off-by: Zach O'Keefe <zokeefe@google.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Chris Kennelly <ckennelly@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: James Houghton <jthoughton@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rongwei Wang <rongwei.wang@linux.alibaba.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03 14:03:33 -07:00
Zach O'Keefe
58ac9a8993 mm/khugepaged: attempt to map file/shmem-backed pte-mapped THPs by pmds
The main benefit of THPs are that they can be mapped at the pmd level,
increasing the likelihood of TLB hit and spending less cycles in page
table walks.  pte-mapped hugepages - that is - hugepage-aligned compound
pages of order HPAGE_PMD_ORDER mapped by ptes - although being contiguous
in physical memory, don't have this advantage.  In fact, one could argue
they are detrimental to system performance overall since they occupy a
precious hugepage-aligned/sized region of physical memory that could
otherwise be used more effectively.  Additionally, pte-mapped hugepages
can be the cheapest memory to collapse for khugepaged since no new
hugepage allocation or copying of memory contents is necessary - we only
need to update the mapping page tables.

In the anonymous collapse path, we are able to collapse pte-mapped
hugepages (albeit, perhaps suboptimally), but the file/shmem path makes no
effort when compound pages (of any order) are encountered.

Identify pte-mapped hugepages in the file/shmem collapse path.  The
final step of which makes a racy check of the value of the pmd to
ensure it maps a pte table.  This should be fine, since races that
result in false-positive (i.e.  attempt collapse even though we
shouldn't) will fail later in collapse_pte_mapped_thp() once we
actually lock mmap_lock and reinspect the pmd value.  Races that result
in false-negatives (i.e.  where we decide to not attempt collapse, but
should have) shouldn't be an issue, since in the worst case, we do
nothing - which is what we've done up to this point.  We make a similar
check in retract_page_tables().  If we do think we've found a
pte-mapped hugepgae in khugepaged context, attempt to update page
tables mapping this hugepage.

Note that these collapses still count towards the
/sys/kernel/mm/transparent_hugepage/khugepaged/pages_collapsed counter,
and if the pte-mapped hugepage was also mapped into multiple process'
address spaces, could be incremented for each page table update.  Since we
increment the counter when a pte-mapped hugepage is successfully added to
the list of to-collapse pte-mapped THPs, it's possible that we never
actually update the page table either.  This is different from how
file/shmem pages_collapsed accounting works today where only a successful
page cache update is counted (it's also possible here that no page tables
are actually changed).  Though it incurs some slop, this is preferred to
either not accounting for the event at all, or plumbing through data in
struct mm_slot on whether to account for the collapse or not.

Also note that work still needs to be done to support arbitrary compound
pages, and that this should all be converted to using folios.

[shy828301@gmail.com: Spelling mistake, update comment, and add Documentation]
  Link: https://lore.kernel.org/linux-mm/CAHbLzkpHwZxFzjfX9nxVoRhzup8WMjMfyL6Xiq8mZ9M-N3ombw@mail.gmail.com/
Link: https://lkml.kernel.org/r/20220907144521.3115321-3-zokeefe@google.com
Link: https://lkml.kernel.org/r/20220922224046.1143204-3-zokeefe@google.com
Signed-off-by: Zach O'Keefe <zokeefe@google.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Chris Kennelly <ckennelly@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: James Houghton <jthoughton@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rongwei Wang <rongwei.wang@linux.alibaba.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03 14:03:33 -07:00
Gao Xiang
312fe643ad erofs: clean up erofs_iget()
isdir indicated REQ_META|REQ_PRIO which no longer works now.
Get rid of isdir entirely.

Link: https://lore.kernel.org/r/20220927063607.54832-2-hsiangkao@linux.alibaba.com
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2022-09-27 17:27:45 +08:00
Liam R. Howlett
d4af56c5c7 mm: start tracking VMAs with maple tree
Start tracking the VMAs with the new maple tree structure in parallel with
the rb_tree.  Add debug and trace events for maple tree operations and
duplicate the rb_tree that is created on forks into the maple tree.

The maple tree is added to the mm_struct including the mm_init struct,
added support in required mm/mmap functions, added tracking in kernel/fork
for process forking, and used to find the unmapped_area and checked
against what the rbtree finds.

This also moves the mmap_lock() in exit_mmap() since the oom reaper call
does walk the VMAs.  Otherwise lockdep will be unhappy if oom happens.

When splitting a vma fails due to allocations of the maple tree nodes,
the error path in __split_vma() calls new->vm_ops->close(new).  The page
accounting for hugetlb is actually in the close() operation,  so it
accounts for the removal of 1/2 of the VMA which was not adjusted.  This
results in a negative exit value.  To avoid the negative charge, set
vm_start = vm_end and vm_pgoff = 0.

There is also a potential accounting issue in special mappings from
insert_vm_struct() failing to allocate, so reverse the charge there in
the failure scenario.

Link: https://lkml.kernel.org/r/20220906194824.2110408-9-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Tested-by: Yu Zhao <yuzhao@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-26 19:46:14 -07:00
Liam R. Howlett
54a611b605 Maple Tree: add new data structure
Patch series "Introducing the Maple Tree"

The maple tree is an RCU-safe range based B-tree designed to use modern
processor cache efficiently.  There are a number of places in the kernel
that a non-overlapping range-based tree would be beneficial, especially
one with a simple interface.  If you use an rbtree with other data
structures to improve performance or an interval tree to track
non-overlapping ranges, then this is for you.

The tree has a branching factor of 10 for non-leaf nodes and 16 for leaf
nodes.  With the increased branching factor, it is significantly shorter
than the rbtree so it has fewer cache misses.  The removal of the linked
list between subsequent entries also reduces the cache misses and the need
to pull in the previous and next VMA during many tree alterations.

The first user that is covered in this patch set is the vm_area_struct,
where three data structures are replaced by the maple tree: the augmented
rbtree, the vma cache, and the linked list of VMAs in the mm_struct.  The
long term goal is to reduce or remove the mmap_lock contention.

The plan is to get to the point where we use the maple tree in RCU mode.
Readers will not block for writers.  A single write operation will be
allowed at a time.  A reader re-walks if stale data is encountered.  VMAs
would be RCU enabled and this mode would be entered once multiple tasks
are using the mm_struct.

Davidlor said

: Yes I like the maple tree, and at this stage I don't think we can ask for
: more from this series wrt the MM - albeit there seems to still be some
: folks reporting breakage.  Fundamentally I see Liam's work to (re)move
: complexity out of the MM (not to say that the actual maple tree is not
: complex) by consolidating the three complimentary data structures very
: much worth it considering performance does not take a hit.  This was very
: much a turn off with the range locking approach, which worst case scenario
: incurred in prohibitive overhead.  Also as Liam and Matthew have
: mentioned, RCU opens up a lot of nice performance opportunities, and in
: addition academia[1] has shown outstanding scalability of address spaces
: with the foundation of replacing the locked rbtree with RCU aware trees.

A similar work has been discovered in the academic press

	https://pdos.csail.mit.edu/papers/rcuvm:asplos12.pdf

Sheer coincidence.  We designed our tree with the intention of solving the
hardest problem first.  Upon settling on a b-tree variant and a rough
outline, we researched ranged based b-trees and RCU b-trees and did find
that article.  So it was nice to find reassurances that we were on the
right path, but our design choice of using ranges made that paper unusable
for us.

This patch (of 70):

The maple tree is an RCU-safe range based B-tree designed to use modern
processor cache efficiently.  There are a number of places in the kernel
that a non-overlapping range-based tree would be beneficial, especially
one with a simple interface.  If you use an rbtree with other data
structures to improve performance or an interval tree to track
non-overlapping ranges, then this is for you.

The tree has a branching factor of 10 for non-leaf nodes and 16 for leaf
nodes.  With the increased branching factor, it is significantly shorter
than the rbtree so it has fewer cache misses.  The removal of the linked
list between subsequent entries also reduces the cache misses and the need
to pull in the previous and next VMA during many tree alterations.

The first user that is covered in this patch set is the vm_area_struct,
where three data structures are replaced by the maple tree: the augmented
rbtree, the vma cache, and the linked list of VMAs in the mm_struct.  The
long term goal is to reduce or remove the mmap_lock contention.

The plan is to get to the point where we use the maple tree in RCU mode.
Readers will not block for writers.  A single write operation will be
allowed at a time.  A reader re-walks if stale data is encountered.  VMAs
would be RCU enabled and this mode would be entered once multiple tasks
are using the mm_struct.

There is additional BUG_ON() calls added within the tree, most of which
are in debug code.  These will be replaced with a WARN_ON() call in the
future.  There is also additional BUG_ON() calls within the code which
will also be reduced in number at a later date.  These exist to catch
things such as out-of-range accesses which would crash anyways.

Link: https://lkml.kernel.org/r/20220906194824.2110408-1-Liam.Howlett@oracle.com
Link: https://lkml.kernel.org/r/20220906194824.2110408-2-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Tested-by: David Howells <dhowells@redhat.com>
Tested-by: Sven Schnelle <svens@linux.ibm.com>
Tested-by: Yu Zhao <yuzhao@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: David Hildenbrand <david@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-26 19:46:13 -07:00
Christoph Hellwig
bd86a532b2 btrfs: stop tracking failed reads in the I/O tree
There is a separate I/O failure tree to track the fail reads, so remove
the extra EXTENT_DAMAGED bit in the I/O tree as it's set but never used.

Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-09-26 12:28:05 +02:00
Josef Bacik
87c11705cc btrfs: convert the io_failure_tree to a plain rb_tree
We still have this oddity of stashing the io_failure_record in the
extent state for the io_failure_tree, which is leftover from when we
used to stuff private pointers in extent_io_trees.

However this doesn't make a lot of sense for the io failure records, we
can simply use a normal rb_tree for this.  This will allow us to further
simplify the extent_io_tree code by removing the io_failure_rec pointer
from the extent state.

Convert the io_failure_tree to an rb tree + spinlock in the inode, and
then use our rb tree simple helpers to insert and find failed records.
This greatly cleans up this code and makes it easier to separate out the
extent_io_tree code.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-09-26 12:28:02 +02:00
Linus Torvalds
c69cf88cda ARM: SoC fixes for 6.0-rc6
Another set of fixes for fixes for the soc tree:
 
  - A fix for the interrupt number on at91/lan966 ethernet PHYs
 
  - A second round of fixes for NXP i.MX series, including a couple
    of build issues, and board specific DT corrections on
    TQMa8MPQL, imx8mp-venice-gw74xx and imx8mm-verdin for reliability
    and partially broken functionality.
 
  - Several fixes for Rockchip SoCs, addressing a USB issue on BPI-R2-Pro,
    wakeup on Gru-Bob and reliability of high-speed SD cards, among
    other minor issues.
 
  - A fix for a long-running naming mistake that prevented the moxart mmc
    driver from working at all.
 
  - Multiple Arm SCMI firmware fixes for hardening some corner cases.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmMsf8kACgkQmmx57+YA
 GNklew//T+pAuVwhR8OMp3DolbM/CwezgZgEXkuqDS0GvLkuoR71N7y1wEO77CDI
 9/luYQiFnMI8ooBMXLG545EJCZNommtDKWfSMjJnYeVQit3nupJSYaOLkzD949hg
 fg2BhA3mIKJY53m5SHRfZJOr+Q5E1DEmREX7m9e3nXTDY7izWpE2HtlKt26lKTq4
 w4sbchmrC4YRLqkBbSGLczClCakF0/L3QhGUIfBlTdLmhye0PJiQzfhVTKgdb7Jr
 l0T8vt5vg+5f5ib3PrnPQCaA3Azgu0QvImwKr7/vU/Sn6/e/xwV/hcuqQBZPFbbl
 RmSkHb3mBLXogk/EjLiw8y59D22SIbdtE+/tD+FRP+q0gjgPKobRZiqLFijvIWSB
 TtaTsKhotFKFs+pDysF0C/IfpK9MaYcX71WdqfvwlPiGGK7xCt3W+AKzgUmRVfew
 dVMeyBlVL9T3003MpLkiaIoDp8JfJsD3051CCH5tdOtF53PeKsgTUEXtnQezBof2
 80KgGXg2QGbwx+vYPGJqgQKzG7teq06G4BERK/yeFCrOsxrRXzH/icDA3F5xKY5f
 IqQiTqvZeCQvvr8G1iZb6YkhflQHaNktsRCajxERTgPfRzuQFHwF96C/+weGcZBp
 edBtweGCJ7AvV8vmvmvCdMDg9BDfgHOOwiNOKqmVvsIO01Ei8Oc=
 =fI2K
 -----END PGP SIGNATURE-----

Merge tag 'soc-fixes-6.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull ARM SoC fixes from Arnd Bergmann:
 "Another set of fixes for fixes for the soc tree:

   - A fix for the interrupt number on at91/lan966 ethernet PHYs

   - A second round of fixes for NXP i.MX series, including a couple of
     build issues, and board specific DT corrections on TQMa8MPQL,
     imx8mp-venice-gw74xx and imx8mm-verdin for reliability and
     partially broken functionality

   - Several fixes for Rockchip SoCs, addressing a USB issue on
     BPI-R2-Pro, wakeup on Gru-Bob and reliability of high-speed SD
     cards, among other minor issues

   - A fix for a long-running naming mistake that prevented the moxart
     mmc driver from working at all

   - Multiple Arm SCMI firmware fixes for hardening some corner cases"

* tag 'soc-fixes-6.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (30 commits)
  arm64: dts: imx8mp-venice-gw74xx: fix port/phy validation
  ARM: dts: lan966x: Fix the interrupt number for internal PHYs
  arm64: dts: imx8mp-venice-gw74xx: fix ksz9477 cpu port
  arm64: dts: imx8mp-venice-gw74xx: fix CAN STBY polarity
  dt-bindings: memory-controllers: fsl,imx8m-ddrc: drop Leonard Crestez
  arm64: dts: tqma8mqml: Include phy-imx8-pcie.h header
  arm64: defconfig: enable ARCH_NXP
  arm64: dts: imx8mp-tqma8mpql-mba8mpxl: add missing pinctrl for RTC alarm
  ARM: dts: fix Moxa SDIO 'compatible', remove 'sdhci' misnomer
  arm64: dts: imx8mm-verdin: extend pmic voltages
  arm64: dts: rockchip: Remove 'enable-active-low' from rk3566-quartz64-a
  arm64: dts: rockchip: Remove 'enable-active-low' from rk3399-puma
  arm64: dts: rockchip: fix property for usb2 phy supply on rk3568-evb1-v10
  arm64: dts: rockchip: fix property for usb2 phy supply on rock-3a
  arm64: dts: imx8ulp: add #reset-cells for pcc
  arm64: dts: tqma8mpxl-ba8mpxl: Fix button GPIOs
  arm64: dts: imx8mn: remove GPU power domain reset
  arm64: dts: rockchip: Set RK3399-Gru PCLK_EDP to 24 MHz
  arm64: dts: imx8mm: Reverse CPLD_Dn GPIO label mapping on MX8Menlo
  arm64: dts: rockchip: fix upper usb port on BPI-R2-Pro
  ...
2022-09-22 11:10:11 -07:00
Dylan Yudaken
f75d5036d0 io_uring: trace local task work run
Add tracing for io_run_local_task_work

Signed-off-by: Dylan Yudaken <dylany@fb.com>
Link: https://lore.kernel.org/r/20220830125013.570060-8-dylany@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-21 10:30:42 -06:00
Noah Klayman
794cd3bd69
ASoC: SOF: replace ipc4-loader dev_vdbg with tracepoints
This patch replaces dev_vdbg with tracepoints in new ipc4-loader code.

Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Signed-off-by: Noah Klayman <noah.klayman@intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/20220919122108.43764-8-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-09-19 15:44:08 +01:00
Noah Klayman
bcd2cc350d
ASoC: SOF: replace dev_vdbg with tracepoints
This patch removes unneeded dev_vdbg calls and replaces remaining ones
with tracepoints to reduce overhead and enable use of trace collection
and analysis tools.

Signed-off-by: Noah Klayman <noah.klayman@intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Link: https://lore.kernel.org/r/20220919122108.43764-7-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-09-19 15:44:07 +01:00
Bard Liao
d272b65704
ASoC: SOF: Intel: replace dev_vdbg with tracepoints
This patch replaces all dev_vdbg calls with tracepoints to reduce
overhead and enable use of trace collection and analysis tools.

Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Signed-off-by: Noah Klayman <noah.klayman@intel.com>
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/20220919122108.43764-6-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-09-19 15:44:06 +01:00
Noah Klayman
baedc6300b
ASoC: SOF: Intel: add HDA interrupt source tracing
The Intel HDaudio controller relies on a single interrupt line which
wire-ORs multiple interrupt sources, such as stream, IPC, SoundWire and
wakes. This patch adds the ability to trace each event occurrence.

Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Signed-off-by: Noah Klayman <noah.klayman@intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/20220919122108.43764-3-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-09-19 15:44:04 +01:00
Bard Liao
fa6e73d691
ASoC: SOF: add widget setup/free tracing
Enables tracking of use_count during widget setup and free routines.
Useful for debugging unbalanced use_counts during suspend/resume.

Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com>
Signed-off-by: Noah Klayman <noah.klayman@intel.com>
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/20220919122108.43764-2-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2022-09-19 15:44:03 +01:00
Ohad Sharabi
0263256791 habanalabs: trace DMA allocations
This patch add tracepoints in the code for DMA allocation.
The main purpose is to be able to cross data with the map operations and
determine whether memory violation occurred, for example free DMA
allocation before unmapping it from device memory.

To achieve this the DMA alloc/free code flows were refactored so that a
single DMA tracepoint will catch many flows.

To get better understanding of what happened in the DMA allocations
the real allocating function is added to the trace as well.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-09-18 13:29:53 +03:00
Ohad Sharabi
191a4443c3 habanalabs: define trace events
This patch adds trace events for habanalabs driver to gain all the
benefits such an infrastructure can supply.

The following events were added:
- MMU map/unmap: to be able to track driver's memory allocations
- DMA alloc/free: to track our DMA allocation

the above trace points in conjunction will help us map the device memory
usage as well as to be able to track memory violations.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Acked-by: Oded Gabbay <ogabbay@kernel.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
2022-09-18 13:29:53 +03:00
Zach O'Keefe
5072280442 mm/khugepaged: record SCAN_PMD_MAPPED when scan_pmd() finds hugepage
When scanning an anon pmd to see if it's eligible for collapse, return
SCAN_PMD_MAPPED if the pmd already maps a hugepage.  Note that
SCAN_PMD_MAPPED is different from SCAN_PAGE_COMPOUND used in the
file-collapse path, since the latter might identify pte-mapped compound
pages.  This is required by MADV_COLLAPSE which necessarily needs to know
what hugepage-aligned/sized regions are already pmd-mapped.

In order to determine if a pmd already maps a hugepage, refactor
mm_find_pmd():

Return mm_find_pmd() to it's pre-commit f72e7dcdd2 ("mm: let mm_find_pmd
fix buggy race with THP fault") behavior.  ksm was the only caller that
explicitly wanted a pte-mapping pmd, so open code the pte-mapping logic
there (pmd_present() and pmd_trans_huge() checks).

Undo revert change in commit f72e7dcdd2 ("mm: let mm_find_pmd fix buggy
race with THP fault") that open-coded split_huge_pmd_address() pmd lookup
and use mm_find_pmd() instead.

Link: https://lkml.kernel.org/r/20220706235936.2197195-9-zokeefe@google.com
Signed-off-by: Zach O'Keefe <zokeefe@google.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Cc: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Chris Kennelly <ckennelly@google.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rongwei Wang <rongwei.wang@linux.alibaba.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: "Souptick Joarder (HPE)" <jrdr.linux@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-09-11 20:25:46 -07:00
Menglong Dong
9cb252c4c1 net: skb: export skb drop reaons to user by TRACE_DEFINE_ENUM
As Eric reported, the 'reason' field is not presented when trace the
kfree_skb event by perf:

$ perf record -e skb:kfree_skb -a sleep 10
$ perf script
  ip_defrag 14605 [021]   221.614303:   skb:kfree_skb:
  skbaddr=0xffff9d2851242700 protocol=34525 location=0xffffffffa39346b1
  reason:

The cause seems to be passing kernel address directly to TP_printk(),
which is not right. As the enum 'skb_drop_reason' is not exported to
user space through TRACE_DEFINE_ENUM(), perf can't get the drop reason
string from the 'reason' field, which is a number.

Therefore, we introduce the macro DEFINE_DROP_REASON(), which is used
to define the trace enum by TRACE_DEFINE_ENUM(). With the help of
DEFINE_DROP_REASON(), now we can remove the auto-generate that we
introduced in the commit ec43908dd5
("net: skb: use auto-generation to convert skb drop reason to string"),
and define the string array 'drop_reasons'.

Hmmmm...now we come back to the situation that have to maintain drop
reasons in both enum skb_drop_reason and DEFINE_DROP_REASON. But they
are both in dropreason.h, which makes it easier.

After this commit, now the format of kfree_skb is like this:

$ cat /tracing/events/skb/kfree_skb/format
name: kfree_skb
ID: 1524
format:
        field:unsigned short common_type;       offset:0;       size:2; signed:0;
        field:unsigned char common_flags;       offset:2;       size:1; signed:0;
        field:unsigned char common_preempt_count;       offset:3;       size:1; signed:0;
        field:int common_pid;   offset:4;       size:4; signed:1;

        field:void * skbaddr;   offset:8;       size:8; signed:0;
        field:void * location;  offset:16;      size:8; signed:0;
        field:unsigned short protocol;  offset:24;      size:2; signed:0;
        field:enum skb_drop_reason reason;      offset:28;      size:4; signed:0;

print fmt: "skbaddr=%p protocol=%u location=%p reason: %s", REC->skbaddr, REC->protocol, REC->location, __print_symbolic(REC->reason, { 1, "NOT_SPECIFIED" }, { 2, "NO_SOCKET" } ......

Fixes: ec43908dd5 ("net: skb: use auto-generation to convert skb drop reason to string")
Link: https://lore.kernel.org/netdev/CANn89i+bx0ybvE55iMYf5GJM48WwV1HNpdm9Q6t-HaEstqpCSA@mail.gmail.com/
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-07 15:28:08 +01:00
Hyeonggon Yoo
2c1d697fb8 mm/slab_common: drop kmem_alloc & avoid dereferencing fields when not using
Drop kmem_alloc event class, and define kmalloc and kmem_cache_alloc
using TRACE_EVENT() macro.

And then this patch does:
   - Do not pass pointer to struct kmem_cache to trace_kmalloc.
     gfp flag is enough to know if it's accounted or not.
   - Avoid dereferencing s->object_size and s->size when not using kmem_cache_alloc event.
   - Avoid dereferencing s->name in when not using kmem_cache_free event.
   - Adjust s->size to SLOB_UNITS(s->size) * SLOB_UNIT in SLOB

Cc: Vasily Averin <vasily.averin@linux.dev>
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2022-09-01 11:44:26 +02:00
Hyeonggon Yoo
11e9734bcb mm/slab_common: unify NUMA and UMA version of tracepoints
Drop kmem_alloc event class, rename kmem_alloc_node to kmem_alloc, and
remove _node postfix for NUMA version of tracepoints.

This will break some tools that depend on {kmem_cache_alloc,kmalloc}_node,
but at this point maintaining both kmem_alloc and kmem_alloc_node
event classes does not makes sense at all.

Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2022-09-01 10:40:27 +02:00
Chao Yu
34a2352560 f2fs: iostat: support accounting compressed IO
Previously, we supported to account FS_CDATA_READ_IO type IO only,
in this patch, it adds to account more type IO for compressed file:
- APP_BUFFERED_CDATA_IO
- APP_MAPPED_CDATA_IO
- FS_CDATA_IO
- APP_BUFFERED_CDATA_READ_IO
- APP_MAPPED_CDATA_READ_IO

Signed-off-by: Chao Yu <chao.yu@oppo.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-08-29 21:15:51 -07:00
Alexander Aring
56171e0db2 fs: dlm: const void resource name parameter
The resource name parameter should never be changed by DLM so we declare
it as const. At some point it is handled as a char pointer, a resource
name can be a non printable ascii string as well. This patch change it
to handle it as void pointer as it is offered by DLM API.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2022-08-23 15:02:47 -05:00
Alexander Aring
7a3de7324c fs: dlm: trace user space callbacks
This patch adds trace callbacks for user locks. Unfortenately user locks
are handled in a different way than kernel locks in some cases. User
locks never call the dlm_lock()/dlm_unlock() kernel API and use the next
step internal API of dlm. Adding those traces from user API callers
should make it possible for dlm trace system to see lock handling for
user locks as well.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2022-08-23 14:54:04 -05:00
Cristian Marussi
40d30cf680 firmware: arm_scmi: Harmonize SCMI tracing message format
After the recently added new scmi_msg_dump traces, the general format of
the various other SCMI traces are not consistent.

As an example the full traces of a simple PERF_LEVEL_SET:

 | cpufreq-set-276  scmi_xfer_begin: transfer_id=145 msg_id=7 protocol_id=19 seq=145 poll=0
 | cpufreq-set-276  scmi_msg_dump: pt=13 t=CMND msg_id=07 seq=0091 s=0 pyld=000000008066ab13
 | cpufreq-set-276  scmi_xfer_response_wait: transfer_id=145 msg_id=7 protocol_id=19 seq=145 tmo_ms=5000 poll=0
 |      <idle>-0    scmi_msg_dump: pt=13 t=RESP msg_id=07 seq=0091 s=0 pyld=
 |      <idle>-0    scmi_rx_done: transfer_id=145 msg_id=7 protocol_id=19 seq=145 msg_type=0
 | cpufreq-set-276  scmi_xfer_end: transfer_id=145 msg_id=7 protocol_id=19 seq=145 status=0

... where the same information is being reported using different names
(protocol_id= vs pt=) and even worst different bases, which is hard to
read and to parse.

So let us unify them, using the same naming and ordering of the fields
(wherever possible) and moving all the protocol related fields to base-16
while keeping in base-10 timeouts, res_id and values, so that the new
traces would be like:

 | cpufreq-set-274  scmi_xfer_begin: pt=13 msg_id=07 seq=0092 transfer_id=92 poll=0
 | cpufreq-set-274  scmi_msg_dump: pt=13 t=CMND msg_id=07 seq=0092 s=0 pyld=000000008066ab13
 | cpufreq-set-274  scmi_xfer_response_wait: pt=13 msg_id=07 seq=0092 transfer_id=92 tmo_ms=5000 poll=0
 |         cat-256  scmi_msg_dump: pt=13 t=RESP msg_id=07 seq=0092 s=0 pyld=
 |         cat-256  scmi_rx_done: pt=13 msg_id=07 seq=0092 transfer_id=92 msg_type=0
 | cpufreq-set-274  scmi_xfer_end: pt=13 msg_id=07 seq=0092 transfer_id=92 s=0

Link: https://lore.kernel.org/r/20220818132309.584042-2-cristian.marussi@arm.com
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2022-08-23 12:21:37 +01:00
Linus Torvalds
e18a90427c * Xen timer fixes
* Documentation formatting fixes
 
 * Make rseq selftest compatible with glibc-2.35
 
 * Fix handling of illegal LEA reg, reg
 
 * Cleanup creation of debugfs entries
 
 * Fix steal time cache handling bug
 
 * Fixes for MMIO caching
 
 * Optimize computation of number of LBRs
 
 * Fix uninitialized field in guest_maxphyaddr < host_maxphyaddr path
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmL0qwIUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroML1gf/SK6by+Gi0r7WSkrDjU94PKZ8D6Y3
 fErMhratccc9IfL3p90IjCVhEngfdQf5UVHExA5TswgHHAJTpECzuHya9TweQZc5
 2rrTvufup0MNALfzkSijrcI80CBvrJc6JyOCkv0BLp7yqXUrnrm0OOMV2XniS7y0
 YNn2ZCy44tLqkNiQrLhJQg3EsXu9l7okGpHSVO6iZwC7KKHvYkbscVFa/AOlaAwK
 WOZBB+1Ee+/pWhxsngM1GwwM3ZNU/jXOSVjew5plnrD4U7NYXIDATszbZAuNyxqV
 5gi+wvTF1x9dC6Tgd3qF7ouAqtT51BdRYaI9aYHOYgvzqdNFHWJu3XauDQ==
 =vI6Q
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull more kvm updates from Paolo Bonzini:

 - Xen timer fixes

 - Documentation formatting fixes

 - Make rseq selftest compatible with glibc-2.35

 - Fix handling of illegal LEA reg, reg

 - Cleanup creation of debugfs entries

 - Fix steal time cache handling bug

 - Fixes for MMIO caching

 - Optimize computation of number of LBRs

 - Fix uninitialized field in guest_maxphyaddr < host_maxphyaddr path

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (26 commits)
  KVM: x86/MMU: properly format KVM_CAP_VM_DISABLE_NX_HUGE_PAGES capability table
  Documentation: KVM: extend KVM_CAP_VM_DISABLE_NX_HUGE_PAGES heading underline
  KVM: VMX: Adjust number of LBR records for PERF_CAPABILITIES at refresh
  KVM: VMX: Use proper type-safe functions for vCPU => LBRs helpers
  KVM: x86: Refresh PMU after writes to MSR_IA32_PERF_CAPABILITIES
  KVM: selftests: Test all possible "invalid" PERF_CAPABILITIES.LBR_FMT vals
  KVM: selftests: Use getcpu() instead of sched_getcpu() in rseq_test
  KVM: selftests: Make rseq compatible with glibc-2.35
  KVM: Actually create debugfs in kvm_create_vm()
  KVM: Pass the name of the VM fd to kvm_create_vm_debugfs()
  KVM: Get an fd before creating the VM
  KVM: Shove vcpu stats_id init into kvm_vcpu_init()
  KVM: Shove vm stats_id init into kvm_create_vm()
  KVM: x86/mmu: Add sanity check that MMIO SPTE mask doesn't overlap gen
  KVM: x86/mmu: rename trace function name for asynchronous page fault
  KVM: x86/xen: Stop Xen timer before changing IRQ
  KVM: x86/xen: Initialize Xen timer only once
  KVM: SVM: Disable SEV-ES support if MMIO caching is disable
  KVM: x86/mmu: Fully re-evaluate MMIO caching when SPTE masks change
  KVM: x86: Tag kvm_mmu_x86_module_init() with __init
  ...
2022-08-11 12:10:08 -07:00
Linus Torvalds
aeb6e6ac18 NFS client updates for Linux 5.20
Highlights include:
 
 Stable fixes:
 - pNFS/flexfiles: Fix infinite looping when the RDMA connection errors out
 
 Bugfixes:
 - NFS: fix port value parsing
 - SUNRPC: Reinitialise the backchannel request buffers before reuse
 - SUNRPC: fix expiry of auth creds
 - NFSv4: Fix races in the legacy idmapper upcall
 - NFS: O_DIRECT fixes from Jeff Layton
 - NFSv4.1: Fix OP_SEQUENCE error handling
 - SUNRPC: Fix an RPC/RDMA performance regression
 - NFS: Fix case insensitive renames
 - NFSv4/pnfs: Fix a use-after-free bug in open
 - NFSv4.1: RECLAIM_COMPLETE must handle EACCES
 
 Features:
 - NFSv4.1: session trunking enhancements
 - NFSv4.2: READ_PLUS performance optimisations
 - NFS: relax the rules for rsize/wsize mount options
 - NFS: don't unhash dentry during unlink/rename
 - SUNRPC: Fail faster on bad verifier
 - NFS/SUNRPC: Various tracing improvements
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESQctxSBg8JpV8KqEZwvnipYKAPIFAmLzy24ACgkQZwvnipYK
 APJJvhAAnVmv3jeOLjwDm+3X9xBevTq8PXAikWVeJgbSKCdjfqXO1J+XF49MpxXl
 N8PQZMqnwWkF3WvhvYobblwGSl6gJ3RnhVuTgdv4jiPl0ZWyS3XngJY0dCTnNgdL
 5O9AjoOMtnGVZwN5j8agymA8f2TUcel5mED6sAk10t2zDZY7VxuQqjp0m696jYjF
 0PKBmxoC4+6tXtYJS7d2PGRCTjfEUx2BLnnGuLOKsB7X8f63XmxJiu8/AIiY7spr
 M/M+BjAF3ok86dT1LjGlkvSNp23H70Wmsv98udfvxIWBm1l4972oCR/CS+kh3/mU
 dYM+oQ3JxxgKEN7Fdak+zU/+qma9q5z2rPFpSIT1fMEuaKN/7H2cbiHPi5RnEBLa
 AHWilX/lWBIMnZJZd9g3yYcGe6E/pkT6TqW5JY+2510koyfNER4IismAWMx2iYKU
 D0WSZOkmEBS/OYZxpTnqGwvS4L1szo9DN3c+yG2KXLifnmVPpjXZO25wahqSuUo3
 V6eYUCXRJmVg+IuXGsMNdrjxGYxD12xChoYzx5RlXls2lwHGeZr+iG3aL3+XayHa
 I1Kji3500UmfEOUUUr4UiQ428dOdL3QqNzVzdymN8Vh4d7v64LUL0GSseY+10Xrs
 xcbR6l/hwjBIo+I1Bi2mmv3W10tKErFy9eBIKzql3D6VHg7ESOo=
 =U00h
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-5.20-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client updates from Trond Myklebust:
 "Highlights include:

  Stable fixes:
   - pNFS/flexfiles: Fix infinite looping when the RDMA connection
     errors out

  Bugfixes:
   - NFS: fix port value parsing
   - SUNRPC: Reinitialise the backchannel request buffers before reuse
   - SUNRPC: fix expiry of auth creds
   - NFSv4: Fix races in the legacy idmapper upcall
   - NFS: O_DIRECT fixes from Jeff Layton
   - NFSv4.1: Fix OP_SEQUENCE error handling
   - SUNRPC: Fix an RPC/RDMA performance regression
   - NFS: Fix case insensitive renames
   - NFSv4/pnfs: Fix a use-after-free bug in open
   - NFSv4.1: RECLAIM_COMPLETE must handle EACCES

  Features:
   - NFSv4.1: session trunking enhancements
   - NFSv4.2: READ_PLUS performance optimisations
   - NFS: relax the rules for rsize/wsize mount options
   - NFS: don't unhash dentry during unlink/rename
   - SUNRPC: Fail faster on bad verifier
   - NFS/SUNRPC: Various tracing improvements"

* tag 'nfs-for-5.20-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (46 commits)
  NFS: Improve readpage/writepage tracing
  NFS: Improve O_DIRECT tracing
  NFS: Improve write error tracing
  NFS: don't unhash dentry during unlink/rename
  NFSv4/pnfs: Fix a use-after-free bug in open
  NFS: nfs_async_write_reschedule_io must not recurse into the writeback code
  SUNRPC: Don't reuse bvec on retransmission of the request
  SUNRPC: Reinitialise the backchannel request buffers before reuse
  NFSv4.1: RECLAIM_COMPLETE must handle EACCES
  NFSv4.1 probe offline transports for trunking on session creation
  SUNRPC create a function that probes only offline transports
  SUNRPC export xprt_iter_rewind function
  SUNRPC restructure rpc_clnt_setup_test_and_add_xprt
  NFSv4.1 remove xprt from xprt_switch if session trunking test fails
  SUNRPC create an rpc function that allows xprt removal from rpc_clnt
  SUNRPC enable back offline transports in trunking discovery
  SUNRPC create an iterator to list only OFFLINE xprts
  NFSv4.1 offline trunkable transports on DESTROY_SESSION
  SUNRPC add function to offline remove trunkable transports
  SUNRPC expose functions for offline remote xprt functionality
  ...
2022-08-10 14:04:32 -07:00
Mingwei Zhang
1685c0f325 KVM: x86/mmu: rename trace function name for asynchronous page fault
Rename the tracepoint function from trace_kvm_async_pf_doublefault() to
trace_kvm_async_pf_repeated_fault() to make it clear, since double fault
has nothing to do with this trace function.

Asynchronous Page Fault (APF) is an artifact generated by KVM when it
cannot find a physical page to satisfy an EPT violation. KVM uses APF to
tell the guest OS to do something else such as scheduling other guest
processes to make forward progress. However, when another guest process
also touches a previously APFed page, KVM halts the vCPU instead of
generating a repeated APF to avoid wasting cycles.

Double fault (#DF) clearly has a different meaning and a different
consequence when triggered. #DF requires two nested contributory exceptions
instead of two page faults faulting at the same address. A prevous bug on
APF indicates that it may trigger a double fault in the guest [1] and
clearly this trace function has nothing to do with it. So rename this
function should be a valid choice.

No functional change intended.

[1] https://www.spinics.net/lists/kvm/msg214957.html

Signed-off-by: Mingwei Zhang <mizhang@google.com>
Message-Id: <20220807052141.69186-1-mizhang@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-08-10 15:08:26 -04:00
Linus Torvalds
e394ff83bb NFSD 6.0 Release Notes
Work on "courteous server", which was introduced in 5.19, continues
 apace. This release introduces a more flexible limit on the number
 of NFSv4 clients that NFSD allows, now that NFSv4 clients can remain
 in courtesy state long after the lease expiration timeout. The
 client limit is adjusted based on the physical memory size of the
 server.
 
 The NFSD filecache is a cache of files held open by NFSv4 clients or
 recently touched by NFSv2 or NFSv3 clients. This cache had some
 significant scalability constraints that have been relieved in this
 release. Thanks to all who contributed to this work.
 
 A data corruption bug found during the most recent NFS bake-a-thon
 that involves NFSv3 and NFSv4 clients writing the same file has been
 addressed in this release.
 
 This release includes several improvements in CPU scalability for
 NFSv4 operations. In addition, Neil Brown provided patches that
 simplify locking during file lookup, creation, rename, and removal
 that enables subsequent work on making these operations more
 scalable. We expect to see that work materialize in the next
 release.
 
 There are also numerous single-patch fixes, clean-ups, and the
 usual improvements in observability.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmLujF0ACgkQM2qzM29m
 f5c14w/+Lsoryo5vdTXMAZBXBNvVdXQmHqLIEotJEVA3sECr+Kad2bF8rFaCWVzS
 Gf9KhTetmcDO9O73I5I/UtJ2qFT9B4I6baSGpOzIkjsM/aKeEEQpbpdzPYhKrCEv
 bQu3P54js7snH4YV8s0I39nBFOWdnahYXaw7peqE/2GOHxaR2mz88bkrQ+OybCxz
 KETqUxA6bKzkOT61S0nHcnQKd8HQzhocMDtrxtANHGsMM167ngI1dw4tUQAtfAUI
 s9R+GS6qwiKgwGz1oqhTR6LA/h4DROxPnc7AieuD9FvuAnR3kXw61bGMN5Biwv2T
 JZUTBbQvWhNasSV+7qOY9nBu+sHVC6Q7OZ5C9F/KjMyqCioDX0DnbxX9uKP20CDd
 EAAMS8n4Tdgd4KRBWdkLXPzizWYAjZQmFIJtcZne1JzGZ4IWRnikgM5qD6n1VviZ
 kcPRm5EN3DRHA+Hte4jG0EHIrE/7g5gnf+zr9dWl3uNhZtfTmumCfU16YYmKG8pP
 QN4kXBR2w7dAvp8nRaOsY6bBFLDAk/jHbpY8Q4xoUO4tsojfWayCTGVFOrecOjxv
 uSn0LhiidC5pLlkcPgwemhysVywDzr+gGXBRJXeUOHfdd05Q2gbFK8OpqDSvJ3dZ
 aC/RxFvHc8jaktUcuIjkE6Rsz6AVaAH3EZj84oMZ4hZhyGbEreg=
 =PEJ3
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull nfsd updates from Chuck Lever:
 "Work on 'courteous server', which was introduced in 5.19, continues
  apace. This release introduces a more flexible limit on the number of
  NFSv4 clients that NFSD allows, now that NFSv4 clients can remain in
  courtesy state long after the lease expiration timeout. The client
  limit is adjusted based on the physical memory size of the server.

  The NFSD filecache is a cache of files held open by NFSv4 clients or
  recently touched by NFSv2 or NFSv3 clients. This cache had some
  significant scalability constraints that have been relieved in this
  release. Thanks to all who contributed to this work.

  A data corruption bug found during the most recent NFS bake-a-thon
  that involves NFSv3 and NFSv4 clients writing the same file has been
  addressed in this release.

  This release includes several improvements in CPU scalability for
  NFSv4 operations. In addition, Neil Brown provided patches that
  simplify locking during file lookup, creation, rename, and removal
  that enables subsequent work on making these operations more scalable.
  We expect to see that work materialize in the next release.

  There are also numerous single-patch fixes, clean-ups, and the usual
  improvements in observability"

* tag 'nfsd-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (78 commits)
  lockd: detect and reject lock arguments that overflow
  NFSD: discard fh_locked flag and fh_lock/fh_unlock
  NFSD: use (un)lock_inode instead of fh_(un)lock for file operations
  NFSD: use explicit lock/unlock for directory ops
  NFSD: reduce locking in nfsd_lookup()
  NFSD: only call fh_unlock() once in nfsd_link()
  NFSD: always drop directory lock in nfsd_unlink()
  NFSD: change nfsd_create()/nfsd_symlink() to unlock directory before returning.
  NFSD: add posix ACLs to struct nfsd_attrs
  NFSD: add security label to struct nfsd_attrs
  NFSD: set attributes when creating symlinks
  NFSD: introduce struct nfsd_attrs
  NFSD: verify the opened dentry after setting a delegation
  NFSD: drop fh argument from alloc_init_deleg
  NFSD: Move copy offload callback arguments into a separate structure
  NFSD: Add nfsd4_send_cb_offload()
  NFSD: Remove kmalloc from nfsd4_do_async_copy()
  NFSD: Refactor nfsd4_do_copy()
  NFSD: Refactor nfsd4_cleanup_inter_ssc() (2/2)
  NFSD: Refactor nfsd4_cleanup_inter_ssc() (1/2)
  ...
2022-08-09 14:56:49 -07:00
Linus Torvalds
15205c2829 fscache fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEqG5UsNXhtOCrfGQP+7dXa6fLC2sFAmLyXZ4ACgkQ+7dXa6fL
 C2tJCxAAhH4ufVMk1r8H/lflCulaBw917zgoOO/daO9FjKDtL6hReTN6OgJLLULw
 a1ZMsRxl1NJy5duGYRswUR6gS8y4LzTdILvqfvP+9YSoZzou+uTHHuKFJheA6x++
 /cRQFBQmx8JTBrEQXSHULa3FfKsU0UCwxCzy7q4o5zJogK6yiLhApxORIB5ZYi7W
 k/3iKA0isqKsSEwmRt3D9ekypb3E/8QGaQV7Bng6/1wldFFmL1g5w/ubY8TCsefV
 gnNUA3Ops3LsYj9a0vQGJ6lXKIol2nFClvmFM1qvb09u4PaVa2tofXd5FQuy/iC/
 z2+ULUtBi7gs+DjsxPBf1cnbeA3BzNu5BfdQFd3QJksKFHcIMGsKnTQehPuVPhdn
 p0IGZrf3/sncmXWVZ322w+R0TLiwB0CX9fh4WS5aYw9WHtRg+FGeFCoVLSV774M0
 OiRxefg6A4sBMP2XwHICRXWPtqnBj2PYhQ6bj6pCHJg4sGfrYnf/q7vWFI1rJ9JW
 258lhyWaaO0SqPMtdYamBQzPQAAsuG72Mrf5k3pFLWIuPPb9Ho3B7fwbtjcMQcbv
 q+apsiiPk/aqbjbI8YclOD+XgRQMUaRO2s2+JCAQvoFoWPRLYE6ZaTDdesN0nHJB
 PgTFACx/dPtTCpQxIUDldDsj8wk4iPmPZiQ31gVzKr53v2Jvkhc=
 =LWMd
 -----END PGP SIGNATURE-----

Merge tag 'fscache-fixes-20220809' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull fscache updates from David Howells:

 - Fix a cookie access ref leak if a cookie is invalidated a second time
   before the first invalidation is actually processed.

 - Add a tracepoint to log cookie lookup failure

* tag 'fscache-fixes-20220809' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  fscache: add tracepoint when failing cookie
  fscache: don't leak cookie access refs if invalidation is in progress or failed
2022-08-09 10:11:56 -07:00
Linus Torvalds
4b22e20741 AFS fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEqG5UsNXhtOCrfGQP+7dXa6fLC2sFAmLpXWwACgkQ+7dXa6fL
 C2tpSg//WX4+1XvsehIZxykJYmQ7XjVjrg9uGIEaslUry2nqEECDeK7LppKLuCoD
 1xMBc5GRn4h5YbBCGbMWnr+r6eezcXdwr5Fv+On4fD3l5AWTk5xTaR8v1So3G9bK
 nAkWaoSrFapuifWgBkDGy9NhLW/k6d24BAKCqGhOlPcVzi4B95C4gQ5sIlMlH9xw
 jUkHB+HSLZj/yBV47sWQUs0uy5fCiOgm3qgWYl2cTHD4qJ052i0ELO30uAv9Oykq
 YjgOUryZlCP/vLSWzOeK8bgINjCLetd+2jGT96bV15BouQ4uoHSrgt7ACtp1I3NP
 WAPGdzlzXZqaYSrqX2sPP/Tmq0/SipoKCRqK5e2yetmqs7zznKZe0ATwFhjJYp7L
 IOULEaOAyrAeT9KQEwqtANXXoUxyEUqz2GKiybXWC8oUFJ0RAKpSf4LRi9cpoaRz
 VDmi0F4/ZA+nMDOPPdbSZTtDCu+68Lz2l6Z7ONwIZ1qzyXJB4BC4wb+9rx7UftA9
 QuX9TqGUtjlWiPcM0qvdFoPOIA3xOXbZVtlvqygFxzDDXGlRVZdUa3uqMjELKzK1
 v1yRZIS00zqWM4lWFyaQNCZHwFhlnixRzkAmvQcioAb/NJL3eA4n3FbL+RBTSrWP
 XEtgpI32ROobA3S+69LomYH4AisMgP9XDP04mOQqQfIxg9GFdEI=
 =fXqy
 -----END PGP SIGNATURE-----

Merge tag 'afs-fixes-20220802' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull AFS fixes from David Howells:
 "Fix AFS refcount handling.

  The first patch converts afs to use refcount_t for its refcounts and
  the second patch fixes afs_put_call() and afs_put_server() to save the
  values they're going to log in the tracepoint before decrementing the
  refcount"

* tag 'afs-fixes-20220802' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  afs: Fix access after dec in put functions
  afs: Use refcount_t rather than atomic_t
2022-08-09 10:08:08 -07:00
Jeff Layton
1a1e3aca9d fscache: add tracepoint when failing cookie
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
2022-08-09 14:13:59 +01:00
Linus Torvalds
507f811f20 More power management updates for 5.20-rc1
- Fix return error code in mtk_cpu_dvfs_info_init (Yang Yingliang).
 
  - Minor cleanups and support for new boards for Qcom cpufreq drivers
    (Bryan O'Donoghue, Konrad Dybcio, Pierre Gondois, and Yicong Yang).
 
  - Fix sparse warnings for Tegra cpufreq driver (Viresh Kumar).
 
  - Make dev_pm_opp_set_regulators() accept NULL terminated list (Viresh
    Kumar).
 
  - Add dev_pm_opp_set_config() and friends and migrate other users and
    helpers to using them (Viresh Kumar).
 
  - Add support for multiple clocks for a device (Viresh Kumar and
    Krzysztof Kozlowski).
 
  - Configure resources before adding OPP table for Venus (Stanimir
    Varbanov).
 
  - Keep reference count up for opp->np and opp_table->np while they are
    still in use (Liang He).
 
  - Minor OPP cleanups (Viresh Kumar and Yang Li).
 
  - Add a trace event for cpuidle to track missed (too deep or too
    shallow) wakeups (Kajetan Puchalski).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmLxUA0SHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxypYQAK/sYS76XzKRjVsPmC082FVlA9Helhsa
 Op50DSnhfzYAWrtRZM5VPsV2CgQkmc5KCmZJSd1ZKIFcOpjlJT/rvaVaSH7Ltcn5
 52GOus6KXKCL3FegQLy3bLcmKkEJIXb3uhWE2VlSuj2cxx6KE2g4bUwPE0pRr++Y
 RkfaT6hcUzxxOAKw1cQhdXgBoXKL/ZeypmpZ95joYuas/mozKskM5SQFX455JCQ9
 t4vaRzrsHzxi5ELiML75TYMY97sF367wSs+4jZSgPBllbJcRXEMg+JkTccKRYrsZ
 k/kDvP5xVFzKT/dYpNpW3u/pl94+xZuh5WLF9/AqwC/qs7kLPJJ0/8mfTTd63DjZ
 3KrkimiQ3d2XMAL4L6FoK+T8v6MwzmlN0elmHHdtmu9mY+v01CwAzjpxdvaFoELK
 V6BCRRX8KNwYsrAJ4EpDK9TvPYJf8yT3jvGDcjPZY9RYlebje0Q825XOcxea4Dfe
 oFxiEWgfK9gzOBvaa24oifKDy2RVy6FvR43qQeiPG4AWAFjr4qP9cDO4q5OL/BuE
 sXpsGY5NE/e8JH9hkgmUK1ms50zk4UMbRC5ZoZuHWyiaFlJdMRF3cUGHe3ylPrxb
 XOFZz8Zl4WeAqBjGGHuiMedwEbmQH2RhdAMCQO1nxoq3UXy6E2/ojI1G1uQ9IEm0
 5FFouJ+bEnqO
 =LBb0
 -----END PGP SIGNATURE-----

Merge tag 'pm-5.20-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull more power management updates from Rafael Wysocki:
 "These are ARM cpufreq updates and operating performance points (OPP)
  updates plus one cpuidle update adding a new trace point.

  Specifics:

   - Fix return error code in mtk_cpu_dvfs_info_init (Yang Yingliang).

   - Minor cleanups and support for new boards for Qcom cpufreq drivers
     (Bryan O'Donoghue, Konrad Dybcio, Pierre Gondois, and Yicong Yang).

   - Fix sparse warnings for Tegra cpufreq driver (Viresh Kumar).

   - Make dev_pm_opp_set_regulators() accept NULL terminated list
     (Viresh Kumar).

   - Add dev_pm_opp_set_config() and friends and migrate other users and
     helpers to using them (Viresh Kumar).

   - Add support for multiple clocks for a device (Viresh Kumar and
     Krzysztof Kozlowski).

   - Configure resources before adding OPP table for Venus (Stanimir
     Varbanov).

   - Keep reference count up for opp->np and opp_table->np while they
     are still in use (Liang He).

   - Minor OPP cleanups (Viresh Kumar and Yang Li).

   - Add a trace event for cpuidle to track missed (too deep or too
     shallow) wakeups (Kajetan Puchalski)"

* tag 'pm-5.20-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (55 commits)
  cpuidle: Add cpu_idle_miss trace event
  venus: pm_helpers: Fix warning in OPP during probe
  OPP: Don't drop opp->np reference while it is still in use
  OPP: Don't drop opp_table->np reference while it is still in use
  cpufreq: tegra194: Staticize struct tegra_cpufreq_soc instances
  dt-bindings: cpufreq: cpufreq-qcom-hw: Add SM6375 compatible
  dt-bindings: opp: Add msm8939 to the compatible list
  dt-bindings: opp: Add missing compat devices
  dt-bindings: opp: opp-v2-kryo-cpu: Fix example binding checks
  cpufreq: Change order of online() CB and policy->cpus modification
  cpufreq: qcom-hw: Remove deprecated irq_set_affinity_hint() call
  cpufreq: qcom-hw: Disable LMH irq when disabling policy
  cpufreq: qcom-hw: Reset cancel_throttle when policy is re-enabled
  cpufreq: qcom-cpufreq-hw: use HZ_PER_KHZ macro in units.h
  cpufreq: mediatek: fix error return code in mtk_cpu_dvfs_info_init()
  OPP: Remove dev{m}_pm_opp_of_add_table_noclk()
  PM / devfreq: tegra30: Register config_clks helper
  OPP: Allow config_clks helper for single clk case
  OPP: Provide a simple implementation to configure multiple clocks
  OPP: Assert clk_count == 1 for single clk helpers
  ...
2022-08-08 14:29:00 -07:00
Linus Torvalds
ea0c39260d 9p-for-5.20
- a couple of fixes
 - add a tracepoint for fid refcounting
 - some cleanup/followup on fid lookup
 - some cleanup around req refcounting
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE/IPbcYBuWt0zoYhOq06b7GqY5nAFAmLuKqkACgkQq06b7GqY
 5nA+RBAAvuA6AjKQSvsNxHsqSFMahwoE3cCPlF/QlmAnnZa1DzmRI3kKTAuuDKkE
 Q4c7DjxzDJCWRTlkpNUCGZycHqpu1QQbfoTo43sIqob7W8xlajeemc5Fxqtw5sPM
 m+0SzN7vvNJpCr+6pxMXwwGHSZmZOvpFjwj7cUjhpF/V1WO8bNxfCGzQdF0hX1Vn
 2HoBbFUOmacL9Z/pF3O/ZG9LwCFuRQsH0EmbFBlJy1WdDtlVHTXlzDaGQ7EGaY4D
 17UR6iITsj9ozacLhvk094PLIc3/RHDGLrm3C4Ka3zmUI7BsiYWPDmai3Pu/DNqn
 JJ5sZkdrVowxyBbGxw8GpZ4YDJtGsU5XglFPdkw+ZazxhZLNEIstPXg0HTZybrOe
 GE+WskWB0qS+RpX0tnYEcX6qOHWm3/63Yq5NG6A9tLQUSFku02jS/bCQSLBrmGWW
 Js24IWvFSTvl6XytHeldYhJP618pNUxXRSqgYv96vT/LI3mUrIMN+IVBNPujO6p2
 jIYXNoaqLoY3efXKW/WQmp7C/52ZP4ly4fOiz7qtHTQsCcIk8Xo6zwHtm/FkNEqc
 sMZdqgLrxKPNBAlT8iEtt//wU2fB7mFt988p0pc+5lAK5t0h67KZJV6vDwTAMObX
 wV6Ht+QhOHtJwO779fk8FhZPNRPYZYMutIyFNlRx+4gCpBuqJPc=
 =941k
 -----END PGP SIGNATURE-----

Merge tag '9p-for-5.20' of https://github.com/martinetd/linux

Pull 9p updates from Dominique Martinet:

 - a couple of fixes

 - add a tracepoint for fid refcounting

 - some cleanup/followup on fid lookup

 - some cleanup around req refcounting

* tag '9p-for-5.20' of https://github.com/martinetd/linux:
  net/9p: Initialize the iounit field during fid creation
  net: 9p: fix refcount leak in p9_read_work() error handling
  9p: roll p9_tag_remove into p9_req_put
  9p: Add client parameter to p9_req_put()
  9p: Drop kref usage
  9p: Fix some kernel-doc comments
  9p fid refcount: cleanup p9_fid_put calls
  9p fid refcount: add a 9p_fid_ref tracepoint
  9p fid refcount: add p9_fid_get/put wrappers
  9p: Fix minor typo in code comment
  9p: Remove unnecessary variable for old fids while walking from d_parent
  9p: Make the path walk logic more clear about when cloning is required
  9p: Track the root fid with its own variable during lookups
2022-08-06 14:48:54 -07:00
Linus Torvalds
1d239c1eb8 IOMMU Updates for Linux v5.20/v6.0:
Including:
 
 	- Most intrusive patch is small and changes the default
 	  allocation policy for DMA addresses. Before the change the
 	  allocator tried its best to find an address in the first 4GB.
 	  But that lead to performance problems when that space gets
 	  exhaused, and since most devices are capable of 64-bit DMA
 	  these days, we changed it to search in the full DMA-mask
 	  range from the beginning.  This change has the potential to
 	  uncover bugs elsewhere, in the kernel or the hardware. There
 	  is a Kconfig option and a command line option to restore the
 	  old behavior, but none of them is enabled by default.
 
 	- Add Robin Murphy as reviewer of IOMMU code and maintainer for
 	  the dma-iommu and iova code
 
 	- Chaning IOVA magazine size from 1032 to 1024 bytes to save
 	  memory
 
 	- Some core code cleanups and dead-code removal
 
 	- Support for ACPI IORT RMR node
 
 	- Support for multiple PCI domains in the AMD-Vi driver
 
 	- ARM SMMU changes from Will Deacon:
 
 	  - Add even more Qualcomm device-tree compatible strings
 
 	  - Support dumping of IMP DEF Qualcomm registers on TLB sync
 	    timeout
 
 	  - Fix reference count leak on device tree node in Qualcomm
 	    driver
 
 	- Intel VT-d driver updates from Lu Baolu:
 
 	  - Make intel-iommu.h private
 
 	  - Optimize the use of two locks
 
 	  - Extend the driver to support large-scale platforms
 
 	  - Cleanup some dead code
 
 	- MediaTek IOMMU refactoring and support for TTBR up to 35bit
 
 	- Basic support for Exynos SysMMU v7
 
 	- VirtIO IOMMU driver gets a map/unmap_pages() implementation
 
 	- Other smaller cleanups and fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmLs3DIACgkQK/BELZcB
 GuMizhAAguAnLLOkOLlR9/MhrTZfNXCUX+bfrEIevjFXMw4iPNfCCr4ydQ7EdVK6
 ZA/3Z89huYl0d0x/FELolnQi+HOeqYrfTDe4rB7TgNgwZnWa+fdHcyYkgBGyfPaV
 ilgjNcx8o//9o4NasyB6kU395jVmFxb735gMTTb+tcO9fr+/qIB6hxrHuCklxrNr
 C7wK6kkoDPi5n0QuXCSjXEx2Hk245pAWKPLwqxsUYzHGlLfl7ULOxw65BUBGvn/H
 uCsTfJFu7u+ErwQYf0qPuOwRBnRdsx9g5EAnfab8p074SoKWvbNnftIxgIRp8ZEM
 YgCbhYa1GOFI4r+XzqRzEbc0/vPSttims4Jqz0KxYs7pr5EoVifrWLJFjJdCdc2h
 Tio1gTvOq8HbH63kwYNKJhg4iSC6zVd37ihEhvfFO6LcgFl4iCfd2o9zK7oY40J4
 XoOxofVnJ2e3tzdhZ/n5quCXiudHixm6WuVa7QYKscF7Ud0tY1wWKuibdlMQTeNM
 68MvtlteKcfs1BrWzZyrFMrFeAfIY8LI82y6jdJuoNMU5LE9+5yelXBdJhnVygZ+
 Jglv1TIt6W/z1H5JgXtNVZ1wWgBm7rurOqNyfN8XCd8eP1z321CLfX8ujkhKrIWP
 ApG15cwvpnh1JX630+UFiEikTGU0fb2orMdPwYmwuu8DAsoLVHE=
 =hI2K
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v5.20-or-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu updates from Joerg Roedel:

 - The most intrusive patch is small and changes the default allocation
   policy for DMA addresses.

   Before the change the allocator tried its best to find an address in
   the first 4GB. But that lead to performance problems when that space
   gets exhaused, and since most devices are capable of 64-bit DMA these
   days, we changed it to search in the full DMA-mask range from the
   beginning.

   This change has the potential to uncover bugs elsewhere, in the
   kernel or the hardware. There is a Kconfig option and a command line
   option to restore the old behavior, but none of them is enabled by
   default.

 - Add Robin Murphy as reviewer of IOMMU code and maintainer for the
   dma-iommu and iova code

 - Chaning IOVA magazine size from 1032 to 1024 bytes to save memory

 - Some core code cleanups and dead-code removal

 - Support for ACPI IORT RMR node

 - Support for multiple PCI domains in the AMD-Vi driver

 - ARM SMMU changes from Will Deacon:
      - Add even more Qualcomm device-tree compatible strings
      - Support dumping of IMP DEF Qualcomm registers on TLB sync
        timeout
      - Fix reference count leak on device tree node in Qualcomm driver

 - Intel VT-d driver updates from Lu Baolu:
      - Make intel-iommu.h private
      - Optimize the use of two locks
      - Extend the driver to support large-scale platforms
      - Cleanup some dead code

 - MediaTek IOMMU refactoring and support for TTBR up to 35bit

 - Basic support for Exynos SysMMU v7

 - VirtIO IOMMU driver gets a map/unmap_pages() implementation

 - Other smaller cleanups and fixes

* tag 'iommu-updates-v5.20-or-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (116 commits)
  iommu/amd: Fix compile warning in init code
  iommu/amd: Add support for AVIC when SNP is enabled
  iommu/amd: Simplify and Consolidate Virtual APIC (AVIC) Enablement
  ACPI/IORT: Fix build error implicit-function-declaration
  drivers: iommu: fix clang -wformat warning
  iommu/arm-smmu: qcom_iommu: Add of_node_put() when breaking out of loop
  iommu/arm-smmu-qcom: Add SM6375 SMMU compatible
  dt-bindings: arm-smmu: Add compatible for Qualcomm SM6375
  MAINTAINERS: Add Robin Murphy as IOMMU SUBSYTEM reviewer
  iommu/amd: Do not support IOMMUv2 APIs when SNP is enabled
  iommu/amd: Do not support IOMMU_DOMAIN_IDENTITY after SNP is enabled
  iommu/amd: Set translation valid bit only when IO page tables are in use
  iommu/amd: Introduce function to check and enable SNP
  iommu/amd: Globally detect SNP support
  iommu/amd: Process all IVHDs before enabling IOMMU features
  iommu/amd: Introduce global variable for storing common EFR and EFR2
  iommu/amd: Introduce Support for Extended Feature 2 Register
  iommu/amd: Change macro for IOMMU control register bit shift to decimal value
  iommu/exynos: Enable default VM instance on SysMMU v7
  iommu/exynos: Add SysMMU v7 register set
  ...
2022-08-06 10:42:38 -07:00
Linus Torvalds
3bd6e5854b asm-generic: updates for 6.0
There are three independent sets of changes:
 
  - Sai Prakash Ranjan adds tracing support to the asm-generic
    version of the MMIO accessors, which is intended to help
    understand problems with device drivers and has been part
    of Qualcomm's vendor kernels for many years.
 
  - A patch from Sebastian Siewior to rework the handling of
    IRQ stacks in softirqs across architectures, which is
    needed for enabling PREEMPT_RT.
 
  - The last patch to remove the CONFIG_VIRT_TO_BUS option and
    some of the code behind that, after the last users of this
    old interface made it in through the netdev, scsi, media and
    staging trees.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmLqPPEACgkQmmx57+YA
 GNlUbQ/+NpIsiA0JUrCGtySt8KrLHdA2dH9lJOR5/iuxfphscPFfWtpcPvcXQWmt
 a8u7wyI8SHW1ku4U0Y5sO0dBSldDnoIqJ5t4X5d7YNU9yVtEtucqQhZf+GkrPlVD
 1HkRu05B7y0k2BMn7BLhSvkpafs3f1lNGXjs8oFBdOF1/zwp/GjcrfCK7KFzqjwU
 dYrX0SOFlKFd4BZC75VfK+XcKg4LtwIOmJraRRl7alz2Q5Oop2hgjgZxXDPf//vn
 SPOhXJN/97i1FUpY2TkfHVH1NxbPfjCV4pUnjmLG0Y4NSy9UQ/ZcXHcywIdeuhfa
 0LySOIsAqBeccpYYYdg2ubiMDZOXkBfANu/sB9o/EhoHfB4svrbPRDhBIQZMFXJr
 MJYu+IYce2rvydA/nydo4q++pxR8v1ES1ZIo8bDux+q1CI/zbpQV+f98kPVRA0M7
 ajc+5GTIqNIsvHzzadq7eYxcj5Bi8Li2JA9sVkAQ+6iq1TVyeYayMc9eYwONlmqw
 MD+PFYc651pKtXZCfkLXPIKSwS0uPqBndAibuVhpZ0hxWaCBBdKvY9mrWcPxt0kA
 tMR8lrosbbrV2K48BFdWTOHvCs2FhHQxPGVPZ/iWuxTA0hHZ9tUlaEkSX+VM57IU
 KCYQLdWzT8J9vrgqSbgYKlb6pSPz6FIjTfut6NZMmshIbavHV/Q=
 =aTR0
 -----END PGP SIGNATURE-----

Merge tag 'asm-generic-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull asm-generic updates from Arnd Bergmann:
 "There are three independent sets of changes:

   - Sai Prakash Ranjan adds tracing support to the asm-generic version
     of the MMIO accessors, which is intended to help understand
     problems with device drivers and has been part of Qualcomm's vendor
     kernels for many years

   - A patch from Sebastian Siewior to rework the handling of IRQ stacks
     in softirqs across architectures, which is needed for enabling
     PREEMPT_RT

   - The last patch to remove the CONFIG_VIRT_TO_BUS option and some of
     the code behind that, after the last users of this old interface
     made it in through the netdev, scsi, media and staging trees"

* tag 'asm-generic-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  uapi: asm-generic: fcntl: Fix typo 'the the' in comment
  arch/*/: remove CONFIG_VIRT_TO_BUS
  soc: qcom: geni: Disable MMIO tracing for GENI SE
  serial: qcom_geni_serial: Disable MMIO tracing for geni serial
  asm-generic/io: Add logging support for MMIO accessors
  KVM: arm64: Add a flag to disable MMIO trace for nVHE KVM
  lib: Add register read/write tracing support
  drm/meson: Fix overflow implicit truncation warnings
  irqchip/tegra: Fix overflow implicit truncation warnings
  coresight: etm4x: Use asm-generic IO memory barriers
  arm64: io: Use asm-generic high level MMIO accessors
  arch/*: Disable softirq stacks on PREEMPT_RT.
2022-08-05 10:07:23 -07:00
Linus Torvalds
965a9d75e3 Tracing updates for 5.20 / 6.0
- Runtime verification infrastructure
   This is the biggest change for this pull request. It introduces the
   runtime verification that is necessary for running Linux on safety
   critical systems. It allows for deterministic automata models to be
   inserted into the kernel that will attach to tracepoints, where the
   information on these tracepoints will move the model from state to state.
   If a state is encountered that does not belong to the model, it will then
   activate a given reactor, that could just inform the user or even panic
   the kernel (for which safety critical systems will detect and can recover
   from).
 
 - Two monitor models are also added: Wakeup In Preemptive (WIP - not to be
   confused with "work in progress"), and Wakeup While Not Running (WWNR).
 
 - Added __vstring() helper to the TRACE_EVENT() macro to replace several
   vsnprintf() usages that were all doing it wrong.
 
 - eprobes now can have their event autogenerated when the event name is left
   off.
 
 - The rest is various cleanups and fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYu0yzRQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qj4HAP4tQtV55rjj4DQ5XIXmtI3/64PmyRSJ
 +y4DEXi1UvEUCQD/QAuQfWoT/7gh35ltkfeS4t3ockzy14rrkP5drZigiQA=
 =kEtM
 -----END PGP SIGNATURE-----

Merge tag 'trace-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:

 - Runtime verification infrastructure

   This is the biggest change here. It introduces the runtime
   verification that is necessary for running Linux on safety critical
   systems.

   It allows for deterministic automata models to be inserted into the
   kernel that will attach to tracepoints, where the information on
   these tracepoints will move the model from state to state.

   If a state is encountered that does not belong to the model, it will
   then activate a given reactor, that could just inform the user or
   even panic the kernel (for which safety critical systems will detect
   and can recover from).

 - Two monitor models are also added: Wakeup In Preemptive (WIP - not to
   be confused with "work in progress"), and Wakeup While Not Running
   (WWNR).

 - Added __vstring() helper to the TRACE_EVENT() macro to replace
   several vsnprintf() usages that were all doing it wrong.

 - eprobes now can have their event autogenerated when the event name is
   left off.

 - The rest is various cleanups and fixes.

* tag 'trace-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (50 commits)
  rv: Unlock on error path in rv_unregister_reactor()
  tracing: Use alignof__(struct {type b;}) instead of offsetof()
  tracing/eprobe: Show syntax error logs in error_log file
  scripts/tracing: Fix typo 'the the' in comment
  tracepoints: It is CONFIG_TRACEPOINTS not CONFIG_TRACEPOINT
  tracing: Use free_trace_buffer() in allocate_trace_buffers()
  tracing: Use a struct alignof to determine trace event field alignment
  rv/reactor: Add the panic reactor
  rv/reactor: Add the printk reactor
  rv/monitor: Add the wwnr monitor
  rv/monitor: Add the wip monitor
  rv/monitor: Add the wip monitor skeleton created by dot2k
  Documentation/rv: Add deterministic automata instrumentation documentation
  Documentation/rv: Add deterministic automata monitor synthesis documentation
  tools/rv: Add dot2k
  Documentation/rv: Add deterministic automaton documentation
  tools/rv: Add dot2c
  Documentation/rv: Add a basic documentation
  rv/include: Add instrumentation helper functions
  rv/include: Add deterministic automata monitor definition via C macros
  ...
2022-08-05 09:41:12 -07:00
Linus Torvalds
746fc76b82 SCSI misc on 20220804
Updates to the usual drivers (ufs, qla2xx, target, lpfc, smartpqi,
 mpi3mr).  The main driver change that might cause issues on down the
 road is the conversion of some of our oldest surviving drivers to the
 DMA API (should only affect m68k).  The only major core change is the
 rework of async resume; the rest are either completely trivial or for
 updating deprecated APIs.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYuvakyYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishfvOAP4m0N6b
 e3JwoBtB1c0JMKv6G4gka8suEG8p5f4khDu8wwD+LfGUCzG49Y5Ts7rByXfEiGgO
 krSdwsAZiV6yKg/HuPw=
 =Ak9L
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "Updates to the usual drivers (ufs, qla2xx, target, lpfc, smartpqi,
  mpi3mr).

  The main driver change that might cause issues on down the road is the
  conversion of some of our oldest surviving drivers to the DMA API
  (should only affect m68k).

  The only major core change is the rework of async resume; the rest are
  either completely trivial or for updating deprecated APIs"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (195 commits)
  scsi: target: Remove XDWRITEREAD emulated support
  scsi: megaraid: Remove the static variable initialisation
  scsi: ch: Do not initialise statics to 0
  scsi: ufs: core: Fix spelling mistake "Cannnot" -> "Cannot"
  scsi: target: iscsi: Do not require target authentication
  scsi: target: iscsi: Allow AuthMethod=None
  scsi: target: iscsi: Support base64 in CHAP
  scsi: target: iscsi: Add support for extended CDB AHS
  scsi: ufs: dt-bindings: Add SC8280XP binding
  scsi: target: iscsi: Fix clang -Wformat warnings
  scsi: ufs: core: Read device property for ref clock
  scsi: libsas: Resume SAS host for phy reset or enable via sysfs
  scsi: hisi_sas: Modify v3 HW SATA completion error processing
  scsi: hisi_sas: Relocate DMA unmap of SMP task
  scsi: hisi_sas: Remove unnecessary variable to hold DMA map elements
  scsi: hisi_sas: Call hisi_sas_slave_configure() from slave_configure_v3_hw()
  scsi: mpi3mr: Delete a stray tab
  scsi: mpi3mr: Unlock on error path
  scsi: mpi3mr: Reduce VD queue depth on detecting throttling
  scsi: mpi3mr: Resource Based Metering
  ...
2022-08-04 19:47:37 -07:00
Linus Torvalds
228dfe98a3 Char / Misc driver changes for 6.0-rc1
Here is the large set of char and misc and other driver subsystem
 changes for 6.0-rc1.
 
 Highlights include:
 	- large set of IIO driver updates, additions, and cleanups
 	- new habanalabs device support added (loads of register maps
 	  much like GPUs have)
 	- soundwire driver updates
 	- phy driver updates
 	- slimbus driver updates
 	- tiny virt driver fixes and updates
 	- misc driver fixes and updates
 	- interconnect driver updates
 	- hwtracing driver updates
 	- fpga driver updates
 	- extcon driver updates
 	- firmware driver updates
 	- counter driver update
 	- mhi driver fixes and updates
 	- binder driver fixes and updates
 	- speakup driver fixes
 
 Full details are in the long shortlog contents.
 
 All of these have been in linux-next for a while without any reported
 problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYup9QQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ylBKQCfaSuzl9ZP9dTvAw2FPp14oRqXnpoAnicvWAoq
 1vU9Vtq2c73uBVLdZm4m
 =AwP3
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char / misc driver updates from Greg KH:
 "Here is the large set of char and misc and other driver subsystem
  changes for 6.0-rc1.

  Highlights include:

   - large set of IIO driver updates, additions, and cleanups

   - new habanalabs device support added (loads of register maps much
     like GPUs have)

   - soundwire driver updates

   - phy driver updates

   - slimbus driver updates

   - tiny virt driver fixes and updates

   - misc driver fixes and updates

   - interconnect driver updates

   - hwtracing driver updates

   - fpga driver updates

   - extcon driver updates

   - firmware driver updates

   - counter driver update

   - mhi driver fixes and updates

   - binder driver fixes and updates

   - speakup driver fixes

  All of these have been in linux-next for a while without any reported
  problems"

* tag 'char-misc-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (634 commits)
  drivers: lkdtm: fix clang -Wformat warning
  char: remove VR41XX related char driver
  misc: Mark MICROCODE_MINOR unused
  spmi: trace: fix stack-out-of-bound access in SPMI tracing functions
  dt-bindings: iio: adc: Add compatible for MT8188
  iio: light: isl29028: Fix the warning in isl29028_remove()
  iio: accel: sca3300: Extend the trigger buffer from 16 to 32 bytes
  iio: fix iio_format_avail_range() printing for none IIO_VAL_INT
  iio: adc: max1027: unlock on error path in max1027_read_single_value()
  iio: proximity: sx9324: add empty line in front of bullet list
  iio: magnetometer: hmc5843: Remove duplicate 'the'
  iio: magn: yas530: Use DEFINE_RUNTIME_DEV_PM_OPS() and pm_ptr() macros
  iio: magnetometer: ak8974: Use DEFINE_RUNTIME_DEV_PM_OPS() and pm_ptr() macros
  iio: light: veml6030: Use DEFINE_RUNTIME_DEV_PM_OPS() and pm_ptr() macros
  iio: light: vcnl4035: Use DEFINE_RUNTIME_DEV_PM_OPS() and pm_ptr() macros
  iio: light: vcnl4000: Use DEFINE_RUNTIME_DEV_PM_OPS() and pm_ptr() macros
  iio: light: tsl2591: Use DEFINE_RUNTIME_DEV_PM_OPS() and pm_ptr()
  iio: light: tsl2583: Use DEFINE_RUNTIME_DEV_PM_OPS and pm_ptr()
  iio: light: isl29028: Use DEFINE_RUNTIME_DEV_PM_OPS() and pm_ptr()
  iio: light: gp2ap002: Switch to DEFINE_RUNTIME_DEV_PM_OPS and pm_ptr()
  ...
2022-08-04 11:05:48 -07:00
Linus Torvalds
f86d1fbbe7 Networking changes for 6.0.
Core
 ----
 
  - Refactor the forward memory allocation to better cope with memory
    pressure with many open sockets, moving from a per socket cache to
    a per-CPU one
 
  - Replace rwlocks with RCU for better fairness in ping, raw sockets
    and IP multicast router.
 
  - Network-side support for IO uring zero-copy send.
 
  - A few skb drop reason improvements, including codegen the source file
    with string mapping instead of using macro magic.
 
  - Rename reference tracking helpers to a more consistent
    netdev_* schema.
 
  - Adapt u64_stats_t type to address load/store tearing issues.
 
  - Refine debug helper usage to reduce the log noise caused by bots.
 
 BPF
 ---
  - Improve socket map performance, avoiding skb cloning on read
    operation.
 
  - Add support for 64 bits enum, to match types exposed by kernel.
 
  - Introduce support for sleepable uprobes program.
 
  - Introduce support for enum textual representation in libbpf.
 
  - New helpers to implement synproxy with eBPF/XDP.
 
  - Improve loop performances, inlining indirect calls when
    possible.
 
  - Removed all the deprecated libbpf APIs.
 
  - Implement new eBPF-based LSM flavor.
 
  - Add type match support, which allow accurate queries to the
    eBPF used types.
 
  - A few TCP congetsion control framework usability improvements.
 
  - Add new infrastructure to manipulate CT entries via eBPF programs.
 
  - Allow for livepatch (KLP) and BPF trampolines to attach to the same
    kernel function.
 
 Protocols
 ---------
 
  - Introduce per network namespace lookup tables for unix sockets,
    increasing scalability and reducing contention.
 
  - Preparation work for Wi-Fi 7 Multi-Link Operation (MLO) support.
 
  - Add support to forciby close TIME_WAIT TCP sockets via user-space
    tools.
 
  - Significant performance improvement for the TLS 1.3 receive path,
    both for zero-copy and not-zero-copy.
 
  - Support for changing the initial MTPCP subflow priority/backup
    status
 
  - Introduce virtually contingus buffers for sockets over RDMA,
    to cope better with memory pressure.
 
  - Extend CAN ethtool support with timestamping capabilities
 
  - Refactor CAN build infrastructure to allow building only the needed
    features.
 
 Driver API
 ----------
 
  - Remove devlink mutex to allow parallel commands on multiple links.
 
  - Add support for pause stats in distributed switch.
 
  - Implement devlink helpers to query and flash line cards.
 
  - New helper for phy mode to register conversion.
 
 New hardware / drivers
 ----------------------
 
  - Ethernet DSA driver for the rockchip mt7531 on BPI-R2 Pro.
 
  - Ethernet DSA driver for the Renesas RZ/N1 A5PSW switch.
 
  - Ethernet DSA driver for the Microchip LAN937x switch.
 
  - Ethernet PHY driver for the Aquantia AQR113C EPHY.
 
  - CAN driver for the OBD-II ELM327 interface.
 
  - CAN driver for RZ/N1 SJA1000 CAN controller.
 
  - Bluetooth: Infineon CYW55572 Wi-Fi plus Bluetooth combo device.
 
 Drivers
 -------
 
  - Intel Ethernet NICs:
    - i40e: add support for vlan pruning
    - i40e: add support for XDP framented packets
    - ice: improved vlan offload support
    - ice: add support for PPPoE offload
 
  - Mellanox Ethernet (mlx5)
    - refactor packet steering offload for performance and scalability
    - extend support for TC offload
    - refactor devlink code to clean-up the locking schema
    - support stacked vlans for bridge offloads
    - use TLS objects pool to improve connection rate
 
  - Netronome Ethernet NICs (nfp):
    - extend support for IPv6 fields mangling offload
    - add support for vepa mode in HW bridge
    - better support for virtio data path acceleration (VDPA)
    - enable TSO by default
 
  - Microsoft vNIC driver (mana)
    - add support for XDP redirect
 
  - Others Ethernet drivers:
    - bonding: add per-port priority support
    - microchip lan743x: extend phy support
    - Fungible funeth: support UDP segmentation offload and XDP xmit
    - Solarflare EF100: add support for virtual function representors
    - MediaTek SoC: add XDP support
 
  - Mellanox Ethernet/IB switch (mlxsw):
    - dropped support for unreleased H/W (XM router).
    - improved stats accuracy
    - unified bridge model coversion improving scalability
      (parts 1-6)
    - support for PTP in Spectrum-2 asics
 
  - Broadcom PHYs
    - add PTP support for BCM54210E
    - add support for the BCM53128 internal PHY
 
  - Marvell Ethernet switches (prestera):
    - implement support for multicast forwarding offload
 
  - Embedded Ethernet switches:
    - refactor OcteonTx MAC filter for better scalability
    - improve TC H/W offload for the Felix driver
    - refactor the Microchip ksz8 and ksz9477 drivers to share
      the probe code (parts 1, 2), add support for phylink
      mac configuration
 
  - Other WiFi:
    - Microchip wilc1000: diable WEP support and enable WPA3
    - Atheros ath10k: encapsulation offload support
 
 Old code removal:
 
  - Neterion vxge ethernet driver: this is untouched since more than
    10 years.
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmLqN+oSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkB9kQAI9VqW0c3SfiTJnkVBEIovZ6Tnh5stD2
 UYFkh1BdchLsYxi7W4XMpVPSzRztiTP87mIx5c/KvIzj+QNeWL1XWRJSPdI9HhTD
 pTAA/tM2OG7bqrbyQiKDNfpQdNl7+kk1RwnYd+f9RFl1QVuIJaYhmjVwrsN5xF/+
 jUsotpROarM2dGFWiFwJbKhP2zMDT+6qEEahM8pEPggKhv8wRLYjany2cZVEe4e0
 WGUpbINAS8gEKm0Ob922WaDfDrcK/N1Z0jNz/kMaENkK18Vvc7F6bCO0DzAawKX9
 QZMMwm6mHp3EThflJAMAzCGIYiIcwLhykgdyj8rrjPhFrWbMD2Sdsbo21HOXU/8j
 u4aAhVl+d+h7emmbgBoJ8sycVJ7BQlXz7lX20sTgADv9xI4/dPhQ17CMRuwX6fXX
 JSrn6P6e1LTV5CEg6vrlSPnKPY6uhFn/cPw47FxCjRwJ9phVnp+8uZWQmf9Pz3yf
 Ok/tcj+juFbsmuOshHy2cbRkuNZNS0oRWlSTBo5795ZwOLSakMonR3L+ev2aOvzz
 DVrFp2Y/iIVwMSFdCbouYdYnhArPRhOAtCmZc2afY8aBN7aaMgrdTy3+mzUoHy3I
 FG3K+VuKpfi0vY4zn6ZoLZDIpyXIoJJ93RcSGltD32t3Dp1RaQMVEI4s45k05PVm
 1nYpXKHA8qML
 =hxEG
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking changes from Paolo Abeni:
 "Core:

   - Refactor the forward memory allocation to better cope with memory
     pressure with many open sockets, moving from a per socket cache to
     a per-CPU one

   - Replace rwlocks with RCU for better fairness in ping, raw sockets
     and IP multicast router.

   - Network-side support for IO uring zero-copy send.

   - A few skb drop reason improvements, including codegen the source
     file with string mapping instead of using macro magic.

   - Rename reference tracking helpers to a more consistent netdev_*
     schema.

   - Adapt u64_stats_t type to address load/store tearing issues.

   - Refine debug helper usage to reduce the log noise caused by bots.

  BPF:

   - Improve socket map performance, avoiding skb cloning on read
     operation.

   - Add support for 64 bits enum, to match types exposed by kernel.

   - Introduce support for sleepable uprobes program.

   - Introduce support for enum textual representation in libbpf.

   - New helpers to implement synproxy with eBPF/XDP.

   - Improve loop performances, inlining indirect calls when possible.

   - Removed all the deprecated libbpf APIs.

   - Implement new eBPF-based LSM flavor.

   - Add type match support, which allow accurate queries to the eBPF
     used types.

   - A few TCP congetsion control framework usability improvements.

   - Add new infrastructure to manipulate CT entries via eBPF programs.

   - Allow for livepatch (KLP) and BPF trampolines to attach to the same
     kernel function.

  Protocols:

   - Introduce per network namespace lookup tables for unix sockets,
     increasing scalability and reducing contention.

   - Preparation work for Wi-Fi 7 Multi-Link Operation (MLO) support.

   - Add support to forciby close TIME_WAIT TCP sockets via user-space
     tools.

   - Significant performance improvement for the TLS 1.3 receive path,
     both for zero-copy and not-zero-copy.

   - Support for changing the initial MTPCP subflow priority/backup
     status

   - Introduce virtually contingus buffers for sockets over RDMA, to
     cope better with memory pressure.

   - Extend CAN ethtool support with timestamping capabilities

   - Refactor CAN build infrastructure to allow building only the needed
     features.

  Driver API:

   - Remove devlink mutex to allow parallel commands on multiple links.

   - Add support for pause stats in distributed switch.

   - Implement devlink helpers to query and flash line cards.

   - New helper for phy mode to register conversion.

  New hardware / drivers:

   - Ethernet DSA driver for the rockchip mt7531 on BPI-R2 Pro.

   - Ethernet DSA driver for the Renesas RZ/N1 A5PSW switch.

   - Ethernet DSA driver for the Microchip LAN937x switch.

   - Ethernet PHY driver for the Aquantia AQR113C EPHY.

   - CAN driver for the OBD-II ELM327 interface.

   - CAN driver for RZ/N1 SJA1000 CAN controller.

   - Bluetooth: Infineon CYW55572 Wi-Fi plus Bluetooth combo device.

  Drivers:

   - Intel Ethernet NICs:
      - i40e: add support for vlan pruning
      - i40e: add support for XDP framented packets
      - ice: improved vlan offload support
      - ice: add support for PPPoE offload

   - Mellanox Ethernet (mlx5)
      - refactor packet steering offload for performance and scalability
      - extend support for TC offload
      - refactor devlink code to clean-up the locking schema
      - support stacked vlans for bridge offloads
      - use TLS objects pool to improve connection rate

   - Netronome Ethernet NICs (nfp):
      - extend support for IPv6 fields mangling offload
      - add support for vepa mode in HW bridge
      - better support for virtio data path acceleration (VDPA)
      - enable TSO by default

   - Microsoft vNIC driver (mana)
      - add support for XDP redirect

   - Others Ethernet drivers:
      - bonding: add per-port priority support
      - microchip lan743x: extend phy support
      - Fungible funeth: support UDP segmentation offload and XDP xmit
      - Solarflare EF100: add support for virtual function representors
      - MediaTek SoC: add XDP support

   - Mellanox Ethernet/IB switch (mlxsw):
      - dropped support for unreleased H/W (XM router).
      - improved stats accuracy
      - unified bridge model coversion improving scalability (parts 1-6)
      - support for PTP in Spectrum-2 asics

   - Broadcom PHYs
      - add PTP support for BCM54210E
      - add support for the BCM53128 internal PHY

   - Marvell Ethernet switches (prestera):
      - implement support for multicast forwarding offload

   - Embedded Ethernet switches:
      - refactor OcteonTx MAC filter for better scalability
      - improve TC H/W offload for the Felix driver
      - refactor the Microchip ksz8 and ksz9477 drivers to share the
        probe code (parts 1, 2), add support for phylink mac
        configuration

   - Other WiFi:
      - Microchip wilc1000: diable WEP support and enable WPA3
      - Atheros ath10k: encapsulation offload support

  Old code removal:

   - Neterion vxge ethernet driver: this is untouched since more than 10 years"

* tag 'net-next-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1890 commits)
  doc: sfp-phylink: Fix a broken reference
  wireguard: selftests: support UML
  wireguard: allowedips: don't corrupt stack when detecting overflow
  wireguard: selftests: update config fragments
  wireguard: ratelimiter: use hrtimer in selftest
  net/mlx5e: xsk: Discard unaligned XSK frames on striding RQ
  net: usb: ax88179_178a: Bind only to vendor-specific interface
  selftests: net: fix IOAM test skip return code
  net: usb: make USB_RTL8153_ECM non user configurable
  net: marvell: prestera: remove reduntant code
  octeontx2-pf: Reduce minimum mtu size to 60
  net: devlink: Fix missing mutex_unlock() call
  net/tls: Remove redundant workqueue flush before destroy
  net: txgbe: Fix an error handling path in txgbe_probe()
  net: dsa: Fix spelling mistakes and cleanup code
  Documentation: devlink: add add devlink-selftests to the table of contents
  dccp: put dccp_qpolicy_full() and dccp_qpolicy_push() in the same lock
  net: ionic: fix error check for vlan flags in ionic_set_nic_features()
  net: ice: fix error NETIF_F_HW_VLAN_CTAG_FILTER check in ice_vsi_sync_fltr()
  nfp: flower: add support for tunnel offload without key ID
  ...
2022-08-03 16:29:08 -07:00
Linus Torvalds
353767e4aa for-5.20-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmLnyNUACgkQxWXV+ddt
 WDt9vA/9HcF+v5EkknyW07tatTap/Hm/ZB86Z5OZi6ikwIEcHsWhp3rUICejm88e
 GecDPIluDtCtyD6x4stuqkwOm22aDP5q2T9H6+gyw92ozyb436OV1Z8IrmftzXKY
 EpZO70PHZT+E6E/WYvyoTmmoCrjib7YlqCWZZhSLUFpsqqlOInmHEH49PW6KvM4r
 acUZ/RxHurKdmI3kNY6ECbAQl6CASvtTdYcVCx8fT2zN0azoLIQxpYa7n/9ca1R6
 8WnYilCbLbNGtcUXvO2M3tMZ4/5kvxrwQsUn93ccCJYuiN0ASiDXbLZ2g4LZ+n56
 JGu+y5v5oBwjpVf+46cuvnENP5BQ61594WPseiVjrqODWnPjN28XkcVC0XmPsiiZ
 lszeHO2cuIrIFoCah8ELMl8usu8+qxfXmPxIXtPu9rEyKsDtOjxVYc8SMXqLp0qQ
 qYtBoFm0JcZHqtZRpB+dhQ37/xXtH4ljUi/mI6x8iALVujeR273URs7yO9zgIdeW
 uZoFtbwpHFLUk+TL7Ku82/zOXp3fCwtDpNmlYbxeMbea/be3ShjncM4+mYzvHYri
 dYON2LFrq+mnRDqtIXTCaAYwX7zU8Y18Ev9QwlNll8dKlKwS89+jpqLoa+eVYy3c
 /HitHFza70KxmOj4dvDVZlzDpPvl7kW1UBkmskg4u3jnNWzedkM=
 =sS1q
 -----END PGP SIGNATURE-----

Merge tag 'for-5.20-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs updates from David Sterba:
 "This brings some long awaited changes, the send protocol bump,
  otherwise lots of small improvements and fixes. The main core part is
  reworking bio handling, cleaning up the submission and endio and
  improving error handling.

  There are some changes outside of btrfs adding helpers or updating
  API, listed at the end of the changelog.

  Features:

   - sysfs:
      - export chunk size, in debug mode add tunable for setting its size
      - show zoned among features (was only in debug mode)
      - show commit stats (number, last/max/total duration)

   - send protocol updated to 2
      - new commands:
         - ability write larger data chunks than 64K
         - send raw compressed extents (uses the encoded data ioctls),
           ie. no decompression on send side, no compression needed on
           receive side if supported
         - send 'otime' (inode creation time) among other timestamps
         - send file attributes (a.k.a file flags and xflags)
      - this is first version bump, backward compatibility on send and
        receive side is provided
      - there are still some known and wanted commands that will be
        implemented in the near future, another version bump will be
        needed, however we want to minimize that to avoid causing
        usability issues

   - print checksum type and implementation at mount time

   - don't print some messages at mount (mentioned as people asked about
     it), we want to print messages namely for new features so let's
     make some space for that
      - big metadata - this has been supported for a long time and is
        not a feature that's worth mentioning
      - skinny metadata - same reason, set by default by mkfs

  Performance improvements:

   - reduced amount of reserved metadata for delayed items
      - when inserted items can be batched into one leaf
      - when deleting batched directory index items
      - when deleting delayed items used for deletion
      - overall improved count of files/sec, decreased subvolume lock
        contention

   - metadata item access bounds checker micro-optimized, with a few
     percent of improved runtime for metadata-heavy operations

   - increase direct io limit for read to 256 sectors, improved
     throughput by 3x on sample workload

  Notable fixes:

   - raid56
      - reduce parity writes, skip sectors of stripe when there are no
        data updates
      - restore reading from on-disk data instead of using stripe cache,
        this reduces chances to damage correct data due to RMW cycle

   - refuse to replay log with unknown incompat read-only feature bit
     set

   - zoned
      - fix page locking when COW fails in the middle of allocation
      - improved tracking of active zones, ZNS drives may limit the
        number and there are ENOSPC errors due to that limit and not
        actual lack of space
      - adjust maximum extent size for zone append so it does not cause
        late ENOSPC due to underreservation

   - mirror reading error messages show the mirror number

   - don't fallback to buffered IO for NOWAIT direct IO writes, we don't
     have the NOWAIT semantics for buffered io yet

   - send, fix sending link commands for existing file paths when there
     are deleted and created hardlinks for same files

   - repair all mirrors for profiles with more than 1 copy (raid1c34)

   - fix repair of compressed extents, unify where error detection and
     repair happen

  Core changes:

   - bio completion cleanups
      - don't double defer compression bios
      - simplify endio workqueues
      - add more data to btrfs_bio to avoid allocation for read requests
      - rework bio error handling so it's same what block layer does,
        the submission works and errors are consumed in endio
      - when asynchronous bio offload fails fall back to synchronous
        checksum calculation to avoid errors under writeback or memory
        pressure

   - new trace points
      - raid56 events
      - ordered extent operations

   - super block log_root_transid deprecated (never used)

   - mixed_backref and big_metadata sysfs feature files removed, they've
     been default for sufficiently long time, there are no known users
     and mixed_backref could be confused with mixed_groups

  Non-btrfs changes, API updates:

   - minor highmem API update to cover const arguments

   - switch all kmap/kmap_atomic to kmap_local

   - remove redundant flush_dcache_page()

   - address_space_operations::writepage callback removed

   - add bdev_max_segments() helper"

* tag 'for-5.20-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (163 commits)
  btrfs: don't call btrfs_page_set_checked in finish_compressed_bio_read
  btrfs: fix repair of compressed extents
  btrfs: remove the start argument to check_data_csum and export
  btrfs: pass a btrfs_bio to btrfs_repair_one_sector
  btrfs: simplify the pending I/O counting in struct compressed_bio
  btrfs: repair all known bad mirrors
  btrfs: merge btrfs_dev_stat_print_on_error with its only caller
  btrfs: join running log transaction when logging new name
  btrfs: simplify error handling in btrfs_lookup_dentry
  btrfs: send: always use the rbtree based inode ref management infrastructure
  btrfs: send: fix sending link commands for existing file paths
  btrfs: send: introduce recorded_ref_alloc and recorded_ref_free
  btrfs: zoned: wait until zone is finished when allocation didn't progress
  btrfs: zoned: write out partially allocated region
  btrfs: zoned: activate necessary block group
  btrfs: zoned: activate metadata block group on flush_space
  btrfs: zoned: disable metadata overcommit for zoned
  btrfs: zoned: introduce space_info->active_total_bytes
  btrfs: zoned: finish least available block group on data bg allocation
  btrfs: let can_allocate_chunk return error
  ...
2022-08-03 14:54:52 -07:00
Kajetan Puchalski
6ab4b19900 cpuidle: Add cpu_idle_miss trace event
Add a trace event for cpuidle to track missed (too deep or too shallow)
wakeups.

After each wakeup, CPUIdle already computes whether the entered state was
optimal, above or below the desired one and updates the relevant
counters. This patch makes it possible to trace those events in addition
to just reading the counters.

The patterns of types and percentages of misses across different
workloads appear to be very consistent. This makes the trace event very
useful for comparing the relative correctness of different CPUIdle
governors for different types of workloads, or for finding the
optimal governor for a given device.

Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-08-03 17:50:58 +02:00
Linus Torvalds
c013d0af81 for-5.20/block-2022-07-29
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmLko3gQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpmQaD/90NKFj4v8I456TUQyg1jimXEsL+e84E6o2
 ALWVb6JzQvlPVQXNLnK5YKIunMWOTtTMz0nyB8sVRwVJVJO0P5d7QopAkZM8fkyU
 MK5OCzoryENw4DTc2wJS4in6cSbGylIuN74wMzlf7+M67JTImfoZQhbTMcjwzZfn
 b3OlL6sID7zMXwGcuOJPZyUJICCpDhzdSF9JXqKma5PQuG2SBmQyvFxJAcsoFBPc
 YetnoRIOIN6yBvsIZaPaYq7XI9MIvF0e67EQtyCEHj4tHpyVnyDWkeObVFULsISU
 gGEKbkYPvNUzRAU5Q1NBBHh1tTfkf/MaUxTuZwoEwZ/s04IGBGMmrZGyfvdfzYo6
 M7NwSEg/TrUSNfTwn65mQi7uOXu1pGkJrqz84Flm8u9Qid9Vd7LExLG5p/ggnWdH
 5th93MDEmtEg29e9DXpEAuS5d0t3TtSvosflaKpyfNNfr+P0rWCN6GM/uW62VUTK
 ls69SQh/AQJRbg64jU4xper6WhaYtSXK7TKEnxJycoEn9gYNyCcdot2uekth0xRH
 ChHGmRlteiqe/y4uFWn/2dcxWjoleiHbFjTaiRL75WVl8wIDEjw02LGuoZ61Ss9H
 WOV+MT7KqNjBGe6lreUY+O/PO02dzmoR6heJXN19p8zr/pBuLCTGX7UpO7rzgaBR
 4N1HEozvIw==
 =celk
 -----END PGP SIGNATURE-----

Merge tag 'for-5.20/block-2022-07-29' of git://git.kernel.dk/linux-block

Pull block updates from Jens Axboe:

 - Improve the type checking of request flags (Bart)

 - Ensure queue mapping for a single queues always picks the right queue
   (Bart)

 - Sanitize the io priority handling (Jan)

 - rq-qos race fix (Jinke)

 - Reserved tags handling improvements (John)

 - Separate memory alignment from file/disk offset aligment for O_DIRECT
   (Keith)

 - Add new ublk driver, userspace block driver using io_uring for
   communication with the userspace backend (Ming)

 - Use try_cmpxchg() to cleanup the code in various spots (Uros)

 - Finally remove bdevname() (Christoph)

 - Clean up the zoned device handling (Christoph)

 - Clean up independent access range support (Christoph)

 - Clean up and improve block sysfs handling (Christoph)

 - Clean up and improve teardown of block devices.

   This turns the usual two step process into something that is simpler
   to implement and handle in block drivers (Christoph)

 - Clean up chunk size handling (Christoph)

 - Misc cleanups and fixes (Bart, Bo, Dan, GuoYong, Jason, Keith, Liu,
   Ming, Sebastian, Yang, Ying)

* tag 'for-5.20/block-2022-07-29' of git://git.kernel.dk/linux-block: (178 commits)
  ublk_drv: fix double shift bug
  ublk_drv: make sure that correct flags(features) returned to userspace
  ublk_drv: fix error handling of ublk_add_dev
  ublk_drv: fix lockdep warning
  block: remove __blk_get_queue
  block: call blk_mq_exit_queue from disk_release for never added disks
  blk-mq: fix error handling in __blk_mq_alloc_disk
  ublk: defer disk allocation
  ublk: rewrite ublk_ctrl_get_queue_affinity to not rely on hctx->cpumask
  ublk: fold __ublk_create_dev into ublk_ctrl_add_dev
  ublk: cleanup ublk_ctrl_uring_cmd
  ublk: simplify ublk_ch_open and ublk_ch_release
  ublk: remove the empty open and release block device operations
  ublk: remove UBLK_IO_F_PREFLUSH
  ublk: add a MAINTAINERS entry
  block: don't allow the same type rq_qos add more than once
  mmc: fix disk/queue leak in case of adding disk failure
  ublk_drv: fix an IS_ERR() vs NULL check
  ublk: remove UBLK_IO_F_INTEGRITY
  ublk_drv: remove unneeded semicolon
  ...
2022-08-02 13:46:35 -07:00
Linus Torvalds
98e2474640 for-5.20/io_uring-buffered-writes-2022-07-29
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmLkm7UQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpldTEADTg/96R+eq78UZBNZmdifY9/qwQD+kzNiK
 ACDoYZFSbWUjMeOWqRxbYr6mXBKHnGHyTGlraTTpLDzhpB1xwoWfgOK9uOYXW/Ik
 eWfgTujPW/8v/l/z86khE+GH9b/maGCRqNZgS6uLVLzhxG6oCkoYTyOh1iHaF1VM
 Rma4nbJ8GSEDtiXNDl0Bznnyks/pzwoz/9slwzZ7PxtFwZsBxKuxgMUR5HIXdRp7
 5iUoFJhZrGWyi/dbQZUsK/9VYVVnKkcBCz2pb4GEmC+3dS/vlPEoeWUpPHInNyd1
 9NB9v8c+KFmQaWnCxuxcdHvCfmRRQrX8Pr8/OBNZKO6McYrKWKA+lurp4EGClE3m
 cZdK+P/9FS/Eeua8hum9UnbPAqsJPqLTbpbrySeBdd4iFA6u7rRqDX2+nz3PNe9U
 1b7V1bWBIEY/Rsw/PKo59oIeV0auD8v9OCHJ0lF2pv6dRln2/W0y1Qfd1DI18xFG
 +9bBnQzhF7R0O8UP5ApVayQCYrd906YsSVUOqAiLmUs/BoOgRq6g/0BqSOVVKE2u
 5iq8zTsVMkxY0ZpExwZST/700JwkPIV4SVPEYRC6QssFTcylvlisIek6XYSS9HX4
 Z6gzMwJW1H47bEfG4JolTI8uBjp0hQLCPX0O0XFLVnbHQwN0kjIBmv3axAwJO2NV
 qrrHXjf09w==
 =hV7G
 -----END PGP SIGNATURE-----

Merge tag 'for-5.20/io_uring-buffered-writes-2022-07-29' of git://git.kernel.dk/linux-block

Pull io_uring buffered writes support from Jens Axboe:
 "This contains support for buffered writes, specifically for XFS. btrfs
  is in progress, will be coming in the next release.

  io_uring does support buffered writes on any file type, but since the
  buffered write path just always -EAGAIN (or -EOPNOTSUPP) any attempt
  to do so if IOCB_NOWAIT is set, any buffered write will effectively be
  handled by io-wq offload. This isn't very efficient, and we even have
  specific code in io-wq to serialize buffered writes to the same inode
  to avoid further inefficiencies with thread offload.

  This is particularly sad since most buffered writes don't block, they
  simply copy data to a page and dirty it. With this pull request, we
  can handle buffered writes a lot more effiently.

  If balance_dirty_pages() needs to block, we back off on writes as
  indicated.

  This improves buffered write support by 2-3x.

  Jan Kara helped with the mm bits for this, and Stefan handled the
  fs/iomap/xfs/io_uring parts of it"

* tag 'for-5.20/io_uring-buffered-writes-2022-07-29' of git://git.kernel.dk/linux-block:
  mm: honor FGP_NOWAIT for page cache page allocation
  xfs: Add async buffered write support
  xfs: Specify lockmode when calling xfs_ilock_for_iomap()
  io_uring: Add tracepoint for short writes
  io_uring: fix issue with io_write() not always undoing sb_start_write()
  io_uring: Add support for async buffered writes
  fs: Add async write file modification handling.
  fs: Split off inode_needs_update_time and __file_update_time
  fs: add __remove_file_privs() with flags parameter
  fs: add a FMODE_BUF_WASYNC flags for f_mode
  iomap: Return -EAGAIN from iomap_write_iter()
  iomap: Add async buffered write support
  iomap: Add flags parameter to iomap_page_create()
  mm: Add balance_dirty_pages_ratelimited_flags() function
  mm: Move updates of dirty_exceeded into one place
  mm: Move starting of background writeback into the main balancing loop
2022-08-02 13:27:23 -07:00
Linus Torvalds
b349b1181d for-5.20/io_uring-2022-07-29
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmLkm5gQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpmKMD/4l3QIrLbjYIxlfrzQcHbmYuUkbQtj3SbZg
 6ejbnGVhCs1P9DdXH8MgE2BxgpiXQE0CqOK7vbSoo5ep2n2UTLI2DIxAl74SMIo7
 0wmJXtUJySuViKr3NYVHqlN180MkQYddBz0nGElhkQBPBCMhW8CrtPCeURr/YyHp
 2RxSYBXiUx2gRyig+klnp6oPEqelcBZJUyNHdA9yVrgl/RhB/t2rKj7D++8ukQM3
 Zuyh8WIkTeTfUz9hdGG7fuCEdZN4DlO2CCEc7uy0cKi6VRCKH4hYUCqClJ+/cfd2
 43dUI2O7B6D1t/ObFh8AGIDXBDqVA6ePQohQU6gooRkfQiBPKkc9d0ts4yIhRqca
 AjkzNM+0Eve3A01loJ8J84w8oZnvNpYEv5n8/sZVLWcyU3UIs0I88nC2OBiFtoRq
 d77CtFLwOTo+r3STtAhnZOqez90rhS6BqKtqlUP346PCuFItl6/MbGtwdTbLYEFj
 CVNIb2pERWSr2NxGv4lFyXaX/cRwruxojWH7yc3rRYjr4Ykevd1pe/fMGNiMAnKw
 5em/3QU3qq0ZVcXLMihksKeHHFIQwGDRMuyuv/fktV10+yYXQ0t16WzkJT3aR8Xo
 cqs0r8+6Jnj3uYcOMzj/FoLcpEPr21hnwAtzLto1mG1Wh4JRn/D7Nx5zqxPLxcW+
 NiU6VihPOw==
 =gxeV
 -----END PGP SIGNATURE-----

Merge tag 'for-5.20/io_uring-2022-07-29' of git://git.kernel.dk/linux-block

Pull io_uring updates from Jens Axboe:

 - As per (valid) complaint in the last merge window, fs/io_uring.c has
   grown quite large these days. io_uring isn't really tied to fs
   either, as it supports a wide variety of functionality outside of
   that.

   Move the code to io_uring/ and split it into files that either
   implement a specific request type, and split some code into helpers
   as well. The code is organized a lot better like this, and io_uring.c
   is now < 4K LOC (me).

 - Deprecate the epoll_ctl opcode. It'll still work, just trigger a
   warning once if used. If we don't get any complaints on this, and I
   don't expect any, then we can fully remove it in a future release
   (me).

 - Improve the cancel hash locking (Hao)

 - kbuf cleanups (Hao)

 - Efficiency improvements to the task_work handling (Dylan, Pavel)

 - Provided buffer improvements (Dylan)

 - Add support for recv/recvmsg multishot support. This is similar to
   the accept (or poll) support for have for multishot, where a single
   SQE can trigger everytime data is received. For applications that
   expect to do more than a few receives on an instantiated socket, this
   greatly improves efficiency (Dylan).

 - Efficiency improvements for poll handling (Pavel)

 - Poll cancelation improvements (Pavel)

 - Allow specifiying a range for direct descriptor allocations (Pavel)

 - Cleanup the cqe32 handling (Pavel)

 - Move io_uring types to greatly cleanup the tracing (Pavel)

 - Tons of great code cleanups and improvements (Pavel)

 - Add a way to do sync cancelations rather than through the sqe -> cqe
   interface, as that's a lot easier to use for some use cases (me).

 - Add support to IORING_OP_MSG_RING for sending direct descriptors to a
   different ring. This avoids the usually problematic SCM case, as we
   disallow those. (me)

 - Make the per-command alloc cache we use for apoll generic, place
   limits on it, and use it for netmsg as well (me).

 - Various cleanups (me, Michal, Gustavo, Uros)

* tag 'for-5.20/io_uring-2022-07-29' of git://git.kernel.dk/linux-block: (172 commits)
  io_uring: ensure REQ_F_ISREG is set async offload
  net: fix compat pointer in get_compat_msghdr()
  io_uring: Don't require reinitable percpu_ref
  io_uring: fix types in io_recvmsg_multishot_overflow
  io_uring: Use atomic_long_try_cmpxchg in __io_account_mem
  io_uring: support multishot in recvmsg
  net: copy from user before calling __get_compat_msghdr
  net: copy from user before calling __copy_msghdr
  io_uring: support 0 length iov in buffer select in compat
  io_uring: fix multishot ending when not polled
  io_uring: add netmsg cache
  io_uring: impose max limit on apoll cache
  io_uring: add abstraction around apoll cache
  io_uring: move apoll cache to poll.c
  io_uring: consolidate hash_locked io-wq handling
  io_uring: clear REQ_F_HASH_LOCKED on hash removal
  io_uring: don't race double poll setting REQ_F_ASYNC_DATA
  io_uring: don't miss setting REQ_F_DOUBLE_POLL
  io_uring: disable multishot recvmsg
  io_uring: only trace one of complete or overflow
  ...
2022-08-02 13:20:44 -07:00
Steven Rostedt (Google)
09794a5a6c tracing: Use alignof__(struct {type b;}) instead of offsetof()
Simplify:

  #define ALIGN_STRUCTFIELD(type) ((int)(offsetof(struct {char a; type b;}, b)))

with

  #define  ALIGN_STRUCTFIELD(type) __alignof__(struct {type b;})

Which works just the same.

Link: https://lore.kernel.org/all/a7d202457150472588df0bd3b7334b3f@AcuMS.aculab.com/
Link: https://lkml.kernel.org/r/20220802154412.513c50e3@gandalf.local.home

Suggested-by: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-08-02 16:01:12 -04:00
Linus Torvalds
c1dbe9a1c8 Thermal control updates for 5.20-rc1
- Consolidate the thermal core code by beginning to move the thermal
    trip structure from the thermal OF code as a generic structure to be
    used by the different sensors when registering a thermal zone
    (Daniel Lezcano).
 
  - Make per cpufreq / devfreq cooling device ops instead of using a
    global variable, fix comments and rework the trace information
    (Lukasz Luba).
 
  - Add the include/dt-bindings/thermal.h under the area covered by the
    thermal maintainer in the MAINTAINERS file (Lukas Bulwahn).
 
  - Improve the error output by giving the sensor identification when a
    thermal zone failed to initialize, the DT bindings by changing the
    positive logic and adding the r8a779f0 support on the rcar3 (Wolfram
    Sang).
 
  - Convert the QCom tsens DT binding to the dtsformat format (Krzysztof
    Kozlowski).
 
  - Remove the pointless get_trend() function in the QCom, Ux500 and
    tegra thermal drivers, along with the unused DROP_FULL and
    RAISE_FULL trends definitions. Simplify the code by using clamp()
    macros (Daniel Lezcano).
 
  - Fix ref_table memory leak at probe time on the k3_j72xx bandgap
    (Bryan Brattlof).
 
  - Fix array underflow in prep_lookup_table (Dan Carpenter).
 
  - Add static annotation to the k3_j72xx_bandgap_j7* data structure
    (Jin Xiaoyun).
 
  - Fix typos in comments detected on sun8i by Coccinelle (Julia
    Lawall).
 
  - Fix typos in comments on rzg2l (Biju Das).
 
  - Remove as unnecessary call to dev_err() as the error is already
    printed by the failing function on u8500 (Yang Li).
 
  - Register the thermal zones as hwmon sensors for the Qcom thermal
    sensors (Dmitry Baryshkov).
 
  - Fix 'tmon' tool compilation issue by adding phtread.h include
    (Markus Mayer).
 
  - Fix typo in the comments for the 'tmon' tool (Slark Xiao).
 
  - Make the thermal core use ida_alloc()/free() directly instead of
    ida_simple_get()/ida_simple_remove() that have been deprecated
    (keliu).
 
  - Drop ACPI_FADT_LOW_POWER_S0 check from the Intel PCH thermal control
    driver (Rafael Wysocki).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmLoK5ASHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxa0cQAJsl3wDxkDbvfEENZ1VSdfeH3qXbUSSE
 EEo0j4X85JE1F1NwT8R2tb4D/YMJDT3p6I55twrVLvxNUdTnx7ybRfXem24uXkK5
 xOybfsuYsSWXxaEfI4260GBzY6ijTR7uWYyDLPN3vvbW3FdMj+nni0D9uTySw7UL
 ecIe1ISn3nxbbp0FxYh+n88+718HWKo07BaTE4TyKeUgQHw+v7HHtCZq7Rdoogm8
 cp6tTkJ8ymrHoEvAWBIcO58zCx7LkSFeU69oMm4CUzVjxWdFfREb079F5cZ92GXr
 ex70r/gKfFAd5GAAdL0WjeS4RwHKta49WKqAMA7w41nIgDj0IA2gJRowfJvKDkF+
 JgcQ7OrJ5eo5jCr4pbycgQ9Lh23zBQe/3LH+yV71KlKiLf6/Tl5rhELfBNbZmraZ
 HOvD5dAxBLySmANN2VX7DJgtbTcinneL9BDVo6dBTdYaWC4jQxXYm73n66nkZdS7
 BDJ0N2P0uZ7NGLawXwrrsMi8xbIApMw4W/o8SN9R4FF1LqIroDg60kLJ9zO+6IhI
 xF8ZtcMdyPVa71fSZNwD0+mz2sF6XnTucf88CjxzVdAxbvNVPQEvKufThWTreyuU
 pjBPtf1YFOFz9CusBYAplOIu96RqUgL1t1aqqwsCqXoUu4Lgh/pyksIDeam1l0EP
 Q5WBUB9bK8q8
 =wj9M
 -----END PGP SIGNATURE-----

Merge tag 'thermal-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull thermal control updates from Rafael Wysocki:
 "These start a rework of the handling of trip points in the thermal
  core, improve the cpufreq/devfreq cooling device handling, update some
  thermal control drivers and the tmon utility and clean up code.

  Specifics:

   - Consolidate the thermal core code by beginning to move the thermal
     trip structure from the thermal OF code as a generic structure to
     be used by the different sensors when registering a thermal zone
     (Daniel Lezcano).

   - Make per cpufreq / devfreq cooling device ops instead of using a
     global variable, fix comments and rework the trace information
     (Lukasz Luba).

   - Add the include/dt-bindings/thermal.h under the area covered by the
     thermal maintainer in the MAINTAINERS file (Lukas Bulwahn).

   - Improve the error output by giving the sensor identification when a
     thermal zone failed to initialize, the DT bindings by changing the
     positive logic and adding the r8a779f0 support on the rcar3
     (Wolfram Sang).

   - Convert the QCom tsens DT binding to the dtsformat format
     (Krzysztof Kozlowski).

   - Remove the pointless get_trend() function in the QCom, Ux500 and
     tegra thermal drivers, along with the unused DROP_FULL and
     RAISE_FULL trends definitions. Simplify the code by using clamp()
     macros (Daniel Lezcano).

   - Fix ref_table memory leak at probe time on the k3_j72xx bandgap
     (Bryan Brattlof).

   - Fix array underflow in prep_lookup_table (Dan Carpenter).

   - Add static annotation to the k3_j72xx_bandgap_j7* data structure
     (Jin Xiaoyun).

   - Fix typos in comments detected on sun8i by Coccinelle (Julia
     Lawall).

   - Fix typos in comments on rzg2l (Biju Das).

   - Remove as unnecessary call to dev_err() as the error is already
     printed by the failing function on u8500 (Yang Li).

   - Register the thermal zones as hwmon sensors for the Qcom thermal
     sensors (Dmitry Baryshkov).

   - Fix 'tmon' tool compilation issue by adding phtread.h include
     (Markus Mayer).

   - Fix typo in the comments for the 'tmon' tool (Slark Xiao).

   - Make the thermal core use ida_alloc()/free() directly instead of
     ida_simple_get()/ida_simple_remove() that have been deprecated
     (keliu).

   - Drop ACPI_FADT_LOW_POWER_S0 check from the Intel PCH thermal
     control driver (Rafael Wysocki)"

* tag 'thermal-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (39 commits)
  thermal/of: Initialize trip points separately
  thermal/of: Use thermal trips stored in the thermal zone
  thermal/core: Add thermal_trip in thermal_zone
  thermal/core: Rename 'trips' to 'num_trips'
  thermal/core: Move thermal_set_delay_jiffies to static
  thermal/core: Remove unneeded EXPORT_SYMBOLS
  thermal/of: Move thermal_trip structure to thermal.h
  thermal/of: Remove the device node pointer for thermal_trip
  thermal/of: Replace device node match with device node search
  thermal/core: Remove duplicate information when an error occurs
  thermal/core: Avoid calling ->get_trip_temp() unnecessarily
  thermal/tools/tmon: Fix typo 'the the' in comment
  thermal/tools/tmon: Include pthread and time headers in tmon.h
  thermal/ti-soc-thermal: Fix comment typo
  thermal/drivers/qcom/spmi-adc-tm5: Register thermal zones as hwmon sensors
  thermal/drivers/qcom/temp-alarm: Register thermal zones as hwmon sensors
  thermal/drivers/u8500: Remove unnecessary print function dev_err()
  thermal/drivers/rzg2l: Fix comments
  thermal/drivers/sun8i: Fix typo in comment
  thermal/drivers/k3_j72xx_bandgap: Make k3_j72xx_bandgap_j721e_data and k3_j72xx_bandgap_j7200_data static
  ...
2022-08-02 11:27:53 -07:00
Linus Torvalds
a771ea6413 Power management updates for 5.20-rc1
- Make cpufreq_show_cpus() more straightforward (Viresh Kumar).
 
  - Drop unnecessary CPU hotplug locking from store() used by cpufreq
    sysfs attributes (Viresh Kumar).
 
  - Make the ACPI cpufreq driver support the boost control interface on
    Zhaoxin/Centaur processors (Tony W Wang-oc).
 
  - Print a warning message on attempts to free an active cpufreq policy
    which should never happen (Viresh Kumar).
 
  - Fix grammar in the Kconfig help text for the loongson2 cpufreq
    driver (Randy Dunlap).
 
  - Use cpumask_var_t for an on-stack CPU mask in the ondemand cpufreq
    governor (Zhao Liu).
 
  - Add trace points for guest_halt_poll_ns grow/shrink to the haltpoll
    cpuidle driver (Eiichi Tsukata).
 
  - Modify intel_idle to treat C1 and C1E as independent idle states on
    Sapphire Rapids (Artem Bityutskiy).
 
  - Extend support for wakeirq to callback wrappers used during system
    suspend and resume (Ulf Hansson).
 
  - Defer waiting for device probe before loading a hibernation image
    till the first actual device access to avoid possible deadlocks
    reported by syzbot (Tetsuo Handa).
 
  - Unify device_init_wakeup() for PM_SLEEP and !PM_SLEEP (Bjorn
    Helgaas).
 
  - Add Raptor Lake-P to the list of processors supported by the Intel
    RAPL driver (George D Sworo).
 
  - Add Alder Lake-N and Raptor Lake-P to the list of processors for
    which Power Limit4 is supported in the Intel RAPL driver (Sumeet
    Pawnikar).
 
  - Make pm_genpd_remove() check genpd_debugfs_dir against NULL before
    attempting to remove it (Hsin-Yi Wang).
 
  - Change the Energy Model code to represent power in micro-Watts and
    adjust its users accordingly (Lukasz Luba).
 
  - Add new devfreq driver for Mediatek CCI (Cache Coherent
    Interconnect) (Johnson Wang).
 
  - Convert the Samsung Exynos SoC Bus bindings to DT schema of
    exynos-bus.c (Krzysztof Kozlowski).
 
  - Address kernel-doc warnings by adding the description for unused
    fucntion parameters in devfreq core (Mauro Carvalho Chehab).
 
  - Use NULL to pass a null pointer rather than zero according to the
    function propotype in imx-bus.c (Colin Ian King).
 
  - Print error message instead of error interger value in
    tegra30-devfreq.c (Dmitry Osipenko).
 
  - Add checks to prevent setting negative frequency QoS limits for
    CPUs (Shivnandan Kumar).
 
  - Update the pm-graph suite of utilities to the latest revision 5.9
    including multiple improvements (Todd Brandt).
 
  - Drop pme_interrupt reference from the PCI power management
    documentation (Mario Limonciello).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmLoKy8SHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRx3+oQAJNVU+W14EaRPWXQRMuwBC5zk3hb6T9q
 JqmMd8coEd+9/4ABAeRAWso1B26rUzB6JyBvw3lGH9OXInpYmvnJEhEPrTpK2h0D
 U9HxEARuGJolrDm0X9NAkn7tKKMC9GnvPS9z2s7s+N97VFFWC/QiU+PHB0SypGNb
 JxRfbVJZQCuxmNG9UeK+xeHFQ9lM2Z9ZdTxR71G0n7nQPPR+sUvnFufFby3Aogf3
 XnBYfia+YNqkUlefxxwB5a0cFwPXOUGsQkIf4d64gZnq1TgZ+71kht1GEF08PDFl
 wV8v1rOWuXEae8dozuf5xszp/eVyAqzgB+IShT9APREOO3Wg6I16XdBm8R1TGwCK
 JTdZqnm6HVKBNqchEwYViJILX69rrNUT+AwHBWhtKKDNh3qeTuwi/JGTeDVN++en
 xf3TNKx3LV31Nq6nWJFzDGLehfZMnAPkhfYohUBI7FNyblpk4mJRVcZ0bYI7UNnS
 als77uoipvb5KdFCtdhxYBHd/y867NvXKa1qsAuDxusAsfJHf4SnlMdbgOepBH2y
 jJg06CGrMDU3TZ8BL+WpqUYk4irQnAMs/159Txh7A6/dOnTjE7S9NHrENCwmt2og
 FrHSLH1eLX6Sa4RSibiGHPC7mNULP2/TOtryf3zFdlIVcjm3NEU3bnfzx7nlJn05
 8t6ObMxgMhWT
 =XeLV
 -----END PGP SIGNATURE-----

Merge tag 'pm-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "These are mostly minor improvements all over including new CPU IDs for
  the Intel RAPL driver, an Energy Model rework to use micro-Watt as the
  power unit, cpufreq fixes and cleanus, cpuidle updates, devfreq
  updates, documentation cleanups and a new version of the pm-graph
  suite of utilities.

  Specifics:

   - Make cpufreq_show_cpus() more straightforward (Viresh Kumar).

   - Drop unnecessary CPU hotplug locking from store() used by cpufreq
     sysfs attributes (Viresh Kumar).

   - Make the ACPI cpufreq driver support the boost control interface on
     Zhaoxin/Centaur processors (Tony W Wang-oc).

   - Print a warning message on attempts to free an active cpufreq
     policy which should never happen (Viresh Kumar).

   - Fix grammar in the Kconfig help text for the loongson2 cpufreq
     driver (Randy Dunlap).

   - Use cpumask_var_t for an on-stack CPU mask in the ondemand cpufreq
     governor (Zhao Liu).

   - Add trace points for guest_halt_poll_ns grow/shrink to the haltpoll
     cpuidle driver (Eiichi Tsukata).

   - Modify intel_idle to treat C1 and C1E as independent idle states on
     Sapphire Rapids (Artem Bityutskiy).

   - Extend support for wakeirq to callback wrappers used during system
     suspend and resume (Ulf Hansson).

   - Defer waiting for device probe before loading a hibernation image
     till the first actual device access to avoid possible deadlocks
     reported by syzbot (Tetsuo Handa).

   - Unify device_init_wakeup() for PM_SLEEP and !PM_SLEEP (Bjorn
     Helgaas).

   - Add Raptor Lake-P to the list of processors supported by the Intel
     RAPL driver (George D Sworo).

   - Add Alder Lake-N and Raptor Lake-P to the list of processors for
     which Power Limit4 is supported in the Intel RAPL driver (Sumeet
     Pawnikar).

   - Make pm_genpd_remove() check genpd_debugfs_dir against NULL before
     attempting to remove it (Hsin-Yi Wang).

   - Change the Energy Model code to represent power in micro-Watts and
     adjust its users accordingly (Lukasz Luba).

   - Add new devfreq driver for Mediatek CCI (Cache Coherent
     Interconnect) (Johnson Wang).

   - Convert the Samsung Exynos SoC Bus bindings to DT schema of
     exynos-bus.c (Krzysztof Kozlowski).

   - Address kernel-doc warnings by adding the description for unused
     function parameters in devfreq core (Mauro Carvalho Chehab).

   - Use NULL to pass a null pointer rather than zero according to the
     function propotype in imx-bus.c (Colin Ian King).

   - Print error message instead of error interger value in
     tegra30-devfreq.c (Dmitry Osipenko).

   - Add checks to prevent setting negative frequency QoS limits for
     CPUs (Shivnandan Kumar).

   - Update the pm-graph suite of utilities to the latest revision 5.9
     including multiple improvements (Todd Brandt).

   - Drop pme_interrupt reference from the PCI power management
     documentation (Mario Limonciello)"

* tag 'pm-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (27 commits)
  powercap: RAPL: Add Power Limit4 support for Alder Lake-N and Raptor Lake-P
  PM: QoS: Add check to make sure CPU freq is non-negative
  PM: hibernate: defer device probing when resuming from hibernation
  intel_idle: make SPR C1 and C1E be independent
  cpufreq: ondemand: Use cpumask_var_t for on-stack cpu mask
  cpufreq: loongson2: fix Kconfig "its" grammar
  pm-graph v5.9
  cpufreq: Warn users while freeing active policy
  cpufreq: scmi: Support the power scale in micro-Watts in SCMI v3.1
  firmware: arm_scmi: Get detailed power scale from perf
  Documentation: EM: Switch to micro-Watts scale
  PM: EM: convert power field to micro-Watts precision and align drivers
  PM / devfreq: tegra30: Add error message for devm_devfreq_add_device()
  PM / devfreq: imx-bus: use NULL to pass a null pointer rather than zero
  PM / devfreq: shut up kernel-doc warnings
  dt-bindings: interconnect: samsung,exynos-bus: convert to dtschema
  PM / devfreq: mediatek: Introduce MediaTek CCI devfreq driver
  dt-bindings: interconnect: Add MediaTek CCI dt-bindings
  PM: domains: Ensure genpd_debugfs_dir exists before remove
  PM: runtime: Extend support for wakeirq for force_suspend|resume
  ...
2022-08-02 11:17:00 -07:00
David Howells
2757a4dc18 afs: Fix access after dec in put functions
Reference-putting functions should not access the object being put after
decrementing the refcount unless they reduce the refcount to zero.

Fix a couple of instances of this in afs by copying the information to be
logged by tracepoint to local variables before doing the decrement.

[Fixed a bit in afs_put_server() that I'd missed but Marc caught]

Fixes: 341f741f04 ("afs: Refcount the afs_call struct")
Fixes: 4521819369 ("afs: Trace afs_server usage")
Fixes: 977e5f8ed0 ("afs: Split the usage count on struct afs_server")
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/165911278430.3745403.16526310736054780645.stgit@warthog.procyon.org.uk/ # v1
2022-08-02 18:21:29 +01:00
David Howells
c56f9ec8b2 afs: Use refcount_t rather than atomic_t
Use refcount_t rather than atomic_t in afs to make use of the count
checking facilities provided.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/165911277768.3745403.423349776836296452.stgit@warthog.procyon.org.uk/ # v1
2022-08-02 18:10:11 +01:00
Linus Torvalds
47b62edcd4 ARM: SoC drivers for 6.0
The SoC driver updates contain changes to improve support for
 additional SoC variants, as well as cleanups an minor bugfixes
 in a number of existing drivers.
 
 Notable updates this time include:
 
  - Support for Qualcomm MSM8909 (Snapdragon 210) in various drivers
 
  - Updates for interconnect drivers on Qualcomm Snapdragon
 
  - A new driver support for NMI interrupts on Fujitsu A64fx
 
  - A rework of Broadcom BCMBCA Kconfig dependencies
 
  - Improved support for BCM2711 (Raspberry Pi 4) power management
    to allow the use of the V3D GPU
 
  - Cleanups to the NXP guts driver
 
  - Arm SCMI firmware driver updates to add tracing support, and
    use the firmware interfaces for system power control and for
    power capping.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmLo+0UACgkQmmx57+YA
 GNkkFw//eyeMxsJ/pqp0HeXTX7s2p0+39IQiak0hjFNe3XN12PIMRAHHtutKlt7F
 K0fKksokY9p+o1M86/5D4l7v7S6DcHQk2MdUD5AeMu/If7cfL0TI2mdIAVnoad4o
 p54ABR0q2Tr/Dr/2GK8kZPTnXkPPLd951mgCG/jwrlPgG3KjTgjhEWg86+F2s/B/
 P8ryYakCYfsxPJGnZqyw63JuVet9Tnv87jySxabukNfSRR8RbJ+OVKXxaBBmvYkA
 +UuDEkcuPtlrEyUoNe+WtM07BdxP6tl/jRwZ4EenQtFDSLCQnapgHK3bVRbLs/WL
 naKJZgI7OOwU8fjRi90/zYoPBW6UQ9OoxcmshNwgFM5yBLJtVgGM+Fp8XNfPIvm0
 BYvE7jf8cEtFDAOYWuB8jCdvBen8qzt5eeUpV+uOjHDxiWwfif15yUDxCE3GB7Ov
 vSPRpuTec/6Ry5hIbqcsrTcZRihnJbAJqDOHdlSVX3M+ohXcf5OMLd+IbD+oHgpo
 vSKvitkDnCKvdR6Uw1GSNAgeVm1ItMBlcL/8VsurOEUXR301pFNGdHEGuuxDu1Mm
 rEzk37ajYmiol5uBYIU8mdYrlK2+52sWd5/22zIwhMIEgQbuPbouYWPfUSP9bb+U
 9NlvjGVxK5U73UqcU/56nn7W9uRt0ArzSzON53wnBW3WjkcgMLk=
 =0dZI
 -----END PGP SIGNATURE-----

Merge tag 'arm-drivers-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull ARM SoC drivers from Arnd Bergmann:
 "The SoC driver updates contain changes to improve support for
  additional SoC variants, as well as cleanups an minor bugfixes
  in a number of existing drivers.

  Notable updates this time include:

   - Support for Qualcomm MSM8909 (Snapdragon 210) in various drivers

   - Updates for interconnect drivers on Qualcomm Snapdragon

   - A new driver support for NMI interrupts on Fujitsu A64fx

   - A rework of Broadcom BCMBCA Kconfig dependencies

   - Improved support for BCM2711 (Raspberry Pi 4) power management to
     allow the use of the V3D GPU

   - Cleanups to the NXP guts driver

   - Arm SCMI firmware driver updates to add tracing support, and use
     the firmware interfaces for system power control and for power
     capping"

* tag 'arm-drivers-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (125 commits)
  soc: a64fx-diag: disable modular build
  dt-bindings: soc: qcom: qcom,smd-rpm: add power-controller
  dt-bindings: soc: qcom: aoss: document qcom,sm8450-aoss-qmp
  dt-bindings: soc: qcom,rpmh-rsc: simplify qcom,tcs-config
  ARM: mach-qcom: Add support for MSM8909
  dt-bindings: arm: cpus: Document "qcom,msm8909-smp" enable-method
  soc: qcom: spm: Add CPU data for MSM8909
  dt-bindings: soc: qcom: spm: Add MSM8909 CPU compatible
  soc: qcom: rpmpd: Add compatible for MSM8909
  dt-bindings: power: qcom-rpmpd: Add MSM8909 power domains
  soc: qcom: smd-rpm: Add compatible for MSM8909
  dt-bindings: soc: qcom: smd-rpm: Add MSM8909
  soc: qcom: icc-bwmon: Remove unnecessary print function dev_err()
  soc: fujitsu: Add A64FX diagnostic interrupt driver
  soc: qcom: socinfo: Fix the id of SA8540P SoC
  soc: qcom: Make QCOM_RPMPD depend on PM
  tty: serial: bcm63xx: bcmbca: Replace ARCH_BCM_63XX with ARCH_BCMBCA
  spi: bcm63xx-hsspi: bcmbca: Replace ARCH_BCM_63XX with ARCH_BCMBCA
  clk: bcm: bcmbca: Replace ARCH_BCM_63XX with ARCH_BCMBCA
  hwrng: bcm2835: bcmbca: Replace ARCH_BCM_63XX with ARCH_BCMBCA
  ...
2022-08-02 08:10:10 -07:00