Commit Graph

1157612 Commits

Author SHA1 Message Date
Willem de Bruijn 14ade6ba41 net: msg_zerocopy: elide page accounting if RLIM_INFINITY
MSG_ZEROCOPY ensures that pinned user pages do not exceed the limit.
If no limit is set, skip this accounting as otherwise expensive
atomic_long operations are called for no reason.

This accounting is already skipped for privileged (CAP_IPC_LOCK)
users. Rely on the same mechanism: if no mmp->user is set,
mm_unaccount_pinned_pages does not decrement either.

Tested by running tools/testing/selftests/net/msg_zerocopy.sh with
an unprivileged user for the TXMODE binary:

    ip netns exec "${NS1}" sudo -u "{$USER}" "${BIN}" "-${IP}" ...

Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20230214155740.3448763-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-15 21:26:36 -08:00
Oleksij Rempel c24a34f5a3 net: phy: c45: genphy_c45_an_config_aneg(): fix uninitialized symbol error
Fix warning:
drivers/net/phy/phy-c45.c:712 genphy_c45_write_eee_adv() error: uninitialized symbol 'changed'

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Link: https://lore.kernel.org/r/202302150232.q6idsV8s-lkp@intel.com/
Fixes: 022c3f87f8 ("net: phy: add genphy_c45_ethtool_get/set_eee() support")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230215050453.2251360-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-15 21:24:37 -08:00
Dan Carpenter 9753613f73 net: phy: motorcomm: uninitialized variables in yt8531_link_change_notify()
These booleans are never set to false, but are just used without being
initialized.

Fixes: 4ac94f728a ("net: phy: Add driver for Motorcomm yt8531 gigabit ethernet phy")
Signed-off-by: Dan Carpenter <error27@gmail.com>
Reviewed-by: Frank Sae <Frank.Sae@motor-comm.com>
Link: https://lore.kernel.org/r/Y+xd2yJet2ImHLoQ@kili
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-15 21:23:33 -08:00
Corinna Vinschen 5d54cb1767 igb: conditionalize I2C bit banging on external thermal sensor support
Commit a97f8783a9 ("igb: unbreak I2C bit-banging on i350") introduced
code to change I2C settings to bit banging unconditionally.

However, this patch introduced a regression:  On an Intel S2600CWR
Server Board with three NICs:

- 1x dual-port copper
  Intel I350 Gigabit Network Connection [8086:1521] (rev 01)
  fw 1.63, 0x80000dda

- 2x quad-port SFP+ with copper SFP Avago ABCU-5700RZ
  Intel I350 Gigabit Fiber Network Connection [8086:1522] (rev 01)
  fw 1.52.0

the SFP NICs no longer get link at all.  Reverting commit a97f8783a9
or switching to the Intel out-of-tree driver both fix the problem.

Per the igb out-of-tree driver, I2C bit banging on i350 depends on
support for an external thermal sensor (ETS).  However, commit
a97f8783a9 added bit banging unconditionally.  Additionally, the
out-of-tree driver always calls init_thermal_sensor_thresh on probe,
while our driver only calls init_thermal_sensor_thresh only in
igb_reset(), and only if an ETS is present, ignoring the internal
thermal sensor.  The affected SFPs don't provide an ETS.  Per Intel,
the behaviour is a result of i350 firmware requirements.

This patch fixes the problem by aligning the behaviour to the
out-of-tree driver:

- split igb_init_i2c() into two functions:
  - igb_init_i2c() only performs the basic I2C initialization.
  - igb_set_i2c_bb() makes sure that E1000_CTRL_I2C_ENA is set
    and enables bit-banging.

- igb_probe() only calls igb_set_i2c_bb() if an ETS is present.

- igb_probe() calls init_thermal_sensor_thresh() unconditionally.

- igb_reset() aligns its behaviour to igb_probe(), i. e., call
  igb_set_i2c_bb() if an ETS is present and call
  init_thermal_sensor_thresh() unconditionally.

Fixes: a97f8783a9 ("igb: unbreak I2C bit-banging on i350")
Tested-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Co-developed-by: Jamie Bainbridge <jbainbri@redhat.com>
Signed-off-by: Jamie Bainbridge <jbainbri@redhat.com>
Signed-off-by: Corinna Vinschen <vinschen@redhat.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20230214185549.1306522-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-15 21:20:40 -08:00
Leo Li 2a00299e74 drm/amd/display: Fail atomic_check early on normalize_zpos error
[Why]

drm_atomic_normalize_zpos() can return an error code when there's
modeset lock contention. This was being ignored.

[How]

Bail out of atomic check if normalize_zpos() returns an error.

Fixes: b261509952 ("drm/amd/display: Fix double cursor on non-video RGB MPO")
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Reviewed-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2023-02-15 22:46:42 -05:00
Jack Xiao 8f32378986 drm/amd/amdgpu: fix warning during suspend
Freeing memory was warned during suspend.
Move the self test out of suspend.

Link: https://bugzilla.redhat.com/show_bug.cgi?id=2151825
Cc: jfalempe@redhat.com
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-and-tested-by: Evan Quan <evan.quan@amd.com>
Tested-by: Jocelyn Falempe <jfalempe@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.1.x
2023-02-15 22:46:04 -05:00
Jakub Kicinski 0f19f514de mlx5-updates-2023-02-10
1) From Roi and Mark: MultiPort eswitch support
 
 MultiPort E-Switch builds on newer hardware's capabilities and introduces
 a mode where a single E-Switch is used and all the vports and physical
 ports on the NIC are connected to it.
 
 The new mode will allow in the future a decrease in the memory used by the
 driver and advanced features that aren't possible today.
 
 This represents a big change in the current E-Switch implantation in mlx5.
 Currently, by default, each E-Switch manager manages its E-Switch.
 Steering rules in each E-Switch can only forward traffic to the native
 physical port associated with that E-Switch. While there are ways to target
 non-native physical ports, for example using a bond or via special TC
 rules. None of the ways allows a user to configure the driver
 to operate by default in such a mode nor can the driver decide
 to move to this mode by default as it's user configuration-driven right now.
 
 While MultiPort E-Switch single FDB mode is the preferred mode, older
 generations of ConnectX hardware couldn't support this mode so it was never
 implemented. Now that there is capable hardware present, start the
 transition to having this mode by default.
 
 Introduce a devlink parameter to control MultiPort Eswitch single FDB mode.
 This will allow users to select this mode on their system right now
 and in the future will allow the driver to move to this mode by default.
 
 2) From Jiri: Improvements and fixes for mlx5 netdev's devlink logic
  2.1) Cleanups related to mlx5's devlink port logic
  2.2) Move devlink port registration to be done before netdev alloc
  2.3) Create auxdev devlink instance in the same ns as parent devlink
  2.4) Suspend auxiliary devices only in case of PCI device suspend
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmPsBlsACgkQSD+KveBX
 +j6izAgAtvzimjzOqI8J0FKPW3piRH3VtNWUrlGAf4HHPI0lZMbEV5GIaSI15JeH
 wrIpNXKzaA+6+NZ7TzTwqVO+zyKCQVpRw7xPANIJ2A5dId36jU2AgW06zgzFuYWa
 mXKG8rAUls6E5a5DHTaY/SueWMO9KMOfLzHFXHNsLAKTAw+FoDdlGso5vgvsy44g
 2/N/kDBdzC5n8og934sM7JymthnTVyuGe/Au2w1UUR1MMZ46OQgmhStDwqS5MrzI
 pIi+tWFG9YFM5/J8d13y7X/xMCq26Aw6SZSUhsRfTk0goy0uKaU6yYm8vSbO0mci
 +e0c8bpWL/MzAhMzWj8CwBPeO6knOw==
 =tYc2
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2023-02-10' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2023-02-10

1) From Roi and Mark: MultiPort eswitch support

MultiPort E-Switch builds on newer hardware's capabilities and introduces
a mode where a single E-Switch is used and all the vports and physical
ports on the NIC are connected to it.

The new mode will allow in the future a decrease in the memory used by the
driver and advanced features that aren't possible today.

This represents a big change in the current E-Switch implantation in mlx5.
Currently, by default, each E-Switch manager manages its E-Switch.
Steering rules in each E-Switch can only forward traffic to the native
physical port associated with that E-Switch. While there are ways to target
non-native physical ports, for example using a bond or via special TC
rules. None of the ways allows a user to configure the driver
to operate by default in such a mode nor can the driver decide
to move to this mode by default as it's user configuration-driven right now.

While MultiPort E-Switch single FDB mode is the preferred mode, older
generations of ConnectX hardware couldn't support this mode so it was never
implemented. Now that there is capable hardware present, start the
transition to having this mode by default.

Introduce a devlink parameter to control MultiPort Eswitch single FDB mode.
This will allow users to select this mode on their system right now
and in the future will allow the driver to move to this mode by default.

2) From Jiri: Improvements and fixes for mlx5 netdev's devlink logic
 2.1) Cleanups related to mlx5's devlink port logic
 2.2) Move devlink port registration to be done before netdev alloc
 2.3) Create auxdev devlink instance in the same ns as parent devlink
 2.4) Suspend auxiliary devices only in case of PCI device suspend

* tag 'mlx5-updates-2023-02-10' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5: Suspend auxiliary devices only in case of PCI device suspend
  net/mlx5: Remove "recovery" arg from mlx5_load_one() function
  net/mlx5e: Create auxdev devlink instance in the same ns as parent devlink
  net/mlx5e: Move devlink port registration to be done before netdev alloc
  net/mlx5e: Move dl_port to struct mlx5e_dev
  net/mlx5e: Replace usage of mlx5e_devlink_get_dl_port() by netdev->devlink_port
  net/mlx5e: Pass mdev to mlx5e_devlink_port_register()
  net/mlx5: Remove outdated comment
  net/mlx5e: TC, Remove redundant parse_attr argument
  net/mlx5e: Use a simpler comparison for uplink rep
  net/mlx5: Lag, Add single RDMA device in multiport mode
  net/mlx5: Lag, set different uplink vport metadata in multiport eswitch mode
  net/mlx5: E-Switch, rename bond update function to be reused
  net/mlx5e: TC, Add peer flow in mpesw mode
  net/mlx5: Lag, Control MultiPort E-Switch single FDB mode
====================

Link: https://lore.kernel.org/r/20230214221239.159033-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-15 19:24:52 -08:00
Jakub Kicinski dee4bf7167 Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2023-02-14 (ixgbe, i40e)

This series contains updates to ixgbe and i40e drivers.

Jason Xing corrects comparison of frame sizes for setting MTU with XDP on
ixgbe and adjusts frame size to account for a second VLAN header on ixgbe
and i40e.

* '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  ixgbe: add double of VLAN header when computing the max MTU
  i40e: add double of VLAN header when computing the max MTU
  ixgbe: allow to increase MTU to 3K with XDP enabled
====================

Link: https://lore.kernel.org/r/20230214185146.1305819-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-15 19:20:58 -08:00
Jakub Kicinski 388a9c907a Merge branch 'devlink-cleanups-and-move-devlink-health-functionality-to-separate-file'
Moshe Shemesh says:

====================
devlink: cleanups and move devlink health functionality to separate file

This patchset moves devlink health callbacks, helpers and related code
from leftover.c to new file health.c. About 1.3K LoC are moved by this
patchset, covering all devlink health functionality.

In addition this patchset includes a couple of small cleanups in devlink
health code and documentation update.
====================

Link: https://lore.kernel.org/r/1676392686-405892-1-git-send-email-moshe@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-15 19:15:46 -08:00
Moshe Shemesh d0ab772c1f devlink: Fix TP_STRUCT_entry in trace of devlink health report
Fix a bug in trace point definition for devlink health report, as
TP_STRUCT_entry of reporter_name should get reporter_name and not msg.

Note no fixes tag as this is a harmless bug as both reporter_name and
msg are strings and TP_fast_assign for this entry is correct.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-15 19:15:44 -08:00
Moshe Shemesh c745cfb27a devlink: Update devlink health documentation
Update devlink-health.rst file:
- Add devlink formatted message (fmsg) API documentation.
- Add auto-dump as a condition to do dump once error reported.
- Expand OOB to clarify this acronym.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-15 19:15:44 -08:00
Moshe Shemesh 12af29e779 devlink: Move health common function to health file
Now that all devlink health callbacks and related code are in file
health.c move common health functions and devlink_health_reporter struct
to be local in health.c file.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-15 19:15:44 -08:00
Moshe Shemesh c9311ee13f devlink: Move devlink health test to health file
Move devlink health report test callback from leftover.c to health.c. No
functional change in this patch.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-15 19:15:44 -08:00
Moshe Shemesh 7004c6c457 devlink: Move devlink health dump to health file
Move devlink health report dump callbacks and related code from
leftover.c to health.c. No functional change in this patch.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-15 19:15:44 -08:00
Moshe Shemesh a929df7fd9 devlink: Move devlink fmsg and health diagnose to health file
Devlink fmsg (formatted message) is used by devlink health diagnose,
dump and drivers which support these devlink health callbacks.
Therefore, move devlink fmsg helpers and related code to file health.c.
Move devlink health diagnose to file health.c. No functional change in
this patch.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-15 19:15:44 -08:00
Moshe Shemesh 55b9b24968 devlink: Move devlink health report and recover to health file
Move devlink health report helper and recover callback and related code
from leftover.c to health.c. No functional change in this patch.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-15 19:15:44 -08:00
Moshe Shemesh db6b5f3ec4 devlink: Move devlink health get and set code to health file
Move devlink health get and set callbacks and related code from
leftover.c to health.c. No functional change in this patch.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-15 19:15:44 -08:00
Moshe Shemesh bfd4e6a5db devlink: health: Fix nla_nest_end in error flow
devlink_nl_health_reporter_fill() error flow calls nla_nest_end(). Fix
it to call nla_nest_cancel() instead.

Note the bug is harmless as genlmsg_cancel() cancel the entire message,
so no fixes tag added.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-15 19:15:44 -08:00
Moshe Shemesh b4740e3a81 devlink: Split out health reporter create code
Move devlink health reporter create/destroy and related dev code to new
file health.c. This file shall include all callbacks and functionality
that are related to devlink health.

In addition, fix kdoc indentation and make reporter create/destroy kdoc
more clear. No functional change in this patch.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-15 19:15:44 -08:00
Lorenzo Bianconi b6a4103c35 ice: update xdp_features with xdp multi-buff
Now ice driver supports xdp multi-buffer so add it to xdp_features.
Check vsi type before setting xdp_features flag.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/r/8a4781511ab6e3cd280e944eef69158954f1a15f.1676385351.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-15 19:10:32 -08:00
Lorenzo Bianconi 9dd6e53ef6 i40e: check vsi type before setting xdp_features flag
Set xdp_features flag just for I40E_VSI_MAIN vsi type since XDP is
supported just in this configuration.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/f2b537f86b34fc176fbc6b3d249b46a20a87a2f3.1676405131.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-15 19:07:54 -08:00
Alexander Lobakin 6c20822fad bpf, test_run: fix &xdp_frame misplacement for LIVE_FRAMES
&xdp_buff and &xdp_frame are bound in a way that

xdp_buff->data_hard_start == xdp_frame

It's always the case and e.g. xdp_convert_buff_to_frame() relies on
this.
IOW, the following:

	for (u32 i = 0; i < 0xdead; i++) {
		xdpf = xdp_convert_buff_to_frame(&xdp);
		xdp_convert_frame_to_buff(xdpf, &xdp);
	}

shouldn't ever modify @xdpf's contents or the pointer itself.
However, "live packet" code wrongly treats &xdp_frame as part of its
context placed *before* the data_hard_start. With such flow,
data_hard_start is sizeof(*xdpf) off to the right and no longer points
to the XDP frame.

Instead of replacing `sizeof(ctx)` with `offsetof(ctx, xdpf)` in several
places and praying that there are no more miscalcs left somewhere in the
code, unionize ::frm with ::data in a flex array, so that both starts
pointing to the actual data_hard_start and the XDP frame actually starts
being a part of it, i.e. a part of the headroom, not the context.
A nice side effect is that the maximum frame size for this mode gets
increased by 40 bytes, as xdp_buff::frame_sz includes everything from
data_hard_start (-> includes xdpf already) to the end of XDP/skb shared
info.
Also update %MAX_PKT_SIZE accordingly in the selftests code. Leave it
hardcoded for 64 bit && 4k pages, it can be made more flexible later on.

Minor: align `&head->data` with how `head->frm` is assigned for
consistency.
Minor #2: rename 'frm' to 'frame' in &xdp_page_head while at it for
clarity.

(was found while testing XDP traffic generator on ice, which calls
 xdp_convert_frame_to_buff() for each XDP frame)

Fixes: b530e9e106 ("bpf: Add "live packet" mode for XDP in BPF_PROG_RUN")
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://lore.kernel.org/r/20230215185440.4126672-1-aleksander.lobakin@intel.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-02-15 17:39:36 -08:00
Andrii Nakryiko d964f09af4 Merge branch 'New benchmark for hashmap lookups'
Anton Protopopov says:

====================

Add a new benchmark for hashmap lookups and fix several typos.

In commit 3 I've patched the bench utility so that now command line options
can be reused by different benchmarks.

The benchmark itself is added in the last commit 7. I was using this benchmark
to test map lookup productivity when using a different hash function [1]. When
run with --quiet, the results can be easily plotted [2].  The results provided
by the benchmark look reasonable and match the results of my different
benchmarks (requiring to patch kernel to get actual statistics on map lookups).

Links:
  [1] https://fosdem.org/2023/schedule/event/bpf_hashing/
  [2] https://github.com/aspsk/bpf-bench/tree/master/hashmap-bench

Changes,
v1->v2:
- percpu_times_index[] is of wrong size (Martin)
- use base 0 for strtol (Andrii)
- just use -q without argument (Andrii)
- use less hacks when parsing arguments (Andrii)
====================

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2023-02-15 16:29:32 -08:00
Anton Protopopov f371f2dc53 selftest/bpf/benchs: Add benchmark for hashmap lookups
Add a new benchmark which measures hashmap lookup operations speed.  A user can
control the following parameters of the benchmark:

    * key_size (max 1024): the key size to use
    * max_entries: the hashmap max entries
    * nr_entries: the number of entries to insert/lookup
    * nr_loops: the number of loops for the benchmark
    * map_flags The hashmap flags passed to BPF_MAP_CREATE

The BPF program performing the benchmarks calls two nested bpf_loop:

    bpf_loop(nr_loops/nr_entries)
            bpf_loop(nr_entries)
                     bpf_map_lookup()

So the nr_loops determines the number of actual map lookups. All lookups are
successful.

Example (the output is generated on a AMD Ryzen 9 3950X machine):

    for nr_entries in `seq 4096 4096 65536`; do echo -n "$((nr_entries*100/65536))% full: "; sudo ./bench -d2 -a bpf-hashmap-lookup --key_size=4 --nr_entries=$nr_entries --max_entries=65536 --nr_loops=1000000 --map_flags=0x40 | grep cpu; done
    6% full: cpu01: lookup 50.739M ± 0.018M events/sec (approximated from 32 samples of ~19ms)
    12% full: cpu01: lookup 47.751M ± 0.015M events/sec (approximated from 32 samples of ~20ms)
    18% full: cpu01: lookup 45.153M ± 0.013M events/sec (approximated from 32 samples of ~22ms)
    25% full: cpu01: lookup 43.826M ± 0.014M events/sec (approximated from 32 samples of ~22ms)
    31% full: cpu01: lookup 41.971M ± 0.012M events/sec (approximated from 32 samples of ~23ms)
    37% full: cpu01: lookup 41.034M ± 0.015M events/sec (approximated from 32 samples of ~24ms)
    43% full: cpu01: lookup 39.946M ± 0.012M events/sec (approximated from 32 samples of ~25ms)
    50% full: cpu01: lookup 38.256M ± 0.014M events/sec (approximated from 32 samples of ~26ms)
    56% full: cpu01: lookup 36.580M ± 0.018M events/sec (approximated from 32 samples of ~27ms)
    62% full: cpu01: lookup 36.252M ± 0.012M events/sec (approximated from 32 samples of ~27ms)
    68% full: cpu01: lookup 35.200M ± 0.012M events/sec (approximated from 32 samples of ~28ms)
    75% full: cpu01: lookup 34.061M ± 0.009M events/sec (approximated from 32 samples of ~29ms)
    81% full: cpu01: lookup 34.374M ± 0.010M events/sec (approximated from 32 samples of ~29ms)
    87% full: cpu01: lookup 33.244M ± 0.011M events/sec (approximated from 32 samples of ~30ms)
    93% full: cpu01: lookup 32.182M ± 0.013M events/sec (approximated from 32 samples of ~31ms)
    100% full: cpu01: lookup 31.497M ± 0.016M events/sec (approximated from 32 samples of ~31ms)

Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230213091519.1202813-8-aspsk@isovalent.com
2023-02-15 16:29:31 -08:00
Anton Protopopov a237dda05e selftest/bpf/benchs: Print less if the quiet option is set
The bench utility will print

    Setting up benchmark '<bench-name>'...
    Benchmark '<bench-name>' started.

on startup to stdout. Suppress this output if --quiet option if given. This
makes it simpler to parse benchmark output by a script.

Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230213091519.1202813-7-aspsk@isovalent.com
2023-02-15 16:29:31 -08:00
Anton Protopopov 90c22503cd selftest/bpf/benchs: Make quiet option common
The "local-storage-tasks-trace" benchmark has a `--quiet` option. Move it to
the list of common options, so that the main code and other benchmarks can use
(new) env.quiet variable. Patch the run_bench_local_storage_rcu_tasks_trace.sh
helper script accordingly.

Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230213091519.1202813-6-aspsk@isovalent.com
2023-02-15 16:29:31 -08:00
Anton Protopopov 9644546260 selftest/bpf/benchs: Remove an unused header
The benchs/bench_bpf_hashmap_full_update.c doesn't set a custom argp,
so it shouldn't include the <argp.h> header.

Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230213091519.1202813-5-aspsk@isovalent.com
2023-02-15 16:29:31 -08:00
Anton Protopopov 22ff7aeaa9 selftest/bpf/benchs: Enhance argp parsing
To parse command line the bench utility uses the argp_parse() function. This
function takes as an argument a parent 'struct argp' structure which defines
common command line options and an array of children 'struct argp' structures
which defines additional command line options for particular benchmarks. This
implementation doesn't allow benchmarks to share option names, e.g., if two
benchmarks want to use, say, the --option option, then only one of them will
succeed (the first one encountered in the array).  This will be convenient if
same option names could be used in different benchmarks (with the same
semantics, e.g., --nr_loops=N).

Fix this by calling the argp_parse() function twice. The first call is the same
as it was before, with all children argps, and helps to find the benchmark name
and to print a combined help message if anything is wrong.  Given the name, we
can call the argp_parse the second time, but now the children array points only
to a correct benchmark thus always calling the correct parsers. (If there's no
a specific list of arguments, then only one call to argp_parse will be done.)

Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230213091519.1202813-4-aspsk@isovalent.com
2023-02-15 16:29:31 -08:00
Anton Protopopov 2f1c59637f selftest/bpf/benchs: Make a function static in bpf_hashmap_full_update
The hashmap_report_final callback function defined in the
benchs/bench_bpf_hashmap_full_update.c file should be static.

Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230213091519.1202813-3-aspsk@isovalent.com
2023-02-15 16:29:31 -08:00
Anton Protopopov 4db98ab445 selftest/bpf/benchs: Fix a typo in bpf_hashmap_full_update
To call the bpf_hashmap_full_update benchmark, one should say:

    bench bpf-hashmap-ful-update

The patch adds a missing 'l' to the benchmark name.

Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230213091519.1202813-2-aspsk@isovalent.com
2023-02-15 16:29:31 -08:00
Alexei Starovoitov 3538a0fbbd Merge branch 'Use __GFP_ZERO in bpf memory allocator'
Hou Tao says:

====================

From: Hou Tao <houtao1@huawei.com>

Hi,

The patchset tries to fix the hard-up problem found when checking how htab
handles element reuse in bpf memory allocator. The immediate reuse of
freed elements will reinitialize special fields (e.g., bpf_spin_lock) in
htab map value and it may corrupt lookup procedure with BFP_F_LOCK flag
which acquires bpf-spin-lock during value copying, and lead to hard-lock
as shown in patch #2. Patch #1 fixes it by using __GFP_ZERO when allocating
the object from slab and the behavior is similar with the preallocated
hash-table case. Please see individual patches for more details. And comments
are always welcome.

Regards,

Change Log:
v1:
  * Use __GFP_ZERO instead of ctor to avoid retpoline overhead (from Alexei)
  * Add comments for check_and_init_map_value() (from Alexei)
  * split __GFP_ZERO patches out of the original patchset to unblock
    the development work of others.

RFC: https://lore.kernel.org/bpf/20221230041151.1231169-1-houtao@huaweicloud.com
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-02-15 15:40:06 -08:00
Hou Tao f88da2d46c selftests/bpf: Add test case for element reuse in htab map
The reinitialization of spin-lock in map value after immediate reuse may
corrupt lookup with BPF_F_LOCK flag and result in hard lock-up, so add
one test case to demonstrate the problem.

Signed-off-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/r/20230215082132.3856544-3-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-02-15 15:40:06 -08:00
Hou Tao 997849c4b9 bpf: Zeroing allocated object from slab in bpf memory allocator
Currently the freed element in bpf memory allocator may be immediately
reused, for htab map the reuse will reinitialize special fields in map
value (e.g., bpf_spin_lock), but lookup procedure may still access
these special fields, and it may lead to hard-lockup as shown below:

 NMI backtrace for cpu 16
 CPU: 16 PID: 2574 Comm: htab.bin Tainted: G             L     6.1.0+ #1
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
 RIP: 0010:queued_spin_lock_slowpath+0x283/0x2c0
 ......
 Call Trace:
  <TASK>
  copy_map_value_locked+0xb7/0x170
  bpf_map_copy_value+0x113/0x3c0
  __sys_bpf+0x1c67/0x2780
  __x64_sys_bpf+0x1c/0x20
  do_syscall_64+0x30/0x60
  entry_SYSCALL_64_after_hwframe+0x46/0xb0
 ......
  </TASK>

For htab map, just like the preallocated case, these is no need to
initialize these special fields in map value again once these fields
have been initialized. For preallocated htab map, these fields are
initialized through __GFP_ZERO in bpf_map_area_alloc(), so do the
similar thing for non-preallocated htab in bpf memory allocator. And
there is no need to use __GFP_ZERO for per-cpu bpf memory allocator,
because __alloc_percpu_gfp() does it implicitly.

Fixes: 0fd7c5d433 ("bpf: Optimize call_rcu in non-preallocated hash map.")
Signed-off-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/r/20230215082132.3856544-2-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-02-15 15:40:06 -08:00
Linus Torvalds 033c40a89f Regression fix
- getattr mediation of old policy
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE7cSDD705q2rFEEf7BS82cBjVw9gFAmPtOMIACgkQBS82cBjV
 w9grtg//QDrI+6kQ8mbiluRVbnZim3lRz6u8w9FUh9421H82FmoWlvkuwIe8HfNi
 3VuhdfsoEUY2Jaq2xUtarAQzzLKJUIGyeqXUMck0Ri8ySEtS21ZrWQ1EOOzT6lCZ
 NfFD5D/CoavHa4mROSos3j/F0+xjqlp+Wmy0erH4qKPd8S1pmleT9lVxboWx5i7o
 LyMn1b36KNd3abYKpJPe/r2BBmi/qw+QqMJBu/KIuD5y5ElRAfz/bSogm4Q9Ap4S
 1Tcn1dILdLOKAo1jc5TRLM9CkyPQ9XwLnodhVh8cQrLG3HdDnEdw1R2k8x7QDxvy
 YIpq7bI5nHsMh+1Nzbf9F8lTsovn4OGcALo17LgrosR5oeza8lvrxxrgW7OR7D0m
 uURhBjnySyEYJ7/qa8XeqcY7d1p4TnA3mGsItlmacgL5o3A/01uu2P/NzukceVv0
 iAjaMVmSHECZ7B8PUYcSiXs0rFAjeMZGquT7w+UQiXgLeOl2FR0bjqv6VL46vhHq
 6B1tO2UJXqtm7oGYjTqYfJ2HIRT9MYUF5fUGzUPPDohk2vgrsDARpF9ZTPXs2O1K
 xJHHC5eRo9pteoNl3T30U8lBudonYDeWKEIPyGCGlwbm51SeHeo/VmAny+X0Osk7
 168bq9KnFmrKdofNP6NOhiMmKlytUISZ3CVbC0NoLC/jj85RS+o=
 =Ei2j
 -----END PGP SIGNATURE-----

Merge tag 'apparmor-v6.2-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor

Pull apparmor fix from John Johansen:
 "Regression fix for getattr mediation of old policy"

* tag 'apparmor-v6.2-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor:
  apparmor: Fix regression in compat permissions for getattr
2023-02-15 14:53:08 -08:00
Jens Axboe 9a28b92cc2 nvme fixes for Linux 6.2
- always return an ERR_PTR from nvme_pci_alloc_dev (Irvin Cote)
  - add bogus ID quirk for ADATA SX6000PNP (Daniel Wagner)
  - set the DMA mask earlier (Christoph Hellwig)
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmPtILgLHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYM0gA//d2RFIIdhmFFu5iBp7S05D5CjLyQxP9jdVFQIJTjO
 UTx6YIb18v9VEgxhWA8EwDQuLcW9Uj7s59Wxkt+Y6s/8PAVQxyzKyEyxTXJWG9kh
 qQeTyymqOBHOScu3aWSIa7TpOlg8/Pxkn0MgmmYm3Wx+NA7+7xzxeVxs8hcoN4Hc
 2mTPmfJY+DiOopyKV/awiwAG1g39gziPODh2VT1kJWn1Q7OxRMbZ9CN047b1yII8
 54hYI0z8MF2T83SdVS+MADcqtWCNKrEVVrEVJS/kQ5fNKV4ojFOTf5hMR6xmo+bg
 bXyxnu0HSM/Ij76Umw7xqlcF5L+Rn/MnJNebcJ+iNfsi6g+AaVwekbMBC4NVWTtp
 iGI4uxH9EbWS654CikTdXPxtbQuODpYL1x/16vZqBQvmv5WJH07uVsrwG3Zoo9R+
 krKNMiI7PkTUbH6IACmk/RNPfuxK0qYpTDZcvKIj6C0PBNd/0RpVi8mjeQzgFlHd
 w+AZ7iu0/8HOTNpoI8Qi0NxlHBOAHFqILrNRf78YN1I8Hisa4JYILrhU1UL9wnDg
 aCLUS56SGCFAjgVlYlakFq2iR8htbFh1fPhWz3Q6BV8k84LjaaHoZvVV3A0DYhdj
 4+rOvI7k5r4sKIIp7jUe2xXwEyacOziY/KrI8kzBIxeio8+CykPNlbDTgIJHW79y
 U5M=
 =xRhD
 -----END PGP SIGNATURE-----

Merge tag 'nvme-6.2-2023-02-15' of git://git.infradead.org/nvme into block-6.2

Pull NVMe fixes from Christoph:

"nvme fixes for Linux 6.2

 - always return an ERR_PTR from nvme_pci_alloc_dev (Irvin Cote)
 - add bogus ID quirk for ADATA SX6000PNP (Daniel Wagner)
 - set the DMA mask earlier (Christoph Hellwig)"

* tag 'nvme-6.2-2023-02-15' of git://git.infradead.org/nvme:
  nvme-pci: always return an ERR_PTR from nvme_pci_alloc_dev
  nvme-pci: set the DMA mask earlier
  nvme-pci: add bogus ID quirk for ADATA SX6000PNP
2023-02-15 13:47:27 -07:00
Linus Torvalds 3402351a5a nfsd-6.2 fixes:
- Fix a teardown bug in the new nfs4_file hashtable
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmPqS30ACgkQM2qzM29m
 f5fV2BAAgkmq0MF6nTjjVcFYSUyJvPO4jCcT5ZPqJcv2CHCWp/EkhOrBMHtli3y2
 zUspjZfHpuV5oi2j7J1/PnlLBbov7oXjg9vZ2xGazo1Zkr5wQHxwyuFXdVzte2Qk
 fGcCEgLA3aPiluafzWHMYeoZUqVSQCT41UaUW4iauc/byRZmNs0nwVsYlfhWzg1U
 dsQsK03S357mmqhevxIbBVvTRNakxJwqALBVZcEgfjulESwgQ6gIJpo8srrAkZ4v
 2k5bPVVTRhyDCv+8fOe7UqIyW7HwT5WDvMJg7C883wcP8xk7NHqg54t7NVoPNv5R
 oZLkwUBr5G75OThCmrm8zDMt6zqT76Wyc29fBQ0hjyNml0HOiB131yYv7zGW6qqD
 2e+jQpAKDkc6IIy8IzqxgHan9lMvtYfz7uIQjGGXb+vPTF5S7BIdWJjSEvgo/7a7
 RNlou48UAF4NeZHcDESXFArcQ0MXd9rBnawxRVDqMzQ3owMyn3Nt8FRx7K/dNGKl
 6qnIR+H7G65KQgQCxiqrGvvO3NssCBurv/9BRNRmVshVzI0pdn/seZff4TngvDfI
 GWvEE7nMFE1jDBrN5o5ecLUVI+zTDuNY3AAnaH7afiRCBQauq+AziCfvXoLtfGw2
 +FktTeZRgZQzMFA62H3eEDvy5dKDp/a0ivP1QglIid6oBRH/muw=
 =pQWv
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-6.2-6' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull nfsd fix from Chuck Lever:

 - Fix a teardown bug in the new nfs4_file hashtable

* tag 'nfsd-6.2-6' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  nfsd: don't destroy global nfs4_file table in per-net shutdown
2023-02-15 11:48:56 -08:00
Alexei Starovoitov b2d9002ee9 Merge branch 'Improvements for BPF_ST tracking by verifier '
Eduard Zingerman says:

====================

This patch-set is a part of preparation work for -mcpu=v4 option for
BPF C compiler (discussed in [1]). Among other things -mcpu=v4 should
enable generation of BPF_ST instruction by the compiler.

- Patches #1,2 adjust verifier to track values of constants written to
  stack using BPF_ST. Currently these are tracked imprecisely, unlike
  the writes using BPF_STX, e.g.:

    fp[-8] = 42;   currently verifier assumes that fp[-8]=mmmmmmmm
                   after such instruction, where m stands for "misc",
                   just a note that something is written at fp[-8].

    r1 = 42;       verifier tracks r1=42 after this instruction.
    fp[-8] = r1;   verifier tracks fp[-8]=42 after this instruction.

  This patch makes both cases equivalent.

- Patches #3,4 adjust verifier.c:check_stack_write_fixed_off() to
  preserve STACK_ZERO marks when BPF_ST writes zero. Currently these
  are replaced by STACK_MISC, unlike zero writes using BPF_STX, e.g.:

    ... stack range [X,Y] is marked as STACK_ZERO ...
    r0 = ... variable offset pointer to stack with range [X,Y] ...

    fp[r0] = 0;    currently verifier marks range [X,Y] as
                   STACK_MISC for such instructions.

    r1 = 0;
    fp[r0] = r1;   verifier keeps STACK_ZERO marks for range [X,Y].

  This patch makes both cases equivalent.

Motivating example for patch #1 could be found at [3].

Previous version of the patch-set is here [2], the changes are:
- Explicit initialization of fake register parent link is removed from
  verifier.c:check_stack_write_fixed_off() as parent links are now
  correctly handled by verifier.c:save_register_state().
- Original patch #1 is split in patches #1 & #3.
- Missing test case added for patch #3
  verifier.c:check_stack_write_fixed_off() adjustment.
- Test cases are updated to use .prog_type = BPF_PROG_TYPE_SK_LOOKUP,
  which requires return value to be in the range [0,1] (original test
  cases assumed that such range is always required, which is not true).
- Original patch #3 with changes allowing BPF_ST writes to context is
  withheld for now, w/o compiler support for BPF_ST it requires some
  creative testing.
- Original patch #5 is removed from the patch-set. This patch
  contained adjustments to expected verifier error messages in some
  tests, necessary when C compiler generates BPF_ST instruction
  instead of BPF_STX (changes to expected instruction indices). These
  changes are not necessary yet.

[1] https://lore.kernel.org/bpf/01515302-c37d-2ee5-c950-2f556a4caad0@meta.com/
[2] https://lore.kernel.org/bpf/20221231163122.1360813-1-eddyz87@gmail.com/
[3] https://lore.kernel.org/bpf/f1e4282bf00aa21a72fc5906f8c3be1ae6c94a5e.camel@gmail.com/
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-02-15 11:48:48 -08:00
Eduard Zingerman 2a33c5a25e selftests/bpf: check if BPF_ST with variable offset preserves STACK_ZERO
A test case to verify that variable offset BPF_ST instruction
preserves STACK_ZERO marks when writes zeros, e.g. in the following
situation:

  *(u64*)(r10 - 8) = 0   ; STACK_ZERO marks for fp[-8]
  r0 = random(-7, -1)    ; some random number in range of [-7, -1]
  r0 += r10              ; r0 is now variable offset pointer to stack
  *(u8*)(r0) = 0         ; BPF_ST writing zero, STACK_ZERO mark for
                         ; fp[-8] should be preserved.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230214232030.1502829-5-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-02-15 11:48:48 -08:00
Eduard Zingerman 31ff213512 bpf: BPF_ST with variable offset should preserve STACK_ZERO marks
BPF_STX instruction preserves STACK_ZERO marks for variable offset
writes in situations like below:

  *(u64*)(r10 - 8) = 0   ; STACK_ZERO marks for fp[-8]
  r0 = random(-7, -1)    ; some random number in range of [-7, -1]
  r0 += r10              ; r0 is now a variable offset pointer to stack
  r1 = 0
  *(u8*)(r0) = r1        ; BPF_STX writing zero, STACK_ZERO mark for
                         ; fp[-8] is preserved

This commit updates verifier.c:check_stack_write_var_off() to process
BPF_ST in a similar manner, e.g. the following example:

  *(u64*)(r10 - 8) = 0   ; STACK_ZERO marks for fp[-8]
  r0 = random(-7, -1)    ; some random number in range of [-7, -1]
  r0 += r10              ; r0 is now variable offset pointer to stack
  *(u8*)(r0) = 0         ; BPF_ST writing zero, STACK_ZERO mark for
                         ; fp[-8] is preserved

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230214232030.1502829-4-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-02-15 11:48:47 -08:00
Eduard Zingerman 1a24af65bb selftests/bpf: check if verifier tracks constants spilled by BPF_ST_MEM
Check that verifier tracks the value of 'imm' spilled to stack by
BPF_ST_MEM instruction. Cover the following cases:
- write of non-zero constant to stack;
- write of a zero constant to stack.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230214232030.1502829-3-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-02-15 11:48:47 -08:00
Eduard Zingerman ecdf985d76 bpf: track immediate values written to stack by BPF_ST instruction
For aligned stack writes using BPF_ST instruction track stored values
in a same way BPF_STX is handled, e.g. make sure that the following
commands produce similar verifier knowledge:

  fp[-8] = 42;             r1 = 42;
                       fp[-8] = r1;

This covers two cases:
 - non-null values written to stack are stored as spill of fake
   registers;
 - null values written to stack are stored as STACK_ZERO marks.

Previously both cases above used STACK_MISC marks instead.

Some verifier test cases relied on the old logic to obtain STACK_MISC
marks for some stack values. These test cases are updated in the same
commit to avoid failures during bisect.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230214232030.1502829-2-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-02-15 11:48:47 -08:00
Linus Torvalds ca5ca22775 tracing: Make trace_define_field_ext() static
Just after the fix to TASK_COMM_LEN not converted to its value in
 trace_events was pulled, the kernel test robot reported that the helper
 function trace_define_field_ext() added to that change was only used in
 the file it was defined in but was not declared static. Make it a local
 function.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCY+pQvhQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qpWMAP958Izvo22zPjlvqypLrC4wkwOrU6BG
 ITApOESLGS6YMAEA3X1qVpjgXClFmRv6j+J7S6LdhUzhkOm9Sxg5Vejxzgo=
 =4fmj
 -----END PGP SIGNATURE-----

Merge tag 'trace-v6.2-rc7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing fixlet from Steven Rostedt:
 "Make trace_define_field_ext() static.

  Just after the fix to TASK_COMM_LEN not converted to its value in
  trace_events was pulled, the kernel test robot reported that the
  helper function trace_define_field_ext() added to that change was only
  used in the file it was defined in but was not declared static.

  Make it a local function"

* tag 'trace-v6.2-rc7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: Make trace_define_field_ext() static
2023-02-15 11:31:34 -08:00
John Johansen cbb13e12a5 apparmor: Fix regression in compat permissions for getattr
This fixes a regression in mediation of getattr when old policy built
under an older ABI is loaded and mapped to internal permissions.

The regression does not occur for all getattr permission requests,
only appearing if state zero is the final state in the permission
lookup.  This is because despite the first state (index 0) being
guaranteed to not have permissions in both newer and older permission
formats, it may have to carry permissions that were not mediated as
part of an older policy. These backward compat permissions are
mapped here to avoid special casing the mediation code paths.

Since the mapping code already takes into account backwards compat
permission from older formats it can be applied to state 0 to fix
the regression.

Fixes: 408d53e923 ("apparmor: compute file permissions on profile load")
Reported-by: Philip Meulengracht <the_meulengracht@hotmail.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2023-02-15 11:24:38 -08:00
Johannes Berg 3caf31e7b1 wifi: mac80211: add documentation for amsdu_mesh_control
This documentation wasn't added in the original patch,
add it now.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 6e4c0d0460 ("wifi: mac80211: add a workaround for receiving non-standard mesh A-MSDU")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-02-15 18:31:16 +01:00
Lorenzo Bianconi 4048a6a738 wifi: cfg80211: remove gfp parameter from cfg80211_obss_color_collision_notify description
Get rid of gfp parameter from cfg80211_obss_color_collision_notify
routine description.

Fixes: 935ef47b16 ("wifi: cfg80211: get rid of gfp in cfg80211_bss_color_notify")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/2da652e2cd5c7903191091ae9757718f1be802a1.1676453359.git.lorenzo@kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-02-15 18:28:52 +01:00
Johannes Berg ab5f171e36 wifi: mac80211: always initialize link_sta with sta
When we have multiple interfaces receiving the same frame,
such as a multicast frame, one interface might have a sta
and the other not. In this case, link_sta would be set but
not cleared again.

Always set link_sta, so we keep an invariant that link_sta
and sta are either both set or both not set.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-02-15 18:27:35 +01:00
Johannes Berg 0d846bdc11 wifi: mac80211: pass 'sta' to ieee80211_rx_data_set_sta()
There's at least one case in ieee80211_rx_for_interface()
where we might pass &((struct sta_info *)NULL)->sta to it
only to then do container_of(), and then checking the
result for NULL, but checking the result of container_of()
for NULL looks really odd.

Fix this by just passing the struct sta_info * instead.

Fixes: e66b7920aa ("wifi: mac80211: fix initialization of rx->link and rx->link_sta")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-02-15 18:27:25 +01:00
Marc Bornand c38c701851 wifi: cfg80211: Set SSID if it is not already set
When a connection was established without going through
NL80211_CMD_CONNECT, the ssid was never set in the wireless_dev struct.
Now we set it in __cfg80211_connect_result() when it is not already set.

When using a userspace configuration that does not call
cfg80211_connect() (can be checked with breakpoints in the kernel),
this patch should allow `networkctl status device_name` to output the
SSID instead of null.

Cc: stable@vger.kernel.org
Reported-by: Yohan Prod'homme <kernel@zoddo.fr>
Fixes: 7b0a0e3c3a (wifi: cfg80211: do some rework towards MLO link APIs)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216711
Signed-off-by: Marc Bornand <dev.mbornand@systemb.ch>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-02-15 18:26:58 +01:00
Alexei Starovoitov 62d101d5f4 selftests/bpf: Fix map_kptr test.
The compiler is optimizing out majority of unref_ptr read/writes, so the test
wasn't testing much. For example, one could delete '__kptr' tag from
'struct prog_test_ref_kfunc __kptr *unref_ptr;' and the test would still "pass".

Convert it to volatile stores. Confirmed by comparing bpf asm before/after.

Fixes: 2cbc469a6f ("selftests/bpf: Add C tests for kptr")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20230214235051.22938-1-alexei.starovoitov@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-02-15 09:24:59 -08:00
Björn Töpel 5e53e5c7ed selftests/bpf: Cross-compile bpftool
When the BPF selftests are cross-compiled, only the a host version of
bpftool is built. This version of bpftool is used on the host-side to
generate various intermediates, e.g., skeletons.

The test runners are also using bpftool, so the Makefile will symlink
bpftool from the selftest/bpf root, where the test runners will look
the tool:

  | $(Q)ln -sf $(if $2,..,.)/tools/build/bpftool/bootstrap/bpftool \
  |    $(OUTPUT)/$(if $2,$2/)bpftool

There are two problems for cross-compilation builds:

 1. There is no native (cross-compilation target) of bpftool
 2. The bootstrap/bpftool is never cross-compiled (by design)

Make sure that a native/cross-compiled version of bpftool is built,
and if CROSS_COMPILE is set, symlink the native/non-bootstrap version.

Acked-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20230214161253.183458-1-bjorn@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-02-15 08:50:20 -08:00