linux-stable/arch
Uros Bizjak 636d6a8b85 locking/atomic/x86: Introduce arch_sync_try_cmpxchg()
Introduce the arch_sync_try_cmpxchg() macro to improve code using
sync_try_cmpxchg() locking primitive. The new definitions use existing
__raw_try_cmpxchg() macros, but use its own "lock; " prefix.

The new macros improve assembly of the cmpxchg loop in
evtchn_fifo_unmask() from drivers/xen/events/events_fifo.c from:

 57a:	85 c0                	test   %eax,%eax
 57c:	78 52                	js     5d0 <...>
 57e:	89 c1                	mov    %eax,%ecx
 580:	25 ff ff ff af       	and    $0xafffffff,%eax
 585:	c7 04 24 00 00 00 00 	movl   $0x0,(%rsp)
 58c:	81 e1 ff ff ff ef    	and    $0xefffffff,%ecx
 592:	89 4c 24 04          	mov    %ecx,0x4(%rsp)
 596:	89 44 24 08          	mov    %eax,0x8(%rsp)
 59a:	8b 74 24 08          	mov    0x8(%rsp),%esi
 59e:	8b 44 24 04          	mov    0x4(%rsp),%eax
 5a2:	f0 0f b1 32          	lock cmpxchg %esi,(%rdx)
 5a6:	89 04 24             	mov    %eax,(%rsp)
 5a9:	8b 04 24             	mov    (%rsp),%eax
 5ac:	39 c1                	cmp    %eax,%ecx
 5ae:	74 07                	je     5b7 <...>
 5b0:	a9 00 00 00 40       	test   $0x40000000,%eax
 5b5:	75 c3                	jne    57a <...>
 <...>

to:

 578:	a9 00 00 00 40       	test   $0x40000000,%eax
 57d:	74 2b                	je     5aa <...>
 57f:	85 c0                	test   %eax,%eax
 581:	78 40                	js     5c3 <...>
 583:	89 c1                	mov    %eax,%ecx
 585:	25 ff ff ff af       	and    $0xafffffff,%eax
 58a:	81 e1 ff ff ff ef    	and    $0xefffffff,%ecx
 590:	89 4c 24 04          	mov    %ecx,0x4(%rsp)
 594:	89 44 24 08          	mov    %eax,0x8(%rsp)
 598:	8b 4c 24 08          	mov    0x8(%rsp),%ecx
 59c:	8b 44 24 04          	mov    0x4(%rsp),%eax
 5a0:	f0 0f b1 0a          	lock cmpxchg %ecx,(%rdx)
 5a4:	89 44 24 04          	mov    %eax,0x4(%rsp)
 5a8:	75 30                	jne    5da <...>
 <...>
 5da:	8b 44 24 04          	mov    0x4(%rsp),%eax
 5de:	eb 98                	jmp    578 <...>

The new code removes move instructions from 585: 5a6: and 5a9:
and the compare from 5ac:. Additionally, the compiler assumes that
cmpxchg success is more probable and optimizes code flow accordingly.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
2023-10-09 18:14:25 +02:00
..
alpha locking/local, arch: Rewrite local_add_unless() as a static inline function 2023-10-04 11:38:11 +02:00
arc ARC updates for v6.6 2023-09-04 15:38:24 -07:00
arm Linux 6.6-rc5 2023-10-09 18:09:23 +02:00
arm64 Linux 6.6-rc5 2023-10-09 18:09:23 +02:00
csky arch/csky 2nd patches for 6.6 2023-09-01 08:02:45 -07:00
hexagon Add x86 shadow stack support 2023-08-31 12:20:12 -07:00
ia64 Linux 6.6-rc5 2023-10-09 18:09:23 +02:00
loongarch Linux 6.6-rc5 2023-10-09 18:09:23 +02:00
m68k futex: Add sys_futex_requeue() 2023-09-21 19:22:10 +02:00
microblaze futex: Add sys_futex_requeue() 2023-09-21 19:22:10 +02:00
mips Linux 6.6-rc5 2023-10-09 18:09:23 +02:00
nios2 Add x86 shadow stack support 2023-08-31 12:20:12 -07:00
openrisc OpenRISC updates for 6.6 2023-09-05 10:09:31 -07:00
parisc Linux 6.6-rc5 2023-10-09 18:09:23 +02:00
powerpc Linux 6.6-rc5 2023-10-09 18:09:23 +02:00
riscv Fourteen hotfixes, eleven of which are cc:stable. The remainder pertain 2023-10-01 13:33:25 -07:00
s390 Linux 6.6-rc5 2023-10-09 18:09:23 +02:00
sh Linux 6.6-rc5 2023-10-09 18:09:23 +02:00
sparc Linux 6.6-rc5 2023-10-09 18:09:23 +02:00
um This pull request contains the following changes for UML: 2023-09-04 11:32:21 -07:00
x86 locking/atomic/x86: Introduce arch_sync_try_cmpxchg() 2023-10-09 18:14:25 +02:00
xtensa Linux 6.6-rc5 2023-10-09 18:09:23 +02:00
.gitignore
Kconfig Add x86 shadow stack support 2023-08-31 12:20:12 -07:00