x86/asm: Use CC_SET/CC_OUT in percpu_cmpxchg8b_double() to micro-optimize code generation

Use CC_SET(z)/CC_OUT(z) instead of explicit SETZ instruction.

Using these two defines, the compiler that supports generation of
condition code outputs from inline assembly flags generates e.g.:

  cmpxchg8b %fs:(%esi)
  jne    172255 <__kmalloc+0x65>

instead of:

  cmpxchg8b %fs:(%esi)
  sete   %al
  test   %al,%al
  je     172255 <__kmalloc+0x65>

Note that older compilers now generate:

  cmpxchg8b %fs:(%esi)
  sete   %cl
  test   %cl,%cl
  je     173a85 <__kmalloc+0x65>

since we have to mark that cmpxchg8b instruction outputs to %eax
register and this way clobbers the value in the register.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/lkml/20180605163910.13015-1-ubizjak@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This commit is contained in:
Uros Bizjak 2018-06-05 18:39:09 +02:00 committed by Ingo Molnar
parent 1abd8a8f39
commit 1966c5e5bd

View file

@ -450,9 +450,10 @@ do { \
bool __ret; \
typeof(pcp1) __o1 = (o1), __n1 = (n1); \
typeof(pcp2) __o2 = (o2), __n2 = (n2); \
asm volatile("cmpxchg8b "__percpu_arg(1)"\n\tsetz %0\n\t" \
: "=a" (__ret), "+m" (pcp1), "+m" (pcp2), "+d" (__o2) \
: "b" (__n1), "c" (__n2), "a" (__o1)); \
asm volatile("cmpxchg8b "__percpu_arg(1) \
CC_SET(z) \
: CC_OUT(z) (__ret), "+m" (pcp1), "+m" (pcp2), "+a" (__o1), "+d" (__o2) \
: "b" (__n1), "c" (__n2)); \
__ret; \
})