linux-stable/arch/microblaze/lib/memset.c
Michal Simek 95fee37be4 microblaze: Do loop unrolling for optimized memset implementation
Align implementation with memcpy and memmove where also remaining bytes are
copied via final switch case instead of using simple implementations which
loop. But this alignment has much stronger reason and definitely aligning
implementation is not the key point here. It is just good to have in mind
that the same technique is used already there.

In GCC 10, now -ftree-loop-distribute-patterns optimization is on at O2.
This optimization causes GCC to convert the while loop in memset.c into a
call to memset.
So this optimization is transforming a loop in a memset/memcpy into a call
to the function itself. This makes the memset implementation as recursive.
"-freestanding" option will disable the built-in library function but it
has been added in generic library implementation.

In default microblaze kernel defconfig we have CONFIG_OPT_LIB_FUNCTION
enabled so it will always pick optimized version of memset which is target
specific so we are replacing the while() loop with switch case to avoid
recursive memset call.

Issue with freestanding was already discussed in connection to commit
33d0f96ffd ("lib/string.c: Use freestanding environment") and also this
is topic in glibc and gcc.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56888
http://patchwork.ozlabs.org/project/glibc/patch/20191121021040.14554-1-sandra@codesourcery.com/

Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Mahesh Bodapati <mbodapat@xilinx.com>
Link: https://lore.kernel.org/r/10a432e269a6d3349cf458e4f5792522779cba0d.1645797329.git.michal.simek@xilinx.com
2022-04-21 10:54:20 +02:00

94 lines
2.2 KiB
C

/*
* Copyright (C) 2008-2009 Michal Simek <monstr@monstr.eu>
* Copyright (C) 2008-2009 PetaLogix
* Copyright (C) 2007 John Williams
*
* Reasonably optimised generic C-code for memset on Microblaze
* This is generic C code to do efficient, alignment-aware memcpy.
*
* It is based on demo code originally Copyright 2001 by Intel Corp, taken from
* http://www.embedded.com/showArticle.jhtml?articleID=19205567
*
* Attempts were made, unsuccessfully, to contact the original
* author of this code (Michael Morrow, Intel). Below is the original
* copyright notice.
*
* This software has been developed by Intel Corporation.
* Intel specifically disclaims all warranties, express or
* implied, and all liability, including consequential and
* other indirect damages, for the use of this program, including
* liability for infringement of any proprietary rights,
* and including the warranties of merchantability and fitness
* for a particular purpose. Intel does not assume any
* responsibility for and errors which may appear in this program
* not any responsibility to update it.
*/
#include <linux/export.h>
#include <linux/types.h>
#include <linux/stddef.h>
#include <linux/compiler.h>
#include <linux/string.h>
#ifdef CONFIG_OPT_LIB_FUNCTION
void *memset(void *v_src, int c, __kernel_size_t n)
{
char *src = v_src;
uint32_t *i_src;
uint32_t w32 = 0;
/* Truncate c to 8 bits */
c = (c & 0xFF);
if (unlikely(c)) {
/* Make a repeating word out of it */
w32 = c;
w32 |= w32 << 8;
w32 |= w32 << 16;
}
if (likely(n >= 4)) {
/* Align the destination to a word boundary */
/* This is done in an endian independent manner */
switch ((unsigned) src & 3) {
case 1:
*src++ = c;
--n;
fallthrough;
case 2:
*src++ = c;
--n;
fallthrough;
case 3:
*src++ = c;
--n;
}
i_src = (void *)src;
/* Do as many full-word copies as we can */
for (; n >= 4; n -= 4)
*i_src++ = w32;
src = (void *)i_src;
}
/* Simple, byte oriented memset or the rest of count. */
switch (n) {
case 3:
*src++ = c;
fallthrough;
case 2:
*src++ = c;
fallthrough;
case 1:
*src++ = c;
break;
default:
break;
}
return v_src;
}
EXPORT_SYMBOL(memset);
#endif /* CONFIG_OPT_LIB_FUNCTION */