grub/grub-core/commands/regexp.c
Peter Jones d5a32255de misc: Make grub_strtol() "end" pointers have safer const qualifiers
Currently the string functions grub_strtol(), grub_strtoul(), and
grub_strtoull() don't declare the "end" pointer in such a way as to
require the pointer itself or the character array to be immutable to the
implementation, nor does the C standard do so in its similar functions,
though it does require us not to change any of it.

The typical declarations of these functions follow this pattern:

long
strtol(const char * restrict nptr, char ** restrict endptr, int base);

Much of the reason for this is historic, and a discussion of that
follows below, after the explanation of this change.  (GRUB currently
does not include the "restrict" qualifiers, and we name the arguments a
bit differently.)

The implementation is semantically required to treat the character array
as immutable, but such accidental modifications aren't stopped by the
compiler, and the semantics for both the callers and the implementation
of these functions are sometimes also helped by adding that requirement.

This patch changes these declarations to follow this pattern instead:

long
strtol(const char * restrict nptr,
       const char ** const restrict endptr,
       int base);

This means that if any modification to these functions accidentally
introduces either an errant modification to the underlying character
array, or an accidental assignment to endptr rather than *endptr, the
compiler should generate an error.  (The two uses of "restrict" in this
case basically mean strtol() isn't allowed to modify the character array
by going through *endptr, and endptr isn't allowed to point inside the
array.)

It also means the typical use case changes to:

  char *s = ...;
  const char *end;
  long l;

  l = strtol(s, &end, 10);

Or even:

  const char *p = str;
  while (p && *p) {
	  long l = strtol(p, &p, 10);
	  ...
  }

This fixes 26 places where we discard our attempts at treating the data
safely by doing:

  const char *p = str;
  long l;

  l = strtol(p, (char **)&ptr, 10);

It also adds 5 places where we do:

  char *p = str;
  while (p && *p) {
	  long l = strtol(p, (const char ** const)&p, 10);
	  ...
	  /* more calls that need p not to be pointer-to-const */
  }

While moderately distasteful, this is a better problem to have.

With one minor exception, I have tested that all of this compiles
without relevant warnings or errors, and that /much/ of it behaves
correctly, with gcc 9 using 'gcc -W -Wall -Wextra'.  The one exception
is the changes in grub-core/osdep/aros/hostdisk.c , which I have no idea
how to build.

Because the C standard defined type-qualifiers in a way that can be
confusing, in the past there's been a slow but fairly regular stream of
churn within our patches, which add and remove the const qualifier in many
of the users of these functions.  This change should help avoid that in
the future, and in order to help ensure this, I've added an explanation
in misc.h so that when someone does get a compiler warning about a type
error, they have the fix at hand.

The reason we don't have "const" in these calls in the standard is
purely anachronistic: C78 (de facto) did not have type qualifiers in the
syntax, and the "const" type qualifier was added for C89 (I think; it
may have been later).  strtol() appears to date from 4.3BSD in 1986,
which means it could not be added to those functions in the standard
without breaking compatibility, which is usually avoided.

The syntax chosen for type qualifiers is what has led to the churn
regarding usage of const, and is especially confusing on string
functions due to the lack of a string type.  Quoting from C99, the
syntax is:

 declarator:
  pointer[opt] direct-declarator
 direct-declarator:
  identifier
  ( declarator )
  direct-declarator [ type-qualifier-list[opt] assignment-expression[opt] ]
  ...
  direct-declarator [ type-qualifier-list[opt] * ]
  ...
 pointer:
  * type-qualifier-list[opt]
  * type-qualifier-list[opt] pointer
 type-qualifier-list:
  type-qualifier
  type-qualifier-list type-qualifier
 ...
 type-qualifier:
  const
  restrict
  volatile

So the examples go like:

const char foo;			// immutable object
const char *foo;		// mutable pointer to object
char * const foo;		// immutable pointer to mutable object
const char * const foo;		// immutable pointer to immutable object
const char const * const foo; 	// XXX extra const keyword in the middle
const char * const * const foo; // immutable pointer to immutable
				//   pointer to immutable object
const char ** const foo;	// immutable pointer to mutable pointer
				//   to immutable object

Making const left-associative for * and right-associative for everything
else may not have been the best choice ever, but here we are, and the
inevitable result is people using trying to use const (as they should!),
putting it at the wrong place, fighting with the compiler for a bit, and
then either removing it or typecasting something in a bad way.  I won't
go into describing restrict, but its syntax has exactly the same issue
as with const.

Anyway, the last example above actually represents the *behavior* that's
required of strtol()-like functions, so that's our choice for the "end"
pointer.

Signed-off-by: Peter Jones <pjones@redhat.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
2020-02-28 12:41:29 +01:00

168 lines
4.2 KiB
C
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

/* regexp.c -- The regexp command. */
/*
* GRUB -- GRand Unified Bootloader
* Copyright (C) 2005,2007 Free Software Foundation, Inc.
*
* GRUB is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* GRUB is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with GRUB. If not, see <http://www.gnu.org/licenses/>.
*/
#include <grub/dl.h>
#include <grub/misc.h>
#include <grub/mm.h>
#include <grub/err.h>
#include <grub/env.h>
#include <grub/extcmd.h>
#include <grub/i18n.h>
#include <grub/script_sh.h>
#include <regex.h>
GRUB_MOD_LICENSE ("GPLv3+");
static const struct grub_arg_option options[] =
{
{ "set", 's', GRUB_ARG_OPTION_REPEATABLE,
/* TRANSLATORS: in regexp you can mark some
groups with parentheses. These groups are
then numbered and you can save some of
them in variables. In other programs
those components aree often referenced with
back slash, e.g. \1. Compare
sed -e 's,\([a-z][a-z]*\),lowercase=\1,g'
The whole matching component is saved in VARNAME, not its number.
*/
N_("Store matched component NUMBER in VARNAME."),
N_("[NUMBER:]VARNAME"), ARG_TYPE_STRING },
{ 0, 0, 0, 0, 0, 0 }
};
static grub_err_t
setvar (char *str, char *v, regmatch_t *m)
{
char ch;
grub_err_t err;
ch = str[m->rm_eo];
str[m->rm_eo] = '\0';
err = grub_env_set (v, str + m->rm_so);
str[m->rm_eo] = ch;
return err;
}
static grub_err_t
set_matches (char **varnames, char *str, grub_size_t nmatches,
regmatch_t *matches)
{
int i;
char *p;
const char * q;
grub_err_t err;
unsigned long j;
for (i = 0; varnames && varnames[i]; i++)
{
err = GRUB_ERR_NONE;
p = grub_strchr (varnames[i], ':');
if (! p)
{
/* varname w/o index defaults to 1 */
if (nmatches < 2 || matches[1].rm_so == -1)
grub_env_unset (varnames[i]);
else
err = setvar (str, varnames[i], &matches[1]);
}
else
{
j = grub_strtoul (varnames[i], &q, 10);
if (q != p)
return grub_error (GRUB_ERR_BAD_ARGUMENT,
"invalid variable name format %s", varnames[i]);
if (nmatches <= j || matches[j].rm_so == -1)
grub_env_unset (p + 1);
else
err = setvar (str, p + 1, &matches[j]);
}
if (err != GRUB_ERR_NONE)
return err;
}
return GRUB_ERR_NONE;
}
static grub_err_t
grub_cmd_regexp (grub_extcmd_context_t ctxt, int argc, char **args)
{
regex_t regex;
int ret;
grub_size_t s;
char *comperr;
grub_err_t err;
regmatch_t *matches = 0;
if (argc != 2)
return grub_error (GRUB_ERR_BAD_ARGUMENT, N_("two arguments expected"));
ret = regcomp (&regex, args[0], REG_EXTENDED);
if (ret)
goto fail;
matches = grub_zalloc (sizeof (*matches) * (regex.re_nsub + 1));
if (! matches)
goto fail;
ret = regexec (&regex, args[1], regex.re_nsub + 1, matches, 0);
if (!ret)
{
err = set_matches (ctxt->state[0].args, args[1],
regex.re_nsub + 1, matches);
regfree (&regex);
grub_free (matches);
return err;
}
fail:
grub_free (matches);
s = regerror (ret, &regex, 0, 0);
comperr = grub_malloc (s);
if (!comperr)
{
regfree (&regex);
return grub_errno;
}
regerror (ret, &regex, comperr, s);
err = grub_error (GRUB_ERR_TEST_FAILURE, "%s", comperr);
regfree (&regex);
grub_free (comperr);
return err;
}
static grub_extcmd_t cmd;
GRUB_MOD_INIT(regexp)
{
cmd = grub_register_extcmd ("regexp", grub_cmd_regexp, 0,
/* TRANSLATORS: This are two arguments. So it's
two separate units to translate and pay
attention not to reverse them. */
N_("REGEXP STRING"),
N_("Test if REGEXP matches STRING."), options);
/* Setup GRUB script wildcard translator. */
grub_wildcard_translator = &grub_filename_translator;
}
GRUB_MOD_FINI(regexp)
{
grub_unregister_extcmd (cmd);
grub_wildcard_translator = 0;
}