When configuring a LUKS disk, we copy over the UUID from the LUKS header
into the new grub_cryptodisk_t structure via grub_memcpy(). As size
we mistakenly use the size of the grub_cryptodisk_t UUID field, which
is guaranteed to be strictly bigger than the LUKS UUID field we're
copying. As a result, the copy always goes out-of-bounds and copies some
garbage from other surrounding fields. During runtime, this isn't
noticed due to the fact that we always NUL-terminate the UUID and thus
never hit the trailing garbage.
Fix the issue by using the size of the local stripped UUID field.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
It appears to be possible to make a (possibly invalid) lvm PV with
a metadata size field that overflows our type when adding it to the
address we've allocated. Even if it doesn't, it may be possible to do so
with the math using the outcome of that as an operand. Check them both.
Signed-off-by: Peter Jones <pjones@redhat.com>
Signed-off-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
This attempts to fix the places where we do the following where
arithmetic_expr may include unvalidated data:
X = grub_malloc(arithmetic_expr);
It accomplishes this by doing the arithmetic ahead of time using grub_add(),
grub_sub(), grub_mul() and testing for overflow before proceeding.
Among other issues, this fixes:
- allocation of integer overflow in grub_video_bitmap_create()
reported by Chris Coulson,
- allocation of integer overflow in grub_png_decode_image_header()
reported by Chris Coulson,
- allocation of integer overflow in grub_squash_read_symlink()
reported by Chris Coulson,
- allocation of integer overflow in grub_ext2_read_symlink()
reported by Chris Coulson,
- allocation of integer overflow in read_section_as_string()
reported by Chris Coulson.
Fixes: CVE-2020-14309, CVE-2020-14310, CVE-2020-14311
Signed-off-by: Peter Jones <pjones@redhat.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
This modifies most of the places we do some form of:
X = malloc(Y * Z);
to use calloc(Y, Z) instead.
Among other issues, this fixes:
- allocation of integer overflow in grub_png_decode_image_header()
reported by Chris Coulson,
- allocation of integer overflow in luks_recover_key()
reported by Chris Coulson,
- allocation of integer overflow in grub_lvm_detect()
reported by Chris Coulson.
Fixes: CVE-2020-14308
Signed-off-by: Peter Jones <pjones@redhat.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
When decrypting a given keyslot, all error cases except for one set up
an error and return the error code. The only exception is when we try to
read the area key: instead of setting up an error message, we directly
print it via grub_dprintf().
Convert the outlier to use grub_error() to allow more uniform handling
of errors.
Signed-off-by: Patrick Steinhardt <ps@kps.im>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
We bumped into the build error while testing gcc-10 pre-release.
../../grub-core/disk/mdraid1x_linux.c: In function 'grub_mdraid_detect':
../../grub-core/disk/mdraid1x_linux.c:181:15: error: array subscript <unknown> is outside array bounds of 'grub_uint16_t[0]' {aka 'short unsigned int[0]'} [-Werror=array-bounds]
181 | (char *) &sb.dev_roles[grub_le_to_cpu32 (sb.dev_number)]
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../grub-core/disk/mdraid1x_linux.c:98:17: note: while referencing 'dev_roles'
98 | grub_uint16_t dev_roles[0]; /* Role in array, or 0xffff for a spare, or 0xfffe for faulty. */
| ^~~~~~~~~
../../grub-core/disk/mdraid1x_linux.c:127:33: note: defined here 'sb'
127 | struct grub_raid_super_1x sb;
| ^~
cc1: all warnings being treated as errors
Apparently gcc issues the warning when trying to access sb.dev_roles
array's member, since it is a zero length array as the last element of
struct grub_raid_super_1x that is allocated sparsely without extra
chunks for the trailing bits, so the warning looks legitimate in this
regard.
As the whole thing here is doing offset computation, it is undue to use
syntax that would imply array member access then take address from it
later. Instead we could accomplish the same thing through basic array
pointer arithmetic to pacify the warning.
Signed-off-by: Michael Chang <mchang@suse.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
The LVM cache logical volume is the logical volume consisting of the original
and the cache pool logical volume. The original is usually on a larger and
slower storage device while the cache pool is on a smaller and faster one. The
performance of the original volume can be improved by storing the frequently
used data on the cache pool to utilize the greater performance of faster
device.
The default cache mode "writethrough" ensures that any data written will be
stored both in the cache and on the origin LV, therefore grub can be straight
to read the original lv as no data loss is guarenteed.
The second cache mode is "writeback", which delays writing from the cache pool
back to the origin LV to have increased performance. The drawback is potential
data loss if losing the associated cache device.
During the boot time grub reads the LVM offline i.e. LVM volumes are not
activated and mounted, hence it should be fine to read directly from original
lv since all cached data should have been flushed back in the process of taking
it offline.
It is also not much helpful to the situation by adding fsync calls to the
install code. The fsync did not force to write back dirty cache to the original
device and rather it would update associated cache metadata to complete the
write transaction with the cache device. IOW the writes to cached blocks still
go only to the cache device.
To write back dirty cache, as LVM cache did not support dirty cache flush per
block range, there'no way to do it for file. On the other hand the "cleaner"
policy is implemented and can be used to write back "all" dirty blocks in a
cache, which effectively drain all dirty cache gradually to attain and last in
the "clean" state, which can be useful for shrinking or decommissioning a
cache. The result and effect is not what we are looking for here.
In conclusion, as it seems no way to enforce file writes to the original
device, grub may suffer from power failure as it cannot assemble the cache
device and read the dirty data from it. However since the case is only
applicable to writeback mode which is sensitive to data lost in nature, I'd
still like to propose my (relatively simple) patch and treat reading dirty
cache as improvement.
Signed-off-by: Michael Chang <mchang@suse.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Currently the string functions grub_strtol(), grub_strtoul(), and
grub_strtoull() don't declare the "end" pointer in such a way as to
require the pointer itself or the character array to be immutable to the
implementation, nor does the C standard do so in its similar functions,
though it does require us not to change any of it.
The typical declarations of these functions follow this pattern:
long
strtol(const char * restrict nptr, char ** restrict endptr, int base);
Much of the reason for this is historic, and a discussion of that
follows below, after the explanation of this change. (GRUB currently
does not include the "restrict" qualifiers, and we name the arguments a
bit differently.)
The implementation is semantically required to treat the character array
as immutable, but such accidental modifications aren't stopped by the
compiler, and the semantics for both the callers and the implementation
of these functions are sometimes also helped by adding that requirement.
This patch changes these declarations to follow this pattern instead:
long
strtol(const char * restrict nptr,
const char ** const restrict endptr,
int base);
This means that if any modification to these functions accidentally
introduces either an errant modification to the underlying character
array, or an accidental assignment to endptr rather than *endptr, the
compiler should generate an error. (The two uses of "restrict" in this
case basically mean strtol() isn't allowed to modify the character array
by going through *endptr, and endptr isn't allowed to point inside the
array.)
It also means the typical use case changes to:
char *s = ...;
const char *end;
long l;
l = strtol(s, &end, 10);
Or even:
const char *p = str;
while (p && *p) {
long l = strtol(p, &p, 10);
...
}
This fixes 26 places where we discard our attempts at treating the data
safely by doing:
const char *p = str;
long l;
l = strtol(p, (char **)&ptr, 10);
It also adds 5 places where we do:
char *p = str;
while (p && *p) {
long l = strtol(p, (const char ** const)&p, 10);
...
/* more calls that need p not to be pointer-to-const */
}
While moderately distasteful, this is a better problem to have.
With one minor exception, I have tested that all of this compiles
without relevant warnings or errors, and that /much/ of it behaves
correctly, with gcc 9 using 'gcc -W -Wall -Wextra'. The one exception
is the changes in grub-core/osdep/aros/hostdisk.c , which I have no idea
how to build.
Because the C standard defined type-qualifiers in a way that can be
confusing, in the past there's been a slow but fairly regular stream of
churn within our patches, which add and remove the const qualifier in many
of the users of these functions. This change should help avoid that in
the future, and in order to help ensure this, I've added an explanation
in misc.h so that when someone does get a compiler warning about a type
error, they have the fix at hand.
The reason we don't have "const" in these calls in the standard is
purely anachronistic: C78 (de facto) did not have type qualifiers in the
syntax, and the "const" type qualifier was added for C89 (I think; it
may have been later). strtol() appears to date from 4.3BSD in 1986,
which means it could not be added to those functions in the standard
without breaking compatibility, which is usually avoided.
The syntax chosen for type qualifiers is what has led to the churn
regarding usage of const, and is especially confusing on string
functions due to the lack of a string type. Quoting from C99, the
syntax is:
declarator:
pointer[opt] direct-declarator
direct-declarator:
identifier
( declarator )
direct-declarator [ type-qualifier-list[opt] assignment-expression[opt] ]
...
direct-declarator [ type-qualifier-list[opt] * ]
...
pointer:
* type-qualifier-list[opt]
* type-qualifier-list[opt] pointer
type-qualifier-list:
type-qualifier
type-qualifier-list type-qualifier
...
type-qualifier:
const
restrict
volatile
So the examples go like:
const char foo; // immutable object
const char *foo; // mutable pointer to object
char * const foo; // immutable pointer to mutable object
const char * const foo; // immutable pointer to immutable object
const char const * const foo; // XXX extra const keyword in the middle
const char * const * const foo; // immutable pointer to immutable
// pointer to immutable object
const char ** const foo; // immutable pointer to mutable pointer
// to immutable object
Making const left-associative for * and right-associative for everything
else may not have been the best choice ever, but here we are, and the
inevitable result is people using trying to use const (as they should!),
putting it at the wrong place, fighting with the compiler for a bit, and
then either removing it or typecasting something in a bad way. I won't
go into describing restrict, but its syntax has exactly the same issue
as with const.
Anyway, the last example above actually represents the *behavior* that's
required of strtol()-like functions, so that's our choice for the "end"
pointer.
Signed-off-by: Peter Jones <pjones@redhat.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
The debug message printed when decryption with a keyslot fails is
missing its trailing newline. Add it to avoid mangling it with
subsequent output.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
With cryptsetup 2.0, a new version of LUKS was introduced that breaks
compatibility with the previous version due to various reasons. GRUB
currently lacks any support for LUKS2, making it impossible to decrypt
disks encrypted with that version. This commit implements support for
this new format.
Note that LUKS1 and LUKS2 are quite different data formats. While they
do share the same disk signature in the first few bytes, representation
of encryption parameters is completely different between both versions.
While the former version one relied on a single binary header, only,
LUKS2 uses the binary header only in order to locate the actual metadata
which is encoded in JSON. Furthermore, the new data format is a lot more
complex to allow for more flexible setups, like e.g. having multiple
encrypted segments and other features that weren't previously possible.
Because of this, it was decided that it doesn't make sense to keep both
LUKS1 and LUKS2 support in the same module and instead to implement it
in two different modules luks and luks2.
The proposed support for LUKS2 is able to make use of the metadata to
decrypt such disks. Note though that in the current version, only the
PBKDF2 key derival function is supported. This can mostly attributed to
the fact that the libgcrypt library currently has no support for either
Argon2i or Argon2id, which are the remaining KDFs supported by LUKS2. It
wouldn't have been much of a problem to bundle those algorithms with
GRUB itself, but it was decided against that in order to keep down the
number of patches required for initial LUKS2 support. Adding it in the
future would be trivial, given that the code structure is already in
place.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
The luks module contains quite a lot of logic to parse cipher and
cipher-mode strings like aes-xts-plain64 into constants to apply them
to the grub_cryptodisk_t structure. This code will be required by the
upcoming luks2 module, as well, which is why this commit moves it into
its own function grub_cryptodisk_setcipher in the cryptodisk module.
While the strings are probably rather specific to the LUKS modules, it
certainly does make sense that the cryptodisk module houses code to set
up its own internal ciphers instead of hosting that code in the luks
module.
Except for necessary adjustments around error handling, this commit does
an exact move of the cipher configuration logic from luks.c to
cryptodisk.c. Any behavior changes are unintentional.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
While the AFSplitter code is currently used only by the luks module,
upcoming support for luks2 will add a second module that depends on it.
To avoid any linker errors when adding the code to both modules because
of duplicated symbols, this commit moves it into its own standalone
module afsplitter as a preparatory step.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Function grub_efi_find_last_device_path() may return NULL when called
from grub_efidisk_get_device_name().
Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Function grub_efi_find_last_device() path may return NULL when called
from is_child().
Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Function grub_efi_find_last_device_path() may return constant NULL when
called from find_parent_device().
Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Add a new disk driver called obdisk for IEEE1275 platforms. Currently
the only platform using this disk driver is SPARC, however other IEEE1275
platforms could start using it if they so choose. While the functionality
within the current IEEE1275 ofdisk driver may be suitable for PPC and x86, it
presented too many problems on SPARC hardware.
Within the old ofdisk, there is not a way to determine the true canonical
name for the disk. Within Open Boot, the same disk can have multiple names
but all reference the same disk. For example the same disk can be referenced
by its SAS WWN, using this form:
/pci@302/pci@2/pci@0/pci@17/LSI,sas@0/disk@w5000cca02f037d6d,0
It can also be referenced by its PHY identifier using this form:
/pci@302/pci@2/pci@0/pci@17/LSI,sas@0/disk@p0
It can also be referenced by its Target identifier using this form:
/pci@302/pci@2/pci@0/pci@17/LSI,sas@0/disk@0
Also, when the LUN=0, it is legal to omit the ,0 from the device name. So with
the disk above, before taking into account the device aliases, there are 6 ways
to reference the same disk.
Then it is possible to have 0 .. n device aliases all representing the same disk.
Within this new driver the true canonical name is determined using the the
IEEE1275 encode-unit and decode-unit commands when address_cells == 4. This
will determine the true single canonical name for the device so multiple ihandles
are not opened for the same device. This is what frequently happens with the old
ofdisk driver. With some devices when they are opened multiple times it causes
the entire system to hang.
Another problem solved with this driver is devices that do not have a device
alias can be booted and used within GRUB. Within the old ofdisk, this was not
possible, unless it was the original boot device. All devices behind a SAS
or SCSI parent can be found. Within the old ofdisk, finding these disks
relied on there being an alias defined. The alias requirement is not
necessary with this new driver. It can also find devices behind a parent
after they have been hot-plugged. This is something that is not possible
with the old ofdisk driver.
The old ofdisk driver also incorrectly assumes that the device pointing to by a
device alias is in its true canonical form. This assumption is never made with
this new driver.
Another issue solved with this driver is that it properly caches the ihandle
for all open devices. The old ofdisk tries to do this by caching the last
opened ihandle. However this does not work properly because the layer above
does not use a consistent device name for the same disk when calling into the
driver. This is because the upper layer uses the bootpath value returned within
/chosen, other times it uses the device alias, and other times it uses the
value within grub.cfg. It does not have a way to figure out that these devices
are the same disk. This is not a problem with this new driver.
Due to the way GRUB repeatedly opens and closes the same disk. Caching the
ihandle is important on SPARC. Without caching, some SAS devices can take
15 - 20 minutes to get to the GRUB menu. This ihandle caching is not possible
without correctly having the canonical disk name.
When available, this driver also tries to use the deblocker #blocks and
a way of determining the disk size.
Finally and probably most importantly, this new driver is also capable of
seeing all partitions on a GPT disk. With the old driver, the GPT
partition table can not be read and only the first partition on the disk
can be seen.
Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
uboot_disk_write() is currently lacking the write support
to storage devices because, historically, those devices did not
implement block_write() in U-Boot.
The solution has been tested using a patched U-Boot loading
and booting GRUB in a QEMU vexpress-a9 environment.
The disk write operations were triggered with GRUB's save_env
command.
Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@gmail.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Let's provide file type info to the I/O layer. This way verifiers
framework and its users will be able to differentiate files and verify
only required ones.
This is preparatory patch.
Signed-off-by: Vladimir Serbinenko <phcoder@gmail.com>
Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
Reviewed-by: Ross Philipson <ross.philipson@oracle.com>
The original code which handles the recovery of a RAID 6 disks array
assumes that all reads are multiple of 1 << GRUB_DISK_SECTOR_BITS and it
assumes that all the I/O is done via the struct grub_diskfilter_segment.
This is not true for the btrfs code. In order to reuse the native
grub_raid6_recover() code, it is modified to not call
grub_diskfilter_read_node() directly, but to call an handler passed
as an argument.
Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
This is a cryptographically signed message in MIME format.
Date: Thu, 9 Aug 2018 07:27:35 +0200
Currently, the GRUB payload for coreboot does not detect the Western
Digital hard disk WDC WD20EARS-60M AB51 connected to the ASRock E350M1,
as that takes over ten seconds to spin up.
```
disk/ahci.c:533: port 0, err: 0
disk/ahci.c:539: port 0, err: 0
disk/ahci.c:543: port 0, err: 0
disk/ahci.c:549: port 0, offset: 120, tfd:80, CMD: 6016
disk/ahci.c:552: port 0, err: 0
disk/ahci.c:563: port 0, offset: 120, tfd:80, CMD: 6016
disk/ahci.c:566: port: 0, err: 0
disk/ahci.c:593: port 0 is busy
disk/ahci.c:621: cleaning up failed devs
```
GRUB detects the drive, when either unloading the module *ahci*, and
then loading it again, or when doing a warm reset.
As the ten second time-out is too short, increase it to 32 seconds,
used by SeaBIOS. which detects the drive successfully.
The AHCI driver in libpayload uses 30 seconds, and that time-out was
added in commit 354066e1 (libpayload: ahci: Increase timeout for
signature reading) with the description below.
> We can't read the drives signature before it's ready, i.e. spun up.
> So set the timeout to the standard 30s. Also put a notice on the
> console, so the user knows why the signature reading failed.
Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Rename grub_gpt_part_type to grub_gpt_part_guid and update grub_gpt_partentry
to use this type for both the partition type GUID string and the partition GUID
string entries. This change ensures that the two GUID fields are handled more
consistently and helps to simplify the changes needed to add Linux partition
GUID support.
Signed-off-by: Nicholas Vinson <nvinson234@gmail.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Check the error bits in the interrupt status register. According to the
AHCI 1.2 spec, "Interrupt sources that are disabled (‘0’) are still
reflected in the status registers.", so this should work even though
grub uses polling
This fixes the following problem on a Fujitsu E744 laptop:
Sometimes there is a very long delay (up to several minutes) when
booting from hard disk. It seems accessing the DVD drive (which has no
disk inserted) sometimes fails with some errors, which leads to each
access being stalled until the 20s timeout triggers. This seems to
happen when grub is trying to read filesystem/partition data.
The problem is that the command_issue bit that is checked in the loop is
only reset if the "HBA receives a FIS which clears the BSY, DRQ, and ERR
bits for the command", but the ERR bit is never cleared. Therefore
command_issue is never reset and grub waits for the timeout.
The relevant bit in our case is the Task File Error Status (TFES), which
is equivalent to the ERR bit 0 in tfd. But this patch also checks
the other error bits except for the "Interface non-fatal error status"
bit.
Signed-off-by: Stefan Fritsch <fritsch@genua.de>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
In util/getroot and efidisk slightly modify exitsing comment to mostly
retain it but still make GCC7 compliant with respect to fall through
annotation.
In grub-core/lib/xzembed/xz_dec_lzma2.c it adds same comments as
upstream.
In grub-core/tests/setjmp_tets.c declare functions as "noreturn" to
suppress GCC7 warning.
In grub-core/gnulib/regexec.c use new __attribute__, because existing
annotation is not recognized by GCC7 parser (which requires that comment
immediately precedes case statement).
Otherwise add FALLTHROUGH comment.
Closes: 50598
iPXE adds Simple File System Protocol to loaded image handle, as side
effect it also adds Block IO protocol (according to comments, to work
around some bugs in EDK2). GRUB assumes that every device with Block IO
is disk and skips network initialization entirely. But iPXE Block IO
implementation is just a stub which always fails for every operation
so cannot be used. Attempt to detect and skip such devices.
We are using media ID which iPXE sets to "iPXE" and block IO size in
hope that no real device would announce 1B block ...
Closes: 50518
Returned from the OpenProtocol operation, the grub_efi_block_io_media
structure contains the io_align field, specifying the minimum alignment
required for buffers used in any data transfers with the device.
Make grub_efidisk_readwrite() allocate a temporary buffer, aligned to
this boundary, if the buffer passed to it does not already meet the
requirements.
Also sanity check the io_align field in grub_efidisk_open() for
power-of-two-ness and bail if invalid.
Map EFI_NO_MEDIA to GRUB_ERR_OUT_OF_RANGE that is ignored by diskfilter. This
actually matches pretty close (we obviously attempt to read outside of media)
and avoids adding more error codes.
This affects only internally initiated scans. If read/write from removable is
explicitly requested, we still return an error and text explanation is more
clear for user than generic error.
Reported and tested by Andreas Loew <Andreas.Loew@gmx.net>
It is not possible to configure encrypted containers on multiple partitions of
the same disk; after the first one all subsequent fail with
disk/cryptodisk.c:978: already mounted as crypto0
Store partition offset in cryptomount descriptor to distinguish between them.
Currently, some messages cannot be mapped to the port they belong to as
the port number is missing from the output. So add `port: n` to the
debug messages.
Run the command below
$ git grep -l schedulded | xargs sed -i 's/schedulded/scheduled/g'
and revert the change in `ChangeLog-2015`.
Including "miscellaneous" spelling fix noted by richardvoigt@gmail.com