Reproduce and Verify Filesystems

Vincent Batts  @vbatts

$> finger $(whoami)
Login: vbatts                           Name: Vincent Batts
Directory: /home/vbatts                 Shell: /bin/bash
Such mail.
Plan:
OHMAN
$> id -Gn
devel opencontainers docker appc redhat golang slackware
  • Packaging
  • Content Addressibility
  • Compression!
  • Reproducible Archives
  • Verify at rest filesystems

Agenda

Packaging

tar archives

Slackware packages (tar(1) archives)

Debian *.deb (ar(1) archive of tar(1) archives)

Red Hat *.rpm (custom key/value binary and cpio(1))

Java *.jar and *.war (zip(1) archive)

Ruby *.gem (tar(1) archive of tar(1) archives)

Container Images (tar(1) archives)

Content Addressibility

Opaque Object storage

changed object = new object

cryptographic assurance

compression!

inflate/deflate (RFC1951)

same objects, but variation in compression

Gzip (RFC1952)

`gzip` vs Golang `compress/gzip` vs Zlib

ideally compress for transfer and storage, but not for identity

compression!

#!/bin/sh
dd if=/dev/urandom of=rando.img bs=1M count=2
cat rando.img | gzip -n > rando.img.gz
cat rando.img | gzip -n -9 > rando.img.9.gz
cat rando.img | xz > rando.img.xz
cat rando.img | xz -9 > rando.img.9.xz
sha1sum rando.img* > SHA1

cat rando.img | gzip -n > rando.img.gz
cat rando.img | gzip -n -9 > rando.img.9.gz
cat rando.img | xz > rando.img.xz
cat rando.img | xz -9 > rando.img.9.xz
sha1sum -c ./SHA1

compression!

#!/usr/bin/env ruby

require 'zlib'
include Zlib

input = File.open(ARGV.first)
GzipWriter.open(ARGV.first + '.gz', DEFAULT_COMPRESSION, HUFFMAN_ONLY) do |gz|
  gz.write(IO.binread(input))
end
input.flush()
input.close()

compression!

package main
  
import (
        "compress/gzip"  
        "io"
        "os"  
)

func main() {
        input, err := os.Open(os.Args[1])
        if err != nil {
                println(err.Error())
                os.Exit(1)
        }
        output, err := os.Create(os.Args[1] + ".gz")
        if err != nil {
                println(err.Error())
                os.Exit(1)
        }
        gz := gzip.NewWriter(output)
        if _, err := io.Copy(gz, input); err != nil {
                println(err.Error())
                os.Exit(1)
        }
}

reproducible archive

reproducible-builds.org

processed checksum of tar archive (see deprecated Docker TarSum)

keep around the original *.tar?

re-assemble the original *.tar

reproducible archive

go install github.com/vbatts/tar-split/cmd/tar-split

tar cf demo.tar *.sh
sha1sum demo.tar | tee SHA1

tar-split disasm --no-stdout ./demo.tar
ls -lh tar-data.json.gz

rm -f demo.tar
tar-split asm --output demo.tar --path .
sha1sum -c ./SHA1

Verify at rest Filesystems

Regardless of transport, ensure resulting filesystem

(*.tar archive, rsync, bittorrent, IPFS, etc)

`rpm -qV <package>` functionality

Future hopes could be IMA/EVM

Passive validation of directory hierarchies

BSD mtree(8)

Verify at rest Filesystems

Verify at rest Filesystems

#!/usr/bin/env python

import libarchive

with libarchive.file_writer('../demo.mtree', 'mtree') as a:
    a.add_files('./')

with packages: libarchive and python-libarchive-c

NOTICE: libarchive uses older mtree format

Verify at rest Filesystems

mtree -c -p ./ -K sha256digest | tee /tmp/demo.mtree

mtree -f /tmp/demo.mtree -p ./
echo $?

read

touch $0 # SCANDALOUS
mtree -f /tmp/demo.mtree -p ./

Verify at rest Filesystems

go get -u github.com/vbatts/go-mtree/cmd/gomtree
gomtree -c -p ./ -K sha256digest | tee /tmp/demo.mtree

gomtree -f /tmp/demo.mtree -p ./
echo $?

read

touch $0 # SCANDALOUS
gomtree -f /tmp/demo.mtree -p ./

Directory Path

Verify at rest Filesystems

tar cf /tmp/demo.tar .
gomtree -c -T /tmp/demo.tar -K sha256digest | tee /tmp/demo.mtree

gomtree -f /tmp/demo.mtree -T /tmp/demo.tar
echo $?

read

gomtree -f /tmp/demo.mtree -p ./
echo $?


touch $0 # SCANDALOUS
gomtree -f /tmp/demo.mtree -p ./

Tar Archive Support

Call to Action

You have the need to store archives, whole and extracted,

check out github.com/vbatts/tar-split

You have the need to verify, or restore, a filesystem regardless of how it was distributed, check out github.com/vbatts/go-mtree or other mtree projects

Thank You!

VINCENT BATTS

@VBATTS| VBATTS@REDHAT.COM