45 lines
2.2 KiB
Markdown
45 lines
2.2 KiB
Markdown
|
## nsenter
|
||
|
|
||
|
The `nsenter` package registers a special init constructor that is called before
|
||
|
the Go runtime has a chance to boot. This provides us the ability to `setns` on
|
||
|
existing namespaces and avoid the issues that the Go runtime has with multiple
|
||
|
threads. This constructor will be called if this package is registered,
|
||
|
imported, in your go application.
|
||
|
|
||
|
The `nsenter` package will `import "C"` and it uses [cgo](https://golang.org/cmd/cgo/)
|
||
|
package. In cgo, if the import of "C" is immediately preceded by a comment, that comment,
|
||
|
called the preamble, is used as a header when compiling the C parts of the package.
|
||
|
So every time we import package `nsenter`, the C code function `nsexec()` would be
|
||
|
called. And package `nsenter` is now only imported in `main_unix.go`, so every time
|
||
|
before we call `cmd.Start` on linux, that C code would run.
|
||
|
|
||
|
Because `nsexec()` must be run before the Go runtime in order to use the
|
||
|
Linux kernel namespace, you must `import` this library into a package if
|
||
|
you plan to use `libcontainer` directly. Otherwise Go will not execute
|
||
|
the `nsexec()` constructor, which means that the re-exec will not cause
|
||
|
the namespaces to be joined. You can import it like this:
|
||
|
|
||
|
```go
|
||
|
import _ "github.com/opencontainers/runc/libcontainer/nsenter"
|
||
|
```
|
||
|
|
||
|
`nsexec()` will first get the file descriptor number for the init pipe
|
||
|
from the environment variable `_LIBCONTAINER_INITPIPE` (which was opened
|
||
|
by the parent and kept open across the fork-exec of the `nsexec()` init
|
||
|
process). The init pipe is used to read bootstrap data (namespace paths,
|
||
|
clone flags, uid and gid mappings, and the console path) from the parent
|
||
|
process. `nsexec()` will then call `setns(2)` to join the namespaces
|
||
|
provided in the bootstrap data (if available), `clone(2)` a child process
|
||
|
with the provided clone flags, update the user and group ID mappings, do
|
||
|
some further miscellaneous setup steps, and then send the PID of the
|
||
|
child process to the parent of the `nsexec()` "caller". Finally,
|
||
|
the parent `nsexec()` will exit and the child `nsexec()` process will
|
||
|
return to allow the Go runtime take over.
|
||
|
|
||
|
NOTE: We do both `setns(2)` and `clone(2)` even if we don't have any
|
||
|
CLONE_NEW* clone flags because we must fork a new process in order to
|
||
|
enter the PID namespace.
|
||
|
|
||
|
|
||
|
|