Skip to content

Latest commit

 

History

History
322 lines (241 loc) · 19.5 KB

File metadata and controls

322 lines (241 loc) · 19.5 KB

Build Artifacts

A primary output of the build system is a set of archive slices of key components. In typical CI flows, these are archived into tarballs and made available to successor jobs as artifacts. Successor jobs can use them for a variety of things:

  • Bootstrapping multi-stage builds with pre-built components.
  • Driving successor platform/channel specific packaging flows (i.e. generate debs, rpms, wheels, etc).
  • Direct inputs to standalone build jobs.
  • Distribution of flat archives to users who benefit from this.

Generally, each artifact is an extract of the top level build tree, containing a subset of leaf project stage/ directories. In this way, simply extracting the artifacts over a build directory is sufficient in successor jobs to make that slice of projects be pre-built". In this way, a monolithic build can be broken down and parallelized at sub-project granularity if desired.

Artifact Layout

After each sub-project build stage, corresponding artifact sub-directories will be populated in the build/artifacts directory. As a visual-aid, consider the directory listing of the base, sysdeps, and rand artifacts:

$ ls -1d artifacts/{base_*,sysdeps_*,rand_*}
artifacts/base_dbg_generic
artifacts/base_dev_generic
artifacts/base_doc_generic
artifacts/base_lib_generic
artifacts/base_run_generic
artifacts/base_test_generic
artifacts/rand_dbg_gfx1100
artifacts/rand_dev_gfx1100
artifacts/rand_doc_gfx1100
artifacts/rand_lib_gfx1100
artifacts/rand_run_gfx1100
artifacts/sysdeps_dev_generic
artifacts/sysdeps_doc_generic
artifacts/sysdeps_lib_generic
artifacts/sysdeps_run_generic

Here we see directories following the artifact naming convention of {name}_{component}_{target}, where each field is defined as:

  • name: Project-wide unique name of the artifact.
  • component: The component sub-division of the artifact, defining the role of files contained (see below).
  • target: Either "generic", indicating that it is not confined to any specific GPU target or a target-family as defined by the therock_amdgpu_targets.cmake. Note that each build tree is only relevant to a single host target, so the host platform is not encoded in the artifact name.

If everything is set up correctly in the build, all relevant files in a sub-project's stage/ directory will be included in one of its constituent artifact components, differentiated by the role it plays. Note that there can be many sub-projects that go into making a single artifact (i.e. sub-projects that are logically grouped together are grouped into the same artifact).

As a convenience, all artifacts defined in the project are flattened by default into the build/dist/rocm directory by the therock-dist-rocm target. Since this includes everything, this directory should be considered a "complete" ROCM SDK.

Artifact Contents

Consider an (abbreviated) tree of the sysdeps_lib_generic artifact at the time of writing:

$ tree artifacts/sysdeps_lib_generic
artifacts/sysdeps_lib_generic
├── artifact_manifest.txt
└── third-party
    └── sysdeps
        └── linux
            ├── bzip2
            │   └── build
            │       └── stage
            │           └── lib
            │               └── rocm_sysdeps
            │                   └── lib
            │                       ├── libbz2.so -> librocm_sysdeps_bz2.so
            │                       └── librocm_sysdeps_bz2.so
            ├── elfutils
            │   └── build
            │       └── stage
            │           └── lib
            │               └── rocm_sysdeps
            │                   └── lib
            │                       ├── libasm.so -> librocm_sysdeps_asm.so.1
            │                       ├── libdw.so -> librocm_sysdeps_dw.so.1
            │                       ├── libelf.so -> librocm_sysdeps_elf.so.1
            │                       ├── librocm_sysdeps_asm.so.1
            │                       ├── librocm_sysdeps_dw.so.1
            │                       └── librocm_sysdeps_elf.so.1
...
            └── zstd
                └── build
                    └── stage
                        └── lib
                            └── rocm_sysdeps
                                └── lib
                                    ├── librocm_sysdeps_zstd.so.1
                                    └── libzstd.so -> librocm_sysdeps_zstd.so.1

And the artifact_manifest.txt:

third-party/sysdeps/linux/bzip2/build/stage
third-party/sysdeps/linux/elfutils/build/stage
third-party/sysdeps/linux/libdrm/build/stage
third-party/sysdeps/linux/numactl/build/stage
third-party/sysdeps/linux/sqlite3/build/stage
third-party/sysdeps/linux/zlib/build/stage
third-party/sysdeps/linux/zstd/build/stage

If you compare this to the overall build/ tree, you will find that the artifact directory has an identical layout, consisting of a selected slice of files underneath every stage/ directory in the sub-projects that make up the sysdeps artifact.

The artifact_manifest.txt file contains the relative paths of each stage directory. If each of these directories were flattened, they will produce a single unified install-tree of all files needed to depend on the sub-project's shared libraries (note: this flattening can be done with the build_tools/fileset_tool.py artifact-flatten command).

Preserving the build-tree and project structure in this way allows us to use artifacts to bootstrap a build if wishing to use replace some parts of the source build with sub-projects that were built separately (i.e. in a multi-stage build in CI or as part of a development workflow by downloading dep artifacts from a CI server). In such a situation, all bootstrap artifacts simply need to be copied into the build/ tree and a special marker file added for each stage directory which instructs the build system to not actually configure/build/install but just use the stage/ directories as-is.

Artifact directories are populated as part of all but can be built manually via targets like therock-artifact-{name}. All artifacts can be built with therock-artifacts.

Artifact Archives

The build system also generates a therock-archive-{name} for each artifact. This will create a .tar.xz file and sha256sum of the artifact directory. All archives can be built with therock-archives. These archives are built and streamed to the CI cloud storage server for subsequent phases and packaging workflows.

Building the artifacts is done via the build_tools/fileset_too.py artifact-archive command. This command always ensures that the artifact_manifest.txt is written to the tar file first, as this is a precondition that the artifact-flatten command requires in order to process them.

Archives are not built by default as part of all and must be explicitly requested. CI systems typically include a therock-archives target in their build to ensure this.

Building Artifacts

Artifacts are constructed by adding a therock_provide_artifact() command to a CMake file. Working forward on our sysdeps example, here is the directive to create its artifact:

therock_provide_artifact(sysdeps
  TARGET_NEUTRAL
  DESCRIPTOR artifact.toml
  COMPONENTS
    lib
    run
    dev
    doc
  SUBPROJECT_DEPS
    therock-bzip2
    therock-elfutils
    therock-libdrm
    therock-numactl
    therock-sqlite3
    therock-zlib
    therock-zstd
)

This says several important things:

  • The artifact is target neutral (i.e. it will end with _generic vs a GPU target family).
  • It consists of components dev, doc, lib, run.
  • It is defined by an artifact.toml file in the current directory.
  • It is assembled from the given subproject's stage/ directories.

Artifact Descriptors

The artifact descriptor uses a pattern based language to define what files are included in each component. Since by default, each named component has a default set of patterns, often, no further configuration is needed beyond declaring the build-directory relative locations from which to draw files.

Abbreviated example:

# bzip2
[components.lib."third-party/sysdeps/linux/bzip2/build/stage"]
[components.dev."third-party/sysdeps/linux/bzip2/build/stage"]

# elfutils
[components.lib."third-party/sysdeps/linux/elfutils/build/stage"]
[components.dev."third-party/sysdeps/linux/elfutils/build/stage"]

# libdrm
[components.lib."third-party/sysdeps/linux/libdrm/build/stage"]
[components.dev."third-party/sysdeps/linux/libdrm/build/stage"]
include = [
  "**/share/**",
]

Each component-dir map supports the following attributes (see fileset_tool.py artifact documentation, which is the code that physically transforms the descriptor into artifact directories):

  • default_patterns: Boolean (default true) whether to use default include/exclude patterns for the given component name.
  • include: String or list of string path patterns of files that should be included unless if they also match an exclude pattern. If default_patterns is true, these will be added to the default patterns.
  • exclude: String or list of string path patterns of files that should be excluded. If default_patterns is true, these will be added to the default patterns.
  • force_include: String or list of string path patterns of files that will always be included, regardless of whether they match an exclude pattern.
  • optional: Boolean (default false) that if true will not cause an error if the listed stage directory does not exist. Use for optional sub-projects.

All path patterns follow a subset of the ant path pattern language, which has been implemented by various systems over the years. In brief, * matches any number of characters within a path component, and ** matches any number of path components (including zero).

Component Types

While artifacts can be defined with any component type mnemonic, the following are standardized across the build and have default patterns that match the majority of situations:

  • lib: Files needed in order to depend on the artifact's contents as a library at runtime. This typically includes shared libraries, DLLs, dylibs, etc. It also includes any file level dependencies that the shared-libraries require in order to function (i.e. for HIP, this can include headers, compiler resources, etc).
  • run: Files needed in order to use the artifact's contents as a tool. This includes CLI tools (not required at build time), etc.
  • dbg: Platform-specific debug-symbol files. These are typically produced in a platform specific way by the build system and bundled into one component.
  • dev: Files needed in order to depend on the artifact's contents at build time. This typically includes static libraries, CMake package config files, pkgconfig files, modulefiles, and any tools needed at build time. Notably it does not include shared libraries but does include import libraries (Windows). It is expected that the dev component is combined with the lib component to produce a fully functional development tree.
  • doc: Documentation files (typically under share/doc/).
  • test: Additional files needed in order to run tests, build test projects, etc. This typically includes test binaries, data file dependencies, and standalone test project trees.

Component Extends Chain

Components are processed in a defined order via an extends chain. Each component extends its predecessor, meaning it will not include files already claimed by an earlier component:

lib → run → dbg → dev → doc → test

This ordering ensures that components are disjoint — each file in a stage directory appears in exactly one component. The mechanism works through transitive_relpaths: when a component is processed, it inherits the set of file paths already claimed by all components it extends (directly or transitively) and skips those files.

Default Patterns

Each component has default include patterns that determine what it matches (defined in artifact_builder.py):

Component Default includes Notes
lib **/*.so, **/*.so.*, **/*.dll, **/*.dylib, **/*.dylib.* Shared libraries
run (none) Catch-all if no includes specified — descriptors should use explicit includes (see warning below)
dbg .build-id/**/*.debug Debug symbol files
dev **/*.a, **/*.lib, **/cmake/**, **/include/**, **/share/modulefiles/**, **/pkgconfig/** Build-time dependencies
doc **/share/doc/** Documentation
test (none) Only matches files not claimed by earlier components

When a descriptor specifies include patterns for a component, those patterns are added to the defaults (not replacing them). To override defaults, set default_patterns = false. Use exclude patterns to carve out files that would otherwise match.

Warning

Because run has no default include patterns, a bare entry like [components.run."some/stage"] acts as a catch-all that claims ALL files not matched by lib — including headers, cmake configs, and test binaries that should go to dev or test. Always use explicit include patterns on run to select specific runtime tools (e.g. include = ["bin/mytool"]), or omit run entirely for stage dirs where dev/test content is expected.

Routing Files to the Right Component

Since run is a catch-all that claims files before test, care is needed when a stage directory has both run and test content. There are two approaches:

Approach 1: Don't assign the stage dir to run. If a subproject only produces libraries and test binaries (no runtime tools), simply omit run from its component entries. The auto-created run component exists at the artifact level but only scans stage dirs explicitly assigned to it.

This is the most common pattern in blas, where each subproject lists only the components it needs:

[components.lib."math-libs/BLAS/rocBLAS/stage"]
include = ["bin/rocblas/library/**", "lib/rocblas/library/**"]

# NOTE: 'run' is intentionally omitted — rocBLAS has no runtime tools, only test
# binaries. If 'run' were listed here, its catch-all behavior would claim the
# test binaries before 'test' could.
# [components.run."math-libs/BLAS/rocBLAS/stage"]

[components.dbg."math-libs/BLAS/rocBLAS/stage"]
[components.dev."math-libs/BLAS/rocBLAS/stage"]
[components.doc."math-libs/BLAS/rocBLAS/stage"]

# rocBLAS test binaries and data files. Explicit includes limit the test
# archive to known test content — without them, test would catch-all remaining
# files not claimed by lib/dev/doc.
[components.test."math-libs/BLAS/rocBLAS/stage"]
include = ["bin/rocblas-bench*", "bin/rocblas-test*", ...]

Compare with rocSPARSE, which does have runtime tools — it uses an explicit run with specific includes (not a catch-all):

[components.run."math-libs/BLAS/rocSPARSE/stage"]
include = ["bin/rocsparse_mtx2csr", "lib/rocsparse/rocsparseio-convert"]
[components.test."math-libs/BLAS/rocSPARSE/stage"]
include = ["bin/rocsparse-bench*", "bin/rocsparse-test*", ...]

Approach 2: Exclude test content from run. When a stage dir must be assigned to both run and test (e.g. the subproject produces both runtime tools and test binaries from the same stage dir), add an exclude on run for the test content:

# MIOpen — run needs to catch runtime files, but miopen_gtest should go to test
[components.run."ml-libs/MIOpen/stage"]
exclude = ["bin/miopen_gtest*"]

[components.test."ml-libs/MIOpen/stage"]
include = ["bin/miopen_gtest*"]

Without the exclude on run, the test binary would be claimed by run (catch-all) and test would skip it.

For artifacts that are entirely test content (e.g. core-rocrtst), exclude all test paths from run so they fall through to test:

[components.run."core/rocrtst/stage"]
exclude = ["bin/**"]

[components.test."core/rocrtst/stage"]
# Gets everything in bin/ that run excluded

Current Artifact Inventory

For a fully up to date list, grep for therock_provide_artifact. This list is maintained on a best-effort basis.

Sub-projects are being continuously added to TheRock. This section aims to provide a bit more commentary as to how they are organized.

Common Artifacts

These artifacts are built if any project features requiring them are enabled:

  • host-blas: An appropriate host BLAS/LAPACK library.
  • host-suite-sparse: SuiteSparse library.
  • sysdeps: Privately built shared libraries that are built internally vs relying on system deps. Includes things like drm, compression libs, etc. All of these use project-local SONAMEs and symbol versioning that isolates them from system provided libraries.

Compiler Artifacts

  • amd-llvm: The AMD LLVM compiler, configured to target the current host and all AMDGPU targets.
  • hipify: Hipify tools, built on top of amd-llvm.

Core Artifacts

  • base: Base ROCM tools and structural components. ROCM sub-projects that do not depend on anything outside of this set are included here so that everything can depend on them.
  • core-amdsmi: AMD System Management Interface (amdsmi) library and tools for GPU and driver management, packaged as a standalone core artifact due to distinct product and usage semantics.
  • core-runtime: Low level runtime components used for interfacing with kernel drivers.
  • core-hip: HIP runtime, compiler interface, and build tools.

Profiler Artifacts

  • rocprofiler-sdk: The rocprofv3 tools and libraries (excluding rocprofiler-register, which is in base).
  • rocprofiler-systems: The ROCm Systems Profiler tools and libraries.

Kernel Libraries

  • blas: All basic linear algebra libraries (BLAS, SOLVER, SPARSE).
  • fft: Fast fourier transform libraries.
  • prim: C++ template based primitives libraries (rocPRIM, hipCUB, rocThrust, etc).
  • rand: Random number generator libraries.
  • rccl: Collective communication libraries.
  • MIOpen: MIOpen kernel-select/fusion library.
  • rocdecode: Video decode library (Linux only).
  • rocjpeg: JPEG decode library (Linux only).

Note

After adding a new artifact via therock_provide_artifact(), you may need to update install_rocm_from_artifacts.py to allow CI workflows and users to selectively install it.
See the Adding Support for New Components guide for step-by-step instructions.