Complete Guide: Installing ROCm 7.0.2 on Rocky Linux 10 for AMD Radeon RX 9070 XT and Deploying vLLM with Podman

 · Kubecoin
Table of contents

Introduction

This guide provides a step-by-step process for installing AMD's ROCm 7.0.2 platform on a fresh Rocky Linux 10 system. It is specifically tailored for use with consumer-grade AMD GPUs, such as the Radeon RX 9070 XT, for machine learning tasks. The tutorial also covers deploying the vLLM inference server inside a Podman container to keep the host system clean.

Prerequisites:

  • A system with a supported AMD GPU (e.g., Radeon RX 9070 XT).
  • A fresh installation of Rocky Linux 10.
  • User account with sudo privileges.

Phase 1: System Preparation and ROCm Installation

Step 1: Enable Essential Repositories (CRB and EPEL)

The CodeReady Linux Builder (CRB) and EPEL repositories are crucial as they contain essential build tools and dependencies, such as python3-wheel, which are not available in the default Rocky Linux repositories.

Run the following commands to enable these repositories[citation:1]:

bash

sudo dnf install -y dnf-plugins-core

sudo dnf config-manager --set-enabled crb

sudo dnf install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-10.noarch.rpm

sudo dnf clean all

Step 2: Install the AMDGPU Installer and ROCm

Use AMD's official installer to set up the ROCm repositories and install the complete ROCm stack.

bash

sudo dnf install -y https://repo.radeon.com/amdgpu-install/7.0.2/el/10/amdgpu-install-7.0.2.70002-1.el10.noarch.rpm

sudo dnf clean all

sudo dnf install -y rocm

``

Step 3: Configure User Permissions

To allow your user account to access the GPU, add yourself to the render and video groups. A reboot is required for these changes to take full effect.

bash

sudo usermod -a -G render,video $USER

sudo systemctl reboot

Step 4: Verify the Installation

After rebooting, verify that both the AMDGPU driver is loaded and your discrete GPU is detected by ROCm.

bash

Check if the kernel module is loaded

lsmod | grep amdgpu

``

Verify ROCm recognizes your GPU

/opt/rocm/bin/rocminfo

/opt/rocm/bin/rocm-smi

A successful verification will show your Radeon RX 9070 XT (marketing name) with its architecture gfx1201 listed as an HSA agent in the rocminfo output.

Phase 2: Containerized Deployment with Podman

Step 5: Configure Podman and Run vLLM

Using Podman allows for an isolated environment to run your ML models. The following command uses the official ROCm-enabled vLLM image.

Key Podman flags explained:

--device /dev/kfd --device /dev/dri: Passes the GPU devices into the container.

--group-add=video: Grants the container necessary permissions.

--ipc=host: Critical for optimal performance with PyTorch.

--security-opt label=disable: Simplifies SELinux context issues (for testing environments).

Execute the command to start the vLLM server:

bash

podman run -it --rm\ --device /dev/kfd --device /dev/dri\ --group-add=video\ --ipc=host\ --security-opt label=disable\ -p 8000:8000\ -v ~/.cache:/root/.cache\ -e "VLLM_ALLOW_LONG_MAX_MODEL_LEN=1"\ -e "VLLM_USE_MODELSCOPE=True"\ rocm/vllm:rocm7.0.0_vllm_0.10.2_20251006\ bash -c "pip install modelscope>=1.18.1 && pip install --upgrade transformers && vllm serve 'Qwen/Qwen3-VL-8B-Instruct' --served-model-name Qwen3-VL-8B-Instruct --tensor-parallel-size 1 --max-model-len 8192 --gpu-memory-utilization 0.85"

This command will:

Pull the rocm/vllm image (if not already present).

Install the required modelscope library and upgrade transformers to a version that supports the Qwen model architecture.

Start the vLLM server, making the Qwen/Qwen3-VL-8B-Instruct model available via an OpenAI-compatible API on port 8000.

Alternative: Enhanced Security Configuration For a more secure container setup that avoids --ipc=host, you can configure SELinux and use private namespaces:

Configure SELinux to allow containers to access devices

bash

sudo setsebool -P container_use_devices 1

Run the container with private namespaces

podman run -it --rm\ --device /dev/kfd --device /dev/dri\ --group-add=video\ --shm-size=2g\ -p 8000:8000\ -v ~/.cache:/root/.cache\ -e "VLLM_USE_MODELSCOPE=True"\ rocm/vllm:rocm7.0.0_vllm_0.10.2_20251006\ bash -c "pip install modelscope>=1.18.1 && vllm serve 'Qwen/Qwen3-VL-8B-Instruct' --served-model-name Qwen3-VL-8B-Instruct --tensor-parallel-size 1"

Conclusion

You have successfully installed ROCm on Rocky Linux 10 and deployed a powerful LLM inference server inside a container. The key to a smooth installation was enabling the CRB repository to resolve critical dependencies. This setup provides a clean, isolated environment for ML development and inference on AMD GPUs.

You can now interact with the API server at http://localhost:8000. To test it, query the /v1/models endpoint or use the OpenAPI documentation at http://localhost:8000/docs