Complete Guide: Installing ROCm 7.0.2 on Rocky Linux 10 for AMD Radeon RX 9070 XT and Deploying vLLM with Podman
Table of contents
Introduction
This guide provides a step-by-step process for installing AMD's ROCm 7.0.2 platform on a fresh Rocky Linux 10 system. It is specifically tailored for use with consumer-grade AMD GPUs, such as the Radeon RX 9070 XT, for machine learning tasks. The tutorial also covers deploying the vLLM inference server inside a Podman container to keep the host system clean.
Prerequisites:
- A system with a supported AMD GPU (e.g., Radeon RX 9070 XT).
- A fresh installation of Rocky Linux 10.
- User account with
sudoprivileges.
Phase 1: System Preparation and ROCm Installation
Step 1: Enable Essential Repositories (CRB and EPEL)
The CodeReady Linux Builder (CRB) and EPEL repositories are crucial as they contain essential build tools and dependencies, such as python3-wheel, which are not available in the default Rocky Linux repositories.
Run the following commands to enable these repositories[citation:1]:
bash
sudo dnf install -y dnf-plugins-core
sudo dnf config-manager --set-enabled crb
sudo dnf install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-10.noarch.rpm
sudo dnf clean all
Step 2: Install the AMDGPU Installer and ROCm
Use AMD's official installer to set up the ROCm repositories and install the complete ROCm stack.
bash
sudo dnf install -y https://repo.radeon.com/amdgpu-install/7.0.2/el/10/amdgpu-install-7.0.2.70002-1.el10.noarch.rpm
sudo dnf clean all
sudo dnf install -y rocm
``
Step 3: Configure User Permissions
To allow your user account to access the GPU, add yourself to the render and video groups. A reboot is required for these changes to take full effect.
bash
sudo usermod -a -G render,video $USER
sudo systemctl reboot
Step 4: Verify the Installation
After rebooting, verify that both the AMDGPU driver is loaded and your discrete GPU is detected by ROCm.
bash
Check if the kernel module is loaded
lsmod | grep amdgpu
``
Verify ROCm recognizes your GPU
/opt/rocm/bin/rocminfo
/opt/rocm/bin/rocm-smi
A successful verification will show your Radeon RX 9070 XT (marketing name) with its architecture gfx1201 listed as an HSA agent in the rocminfo output.
Phase 2: Containerized Deployment with Podman
Step 5: Configure Podman and Run vLLM
Using Podman allows for an isolated environment to run your ML models. The following command uses the official ROCm-enabled vLLM image.
Key Podman flags explained:
--device /dev/kfd --device /dev/dri: Passes the GPU devices into the container.
--group-add=video: Grants the container necessary permissions.
--ipc=host: Critical for optimal performance with PyTorch.
--security-opt label=disable: Simplifies SELinux context issues (for testing environments).
Execute the command to start the vLLM server:
bash
podman run -it --rm\
--device /dev/kfd --device /dev/dri\
--group-add=video\
--ipc=host\
--security-opt label=disable\
-p 8000:8000\
-v ~/.cache:/root/.cache\
-e "VLLM_ALLOW_LONG_MAX_MODEL_LEN=1"\
-e "VLLM_USE_MODELSCOPE=True"\
rocm/vllm:rocm7.0.0_vllm_0.10.2_20251006\
bash -c "pip install modelscope>=1.18.1 && pip install --upgrade transformers && vllm serve 'Qwen/Qwen3-VL-8B-Instruct' --served-model-name Qwen3-VL-8B-Instruct --tensor-parallel-size 1 --max-model-len 8192 --gpu-memory-utilization 0.85"
This command will:
Pull the rocm/vllm image (if not already present).
Install the required modelscope library and upgrade transformers to a version that supports the Qwen model architecture.
Start the vLLM server, making the Qwen/Qwen3-VL-8B-Instruct model available via an OpenAI-compatible API on port 8000.
Alternative: Enhanced Security Configuration For a more secure container setup that avoids --ipc=host, you can configure SELinux and use private namespaces:
Configure SELinux to allow containers to access devices
bash
sudo setsebool -P container_use_devices 1
Run the container with private namespaces
podman run -it --rm\
--device /dev/kfd --device /dev/dri\
--group-add=video\
--shm-size=2g\
-p 8000:8000\
-v ~/.cache:/root/.cache\
-e "VLLM_USE_MODELSCOPE=True"\
rocm/vllm:rocm7.0.0_vllm_0.10.2_20251006\
bash -c "pip install modelscope>=1.18.1 && vllm serve 'Qwen/Qwen3-VL-8B-Instruct' --served-model-name Qwen3-VL-8B-Instruct --tensor-parallel-size 1"
Conclusion
You have successfully installed ROCm on Rocky Linux 10 and deployed a powerful LLM inference server inside a container. The key to a smooth installation was enabling the CRB repository to resolve critical dependencies. This setup provides a clean, isolated environment for ML development and inference on AMD GPUs.
You can now interact with the API server at http://localhost:8000. To test it, query the /v1/models endpoint or use the OpenAPI documentation at http://localhost:8000/docs