Introduction for GPU Virtualization Technology - MIG

Using Multi-Instance GPU (MIG/Multi-Instance GPU) allows you to divide a powerful graphics card into smaller partitions, each with its own workload, enabling a single graphics card to run different tasks simultaneously. This article provides a brief introduction to MIG and offers installation and usage examples.

What is MIG?

NVIDIA Multi-Instance GPU (MIG) is a GPU virtualization technology introduced by NVIDIA. It allows a physical GPU to be divided into multiple independent GPU instances, each of which can be assigned to different virtual machines, containers, or users. This technology helps to efficiently utilize GPU resources, improve GPU sharing and multi-tenancy support.

MIG technology typically requires hardware and software support, including NVIDIA GPUs that support MIG and the corresponding drivers. This makes MIG technology a powerful tool for managing GPU resources in data centers and cloud computing environments. It helps improve GPU utilization, reduce costs, and better meet the needs of different applications and users.

Multi-Instance GPU

How MIG Works

MIG works by virtually partitioning a single physical GPU into smaller independent instances. This technology involves GPU virtualization, where GPU resources, including CUDA cores and memory, are allocated to different instances. These instances are isolated from each other, ensuring that tasks running on one instance do not interfere with other instances.

MIG supports dynamic allocation of GPU resources, allowing the size of instances to be dynamically adjusted based on workload demands. This dynamic allocation helps to efficiently utilize resources. Multiple applications or users can run concurrently on the same GPU, each with its dedicated instance. The entire process is managed through software, providing administrators with control over instance configurations and resource allocations. This approach enhances flexibility, scalability, and resource efficiency in handling different workloads on a single GPU.

Key Features of MIG Technology

Resource Partitioning: MIG allows a physical GPU to be divided into multiple GPU instances, each with its GPU cores, GPU memory, NVLink bandwidth, and other resources. This enables better control and allocation of GPU resources.

Multi-Tenancy Support: MIG technology can be used for GPU virtualization, allowing different users or applications to share the same physical GPU without interfering with each other.

Dynamic Resource Adjustment: Administrators can dynamically reconfigure the resources of MIG instances based on workload demands, resulting in better resource utilization and performance.

Fault Isolation: MIG technology supports isolation of GPU instances, meaning issues in one GPU instance do not affect other instances, thereby improving system fault tolerance.

Deployment Flexibility: MIG technology can be used in various scenarios such as cloud computing, virtualized environments, containerized applications, providing flexibility for different deployment requirements.

Conditions for MIG

Not all graphics cards support MIG. The following are the GPU models officially supported:

Supported GPU Products

Based on the list, it appears that the A100 and H100 are the GPUs that can be used with MIG. Although both have 24GB of VRAM, the consumer-grade 4090 is not supported.

Next is the driver:

GPU Drive

Once these requirements are met, MIG can be used.

MIG Configuration and Usage

Installing Nvidia SMI (using Ubuntu as an example) is simple. Just install the tool package provided by Nvidia:

sudo apt-get install nvidia-utils

The next step is to verify the Nvidia driver:

nvidia-smi

If there are no issues, it means the installation is complete. The next step is the configuration command:

sudo nvidia-smi -i <GPU_ID> --mig on

The GPU ID is included in the nvidia-smi result.

To validate the MIG configuration (requires GPU ID and instance ID for further steps):

nvidia-smi mig -lgip

If the validation is successful, it means that MIG is functioning properly, and you can proceed to create virtual GPUs.

We partition a single GPU (hardware) into multiple independent GPU instances to manually distribute the workload and reduce the cost of workload balancing.

sudo nvidia-smi -i <GPU_ID> --mig <INSTANCE_COUNT>

-i <GPU_ID>: Specify the GPU device to use. Replace <GPU_ID> with the actual ID of the GPU you want to configure.

-mig <INSTANCE_COUNT>: Used to configure MIG (Multi-Instance GPU). Replace <INSTANCE_COUNT> with the desired number of GPU instances to create on the specified GPU. Each instance has its own set of resources, including memory and computational capabilities.

For example, in the following example, we create 3 instances on GPU ID 0:

sudo nvidia-smi -i 0 --mig 3

Changing the resource allocation (workload) of an instance, with the main goal of adjusting resource allocation for a specific MIG instance:

sudo nvidia-smi -i <GPU_ID> -gi <INSTANCE_ID> -rg <WORKLOAD_PERCENT>

-i <GPU_ID>: Specify the GPU on which to perform the operation. For example, -i 0 represents the first GPU.

-gi <INSTANCE_ID>: The MIG instance on the specified GPU to perform the operation. For example, -gi 1 represents the second MIG instance on the GPU.

-rg <WORKLOAD_PERCENT>: The percentage of GPU resources allocated to the specified MIG instance. Replace <WORKLOAD_PERCENT> with the desired percentage. For example, -rg 70 allocates 70% of GPU resources to the specified MIG instance.

Setting a workload that occupies 70% of the total GPU resources on GPU_ID 0 and MIG Instance 1:

sudo nvidia-smi -i 0 -gi 1 -rg 70

Docker and MIG

In most cases, we use Docker as the runtime environment, so let's introduce Docker and MIG configuration here.

Install the NVIDIA Container Toolkit, which is the first step to using the GPU in Docker. We won't go into detail here, so let's directly install it with the following command:

sudo apt-get install -y nvidia-container-toolkit

Configure the Docker daemon to use NVIDIA: Edit the Docker daemon configuration file (/etc/docker/daemon.json) and add the following lines:

{

"default-runtime": "nvidia",

"runtimes": {

"nvidia": {

"path": "/usr/bin/nvidia-container-runtime",

"runtimeArgs": []

}

The code above is just an example, so modify it according to your actual situation. This article doesn't focus on how to use the GPU in Docker, so it's provided as a simple example.

After configuring, restart the Docker daemon:

sudo systemctl restart docker

To verify GPU availability and get GPU information, run the following command:

docker run --gpus all nvidia/cuda:11.0-base nvidia-smi

Now let's move on to the main task, configuring MIG.

To configure MIG, run the following command:

docker run --gpus device=0,1,2,3 -e NVIDIA_VISIBLE_DEVICES=0,1,2,3 my_container

You can adjust the "--gpus" and "NVIDIA_VISIBLE_DEVICES" parameters based on the number of MIG devices you want to use. Here, "gpus" refers to the virtual GPUs created using the previous command.

Summary

MIG allows a single GPU to be divided into smaller instances, providing an economical, efficient, and scalable solution for handling various workloads simultaneously. The underlying features of MIG, including resource isolation and dynamic allocation, enhance flexibility, scalability, and overall efficiency in GPU utilization.

Real-world applications spanning data centers, scientific research, and AI development highlight the impact of MIG in optimizing GPU resources and accelerating computational tasks. MIG is a powerful technology, but its widespread adoption is hindered by the current high prices of graphics cards.