Skip to content

Enable DT topology-driven automatic CMA region allocation#1050

Open
bisingha-xilinx wants to merge 5 commits intoamd:mainfrom
bisingha-xilinx:main-special
Open

Enable DT topology-driven automatic CMA region allocation#1050
bisingha-xilinx wants to merge 5 commits intoamd:mainfrom
bisingha-xilinx:main-special

Conversation

@bisingha-xilinx
Copy link
Contributor

@bisingha-xilinx bisingha-xilinx commented Feb 4, 2026

Implement automatic CMA memory region selection based on device tree AIE topology and start_col. Users now only need to specify start_col when creating a hardware context - the driver automatically selects the appropriate CMA region based on DT topology mapping. This eliminates manual mem_index management and enables seamless multi-partition AIE workloads across dedicated memory regions.

Tested:

  1. Tested the provided special usecase in which the 36 aie cols are divided into 3 parts. 0-23 connected to only cma region0, 24-31 connected to only cma region1 and 32-35 connected to only cma region3. Tested by having multiple combinations.
  2. Tested existing XRT hw unit tests for ve2 with the existing platform. All the tests are passing.

Expected information in the DTB (eg.):

`

	my_drm_node {
		compatible = "xlnx,amdxdna";
		memory-region = <&cma_reserved_0>, <&cma_reserved_1>, <&cma_reserved_2>;
	};

	aie_memory_topology {
		compatible = "xlnx,aie-memory-topology";
		#address-cells = <0x01>;
		#size-cells = <0x00>;

		region@0 {
			reg = <0x00>;
			columns = <0x00 0x17>;  /* AIE columns 0-23 */
			memory-region = <&cma_reserved_0>;
		};

		region@1 {
			reg = <0x01>;
			columns = <0x18 0x1F>;  /* AIE columns 24-31 */
			memory-region = <&cma_reserved_1>;
		};

		region@2 {
			reg = <0x02>;
			columns = <0x20 0x23>;  /* AIE columns 32-35 */
			memory-region = <&cma_reserved_2>;
		};
	};

`

Copilot AI review requested due to automatic review settings February 4, 2026 16:11
@bisingha-xilinx bisingha-xilinx changed the title Enable DT topology-driven automatic CMA region allocation for AIE Enable DT topology-driven automatic CMA region allocation Feb 4, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements automatic CMA (Contiguous Memory Allocator) region selection for AIE (AI Engine) hardware contexts based on device tree topology. The driver now automatically maps AIE column ranges to dedicated CMA memory regions, eliminating the need for manual memory index management by users. Users only need to specify start_col when creating hardware contexts, and the driver handles the appropriate CMA region selection based on the parsed device tree topology.

Changes:

  • Added device tree parsing to extract AIE memory topology mapping (column ranges to CMA regions)
  • Implemented automatic memory index selection based on allocated start column from XRS
  • Modified BO allocation to inject the auto-selected memory index into buffer flags
  • Updated HSA queue allocation/deallocation to use the correct CMA device based on memory index

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/shim_ve2/xdna_hwctx.h Added m_mem_index member and query_mem_index() method declaration
src/shim_ve2/xdna_hwctx.cpp Implemented query_mem_index() to retrieve auto-selected memory index from driver and inject it into BO flags
src/include/uapi/drm_local/amdxdna_accel.h Added DRM_AMDXDNA_HWCTX_MEM_INDEX parameter for querying hardware context memory index
src/driver/amdxdna/ve2_of.h Added memory topology structures and function declarations
src/driver/amdxdna/ve2_of.c Implemented device tree parsing and automatic memory index selection logic
src/driver/amdxdna/ve2_hwctx.c Reordered initialization to request XRS before HSA queue creation and use correct CMA device
src/driver/amdxdna/ve2_debug.c Added ioctl handler to return memory index to userspace

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings February 5, 2026 09:18
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

// This ensures BOs are allocated from the correct CMA region
xcl_bo_flags xflags{flags};
if (m_mem_index < MAX_MEM_REGIONS)
xflags.bank = m_mem_index & 0xFF; // Lower 8 bits
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (xfflags.use > 0) // this is internal
{ xflags.bank = m_mem_index & 0xFF;
}
else
{
if (xflags.bank & m_mem_index)
// bank passed by user is valid
else
throw;
}

Copilot AI review requested due to automatic review settings February 6, 2026 09:09
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

int num_phandles;
int ret;
u32 i;
u32 k;
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable name 'k' is ambiguous. Consider renaming it to 'phandle_idx' or 'mem_region_idx' to clarify that it indexes memory-region phandles.

Copilot uses AI. Check for mistakes.
Comment on lines 1262 to 1260
ret = ve2_xrs_request(xdna, hwctx);
if (ret) {
XDNA_ERR(xdna, "Failed to create host queue, ret=%d", ret);
goto free_priv;
XDNA_ERR(xdna, "XRS resource request failed, ret=%d", ret);
goto cleanup_priv;
}

ret = ve2_xrs_request(xdna, hwctx);
/* Auto-select mem_index based on ACTUAL allocated start_col from XRS */
ve2_auto_select_mem_index(xdna, hwctx);
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The automatic mem_index selection logic introduced by ve2_auto_select_mem_index() lacks test coverage. Consider adding tests to verify correct mem_index assignment for various start_col values and topology configurations.

Copilot uses AI. Check for mistakes.
if (!topo_np) {
XDNA_DBG(xdna, "No aie_mem_topology node found, using default CMA");
xdna_hdl->mem_topology.num_regions = 0;
return -ENOENT;
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning -ENOENT when topology node is not found, but the function continues execution as if this is a success case. Consider returning 0 or a distinct code to indicate optional feature unavailability vs. actual errors.

Suggested change
return -ENOENT;
return 0;

Copilot uses AI. Check for mistakes.
Bikash Singha added 5 commits February 6, 2026 22:39
Signed-off-by: Bikash Singha <bisingha@xcobisingha50x.amd.com>
Signed-off-by: Bikash Singha <bisingha@xcobisingha50x.amd.com>
Signed-off-by: Bikash Singha <bisingha@xcobisingha50x.amd.com>
Signed-off-by: Bikash Singha <bisingha@xcobisingha50x.amd.com>
Signed-off-by: Bikash Singha <bisingha@xcobisingha50x.amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants