-
Notifications
You must be signed in to change notification settings - Fork 943
Fix OpStore of BDA to align to 8 #4123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
spencer-lunarg
wants to merge
1
commit into
KhronosGroup:main
Choose a base branch
from
spencer-lunarg:spencer-lunarg-one-day-we-will-have-bda-working
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+102
−8
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,62 @@ | ||
| spv.bufferhandle26.frag | ||
| // Module Version 10000 | ||
| // Generated by (magic number): 8000b | ||
| // Id's are bound by 27 | ||
|
|
||
| Capability Shader | ||
| Capability PhysicalStorageBufferAddressesEXT | ||
| Extension "SPV_KHR_physical_storage_buffer" | ||
| 1: ExtInstImport "GLSL.std.450" | ||
| MemoryModel PhysicalStorageBuffer64EXT GLSL450 | ||
| EntryPoint Fragment 4 "main" | ||
| ExecutionMode 4 OriginUpperLeft | ||
| Source GLSL 460 | ||
| SourceExtension "GL_EXT_buffer_reference" | ||
| SourceExtension "GL_EXT_scalar_block_layout" | ||
| Name 4 "main" | ||
| Name 7 "constants" | ||
| MemberName 7(constants) 0 "u_param" | ||
| Name 9 "ParametersBuffer" | ||
| MemberName 9(ParametersBuffer) 0 "src" | ||
| MemberName 9(ParametersBuffer) 1 "dst" | ||
| Name 12 "DataBuffer" | ||
| MemberName 12(DataBuffer) 0 "data" | ||
| Name 14 "" | ||
| Decorate 7(constants) Block | ||
| MemberDecorate 7(constants) 0 Offset 0 | ||
| Decorate 9(ParametersBuffer) Block | ||
| MemberDecorate 9(ParametersBuffer) 0 Offset 0 | ||
| MemberDecorate 9(ParametersBuffer) 1 Offset 8 | ||
| Decorate 12(DataBuffer) Block | ||
| MemberDecorate 12(DataBuffer) 0 NonWritable | ||
| MemberDecorate 12(DataBuffer) 0 Offset 0 | ||
| 2: TypeVoid | ||
| 3: TypeFunction 2 | ||
| TypeForwardPointer 6 PhysicalStorageBufferEXT | ||
| 7(constants): TypeStruct 6 | ||
| TypeForwardPointer 8 PhysicalStorageBufferEXT | ||
| 9(ParametersBuffer): TypeStruct 8 8 | ||
| 10: TypeFloat 32 | ||
| 11: TypeVector 10(float) 4 | ||
| 12(DataBuffer): TypeStruct 11(fvec4) | ||
| 8: TypePointer PhysicalStorageBufferEXT 12(DataBuffer) | ||
| 6: TypePointer PhysicalStorageBufferEXT 9(ParametersBuffer) | ||
| 13: TypePointer PushConstant 7(constants) | ||
| 14: 13(ptr) Variable PushConstant | ||
| 15: TypeInt 32 1 | ||
| 16: 15(int) Constant 0 | ||
| 17: TypePointer PushConstant 6(ptr) | ||
| 20: 15(int) Constant 1 | ||
| 23: TypePointer PhysicalStorageBufferEXT 8(ptr) | ||
| 4(main): 2 Function None 3 | ||
| 5: Label | ||
| 18: 17(ptr) AccessChain 14 16 | ||
| 19: 6(ptr) Load 18 | ||
| 21: 17(ptr) AccessChain 14 16 | ||
| 22: 6(ptr) Load 21 | ||
| 24: 23(ptr) AccessChain 22 16 | ||
| 25: 8(ptr) Load 24 Aligned 8 | ||
| 26: 23(ptr) AccessChain 19 20 | ||
| Store 26 25 Aligned 8 | ||
| Return | ||
| FunctionEnd |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,21 @@ | ||
| #version 460 core | ||
|
|
||
| #extension GL_EXT_buffer_reference: require | ||
| #extension GL_EXT_scalar_block_layout: require | ||
|
|
||
| layout (buffer_reference, buffer_reference_align = 4, scalar) readonly buffer DataBuffer { | ||
| vec4 data; | ||
| }; | ||
|
|
||
| layout (buffer_reference, buffer_reference_align = 8, scalar) buffer ParametersBuffer { | ||
| DataBuffer src; | ||
| DataBuffer dst; | ||
| }; | ||
|
|
||
| layout(push_constant) uniform constants { | ||
| ParametersBuffer u_param; | ||
| }; | ||
|
|
||
| void main() { | ||
| u_param.dst = u_param.src; | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is
which is storing a BDA pointer, but it was defaulting back to 16 before, so this is correct
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should still be 16 I believe. The default value for
buffer_reference_alignwhen unspecified is 16 and the first member of the implicit buffer struct can be proved to have the same alignment as the struct itself.Other members may have smaller alignments depending on their offset in the struct (specified by the selected layout rule) though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like the correct formula to compute the highest guaranteed alignment for each member of a struct is
gcd(struct_alignment, member_offset_in_the_struct)wheregcdis the greatest common divisor.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alignments have to be powers of two so you don't need
gcd, you can just usemax.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean
max(struct_alignment, member_offset_in_the_struct)?This does not work:
Correct:
gcd(16, 0) = 16
gcd(16, 8) = 8
gcd(16, 16) = 16
gcd(16, 24) = 8
Incorrect:
max(16, 0) = 16
max(16, 8) = 16
max(16, 16) = 16
max(16, 24) = 24
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, looked back into this, the issue is actually not the logic I added, it is doing things correctly IMO
The "real" issue here is this GLSL is generating code to copy the pointer, not the struct, but only because this test is
--target-env vulkan1.1If you set it to
--target-env vulkan1.2or higher, it will generate the correct SPIR-V such that the alignment is 8 as you would thinkhttps://godbolt.org/z/d7n86soYd
So I agree there is an issue, but only for 1.1 or lower GLSL code path and its because it produces code that is copying a pointer, which should be aligned 8... trying to say "this PR is doing the logic correctly and there is nothing in the scope here to fix, but can raise a new issue to track the issue"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pointer being 8 bytes, it indeed requires an alignment of at least 8 bytes.
However, OpLoad instructions should always use the largest alignment possible that can be compile-time-proved, for performance reasons (see Nvidia's Advanced API Performance: Descriptors blog).
From my understanding, this is the whole point of the existence of
buffer_reference_aligninstead of auto-computing the alignment from the largest member.In this example,
MyBufferdoes not setbuffer_reference_align, which means the default value of 16 is used, as explained byGL_EXT_buffer_reference:Since the first member,
s6, is at offset 0, we can prove that it has the same guaranteed alignment as the implicit struct (16) and thus should have loads/stores aligned to 16 bytes.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree 100% with you that it should be 16
I disagree there is anything to change in this PR, this PR is exposing a bug that is there already, which is
--target-env vulkan1.1is not producing correct BDA spirv, regardless of theAlignedvalueThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will open another issue about
buffer_reference_alignbeing ignored for loads/stores.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maxime-modulopi thanks, clearly there are still more things to get right, but happy to get it fixed little by little and issues with small code to reproduce is always helpful!