Skip to content

[datapath] Signed Partial Product Generation#9592

Merged
cowardsa merged 7 commits intollvm:mainfrom
cowardsa:coward/signed_mult
Feb 5, 2026
Merged

[datapath] Signed Partial Product Generation#9592
cowardsa merged 7 commits intollvm:mainfrom
cowardsa:coward/signed_mult

Conversation

@cowardsa
Copy link
Contributor

@cowardsa cowardsa commented Feb 2, 2026

We can greatly reduce the gate count when lowering signed multipliers (sext(a) * sext(b)) by simplifying the partial product generation.

Given a p-bit input a, and a q-bit input b, if we extend both to p+q bits, then we can simplify using the following:
sext(a) * sext(b) == a[p-2:0] * b[q-2:0] - 2^(p-1)*a[p-1]*b[q-2:0] - 2^(q-1)*a[p-2:0]*b[q-1] + 2^(p+q-2) a[p-1]*b[q-1]

We can then use the identity: -x = ~x + 1, and perform a bunch of constant folding to really reduce the number of variable bits going into the compressor.

Currently implemented for equal width multipliers - need to add support for operands of different base widths.

The signed multiplier optimization is tested via LEC tests, within the integration tests - the code is overly verbose for a FileCheck test IMHO.

@uenoku
Copy link
Member

uenoku commented Feb 4, 2026

This is super cool!! Will try reviewing details tomorrow :)

Copy link
Member

@uenoku uenoku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super cool!! LGTM! I believe this is so-called Baugh-Wooley Algorithm? In that case could you mention in the code? (it took some time to reach to this word :) I agree it's too hard to write test for this, I'll try to consider some way to test this kind of code :)

// Note constant correction will depend on lhs and rhs widths - so general
// case is not twice the correction for one side.
auto ones = APInt::getAllOnes(inputWidth);
auto lowerLhs = APInt(inputWidth, (1 << lhsBaseWidth));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 << lhsBaseWidth would overflow when > 32. Would be good to use getOneBitSet().

Suggested change
auto lowerLhs = APInt(inputWidth, (1 << lhsBaseWidth));
auto lowerLhs = APInt::getOneBitSet(inputWidth, lhsBaseWidth);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat - did not know this existed :)

@cowardsa
Copy link
Contributor Author

cowardsa commented Feb 5, 2026

Thanks for the review @uenoku - have expanded the documentation (with pointers to the original Baugh-Wooley) and addressed other comments!

@cowardsa cowardsa merged commit ee01872 into llvm:main Feb 5, 2026
7 checks passed
@uenoku
Copy link
Member

uenoku commented Feb 5, 2026

Thank you for adding comments! This is awesome!

Arya-Golkari pushed a commit to Arya-Golkari/circt that referenced this pull request Feb 7, 2026
* Signed multiplier optimizations + tests

* Update comments

* Simplify

* Update comments

* final updates

* Address warnings

* Address comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants