Open
Conversation
Member
|
Cool, thank you! We have implemented a similar fix in MMseqs2's version of the same code, but haven’t backported it: Could you repeat the benchmark with a stable sort? |
Author
|
Thank you for the reply. I changed the
Because of the large memory, the computational complexity is probably Nlog(N). |
Member
|
@fuji8 This looks great! Thank you for the PR. Would it be possible to avoid the lambda expression in the sort? |
Author
|
I apologize for the delay in response. I rewrote the code to be almost equivalent without using the lambda expression.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I profiled the T1050 with the parameters run by AlphaFold. (Only -cpu is changed). I used Score-P as the profiling tool and got the following results.
From this image, we can see that
QSortIntinmergeHitsToQueryis taking a long time. With fast enough storage, my hhblits for this condition is about 2100sec, andQSortIntaccounts for about 40%.Instead of this
QsortInt, usestd::sort.I ran hhblits installed by conda and using
std::sortunder the same conditions as before.In order to avoid I/O effects, I analyze the difference in execution time between the logs that contain this change, instead of the overall execution time. (From
hh-suite/src/hhblits.cpp
Lines 1028 to 1030 in ac76598
hh-suite/src/hhhmm.cpp
Line 2337 in b100bb0
std::sortThis reduced the execution time. I also ran it using the parallelization policy, but the results were not significantly different from
std::sort.This change is due to the different stability of sort, so the execution results may not truly match.