make GPUArray .imag, .real and .conj() preserve contiguity#151
make GPUArray .imag, .real and .conj() preserve contiguity#151grlee77 wants to merge 2 commits intoinducer:masterfrom
Conversation
|
Thanks for your contribution! |
|
Dies with Py2: https://gitlab.tiker.net/inducer/pycuda/-/jobs/28284 |
|
Closing in favor of gitlab MR. Could you please push your fixes there? I added an account for you there. |
|
Okay, will continue over there. I am not sure how to log in? I did not previously have a gitlab.com account, but just made one (although perhaps that is separate from gitlab.tiker.net?). The python 2 failure is because the proposed reshape function signature is not valid in Python 2.x. I am not sure what would be the best workaround. I am not sure that there is a great one that also preserves backward compatibility. |
|
If needed, the gitlab account I created is also under username grlee77 |
|
It is separate. Gmail has a habit of chucking those emails into spam--could you check? Regarding the interface--you can always manually figure out what happened from |
If a
GPUArray,gis Fortran contiguous,g.real,g.imagandg.conj()all currently return C-ordered arrays. The same operations in numpy preserve Fortran ordering.With this PR, those routines return F-ordered arrays if the source array is F-ordered. For C-ordered or non-contiugous arrays, the behaviour will be unchanged.
This PR has some overlap with the existing PR #15, but is more conservative in approach. (I had also considered putting the code setting 'C' vs. 'F' order into
_new_like_meitself, but wasn't sure if that might cause problems elsewhere).The specific use case I have is in calling a function that only supports Fortran-ordered float32 data and I want to run it on both the real and imaginary components of the input and the recombine them. I already have a Fortran-ordered gpu array,
g, and would like to be able to call it asout = kernel_func(g.real, ...) + 1j*kernel_func(g.imag, ...). I realize this involves some copying of device arrays behind the scenes, but that overhead is relatively small in my application. Currently this approach fails because the real and imaginary components returned do not respect the ordering of the source array. After this PR, it works as expected.The second commit here is independent of the above and just adds an
orderkwarg toreshape. I think this is preferable to the approach taken in #15 where the linear at 5ea4c97#diff-c6f20a28105688a11d8983cd1fce702cR642 seems to assume a 2D shape.