Support for CUDA (New Subpackage)#156
Support for CUDA (New Subpackage)#156ChantalJuntao wants to merge 1 commit intoJuliaImages:masterfrom
Conversation
Codecov Report
@@ Coverage Diff @@
## master #156 +/- ##
==========================================
- Coverage 90.00% 89.81% -0.19%
==========================================
Files 8 8
Lines 220 216 -4
==========================================
- Hits 198 194 -4
Misses 22 22
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
I believe it's better to leave this to @timholy (I guess you're in the same lab?) I'll just leave some general comments here.
On the code design. Because it's very likely that we wouldn't have CUDA support for all methods for a long period (will we?), I think a better way to design
the code is to opt-in the CUDA support when we know the interpolation method has CUDA support and otherwise fallback to the CPU. We can have a private cuwarp! function and then call cuwarp! from warp! when certain condition meets (i.e., when we do support CUDA).
On performance, I'd be very interested if this matches or beats the NNLibCUDA performance https://github.com/FluxML/NNlibCUDA.jl/blob/da73f077371a5f85660f6e7620099d74ddd11b6e/src/upsample.jl#L152-L194
| img_inds = map(out->out.I, CartesianIndices(axes(out.parent))) | ||
| tform_offset = offset_calc(out, in_offsets, tform) | ||
|
|
||
| tformindex = CuArray(tform_offset.(SVector.(img_inds))) |
There was a problem hiding this comment.
This calculates all image indexes in the single thread CPU and converts them to GPU. I believe this is unnecessarily inefficient -- I think we need to figure out how to directly generate the target indexes on GPU. Maybe manually write a small CUDA kernel for this.
|
FWIW we agree this needs additional work but is not high priority, as it turns out that warping is not a real bottleneck in the analysis we were planning to optimize. To be continued later. In the meantime, let's leave this open in case it inspires someone else. |
While working on improving the speed of my image registration analysis, I've noticed that ImageTransformations doesn't currently support compatibility for GPU computing using CuArrays. Therefore, I've made a first attempt at a GPU version of ImageTransformations.warp that supports CuArrays that can be wrapped in Offset Arrays and defaults to linear b-spline interpolation.
I'm still working on trying to get this to use the actual Interpolations package to support more functionality, and my test suite is still quite incomplete. I'm also not sure if this will work on grayscale or colored images yet. Any help or feedback would be extremely appreciated.