-
Notifications
You must be signed in to change notification settings - Fork 17
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Built with CUDA 13, CCCL 3.0, rdkit (pip installed) 2025.9.1, built at tag v0.2.0
Test command
python3.12 -m pytest --pyargs /app/nvMolKit/nvmolkit/tests
When running with -k "not async", many of the tests below pass.
Example error
/app/nvMolKit/nvmolkit/tests/test_types.py:35:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
device = None
def synchronize(device: "Device" = None) -> None:
r"""Wait for all kernels in all streams on a CUDA device to complete.
Args:
device (torch.device or int, optional): device for which to synchronize.
It uses the current device, given by :func:`~torch.cuda.current_device`,
if :attr:`device` is ``None`` (default).
"""
_lazy_init()
with torch.cuda.device(device):
> return torch._C._cuda_synchronize()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E torch.AcceleratorError: CUDA error: an illegal memory access was encountered
E Search for `cudaErrorIllegalAddress' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information.
E CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
E For debugging consider passing CUDA_LAUNCH_BLOCKING=1
E Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
/opt/python/cp312-cp312/lib/python3.12/site-packages/torch/cuda/__init__.py:1083: AcceleratorError
All tests that failed:
FAILED ../app/nvMolKit/nvmolkit/tests/test_mmff_optimization.py::test_mmff_optimization_batch_vs_rdkit[1-0-gpu_ids1] - AssertionError: Molecule 0, Conformer 0: energy mismatch: RDKit=26.874311, nvMolKit=125669.641792, abs_diff=125642.767481, rel_error=4675.199514
FAILED ../app/nvMolKit/nvmolkit/tests/test_mmff_optimization.py::test_mmff_optimization_batch_vs_rdkit[1-2-gpu_ids1] - RuntimeError: Encountered CUDA error 101: invalid device ordinal
FAILED ../app/nvMolKit/nvmolkit/tests/test_mmff_optimization.py::test_mmff_optimization_batch_vs_rdkit[1-5-gpu_ids1] - RuntimeError: Encountered CUDA error 700: an illegal memory access was encountered
FAILED ../app/nvMolKit/nvmolkit/tests/test_mmff_optimization.py::test_mmff_optimization_batch_vs_rdkit[3-0-gpu_ids1] - RuntimeError: Encountered CUDA error 700: an illegal memory access was encountered
FAILED ../app/nvMolKit/nvmolkit/tests/test_mmff_optimization.py::test_mmff_optimization_batch_vs_rdkit[3-2-gpu_ids1] - RuntimeError: Encountered CUDA error 700: an illegal memory access was encountered
FAILED ../app/nvMolKit/nvmolkit/tests/test_mmff_optimization.py::test_mmff_optimization_batch_vs_rdkit[3-5-gpu_ids1] - RuntimeError: Encountered CUDA error 700: an illegal memory access was encountered
FAILED ../app/nvMolKit/nvmolkit/tests/test_mmff_optimization.py::test_mmff_optimization_allows_large_molecule_interleaved - RuntimeError: Encountered CUDA error 700: an illegal memory access was encountered
FAILED ../app/nvMolKit/nvmolkit/tests/test_mmff_optimization.py::test_error_case_throws_properly - RuntimeError: Encountered CUDA error 700: an illegal memory access was encountered
FAILED ../app/nvMolKit/nvmolkit/tests/test_similarity.py::test_cross_similarity_fp_mismatch[tanimoto] - RuntimeError: Encountered CUDA error 700: an illegal memory access was encountered
FAILED ../app/nvMolKit/nvmolkit/tests/test_similarity.py::test_cross_similarity_fp_mismatch[cosine] - RuntimeError: Encountered CUDA error 700: an illegal memory access was encountered
FAILED ../app/nvMolKit/nvmolkit/tests/test_similarity.py::test_nvmolkit_cross_tanimoto_similarity_from_nvmolkit_fp - RuntimeError: Encountered CUDA error 700: an illegal memory access was encountered
FAILED ../app/nvMolKit/nvmolkit/tests/test_similarity.py::test_nxm_cross_tanimoto_similarity_from_nvmolkit_fp[nxmdims0] - RuntimeError: Encountered CUDA error 700: an illegal memory access was encountered
FAILED ../app/nvMolKit/nvmolkit/tests/test_similarity.py::test_nxm_cross_tanimoto_similarity_from_nvmolkit_fp[nxmdims1] - RuntimeError: Encountered CUDA error 700: an illegal memory access was encountered
FAILED ../app/nvMolKit/nvmolkit/tests/test_similarity.py::test_nxm_cross_tanimoto_similarity_from_nvmolkit_fp[nxmdims2] - RuntimeError: Encountered CUDA error 700: an illegal memory access was encountered
FAILED ../app/nvMolKit/nvmolkit/tests/test_similarity.py::test_nxm_cross_tanimoto_similarity_from_nvmolkit_fp[nxmdims3] - RuntimeError: Encountered CUDA error 700: an illegal memory access was encountered
FAILED ../app/nvMolKit/nvmolkit/tests/test_similarity.py::test_nxm_cross_tanimoto_similarity_from_packing[nxmdims0] - torch.AcceleratorError: CUDA error: an illegal memory access was encountered
FAILED ../app/nvMolKit/nvmolkit/tests/test_similarity.py::test_nxm_cross_tanimoto_similarity_from_packing[nxmdims1] - torch.AcceleratorError: CUDA error: an illegal memory access was encountered
FAILED ../app/nvMolKit/nvmolkit/tests/test_similarity.py::test_nxm_cross_tanimoto_similarity_from_packing[nxmdims2] - torch.AcceleratorError: CUDA error: an illegal memory access was encountered
FAILED ../app/nvMolKit/nvmolkit/tests/test_similarity.py::test_nvmolkit_cross_cosine_similarity_from_nvmolkit_fp - RuntimeError: Encountered CUDA error 700: an illegal memory access was encountered
FAILED ../app/nvMolKit/nvmolkit/tests/test_similarity.py::test_nxm_cross_cosine_similarity_from_nvmolkit_fp[nxmdims0] - RuntimeError: Encountered CUDA error 700: an illegal memory access was encountered
FAILED ../app/nvMolKit/nvmolkit/tests/test_similarity.py::test_nxm_cross_cosine_similarity_from_nvmolkit_fp[nxmdims1] - RuntimeError: Encountered CUDA error 700: an illegal memory access was encountered
FAILED ../app/nvMolKit/nvmolkit/tests/test_similarity.py::test_nxm_cross_cosine_similarity_from_nvmolkit_fp[nxmdims2] - RuntimeError: Encountered CUDA error 700: an illegal memory access was encountered
FAILED ../app/nvMolKit/nvmolkit/tests/test_similarity.py::test_nxm_cross_cosine_similarity_from_nvmolkit_fp[nxmdims3] - RuntimeError: Encountered CUDA error 700: an illegal memory access was encountered
FAILED ../app/nvMolKit/nvmolkit/tests/test_similarity.py::test_nxm_cross_cosine_similarity_from_nvmolkit_fp[nxmdims4] - RuntimeError: Encountered CUDA error 700: an illegal memory access was encountered
FAILED ../app/nvMolKit/nvmolkit/tests/test_similarity.py::test_nxm_cross_cosine_similarity_from_nvmolkit_fp[nxmdims5] - RuntimeError: Encountered CUDA error 700: an illegal memory access was encountered
FAILED ../app/nvMolKit/nvmolkit/tests/test_similarity.py::test_memory_constrained_tanimoto_self - RuntimeError: Encountered CUDA error 700: an illegal memory access was encountered
FAILED ../app/nvMolKit/nvmolkit/tests/test_similarity.py::test_memory_constrained_tanimoto_cross[nxmdims0] - RuntimeError: Encountered CUDA error 700: an illegal memory access was encountered
FAILED ../app/nvMolKit/nvmolkit/tests/test_similarity.py::test_memory_constrained_tanimoto_cross[nxmdims1] - RuntimeError: Encountered CUDA error 700: an illegal memory access was encountered
FAILED ../app/nvMolKit/nvmolkit/tests/test_similarity.py::test_memory_constrained_tanimoto_cross[nxmdims2] - RuntimeError: Encountered CUDA error 700: an illegal memory access was encountered
FAILED ../app/nvMolKit/nvmolkit/tests/test_similarity.py::test_memory_constrained_cosine_self - RuntimeError: Encountered CUDA error 700: an illegal memory access was encountered
FAILED ../app/nvMolKit/nvmolkit/tests/test_similarity.py::test_memory_constrained_cosine_cross[nxmdims0] - RuntimeError: Encountered CUDA error 700: an illegal memory access was encountered
FAILED ../app/nvMolKit/nvmolkit/tests/test_similarity.py::test_memory_constrained_cosine_cross[nxmdims1] - RuntimeError: Encountered CUDA error 700: an illegal memory access was encountered
FAILED ../app/nvMolKit/nvmolkit/tests/test_similarity.py::test_memory_constrained_cosine_cross[nxmdims2] - RuntimeError: Encountered CUDA error 700: an illegal memory access was encountered
FAILED ../app/nvMolKit/nvmolkit/tests/test_similarity.py::test_memory_constrained_segmented_path_large_cross[tanimoto] - torch.AcceleratorError: CUDA error: an illegal memory access was encountered
FAILED ../app/nvMolKit/nvmolkit/tests/test_similarity.py::test_memory_constrained_segmented_path_large_cross[cosine] - torch.AcceleratorError: CUDA error: an illegal memory access was encountered
FAILED ../app/nvMolKit/nvmolkit/tests/test_types.py::test_async_gpu_result_release_frees_memory - torch.AcceleratorError: CUDA error: an illegal memory access was encountered
nvidia-smi:
$ nvidia-smi
Tue Oct 21 03:25:06 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.95.02 Driver Version: 581.42 CUDA Version: 13.0 |
+-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA RTX A4500 Laptop GPU On | 00000000:01:00.0 Off | Off |
| N/A 50C P0 33W / 91W | 0MiB / 16384MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working