-
Notifications
You must be signed in to change notification settings - Fork 30
Improve mmap #281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve mmap #281
Conversation
| return _temp_dir, temp_dir_name | ||
|
|
||
|
|
||
| _TEMP_DIR, TEMP_DIR_NAME = make_temp_dir() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the culprit
| """Initialize the temporary directory for the temp mmap arrays if not already done.""" | ||
| global _TEMP_DIR, TEMP_DIR_NAME | ||
|
|
||
| if _TEMP_DIR is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the fix
| import alphatims.utils | ||
|
|
||
|
|
||
| _TEMP_DIR, TEMP_DIR_NAME = None, None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The initialization function lacks a proper return value when _TEMP_DIR already exists. Always return TEMP_DIR_NAME for consistency regardless of whether we're initializing a new directory or using an existing one.
_TEMP_DIR, TEMP_DIR_NAME = None, None
ARRAYS = {}
CLOSED = False
ALLOW_NDARRAY_SUBCLASS = False
def _init_temp_dir(prefix: str = "temp_mmap_"):
"Initialize the temporary directory for the temp mmap arrays if not already done."
global _TEMP_DIR, TEMP_DIR_NAME
if _TEMP_DIR is None:
_TEMP_DIR, TEMP_DIR_NAME = make_temp_dir(prefix)
return TEMP_DIR_NAME
return TEMP_DIR_NAME| temp_dir_name = _temp_dir.name | ||
|
|
||
| logging.info( | ||
| f"Memory-mapped arrays are written to temporary directory {temp_dir_name}. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logging statement is using TEMP_DIR_NAME which hasn't been assigned yet at this point. It should use temp_dir_name instead, which is the local variable containing the path to the newly created temporary directory.
logging.info(
f"Memory-mapped arrays are written to temporary directory {temp_dir_name}. "
"Cleanup of this folder is OS dependent and might need to be triggered manually! "
f"Current space: {shutil.disk_usage(temp_dir_name)[-1]:,}"
)| np.ndarray | ||
| A writable temporary mmapped array. | ||
| """ | ||
| temp_dir_name = _init_temp_dir() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function should store the return value of _init_temp_dir() to ensure it's using the correct directory path.
temp_dir_name = _init_temp_dir()| _memmap.close() | ||
| del _memmap | ||
| del _TEMP_DIR | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docstring is incorrectly formatted. The " should be placed around the entire docstring including the "WARNING" section to properly document the function.
def reset() -> str:
"Reset the temporary folder containing temp mmapped arrays and create a new temporary folder."
WARNING: All existing temp mmapp arrays will become unusable!
Returns
-------
str
The name of the new temporary folder.
"
global _TEMP_DIR
global TEMP_DIR_NAME
global ARRAYS
_reset()
_TEMP_DIR, TEMP_DIR_NAME = make_temp_dir()
ARRAYS = {}
return TEMP_DIR_NAME| - uses: actions/checkout@v2 | ||
| - uses: actions/checkout@v4 | ||
| - uses: conda-incubator/setup-miniconda@v3 | ||
| with: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a duplicate miniconda-version line in the GitHub workflow. The redundant line should be removed to avoid confusion and potential conflicts.
- uses: conda-incubator/setup-miniconda@v3
with:
auto-update-conda: true
miniconda-version: "latest"
python-version: ${{ matrix.python-version }}
README.md
Outdated
| ## How to contribute | ||
|
|
||
| If you like AlphaTims you can give us a [star](stargazers) to boost our visibility! All direct contributions are also welcome. Feel free to post a new [issue](https://github.com/MannLabs/alphabase/issues) or clone the repository and create a [pull request](https://github.com/MannLabs/alphabase/pulls) with a new branch. For an even more interactive participation, check out the [discussions](https://github.com/MannLabs/alphabase/discussions). | ||
| If you like AlphaTims you can give us a [star](stargazers) to boost our visibility! All direct contributions are also welcome. Feel free to post a new [issue](https://github.com/MannLabs/alphatims/issues) or clone the repository and create a [pull request](https://github.com/MannLabs/alphabase/pulls) with a new branch. For an even more interactive participation, check out the [discussions](https://github.com/MannLabs/alphabase/discussions). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The URL for pull requests is incorrect. It currently points to alphabase/pulls instead of alphatims/pulls. This should be corrected to maintain consistency with the issues link which was properly updated.
If you like AlphaTims you can give us a [star](stargazers) to boost our visibility! All direct contributions are also welcome. Feel free to post a new [issue](https://github.com/MannLabs/alphatims/issues) or clone the repository and create a [pull request](https://github.com/MannLabs/alphatims/pulls) with a new branch. For an even more interactive participation, check out the [discussions](https://github.com/MannLabs/alphatims/discussions).|
Number of tokens: input_tokens=15717 output_tokens=1269 MAX_NUM_OUTPUT_TOKENS=4096 |
sander-willems-bruker
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Been over it quickly. Have not tested it or scrutinized the code, but looks good and if it still works as expected with a quick local test I am happy to give a (mostly rubber stamp) approval
README.md
Outdated
| ## How to contribute | ||
|
|
||
| If you like AlphaTims you can give us a [star](stargazers) to boost our visibility! All direct contributions are also welcome. Feel free to post a new [issue](https://github.com/MannLabs/alphabase/issues) or clone the repository and create a [pull request](https://github.com/MannLabs/alphabase/pulls) with a new branch. For an even more interactive participation, check out the [discussions](https://github.com/MannLabs/alphabase/discussions). | ||
| If you like AlphaTims you can give us a [star](stargazers) to boost our visibility! All direct contributions are also welcome. Feel free to post a new [issue](https://github.com/MannLabs/alphatims/issues) or clone the repository and create a [pull request](https://github.com/MannLabs/alphatims/pulls) with a new branch. For an even more interactive participation, check out the [discussions](https://github.com/MannLabs/alphabase/discussions). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
discussions also still point to alphabase
| logging.info(f"cpu frequency - {psutil.cpu_freq().current:.2f} Mhz") | ||
| except AttributeError: | ||
| logging.info("Unable to log cpu frequency") | ||
| except Exception as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use generic errors than specific?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to avoid any interference of this non-critical logic (logging cpu frequency) with the remaning code execution.
More generally:
While narrow exception clauses have their merits in many cases, I feel that broad exception handling should be frowned upon less (it's even flagged by ruff!)
One reason being the above one: I want to continue regardless of the error.
Another one is that narrow exception handling couples the code quite tightly to an API: if someone chooses to raise some CustomError instead of a ValueError in a new version of their package, the narrow exception handling breaks. And I guess exceptions are an often overlooked part of an API, so spurious changes are more likely there.
yup, gave it a quick test and still worked.. also, all automated tests pass |
Remove logic on module import:
This caused slowing down unit tests of AlphaDIA as well as many log messages.
solves #279