Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ jobs:
- name: black
uses: psf/black@stable
with:
options: "--check --verbose"
options: "--check --verbose --diff --color"
src: "./dtaianomaly"

# Install dtaianomaly (not required for black)
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/unit-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ jobs:
- name: Install dtaianomaly
run: |
python -m pip install --upgrade pip
pip install .[tests]
pip install .[tests,tqdm]
pip list

- name: Test with pytest
Expand Down
16 changes: 16 additions & 0 deletions docs/additional_information/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,22 @@ Added
^^^^^
- Implemented ``KShapeAnomalyDetector`` anomaly detector.
- Added arXiv citation to the documentation.
- Added support for TOML configuration files in the ``Workflow``.
- Added option to fit semi-supervised methods on test data in ``Workflow``.
- Option to show progress bar when running a ``Workflow``.
- Added optional feature names and time steps to ``DataSet``.
- Added option for relative bounds when automatically computing the window size.
- Added option to pass kwargs to the ``Workflow``.


Changed
^^^^^^^
- ``BestThresholdMetric`` now accepts an optional list of thresholds to use.
- ``BestThresholdMetric`` stores all used thresholds and their respective scores.
- ``BaseDetector`` by default checks the input variables, so this should no longer
be done in the implemented detectors.
- Removed ``Evaluation.run()`` method, since it is not used.


Fixed
^^^^^
Expand All @@ -21,6 +34,9 @@ Fixed
'pyximport' within tslearn was not found, while this is not necessary for our
codebase. Therefore, we have addid this dependency to the mock imports, which fixed
the issue.
- Ensured that ``interpret_additional_information()`` dynamically checks the possible
parameters of a ``Workflow``.
- Parameter ``y`` in ``visualizations.plot_with_zoom()`` is now optional.

[0.3.0] - 2025-01-31
--------------------
Expand Down
14 changes: 9 additions & 5 deletions docs/api/data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,20 +9,24 @@ Data module
.. autoclass:: dtaianomaly.data.DataSet
:members:

Synthetic data
--------------
Demonstration time series
-------------------------

.. autofunction:: dtaianomaly.data.demonstration_time_series

.. autoclass:: dtaianomaly.data.DemonstrationTimeSeriesLoader

.. image:: /../notebooks/Demonstration-time-series.svg
:align: center
:width: 100%

.. autofunction:: dtaianomaly.data.make_sine_wave


Loading data
------------

.. autoclass:: dtaianomaly.data.UCRLoader
.. autoclass:: dtaianomaly.data.PathDataLoader
:members:

.. autofunction:: dtaianomaly.data.from_directory

.. autoclass:: dtaianomaly.data.UCRLoader
1 change: 1 addition & 0 deletions docs/api/evaluation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,4 @@ Evaluation module
.. autoclass:: dtaianomaly.evaluation.AreaUnderPR
.. autoclass:: dtaianomaly.evaluation.AreaUnderROC
.. autoclass:: dtaianomaly.evaluation.BestThresholdMetric
:members: _compute
4 changes: 2 additions & 2 deletions docs/api/visualization.rst
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ Visualization module
>>> from dtaianomaly.visualization import plot_time_series_anomalies
>>> from dtaianomaly.anomaly_detection import IsolationForest
>>> from dtaianomaly.thresholding import FixedCutoff
>>> X, _ = demonstration_time_series()
>>> X, y = demonstration_time_series()
>>> y_pred = IsolationForest(window_size=100).fit(X).predict_proba(X)
>>> y_pred_binary = FixedCutoff(cutoff=0.9).threshold(y_pred)
>>> fig = plot_time_series_anomalies(X, y, y_pred_binary, figsize=(10, 3))
Expand All @@ -100,5 +100,5 @@ Visualization module
>>> from dtaianomaly.data import demonstration_time_series
>>> from dtaianomaly.visualization import plot_with_zoom
>>> X, y = demonstration_time_series()
>>> fig = plot_with_zoom(X, y, start_zoom=700, end_zoom=1200, figsize=(10, 3))
>>> fig = plot_with_zoom(X, y=y, start_zoom=700, end_zoom=1200, figsize=(10, 3))
>>> fig.suptitle("Example of 'plot_with_zoom'") # doctest: +SKIP
101 changes: 65 additions & 36 deletions docs/getting_started/examples/anomaly_detection.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,20 @@
:orphan:

.. testsetup::

X, y = None, None
X_, y_ = None, None
preprocessor = None
detector = None
y_pred = None
pipeline = None
thresholding = None
y_pred_binary = None
precision = None
recall = None
f_1 = None
auc_roc, auc_pr = None, None

Anomaly detection
=================

Expand All @@ -26,12 +41,12 @@ demonstration time series. This time series can easily be loaded using the
using the :py:func:`~dtaianomaly.visualization.plot_time_series_colored_by_score`
method.

.. code-block:: python
.. doctest::

from dtaianomaly.data import demonstration_time_series
from dtaianomaly.visualization import plot_time_series_colored_by_score
X, y = demonstration_time_series()
plot_time_series_colored_by_score(X, y, figsize=(10, 2))
>>> from dtaianomaly.data import demonstration_time_series
>>> from dtaianomaly.visualization import plot_time_series_colored_by_score
>>> X, y = demonstration_time_series()
>>> plot_time_series_colored_by_score(X, y, figsize=(10, 2)) # doctest: +SKIP

.. image:: /../notebooks/Demonstration-time-series.svg
:align: center
Expand All @@ -44,19 +59,19 @@ Before detecting anomalies, we can preprocess the time series. In this case,
we apply :py:class:`~dtaianomaly.preprocessing.MovingAverage` to remove some
of the noise from the time series.

.. code-block:: python
.. doctest::

from dtaianomaly.preprocessing import MovingAverage
preprocessor = MovingAverage(window_size=10)
>>> from dtaianomaly.preprocessing import MovingAverage
>>> preprocessor = MovingAverage(window_size=10)

In general, `any anomaly detector <https://dtaianomaly.readthedocs.io/en/stable/api/anomaly_detection.html>`_
in ``dtaianomaly`` can be used to detect anomalies in this time series. Here, we use the
:py:class:`~dtaianomaly.anomaly_detection.MatrixProfileDetector`

.. code-block:: python
.. doctest::

from dtaianomaly.anomaly_detection import MatrixProfileDetector
detector = MatrixProfileDetector(window_size=100)
>>> from dtaianomaly.anomaly_detection import MatrixProfileDetector
>>> detector = MatrixProfileDetector(window_size=100)


Now that the components have been initialized, we can preprocess the time series and
Expand All @@ -66,10 +81,10 @@ does not process the ground truth, other preprocessors may change the ground tru
For example, :py:class:`~dtaianomaly.preprocessing.SamplingRateUnderSampler` samples both
the time series ``X`` and labels ``y``.

.. code-block:: python
.. doctest::

X_, y_ = preprocessor.fit_transform(X)
y_pred = detector.fit(X_).predict_proba(X_)
>>> X_, y_ = preprocessor.fit_transform(X)
>>> y_pred = detector.fit(X_).predict_proba(X_)

Now we can plot the data along with the anomaly scores, and see that the predictions
nicely align with the anomaly!
Expand All @@ -89,14 +104,14 @@ will automatically process the data before detecting anomalies. Note that it is
possible to pass a list of preprocessors to apply multiple preprocessing steps before
detecting anomalies.

.. code-block:: python
.. doctest::

from dtaianomaly.pipeline import Pipeline
pipeline = Pipeline(
preprocessor=preprocessor,
detector=detector
)
y_pred = pipeline.fit(X).predict_proba(X)
>>> from dtaianomaly.pipeline import Pipeline
>>> pipeline = Pipeline(
... preprocessor=preprocessor,
... detector=detector
... )
>>> y_pred = pipeline.fit(X).predict_proba(X)

Quantitative evaluation
-----------------------
Expand All @@ -111,14 +126,14 @@ to 0 ("normal"). At this threshold, we see that all anomalous observations are d
(recall=1.0), at the cost of some false positives near the borders of the ground truth
anomaly (precision<1).

.. code-block:: python
.. doctest::

from dtaianomaly.thresholding import FixedCutoff
from dtaianomaly.evaluation import Precision, Recall
thresholding = FixedCutoff(0.85)
y_pred_binary = thresholding.threshold(y_pred)
precision = Precision().compute(y, y_pred_binary)
recall = Recall().compute(y, y_pred_binary)
>>> from dtaianomaly.thresholding import FixedCutoff
>>> from dtaianomaly.evaluation import Precision, Recall
>>> thresholding = FixedCutoff(0.85)
>>> y_pred_binary = thresholding.threshold(y_pred)
>>> precision = Precision().compute(y, y_pred_binary)
>>> recall = Recall().compute(y, y_pred_binary)


Alternatively to manually applying a threshold to convert the continuous scores to
Expand All @@ -127,21 +142,35 @@ which will automatically apply a specified thresholding strategy before using a
evaluation metric. Below, we use the same thresholding as above, but compute the
:py:class:`~dtaianomaly.evaluation.FBeta` score with :math:`\\beta = 1`.

.. code-block:: python
.. doctest::

from dtaianomaly.evaluation import ThresholdMetric, FBeta
f_1 = ThresholdMetric(thresholding, FBeta(1.0)).compute(y, y_pred)
>>> from dtaianomaly.evaluation import ThresholdMetric, FBeta
>>> f_1 = ThresholdMetric(thresholding, FBeta(1.0)).compute(y, y_pred)

Lastly, we also compute the :py:class:`~dtaianomaly.evaluation.AreaUnderROC` and
:py:class:`~dtaianomaly.evaluation.AreaUnderPR`. Because these metrics create a
curve for all possible thresholds, we can simply pass the predicted, continuous
anomaly scores, as shown below.

.. code-block:: python

from dtaianomaly.evaluation import AreaUnderROC, AreaUnderPR
auc_roc = AreaUnderROC().compute(y, y_pred)
auc_pr = AreaUnderPR().compute(y, y_pred)
.. doctest::

>>> from dtaianomaly.evaluation import AreaUnderROC, AreaUnderPR
>>> auc_roc = AreaUnderROC().compute(y, y_pred)
>>> auc_pr = AreaUnderPR().compute(y, y_pred)

.. doctest::
:hide:

>>> round(precision, 2)
0.64
>>> round(recall, 2)
1.0
>>> round(f_1, 2)
0.78
>>> round(auc_roc, 2)
0.99
>>> round(auc_pr, 2)
0.68

The table below shows the computed performance metrics for this example.

Expand Down
Loading
Loading