You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"""Decorator to Perf with any write :py:class:`~sklearn.metrics` documentation
33
+
"""
34
+
35
+
func.__doc__=f""":py:class:`~CompStats.interface.Perf` with :py:func:`~sklearn.metrics.{func.__name__}` as :py:attr:`score_func.` The parameters not described can be found in :py:func:`~sklearn.metrics.{func.__name__}`.
36
+
37
+
:param y_true: True measurement or could be a pandas.DataFrame where column label 'y' corresponds to the true measurement.
38
+
:type y_true: numpy.ndarray or pandas.DataFrame
39
+
:param y_pred: Predictions, the algorithms will be identified with alg-k where k=1 is the first argument included in :py:attr:`y_pred.`
40
+
:type y_pred: numpy.ndarray
41
+
:param kwargs: Predictions, the algorithms will be identified using the keyword
42
+
:type kwargs: numpy.ndarray
43
+
:param num_samples: Number of bootstrap samples, default=500.
44
+
:type num_samples: int
45
+
:param n_jobs: Number of jobs to compute the statistic, default=-1 corresponding to use all threads.
46
+
:type n_jobs: int
47
+
:param use_tqdm: Whether to use tqdm.tqdm to visualize the progress, default=True
Collaborative competitions have gained popularity in the scientific and technological fields. These competitions involve defining tasks, selecting evaluation scores, and devising result verification methods. In the standard scenario, participants receive a training set and are expected to provide a solution for a held-out dataset kept by organizers. An essential challenge for organizers arises when comparing algorithms' performance, assessing multiple participants, and ranking them. Statistical tools are often used for this purpose; however, traditional statistical methods often fail to capture decisive differences between systems' performance. CompStats implements an evaluation methodology for statistically analyzing competition results and competition. CompStats offers several advantages, including off-the-shell comparisons with correction mechanisms and the inclusion of confidence intervals.
28
+
Collaborative competitions have gained popularity in the scientific and technological fields. These competitions involve defining tasks, selecting evaluation scores, and devising result verification methods. In the standard scenario, participants receive a training set and are expected to provide a solution for a held-out dataset kept by organizers. An essential challenge for organizers arises when comparing algorithms' performance, assessing multiple participants, and ranking them. Statistical tools are often used for this purpose; however, traditional statistical methods often fail to capture decisive differences between systems' performance. CompStats implements an evaluation methodology for statistically analyzing competition results and competition. CompStats offers several advantages, including off-the-shell comparisons with correction mechanisms and the inclusion of confidence intervals.
29
+
30
+
To illustrate the use of `CompStats`, the following snippets show an example. The instructions load the necessary libraries, including the one to obtain the problem (e.g., digits), three different classifiers, and the last line is the score used to measure the performance and compare the algorithm.
31
+
32
+
>>> from sklearn.svm import LinearSVC
33
+
>>> from sklearn.naive_bayes import GaussianNB
34
+
>>> from sklearn.ensemble import RandomForestClassifier
35
+
>>> from sklearn.datasets import load_digits
36
+
>>> from sklearn.model_selection import train_test_split
37
+
>>> from sklearn.base import clone
38
+
>>> from CompStats.metrics import f1_score
39
+
40
+
The first step is to load the digits problem and split the dataset into training and validation sets. The second step is to estimate the parameters of a linear Support Vector Machine and predict the validation set's classes. The predictions are stored in the variable `hy`.
41
+
42
+
>>> X, y = load_digits(return_X_y=True)
43
+
>>> _ = train_test_split(X, y, test_size=0.3)
44
+
>>> X_train, X_val, y_train, y_val = _
45
+
>>> m = LinearSVC().fit(X_train, y_train)
46
+
>>> hy = m.predict(X_val)
47
+
48
+
Once the predictions are available, it is time to measure the algorithm's performance, as seen in the following code. It is essential to note that the API used in `sklearn.metrics` is followed; the difference is that the function returns an instance with different methods that can be used to estimate different performance statistics and compare algorithms.
49
+
50
+
>>> score = f1_score(y_val, hy, average='macro')
51
+
>>> score
52
+
<Perf>
53
+
Prediction statistics with standard error
54
+
alg-1 = 0.936 (0.010)
55
+
56
+
The previous code shows the macro-f1 score and, in parenthesis, its standard error. The actual performance value is stored in the `statistic` function.
57
+
58
+
>>> score.statistic()
59
+
{'alg-1': 0.9355476018466147}
60
+
61
+
Continuing with the example, let us assume that one wants to test another classifier on the same problem, in this case, a random forest, as can be seen in the following two lines. The second line predicts the validation set and sets it to the analysis.
0 commit comments