Skip to content
This repository was archived by the owner on Apr 10, 2025. It is now read-only.

first pass at exploring infra flakes#448

Open
antter wants to merge 1 commit intoaicoe-aiops:masterfrom
antter:infra_flakes
Open

first pass at exploring infra flakes#448
antter wants to merge 1 commit intoaicoe-aiops:masterfrom
antter:infra_flakes

Conversation

@antter
Copy link
Contributor

@antter antter commented Nov 17, 2021

Related Issues and Dependencies

#447

This introduces a breaking change

  • Yes
  • [x ] No

This Pull Request implements

Infra flakes are likely not detectable in an honest matter with methods by just looking at grid data; more analysis will be needed with different data, e.g. the logs of the outputs. This is a notebook that works to detect anomalous columns in the testgrid data just using very simple probability calculations.

@sesheta sesheta added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 17, 2021
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@sesheta
Copy link
Contributor

sesheta commented Nov 17, 2021

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign oindrillac after the PR has been reviewed.
You can assign the PR to them by writing /assign @oindrillac in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sesheta sesheta added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Nov 17, 2021
@antter antter force-pushed the infra_flakes branch 2 times, most recently from d99c734 to 2a16d7e Compare November 22, 2021 18:45
@antter antter changed the title [WIP] first pass at exploring infra flakes first pass at exploring infra flakes Nov 24, 2021
@sesheta sesheta removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 24, 2021
@Shreyanand Shreyanand self-requested a review December 9, 2021 14:00
@@ -0,0 +1,973 @@
{
Copy link
Member

@Shreyanand Shreyanand Dec 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we sure that the x-axis here are in days? If iirc they were successive builds. I could be wrong here, we should double check.


Reply via ReviewNB

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No you're right I just copy pasted this graph from some other place

@@ -0,0 +1,973 @@
{
Copy link
Member

@Shreyanand Shreyanand Dec 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have a count of the NA values? If it's 1% it wouldn't be a problem, if it's 99% we'd have to think this through.


Reply via ReviewNB

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the NA value counting go before the array is fillna'ed with zeros?

@@ -0,0 +1,973 @@
{
Copy link
Member

@Shreyanand Shreyanand Dec 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change the last line to better explain the example. "So if a column has 2 tests that failed, and the first had a row score of 2/5...."


Reply via ReviewNB

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please describe the intuition behind the approach in a bit more detail? E.g. what does sum of entropies signify? why are we looking at entropies of row scores for failed tests?

@@ -0,0 +1,973 @@
{
Copy link
Member

@Shreyanand Shreyanand Dec 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why just 50?


Reply via ReviewNB

@@ -0,0 +1,973 @@
{
Copy link
Member

@Shreyanand Shreyanand Dec 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this output mean?


Reply via ReviewNB

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If these outputs are meant to function as a "progress bar", I'd suggest just using tqdm instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My tqdm has never worked on the MOC jupyterhub ):

@@ -0,0 +1,973 @@
{
Copy link
Member

@Shreyanand Shreyanand Dec 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are you bringing the score into the 0-1 range? What does 0 mean and what does 1 mean?


Reply via ReviewNB

@@ -0,0 +1,973 @@
{
Copy link
Member

@Shreyanand Shreyanand Dec 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we want to know the answer to this question?


Reply via ReviewNB

@@ -0,0 +1,973 @@
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you print some examples of jobs that have a high unexpected failure rate according to this method? And some that are expected failures?


Reply via ReviewNB

Copy link
Member

@Shreyanand Shreyanand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quite a creative method @antter 👏
Left some comments on the notebook.

@@ -0,0 +1,973 @@
{
Copy link
Member

@chauhankaranraj chauhankaranraj Dec 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nit, adding a legend to this chart might help read it better :)


Reply via ReviewNB

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants