Skip to content

Confidence implementation is not correct #1

@ferranmunoz

Description

@ferranmunoz

Do you agree the following comment from the blog?
The otherItemsFrequency can be obtained from fList for the rules with a single item in the precondition.

But it seems to me that your confidence implementation is not corrrect still.
In yours code:
double confidence = (double)occurrence / firstFrequencyItem;

Let’s look at formula: conf(X => Y)=support(X&Y)/support(X)
In code, occurence is value that’s coresponds to X&Y set, but firstFrequencyItem is value that corresponds to Y. So you calculate support(X&Y)/support(Y). As you can see it’s not a confidence for rule X=>Y.

Implementation should look like
double confidence = (double)occurrence / otherItemsFrequency;

But how this otherItemsFrequency can be found in mahout output files?
My prectice shows that such frequencies can be found in frequentpatterns output but not for all rules.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions