-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What are the relevant data of the error logs? #4
Comments
There are two radical approaches for clustering:
Of course the solution is neither completely black nor completely white... For instance, I have come across some patterns:
and I implemented some ad-hoc regex
maybe we can have pre-defined regex for labelling failures... and fully automated techniques for the rest. Final remark: we may have more than one cluster attached to a failure -- see this failure #1 (comment) |
I extracted the four pieces of information above and then I clustered using brute force.
|
Interesting ideas, go ahead! |
I used the data frame created in issue #5 to cluster the errors and I got 32 clusters. |
I'm able to cover all error logs after using k-means to discover the top terms to cluster.
|
Some relevant information that should be considered for clustering:
'make.: *.+'
'. 1:.+'
'. error:.+'
'. undefined reference.+'
The text was updated successfully, but these errors were encountered: