Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNS and NTP columns are the same because the data in the DB is the same for both risks! #100

Open
aaronkaplan opened this issue Mar 20, 2017 · 4 comments
Assignees
Labels

Comments

@aaronkaplan
Copy link

prod=> select sum(count) from count_by_country where date > '2017/1/1' and risk=2 and country='US' limit 10;
   sum
---------
 2585757
(1 row)

prod=> select sum(count) from count_by_country where date > '2017/1/1' and risk=1 and country='US' limit 10;
   sum
---------
 2585757
(1 row)

@zelima
Copy link
Contributor

zelima commented Mar 21, 2017

@aaronkaplan Small note: We don't use table count_by_country any more, but agg_risk_country_week. But they are identical

This is not an aggregation issue, but scanned data for last 3 week for DNS and NTP are identical. Only difference between this two files are risk ID's. So aggregation result is identical obviously.

To check difference for latest week

aws --profile cg s3 cp  s3://private-bits-cybergreen-net/dev/clean/dns-scan/dns-scan.2017-W03.csv.gz .
aws --profile cg s3 cp  s3://private-bits-cybergreen-net/dev/clean/ntp-scan/ntp-scan.20170120.csv.gz .
gunzip ntp-scan.20170120.csv.gz
gunzip dns-scan.2017-W03.csv.gz

# strip timestamps and IPs and ASN for comparison
sed "s/\([^,]*,\)\{3\}//" ntp-scan.20170120.csv > ntp.stripped.csv
sed "s/\([^,]*,\)\{3\}//" dns-scan.2017-W03.csv > dns.stripped.csv
diff ntp.stripped.csv dns.stripped.csv -c | less

No output - files are identical!

cc @chorsley and @kxyne.

@aaronkaplan
Copy link
Author

aaronkaplan commented Mar 21, 2017 via email

@chorsley
Copy link
Contributor

Acknowledged, Cosive will investigate.

@zelima zelima assigned chorsley and kxyne and unassigned rufuspollock and zelima Mar 21, 2017
@kxyne
Copy link

kxyne commented Mar 22, 2017

This was from a manual file handling issue on the unprocessed files I believe, will rectify along with backprocessing the last weeks files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants