feat(security): implement URL lexical analysis and anti-bot fingerprinting#81
Open
RutujaSant wants to merge 1 commit intoHardhat-Enterprises:devfrom
Open
feat(security): implement URL lexical analysis and anti-bot fingerprinting#81RutujaSant wants to merge 1 commit intoHardhat-Enterprises:devfrom
RutujaSant wants to merge 1 commit intoHardhat-Enterprises:devfrom
Conversation
- feat(utils): add URL lexical entropy and path analysis engine Implemented Shannon entropy calculation and path depth/extension analysis for URL risk detection. - feat(services): integrate URL analysis into detection service Added analyzeMessageUrls to consolidate risk scores from lexical URL features. - feat(middleware): add infrastructure-level client fingerprinting Implemented header-based fingerprinting and sliding-window malicious scan tracking for anti-bot defense. - feat(scan): enhance scan API with URL analysis and anti-bot protection Integrated URL risk reporting and fingerprint tracking/shadow-banning into the scan controller.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Scope & Summary
This PR introduces two critical security enhancements to the
smishing detection pipeline: a lexical analysis engine for
URLs and an infrastructure-level fingerprinting system for bot
defense.
URL Lexical Entropy & Path Analysis:
identify algorithmically generated strings (DGA).
and flags "unnatural" extensions (e.g., .php, .exe).
Top-Level Domains (e.g., .top, .link, .xyz).
riskScore returned in the /api/scan response.
Anti-Bot Fingerprinting:
stable HTTP headers (User-Agent, Accept-Language,
Sec-Ch-Ua, Accept-Encoding).
(smishing detections) per fingerprint within a
15-minute window.
Requests status once a client exceeds 10 malicious
scans, preventing attackers from probing the API.
Deliverables
identification and rate-limiting logic.
analysis.
response and tracking.
protection.
Testing & Verification
bit.ly/x72K9aLpq1 (High Entropy) is flagged.
increase the riskScore.
consistent fingerprint.
successfully after 10 simulated malicious scans from the
same fingerprint.