Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions security_300.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Security Enhancement for Issue #300
import re
from typing import Tuple

def validate_email(email: str) -> Tuple[bool, str]:
if not email or '@' not in email:
return False, "Invalid email"
return True, "OK"
Comment on lines +5 to +8
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Strengthen validate_email to avoid false positives.

Current validation accepts malformed values (for example, a@), which weakens the stated input-validation goal.

🔧 Proposed hardening
+EMAIL_RE = re.compile(
+    r"^[A-Za-z0-9.!#$%&'*+/=?^_`{|}~-]+@[A-Za-z0-9-]+(?:\.[A-Za-z0-9-]+)+$"
+)
+
 def validate_email(email: str) -> Tuple[bool, str]:
-    if not email or '@' not in email:
+    if not email or not EMAIL_RE.fullmatch(email.strip()):
         return False, "Invalid email"
     return True, "OK"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@security_300.py` around lines 5 - 8, The validate_email function currently
returns True for inputs like "a@" because it only checks for presence of '@';
update validate_email to verify both local and domain parts are non-empty,
contain no whitespace, and that the domain includes at least one dot and valid
characters (or use a standard email parser/regex). Specifically, in
validate_email(email: str) split on '@' (ensure exactly one '@'), check
local_part and domain_part are non-empty and do not contain spaces, and ensure
domain_part contains a dot and valid labels (no leading/trailing hyphens);
return False with "Invalid email" for failures and True,"OK" only when all
checks pass.


def sanitize_input(input_str: str, max_len: int = 1000) -> str:
if not input_str:
return ""
if len(input_str) > max_len:
input_str = input_str[:max_len]
Comment on lines +10 to +14
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, locate and read the security_300.py file
find . -type f -name "security_300.py" 2>/dev/null

Repository: rogermt/forgesyte

Length of output: 78


🏁 Script executed:

# If found, read the entire file to see the context
if [ -f "security_300.py" ]; then
  cat -n "security_300.py"
elif find . -type f -name "*security*.py" | head -5; then
  echo "Searching for files with 'security' in the name..."
  find . -type f -name "*security*.py" -exec wc -l {} \;
fi

Repository: rogermt/forgesyte

Length of output: 888


🏁 Script executed:

# Search for the sanitize_input function in the codebase
rg "def sanitize_input" -A 15 -B 2

Repository: rogermt/forgesyte

Length of output: 772


🏁 Script executed:

# Search for usages of sanitize_input to see how it's called
rg "sanitize_input\s*\(" --type py -B 2 -A 2

Repository: rogermt/forgesyte

Length of output: 267


Add validation to reject non-positive max_len values.

Negative or zero max_len produces unintended slicing behaviour: max_len <= 0 results in either an empty string or unpredictable truncation (e.g., input_str[:-1] removes the last character). This violates the function's contract as a length limiter. Additionally, add a docstring documenting the valid parameter range and function purpose.

🛡️ Proposed guard
 def sanitize_input(input_str: str, max_len: int = 1000) -> str:
+    if max_len <= 0:
+        raise ValueError("max_len must be a positive integer")
     if not input_str:
         return ""
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def sanitize_input(input_str: str, max_len: int = 1000) -> str:
if not input_str:
return ""
if len(input_str) > max_len:
input_str = input_str[:max_len]
def sanitize_input(input_str: str, max_len: int = 1000) -> str:
if max_len <= 0:
raise ValueError("max_len must be a positive integer")
if not input_str:
return ""
if len(input_str) > max_len:
input_str = input_str[:max_len]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@security_300.py` around lines 10 - 14, The sanitize_input function currently
allows non-positive max_len which causes unintended slicing; add input
validation in sanitize_input to raise a ValueError if max_len <= 0, document the
function with a docstring that states its purpose (truncate input_str to at most
max_len characters), the valid parameter range (max_len must be a positive
integer) and return behavior (returns empty string for falsy input_str), and
ensure the existing truncation logic uses input_str[:max_len] only after
validation; reference the sanitize_input function and the max_len parameter when
making these changes.

patterns = [r"<script.*?>.*?</script>", r"javascript:"]
for p in patterns:
input_str = re.sub(p, "", input_str, flags=re.IGNORECASE)
return input_str.strip()
Comment on lines +15 to +18
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

sanitize_input can be bypassed by multiline <script> payloads.

The current regex does not handle newline-spanning script tags, so malicious content can survive sanitisation.

🛡️ Proposed fix
-    patterns = [r"<script.*?>.*?</script>", r"javascript:"]
-    for p in patterns:
-        input_str = re.sub(p, "", input_str, flags=re.IGNORECASE)
+    patterns = [r"<script\b[^>]*>.*?</script>", r"javascript\s*:"]
+    for pattern in patterns:
+        input_str = re.sub(
+            pattern, "", input_str, flags=re.IGNORECASE | re.DOTALL
+        )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@security_300.py` around lines 15 - 18, sanitize_input's
"<script.*?>.*?</script>" replacement is vulnerable because the regex doesn't
match newlines; update the function to compile the script pattern with
re.IGNORECASE|re.DOTALL (or use the DOTALL flag in re.sub) so multiline script
blocks are removed, and apply the compiled pattern (e.g., the symbol/name
sanitize_input and the pattern variable/patterns list) when calling re.sub; keep
the "javascript:" pattern but ensure it's applied case-insensitively as well by
using the same flags or compiling it with re.IGNORECASE.


# Tests
assert validate_email("[email protected]")[0] == True
print("Security tests passed!")
Comment on lines +20 to +22
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Remove import-time test execution and replace print logging.

Running assertions and printing at import time causes side effects in production code and mixes test behaviour into the module.

✅ Minimal fix
-# Tests
-assert validate_email("[email protected]")[0] == True
-print("Security tests passed!")

As per coding guidelines, "Use logging module instead of print() for logging in Python".

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Tests
assert validate_email("[email protected]")[0] == True
print("Security tests passed!")
🧰 Tools
🪛 Ruff (0.15.5)

[error] 21-21: Avoid equality comparisons to True; use validate_email("[email protected]")[0]: for truth checks

Replace with validate_email("[email protected]")[0]

(E712)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@security_300.py` around lines 20 - 22, Remove the import-time assertion and
print call so the module has no side-effects: move the test assertion using
validate_email("[email protected]") into a proper test harness or wrap it under
an if __name__ == "__main__": block; replace the print("Security tests passed!")
with a logging.info(...) call using the logging module (configure logger at
module level if needed) so runtime logging follows guidelines.