Skip to content

Conversation

@tamilarasan-n-dev
Copy link

Fix MSSQL JSON Parsing Issues in UNION Injection Technique

Problem Description

When using UNION-based SQL injection against Microsoft SQL Server with JSON aggregation mode (FOR JSON AUTO), sqlmap encountered multiple critical issues that prevented successful data extraction:

Issue 1: Truncated JSON Responses

Large JSON responses were being truncated due to HTTP response size limits (MAX_CONNECTION_TOTAL_SIZE), resulting in incomplete JSON that could not be parsed. This caused errors like:

JSON decode error: Extra data: line 1 column 635578 (char 635577)
JSON decode error: Expecting ',' delimiter: line 1 column 114121 (char 114120)

Issue 2: HTML Entity Encoding

JSON responses containing HTML entities (e.g., " instead of ") were failing to parse, as the entities were not being decoded before JSON parsing.

Issue 3: Inefficient Field Extraction

The original implementation used regex to extract field names from only the first JSON object, which was:

  • Fragile and prone to errors with nested objects
  • Limited to extracting from incomplete JSON fragments
  • Unable to handle varying field orders across rows

Solution

This PR implements a robust solution with the following improvements:

1. Automatic JSON Repair for Truncated Responses

When a JSONDecodeError occurs due to truncation:

  • Detects if the response is a JSON array
  • Finds the last complete object before the truncation point
  • Properly closes the JSON array with ]
  • Successfully parses all complete objects
  • Logs a warning message indicating partial data recovery
if output_decoded.strip().startswith('['):
    last_complete = output_decoded.rfind('},', 0, e.pos)
    if last_complete > 0:
        repaired = output_decoded[:last_complete+1] + ']'
        json_data = json.loads(repaired)
        logger.warning("parsed %d rows from truncated JSON response" % len(json_data))

2. HTML Entity Decoding

Added automatic HTML entity decoding using Python's html.unescape() before JSON parsing:

import html
output_decoded = html.unescape(output)

3. Improved Field Extraction

Replaced regex-based field extraction with direct dictionary key access:

# Old approach (fragile):
fields = re.findall(r'"([^"]+)":', extractRegexResult(r"{(?P<result>[^}]+)}", output))

# New approach (robust):
if json_data and isinstance(json_data[0], dict):
    fields = list(json_data[0].keys())

4. Better Error Handling

  • Handles both single objects and arrays uniformly
  • Uses .get() method for safer field access
  • Gracefully degrades when fields are missing in some rows

Testing

Tested against Microsoft SQL Server with:

  • Large result sets (600KB+ JSON responses)
  • Tables with 20+ columns
  • Data containing special characters and HTML entities
  • Both complete and truncated responses

Before Fix

[ERROR] JSON decode error: Extra data: line 1 column 635578
[WARNING] something went wrong with full UNION technique

After Fix

[WARNING] parsed 450 rows from truncated JSON response
[INFO] retrieved: [data successfully extracted]

Impact

  • ✅ Enables successful data extraction from MSSQL databases with large result sets
  • ✅ Improves reliability of UNION-based injection for MSSQL
  • ✅ Handles edge cases with HTML-encoded responses
  • ✅ Provides better user feedback with warning messages
  • ✅ No breaking changes to existing functionality

Files Changed

  • lib/techniques/union/use.py - Enhanced MSSQL JSON parsing in _oneShotUnionUse() function

Backward Compatibility

This change is fully backward compatible. The improvements only affect the MSSQL JSON aggregation code path and gracefully fall back to the original behavior if the repair attempt fails.

@stamparm
Copy link
Member

stamparm commented Dec 22, 2025

there is no "partial retrieval" in sqlmap workflow. you can't just push "partial data" to the user

@stamparm
Copy link
Member

also, there is no "import html" in standard python2, and sqlmap is python2 compatbile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants