Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved table encoding code to handle sparse arrays. #28

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
90 changes: 51 additions & 39 deletions json.lua
Original file line number Diff line number Diff line change
Expand Up @@ -57,44 +57,56 @@ end


local function encode_table(val, stack)
local res = {}
stack = stack or {}

-- Circular reference?
if stack[val] then error("circular reference") end

stack[val] = true

if rawget(val, 1) ~= nil or next(val) == nil then
-- Treat as array -- check keys are valid and it is not sparse
local n = 0
for k in pairs(val) do
if type(k) ~= "number" then
error("invalid table: mixed or invalid key types")
end
n = n + 1
end
if n ~= #val then
error("invalid table: sparse array")
end
-- Encode
for i, v in ipairs(val) do
table.insert(res, encode(v, stack))
end
stack[val] = nil
return "[" .. table.concat(res, ",") .. "]"

else
-- Treat as an object
for k, v in pairs(val) do
if type(k) ~= "string" then
error("invalid table: mixed or invalid key types")
end
table.insert(res, encode(k, stack) .. ":" .. encode(v, stack))
end
stack[val] = nil
return "{" .. table.concat(res, ",") .. "}"
end
local res = {}
stack = stack or {}

-- Circular reference?
if stack[val] then error("circular reference") end

stack[val] = true
-- Check whether to treat as a array or object
local array = true
local length = 0
local nLen = 0

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered the worst-case this might create: What if I serialize {[1e9] = true}?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed all these. Revisiting after a long time. Thanks for catching that. Had not considered it at all. Been using it, did not have issues. Fixed it in the next commit by preventing sparse arrays greater than 100 (arbitrarility) length getting through. Seems like other json modules also handle sparse arrays in a undefined way. 100 should be good for many cases I suppose.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's a tricky issue, and behavior varies wildly. I believe ideally it should be left up to the user. Perhaps use something like a "density" factor as a threshold?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a variable called SPARSELIMIT in the module that the user can change after requiring the module to change it. Default set it to 100.

local count = 0
for k,v in pairs(val) do
if (type(k) ~= "number" or k<=0) and not (k == "n" and type(v) == "number") then

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about fractional keys, like 4.2? I think you also need a check like k % 1 == 0 or math.floor(k) == k to check that k is (not) an integer.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, added k%1 ~= 0 in the condition.

array = nil
break -- Treat as object
else
if k > length then
length = k
end
if k == "n" and type(v) == "number" then
nLen = v
end
count = count + 1
end
end

if nLen > length then
length = nLen
end
if array and not (length > 100 and count ~= length) then -- Check Array detected but sparse > 100 length then treat as object
-- Encode
for i=1,length do
table.insert(res, encode(val[i], stack))
end
stack[val] = nil
return "[" .. table.concat(res, ",") .. "]"
else
-- Treat as an object
for k, v in pairs(val) do
--[[
if type(k) ~= "string" then
error("invalid table: mixed or invalid key types")
end
]]
table.insert(res, encode(k, stack) .. ":" .. encode(v, stack))
end
stack[val] = nil
return "{" .. table.concat(res, ",") .. "}"
end
end


Expand All @@ -108,7 +120,7 @@ local function encode_number(val)
if val ~= val or val <= -math.huge or val >= math.huge then
error("unexpected number value '" .. tostring(val) .. "'")
end
return string.format("%.14g", val)
return tostring(val)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unreliable.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't understand. Perhaps a more useful/productive comment? Thanks.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, that was indeed too short to be useful. The documentation of tostring says:

Receives an argument of any type and converts it to a string in a reasonable format. For complete control of how numbers are converted, use string.format.

It doesn't guarantee anything. It could output the number with arbitrarily poor precision, or it could use an arbitrary format (which need not be valid JSON). (In practice, it does indeed behave quite reasonable, but you shouldn't rely on this; it should also be noted that it is not "lossless" in practice, it rounds (albeit neither is the current method - 14 digits aren't enough).)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for explaining it. I see. I remember I had changed it to make a certain case work properly when I was using it in the oauth2 module. Unfortunately I did not document why I changed it. I think I switched it back in the repository for now.

end


Expand Down