Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enchancement: use CFStringGetCharacters over iteration and CFStringGetCharacterAtIndex #5

Open
jakewilliami opened this issue Aug 29, 2022 · 5 comments
Labels
enhancement New feature or request macOS

Comments

@jakewilliami
Copy link
Owner

jakewilliami commented Aug 29, 2022

Currently we use CFStringGetCharacterAtIndex and iterate over the CF String to construct the String. This is obviously inefficient, and we should use CFStringGetCharacters instead.

Some example code which I haven't quite got working yet:

mutable struct CFRange
    length::Int
    location::Int
end
function _cfstring_get_characters(cfstr::Cstring, range::CFRange)
    charbuf = Vector{UInt8}(undef, range.length)
    ccall(:CFStringGetCharacters, Ptr{UInt32}, (Cstring, Ptr{Cvoid}, Ptr{Cvoid}), cfstr, pointer_from_objref(range), charbuf)
    return charbuf
end
function _string_from_cfstring(cfstr::Cstring, encoding::Unsigned = K_CFSTRING_ENCODING_MACROMAN)
    strlen = _cfstring_get_length(cfstr)
    maxsz = _cfstring_get_maximum_size_for_encoding(strlen, encoding)
    return _cfstring_get_characters(cfstr, CFRange(maxsz, 0))
end

Then, we will simple be able to write

function _string_from_cf_string(cfstr::Cstring)
    charbuf = _string_from_cfstring(cfstr)
    return String(charbuf)
end
@jakewilliami
Copy link
Owner Author

NOTE: when I work on this optimisation, it would be nice to benchmark the results and see how much more efficient it is.

jakewilliami added a commit that referenced this issue Aug 29, 2022
Created issue #5 for this optimisation in the future.  For the moment,
it is working fine.
@jakewilliami jakewilliami changed the title Enchancement: use CFStringGetCharacters over iteration and CFStringGetCharacterAtIndex for macOS Enchancement: use CFStringGetCharacters over iteration and CFStringGetCharacterAtIndex Aug 29, 2022
@jakewilliami jakewilliami added macOS enhancement New feature or request labels Aug 29, 2022
@jakewilliami
Copy link
Owner Author

Just using initial tests, you can see how much overhead this might add:

julia> using HiddenFiles, BenchmarkTools

julia> ishidden("/System/Applications/Utilities/Terminal.app/Contents/MacOS/Terminal")
true

julia> @btime ishidden("/System/Applications/Utilities/Terminal.app/Contents/MacOS/Terminal");
  2.010 ms (114 allocations: 7.26 KiB)

julia> @btime ishidden("/System/Applications/Utilities/Terminal.app/Contents/");
  807.882 μs (62 allocations: 4.38 KiB)

@jakewilliami
Copy link
Owner Author

Might be able to use CFRangeMake

@jakewilliami
Copy link
Owner Author

Still not working

julia> using HiddenFiles

julia> mutable struct CFRange
           length::Int
           location::Int
       end

julia> cfstr = HiddenFiles._cfstring_create_with_cstring("Project.toml")
Cstring(0x000060000031c6a0)

julia> r = CFRange(10, 1)
CFRange(10, 1)

julia> charbuf = Vector{UInt16}(undef, 12);

julia> ccall(:CFStringGetCharacters, Cstring, (Cstring, Ptr{CFRange}, Ptr{Cvoid}), cfstr, pointer_from_objref(r), charbuf)

signal (11): Segmentation fault: 11
in expression starting at REPL[6]:1
__CFStrConvertBytesToUnicode at /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (unknown line

@jakewilliami
Copy link
Owner Author

Tried with Ref but still not working

julia> using HiddenFiles

julia> mutable struct CFRange
                  length::Int
                  location::Int
              end

julia> cfstr = HiddenFiles._cfstring_create_with_cstring("Project.toml")
Cstring(0x0000600000d2cae0)

julia> r = CFRange(10, 1)
CFRange(10, 1)

julia> charbuf = Vector{UInt16}(undef, 12);

julia> ccall(:CFStringGetCharacters, Cstring, (Cstring, Ref{CFRange}, Ptr{Cvoid}), cfstr, Ref{CFRange}(r), charbuf)

jakewilliami added a commit that referenced this issue Sep 23, 2022
I changed it to UInt16 to account for certain characters in the
previous commit, but I am unsure characters that are more than 8 bits
are allowed in paths.  I should think about this further, however
perhaps something to consider while I am working on #5
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request macOS
Projects
None yet
Development

No branches or pull requests

1 participant