Skip to content

Latest commit

 

History

History
1022 lines (834 loc) · 58.7 KB

3.3.md

File metadata and controls

1022 lines (834 loc) · 58.7 KB
title prev next description
Ruby 3.3 changes
3.4
3.2
Ruby 3.3 full and annotated changelog

Ruby 3.3

  • Released at: Dec 25, 2023 (NEWS.md file)
  • Status (as of Jan 05, 2025): 3.3.6 is current stable
  • This document first published: Dec 25, 2023
  • Last change to this document: Jan 05, 2025

🇺🇦 🇺🇦 Before you start reading the changelog: A full-scale Russian invasion into my home country continues, for the second year in a row. The only reason I am alive and able to work on the changelog is Armed Forces of Ukraine, and international support with weaponry, funds and information. I got into the Army in March, and spent the summer on the frontlines. Now I have moved to another army position which frees some time to work for the Ruby community. You can read my recent text (that supposed to be a RubyConf talk) as an appeal to the community. Please spread information, lobby our cause and donate.🇺🇦 🇺🇦

Note: As already explained in Introduction, this site is dedicated to changes in the language, not the implementation, therefore the list below lacks mentions of lots of important internal changes related to performance optimizations, parser, and JIT that happened in 3.3 (which is, on the other hand, somewhat lighter on the "small quality of life improvement" changes). The changes aren't covered not because they are not important, just because this site's goals are different. See the official release notes that cover those significant internal changes.

Highlights

Language changes

Standalone it in blocks will become anonymous argument in Ruby 3.4

In Ruby 3.3, it will just warn to prepare for a change.

  • Reason: Numeric designation for anonymous block arguments (_1, _2, and so on) were considered ugly by many people, so after years of discussion, the it keyword is to be introduced on the next Ruby version; for now, it just warns in places where it would be considered an anonymous block argument.
  • Discussion: Feature #18980
  • Code: In the code below, where Ruby 3.3 currently produces a warning, Ruby 3.4 would treat it as an anonymous block argument; where Ruby 3.3 doesn't produce a warning, Ruby 3.4 would treat it as a local variable name or a method call (and would look for such names available in the scope).
    # The cases that are warned:
    # -------------------------
    # warning: `it` calls without arguments will refer to the first block param in Ruby 3.4; use it() or self.it
    
    (1..3).map { it }      # inside a block without explicit parameters
    (1..3).map { it; _1 }  # ...even if numbered parameters are used, too
    def it; end
    (1..3).map { it }      # even if a method with name `it` exists in the scope
    
    # The cases that are not warned:
    # -----------------------------
    
    it                        # not inside a block
    (1..3).map { |x| it }     # inside a block with named parameters
    (1..3).map { || it }      # ...even if they are empty
    (1..3).map { it() }       # with parentheses
    (1..3).map { it {} }      # with a block attached
    (1..3).map { it = 5; it } # if a local variable with the same name is created in the block
    it = 5
    (1..3).map { it }         # if a local variable with the same name is in the scope
  • Notes: The new feature isn't expected to conflict with RSpec's it, as calling that without any block attached, or at least a description for the future example, is useless.
  • Follow-up: 3.4: it was successfully introduced.

Anonymous parameters forwarding inside blocks are disallowed

Now anonymous parameters forwarding inside a block raise error.

  • Reason: Blocks didn't support anonymous parameters forwarding, yet they supported anonymous parameters declaration, and it was a confusing situation (when something that looked like block forwarding its parameters, actually forwarded parameters of the method containing the block).
  • Discussion: Feature #19370
  • Code:
    def m(*)
      # ..some other code using anonymous params...
    
      [1, 2, 3].each { |*| p(*) }
    end
    m('test')
    
    # Ruby 3.2:
    #  The block above looks like it would forward its arguments to p
    #  (so it would print 1, 2, 3); but actually anonymous params of the _method_
    #  are forwarded, so it actually prints:
    #   "test"
    #   "test"
    #   "test"
    
    # Ruby 3.3:
    #   anonymous rest parameter is also used within block (SyntaxError)
    # (raised during parsing the file)
    
    # No error is raised if there's no perceived conflict of anonymous
    # params:
    def m(*)
      # ..some other code using anonymous params...
    
      [1, 2, 3].each { |i| p(*) } # no question what `*` refers to
    end
    m('test')
    # Ruby 3.3:
    #   "test"
    #   "test"
    #   "test"
  • Notes:
    • There is a question whether disallowing block parameters forwarding is the best way to solve the confusion; alternative solution would be just to support forwarding inside the block properly. I hope the discussion to continue during 3.4 development.
    • In the 3.3.0 release, the prohibition was accidentally too greedy, affecting lambdas with unambiguous forwarding, see Bug #20090:
    def b(*)
      -> { c(*) } # Unambiguous, yet raises:
                  #  anonymous rest parameter is also used within block (SyntaxError)
    end
    This is already fixed to be released in the next minor version.

Core classes and modules

Kernel#lambda raises when passed Proc instance

  • Reason: lambda's goal is to create a lambda from provided literal block; in Ruby, it is impossible to change the "lambdiness" of the block once it is created. But lambda(&proc_instance) never notified users of that, which was confusing.
  • Discussion: Feature #19777
  • Documentation: Kernel#lambda (no specific details are provided, though)
  • Code:
    # Intended usage:
    l = lambda { |a, b| a + b }
    l.lambda? #=> true
    l.parameters #=> [[:req, :a], [:req, :b]]
    
    # Unintended usage:
    p = proc { |a, b| a + b }
    
    # In Ruby 3.2 and below, it worked, but the produced value wasn't lambda:
    l = lambda(&p)
    l.parameters #=> [[:opt, :a], [:opt, :b]]
    l.lambda? #=> false
    l.object_id == p.object_id #=> true, it is just the same proc
    
    # Ruby 3.3:
    l = lambda(&p)
    # in `lambda': the lambda method requires a literal block (ArgumentError)
    
    # Despite the message about a "literal block," the method
    # works (though has no meaningful effect) with lambda-like Proc objects
    other_lambda = lambda { |a, b| a + b }
    lambda(&other_lambda) #=> works
    lambda(&:to_s) #=> works
    lambda(&method(:puts)) #=> works
  • Notes: The discussion was once started from the proposal to make lambda change "lambiness" of a passed block, but it raises multiple issues (changing the block semantics mid-program is just one of them). In general, lambda as a method is considered legacy, inferior to the -> { } lambda literal syntax, exactly due to problems like this: it looks like a regular method that receives a block, and therefore should be able to accept any block, but in fact it is "special" method. So in 3.0, there was a warning about lambda(&proc_instance), and since 3.3, the warning finally turned into an error.

Proc#dup and #clone call #initialize_dup and #initialize_copy

  • Reason: A fix for a small inconsistency created in 3.2: Since that version, #dup and #clone on an object inherited from the Proc, rightfully produced an instance of the inherited class. But despite Object's #dup and #clone methods docs claiming that corresponding copying constructors would be called on object cloning/duplication, it was not true for Proc.
  • Discussion: Feature #19362
  • Documentation: — (Adheres to the behavior described for Object#dup and #clone)
  • Code:
    # The examples would work the same way with
    # #dup/#initialize_dup and #clone/#initialize_copy
    
    class TaggedProc < Proc
      attr_reader :tag
    
      def initialize(tag)
        super()
        @tag = tag
      end
    
      def initialize_dup(other)
        @tag = other.tag
        super
      end
    end
    
    proc = TaggedProc.new('admin') { }
    
    proc.tag #=> 'admin'
    proc.dup.tag
    # Ruby 3.1:
    #   undefined method `tag' for #<Proc:0x0...> -- #dup didn't preserve the class
    # Ruby 3.2:
    #   => nil -- the class is preserved, yet the duplication didn't went through #initialize_dup
    # Ruby 3.3:
    #   => "admin"
  • Notes: Inheriting from core classes is an advanced technique, and most of the times there are simple ways to achieve same goals (like wrapper objects containing a Proc and an additional info).

Module#set_temporary_name

Allows to assign a string to be rendered as class/module's #name, without assigning the class/module to a constant.

  • Reason: The feature is useful to provide reasonable representation for dynamically auto-generated classes without assigning them to constants (which pollutes the global namespace and might conflict with existing constants) or redefining Class#name (which might break other code and not always respected in the output).
  • Discussion: Feature #19521
  • Documentation: Module#set_temporary_name
  • Code:
    dynamic_class = Class.new do
      def foo; end
    end
    
    dynamic_class.name #=> nil
    
    # For dynamic classes, representation of related values is frequently unreadable:
    dynamic_class #=> #<Class:0x0...>
    instance = dynamic_class.new #=> #<#<Class:0x0...>:0x0...>
    instance.method(:foo) #=> #<Method: #<Class:0x0...>#foo() ...>
    
    dynamic_class::Nested = Module.new
    dynamic_class::Nested #=> #<Class:0x0...>::Nested
    
    # After assigning the temporary name, representation becomes more convenient:
    dynamic_class.set_temporary_name("MyDSLClass(with description)")
    
    dynamic_class #=> MyDSLClass(with description)
    instance #=> #<MyDSLClass(with description):0x0...>
    instance.method(:foo) #=> #<Method: MyDSLClass(with description)#foo() ...>
    
    # Note that module constant names are assigned at the moment of their creation,
    # and don't change when the temporary name is assigned:
    dynamic_class::OtherNested = Module.new
    
    dynamic_class::Nested #=> #<Class:0x0...>::Nested
    dynamic_class::OtherNested #=> MyDSLClass(with description)::OtherNested
    
    # Assigning names that correspond to constant name rules is prohibited:
    dynamic_class.set_temporary_name("MyClass")
    # `set_temporary_name': the temporary name must not be a constant path to avoid confusion (ArgumentError)
    dynamic_class.set_temporary_name("MyClass::NestedName")
    # `set_temporary_name': the temporary name must not be a constant path to avoid confusion (ArgumentError)
    
    # When the module with a temporary name is put into a constant,
    # it receives a permanent name, which can't be changed anymore
    C = dynamic_class
    
    # It affects all associated values (including modules)
    
    dynamic_class #=> C
    instance #=> #<C:0x0...>
    instance.method(:foo) #=> #<Method: C#foo() ...>
    dynamic_class::Nested #=> C::Nested
    dynamic_class::OtherNested #=> C::OtherNested
    
    dynamic_class.set_temporary_name("Can I have it back?")
    # `set_temporary_name': can't change permanent name (RuntimeError)
    
    # `nil` can be used to cleanup a temporary name:
    other_class = Class.new
    other_class.set_temporary_name("another one")
    other_class #=> another one
    other_class.set_temporary_name(nil)
    other_class #=> #<Class:0x0...>
  • Notes: Any phrase that used as a temporary name would be used verbatim; this might create very confusing #inspect results and error messages; so it is advised to use strings somehow implying that the name belong to a module. Imagine we wrap into classes with temporary names RSpec-style examples, and then there is a typo in the body of such example:
    it "works as a calculator" do
      expec(2+2).to eq 4
    end
    # If we assign just the example description as a temp.name, the
    # error would look like this:
    #
    #   undefined method `expec' for an instance of works as a calculator
    #                                               ^^^^^^^^^^^^^^^^^^^^^
    #
    # ...which is confusing. So it is probably better to construct a
    # module-like temporary name, to have:
    #
    #   undefined method `expec' for an instance of MyFramework::Example("works as a calculator")
    #                                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Refinement#refined_class is renamed to Refinement#target

Just a renaming of the unfortunately named new method that emerged in Ruby 3.2.

Strings and regexps

String#bytesplice: new arguments to select a portion of the replacement string

The low-level string manipulation method now allows to provide a coordinates of the part of the replacement string to be used.

  • Reason: The new "byte-oriented" methods were introduced in Ruby 3.2 to support low-level programming like text editors or network protocol implementations. In those use cases, the necessity of copying of a small part of one string into the middle of another is frequent, and producing intermediate strings (by first slicing the necessary part) is costly.
  • Discussion: Feature #19314
  • Documentation: String#bytesplice
  • Code:
    # Base usage
    buf1 = "Слава Україні!"
    #             ^^^^^^^ - bytes 11-24
    buf2 = "Шана Героям"
    #            ^^^^^^ - bytes 9-20
    
    buf1.bytesplice(11..24, buf2, 9..20)
    #=> "Слава Героям!"
    buf1
    #=> "Слава Героям!" -- The receiver is modified
    
    # Or, alternatively, with (start, length) pairs
    buf1 = "Слава Україні!"
    buf1.bytesplice(11, 14, buf2, 9, 12)
    #=> "Слава Героям!"
    
    # Two forms can't be mixed:
    buf1 = "Слава Україні!"
    buf1.bytesplice(11..24, buf2, 9, 12)
    # `bytesplice': wrong number of arguments (given 4, expected 2, 3, or 5) (ArgumentError)
    
    # Index can't be in the middle of the Unicode character:
    buf1.bytesplice(11..23, buf2, 9..20)
    #                    ^
    # `bytesplice': offset 24 does not land on character boundary (IndexError)
    buf1.bytesplice(11..24, buf2, 9..19)
    #                                 ^
    # `bytesplice': offset 20 does not land on character boundary (IndexError)
    
    # Semi-open ranges work:
    buf1 = "Слава Україні!"
    buf1.bytesplice(11..24, buf2, 9..)
    #=> "Слава Героям!"
    
    buf1 = "Слава Україні!"
    buf1.bytesplice(11..24, buf2, ...8)
    #=> "Слава Шана!"
    
    # Empty ranges lead to inserting empty strings:
    buf1 = "Слава Україні!"
    buf1.bytesplice(11..24, buf2, 9...8)
    #=> "Слава !"

MatchData#named_captures: symbolize_names: argument

  • Discussion: Feature #19591
  • Documentation: MatchData#named_captures
  • Code:
    m = "2023-12-25".match(/(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/)
    m.named_captures
    #=> {"year"=>"2023", "month"=>"12", "day"=>"25"}
    m.named_captures(symbolize_names: true)
    #=> {:year=>"2023", :month=>"12", :day=>"25"}
  • Notes: While symbolize_names: might looks somewhat strange (usually we talk about hash keys), it is done for consistency with Ruby standard library's JSON.parse signature, which inherited the terminology from the JSON specification.

Time.new with string argument became stricter

The method now requires fully-specified date-time string.

  • Discussion: Bug #19293
  • Documentation: Time#new
  • Code:
    Time.new('2023-12-20')
    # Ruby 3.2: #=> 2023-12-20 00:00:00 +0200
    # Ruby 3.3: in `initialize': no time information (ArgumentError)
    
    Time.new('2023-12')
    # Ruby 3.2: #=> 2023-12-01 00:00:00 +0200
    # Ruby 3.3: in `initialize': no time information (ArgumentError)
    
    # Singular year is still works:
    Time.new('2023')
    #=> 2023-01-01 00:00:00 +0200
    
    # ...because it is documented behavior of Time.new to accept
    # strings that are numeric and treat them as numbers:
    Time.new('2023', '12', '20')
    #=> 2023-12-20 00:00:00 +0200

Array#pack and String#unpack: raise ArgumentError for unknown directives

  • Discussion: Bug #19150
  • Documentation: doc/packed_data.rdoc
  • Code:
    [1, 2, 3].pack('r*')
    # Ruby 3.1:
    #   => "", no warning
    # Ruby 3.2:
    #   => "", warning: unknown pack directive 'r' in 'r*'
    # Ruby 3.3:
    #   in `pack': unknown pack directive 'r' in 'r*' (ArgumentError)
    
    "\x01\x02\x03".unpack("r*")
    # Ruby 3.1:
    #   => [], no warning
    # Ruby 3.2:
    #   => [], warning: unknown unpack directive 'r' in 'r*'
    # Ruby 3.3:
    #   in `unpack': unknown pack directive 'r' in 'r*' (ArgumentError)

Enumerables and collections

Set#merge accepts multiple arguments

  • Documentation: Set#merge
  • Code:
    Set[1, 2, 3].merge(Set[3, 4, 5], Set[:a, :b, :c])
    #=> #<Set: {1, 2, 3, 4, 5, :a, :b, :c}>
  • Notes: The method's signature (seen in docs) has a rare clause **nil. It means "don't accept something that looks like keyword arguments." As #merge accept any list of enumerables, this protects from accidentally passing a hash believing it would be keyword arguments with some meaning:
    Set[1, 2, 3].merge(Set[3, 4, 5], reorder: false)
    #                                ^^^^^^^^^^^^^^
    # Without **nil, this would be treated implicitly as Hash, while looking like keyword arguments
    # But actually, it produces
    #    no keywords accepted (ArgumentError)
    
    # When you do mean to merge data from hash, use parentheses to make it explicit
    # (its #each would be used to produce set items):
    Set[1, 2, 3].merge(Set[3, 4, 5], {some: 'data'})
    #=> #<Set: {1, 2, 3, 4, 5, [:some, "data"]}>

ObjectSpace::WeakKeyMap

A new "weak map" concept implementation. Unlike ObjectSpace::WeakMap, it compares keys by equality (WeakMap compares by identity), and only references to keys are weak (garbage-collectible).

  • Reason: The idea of a new class grew out of increased usage of ObjectSpace::WeakMap (which was once considered internal). In many other languages, concept of "weak map" implies only key references are weak: this allows to use it as a generic "holder of some additional information related to a set of objects while they are alive," or just a weak set of objects (using them as keys and true as values): caches, deduplication sets, etc.
  • Discussion: Feature #18498
  • Documentation: ObjectSpace::WeakKeyMap
  • Code:
    map = ObjectSpace::WeakKeyMap.new
    
    key = "foo"
    map[key] = true
    map["foo"] #=> true -- compares by equality, even if two strings are different objects
    
    # "Just return the equal key" API, always returns the key's object
    map.getkey("foo") #=> "foo"
    map.getkey("foo").object_id == key.object_id #=> true
    
    key = nil
    GC.start
    
    map["foo"] #=> nil -- the key was garbage-collected, so the pair was removed
    
    # One of the possible usages: a lightweight uniqueness cache for
    # many small objects:
    class Money < Data.define(:amount, :currency)
      def self.new(...)
        value = super(...)
        @cache ||= ObjectSpace::WeakKeyMap.new
        if (existing = @cache.getkey(value))
          existing
        else
          @cache[value] = true
        end
      end
    end
    
    m1 = Money.new(10, 'USD')
    m2 = Money.new(10, 'USD')
    m1.object_id #=> 60
    m2.object_id #=> 60
    # Same values, it is the same object, so there wouldn't be a huge memory
    # penalty when thousands of similar values are created.
    
    # No references to "10 USD" object left
    m1 = nil
    m2 = nil
    GC.start
    m3 = Money.new(10, 'USD')
    m3.object_id #=> 80
    # The unused values got garbage-collected, so the cache wouldn't just grow forever
  • Notes: The class interface is significantly leaner than WeakMap's, and doesn't provide any kind of iteration methods (which is very hard to implement and use correctly with weakly-referenced objects), so the new class is more like a black box with associations than a collection.

ObjectSpace::WeakMap#delete

  • Reason: WeakMap is frequently used to have a loose list of objects that will need some processing at some point of program execution if they are still alive/used (that's why WeakMap and not Array/Hash is chosen in those cases). But it is possible that the code author wants to process objects conditionally, and to remove those which don't need processing anymore—even if they are still alive. WeakMap quacks like kind-of simple Hash, yet previously provided no way to delete keys.
  • Discussion: Feature #19561
  • Documentation: ObjectSpace::WeakMap#delete
  • Code:
    files_to_close = ObjectSpace::WeakMap.new
    file1 = File.new('README.md')
    file2 = File.new('NEWS.md')
    
    files_to_close[file1] = true
    files_to_close[file2] = true
    
    files_to_close.delete(file1) #=> true
    
    # Attempt to delete non-existing key:
    files_to_close.delete(file1) #=> nil
    # An optional block can be provided in case the key doesn't exist:
    files_to_close.delete(file1) { puts "Already removed"; 0 }
    # Prints "Already removed", returns `0`
    
    # The block wouldn't be called if the deletion was effectful:
    files_to_close.delete(file2) { puts "Already removed"; 0 }
    # Prints nothing, returns true

Thread::Queue#freeze and SizedQueue#freeze raise TypeError

  • Reason: The discussion was started with a bug report about Queue not respecting #freeze in any way (#push and #pop were still working after #freeze call). It was then decided that allowing to freeze a queue like any other collection (leaving it immutable) would have questionable semantics. As Queue is meant to be an inter-thread communication utility, freezing a queue while some thread waits for it would either leave this thread hanging, or would require #freeze's functionality to extend for communication with dependent threads. Neither is a good option, so the behavior of the method was changed to communicate that queue freezing doesn't make sense.
  • Discussion: Bug #17146
  • Documentation: Thread::Queue#freeze and Thread::SizedQueue#freeze

Range

#reverse_each

Specialized Range#reverse_each method is implemented.

  • Reason: Previously, Range didn't have a specialized #reverse_each method, so calling it invoked a generic Enumerable#reverse_each. The latter works by converting the object to array, and then enumerating this array. In case of a Range this can be inefficient (producing large arrays) or impossible (when only upper bound of the range is defined). It also went into infinite loop with endless ranges, trying to enumerate it all to convert into array, while the range can say beforehand that it would be impossible.
  • Discussion: Feature #18515
  • Documentation: Range#reverse_each
  • Code:
    # Efficient implementation for integers:
    
    (1..2**100).reverse_each.take(3)
    # Ruby 3.2: hangs on my machine, trying to produce an array
    # Ruby 3.3: #=> [1267650600228229401496703205376, 1267650600228229401496703205375, 1267650600228229401496703205374]
    #  (returns immediately)
    
    (...5).reverse_each.take(3)
    # Ruby 3.2: can't iterate from NilClass (TypeError)
    # Ruby 3.3: #=> [5, 4, 3]
    
    # Explicit error for endless ranges:
    
    (1...).reverse_each
    # Ruby 3.2: hangs forever, trying to produce an array
    # Ruby 3.3: `reverse_each': can't iterate from NilClass (TypeError)
    
    # The latter change affects any type of range beginning:
    ('a'...).reverse_each
    # Ruby 3.2: hangs forever, trying to produce an array
    # Ruby 3.3: `reverse_each': can't iterate from NilClass (TypeError)
  • Notes: Other than raising TypeError for endless ranges (which works with any type of range beginning), the specialized behavior is only implemented for Integer. A possibility of a generalization was discussed by using object's #pred method (opposite to #succ, which the range uses to iterate forward), but the scope of this change would be bigger, as currently only Integer implements such method. It is possible that the adjustments would be made in the future versions.

#overlap?

Checks for overlapping of two ranges.

  • Discussion: Feature #19839
  • Documentation: Range#overlap?
  • Code:
    (1..3).overlap?(2..5) #=> true
    (1..3).overlap?(4..5) #=> false
    (..3).overlap?(3..)   #=> true
    
    (1...3).overlap?(3..5)
    #=> false, the first range doesn't include 3
    (1..3).overlap?(3...3)
    #=> false, the second range is empty (note it has an exclusive end)
    
    (1..3).overlap?('a'..'c')
    #=> false, ranges are incompatible (but not an exception)
    (1..3).overlap?(1)
    # `overlap?': wrong argument type Integer (expected Range) (TypeError)
  • Notes: As documentation points out, the technically empty (...-Float::INFINITY) range (nothing can be lower than -Float::INFINITY, and it is not included) still considered overlapping with itself by this method:
    (...-Float::INFINITY).overlap?(...-Float::INFINITY) #=> true
    # Same with other "nothing could be smaller" ranges:
    (..."").overlap?(..."") #=> true
    (Though, with Ruby's dynamic nature, one technically can define an object that will report itself to be smaller than an empty string, and therefore belong to a range... Making it non-empty.)

Filesystem and IO

Dir.for_fd and Dir.fchdir

Two methods to accept an integer file descriptor as an argument: for_fd creates a Dir object from it; fchdir changes the current directory to one specified by a descriptor.

  • Reason: New methods allow to use UNIX file descriptors if they are returned from a C-level code or obtained from OS.
  • Discussion: Feature #19347
  • Documentation: Dir.for_fd, Dir.fchdir
  • Code:
    fileno = Dir.new('doc/').fileno
    # In reality, this #fileno might come from other library
    
    dir = Dir.for_fd(fileno)
    #=> #<Dir:0x00007f8831b810a8> -- no readable path representation
    dir.path #=> nil
    dir.to_a
    #=> ["forwardable.rd.ja", "packed_data.rdoc", "marshal.rdoc", "format_specifications.rdoc", ....
    # It was performed in the Ruby's core folder, and lists the doc/ contents
    
    # Attempt to use a bogus fileno will result in error:
    Dir.for_fd(0)
    # `for_fd': Not a directory - fdopendir (Errno::ENOTDIR)
    
    # Same with fileno that doesn't designate a directory:
    Dir.for_fd(Dir.new('README.md').fileno)
    # in `initialize': Not a directory @ dir_initialize - README.md (Errno::ENOTDIR)
    
    # Same logic works for .fchdir
    Dir.fchdir(fileno) #=> 0
    Dir.pwd
    #=> "/home/zverok/projects/ruby/doc" -- the current path have changed successfully
    
    # A block form of fchdir is available, like for a regular .chdir:
    Dir.fchdir(Dir.new('NEWS').fileno) do |*args|
      p args #=> [] -- no arguments are passed into the block
      p Dir.pwd #=> "/home/zverok/projects/ruby/doc/NEWS"
      'return value'
    end #=> "return value"
    Dir.pwd #=> "/home/zverok/projects/ruby/doc" -- back to the path before the block
  • Notes:
    • The functionality is only supported on POSIX platforms;
    • The initial ticket only proposed to find a way to be able to change a current directory to one specified by a descriptor (i.e., what eventually became .fchdir), but during the discussion a need were discovered for a generic instantiation of a Dir instance from the descriptor (what became from_fd), as well as a generic way to change the current directory to one specified by Dir instance (#chdir, which is not related to descriptors but is generically useful).

Dir#chdir

An instance method version of Dir.chdir: changes the current working directory to one specified by the Dir instance.

  • Discussion: Feature #19347
  • Documentation: Dir#chdir
  • Code:
    Dir.pwd #=> "/home/zverok/projects/ruby"
    dir = Dir.new('doc')
    dir.chdir #=> nil
    Dir.pwd #=> "/home/zverok/projects/ruby/doc"
    
    # The block form works, too:
    Dir.new('NEWS').chdir do |*args|
      p args #=> [] -- no arguments are passed into the block
      Dir.pwd #=> "/home/zverok/projects/ruby/doc/NEWS"
      'return value'
    end #=> "return value"
    Dir.pwd #=> "/home/zverok/projects/ruby/doc"

Deprecate subprocess creation with method dedicated to files

  • Reason: Methods that are dedicated for opening/reading a file by name historically supported the special syntax of the argument: if it started with pipe character |, the subprocess was created and could've been used to communicate with an external command. The functionality is still explained in Ruby 3.2 docs. It, though, created a security vulnerability: even when the program's author didn't rely on that behavior, the malicious string could've been passed by the attacker instead of an innocent filename.
  • Discussion: Feature #19630
  • Affected methods:
  • Code:
    IO.read('| ls')
    #=> contents of the current folder
    
    Warning[:deprecated] = true # Or pass -w command-line option
    IO.read('| ls')
    # warning: Calling Kernel#open with a leading '|' is deprecated and will be removed in Ruby 4.0; use IO.popen instead
    #=> contents of the current folder
  • Notes:
    • The documentation for the corresponding methods was adjusted accordingly. Compare the documentation for Kernel#open from 3.2 (explains and showcases the | trick) and 3.3 (just mentions that there is a vulnerability to command injection attack).
    • As advised by the warning, IO.popen is a specialized method when communicating with an external process is desired functionality:
      IO.popen('ls')
      #=> contents of the current folder
    • As the impact of the change might be big, note that target version for removal is set to 4.0. To the best of my knowledge, there are no set date for major version yet.

NoMethodError: change of rendering logic

NoMethodError doesn't use target object's #inspect in its message, and renders "instance of ClassName" instead.

  • Reason: While the #inspect of the object which failed to respond might be convenient in the error's output, it also might be extremely inefficient and confusing when the object is large and doesn't have #inspect redefined to something sensible. It is impossible to require all user objects to redefine #inspect, and even if it is redefined, it might be short yet inefficient; so the lesser of evils was chosen and exception's message became more efficient even if less informative.
  • Documentation: NoMethodError
  • Code:
    "hello".to_ary
    # Ruby 3.2: undefined method `to_ary' for "hello":String (NoMethodError)
    # Ruby 3.3: undefined method `to_ary' for an instance of String (NoMethodError)
    
    # But also, for some complicated data structure:
    ([{name: 'User 1', role: 'admin'}] * 100).to_josn # typo
    # Ruby 3.2: undefined method `to_josn' for [{:name=>"User 1", :role=>"admin"}, {:name=>"User 1", :role=>"admin"}, ...
    # ....10 lines of console output....
    # ..., {:name=>"User 1", :role=>"admin"}]:Array (NoMethodError)
    #
    # Ruby 3.3: undefined method `to_josn' for an instance of Array (NoMethodError)

Fiber#kill

Terminates the Fiber by sending an exception inside it.

  • Reason: The method is intended to be used to fibers that represent processes that need to be told explicitly to finalize themselves (invoking any ensure operations and cleanups that are necessary). If such fiber just abandoned and collected by a GC, it wouldn't invoke fiber's ensure, and therefore the resources wouldn't be cleaned; so there was need for a way to do this explicitly.
  • Discussion: Bug #595
  • Documentation: Fiber#kill
  • Code:
    f = Fiber.new do
      (1..).each { Fiber.yield _1 }
    ensure
      puts "Closing myself"
    end
    #=> #<Fiber:0x0... (created)>
    
    f.resume #=> 1
    f.resume #=> 2
    f #=> #<Fiber:0x0... (suspended)>
    f.kill
    # Prints: "Closing myself"
    f #=> #<Fiber:0x0... (terminated)>
    f.resume
    # `resume': attempt to resume a terminated fiber (FiberError)
    
    # Semi-realistic usage example:
    
    reader = Fiber.new do
      conn = SomeConnection.open(**params)
      while conn.open?
        Fiber.yield conn.read
      end
    ensure
      conn.close
    end
    
    headers = reader.resume # reads something from the connection
    body_line1 = reader.resume # reads some more
    # Now, if we want to explicitly stop reading and be sure that the connection
    # is closed, we might do this:
    reader.kill # invokes #ensure
  • Notes:
    • The exception sent to Fiber is uncatchable (so no rescue Exception will notice it), so it can't be said that it has some class; the only usage of the fact that it is raised through exception mechanism is invoking ensure block;
    • The fibers that was invoking the killed one with resume or transfer, receives nil from that call;
      f1 = Fiber.new {
        # Instead of yielding something back, the fiber kills itself
        Fiber.current.kill
      }
      
      f2 = Fiber.new {
        result = f1.transfer
        p(result:)
      }
      
      f2.resume
      # prints: {:result => nil}
    • Only fibers belonging to the same thread can be killed.

Internals

New Warning category: :performance

A new warning category was introduced for a code that is correct but is known to produces a performance problems. One new such warning was added for objects with too many "shape" variations.

  • Discussion: Feature #19538
  • Documentation: Warning#[category]
  • Code: Here is an example of the new warning in play:
    class C
      def initialize(i)
        instance_variable_set("@var_#{i}", i**2)
      end
    end
    
    Warning[:performance] = true # or pass `-W:performance` command-line argument
    
    (1..10).map { C.new(_1) }
    # warning: Maximum shapes variations (8) reached by C, instance variables accesses will be slower.
    The example is artificial, but it shows the principle: when we have more than 8 instances of the same class, but with different list of instance variables (shape), we might have a performance problem. This means, for example, that a frequently-used class that has many methods with a memoization idiom (@var ||= value on the first access) would create the same problem, unless all of them would be initialized in the initialize, making all instances having the same shape:
    class C
      # 9 different getters that create an instance varaible
      # on the first access.
      def var1 = @var1 ||= rand
      def var2 = @var2 ||= rand
      def var3 = @var3 ||= rand
      def var4 = @var4 ||= rand
      def var5 = @var5 ||= rand
      def var6 = @var6 ||= rand
      def var7 = @var7 ||= rand
      def var8 = @var8 ||= rand
      def var9 = @var9 ||= rand
    end
    
    Warning[:performance] = true
    # Invoking different getters on different instances of the same class makes
    # them have different set of instance variables.
    (1..9).map { C.new.send("var#{_1}") }
    # warning: Maximum shapes variations (8) reached by C, instance variables accesses will be slower.
    
    # But if we add this to initialize:
    class C
      def initialize
        @var1, @var2, @var3, @var4, @var5, @var5, @var6, @var7, @var8, @var9 = nil
      end
    end
    
    (1..9).map { C.new.send("var#{_1}") }
    # no warning. All objects have the same list of instance vars = the same shape
  • Notes:
    • The warning category should be turned on explicitly by providing -W:performance CLI option or Warning[:performance] = true from the program.
  • Additional reading: Performance impact of the memoization idiom on modern Ruby by Ruby core team member Jean Boussier.

Process.warmup

A method to call when a long-running application finalized its loading, and before the regular work is started.

  • Discussion: Feature #18885
  • Documentation: Process.warmup
  • Notes: Hardly something can be explained or showcased here better than the justification discussion linked above and the method docs are doing it.

Process::Status#& and #>> are deprecated

  • Reason: These methods have been treating Process::Status as a very thin wrapper around an integer value of the return status of the process; which is unreasonable for supporting Ruby in more varying environments.
  • Discussion: Bug #19868
  • Documentation: Process::Status#&, #>>

TracePoint supports :rescue event

Allows to trace when some exception was rescue'd in the code of interest.

  • Discussion: Feature #19572
  • Documentation: TracePoint#Events
  • Code:
    TracePoint.trace(:rescue) do |tp|
      puts "Exception rescued: #{tp.raised_exception.inspect} at #{tp.path}:#{tp.lineno}"
    end
    
    begin
      raise "foo"
    rescue => e
    end
    # Prints: "Exception rescued: #<RuntimeError: foo> at example.rb:7
  • Notes: The event-specific attribute for the event is the same as for :raise: #raised_exception.

Standard library

Since Ruby 3.1 release, most of the standard library is extracted to either default or bundled gems; their development happens in separate repositories, and changelogs are either maintained there, or absent altogether. Either way, their changes aren't mentioned in the combined Ruby changelog, and I'll not be trying to follow all of them.

stdgems.org project has a nice explanations of default and bundled gems concepts, as well as a list of currently gemified libraries and links to their docs.

"For the rest of us" this means libraries development extracted into separate GitHub repositories, and they are just packaged with main Ruby before release. It means you can do issue/PR to any of them independently, without going through more tough development process of the core Ruby.

A few changes to mention, though:

  • BasicSocket#recv and BasicSocket#recv_nonblock returns nil instead of an empty string on closed connections. BaicSocket#recvmsg and BasicSocket#recvmsg_nonblock returns nil instead of an empty packet on closed connections. Discussion: Bug #19012
  • Name resolution such as Socket.getaddrinfo, Socket.getnameinfo, Addrinfo.getaddrinfo can now be interrupted. Discussion: Feature #19965
  • Random::Formatter#alphanumeric: chars keyword argument. Feature #18183:
    require 'random/formatter'
    # The default behavior: uses English alphabet + numbers
    Random.alphanumeric
    #=> "fhCshEkcGfCTO6Ny"
    
    # With the argument provided:
    Random.alphanumeric(chars: ['a', 'b', 'c'])
    #=> "cbacacbababccccc"
    
    # Note that the argument should be an array.
    # So if you have a string of characters, you can do:
    Random.alphanumeric(chars: 'abc'.chars)
    #=> "abbaccaacbacbccc"
    
    # Any object is acceptable as an array element; the method
    # would just use their `#to_s`; arrays would be flattened:
    Random.alphanumeric(chars: [1, true, [2], Object.new])
    #=> "111true11211true2true#<Object:0x00007fe804e79f48>221"
    
    # An empty array just hangs forever:
    Random.alphanumeric(chars: []) # never returns
  • There were many amazing changes in Ruby's console IRB. See the article by IRB maintainer Stan Lo: Unveiling the big leap in Ruby 3.3's IRB.

Version updates

Default gems

Bundled gems

Standard library content changes

New libraries

  • prism (nee YARP) is added. It is a new Ruby code parser, developed by Kevin Newton, which intends to become the Ruby parser, shared by all implementations (not only CRuby/MRI, but also TruffleRuby, JRuby, and others) and tools that need to parser Ruby code (like Sorbet or Rubocop). It doesn't replace CRuby's Ruby parser, at least for now, but can be used to parse Ruby quickly and produce robust, easy to use AST.
    • Documentation: Prism (it isn't very well rendered in the standard library docs, so the official site is recommended);
    • Note: You can run Ruby with Prism as its main parser with --parser=prism, but it is only for experimentation and debugging for now.

Removals

  • readline extension is removed. It was a standard library written in C to wrap GNU Readline, used to implement interactive consoles like IRB. Ruby includes pure-Ruby replacement called reline since 2.7, and now require 'readline' will just require it and make an alias Readline = Reline. Though, if the readline-ext gem is installed explicitly, require 'readline' would require it. Discussion: Feature #19616.
    require 'readline'
    
    Readline
    # Ruby 3.2:
    #   => Readline -- a separate library/constant
    # Ruby 3.3:
    #   => Reline -- Readline is just an alias
    
    Readline.method(:readline)
    # Ruby 3.2:
    #   => #<Method: Readline.readline(*)>  -- a C-defined method with no location/signature extracted
    # Ruby 3.3:
    #   => #<Method: Reline.readline(*args, &block) <...>/lib/ruby/3.3.0+0/forwardable.rb:231>

Default gems that became bundled

This means that if your dependencies are managed by Bundler and your code depend on racc, it should be added to a Gemfile.

Gems that are warned to become bundled in the next version

These gems wouldn't in a Bundler-managed environment unless explicitly added to Gemfile since the next version of Ruby. For now, requiring them in such environment would produce a warning. Discussion: Feature #19351 (initial proposal to promote many gems, which then was deemed problematic), Feature #19776 (the warning proposal)