Skip to content

Data#to_h and Data#deconstruct return values in wrong order when module overrides initialize in a subclassed Data class #9241

@myronmarston

Description

@myronmarston

Summary

When a module included in a Data.define block overrides initialize and calls super(**reordered_hash), and the resulting Data class is subclassed (even an empty subclass), to_h and deconstruct return values matched by hash iteration position rather than by member name. Accessor methods still return the correct values, creating an inconsistency between accessors and to_h/deconstruct.

Data#with uses to_h internally, so it propagates the corruption — after with, even accessor methods return wrong values.

Struct is not affected.

Environment

  • JRuby 10.0.2.0 (3.4.2)
  • OpenJDK 24.0.2+12-FR, arm64-darwin
  • Also verified correct behavior on MRI 3.4.5 and MRI 4.0.0

Reproduction

module ReversingInit
  def initialize(**kwargs)
    super(**kwargs.to_a.reverse.to_h)
  end
end

Base = Data.define(:a, :b, :c) { include ReversingInit }
class Sub < Base; end  # empty subclass

obj = Sub.new(a: 1, b: 2, c: 3)

obj.a     #=> 1  (correct)
obj.b     #=> 2  (correct)
obj.c     #=> 3  (correct)
obj.to_h  #=> {a: 3, b: 2, c: 1}  (WRONG — expected {a: 1, b: 2, c: 3})

obj.deconstruct  #=> [3, 2, 1]  (WRONG — expected [1, 2, 3])

obj2 = obj.with(b: 20)
obj2.a  #=> 3   (WRONG — expected 1; corruption propagated via to_h)
obj2.b  #=> 20  (correct — was explicitly set)
obj2.c  #=> 1   (WRONG — expected 3)

Expected behavior

to_h, deconstruct, and accessor methods should all agree. Keyword arguments passed to Data#initialize should be matched by name regardless of hash iteration order — the same as MRI behavior.

Conditions

All three must hold to trigger the bug:

  1. A module overrides initialize and calls super(**hash) where the hash has a different key iteration order than Data.define's member order.
  2. The module is included in the Data.define block (in the base Data class).
  3. The Data class is subclassed (even a completely empty subclass).

Changing any single condition avoids the bug:

Scenario Module in base? Subclassed? Reordered? Result
Module in base, no subclass yes no yes PASS
Subclass overrides initialize directly (no module) no yes yes PASS
Module included in subclass (not base) no (in sub) yes yes PASS
Module in base + subclass, kwargs in original order yes yes no PASS
Module in base + subclass + reordered kwargs yes yes yes FAIL

Full self-contained test script

#!/usr/bin/env ruby
# frozen_string_literal: true

puts "#{RUBY_ENGINE} #{RUBY_ENGINE_VERSION} (#{RUBY_VERSION})"
puts

$failures = []

def assert_equal(expected, actual, description)
  if expected == actual
    puts "  PASS: #{description}"
  else
    puts "  FAIL: #{description}"
    puts "    expected: #{expected.inspect}"
    puts "    actual:   #{actual.inspect}"
    $failures << description
  end
end

module ReversingInit
  def initialize(**kwargs)
    super(**kwargs.to_a.reverse.to_h)
  end
end

# ─── Controls: all PASS on JRuby ─────────────────────────────────────────────

puts "Control 1: Module in Data.define, NO subclass"
C1 = Data.define(:a, :b, :c) { include ReversingInit }
c1 = C1.new(a: 1, b: 2, c: 3)
assert_equal(1, c1.a, "c1.a")
assert_equal({a: 1, b: 2, c: 3}, c1.to_h, "c1.to_h")
puts

puts "Control 2: No module, subclass overrides initialize directly"
C2Base = Data.define(:a, :b, :c)
class C2 < C2Base
  def initialize(**kwargs)
    super(**kwargs.to_a.reverse.to_h)
  end
end
c2 = C2.new(a: 1, b: 2, c: 3)
assert_equal(1, c2.a, "c2.a")
assert_equal({a: 1, b: 2, c: 3}, c2.to_h, "c2.to_h")
puts

puts "Control 3: Module in subclass (not base)"
C3Base = Data.define(:a, :b, :c)
class C3 < C3Base
  include ReversingInit
end
c3 = C3.new(a: 1, b: 2, c: 3)
assert_equal(1, c3.a, "c3.a")
assert_equal({a: 1, b: 2, c: 3}, c3.to_h, "c3.to_h")
puts

# ─── Bug: Module in base + subclass ──────────────────────────────────────────

puts "BUG: Module in Data.define base + empty subclass"
BugBase = Data.define(:a, :b, :c) { include ReversingInit }
class Bug < BugBase; end

bug = Bug.new(a: 1, b: 2, c: 3)

puts "  Accessors (correct on JRuby):"
assert_equal(1, bug.a, "bug.a")
assert_equal(2, bug.b, "bug.b")
assert_equal(3, bug.c, "bug.c")

puts "  to_h (BROKEN on JRuby):"
assert_equal({a: 1, b: 2, c: 3}, bug.to_h, "bug.to_h")

puts "  deconstruct (BROKEN on JRuby):"
assert_equal([1, 2, 3], bug.deconstruct, "bug.deconstruct")

puts "  with (propagates corruption via to_h):"
bug2 = bug.with(b: 20)
assert_equal(1, bug2.a, "bug.with.a")
assert_equal(20, bug2.b, "bug.with.b")
assert_equal(3, bug2.c, "bug.with.c")
puts

# ─── Summary ─────────────────────────────────────────────────────────────────
puts "=" * 60
if $failures.empty?
  puts "All assertions passed."
else
  puts "#{$failures.size} failure(s):"
  $failures.each { |f| puts "  - #{f}" }
  exit 1
end

Analysis

It appears that JRuby's Data#initialize has two storage mechanisms:

  • Accessor methods use name-based storage (correct).
  • to_h / deconstruct use positional slots that are filled in hash iteration order rather than matched by parameter name.

When the module and subclass are in the same class (no subclassing), or when initialize is overridden directly in the subclass (no module), the positional slots are filled correctly. The bug only manifests when the super call originates from a module included in the base Data class and the actual instance is of a subclass.

Struct with keyword_init: true is not affected under the same conditions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions