It was first pointed out that a call to get a method generic return type was throwing an exception.
During the investigation it was found that the exception was thrown while gathering the method's data.
And the cause of the exception was that the signature returned by Class#getGenericSignature0
was java/util/concurrent/CompletableFuture
while the correct value is <T:Ljava/lang/Object;>Ljava/lang/Object;Ljava/util/concurrent/Future<TT;>;
.
Given that Class#getGenericSignature0
is a native method, the next step was to inspect the generated bytecode.
A simplified version of the instrumentation was created. This version did not change anything in the code, it only adds a couple of annotations to the class. Even then the problem persisted. And looking at the bytecode, both instrumented and original contain a line with the proper signature:
Signature: #xyz // <T:Ljava/lang/Object;>Ljava/lang/Object;Ljava/util/concurrent/Future<TT;>;Ljava/util/concurrent/CompletionStage<TT;>;
.
Bytecodes:
Further comparing the bytecodes, the only differences are the order of the items in the constant pool (which cause the references throughout the class to change), the order of the methods and the order of the tables in each method (LineNumberTable, LocalVariableTable, LocalVariableTypeTable).
NOTE as mentioned before, the instrumented CompletableFuture bytecode was generated with a trimmed down version of the New Relic Java agent. The bytecode that will be generated by this application will have more differences from the original since a full New Relic Java agent is used.
Since the bytecode did not provide any clue, the next step was to debug the JVM itself.
ClassFileParser::apply_parsed_class_attributes
is called twice for CompletableFuture
. Once before instrumentation and once after. And in both cases, the _generic_signature_index
is correct relative to the bytecode being read.
Following the execution to VM_RedefineClasses::merge_cp_and_rewrite
:
1794 jvmtiError VM_RedefineClasses::merge_cp_and_rewrite(
1795 InstanceKlass* the_class, InstanceKlass* scratch_class,
1796 TRAPS) {
...
... // point A
1916 if (!rewrite_cp_refs(scratch_class)) {
1917 return JVMTI_ERROR_INTERNAL;
1918 }
... // point B
...
1924 set_new_constant_pool(loader_data, scratch_class, merge_cp, merge_cp_length,
1925 CHECK_(JVMTI_ERROR_OUT_OF_MEMORY));
... // point C
...
1931 cp_cleaner.add_scratch_cp(scratch_cp());
...
At point A, scratch_class->generic_signature()
is correct and generic_signature_index
is 4.
At point B, scratch_class->generic_signature()
is wrong, its value changes with each run, and generic_signature_index
is 1019.
At point C, scratch_class->generic_signature()
is the value observed at runtime and generic_signature_index
is 4. Also scratch_class->constants()->symbol_at(1019)
has the correct generic signature.
Note that 1019 is the index for the generic signature in the original bytecode and 4 is the index for the generic signature in the instrumented bytecode. (This is likely not relevant as explained later.)
Inside VM_RedefineClasses::rewrite_cp_refs
the explanation to what happens between points A and B.
1936 bool VM_RedefineClasses::rewrite_cp_refs(InstanceKlass* scratch_class) {
...
2024 // rewrite class generic signature index:
2025 u2 generic_signature_index = scratch_class->generic_signature_index(); // this is set to 4
2026 if (generic_signature_index != 0) {
2027 u2 new_generic_signature_index = find_new_index(generic_signature_index); // this is set to 1019
2028 if (new_generic_signature_index != 0) {
2029 scratch_class->set_generic_signature_index(new_generic_signature_index);
2030 }
2031 }
...
It looks like VM_RedefineClasses::rewrite_cp_refs
writes the generic_signature_index
into scratch_cp
(the cp inside scratch_class
) to be correct index of the generic signature for the constant pool that will be put into scratch_class
.
Then merge_cp
is passed into VM_RedefineClasses::set_new_constant_pool
as scratch_cp
. A new constant pool is created (smaller_cp
), the constants from the "new" scratch_cp
are put in it, along with its generic_signature_index
which is not the value updated in VM_RedefineClasses::rewrite_cp_refs
.
Looks like there is some confusion with the names here and the generic_signature_index
that was updated in VM_RedefineClasses::rewrite_cp_refs
gets lost.
Inside VM_RedefineClasses::merge_cp_and_rewrite
, the scratch_cp
(inside scratch_class), has its fields updated to reference the items in the final constant pool during the call to VM_RedefineClasses::rewrite_cp_refs
.
Just after that, there is a call to VM_RedefineClasses::set_new_constant_pool
in which merge_cp
is passed as a parameter named scratch_cp
. Inside this method, the final constant pool is created (smaller_cp
) and there is a call to populate its fields.
smaller_cp->copy_fields(scratch_cp);
but remember that inside VM_RedefineClasses::set_new_constant_pool
, scratch_cp
is merge_cp
from the calling method and does not have the updated fields. So changing this line to
smaller_cp->copy_fields(scratch_class->constants());
copies the right fields to smaller_cp
.
This has not been thoroughly tested, but does fix the problem we are seeing. It also does not make the code any less confusing. Renaming scratch_cp
inside VM_RedefineClasses::set_new_constant_pool
might make things a little clearer. Another possible fix is to update the fields in merge_cp
instead of scratch_cp
(in VM_RedefineClasses::rewrite_cp_refs
).
I don't have a holistic view of everything going on, so this may be completely off.