Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(agent): Add aws-sdk-php SQS instrumentation #1003

Open
wants to merge 24 commits into
base: feat/message_segment
Choose a base branch
from

Conversation

zsistla
Copy link
Contributor

@zsistla zsistla commented Dec 30, 2024

Uses message segments to create SQS instrumentation.

Added unit tests and multiverse tests.

SQS commands sendMessage, sendMessageBatch, receiveMessage are instrumented
to create message segment spans.
Added unit tests.
* Added helper func to get command arg value for reuse with future AWS instrumentations.
* Added unit tests for helper functions.
Separated cloud attributes from message segment.
They can be applied to multiple segment types and don't belong in typed attributes.
It will make it easier to add to other segments in the future.

Added a function to add the cloud attributes.

Reworked unit tests with new functionality.
SQS commands sendMessage, sendMessageBatch, receiveMessage are instrumented
to create message segment spans.
Added unit tests.
* Added helper func to get command arg value for reuse with future AWS instrumentations.
* Added unit tests for helper functions.
Separated cloud attributes from message segment.
They can be applied to multiple segment types and don't belong in typed attributes.
It will make it easier to add to other segments in the future.

Added a function to add the cloud attributes.

Reworked unit tests with new functionality.
@zsistla zsistla force-pushed the feat/sqs_instrumentation branch from b342fc5 to 352f85f Compare December 30, 2024 14:11
@newrelic-php-agent-bot
Copy link

newrelic-php-agent-bot commented Dec 30, 2024

Test Suite Status Result
Multiverse 0/8 passing
SOAK 61/72 passing

@codecov-commenter
Copy link

codecov-commenter commented Dec 30, 2024

Codecov Report

Attention: Patch coverage is 44.18605% with 72 lines in your changes missing coverage. Please review.

Please upload report for BASE (feat/message_segment@925b94c). Learn more about missing BASE report.

Files with missing lines Patch % Lines
agent/lib_aws_sdk_php.c 43.08% 70 Missing ⚠️
axiom/nr_segment_message.c 50.00% 2 Missing ⚠️
Additional details and impacted files
@@                   Coverage Diff                   @@
##             feat/message_segment    #1003   +/-   ##
=======================================================
  Coverage                        ?   78.01%           
=======================================================
  Files                           ?      197           
  Lines                           ?    27421           
  Branches                        ?        0           
=======================================================
  Hits                            ?    21392           
  Misses                          ?     6029           
  Partials                        ?        0           
Flag Coverage Δ
agent-for-php-7.2 78.16% <66.66%> (?)
agent-for-php-7.3 78.18% <66.66%> (?)
agent-for-php-7.4 77.89% <66.66%> (?)
agent-for-php-8.0 77.26% <66.66%> (?)
agent-for-php-8.1 77.75% <44.18%> (?)
agent-for-php-8.2 77.35% <44.18%> (?)
agent-for-php-8.3 77.35% <44.18%> (?)
agent-for-php-8.4 77.37% <44.18%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

agent/lib_aws_sdk_php.h Outdated Show resolved Hide resolved
agent/lib_aws_sdk_php.h Outdated Show resolved Hide resolved
agent/lib_aws_sdk_php.h Outdated Show resolved Hide resolved
* Start a new segment to contain the message segment.
* The new segment is started in a new "before" wrapper.
* If we created a message segment, the segment is ended; otherwise we need to end it before exiting the wrapper.
* Give the calling segment a more meaningful name so we see something like `Aws\\Sqs\\SqsClient::sendMessage` instead of seeing `Aws/AwsClient::__call`
agent/lib_aws_sdk_php.c Outdated Show resolved Hide resolved
@zsistla zsistla requested review from ZNeumann and mfulb January 9, 2025 13:54

/*
* Purpose : The second argument to the Aws/AwsClient::__call function should be
* an array containing an array of argument name:value pairs. Given an argument
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it an array containing an array or is it just an array containing arguments to the command stored as name_of_an_argument => value?

Copy link
Contributor Author

@zsistla zsistla Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an array containing an array of name:value pairs that represent the arguments passed to the original called command_name function.

agent/lib_aws_sdk_php.h Outdated Show resolved Hide resolved
agent/lib_aws_sdk_php.c Outdated Show resolved Hide resolved
}

nr_free(command_name_string);
nr_free(real_class_and_command);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this cleanup for real_class_and_command happen in the if block above since it would make it more obvious when reading the code that this allocation is released?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored before reading this comment, but I think the refactor addressed this concern.
2a74246

Research showed comparing strlen first significantly reduced lookup times, especially in non-matching cases.
Refactor to use the already computed strlen in the zval/zend_string.
This involved removing the helper function which took the zval/zend_string and strduped it onto a char*.
Since we weren't helping by converting to string anymore, keeping the helper function would have meant having to do the less optimal options of the calling function remember to nr_php_arg_release an arg it didn't directly get, copy the zval, add a ref to the zval that would need to be freed.
Simpler just to extract the functionality back to the main function.  One positive of helper functions is they are easier to test, but shouldn't be at the detriment of code readability or optimization. nr_php_arg_get was already directly giving us the zval/zend_string availability that we wanted.
* Find the pattern of the AWS queueurl that should immediately precede the
* region.
*/
if (0 != strncmp(queueurl_pointer, "https://sqs.", 12)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on our conversation - replace hard coded numbers with a #define and sizeof().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see: 09f10e8

* nr_lib_aws_sdk_php_sqs_parse_queueurl checks for NULL so safe pass
* command_arg_value directly in.
*/
nr_lib_aws_sdk_php_sqs_parse_queueurl(command_arg_value, &message_params,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider an optimization where a copy of command_arg_value is passed in and then the parsing function can modify this buffer in place and return pointers to the fields within this buffer to avoid strdup'ing the 3 fields in the parsing function. Also removes the need to free these later. Just need to document the parse function REQUIRES a copy that can be modified.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see: f48df3f

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hold off on looking at this, it needs additional rework.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests needed to be updated for this refactor to pass modifiable strings not literals.
0b194c2

@@ -1260,7 +1260,10 @@ static inline void nr_php_execute_segment_add_metric(
if (create_metric) {
nr_segment_add_metric(segment, buf, true);
}
nr_segment_set_name(segment, buf);
if (!segment->name) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this related to SQS instrumentation? This is a code path run for all instrumentation so I'm trying to understand why this general a change is required for SQS and if this is not something we've handled differently before.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not required for just SQS but for all AWS instrumentation.
Because we instrument the Aws/AwsClient::__call and decode the calling command, we are able to give a more meaningful name here.

So, instead of naming the segment Aws//AwsClient::__call we can name it Aws\\Sqs\\SqsClient::sendMessage.

However, when the segment was getting set in php_execute, it wasn't checking if a more meaningful name had already been assigned. This just checks if a name has already been applied to the segment, and if so don't overwrite it; otherwise, continue as normal.

Copy link
Contributor

@mfulb mfulb Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this means up till this PR the segment would never have been assigned a name before this point by any existing instrumentation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct.

tlib_pass_if_not_null("Expression should evaluate.", expr);
nr_php_zval_free(&expr);
nr_php_zval_free(&array_arg);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the QueueUrl key has a value which is not the correct type, like an array?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added here: b63f74f

tlib_pass_if_str_equal("destination_name should match.",
message_params.destination_name, destination_name);

nr_free(cloud_attrs.cloud_region);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the parse function is changed to not strdup the values this function will need to be modified too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the parse_queueurl refactor, tests need to be updated with modifiable strings and to not free the previously strduped strings.
0b194c2

#define INVALID_QUEUE_URL_7 "https://sqs.us-east-2.amazonaws.com"
#define INVALID_QUEUE_URL_8 "https://sqs.us-east-2.random.com/123456789012/SQS_QUEUE_NAME"
// clang-format on

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add tests where the 2nd and 3rd args are NULL instead of pointing to the message and cloud structs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, the third argument queue_name would not exist.

In this case, we wouldn't decode the 2nd argument (account id).

If one of the three is NULL, all will be NULL; otherwise, the queueUrl is malformed and cannot be trusted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean something like

nr_lib_aws_sdk_php_sqs_parse_queueurl(NULL, NULL, NULL);

Comment on lines +110 to +131
if (command_name_len == AWS_SQS_SEND_MESSAGE_BATCH_COMMAND_LEN
&& 0
== nr_strncmp(AWS_SQS_SEND_MESSAGE_BATCH_COMMAND,
command_name_string,
AWS_SQS_SEND_MESSAGE_BATCH_COMMAND_LEN)) {
message_params.message_action = NR_SPANKIND_PRODUCER;
} else if (command_name_len == AWS_SQS_SEND_MESSAGE_COMMAND_LEN
&& 0
== nr_strncmp(AWS_SQS_SEND_MESSAGE_COMMAND,
command_name_string,
AWS_SQS_SEND_MESSAGE_COMMAND_LEN)) {
message_params.message_action = NR_SPANKIND_PRODUCER;
} else if (command_name_len == AWS_SQS_RECEIVE_MESSAGE_COMMAND_LEN
&& 0
== nr_strncmp(AWS_SQS_RECEIVE_MESSAGE_COMMAND,
command_name_string,
AWS_SQS_RECEIVE_MESSAGE_COMMAND_LEN)) {
message_params.message_action = NR_SPANKIND_CONSUMER;
} else {
/* Nothing to do here so exit. */
return;
}
Copy link
Member

@lavarou lavarou Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code has so much indirection and requires jumping back and forth to check if the macros have correct values and what those values actually are. Wouldn't this be easier to read and maintain:

Suggested change
if (command_name_len == AWS_SQS_SEND_MESSAGE_BATCH_COMMAND_LEN
&& 0
== nr_strncmp(AWS_SQS_SEND_MESSAGE_BATCH_COMMAND,
command_name_string,
AWS_SQS_SEND_MESSAGE_BATCH_COMMAND_LEN)) {
message_params.message_action = NR_SPANKIND_PRODUCER;
} else if (command_name_len == AWS_SQS_SEND_MESSAGE_COMMAND_LEN
&& 0
== nr_strncmp(AWS_SQS_SEND_MESSAGE_COMMAND,
command_name_string,
AWS_SQS_SEND_MESSAGE_COMMAND_LEN)) {
message_params.message_action = NR_SPANKIND_PRODUCER;
} else if (command_name_len == AWS_SQS_RECEIVE_MESSAGE_COMMAND_LEN
&& 0
== nr_strncmp(AWS_SQS_RECEIVE_MESSAGE_COMMAND,
command_name_string,
AWS_SQS_RECEIVE_MESSAGE_COMMAND_LEN)) {
message_params.message_action = NR_SPANKIND_CONSUMER;
} else {
/* Nothing to do here so exit. */
return;
}
#define COMMAND_IS(CMD) \
(command_name_len == (sizeof(CMD) - 1) && nr_streq(CMD, command_name_string))
/* Determine if we instrument this command. */
if (COMMAND_IS("sendBatch")) {
message_params.message_action = NR_SPANKIND_PRODUCER;
} else if (COMMAND_IS("sendMessage")) {
message_params.message_action = NR_SPANKIND_PRODUCER;
} else if (COMMAND_IS("receiveMessage")) {
message_params.message_action = NR_SPANKIND_CONSUMER;
} else {
/* Nothing to do here so exit. */
return;
}
#undef IS_COMMAND

Suggested change avoids the redirection, and also avoids defining and maintaining the macros for each command and its length.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This suggestion is something to consider but is not blocking approval.

Copy link
Contributor Author

@zsistla zsistla Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, the macro makes the code more readable, but it means we've got to compute the size of a known string every single time we go through the handle_sqs code.

The defines were both to avoid hardcoding values as well as to avoid having to compute sizeof/strlen for those three values every time we go through the code.

For the case where we match the last string, the numbers for computing the size everytime vs having it already computed is:
use macro duration =979650
precomputed sizes duration =957323

For the case where no strings match:
macro=62778
precomputed=56180


We could combine approaches to improve readability while maintaining the optimization.

The macro could be something similar to:


   #define COMMAND_IS(CMD) \
  (command_name_len == AWS_ ## CMD ## _COMMAND ## _LEN && (nr_streq(AWS_ ## CMD ## _COMMAND, command_name_string)))

then the code could look like:

  /* Determine if we instrument this command. */
  if (COMMAND_IS(SQS_SEND_MESSAGE_BATCH)) {
    message_params.message_action = NR_SPANKIND_PRODUCER;
  } else if (COMMAND_IS(SQS_SEND_MESSAGE)) {
    message_params.message_action = NR_SPANKIND_PRODUCER;
  } else if (COMMAND_IS(SQS_RECEIVE_MESSAGE)) {
    message_params.message_action = NR_SPANKIND_CONSUMER;
  } else {
    /* Nothing to do here so exit. */
    return;
  }
#undef IS_COMMAND

The above improves readability but as seen below this still keeps the performance improvements
macro comparison when len/string matches:
macro that computes lengths each time 924522
macro that uses precomputed lengths 866615

macro comparison when len/string matches:
macro that computes lengths each time 68946
macro that uses precomputed lengths 65715

Comment on lines +112 to +114
== nr_strncmp(AWS_SQS_SEND_MESSAGE_BATCH_COMMAND,
command_name_string,
AWS_SQS_SEND_MESSAGE_BATCH_COMMAND_LEN)) {
Copy link
Member

@lavarou lavarou Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the advantage of using nr_strncmp over nr_strncmp or nr_streq? nr_streq makes the code most readable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strncmp was needed since the defined strings had an extra char on the end that we could account for if we used the precomputed length instead of the length computed in nr_streq.
This could be made more readable with a macro.

Comment on lines +370 to +415
if (NULL != klass) {
/* Get the arg command_name. */
command_name = nr_php_arg_get(1, NR_EXECUTE_ORIG_ARGS);

if (nr_php_is_zval_non_empty_string(command_name)) {
command_name_string = Z_STRVAL_P(command_name);
klass_len = nr_php_class_entry_name_length(class_entry);

if (klass_len == AWS_SDK_PHP_SQSCLIENT_CLASS_LEN
&& nr_striendswith(klass, klass_len,
AWS_SDK_PHP_SQSCLIENT_CLASS_SHORT,
AWS_SDK_PHP_SQSCLIENT_CLASS_SHORT_LEN)) {
nr_lib_aws_sdk_php_sqs_handle(auto_segment, command_name_string,
Z_STRLEN_P(command_name),
NR_EXECUTE_ORIG_ARGS);
}

if (NR_SEGMENT_CUSTOM == auto_segment->type) {
/*
* We need to end the segment that we started in the 'before' wrapper if
* it wasn't handled and ended by the handling function. Handling
* function would have changed the segment type from from default
* (`NR_SEGMENT_CUSTOM`) if it ended it.
*/
nr_segment_discard(&auto_segment);
}

/*
* Since we have klass and command_name, we can give the calling segment
* a more meaningful name than Aws/AwsClient::__call We can decode it to
* Aws/CALLING_CLASS_NAME/CALLING_CLASS_CLIENT::CALLING_CLASS_COMMAND
*
* EX: Aws\\Sqs\\SqsClient::sendMessage
*/

segment = nr_txn_get_current_segment(NRPRG(txn), NULL);
if (NULL != segment) {
real_class_and_command
= nr_formatf("Custom/%s::%s", klass, command_name_string);
nr_segment_set_name(segment, real_class_and_command);
nr_free(real_class_and_command);
}
}
/* Release the command_name. */
nr_php_arg_release(&command_name);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

newrelic-php-agent's code base doesn't seem to have a standard around structuring the code. Golang has the following standard defined in Effective Go:

[...] a common situation where code must guard against a sequence of error conditions. The code reads well if the successful flow of control runs down the page, eliminating error cases as they arise. Since error cases tend to end in return statements, the resulting code needs no else statements.

The readability of the above could could definitely benefit from applying Effective Go standards. Consider:

  if (NULL == klass) {
    return;
  }
  ...
  if (!nr_php_is_zval_non_empty_string(command_name)) {
    nr_php_arg_release(&command_name);
    return;
  }
  

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This suggestion is something to consider but is not blocking approval.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nr_lib_aws_sdk_php_sqs_parse_queueurl follows the Effective Go standards and is easy to read.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, but it's not a simple return that can occur since it needs a return value.
I can however, make a goto end so that we'll skip to the bottom then all the logic here can execute for the actual return value.

Add a test case for get_command_arg_value  when queueUrl exists but is not a string.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants