Skip to content

Conversation

@ketkarameya
Copy link
Contributor

@ketkarameya ketkarameya commented Jul 5, 2023

Implement the logic for matching code with regex.

Add tests for :

  • query as regex
  • filter as regex
  • propagate when hole is captured by regex
  • propagate built in rules when query is regex

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]
[ghstack-poisoned]
ketkarameya added a commit that referenced this pull request Jul 6, 2023
ghstack-source-id: 828e81f
Pull Request resolved: #529
[ghstack-poisoned]
ketkarameya added a commit that referenced this pull request Jul 6, 2023
ghstack-source-id: aaad598
Pull Request resolved: #529
[ghstack-poisoned]
ketkarameya added a commit that referenced this pull request Jul 6, 2023
ghstack-source-id: 3f22a0a
Pull Request resolved: #529
[ghstack-poisoned]
ketkarameya added a commit that referenced this pull request Jul 6, 2023
ghstack-source-id: 1f8b03f
Pull Request resolved: #529
[ghstack-poisoned]
ketkarameya added a commit that referenced this pull request Jul 6, 2023
ghstack-source-id: 6d0afcb
Pull Request resolved: #529
Implement the logic for matching code with regex. 

Add tests for : 
* query as regex 
* filter as regex 
* propagate when hole is captured by regex 
* propagate built in rules when query is regex 

-------




[ghstack-poisoned]
ketkarameya added a commit that referenced this pull request Jul 7, 2023
ghstack-source-id: ff60d40
Pull Request resolved: #529
fn validate(&self) -> Result<(), String> {
if self.pattern().starts_with("rgx ") {
panic!("Regex not supported")
let mut _val = &self.pattern()[4..];
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let mut _val = &self.pattern()[4..];
let mut _val = &self.extract_regex();


# The below three rules do a dummy type migration from List<Integer> to NewList

# Updates the import statement from `java.util.List` to `com.uber.NEwList`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Updates the import statement from `java.util.List` to `com.uber.NEwList`
# Updates the import statement from `java.util.List` to `com.uber.NewList`

replace_node = "m_name"
replace = "addToNewList"
holes = ["name"]
is_seed_rule = false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every replace_node in this test suite is a constant, even though the surrounding context is matched by more complex regexps. For generality, we might want a regex that matches a more general pattern on the replaced node. How about converting OurMapOfX to HashMap<X> for various Xs (i.e. the code could have OurMapOfInteger, OurMapOfString, and OurMapOfOurMapOfLong)

Or is something like that not meant to be supported?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats a brilliant idea ! We can absolutely do that!

Done ✅


// Will not get updated
List<String> b = getListStr();
Integer item = getItemStr();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Integer item = getItemStr();
String itemStr = getItemStr();

Or something like that. I know this code doesn't get compiled, but it still should make sense types-wise :)

/// * `replace_node` - node to replace
///
/// # Returns
/// The range of the match in the source code and the corresponding mapping from tags to code snippets.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this correct? Return seems to just be a vector of matches. Where is this mapping encoded?

pub(crate) fn get_all_matches_for_regex(
node: &Node, source_code: String, regex: &Regex, recursive: bool, replace_node: Option<String>,
) -> Vec<Match> {
// let code_snippet = node.utf8_text(source_code.as_bytes()).unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete commented out line

all_matches
}

// Creates an hashmap from the capture group(name) to the corresponding code snippet.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Creates an hashmap from the capture group(name) to the corresponding code snippet.
// Creates a hashmap from the capture group (name) to the corresponding code snippet.

let range_matches_inside_node = node.start_byte() <= captures.get(0).unwrap().start()
&& node.end_byte() >= captures.get(0).unwrap().end();
if (recursive && range_matches_inside_node) || range_matches_node {
let group_by_tag = if let Some(ref rn) = replace_node {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Group by tag is either the matched code string (plus location info) corresponding to the match group for the replace node if present or a the matched code string matching the full regex match if not, correct? Why is the name group_by_tag?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! your understanding is correct. It represents the match corresponding to the replace node (if present) or the entire match.
renamed to - replace_node_match

range: Range::from_regex_match(mtch, source_code),
matches,
associated_comma: None,
associated_comments: Vec::new(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we populate associated comments later and does that work for regex matches?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes.
But for testing purposes, I added a scenario where we expect the associated comment to be cleaned up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not blocking, but still curious about this.

fn validate(&self) -> Result<(), String> {
if self.pattern().starts_with("rgx ") {
panic!("Regex not supported")
let mut _val = &self.pattern()[4..];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have this logic here and in extract_regex() both? This probably should not be duplicated in case we ever change the identifier, and also should be something like len(REGEX_QUERY_PREFIX) and that prefix used elsewhere to match "rgx "

Implement the logic for matching code with regex. 

Add tests for : 
* query as regex 
* filter as regex 
* propagate when hole is captured by regex 
* propagate built in rules when query is regex 

-------




[ghstack-poisoned]
ketkarameya added a commit that referenced this pull request Jul 7, 2023
ghstack-source-id: f18d07f
Pull Request resolved: #529
Implement the logic for matching code with regex. 

Add tests for : 
* query as regex 
* filter as regex 
* propagate when hole is captured by regex 
* propagate built in rules when query is regex 

-------




[ghstack-poisoned]
ketkarameya added a commit that referenced this pull request Jul 9, 2023
ghstack-source-id: d5311a1
Pull Request resolved: #529
Implement the logic for matching code with regex. 

Add tests for : 
* query as regex 
* filter as regex 
* propagate when hole is captured by regex 
* propagate built in rules when query is regex 

-------




[ghstack-poisoned]
ketkarameya added a commit that referenced this pull request Jul 10, 2023
ghstack-source-id: 1c70c5c
Pull Request resolved: #529
Implement the logic for matching code with regex. 

Add tests for : 
* query as regex 
* filter as regex 
* propagate when hole is captured by regex 
* propagate built in rules when query is regex 

-------




[ghstack-poisoned]
ketkarameya added a commit that referenced this pull request Jul 10, 2023
ghstack-source-id: 257afeb
Pull Request resolved: #529
@ketkarameya ketkarameya mentioned this pull request Jul 10, 2023
Copy link
Contributor

@lazaroclapp lazaroclapp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bunch of documentation/comment nits, but LGTM once those are addressed!

# The below three rules do a dummy type migration from OurListOfInteger to List<Integer>

# Updates the import statement from `java.util.List` to `com.uber.NEwList`
# Updates the import statement from `java.util.List` to `com.uber.NewList`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Updates the import statement from `java.util.List` to `com.uber.NewList`
# Updates the import statement from `com.uber.OurListOfInteger` to `java.util.List`

is_seed_rule = false


# The below three rules do a dummy type migration like - from OurMapOfStringInteger to HashMap<String, Integer>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# The below three rules do a dummy type migration like - from OurMapOfStringInteger to HashMap<String, Integer>
# The below three rules do a dummy type migration from OurMapOf{T1}{T2} to HashMap<T1, T2>. For example, from OurMapOfStringInteger to HashMap<String, Integer>. This is to exercise non-constant regex matches for replace_node.

replace_node = "n"
replace = ""

# Adds Import to java.util.hashmap if absent
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Adds Import to java.util.hashmap if absent
# Adds Import to java.util.HashMap if absent

range: Range::from_regex_match(mtch, source_code),
matches,
associated_comma: None,
associated_comments: Vec::new(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not blocking, but still curious about this.

Copy link
Contributor Author

@ketkarameya ketkarameya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aah forgot to push my comments

range: Range::from_regex_match(mtch, source_code),
matches,
associated_comma: None,
associated_comments: Vec::new(),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes.
But for testing purposes, I added a scenario where we expect the associated comment to be cleaned up.

let range_matches_inside_node = node.start_byte() <= captures.get(0).unwrap().start()
&& node.end_byte() >= captures.get(0).unwrap().end();
if (recursive && range_matches_inside_node) || range_matches_node {
let group_by_tag = if let Some(ref rn) = replace_node {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! your understanding is correct. It represents the match corresponding to the replace node (if present) or the entire match.
renamed to - replace_node_match

replace_node = "m_name"
replace = "addToNewList"
holes = ["name"]
is_seed_rule = false
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats a brilliant idea ! We can absolutely do that!

Done ✅

Implement the logic for matching code with regex. 

Add tests for : 
* query as regex 
* filter as regex 
* propagate when hole is captured by regex 
* propagate built in rules when query is regex 

-------




[ghstack-poisoned]
ketkarameya added a commit that referenced this pull request Jul 11, 2023
ghstack-source-id: 78529c1
Pull Request resolved: #529
@ketkarameya
Copy link
Contributor Author

Addressed comments 3ac21b0 and rebased

@ketkarameya ketkarameya merged commit 3ac21b0 into gh/ketkarameya/26/base Jul 11, 2023
ketkarameya added a commit that referenced this pull request Jul 11, 2023
ghstack-source-id: 78529c1
Pull Request resolved: #529
@ketkarameya ketkarameya deleted the gh/ketkarameya/26/head branch July 11, 2023 02:31
collabrpay added a commit to collabrpay/solid-tribble that referenced this pull request Aug 11, 2024
ghstack-source-id: 78529c1
Pull Request resolved: uber/piranha#529
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants