Add regex stuff #529

ketkarameya · 2023-07-05T20:00:54Z

Implement the logic for matching code with regex.

Add tests for :

query as regex
filter as regex
propagate when hole is captured by regex
propagate built in rules when query is regex

Stack from ghstack (oldest at bottom):

-> Add regex stuff #529

[ghstack-poisoned]

ghstack-source-id: f3f20b5 Pull Request resolved: #529

[ghstack-poisoned]

ghstack-source-id: 828e81f Pull Request resolved: #529

[ghstack-poisoned]

ghstack-source-id: aaad598 Pull Request resolved: #529

[ghstack-poisoned]

ghstack-source-id: 3f22a0a Pull Request resolved: #529

[ghstack-poisoned]

ghstack-source-id: 1f8b03f Pull Request resolved: #529

[ghstack-poisoned]

ghstack-source-id: 6d0afcb Pull Request resolved: #529

Implement the logic for matching code with regex. Add tests for : * query as regex * filter as regex * propagate when hole is captured by regex * propagate built in rules when query is regex ------- [ghstack-poisoned]

ghstack-source-id: ff60d40 Pull Request resolved: #529

ketkarameya · 2023-07-07T16:21:27Z

src/models/capture_group_patterns.rs

  fn validate(&self) -> Result<(), String> {
    if self.pattern().starts_with("rgx ") {
-      panic!("Regex not supported")
+      let mut _val = &self.pattern()[4..];


Suggested change

let mut _val = &self.pattern()[4..];

let mut _val = &self.extract_regex();

lazaroclapp · 2023-07-07T16:13:54Z

test-resources/java/regex_based_matcher/configurations/rules.toml

+
+# The below three rules do a dummy type migration from List<Integer> to NewList 
+
+# Updates the import statement from `java.util.List` to `com.uber.NEwList`


Suggested change

# Updates the import statement from `java.util.List` to `com.uber.NEwList`

# Updates the import statement from `java.util.List` to `com.uber.NewList`

lazaroclapp · 2023-07-07T16:17:43Z

test-resources/java/regex_based_matcher/configurations/rules.toml

+replace_node = "m_name"
+replace = "addToNewList"
+holes = ["name"]
+is_seed_rule = false


Every replace_node in this test suite is a constant, even though the surrounding context is matched by more complex regexps. For generality, we might want a regex that matches a more general pattern on the replaced node. How about converting OurMapOfX to HashMap<X> for various Xs (i.e. the code could have OurMapOfInteger, OurMapOfString, and OurMapOfOurMapOfLong)

Or is something like that not meant to be supported?

Thats a brilliant idea ! We can absolutely do that!

Done ✅

lazaroclapp · 2023-07-07T16:21:18Z

test-resources/java/regex_based_matcher/input/Sample.java

+
+        // Will not get updated
+        List<String> b = getListStr();
+        Integer item = getItemStr();


Suggested change

Integer item = getItemStr();

String itemStr = getItemStr();

Or something like that. I know this code doesn't get compiled, but it still should make sense types-wise :)

lazaroclapp · 2023-07-07T16:24:06Z

src/utilities/regex_utilities.rs

+/// * `replace_node` - node to replace
+///
+/// # Returns
+/// The range of the match in the source code and the corresponding mapping from tags to code snippets.


Is this correct? Return seems to just be a vector of matches. Where is this mapping encoded?

lazaroclapp · 2023-07-07T16:24:28Z

src/utilities/regex_utilities.rs

+pub(crate) fn get_all_matches_for_regex(
+  node: &Node, source_code: String, regex: &Regex, recursive: bool, replace_node: Option<String>,
+) -> Vec<Match> {
+  // let code_snippet = node.utf8_text(source_code.as_bytes()).unwrap();


Delete commented out line

lazaroclapp · 2023-07-07T16:27:12Z

src/utilities/regex_utilities.rs

+  all_matches
+}
+
+// Creates an hashmap from the capture group(name) to the corresponding code snippet.


Suggested change

// Creates an hashmap from the capture group(name) to the corresponding code snippet.

// Creates a hashmap from the capture group (name) to the corresponding code snippet.

lazaroclapp · 2023-07-07T16:31:06Z

src/utilities/regex_utilities.rs

+    let range_matches_inside_node = node.start_byte() <= captures.get(0).unwrap().start()
+      && node.end_byte() >= captures.get(0).unwrap().end();
+    if (recursive && range_matches_inside_node) || range_matches_node {
+      let group_by_tag = if let Some(ref rn) = replace_node {


Group by tag is either the matched code string (plus location info) corresponding to the match group for the replace node if present or a the matched code string matching the full regex match if not, correct? Why is the name group_by_tag?

Yes! your understanding is correct. It represents the match corresponding to the replace node (if present) or the entire match.
renamed to - replace_node_match

lazaroclapp · 2023-07-07T16:32:57Z

src/models/matches.rs

+      range: Range::from_regex_match(mtch, source_code),
+      matches,
+      associated_comma: None,
+      associated_comments: Vec::new(),


Do we populate associated comments later and does that work for regex matches?

yes.
But for testing purposes, I added a scenario where we expect the associated comment to be cleaned up.

Not blocking, but still curious about this.

lazaroclapp · 2023-07-07T16:41:59Z

src/models/capture_group_patterns.rs

  fn validate(&self) -> Result<(), String> {
    if self.pattern().starts_with("rgx ") {
-      panic!("Regex not supported")
+      let mut _val = &self.pattern()[4..];


Why do we have this logic here and in extract_regex() both? This probably should not be duplicated in case we ever change the identifier, and also should be something like len(REGEX_QUERY_PREFIX) and that prefix used elsewhere to match "rgx "

Implement the logic for matching code with regex. Add tests for : * query as regex * filter as regex * propagate when hole is captured by regex * propagate built in rules when query is regex ------- [ghstack-poisoned]

ghstack-source-id: f18d07f Pull Request resolved: #529

Implement the logic for matching code with regex. Add tests for : * query as regex * filter as regex * propagate when hole is captured by regex * propagate built in rules when query is regex ------- [ghstack-poisoned]

ghstack-source-id: d5311a1 Pull Request resolved: #529

Implement the logic for matching code with regex. Add tests for : * query as regex * filter as regex * propagate when hole is captured by regex * propagate built in rules when query is regex ------- [ghstack-poisoned]

ghstack-source-id: 1c70c5c Pull Request resolved: #529

Implement the logic for matching code with regex. Add tests for : * query as regex * filter as regex * propagate when hole is captured by regex * propagate built in rules when query is regex ------- [ghstack-poisoned]

ghstack-source-id: 257afeb Pull Request resolved: #529

lazaroclapp

A bunch of documentation/comment nits, but LGTM once those are addressed!

lazaroclapp · 2023-07-10T18:42:06Z

test-resources/java/regex_based_matcher/configurations/rules.toml

+# The below three rules do a dummy type migration from OurListOfInteger to List<Integer> 

-# Updates the import statement from `java.util.List` to `com.uber.NEwList`
+# Updates the import statement from `java.util.List` to `com.uber.NewList`


Suggested change

# Updates the import statement from `java.util.List` to `com.uber.NewList`

# Updates the import statement from `com.uber.OurListOfInteger` to `java.util.List`

lazaroclapp · 2023-07-10T18:44:35Z

test-resources/java/regex_based_matcher/configurations/rules.toml

+is_seed_rule = false
+
+
+# The below three rules do a dummy type migration  like - from OurMapOfStringInteger to HashMap<String, Integer>


Suggested change

# The below three rules do a dummy type migration like - from OurMapOfStringInteger to HashMap<String, Integer>

# The below three rules do a dummy type migration from OurMapOf{T1}{T2} to HashMap<T1, T2>. For example, from OurMapOfStringInteger to HashMap<String, Integer>. This is to exercise non-constant regex matches for replace_node.

lazaroclapp · 2023-07-10T18:45:58Z

test-resources/java/regex_based_matcher/configurations/rules.toml

+replace_node = "n"
+replace = ""
+
+# Adds Import to java.util.hashmap if absent


Suggested change

# Adds Import to java.util.hashmap if absent

# Adds Import to java.util.HashMap if absent

lazaroclapp · 2023-07-10T18:54:30Z

src/models/matches.rs

+      range: Range::from_regex_match(mtch, source_code),
+      matches,
+      associated_comma: None,
+      associated_comments: Vec::new(),


Not blocking, but still curious about this.

ketkarameya

aah forgot to push my comments

ketkarameya · 2023-07-09T22:59:25Z

src/models/matches.rs

+      range: Range::from_regex_match(mtch, source_code),
+      matches,
+      associated_comma: None,
+      associated_comments: Vec::new(),


yes.
But for testing purposes, I added a scenario where we expect the associated comment to be cleaned up.

ketkarameya · 2023-07-09T23:07:27Z

src/utilities/regex_utilities.rs

+    let range_matches_inside_node = node.start_byte() <= captures.get(0).unwrap().start()
+      && node.end_byte() >= captures.get(0).unwrap().end();
+    if (recursive && range_matches_inside_node) || range_matches_node {
+      let group_by_tag = if let Some(ref rn) = replace_node {


Yes! your understanding is correct. It represents the match corresponding to the replace node (if present) or the entire match.
renamed to - replace_node_match

ketkarameya · 2023-07-09T23:14:32Z

test-resources/java/regex_based_matcher/configurations/rules.toml

+replace_node = "m_name"
+replace = "addToNewList"
+holes = ["name"]
+is_seed_rule = false


Thats a brilliant idea ! We can absolutely do that!

Done ✅

Implement the logic for matching code with regex. Add tests for : * query as regex * filter as regex * propagate when hole is captured by regex * propagate built in rules when query is regex ------- [ghstack-poisoned]

ghstack-source-id: 78529c1 Pull Request resolved: #529

ketkarameya · 2023-07-11T00:50:16Z

Addressed comments 3ac21b0 and rebased

ghstack-source-id: 78529c1 Pull Request resolved: #529

ghstack-source-id: 78529c1 Pull Request resolved: uber/piranha#529

Add regex stuff

4a413d0

[ghstack-poisoned]

This was referenced Jul 5, 2023

Refactor: Rename and Move TSQuery to models::capture_group_pattern::CGPattern #526

Merged

Introduce CompiledCGPatterns capturing the TS-Query and regex (placeholder) #527

Merged

ketkarameya added a commit that referenced this pull request Jul 5, 2023

Add regex stuff

b14c65a

ghstack-source-id: f3f20b5 Pull Request resolved: #529

Update on "Add regex stuff"

f32bb95

[ghstack-poisoned]

ketkarameya added a commit that referenced this pull request Jul 6, 2023

Add support for regex based CGPatterns

06c2772

ghstack-source-id: 828e81f Pull Request resolved: #529

Update on "Add regex stuff"

80adac8

[ghstack-poisoned]

ketkarameya added a commit that referenced this pull request Jul 6, 2023

Add support for regex based CGPatterns

e4c75d0

ghstack-source-id: aaad598 Pull Request resolved: #529

Update on "Add regex stuff"

786dee6

[ghstack-poisoned]

ketkarameya added a commit that referenced this pull request Jul 6, 2023

Add support for regex based CGPatterns

90b2619

ghstack-source-id: 3f22a0a Pull Request resolved: #529

Update on "Add regex stuff"

2b0b086

[ghstack-poisoned]

ketkarameya added a commit that referenced this pull request Jul 6, 2023

Add support for regex based CGPatterns

05a2f29

ghstack-source-id: 1f8b03f Pull Request resolved: #529

Update on "Add regex stuff"

6857c6e

[ghstack-poisoned]

ketkarameya added a commit that referenced this pull request Jul 6, 2023

Add support for regex based CGPatterns

f69678c

ghstack-source-id: 6d0afcb Pull Request resolved: #529

Update on "Add regex stuff"

99fd457

Implement the logic for matching code with regex. Add tests for : * query as regex * filter as regex * propagate when hole is captured by regex * propagate built in rules when query is regex ------- [ghstack-poisoned]

ketkarameya added a commit that referenced this pull request Jul 7, 2023

Add support for regex based CGPatterns

39073a6

ghstack-source-id: ff60d40 Pull Request resolved: #529

ketkarameya commented Jul 7, 2023

View reviewed changes

lazaroclapp reviewed Jul 7, 2023

View reviewed changes

Update on "Add regex stuff"

8ed0493

Implement the logic for matching code with regex. Add tests for : * query as regex * filter as regex * propagate when hole is captured by regex * propagate built in rules when query is regex ------- [ghstack-poisoned]

ketkarameya added a commit that referenced this pull request Jul 7, 2023

Add support for regex based CGPatterns

d64478a

ghstack-source-id: f18d07f Pull Request resolved: #529

Update on "Add regex stuff"

c55bca1

Implement the logic for matching code with regex. Add tests for : * query as regex * filter as regex * propagate when hole is captured by regex * propagate built in rules when query is regex ------- [ghstack-poisoned]

ketkarameya added a commit that referenced this pull request Jul 9, 2023

Add support for regex based CGPatterns

9ac257c

ghstack-source-id: d5311a1 Pull Request resolved: #529

Update on "Add regex stuff"

99d2218

Implement the logic for matching code with regex. Add tests for : * query as regex * filter as regex * propagate when hole is captured by regex * propagate built in rules when query is regex ------- [ghstack-poisoned]

ketkarameya added a commit that referenced this pull request Jul 10, 2023

Add support for regex based CGPatterns

976f5b4

ghstack-source-id: 1c70c5c Pull Request resolved: #529

Update on "Add regex stuff"

1454962

Implement the logic for matching code with regex. Add tests for : * query as regex * filter as regex * propagate when hole is captured by regex * propagate built in rules when query is regex ------- [ghstack-poisoned]

ketkarameya added a commit that referenced this pull request Jul 10, 2023

Add support for regex based CGPatterns

73aa81a

ghstack-source-id: 257afeb Pull Request resolved: #529

ketkarameya mentioned this pull request Jul 10, 2023

WIP code #536

Open

lazaroclapp approved these changes Jul 10, 2023

View reviewed changes

ketkarameya commented Jul 10, 2023

View reviewed changes

Update on "Add regex stuff"

3ac21b0

Implement the logic for matching code with regex. Add tests for : * query as regex * filter as regex * propagate when hole is captured by regex * propagate built in rules when query is regex ------- [ghstack-poisoned]

ketkarameya added a commit that referenced this pull request Jul 11, 2023

Add support for regex based CGPatterns

eaf000c

ghstack-source-id: 78529c1 Pull Request resolved: #529

ketkarameya merged commit 3ac21b0 into gh/ketkarameya/26/base Jul 11, 2023

ketkarameya added a commit that referenced this pull request Jul 11, 2023

Add support for regex based CGPatterns

f4b48d2

ghstack-source-id: 78529c1 Pull Request resolved: #529

ketkarameya deleted the gh/ketkarameya/26/head branch July 11, 2023 02:31

collabrpay added a commit to collabrpay/solid-tribble that referenced this pull request Aug 11, 2024

Add support for regex based CGPatterns

97ea973

ghstack-source-id: 78529c1 Pull Request resolved: uber/piranha#529

	let mut _val = &self.pattern()[4..];
	let mut _val = &self.extract_regex();


		# The below three rules do a dummy type migration from List<Integer> to NewList

		# Updates the import statement from `java.util.List` to `com.uber.NEwList`

	// Creates an hashmap from the capture group(name) to the corresponding code snippet.
	// Creates a hashmap from the capture group (name) to the corresponding code snippet.

	# Updates the import statement from `java.util.List` to `com.uber.NewList`
	# Updates the import statement from `com.uber.OurListOfInteger` to `java.util.List`

		is_seed_rule = false


		# The below three rules do a dummy type migration like - from OurMapOfStringInteger to HashMap<String, Integer>

	# Adds Import to java.util.hashmap if absent
	# Adds Import to java.util.HashMap if absent

Add regex stuff #529

Add regex stuff #529

Uh oh!

Conversation

ketkarameya commented Jul 5, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lazaroclapp left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ketkarameya left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ketkarameya commented Jul 11, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ketkarameya commented Jul 5, 2023 •

edited

Loading