Skip to content

Commit eae47c9

Browse files
authored
Extracting 'words' in the commonly-understood sense of the word
A function I use a lot in my projects.
1 parent c58defb commit eae47c9

File tree

1 file changed

+13
-1
lines changed

1 file changed

+13
-1
lines changed

lib/core/facets/string/words.rb

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,18 @@ class String
77
def words
88
self.split(/\s+/)
99
end
10-
10+
11+
# Returns an array of words in the commonly-understood sense (not including punctuation).
12+
# This takes into account international punctuation characters as well as English ones.
13+
#
14+
# 'Slowly, grudgingly he said: "This has to stop."'.words
15+
# => ["Slowly", "grudgingly", "he", "said", "This", "has", "to", "stop"]
16+
def words_without_punctuation
17+
s = self.dup
18+
s.gsub!(/[.?¿¡…!,::;"。?!、‘“”〈〉《》,\/\[\]]/, ' ')
19+
s.gsub!('- ', ' ')
20+
s.squeeze!(" ")
21+
s.strip.split(" ")
22+
end
1123
end
1224

0 commit comments

Comments
 (0)