From 35a79a60274606fa77859cf4d44eccde41b43380 Mon Sep 17 00:00:00 2001 From: TiredSounds Date: Sat, 20 Feb 2016 00:21:08 +0000 Subject: [PATCH 01/11] Add feature outline to TODO --- TODO | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/TODO b/TODO index 014c8df..8f83aee 100644 --- a/TODO +++ b/TODO @@ -1,2 +1,23 @@ - Add :set command / other method to configure from within wikicurses - Bash completion (currently only has zsh completion) + +### TS ### + +feature_images + += Description = + +A command which launches a separate window with the images from the current article. + += Requirements = + +- feh for image display. It is minimal and has slideshow and URL loading features. + += Implementation = + +1. Research mediawiki API and find a way to get the image URLs. +2. Create new command and make sure it works with tab completion. +3. Write a function to fetch the image URLs +4. Write a function to launch feh with the URLs as arguments. +5. Link command in with said functions. +6. Implement error handling in case of zero URLs or feh not present. From a96cf37b52812790cf15b794c95d2aeb7a702c8c Mon Sep 17 00:00:00 2001 From: TiredSounds Date: Sat, 20 Feb 2016 01:02:45 +0000 Subject: [PATCH 02/11] Update TODO --- TODO | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/TODO b/TODO index 8f83aee..ac2b8f2 100644 --- a/TODO +++ b/TODO @@ -12,12 +12,13 @@ A command which launches a separate window with the images from the current arti = Requirements = - feh for image display. It is minimal and has slideshow and URL loading features. +Update: No longer required as an alternative can be specified in the config. feh is the default. = Implementation = 1. Research mediawiki API and find a way to get the image URLs. -2. Create new command and make sure it works with tab completion. +2. Create new command and make sure it works with completion. <- Done 3. Write a function to fetch the image URLs 4. Write a function to launch feh with the URLs as arguments. -5. Link command in with said functions. -6. Implement error handling in case of zero URLs or feh not present. +5. Implement error handling in case of zero URLs or feh not present. +6. Link command in with said functions. From 9c5da2a8fdb1ea682b1b5802383eb1563faecf0a Mon Sep 17 00:00:00 2001 From: TiredSounds Date: Sat, 20 Feb 2016 01:24:06 +0000 Subject: [PATCH 03/11] Update TODO --- TODO | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/TODO b/TODO index ac2b8f2..c05cf15 100644 --- a/TODO +++ b/TODO @@ -16,9 +16,11 @@ Update: No longer required as an alternative can be specified in the config. feh = Implementation = -1. Research mediawiki API and find a way to get the image URLs. +1. Research mediawiki API and find a way to get the image URLs. <- Done 2. Create new command and make sure it works with completion. <- Done -3. Write a function to fetch the image URLs +3. Write function(s) to fetch the image URLs. This is done by searching + the page source code (regex) and matching with those found through the API. + This is done to filter out icons and such. 4. Write a function to launch feh with the URLs as arguments. -5. Implement error handling in case of zero URLs or feh not present. +5. Implement error handling in case of zero URLs or program not present. 6. Link command in with said functions. From 589275470b7cbe822c1347b631572d36fe85ed16 Mon Sep 17 00:00:00 2001 From: TiredSounds Date: Sun, 21 Feb 2016 03:16:56 +0000 Subject: [PATCH 04/11] Add command to show images from current page. Summary of changes: - Added three functions which implement the image display: `showImages`, `fetchImageTargets`, `fetchImageUrls`. - Added :image command which calls `showImages`. - Modified _query to accept a custom URL kwarg. (necessary for `fetchImageTargets`.) Description: The new :image command calls a program with the fetched images as a list of space separated URLs. By default this is `feh` as it is a lightweight and nifty image viewer. As of this commit, this feature only works on Wikipedia. I may try to extend that, though it may prove difficult as the API is rather lacking in this department, and different wikis seems to use images differently. I'll need to do some more research. The program and arguments (flags etc) can be specified in the config file. For example: [images] program = ristretto arguments = foo, bar, baz This will lead to the following being called upon execution: subprocess.Popen(['ristretto', 'foo', 'bar', 'baz', 'url1', 'url2',...]) TODO: More testing. Try to improve coverage so that we get more relevant images. Implement some kind of error reporting as there is literally zero as of this commit. --- TODO | 18 ++++++++--- wikicurses/main.py | 77 +++++++++++++++++++++++++++++++++++++++++++++- wikicurses/wiki.py | 7 ++++- 3 files changed, 96 insertions(+), 6 deletions(-) diff --git a/TODO b/TODO index c05cf15..a17c2d3 100644 --- a/TODO +++ b/TODO @@ -20,7 +20,17 @@ Update: No longer required as an alternative can be specified in the config. feh 2. Create new command and make sure it works with completion. <- Done 3. Write function(s) to fetch the image URLs. This is done by searching the page source code (regex) and matching with those found through the API. - This is done to filter out icons and such. -4. Write a function to launch feh with the URLs as arguments. -5. Implement error handling in case of zero URLs or program not present. -6. Link command in with said functions. + This is done to filter out icons and such. <- Done +4. Write a function to launch feh with the URLs as arguments. <- Done! +5. Implement error handling in case of zero URLs or program not present. <- Done +6. Link command in with said functions. <- Done + +while True: + bugs = test(mode='extensively') + bugs.fix() + +Note: + +re.findall(r'\b(?:File|Image):[^]|\n\r]+', text) + +That is the regex we will use. diff --git a/wikicurses/main.py b/wikicurses/main.py index bd7797c..8b0c39c 100644 --- a/wikicurses/main.py +++ b/wikicurses/main.py @@ -1,5 +1,6 @@ import os import re +import json import argparse import tempfile import subprocess @@ -35,6 +36,78 @@ def tabComplete(text, matches): return match +def showImages(): + """ + Launch program to display the images for the current article. + + The program is specified in the config, along with it's args. + Otherwise, the default is `feh`. + """ + targets = fetchImageTargets() + + if not targets: + return + + urls = fetchImageUrls() + filtered = [url for t in targets for url in urls if t in url] + + try: + command = [settings.conf.get('images', 'program')] + except (settings.configparser.NoOptionError, + settings.configparser.NoSectionError): + command = ['feh'] + + try: + args = settings.conf.get('images', 'arguments').split(', ') + except (settings.configparser.NoOptionError, + settings.configparser.NoSectionError): + if command == ['feh']: + args = ['-q', '--zoom', 'fill', '--image-bg', 'white', '-g', '300x300'] + else: + args = [] + + command.extend(args) + command.extend(filtered) + + try: + with open(os.devnull, 'w') as fp: + subprocess.Popen(command, stdout=fp, stderr=fp) + except FileNotFoundError: + return + + +def fetchImageTargets(): + """ + Get filenames of relevant images from the current article. + + These are used to filter out non-relevant images from the API result. + """ + url = re.sub(r'api.php', 'index.php', wiki.siteurl) + page_title = re.sub(r' ', '_', page.title) + raw = wiki._query(customurl=url, action="raw", title=page_title) + + targets = re.findall(r'\b(?:File|Image):[^]|\n\r]+', raw) # relevant ones + targets = [re.sub(r'(?:File|Image):', '', i) for i in targets] + targets = [re.sub(r' ', '_', i) for i in targets] + + return targets + + +def fetchImageUrls(): + """Use API to fetch all image urls on current article.""" + page_title = re.sub(r' ', '_', page.title) + result = wiki._query(action="query", titles=page_title, generator="images", + prop="imageinfo", iiprop="url", format="json") + + json_result = json.loads(result) + urls = [] + + for v in json_result['query']['pages'].values(): + urls.append(v['imageinfo'][0]['url']) + + return urls + + class SearchBox(urwid.Edit): title = "Search" @@ -495,7 +568,7 @@ def cancel(button): 'extlinks': Extlinks, 'langs': Langs} cmds = tuple(overlaymap) + ('quit', 'bmark', 'open', 'edit', 'clearcache', - 'help', 'back', 'forward', 'random') + 'help', 'back', 'forward', 'random', 'images') def processCmd(cmd, *args): global current @@ -528,6 +601,8 @@ def processCmd(cmd, *args): openPage(history[current], browsinghistory=True) elif cmd == 'random': openPage(wiki.random()) + elif cmd == 'images': + showImages() elif cmd: ex.notify(cmd + ': Unknown Command') diff --git a/wikicurses/wiki.py b/wikicurses/wiki.py index 4b8b8d1..1113054 100644 --- a/wikicurses/wiki.py +++ b/wikicurses/wiki.py @@ -87,7 +87,12 @@ def mainpage(self): def _query(self, post=False, **kwargs): params = {k: v for k, v in kwargs.items() if v is not False} data = urllib.parse.urlencode(params) - url = self.siteurl + + try: + url = params['customurl'] + except KeyError: + url = self.siteurl + if post: data = data.encode() if not post: From 60f7f710ddadc86b8bb379bab4757970335d38b1 Mon Sep 17 00:00:00 2001 From: TiredSounds Date: Mon, 22 Feb 2016 22:29:58 +0000 Subject: [PATCH 05/11] Hard limit :image command to wikipedia. Add notifications for errors. --- wikicurses/main.py | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/wikicurses/main.py b/wikicurses/main.py index 8b0c39c..5932b75 100644 --- a/wikicurses/main.py +++ b/wikicurses/main.py @@ -43,9 +43,14 @@ def showImages(): The program is specified in the config, along with it's args. Otherwise, the default is `feh`. """ + if 'wikipedia.org' not in wiki.siteurl: + ex.notify('Cancelled: Only works on Wikipedia') + return + targets = fetchImageTargets() if not targets: + ex.notify('No relevant images found') return urls = fetchImageUrls() @@ -62,7 +67,7 @@ def showImages(): except (settings.configparser.NoOptionError, settings.configparser.NoSectionError): if command == ['feh']: - args = ['-q', '--zoom', 'fill', '--image-bg', 'white', '-g', '300x300'] + args = ['-q', '--zoom', 'fill', '--image-bg', 'white', '-g', '400x400'] else: args = [] @@ -73,6 +78,7 @@ def showImages(): with open(os.devnull, 'w') as fp: subprocess.Popen(command, stdout=fp, stderr=fp) except FileNotFoundError: + ex.notify('Program not found') return From e1a2397f1a7be763d1fed96fe07c794aba63174b Mon Sep 17 00:00:00 2001 From: TiredSounds Date: Wed, 24 Feb 2016 22:07:45 +0000 Subject: [PATCH 06/11] Improve image coverage, tweak default feh arguments. `findImageTargets` now looks for another common notation used by Wikipedia for images, which generally takes the form: """image = foobar.ext image1 = barfoo.ext image2=barbaz.ext""" The default arguments given to `feh` have changed: `--scale-down` has replaced `--zoom fill`. --- wikicurses/main.py | 26 ++++++++++++++++++++++---- 1 file changed, 22 insertions(+), 4 deletions(-) diff --git a/wikicurses/main.py b/wikicurses/main.py index 5932b75..6ee265c 100644 --- a/wikicurses/main.py +++ b/wikicurses/main.py @@ -54,7 +54,13 @@ def showImages(): return urls = fetchImageUrls() - filtered = [url for t in targets for url in urls if t in url] + filtered = [] + + for target in targets: + for index, url in enumerate(urls): + if target in url: + filtered.append(url) + del urls[index] # prevent more than one occurrence of image try: command = [settings.conf.get('images', 'program')] @@ -67,7 +73,7 @@ def showImages(): except (settings.configparser.NoOptionError, settings.configparser.NoSectionError): if command == ['feh']: - args = ['-q', '--zoom', 'fill', '--image-bg', 'white', '-g', '400x400'] + args = ['-q', '--scale-down', '--image-bg', 'white', '-g', '400x400'] else: args = [] @@ -92,8 +98,20 @@ def fetchImageTargets(): page_title = re.sub(r' ', '_', page.title) raw = wiki._query(customurl=url, action="raw", title=page_title) - targets = re.findall(r'\b(?:File|Image):[^]|\n\r]+', raw) # relevant ones - targets = [re.sub(r'(?:File|Image):', '', i) for i in targets] + ### Finding targets... + targets = [] + + ## of the form `[[Image:foobar.png]]` + for match in re.finditer(r'\b(?:File:|Image:)([^]|\n\r]+)', raw): + targets.append(match.group(1)) + + ## of the form `image1 = foobar.png`, `image2=foobar.png`, + ## and `image = bar baz.png` etc... + for match in re.finditer( + r'(image\d?[\t ]*?=[\t ]*)(.+?\..+?)((?=[^A-Za-z0-9\.])|(?:$))', + raw): + targets.append(match.group(2)) + targets = [re.sub(r' ', '_', i) for i in targets] return targets From 2eb96060c48cf065ce46074bba48ad64e8360990 Mon Sep 17 00:00:00 2001 From: TiredSounds Date: Thu, 25 Feb 2016 23:13:29 +0000 Subject: [PATCH 07/11] Add memoization to image command. `page.title` now passed as an arg to `fetchImageTargets` and `fetchImageUrls` to make memoization possible. --- wikicurses/main.py | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/wikicurses/main.py b/wikicurses/main.py index 6ee265c..40202f6 100644 --- a/wikicurses/main.py +++ b/wikicurses/main.py @@ -5,6 +5,7 @@ import tempfile import subprocess import urllib.parse +from functools import lru_cache import urwid @@ -47,13 +48,16 @@ def showImages(): ex.notify('Cancelled: Only works on Wikipedia') return - targets = fetchImageTargets() + targets = fetchImageTargets(page.title) if not targets: ex.notify('No relevant images found') return - urls = fetchImageUrls() + ## We make a copy of the returned list here, + ## because it is memoized with `lru_cache` and we will be deleting + ## elements from it non-permanently. + urls = list(fetchImageUrls(page.title)) filtered = [] for target in targets: @@ -88,14 +92,15 @@ def showImages(): return -def fetchImageTargets(): +@lru_cache() +def fetchImageTargets(page_title): """ Get filenames of relevant images from the current article. These are used to filter out non-relevant images from the API result. """ url = re.sub(r'api.php', 'index.php', wiki.siteurl) - page_title = re.sub(r' ', '_', page.title) + page_title = re.sub(r' ', '_', page_title) raw = wiki._query(customurl=url, action="raw", title=page_title) ### Finding targets... @@ -117,9 +122,10 @@ def fetchImageTargets(): return targets -def fetchImageUrls(): +@lru_cache() +def fetchImageUrls(page_title): """Use API to fetch all image urls on current article.""" - page_title = re.sub(r' ', '_', page.title) + page_title = re.sub(r' ', '_', page_title) result = wiki._query(action="query", titles=page_title, generator="images", prop="imageinfo", iiprop="url", format="json") From d938844127e1d41a5924cb69c18e9111ac1a15e9 Mon Sep 17 00:00:00 2001 From: TiredSounds Date: Sun, 28 Feb 2016 09:04:05 +0000 Subject: [PATCH 08/11] Add image command/configuration to man pages --- wikicurses.1 | 4 ++++ wikicurses.conf.5 | 11 +++++++++++ 2 files changed, 15 insertions(+) diff --git a/wikicurses.1 b/wikicurses.1 index cc8a72a..612294d 100644 --- a/wikicurses.1 +++ b/wikicurses.1 @@ -35,6 +35,10 @@ G or end .RS 4 Scroll to bottom .RE +.PP +:images +.RS 4 +Launch a program (default: feh) with images from the article .SS Pager .PP c diff --git a/wikicurses.conf.5 b/wikicurses.conf.5 index b939b6d..049d214 100644 --- a/wikicurses.conf.5 +++ b/wikicurses.conf.5 @@ -49,6 +49,17 @@ Hide the References section at the bottom of the page and strip citations from t .RE .SS keymap This section configures the keyboard bindings of wikicurses, in the format "key=command". Command can be any ex command supported by wikicurses. +.SS images +.PP +program +.RS 4 +The name of the command to use to show images. It should be able to take space separated URLs as arguments. Defaults to feh if unspecified. +.RE +.PP +arguments +.RS 4 +The list of arguments given to the image display program. The URLs are appended to this. This is a comma and space separated list. Example: `--flag, -abc, -d, --option-with-arg, arg`. +.RE .SS Other Sections Other sections are treated as wiki entries. The url is the url for api.php on the wiki. The username and password are required for editing. .PP From 831e83b12234d0bb84d14a261d73e55a0338e246 Mon Sep 17 00:00:00 2001 From: TiredSounds Date: Sun, 28 Feb 2016 09:22:46 +0000 Subject: [PATCH 09/11] Make args for image command a space separated list. As opposed to a comma and space separated list. The change is reflected in the man page. --- wikicurses.conf.5 | 2 +- wikicurses/main.py | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/wikicurses.conf.5 b/wikicurses.conf.5 index 049d214..4d8de68 100644 --- a/wikicurses.conf.5 +++ b/wikicurses.conf.5 @@ -58,7 +58,7 @@ The name of the command to use to show images. It should be able to take space s .PP arguments .RS 4 -The list of arguments given to the image display program. The URLs are appended to this. This is a comma and space separated list. Example: `--flag, -abc, -d, --option-with-arg, arg`. +The list of arguments given to the image display program. The URLs are appended to this. This is a space separated list. Example: `--flag -abc -d --option-with-arg arg`. .RE .SS Other Sections Other sections are treated as wiki entries. The url is the url for api.php on the wiki. The username and password are required for editing. diff --git a/wikicurses/main.py b/wikicurses/main.py index 40202f6..2bcb022 100644 --- a/wikicurses/main.py +++ b/wikicurses/main.py @@ -73,7 +73,7 @@ def showImages(): command = ['feh'] try: - args = settings.conf.get('images', 'arguments').split(', ') + args = settings.conf.get('images', 'arguments').split(' ') except (settings.configparser.NoOptionError, settings.configparser.NoSectionError): if command == ['feh']: From d64e14ae2505461b88c2bec5ee6c112eec607801 Mon Sep 17 00:00:00 2001 From: TiredSounds Date: Mon, 29 Feb 2016 03:01:28 +0000 Subject: [PATCH 10/11] Change method of filtering image urls. The results of `fetchImageTargets` are now matched with the titles of the images from the current page, and the corresponding URL is used. This is as opposed to matching the targets with the URLs directly. This has increased coverage on certain pages. Supporting changes: - `fetchImageUrls` name changed to `fetchImageInfo`. - `fetchImageInfo` returns a list of 2-tuples with (title, url). --- wikicurses/main.py | 26 +++++++++++++++----------- 1 file changed, 15 insertions(+), 11 deletions(-) diff --git a/wikicurses/main.py b/wikicurses/main.py index 2bcb022..5e5e4be 100644 --- a/wikicurses/main.py +++ b/wikicurses/main.py @@ -57,14 +57,16 @@ def showImages(): ## We make a copy of the returned list here, ## because it is memoized with `lru_cache` and we will be deleting ## elements from it non-permanently. - urls = list(fetchImageUrls(page.title)) + image_info = list(fetchImageInfo(page.title)) filtered = [] for target in targets: - for index, url in enumerate(urls): - if target in url: + for index, i in enumerate(image_info): + title, url = i + if target in title: filtered.append(url) - del urls[index] # prevent more than one occurrence of image + del image_info[index] # prevent more than one occurrence of image + break try: command = [settings.conf.get('images', 'program')] @@ -117,25 +119,27 @@ def fetchImageTargets(page_title): raw): targets.append(match.group(2)) - targets = [re.sub(r' ', '_', i) for i in targets] - return targets @lru_cache() -def fetchImageUrls(page_title): - """Use API to fetch all image urls on current article.""" +def fetchImageInfo(page_title): + """ + Use API to fetch all image titles and urls on current article. + + Returns a list of `(title, url)` tuples. + """ page_title = re.sub(r' ', '_', page_title) result = wiki._query(action="query", titles=page_title, generator="images", prop="imageinfo", iiprop="url", format="json") json_result = json.loads(result) - urls = [] + info = [] for v in json_result['query']['pages'].values(): - urls.append(v['imageinfo'][0]['url']) + info.append((v['title'], v['imageinfo'][0]['url'])) - return urls + return info class SearchBox(urwid.Edit): From c4d4d97ce8cdec53c33555f792a14f40c44a58af Mon Sep 17 00:00:00 2001 From: TiredSounds Date: Mon, 29 Feb 2016 03:18:30 +0000 Subject: [PATCH 11/11] Cleanup TODO --- TODO | 34 ---------------------------------- 1 file changed, 34 deletions(-) diff --git a/TODO b/TODO index a17c2d3..014c8df 100644 --- a/TODO +++ b/TODO @@ -1,36 +1,2 @@ - Add :set command / other method to configure from within wikicurses - Bash completion (currently only has zsh completion) - -### TS ### - -feature_images - -= Description = - -A command which launches a separate window with the images from the current article. - -= Requirements = - -- feh for image display. It is minimal and has slideshow and URL loading features. -Update: No longer required as an alternative can be specified in the config. feh is the default. - -= Implementation = - -1. Research mediawiki API and find a way to get the image URLs. <- Done -2. Create new command and make sure it works with completion. <- Done -3. Write function(s) to fetch the image URLs. This is done by searching - the page source code (regex) and matching with those found through the API. - This is done to filter out icons and such. <- Done -4. Write a function to launch feh with the URLs as arguments. <- Done! -5. Implement error handling in case of zero URLs or program not present. <- Done -6. Link command in with said functions. <- Done - -while True: - bugs = test(mode='extensively') - bugs.fix() - -Note: - -re.findall(r'\b(?:File|Image):[^]|\n\r]+', text) - -That is the regex we will use.