Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wip] A command which displays images from the article #5

Open
wants to merge 11 commits into
base: master
Choose a base branch
from
4 changes: 4 additions & 0 deletions wikicurses.1
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,10 @@ G or end
.RS 4
Scroll to bottom
.RE
.PP
:images
.RS 4
Launch a program (default: feh) with images from the article
.SS Pager
.PP
c
Expand Down
11 changes: 11 additions & 0 deletions wikicurses.conf.5
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,17 @@ Hide the References section at the bottom of the page and strip citations from t
.RE
.SS keymap
This section configures the keyboard bindings of wikicurses, in the format "key=command". Command can be any ex command supported by wikicurses.
.SS images
.PP
program
.RS 4
The name of the command to use to show images. It should be able to take space separated URLs as arguments. Defaults to feh if unspecified.
.RE
.PP
arguments
.RS 4
The list of arguments given to the image display program. The URLs are appended to this. This is a space separated list. Example: `--flag -abc -d --option-with-arg arg`.
.RE
.SS Other Sections
Other sections are treated as wiki entries. The url is the url for api.php on the wiki. The username and password are required for editing.
.PP
Expand Down
111 changes: 110 additions & 1 deletion wikicurses/main.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
import os
import re
import json
import argparse
import tempfile
import subprocess
import urllib.parse
from functools import lru_cache

import urwid

Expand Down Expand Up @@ -35,6 +37,111 @@ def tabComplete(text, matches):
return match


def showImages():
"""
Launch program to display the images for the current article.

The program is specified in the config, along with it's args.
Otherwise, the default is `feh`.
"""
if 'wikipedia.org' not in wiki.siteurl:
ex.notify('Cancelled: Only works on Wikipedia')
return
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? If I remove this code, it works fine on at least some other wikis.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure how standardised any of this was to be honest. When I tried it on a couple of others, they raised errors (I believe to do with a different JSON structure.) I didn't want to risk crashing the user's session. However if you would prefer I'll remove that limit, and put the call to the showImages function in a try statement?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, using a try statement should be fine if there is not a better method.

What wikis are not working with it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I tried it on vimwiki (it does have some images.) I can't recall the others. I've tested it on a few Wikia sites just now though and it seems I was too hasty in limiting it, as they are just fine.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK vimwiki does not run on MediaWiki. It may also not work on some older (and future) versions of MediaWiki. To increase compatibility, I think you should work with the API endpoint instead of index.php - it provides an error reporting mechanism that you could use to produce meaningful error messages. It is also much more extensively documented and obviously has more features.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

http://vim.wikia.com/wiki/Vim_Tips_Wiki this one? It's on Wikia which uses MediaWiki, I thought.

Anyway, I'll have a go at changing to the API.


targets = fetchImageTargets(page.title)

if not targets:
ex.notify('No relevant images found')
return

## We make a copy of the returned list here,
## because it is memoized with `lru_cache` and we will be deleting
## elements from it non-permanently.
image_info = list(fetchImageInfo(page.title))
filtered = []

for target in targets:
for index, i in enumerate(image_info):
title, url = i
if target in title:
filtered.append(url)
del image_info[index] # prevent more than one occurrence of image
break

try:
command = [settings.conf.get('images', 'program')]
except (settings.configparser.NoOptionError,
settings.configparser.NoSectionError):
command = ['feh']

try:
args = settings.conf.get('images', 'arguments').split(' ')
except (settings.configparser.NoOptionError,
settings.configparser.NoSectionError):
if command == ['feh']:
args = ['-q', '--scale-down', '--image-bg', 'white', '-g', '400x400']
else:
args = []

command.extend(args)
command.extend(filtered)

try:
with open(os.devnull, 'w') as fp:
subprocess.Popen(command, stdout=fp, stderr=fp)
except FileNotFoundError:
ex.notify('Program not found')
return


@lru_cache()
def fetchImageTargets(page_title):
"""
Get filenames of relevant images from the current article.

These are used to filter out non-relevant images from the API result.
"""
url = re.sub(r'api.php', 'index.php', wiki.siteurl)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wiki.siteurl is the api url, not the url of the homepage.

Edit: Oops, somehow I misread the substitution as the reverse. But anyway, to support all wikis, the site url should be retrieved through the api instead of doing it this way.

Edit2: Actually, that might be a safe assumption. I'm not sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The index.php which is used to query for the raw data is (to my knowledge) always located in the same directory as api.php, but perhaps I'm mistaken?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if index.php is always at that path. It probably usually is, at least. I'm not really sure.

page_title = re.sub(r' ', '_', page_title)
raw = wiki._query(customurl=url, action="raw", title=page_title)

### Finding targets...
targets = []

## of the form `[[Image:foobar.png]]`
for match in re.finditer(r'\b(?:File:|Image:)([^]|\n\r]+)', raw):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Namespaces are case-insensitive and can have localized aliases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what specifically you're referring to, could you elaborate?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex matches only [[File:foobar]], but not [[file:foobar]]. Similar for the Image alias. Non-English wikis usually define other aliases, e.g. Bild in German. See this query:
https://de.wikipedia.org/w/api.php?action=query&meta=siteinfo&siprop=namespacealiases

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! I see. I'm still new to MediaWiki in general. Hm, well the case issue is easy, but I don't want to resolve the local aliases issue with another query. I'll have to think about a relatively foolproof solution to that.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The additional query is probably unavoidable, but it only needs to be issued once for the wiki. So make it a method of Wiki with @lru_cache.

targets.append(match.group(1))

## of the form `image1 = foobar.png`, `image2=foobar.png`,
## and `image = bar baz.png` etc...
for match in re.finditer(
r'(image\d?[\t ]*?=[\t ]*)(.+?\..+?)((?=[^A-Za-z0-9\.])|(?:$))',
raw):
targets.append(match.group(2))

return targets


@lru_cache()
def fetchImageInfo(page_title):
"""
Use API to fetch all image titles and urls on current article.

Returns a list of `(title, url)` tuples.
"""
page_title = re.sub(r' ', '_', page_title)
result = wiki._query(action="query", titles=page_title, generator="images",
prop="imageinfo", iiprop="url", format="json")

json_result = json.loads(result)
info = []

for v in json_result['query']['pages'].values():
info.append((v['title'], v['imageinfo'][0]['url']))

return info


class SearchBox(urwid.Edit):
title = "Search"

Expand Down Expand Up @@ -495,7 +602,7 @@ def cancel(button):
'extlinks': Extlinks,
'langs': Langs}
cmds = tuple(overlaymap) + ('quit', 'bmark', 'open', 'edit', 'clearcache',
'help', 'back', 'forward', 'random')
'help', 'back', 'forward', 'random', 'images')

def processCmd(cmd, *args):
global current
Expand Down Expand Up @@ -528,6 +635,8 @@ def processCmd(cmd, *args):
openPage(history[current], browsinghistory=True)
elif cmd == 'random':
openPage(wiki.random())
elif cmd == 'images':
showImages()
elif cmd:
ex.notify(cmd + ': Unknown Command')

Expand Down
7 changes: 6 additions & 1 deletion wikicurses/wiki.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,12 @@ def mainpage(self):
def _query(self, post=False, **kwargs):
params = {k: v for k, v in kwargs.items() if v is not False}
data = urllib.parse.urlencode(params)
url = self.siteurl

try:
url = params['customurl']
except KeyError:
url = self.siteurl

if post:
data = data.encode()
if not post:
Expand Down