Skip to content

Windows support for decompressing dumps #13

Open
@he7d3r

Description

@he7d3r

When running this

#!/usr/bin/env python
# -*- coding: UTF-8 -*-
# Example:
# python TEST.py ocwikibooks-20140928-pages-meta-history.xml.bz2
from mw import xml_dump
import sys

def rev_info(dump, path):
    for page in dump:
        yield page.title

def run(dump):
    for title in xml_dump.map(dump, rev_info):
        print(title)
    print('Done.' )

if __name__ == "__main__":
    run([sys.argv[1]])

on Windows, the result is the following:

C:\Users\Diego\Desktop\TEST> python aaa.py ocwikibooks-20140928-pages-articles.xml.bz2
Traceback (most recent call last):
  File "c:\Program Files\Python32\lib\pickle.py", line 683, in save_global
    klass = getattr(mod, name)
AttributeError: 'module' object has no attribute 'dec'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "aaa.py", line 18, in <module>
    run([sys.argv[1]])
  File "aaa.py", line 13, in run
    for title in xml_dump.map(dump, rev_info):
  File "c:\Program Files\Python32\lib\site-packages\mediawiki_utilities-0.4.1-py3.2.egg\mw\xml_dump\map.py", line 72, in map
    processor.start()
  File "c:\Program Files\Python32\lib\multiprocessing\process.py", line 132, in start
    self._popen = Popen(self)
  File "c:\Program Files\Python32\lib\multiprocessing\forking.py", line 266, in __init__
    dump(process_obj, to_child, HIGHEST_PROTOCOL)
  File "c:\Program Files\Python32\lib\multiprocessing\forking.py", line 188, in dump
    ForkingPickler(file, protocol).dump(obj)
  File "c:\Program Files\Python32\lib\pickle.py", line 237, in dump
    self.save(obj)
  File "c:\Program Files\Python32\lib\pickle.py", line 344, in save
    self.save_reduce(obj=obj, *rv)
  File "c:\Program Files\Python32\lib\pickle.py", line 432, in save_reduce
    save(state)
  File "c:\Program Files\Python32\lib\pickle.py", line 299, in save
    f(self, obj) # Call unbound method with explicit self
  File "c:\Program Files\Python32\lib\pickle.py", line 627, in save_dict
    self._batch_setitems(obj.items())
  File "c:\Program Files\Python32\lib\pickle.py", line 660, in _batch_setitems
    save(v)
  File "c:\Program Files\Python32\lib\pickle.py", line 299, in save
    f(self, obj) # Call unbound method with explicit self
  File "c:\Program Files\Python32\lib\pickle.py", line 687, in save_global
    (obj, module, name))
_pickle.PicklingError: Can't pickle <function dec at 0x0000000002812F48>: it's not found as mw.xml_dump.map.dec

C:\Users\Diego\Desktop\TEST>Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "c:\Program Files\Python32\lib\multiprocessing\forking.py", line 369, in main
    self = load(from_parent)
EOFError

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions