-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
External link not parsed #13
Comments
I think there is currently a lack of parsing rule for images ( |
Would it be an idea to have an option for including the specific namespaces? These are language-dependant. |
Well, that's a good point! I think I will go on with a new configuration for you to specify such namespace names. Btw, there is a related discussion on earwig/mwparserfromhell#136 . |
Read the discussion. Same issue indeed. I think it would make sense to have a static language class for these situations. I can provide the Dutch version based on my findings on all WikiPedia articles. |
The updated ETA is before end of next week 😂 |
Published See the following snippet for an example on how to customize namespace prefixes used as MwParserFromScratch/UnitTestProject1/BasicParsingTests.cs Lines 158 to 162 in f0dac82
Additionally, you may use using WikiClientLibrary;
using WikiClientLibrary.Client;
using WikiClientLibrary.Sites;
var client = new WikiClient();
var endpointUrl = await WikiSite.SearchApiEndpointAsync(client, "nl.wikipedia.org")
var site = new WikiSite(client, endpointUrl);
await site.Initialization;
site.Namespaces[BuiltInNamespaces.File] |
This is perfect. I will complete this for the Dutch (NL) WikiPedia as I find the namespaces. Will take some time though :) |
Example:
[[Bestand:Bundesarchiv Bild 146III-373, Modell der Neugestaltung Berlins ("Germania").jpg|miniatuur|260px|right| Schaalmodel van de [[Welthauptstadt Germania]], 1939]]
This is a link on this particilar page:
https://nl.wikipedia.org/wiki/Albert_Speer
With the code
var ast = LoadAndParse(fileName.Trim(' ', '\t', '"')); var text = ast.ToPlainText(NodePlainTextOptions.RemoveRefTags);
I would expect the text to read:
Schaalmodel van de Welthauptstadt Germania, 1939
I have been trying to get this sorted, but I am kind of lost in the code...
The text was updated successfully, but these errors were encountered: