Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improper special character escaping in RSS Feed #297

Open
marb08 opened this issue Jan 3, 2025 · 1 comment
Open

Improper special character escaping in RSS Feed #297

marb08 opened this issue Jan 3, 2025 · 1 comment
Labels
work completed Work for this has been completed but it may not yet be merged / released
Milestone

Comments

@marb08
Copy link

marb08 commented Jan 3, 2025

Description

The RSS feed is not properly handling special characters in its content.
For example, when parsing information from this URL:

  • Example: Supercoppa e non solo, nel super gennaio di Conceiçao c'è di tutto. Le tre vie per ripartire
    • Expected: Supercoppa e non solo, nel super gennaio di Conceição c'è di tutto. Le tre vie per ripartire
    • Observed: Special characters remain as escaped entities (&#x...;).

I investigated the issue and found that the RSS feed from the above URL seems to omit wrapping the <description> field in <![CDATA[...]]>. As a result, the parser used by Glance may be too reliant on the RSS feed’s format, not handling the missing CDATA properly.

Steps to Reproduce

  1. Fetch the RSS feed from https://www.gazzetta.it/dynamic-feed/rss/section/Calcio.xml.
  2. Parse the feed and examine the <description> fields for escaped special characters.
  3. Special characters like ç and é remain in their escaped form (&#x...;) instead of being decoded into readable text.

Expected Behavior

  • Special characters should be handled properly, with escaped entities correctly decoded into readable text.

Possible Cause

  1. The feed is missing CDATA blocks around the <description> fields, which are needed to encapsulate unencoded text.
  2. The parser might not handle this deviation, assuming the feed strictly adheres to standard formatting.
@marb08 marb08 changed the title Improper Special Character Escaping in RSS Feed Improper special character escaping in RSS Feed Jan 3, 2025
@svilenmarkov
Copy link
Member

Thanks for reporting this! Should be a pretty simple fix, I'll get it sorted out.

@svilenmarkov svilenmarkov added the work completed Work for this has been completed but it may not yet be merged / released label Jan 4, 2025
@svilenmarkov svilenmarkov added this to the v0.7.0 milestone Jan 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
work completed Work for this has been completed but it may not yet be merged / released
Projects
None yet
Development

No branches or pull requests

2 participants