Skip to content

Parser #7

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 13 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 36 additions & 26 deletions .github/workflows/publish.yml
Original file line number Diff line number Diff line change
@@ -1,30 +1,40 @@
name: Upload Python Package to PyPI when a Release is Created
name: Publish to PyPI

on:
release:
types: [created]
release:
types: [published]

jobs:
pypi-publish:
name: Publish release to PyPI
runs-on: ubuntu-latest
environment:
name: pypi
url: https://pypi.org/p/Better-MD
permissions:
id-token: write
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.x"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install setuptools wheel
- name: Build package
run: |
python setup.py sdist bdist_wheel
- name: Publish package distributions to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
publish:
runs-on: ubuntu-latest
permissions:
id-token: write

steps:
- uses: actions/checkout@v4

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.x'

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install build twine

- name: Build package
run: python -m build

- name: Publish to PyPI
uses: pypa/[email protected]
with:
user: __token__
password: ${{ secrets.PYPI_API_TOKEN }}

- name: Publish to Test PyPI
uses: pypa/[email protected]
with:
user: __token__
password: ${{ secrets.PYPI_API_TOKEN }}
repository_url: https://test.pypi.org/legacy/
Comment on lines +35 to +40
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Use separate tokens for PyPI and Test PyPI.

Using the same token for both PyPI and Test PyPI is a security risk. Create and use separate tokens for each repository.

- name: Publish to Test PyPI
  uses: pypa/[email protected]
  with:
    user: __token__
-   password: ${{ secrets.PYPI_API_TOKEN }}
+   password: ${{ secrets.TEST_PYPI_API_TOKEN }}
    repository_url: https://test.pypi.org/legacy/
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- name: Publish to Test PyPI
uses: pypa/[email protected]
with:
user: __token__
password: ${{ secrets.PYPI_API_TOKEN }}
repository_url: https://test.pypi.org/legacy/
- name: Publish to Test PyPI
uses: pypa/[email protected]
with:
user: __token__
password: ${{ secrets.TEST_PYPI_API_TOKEN }}
repository_url: https://test.pypi.org/legacy/

46 changes: 45 additions & 1 deletion BetterMD/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,48 @@
import logging
from .elements import *
from .html import CustomHTML
from .markdown import CustomMarkdown
from .rst import CustomRst
from .rst import CustomRst
from .parse import HTMLParser, MDParser, Collection

class HTML:
@staticmethod
def from_string(html:'str'):
return Symbol.from_html(html)

@staticmethod
def from_file(file):
return Symbol.from_html(file)

@staticmethod
def from_url(url):
import requests as r
text = r.get(url).text

if text.startswith("<!DOCTYPE html>"):
text = text[15:]

return Symbol.from_html(text)
Comment on lines +19 to +25
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve error handling in URL requests

The from_url method doesn't handle potential network errors when fetching content. Add try-except blocks to gracefully handle connection errors, timeouts, and other HTTP issues.

@staticmethod
def from_url(url):
    import requests as r
-    text = r.get(url).text
+    try:
+        response = r.get(url, timeout=10)
+        response.raise_for_status()  # Raise an exception for HTTP errors
+        text = response.text
+    except Exception as e:
+        raise ValueError(f"Failed to fetch URL content: {e}")

    if text.startswith("<!DOCTYPE html>"):
        text = text[15:]

    return Symbol.from_html(text)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import requests as r
text = r.get(url).text
if text.startswith("<!DOCTYPE html>"):
text = text[15:]
return Symbol.from_html(text)
@staticmethod
def from_url(url):
import requests as r
try:
response = r.get(url, timeout=10)
response.raise_for_status() # Raise an exception for HTTP errors
text = response.text
except Exception as e:
raise ValueError(f"Failed to fetch URL content: {e}")
if text.startswith("<!DOCTYPE html>"):
text = text[15:]
return Symbol.from_html(text)


class MD:
@staticmethod
def from_string(md:'str'):
return Symbol.from_md(md)

@staticmethod
def from_file(file):
return Symbol.from_md(file)

@staticmethod
def from_url(url):
import requests as r
text = r.get(url).text
return Symbol.from_md(text)
Comment on lines +37 to +40
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add consistent error handling to MD.from_url

For consistency with the suggested improvements to HTML.from_url, implement the same error handling here:

@staticmethod
def from_url(url):
    import requests as r
-    text = r.get(url).text
-    return Symbol.from_md(text)
+    try:
+        response = r.get(url, timeout=10)
+        response.raise_for_status()  # Raise an exception for HTTP errors
+        text = response.text
+        return Symbol.from_md(text)
+    except Exception as e:
+        raise ValueError(f"Failed to fetch URL content: {e}")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def from_url(url):
import requests as r
text = r.get(url).text
return Symbol.from_md(text)
def from_url(url):
import requests as r
try:
response = r.get(url, timeout=10)
response.raise_for_status() # Raise an exception for HTTP errors
text = response.text
return Symbol.from_md(text)
except Exception as e:
raise ValueError(f"Failed to fetch URL content: {e}")


def enable_debug_mode():
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger("BetterMD")

return logger

__all__ = ["HTML", "MD", "Symbol", "Collection", "HTMLParser", "MDParser", "CustomHTML", "CustomMarkdown", "CustomRst", "enable_debug_mode"]
10 changes: 10 additions & 0 deletions BetterMD/__main__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
import logging


def setup_logger():
LEVEL = logging.INFO
logging.basicConfig(level=LEVEL)
logger = logging.getLogger("BetterMD")
return logger

setup_logger()
131 changes: 121 additions & 10 deletions BetterMD/elements/__init__.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,127 @@
from .symbol import Symbol
from .comment import Comment
from .svg import *

from .text_formatting import Strong, Em, B

from .a import A
from .abbr import Abbr
from .acronym import Acronym
from .address import Address
from .area import Area
from .article import Article
from .aside import Aside
from .audio import Audio

from .base import Base
from .bd import BDI, BDO
from .big import Big
from .blockquote import Blockquote
from .body import Body
from .br import Br
from .button import Button

from .canvas import Canvas
from .caption import Caption
from .center import Center
from .cite import Cite
from .code import Code
from .col import Col, Colgroup

from .d import DD, DFN, DL, DT
from .data import Data
from .datalist import DataList
from .del_ import Del # Using _ to avoid conflict with del keyword
from .details import Details
from .dialog import Dialog
from .dir import Dir
from .div import Div

from .embed import Embed

from .fencedframe import FencedFrame
from .fieldset import Fieldset
from .figure import FigCaption, Figure
from .font import Font
from .footer import Footer
from .form import Form
from .frame import Frame
from .frameset import Frameset

from .h import H1,H2,H3,H4,H5,H6
from .head import Head
from .header import Header
from .hgroup import HGroup
from .hr import Hr
from .html import HTML

from .i import I
from .iframe import Iframe
from .img import Img
from .input import Input
from .ins import Ins

from .kbd import Kbd

from .label import Label
from .legend import Legend
from .li import OL, UL, LI
from .text import Text
from .div import Div
from .link import Link

from .main import Main
from .map import Map
from .mark import Mark
from .marquee import Marquee
from .menu import Menu
from .meta import Meta
from .meter import Meter

from .nav import Nav
from .no import NoFrames, NoScript, NoBr, NoEmbed

from .object import Object
from .output import Output

from .p import P
from .param import Param
from .picture import Picture
from .plaintext import Plaintext
from .progress import Progress

from .q import Q

from .ruby import RB, RP, RT, RTC

from .s import S
from .samp import Samp
from .script import Script
from .search import Search
from .section import Section
from .select import Select
from .slot import Slot
from .small import Small
from .source import Source
from .span import Span
from .img import Img
from .text_formatting import Strong, Em, Code
from .br import Br
from .blockquote import Blockquote
from .hr import Hr
from .table import Table, Tr, Td, Th
from .input import Input
from .code import Code
from .strike import Strike
from .style import Style
from .sub import Sub
from .summary import Summary
from .sup import Sup

from .table import Table, Tr, Td, Th, THead, TBody, TFoot
from .template import Template
from .text import Text
from .textarea import Textarea
from .time import Time
from .title import Title
from .track import Track
from .tt import TT

from .u import U

from .var import Var
from .video import Video

from .wbr import WBR

from .xmp import XMP
24 changes: 14 additions & 10 deletions BetterMD/elements/a.py
Original file line number Diff line number Diff line change
@@ -1,23 +1,27 @@
from BetterMD.rst.custom_rst import CustomRst
from .symbol import Symbol
from ..rst import CustomRst
from ..markdown import CustomMarkdown
from ..html import CustomHTML
import typing as t

class MD(CustomMarkdown['A']):
class MD(CustomMarkdown):
def to_md(self, inner, symbol, parent):
return f"[{" ".join([e.to_md() for e in inner])}]({symbol.get_prop("href")})"

class HTML(CustomHTML['A']):
def to_html(self, inner, symbol, parent):
return f"<a href={symbol.get_prop('href')}>{" ".join([e.to_html() for e in inner])}</a>"

class RST(CustomRst['A']):
def to_rst(self, inner, symbol, parent):
return f"`{' '.join([e.to_rst() for e in inner])} <{symbol.get_prop('href')}>`_"

class A(Symbol):
prop_list = ["href"]

refs = {}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Use a class variable for shared references with caution.

The refs = {} dictionary here is shared among all instances of class A. Any modifications to refs on one instance will affect other instances. Consider changing the design if different instances of A should have separate dictionaries.

-class A(Symbol):
-    ...
-    refs = {}
+class A(Symbol):
+    ...
+    # If you need a unique dictionary on each instance, move refs to __init__
+    def __init__(self, *args, **kwargs):
+        super().__init__(*args, **kwargs)
+        self.refs = {}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
refs = {}
class A(Symbol):
...
# If you need a unique dictionary on each instance, move refs to __init__
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.refs = {}

md = MD()
html = HTML()
rst = RST()
html = "a"
rst = RST()

@classmethod
def get_ref(cls, name):
return cls.refs[name]

@classmethod
def email(cls, email):
return cls(href=f"mailto:{email}")
8 changes: 8 additions & 0 deletions BetterMD/elements/abbr.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
from .symbol import Symbol

class Abbr(Symbol):
prop_list = ["title"]

md = ""
html = "abbr"
rst = ""
8 changes: 8 additions & 0 deletions BetterMD/elements/acronym.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
from .symbol import Symbol

class Acronym(Symbol):
prop_list = ["title"]

md = ""
html = "acronym"
rst = ""
6 changes: 6 additions & 0 deletions BetterMD/elements/address.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
from .symbol import Symbol

class Address(Symbol):
md = ""
html = "address"
rst = ""
8 changes: 8 additions & 0 deletions BetterMD/elements/area.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
from .symbol import Symbol

class Area(Symbol):
prop_list = ["alt", "coords", "download", "href", "ping", "referrerpolicy", "rel", "shape", "target"]

md = ""
html = "area"
rst = ""
6 changes: 6 additions & 0 deletions BetterMD/elements/article.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
from .symbol import Symbol

class Article(Symbol):
md = ""
html = "article"
rst = ""
6 changes: 6 additions & 0 deletions BetterMD/elements/aside.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
from .symbol import Symbol

class Aside(Symbol):
md = ""
html = "aside"
rst = ""
Comment on lines +1 to +6
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Missing property list for Aside element

The Aside class is missing a prop_list attribute, which is present in the other Symbol-derived classes you've implemented. This appears to be inconsistent with the pattern established in other element classes.

While the HTML <aside> element doesn't have many unique attributes, for consistency you should either:

  1. Add an empty prop_list = [], or
  2. Add relevant properties if there are any that should be supported
from .symbol import Symbol

class Aside(Symbol):
+    prop_list = []
    md = ""
    html = "aside"
    rst = ""

Also, consider documenting why the md and rst attributes are empty strings.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from .symbol import Symbol
class Aside(Symbol):
md = ""
html = "aside"
rst = ""
from .symbol import Symbol
class Aside(Symbol):
prop_list = []
md = ""
html = "aside"
rst = ""

8 changes: 8 additions & 0 deletions BetterMD/elements/audio.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
from .symbol import Symbol

class Audio(Symbol):
prop_list = ["autoplay", "controls", "crossorigin", "disableremoteplayback", "loop", "muted", "preload", "src"]

md = ""
html = "audio"
rst = ""
8 changes: 8 additions & 0 deletions BetterMD/elements/base.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
from .symbol import Symbol

class Base(Symbol):
prop_list = ["href", "target"]

md = ""
html = "base"
rst = ""
Loading