Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MDX img parsing #1596

Closed
Its-Just-Nans opened this issue Dec 23, 2024 · 2 comments
Closed

MDX img parsing #1596

Its-Just-Nans opened this issue Dec 23, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@Its-Just-Nans
Copy link
Contributor

MDX are not parsed correctly

Reproduce

test.mdx

# A Test

Some link in text [here](https://example.com/foo.png)

<img src="https://example.com/bar.png" />

<img src={"https://example.com/boo.png"} />

<img src='https://example.com/buu.png' />

<img src={'https://example.com/bii.png'} />

<img src={`https://example.com/${variable}`} />

import Img from "./img.png";

<img src={Img} />

Run

cargo run test.mdx -v

Result

[EXCLUDED] https://example.com/foo.png
[EXCLUDED] https://example.com/bar.png
[EXCLUDED] https://example.com/buu.png
[EXCLUDED] https://example.com/bii.png
[EXCLUDED] https://example.com/boo.png

Issues found in 1 input. Find details below.

[test.mdx]:
   [ERROR] file:///home/n4n5/tmp/tmp.mqqx3sEeAi/lychee/%7BImg%7D | Cannot find file

🔍 6 Total (in 0s) ✅ 0 OK 🚫 1 Error 👻 5 Excluded

Problem

I think variable as src should be ignored, in this case src={Img} should be ignored

@mre mre added the bug Something isn't working label Jan 12, 2025
@mre
Copy link
Member

mre commented Feb 5, 2025

This is probably quite hard to do.
Here are some cases that make it hard:

# Complex MDX Image Parsing Test Cases

// Basic cases
<img src="https://example.com/basic.png" />
<img src={"https://example.com/string-literal.png"} />
<img src={'https://example.com/single-quote.png'} />

// Markdown style
![Markdown style image](https://example.com/markdown.png)

// Import statement
import LocalImage from "./local-image.png";
<img src={LocalImage} />

// Nested template literals
<img src={`https://${domain}/images/${`${subpath}/${filename}`}`} />

// Dynamic import with require
<img src={require(`@/assets/${dynamic.path}`).default} />

// Conditional rendering
<img src={isProduction ? 
  "https://prod.example.com/image.png" : 
  "https://dev.example.com/image.png"
} />

// String concatenation with environment variables
<img src={'https://' + process.env.CDN_URL + '/images/' + props.image.path} />

// Variable interpolation
<img src={`https://example.com/${variable}`} />

// Multiple variables in template literal
<img src={`https://${domain}.${tld}/images/${filename}`} />

// Object property access
<img src={images.hero.url} />

// Function calls
<img src={getImageUrl('hero')} />

// Array access
<img src={imageUrls[index]} />

// Complex expression
<img src={`${CDN_URL}/${getPath(userId)}/${image.hash}?w=${width}&h=${height}`} />

// Mixed quotes and template literals
<img src={`${baseUrl}'s-image-${version}.png`} />

// Nested object destructuring
<img src={data?.user?.profile?.avatar?.url} />

// Dynamic path segments
<img src={new URL(`images/${filename}`, process.env.BASE_URL).toString()} />

// Import with alias
import { default as HeroImage } from './hero.png';
<img src={HeroImage} />

// Comments within template literals
<img src={`https://example.com/${
  // This is a comment
  process.env.ENVIRONMENT
}/image.png`} />

I think we could cover 80% of that with a snippet like that (pseudocode):

// Pseudocode for the simplified fix
fn should_exclude_url(url: &str) -> bool {
    // Skip any URL containing template literal syntax
    if url.contains('`') {
        return true;
    }
    
    // Skip any URL wrapped in curly braces or containing JS expressions
    if url.contains('{') || url.contains('}') {
        return true;
    }

    false
}

Although I think that's another rabbit hole that we'd rather not get into? 🤔

Unless someone is willing to write an MDX parser, I'd say that's pretty much out of the scope right now.
You can exclude those false-positives with --exclude, which supports regex.

@Its-Just-Nans
Copy link
Contributor Author

understandable

thanks

@Its-Just-Nans Its-Just-Nans closed this as not planned Won't fix, can't repro, duplicate, stale Feb 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants