-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add invidious companion support #4985
base: master
Are you sure you want to change the base?
Conversation
7efa8f7
to
194fb72
Compare
f6d8ddc
to
a63fca8
Compare
src/invidious/routes/api/manifest.cr
Outdated
if local && CONFIG.invidious_companion | ||
return env.redirect "#{video.invidious_companion["baseUrl"].as_s}#{env.request.path}?#{env.request.query}" | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the get_video()
call above already talks to the companion and gets playable URLs, why still redirect to the companion manifest handler? Why not let invidious generate the manifest?
PS: I realized that there is an important oversight here: what if the video
object (e.g returned from cache) doesn't have a companion field, or the companion base URL is not present anymore in the config?
PPS: Why handling that case at all when the URL is already replaced in components/player.ecr
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are multiple reasons:
- One may configure a proxy in invidious_companion. And invidious companion is going to handle multiple proxies soon, so this can't be handled inside Invidious: handle the ability to use multiple proxies invidious-companion#5
- DASH manifest generation from youtube.js include way more things than invidious. which includes multi-language audio support, see here for more: https://github.com/LuanRT/YouTube.js/blob/4e9c2a585bf84751dd4e3964f70fba284c8b8e38/src/utils/DashManifest.tsx#L68-L267
- If an external program relies on the dash api, it makes more sense to redirect this request to Invidious companion because it avoids making unnecessary process in invidious as Invidious companion is the one requesting the video streams from YouTube.
- One may configure multiple invidious companion, we need to make sure that the video stream used is from the same public IP address that generated the video stream.
In the design plan, I said that everything related to the video retrieval should be handled now in invidious companion. It doesn't make a lot of sense to handle all the logic explained above in invidious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One may configure a proxy in invidious_companion. And invidious companion is going to handle multiple proxies soon, so this can't be handled inside Invidious
Unless this is done transparently to the end user (e.g with NGINX rules), the CSP header will need to reflect the use of these external proxies, and so Invidious will need to take care of it.
DASH manifest generation from youtube.js include way more things than invidious. which includes multi-language audio support
But is this compatible with videoJS? Our DASH manifest is missing multiple formats because of that.
If an external program relies on the dash api, it makes more sense to redirect this request to Invidious companion because it avoids making unnecessary process in invidious as Invidious companion is the one requesting the video streams from YouTube.
Invidious already requests and parses the whole /player response returned by the companion (if it wasn't already in cache) so I don't think this will represent much more processing.
One may configure multiple invidious companion, we need to make sure that the video stream used is from the same public IP address that generated the video stream.
For an API user, all URLs should already have that companion as the host, and for a web user, all the relevant URLs should already be handled in components/player.ecr
. Am I right?
If so, the only way to access invidiou's own /latest_version
and manifest endpoints would be willingly by a malicious or careless API user, so broken streams are a non-problem here imo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First of all, I would like to say that I have been in your thinking situation. I had long thoughts about the advantages and the disadvantages of each implementation. This current implementation outweighs the one where Invidious handle almost all the logic and Invidious companion is just "dumb proxy".
Also, invidious companion is not entirely new, it's actually based on the ideas of invidious-stripped-down. A project that I have used for 2 years on yewtu.be in order to replace some functions of Invidious. For really improving the performance and user experience of my instance.
Unless this is done transparently to the end user (e.g with NGINX rules), the CSP header will need to reflect the use of these external proxies, and so Invidious will need to take care of it.
Already does in https://github.com/iv-org/invidious/pull/4985/files#diff-a86ced8aa99e3403588538decc008851d3d33dfadbb00d392d5ec8b4696f7df4
But is this compatible with videoJS? Our DASH manifest is missing multiple formats because of that.
Yes. I have used it for 2 years on yewtu.be. I'm stripping out the codec that do not work: https://github.com/iv-org/invidious-companion/blob/master/src/routes/invidious_routes/dashManifest.ts#L41-L43
If an external program relies on the dash api, it makes more sense to redirect this request to Invidious companion because it avoids making unnecessary process in invidious as Invidious companion is the one requesting the video streams from YouTube.
Invidious already requests and parses the whole /player response returned by the companion (if it wasn't already in cache) so I don't think this will represent much more processing.
The benefit of handling latest_version and /api/manifest/dash/id/ is that you can more easily handle all the edge cases in invidious companion itself, no need to add more logic in Invidious. Especially in a time when YouTube is evolving rapidly, we want to keep the pace thanks to youtube.js.
For an API user, all URLs should already have that companion as the host, and for a web user, all the relevant URLs should already be handled in
components/player.ecr
. Am I right?
If so, the only way to access invidiou's own/latest_version
and manifest endpoints would be willingly by a malicious or careless API user, so broken streams are a non-problem here imo.
My number one priority is users using Invidious directly. We can think about API users later. Our main user base uses our frontend.
It doesn't mean that developers using the API are forever forgotten. I have ideas but for the moment, if invidious companion is configured then the video streams given through the API may not always work, especially when "proxying" through the same public IP address is needed.
I'm sorry, but a day is only 24 hours and at the moment I'm a single developer trying to improve the overall usability of Invidious through youtube.js. I can't deal with all the cases at the same time. Happy to receive more help for contribution!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already does in https://github.com/iv-org/invidious/pull/4985/files#diff-a86ced8aa99e3403588538decc008851d3d33dfadbb00d392d5ec8b4696f7df4
The problem is that it add the companion's domains to the CSP, but not the companion's proxies themselves.
The benefit of handling latest_version and /api/manifest/dash/id/ is that you can more easily handle all the edge cases in invidious companion itself [...]
[...]
My number one priority is users using Invidious directly. We can think about API users later. Our main user base uses our frontend.
My question was about the necessity of adding this redirect logic on these endpoints, where they should never be called in the first place, as none of the URLs in the /watch
page should lead to invidious in the first place!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already does in #4985 (files)
The problem is that it add the companion's domains to the CSP, but not the companion's proxies themselves.
This adds in the CSP the baseUrl: https://github.com/iv-org/invidious-companion/blob/master/config/default.toml#L5 which is given by invidious companion in the JSON reply. This domain is the same domain that which videojs will use for either latest_version or dash manifest api requests or videoplayback request.
We are not adding the URL passing in here: https://github.com/iv-org/invidious/pull/4985/files#diff-b68cf7cb2cb2dbd2275444b03cdc238962e05b5ccff0b89ba34b1f9126f39187R71
invidious_companion:
- http://127.0.0.1:8282
My question was about the necessity of adding this redirect logic on these endpoints, where they should never be called in the first place, as none of the URLs in the
/watch
page should lead to invidious in the first place!
You always have stuff that use these endpoints directly without going through the frontend. It's to make sure that we are using invidious companion all the time.
- download function/button in invidious
- bots
- apps that use latest_version or dash manifest api instead of the official api of invidious
It's to make sure that we avoid sending wrong requests to youtube servers and the IP get banned for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This adds in the CSP the baseUrl [...] which is given by invidious companion in the JSON reply
I got that ^^ But if the companion has say proxy1.example.com
and proxy2.example.com
set, those won't be added to the CSP. Unless this is completely transparent to invidious (= the companion forwards the request to the proxies itself, and invidious only ever connects to http://127.0.0.1:8282
)
It's to make sure that we avoid sending wrong requests to youtube servers and the IP get banned for that.
Fair enough!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have rewritten some part. Let me explain.
If you ENABLE invidious_companion then by default in Invidious:
- /api/manifest/dash/id/ is blocked
- /latest_version is blocked
This makes sense, as those endpoints are not used in Invidious if Invidious companion is enabled. This is to avoid any tries to load some video stream through the incorrect IP and making the requests suspicious to YouTube servers.
But for the "downloads" feature, all requests will now be straight sent to latest_version endpoint of invidious companion.
I know this does not yet download the video file directly but the user can do it by clicking on the download feature in their browser and this works. It's being tracked in iv-org/invidious-companion#13
In my opinion this is not a dealbreaker as the Invidious download function is somewhat not useful anymore since the removal of hd720. Maybe in the future we could look into adding ffmpeg combining video and audio on the fly, but right now I'm prioritizing giving back more stability to Invidious.
Feedback from @Fijxu: |
src/invidious/routes/api/manifest.cr
Outdated
if local && CONFIG.invidious_companion | ||
return env.redirect "#{video.invidious_companion["baseUrl"].as_s}#{env.request.path}?#{env.request.query}" | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already does in https://github.com/iv-org/invidious/pull/4985/files#diff-a86ced8aa99e3403588538decc008851d3d33dfadbb00d392d5ec8b4696f7df4
The problem is that it add the companion's domains to the CSP, but not the companion's proxies themselves.
The benefit of handling latest_version and /api/manifest/dash/id/ is that you can more easily handle all the edge cases in invidious companion itself [...]
[...]
My number one priority is users using Invidious directly. We can think about API users later. Our main user base uses our frontend.
My question was about the necessity of adding this redirect logic on these endpoints, where they should never be called in the first place, as none of the URLs in the /watch
page should lead to invidious in the first place!
2683b24
to
37df2b4
Compare
Co-authored-by: syeopite <[email protected]>
Co-authored-by: syeopite <[email protected]>
97ae26c
to
1aa154b
Compare
I have just added a new commit for giving invidious companion the ability to verify that the request originated from an invidious watch page. This allows to combat against bots that will abuse the latest_version endpoint. This verification is not enabled by default in Invidious companion. I made it on purpose to not include the verification ID for the internal latest_version redirect. Mainly because this would defeat the purpose of combatting bots since the ID would be given by Invidious. |
bce789b
to
1de2054
Compare
I just marked this PR as ready as I think the code is now production ready. @SamantazFox @syeopite could you please take a look again at the code? Thanks. |
src/invidious/config.cr
Outdated
@@ -222,6 +238,24 @@ class Config | |||
end | |||
{% end %} | |||
|
|||
if !config.invidious_companion.empty? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Codestlye nitpick
What do you think about replacing !empty?
with present?
instead or unless {...}.empty?
I've seen a couple discussions in the Crystal community regarding if !empty?
being harder to process cognitively due to an essentially double negation.
Related discussions:
https://forum.crystal-lang.org/t/collections-any-vs-empty/5303
crystal-lang/crystal#13847
crystal-lang/shards#577 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's perfect for me. That was already odd for me to do !empty?
, happy there is an alternative.
if companion_base_url = video.invidious_companion.try &.["baseUrl"].as_s | ||
env.response.headers["Content-Security-Policy"] = | ||
env.response.headers["Content-Security-Policy"] | ||
.gsub("media-src", "media-src #{companion_base_url}") | ||
.gsub("connect-src", "connect-src #{companion_base_url}") | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move these to the before_all handler instead maybe under a:
if {"/embed", "/watch"}.any? { |r| env.request.resource.starts_with? r }
env.response.headers["Content-Security-Policy"] =
env.response.headers["Content-Security-Policy"]
.gsub("media-src", "media-src #{companion_base_url}")
.gsub("connect-src", "connect-src #{companion_base_url}")
end
src/invidious/videos/parser.cr
Outdated
client_config.client_type = YoutubeAPI::ClientType::AndroidTestSuite | ||
new_player_response = try_fetch_streaming_data(video_id, client_config) | ||
end | ||
if !CONFIG.invidious_companion.empty? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if !CONFIG.invidious_companion.empty? | |
if CONFIG.invidious_companion.empty? |
The Invidious stream data workarounds should run when invidious companion is not set
|
||
begin | ||
invidious_companion = CONFIG.invidious_companion.sample | ||
response = make_client(invidious_companion.private_url, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since Invidious is expected to constantly make requests to the Invidious companion shouldn't it use a connection pool?
It shouldn't be too difficult to modify the connection pool's factory method to randomly select a companion URL for each client it creates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you suggest how to do that? I'm not very familiar with this stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't tested it but something like this should do the trick:
Patch
diff --git a/src/invidious.cr b/src/invidious.cr
index b422dcbb..c0c78a79 100644
--- a/src/invidious.cr
+++ b/src/invidious.cr
@@ -97,6 +97,10 @@ YT_POOL = YoutubeConnectionPool.new(YT_URL, capacity: CONFIG.pool_size)
GGPHT_POOL = YoutubeConnectionPool.new(URI.parse("https://yt3.ggpht.com"), capacity: CONFIG.pool_size)
+COMPANION_POOL = CompanionConnectionPool.new(
+ capacity: CONFIG.pool_size
+)
+
# CLI
Kemal.config.extra_options do |parser|
parser.banner = "Usage: invidious [arguments]"
diff --git a/src/invidious/yt_backend/connection_pool.cr b/src/invidious/yt_backend/connection_pool.cr
index c4a73aa7..6f1ef9bd 100644
--- a/src/invidious/yt_backend/connection_pool.cr
+++ b/src/invidious/yt_backend/connection_pool.cr
@@ -46,6 +46,45 @@ struct YoutubeConnectionPool
end
end
+struct CompanionConnectionPool
+ property pool : DB::Pool(HTTP::Client)
+
+ def initialize(capacity = 5, timeout = 5.0)
+ options = DB::Pool::Options.new(
+ initial_pool_size: 0,
+ max_pool_size: capacity,
+ max_idle_pool_size: capacity,
+ checkout_timeout: timeout
+ )
+
+ @pool = DB::Pool(HTTP::Client).new(options) do
+ companion = CONFIG.invidious_companion.sample
+ next make_client(companion.private_url, force_resolve: true)
+ end
+ end
+
+ def client(&)
+ conn = pool.checkout
+ # Proxy needs to be reinstated every time we get a client from the pool
+ conn.proxy = make_configured_http_proxy_client() if CONFIG.http_proxy
+
+ begin
+ response = yield conn
+ rescue ex
+ conn.close
+
+ companion = CONFIG.invidious_companion.sample
+ conn = make_client(companion.private_url, force_resolve: true)
+
+ response = yield conn
+ ensure
+ pool.release(conn)
+ end
+
+ response
+ end
+end
+
def add_yt_headers(request)
request.headers.delete("User-Agent") if request.headers["User-Agent"] == "Crystal"
request.headers["User-Agent"] ||= "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36"
diff --git a/src/invidious/yt_backend/youtube_api.cr b/src/invidious/yt_backend/youtube_api.cr
index 74f65449..9bc6fe05 100644
--- a/src/invidious/yt_backend/youtube_api.cr
+++ b/src/invidious/yt_backend/youtube_api.cr
@@ -695,9 +695,7 @@ module YoutubeAPI
# Send the POST request
begin
- invidious_companion = CONFIG.invidious_companion.sample
- response = make_client(invidious_companion.private_url,
- &.post(endpoint, headers: headers, body: data.to_json))
+ response = COMPANION_POOL.client &.post(endpoint, headers: headers, body: data.to_json)
body = response.body
if (response.status_code != 200)
raise Exception.new(
Keep in mind that the connection pool will have the behavior described here #4326 (comment)
## If you are using a reverse proxy then you will probably need to | ||
## configure the public_url to be the same as the domain used for Invidious. | ||
## Also apply when used from an external IP address (without a domain). | ||
## Examples: https://MYINVIDIOUSDOMAIN or http://192.168.1.100:8282 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, I'm really excited we can deprecate inv-sig-helper
and that it fixes many issues!
Maybe you could mention which paths have to go through the Companion if Invidious and Companion domains are the same.
AFAIK these are:
/api/manifest/dash/id/*
/latest_version
In my K8S / helm template:
...
routes:
- kind: Rule
match: Host(`{{ .Values.env.domain }}`) && PathPrefix(`/`)
services:
- name: {{ .Values.name }}-app
port: {{ .Values.network.app_port_http }}
- kind: Rule
match: Host(`{{ .Values.env.domain }}`) && (PathPrefix(`/api/manifest/dash/id`) || PathPrefix(`/latest_version`))
services:
- name: {{ .Values.name }}-app
port: {{ .Values.network.companion_port_http }}
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's in the temporary doc: https://docs.invidious.io/companion-installation/
Description
Invidious companion is the new tool created for the retrieval of the YouTube streams: https://github.com/iv-org/invidious-companion
Invidious will not handle the videos streams retrieval anymore, as it has become a burner for the Invidious team to adapt. Instead, invidious companion will be based on https://github.com/LuanRT/YouTube.js which is the most up to date when it comes to video streams retrieval.
This allows us to spend more time on actually improving the Invidious frontend.
What does this PR do?
Invidious send the /player request to Invidious companion in HTTP(S). Invidious companion does all of its magic then Invidious does the usual parsing of the player endpoint.
Invidious also delegate the work of
latest_version
,/api/manifest/dash/id
and/videoplayback
to Invidious companion.What does invidious companion do
Incompatibilities
Not supported yet - will be work in progress after this PR is merged
Future potential work
How to try?
Fixes
Related to