Skip to content

Rate-limit anonymous accesses to expensive pages when the server is overloaded #34167

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions custom/conf/app.example.ini
Original file line number Diff line number Diff line change
@@ -780,6 +780,7 @@ LEVEL = Info
;; for example: block anonymous AI crawlers from accessing repo code pages.
;; The "expensive" mode is experimental and subject to change.
;REQUIRE_SIGNIN_VIEW = false
;OVERLOAD_INFLIGHT_ANONYMOUS_REQUESTS =
Copy link
Member

@silverwind silverwind Apr 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think we need to document debug-only options, at least not like that. Maybe we should add a [debug] ini section?

;;
;; Mail notification
;ENABLE_NOTIFY_MAIL = false
7 changes: 6 additions & 1 deletion modules/setting/service.go
Original file line number Diff line number Diff line change
@@ -5,6 +5,7 @@ package setting

import (
"regexp"
"runtime"
"strings"
"time"

@@ -45,6 +46,8 @@ var Service = struct {
ShowMilestonesDashboardPage bool
RequireSignInViewStrict bool
BlockAnonymousAccessExpensive bool
BlockAnonymousAccessOverload bool
OverloadInflightAnonymousRequests int
EnableNotifyMail bool
EnableBasicAuth bool
EnablePasskeyAuth bool
@@ -164,10 +167,12 @@ func loadServiceFrom(rootCfg ConfigProvider) {
// boolean values are considered as "strict"
var err error
Service.RequireSignInViewStrict, err = sec.Key("REQUIRE_SIGNIN_VIEW").Bool()
Service.OverloadInflightAnonymousRequests = sec.Key("OVERLOAD_INFLIGHT_ANONYMOUS_REQUESTS").MustInt(4 * runtime.NumCPU())
if s := sec.Key("REQUIRE_SIGNIN_VIEW").String(); err != nil && s != "" {
// non-boolean value only supports "expensive" at the moment
Service.BlockAnonymousAccessExpensive = s == "expensive"
if !Service.BlockAnonymousAccessExpensive {
Service.BlockAnonymousAccessOverload = s == "overload"
if !Service.BlockAnonymousAccessExpensive && !Service.BlockAnonymousAccessOverload {
log.Fatal("Invalid config option: REQUIRE_SIGNIN_VIEW = %s", s)
}
}
74 changes: 71 additions & 3 deletions routers/common/blockexpensive.go
Original file line number Diff line number Diff line change
@@ -6,27 +6,94 @@ package common
import (
"net/http"
"strings"
"sync/atomic"
"time"

user_model "code.gitea.io/gitea/models/user"
"code.gitea.io/gitea/modules/reqctx"
"code.gitea.io/gitea/modules/setting"
"code.gitea.io/gitea/modules/templates"
"code.gitea.io/gitea/modules/util"
"code.gitea.io/gitea/modules/web/middleware"
"code.gitea.io/gitea/services/context"

"github.com/go-chi/chi/v5"
lru "github.com/hashicorp/golang-lru/v2"
)

const tplStatus503RateLimit templates.TplName = "status/503_ratelimit"

type RateLimitToken struct {
RetryAfter time.Time
}

func BlockExpensive() func(next http.Handler) http.Handler {
if !setting.Service.BlockAnonymousAccessExpensive {
if !setting.Service.BlockAnonymousAccessExpensive && !setting.Service.BlockAnonymousAccessOverload {
return nil
}

tokenCache, _ := lru.New[string, RateLimitToken](10000)

deferAnonymousRateLimitAccess := func(w http.ResponseWriter, req *http.Request) bool {
// * For a crawler: if it sees 503 error, it would retry later (they have their own queue), there is still a chance for them to read all pages
// * For a real anonymous user: allocate a token, and let them wait for a while by browser JS (queue the request by browser)

const tokenCookieName = "gitea_arlt" // gitea anonymous rate limit token
cookieToken, _ := req.Cookie(tokenCookieName)
if cookieToken != nil && cookieToken.Value != "" {
token, exist := tokenCache.Get(cookieToken.Value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: this doesn't work when running multiple Gitea processes, perhaps to load balance or enable graceful restarts. Since this code is meant for instances that serve substantial public traffic, they are more likely to be running a multi process setup.

suggestion: leverage a signed token instead, perhaps a JWT. This prevents needing some sort of external store for these tokens. However, this prevents easy revocation of the token, so it should likely have an expiration as well. Because of this, we should also add the request's IP address to this JWT, and check that the IP address in the token is the same as the request's. This is because large, malicious crawler often distribute workloads over many IPs to get around IP-based rate limits.

Copy link
Contributor Author

@wxiaoguang wxiaoguang Apr 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are far more problems to run Gitea in a cluster, see my comments in other issues (#13791)

For this problem: there must be Redis in a cluster.

if exist {
if time.Now().After(token.RetryAfter) {
// still valid
tokenCache.Remove(cookieToken.Value)
return false
}
// not reach RetryAfter time, so either remove the old one and allocate a new one, or keep using the old one
// TODO: in the future, we could do better to allow more accesses for the same token, or extend the expiration time if the access seems well-behaved
tokenCache.Remove(cookieToken.Value)
}
}

// TODO: merge the code with RenderPanicErrorPage
tmplCtx := context.TemplateContext{}
tmplCtx["Locale"] = middleware.Locale(w, req)
ctxData := middleware.GetContextData(req.Context())

tokenKey, _ := util.CryptoRandomString(32)
retryAfterDuration := 1 * time.Second
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: since this causes the client to retry, there is a possibility that a lot of clients are told to retry after a given time, and those retries all happen at approximately the same time, causing an outage.

suggestion: add some random jitter to this value so that clients don't all retry at the same time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need. Client time already random.

token := RateLimitToken{RetryAfter: time.Now().Add(retryAfterDuration)}
tokenCache.Add(tokenKey, token)
ctxData["RateLimitTokenKey"] = tokenKey
ctxData["RateLimitCookieName"] = tokenCookieName
ctxData["RateLimitRetryAfterMs"] = retryAfterDuration.Milliseconds() + 100
_ = templates.HTMLRenderer().HTML(w, http.StatusServiceUnavailable, tplStatus503RateLimit, ctxData, tmplCtx)
return true
}

inflightRequestNum := atomic.Int32{}
return func(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, req *http.Request) {
ret := determineRequestPriority(reqctx.FromContext(req.Context()))
if !ret.SignedIn {
if ret.Expensive || ret.LongPolling {
http.Redirect(w, req, setting.AppSubURL+"/user/login", http.StatusSeeOther)
if ret.LongPolling {
http.Error(w, "Long polling is not allowed for anonymous users", http.StatusForbidden)
return
}
if ret.Expensive {
inflightNum := inflightRequestNum.Add(1)
defer inflightRequestNum.Add(-1)

if setting.Service.BlockAnonymousAccessExpensive {
// strictly block the anonymous accesses to expensive pages, to save CPU
http.Redirect(w, req, setting.AppSubURL+"/user/login", http.StatusSeeOther)
return
} else if int(inflightNum) > setting.Service.OverloadInflightAnonymousRequests {
// be friendly to anonymous access (crawler, real anonymous user) to expensive pages, but limit the inflight requests
if deferAnonymousRateLimitAccess(w, req) {
return
}
}
}
}
next.ServeHTTP(w, req)
})
@@ -44,6 +111,7 @@ func isRoutePathExpensive(routePattern string) bool {
"/{username}/{reponame}/blame/",
"/{username}/{reponame}/commit/",
"/{username}/{reponame}/commits/",
"/{username}/{reponame}/compare/",
"/{username}/{reponame}/graph",
"/{username}/{reponame}/media/",
"/{username}/{reponame}/raw/",
15 changes: 15 additions & 0 deletions templates/status/503_ratelimit.tmpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{{template "base/head" .}}
<div role="main" aria-label="{{.Title}}" class="page-content">
{{if .IsRepo}}{{template "repo/header" .}}{{end}}
<div class="ui container">
{{/*TODO: this page could be improved*/}}
Server is busy, loading .... or <a href="{{AppSubUrl}}/user/login">click here to sign in</a>.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should translate

<script>
setTimeout(() => {
document.cookie = "{{.RateLimitCookieName}}={{.RateLimitTokenKey}}; path=/";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: the path on this cookie can cause problems when setting.AppSubURL is not a default value

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, some details still need to be improved, including the i18n and cookie management.

window.location.reload();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: in the event of an overload where the client executes javascript, this can still cause an outage. This could be a result of something such as a repo being on the frontpage of a new aggregator, drawing a lot of human users, or one of the many crawler frameworks that use headless browsers to automate crawling.

suggestion: move this logic to the QoS middleware instead, where this logic only activates on a low priority request being rejected. In this case, change the priority to default in order, which gives them a higher chance of getting through when non javascript executing clients are crawling the site, but still provide a good experience for logged in users in the event of a javascript executing crawler having all of its requests classified as default

suggestion(non-blocking): to further prioritize users, we could leverage the existing captcha support (if the user has configured it) to further segment out bots from humans. However, this might be quite a large addition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I also designed that approach, but I didn't integrate the QoS because its your authorship.

}, {{.RateLimitRetryAfterMs}});
</script>
</div>
</div>
{{template "base/footer" .}}