Skip to content

bug: enabling health checks for the upstream and adjusting the weight while there is traffic causes error logs to be generated and may result in a 500 error #11897

@Lewisyixin

Description

@Lewisyixin

Current Behavior

When modify nodes weight, error log will be recorded " all upstream nodes is unhealthy"

Expected Behavior

no error log

Error Logs

image

2025/01/07 18:04:23 [error] 6714#6714: *276732 [lua] balancer.lua:83: fetch_health_nodes(): failed to get health check target status, addr: 10.58.94.168:80, host: nil, err: target not found, client: 127.0.0.1, server: _, request: "GET /l HTTP/1.1", host: "l.com" 2025/01/07 18:04:23 [error] 6714#6714: *276732 [lua] balancer.lua:83: fetch_health_nodes(): failed to get health check target status, addr: 10.58.32.145:80, host: nil, err: target not found, client: 127.0.0.1, server: _, request: "GET /l HTTP/1.1", host: "l.com" 2025/01/07 18:04:23 [warn] 6714#6714: *276732 [lua] balancer.lua:89: fetch_health_nodes(): all upstream nodes is unhealthy, use default, client: 127.0.0.1, server: _, request: "GET /l HTTP/1.1", host: "l.com" 2025/01/07 18:04:24 [error] 6716#6716: *276683 [lua] balancer.lua:83: fetch_health_nodes(): failed to get health check target status, addr: 10.58.32.145:80, host: nil, err: target not found, client: 127.0.0.1, server: _, request: "GET /l HTTP/1.1", host: "l.com" 2025/01/07 18:04:24 [error] 6716#6716: *276683 [lua] balancer.lua:83: fetch_health_nodes(): failed to get health check target status, addr: 10.58.94.168:80, host: nil, err: target not found, client: 127.0.0.1, server: _, request: "GET /l HTTP/1.1", host: "l.com" 2025/01/07 18:04:24 [warn] 6716#6716: *276683 [lua] balancer.lua:89: fetch_health_nodes(): all upstream nodes is unhealthy, use default, client: 127.0.0.1, server: _, request: "GET /l HTTP/1.1", host: "l.com"

Steps to Reproduce

  1. create a route and upstream with 2 or more nodes. Upstream need turn on healthcheck
    image
    image
  2. run a wrk command to make requests to the server continuously.
    image
  3. change node weight continuously(such as keep executing the following command)
    curl -XPATCH -s -H "x-api-key: $token" http://192.168.20.226:9180/apisix/admin/upstreams/547982406942985081 -d '{"nodes":{"10.58.32.145:80": 20}}'
  4. and we will get error log
    image
    5.In extreme cases, client will get a 500 internal error(but this cannot be reproduced stably)
    image

Environment

  • APISIX version (run apisix version): 3.5
  • Operating system (run uname -a): 3.10.0-514.26.2.el7.x86_64
  • OpenResty / Nginx version (run openresty -V or nginx -V): openresty/1.21.4.2
  • etcd version, if relevant (run curl http://127.0.0.1:9090/v1/server_info): 3.5
  • APISIX Dashboard version, if relevant:
  • Plugin runner version, for issues related to plugin runners:
  • LuaRocks version, for installation issues (run luarocks --version): 2.3.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    Status

    📋 Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions