-
Notifications
You must be signed in to change notification settings - Fork 109
Reuse the same ID for both auth-less and auth-ful INVITEs #488
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| inviteOKRetryAttemptsNoACK = 2 | ||
| inviteOkAckLateTimeout = inviteOkRetryIntervalMax | ||
|
|
||
| inviteCredentialValidity = 60 * time.Minute // Allow reuse of credentials for 1h |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure this is smart, might want to extend this to max_call_duration or something.
Also, keep in mind that this is per Call-ID for now, so new calls would still need re-auth.
| tr := callTransportFromReq(req) | ||
| legTr := legTransportFromReq(req) | ||
| log := s.log.WithValues( | ||
| "callID", callID, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The one log line that will not have callID now is "Bad request", when validation fails.
| if err != nil { | ||
| return nil, fmt.Errorf("invalid challenge %q: %w", challengeStr, err) | ||
| } | ||
| toHeader := resp.To() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we're doing the right validation on the inbound side, the outbound side E2E test caught this error!
But this also means out clients might run into the same issue, in case some of them are not spec-compliant.
pkg/sip/inbound.go
Outdated
| s.inProgressInvites[key] = is | ||
|
|
||
| go func() { | ||
| time.Sleep(inviteCredentialValidity) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better avoid spawning goroutines that wait for the whole hour without a clear cancellation signal.
Usually, you'd have one goroutine periodically cleaning the expired cache. time.AfterFunc could work, but again, there's no real need to create thousands of timers all waiting for an hour.
pkg/sip/inbound.go
Outdated
| log: log, | ||
| s: s, | ||
| id: id, | ||
| id: "unassigned", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be helpful if it logs/panics if something tries to read this unassigned ID.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather not add shims or verification for this. Unless it's real clean, I'd rather leave it blank. Would that be better in your opinion?
| req.To().Params.Add("tag", toTag) | ||
| } | ||
| inviteProgress := s.getInvite(sipCallID, toTag, fromTag) | ||
| callID := inviteProgress.lkCallID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No locking here. I assume our re-invite handling will catch a duplicate invite if it arrives at the same time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right - locking is done in getInvite(), but we're using inviteProgress.lkCallID outside of it.
If two separate INVITEs come along, and they're retransmissions (same via branch), sipgo swallows it. If they're not retransmissions, these are separate invites and should be processed independently, which due to the early to-tag generation here they would (different toTag = different map key)
0658b5d to
de0faf0
Compare
| toTag: toTag, | ||
| fromTag: fromTag, | ||
| } | ||
| s.imu.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A common practice to avoid lock contention is to use RWMutex and doing a two stage lock:
s.imu.RLock()
is, ok := s.inProgressInvites[key]
s.imu.RUnlock()
if ok {
return is
}
s.imu.Lock()
defer s.imu.Unlock()
is, ok := s.inProgressInvites[key]
if ok {
return is
}
// ... the rest ...This allows multiple readers to get the invite state without blocking each other. Also notice that we redo the check after getting a write lock - other routine might create the state earlier.
| imu sync.Mutex | ||
| inProgressInvites map[dialogKey]*inProgressInvite | ||
| inviteTimeoutQueue utils.TimeoutQueue[*dialogKey] | ||
| isCleanupTaskRunning atomic.Bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seem unused
| inProgressInvites []*inProgressInvite | ||
| imu sync.Mutex | ||
| inProgressInvites map[dialogKey]*inProgressInvite | ||
| inviteTimeoutQueue utils.TimeoutQueue[*dialogKey] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| inviteTimeoutQueue utils.TimeoutQueue[*dialogKey] | |
| inviteTimeoutQueue utils.TimeoutQueue[dialogKey] |
We probably don't need a pointer here.
| s.inProgressInvites = append(s.inProgressInvites, is) | ||
| is = &inProgressInvite{sipCallID: sipCallID} | ||
| s.inProgressInvites[key] = is | ||
| s.inviteTimeoutQueue.Reset(&utils.TimeoutQueueItem[*dialogKey]{Value: &key}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we do not reset the cache expiry time if we have a cache hit. Is it intentional?
| inProgressInvites []*inProgressInvite | ||
| imu sync.Mutex | ||
| inProgressInvites map[dialogKey]*inProgressInvite | ||
| inviteTimeoutQueue utils.TimeoutQueue[*dialogKey] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a convention in Go called a Mutex hat, which implies inviteTimeoutQueue is protected by imu.
But, the queue implementation already has a mutex internally. So might be worth moving it out of the imu group.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, this is neat, thank you for poinitng it out!
See this thread for more details. Main point:
When we authenticate for the purposes of a SIP session (when credentials are not being cached), the following happens:
If we happen to count the 407 response as a failed call, and we generate unique SCL_* ids for both.