Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

util/update_online: fix login times increased only per 2 days for online users 修正線上使用者的登入次數兩天僅增加一次的問題 #105

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

IepIweidieng
Copy link
Contributor

@IepIweidieng IepIweidieng commented Jun 16, 2021

  • fix problematic non-numlogindays++ condition (abbr. EX-COND below), which excluded users who have been online for exactly 86400 seconds.
    修正有問題的 numlogindays++ 排除條件式(以下簡稱 EX-COND)。它厡先誤排除了上線期間正好 86400 秒的使用者。

The problematic form of EX-COND urec.lastlogin >= base was introduced in 05bb3eb ("Change update_online to do same way as pwcu does.", 2014-03-26 UTC+8), which excluded all online users logged in at and after base (formerly 09:30).
05bb3eb 引入了 EX-COND 的有問題的形式 urec.lastlogin >= base,會排除在 base 所表示的時間點之上與之後登入的使用者。base 原爲 09:30。

f1328b3 ("Fix calculation error in update_online.", 2014-12-23 UTC+8) & d38b21d ("Fix update_online again, to avoid timezone problems.", 12-24) redefined base as time(0) - DAY_SECONDS.
Thus, EX-COND had become equivalent to urec.lastlogin + DAY_SECONDS >= now (c.f., urec.lastlogin + DAY_SECONDS > now) in 6871fbb ("Update using ulist."), 2014-03-25 UTC+8), i.e., all users whose online duration <= 1 day were excluded, which had made online users' login count sometimes increased only per 2 days.
f1328b3 & d38b21dbase 重新定義爲 time(0) - DAY_SECONDS,使 EX-COND 變得等價於 urec.lastlogin + DAY_SECONDS >= now(可比較 6871fbb 中的 urec.lastlogin + DAY_SECONDS > now),即排除所有上線期間 <= 1 天的使用者,造成使用者的登入次數有時每 2 天才增加 1 次。

  • make now the begin of the current minute
    now 設爲目前的分鐘的起點
  • make base calculated using now instead of calling time() independently
    now 計算 base 而不分別呼叫 time()

There have been efforts to deal with the non-exactness of EX-COND, e.g., 05702e5 ("update_online: prevent too verbose messages and allow some tolerance in duration.", 2014-03-25 UTC+8), which introduced a 1-hour tolerance but had soon been cancelled.
對於 EX-COND 不精準的問題,先前已有部份嘗試,例如 05702e5 引入了 1 小時的誤差容忍値,不過不久就被取消。

Now it is known that Cron can delay due to a high system load, so 05bb3eb had once allowed users to earn a login count of 2 per day by logging in at 00:00 and before update_online ran.
現在已知 Cron 可能會因爲系統負荷重而延後執行,由此可知 05bb3eb 曾讓使用者能在每天 00:00 且 update_online 執行前登入以在 1 天內取得 2 次登入。

Also, if there were significant delay between time() & fastcheck() (assumed to be due to attach_SHM() & context switches), base - now could have become < 1 day, which had made EX-COND have random tolerance sometimes and unfair.
此外,此前若 time()fastcheck() 的執行之間有明顯延誤(可能由 attach_SHM() 與 context switches 所造成),base - now 可 < 1 天,使 EX-COND 有時具有隨機的誤差容忍値而不公平。

@IepIweidieng IepIweidieng changed the title Fix undercounted login times (numlogindays) Fix undercounted login times (numlogindays) 修正登入日數 (numlogindays) 少算的問題 Nov 29, 2022
Copy link
Contributor

@hungte hungte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Increasing numlogindays at logout is a not a good idea because:

When the system needs to kick out everyone (for restart), that will introduce more dirty logs we have to flush, and that's a huge risk. In fact I think we should prevent all logout time pwcu update in the future.
Aborted sessions are sometimes more than expected.
"Login" is an active behavior definitely driven by the user but "logout" is not. This usually also leads to arguments.
As a result, the algorithm was implemented in the way you see today - number of "login" plus mitigation for people with persistent connections, but not really counting this number by "86400 seconds". That is why it is called "login days" instead of "online days".

Yes I understand this may be a little frustrating for people using persistent connections (I myself also used to keep the single connection for the whole week), but considering all the pros and cons, that's what we have for today. Yes there will be corner cases like your example, but if we look at the days before the algorithm was introduced, people in your examples will always only get +1.

Conclusion: we can still try to improve the algorithm (patches welcome), but never do "increase numlogindays at logout/exit".

mbbsd/passwd.c Outdated Show resolved Hide resolved
@IepIweidieng IepIweidieng force-pushed the fix-numlogindays-undercounted branch 3 times, most recently from cdd0aca to 68b798b Compare June 8, 2023 16:57
@IepIweidieng IepIweidieng requested a review from hungte June 8, 2023 17:20
@hungte
Copy link
Contributor

hungte commented Jun 9, 2023

suggest update_online to be run on the following timepoints every day:
00:00, 06:00, 12:00, & 18:00 (i.e., one run per 6 hours 也就是每隔 6 小時執行一次)

Yes increasing the frequency of update_online may reduce the issue, but running password check-like programs at rush hours (e.g, 0:00, 12:00, 18:00) is super dangerous on PTT since it will largely increase the risk of hitting passwd sync issues. That's why we previously tried to only do it on 4:00. Also there will still be people complaining about wrong calculation.

@IepIweidieng IepIweidieng changed the title Fix undercounted login times (numlogindays) 修正登入日數 (numlogindays) 少算的問題 util/update_online: fix login times increasing only per 2 days for online users 修正線上使用者的登入次數兩天僅增加一次的問題 Jun 9, 2023
@IepIweidieng IepIweidieng deleted the fix-numlogindays-undercounted branch December 16, 2023 11:10
@IepIweidieng IepIweidieng restored the fix-numlogindays-undercounted branch December 16, 2023 11:10
@IepIweidieng IepIweidieng reopened this Dec 16, 2023
@IepIweidieng IepIweidieng force-pushed the fix-numlogindays-undercounted branch from 68b798b to c4c2de0 Compare December 16, 2023 12:06
@IepIweidieng IepIweidieng changed the title util/update_online: fix login times increasing only per 2 days for online users 修正線上使用者的登入次數兩天僅增加一次的問題 util/update_online: fix login times increased only per 2 days for online users 修正線上使用者的登入次數兩天僅增加一次的問題 Dec 16, 2023
@IepIweidieng IepIweidieng force-pushed the fix-numlogindays-undercounted branch 3 times, most recently from f3e76ff to f8fb5d9 Compare December 16, 2023 12:16
@IepIweidieng
Copy link
Contributor Author

The scope of this pull request has been reduced to address only the problem that the login times increased only 1 per 2 days.
本拉取請求的影響範疇已限縮至僅針對登入次數每 2 天只增加 1 次的問題。

Discussions and potential solutions regarding other issues of the login time algorithm should now go to #104 or another pull request instead.
今後與登入次數演算法的其它方面相關的討論與潛在的解決方案應於 #104 或另外的拉取請求提出。

@IepIweidieng IepIweidieng force-pushed the fix-numlogindays-undercounted branch from f8fb5d9 to f239f85 Compare December 16, 2023 13:05
…ine users

* fix problematic non-`numlogindays++` condition (abbr. EX-COND below),
  which excluded users who have been online for exactly 86400 seconds.

The problematic form of EX-COND `urec.lastlogin >= base`
was introduced in 05bb3eb ("Change update_online to do same way as pwcu does.", 2014-03-26 UTC+8),
which excluded all online users logged in at and after `base` (formerly 09:30).

f1328b3 ("Fix calculation error in update_online.", 2014-12-23 UTC+8)
& d38b21d ("Fix update_online again, to avoid timezone problems.", 12-24)
redefined `base` as `time(0) - DAY_SECONDS`.
Thus, EX-COND had become equivalent to `urec.lastlogin + DAY_SECONDS >= now`
(c.f., `urec.lastlogin + DAY_SECONDS > now`) in 6871fbb ("Update using ulist."), 2014-03-25 UTC+8),
i.e., all users whose online duration <= 1 day were excluded,
which had made online users' login count sometimes increased only per 2 days.

* make `now` the begin of the current minute
* make `base` calculated using `now` instead of calling time() independently

There have been efforts to deal with the non-exactness of EX-COND,
e.g., 05702e5 ("update_online: prevent too verbose messages and allow some tolerance in duration.", 2014-03-25 UTC+8),
which introduced a 1-hour tolerance but had soon been cancelled.

Now it is known that Cron can delay due to a high system load,
so 05bb3eb had once allowed users to earn a login count of 2 per day
by logging in at 00:00 and before update_online ran.

Also, if there were significant delay between time() & fastcheck()
(assumed to be due to attach_SHM() & context switches),
`base - now` could have become < 1 day,
which had made EX-COND have random tolerance sometimes and unfair.
@IepIweidieng IepIweidieng force-pushed the fix-numlogindays-undercounted branch from f239f85 to c15a33d Compare December 17, 2023 11:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants