fix(zhihu): fix identity detection, comment, answer, and search#1207
fix(zhihu): fix identity detection, comment, answer, and search#1207Benjamin-eecs wants to merge 2 commits intojackwener:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Fixes the Zhihu adapter’s write/search functionality after recent Zhihu DOM/API changes, primarily by improving logged-in identity detection and switching comment/answer writes from UI automation to direct API calls.
Changes:
- Add a new fallback for resolving the current user slug from the header avatar
img.alt. - Replace UI-based comment/answer flows with cookie-authenticated API
POSTrequests. - Update search result filtering to handle mixed result types and fetch more items to compensate.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| clis/zhihu/write-shared.js | Adds header-avatar-alt-based identity detection fallback. |
| clis/zhihu/write-shared.test.js | Updates identity-resolution test scaffolding for the new selector. |
| clis/zhihu/comment.js | Switches comment creation to a cookie + API POST flow. |
| clis/zhihu/comment.test.js | Updates tests to validate API-based comment behavior and errors. |
| clis/zhihu/answer.js | Switches answer creation to a cookie + API POST flow. |
| clis/zhihu/answer.test.js | Updates tests to validate API-based answer behavior and errors. |
| clis/zhihu/search.js | Adjusts search filtering to accept real content types and increases fetch size. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const apiResult = await page.evaluate(`(async () => { | ||
| var targetKind = ${JSON.stringify(target.kind)}; | ||
| var targetId = ${JSON.stringify(target.id)}; | ||
| var content = ${JSON.stringify(payload)}; | ||
| var resourceType = targetKind === 'answer' ? 'answers' : 'articles'; | ||
| var url = 'https://www.zhihu.com/api/v4/' + resourceType + '/' + targetId + '/comments'; |
There was a problem hiding this comment.
For article targets, parseTarget() navigates to https://zhuanlan.zhihu.com/p/..., but the API call posts to https://www.zhihu.com/api/v4/articles/.... That’s a cross-origin fetch from the page context (zhuanlan → www) and can be blocked by CORS, causing the command to fail even with valid cookies. To avoid this, consider running the POST from a www.zhihu.com origin (e.g., navigate to https://www.zhihu.com before the API call and use a same-origin/relative /api/v4/... URL), or otherwise ensure the request is made in a way that isn’t subject to browser CORS restrictions.
| const apiResult = await page.evaluate(`(async () => { | |
| var targetKind = ${JSON.stringify(target.kind)}; | |
| var targetId = ${JSON.stringify(target.id)}; | |
| var content = ${JSON.stringify(payload)}; | |
| var resourceType = targetKind === 'answer' ? 'answers' : 'articles'; | |
| var url = 'https://www.zhihu.com/api/v4/' + resourceType + '/' + targetId + '/comments'; | |
| if (target.kind === 'article') { | |
| await page.goto('https://www.zhihu.com'); | |
| await page.wait(3); | |
| } | |
| const apiResult = await page.evaluate(`(async () => { | |
| var targetKind = ${JSON.stringify(target.kind)}; | |
| var targetId = ${JSON.stringify(target.id)}; | |
| var content = ${JSON.stringify(payload)}; | |
| var resourceType = targetKind === 'answer' ? 'answers' : 'articles'; | |
| var url = '/api/v4/' + resourceType + '/' + targetId + '/comments'; |
| var data = await resp.json(); | ||
| if (!resp.ok) return { ok: false, status: resp.status, message: data.error ? data.error.message : 'unknown error' }; | ||
| return { ok: true, id: data.id, url: data.url }; |
There was a problem hiding this comment.
Similar to answer.js, the evaluated API call does await resp.json() even when resp.ok is false. If Zhihu returns a non-JSON error body (HTML on robot check / edge CDN errors), this will throw inside page.evaluate and bypass your { ok:false, ... } handling. Consider parsing defensively (e.g., resp.text() + try { JSON.parse }) and returning a stable error shape.
| var data = await resp.json(); | |
| if (!resp.ok) return { ok: false, status: resp.status, message: data.error ? data.error.message : 'unknown error' }; | |
| return { ok: true, id: data.id, url: data.url }; | |
| var body = await resp.text(); | |
| var data = null; | |
| if (body) { | |
| try { | |
| data = JSON.parse(body); | |
| } | |
| catch (e) { | |
| data = null; | |
| } | |
| } | |
| if (!resp.ok) return { | |
| ok: false, | |
| status: resp.status, | |
| message: data && data.error && data.error.message | |
| ? data.error.message | |
| : data && data.message | |
| ? data.message | |
| : 'unknown error', | |
| }; | |
| return { ok: true, id: data && data.id, url: data && data.url }; |
| if (!apiResult?.ok) { | ||
| throw new CliError('COMMAND_EXEC', apiResult?.message || 'Failed to create comment'); | ||
| } |
There was a problem hiding this comment.
As with zhihu/question, consider mapping API 401/403 responses to AuthRequiredError (clis/zhihu/question.js:39-44) instead of always throwing COMMAND_EXEC. This will make expired/invalid sessions much easier for users to diagnose and fix.
| var avatarImgs = document.querySelectorAll('header img[alt*="\\u4e3b\\u9875"]'); | ||
| for (var ai = 0; ai < avatarImgs.length; ai++) { | ||
| var altMatch = (avatarImgs[ai].alt || '').match(/\\u70b9\\u51fb\\u6253\\u5f00(.+?)\\u7684\\u4e3b\\u9875/); | ||
| if (altMatch) return { slug: altMatch[1] }; | ||
| } |
There was a problem hiding this comment.
The new identity fallback that extracts the slug from the header avatar img.alt is a key behavior change but isn’t covered by tests. Consider adding a unit test in write-shared.test.js that stubs header img[alt*="主页"] with an alt like 点击打开<slug>的主页 and asserts resolveCurrentUserIdentity() returns that slug.
| var data = await resp.json(); | ||
| if (!resp.ok) return { ok: false, status: resp.status, message: data.error ? data.error.message : 'unknown error' }; |
There was a problem hiding this comment.
The in-page API call unconditionally does await resp.json(). If Zhihu returns non-JSON (e.g., HTML on a 403/5xx/robot check), this will throw inside page.evaluate and surface as an unhandled browser-eval error rather than a structured CLI error. Consider reading resp.text() and JSON-parsing defensively (or try/catch around resp.json()), returning a structured { ok:false, status, message } even when the body is not JSON.
| var data = await resp.json(); | |
| if (!resp.ok) return { ok: false, status: resp.status, message: data.error ? data.error.message : 'unknown error' }; | |
| var bodyText = await resp.text(); | |
| var data = null; | |
| try { | |
| data = bodyText ? JSON.parse(bodyText) : null; | |
| } | |
| catch (e) { | |
| data = null; | |
| } | |
| if (!resp.ok) { | |
| var errorMessage = data && data.error && data.error.message | |
| ? data.error.message | |
| : (bodyText ? bodyText.slice(0, 200) : ('Request failed with status ' + resp.status)); | |
| return { ok: false, status: resp.status, message: errorMessage }; | |
| } | |
| if (!data || typeof data !== 'object' || data.id == null) { | |
| return { ok: false, status: resp.status, message: bodyText ? bodyText.slice(0, 200) : 'Invalid API response' }; | |
| } |
| if (!apiResult?.ok) { | ||
| throw new CliError('COMMAND_EXEC', apiResult?.message || 'Failed to create answer'); | ||
| } |
There was a problem hiding this comment.
On API failure, this currently throws CliError('COMMAND_EXEC', ...) for all statuses. In the same adapter, zhihu/question maps 401/403 to AuthRequiredError (see clis/zhihu/question.js:39-44). It’d be more consistent/helpful to surface 401/403 from the answer-create API as AuthRequiredError (or another auth-specific error) so users get the correct remediation message when their session expires.
482152f to
b7b6ec0
Compare
b7b6ec0 to
1e99756
Compare
Identity detection: Zhihu removed __INITIAL_STATE__ and moved the
user avatar from a profile link into a button. Added fallback that
extracts the user slug from the header avatar alt text.
Comment and answer: Zhihu moved the comment editor into a Modal
and changed the submit button behavior, breaking the UI-based
write flow. Replaced with direct API calls (POST /api/v4/answers/
{id}/comments and POST /api/v4/questions/{id}/answers) which are
reliable and much simpler.
Search: Zhihu's search API now returns mixed result types (ads,
education, hot_timing) alongside search_result. Updated the filter
to select by object.type (answer/article/question) and increased
fetch size to compensate for non-content results.
Fixes jackwener#1198
Same DOM breakage as comment/answer. Replaced UI-based click flows with direct Zhihu API calls for all write commands.
1e99756 to
c941ff2
Compare
Description
Zhihu changed its page structure in several ways that broke all write commands and search. This PR fixes them.
cc @Astro-Han (original author of the zhihu write commands in #868)
Identity detection (#1198): Zhihu removed
__INITIAL_STATE__and moved the user avatar from<a href="/people/...">into a<button>. Added a fallback inwrite-shared.jsthat extracts the slug from the header avatar alt text.Write commands (comment, answer, like, follow, favorite): The previous UI-based flow relied on finding contenteditable editors, inserting text via
execCommand, and clicking submit buttons. Three independent Zhihu DOM changes broke this approach simultaneously:Modal-contentportal outside the answer DOM block, soscope.querySelectorAll('[contenteditable]')finds nothingdocument.execCommand('insertText'). the DOM updates but the internal ContentState stays empty, so the submit button stays disabledI tried fixing each layer individually (Modal scope fallback, container walk-up for submit button, CDP Input.insertText for Draft.js, CDP Input.dispatchMouseEvent for clicks) but the submit button still does not fire any network request on click. Zhihu appears to check
event.isTrustedor use an internal event bus that rejects programmatic interaction.The working alternative is Zhihu's REST API, which accepts the same cookies the browser session already has. All five write commands now call the API directly:
POST /api/v4/answers/{id}/commentsPOST /api/v4/questions/{id}/answersPOST /api/v4/answers/{id}/votersPOST /api/v4/questions/{id}/followersor/api/v4/members/{slug}/followersPOST /api/v4/favlists/{id}/itemsIdentity verification (
resolveCurrentUserIdentity) and the--executesafety gate are preserved. The API responses include created item IDs for verification.Search: The search API now returns mixed types (ads, education, hot_timing) alongside actual results. Updated the filter to select by
object.typeand increased the fetch size to compensate.Strategy changed from
UItoCOOKIEfor the five write commands since they no longer manipulate the DOM.Fixes #1198, fixes #1210
Type of Change
Checklist
Documentation (if adding/modifying an adapter)
N/A
Screenshots / Output
All 9 commands tested on macOS with v1.7.8:
hotsearchquestioncomment --executeanswer --executelike --executefollow --executefavorite --executedownload