Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_long_weibo函数重试5次仍然失败的处理 #465

Open
MiSanl opened this issue Nov 2, 2024 · 1 comment
Open

get_long_weibo函数重试5次仍然失败的处理 #465

MiSanl opened this issue Nov 2, 2024 · 1 comment

Comments

@MiSanl
Copy link

MiSanl commented Nov 2, 2024

    def get_long_weibo(self, id):
        """获取长微博"""
        for i in range(5):
            url = "https://m.weibo.cn/detail/%s" % id
            logger.info(f"""URL: {url} """)
            html = requests.get(url, headers=self.headers, verify=False).text
            html = html[html.find('"status":') :]
            html = html[: html.rfind('"call"')]
            html = html[: html.rfind(",")]
            html = "{" + html + "}"
            js = json.loads(html, strict=False)
            weibo_info = js.get("status")
            if weibo_info:
                weibo = self.parse_weibo(weibo_info)
                return weibo
            sleep(random.randint(6, 10))

如上面代码所示,现在重试5次仍然没获取到微博是没进行错误的记录,是不是应该记录到日志里来方便后续的核查,还是现在有处理逻辑但是我没看到呢?或者说这不是一个核心的函数?但是看我的日志打印,确实存在好多重复5次的url,尤其是每页刚开始的时候。

@dataabc
Copy link
Owner

dataabc commented Nov 2, 2024

现在的逻辑是如果5次都失败了,就使用不完整的微博内容;否则使用完整的微博内容。确实没有记录失败的情况,您也可以在这个方法里处理下,即如果失败,就将id写到特定的文件里。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants