get_long_weibo函数重试5次仍然失败的处理 #465

MiSanl · 2024-11-02T14:23:03Z

    def get_long_weibo(self, id):
        """获取长微博"""
        for i in range(5):
            url = "https://m.weibo.cn/detail/%s" % id
            logger.info(f"""URL: {url} """)
            html = requests.get(url, headers=self.headers, verify=False).text
            html = html[html.find('"status":') :]
            html = html[: html.rfind('"call"')]
            html = html[: html.rfind(",")]
            html = "{" + html + "}"
            js = json.loads(html, strict=False)
            weibo_info = js.get("status")
            if weibo_info:
                weibo = self.parse_weibo(weibo_info)
                return weibo
            sleep(random.randint(6, 10))

如上面代码所示，现在重试5次仍然没获取到微博是没进行错误的记录，是不是应该记录到日志里来方便后续的核查，还是现在有处理逻辑但是我没看到呢？或者说这不是一个核心的函数？但是看我的日志打印，确实存在好多重复5次的url，尤其是每页刚开始的时候。

dataabc · 2024-11-02T15:08:19Z

现在的逻辑是如果5次都失败了，就使用不完整的微博内容；否则使用完整的微博内容。确实没有记录失败的情况，您也可以在这个方法里处理下，即如果失败，就将id写到特定的文件里。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

get_long_weibo函数重试5次仍然失败的处理 #465

get_long_weibo函数重试5次仍然失败的处理 #465

MiSanl commented Nov 2, 2024

dataabc commented Nov 2, 2024 •

edited

Loading

get_long_weibo函数重试5次仍然失败的处理 #465

get_long_weibo函数重试5次仍然失败的处理 #465

Comments

MiSanl commented Nov 2, 2024

dataabc commented Nov 2, 2024 • edited Loading

dataabc commented Nov 2, 2024 •

edited

Loading