-
Notifications
You must be signed in to change notification settings - Fork 16
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
280 changed files
with
11,639 additions
and
13,152 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
2024-04-04 03:07:03 | ||
2024-04-04 09:09:34 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
{ | ||
"title": "Estimating Treatment Effects using Multiple Surrogates: The Role of the Surrogate Score and the Surrogate Index", | ||
"abstract": "arXiv:1603.09326v4 Announce Type: replace-cross Abstract: Estimating the long-term effects of treatments is of interest in many fields. A common challenge in estimating such treatment effects is that long-term outcomes are unobserved in the time frame needed to make policy decisions. One approach to overcome this missing data problem is to analyze treatments effects on an intermediate outcome, often called a statistical surrogate, if it satisfies the condition that treatment and outcome are independent conditional on the statistical surrogate. The validity of the surrogacy condition is often controversial. Here we exploit that fact that in modern datasets, researchers often observe a large number, possibly hundreds or thousands, of intermediate outcomes, thought to lie on or close to the causal chain between the treatment and the long-term outcome of interest. Even if none of the individual proxies satisfies the statistical surrogacy criterion by itself, using multiple proxies can be ", | ||
"link": "https://arxiv.org/abs/1603.09326", | ||
"context": "Title: Estimating Treatment Effects using Multiple Surrogates: The Role of the Surrogate Score and the Surrogate Index\nAbstract: arXiv:1603.09326v4 Announce Type: replace-cross Abstract: Estimating the long-term effects of treatments is of interest in many fields. A common challenge in estimating such treatment effects is that long-term outcomes are unobserved in the time frame needed to make policy decisions. One approach to overcome this missing data problem is to analyze treatments effects on an intermediate outcome, often called a statistical surrogate, if it satisfies the condition that treatment and outcome are independent conditional on the statistical surrogate. The validity of the surrogacy condition is often controversial. Here we exploit that fact that in modern datasets, researchers often observe a large number, possibly hundreds or thousands, of intermediate outcomes, thought to lie on or close to the causal chain between the treatment and the long-term outcome of interest. Even if none of the individual proxies satisfies the statistical surrogacy criterion by itself, using multiple proxies can be ", | ||
"path": "papers/16/03/1603.09326.json", | ||
"total_tokens": 835, | ||
"translated_title": "利用多个替代指标估计治疗效果:替代分数和替代指数的作用", | ||
"translated_abstract": "估计治疗效果长期作用是许多领域感兴趣的问题。 估计此类治疗效果的一个常见挑战在于长期结果在需要做出政策决策的时间范围内是未观察到的。 克服这种缺失数据问题的一种方法是分析治疗效果对中间结果的影响,通常称为统计替代指标,如果满足条件:在统计替代指标的条件下,治疗和结果是独立的。 替代条件的有效性经常是有争议的。 在现代数据集中,研究人员通常观察到大量中间结果,可能是数百个或数千个,被认为位于治疗和长期感兴趣的结果之间的因果链上或附近。 即使没有个别代理满足统计替代条件,使用多个代理也可以。", | ||
"tldr": "利用现代数据集中大量中间结果的事实,即使没有单个替代指标满足统计替代条件,使用多个替代指标也可能是有效的。", | ||
"en_tdlr": "Leveraging the presence of a large number of intermediate outcomes in modern datasets, using multiple surrogates can be effective even if none of them individually meets the statistical surrogate criterion." | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
{ | ||
"title": "Model-Based Reinforcement Learning for Atari", | ||
"abstract": "arXiv:1903.00374v5 Announce Type: replace Abstract: Model-free reinforcement learning (RL) can be used to learn effective policies for complex tasks, such as Atari games, even from image observations. However, this typically requires very large amounts of interaction -- substantially more, in fact, than a human would need to learn the same games. How can people learn so quickly? Part of the answer may be that people can learn how the game works and predict which actions will lead to desirable outcomes. In this paper, we explore how video prediction models can similarly enable agents to solve Atari games with fewer interactions than model-free methods. We describe Simulated Policy Learning (SimPLe), a complete model-based deep RL algorithm based on video prediction models and present a comparison of several model architectures, including a novel architecture that yields the best results in our setting. Our experiments evaluate SimPLe on a range of Atari games in low data regime of 100k", | ||
"link": "https://arxiv.org/abs/1903.00374", | ||
"context": "Title: Model-Based Reinforcement Learning for Atari\nAbstract: arXiv:1903.00374v5 Announce Type: replace Abstract: Model-free reinforcement learning (RL) can be used to learn effective policies for complex tasks, such as Atari games, even from image observations. However, this typically requires very large amounts of interaction -- substantially more, in fact, than a human would need to learn the same games. How can people learn so quickly? Part of the answer may be that people can learn how the game works and predict which actions will lead to desirable outcomes. In this paper, we explore how video prediction models can similarly enable agents to solve Atari games with fewer interactions than model-free methods. We describe Simulated Policy Learning (SimPLe), a complete model-based deep RL algorithm based on video prediction models and present a comparison of several model architectures, including a novel architecture that yields the best results in our setting. Our experiments evaluate SimPLe on a range of Atari games in low data regime of 100k", | ||
"path": "papers/19/03/1903.00374.json", | ||
"total_tokens": 884, | ||
"translated_title": "基于模型的强化学习在Atari中的应用", | ||
"translated_abstract": "无模型的强化学习(RL)可以用于从图像观察中学习有效的策略,例如Atari游戏,但通常需要非常大量的交互——实际上,远远超过人类学习相同游戏所需的数量。人们是如何如此快速学习的?答案的一部分可能是人们可以学习游戏运行的方式,并预测哪些动作会产生期望的结果。本文探讨了视频预测模型如何使代理能够在比无模型方法交互更少的情况下解决Atari游戏。我们描述了Simulated Policy Learning(SimPLe),这是一个基于视频预测模型的完整的基于模型的深度RL算法,并对几种模型体系结构进行了比较,包括一个在我们的情境中取得最佳结果的新颖结构。我们的实验评估了SimPLe在100k低数据条件下的一系列Atari游戏中的表现。", | ||
"tldr": "本研究探索了如何利用视频预测模型实现基于模型的深度RL算法SimPLe,在Atari游戏中比无模型方法更有效地解决问题,并通过实验验证了新颖模型体系结构在这一背景下取得最佳结果。", | ||
"en_tdlr": "This study explores how to use video prediction models to implement the model-based deep RL algorithm SimPLe, which solves Atari games more effectively than model-free methods, and experimentally validates the novel model architecture achieving the best results in this context." | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
{ | ||
"title": "Global Momentum Compression for Sparse Communication in Distributed Learning", | ||
"abstract": "arXiv:1905.12948v3 Announce Type: replace-cross Abstract: With the rapid growth of data, distributed momentum stochastic gradient descent~(DMSGD) has been widely used in distributed learning, especially for training large-scale deep models. Due to the latency and limited bandwidth of the network, communication has become the bottleneck of distributed learning. Communication compression with sparsified gradient, abbreviated as \\emph{sparse communication}, has been widely employed to reduce communication cost. All existing works about sparse communication in DMSGD employ local momentum, in which the momentum only accumulates stochastic gradients computed by each worker locally. In this paper, we propose a novel method, called \\emph{\\underline{g}}lobal \\emph{\\underline{m}}omentum \\emph{\\underline{c}}ompression~(GMC), for sparse communication. Different from existing works that utilize local momentum, GMC utilizes global momentum. Furthermore, to enhance the convergence performance when u", | ||
"link": "https://arxiv.org/abs/1905.12948", | ||
"context": "Title: Global Momentum Compression for Sparse Communication in Distributed Learning\nAbstract: arXiv:1905.12948v3 Announce Type: replace-cross Abstract: With the rapid growth of data, distributed momentum stochastic gradient descent~(DMSGD) has been widely used in distributed learning, especially for training large-scale deep models. Due to the latency and limited bandwidth of the network, communication has become the bottleneck of distributed learning. Communication compression with sparsified gradient, abbreviated as \\emph{sparse communication}, has been widely employed to reduce communication cost. All existing works about sparse communication in DMSGD employ local momentum, in which the momentum only accumulates stochastic gradients computed by each worker locally. In this paper, we propose a novel method, called \\emph{\\underline{g}}lobal \\emph{\\underline{m}}omentum \\emph{\\underline{c}}ompression~(GMC), for sparse communication. Different from existing works that utilize local momentum, GMC utilizes global momentum. Furthermore, to enhance the convergence performance when u", | ||
"path": "papers/19/05/1905.12948.json", | ||
"total_tokens": 837, | ||
"translated_title": "全局动量压缩用于分布式学习中的稀疏通信", | ||
"translated_abstract": "随着数据的快速增长,分布式动量随机梯度下降(DMSGD)在分布式学习中得到了广泛应用,特别是用于训练大规模深度模型。由于网络的延迟和带宽有限,通信成为分布式学习的瓶颈。使用稀疏梯度进行通信压缩,简称为“稀疏通信”,已被广泛应用以降低通信成本。所有关于DMSGD中稀疏通信的现有工作都使用本地动量,其中动量仅累积每个工作者在本地计算的随机梯度。在本文中,我们提出了一种新方法,称为\\emph{全局动量压缩}(GMC),用于稀疏通信。不同于现有工作中使用的局部动量,GMC使用全局动量。", | ||
"tldr": "本文提出了一种全局动量压缩(GMC)方法,用于稀疏通信,与现有的局部动量方法不同,GMC利用全局动量来提高分布式学习性能。", | ||
"en_tdlr": "This paper introduces a novel method called Global Momentum Compression (GMC) for sparse communication in distributed learning, which improves performance by utilizing global momentum instead of local momentum used in existing works." | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
{ | ||
"title": "Simulation-based reinforcement learning for real-world autonomous driving", | ||
"abstract": "arXiv:1911.12905v4 Announce Type: replace-cross Abstract: We use reinforcement learning in simulation to obtain a driving system controlling a full-size real-world vehicle. The driving policy takes RGB images from a single camera and their semantic segmentation as input. We use mostly synthetic data, with labelled real-world data appearing only in the training of the segmentation network. Using reinforcement learning in simulation and synthetic data is motivated by lowering costs and engineering effort. In real-world experiments we confirm that we achieved successful sim-to-real policy transfer. Based on the extensive evaluation, we analyze how design decisions about perception, control, and training impact the real-world performance.", | ||
"link": "https://arxiv.org/abs/1911.12905", | ||
"context": "Title: Simulation-based reinforcement learning for real-world autonomous driving\nAbstract: arXiv:1911.12905v4 Announce Type: replace-cross Abstract: We use reinforcement learning in simulation to obtain a driving system controlling a full-size real-world vehicle. The driving policy takes RGB images from a single camera and their semantic segmentation as input. We use mostly synthetic data, with labelled real-world data appearing only in the training of the segmentation network. Using reinforcement learning in simulation and synthetic data is motivated by lowering costs and engineering effort. In real-world experiments we confirm that we achieved successful sim-to-real policy transfer. Based on the extensive evaluation, we analyze how design decisions about perception, control, and training impact the real-world performance.", | ||
"path": "papers/19/11/1911.12905.json", | ||
"total_tokens": 700, | ||
"translated_title": "模拟强化学习用于现实世界自动驾驶", | ||
"translated_abstract": "我们利用模拟强化学习来获得控制全尺寸真实世界车辆的驾驶系统。驾驶策略以来自单个摄像头的RGB图像及其语义分割作为输入。我们主要使用合成数据,只有在分割网络的训练中才出现标记的真实世界数据。在真实世界实验中,我们确认实现了成功的模拟到真实策略转移。基于广泛的评估,我们分析了关于感知、控制和训练的设计决策如何影响真实世界性能。", | ||
"tldr": "该论文利用模拟强化学习和合成数据来实现对真实世界自动驾驶系统的控制,成功实现了模拟到真实策略转移,并分析了设计决策对真实世界性能的影响。", | ||
"en_tdlr": "This paper utilizes simulation-based reinforcement learning and synthetic data to control real-world autonomous driving systems, successfully achieving sim-to-real policy transfer and analyzing the impact of design decisions on real-world performance." | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
{ | ||
"title": "On robust fundamental theorems of asset pricing in discrete time", | ||
"abstract": "arXiv:2007.02553v4 Announce Type: replace Abstract: This paper is devoted to a study of robust fundamental theorems of asset pricing in discrete time and finite horizon settings. Uncertainty is modelled by a (possibly uncountable) family of price processes on the same probability space. Our technical assumption is the continuity of the price processes with respect to uncertain parameters. In this setting, we introduce a new topological framework which allows us to use the classical arguments in arbitrage pricing theory involving $L^p$ spaces, the Hahn-Banach separation theorem and other tools from functional analysis. The first result is the equivalence of a ``no robust arbitrage\" condition and the existence of a new ``robust pricing system\". The second result shows superhedging dualities and the existence of superhedging strategies without restrictive conditions on payoff functions, unlike other related studies. The third result discusses completeness in the present robust setting. W", | ||
"link": "https://arxiv.org/abs/2007.02553", | ||
"context": "Title: On robust fundamental theorems of asset pricing in discrete time\nAbstract: arXiv:2007.02553v4 Announce Type: replace Abstract: This paper is devoted to a study of robust fundamental theorems of asset pricing in discrete time and finite horizon settings. Uncertainty is modelled by a (possibly uncountable) family of price processes on the same probability space. Our technical assumption is the continuity of the price processes with respect to uncertain parameters. In this setting, we introduce a new topological framework which allows us to use the classical arguments in arbitrage pricing theory involving $L^p$ spaces, the Hahn-Banach separation theorem and other tools from functional analysis. The first result is the equivalence of a ``no robust arbitrage\" condition and the existence of a new ``robust pricing system\". The second result shows superhedging dualities and the existence of superhedging strategies without restrictive conditions on payoff functions, unlike other related studies. The third result discusses completeness in the present robust setting. W", | ||
"path": "papers/20/07/2007.02553.json", | ||
"total_tokens": 902, | ||
"translated_title": "论离散时间中资产定价的鲁棒基本定理", | ||
"translated_abstract": "本文致力于研究离散时间和有限时间跨度下资产定价的鲁棒基本定理。不确定性通过同一概率空间上的(可能是不可数的)一族价格过程建模。我们的技术假设是价格过程对于不确定参数是连续的。在这一环境中,我们引入了一个新的拓扑框架,使我们能够使用涉及 $L^p$ 空间、Hahn-Banach 分离定理以及其他来自泛函分析的工具的经典套利定价理论论证。第一个结果是“无鲁棒套利”条件与存在新的“鲁棒定价体系”之间的等价性。第二个结果展示了双向套期保值和在支付函数上不存在限制性条件的套期保值策略的存在,这与其他相关研究不同。第三个结果讨论了当前鲁棒设置中的完备性。", | ||
"tldr": "本文研究了离散时间下资产定价的鲁棒基本定理,引入了新的拓扑框架,创新性地证明了“无鲁棒套利”条件与“鲁棒定价体系”的等价性,展示了双向套期保值以及完备性。", | ||
"en_tdlr": "This paper investigates the robust fundamental theorems of asset pricing in discrete time, introduces a new topological framework, innovatively proves the equivalence between the \"no robust arbitrage\" condition and the \"robust pricing system\", demonstrates superhedging dualities and completeness in the present robust setting." | ||
} |
Oops, something went wrong.