Skip to content

Commit 00e496e

Browse files
committed
Update datasets chapter
1 parent ab7c18c commit 00e496e

File tree

61 files changed

+1051
-659
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

61 files changed

+1051
-659
lines changed

01 Cloud Platform/99 API Reference/08 Downloading Data/00.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
"type" : "landing",
33
"heading" : "API Reference",
44
"subHeading" : "",
5-
"content" : "<p>The QuantConnect REST API lets you generate links to download data from our cloud servers through URL endpoints. To use the CLI to download data, see <a href='/docs/v2/lean-cli/datasets/downloading-quantconnect-data'>Downloading QuantConnect Data</a>.</p>",
5+
"content" : "<p>The QuantConnect REST API lets you generate links to download data from our cloud servers through URL endpoints. To use the CLI to download data, see <a href='/docs/v2/lean-cli/datasets/downloading-data'>Downloading Data</a>.</p>",
66
"alsoLinks" : [
77
],
88
"featureShortDescription": {

05 Lean CLI/00.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,7 @@ <h4>Initialization</h4>
8585
</div>
8686
</div>
8787

88-
<div class="internal-link" onclick="window.location.href = '/docs/v2/lean-cli/datasets/local-data'">
88+
<div class="internal-link" onclick="window.location.href = '/docs/v2/lean-cli/datasets/format-and-storage'">
8989
<div class="internal-link-icon">
9090
<svg xmlns="http://www.w3.org/2000/svg" width="27" height="30" viewBox="0 0 27 30"><g transform="translate(-2 -1)"><path d="M12.5-1C19.209-1,26,.786,26,4.2S19.209,9.4,12.5,9.4-1,7.614-1,4.2,5.791-1,12.5-1Zm0,8.4a27.855,27.855,0,0,0,8.52-1.178C23.372,5.432,24,4.579,24,4.2s-.628-1.232-2.98-2.022A27.855,27.855,0,0,0,12.5,1,27.855,27.855,0,0,0,3.98,2.178C1.628,2.968,1,3.821,1,4.2s.628,1.232,2.98,2.022A27.855,27.855,0,0,0,12.5,7.4Z" transform="translate(3 2)" fill="#8f9ca3"/><path d="M15.5,17.2a29.9,29.9,0,0,1-9.173-1.281C2.751,14.716,2,13.124,2,12a1,1,0,0,1,2,0c0,.511.917,1.335,2.965,2.024A27.888,27.888,0,0,0,15.5,15.2a27.888,27.888,0,0,0,8.535-1.176C26.083,13.335,27,12.511,27,12a1,1,0,0,1,2,0c0,1.124-.751,2.716-4.327,3.919A29.9,29.9,0,0,1,15.5,17.2Z" transform="translate(0 4)" fill="#8f9ca3"/><path d="M15.5,29.8a29.9,29.9,0,0,1-9.173-1.281C2.751,27.316,2,25.724,2,24.6V5A1,1,0,0,1,4,5V24.6c0,.511.917,1.335,2.965,2.024A27.888,27.888,0,0,0,15.5,27.8a27.888,27.888,0,0,0,8.535-1.176C26.083,25.935,27,25.111,27,24.6V5a1,1,0,0,1,2,0V24.6c0,1.124-.751,2.716-4.327,3.919A29.9,29.9,0,0,1,15.5,29.8Z" transform="translate(0 1.2)" fill="#8f9ca3"/></g></svg>
9191
</div>

05 Lean CLI/01 Key Concepts/01 Getting Started/04 Basic Usage.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ <h4>Pull Projects From the Cloud</h4>
4242
<h4>Source Data</h4>
4343
<p>
4444
Run <code>lean data generate --start 20180101 --symbol-count 100</code> to generate realistic fake market data to test with.
45-
You can also choose to <a href="/docs/v2/lean-cli/datasets/downloading-quantconnect-data">download data</a> from QuantConnect Datasets, or to convert your own data into LEAN-compatible data.
45+
You can also choose to <a href="/docs/v2/lean-cli/datasets/downloading-data">download data</a> from QuantConnect Datasets or convert your own data into LEAN-compatible data.
4646
</p>
4747

4848
<div class="cli section-example-container">
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
<p>LEAN developed an open format based on open protocols (CSV, ZIP). The data format is simply compressed on disk.</p>

05 Lean CLI/05 Datasets/01 Local Data/05 Other Data Sources.html renamed to 05 Lean CLI/05 Datasets/01 Format and Storage/05 Other Data Sources.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,6 @@
44
</p>
55

66
<p>
7-
For development purposes, it is also possible to <a href="/docs/v2/lean-cli/datasets/generating-random-data">generate data</a> using the CLI.
7+
For development purposes, it is also possible to <a href="/docs/v2/lean-cli/datasets/generating-data">generate data</a> using the CLI.
88
This generator uses a <a href="https://en.wikipedia.org/wiki/Brownian_model_of_financial_markets" target="_blank">Brownian motion model</a> to generate realistic market data, which might be helpful when you're testing strategies locally but don't have access to real market data.
99
</p>
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
{
2+
"type": "metadata",
3+
"values": {
4+
"description": "LEAN developed an open format based on open protocols (CSV, ZIP). The data format is simply compressed on disk.",
5+
"keywords": "csv files, zip compression, algo trading data structure, lean data formats",
6+
"og:description": "LEAN developed an open format based on open protocols (CSV, ZIP). The data format is simply compressed on disk.",
7+
"og:title": "Format and Storage - Documentation QuantConnect.com",
8+
"og:type": "website",
9+
"og:site_name": "Format and Storage - QuantConnect.com",
10+
"og:image": "https://cdn.quantconnect.com/docs/i/lean-cli/datasets/format-and-storage.png"
11+
}
12+
}

05 Lean CLI/05 Datasets/01 Local Data/01 Introduction.html

Lines changed: 0 additions & 5 deletions
This file was deleted.

05 Lean CLI/05 Datasets/01 Local Data/04 QuantConnect Datasets.html

Lines changed: 0 additions & 8 deletions
This file was deleted.

05 Lean CLI/05 Datasets/01 Local Data/metadata.json

Lines changed: 0 additions & 12 deletions
This file was deleted.
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
{
2+
"type" : "landing",
3+
"heading" : "Downloading Data",
4+
"subHeading" : "",
5+
"content" : "<p>Running the LEAN engine locally with the CLI requires you to have your own local data. The recommended way to source local data is by purchasing and downloading it from our Dataset Market. This way, you can use the same data that is available to your algorithm when you run it in the cloud. There are two methods of downloading data. You can download smaller discrete datasets at a low cost or download complete collections in bulk to avoid selection bias.</p>",
6+
"alsoLinks" : [
7+
{ "text" : "Datasets" , "href" : "/docs/v2/writing-algorithms/datasets/overview" }
8+
],
9+
"featureShortDescription": {
10+
"01" : "Low cost option for individual tickers",
11+
"02" : "All tickers to avoid selection bias"
12+
}
13+
}
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
<p>Downloading data by the ticker is a low cost option to aquiring local trading data. This technique is for smaller, discrete downloads instead of downloading an entire dataset. You can download our ticker data through the CLI or LEAN in exchange for some <a href='https://www.quantconnect.com/docs/v2/cloud-platform/organizations/credit'>QuantConnect Credit</a> (QCC). Before the CLI or LEAN download new files, they check if your local machine already stores the files.</p>
Lines changed: 183 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,183 @@
1+
<p>
2+
Follow these steps to purchase and download data from the Dataset Market using the CLI:
3+
</p>
4+
5+
<ol>
6+
<li><a href="/docs/v2/lean-cli/initialization/authenticating-accounts#02-Log-In">Log in</a> to the CLI if you haven't done so already.</li>
7+
<li>Open a terminal in one of your <a href='/docs/v2/lean-cli/initialization/workspace'>workspace directories</a>.</li>
8+
<li>Run <code>lean data download</code> to start an interactive downloading wizard.</li>
9+
<li>Go through the interactive wizard to configure the data you want to download.</li>
10+
<li>After configuring the data you want to download, enter <span class='key-combinations'>N</span> when asked whether you want to download more data.
11+
<div class="cli section-example-container">
12+
<pre>$ lean data download
13+
Selected data:
14+
...
15+
Total price: 10 QCC
16+
Organization balance: 20,096,264 QCC
17+
Do you want to download more data? [y/N]: N</pre>
18+
</div>
19+
</li>
20+
<li>In the browser window that opens, read the <span class='document-title'>CLI API Access and Data Agreement</span> and click <span class='button-name'>I Accept The Data Agreement</span>.</li>
21+
<li>Go back to the terminal and confirm the purchase to start downloading.
22+
<div class="cli section-example-container">
23+
<pre>$ lean data download
24+
You will be charged 10 QCC from your organization's QCC balance
25+
After downloading all files your organization will have 20,096,254 QCC left
26+
Continue? [y/N]: y
27+
[1/1] Downloading equity/usa/daily/spy.zip (10 QCC)</pre>
28+
</div>
29+
</li>
30+
</ol>
31+
32+
<p>Many of the datasets depend on the <a href='https://www.quantconnect.com/datasets/quantconnect-us-equity-security-master'>US Equity Security Master</a> dataset because the because the US Equity Security Master contains information on splits, dividends, and symbol changes. To check if a dataset depends the US Equity Security Master, see the listing in the <a href='https://www.quantconnect.com/datasets/'>Dataset Market</a>. If you try download a dataset through the CLI that depends on the US Equity Security Master and you don't have an active subscription to it, you'll get an error.</p>
33+
34+
<p>Follow these steps to download US Equity data from the Dataset Market:</p>
35+
36+
<ol>
37+
<li>Open a terminal in one of your <a href='/docs/v2/lean-cli/initialization/workspace'>workspace directories</a>.</li>
38+
<li>Run <code>lean data download</code> to start an interactive downloading wizard and then enter the dataset category number.
39+
<div class="cli section-example-container">
40+
<pre>$ lean data download
41+
Select a category:
42+
1) Commerce Data (3 datasets)
43+
2) Consumer Data (2 datasets)
44+
3) Energy Data (2 datasets)
45+
4) Environmental Data (1 dataset)
46+
5) Financial Market Data (30 datasets)
47+
6) Industry Data (1 dataset)
48+
7) Legal Data (2 datasets)
49+
8) News and Events (4 datasets)
50+
9) Political Data (2 datasets)
51+
10) Transport and Logistics Data (1 dataset)
52+
11) Web Data (9 datasets)
53+
Enter an option: 5</pre>
54+
</div>
55+
</li>
56+
<li>Enter the dataset number.
57+
<div class="cli section-example-container">
58+
<pre>$ lean data download
59+
Select a dataset:
60+
1) Binance Crypto Future Margin Rate Data
61+
2) Binance Crypto Future Price Data
62+
3) Binance Crypto Price Data
63+
4) Binance US Crypto Price Data
64+
5) Bitcoin Metadata
65+
6) Bitfinex Crypto Price Data
66+
7) Brain Language Metrics on Company Filings
67+
8) Brain ML Stock Ranking
68+
9) CFD Data
69+
10) Coinbase Crypto Price Data
70+
11) Composite Factor Bundle
71+
12) Cross Asset Model
72+
13) FOREX Data
73+
14) Insider Trading
74+
15) Kraken Crypto Price Data
75+
16) NFT Sales
76+
17) Tactical
77+
18) Template (Do not Edit)
78+
19) US Coarse Universe
79+
20) US Congress Trading
80+
21) US ETF Constituents
81+
22) US Equities
82+
23) US Equity Options
83+
24) US Equity Security Master
84+
25) US Federal Reserve (FRED)
85+
26) US Futures Security Master
86+
27) US SEC Filings
87+
28) US Treasury Yield Curve
88+
29) VIX Central Contango
89+
30) VIX Daily Price
90+
Enter an option: 22</pre>
91+
</div>
92+
93+
<p>If you don't have an active subscription to the US Equity Security Master, you'll get the following error message:</p>
94+
<div class='error-messages'>Your organization needs to have an active Security Master subscription to download data from the 'US Equities' dataset<br>You can add the subscription at https://www.quantconnect.com/datasets/quantconnect-security-master/pricing</div>
95+
</li>
96+
97+
<li>Enter the data type.
98+
<div class="cli section-example-container">
99+
<pre>$ lean data download
100+
Data type:
101+
1) Trade
102+
2) Quote
103+
3) Bulk
104+
Enter an option: 1</pre>
105+
</div>
106+
</li>
107+
108+
<li>Enter the ticker(s).
109+
<div class="cli section-example-container">
110+
<pre>$ lean data download
111+
Ticker(s) (comma-separated): SPY</pre>
112+
</div>
113+
</li>
114+
115+
<li>Enter the resolution.
116+
<div class="cli section-example-container">
117+
<pre>$ lean data download
118+
Resolution:
119+
1) Tick
120+
2) Second
121+
3) Minute
122+
4) Hour
123+
5) Daily
124+
Enter an option: 3</pre>
125+
</div>
126+
</li>
127+
128+
<li>If you selected tick, second, or minute resolution in the previous step, enter the start and end date.
129+
<div class="cli section-example-container">
130+
<pre>$ lean data download
131+
Start date (yyyyMMdd): 20230101
132+
End date (yyyyMMdd): 20230105</pre>
133+
</div>
134+
</li>
135+
136+
137+
<li>Review your selected data and enter whether you would like to download more data.
138+
<div class="cli section-example-container">
139+
<pre>$ lean data download
140+
Selected data:
141+
┌─────────────┬──────────┬────────────────────────┬────────────┬────────┐
142+
│ Dataset │ Vendor │ Details │ File count │ Price │
143+
├─────────────┼──────────┼────────────────────────┼────────────┼────────┤
144+
│ US Equities │ AlgoSeek │ Data type: Trade │ 3 │ 15 QCC │
145+
│ │ │ Ticker: SPY │ │ │
146+
│ │ │ Resolution: Minute │ │ │
147+
│ │ │ Start date: 2023-01-01 │ │ │
148+
│ │ │ End date: 2023-01-05 │ │ │
149+
└─────────────┴──────────┴────────────────────────┴────────────┴────────┘
150+
Total price: 15 QCC
151+
Organization balance: 100 QCC
152+
Do you want to download more data? [y/N]: n</pre>
153+
</div>
154+
</li>
155+
156+
157+
<li>If you haven't already agreed to the <span class='document-title'>CLI API Access and Data Agreement</span>, in the browser window that opens, read the document and click <span class='button-name'>I Accept The Data Agreement</span>.</li>
158+
159+
<li>Confirm your data purchase.
160+
<div class="cli section-example-container">
161+
<pre>$ lean data download
162+
Data Terms of Use has been signed previously.
163+
Find full agreement at: https://www.quantconnect.com/terms/data/?organization=&lt;organizationId&gt;
164+
==========================================================================
165+
CLI API Access Agreement: On 2022-04-12 22:34:26 You Agreed:
166+
- Display or distribution of data obtained through CLI API Access is not permitted.
167+
- Data and Third Party Data obtained via CLI API Access can only be used for individual or internal employee's use.
168+
- Data is provided in LEAN format can not be manipulated for transmission or use in other applications.
169+
- QuantConnect is not liable for the quality of data received and is not responsible for trading losses.
170+
==========================================================================
171+
172+
You will be charged 15 QCC from your organization's QCC balance
173+
After downloading all files your organization will have 85 QCC left
174+
Continue? [y/N]: y</pre>
175+
</div>
176+
</li>
177+
178+
<p>After you confirm your data purchsae, the CLI downloads the data and prints its progress.</p>
179+
<div class="cli section-example-container">
180+
<pre>$ lean data download
181+
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% (3/3)</pre>
182+
</div>
183+
</ol>
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
<p>To view all of the supported datasets, see the <a href='https://www.quantconnect.com/datasets'>Dataset Market</a>.</p>

0 commit comments

Comments
 (0)