Skip to content

Commit 43d77af

Browse files
jonesbpjbolda
authored andcommitted
Limit concurrency of remote file requests (#50)
* Limit concurrency of remote file requests Use bluebird’s Promise.map() with concurrency option to limit HTTP requests when creating remote nodes for file attachments in order to avoid being rate-limited by Airtable’s HTTP servers. * Require bluebird for map and allow concurrency to be set as option * Document concurrency option for Attachment fileNode downloads * Update README.md Tighten language in documentation * Tighten language in comment on concurrency default
1 parent 2672c18 commit 43d77af

File tree

3 files changed

+22
-15
lines changed

3 files changed

+22
-15
lines changed

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,8 @@ For an example of a markdown-and-airtable-driven site using `gatsby-transformer-
108108

109109
If you are using the `Attachment` type field in Airtable, you may specify a column name with `fileNode` and the plugin will bring in these files. Using this method, it will create "nodes" for each of the files and expose this to all of the transformer plugins. A good use case for this would be attaching images in Airtable, and being able to make these available for use with the `sharp` plugins and `gatsby-image`. Specifying a `fileNode` does require a peer dependency of `gatsby-source-filesystem` otherwise it will fall back as a non-mapped field. The locally available files and any ecosystem connections will be available on the node as `localFiles`.
110110

111+
When using the Attachment type field, this plugin governs requests to download the associated files from Airtable to 5 concurrent requests to prevent excessive requests on Airtable's servers - which can result in refused / hanging connections. You can adjust this limit with the concurrency option in your gatsby-config.js file. Set the option with an integer value for your desired limit on attempted concurrent requests. A value of 0 will allow requests to be made without any limit.
112+
111113
### The power of views
112114

113115
Within Airtable, every table can have one or more named Views. These Views are a convenient way to pre-filter and sort your data before querying it in Gatsby. If you do not specify a view in your table object, raw data will be returned in no particular order.

gatsby-node.js

Lines changed: 18 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,11 @@
11
const Airtable = require("airtable");
22
const crypto = require(`crypto`);
33
const { createRemoteFileNode } = require(`gatsby-source-filesystem`);
4+
const { map } = require('bluebird');
45

56
exports.sourceNodes = async (
67
{ actions, createNodeId, store, cache },
7-
{ apiKey, tables }
8+
{ apiKey, tables, concurrency }
89
) => {
910
// tables contain baseId, tableName, tableView, queryName, mapping, tableLinks
1011
const { createNode, setPluginStatus } = actions;
@@ -31,6 +32,14 @@ exports.sourceNodes = async (
3132
return;
3233
}
3334

35+
if (concurrency === undefined) {
36+
// Airtable hasn't documented what the rate limit against their attachment servers is.
37+
// They do document that API calls are limited to 5 requests/sec, so the default limit of 5 concurrent
38+
// requests for remote files has been selected in that spirit. A higher value can be set as a plugin
39+
// option in gatsby-config.js
40+
concurrency = 5;
41+
}
42+
3443
console.time(`\nfetch all Airtable rows from ${tables.length} tables`);
3544

3645
let queue = [];
@@ -120,7 +129,10 @@ exports.sourceNodes = async (
120129
}
121130
});
122131

123-
let childNodes = allRows.map(async row => {
132+
// Use the map function for arrays of promises imported from Bluebird.
133+
// Using the concurrency option protects against being blocked from Airtable's
134+
// file attachment servers for large numbers of requests.
135+
return map(allRows, async row => {
124136
// don't love mutating the row here, but
125137
// not ready to refactor yet to clean this up
126138
// (happy to take a PR!)
@@ -153,19 +165,11 @@ exports.sourceNodes = async (
153165
};
154166

155167
createNode(node);
156-
return processedData.childNodes;
157-
});
158168

159-
let flattenedChildNodes = await Promise.all(childNodes).then(nodes =>
160-
nodes.reduce(
161-
(accumulator, currentValue) => accumulator.concat(currentValue),
162-
[]
163-
)
164-
);
165-
166-
return Promise.all(flattenedChildNodes).then(nodes => {
167-
nodes.forEach(node => createNode(node));
168-
});
169+
await Promise.all(processedData.childNodes).then(nodes => {
170+
nodes.forEach(node => createNode(node));
171+
});
172+
}, { concurrency: concurrency });
169173
};
170174

171175
const processData = async (

package.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,8 @@
1919
"airtable"
2020
],
2121
"dependencies": {
22-
"airtable": "^0.5.6"
22+
"airtable": "^0.5.6",
23+
"bluebird": "^3.5.4"
2324
},
2425
"peerDependencies": {
2526
"gatsby-source-filesystem": ">=2.0.0-rc.0"

0 commit comments

Comments
 (0)