From 10bc4b6e9aa46c54d2485aca523b62ebcc3a0ebc Mon Sep 17 00:00:00 2001
From: Bruno Oliveira <nicoddemus@gmail.com>
Date: Tue, 8 Dec 2015 20:08:46 -0200
Subject: [PATCH 1/5] First version of architecture overview doc

---
 OVERVIEW.md | 66 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 66 insertions(+)
 create mode 100644 OVERVIEW.md

diff --git a/OVERVIEW.md b/OVERVIEW.md
new file mode 100644
index 00000000..552fb332
--- /dev/null
+++ b/OVERVIEW.md
@@ -0,0 +1,66 @@
+# Overview #
+
+Here it is described a brief overview of xdist's internal architecture.
+
+
+`xdist` works by spawning one or more **worker nodes**, which are controlled
+by the **master node**. Each **worker node** is responsible for performing 
+a full test collection and afterwards running tests as dictated by the **master node**.
+
+The execution flow is:
+
+1. **master node** spawns one or more **worker nodes** at the begginning of
+   the test session. The communication between **master** and **worker** nodes makes use of 
+   [execnet](http://codespeak.net/execnet/) and its [gateways](http://codespeak.net/execnet/basics.html#gateways-bootstrapping-python-interpreters).
+   The actual interpreters executing the code for the **worker nodes** might
+   be remote or local. 
+  
+1. Each **worker node** itself is a mini pytest runner. **workers** at this
+   point perform a full test collection, sending back the collected 
+   test-ids back to the **master node** which does not
+   perform any collection itself.
+     
+1. The **master node** receives the result of the collection from all nodes.
+   At this point the **master node** performs some sanity check to ensure that
+   all **worker nodes** collected the same tests (including order), bailing out otherwise.
+   If all is well, it converts the list of test-ids into a list of simple
+   indexes, where each index corresponds to the position of that test in the
+   original collection list. This works because all nodes have the same 
+   collection list, and saves bandwidth because the **master** can now tell
+   one of the workers to just *execute test index 3* index of passing the
+   full test id.
+   
+1. If **dist-mode** is **each**: the **master node** just sends the full list
+   of test indexes to each node at this moment.
+   
+1. If **dist-mode** is **load**: the **master node** takes around 25% of the
+   tests and sends them one by one to each **worker node** in a round robin
+   fashion. The rest of the tests will be distributed later as **worker nodes**
+   finish tests (see below).
+   
+1. **worker nodes** re-implement `pytest_runtestloop`: pytest's default implementation
+   basically loops over all collected items in the `session` object and executes
+   the `pytest_runtest_protocol` for each test item, but in xdist **workers** sit idly 
+   waiting for **master node** to send tests for execution. As tests are
+   received by **workers**, `pytest_runtest_protocol` is executed for each test. 
+   Here it worth noting an implementation detail: at least one
+   test is kept always in **worker nodes** must they comply with 
+   `pytest_runtest_protocol` in that it needs to know which will be the 
+   `nextitem` in the hook call: either a new test in case the **master node** sends
+   a new test, or `None` if the **worker** receives a "shutdown" request.
+   
+1. As tests are started and completed at the **workers**, the results are sent
+   back to the **master node**, which then just forwards the results to 
+   the appropriate pytest hooks: `pytest_runtest_logstart` and 
+   `pytest_runtest_logreport`. This way other plugins (for example `junitxml`)
+   can work normally. The **master node** (when in dist-mode **load**) 
+   decides to send more tests to a node when a test completes, using
+   some heuristics such as test durations and how many tests each **worker node**
+   still has to run.
+   
+1. When the **master node** has no more pending tests it will
+   send a "shutdown" signal to all **workers**, which will then run their 
+   remaining tests to completion and shut down. At this point the 
+   **master node** will sit waiting for **workers** to shut down, still
+   processing events such as `pytest_runtest_logreport`.
+ 

From b29ff737811a93f62b3d1e3f7bdf0af5818d47b5 Mon Sep 17 00:00:00 2001
From: Bruno Oliveira <nicoddemus@gmail.com>
Date: Wed, 9 Dec 2015 19:00:20 -0200
Subject: [PATCH 2/5] Apply small review requests

* Fixed typo
* Removed superfluous introduction
---
 OVERVIEW.md | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/OVERVIEW.md b/OVERVIEW.md
index 552fb332..25626959 100644
--- a/OVERVIEW.md
+++ b/OVERVIEW.md
@@ -1,15 +1,12 @@
 # Overview #
 
-Here it is described a brief overview of xdist's internal architecture.
-
-
 `xdist` works by spawning one or more **worker nodes**, which are controlled
 by the **master node**. Each **worker node** is responsible for performing 
 a full test collection and afterwards running tests as dictated by the **master node**.
 
 The execution flow is:
 
-1. **master node** spawns one or more **worker nodes** at the begginning of
+1. **master node** spawns one or more **worker nodes** at the beginning of
    the test session. The communication between **master** and **worker** nodes makes use of 
    [execnet](http://codespeak.net/execnet/) and its [gateways](http://codespeak.net/execnet/basics.html#gateways-bootstrapping-python-interpreters).
    The actual interpreters executing the code for the **worker nodes** might

From c62effdd793133afa5b12109677701ae60f7928f Mon Sep 17 00:00:00 2001
From: Bruno Oliveira <nicoddemus@gmail.com>
Date: Thu, 10 Dec 2015 19:55:31 -0200
Subject: [PATCH 3/5] Reword reason why workers must keep a single test on
 queue always

---
 OVERVIEW.md | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/OVERVIEW.md b/OVERVIEW.md
index 25626959..62cb4d1c 100644
--- a/OVERVIEW.md
+++ b/OVERVIEW.md
@@ -40,11 +40,12 @@ The execution flow is:
    the `pytest_runtest_protocol` for each test item, but in xdist **workers** sit idly 
    waiting for **master node** to send tests for execution. As tests are
    received by **workers**, `pytest_runtest_protocol` is executed for each test. 
-   Here it worth noting an implementation detail: at least one
-   test is kept always in **worker nodes** must they comply with 
-   `pytest_runtest_protocol` in that it needs to know which will be the 
-   `nextitem` in the hook call: either a new test in case the **master node** sends
-   a new test, or `None` if the **worker** receives a "shutdown" request.
+   Here it worth noting an implementation detail: **workers** always must keep at 
+   least one test item on their queue due to how the `pytest_runtest_protocol(item, nextitem)` 
+   hook is defined: in order to pass the `nextitem` to the hook, the worker must wait for more 
+   instructions from master before executing that remaining test. If it receives more tests, 
+   then it can safely call `pytest_runtest_protocol` because it knows what the `nextitem` parameter will be. 
+   If it receives a "shutdown" signal, then it can execute the hook passing `nextitem` as `None`. 
    
 1. As tests are started and completed at the **workers**, the results are sent
    back to the **master node**, which then just forwards the results to 

From 1716767a1c49dc12e1c42cfdc49ecf5d04aca3b0 Mon Sep 17 00:00:00 2001
From: Bruno Oliveira <nicoddemus@gmail.com>
Date: Thu, 10 Dec 2015 20:00:43 -0200
Subject: [PATCH 4/5] Drop "node" from "workers" and "master"

---
 OVERVIEW.md | 42 +++++++++++++++++++++---------------------
 1 file changed, 21 insertions(+), 21 deletions(-)

diff --git a/OVERVIEW.md b/OVERVIEW.md
index 62cb4d1c..15fc1f68 100644
--- a/OVERVIEW.md
+++ b/OVERVIEW.md
@@ -1,25 +1,25 @@
 # Overview #
 
-`xdist` works by spawning one or more **worker nodes**, which are controlled
-by the **master node**. Each **worker node** is responsible for performing 
-a full test collection and afterwards running tests as dictated by the **master node**.
+`xdist` works by spawning one or more **workers**, which are controlled
+by the **master**. Each **worker** is responsible for performing 
+a full test collection and afterwards running tests as dictated by the **master**.
 
 The execution flow is:
 
-1. **master node** spawns one or more **worker nodes** at the beginning of
+1. **master** spawns one or more **workers** at the beginning of
    the test session. The communication between **master** and **worker** nodes makes use of 
    [execnet](http://codespeak.net/execnet/) and its [gateways](http://codespeak.net/execnet/basics.html#gateways-bootstrapping-python-interpreters).
-   The actual interpreters executing the code for the **worker nodes** might
+   The actual interpreters executing the code for the **workers** might
    be remote or local. 
   
-1. Each **worker node** itself is a mini pytest runner. **workers** at this
+1. Each **worker** itself is a mini pytest runner. **workers** at this
    point perform a full test collection, sending back the collected 
-   test-ids back to the **master node** which does not
+   test-ids back to the **master** which does not
    perform any collection itself.
      
-1. The **master node** receives the result of the collection from all nodes.
-   At this point the **master node** performs some sanity check to ensure that
-   all **worker nodes** collected the same tests (including order), bailing out otherwise.
+1. The **master** receives the result of the collection from all nodes.
+   At this point the **master** performs some sanity check to ensure that
+   all **workers** collected the same tests (including order), bailing out otherwise.
    If all is well, it converts the list of test-ids into a list of simple
    indexes, where each index corresponds to the position of that test in the
    original collection list. This works because all nodes have the same 
@@ -27,18 +27,18 @@ The execution flow is:
    one of the workers to just *execute test index 3* index of passing the
    full test id.
    
-1. If **dist-mode** is **each**: the **master node** just sends the full list
+1. If **dist-mode** is **each**: the **master** just sends the full list
    of test indexes to each node at this moment.
    
-1. If **dist-mode** is **load**: the **master node** takes around 25% of the
-   tests and sends them one by one to each **worker node** in a round robin
-   fashion. The rest of the tests will be distributed later as **worker nodes**
+1. If **dist-mode** is **load**: the **master** takes around 25% of the
+   tests and sends them one by one to each **worker** in a round robin
+   fashion. The rest of the tests will be distributed later as **workers**
    finish tests (see below).
    
-1. **worker nodes** re-implement `pytest_runtestloop`: pytest's default implementation
+1. **workers** re-implement `pytest_runtestloop`: pytest's default implementation
    basically loops over all collected items in the `session` object and executes
    the `pytest_runtest_protocol` for each test item, but in xdist **workers** sit idly 
-   waiting for **master node** to send tests for execution. As tests are
+   waiting for **master** to send tests for execution. As tests are
    received by **workers**, `pytest_runtest_protocol` is executed for each test. 
    Here it worth noting an implementation detail: **workers** always must keep at 
    least one test item on their queue due to how the `pytest_runtest_protocol(item, nextitem)` 
@@ -48,17 +48,17 @@ The execution flow is:
    If it receives a "shutdown" signal, then it can execute the hook passing `nextitem` as `None`. 
    
 1. As tests are started and completed at the **workers**, the results are sent
-   back to the **master node**, which then just forwards the results to 
+   back to the **master**, which then just forwards the results to 
    the appropriate pytest hooks: `pytest_runtest_logstart` and 
    `pytest_runtest_logreport`. This way other plugins (for example `junitxml`)
-   can work normally. The **master node** (when in dist-mode **load**) 
+   can work normally. The **master** (when in dist-mode **load**) 
    decides to send more tests to a node when a test completes, using
-   some heuristics such as test durations and how many tests each **worker node**
+   some heuristics such as test durations and how many tests each **worker**
    still has to run.
    
-1. When the **master node** has no more pending tests it will
+1. When the **master** has no more pending tests it will
    send a "shutdown" signal to all **workers**, which will then run their 
    remaining tests to completion and shut down. At this point the 
-   **master node** will sit waiting for **workers** to shut down, still
+   **master** will sit waiting for **workers** to shut down, still
    processing events such as `pytest_runtest_logreport`.
  

From 0a5bdfcda52b04cd4a8fa9cbaa3f685b7832124d Mon Sep 17 00:00:00 2001
From: Bruno Oliveira <nicoddemus@gmail.com>
Date: Thu, 10 Dec 2015 20:01:06 -0200
Subject: [PATCH 5/5] Add a FAQ section

---
 OVERVIEW.md | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/OVERVIEW.md b/OVERVIEW.md
index 15fc1f68..25bb76f4 100644
--- a/OVERVIEW.md
+++ b/OVERVIEW.md
@@ -62,3 +62,15 @@ The execution flow is:
    **master** will sit waiting for **workers** to shut down, still
    processing events such as `pytest_runtest_logreport`.
  
+## FAQ ##
+
+> Why does each worker do its own collection, as opposed to having 
+the master collect once and distribute from that collection to the workers?
+
+If collection was performed by master then it would have to 
+serialize collected items to send them through the wire, as workers live in another process. 
+The problem is that test items are not easily (impossible?) to serialize, as they contain references to 
+the test functions, fixture managers, config objects, etc. Even if one manages to serialize it, 
+it seems it would be very hard to get it right and easy to break by any small change in pytest. 
+  
+