Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update markdown formatting (mostly for commands) #16

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 31 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,36 +2,46 @@

Benchmarking different python HTTP clients, using a small, high-performance sample API.

# Why?
## Why?

[To answer a StackOverflow question: how much does performance differ, for python HTTP clients?](http://stackoverflow.com/a/32899936/95122)

This provides a Python client library benchmarking tool and a simple but high-performance API (HTTP and HTTPS) to use in benchmarking.

Oh, and of course, results in a controlled environment!

# How do I run the test server?
## How do I run the test server?
In one terminal:
* docker run --rm -i -t -w /tmp -p 4443:443 -p 8080:80 svanoort/client-flask-demo:2.0
* OR: (sudo) ./build-docker.sh && ./run-docker.sh
```bash
docker run --rm -i -t -w /tmp -p 4443:443 -p 8080:80 svanoort/client-flask-demo:2.0
```
Or:
```bash
./build-docker.sh && ./run-docker.sh
```

You may beed to prefix commands with `sudo`.

The test server will expose several API endpoints, with HTTP responses on port 8080 & HTTPS (self-signed cert) on port 4443:
* /ping - returns "pong"
* /bigger - returns a fixed ~585B JSON response
* /length/\<integer\> - returns a fixed-length random binary response of $integer bytes. The first time this is generated and cached, then it is served from cache.
* `/ping` - returns "pong"
* `/bigger` - returns a fixed ~585B JSON response
* `/length/\<integer\>` - returns a fixed-length random binary response of $integer bytes. The first time this is generated and cached, then it is served from cache.

## How do I run the benchmark client?
It's a command line utility and has help via `python benchmark.py -h`

# How do I run the benchmark client?
It's a command line utility, and has help via "python benchmark.py -h"
```bash
python benchmark.py http://host:port/
```

# How
1. On one server or system
## How
On one server or system:

1. In one terminal, build & run the docker image to start a test server: `sh build-docker.sh && sh run-docker.sh`
2. In another terminal, run: python benchmark.py --cycles 10000
2. In another terminal, run: `python benchmark.py --cycles 10000`
3. Tests take a few minutes minute to finish, then you can close process in terminal 1.

# Short Summary of Results
## Short Summary of Results

PyCurl is much faster than Requests (or other HTTP client libraries), generally completing smaller requests 2-3x as fast, and requires 3-10x less CPU time. This is most visible with connection creation for small requests; given large enough requests (above 100 kB/s), Requests can still saturate a fast connection quite easily even without concurrent HTTP requests (assuming that all post-request processing is done separately) on a fast CPU.

Expand All @@ -42,7 +52,7 @@ On this system:
* requests takes about **1078** CPU-microseconds to *open a new connection* and issue a request (no connection reuse), or ~552 microseconds to open
* CPU-limited peak throughput with PyCurl is *roughly* 50%-100% better than requests with very large request sizes, assuming an extremely fast network connection relative to CPU (in this case several GBit)

# Benchmark Setup
## Benchmark Setup

* Testing with two c4.large instances in AWS, one running a client, and the other running the server.
* All tests issued 10000 sequential requests to minimize the impact of noise
Expand All @@ -59,7 +69,7 @@ Specs:
* 3.75 GB of RAM
* 8 GB GP2 EBS volume

# Graphs
## Graphs

![AWS-to-AWS RPS For HTTP](https://cdn.rawgit.com/svanoort/python-client-benchmarks/master/aws-to-aws-http-rps.svg)

Expand All @@ -74,7 +84,7 @@ Specs:
![AWS-to-AWS Throughput For Both HTTP and HTTPS](https://cdn.rawgit.com/svanoort/python-client-benchmarks/master/aws-to-aws-both-throughput.svg)


# Detailed Results
## Detailed Results

Loopback Test, running on 1 c4.large and making 1 kB requests against a local server

Expand Down Expand Up @@ -108,16 +118,16 @@ Detailed **CPU TIME** Loopback Test, running on 1 c4.large with different reques

Also available in the [new-aws-results folder](new-aws-results).

# Miscellanea
build-docker.sh was used to generate the docker image
./run-docker.sh will launch the container
## Miscellanea
* `build-docker.sh` was used to generate the docker image
* `./run-docker.sh` will launch the container

# Caveats
## Caveats

* This is only *one* system type and *one* operating system tested
* I am only trying GET requests to a sample API
* I've taken pains to generate data in as scientific and clean a manner as I can reasonably manage (reporting stats over 10k requests), but do not collect per-execution data so there are no error bars
* HTTPS performance should be taken with an extra grain of salt and only used for rough comparison:
+ Many different options and configurable settings exist for HTTPS, all with varying performance
+ There are also multiple TLS/SSL implementations available for libcurl, for sanity's sake I am only testing the default one.
+ For similar reasons I'm not testing pyopenssl use in requests (in addition to base requests settings)
+ For similar reasons I'm not testing pyopenssl use in requests (in addition to base requests settings)