|
| 1 | +# Proxy Server |
| 2 | +Proxy is a simple concurrent HTTP proxy server that caches web objects. |
| 3 | + |
| 4 | +It's an application for learning about *Network Programming* and *Concurrent Programming*. |
| 5 | + |
| 6 | + |
| 7 | +## Table of contents |
| 8 | + |
| 9 | +| Table of contents | |
| 10 | +|:----------------------------------------------------------------------------------------------:| |
| 11 | +| [Overview](#Overview) | |
| 12 | +| [How does proxy work?](#How-does-proxy-work) | |
| 13 | +| [How does proxy deal with concurrent requests?](#How-does-proxy-deal-with-concurrent-requests) | |
| 14 | +| [How does proxy synchronize accesses to cache?](#How-does-proxy-synchronize-accesses-to-cache) | |
| 15 | +| [Requirements](#Requirements) | |
| 16 | +| [How to test proxy?](#How-to-test-proxy?) | |
| 17 | +| [My thoughts about the project](#Thoughts) | |
| 18 | + |
| 19 | + |
| 20 | + |
| 21 | + |
| 22 | + |
| 23 | + |
| 24 | + |
| 25 | + |
| 26 | +## Overview |
| 27 | +A web proxy is a program that acts as a middleman between a Web browser and an end server. |
| 28 | +- Instead of contacting the end server directly to get a Web page, the browser contacts the proxy, which forwards the request on to the end server. When the end server replies to the proxy, the proxy sends the reply on to the browser. |
| 29 | +- It can deal with multiple concurrent connections using multi-threading. |
| 30 | +- It can also cache web objects by storing local copies of objects from servers then responding to future requests by reading them out of its cache rather than by communicating again with remote servers. |
| 31 | + |
| 32 | + |
| 33 | +## How does proxy work? |
| 34 | + |
| 35 | +It has the following detailed behavior: |
| 36 | +- Set up the proxy to accept incoming connections: |
| 37 | + - Proxy creates a listening descriptor that is ready to receive connection requests on port ***port*** by calling ***Open_listenfd()*** funciton defined in **socket.c** file. |
| 38 | + |
| 39 | +- Set up the proxy to deal with multiple concurrent connections using **prethreading** technique: |
| 40 | + - After initializing buffer ***sbuf*** declared in **sbuf.h** file, the main thread creates the set of worker threads by calling ***Pthread_create()*** defined in **thread.c** file. |
| 41 | + - Main thread then enters an infinite loop, waiting for connection requests using ***Accept()*** function defined in **socket.c** file. |
| 42 | + - It then inserts the resulting connected descriptors in ***sbuf***. |
| 43 | + - Each worker thread waits until it is able to remove a connected descriptor from the buffer and then calls the ***serve_client()*** function to serve the client. |
| 44 | +- ***serve_client()*** routine: |
| 45 | + - Read and parse the **HTTP** request sent from the client by calling ***read_HTTP_request()*** function defined in **serve.c** file. |
| 46 | + - using ***hash()*** function defined in **cache.c** file, generate ***HTTP_request_hash*** that will be used to check if the cache contains the requested web object. |
| 47 | + - If the object is cached then read the object out of the cache rather than communicating again with the server by calling ***service_from_cache()*** function defined in **service.c** file. |
| 48 | + - Otherwise, using ***service_from_server()*** function, Try to connect the server then send it the ***parsed_request***, read its response, write it back to the client, and save it in an internal buffer for possible caching. |
| 49 | + - If ***object_size*** is less than ***MAX_OBJECT_SIZE*** write the object in a suitable cahce line. |
| 50 | + |
| 51 | + |
| 52 | +## How does proxy deal with concurrent requests? |
| 53 | +- We incur non-trivial cost of creating a new thread for each new client. |
| 54 | +- A proxy based on *prethreading* tries to reduce this overhead by using the **producer-consumer** model shown in the figure. |
| 55 | +- The proxy consists of a ***main thread*** and a set of ***worker threads***, the main thread repeatedly accepts connection requests from clients and places the resulting connected descriptors in a bounded buffer. |
| 56 | +- Each worker thread repeatedly removes a descriptor from the buffer, services the client, and then waits for the next descriptor. |
| 57 | + |
| 58 | +<p align="center"><img src="https://i.ibb.co/jfNwR5n/producer-consumer-model.png"></p> |
| 59 | + |
| 60 | + |
| 61 | +## How does proxy synchronize accesses to cache? |
| 62 | +- Accesses to the cache must be thread-safe, and free of race conditions, So as a matter of fact we have these special requirements: |
| 63 | + 1. Multiple threads must be able to simultaneously read from the cache. |
| 64 | + 2. No thread can write to specific object while another thread is reading it. |
| 65 | + 3. Only one thread should be permitted to write to the object at a time, but that restriction mustn't exist for readers. |
| 66 | + |
| 67 | +- As such, protecting accesses to the cache with *one large exclusive lock* was not an acceptable solution. ***We partioned the cache into lines***, each line associated with ***read_cnt*** that count the number of current readers of that cache line, and ***mutex_writing_cache_line*** semaphore that ***lock only the cache line associated with it*** instead of locking the whole cache. |
| 68 | + |
| 69 | +- A **writer thread** locks the write mutex each time it writes to the cache line associtaed with it, and unlocks it each time it finishes writing. This guarantees that there is at most one writer in that cache line at any point of time. |
| 70 | + |
| 71 | +- On the other hand, only the first **reader thread** to read a cache line locks write mutex for that cache line, and only the last reader to finish reading unlocks it. The write mutex is ignored by readers who enter and leave while other readers are present. |
| 72 | +- This means that as long as a single reader holds the write mutex for a particular cahce line, an unbounded number of readers can read this cache line at the same time unimpeded. |
| 73 | + |
| 74 | + |
| 75 | + |
| 76 | +## Requirements |
| 77 | +- Git |
| 78 | +- GCC |
| 79 | +- make |
| 80 | + |
| 81 | +## How to test proxy? |
| 82 | +1) **Compile source code and run proxy** |
| 83 | + |
| 84 | + Run the following commands in a terminal: |
| 85 | + ```console |
| 86 | + git clone https://github.com/ahmed-salah-ucf/proxy-server.git |
| 87 | + ``` |
| 88 | + ```console |
| 89 | + make |
| 90 | + ``` |
| 91 | + ```console |
| 92 | + .\proxy <port> |
| 93 | + ``` |
| 94 | + |
| 95 | +2) **Send HTTP requests using** *telnet* **or** *curl* **tools to test the proxy** |
| 96 | + |
| 97 | + **telnet:** |
| 98 | + ```console |
| 99 | + telnet localhost <port> |
| 100 | + ``` |
| 101 | + **curl:** |
| 102 | + ```console |
| 103 | + curl --proxy http://localhost:<port> www.example.com |
| 104 | + ``` |
| 105 | + |
| 106 | +## Thoughts |
| 107 | +- The Project helped me become more familiar with the concepts of **network programming** and **concurrent programming**. |
| 108 | +- I learned about **HTTP operation** and how to use **sockets** to write programs that communicate over network connections. |
| 109 | +- This introduced me to dealing with **concurrency**, and how to write **thread-safe** routines. |
0 commit comments