Skip to content

Commit 23a12bf

Browse files
committed
init
0 parents  commit 23a12bf

26 files changed

+1800
-0
lines changed

Makefile

+18
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# Makefile for Proxy Lab
2+
3+
4+
CC = gcc
5+
CFLAGS = -g -Werror
6+
LDFLAGS = -lpthread
7+
8+
proxy: proxy.c cache.c ./proxy_helpers/error_handling.c service.c robust_io.c ./thread_helpers/sbuf.c ./thread_helpers/semaphore.c socket.c thread.c ./proxy_helpers/url_parser.c ./proxy_helpers/wrappers.c cache.h constants.h ./proxy_helpers/error_handling.h service.h robust_io.h ./thread_helpers/sbuf.h ./thread_helpers/semaphore.h socket.h thread.h ./proxy_helpers/url_parser.h ./proxy_helpers/wrappers.h
9+
$(CC) $(CFLAGS) -o proxy service.c cache.c ./proxy_helpers/error_handling.c robust_io.c ./thread_helpers/sbuf.c ./thread_helpers/semaphore.c socket.c thread.c ./proxy_helpers/url_parser.c ./proxy_helpers/wrappers.c proxy.c $(LDFLAGS)
10+
11+
12+
# Creates a tarball in ../proxylab-handin.tar that you can then
13+
# hand in. DO NOT MODIFY THIS!
14+
15+
16+
clean:
17+
rm -f *~ *.o proxy
18+

README.md

+109
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,109 @@
1+
# Proxy Server
2+
Proxy is a simple concurrent HTTP proxy server that caches web objects.
3+
4+
It's an application for learning about *Network Programming* and *Concurrent Programming*.
5+
6+
7+
## Table of contents
8+
9+
| Table of contents |
10+
|:----------------------------------------------------------------------------------------------:|
11+
| [Overview](#Overview) |
12+
| [How does proxy work?](#How-does-proxy-work) |
13+
| [How does proxy deal with concurrent requests?](#How-does-proxy-deal-with-concurrent-requests) |
14+
| [How does proxy synchronize accesses to cache?](#How-does-proxy-synchronize-accesses-to-cache) |
15+
| [Requirements](#Requirements) |
16+
| [How to test proxy?](#How-to-test-proxy?) |
17+
| [My thoughts about the project](#Thoughts) |
18+
19+
20+
21+
22+
23+
24+
25+
26+
## Overview
27+
A web proxy is a program that acts as a middleman between a Web browser and an end server.
28+
- Instead of contacting the end server directly to get a Web page, the browser contacts the proxy, which forwards the request on to the end server. When the end server replies to the proxy, the proxy sends the reply on to the browser.
29+
- It can deal with multiple concurrent connections using multi-threading.
30+
- It can also cache web objects by storing local copies of objects from servers then responding to future requests by reading them out of its cache rather than by communicating again with remote servers.
31+
32+
33+
## How does proxy work?
34+
35+
It has the following detailed behavior:
36+
- Set up the proxy to accept incoming connections:
37+
- Proxy creates a listening descriptor that is ready to receive connection requests on port ***port*** by calling ***Open_listenfd()*** funciton defined in **socket.c** file.
38+
39+
- Set up the proxy to deal with multiple concurrent connections using **prethreading** technique:
40+
- After initializing buffer ***sbuf*** declared in **sbuf.h** file, the main thread creates the set of worker threads by calling ***Pthread_create()*** defined in **thread.c** file.
41+
- Main thread then enters an infinite loop, waiting for connection requests using ***Accept()*** function defined in **socket.c** file.
42+
- It then inserts the resulting connected descriptors in ***sbuf***.
43+
- Each worker thread waits until it is able to remove a connected descriptor from the buffer and then calls the ***serve_client()*** function to serve the client.
44+
- ***serve_client()*** routine:
45+
- Read and parse the **HTTP** request sent from the client by calling ***read_HTTP_request()*** function defined in **serve.c** file.
46+
- using ***hash()*** function defined in **cache.c** file, generate ***HTTP_request_hash*** that will be used to check if the cache contains the requested web object.
47+
- If the object is cached then read the object out of the cache rather than communicating again with the server by calling ***service_from_cache()*** function defined in **service.c** file.
48+
- Otherwise, using ***service_from_server()*** function, Try to connect the server then send it the ***parsed_request***, read its response, write it back to the client, and save it in an internal buffer for possible caching.
49+
- If ***object_size*** is less than ***MAX_OBJECT_SIZE*** write the object in a suitable cahce line.
50+
51+
52+
## How does proxy deal with concurrent requests?
53+
- We incur non-trivial cost of creating a new thread for each new client.
54+
- A proxy based on *prethreading* tries to reduce this overhead by using the **producer-consumer** model shown in the figure.
55+
- The proxy consists of a ***main thread*** and a set of ***worker threads***, the main thread repeatedly accepts connection requests from clients and places the resulting connected descriptors in a bounded buffer.
56+
- Each worker thread repeatedly removes a descriptor from the buffer, services the client, and then waits for the next descriptor.
57+
58+
<p align="center"><img src="https://i.ibb.co/jfNwR5n/producer-consumer-model.png"></p>
59+
60+
61+
## How does proxy synchronize accesses to cache?
62+
- Accesses to the cache must be thread-safe, and free of race conditions, So as a matter of fact we have these special requirements:
63+
1. Multiple threads must be able to simultaneously read from the cache.
64+
2. No thread can write to specific object while another thread is reading it.
65+
3. Only one thread should be permitted to write to the object at a time, but that restriction mustn't exist for readers.
66+
67+
- As such, protecting accesses to the cache with *one large exclusive lock* was not an acceptable solution. ***We partioned the cache into lines***, each line associated with ***read_cnt*** that count the number of current readers of that cache line, and ***mutex_writing_cache_line*** semaphore that ***lock only the cache line associated with it*** instead of locking the whole cache.
68+
69+
- A **writer thread** locks the write mutex each time it writes to the cache line associtaed with it, and unlocks it each time it finishes writing. This guarantees that there is at most one writer in that cache line at any point of time.
70+
71+
- On the other hand, only the first **reader thread** to read a cache line locks write mutex for that cache line, and only the last reader to finish reading unlocks it. The write mutex is ignored by readers who enter and leave while other readers are present.
72+
- This means that as long as a single reader holds the write mutex for a particular cahce line, an unbounded number of readers can read this cache line at the same time unimpeded.
73+
74+
75+
76+
## Requirements
77+
- Git
78+
- GCC
79+
- make
80+
81+
## How to test proxy?
82+
1) **Compile source code and run proxy**
83+
84+
Run the following commands in a terminal:
85+
```console
86+
git clone https://github.com/ahmed-salah-ucf/proxy-server.git
87+
```
88+
```console
89+
make
90+
```
91+
```console
92+
.\proxy <port>
93+
```
94+
95+
2) **Send HTTP requests using** *telnet* **or** *curl* **tools to test the proxy**
96+
97+
**telnet:**
98+
```console
99+
telnet localhost <port>
100+
```
101+
**curl:**
102+
```console
103+
curl --proxy http://localhost:<port> www.example.com
104+
```
105+
106+
## Thoughts
107+
- The Project helped me become more familiar with the concepts of **network programming** and **concurrent programming**.
108+
- I learned about **HTTP operation** and how to use **sockets** to write programs that communicate over network connections.
109+
- This introduced me to dealing with **concurrency**, and how to write **thread-safe** routines.

cache.c

+105
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
#include "cache.h"
2+
3+
unsigned long cur_time = 0; /* Used to update timestamp */
4+
5+
/*
6+
* cache_init - initialize proxy's cache by allocating
7+
* memory for it and initializing it to zeros.
8+
*/
9+
void cache_init(cache_line_t **cache_p) {
10+
int i;
11+
cache_line_t *p;
12+
13+
*cache_p = Calloc(CACHE_LINE_NUM, sizeof(cache_line_t));
14+
for (p = *cache_p, i = 0; i < CACHE_LINE_NUM; ++i, ++p) {
15+
p->length = 0;
16+
p->timestamp = 0;
17+
p->hash = 0;
18+
p->valid_bit = 0;
19+
}
20+
}
21+
22+
/*
23+
* is_cached - Check if a web object target is cached. It searches cache by
24+
* checking valid bit and hash value. Return the index of cache
25+
* line if there is a match. Otherwise, return -1.
26+
* If there is a cache hit, the timestamp in the corresponding
27+
* cache line would be updated.
28+
*/
29+
int is_cached(cache_line_t *cache, size_t target) {
30+
int i;
31+
cache_line_t *p;
32+
for (p = cache, i = 0; i < CACHE_LINE_NUM; ++i, ++p) {
33+
if (p->valid_bit == 1 && p->hash == target) {
34+
P(&mutex_writing_cache_line[i]);
35+
p->timestamp = cur_time;
36+
V(&mutex_writing_cache_line[i]);
37+
return i;
38+
}
39+
40+
}
41+
return -1;
42+
}
43+
44+
/*
45+
* get_write_index - Helper function searches the cache for an empty cache
46+
* line. if found one returns its index. otherwise,
47+
* returns the index of least-recent-used line.
48+
*/
49+
static int get_write_idx(cache_line_t *cache) {
50+
int i;
51+
cache_line_t *p;
52+
unsigned long lru_time = __UINT64_MAX__;
53+
int lru_idx;
54+
55+
for (p = cache, i = 0; i < CACHE_LINE_NUM; ++i, ++p) {
56+
if (p->valid_bit == 0) {
57+
return i;
58+
} else {
59+
if (p->timestamp < lru_time) {
60+
lru_time = p->timestamp;
61+
lru_idx = i;
62+
}
63+
}
64+
}
65+
return lru_idx;
66+
}
67+
68+
/*
69+
* write_cache - Search for write index in the cache and write cache_buf in the
70+
* cache line at this index. it uses mutex semaphore to lock
71+
* writing at this cache line from other threads.
72+
*/
73+
void write_cache(cache_line_t *cache, char *cache_buf, int object_size, size_t hash)
74+
{
75+
int write_idx = get_write_idx(cache);
76+
77+
P(&mutex_writing_cache_line[write_idx]);
78+
memcpy((cache + write_idx)->content, cache_buf, object_size);
79+
(cache + write_idx)->length = object_size;
80+
(cache + write_idx)->timestamp = cur_time;
81+
(cache + write_idx)->hash = hash;
82+
(cache + write_idx)->valid_bit = 1;
83+
V(&mutex_writing_cache_line[write_idx]);
84+
}
85+
86+
/*
87+
* hash_func - Helper function generates hash value from string
88+
*/
89+
static size_t hash_func(char *str) {
90+
unsigned long res = 5381;
91+
int c;
92+
93+
while ((c = *(str++)))
94+
res = ((res << 5) + res) + c; /* hash * 33 + c */
95+
96+
return res;
97+
}
98+
99+
/*
100+
* hash - Generate hash value for HTTP request
101+
*/
102+
size_t hash(char *HTTP_request)
103+
{
104+
return hash_func(HTTP_request);
105+
}

cache.h

+29
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
#ifndef CACHE_H
2+
#define CACHE_H
3+
#include <stdio.h>
4+
#include <stdlib.h>
5+
#include "constants.h"
6+
#include "proxy_helpers/wrappers.h"
7+
#include "thread_helpers/semaphore.h"
8+
9+
/* Global variables */
10+
extern unsigned long cur_time; /* Used to update timestamp */
11+
12+
typedef struct cache_line {
13+
char content[MAX_OBJECT_SIZE]; /* Stored web object */
14+
int length; /* Actual length of web object */
15+
unsigned long long timestamp; /* To perform LRU */
16+
size_t hash; /* Hash value of the request header */
17+
int valid_bit; /* Valid bit */
18+
} cache_line_t;
19+
20+
/* Helper functions prototypes */
21+
static int get_write_idx(cache_line_t *cache);
22+
static size_t hash_func(char *str);
23+
24+
/* Functions prototypes */
25+
void cache_init(cache_line_t **cache_p);
26+
void write_cache(cache_line_t *cache, char *cache_buf, int object_size, size_t hash);
27+
int is_cached(cache_line_t *cache, size_t target);
28+
size_t hash(char *HTTP_request);
29+
#endif

constants.h

+14
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
#ifndef CONSTANTS_H
2+
#define CONSTANTS_H
3+
/* Misc constants */
4+
#define MAXLINE 8192 /* Max text line length */
5+
#define MAXBUF 1048576 /* Max I/O buffer size */
6+
7+
/* Max cache and object sizes */
8+
#define MAX_CACHE_SIZE 1049000
9+
#define MAX_OBJECT_SIZE 102400
10+
#define CACHE_LINE_NUM (MAX_CACHE_SIZE / MAX_OBJECT_SIZE)
11+
12+
#define LISTENQ 1024 /* Second argument to listen() */
13+
14+
#endif

producer-consumer-model.png

70.5 KB
Loading

proxy

73.2 KB
Binary file not shown.

proxy.c

+50
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
#include "service.h"
2+
3+
4+
int main(int argc, char *argv[])
5+
{
6+
int listenfd, connfd;
7+
char hostname[MAXLINE], port[MAXLINE];
8+
socklen_t clientlen;
9+
struct sockaddr_storage clientaddr;
10+
pthread_t tid;
11+
cache_line_t *cache = NULL;
12+
13+
/* Check command line args */
14+
if (argc != 2) {
15+
fprintf(stderr, "usage: %s <port>\n", argv[0]);
16+
exit(1);
17+
}
18+
19+
/* Ignore SIGPIPE signal */
20+
Signal(SIGPIPE, SIG_IGN);
21+
22+
listenfd = Open_listenfd(argv[1]);
23+
printf("Proxy listening on port: %d\n", listenfd);
24+
25+
/* Initialize cache */
26+
cache_init(&cache);
27+
28+
/* Initialize semaphores */
29+
semaphore_init();
30+
31+
sbuf_init(&sbuf, SBUFSIZE);
32+
for (int i = 0; i < NTHREADS; ++i) /* Create worker threads */
33+
Pthread_create(&tid, NULL, thread, cache);
34+
35+
while (1) {
36+
clientlen = sizeof(clientaddr);
37+
connfd = Accept(listenfd, (SA *)&clientaddr, &clientlen);
38+
Getnameinfo((SA *) &clientaddr, clientlen,
39+
hostname, MAXLINE, port, MAXLINE, 0);
40+
printf("Accepted connection from (%s, %s)\n", hostname, port);
41+
++cur_time; /* Update time */
42+
sbuf_insert(&sbuf, connfd);
43+
}
44+
Free(cache);
45+
Close(listenfd);
46+
exit(0);
47+
}
48+
49+
50+

proxy_helpers/error_handling.c

+39
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
#include "error_handling.h"
2+
3+
4+
5+
/**************************
6+
* Error-handling functions
7+
**************************/
8+
/*
9+
* unix_error - Report error message for unix-functions errors.
10+
*/
11+
void unix_error(char *msg) /* Unix-style error */
12+
{
13+
if (errno)
14+
fprintf(stderr, "%s: %s\n", msg, strerror(errno));
15+
}
16+
17+
/*
18+
* posix_error - Report error message for posix-functions errors.
19+
*/
20+
void posix_error(int code, char *msg) /* Posix-style error */
21+
{
22+
fprintf(stderr, "%s: %s\n", msg, strerror(code));
23+
}
24+
25+
/*
26+
* gai_error - Report error message for getaddrinfo function errors.
27+
*/
28+
void gai_error(int code, char *msg) /* Getaddrinfo-style error */
29+
{
30+
fprintf(stderr, "%s: %s\n", msg, gai_strerror(code));
31+
}
32+
33+
/*
34+
* app_error - Report error messages.
35+
*/
36+
void app_error(char *msg) /* Application error */
37+
{
38+
fprintf(stderr, "%s\n", msg);
39+
}

proxy_helpers/error_handling.h

+15
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
#ifndef ERROR_HANDLING_H
2+
#define ERROR_HANDLING_H
3+
#include <stdio.h>
4+
#include <string.h>
5+
#include <errno.h>
6+
#include <netdb.h>
7+
#include "../constants.h"
8+
9+
/* error-handling functions prototypes */
10+
void unix_error(char *msg);
11+
void posix_error(int code, char *msg);
12+
void gai_error(int code, char *msg);
13+
void app_error(char *msg);
14+
15+
#endif

0 commit comments

Comments
 (0)