Proxy Servers and Caching: Enhance Network Efficiency
Proxy Servers: Centralizing Internet Traffic
A proxy server centralizes traffic between the internet and a local network. This way, each computer on your local network does not need a direct internet connection. It is used to control unauthorized access from the internet to the local network.
How Does It Work?
The proxy transforms input and output directions. When a local network computer makes a web application request, the proxy intercepts and processes it. This hides the real IP address of the computer making the request, and the request uses the IP address of the proxy.
Proxy Cache Servers: Speed and Security
A proxy cache server is a machine located between the user and another network, normally the internet. It acts as a separation between the two networks, serving as a cache area to speed up access to web pages or to restrict access to content.
Understanding Cache
The cache is an area of RAM that stores a copy of data that will likely be accessed more than once. This minimizes the time to access such data since access to RAM is always faster than access to a peripheral device like a hard disk.
Functions of Proxy Cache Servers
- Allow web access to private hosts (with a private IP address) that are not directly connected to the internet.
- Control web access by applying rules or standards (e.g., based on the machine, the requested page, day and/or time of application).
- Register web traffic from the LAN to the outside.
- Control the web content accessed and downloaded for the presence of possible attacks by viruses, worms, Trojan horses, etc.
- Control local network security against potential attacks, intrusions, etc.
- Function as a cache of web pages. If another user wants to access a page that the proxy server has saved, it does not have to access the internet again because the server can send it directly to the user.
Advantages of Using a Proxy Cache Server
- Faster navigation: If the requested web page is in the server cache, it is served immediately without needing to access the original server, saving time.
- More efficient use of the internet connection: If the requested page is cached on the server, only the local network is used, saving bandwidth.
- Firewall: The proxy cache server communicates with the exterior and can function as a firewall, increasing user safety.
- Filtering services: You can make available only those services (such as HTTP or FTP) for which the proxy cache server is configured.
How It’s Used
The web browser (client) requests an HTML page (for example) from a web server or requests a file from an FTP server. As the web browser is configured to access the internet through the proxy, the request is actually made to the proxy cache server.
The proxy cache server receives the request and searches the cache (the proxy server’s hard disk) to see if the requested page is stored.
If this is the first time accessing the page, it is not stored. The proxy server forwards the request to the web server, which returns the requested page. The proxy caches it and sends it to the web browser.
Squid: A Popular Proxy Cache Server
Installation
sudo apt-get install squid
Important Directories
/usr/sbin/
: Contains executable files./var/run/squid.pid
: Contains the PID (Process Identifier) of the Squid process./var/log/squid/
: Stores Squid’s log files./var/spool/squid/
: Contains the cache itself./etc/squid/squid.conf
: Squid configuration file./usr/lib/squid/
: Accessories, authentication, etc./usr/share/doc/squid/
: Squid documentation.Squid Configuration
Squid’s configuration is found in
/etc/squid/squid.conf
.
cache_effective_user squid
/cache_effective_group squid
: Sets the user and group for Squid.http_port 8080
: Squid listening port.cache_mem 8 MB
: Sets the amount of RAM dedicated to storing cache blocks.cache_dir ufs /var/spool/squid 500 16 256
: Sets the location and size of the hard disk cache.visible_hostname
: The name by which the cache is announced.Using ACL (Access Control Lists)
The ACL parameter is used to:
- Protect the proxy cache server from external connections, preventing unknown clients from saturating the connection.
- Protect clients from accessing dangerous ports, acting as a firewall against potential attacks.
- Establish a hierarchy of caches.
- Establish whether the network works as a whole or as individual machines.
Examples
acl [list_name] src [list_components]
acl [list_name] time time_frame
acl [list_name] srcdomain/dstdomain domain
acl [list_name] url_regex pattern
acl [list_name] maxconn limit
list_name
: The name assigned to the ACL.src
: Refers to the origin, i.e., the IP address of a client.[list_components]
: Can indicate network IP addresses with a network mask or files whose contents are IP addresses.time
: Allows or denies connections within a time slot wheretime_frame
follows specific construction rules.srcdomain/dstdomain
: Sets permissions on origin and destination web domains.url_regex
: Defines ACLs that identify websites based on whether the URL contains certain characters or words, satisfying a regular expression or pattern. Use-i
to ignore case (ignorecase
).maxconn
: Sets a maximum number of connections per IP.Log Files
Squid generates the following log files:
cache_access_log /squid/cache/logs/access.log
cache_store_log /squid/cache/logs/store.log
cache_log /squid/cache/logs/cache.log
access.log
: Stores requests made to the proxy. Analyzing its contents can provide statistics.store.log
: Stores information on cache management, i.e., the entry and exit of objects.cache.log
: Stores general operating information about the cache, errors, etc.
/usr/sbin/squid -z
/etc/init.d/squid start
ps aux | grep squid
Transparent Proxy
Software for packet filtering input/output located between a LAN and the internet. The local network will not be aware of its existence. Internally, it redirects requests from the local network by changing the direction that addresses the connection. It uses NAT (Network Address Translation).
Advantages
- Forces users to use the proxy without them being aware of it.
- Eliminates the need for configuration for each type of web browser.