(1). Overview of Varnish
Varnish is a high performance open source HTTP accelerator that can effectively reduce the load on the web server and increase access speed.According to the official statement, Varnish is a cache-type HTTP reverse proxy.
Poul-Henning Kamp, author of Varnish, is one of FreeBSD's core developers and believes that computers today are much more complex than they were in 1975.In 1975, there were only two storage media: memory and hard disk.But the memory of the computer system now includes L1, L2 and even L3 caches in the cpu in addition to main memory.The hard disk also has its own caching device, so it is impossible for the squid cache's self-processing architecture for object replacement to know these things and optimize them, but the operating system can know them, so this part of the work should be handled by the operating system, which is the Varnish cache design architecture.
When Varnish is deployed, there will be some changes in the processing of web requests.The client's request will first be accepted by Varnish.Varnish will analyze the received request and forward it to the back-end web server.The back-end web server handles the request routinely and returns the result to Varnish in turn.
Varnish's capabilities are not limited to this.The core function of Varnish is to cache the results returned by the back-end web server. If a subsequent request is found to be the same, Varnish will not forward the request to the web server, but will return the results in the cache.This will effectively reduce the load on the web server, improve response speed, and respond to more requests per second.Another major reason Varnish is fast is that its caches are all in memory, which is much faster than on disk.Optimizations like these make Varnish faster than you might think.However, considering that memory is generally limited in the actual system, you need to manually configure the space limit for the cache and avoid caching duplicate content.
The order in which the cache is processed: Accept the request - Analyze the request (analyze your URL, analyze your header) -- hash calculation - find the cache - freshness detection - access source - cache - create response message - respond and log.
Listen on port 6081, manage process management, child/cache, official network https://www.varnish-cache.org/.
(2) Comparison of Varnish features with Squid
Varnish features:
Based on the memory cache, the data will disappear after restart.
I/O performance is good using virtual memory.
Setting the exact cache time in 0-60 seconds is supported.
VCL (full name varnish config language, Varnish's own domain specific language) is flexible in configuration management.
The cache file size on a 32-bit machine is up to 2G.
Has powerful administrative functions such as top, stat, admin, list, etc.
The state machine is cleverly designed and has a clear structure.
Use binary heap to manage cache files for active deletion.
Varnish vs. Squid:
Same:
Is a reverse proxy server;
Are open source software;
Varnish has advantages over Squid:
Varnish is highly stable, and Squid servers are more likely to fail when they do the same work load than Varnish, because Squid is often restarted.
Varnish has faster access. Varnish uses Visual Page Cache technology, which reads all cached data directly from memory while Squid reads from hard disk, so Varnish has faster access speed.
Varnish can support more concurrent connections because Varnish's TCP connections release faster than Squid's, so it can support more TCP connections in high concurrent connections.
Varnish cleans up partial caches in batches using regular expressions through a management port, which Squid cannot do;
Squid is a single process that uses single-core CPU s, but Varnish opens multiple processes for processing by fork, so it is reasonable to use all cores to process the corresponding requests.
Disadvantages of Varnish over Squid:
Varnish has higher CPU, I/O and memory overhead than Squid in high concurrency state.
Once a Varnish process is Hang, Crash, or restarted, the cached data is completely released from memory, and all requests are sent to the back-end server, which can be very stressful in high concurrency situations.
In Varnish usage, if a request from a single url is made through HA/F5 (load balancing) to a different varnish server each time, the requested varnish server will be penetrated to the back end, and the same request will be cached on multiple servers, which will also waste Varnish's cached resources and cause performance degradation.
(3). Install Varnish
1) Installation environment
youxi1. 192.168.1.6. Source Package Installation
youxi2. 192.168.1.7. yum Installation Instance, Web Backend
youxi3. 192.168.1.8. Web Backend
2) Installation
Source installation varnish6.2.0 on youxi1 (recommended)
//Install Dependent Packages [root@youxi1 ~]# yum -y install make autoconf automake libedit-devel libtool ncurses-devel pcre-devel pkgconfig python3-docutils python3-sphinx graphviz [root@youxi1 ~]# tar xf varnish-6.2.0.tgz -C /usr/local/src/ [root@youxi1 ~]# cd /usr/local/src/varnish-6.2.0/ [root@youxi1 varnish-6.2.0]# ./configure --prefix=/usr/local/varnish [root@youxi1 varnish-6.2.0]# make && make install [root@youxi1 varnish-6.2.0]# echo $? 0 [root@youxi1 varnish-6.2.0]# cd /usr/local/varnish/ [root@youxi1 varnish]# mkdir etc [root@youxi1 varnish]# cp share/doc/varnish/example.vcl etc/default.vcl //Generate VCL configuration file
yum installation varnish on youxi2
[root@youxi2 ~]# vim /etc/yum.repos.d/varnishcache_varnish62.repo [varnishcache_varnish62] name=varnishcache_varnish62 baseurl=https://packagecloud.io/varnishcache/varnish62/el/7/$basearch repo_gpgcheck=1 gpgcheck=0 enabled=1 gpgkey=https://packagecloud.io/varnishcache/varnish62/gpgkey sslverify=1 sslcacert=/etc/pki/tls/certs/ca-bundle.crt metadata_expire=300 [varnishcache_varnish62-source] name=varnishcache_varnish62-source baseurl=https://packagecloud.io/varnishcache/varnish62/el/7/SRPMS repo_gpgcheck=1 gpgcheck=0 enabled=1 gpgkey=https://packagecloud.io/varnishcache/varnish62/gpgkey sslverify=1 sslcacert=/etc/pki/tls/certs/ca-bundle.crt metadata_expire=300 [root@youxi2 ~]# yum clean all & & yum list //clear yum cache and regenerate [root@youxi2 ~]# yum -y install varnish
3) Configure the Varnish cache site on youxi1 on youxi2
youxi1 Modify vcl Profile
[root@youxi1 ~]# vim /usr/local/varnish/etc/default.vcl backend default { //Rows 16-19 .host = "192.168.1.7"; //Modify the IP address of the Web back-end site .port = "80"; //Modify the port number of the Web back-end site } sub vcl_deliver { //Starting at line 35, cache hits if (obj.hits > 0) { set resp.http.X-Cache = "HIT cache"; } else { set resp.http.X-Cache = "Miss cache"; } }
Configuring environment variables
[root@youxi1 ~]# vim /etc/profile.d/varnish.sh export PATH=/usr/local/varnish/bin:/usr/local/varnish/sbin:$PATH [root@youxi1 ~]# . /etc/profile.d/varnish.sh //load environment variables
Start Varnish
[root@youxi1 ~]# varnishd -a 192.168.1.6:80,HTTP -f /usr/local/varnish/etc/default.vcl Debug: Version: varnish-6.2.0 revision b14a3d38dbe918ad50d3838b11aa596f42179b54 Debug: Platform: Linux,3.10.0-957.el7.x86_64,x86_64,-jnone,-sdefault,-sdefault,-hcritbit Debug: Child (18374) Started [root@youxi1 ~]# ps aux | grep varnishd root 18364 0.0 0.0 22188 1532 ? SLs 22:59 0:00 varnishd -a 192.168.1.6:80,HTTP -f /usr/local/varnish/etc/default.vcl root 18374 1.8 4.4 1029912 89468 ? SLl 22:59 0:00 varnishd -a 192.168.1.6:80,HTTP -f /usr/local/varnish/etc/default.vcl root 18593 0.0 0.0 112724 992 pts/0 S+ 23:00 0:00 grep --color=auto varnishd [root@youxi1 ~]# firewall-cmd --permanent --zone=public --add-port=80/tcp && firewall-cmd --reload success success
Build a test Web backend on youxi2
[root@youxi2 ~]# yum -y install httpd [root@youxi2 ~]# echo youxi2 > /var/www/html/index.html [root@youxi2 ~]# systemctl start httpd [root@youxi2 ~]# firewall-cmd --permanent --zone=public --add-port=80/tcp && firewall-cmd --reload success success
* Final Test
Then use curl command to do cache hit test, -I option only takes http response header information, not web page content
[root@youxi1 ~]# Curl-I 192.168.1.7 //This is a direct access to youxi2 HTTP/1.1 200 OK Date: Sun, 04 Aug 2019 15:14:16 GMT Server: Apache/2.4.6 (CentOS) Last-Modified: Sun, 04 Aug 2019 14:56:47 GMT ETag: "7-58f4bccfca680" Accept-Ranges: bytes Content-Length: 7 Content-Type: text/html; charset=UTF-8 [root@youxi1 ~]# Curl-I 192.168.1.6 //First visit to youxi1 HTTP/1.1 200 OK Date: Sun, 04 Aug 2019 15:14:19 GMT Server: Apache/2.4.6 (CentOS) Last-Modified: Sun, 04 Aug 2019 14:56:47 GMT ETag: "7-58f4bccfca680" Content-Length: 7 Content-Type: text/html; charset=UTF-8 X-Varnish: 12 Age: 0 Via: 1.1 varnish (Varnish/6.2) X-Cache: Miss cache //This time it was a miss Accept-Ranges: bytes Connection: keep-alive [root@youxi1 ~]# Curl-I 192.168.1.6 //Second visit to youxi1 HTTP/1.1 200 OK Date: Sun, 04 Aug 2019 15:16:39 GMT Server: Apache/2.4.6 (CentOS) Last-Modified: Sun, 04 Aug 2019 14:56:47 GMT ETag: "7-58f4bccfca680" Content-Length: 7 Content-Type: text/html; charset=UTF-8 X-Varnish: 15 32773 Age: 2 Via: 1.1 varnish (Varnish/6.2) X-Cache: HIT cache //This hit cached Accept-Ranges: bytes Connection: keep-alive
The cache time is short, you can try to configure the long link function of httpd (set KeepAlive On in the configuration file and restart)
4) Configure Varnish caching on youxi1 for multiple websites (youxi1, youxi2)
youxi1 Modify vcl Profile
[root@youxi1 ~]# vim /usr/local/varnish/etc/default.vcl backend youxi2 { //Change default to host name .host = "192.168.1.7"; .port = "80"; } backend youxi3 { //Create one more .host = "192.168.1.8"; .port = "80"; } sub vcl_recv { //Add in vcl_recv if (req.http.host ~ "^(www.)?you.cn"){ //regular expression matching set req.http.host = "www.you.cn"; set req.backend_hint = youxi2; //Point to youxi2 backend } elsif (req.http.host ~ "^bbs.you.cn") { //regular expression matching set req.backend_hint = youxi3; //Point to youxi3 backend } }
To restart Varnish, you need to use the killall command to install the psmisc package
[root@youxi1 ~]# yum -y install psmisc [root@youxi1 ~]# killall varnishd [root@youxi1 ~]# varnishd -a 192.168.1.6:80,HTTP -f /usr/local/varnish/etc/default.vcl Debug: Version: varnish-6.2.0 revision b14a3d38dbe918ad50d3838b11aa596f42179b54 Debug: Platform: Linux,3.10.0-957.el7.x86_64,x86_64,-jnone,-sdefault,-sdefault,-hcritbit Debug: Child (19017) Started
Build a test Web backend on youxi3
[root@youxi3 ~]# yum -y install httpd [root@youxi3 ~]# echo youxi3 > /var/www/html/index.html [root@youxi3 ~]# systemctl start httpd [root@youxi3 ~]# firewall-cmd --permanent --zone=public --add-port=80/tcp && firewall-cmd --reload success success
Edit/etc/hosts file on youxi1
[root@youxi1 ~]# vim /etc/hosts 192.168.1.6 www.you.cn 192.168.1.6 bbs.you.cn
- Testing
[root@youxi1 ~]# curl www.you.cn //First visit, you can see that youxi2 is pointing to youxi2 [root@youxi1 ~]# Curl-I www.you.cn //Second visit, take only http response header, you can see hit cache HTTP/1.1 200 OK Date: Sun, 04 Aug 2019 16:09:19 GMT Server: Apache/2.4.6 (CentOS) Last-Modified: Sun, 04 Aug 2019 14:56:47 GMT ETag: "7-58f4bccfca680" Content-Length: 7 Content-Type: text/html; charset=UTF-8 X-Varnish: 5 32772 Age: 12 Via: 1.1 varnish (Varnish/6.2) X-Cache: HIT cache //Hit Cache Accept-Ranges: bytes Connection: keep-alive [root@youxi1 ~]# curl bbs.you.cn // on first visit, you can see that you are pointing to youxi3 youxi3 [root@youxi1 ~]# Curl-I bbs.you.cn //second visit, only take http response header, you can see hit cache HTTP/1.1 200 OK Date: Sun, 04 Aug 2019 16:09:49 GMT Server: Apache/2.4.6 (CentOS) Last-Modified: Sun, 04 Aug 2019 16:07:43 GMT ETag: "7-58f4ccaa0e583" Content-Length: 7 Content-Type: text/html; charset=UTF-8 X-Varnish: 32774 8 Age: 6 Via: 1.1 varnish (Varnish/6.2) X-Cache: HIT cache Accept-Ranges: bytes Connection: keep-alive
(4). Extension
1) Why use caching:
Accessed data is accessed again and hot data is accessed multiple times.
After a data access session, the nearest or closest client to a data user accesses it again.
2) Since you are caching and need to read at a high speed, the best way is to put it all in memory.
Common memory databases, memcached, redis, HANA.
But for pages, it's unrealistic to put them all in memory, memory + cached disk to store the cache.
Key-value, key in memory, value on disk.
3) One form of data: key value
Key: The result of hash calculation of access paths, URL s, specific features, stored in memory.
Value: Pages, what our users really get, are usually stored on high-speed hard drives.
4) Anything related to cache can not be separated from two components: memory and high-speed hard disk.
5) Common terms:
Hit: Data can be retrieved from the cache. If you are a web site, your cache server will be a front-end server.
Hit ratio: Hit count/(Hit count + Miss count).
Hot data: Frequently accessed data.
Memory cache space, disk cache space.
Clean up: periodically clean up, LRU (less commonly used, oldest type of data deletes it), periodically update (purge).
Cached objects: user information, cookies, transaction information, page memory, all understood as objects.
Reference resources: https://www.oschina.net/translate/speed-your-web-site-varnish?print