Entry Processes and .htaccess
Overview
Overview
This article describes our experiences of addressing a denial of service (DOS) type of attack by making modifications to the .htaccess file commonly available on web hosting environments. This article also discusses (or makes guesses at) reducing the mysterious "Entry Processes" count found on the cPanel side bar when managing your web site.
Problem History
- We had provided simple CGI and PHP service scripts at our http://www.networksecuritytoolkit.org/ site.
- The scripts simply reported a short string (like: "66.198.240.15") indicating the IPv4 address of the client making the request.
- This simple service became very popular.
- Even though the scripts were extremely short, our original web hosting provider became upset with us due to the number of connections being made.
- We tried to appease our web hosting provider by disabling the service (removing the scripts and triggering 404 errors to all of our clients).
- While this prevented the service from running, it did not stop all of the incoming connections and the server handling the 404 (our web host provider was still not happy with us).
- We then modified our top level .htaccess file to reduce the load of handling 404 messages (still didn't make our web hosting provider happy).
- At this point we were at a loss of what to do as the continual pounding on the web server from outside requests for our simple IPv4 service would not stop.
The bottom line is that by providing a useful and simple service that many people had come to rely on we had essentially created a denial of service (DOS) upon ourselves as many machines throughout the Internet are running unmanaged cron jobs that hit our service to request an IPv4 address. We were powerless to stop the external requests (short of moving to a new domain).
We tried moving our site to a new we hosting service as our current provider could offer us no help in resolving the issue (short of upgrading our site to a dedicated server).
Things were not much better at our new site.
- The new web hosting environment indicated the word "Unlimited" in much of their advertisement (this is common).
- However, our web site performance was terrible (pages would sometimes not load at all or only partially load).
- Using the cPanel interface at the new web hosting environment, we discovered that we were maxing out our "Entry Processes" count (we had a limit of 35).
- Through a lot of trial and error and guessing, we determined that we were hitting this "Entry Processes" limit due to 404 errors from incoming requests for PHP and CGI scripts that did not exist on our system.
- At this point we were quite frustrated and sent in a help request to our new web hosting provider. Basically, we wanted to know how to prevent the "Entry Processes" limit from maxing out when clients sent bad requests to our site. It seemed quite unfair to us that our site could be rendered unusable simply by having requests for documents not available at our site.
Here is what our cPanel looked like
The .htaccess File
While waiting for our web host provider to offer a suggestion, we started experimenting with the ~/public_html/.htaccess file that the web host provider enables in accounts. This file can be used to "tweak" how Apache handles requests from the outside world.
We decided to first try minimizing the 404 load on the server by telling Apache to just echo a message (instead of looking for a document to send back) with the following:
ErrorDocument 404 "<H1>Page not found</H1>"
While this worked and reduced the load a bit, it did not help with the "Entry Processes" count that was making our web site unusable.
We next tried adding RedirectMatch directives to our .htaccess file:
ErrorDocument 404 "<H1>Page not found</H1>" RedirectMatch temp /ip.php$ /no-ip.txt RedirectMatch temp /ip.cgi$ /no-ip.txt
The above rules instructed Apache to send back a simple text file (http://www.networksecuritytoolkit.org/no-ip.txt) anytime a request was made for http://www.networksecuritytoolkit.org/nst/tools/ip.php or http://www.networksecuritytoolkit.org/nst/cgi-bin/ip.cgi (the two URLs that the outside world was pounding).
Low and behold that fixed the problem! Immediately our "Entry Processes" count dropped to 0 and our site was usable again.
Entry Processes Theory
After having found a solution, it was still a bit of a mystery as to why removing the scripts from the system did not fix the problem, but adding the RedirectMatch lines to the .htaccess file did. After all we were now returning documents instead of just echoing back "Page not found" messages. Here is the theory:
- An "Entry Process" count is triggered whenever an outside client requests a PHP or CGI document from the web server.
- This "Entry Process" count is triggered regardless of whether or not the document actually exists on the server (simply removing the PHP or CGI script will not fix the issue).
- An "Entry Process" count has a life span longer than it takes to run the process (Apache must be holding these counts open for a period of time after each request), so they will accumulate much faster than you might expect. For example, our PHP script would run in less than 0.05 seconds on the web host which would imply that you should be able to have about 700 requests a second before reaching the 35 "Entry Process" count. We were not seeing anything close to that request rate. This implies that "Entry Process" counts live much longer than the process itself.
- Adding the RedirectMatch statements to the .htaccess file prevents Apache from invoking PHP or CGI (prevents the spawning of any external process). This is the key to how we cleaned up our "Entry Processes" issue.
- Internally, Apache must not care whether or not the actual PHP or CGI files exist. If a PHP or CGI request comes in, the Apache module must be spawning off a process to run it and that process must be discovering the missing document.
Unfortunately, this leaves us with the realization that anyone can trigger effective DOS attacks on shared web hosting environments. This is as simple as making requests for PHP or CGI scripts that don't exist at the site (at least if Apache is configured to support PHP/CGI). This is quite unfortunate and we don't know of a good solution to this problem.
SHTML Replacement
Since the general theory was that "Entry Process" counts were only being triggered by PHP and CGI scripts when Apache had to deal with child processes. We started to wonder if the same holds true for SHTML requests. We were curious as to whether Apache would spawn another process when handling SHTML documents. To test this, we created our IPv4 echo service as a simple one line SHTML file containing:
<!--#echo var="REMOTE_ADDR" -->
We then verified that the URL http://www.networksecuritytoolkit.org/nst/tools/ip.shtml was functioning as expected and returned the expected results.
We then opened several terminal windows and hit the URL with the following command:
taco:~ pkb$ while true; do curl 'http://www.networksecuritytoolkit.org/nst/tools/ip.shtml'; sleep 1; done 72.43.94.73 72.43.94.73 ...
We were pleased to find that our "Entry Processes" remained at 0 and our cPanel loads looked good.
Armed with this knowledge, we then refactored our .htaccess file to redirect the old PHP and CGI services to this new SHTML rewrite:
ErrorDocument 404 "<H1>Page not found</H1>" RedirectMatch 301 /ip.php$ /nst/tools/ip.shtml RedirectMatch 301 /ip.cgi$ /nst/tools/ip.shtml
We now have a configuration that automagically switches all of the PHP and CGI client requests to our new SHTML implementation. So, in theory, all of those unmanaged cron jobs out there in wild should start working again (at least if they are sophisticated enough that they can follow the redirect).
The bottom line is that if you have simple requirements that can be handled without causing Apache to manage external processes, you can improve the performance of your website.