QUOTE(Dima @ Jul 17 2007, 10:22 PM)
I can tell you for sure that we didn't have downtimes for 4 hours.That can prove any customer on the server 65.98.8.190. I don't know what kind of script you are using to check the downtimes,but we never had such huge downtime problems.
Hi Dima,
Thanks for your reply.
I did some more detailed research and the downtime appears to have been
30 minutes of minimal access, followed by no access for 3 hours and 24 minutes.
Here is my supporting info confirming that the site was down for this time.
Results from uptime monitor service 1: 4 hours
Results from uptime monitor service 2: over 4 hours
Hostony webalizer stats for June show a clear dip for the 17th
Hostony access log file shows uncharacteristic low access for 30 minutes, followed by
no access for 3 hours 24 minutes.
Here are the details:
I use two free services that do site up-time monitoring.
(1)
The first is
http://uptime.weberdev.com/ which checks my site every 10 minutes. When it can't reach the site, it sends me an email. It continues
to try every 10 minutes, and sends another email when the site can be accessed again. I have it set to fetch the home page at www.fpga-faq.org
This is the email that indicated the site was unavailable (10:21 PDT):
>To: philip@fliptronics.com
>Subject: WeberDev's Web Site Uptime Monitor Service Alert
>From: uptime@weberdev.com
>Date: Sun, 17 Jun 2007 20:21:17 +0300 (IDT)
>
>This is an Alert from WeberDev's Web Site Uptime Monitor Service.
>
>Your site seems to be down.
>Monitored Site :
http://www.fpga-faq.org/>reported Error : Script timed out
And this is the email indicating that the site was back up (14:30 PDT):
>To: philip@fliptronics.com
>Subject: WeberDev Uptime Alert - Site is UP
>From: uptime@weberdev.com
>Date: Mon, 18 Jun 2007 00:30:50 +0300 (IDT)
>
>Monitored Site :
http://www.fpga-faq.org/>Status : Site is Up
(2)
The second service I use is
http://www.siteuptime.com which checks my site every 30 minutes.
It sends an email when the site is unable to be accessed, and again when the site can be accessed.
In the site 'ok' message, it includes the number of consecutive 30 minute attempts that were
unsuccessful, thus confirming the duration. Here are the two emails:
>To: philip@fliptronics.com
>Subject: SiteUptime Issue for FPGA-FAQ.ORG
>From: SiteUptime <system@siteuptime.com>
>Date: Sun, 17 Jun 2007 11:06:43 -0700
>
>Dear Philip Freidin,
>
>This is an automated message from SiteUptime.
>
>Alert Type: Site Not Available
>Result: Failed
>Time: June 17, 2007 11:04:52 PST
>HostName/URL: www.fpga-faq.org/
>Monitor Name: fpga-faq.org
>Service: http
And here is the second email.
>To: philip@fliptronics.com
>Subject: Uptime OK for FPGA-FAQ.ORG
>From: SiteUptime <system@siteuptime.com>
>Date: Sun, 17 Jun 2007 14:35:05 -0700
>
>Dear Philip Freidin,
>
>This is an automated message from SiteUptime.
>
>The system confirmed 8 failed checks at 30 minute intervals starting at June 17, 2007 10:36:42 PST
>
>Alert Type: Site is Available
>Result: Ok
>Time: June 17, 2007 14:35:04 PST
>HostName/URL: www.fpga-faq.org/
>Monitor Name: fpga-faq.org
>Service: http
This confirms the 4 hour outage I described in my first posting.
During this time I also tried accessing my site and was unsuccessful.
(3)
The webalizer stats for the month of June also show a noticeable dip on the 17 th:
http://65.98.8.190:2082/tmp/fpgafaq/webali...6.html#DAYSTATS(4)
The raw access log file for my site for this period (fpga-faq.org-Jun-2007.gz) also confirms this
outage, at line 293686 where you can see a gap of about 3:24 hours between accesses:
74.124.192.222 - - [17/Jun/2007:17:53:18 +0000] "GET /archives/110625.html HTTP/1.0" 200 11242
74.6.19.30 - - [17/Jun/2007:21:17:11 +0000] "GET /archives/48625.html HTTP/1.0" 200 70431
With about 500K accesses for the month of June, that averages out to about 11.5 accesses per minute. So no
accesses for 3.4 hours has got to be a server problem :-)
The log file also shows much lower access rate starting at 17:20 at line 293672, so the system was
having problems about half an hour before it stopped at 17:53. This would explain why the
monitoring services both reported 4 hours.
Thanks for any help you can give,
Philip