Hi, been with JAG for about a month, and noticed this issue since the beginning:
While I was working on ssh, I'd suddenly get disconnected. After more extensive testing and monitoring over the time, I have the following results:
Every 4 hours, the VPS's in the machine lose connectivity for a few seconds (~15-20). During this short outage, the actual server machine isn't affected. I've confirmed this by constantly monitoring ICMP replies (ping) on 2 of the VPS nodes running on kvothe, and the server itself. The 2 vps get no reply during this time, while the machine IP does. Sample screenshot of 1 of the occurrences:
130 is the server machine. 131 is my VPS. 186 is a friend's VPS I'm also managing (both in kvothe). (was my server the first made in this server?)
While a brief outage may not seem such a big issue, I consider it as such. As a hosting provider, it means cron services may fail to run if they happen to trigger at that time, clients get their FTP session and file transfers stopped and disconnected, SSH sessions get disconnected, clients submitting any form data may have data loss as a result if they submit during the outage, mysql imports through phpmyadmin will fail mid-way, you name it, this generates several problems.
Clearly it's not an isolated issue with my vps, since another vps in the same machine has the same problem. It's also not an issue with our connection to the datacenter, else the actual machine IP would also timeout. It's clearly a pattern as well, the last 4 outages I recorded today, in CET times:
- 07.45
- 11.46
- 15.47
- 19.48
See the pattern? Every 4 hours, plus the next outage a minute later. So something's going on here. Maybe some backup or maintenance service in the machine scheduled every 4 hours to run on all the vps, which requires to disconnect them?
Support has been unable to trace the issue, or even recognize it. The ticket on the issue is now quite long dating back to April 1st. Gone through a NIC card change (which actually fixed a longer outage issue), they also tried turning the firewall/lfd off, with the argument that the outage went away after they turned it off (obviously a coincidence, since the outage is so brief, and they continued even with the firewall off), they mentioned high resource usage sometimes (not at the time of the outage, and also unrelated as not only my VPS has the issue).
At the end, they come up with the only final solution of moving my VPS to another machine, which comes of course with a change of IPs, which is quite an inconvenience, but I'd go through with it if the problem was really isolated to this machine.
So my question to other jpc users, have you noticed this issue in other servers? Maybe it's a required part of the jpc infrastructure, I don't know, but if that was the case, then I'd rather know beforehand so I can save the headache of going through another IP change if it was pointless, which is a severe issue as a hosting provider. In that case, I'd just live with it for the moment until a satisfactory answer can be given to us as to why this is happening, and if there is a solution.


LinkBack URL
About LinkBacks




Reply With Quote
Bookmarks