[S] Site shutdowns

Post about what you like and dislike on AAO and suggest new features and improvements.

Modérateur: EN - Forum Moderators

Règles du forum

[S] Site shutdowns 

Message par DWaM » Mer Avr 18, 2018 6:15 pm

(Moves from the Help and Support section)

I'm not sure how productive this will be, but at this point, I feel like this is something that should at least be addressed at some point.

Initially, after the host change, we understood that it wasn't quite what the previous one was -- the shutdowns that were initially were noticeable, but never terribly long and never terribly frequent. However, now, we've sort of reached a point where the site appears to be going down a few times within a ridiculously short timespan. Now, it's obvious that none of this is Unas' fault. The issue is evidently with the host. But it feels like something should be said on the matter publicly so that trial makers are aware of what's going on. I fear if there's no communication in regards to this, people who aren't used to this new state of things will just bail on the site if they find hours of work lost because the site happened to go down at the wrong moment.

Basically, it would be nice if some of these things are addressed:

  • Could we get information on why the site is potentially going down on this new host? Is it an issue that can be fixed?
  • Is there some sort of frequency or regularity to the times the site goes down so that we can at least prepare ourselves for it, and trial makers can properly know not to make any major progress in that time?
  • If not, is there some way we can communicate to people just now finding the site that the site going down is a potential issue?
  • Is it possible to make someone else, in addition to Unas, be able to bring the site back online, so that Unas doesn't have to be the one pinged every time this happens? Especially given the frequency, it's safe to say Unas will often find himself busy doing... well, more important things.
  • Are there any plans for perhaps changing the host and eliminating this issue altogether?

I admit some of these are probably a bit tough to answer and on some I can already sort of guess the answer to, but I feel like it's important to at least talk about it publicly and address the issue head-on at this point.

Especially given the fact that we may be looking at an influx of cases this summer due to a game jam being (potentially) hosted on the AA/Court Records discord. If, by that time, the site ends up being perceived as unreliable due to shutdowns, people might end up not using AAO at all, which would be a damn shame.

Thank you in advance.
Credits to Hersh for the avy! | My Twitter -- I'm mostly here these days
Image Image Image Image Image Image Image Image Image Image Image
Image
Image
Image
Avatar de l’utilisateur
DWaM
 
Message(s) : 1692
Inscription : Ven Juin 01, 2012 9:23 am
Localisation : The Kingdom of Ellipses
Genre: Masculin
Langues parlées: English, Croatian/Serbian, some German

Re: [S] Site shutdowns 

Message par Enthalpy » Jeu Avr 19, 2018 12:45 am

DWaM a écrit :It's important to at least talk about it publicly and address the issue head-on at this point.

Especially given the fact that we may be looking at an influx of cases this summer due to a game jam being (potentially) hosted on the AA/Court Records discord. If, by that time, the site ends up being perceived as unreliable due to shutdowns, people might end up not using AAO at all, which would be a damn shame.


Absolutely agreed. It's crossed my mind a time or two that it will be really bad if AAO doesn't recover from one of these outages, or if something worse than normal happens.

I don't have authority over anything server or hosting relating, so I can only relate what Unas has already told me.

DWaM a écrit :Could we get information on why the site is potentially going down on this new host? Is it an issue that can be fixed?

We'd like information ourselves. Here's what we know:
  • The webhosting platform, Scaleway, has not been helpful in identifying the problem.
  • Unas has checked the physical machine that does the hosting and hasn't found a problem.
  • The outage in November gave a different error message than the other ones and likely has a different cause. This is the only shutdown that gave server logs. mysqld ran out of memory, then the OS did, and in a last-ditch effort to prevent a server crash, another program killed mysqld. Shortly before this happened, there was a large number of something called "innoDB semaphores" and apache2 threads. Unas has logs for this and shared some with me, but I can't read them well enough to know if sharing them is safe to do.
  • The latest outage had a different cause and was due to a problem with Scaleway that affected multiple websites.
DWaM a écrit :Is there some sort of frequency or regularity to the times the site goes down so that we can at least prepare ourselves for it, and trial makers can properly know not to make any major progress in that time?

No. We still don't know why it happens, let alone when.
DWaM a écrit :If not, is there some way we can communicate to people just now finding the site that the site going down is a potential issue?

Yes. I can put up an Announcement explaining things easily enough, though I'd prefer to wait to give Unas a chance to chime in, since he knows more about the outages than I do. (I've sent him an e-mail about this topic and will give him a week before I put up an announcement.)
DWaM a écrit :Is it possible to make someone else, in addition to Unas, be able to bring the site back online, so that Unas doesn't have to be the one pinged every time this happens? Especially given the frequency, it's safe to say Unas will often find himself busy doing... well, more important things.

I don't know if this is technically possible, but I'd be willing to take on this role.
DWaM a écrit :Are there any plans for perhaps changing the host and eliminating this issue altogether?

None that I know of, but Unas may have some.

Let me know if I can do anything else.
[D]isordered speech is not so much injury to the lips that give it forth, as to the disproportion and incoherence of things in themselves, so negligently expressed. ~ Ben Jonson

Current AAO Development Priority: Issue #94: Grayscale Mode
Avatar de l’utilisateur
Enthalpy
Community Manager
 
Message(s) : 4368
Inscription : Mer Jan 04, 2012 4:40 am
Genre: Masculin
Langues parlées: English, limited Spanish

Re: [S] Site shutdowns 

Message par Unas » Dim Avr 22, 2018 5:13 pm

Enth summed it up quite well.

Basically, the server tends to go in a state called "kernel panic" and not respond to anything. When this happens, unfortunately, the crash is so severe that there is no log written, so there is no proper way to investigate the cause after forcing a reboot.
Once, however, we were lucky, and the server only "half-crashed" (ie the database was killed, but the kernel and apache servers still up), which allowed me to get some meaningful logs highlighting RAM issues. Not quite sure whether it was the same issue, but I tend to think that it was.
Back then, I tried to tweak the server's performance limits so it would not use up all the memory, but apparently without much effect, judging from continued occurrences of the problem - and unfortunately I didn't take the time to look at it again until today...

As for last week's issue, it was basically the same, except that at the same time there was also a wider issue on Scaleway's network that was preventing reboot, so I had to wait until they fixed it before I was able to trigger the reboot that brought the site back online...

As far as what can be done about this, well...
  • I just added additional restrictions to the server's performance settings. We'll see if it behaves better from now on.
  • Years ago, I developed support for serving the AAO static files from a separate server, to decrease the load on the main one - but I never took the time to actually set it up.
    If the issue still occurs, I guess I could set up an additional server and use it for that. Thankfully, these servers are cheap enough, it wouldn't ruin me to set up a second one, even though I'd rather avoid it if not necessary...
  • Unfortunately, as far I know, I can't give anyone else access to reboot the machine without giving them my personal credentials to the host's admin console.
    Given these credentials are also linked to my payment information, I ovbiously won't give them to anyone.

When is the competition you're talking about supposed to take place ?
ImageImageImage
If knowledge can create problems, it is not through ignorance that we can solve them.
Si le savoir peut créer des problèmes, ce n'est pas l'ignorance qui les résoudra. ( Isaac Asimov )
Avatar de l’utilisateur
Unas
Admin / Site programmer
 
Message(s) : 8787
Inscription : Mar Juil 10, 2007 4:43 pm
Genre: Masculin
Langues parlées: Français, English, Español

Re: [S] Site shutdowns 

Message par Exedeb » Mer Mai 23, 2018 12:52 am

The Game Jam is going to start from July 1 to August 11.

For more info, see this webpage.
Avatar de l’utilisateur
Exedeb
 
Message(s) : 108
Inscription : Lun Juin 29, 2009 8:33 pm
Genre: Masculin
Langues parlées: Italian, English (meh)

Re: [S] Site shutdowns 

Message par Unas » Ven Mai 25, 2018 9:41 pm

Thanks. If we experience new site shutdowns by then, I'll set up an additional server.
ImageImageImage
If knowledge can create problems, it is not through ignorance that we can solve them.
Si le savoir peut créer des problèmes, ce n'est pas l'ignorance qui les résoudra. ( Isaac Asimov )
Avatar de l’utilisateur
Unas
Admin / Site programmer
 
Message(s) : 8787
Inscription : Mar Juil 10, 2007 4:43 pm
Genre: Masculin
Langues parlées: Français, English, Español

Re: [S] Site shutdowns 

Message par energizerspark » Ven Juin 08, 2018 4:09 pm

I'm assuming I'm not the only person that couldn't connect yesterday?
Any advice or criticism I may give should be considered with the understanding that I lack skill and experience in pretty much everything.

Currently watching:
Doctor Who (series 11)
JoJo's Bizarre Adventure: Golden Wind

Currently playing:
Undertale (replay)

Currently reading:
The Lord of the Rings (The Fellowship of the Ring) - J.R.R. Tolkien

Image

the avatar is from Urusei Yatsura in case you were wondering
Avatar de l’utilisateur
energizerspark
 
Message(s) : 4113
Inscription : Jeu Jan 21, 2010 5:41 pm
Localisation : the Whole Sort of General Mish Mash
Genre: Masculin
Langues parlées: English

Re: [S] Site shutdowns 

Message par Southern Corn » Ven Juin 08, 2018 5:13 pm

Nope. Wasn't able to for most of the day either.
Image
Avatar de l’utilisateur
Southern Corn
 
Message(s) : 35
Inscription : Sam Mai 19, 2018 6:05 pm
Genre: Masculin
Langues parlées: English, Bad Jokes

Re: [S] Site shutdowns 

Message par Gosicrystal » Ven Juin 08, 2018 10:43 pm

Me too.
Gosicrystal
 
Message(s) : 19
Inscription : Lun Avr 24, 2017 7:54 pm
Genre: Masculin
Langues parlées: Español, English

Re: [S] Site shutdowns 

Message par Enthalpy » Dim Juin 10, 2018 12:25 am

It was, as you surmised, another site shutdown. I wouldn't be surprised to hear word from Unas on this.
[D]isordered speech is not so much injury to the lips that give it forth, as to the disproportion and incoherence of things in themselves, so negligently expressed. ~ Ben Jonson

Current AAO Development Priority: Issue #94: Grayscale Mode
Avatar de l’utilisateur
Enthalpy
Community Manager
 
Message(s) : 4368
Inscription : Mer Jan 04, 2012 4:40 am
Genre: Masculin
Langues parlées: English, limited Spanish

Re: [S] Site shutdowns 

Message par Unas » Ven Juin 22, 2018 12:36 am

Hi there,

Sorry for taking care of that a bit late... Anyway, as promised, since another shutdown occurred in June, I've decided to set up an additional server to serve all static files of AAO (this includes trial pictures, sounds and music from the default AAO asset-base).

I've just finalised the setup to use this new server : you may notice that, from now on, all these files will be served from http://asuras.aaonline.fr/ (Don't try to access this URL directly though - there is nothing to see)

What this should give in theory is :
  • Greatly reduce the number of queries to the AAO's main Apache server. Therefore, hopefully allow webpages on the site to load faster, and more importantly, hopefully avoid future kernel panics (since I suspect those were caused by intense Apache traffic - even though my settings should prevent this).
  • Load all these static files through a much lighter and faster stack : the new server uses nginx instead of apache, which should be a bit faster for serving static content. So hopefully trials should load a bit faster as well.

It may require some fine tuning though, so don't hesitate to let me know your impressions - if things are faster, slower, buggy, etc.
ImageImageImage
If knowledge can create problems, it is not through ignorance that we can solve them.
Si le savoir peut créer des problèmes, ce n'est pas l'ignorance qui les résoudra. ( Isaac Asimov )
Avatar de l’utilisateur
Unas
Admin / Site programmer
 
Message(s) : 8787
Inscription : Mar Juil 10, 2007 4:43 pm
Genre: Masculin
Langues parlées: Français, English, Español

Re: [S] Site shutdowns 

Message par kwando1313 » Ven Juin 22, 2018 1:14 am

Is there a reason we don't use nginx for everything? Since (afaik) that's the standard thing used for website servicing nowadays...




Heshy made this sig/Opannah art~ <3 <3 Avatar made by the wonderful StaticMegaByte~

Image

Image
Natsuki best senpai

Image

Image

Image

Image

Image

Image


"The Knight of the Iron Hammer, Vita, and the Steel Count, Graf Eisen. There's nothing in this world we can't destroy."
Avatar de l’utilisateur
kwando1313
 
Message(s) : 7684
Inscription : Mar Juil 22, 2008 6:33 pm
Localisation : Uminari City
Genre: Masculin
Langues parlées: English, Français (un peu), Ancient Belkan

Re: [S] Site shutdowns 

Message par Unas » Ven Juin 22, 2018 11:01 am

The reason is PHP - in which all the server-side code of AAO is written.

Apache has a rather deeply integrated PHP module which has no real equivalent in ngnix.
It's possible to execute PHP on nginx (through a system called php-fpm), but in my experience it always involved a significant overhead in processing time. I experimented with it a few years ago, and it was adding around 0.3 to 0.5s to each single php request.
In fact, I actually use it on the new server (I have a few admin and monitoring tools based on PHP as well), and have the same experience again - the tool feels significantly slower than the same running on Apache on the main server.

This may not be true with cutting edge versions of php or nginx, but for now I'm staying on old stable releases.
ImageImageImage
If knowledge can create problems, it is not through ignorance that we can solve them.
Si le savoir peut créer des problèmes, ce n'est pas l'ignorance qui les résoudra. ( Isaac Asimov )
Avatar de l’utilisateur
Unas
Admin / Site programmer
 
Message(s) : 8787
Inscription : Mar Juil 10, 2007 4:43 pm
Genre: Masculin
Langues parlées: Français, English, Español

Re: [S] Site shutdowns 

Message par Super legenda » Mer Juil 11, 2018 11:24 pm

It happened again.
Super legenda
 
Message(s) : 452
Inscription : Lun Sep 11, 2017 8:10 pm
Genre: Masculin
Langues parlées: Español

Re: [S] Site shutdowns 

Message par Southern Corn » Jeu Juil 12, 2018 4:05 am

It happened for a whole day too.
Image
Avatar de l’utilisateur
Southern Corn
 
Message(s) : 35
Inscription : Sam Mai 19, 2018 6:05 pm
Genre: Masculin
Langues parlées: English, Bad Jokes

Re: [S] Site shutdowns 

Message par drvonkitty » Jeu Juil 12, 2018 6:49 am

A couple questions:

One, is the cost of the server a problem? I don't know about anyone else, but I wouldn't mind chipping in a couple bucks a month to help with server running costs, if that'd help lighten the load.

And two, how severe are these crashes? Could trial data be at risk in a worst-case scenario crash?
Image

Image
Avatar de l’utilisateur
drvonkitty
 
Message(s) : 406
Inscription : Sam Avr 14, 2012 12:25 am
Genre: Masculin
Langues parlées: English

Suivant

Retour vers Comments and Ideas

Qui est en ligne ?

Utilisateur(s) parcourant ce forum : Aucun utilisateur inscrit et 1 invité