1. Daniel

    Daniel I'm sure you've heard the rumors Founder Staff

    Joined:
    May 14, 2006
    Messages:
    2,781
    Likes Received:
    609
    Location:
    Phoenix, AZ

    Notice Regarding Recurrent Forum Downtime

    Discussion in 'Announcements' started by Daniel, Jul 7, 2018.

    Over the past seven months, WritingForums.org has experienced bought of forum downtime.

    I’m writing this notice to keep you appraised of this issue and what we are doing to both maximize uptime and discover the root of the problem.

    Before I go further, I want to let you know that this issue will be solved, and it’s my number one priority. I’m sure some of you are concerned with these server issues on WF, both publicly and privately, but I’m here to promise diligence in the quest to solve it and to promise my commitment to the longevity of the site. We're not going anywhere. I just ask for your patience and understanding as we investigate this issue.


    Frequency & Duration

    The frequency and duration of this has been variable: sometimes no issues for s month, other times problems three times in a week. I’ll get into the cause of this below, but typically once I’m away of the issue I can bring us back online quickly. Downtime only last several hours, of in rare cases a day or two, when it occurs while at my day job or when I’m otherwise unable to fix the problem.


    Know Sequence of the Problem

    Essentially what happens is that out MySQL Server - the part of our server that runs and manages all databases - is kicked offline.

    During this process, an error occurs in our sessions table. The sessions table is not particularly important, or only logs the recent visits of users and guests. Because of its function, it is likely the most active database table, and the most active table is the one most likely to have an error or corruption.

    No loss of data, beyond which users and guests visits the site, is at risk to our knowledge, aside from what may be posted or edited at the exact time of the crash. As of now this is a problem of inconvenience, not data loss.

    Once the MySQL Server is restarted, the sessions table in the database must go through a manually-initiated repair process. This is the error you probably see if you can see the sites design but not forum content. Once this table is repaired, the site is back online.


    What We Are Doing

    I’ve done research over the past few months reviewing similar errors and opening multiple tickets with our web host.

    I’m currently in updated communications with our Web Host and Xenforo. Xenforo believes it's a server issue, not anything related to their software. Right now we need to capture and review data regarding what's happening right before the crash, and after the crash before we restart the server.

    Our investigation has lead to these possible causes, though we still have yet to pinpoint the root of problem:

    • Problems with Memory Allocation (will disk space or disposable memory)
    • Problems / Damaged with the Server Hardware
    • Misconfigured Server / Database Settings
    • Problems with Resource Use

    Latest Discoveries and Plan by Host

    During our last Support Ticket two weeks ago, a technician working for our hosting company made some progress. He found some clues on activities based on the logs. However, he said that to move forward we need to observe the issue with MySQL and investigate before our MySQL Server is restarted. This should give more information in the cause of the problem. We also need to investigate to get an idea of what queries are running when the issue starts.

    I've re-opened the support ticket as these issues seem to be happening more frequently. The next two or three times this happens we may need to stay offline a little longer than usual to allow them to investigate.

    Right now I’m clarifying what they need from me. The plan is to capture the appropriate logs once the site crashes next time, prior to bringing us back online.

    If I’m unable to make progress with Xenforo and our Hosting Provider we will look into hiring a profession Server/Database Administrator.

    If all else fails, we will update our software/reduce dependencies and move to a new web host. If it's a server configuration or server hardware issue, a new hosting solution may solve the problem.

    I will do my best to keep you appraised. Expect this to reoccur for another week or two until we figure out why its happenings. If you have server admin experience and you want to give your shot at investigating let me know. It’s not a data loss issue. I’m committed to both solving this problem and to the longevity of WritingForums.org.

    My hope is that we can solve this over the next two weeks and I can return to XF2/WF5 development.

    Questions and comments welcome.

    Daniel
     
  2. mashers

    mashers Contributor Contributor Community Volunteer

    Joined:
    Jun 6, 2016
    Messages:
    2,369
    Likes Received:
    3,069
    Thank you @Daniel. I for one really appreciate all the hard work that goes into keeping the site running.
     
  3. jannert

    jannert Member Supporter Contributor

    Joined:
    Mar 7, 2013
    Messages:
    11,229
    Likes Received:
    12,001
    Location:
    Scotland
    Yes, absolutely. Without the mods keeping an eye on the ebb and flow of interactions here, we wouldn't have such a good forum. But without Daniel's efforts, we wouldn't have a forum at all. Thank you, Daniel, for keeping it going.
     
  4. Laurin Kelly

    Laurin Kelly Contributor Contributor

    Joined:
    Jun 5, 2016
    Messages:
    2,081
    Likes Received:
    3,075
    Thank you Daniel! We appreciate the update.

    I unfortunately am only just beginning my SQL studies, but hopefully another more experienced member might be able to lend a hand.
     
    Shenanigator likes this.
  5. big soft moose

    big soft moose All killer, no filler. Contributor Community Volunteer

    Joined:
    Aug 1, 2016
    Messages:
    9,972
    Likes Received:
    10,861
    Location:
    East devon/somerset border
    It's obvious that the real problem is that someone has forgotten to feed the hamster ( whose wheel powers the server) the only answer is a team of coypu to provide extra power when it's needed.
     
  6. Daniel

    Daniel I'm sure you've heard the rumors Founder Staff

    Joined:
    May 14, 2006
    Messages:
    2,781
    Likes Received:
    609
    Location:
    Phoenix, AZ
    An update:

    About a week I ago I was working on the site and managed to see a the database/MySQL server crash happening in real time. I managed to capture the logs our web host needed to investigate this issue further. The relevant logs apparently get wiped when the MySQL server restarts, so they needed to capture the logs prior to the server restarting. Check.

    Using these logs, our hosting provider pinpointed certain configuration settings they believe are responsible. These MySQL connection settings allowed caching of database queries with delayed inserting, which essentially means the forum delays or reuses common queries to the database that are the same to reduce the load on database. This is typically something done to increase site load time forThis functionality is normally a good thing that speeds up a website, but in the case of our Xenforo setup, it caused our database to lock up to due too many attempted connections. We've disabled this type of caching and made other adjustments likely to prevent the issue from recurring.

    While our server provider seems to think these issues our solved, I'm unsure, but cautiously optimistic. We haven't had a reoccurrence in 10 days or so, which is promising. There's a good probability this issue is resolved. That said, I am continuing to monitor our server, and this issue or a related one could reoccur. At some point we may need a specialized system administrator to optimize our server's software, memory allocation, and database configuration.

    TL/DR: Our hosting provider used our server logs to pinpoint what they think the issue was, made some configuration changes, and believes they have solved the heart of the issue.
     
    izzybot, Shenanigator and Zerotonin like this.
  7. Homer Potvin

    Homer Potvin Digging out my Balzac Contributor

    Joined:
    Jan 8, 2017
    Messages:
    4,371
    Likes Received:
    6,979
    Location:
    Rhode Island
    I don't know what any of that means, but... excelsior!
     
    Shenanigator likes this.
  8. mashers

    mashers Contributor Contributor Community Volunteer

    Joined:
    Jun 6, 2016
    Messages:
    2,369
    Likes Received:
    3,069
    Thank you @Daniel :)
     
  9. badgerjelly

    badgerjelly Contributor Contributor

    Joined:
    Aug 10, 2013
    Messages:
    832
    Likes Received:
    402
    Location:
    Earth
    Thanks. I was wondering if it was the first signs of an ownership issue or simply that the site was transitioning to a new format or even closely down!

    Good to hear it’s serious. I think many find this forum to be a useful resource as well as reasonably open to all without any enforcement of critique (which personally I find to be counterproductive if enforced!)
     
  10. mashers

    mashers Contributor Contributor Community Volunteer

    Joined:
    Jun 6, 2016
    Messages:
    2,369
    Likes Received:
    3,069
    I was a bit worried about that too. I’ve come to really love this place and it would leave a big hole in my life if it went away.
     
  11. Some Guy

    Some Guy People-thing Supporter

    Joined:
    May 2, 2018
    Messages:
    2,526
    Likes Received:
    3,211
    Location:
    Boneless chicken ranch
    I'm assuming you're okay with the hardware and resources on your local end? It doesn't seem like it's a problem to throw money at, unless you change hosts? Let me know. I didn't choose to be a Supporter to sit on the sideline and watch. What are the most reliable options/costs?
     

Share This Page