Lessons Have Been Learned
April 5th, 2012After every cock-up, politicians appear on our TVs to hang their heads and admit that “Lessons Have Been Learned.”
Well, now it’s my turn. As many of you will be aware, the fotolibra website suffered a calamitous collapse last week, and as it fell it brought the Heritage Ebooks site down with it, as well as all our back office tools — admin, banking, invoicing, Datacash, payments, mailing systems and more.
The good news is that the only thing we actually lost was time. No images were harmed in the making of this booboo, no data was lost and no accounts were compromised.
I’m delighted to tell you that fotoLibra is back up and running after our calamitous crash. Everything is back to normal.
You can upload images again!
If you use fotoLibra DND, please quit the application and restart it before attempting to upload.
Two questions: how do we stop this happening again, and what are we going to do about it?
Well, Lessons Have Been Learned. We are studying a cloud computing model to run in tandem with our physical array of servers and RAID 5 disks which live in a server farm in Manchester. If one system goes down, the other has to be there for it. That’s redundancy.
Redundancy (which has a different meaning in the computing world to what it used to have in my chosen career path) must be at the forefront of our plans. When a system fails, another system must step seamlessly into its place.
What are we going to do about it? Firstly of course we must apologise to all our users, buyers, sellers and browsers. We let you down, and we are very sorry. I am personally desolated — the fotoLibra website has been live since March 2004 and in that time it’s never been down for longer than ten minutes, and then only for service upgrades. I was rather proud of that; but then pride comes before a fall.
Enough breast-beating. Let’s look to the future. Assuming we have an even more robust system, we still have to have a contingency plan. As for the images, which were unharmed in this little unpleasantness, as well as our existing RAID 5 storage and possible future cloud back-up I am planning to physically secrete caches of hard drives full of images in various undisclosed locations in Snowdonia. Just in case.
One of the worrying things about last week’s crash is that it took our mailing system down with it, so we were unable to tell everyone.
There needs to be a line of communication with fotoLibra users set up outside our inhouse systems. And it appears some kind Americans have already thought of this, and have created things called LinkedIn, Twitter and Facebook. In exchange they want our souls for all eternity, but it’s just the price we have to pay.
fotoLibra has opened a Group on LinkedIn, which will be my preferred way of reaching you. It’s a professional networking group, and I promise I will link with you if you ask me.
There is also a fotoLibra Facebook site, which will be run by our redoubtable web editor Jacqui Norman. She will link with you, but I won’t, as I have reserved my Facebook visits for keeping an eye on my extended family.
Finally, there is Twitter. Now I am not a chatty man, so this will be difficult for me, but I will try and post something every day. The content will most likely be taken from my commonplace book, so it will largely consist of wise thoughts, pithy sayings and the world according to my friend Dede. I hope that sometimes you will find it fun and amusing. From time to time there will be something of interest to fotoLibra users. Please follow me @fotoLibrarian.
This way, if there ever is another problem, we’ll be able to let everyone know — and you will know where to check if you think you are having problems with the fotoLibra site.
So join the new fotoLibra Group on LinkedIn
and join and ‘Like’ the new fotoLibra Group on Facebook
and follow my Twitter feed.
Please sign up to join these groups — if you can also put up with my disconnected ramblings, of course.
And please stick with us. We’ll be even better as a result of this crisis.
I think the lesson here is as far as data is concerned
“if you havnt got it in three places , you hav’t got it at all
Thanks Gwyn; it must have been a terrible time. Glad it’s all sorted.
I have signed up on Linkedin, Twitter (which was anathema to me also) and Facebook (which I also use for my family who all now live in Australia).
All best wishes to all of you,
John
A good record for keeping the site up (until recently, that is). I take it these are Linux servers?
I’m not convinced of the security of cloud computing.
Yes, Linux. We still haven’t isolated the root cause of the problem.
Gwyn , I’m afraid we all take crashes like this very personally , I too had a website crash which left it down for over a week , but its like everything in
Life , nobody thanks you for the time and effort you put in whilst it’s up and running , they only have a dig when it’s not working . I have learnt done valuable lessons over the past couple of months and like you , believe cloud computing is the way forward , keep up the good work .
I run two web sites Mumfordbooks and Landscape-guides. Both went down this week, no fault on our side. Something at Google had changed. On Google search engine: WARNING NOTICE “This site may harm your computer”. You want to update the status of the domain, the keygen must have been put with a bit of code offering it for download on the site, also Google will crawl the site with a spider and also find it. The bug is now caught and our Google spider is well fed, thanks to my technician.
WE must be doing something right as we are still on page one, of Google’s wonderful search engine.
I think you nailed it…….Redundancy…….That is what we rely on as backup systems in the Commercial Airliners that we fly.
Without this redundant design system, we would have more aircraft falling out of the skies worldwide.
With that said, we as residents of planet earth should prepare all of our electronics,computers,hard drives,servers in an electromagnetic shielded enclosure to help ward off the EMF(electromagnetic fields) coming into our atmosphere from space. We know governments worldwide have already begun doing this as they were advised by NASA and other space agencies that beginning now until end of 2013 , we will experience massive radiation from solar flares from the sun, this is already beginning to wreak havoc on many electronic systems worldwide.
As for Facebook and Twitter and LinkedIN, count me out…..these sites are designed by the CIA, and I am not going to sell my soul to anyone except JESU CHRIST our LORD!……Period!
Please if you do NOT BELIEVE this, go to contendingfortruth.com and type facebook in the search engine and read the PDF as well as listen to the audio, this is a “1984” George Orwell, BIG BROTHER nightmare….which is now a reality!
May GOD continue to BLESS HIS people that are not asleep and will open their eyes to the truth!
Well I’m not sure about the CIA’s involvement but I’m pretty sure Facebook wasn’t set up with altruism at the forefront.
Well, when I kept clicking on my favourites link to your site and it wouldn’t open, I thought something was indeed strange….a quick e-mail to Jaqui and my suspicions were confirmed….back up your data three times at least I have always been told…i dread to think what tons of data you will have stored on Fotolibra and it must have caused you real headache while we sat back and waited for the site to re-open.
It takes a brave man to say you got it wrong….but the positives are you bounced back and the site has had a glitch……glad to see it’s resolved and thanks for the info.
Erik
Indeed. Our data is backed up 5 times over which is why we never lost anything except the service. We had the images, we had the site, we had the plug-ins, but they all had to be restored and rebuilt on new servers and new hard drives, just in case. Rather more expensive than we had budgeted for.
The code for the fotoLibra site itself is backed up in the cloud with AWS, but the images are all on physical disks. And as the question is never Will the disk fail? but When will the disk fail?, we have to make sure that when one goes, another can be plugged in to replace it within minutes. Luckily our server farm can do this for us within 30 minutes.
Hi Gwyn and the team,
Good to see you back with a determined and more positive outlook….I felt assured with full faith that all would be well in a short space of time and effort…
Keep up the good work and hope you all have a relaxed and Happy Easter..!!!
“Don’t forget the backups on easter eggs..!!”
Henry Watts
Lesson I learnt from a similar experience was to sack Melbourne, absolute clowns. (IMHO)
To be fair it was nothing to do with Melbourne who have been quite good with us. One of our HP Proliant servers decided to fry its circuits for breakfast. The charred hulk is still in our cabinet.