Saturday, 14 January 2012

It's broken...


Last week, I made a comment that “the consultants were still working on config changes”, but didn’t go into any details. I can go back over that now…

So I made the point that the consultants were still making some config changes, and doing so directly in the Production system. Normally, all changes should be done in the Development system, checked, then transported to the Test system, before being checked again – and only when it is proven that the changes won’t cause any functional issues, do they then get transported to the Production system.

Now there are certain changes that have to be made directly in a system as they cannot be transported. However, as far as I am aware, SAP best practice still highlights that the correct procedure is to make the config change in the Development system and then once it has been tested making the same change directly in the Test system. It then gets tested again before the change is made directly into the Production system.

The reason for this is simple – if you have a live system, any change can have undesirable effects. If your business is reliant upon a system, you want that system to work all the time – you do not want it going wrong, or working incorrectly. If you follow the correct procedure, you should be able to ensure that any change made will not turn around and bite you in the ass.

But of course there are those people that choose not to follow procedures because either they don’t know any better, or because they think that they do know better. (Dare I suggest that some simply do not care?) In our case, I don’t believe that it is malicious, but I feel that the people concerned have never had to work to the appropriate disciplines.

On Monday morning, people found that they were not able to post anything in the production system. As I arrived at work, it became clear that we had a crisis on our hands. The CEO and FD were already there, discussing the problems and trying to get answers on what had happened and why things were not working.

After investigation, it became apparent that one of the consultants had really messed up – the config changes added to the system were physically preventing any new data from being added and quite a lot of the older data from being processed. I got everyone out of the system in case a restart of the server might fix the issue, but that was no good. After many hours of discussion with the people concerned, it became obvious that the only way to fix the issue was to reverse all of the changes.

That work has been going on all week. In that time, no-one has been able to do very much in SAP – there were a few jobs, some reports could be run, but not very much. Fortunately, the factory was able to continue working for several days as we had some info left over from the previous week, but that dried up after about 3 days. At one stage, the directors were even considering closing down production and sending people home and putting plans in place to do this.

However, by midday yesterday, all of the config changes that caused the problem have now been undone, and people were getting down to running the stuff that they had not been able to do. A couple of the staff from Sales will be working on Saturday and they hope to catch up with what they have missed – Finance have managed to organise 2 billing payment runs, and they are staying on for a couple of hours to get the invoices in the mail.

Meanwhile, the Production manager has started to run one of their scheduling jobs, and the Production Supervisor will try to run another later tonight. I doubt that we will be back to normal by Monday, but we should be more or less OK by Tuesday lunch time – evening at the latest.

So of course, everyone is pretty angry about this – but so far, no word of apology from the SI. The CEO has told them in no uncertain terms that any attempt to bill us for the work will be rejected. I think that we ought to get some sort of compensation, but based upon their previous mess-ups, I think that is highly unlikely to happen.

So we have managed to escape a major disaster    yes, things could have gotten a lot worse, and other people have suffered far worse things that we have. But I do feel that we should have not been exposed in that way to begin with, and I do get frustrated that we are suffering because of other people’s behaviour with apparently no recourse to compensation.

2 comments:

  1. Just wondering here, after SO many problems with consultants (1) why doesn't the customer just fire the SI, or at a minimun (2) why not take away their access rights to the production system?

    ReplyDelete
  2. Oh my god , I can't believe what I'm reading !

    ReplyDelete