More details on Chase’s website crash emerge

From this article, much of this information seems to be from tips from a Chase insider.

There was a subsequent outage on Wednesday, apparently due to the huge number of access retries after the initial restoration of service.

There was a definite operator-error contributing cause, perhaps the error would not have happened if people weren’t so exhausted from dealing with the database outage.

Simply said, Chase’s website was not adequate to handle all the traffic when a larger than normal percentage of Chase customers tried to log on, and many people were unable to successfully log on for much of Wednesday, and that an error by Chase technical personnel exacerbated this problem.

Monash said JP Morgan Chase runs its user profile Oracle database on a cluster of eight Solaris T4520 servers, each with 64GB of RAM, with the data held on EMC storage. El Reg is told that Oracle support staff pointed the finger of blame at an EMC SAN controller but that was given the all-clear on Monday night.

Monash subsequently posted that the outage was caused by corruption in an Oracle database which stored user profiles. Four files in the database were awry and this corruption was replicated in the hot backup.

Recovery was accomplished by restoring the database from a Saturday night backup, and then by reapplying 874,000 transactions during the Tuesday.

For the non-technical folks in the audience, a piece of storage hardware failed and subsequently caused the databases to get corrupted in both the live and real-time backup.  Most databases of this type have many layers of backup, and that was the case here.  In addition to periodic backups, a typical system will keep a “journal” of any activity that is applied so that in a worst case scenario like this, the list of database changes can be applied and the data from the older backup can be updated with all subsequent changes.  But it can take some time.

It seems likely that at some point during the outage Chase must have known what was going on and that they would eventually be able to fully restore the service and no data would be lost, which makes it even more perplexing that they didn’t release any statements to this effect.

This Oracle database stored user profiles, which are more than just authentication data.Applications that went down include but may not be limited to:

  • The main JPMorgan Chase portal.
  • JPMorgan Chase’s ability to use the ACH (Automated Clearing House).
  • Loan applications.
  • Private client trading portfolio access.

So, clearly more than just account access and bill-pay were affected.  ACH transactions include things like paying an Ebay auction using PayPal, which comes from your checking account.  They should have released information better telling people what was and was not affected while the outage was occurring.

2 Comments

  • By storm10, September 21, 2010 @ 9:40 am

    Chase totally lost the payments I scheduled on Monday, Setpember 13, 2010, and claimed that I must not have scheduled them since ALL THE OTHER SCHEDULED ON-LINE PAYMENTS HAD BEEN POSTED, in essence, calling me a liar. I’m curious, am I the ONLY one whose payments just vanished or did someone else who used the on-line bill pay around 7 pm September 13, 2010, lose their payment info as well? I find it impossible to believe that no one else was on-line at that time and that they were able to re-post all transactions except mine.

  • By Josh in Nashville, TN, December 30, 2010 @ 3:28 pm

    I would imaging this site would have to have some insane traffic to crash it. Arent they one of the largest 5 banks in the US? Love the title of your site BTW, pretty comical.

Other Links to this Post

RSS feed for comments on this post. TrackBack URI

Leave a comment



WordPress Themes