2019年8月23日 13時頃からAmazon AWS 東京リージョン でシステム障害が発生し、EC2インスタンスに接続できない等の影響が発生しています。ここでは関連する情報をまとめます。
AWS障害の状況
障害発生時間(EC2) |
約6時間 2019年8月23日 12時36分頃~18時30分頃(大部分の復旧) |
障害発生時間(RDS) |
約9時間半 2019年8月23日 12時36分頃~22時5分頃 |
障害原因(EC2) |
一部EC2サーバーのオーバーヒートによる停止 制御システム障害により冷却システムが故障したことに起因 |
影響範囲 |
東京リージョン(AP-NORTHEAST-1)の単一のAZに存在する一部EC2、EBS、およびRDS。 |
- 発生リージョンは東京。東京近郊4データセンター群の内、1つで発生。
- 日本国内のAWSの契約先は数十万件とみられる。*1
障害報告があったサービス
piyokangoが確認した範囲で影響発表されていた(非公式含む)サービスは以下の通り。
同時期に報告されたものを集めたもので、全てがAWS障害が原因かどうかは不明です。
決済系
障害報告のあったサービス |
原因・発生事象 |
PayPay |
サービス断続的に使用不可 |
ファミペイ |
AWS障害 |
BillingSystem(PayB) |
クラウド事業者のネットワーク障害 |
社内システム系
障害報告のあったサービス |
原因・発生事象 |
日本通運 |
メールシステム障害 |
Amazon ステータスページ
以下のStatusページで最新の状況が公開されている。
status.aws.amazon.com
Amazon Elastic Compute Cloud (Tokyo)
2019/08/23 13:18 |
We are investigating connectivity issues affecting some instances in a single Availability Zone in the AP-NORTHEAST-1 Region. |
2019/08/23 13:47 |
We can confirm that some instances are impaired and some EBS volumes are experiencing degraded performance within a single Availability Zone in the AP-NORTHEAST-1 Region. Some EC2 APIs are also experiencing increased error rates and latencies. We are working to resolve the issue. |
2019/08/23 14:27 |
We have identified the root cause and are working toward recovery for the instance impairments and degraded EBS volume performance within a single Availability Zone in the AP-NORTHEAST-1 Region. |
2019/08/23 15:40 |
We are starting to see recovery for instance impairments and degraded EBS volume performance within a single Availability Zone in the AP-NORTHEAST-1 Region. We continue to work towards recovery for all affected instances and EBS volumes. |
2019 /08/23 17:54 |
Recovery is in progress for instance impairments and degraded EBS volume performance within a single Availability Zone in the AP-NORTHEAST-1 Region. We continue to work towards recovery for all affected instances and EBS volumes. |
2019/08/23 18:39 |
The majority of impaired EC2 instances and EBS volumes experiencing degraded performance have now recovered. We continue to work on recovery for the remaining EC2 instances and EBS volumes that are affected by this issue. This issue affects EC2 instances and EBS volumes in a single Availability Zone in the AP-NORTHEAST-1 Region. |
2019/08/23 20:18 |
Beginning at 8:36 PM PDT a small percentage of EC2 servers in a single Availability Zone in the AP-NORTHEAST-1 Region shutdown due to overheating. This resulted in impaired EC2 instances and degraded EBS volume performance for resources in the affected area of the Availability Zone. The overheating was caused by a control system failure that caused multiple, redundant cooling systems to fail in parts of the affected Availability Zone. The chillers were restored at 11:21 PM PDT and temperatures in the affected areas began to return to normal. As temperatures returned to normal, power was restored to the affected instances. By 2:30 AM PDT, the vast majority of instances and volumes had recovered. We have been working to recover the remaining instances and volumes. A small number of remaining instances and volumes are hosted on hardware which was adversely affected by the loss of power. We continue to work to recover all affected instances and volumes. For immediate recovery, we recommend replacing any remaining affected instances or volumes if possible. Some of the affected instances may require action from customers and we will be reaching out to those customers with next steps. |
Amazon Relational Database Service (Tokyo)
2019/08/23 13:22 |
We are investigating connectivity issues affecting some instances in a single Availability Zone in the AP-NORTHEAST-1 Region. |
2019/08/23 14:25 |
We have identified the root cause of instance connectivity issues within a single Availability Zone in the AP-NORTHEAST-1 Region and are working toward recovery. |
2019/08/23 15:01 |
We are starting to see recovery for instance connectivity issues within a single Availability Zone in the AP-NORTHEAST-1 Region. We continue to work towards recovery for all affected instances. |
2019/08/23 17:16 |
We continue to see recovery for instance connectivity issues within a single Availability Zone in the AP-NORTHEAST-1 Region and are working towards recovery for all affected instances. |
2019/08/23 20:46 |
The majority of instance connectivity issues have now recovered. We continue to work on recovery for the remaining instance connectivity issues within a single Availability Zone in the AP-NORTHEAST-1 Region. |
2019/08/23 22:19 |
Between August 22 8:36 PM and August 23 6:05 AM PDT, some RDS instances experienced connectivity issues within a single Availability Zone in the AP-NORTHEAST-1 Region. The Issue have been resolved and the service is operating normarly. |