The Capital One - AWS incident highlights the roles and responsibilities of cloud customers, providers

Profile picture for user kmarko By Kurt Marko August 8, 2019
Summary:
After the Capital One data breach, media finger pointing obscured the real issues. As Kurt explains, this is about companies getting a better handle on the training and know-how they need to manage cloud services securely.

professor-questioning

The latest headline-making data security incident presents a teachable moment for enterprises, but unlike the knee jerk reaction from some initial news reports, it doesn't involve chiding people for their stupidity by using cloud infrastructure.

Nor is it about how Amazon is just the latest example of an irresponsible tech giant putting profit above security, jeopardizing everyone's privacy in the process.

Rather, the lesson is about responsibilities, technical comprehension and training. There's a reason high school wood- and metal-working shops are staffed by skilled teachers, because if you turn kids with no training loose on power tools, someone will eventually cut off their arm. The same this is true, metaphorically, with cloud services.

Cloud infrastructure is some of the most sophisticated, versatile and yes, complicated technology yet invented for building applications, storing and analyzing data and automating processes. Used correctly, it can be both secure and amazingly powerful, allowing organizations to do things that would have been previously impossible or prohibitively expensive. However, make one mistake, and disaster can strike as more technically savvy hackers exploit misconfigurations and vulnerabilities to steal data and disrupt systems.

Capital One shows that even early cloud adopters are climbing a learning curve

Everyone in IT is by now familiar with the Capital One incident in which an ex-employee of what a federal prosecutor calls "Cloud Computing Company" - and everyone else knows as AWS based on a resume cited in the federal district court complaint - used her detailed knowledge of AWS S3 and other services to find a hole in Capital One's firewall configuration that allowed access to its AWS compute (EC2) and storage (S3) resources. One inside Capital One's AWS infrastructure, the accused, Paige Thompson, used a security weakness in the design of an AWS API-handling service to access S3 storage buckets containing a variety of sensitive customer data. Capital One's terse statement on the incident summarizes the damage (emphasis added):

Based on our analysis to date, this event affected approximately 100 million individuals in the United States and approximately 6 million in Canada. Importantly, no credit card account numbers or log-in credentials were compromised and less than one percent of Social Security numbers were compromised. Based on our analysis to date, we believe it is unlikely that the information was used for fraud or disseminated by this individual.

The data came from individual and small business credit applications from 2005 on that includes the following items for each:

  • Name, address, zip/postal code
  • Phone number, email address
  • Date of birth
  • Self-reported income

In other words, almost everything someone would need short of the SSN to complete a bogus credit application in someone else's name. Indeed, the federal complaint says that the data did include "approximately 120,000 Social Security Numbers and approximately 77,000 bank account numbers."

Anatomy of the attack shows mistakes on both sides

The Capital One breach is more interesting than most such incidents because it happened to one of the most cloud-savvy enterprises, one which was an early and vocal advocate for AWS services. Capital One executives have appeared at several AWS re:Invent keynotes and Amazon has a series of glowing case studies on its website highlighting Capital One's experiences.

Understanding the elements of the Capital One attack is critical to seeing the nuances around the shared responsibility for its potential, if not actual execution. The criminal complaint describes the stages and results of the attack, however a more technical discussion illustrates the subtleties. I'll summarize the details using two great blog posts from Securiosis and Josh Stella, CTO of Fugue. Each are cloud security consultancies with expertise on the subject that I have validated by studying the relevant AWS documentation.

The attack began when Thompson, which remember was no longer working at AWS, probed for EC2 servers with public IP addresses, i.e. accessible from the Internet and found one with a misconfigured firewall that probably (the FBI statement in the complaint isn't explicit) left a port open that could be used to attack an application or OS vulnerability that provided access to the server. From there, the attacker:

  • Pulled the identity and access management (IAM) Role credentials for the server, which as Securiosis describes it are "ephemeral credentials that allow AWS API calls."
  • Most likely (since we're speculating here) used API calls to query the IAM to find available IAM roles on the server that provided read access to the S3 storage service.
  • Used those credentials to list the available S3 buckets and copy data to another bucket she controlled.
  • Later copied the data to a local drive and/or another cloud repository.
  • Discussed the hack on social media and GItHub, which eventually led someone to tip Capital One off to the breach, who then notified the FBI.
  • With the basic parameters in hand, including a file that executed AWS CLI instructions, Capital One used various AWS logging services like CloudTRail to reconstruct the attack's methods and identify the more than 700 S3 buckets Thompson accessed.

Although some of the data was likely encrypted, hence the low number of SSNs exposed, personally identifiable information (PII) was not.

Assigning blame

Capital One is obviously at fault for misconfiguring AWS firewall rules for a public-facing server such that a knowledgeable attacker could gain access to at least one of its EC2 servers. However, AWS is not blameless due to known weaknesses in its API-handling feature known as the Metadata service, which allows processes on any EC2 instance to access an API running on a link local address to query data about the instance itself. Evan Johnson, a security engineer at Cloudflare, provides an excellent technical analysis of the vulnerabilities. According to Johnson's blog post:

Every indication is that the attacker exploited a type of vulnerability known as Server Side Request Forgery (SSRF) in order to perform the attack. SSRF has become the most serious vulnerability facing organizations that use public clouds. SSRF is not an unknown vulnerability, but it doesn't receive enough attention. … Server Side Request Forgery is an attack where a server can be tricked into connecting to a server it did not intend.

In this case, the SSRF exploited the Metadata service, which provides temporary credentials used to make API calls to other AWS services. The advantage of AWS's Metadata service design is that it doesn't require issuing actual IAM keys to any programmer that needs to use AWS APIs. However, as Johnson details, it opens security holes. As he writes:

Accessing the credentials is easy! It's extremely easy to access the IAM Role temporary credentials that are provided by the metadata service. This is an example of how easy it is [to] access the metadata service. There is no authentication and no authorization to access the service. … All processes can talk to the metadata service and access temporary credentials with any out of the box operating system.

server side request

Source: Netsparker; What is the Server Side Request Forgery Vulnerability & How to Prevent It?

Indeed, a Swiss security engineer described such an attack two years ago in a blog post portentously titled, Abusing the AWS metadata service using SSRF vulnerabilities. Johnson proposes some changes AWS could make to bolster the Metadata service's security, notably to require two-factor authentication for access and requiring that temporary credentials issued by the service be only used within the customer's VPC network. Given the severity of the Capital One incident, customers should demand that AWS research and implement security improvement at that thwart similar attacks in the future.

My take - understand the delineation of responsibilities when using cloud services

The Capital One breach demonstrates the critical need for enterprise cloud users to understand their role in securing cloud resources and the applications and data they host. AWS publishes a shared responsibility model the explicitly identifies the abstraction layers that are its responsibility to secure and those that are the customer's. The Capital One incident is the result of mistakes (arguably five, since Capital One could have encrypted all of the PII, not just the SSNs) in at least four areas as I've annotated below.

AWS documentation

Source: AWS Documentation; Shared Responsibility Model.

Although AWS was the target of this attack, a similar demarcation of security responsibilities holds for any cloud service, both IaaS like AWS and SaaS like Salesforce.com. Indeed, as AWS has stated in response to this incident (see, for example, this incident summary by security researcher Brian Krebs), it and the other cloud services provide ample, and continuously improving tools to make the job of protecting systems and data easier and more effective.

Nonetheless, AWS is not off the hook in this case because of the abstruse design of many of its services and APIs, which can allow leveraging restricted access to one system into escalated security roles and subsequent access to other resources. AWS's culpability is notably evident in this case since just such an attack was outlined by security researchers years before.

Enterprises should learn the right lessons from Capital One's misfortune: don't abandon the cloud in a spasm of FUD, but redouble efforts to understand its nuances and use cloud services more safely and effectively.