Social Media and the Blockchain

Social media and blockchain are two technologies that commonly make headlines. But how and do these two technologies work together?

Why Put Social Media on the Blockchain?

Social media and the blockchain are not an unknown pairing. Several different blockchain-based social media startups have been around for years now. Social media startups are attracted to blockchain technology for a variety of different reasons. Some of the main advantages of blockchain for social media include:

  • Decentralization: The current major social media companies are relative monopolies in their spaces and have the ability to exert a great deal of control over the types of content posted on their platforms and disseminated to their users. Blockchain, which is designed to be a fully decentralized system, lacks the centralization of traditional social media. This decentralization makes blockchain-based social media platforms more resistant to censorship.

  • Transparency: A common criticism of social media platforms is their lack of transparency. Platforms like Facebook have been the defendants in a number of different lawsuits regarding what they do with their users’ data, and many of these platforms have been accused of being arbitrary in the application of their rules regarding “acceptable” content. A blockchain is designed to be fully transparent, meaning that any rules regarding what constitutes “acceptable” content are transparent, consistent, and fairly applied.

  • Incentives and Compensation: Traditional social media platforms are designed to monetize their users. These platforms collect their users’ data and sell information about them to companies selling targeting advertising. Blockchain platforms, on the other hand, have built-in methods for paying for use of its infrastructure (block rewards and transaction fees) and for rewarding creators or paying subscriptions (cryptocurrency). This eliminates the need to monetize users to make a profit.

  • Anonymity: Many blockchain platforms are designed to provide a level of anonymity to their users. Instead of using real identities, blockchain accounts are identified by public keys and addresses. This anonymity can be an asset in some spaces on social media where publishing unpopular opinions or facts could endanger a journalist or writer.

  • Immutability: The blockchain is designed to create a distributed and immutable digital ledger, making it difficult to modify or remove information without detection. This can be an asset in some contexts where an authoritative record would be valuable.

While all of these are potential advantages of blockchain technology for social media, the picture isn’t all positive. Many of these same factors can also be detrimental to a social media platform. For example, the anonymity of blockchain (not essential but common) can encourage trolling and abusive behavior online. Or the blockchain’s immutability could make it difficult to remediate leaks of sensitive information on social media. Social media is an interesting potential application for blockchain technology, but they aren’t a perfect fit.

The Reality of Social Media and the Blockchain

Blockchain is a relatively young technology. While the original blockchains are over a decade old, it took several years before Bitcoin and its descendants achieved widespread visibility. While many companies - both large and small - are actively investigating the potential of blockchain solutions, many projects are still in their early stages.

The same can be said of blockchain-based social media projects. Several such projects are in existence, but many have been around for only a few years. The history of social media as a whole demonstrates that it can be difficult to predict if and when certain social media sites will take off and succeed.

Some social media sites - like MySpace - were relatively early successes, while many of the major players today were ridiculed in their youth by the major players of the day. At some point, it seems very likely that a social media platform based on blockchain technology will take off. However, it is impossible to tell whether this will be any of the ones that currently exist today.

What is File Encryption and Why it's Important for your Business

Data needs to be protected against exposure and unauthorized access at all times. Protocols like TLS protect data in transit, but don’t do anything for data being stored on a machine. File encryption protects this data by encrypting all files before storing them on a computer’s hard drive or on removable media. The use of strong encryption means that it is impossible for anyone to read the data without access to the appropriate encryption key.

A file encryption solution will also implement a key management solution, which is critical to the security of the system. On the one hand, users need to be able to access these keys so that they can decrypt data and use it for legitimate purposes. On the other, attackers need to be blocked from accessing these keys, which would allow them to decrypt the files and read the data that they contain. A file encryption solution must be designed so that encryption keys are securely stored and only accessible by legitimate users.

Why Use File Encryption?

File encryption is designed to protect data at rest. The use of encryption enables an organization to protect itself against a range of potential attacks and decrease its cybersecurity risk.

Compromised Accounts

User and application accounts are commonly compromised as part of cyberattacks. A cybercriminal may use phishing, credential stuffing, or other means to identify login credentials for a user account. Alternatively, exploitation of an application vulnerability may give an attacker access to an enterprise system with the same privileges as the compromised application. In these cases, organizations’ data security largely boils down to permissions management. If the compromised account has access to a particular file, so does the attacker. In the case of a compromise of a root or administrator account, this includes almost every file on the compromised system.

The use of file encryption can help to provide another line of defense against this type of attack. If a file is encrypted, the attacker needs access to the decryption key as well as the file itself. If encryption keys are well-managed, access is restricted to those who actually need them for their jobs, which is not necessarily the same group as has administrator-level permissions on a system. This provides an additional level of defense against data leaks and decreases an organization’s cyber risk.

Cloud Storage

Companies are increasingly moving sensitive data and vital applications to the cloud. While cloud-based deployments have a number of advantages over traditional on-premises data centers, they also create security concerns.

Cloud security can be very different from traditional cybersecurity, and the accessibility of the cloud from the public Internet makes the stakes of poor security even higher. As a result, the number of data breaches involving cloud storage has grown steadily with the increase in cloud adoption.

One of the most common mistakes that organizations make regarding their cloud data is failing to encrypt it. This makes the organization’s data security only as strong as the weakest link in the organization’s cloud security.

Leveraging file encryption in the cloud makes cloud data breaches much harder to perform. Even if an attacker can gain access to an organization’s cloud-based data storage, they also need access to the associated decryption keys to derive any value from the data. A file encryption solution with secure key management poses a much more challenging target.

Lost/Stolen Devices

Employees are increasingly using mobile devices for work. This trend has become more common in recent years, and the COVID-19 pandemic created an explosion in telework and the use of personal and mobile devices.

With the increased convenience of these mobile devices comes higher cybersecurity risk. A smartphone, tablet, or laptop is relatively easy to lose or have stolen in a public place. If this occurs, the thief may be able to read sensitive company data off of the device by scanning its hard drive.

File encryption protects against the threat of lost or stolen mobile devices. Each file on the machine is encrypted, and the encryption keys are stored protected by the user’s password. If an attacker doesn’t have access to this password, then they can’t read any useful data off of the stolen device.

Regulatory Compliance

In recent years, the regulatory compliance landscape has grown increasingly complex. In the past, organizations largely had to comply with industry-specific regulations like HIPAA and PCI DSS. In the wake of the passage of the EU’s General Data Protection Regulation (GDPR), many governments have passed their own data privacy laws as well, such as the California Consumer Privacy Act (CCPA).

While these laws vary in the details, they have a common focus on protecting consumer data. One of the common requirements is that organizations protect their customers’ data and restrict access based upon need-to-know.

File encryption enables an organization to meet both of these requirements. Encrypting files and restricting access to decryption keys based upon role requirements ensures that no-one can gain unauthorized access to sensitive data.

What to Look For in a File Encryption Solution

File encryption is a valuable tool for data security. However, implemented improperly, it can negatively impact employee productivity or lull an organization into a false sense of security. Some vital features to look for in a file encryption solution include:

  • Secure Encryption: A file encryption solution is only as secure as the encryption algorithm that it uses. For example, Ghostvolt uses AES in GCM mode.

  • Granular Control: Some encryption solutions use a single key to encrypt all data, but this forces an “all or nothing” approach to access management. A file encryption solution should support a variety of different keys, enabling access to files to be granted or denied on a per-user or per-application basis

  • Usable Key Management: File encryption is designed to protect data from unauthorized access; however, it is necessary for legitimate users to be able to access their data in order to carry out their business. For this reason, the file encryption solution’s key management system should be secure, easy to use (enabling granular key management) and highly accessible.

  • Easy Sharing: Employees within an organization need to be able to share internal documents and other files. A file encryption solution needs to make it easy for these users to add or revoke other users’ access to their documents.

Preparing for the Cyber Risks of Tax Season

With the beginning of 2021 comes the start of 2020 tax season as well. While some people may wait until the middle of April to prepare their tax returns, cybercriminals are already gearing up to start targeting this sensitive data. Securing your tax data requires an understanding of the potential threats and best practices for protecting it.

The Value of Tax Return Data

Tax returns provide a wealth of valuable information for a cybercriminal. Obviously, access to a tax return from a previous year makes it possible to perform tax fraud for the current year. However, tax returns also contain a wealth of other data that can be used for fraudulent purposes, such as:

  • Personal Data: A tax return contains your full name, address, social security number (SSN), and similar personal data. This information is enough to perform tax fraud or to open up a credit card or bank account.

  • Spouse and Dependents: As part of filing joint taxes or claiming dependents, it is necessary to provide similar data for them as well. This potentially exposes them to the same tax and financial fraud. Additionally, this information can be used for spear phishing attacks or to try to guess your login passwords for online accounts (which hopefully aren’t based on kids’ names, birthdays, etc….)

  • Financial Data: A tax return provides an in-depth look at your current financial status. This information could be used to inform ransomware attacks (how much are you able to pay in ransom?) or to help identify the banks that you use so that “you” can call after “forgetting” your password. Many of the answers to your security questions are also included in the return.

  • Charitable Donations: Your tax returns may include information regarding charitable donations that you have made this year. This information may be used to impersonate you when communicating and scamming a charity.

The information contained within a tax return is highly sensitive and can be misused in a number of different ways. It is vital to ensure that this data is properly protected against unauthorized access and exposure to cybercriminals.

How Cybercriminals Get Your Tax Returns

The data contained within a tax return is extremely valuable to a cybercriminal. For this reason, they are willing to put in the effort to steal it using a variety of different techniques:

  • Phishing Emails: Phishing is the most common type of cyberattack, and it works well for this type of scam. If a phisher impersonating the IRS, your bank, or a similar institution can convince you to hand over login credentials or a copy of your return, then the attacker has everything that they need.

  • Vishing: Voice phishing or “vishing” is phishing over the phone. The same techniques apply here as well (and over social media, in person, etc.). The attacker will pretend to be someone in authority and try to talk you into providing sensitive data.

  • Malware: Some forms of malware are specifically designed to search for and steal financial data from a computer. These malware can look for copies of tax returns on your computer and send the entire return or extracted high-value data to the attacker.

Anyone who contacts you and tries to get you to provide information about your tax return over the phone. Always verify the authenticity of a request by contacting the alleged requestor through official channels such as the email or phone number listed on an official site (like irs.gov).

Protecting Your Personal Data

Tax season is one of the times of highest activity for cybercriminals and scammers. To keep your data safe, take the following simple steps:

  • Watch Out for Scammers: Social engineering (including phishing, vishing, and more) is a common way to steal tax return data. Always verify requests before providing any information.

  • Use an Antivirus (AV): A malware infection can allow a cybercriminal to steal your personal data or cause other damage to your computer. Keep your AV up to date and run it regularly to help detect and remove malware from your computer.

  • Use a File Encryption Solution: Files stored unencrypted on your computer can be stolen by malware and other means. A file encryption solution like GhostVolt ensures that an attacker can’t use the data in your tax return even if they steal the file.

Why Do I Need File Encryption?

Many cybersecurity solutions are geared toward businesses. Most individuals don’t need to perform the same investment in cyber and data security as companies do because they have less personal data and it is often less targeted by cybercriminals. However, that isn’t to say that the average person shouldn’t take any steps to protect their personal data. One of the fundamental cybersecurity protections that every person should have is file encryption.

What is File Encryption?

When using the Internet, most people know to use encryption of data in transit. This is the big difference between HTTP and HTTPS traffic (i.e. whether or not the lock icon shows up in the address bar). Using HTTPS helps to ensure that you’re connected to the right website and that no one can eavesdrop on your connection, which may include personal data like credit card information, your Netflix queue, etc.

However, your personal data isn’t only at risk when it’s traveling over the Internet. Despite your best efforts, there is a chance that your computer will be infected with malware that your antivirus doesn’t catch. If this is the case, the malware may start looking for sensitive data on your computer and send it to the cybercriminal running the malware.

This is where file encryption comes in. Instead of just encrypting data in transit, file encryption ensures that data is stored encrypted on your computer. This means that an attacker or malware with access to your computer can’t read your sensitive data unless they also know your password.

Why Do I Need File Encryption?

The most common argument against implementing good cybersecurity practices is “I don’t have any data worth stealing”. However, this statement is incorrect, and cybercriminals commonly target individuals to steal personal data.

When thinking about your personal data, you might focus on credit card and banking information, which is primarily entered into the browser and not stored on the machine. However, a great deal of personal data can be extracted from files that you may store on your computer without thinking twice about them. Some examples of these files include

  • Tax Return Documents: Most tax preparation software provides an option to store a copy of the return on your computer. A full tax return provides an attacker with all of the information that they require to perform identity theft. Similarly, W2s, 1099s, and other common forms can contain sensitive data.

  • Family Photos: By default, many cameras and smartphones will embed location information in photos, which is why your computer can tell where and when the photo is taken. A picture of a backyard barbecue can reveal a home address, or a birthday photo reveals someone’s name and date of birth.

  • Application Forms: Applications for a loan, rental, etc. often contain sensitive information like a social security number (SSN). This information can be used in identity theft scams.

  • Travel Plans: When booking a vacation, you may store a copy of the booking information on a computer. These confirmations can include financial information, information about your travel plans, and provide a would-be burglar with a list of dates when a house will be empty.

Common, everyday files can reveal a great deal of personal information. Protecting these files with encryption when storing them on a computer can help to prevent this data from getting into the hands of cybercriminals.

Simple File Encryption with Ghostvolt

Implementing file encryption does not need to be complex or expensive. Ghostvolt offers a simple, user-friendly solution that enables users to securely store files on their computers and share them with friends and family.

How Your Passwords End Up for Sale on the Dark Web

Credential stuffing and similar attacks are made possible by cybercriminals with access to large lists of common and breached passwords. But how do these cybercriminals get these lists of passwords to try? Two of the main methods are via phishing attacks and data breaches.

Phishing Attacks

Phishing attacks are the easiest way for cybercriminals to gain access to account credentials. If a phishing page tricks you into entering your username and password, then these credentials are sent straight to the attacker. They can then add them to a list of credentials for sale on the Dark Web or keep them for their own personal use.

Inside a Data Breach

Data breaches have become a fact of daily life. In the US alone, 1,473 data breaches were reported in 2019. This averages to 4 breaches and over 450,000 breached records per day.

In many of these breaches, passwords are listed as part of the exposed data. However, most organizations don’t store their users’ passwords. This would make it far too easy for an attacker or a malicious insider to steal the master password file and use it to gain illegitimate access to user accounts.

Instead, when data breaches talk about passwords being breached, they are actually talking about leaked password hashes. Hash functions are cryptographic operations with a few useful properties:

  • One-Way Function: It is impossible to calculate the input to a hash function from the corresponding output.

  • Collision-Resistance: It is very difficult to find two inputs that produce the same output.

  • Deterministic: Hashing the same input always produces the same output.

These properties make hashes an ideal way to store password data. Instead of storing a list of passwords (which can be abused), a list of hashes is stored instead. When verifying users, if the hash of the password that they provide matches the one on file, then they are likely to have provided the correct password. However, if the password hashes are leaked, then an attacker still doesn’t have the original password.

From Password Hashes to Breached Passwords

If password hashes are so secure, then how do data breaches lead to passwords being offered for sale on the Dark Web? This mainly happens when someone has made a mistake.

The less common reason for this is that the breached service is not storing passwords securely. If passwords are not appropriately hashed or are using a broken hash algorithm (like MD5 or SHA1), then an attacker may be able to break the hash and determine the user’s credentials. This isn’t the most common reason that passwords get broken, but it does happen.

More commonly, the fault is on the side of the user. An attacker can take a “guess and check” approach to breaking passwords by hashing potential passwords and comparing them to breached password hashes. If the hashes match, then the attacker has found the right password.

This is what makes the use of weak and reused passwords so dangerous. If a password is easily guessable (based off of a word, pattern, etc.), then password cracking tools will easily be able to identify it in a list of breached password hashes. Once this occurs, the attacker is able to sell the breached password rather than just the hash.

How The Good Guys Are Fighting Back

Most people don’t know when they’ve been the victim of a data breach. For this reason, many online services have incorporated breach notifications. This means that, if you try to log into an account that has been breached or using a breached password, the service will notify you and help you reset the password. In most cases, these breach notifications are driven by HaveIBeenPwned, a site that collates this data and allows people to determine if their data has been breached.

These services work because they use the same tools and techniques as malicious attackers. Cybersecurity researchers search the Dark Web for lists of credentials that have been publicly posted or offered for sale. Any password hashes that they collected can then be subjected to password cracking to determine whether or not they are likely to be broken by attackers. This is where the lists of the “most commonly used passwords” come from.

Applications like GhostVolt also use HaveIBeenPwned to help protect people from reusing breached passwords. When setting a password on GhostVolt, the password is automatically checked against HaveIBeenPwned’s list of breached passwords. If a match is found, GhostVolt will not allow the use of this weak password. While this may seem inconvenient, it is essential to protecting the security of your GhostVolt account.

Protecting Your Online Accounts

If you are notified about a breach of one of your online accounts, it is vital to change the affected password immediately. However, this may be too little too late since a breached account means that attackers already may have access to your account.

Being proactive about password security is a much better idea. Instead of waiting for passwords to be breached, replace any weak and reused passwords with unique, strong ones and store them in a password manager. This dramatically decreases the risk of cybercriminals compromising your account if an online service is breached and password hashes are leaked.

Encryption and Its Role in Election Security

The security of the United States’ electoral process has come into question from a number of different angles. On the one hand, the increase in mail-in voting due to COVID-19 has caused some politicians to call the validity of the election into question. This is largely based upon unsupported allegations of widespread fraudulent use of absentee ballots.

On the other hand, the electronic voting machines that are in widespread usage and are the predominant method by which US voters cast their votes have a number of known security issues. Assessments of the security of over 100 voting machines at the 2019 DEFCON conference found that all of them contained exploitable vulnerabilities.

Applying Cryptography to Election Security Threats

The security of the US election infrastructure is a vital concern, and cryptographic algorithms have a number of potential applications for voting machine security. The CIA Triad (confidentiality, integrity, and authenticity) outline the primary goals of cryptographic algorithms, and all of these apply to election infrastructure.

Confidentiality: Revealed Votes

One of the key components of a fair election process is the ability for voters to cast their ballots in secret. If the individual votes within an election are not confidential, then the potential exists for coercion or blackmail to impact the results of the election.

Ensuring data confidentiality is the primary goal of an encryption algorithm. Homomorphic encryption algorithms enable algebraic operations to be performed on encrypted data without revealing the data itself. This can be applied to election infrastructure by enabling ballots to be encrypted and tallied without revealing their actual contents.

Integrity: Voting Machine Glitches

Voting machines are computers, and computers have the potential for software bugs or glitches that impact their operation. This is not even a theoretical threat to the integrity of the electronic voting systems in use today.

In a 2019 Pennsylvania election, a voter reported that a touchscreen voting system was acting oddly and not properly recording her votes. Analysis of the paper trail associated with this machine determined that a candidate for whom the machine recorded a total of 15 votes actually won the election by over 1,000. Despite this potential issue with the integrity of electronic storage, 12% of voters use electronic voting machines with no paper backup.

Cryptographic tools such as hash algorithms, digital signatures, and message authentication codes (MACs) are designed to protect the integrity of data by alerting if any modifications have been performed. The use of one of these solutions (potentially stored separately from votes or provided on a receipt to voters) could help to detect issues that affect the accuracy of the vote.

Authenticity: Modified Votes

The most significant threat to the integrity of the US electoral process is the potential for votes to be modified in a way that impacts the result of the election. The potential for this to occur has been demonstrated in numerous ways, including a study that found that 94% of voters didn’t notice that their votes had been changed between a touchscreen system and the associated paper record.

Authenticating the identity of a message sender is one of the primary goals of digital signature algorithms. Voters issued identification cards with digital certificates stored on them could generate a digital signature as part of the voting process. If their ballot or vote was modified in any way, the signature would not be valid, making the tampering easily detectable.

Securing US Elections with Cryptography

That the US elections infrastructure has security issues is unquestionable. Numerous assessments have demonstrated that voting machines violate fundamental cybersecurity best practices. However, US law limits security researchers’ ability to perform testing of voting machines, and some voting machine manufacturers are pushing to make tests that they have not authorized (and where they cannot control the distribution of the results) illegal.

Despite these issues, options exist for improving the security of US elections. Maybe one day, the US will use modern encryption solutions to bolster election security.

The Problem with Cloud Storage for Secure File Sharing

In the modern business, the ability to quickly and easily share documents and other files throughout the business is essential. Everyone works in teams, and sharing documents via email or shared file servers is inefficient.

Cloud-based document sharing services, like Dropbox and Google Drive, offer a tempting alternative to traditional methods of document sharing. Tools like Google Docs enable an entire team to edit a document in parallel and track the complete revision history of the document, making it easy to attribute and revert edits.

However, these cloud-based document sharing services also have their downsides. Employees using these services have to make a choice between efficiency and security, and many choose efficiency. As a result, a number of organizations have suffered data breaches caused by employee negligence in configuring and securing cloud-based services.

Security Challenges of Cloud-Based File Sharing

Cloud Service Providers (CSPs) provide built-in security configuration settings for their environments. While the details vary from CSP to CSP, many of them operate on a simple private/public access model.

A private cloud, as the name suggests, is private. In order to access the cloud-based resource, an employee needs to be explicitly invited to access the resource. On Google Drive, for example, this invitation comes in the form of the document owner (or other administrator) sending a sharing link to the person’s email address.

While this system is effective at securing access to the cloud-based resource, it also creates significant overhead for the document administrators. They must explicitly invite every user of the cloud-based resource and manually revoke permissions if access is abused. While Google Docs keeps an edit history, making it possible to detect such abuse, the document administrator would have to manually review this for anything suspicious.

The overhead associated with properly securing cloud-based resources drives many users to go to the opposite extreme. By marking the cloud-based resource as public, the employee can share access to the document simply by sharing the URL of the document with the desired recipient.

The primary benefit of this system is that anyone with access to the link has access to the resource, making it easy to invite new users. The primary downside of this system is that anyone with access to the link has access to the resource, making it easy for unauthorized users to discover and access the document.

Many people incorrectly believe that it is difficult to find the URL of a cloud-based resource if you are not an authorized user of the resource. Even ignoring the possible cases where an authorized user forwards the link to an unauthorized use, cloud URLs are relatively easy to discover. Hacking tools exist specifically for scanning the space of possible cloud URLs (they have a set scheme), checking if a given URL is valid, and checking if it is public. In fact, most known cloud data breaches were discovered in this way. An ethical hacker using these tools identified an unsecured cloud resource and notified the owner.

Beyond the access control issues associated with cloud resources set to “public,” there are also attribution issues. For example, Google Drive maintains a complete edit history for a document, making it possible to determine if a user has made unauthorized edits. However, knowing that “Anonymous Panda” was the one at fault doesn’t help much. Additionally, Google Drive doesn’t track access by anonymous users, so only those trying to modify the data (instead of just stealing it) would be detected.

Secure Document Sharing with GhostVolt

Cloud-based document storage, like Google Drive, has made significant strides toward making it possible to efficiently and effectively share documents within a team. However, these systems also have a ways to go.

However, more effective solutions for secure document sharing are available. GhostVolt Business takes the basic services that Google Drive (and similar services) provide and takes them a lot further.

Encryption of all files by default, whether on a user’s personal machine or in the cloud, with AES-256 ensures the security of business data. Access to these documents can be managed by defining specific user roles and managing permissions to files based off of these roles. This makes it easy to map a user’s access to documents within the organization to their job responsibilities.

However, where GhostVolt really stands out is in the visibility that it provides regarding shared documents. All user activity is logged, and GhostVolt has built-in reporting functionality to summarize raw data into readable reports. This, combined with Ghostvolt’s strong access controls, make it easy to maintain and demonstrate compliance with a wide (and growing) range of data protection regulations, such as the EU’s General Data Protection Regulation (GDPR), the Payment Card Industry Data Security Standard (PCI DSS), and HIPAA, SOX, CCPA, and more in the US.

The Importance of Secure Document Sharing

As more regulations like GDPR and CCPA come into effect, organizations are required to strongly protect the data in their possession and to be able to demonstrate that these security controls are in place. On the other hand, the ability to quickly share data throughout the organization is essential to enabling the organization to operate efficiently.

A secure document sharing solution, with built-in encryption and strong role-based access control, is essential to maintaining regulatory compliance. However, it also needs to be intuitive and efficient to use to meet core business needs. When choosing a document sharing solution, an organization should not need to compromise on security, usability, and performance.

The Hidden Costs of a Data Breach

The Growing Threat of Breaches

Companies are collecting and storing ever-increasing amounts of customer’s personal data. While some organizations are doing so to perform mass-scale data mining, the average business is collecting data simply to perform their core business practices. It’s pretty difficult to keep track of a user’s account without an email address or send a parcel without a shipping address.

Unfortunately, the troves of sensitive data that companies are collecting are a major target for hackers. An individual’s personal data can be used for a variety of malicious purposes (identity theft, spear phishing, and blackmail to name a few), so this information can fetch a pretty good price on the black market. The collections of sensitive data held by businesses are a treasure trove to any hacker who can access them.

As a result, data breaches are becoming increasingly common. In 2018, 5 billion records (or individual customers’ data collected by a business) were stolen by hackers. Since the cost of a data breach to the organization is often proportional to the number of records stolen, the impact of these breaches on the global economy is significant.

Hidden Costs of Data Breaches

Data breaches are expensive for the attacked organization. Some costs are directly related to managing the impacts of the breach (investigating the incident, paying fines, etc.), while some are more indirect. In 2019, a data breach costs a company an average of $8.19 million. These costs are spread over a variety of different impacts.

Investigation and Remediation

After a data breach has occurred, the organization needs to perform an investigation to determine the scope of the breach and remove any traces of the attacker from the network. This can be a complicated and expensive proposition since:

  1. Many organizations don’t have the expertise in-house to investigate.

  2. Attackers cover their tracks to make investigation more difficult.

Once the incident investigation has been completed, the organization needs to pay the costs of remediation. This not only includes the price of implementing all of the cybersecurity protections that were lacking in the first place (allowing the breach to occur) but also the price of fixing any damage caused by the attacker while they had access to the system. The need to perform these investigations and mitigations quickly can also add to the price tag.

Reputational Damage

One of the hardest costs of a breach to quantify is the damage to an organization’s reputation after a breach. Their customers have trusted them to properly protect the sensitive data entrusted to them and the organization has failed to do so. After a breach, an organization has to endure numerous news reports and articles dissecting what they did wrong and how their security processes were inadequate.

A great example of the impact of reputational damage to a company due to a breach is the general feelings of the American public towards Equifax. The Equifax breach was over a year ago, yet people are still annoyed with the company. The Equifax breach may be an extreme case since no-one gave their data directly to Equifax (so they’re not happy with it being lost) and the breach was caused by gross negligence by the company (failure to patch a vulnerability that was being actively exploited for months before the breach), but the current ill feelings toward the company demonstrate how an organization’s reputation can suffer after a breach.

Compliance Reporting and Penalties

In recent years, governments have been increasingly focused on protecting the personal data of their constituents. The EU’s General Data Protection Regulation (GDPR) is the most famous of these, but a variety of different nations and states have passed data privacy regulations to protect their citizens. These new regulations and standards are in addition to those already in effect, including PCI DSS, HIPAA, FISMA, SOX, and others.

As a result, the average organization may be required to achieve, maintain, and demonstrate compliance with several different regulations. In the event of a breach, this involves determining if the breach is reportable, whom to report it to, and how the report needs to be performed. Once a report is filed, the organization needs to cooperate with regulators and may be fined for negligence as a result of the breach. The manpower and fines associated with this can dramatically increase the cost of a breach. British Airways was fined $230 million by GDPR regulators for a 2018 data breach.

Notification and Compensations

After a data breach has occurred, the breach organization may be compelled to notify affected parties and provide compensation. While commonly these notifications are performed by email and have minimal cost to the organization, determining who needs to be notified can require significant effort.

Compensation costs can also be significant to an organization. Commonly breached businesses offer identity monitoring, but it is not uncommon for affected parties to file lawsuits against the organization for damages. The costs of litigation, settling, and/or any damages can be a significant cost to the organization.

Loss of Future Revenue

The impacts of a breach in terms of lost future revenue are difficult to determine. Multiple surveys have found that as many as 70% of customers will stop buying from a company after a breach. However, the fact that the #deletefacebook movement fizzled despite all of the missteps that Facebook has made regarding properly using and protecting customer data demonstrates that this may not be the case. If customers have no viable alternative, they may stick with a breached company, but many organizations will see a drop off in sales after reporting a breach.

Protecting Against a Breach

The simplest way to prevent data breaches is to ensure that the organization’s defenses are never breached by a hacker. However, this expectation is unrealistic. Every company is likely to suffer at least one successful cyberattack, and most companies will be the victim of several.

Protecting against data breaches requires ensuring that, even if a hacker can breach the organization’s defenses, that they can’t steal any valuable data. The best way to accomplish this is by encrypting data at rest.

Inside the Encryption Backdoor Debate

Encryption technology is designed to protect sensitive data from unauthorized access.  Communications applications with end-to-end encryption, like Telegram or Signal, are designed to ensure that an eavesdropper who intercepts the traffic en route doesn’t have the capability to read it.

Increasingly, government officials have been calling for backdoors to be built into all encryption algorithms.  The most famous and vocal proponent of it is the US Attorney General, William Barr; however, a meeting of representatives from Five Eyes countries (Australia, Canada, New Zealand, UK, and US) recently expressed support for the proposal

Despite the security concerns associated with it, encryption backdoors are required in some countries.  In December 2018, Australia ignored the advice of security and encryption experts and passed a law mandating backdoors in all encryption.

The Call for Backdoors

The argument for encryption backdoors essentially boils down to one of law enforcement.  Encryption is designed to keep all eavesdroppers out of a private conversation, including the police.  Encryption backdoor supporters, like Barr, believe that this ability to “go dark” has been causing an increase in unsolved criminal investigations.

As a result, enterprises may be required to install backdoors in encryption algorithm, regardless of the damage to personal privacy.  Barr has expressed the opinion that the general public doesn’t really need strong encryption since they’re only protecting personal emails and selfies and not the nuclear launch codes.

The problem with Barr’s justification for breaking encryption is that crime rates aren’t rising.  In fact, they’ve been steadily dropping since 1993.  While encryption denies law enforcement access to some forms of evidence, they still have all of the investigative techniques that existed before the rise of computers, and these techniques have consistently proven to be effective.

Backdoored Cryptography: Data Encryption Standard

This isn’t the first time that the US government has tried to weaken encryption for the purposes of law enforcement and/or national security.  In the past, the National Security Agency (NSA) was even successful in weakening one encryption algorithm: the Data Encryption Standard.

In 1973 and 1974, the National Board of Standards, the forerunner of the current National Institute of Standards and Technology (NIST), issued calls for an encryption algorithm that would become the official Data Encryption Standard of the US government.  One submission, developed by IBM, was eventually selected as the winner of the contest.

As part of the review process, the NSA had the ability to review and make suggestions regarding the encryption algorithm.  In the end, they made two major modifications that changed the substitution boxes (S-Boxes) used by the cipher and the key length.

The NSA’s change to DES’s S-Boxes was actually intended to improve the security of the cipher.  At the time, only the NSA knew about a particular method of cryptanalysis, called linear cryptanalysis.  The original S-Boxes of the proposed algorithm were vulnerable to this attack, and the NSA’s modifications corrected this.

The other main change that the NSA made to the cipher was a change to the length of the secret key from 64 to 56 bits (though the NSA wanted 48-bit keys).  This made the encryption keys 256 times weaker than the original cipher and led to its being broken in 1999. The original DES submission would have remained secure for a longer time.

The Challenge of “Secure” Backdoors

The main challenge with backdoors in encryption algorithms is that no-one knows how to do them in a secure way.  In his calls for encryption backdoors, Barr pointed to three previous proposals for accomplishing it:

  1. Allowing law enforcement to silently enter as a participant in a conversation

  2. A hardware chip in phones that stores encryption keys that only law enforcement can access

  3. A multi-layer encryption algorithm that allows law enforcement access to the underlying data

These proposals are nothing new.  The first option has been tried in the past with unfortunate results since it is difficult to ensure that only law enforcement has access to a backdoor.  The problem with the other two options is that they seem like a good idea, but no-one knows how to implement them in a secure fashion.

The Bottom Line

The encryption backdoor debate is troubling since it involves lawmakers deliberately ignoring statements from security and cryptography experts that their requests are impossible to implement in a secure fashion.  If the backdoor proponents succeed, individual privacy could take a major hit.

And the crazy thing is that it won’t even work.  The primary difference between apps like Whatsapp and Telegram and messaging apps like Signal is that Whatsapp and Telegram are operated and maintained by companies and Signal is open-source.

This means that, if companies like Facebook and Telegram comply with the law, all of their users will have potential privacy violations.  However, the criminals that the law is designed to address (and who couldn’t care less about the law) will switch to Signal or other open-source apps where the backdoor can be removed from the code before using it.

The encryption backdoor law boils down to limiting people (and criminals) to only using certain apps on their phones and computers.  The number of jailbroken iPhones, rooted Androids, and unlicensed copies of Windows in use demonstrate that this is another problem that device manufacturers don’t know how to solve.

The Story of Cryptography Part 3: Modern Cryptography

This is the final post in a three-part series on the history of cryptography.  In the first post, we explored encryption algorithms that pioneered some of the core components of encryption: Caesar’s Box and the Vigenere Cipher.  From there, the second post describes some of the historical cryptographic milestones of the 20th century: the Enigma machine, the Data Encryption Standard (DES), and the invention of asymmetric encryption.

In this post, we’ll explore how cryptography has evolved in the 21st century.  Beginning with DES’s replacement, AES, this post explores the use of Elliptic Curve Cryptography (ECC) and some of the areas of cryptographic research currently being explored today: homomorphic encryption and post-quantum cryptography.

The Advanced Encryption Standard

The Data Encryption Standard (DES) was the first encryption algorithm publicly endorsed by the US government.  As a result, it was rapidly adopted in the US and beyond and also subjected to a great deal of scrutiny by cryptographers.  When the DES was defeated in 1997 by a brute-force key guessing attack, it was time for an update.

In 1997, NIST put out a call for proposals for the Advanced Encryption Standard (AES).  This resulted in the submission of fifteen candidate ciphers from around the world and a three-year selection process where cryptographers debated and attempted to break the various ciphers.

In the end, three ciphers from the Rijndael family of ciphers (developed by Belgian cryptographers) were selected as the AES.

The three variants have a 128-bit, 192-bit, or 256-bit secret key.  All three variants use a 128-bit block size (organized into 4 4-byte rows) and multiple rounds.  Each round but the last includes four steps:

  • SubBytes: Application of S-Boxes to transform the state

  • ShiftRows: Shifting of the last three rows by 1, 2, or 3 bytes

  • MixColumns: Combination of the four bytes in each column of the state

  • AddRoundKey: Exclusive-or (XOR) of the state with the round key

Before the rounds begin, an AddRoundKey operation is performed, and the last round does not include MixColumns.  One difference between the different variants is the number of rounds: 10, 12, or 14. The other significant difference is the key schedule used to derive the 128-bit round keys from the 128-bit, 192-bit, or 256-bit secret key.

Security of the AES

AES is currently considered a secure cipher as the only feasible attacks against the full version of AES uses side-channel analysis, where power consumption, electro-magnetic emissions, etc. are measured and used to infer internal values of the state of the cipher.  Other attacks against AES either operate on a reduced number of rounds or have a limited effect on the security.

The best known attack reduces the effective key lengths of the three variants to 126, 189.9, and 254.3 bits.  All of these key lengths are infeasible to brute-force on modern computing hardware.

Elliptic Curve Cryptography

In the previous post in this series, we discussed the invention of asymmetric encryption and how algorithms using it are still considered secure.  However, the security of these algorithms is based upon the key length used. As computing improves, brute force attacks on longer keys become feasible, forcing a move to even larger secret keys.

Elliptic curve cryptography (ECC) provides a limited solution to this problem.  Elliptic curves are mathematical functions with a specific structure with operations that map to operations over the integers.  Addition of two points on an elliptic curve is equivalent to multiplication of two integers, and exponentiation in the integers is equivalent to multiplication of points on an elliptic curve.  As a result, it is possible to construct the same “hard” mathematical problems used in asymmetric cryptography using elliptic curves.

The benefit of using elliptic curve cryptography instead of traditional integer-based algorithms like RSA and Diffie-Hellman is an increased level of security with shorter key lengths.  According to NIST, a 256-bit ECC private key provides equivalent security to a 3072-bit RSA key. ECC-based asymmetric algorithms also consume less energy than their integer-based counterparts, making them more efficient and usable as well.

However, the security advantages of ECC only apply to brute force attacks exploiting key lengths.  The ECC algorithms use the same underlying mathematical “hard” problems as RSA and Diffie-Hellman, meaning that an attacker who finds an “easy” solution to these “hard” problems can attack ECC-based encryption as well.

Next-Gen Cryptography

The Advanced Encryption Standard (AES) and Elliptic Curve Cryptography are in common use today, but they’re largely “solved” problems.  Some of the cryptographic algorithms currently in development today, like homomorphic encryption and post-quantum cryptography, are designed to expand the applications of cryptography or solve problems created by improvements in computing systems.

Homomorphic Encryption

Modern cryptography works very well at protecting sensitive data against attack with modern systems. However, it has one very significant shortcoming: you can’t process encrypted data.  Adding or multiplying the ciphertexts of two AES-encrypted values doesn’t produce the sum or product of their corresponding plaintexts. This means that data must be decrypted in order to be processed, and encryption only protects data at rest and data in transit.

Homomorphic encryption is designed to change this.  A fully homomorphic encryption algorithm allows the user to perform arbitrary computations on a ciphertext and then decrypt the result to the correct plaintext (with all computations applied).

The first fully homomorphic encryption algorithm was proposed by Craig Gentry in his 2009 PhD thesis.  Since then, the first has been actively expanded by Gentry and others to help improve the efficiency, security, and usability of FHE algorithms.

Post-Quantum Cryptography

Another area that is getting a lot of attention in the cryptography space is that of post-quantum cryptography.  In previous posts, we discussed how asymmetric encryption is based upon mathematical “hard” problems. With these problems, legitimate operations (multiplication and exponentiation) are polynomially difficult, but their inverses (factoring and logarithms) have a difficulty that grows exponentially with the length of the numbers used.

This asymmetry allows asymmetric cryptosystems to be developed that are usable but can achieve an arbitrary level of security against attack.  However, this only works if these problems remain “hard”, and an algorithm developed by Peter Shor in 1994 makes it possible for quantum computers to break this assumption.  When quantum computers grow powerful enough, traditional asymmetric encryption will be broken.

In anticipation of this day, cryptographer are actively developing post-quantum asymmetric cryptography algorithms.  These algorithms are based on mathematical problems that are “hard” for quantum computers as well. Several of these problems and algorithms exist, but there is still active research in the field to develop new ones and test existing ones to ensure that secure alternatives are available when traditional asymmetric cryptography breaks.

Wrapping Up

The goal of this three part series is to provide an introduction to the history of cryptography.  Beginning with historical ciphers, moving to 20th century encryption systems, and concluding with modern cryptographic algorithms, this series highlights the inventions with the greatest impact on the cryptographic algorithms that we use today.

The Story of Cryptography Part 2: 20th Century Cryptography

In the first post of this series, we described the early history of cryptography.  This included the development and use of Caesar’s Box and the Vigenere cipher, encryption algorithms that pioneered the use of encryption and the encryption key.

It wasn’t until the 20th century that cryptography came into widespread use.  Notable encryption milestones from the 20th century include the development and use of the Enigma machine, the creation of the US Data Encryption Standard (DES), and the development of asymmetric encryption.

The Enigma Machine

The Enigma machine is the most famous mechanical encryption machine in existence.  Developed by Arthur Scherbius in 1918, it was a machine that implemented a substitution cipher.  The use of several different rotors and reconfigurable wiring allowed the machine to operate in many different configurations and acted as the secret key.

When used, the Enigma operator would type a letter on the keyboard and the next letter of the ciphertext would light up.  Decryption was accomplished via the same process, allowing the use of Enigma for encrypting radio communications (which was necessary for the rapid troop movement that was a vital part of their strategy).  The Enigma machines were famously used by the Germans in WWII, but versions were also employed by the Japanese and Italians as well.

Cracking the German Code

Breaking the encryption of the Enigma machine was a multistage process.  The model in use by the Germans had cryptographic vulnerabilities, allowing a Polish cryptanalysis, Marian Rejewski, to crack the encryption keys in use.  However, he did not have knowledge of the internal wiring of the machines, so he was not able to use this knowledge to break the codes.

This information was obtained by French spy Hans-Thilo Schmidt, along with the codes used for encryption in September and October 1932.  Using captured ciphertexts from this period, the Polish were able to build a working Enigma machine and decrypt German communications starting the following January.  This began a cat and mouse game between the Germans and the Poles until 1938, when German improvements made decryption too resource-intensive for the Poles to maintain.

In July of the following year, the Poles disclosed their Enigma decryption efforts to French and British military intelligence.  This partnership allowed the British to develop their Enigma decryption efforts at Bletchley Park and continue decryption of German communications for the rest of the war.  In the end, one of the deciding factors in their ability to decrypt Enigma communications was procedural errors like failure to update encryption keys regularly 

Lucifer: The Data Encryption Standard

Before the early 1970s, the use of cryptography was primarily restricted to the military.  However, in 1971, the first encryption algorithms for private use were published. Companies had seen the effectiveness of encryption and were interested in applying to the protection of their intellectual property.

The first set of civilian encryption algorithms were developed by IBM under the name Lucifer.  Several different variants were developed internal to IBM, and one of the variants was submitted to the US National Bureau of standards (the precursor of NIST) in response to a call for proposals for the national Data Encryption Standard.

The submitted version of Lucifer was vulnerable to differential cryptanalysis, an attack only known to the NSA at the time.  However, after some modifications by the NSA, IBM’s Lucifer submission was accepted as the Data Encryption Standard (DES).

The accepted version of DES was a Feistel network, a multi-round encryption algorithm whose structure is shown in the images above, where the right image shows the operations performed in the boxes labeled F (for Feistel function) in the left image.  As shown, the message undergoes a permutation at the beginning and the end of the encryption process (IP and FP) and in each round of the Feistel function (labeled P). The E box in the right image represents an expansion of the half block from 32 to 48 bits, and each of the S-Boxes represents a substitution table that reduces 6 input bits to 4 output bits.  The cipher also has a key schedule, which generates 16 48-bit round keys from the original 56-bit secret key.

Brute-Forcing the Data Encryption Standard

The NSA made several different modifications to the Lucifer cipher before it was accepted as the DES.  Some modifications, like changing the S-Boxes to protect against differential cryptanalysis, were designed to improve the security of the cipher.

Others, like changing the key length from 128 bits to 56 bits, may have been designed to ensure that the NSA still had the ability to break the cipher at need.  In the end, DES wasn’t broken based upon cryptographic vulnerabilities (though some have been identified), but by a brute force attack taking advantage of the limited keyspace.  DES was first broken in 1997, paving the way for the Advanced Encryption Standard.

DH and RSA: Asymmetric Encryption

The original encryption algorithms are all symmetric encryption algorithms.  These systems require both the sender and recipient of a message to both have access to the same encryption key.  This requires that the users have the ability to set up a secure channel to transmit this key before encryption can begin.

However, in the 1970s, the invention of asymmetric cryptography changed this.  A series of encryption algorithms developed by the UK’s GCHQ made use of mathematical one-way functions to implement asymmetric encryption.  These algorithms allowed two parties to create a shared private key over a public channel or let a user generate a private/public keypair where the public key can be used for encryption of messages that can only be read using the private key.

The one way functions used in most asymmetric cryptography are the factorization and discrete logarithm problem.  The security of these schemes relies on the fact that operations like multiplication and exponentiation are “easy” (with polynomial difficulty) and the inverse operations, factoring and logarithms, are “hard” (with exponential difficulty).  Asymmetric encryption algorithms are designed to allow legitimate users to perform “easy” operations while forcing attackers to solve “hard” problems. The asymmetric relationship between the attacker and the defender means that it’s possible to create algorithms that are usable but have an arbitrary level of security against brute-force attacks.

While GCHQ cryptographers originally developed asymmetric cryptography in the early 1970s, their results and algorithms remained classified until 1997.  As a result, the algorithms were independently discovered by private researchers and are named after these parties.

The original key exchange protocol was invented by Malcolm J. Williamson of GCHQ in 1974 and is named after Whitfield Diffie and Martin Hellman, who discovered it in 1976.  The RSA algorithm for asymmetric encryption and decryption was discovered by Clifford Cocks in 1973 and a more general version was invented by Ron Rivest, Adi Shamir, and Leonard Adleman in 1976.  Since then, several different asymmetric encryption algorithms have been invented.

Coming Up: Modern Cryptography

The encryption algorithms presented in this article are largely broken, with the exception of the asymmetric algorithms Diffie-Hellman and RSA.  However, even the security of these algorithms are threatened by advances in computers. The final post in this series describes some of the modern encryption algorithms currently in use or development that are designed to replace these algorithms as they are broken.

The Story of Cryptography Part 1: Historical Cryptography

Cryptography is the science of secrets.  Literally meaning “secret writing”, cryptography is designed to hide information from all but its intended recipients.  Modern cryptography is essential to the secure Internet, corporate cybersecurity, and blockchain technology.

However, cryptography has a very long history before the modern ciphers were invented.  In this three-part series, we’ll explore the history of cryptography before the 20th century, in the 20th century, and in the modern day.

Caesar’s Box

Caesar’s Box is one of the earliest known ciphers.  Developed around 100 BC, it was used by Julius Caesar to send secret messages to his generals in the field.  Since messengers could easily be waylaid by the enemy en route, the use of even a simple cryptographic algorithm to encode his orders and his generals’ responses could give him a significant strategic advantage: he could intercept and read his opponents’ messages but they can’t read his.

Caesar’s box is a particular implementation of a shift cipher (which is a specific type of substitution cipher).  In Caesar’s Box, the encryption algorithm involved shifting each letter in the message three letters to the right to produce the ciphertext.  For example, A became D, B became E, and X became A. Decryption involved reversing this process by shifting each letter of the ciphertext three steps to the left.

In Caesar’s Box, the secret step amount was three, but this isn’t the only option.  Other shift ciphers (like ROT13) use different numbers of steps (like 13). However, all shift ciphers are insecure and can be easily broken.

Frequency Analysis: Breaking Caesar’s Box

It turns out that the hardest part of breaking Caesar’s Box is knowing the language of the message that it encodes.  Once you know that, it’s easy to break the cipher using frequency analysis. If that fails, a brute force attack against the cipher can work, since there are only 26 possible shifts in English.

Frequency analysis attacks take advantage of the fact that all of the letters in the English alphabet are not used equally.  E is the most commonly used letter, and you hardly ever see a word using z. In fact, this is the first time it appears in this article.  The relative frequencies of each letter in the English language are shown in the graph below.

The frequencies of the letters usage in English is important because Caesar’s Box does nothing to change them.  In a ciphertext encrypted with Caesar’s box, the most common letter is likely to be h.

Knowing this, it’s easy to determine the shift factor for Caesar’s Box or any other shift cipher out there.  With this information, the rest of the ciphertext can be easily decrypted. In the event that the shift factor is incorrect (a sentence may have more t’s than e’s for example), there are few enough options for a shift cipher that it’s easy to try them all.

Vigenere’s Cipher

Caesar’s Box may be the first famous cipher in existence, but it’s missing something rather important for modern ciphers: a secret key.  While the number of steps to shift may be considered the secret key for a general shift cipher, this value is hardcoded as three for Caesar’s Box.  As a result, the security of Caesar’s Box relies on the fact that no-one knows how it works, a practice called “security through obscurity” that violates Kerckhoff’s Law.

Vigenere’s cipher was created in the 16th century and introduced the concept of a secret key.  In Vigenere’s cipher, the secret key is another word or phrase that may be shorter than the plaintext to be encrypted.  If this is the case, the key is repeated until it matches the length of the plaintext.

To encrypt using Vigenere’s cipher, you convert the letters in the plaintext and the key to numbers in the range 0-25, add each pair of numbers together, and calculate the result modulus 26 (the result of dividing by 26 and keeping the remainder).  The output of this calculation is mapped back to a letter and used as a character of the ciphertext.

The image above shows a lookup table for performing encryption with the Vigenere cipher.  The columns are the letters of the secret key and the rows are the letters of the corresponding letter of plaintext.  Their intersection is the letter of ciphertext created from a given pair of plaintext and key letters.

Cryptanalysis of Vigenere’s Cipher

Like Caesar’s box, decryption of the Vigenere cipher can be performed using frequency analysis.  However, it takes a little more work. In order to decrypt Vigenere’s cipher, it’s necessary to first determine the period of the cipher and then apply frequency analysis.

The period of the Vigenere cipher is the length of the secret key used for encryption.  For example, encryption of a 36 letter plaintext with the key CIPHER would actually use a key of CIPHERCIPHERCIPHERCIPHERCIPHERCIPHER.

Once you know the period of the cipher, it’s possible to decrypt it just like a Caesar cipher.  Note from the image above that each particular letter of the key just creates a shift of a certain amount.  If you know the letters of the ciphertext that used the same shift value, you can apply frequency analysis to the cipher.

This is why the period of the cipher is important.  Note that the first, seventh, etc. letters of the sample key are all C (a shift of 2).  Through either guess and check or using calculation of the Index of Coincident, it’s possible to determine the actual period, shift values, and plaintext of the cipher.

Coming Up: 20th Century Ciphers

Caesar’s Box and the Vigenere cipher are two of the earliest known ciphers.  They pioneered the use of encryption to protect sensitive communications data and the use of a secret key in encryption.  In the next post in this series, we’ll move forward to the 20th century. There, we’ll see how cryptography evolved when driven both by military interests and organizations protecting their intellectual property.

Blockchain Security vs. Crypto Hacks

The blockchain is an amazing technology that enables a whole host of applications that were not previously possible.  The ability to create and maintain a secure, decentralized digital ledger allows organizations to function without requiring trust in a centralized authority.

However, the big question in blockchain is “is it secure?”  Blockchain apologists will assure you that the blockchain is completely secure and unhackable; however, the news cycle almost constantly has stories of blockchains being hacked and cryptocurrency assets being stolen, demonstrating that the blockchain is most certainly not secure.

These positions seem mutually exclusive; however, there is a lot of truth in both of them.  Understanding how the blockchain is secure and how it is still hackable is a vital step when deciding (or continuing) to use this new technology.

Blockchain: Secure by Design

Blockchain is designed to be a completely secure protocol for implementing a decentralized and distributed ledger.  This allows the network to maintain a trustworthy record of its history without relying upon a centralized authority to do so.  In order to accomplish this, the system needs a means to authenticate its users and a way to maintain the integrity and immutability of the ledger.

Authentication in the blockchain is dependent on public key cryptography.  This type of cryptography (also called asymmetric key cryptography) uses a keypair consisting of a related private and public key.  The private key can be used to decrypt messages encrypted with the corresponding public key and to generate digital signatures that can be verified with the public key.  These digital signatures are an essential part of blockchain technology’s security since they prove that someone with access to a given private key generated a given transaction or block.

The other important feature of the blockchain is its immutability: it’s difficult or impossible to change the ledger after the fact.  This is enabled by a combination of a few different features: digital signatures protect the integrity of transactions and blocks, blockchain consensus algorithms make it difficult for an attacker to create a fake version of a block, and hash functions link the blocks in the ledger together and ensure that any modifications to blocks are detectable.  All together, these components create a ledger that is extremely resilient to modification.

The underlying design of the blockchain works, and it works well.  While some blockchains have been exploited due to built-in vulnerabilities, these vulnerabilities have always either been deviations from the original blockchain design or created by poor implementations of the blockchain protocol.  The underlying design of the blockchain is secure against attacks using modern technology.

The Cryptocurrency Security Paradox

The blockchain is theoretically secure; however, you can hardly go a month without hearing about some cryptocurrency being hacked.  So what happened? Either blockchain is secure or it isn’t.

Robbing the Bank

Cryptocurrency hacks occur because of factors outside the control of the blockchain network.  Blockchain is designed to create decentralized systems and allow people to “be their own bank”.  However, this isn’t what many cryptocurrency users do. Instead, they take advantage of cryptocurrency exchanges and online crypto wallets.

Cryptocurrency exchanges are enticing because they provide a lot of convenience to their users.  Instead of managing their own private keys, they store them with the exchange. In turn, the exchange simplifies the process of performing transactions, making trades, etc.

The downside of exchanges is that they provide a centralized target for hackers attempting to steal these valuable private keys.  Like any other website, access to your exchange account is probably managed by a username and password and maybe a two-factor authentication (2FA) system.  Usernames and passwords can be guessed or phished and many 2FA solutions can be easily defeated. If this occurs, the hacker then has access to your account and your private key.

The security model of public key cryptography is based upon your private key remaining secret.  Anyone with that key is essentially “you” on the blockchain and can perform transactions in your name (including draining your account).  As a result, hacking an exchange enables the hacks that make headlines.

Bandits on the Loose

Exchange hacks are bad enough, but they don’t account for every hack on the blockchain.  A variety of other attack vectors exist that essentially boil down to cryptocurrency users doing something unwise.

The “Blockchain Bandit” is a cryptocurrency thief that took advantage of poor habits when generating private keys for use on the blockchain.  Whether to help them remember their keys or for other reasons, users generated weak keys and used them to store value on the blockchain. The “Bandit” caught on and began automatically scanning for and stealing from these weak addresses, netting them almost 45,000 Ether.

Another mistake that has lost cryptocurrency users some Ether is poor misconfiguration of their blockchain software.  A setting (disabled by default), allowed programs to interact with the wallet software via Remote Procedure Call (RPC) and perform transactions.  Users who enabled this setting and didn’t protect it properly were prey to an attacker who scanned for open RPC ports and stole over $20 million in Ether.

Using Blockchain Securely

Blockchain technology is designed to be secure, but, like any system, it doesn’t stay secure if you give the keys away or leave the front door unlocked.  The paradox between the security of the blockchain protocol and the rampant hacks of cryptocurrency is caused by people misusing the system, not any problem with the system itself.

Blockchain security is a complicated issue, but, from a user perspective, there are simple things that can be done to improve personal security on the blockchain.  Even steps as simple as properly protecting your own secret keys (i.e. not entrusting them to an exchange) and ensuring that any blockchain software that you use is secure and updated can make a huge difference in your ability to protect your cryptocurrency assets.

Threat Modeling for the Blockchain

Blockchain technology is an exciting new technology with a great deal of potential. With this potential comes the need to explore the security of this new technology. There has been a great deal of work in this space; however, no comprehensive threat model exists that classifies all potential threats and attack vectors within the blockchain ecosystem. When discussing potential security threats to a system and attempting to analyze whether a system is secure by design, it' is extremely useful to have a framework to use in classifying known attacks and pointing out ones that potentially have been overlooked. In this post, blockchain security threats are mapped to STRIDE, a well-known threat model developed by Microsoft, to create an effective threat model for the blockchain.

STRIDE and the Blockchain

The STRIDE framework was developed by Microsoft to help in threat modeling. Each letter in the STRIDE acronym is designed to refer to one of the most common threats in cybersecurity:

  • Spoofing: Spoofing refers to the ability of the attacker to masquerade as another on the system.

  • Tampering: Tampering attacks violate the integrity of the data stored on the protected system.

  • Repudiation: Repudiation is the ability of a user to deny that they have taken a certain action.

  • Information Disclosure: Breaches of confidentiality fall under information disclosure.

  • Elevated Privileges: If a user manages to gain unauthorized levels of control over the system, this is a privilege escalation attack.

    • In the context of the blockchain, we can break up elevated privileges based upon whether the attacker has unauthorized access to a user’s account, an elevated level of control over the blockchain system (i.e. in a 51% attack), or unauthorized permissioned access to a smart contract.

The STRIDE framework is useful for defining the potential effects that certain vulnerabilities or attacks can have on the security of a system. However, blockchain systems are a complete environment, including everything from the cryptographic primitives that underpin their security to the smart contracts that extend the functionality of the blockchain system.

In order to have a meaningful discussion about a blockchain threat model, it’s useful to break up the blockchain ecosystem into its various levels. For the purposes of this post, the following breakdown is used:

  • Fundamentals: The underlying components used to build the blockchain.

    • Cryptographic Primitives: The hash functions and public key cryptography used to ensure data integrity and provide user authentication.

    • Data Structures: The structure of the blocks used to store transaction data and the hash functions used to chain them together.

  • Protocols: The definitions of how blockchain nodes should interact when working to maintain the shared distributed ledger.

    • Consensus:

    • Block Creation:

  • Infrastructure: The nodes that work to maintain the distributed ledger and the network that they use to communicate.

    • Nodes: Computers running the blockchain software and maintaining a copy of the distributed ledger.

    • Network: The underlying network that the nodes use to communicate and the protocols that define how communications occur within the blockchain ecosystem.

  • Advanced: Many blockchain solutions do not limit themselves to the basic blockchain protocol defined in the Bitcoin whitepaper. These advanced components are an important component of these blockchain’s security and their threat model.

    • Smart Contracts: Smart contracts allow third-party code to be uploaded to and executed on the distributed ledger.

    • Blockchain Extensions: The basic blockchain technology can be extended by systems built either on top of it (state channels, side chains, etc.) or through connections to external systems via APIs.

With the STRIDE threat model and the framework of the blockchain ecosystem, we have what we need to begin threat modeling for the blockchain.

Blockchain Threat Modeling

The blockchain threat model is presented in the table below. Using the STRIDE model and the levels of the blockchain ecosystem, it’s possible to classify each attack vector based upon its potential effects. Each cell shows the different attacks that can be used to affect a given component of the STRIDE model at a level of the blockchain ecosystem. Each attack vector includes mouse-over text that describes how the particular effect can be accomplished by that attack.

Spoofing Authenticity
Tampering Integrity
Repudiation Non-Repudiation
Information Disclosure Confidentiality
Denial of Service Denial of Service
Elevated Privileges Privilege Escalation
Account Attack has unauthorized access to blockchain account.
Blockchain Attacker has unauthorized level of control over blockchain.
Smart Contract Attacker has unauthorized access to protected smart contract functionality.
Fundamentals Blockchain is based upon cryptographic primitives and the block and chain data structures.
Cryptographic Primitives Hash functions and public key cryptography are essential to access control and data integrity on the blockchain.
Private Key Compromising a user's private key allows an attacker to generate transactions on their behalf.

Phishing Phishing emails can be used to steal private keys, which allows the attacker to masquerade as a legitimate user.

Shor's Algorithm Shor's algorithm breaks traditional asymmetric cryptography, allowing an attacker to forge digital signatures on transactions and blocks.
Grover's Algorithm Grover's algorithm decreases the security of hash functions, making it easier for an attacker to find collisions and break blockchain immutability.
Private Key Compromising a user's private key allows an attacker to read any encrypted data meant for them.

Shor's Algorithm Shor's algorithm breaks traditional asymmetric cryptography, allowing an attacker to decrypt encrypted messages.
Private Key Compromising a user's private key gives an attacker unauthorized access to their account.

Shor's Algorithm Shor's algorithm breaks traditional asymmetric cryptography, allowing an attacker to guess a user's private key and access their account.
Data Structure Blockchain has defined formats for transactions and blocks. Vulnerabilities in these data structures or how they are processed can impact blockchain security.
Transaction Malleability The hash of a transaction depends upon the transaction's digital signature. This can be regenerated by the original signer, creating an identical transaction with a different hash.
Protocol Blockchain protocols like consensus algorithms and the block creation process codify how the network interacts and maintains a decentralized, distributed ledger.
Consensus The blockchain consensus algorithm defines how the blockchain is updated in a decentralized fashion.
51% A 51% attack allows the attacker to rewrite the history of the blockchain, breaking its integrity.

Long-Range In a long-range attack, the attacker generates a conflicting version of a Proof of Stake blockchain and gets it accepted, breaking the integrity of the distributed ledger.

Nothing at Stake In a Nothing at Stake attack, a Proof of Stake block forger signs two conflicting versions of the blockchain.
51% In a 51% attack, the attacker rewrites the history of the blockchain, allowing them to deny that past transactions are part of the official ledger.

Long-Range In a long-range attack, the attacker rewrites the history of the blockchain, allowing them to deny that past transactions are part of the official ledger.
51% A 51% attacker controls the blockchain and can refuse to add transactions to it, performing a DoS attack against its users.

Artificial Difficulty Increases If an attacker suddenly withdraws a large percentage of a Proof of Stake network's mining resources, the block difficulty target is too high for the remaining nodes. Since blocks cannot be found at the desired block rate, this implements a DoS attack.

Long-Range A long-range attacker controls the blockchain and can refuse to add transactions to it, performing a DoS attack against its users.
51% A 51% attack gives the attacker control of the distributed ledger.

Long-Range A long-range attack gives the attacker control of the distributed ledger.

Selfish Mining Selfish mining allows the attacker to create more blocks than their percentage of mining power should allow. This increases their level of control over the distributed ledger.

SPV Mining SPV mining allows the attacker to create more blocks than their percentage of mining power should allow. This increases their level of control over the distributed ledger.
Block Creation The block creation process defines how the selected block creator creates new blocks and ensures their validity.
Frontrunning Blockchains publish transactions to the entire network before adding them to the distributed ledger. An attacker who sees a transaction can create a competing one with a higher transaction fee so that it is processed before the transaction that was created first.
Transaction Flooding By flooding the blockchain network with spam transactions, an attacker uses up the blockchain's capacity, delaying the addition of other blocks to the ledger. Also, any spam transactions that are included in the ledger are retained forever, consuming storage and processing resources on the nodes.
SPV Mining SPV mining allows the attacker to create more blocks than their percentage of mining power should allow. This increases their level of control over the distributed ledger.
Infrastructure Blockchain infrastructure consists of the endpoints running blockchain software and the network that connects them.
Nodes Exploitation of the computers running the blockchain software.
Malware Malware can be used to steal private keys, which allows the attacker to masquerade as a legitimate user.
Malware Malware can be used to perform eclipse and routing attacks. It can also be used to steal private keys, allowing the attacker to create fake transactions on the user's behalf.
Malware Malware can be used to intercept communications or steal private keys, allowing an attacker to view private or permissioned data without authorization.
Failure to Update Failing to update blockchain software could mean that a user does not follow a hard fork and cannot access the blockchain.

Malware Malware on a user's computer can impede access to the blockchain at a variety of levels, including filtering or blocking traffic and terminating blockchain processes. This both denies access to them and degrades the efficiency of the blockchain since the user cannot contribute to block creation.
MSP Misconfig A misconfigured Membership Services Provider (MSP) could allow an attacker to grant themselves unauthorized permissions on the blockchain.
Network The blockchain runs on traditional networking. Attacking this network can impact the security of the blockchain.
Eclipse/Routing Eclipse and routing attacks rely on isolating users, which can be accomplished by attacking the network level. An attacker can perform double-spend against users in different isolated pieces of an eclipsed network.

Network Design A poorly designed network can enable an eclipse or routing attack by limiting the number of connections between different groups of users in the network. Overwhelming communication links can also essentially isolate different portions of the network.
Network Design If a private or permissioned blockchain relies on the security of the underlying network to manage access, an attacker may be able to gain visibility by compromising network components (routers, etc.).
Eclipse/Routing Eclipse and routing attacks can be performed at the network level by destroying or filtering communication links. Isolating portions of the network from one another decreases the block rate and causes the shorter chain to be discarded when the network reconnects.

Network Design A poorly designed network may not be capable of managing the overhead necessary for a blockchain system, so bandwidth limitations could impact functionality.

Physical Attacks An attacker physically severing communication links or tampering with devices (routers, etc.) could cause the functionality of the blockchain solution to be degraded.

PoS DoS A Denial of Service attack against the legitimate block creator in a Proof of Stake blockchain means that an opportunity to create a block may be missed. This decreases the efficiency and capacity of the blockchain.

MSP DoS A Denial of Service attack against a Membership Services Provider (MSP) may deny legitimate users access to the blockchain system.
Eclipse/Routing An eclipse or routing attack allows an attacker to corrupt a user's view of the blockchain and get them to act in the attacker's interests. This can give the attacker a level of control over the blockchain greater than they should have based on their percentage of the scarce resource (computational power, stake, etc.).
Advanced The basic blockchain protocol has been extended by the creation of smart contract platforms and allowing connections to external software and devices through APIs.
Smart Contracts Smart contracts extend the functionality of the basic blockchain protocol by allowing third-party code to run on the distributed ledger.
Delegatecall Delegatecall allows a smart contract to run in the scope of another smart contract. This can give the attacker unauthorized access to protected functionality within the smart contract.
Arithmetic Integer overflow and underflow vulnerabilities can be exploited to bypass checks on transactions and other protected operations, allowing the attacker to perform unauthorized actions.

Bad Randomness Generating strong randomness is difficult in smart contracts, making it possible for attackers to cause smart contracts to take unanticipated actions.

Reentrancy Reentrancy vulnerabilities allow malicious smart contracts to force vulnerable ones to take unauthorized actions.

Short Addresses Short address vulnerabilities trick vulnerable smart contracts into performing transactions with a greater amount of value than was authorized.

Timestamp Dependence Some smart contracts are designed to take action before or after a specific time. Since time on the blockchain is flexible and dependent on block creators, a malicious block creator can force unanticipated behavior.

Unchecked Returns In Ethereum, some low-level functions throw an exception and others return false and continue running upon failure. Failing to check return values may cause a smart contract to continue executing after an unexpected failure.
Access Control Some smart contracts have protected kill switches. A failure in controlling access to these functions can allow a DoS attack against these contracts.

Out of Gas Ethereum limits the amount of gas that a transaction can use. Forcing a smart contract into a state where it needs more gas than the limit to run can make it incapable of running.
Access Control Poor management of access control within a smart contract can give an attacker elevated privileges within the contract.

Delegatecall The use of delegatecall allows a called smart contract to run code with the privileges of the calling smart contract.
Blockchain Extensions Blockchain extensions build on top of the blockchain protocol (like state channels and side chains) or connect blockchains to external software via APIs.
Insecure APIs Exploitation of external software or hardware with access to a blockchain account may allow an attacker to perform actions masquerading as that account's owner.
Insecure APIs Exploitation of external software or hardware with access to a blockchain account may allow an attacker to gain access to protected functionality available to that account's owner.

This blockchain threat model represents my personal attempt to classify the currently known attack vectors against blockchain systems and is designed to be a constant work in progress as new attack vectors are discovered against blockchain systems. I plan to continue to update and refine this model and would appreciate any comments or input.

Introduction to Blockchain: Extensions

Blockchain technology provides users with a number of advantages not present in traditional systems.  Blockchains are the first fully distributed and decentralized system that is capable of maintaining a shared, trusted ledger.  This allows a network to keep a record of its history and be confident that a malicious user or users is not capable of modifying this history to their own benefit.

However, blockchain technology isn’t perfect.  Bitcoin was originally designed to replace traditional payment systems (like credit cards); however, by itself doesn’t have the ability to do so.  Blockchain technology has limitations, and blockchain extensions have been developed to help mitigate or eliminate these.

Limitations of Blockchain

Blockchains has a very specific structure.  Due to the need for the network to remain synchronized and for the network to validate all transactions, transactions cannot be continuously added to the distributed ledger.  Instead, transactions are organized into blocks, which are added to the distributed ledger at regular intervals. This design limits the speed and capacity of the blockchain solution.

The speed at which transactions are added to the distributed ledger is severely limited on the blockchain.  Blockchains typically have a target block rate, that is enforced at some level by their consensus algorithm.  For example, Bitcoin has a block rate of 10 minutes, meaning, with the three block rule, you may have to wait half an hour before a transaction is considered trustworthy.  This compares unfavorably with credit cards, where a “slow” transactions is done in a minute.

Blockchains also have an issue with a maximum capacity.  In addition to the set block size, many blockchains have a set maximum block size designed to protect against Denial of Service (DoS) attacks.  With fixed-size blocks created at fixed intervals, the blockchain can only process so many transactions in a time period, and this capacity is often far below that of the payment card system.

Blockchain Extensions

Some distributed ledger technologies have abandoned the blockchain data structure in order to address these problems.  For example, IOTA uses a directed acyclic graph (DAG) as its underlying data structure, which greatly increases its transaction speed and capacity.  Some blockchains make small protocol tweaks (like increasing the block rate) to improve transaction speed and capacity. And some blockchains have begun leveraging blockchain extensions to help address these issues while maintaining the original design of the blockchain.

Sidechains

Sidechains are primarily designed to increase the capacity of the blockchain by offloading some transactions to a standalone blockchain.  There are a few different implementations of sidechains, but a common one is to “peg” a sidechain to a parent blockchain. With pegged blockchains, a user on one blockchain can send tokens to an “output address”, and the equivalent amount of tokens will be released onto the sidechain.  Pegs are bidirectional, so the user can return to the original blockchain at will.

One benefit of sidechains is the increase in capacity for the original blockchain.  Since transactions performed on the sidechain are not recorded in the blocks of the main blockchain, the total capacity of the system is increased.

Sidechains can also be used to address specific deficiencies of the parent blockchain.  For example, a sidechain could have a faster block rate than the parent chain, increasing the system’s transaction speed.  Alternatively, sidechains can increase the capabilities of the system, like the Rootstock sidechain that plans to add smart contract functionality to Bitcoin.

The main security consideration of sidechains is that the sidechain is a completely distinct system from the main chain.  It needs to have its own means of securing consensus, through a large pool of miners, stakers, etc. Otherwise, a hack of the sidechain could affect the quality of its peg with the main chain and its users’ ability to switch back and forth.

State Channels

Another blockchain extension that has been getting a lot of press is the state channel.  The most famous state channel system is probably the Lightning Network running on the Bitcoin network.  However, other state channel implementations run on other blockchains under different names.

State channels function as a second-level protocol that runs on top of a traditional blockchain implementation.  A state channel is a direct connection between users of the blockchain network. They establish the channel using a traditional blockchain transaction that establishes the balance that each has contributed to the channel (i.e. 1 BTC apiece).  After the channel is established, payments are made by creating mutually signed assertions regarding the balance of value in the channel (i.e. .75 BTC and 1.25 BTC). The channel can be closed down at any time, and another blockchain transaction is created using the most recent balance assertion to place the correct amount of cryptocurrency in each participant’s blockchain account.

The main advantages of state channels are transaction speed, scalability, and privacy.  Transactions only require the channel participants and can be completed near-instantaneously.  However, if a channel becomes too unbalanced, it may be impossible to make a payment. This is where the network of state channels can be very useful since transactions can pass through different paths to rebalance channels or perform transfers between unconnected parties.

The main security consideration of state channels is that transactions are backed by the blockchain but not recorded on it.  State channel transactions are private to the recipients, and the blockchain network has to trust that all transactions made over them are legitimate.  However, the point-to-point nature of state channels protects against double-spend attacks since the value stored in one channel is unique to that channel and cannot be used to open and perform transactions in other channels.

The Distributed Ledger Universe

This series was designed to be an introduction to blockchain technology with a focus on blockchain security.  However, the solutions discussed in these articles are not the only ones out there. Other distributed ledger implementations (like DAGs and hashgraphs) use different data structures and have different security properties.  Also, the blockchain can be extended using external devices that interact via APIs or smart contracts. When designing a distributed ledger solution, it’s important to consider all of the available technology and the security considerations associated with it.

Introduction to Blockchain: Smart Contracts

In the beginning, blockchain was designed to replace the financial system.  The distributed, decentralized ledger maintained by the blockchain network is used to record the transactions performed using the blockchain financial system.  As a result, cryptocurrencies like Bitcoin can implement complete, trustworthy financial systems without a central authority (like a bank).

The distributed, decentralized ledger offered by blockchain technology is useful for more than just recording financial transactions.  Smart contract platforms are designed to run a Turing-complete computer on top of the blockchain, allowing smart contracts to fulfill a variety of different functions.

Introduction to Smart Contracts

Smart contract platforms use the underlying blockchain technology but modifies it to use it to run arbitrary, third-party programs on top of it.  Instead of transactions including actual financial transactions, they include computer instructions designed to be run by the blockchain’s virtual machine.

Since the blockchain network is distributed and decentralized, there is no central computing platform that runs the code and updates the state of the smart contract platform with the result.  Instead, each node in the network runs its own copy of the virtual machine and executes the code contained in the transactions in each block of the blockchain. Since code is designed to be deterministic and is organized into a block before execution, the network is able to remain synchronized at all times.

Smart Contract Security

The blockchain landscape is fragmented, and so is the landscape of smart contract blockchains.  The basic blockchain solution has been adopted and adapted in many different ways, making it difficult to create a definitive list of smart contract vulnerabilities.

Since Ethereum is the best-established smart contract platform, many lists of smart contract vulnerabilities focus on this.  The Decentralized Application Security Project has compiled a list of the most common smart contract vulnerabilities on the Ethereum platform, which are explored here.

Reentrancy

Reentrancy is probably the most famous of the Ethereum smart contract vulnerabilities.  Exploitation of this vulnerability in The DAO smart contract caused Ethereum to break its blockchain’s immutability and rewrite history to erase the attack.  This controversial decision caused a split in the Ethereum network that created the Ethereum Classic cryptocurrency.

Reentrancy is possible in Ethereum due to the existence of payable fallback functions.  These functions are designed to run when value is sent to the smart contract, allowing it to update its internal ledger, perform some functions, etc.

The issue with this setup is when a vulnerable smart contract function that the attacker can control (like a refund function) can be forced to call a malicious fallback function.  Vulnerable functions use the following control flow:

  1. Check transaction validity

  2. Perform send

  3. Update internal ledger

With this control flow, the malicious fallback function is run as a part of step 2, before the vulnerable function updates state.  If it calls the vulnerable function again, the transaction will still be considered valid (since the state isn’t updated) and run again, allowing the attacker to withdraw twice as much value as approved.

Access Control

Some smart contracts are designed to have protected functionality.  For example, you can implement wallets as smart contracts where anyone can send value to it but only the owner can extract value from it.

Some of these smart contracts have a function for claiming ownership, where the owner of the smart contract (and the one permitted to call protected functionality) is set to the person who called the function.  The issue with these functions is that sometimes people forget to set the function to check that this is the first time the function is called. If they fail to do so, the owner is whoever called the function last, not first.

Arithmetic

Arithmetic vulnerabilities like integer overflows and underflows are nothing new with blockchain.  They’ve existed in software programming for some time and have only become less common due to the existence of programming languages that make them impossible (like Python).  Unfortunately, some smart contract programming languages are now vulnerable.

Arithmetic vulnerabilities occur when certain variable types are misused.  Integer overflow vulnerabilities occur when a programmer uses too small of a variable to store a value.  Underflow vulnerabilities occur during switches between signed variables (where a one in the most significant bit means negative) and unsigned variables (where a one in the most significant bit means a large, positive number).  Performing subtraction with unsigned values always results in a positive number, which can be problematic since these tests are often performed to check the validity of a transaction.

Unchecked Return Values

An Ethereum-specific feature that can trip up novice smart contract developers is the fact that it does not have a consistent means of indicating when low-level functions fail.  Some low-level functions throw an error if they fail, which terminates execution. Other ones return a value of false and allow the code to continue running. If an programmer assumes the first case for a certain function and doesn’t check function return codes, it’s possible to put their code in an unexpected (and potentially invalid) state.

Denial of Service

Just like the underlying blockchain can be vulnerable to Denial of Service attacks, smart contracts can also be rendered non-operational by a malicious (or benign) user.  Denial of Service attacks on smart can be accomplished in a variety of different ways.

One way to attack a smart contract is to exploit an access control vulnerability.  Well-designed smart contracts include a self-destruct function that would render the contract unusable if an attacker gains access to and executes this function.

Bad Randomness

Smart contracts often need access to random numbers.  In fact, many smart contract are designed to implement gambling games, so they need the ability to generate secret random numbers.

There are several means of generating random numbers in code, and many of these are considered “best practice” in traditional applications.  However, the blockchain environment is different, making the following means of generating randomness insecure:

  • “Secret” Values: Like seeding a pseudo-random number generator, some smart contracts use “secret” values to create randomness.  However, everything is public on the blockchain, so an attacker can observe this value and predict the “random” values

  • “Secret Code”: Using a proprietary algorithm for generating random numbers is not a great idea but it often works.  However, it fails on blockchain for the same reason as the “secret” values.

  • External Input: Using external sources of entropy is a common method of generating randomness in traditional applications.  However, any source of entropy visible to one smart contract is visible to any other, making it easy to observe and exploit.

In the end, the best way to generate random numbers on the blockchain is to use an external source of randomness that the smart contract can query.  However, this has to be done carefully to ensure that malicious smart contracts can’t see it as well.

Race Conditions

In traditional programming, race conditions are when two or more threads are competing for resources, and the behavior of the program is dependent on which one gets there first.  In the blockchain, multiple transactions may be competing for recognition by a smart contract and the result depends on whichever transaction is processed first.

For example, a contest may exist where the first person to solve a puzzle wins some prize.  A benign user solves the puzzle and submits their solution to the smart contract as a transaction.  However, transactions are not instantly processed, are publicly visible before processing, and are organized for processing based upon transaction fees.  If an attacker sees the user’s solution, copies it, and submits a transaction with a higher transaction fee, the attacker is likely to win the contest without doing any work to solve the puzzle.

Timestamp Dependence

Another way that these contests can go wrong is if they depend on the current time on the blockchain as a condition for the contest.  For example, a smart contract may run a contest where the first submission after midnight on a certain day is the winner.

On the blockchain, the current time is the time of the most recent block, and this is set by the block creator.  Further, there is some wiggle room (often up to two hours) in timestamps to deal with propagation delay, non-synchronized clocks, etc. (in fact block timestamps don’t even have to be in order).  An attacker who manages to create a valid block with a timestamp of midnight before midnight but within the acceptable window can win this contest before anyone else tries to play.

Short Addresses

Short address vulnerabilities in Ethereum are caused by variable sizes, how arguments to a function are stored in memory, and how Ethereum pads arguments that are too short.

In this attack, the attacker calls a vulnerable smart contract function designed to send value to a certain address (like the refund function from the reentrancy vulnerability).  In this call, the attacker deliberately sends a destination address that is one byte too short and a value of the correct size. The function checks the value and, if the transaction is valid, calls a function to transfer the value.

This transfer function specifies the size of its arguments and expects a destination address of a given size.  As a result, it fills the address variable with the provided address and the first byte of the provided value. Now, the value is too small, so Ethereum zero-pads it on the right, effectively multiplying it by 256.  As a result, if the new destination address is controlled by the attacker (which they can assure before they perform the attack), they receive 256 times more value than the vulnerable function authorized.

Unknown Unknowns

The final smart contract vulnerability included in the DASP list was unknown unknowns.  Blockchains in general, and smart contract platforms in particular, are a relatively new technologies.  It is extremely likely that new vulnerabilities will be discovered and take the top slots for smart contract vulnerabilities in future years.

The Final Chapter: Blockchain Extensions

The articles to date have covered the topics used by most blockchain and smart contract platforms.  However, some extensions and second-level protocols have been developed to fix the deficiencies of the blockchain.  The final article in this series will discuss some of the security considerations for these technologies.

Introduction to Blockchain: Consensus

Blockchains are designed to be distributed, decentralized networks. Part of this includes removing the central authority used in many other systems.  In a traditional financial system, banks centralize power by maintaining control of the ledger that states how much value is stored in each account. If a dispute arises over the ledger, the bank has the final authority to decide what the authoritative version is.

Blockchain is designed to remove centralizing authorities like banks.  Instead, the blockchain network maintains a shared, decentalized ledger with each node in the network maintaining a copy and updating it as each new block is created.

The challenge with this is ensuring that all nodes make the same updates to their copies of the ledger with each block.  Since the network does not have a consistent authority to create the official version of the ledger, it chooses a temporary authority to create and share each block.  The mechanism for accomplishing this is called the blockchain consensus algorithm.

Fundamentals of Consensus

The job of the consensus algorithm is to ensure that control of the blockchain is decentralized so that no one user has the ability to control the network.  The means by which this is accomplished is through making control of the blockchain network dependent on control of a scarce resource.

No matter what consensus algorithm you choose, it boils down to the fact that control of a scarce resource equals power on the blockchain.  In Proof of Work, this resource is computational power. In Proof of Stake, it’s the blockchain’s cryptocurrency.

The logic behind using a scarce resource as a analog to power on the blcokchain is that it enables the use of economic incentives to protect the blockchain.  The Law of Supply and Demand says that, if there is increased demand for a resource with a limited supply, then the price increases.

When an attacker tries to gain control of a blockchain network (to perform a 51% attack or similar), they need to acquire more of the scarce resource to do so.  As a result, they increase the demand for the resource, which increases the price to acquire it. Hopefully, the cost to acquire enough of the resource to perform a successful attack will be beyond the attacker’s resources.  If not, we have successful 51% attacks against blockchains, which has certainly happened on smaller cryptocurrency networks.

How Common Algorithms Implement Consensus

When Satoshi Nakamoto created Bitcoin, it was the only blockchain in existence.  The Bitcoin whitepaper described the Proof of Work consensus algorithm used on the Bitcoin network.  Since then, many other consensus algorithms have been developed for different blockchain implementations.  Of these, Proof of Stake also receive a lot of attention, partly due to its presence on the Ethereum roadmap.

Proof of Work

Proof of Work is the original consensus algorithm, and, as its name suggests, it involves making people do work.  In Proof of Work, miners are the ones attempting to create a new block.  The way that the block creator is selected is by implementing a race where the winner creates the block (and earns the associated rewards).

This race involves creating a valid block, where the condition for validity is that the header of the block hashes to a value less than a given threshold.  Due to the properties of hash functions, the best way of accomplishing this is by random guessing. As a result, the miners in the network try random hashes until one stumbles across a nonce that creates the desired hash output.  The first miner to find a valid block then transmits it to the rest of the network to build the next block on top of.

The main issue with Proof of Work is that the criteria for block creation is the ability to create a valid block.  There is nothing to say that two different miners can’t find different versions of the block around the same time. If this occurs, a divergent blockchain may be created with different parts of the network building on top of different blocks.  Blockchain resolves this using the longest block rule, which says that, in a conflict between two versions of the blockchain, the longer one should be accepted.

Proof of Work also tries to minimize the probability of divergent blockchains using the concept of  difficulty.  The threshold value that a valid block header’s hash must be less than can be updated in a distributed fashion.  The difficulty is updated at regular intervals so that the creation of blocks (with the current computational power of the blockchain network) occurs at the desired block rate.

Proof of Stake

Proof of Stake takes a different approach to securing the blockchain using a scarce resource.  Instead of using scarce computational power (like Proof of Work), Proof of Stake uses the blockchain’s scarce cryptocurrency.

Proof of Stake works a lot like investing in a company.  By giving some of your money to a company, you have the right to receive investor dividends.  In Proof of Stake, you promise not to spend a portion of your cryptocurrency (or stake it) in exchange for the chance to be a block creator (and earn the associated rewards).

The mechanics of how block creators are selected based on stakes varies based upon the implementation.  In some implementations, the probability of being selected is directly proportional to the size of the user’s stake.  In others, the concept of coin age is introduced, where stakers who have not been selected to create a block in some time have an increased probability of being selected.  Regardless, control of more staked cryptocurrency in Proof of Stake equates to increased control over the blockchain.

One issue with Proof of Stake is the potential for a user to create multiple versions of the same block. Since the only criteria for a block to be valid is a signature by the chosen block creator, it’s possible for a user to sign multiple versions of the same block.  In fact, this is one place where blockchain incentives break down since, if presented with two versions of the blockchain to build upon, it’s in the block creator’s best interest to build on both to ensure that whichever version eventually wins out includes the block that pays them their block reward.

Attacking Consensus

Consensus mechanisms are the key to controlling the blockchain. As a result, many attacks on the blockchain are based upon gaining this control.  If successful, an attacker can perform a double-spend attack, which allows them to complete one transaction and then remove it from the ledger at a later date.  Some attacks against consensus have been known from the beginning (like the 51% attack), while others (like long-range attacks) were developed later.

51% Attacks

51% attacks are probably the simplest way to attack a Proof of Work blockchain and occur when the economic incentives of the blockchain don’t work.  Under the longest block rule, every benign node is obligated to choose the longer option when presented with two contradictory versions of the blockchain.  If an attacker has the ability to create the longer version at will, then they control the blockchain.

In Proof of Work, this is accomplished by controlling over half of the computational power of the blockchain network. Since creation of valid blocks requires randomly searching the space of potential options, whomever can search the space more quickly can create blocks faster.

Similar attacks are possible on Proof of Stake, but it requires a greater level of control over the scarce resource.  In Proof of Work, you need 50% of the computational power to have a 100% chance of finding the next block. In Proof of Stake, you need 100% of the staked cryptocurrency to have a 100% chance of forging the next block.  Since this is unlikely, an attacker trying to control a Proof of Stake blockchain needs to accept the possibility of failure.

Long-Range Attacks

Long-range attacks can be used on Proof of Stake blockchains to give an attacker the controlling portion of the staked cryptocurrency necessary to attack the consensus algorithm.  In this attack, the attacker creates a divergent version of the blockchain all the way back to the genesis block (this assumes that they have a stake in the genesis block).

On their divergent blockchain, the attacker creates a new block whenever they are selected as the block creator. Since they are the only ones creating blocks, they’re the only ones receiving block rewards.  Over time, the attacker has the controlling stake in the divergent blockchain.

However, the divergent blockchain will only be accepted if it is longer than the “true” version of the blockchain. Since the attacker can only create blocks on their version when it’s their turn, their divergent blockchain will fall behind the main chain whenever a benign user is selected to create a block.  While this happens less frequently as they control more of the stake, the attacker’s chain is significantly behind in the beginning.

In order to catch up, the attacker deliberately passes up their opportunities to create blocks on the main chain.  Between these missed blocks and ones missed by natural causes (or due to a Denial of Service attack on the chosen block creator), the attacker’s chain has the opportunity to slowly catch up to the main chain.  When this occurs, the attacker can publish their malicious divergent blockchain and gain control of the blockchain.

Up Next: Smart Contracts

At this point, we’ve examined the security implications of each level of the original blockchain.  However, blockchain technology has advanced since the publication of the Bitcoin whitepaper The remaining two articles in this series are devoted to technology built on top of the original blockchain: smart contracts and blockchain extensions.

Introduction to Blockchain: Network

Introduction to Blockchain: Network

The blockchain is designed to store a trusted, shared distributed ledger.  This ledger represents the history of the blockchain network, so the network level is an important one when discussing the blockchain ecosystem.

In the previous post, we discussed the nodes and how they each maintain their own copy of the distributed ledger.  Since the blockchain is designed to be trustless, no other node is going to implicitly trust any other node’s copy.  They need a way to agree on the state of the ledger (consensus), and, for that, they need a way to communicate: the network.

The Blockchain Peer-to-Peer Network

Blockchains use a different network architecture than most of the web services that we’re used to.  These services use a client-server architecture, where the server acts as a single source of ground truth, and the clients connect directly to it to upload or download application data.  For example, when you use a webmail client like Gmail, your email doesn’t go directly from your computer to the recipient’s.  Instead, you upload it to the Gmail servers and the recipient downloads it from a Gmail server to read.

This system is simple and effective, yet it relies on the Gmail server to be a trusted middleman in the process.  Blockchain isn’t big on trusted middlemen, so it uses a peer-to-peer network, where each node in the network communicates directly with other nodes.  Most blockchain networks use a broadcast system where, if a node has five peers, every message that is received from one is sent to the other four.  This way, messages percolate across the network over many paths, and no one has complete control over communications.

The main implication of the peer-to-peer model for blockchain networking is that the underlying network needs to be able to support it.  Since every peer needs to be capable of connecting to every other peer, you can’t effectively have a blockchain network distributed across a network with varying trust levels without compromising either blockchain or network security.

Also, the “broadcast” communication style of the blockchain means that it requires a large amount of bandwidth to function properly.  The inability to support this can have negative impacts on blockchain security and effectiveness.

Attacking the Blockchain Network

Many of the best-known attacks against blockchain systems are at the network level.  Many people know that private key management is a problem and that smart contract vulnerabilities exist, but they’d be hard-pressed to even name the top ten most common smart contract vulnerabilities.  On the other hand, Sybil attacks and 51% attacks are commonly mentioned in blockchain security-related posts.

In this section, we’ll discuss three network-level attacks on the blockchain: Denial of Service (DoS), Eclipse, and Sybil attacks.  A 51% attack can also be considered a network-level attack, but we’ll talk about it in the next post since it’s most closely related to consensus.

Denial of Service Attacks

Blockchains are distributed, decentralized networks, so it seems like Denial of Service (DoS) attacks should be impossible.  DoS attacks target a single point of failure (like a webserver) or a bottleneck in a system and attempt to overwhelm it in order to degrade the operations of the system.  Since blockchain (theoretically) has no single points of failure, DoS attacks shouldn’t be an issue.  In practice, DoS attacks against the blockchain exist, but they attack temporary single points of failure or system bottlenecks.

One such bottleneck is the transaction capacity of the blockchain.  Most blockchains create blocks with a fixed maximum size at a fixed rate.  An attacker can create a large number of spam transactions and transmit them to the network (similar to a DoS attack on a webserver).  If the network can’t reliably identify them as spam transactions and ignore them, they’ll be added to the blockchain, taking up space that could have been used by legitimate transactions. Worse, blockchains are “forever”, so these spam transactions that make it onto the blockchain can take up storage space on nodes for the life of the blockchain.

An example of a temporary single point of failure is the creator of a given block.  Different blockchains have different methods of choosing this person, but in the end, one node puts a block together, signs it, and transmits it to the network.  In some schemes (like Proof of Stake), if a block creator misses their “slot” for creating a block, they forfeit it.  If you can force someone to forfeit a block (i.e. by a traditional DoS attack), that block is never created and the network loses some of its potential capacity.

Eclipse/Routing Attacks

Eclipse and Routing attacks are two names for essentially the same attack.  In an Eclipse attack, an attacker isolates a single node from the rest of the network by controlling all of its peer connections.  In a Routing attack, the network is split up into two or more isolated groups.  Both attacks can be used to facilitate a double-spend attack (by sending a different transaction to each isolated individual/group) or a 51% attack (by filtering the victim’s view of the network state so that they mine a version of history that’s in the attacker’s favor).

Eclipse and routing attacks can be performed by a variety of different means.  External to the blockchain ecosystem, an attacker can control a node’s connection to the network using malware or any other traditional means of performing a Man-in-the-Middle (MitM) attack.  A study found that Bitcoin is especially vulnerable to BGP routing attacks, where the attacker convinces computers that the best route from A to B is through them.

Internal to the blockchain ecosystem, an attacker can perform these attacks by controlling all of a node’s connections to their peers.  Since blockchain networks are not fully connected (nodes only connect to a small number of other nodes), it’s possible that a node can only be connected to peers controlled by an attacker.

It doesn’t matter how the attacker controls the node’s connection to the blockchain network as long as control is absolute.  If this is true, the attacker may be able to selectively drop packets from other users or send mutually exclusive versions of transactions from their own addresses to drive the isolated groups’ versions of history apart.  When the attack is completed, the longest block rule means that whichever version of the blockchain is shorter will be discarded (which is perfect for a double-spend attack).

Sybil Attacks

A Sybil attack is a simple network-level attack used to facilitate other attacks.  In a Sybil attack, the attacker creates and maintains a large number of accounts on the blockchain network.  This can be useful when performing an Eclipse/Routing attack since, if the attacker controls most of the nodes accepting connections when a node is looking for one, there is a high probability that the node will only choose attacker-controlled connections.

Up Next: Consensus

Up to this point, we’ve talked about the fundamentals of the blockchain protocol and how it works and is attacked at a node and network level.  In the next post, we’ll be discussing consensus: the way that these distributed and decentralized networks of nodes agree on their shared history.

Introduction to Blockchain: Nodes

Traditional network services work on a client-server model. To access the shared resource, you (the client) connect to a server and request the official version of the file. This makes synchronization easy (since the server knows the most recent version) but is very centralized. This can be problematic because it requires trust in the server and the server is vulnerable to Denial of Service (DoS) attacks.

Blockchain is designed to be a completely decentralized system. Every node in the blockchain network has the ability to keep a copy of the distributed ledger, and the the official version of the shared ledger is updated via blockchain consensus mechanisms (covered in detail in the fourth section of this series).

What Do Nodes Do?

Nodes are a vital part of the blockchain ecosystem because they’re the ones that do everything. As a decentralized peer-to-peer system, everyone acts as a combined client and server. As a result, the duties of nodes are protocol-specific (rather than software-specific) and numerous.

Protocol Not Software

Like many other Internet applications, a blockchain is a protocol rather than a specific piece of software. Instead of mandating that everyone run the same executable to use a service (like Skype), the only requirement is that nodes communicate based upon the rules of the service.

An example is HTTP, the protocol that defines how websites work. The structure and ordering of packets on the network is defined by the protocol, but no-one cares which software you’re running. As a result, there are a couple of different web servers (Apache, IIS, etc.) and many different web browsers (Chrome, Firefox, Safari, etc.). These servers and browsers have agreed to follow the protocol, so they’re able to communicate with one another with no issues.

Some blockchains are implemented using different software, while others have only one. When choosing blockchain software to run, it’s always a good idea to cross-compare the options.

Common Node Tasks

The purpose of the node is to implement and operate the blockchain. Each node has the ability to store a complete copy of the distributed ledger, and, if they do, to update it based upon the consensus of the network as a whole. As a result, nodes can participate in a variety of activities including transaction processing, block creation, and ledger management.

Transaction Processing

One of the most common tasks that nodes have is transaction processing. Anyone connected to the blockchain network through the node will send their transactions to the node to be added to the distributed ledger. The node is responsible for sending these transactions on to the rest of the network as well as forwarding on any transactions that it receives from other nodes to its peers in the network.

Block Creation

The blockchain is updated by adding new blocks to the existing chain. These blocks contain the data stored on the distributed ledger, and someone needs to collect this information into the block and distribute it to the rest of the network. Since there is no centralized server in blockchain, this means that the nodes are responsible for this as well. Using a blockchain consensus algorithm, a node is selected as the next block creator. They perform the tasks of creating the next block and starting its distribution (and are rewarded for their trouble).

Ledger Management

Finally, nodes are responsible for ensuring that the distributed ledger is properly stored and accessible. Every node has the potential to store a complete copy of the distributed ledger. Since not all users of the blockchain network are nodes (i.e. some people just use Bitcoin for performing transactions or investments), these nodes may occasionally be asked to send a copy of certain parts of the blockchain to a user in order to verify that a transaction made it onto the distributed ledger.

Types of Blockchain Nodes

The role distinctions in the blockchain network aren’t even as simple as node and not-node. In some cases, it’s possible to have different types of nodes. For example, Hyperledger permits a huge amount of role specialization, allowing nodes to only do the portion of the work that they are most suited to.

One of the more common distinctions between nodes on the blockchain is full and lite nodes.  As their name suggests, full nodes perform all of the job roles associated with being a node. These guys store a complete copy of the ledger and participate in consensus and block creation. A blockchain network needs a certain critical mass of full nodes in order to maintain its security and decentralization.

Lite nodes are designed to make it easy for someone to perform and verify transactions without doing everything that a full node does. In the previous post in this series, we talked about how the block headers are “chained” together using hash values. Since these headers summarize all of the transactions contained in a block, they are all you need for verification of blockchain integrity. Lite nodes download the headers and only request the actual transaction data if they want to verify that a certain transaction was included in the block. This reduces the storage and communications requirements of lite nodes at the cost of a bit of decentralization.

Security of Blockchain Nodes

Nodes are the targets of most attacks on blockchain networks. While other attacks may have more name recognition (like 51% attacks), many attackers have found that it’s more profitable to target the individual users. Some threats at the node level are security misconfigurations, phishing, and malware.

Security misconfiguration vulnerabilities occur when users modify the settings on their blockchain software without understanding the potential impacts. One example is a setting on a common Ethereum client that allowed external applications to communicate with wallet software via Remote Procedure Call (RPC). Attackers scanning for port 8545 were able to connect to the software and steal $20 million in Ether.

Phishing attacks are also extremely common for blockchain users. The Electrum wallet is especially known for being a target of phishing attacks, with over $1 million in Bitcoin being stolen by just one attacker in a matter of hours.

Finally, malware can be used on blockchain nodes for a variety of different purposes. Many of the attacks described in the remaining articles in this series can be performed using malware that targets the blockchain software on a node.

Securing Your Node

If you run a node on the blockchain, its security is completely under your control. Taking the appropriate steps to secure it like installing antivirus software, properly configuring it, and being aware of phishing scams can make a huge difference for your security and that of the blockchain network. The decentralization of a blockchain network makes it more difficult to defend against certain network-level attacks, but every secure node contributes to the health and security of the network.

Introduction to Blockchain: Fundamentals

This article is the first in a multi-part series describing how blockchain works and some of the security assumptions associated with it. Each article will describe a different level of how the blockchain’s distributed ledger operates, starting with the fundamentals.

At the most basic level, blockchain technology is composed of cryptographic algorithms. The creator of blockchain, Satoshi Nakamoto, developed a system in which the trust that we traditionally place in organizations to maintain trusted records (like banks) is transferred to the blockchain and the cryptographic algorithms that it uses.

The Cryptography Behind Blockchain

The goal of the blockchain is to create a distributed, decentralized, and trusted record of the history of the system. The most famous blockchain, Bitcoin, uses this record to store the history of transactions, so people can make and receive payments on the Bitcoin blockchain and trust that their money won’t be lost or stolen.

In order to achieve this level of trust, the blockchain uses a couple of cryptographic algorithms as building blocks. Hash functions and public key cryptography are crucial to both the functionality and security of the blockchain ecosystem.

Hash Functions

A hash function is a mathematical function that can take any number as an input and produces an output in a fixed range of numbers. For example, 256-bit hash functions (which are commonly used in blockchain), produce outputs in the range 0-2256.

In order to be considered secure, a hash function needs to be collision-resistant, this means that it’s extremely difficult (to the point of being nearly impossible) to find two inputs that create the same hash output. Accomplishing this requires a few different features:

  • No weaknesses in the hash function

  • A large number of possible outputs

  • A one-way hash function (can’t derive the input from the output)

  • Similar inputs produce very different outputs

If a hash function meets these requirements, it can be used in blockchain. However, if any of these requirements are violated, then the security of the blockchain is at risk. Blockchain relies heavily on secure hash functions to ensure that transactions cannot be modified after being stored in the ledger.

Public Key Cryptography

The other cryptographic algorithm used in blockchain technology is public key cryptography. This type of cryptography is also widely used on the Internet as well since it has so many useful properties.  With public key cryptography, you can:

  • Encrypt a message so that only the intended recipient can read it

  • Generate a digital signature proving that you sent a given message

  • Use a digital signature to verify that a message was not modified in transit

In public key cryptography, everyone has two different encryption keys: a private one and a public one. Your private key is a random number that you generate and keep secret. It is used for decrypting messages and generating digital signatures.

Your public key is derived from your private key and, as the name suggests, is designed to be publicly distributed. It’s used for encrypting messages to you and generating digital signatures. Your address (where people sent transactions to) on the blockchain is typically derived from your public key.

The security of public key cryptography is based on two things. The first is the secrecy of your private key. If someone can guess or steal your private key, they have complete control of your account on the blockchain. This allows them to perform transactions on your behalf and decrypt data meant for you. The most common way that blockchain is “hacked” is people failing to protect their private key.

The other main assumption of public key cryptography is that the algorithms used are secure. Public key cryptography is based off of mathematical “hard” problems, where performing an operation is much easier than reversing it. For example, it’s relatively easy to multiply two numbers together but hard to factor the result. Similarly, it’s easy to perform exponentiation but hard to calculate logarithms. As a result, it’s possible to create schemes where computers are capable of performing the easy operation but not the hard one.

The security of these “hard” problems are why you’ll often see articles about quantum computers breaking blockchain. Due to how quantum computers work, factoring and logarithms aren’t much harder than multiplication and exponentiation, so traditional public key cryptography no longer works. However, other problems exist that are still “hard” for quantum computers, so the threat of quantum computers to blockchain can be fixed with a simple upgrade. 

How Blockchains are Put Together

As its name suggests, the blockchain is a collection of blocks that are chained together to create a continuous whole. In this section, we explore how this works.

Blocks

The purpose of the blockchain is to act as a distributed ledger that stores data in a secure fashion. The blocks are the place where this data is stored.

blocks.png

The image above illustrates the basic structure of a block in a blockchain. We’ll talk about every part of this image throughout the series, but for now focus on the green sections. Each green piece represents a transaction within the block. While a transaction may represent a literal transaction (i.e. a transfer of value) on blockchains like Bitcoin, this is not the only option. As we’ll see later, smart contract platforms store other things (like computer code) as transactions as well.

The security of the blocks in the digital ledger depends on the security of public key cryptography. Every transaction and block in the blockchain is digitally signed by its creator. This allows anyone with access to the blockchain to easy validate that every transaction is authenticated (i.e. sent by someone who owns the associated account) and has not been modified since creation. The integrity and authenticity of the blocks in the chain is also assured by the digital signature of the block creator.

Chaining

Each block is equivalent to a single page in a bank’s account ledger; it only represents a slice of the history of the network’s history. In order to combine these slides into a continuous whole, the blockchain makes use of hash functions.

In the image above, you can see the hash functions linking each block together. Each block contains the hash of the previous block as part of its block header (the section not containing transaction data).

The fact that every block is dependent on the previous one is significant because of the collision-resistance of hash functions. If someone wanted to forge block 51 in the image, they have two options: find another version of block 51 that has the same hash or forge every block after 52 as well. The first is supposed to be impossible (due to collision resistance) and the other should be difficult or impossible since the blockchain is designed to make forging even a single block difficult (more on this later).

The security of the “chain” part of blockchain is based upon the collision resistance of the hash function that it uses. If someone can find a way to generate another version of block 51 that has the same hash, the immutability assumptions of blockchain break down and you can’t trust that any transaction will remain in the distributed ledger.

What's Next...

At this point, we’ve covered the fundamental cryptographic algorithms and data structures used in blockchain. The next two articles in this series are focused on the infrastructure behind the blockchain: the nodes and how they’re networked together.