What is hashing in cybersecurity?
Hashing is a one-way mathematical function that turns data into a string of nondescript text that cannot be reversed or decoded.
In the context of cybersecurity, hashing is a way to keep sensitive information and data — including passwords, messages, and documents — secure. Once this content is converted via a hashing algorithm, the resulting value (or hash code) is unreadable to humans and extremely difficult to decrypt, even with the help of advanced technology.
Hashing has become an important cybersecurity tool for organizations, especially given the rise in remote work and use of personal devices. Both of these trends require organizations to leverage single sign-on (SSO) technology to enable a remote workforce and reduce friction within the user experience. Though this is a necessary part of modern business, attackers have come to recognize the inherent vulnerability of stored passwords and user credentials, which means that companies must take additional steps to keep that information secure.
2024 CrowdStrike Global Threat Report
The 2024 Global Threat Report unveils an alarming rise in covert activity and a cyber threat landscape dominated by stealth. Data theft, cloud breaches, and malware-free attacks are on the rise. Read about how adversaries continue to adapt despite advancements in detection technology.
Download NowComponents of hashing
Hashing has three main components:
- The input key
- The hash function
- The hash table
Input Key | Hash Function | Hash Table |
---|---|---|
The input data or message that is processed by the hash function and converted into the hash code. | A mathematical function that converts the input data, or key, into a unique hash value. | A data structure that stores data and maps keys to values. |
Types of hashing
Hashing is not singular in nature. In fact, there are a variety of hashing algorithms that are suitable for various use cases. Here, we explore four common hashing algorithms:
- LANMAN
- NTLM
- Script
- Ehtash
Hashing algorithms
LANMAN: The Microsoft LAN Manager hashing algorithm, or LANMAN, is a network operating system and authentication protocol developed by Microsoft primarily for storing passwords. Introduced in the 1980s, LANMAN is now considered largely obsolete, though it is perhaps the best-known example of a hashing algorithm.
NTLM: Windows New Technology LAN Manager (NTLM) is a suite of security protocols offered by Microsoft to authenticate users’ identities and protect the integrity and confidentiality of their activity. At its core, NTLM is an SSO tool that relies on a challenge-response protocol to confirm the user without requiring them to submit a password, a process known as NTLM authentication.
Scrypt: Scrypt is a key derivation function (KDF) and password-based key derivation function (PBKDF) that can convert data or passwords into cryptographic keys. The primary purpose of Scrypt is to provide a more robust defense against cryptographic attacks, such as brute-force attacks.
Ethash: Ethash is a “proof of work” hashing algorithm developed by the Ethereum network. The algorithm is custom built to secure the Ethereum cryptocurrency network and its blockchain and validate platform transactions.
Hashing use cases in cybersecurity
Hashing plays an important role in many cybersecurity algorithms and protocols. At the most basic level, hashing is a way to encode sensitive data or text into an indecipherable value that is incredibly difficult to decode.
Below, we discuss three of the most common hashing use cases in cybersecurity:
- Password storage
- Digital signatures
- File and document management
Password storage
Storing passwords as plain text within a system, application, or device is extremely risky. A password storage solution can use hashing to encode and save login credentials as a hashed value. When users attempt to access the system in the future, the solution will authenticate the user by validating the password that was entered with the hashed value in the database.
Digital signature
A digital signature is a cryptographic technique used to verify the origin, authenticity, and integrity of a message, document, or transaction.
To create a digital signature using hashing:
- A hash function is applied to the original message to create a secure hash value
- The hash value is then encrypted using a private key that belongs to the sender; this process creates the digital signature
- The recipient uses a public key to decrypt the digital signature
- The recipient then takes the resulting hash value and applies the same hash function; if the hash values match, it proves that the message has not been altered and that it originated with the designated sender
File and document management
Though digital signatures are often used to secure email and other digital communications, they can also authenticate and verify any kind of electronic transaction or document. Two main use cases are:
- Document comparisons: During hashing, the hash function will produce a fixed-value character string that serves as a unique identifier for any type of document or file. If the document is altered in any way, even minutely, the hash value will also change. As a result, hashing provides a fast and effective way to compare files and confirm they have not been altered or compromised.
- Data integrity verification: To verify the integrity of the data within a file or document, a hashing algorithm can be used to produce a checksum, which is a hash value that reflects the sum of the dataset. This checksum accompanies the data as it is shared or transmitted. Upon receipt, the user can create a new checksum and compare it to the original. If the two values match, then the data is considered secure.
2023 Threat Hunting Report
In the 2023 Threat Hunting Report, CrowdStrike’s Counter Adversary Operations team exposes the latest adversary tradecraft and provides knowledge and insights to help stop breaches.
Download NowHashing benefits in cybersecurity
Hashing is an essential component within many cybersecurity practices and protocols. Benefits include:
- Strong password security: Hashing is a critical part of identity and access management (IAM) tools. By using a hashing tool, organizations can ensure the identity of their users and maintain proper access controls. This helps prevent password-based attacks, such as password spraying or credential theft.
- File and data integrity: Once a file, document, or dataset is hashed, any change will result in a new, completely different hash value. This helps organizations track and identify if and when assets have been altered and immediately stop using any asset that may be compromised.
- Data security: A hashed value is virtually useless to cybercriminals and bad actors because it is extremely challenging to decode a one-way hash function. This means that a data breach may not necessarily result in the loss of sensitive data if it has been properly hashed.
- Secure communications: Hashing is a critical part of digital signatures, which is the primary way to authenticate both the contents of a message and the identity of the sender.
- Secure downloads: When downloading software or any large file, systems can verify the hash value of the download to ensure it has not been infected with malware.
- Improved threat detection: Hash codes are a fast and effective way to scan and discover known threats on a system or within the network.
Limitations of hashing
Though hashing is a useful tool, it has its limitations. In this section, we explore a few challenges and drawbacks of using hashing in cybersecurity:
- Collision: A collision is when two or more inputs result in the same hash value. Large enterprises or companies that store significant amounts of data will face this challenge and must implement a solution to prevent such collisions. For example, a company might implement a chaining strategy, which is when a duplicative value is added to a linked list on the hash table.
- Performance: Hashing can be a bit of a balancing act. Algorithms are designed to optimize both speed and memory use; they must also be able to support the level of data input needed by the company. This creates significant complexity in terms of algorithm design and evolution. With respect to cybersecurity, a slow algorithm or one facing significant backlogs can translate into higher risk.
- Security risks: By definition, hashing is a one-way conversion of data into an indecipherable string of text. Generally, it is impossible to reconvert the hash value back to data. However, some sophisticated adversaries can either discover or guess the hash function, which would allow them to reverse engineer the hash values or tamper with the dataset by creating fake inputs.
Hashing vs. encryption
Though hashing and encryption may seem to result in the same outcome, they are actually two different functions.
The main difference is that hashing is always intended to be a one-way conversion of data. The hash value is a unique string of text that can only be decoded if the adversary is able to steal or guess the hash function and then reverse engineer the data input.
Data encryption, on the other hand, is a two-way process. Though encryption also uses cryptographic algorithms to convert plain text into an encoded format, it has a corresponding decoding key that allows users to decrypt the data.
Another key difference is that hashing provides you with the ability to authenticate data, messages, files, or other assets. Users can confirm that data sent from one user to another has not been intercepted and altered by comparing the original hash value with the one produced by the recipient. With encrypted data, on the other hand, there is no way to validate the data or tell if it has been changed, which is why hashing is preferred for authentication purposes.