The hashlib module provides secure hash functions. Use it for checksums, data integrity, and (with care) password hashing.

Basic Hashing

import hashlib
 
# Hash a string
text = "Hello, World!"
hash_obj = hashlib.sha256(text.encode())
 
# Get the digest
print(hash_obj.hexdigest())
# 'dffd6021bb2bd5b0af676290809ec3a53191dd81c7f70a4b28688a362182986f'
 
print(hash_obj.digest())  # Raw bytes

Common Algorithms

# SHA-256 (recommended for most uses)
hashlib.sha256(data)
 
# SHA-512 (longer hash, more security margin)
hashlib.sha512(data)
 
# SHA-1 (legacy, avoid for security)
hashlib.sha1(data)
 
# MD5 (broken for security, fine for checksums)
hashlib.md5(data)
 
# List all available
print(hashlib.algorithms_available)

File Checksums

For large files, read in chunks:

def file_hash(filepath: str, algorithm: str = 'sha256') -> str:
    """Calculate hash of a file."""
    h = hashlib.new(algorithm)
    
    with open(filepath, 'rb') as f:
        while chunk := f.read(8192):
            h.update(chunk)
    
    return h.hexdigest()
 
# Usage
checksum = file_hash('large_file.iso')
print(checksum)

Verify a download:

def verify_checksum(filepath: str, expected: str, algorithm: str = 'sha256') -> bool:
    """Verify file matches expected checksum."""
    actual = file_hash(filepath, algorithm)
    return actual.lower() == expected.lower()
 
# Usage
if verify_checksum('download.zip', 'abc123...'):
    print("File integrity verified")
else:
    print("Checksum mismatch!")

Incremental Hashing

Build up a hash over multiple updates:

h = hashlib.sha256()
h.update(b"Hello, ")
h.update(b"World!")
print(h.hexdigest())
# Same as sha256(b"Hello, World!")

Useful for streaming data:

import hashlib
 
def hash_stream(stream) -> str:
    """Hash data from any iterable."""
    h = hashlib.sha256()
    for chunk in stream:
        h.update(chunk)
    return h.hexdigest()

Password Hashing (Don't Do This)

Plain hashing is not secure for passwords:

# BAD: vulnerable to rainbow tables
password_hash = hashlib.sha256(password.encode()).hexdigest()

Use hashlib.pbkdf2_hmac or better, bcrypt/argon2:

import hashlib
import secrets
 
def hash_password(password: str) -> tuple[str, str]:
    """Hash password with PBKDF2 (stdlib option)."""
    salt = secrets.token_hex(16)
    key = hashlib.pbkdf2_hmac(
        'sha256',
        password.encode(),
        salt.encode(),
        iterations=100_000
    )
    return key.hex(), salt
 
def verify_password(password: str, key_hex: str, salt: str) -> bool:
    """Verify password against stored hash."""
    new_key = hashlib.pbkdf2_hmac(
        'sha256',
        password.encode(),
        salt.encode(),
        iterations=100_000
    )
    return secrets.compare_digest(new_key.hex(), key_hex)

Better: use bcrypt or argon2-cffi packages.

HMAC for Message Authentication

Verify both integrity and authenticity:

import hmac
import hashlib
 
def sign_message(message: bytes, secret: bytes) -> str:
    """Create HMAC signature."""
    return hmac.new(secret, message, hashlib.sha256).hexdigest()
 
def verify_signature(message: bytes, signature: str, secret: bytes) -> bool:
    """Verify HMAC signature."""
    expected = sign_message(message, secret)
    return hmac.compare_digest(signature, expected)
 
# Usage
secret = b'my-secret-key'
message = b'{"user": "alice", "action": "transfer"}'
 
sig = sign_message(message, secret)
print(verify_signature(message, sig, secret))  # True

Content Addressing

Use hashes as identifiers:

def content_address(data: bytes) -> str:
    """Generate content-based identifier."""
    return hashlib.sha256(data).hexdigest()[:16]
 
# Same content always gets same address
addr1 = content_address(b"Hello")
addr2 = content_address(b"Hello")
assert addr1 == addr2

Quick Reference

AlgorithmOutput SizeUse Case
MD5128 bitsFile checksums (non-security)
SHA-1160 bitsLegacy compatibility only
SHA-256256 bitsGeneral purpose, recommended
SHA-512512 bitsExtra security margin
BLAKE2bVariableFast, secure alternative
FunctionPurpose
hashlib.sha256(data)Create hash object
hash.update(data)Add more data
hash.hexdigest()Get hex string
hash.digest()Get raw bytes
hashlib.pbkdf2_hmac()Key derivation
hashlib.new(name)Dynamic algorithm selection

When to Use What

TaskSolution
File integritySHA-256
Download verificationSHA-256/SHA-512
Password storagebcrypt or argon2
API signaturesHMAC-SHA256
Content addressingSHA-256
Quick deduplicationMD5 (speed over security)

hashlib is foundational. Know when hashing alone is enough and when you need HMAC, KDFs, or specialized password hashing.

React to this post: