Skip to content

SevenZip

SevenZipArchiver(path: Path)

Bases: Archiver

7-Zip archiver implementation using py7zr library.

This class provides comprehensive 7-Zip archive support following the Archiver interface. It handles reading, writing, and management of 7-Zip archives (.cb7) using the py7zr library with efficient filename caching and batch operations.

Features
  • Full read/write support for 7-Zip archives
  • LZMA compression for optimal file size reduction
  • Filename list caching for improved performance
  • Batch operations for multiple files
  • Memory-efficient streaming operations
  • Context manager support for proper resource cleanup
Limitations
  • Requires py7zr >= 1.0.0 to be installed
  • Write operations require full archive reconstruction
  • No password protection support
  • Some advanced 7-Zip features may not be available
  • Performance may be slower than native 7-Zip for very large archives
Performance Notes
  • Filename lists are cached to avoid repeated archive parsing
  • Write operations rewrite the entire archive (7-Zip format limitation)
  • Use batch operations (remove_files) when possible
Thread Safety

This class is NOT thread-safe. Use separate instances for concurrent access or implement external synchronization.

Error Handling

All operations can raise ArchiverReadError or ArchiverWriteError exceptions. The class includes comprehensive error handling and logging for troubleshooting.

Examples:

Writing to a cbz archive:

Python Console Session
>>> from pathlib import Path
>>> from darkseid.archivers.sevenzip import SevenZipArchiver
>>>
>>> # Add files
>>> with SevenZipArchiver(Path("comic.cb7")) as archive:
...     # Add text file
...     archive.write_file("metadata.txt", "Comic metadata here")
...
...     # Add binary file (e.g., image)
...     with open("page1.jpg", "rb") as f:
...         archive.write_file("pages/page1.jpg", f.read())
...
...     # Verify files were added
...     files = archive.get_filename_list()
...     print(f"Archive contains: {files}")

Reading from an existing 7z archive:

Python Console Session
>>> # Read from existing archive
>>> with SevenZipArchiver(Path("existing.cb7")) as archive:
...     # Check if archive is valid
...     if archive.test():
...         # List all files
...         files = archive.get_filename_list()
...         print(f"Found {len(files)} files")
...
...         # Read specific file
...         if "metadata.txt" in files:
...             content = archive.read_file("metadata.txt")
...             print(f"Metadata: {content.decode()}")
...
...         # Process all files
...         for filename in files:
...             if filename.endswith('.jpg'):
...                 image_data = archive.read_file(filename)
...                 print(f"Image {filename}: {len(image_data)} bytes")

Batch operations for better performance:

Python Console Session
>>> with SevenZipArchiver(Path("batch.cb7")) as archive:
...     # Remove multiple files at once
...     files_to_remove = ["temp1.txt", "temp2.txt", "old_data.json"]
...     success = archive.remove_files(files_to_remove)
...
...     if success:
...         print("Batch removal successful")
...     else:
...         print("Some files could not be removed")

Error handling:

Python Console Session
>>> from darkseid.archivers import ArchiverReadError, ArchiverWriteError
>>>
>>> try:
...     with SevenZipArchiver(Path("nonexistent.cb7")) as archive:
...         content = archive.read_file("missing.txt")
... except ArchiverReadError as e:
...     print(f"Read error: {e}")
... except ArchiverWriteError as e:
...     print(f"Write error: {e}")
ATTRIBUTE DESCRIPTION
path

Path to the 7-Zip archive file

TYPE: Path

_filename_list_cache

Cached list of filenames

TYPE: list[str] | None

Initialize SevenZipArchiver.

Creates a new SevenZipArchiver instance for the specified archive file. The archive file doesn't need to exist yet - it will be created when first written to.

PARAMETER DESCRIPTION
path

Path to the 7-Zip archive file. Can be any extension, but typically .cb7 for comic book archives.

TYPE: Path

Note

The parent directory will be created automatically when writing if it doesn't exist.

Examples:

Python Console Session
>>> from pathlib import Path
>>> archiver1 = SevenZipArchiver(Path("my_archive.cb7"))
>>> archiver2 = SevenZipArchiver(Path("comics/issue1.cb7"))

Functions

__enter__() -> Self

Context manager entry for 7-Zip archive operations.

RETURNS DESCRIPTION
Self

The archiver instance for use in the context.

TYPE: Self

Examples:

Python Console Session
>>> with SevenZipArchiver(Path("archive.cb7")) as archive:
...     # Use archive here
...     files = archive.get_filename_list()

__exit__(*_: object) -> None

Context manager exit with proper cleanup.

Ensures the archive caches are cleared to prevent memory leaks and resource issues.

PARAMETER DESCRIPTION
*_

Exception information (ignored)

TYPE: object DEFAULT: ()

Note

This method is called automatically when exiting a 'with' block. It handles exceptions gracefully and always cleans up resources.

copy_from_archive(other_archive: Archiver) -> bool

Copy files from another archive to this 7z archive.

Note

This operation is not supported for 7-Zip archives and will always return False. Converting other archive formats to 7-Zip is not recommended as it may not provide meaningful benefits and could reduce compatibility.

PARAMETER DESCRIPTION
other_archive

The source archive to copy files from.

TYPE: Archiver

RETURNS DESCRIPTION
False

This operation is not supported for 7-Zip archives.

TYPE: bool

Warning

A warning will be logged when this method is called, indicating that the copy operation was attempted on a 7-Zip archive.

Examples:

Python Console Session
>>> seven_zip_archive = SevenZipArchiver(Path("target.cb7"))
>>> zip_archive = ZipArchiver(Path("source.cbz"))
>>> result = seven_zip_archive.copy_from_archive(zip_archive)
>>> print(f"Copy successful: {result}")  # Will print: Copy successful: False

get_filename_list() -> list[str]

Get list of all files in the 7z archive.

Returns a sorted list of all file paths contained in the archive. The list is cached for performance and updated when files are added or removed.

RETURNS DESCRIPTION
list[str]

List of file paths in the archive, sorted alphabetically. Returns empty list if archive doesn't exist or is empty.

Examples:

Python Console Session
>>> with SevenZipArchiver(Path("archive.cb7")) as archive:
...     files = archive.get_filename_list()
...     print(f"Archive contains {len(files)} files:")
...     for file in files:
...         print(f"  - {file}")
...
...     # Check if specific file exists
...     if "config.txt" in files:
...         print("Config file found")
Performance

The filename list is cached after first access, so subsequent calls are very fast. Cache is invalidated when files are added or removed.

Note

File paths use forward slashes regardless of platform.

read_file(archive_file: str) -> bytes

Read a file from the 7z archive.

Reads the specified file from the archive and returns its contents as bytes.

PARAMETER DESCRIPTION
archive_file

Path of the file within the archive. Use forward slashes for directory separators regardless of platform.

TYPE: str

RETURNS DESCRIPTION
bytes

The file contents as bytes. For text files, use .decode() to convert to string.

RAISES DESCRIPTION
ArchiverReadError

If the file cannot be read, doesn't exist, or the archive is corrupted.

Examples:

Python Console Session
>>> with SevenZipArchiver(Path("archive.cb7")) as archive:
...     # Read text file
...     text_data = archive.read_file("config.txt")
...     config = text_data.decode('utf-8')
...
...     # Read binary file
...     image_data = archive.read_file("images/photo.jpg")
...     print(f"Image size: {len(image_data)} bytes")
...
...     # Read file in subdirectory
...     data = archive.read_file("data/records/file.json")

remove_files(filename_list: list[str]) -> bool

Remove multiple files from the 7z archive.

Removes all specified files from the archive in a single operation.

PARAMETER DESCRIPTION
filename_list

List of file paths to remove from the archive.

TYPE: list[str]

RETURNS DESCRIPTION
bool

True if all files were removed successfully (or didn't exist), False if the operation failed.

Examples:

Python Console Session
>>> with SevenZipArchiver(Path("archive.cb7")) as archive:
...     # Remove multiple files at once
...     files_to_remove = ["temp1.txt", "temp2.txt", "old/data.json"]
...     success = archive.remove_files(files_to_remove)
...
...     if success:
...         print("All files removed successfully")
...     else:
...         print("Some files could not be removed")
Note

The operation is atomic - either all files are removed or none are. Files that don't exist are ignored (not treated as errors).

test() -> bool

Test whether the 7z archive is valid.

Checks if the archive file exists and is a valid 7-Zip archive that can be read. This is useful for validating archives before attempting to read from them.

RETURNS DESCRIPTION
bool

True if the archive is valid and can be read, False otherwise.

Examples:

Python Console Session
>>> archive = SevenZipArchiver(Path("archive.cb7"))
>>> if archive.test():
...     print("Archive is valid")
...     with archive:
...         files = archive.get_filename_list()
... else:
...     print("Archive is invalid or corrupted")
Note

This method does not require the archive to be opened with a context manager - it can be called on any instance.

write_file(archive_file: str, data: str | bytes) -> bool

Write a file to the 7z archive.

Writes the specified data to a file in the archive. If the archive already exists, it will be completely reconstructed with the new file added or replaced. This is a limitation of the 7-Zip format.

PARAMETER DESCRIPTION
archive_file

Path of the file within the archive. Use forward slashes for directory separators. Parent directories will be created automatically.

TYPE: str

data

Data to write. Can be a string (will be UTF-8 encoded) or bytes.

TYPE: str | bytes

RETURNS DESCRIPTION
bool

True if the write operation was successful.

RAISES DESCRIPTION
ArchiverWriteError

If the write operation fails due to permissions, disk space, or other I/O errors.

ArchiverReadError

If the archive cannot be read.

Examples:

Python Console Session
>>> with SevenZipArchiver(Path("archive.cb7")) as archive:
...     # Write text file
...     archive.write_file("readme.txt", "This is a readme file")
...
...     # Write binary data
...     with open("image.jpg", "rb") as f:
...         archive.write_file("images/photo.jpg", f.read())
...
...     # Write to subdirectory
...     archive.write_file("data/config.json", '{"setting": "value"}')
Performance

This operation rewrites the entire archive, so it can be slow for large archives. Consider batching multiple writes when possible.