SevenZip
SevenZipArchiver(path: Path)
Bases: Archiver
7-Zip archiver implementation using py7zr library.
This class provides comprehensive 7-Zip archive support following the Archiver interface. It handles reading, writing, and management of 7-Zip archives (.cb7) using the py7zr library with efficient filename caching and batch operations.
Features
- Full read/write support for 7-Zip archives
- LZMA compression for optimal file size reduction
- Filename list caching for improved performance
- Batch operations for multiple files
- Memory-efficient streaming operations
- Context manager support for proper resource cleanup
Limitations
- Requires py7zr >= 1.0.0 to be installed
- Write operations require full archive reconstruction
- No password protection support
- Some advanced 7-Zip features may not be available
- Performance may be slower than native 7-Zip for very large archives
Performance Notes
- Filename lists are cached to avoid repeated archive parsing
- Write operations rewrite the entire archive (7-Zip format limitation)
- Use batch operations (remove_files) when possible
Thread Safety
This class is NOT thread-safe. Use separate instances for concurrent access or implement external synchronization.
Error Handling
All operations can raise ArchiverReadError or ArchiverWriteError exceptions. The class includes comprehensive error handling and logging for troubleshooting.
Examples:
Writing to a cbz archive:
>>> from pathlib import Path
>>> from darkseid.archivers.sevenzip import SevenZipArchiver
>>>
>>> # Add files
>>> with SevenZipArchiver(Path("comic.cb7")) as archive:
... # Add text file
... archive.write_file("metadata.txt", "Comic metadata here")
...
... # Add binary file (e.g., image)
... with open("page1.jpg", "rb") as f:
... archive.write_file("pages/page1.jpg", f.read())
...
... # Verify files were added
... files = archive.get_filename_list()
... print(f"Archive contains: {files}")
Reading from an existing 7z archive:
>>> # Read from existing archive
>>> with SevenZipArchiver(Path("existing.cb7")) as archive:
... # Check if archive is valid
... if archive.test():
... # List all files
... files = archive.get_filename_list()
... print(f"Found {len(files)} files")
...
... # Read specific file
... if "metadata.txt" in files:
... content = archive.read_file("metadata.txt")
... print(f"Metadata: {content.decode()}")
...
... # Process all files
... for filename in files:
... if filename.endswith('.jpg'):
... image_data = archive.read_file(filename)
... print(f"Image {filename}: {len(image_data)} bytes")
Batch operations for better performance:
>>> with SevenZipArchiver(Path("batch.cb7")) as archive:
... # Remove multiple files at once
... files_to_remove = ["temp1.txt", "temp2.txt", "old_data.json"]
... success = archive.remove_files(files_to_remove)
...
... if success:
... print("Batch removal successful")
... else:
... print("Some files could not be removed")
Error handling:
>>> from darkseid.archivers import ArchiverReadError, ArchiverWriteError
>>>
>>> try:
... with SevenZipArchiver(Path("nonexistent.cb7")) as archive:
... content = archive.read_file("missing.txt")
... except ArchiverReadError as e:
... print(f"Read error: {e}")
... except ArchiverWriteError as e:
... print(f"Write error: {e}")
| ATTRIBUTE | DESCRIPTION |
|---|---|
path |
Path to the 7-Zip archive file
TYPE:
|
_filename_list_cache |
Cached list of filenames
TYPE:
|
Initialize SevenZipArchiver.
Creates a new SevenZipArchiver instance for the specified archive file. The archive file doesn't need to exist yet - it will be created when first written to.
| PARAMETER | DESCRIPTION |
|---|---|
path
|
Path to the 7-Zip archive file. Can be any extension, but typically .cb7 for comic book archives.
TYPE:
|
Note
The parent directory will be created automatically when writing if it doesn't exist.
Examples:
>>> from pathlib import Path
>>> archiver1 = SevenZipArchiver(Path("my_archive.cb7"))
>>> archiver2 = SevenZipArchiver(Path("comics/issue1.cb7"))
Functions
__enter__() -> Self
Context manager entry for 7-Zip archive operations.
| RETURNS | DESCRIPTION |
|---|---|
Self
|
The archiver instance for use in the context.
TYPE:
|
Examples:
>>> with SevenZipArchiver(Path("archive.cb7")) as archive:
... # Use archive here
... files = archive.get_filename_list()
__exit__(*_: object) -> None
Context manager exit with proper cleanup.
Ensures the archive caches are cleared to prevent memory leaks and resource issues.
| PARAMETER | DESCRIPTION |
|---|---|
*_
|
Exception information (ignored)
TYPE:
|
Note
This method is called automatically when exiting a 'with' block. It handles exceptions gracefully and always cleans up resources.
copy_from_archive(other_archive: Archiver) -> bool
Copy files from another archive to this 7z archive.
Note
This operation is not supported for 7-Zip archives and will always return False. Converting other archive formats to 7-Zip is not recommended as it may not provide meaningful benefits and could reduce compatibility.
| PARAMETER | DESCRIPTION |
|---|---|
other_archive
|
The source archive to copy files from.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
False
|
This operation is not supported for 7-Zip archives.
TYPE:
|
Warning
A warning will be logged when this method is called, indicating that the copy operation was attempted on a 7-Zip archive.
Examples:
>>> seven_zip_archive = SevenZipArchiver(Path("target.cb7"))
>>> zip_archive = ZipArchiver(Path("source.cbz"))
>>> result = seven_zip_archive.copy_from_archive(zip_archive)
>>> print(f"Copy successful: {result}") # Will print: Copy successful: False
get_filename_list() -> list[str]
Get list of all files in the 7z archive.
Returns a sorted list of all file paths contained in the archive. The list is cached for performance and updated when files are added or removed.
| RETURNS | DESCRIPTION |
|---|---|
list[str]
|
List of file paths in the archive, sorted alphabetically. Returns empty list if archive doesn't exist or is empty. |
Examples:
>>> with SevenZipArchiver(Path("archive.cb7")) as archive:
... files = archive.get_filename_list()
... print(f"Archive contains {len(files)} files:")
... for file in files:
... print(f" - {file}")
...
... # Check if specific file exists
... if "config.txt" in files:
... print("Config file found")
Performance
The filename list is cached after first access, so subsequent calls are very fast. Cache is invalidated when files are added or removed.
Note
File paths use forward slashes regardless of platform.
read_file(archive_file: str) -> bytes
Read a file from the 7z archive.
Reads the specified file from the archive and returns its contents as bytes.
| PARAMETER | DESCRIPTION |
|---|---|
archive_file
|
Path of the file within the archive. Use forward slashes for directory separators regardless of platform.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
bytes
|
The file contents as bytes. For text files, use .decode() to convert to string. |
| RAISES | DESCRIPTION |
|---|---|
ArchiverReadError
|
If the file cannot be read, doesn't exist, or the archive is corrupted. |
Examples:
>>> with SevenZipArchiver(Path("archive.cb7")) as archive:
... # Read text file
... text_data = archive.read_file("config.txt")
... config = text_data.decode('utf-8')
...
... # Read binary file
... image_data = archive.read_file("images/photo.jpg")
... print(f"Image size: {len(image_data)} bytes")
...
... # Read file in subdirectory
... data = archive.read_file("data/records/file.json")
remove_files(filename_list: list[str]) -> bool
Remove multiple files from the 7z archive.
Removes all specified files from the archive in a single operation.
| PARAMETER | DESCRIPTION |
|---|---|
filename_list
|
List of file paths to remove from the archive.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
bool
|
True if all files were removed successfully (or didn't exist), False if the operation failed. |
Examples:
>>> with SevenZipArchiver(Path("archive.cb7")) as archive:
... # Remove multiple files at once
... files_to_remove = ["temp1.txt", "temp2.txt", "old/data.json"]
... success = archive.remove_files(files_to_remove)
...
... if success:
... print("All files removed successfully")
... else:
... print("Some files could not be removed")
Note
The operation is atomic - either all files are removed or none are. Files that don't exist are ignored (not treated as errors).
test() -> bool
Test whether the 7z archive is valid.
Checks if the archive file exists and is a valid 7-Zip archive that can be read. This is useful for validating archives before attempting to read from them.
| RETURNS | DESCRIPTION |
|---|---|
bool
|
True if the archive is valid and can be read, False otherwise. |
Examples:
>>> archive = SevenZipArchiver(Path("archive.cb7"))
>>> if archive.test():
... print("Archive is valid")
... with archive:
... files = archive.get_filename_list()
... else:
... print("Archive is invalid or corrupted")
Note
This method does not require the archive to be opened with a context manager - it can be called on any instance.
write_file(archive_file: str, data: str | bytes) -> bool
Write a file to the 7z archive.
Writes the specified data to a file in the archive. If the archive already exists, it will be completely reconstructed with the new file added or replaced. This is a limitation of the 7-Zip format.
| PARAMETER | DESCRIPTION |
|---|---|
archive_file
|
Path of the file within the archive. Use forward slashes for directory separators. Parent directories will be created automatically.
TYPE:
|
data
|
Data to write. Can be a string (will be UTF-8 encoded) or bytes.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
bool
|
True if the write operation was successful. |
| RAISES | DESCRIPTION |
|---|---|
ArchiverWriteError
|
If the write operation fails due to permissions, disk space, or other I/O errors. |
ArchiverReadError
|
If the archive cannot be read. |
Examples:
>>> with SevenZipArchiver(Path("archive.cb7")) as archive:
... # Write text file
... archive.write_file("readme.txt", "This is a readme file")
...
... # Write binary data
... with open("image.jpg", "rb") as f:
... archive.write_file("images/photo.jpg", f.read())
...
... # Write to subdirectory
... archive.write_file("data/config.json", '{"setting": "value"}')
Performance
This operation rewrites the entire archive, so it can be slow for large archives. Consider batching multiple writes when possible.