Skip to content

Archiver

ArchiverError

Bases: Exception

Base exception for archiver operations.

This is the parent class for all archiver-related exceptions. Concrete implementations should raise more specific exceptions that inherit from this base class.

Examples:

Python Console Session
>>> try:
...     archive.read_file("nonexistent.txt")
... except ArchiverError as e:
...     print(f"Archive operation failed: {e}")

ArchiverReadError

Bases: ArchiverError

Raised when reading from archive fails.

This exception is raised when:

  • A file doesn't exist in the archive
  • The archive is corrupted or unreadable
  • Permission issues prevent reading
  • The archive format is unsupported

Examples:

Python Console Session
>>> try:
...     content = archive.read_file("missing.txt")
... except ArchiverReadError as e:
...     print(f"Failed to read file: {e}")

ArchiverWriteError

Bases: ArchiverError

Raised when writing to archive fails.

This exception is raised when:

  • The archive is read-only
  • Disk space is insufficient
  • Permission issues prevent writing
  • The archive format doesn't support the operation

Examples:

Python Console Session
>>> try:
...     archive.write_file("new.txt", "content")
... except ArchiverWriteError as e:
...     print(f"Failed to write file: {e}")

Archiver(path: Path)

Bases: ABC

Abstract base class for archive operations.

Provides a common interface for reading, writing, and managing files within different archive formats. This class defines the contract that all concrete archive implementations must follow.

The class is designed to work with various archive formats including but not limited to ZIP, RAR, TAR, etc. Each concrete implementation should handle the specifics of its archive format while maintaining the same public interface.

Features
  • Context manager support for automatic resource cleanup
  • Unified error handling with specific exception types
  • Support for both text and binary data
  • Batch operations for multiple files
  • Archive-to-archive copying functionality
  • Built-in file existence checking
Thread Safety

The base class does not provide thread safety guarantees. Concrete implementations should document their thread safety characteristics and implement appropriate locking if needed.

Performance Considerations
  • File operations are performed individually; batch operations may be more efficient for large numbers of files
  • The get_filename_list() method may be expensive for large archives
  • Consider caching file lists in concrete implementations
ATTRIBUTE DESCRIPTION
IMAGE_EXT_RE

Compiled regex for matching image file extensions. Matches: .jpg, .jpeg, .png, .webp, .gif (case-insensitive)

Examples:

Implementing a concrete archiver:

Python Console Session
>>> class MyArchiver(Archiver):
...     def read_file(self, archive_file: str) -> bytes:
...         # Implementation specific to your archive format
...         pass
...
...     def write_file(self, archive_file: str, data: str | bytes) -> bool:
...         # Implementation specific to your archive format
...         pass
...
...     # ... implement other abstract methods

Initialize an Archiver with the specified path.

Creates a new archiver instance that will operate on the specified archive file. The path validation is performed during initialization, but the actual archive operations are deferred until method calls.

PARAMETER DESCRIPTION
path

Path to the archive file. Can be an existing file or a path where a new archive will be created. The path should include the appropriate file extension for the archive format.

TYPE: Path

RAISES DESCRIPTION
FileNotFoundError

If the archive file doesn't exist and the archiver is expected to perform read operations on an existing archive. This is determined by the is_write_operation_expected() method.

Note

The constructor does not immediately open or create the archive. The actual file operations are performed when methods are called. This allows for lazy initialization and better error handling.

Examples:

Python Console Session
>>> from pathlib import Path
>>>
>>> # For existing archives
>>> archiver = MyArchiver(Path("existing.cbz"))
>>>
>>> # For new archives to be created
>>> archiver = MyArchiver(Path("new_archive.cbz"))

Attributes

path: Path property

Get the path associated with this archiver.

RETURNS DESCRIPTION
Path

The Path object representing the archive file location.

Examples:

Python Console Session
>>> archiver = MyArchiver(Path("example.cbz"))
>>> print(archiver.path)  # Output: example.cbz

Functions

__enter__()

Context manager entry.

Enables the archiver to be used in a 'with' statement for automatic resource management. The archiver will be properly initialized and any necessary resources will be acquired.

Examples:

Python Console Session
>>> with MyArchiver(Path("archive.cbz")) as archive:
...     content = archive.read_file("file.txt")
...     # Archive is automatically closed when exiting the block

__exit__(exc_type: object, exc_val: object, exc_tb: object) -> None

Context manager exit.

Performs cleanup when exiting a 'with' statement. This method should be overridden by concrete implementations to release any resources such as file handles, network connections, or temporary files.

PARAMETER DESCRIPTION
exc_type

The exception type if an exception was raised, None otherwise.

TYPE: object

exc_val

The exception value if an exception was raised, None otherwise.

TYPE: object

exc_tb

The exception traceback if an exception was raised, None otherwise.

TYPE: object

Note

This base implementation does nothing. Concrete implementations should override this method to perform proper cleanup.

copy_from_archive(other_archive: Archiver) -> bool abstractmethod

Copy files from another archive to this archive.

Copies all files from the source archive to this archive. This is useful for converting between archive formats, merging archives, or creating backups.

PARAMETER DESCRIPTION
other_archive

Source archive to copy files from. Must be a valid Archiver instance that can read files.

TYPE: Archiver

RETURNS DESCRIPTION
bool

True if all files were successfully copied, False if any copy operation failed. The operation may be partially successful.

Examples:

Python Console Session
>>> # Copy from ZIP to 7Z
>>> with ZipArchiver(Path("source.cbz")) as source:
...     with SevenZipArchiver(Path("destination.7z")) as dest:
...         success = dest.copy_from_archive(source)
>>>
>>> # Merge two archives
>>> with ZipArchiver(Path("archive1.cbz")) as arch1:
...     with ZipArchiver(Path("archive2.cbz")) as arch2:
...         success = arch2.copy_from_archive(arch1)
Note
  • Files with the same name will be overwritten in the destination
  • The source archive must be readable and the destination writable
  • Large archives may take significant time to copy
  • Consider implementing progress callbacks in concrete classes

exists(archive_file: str) -> bool

Check if a file exists in the archive.

Determines whether a specific file exists within the archive without actually reading its contents. This is useful for conditional operations and avoiding exceptions when checking for file presence.

PARAMETER DESCRIPTION
archive_file

Path of the file to check within the archive. Should use forward slashes as path separators.

TYPE: str

RETURNS DESCRIPTION
bool

True if the file exists in the archive, False otherwise.

Examples:

Python Console Session
>>> # Check before reading
>>> if archive.exists("config.txt"):
...     content = archive.read_file("config.txt")
... else:
...     print("Config file not found")
>>>
>>> # Conditional writing
>>> if not archive.exists("backup.txt"):
...     archive.write_file("backup.txt", "backup data")
Performance Note

This method calls get_filename_list() internally, which may be expensive for large archives. Consider caching the file list if you need to check existence of multiple files.

get_filename_list() -> list[str] abstractmethod

Get a list of all files in the archive.

Returns a complete list of all files contained in the archive. The returned paths use forward slashes as separators and are relative to the archive root.

RETURNS DESCRIPTION
list[str]

List of file paths within the archive. The list is sorted alphabetically by most implementations. Returns an empty list if the archive is empty or doesn't exist.

Examples:

Python Console Session
>>> files = archive.get_filename_list()
>>> print(files)
>>> # Output: ['config.txt', 'data/users.json', 'images/logo.png']
>>>
>>> # Check if archive is empty
>>> if not files:
...     print("Archive is empty")
>>>
>>> # Filter for specific file types
>>> text_files = [f for f in files if f.endswith('.txt')]
>>> image_files = [f for f in files if archive.IMAGE_EXT_RE.search(f)]
Performance Note

This method may be expensive for large archives as it typically requires reading the archive's central directory. Consider caching the result if you need to call this method multiple times.

is_write_operation_expected() -> bool

Check if this archiver is expected to be used for write operations.

This method helps determine whether the archiver will be used primarily for writing (creating new archives) or reading (accessing existing archives). It's used during path validation to determine if a missing file should trigger a warning.

RETURNS DESCRIPTION
bool

True if write operations are expected (default), False if the archiver is read-only or primarily intended for reading existing archives.

Note

Override this method in read-only implementations to return False. This will change the validation behavior to expect existing files.

Examples:

Python Console Session
>>> class ReadOnlyArchiver(Archiver):
...     def is_write_operation_expected(self) -> bool:
...         return False  # This archiver only reads existing archives

read_file(archive_file: str) -> bytes abstractmethod

Read the contents of a file from the archive.

Extracts and returns the complete contents of the specified file from the archive. The file path should use forward slashes as separators regardless of the operating system.

PARAMETER DESCRIPTION
archive_file

Path of the file within the archive. Should use forward slashes as path separators (e.g., "folder/file.txt"). The path is relative to the archive root.

TYPE: str

RETURNS DESCRIPTION
bytes

The complete file contents as bytes. For text files, you'll need to decode the bytes using the appropriate encoding.

RAISES DESCRIPTION
ArchiverReadError

If the file cannot be read. This includes cases where the file doesn't exist, the archive is corrupted, or there are permission issues.

Examples:

Python Console Session
>>> # Reading a text file
>>> content = archive.read_file("config.txt")
>>> config_text = content.decode('utf-8')
>>>
>>> # Reading a binary file
>>> image_data = archive.read_file("images/photo.jpg")
>>> with open("extracted_photo.jpg", "wb") as f:
...     f.write(image_data)
Note

The entire file is loaded into memory. For very large files, consider implementing streaming read methods in concrete classes.

remove_files(filename_list: list[str]) -> bool abstractmethod

Remove multiple files from the archive.

Batch operation to remove multiple files from the archive in a single call.

PARAMETER DESCRIPTION
filename_list

List of file paths to remove from the archive. Each path should use forward slashes as separators.

TYPE: list[str]

RETURNS DESCRIPTION
bool

True if ALL files were successfully removed (or didn't exist), False if ANY file removal failed. The operation may be partially successful - some files may be removed even if the method returns False.

Examples:

Python Console Session
>>> # Remove multiple files at once
>>> files_to_remove = ["old1.txt", "old2.txt", "temp/cache.dat"]
>>> success = archive.remove_files(files_to_remove)
>>> if success:
...     print("All files removed successfully")
>>> else:
...     print("Some files may not have been removed")
Note

The atomic nature of this operation depends on the archive format and implementation. Some formats may support atomic batch operations while others may process files individually.

test() -> bool abstractmethod

Test whether the archive is valid.

RETURNS DESCRIPTION
bool

True if the file is a valid archive, False otherwise.

TYPE: bool

write_file(archive_file: str, data: str | bytes) -> bool abstractmethod

Write data to a file in the archive.

Creates or overwrites a file in the archive with the provided data. If the file already exists, it will be replaced. Directory structure within the archive is created automatically as needed.

PARAMETER DESCRIPTION
archive_file

Path of the file within the archive. Should use forward slashes as path separators (e.g., "folder/file.txt"). The path is relative to the archive root.

TYPE: str

data

Data to write to the file. Can be either a string (which will be encoded as UTF-8) or bytes for binary data.

TYPE: str | bytes

RETURNS DESCRIPTION
bool

True if the write operation was successful, False otherwise. Note that returning False doesn't necessarily mean an error occurred - check the logs for detailed error information.

RAISES DESCRIPTION
ArchiverWriteError

If the write operation fails due to serious errors such as disk full, permission denied, or archive format limitations.

Examples:

Python Console Session
>>> # Writing text content
>>> success = archive.write_file("config.txt", "setting=value")
>>>
>>> # Writing binary content
>>> with open("image.jpg", "rb") as f:
...     image_data = f.read()
>>> success = archive.write_file("images/photo.jpg", image_data)
>>>
>>> # Writing JSON data
>>> import json
>>> data = {"name": "example", "version": "1.0"}
>>> success = archive.write_file("data.json", json.dumps(data))
Note

String data is automatically encoded as UTF-8 bytes. For other encodings, encode the string manually before passing it to this method.