File Operations¶
Functions for working with files and directories.
ArchiveType ¶
Bases: Enum
Enumeration of supported archive formats.
Maps to formats recognized by :func:shutil.make_archive and
:func:shutil.unpack_archive.
from_path
classmethod
¶
from_path(path)
Infer the archive type from a file path.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
Path
|
Path to an existing archive file. |
required |
Returns:
| Type | Description |
|---|---|
ArchiveType
|
The inferred |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the path does not exist, is a directory, or is not a recognized archive. |
Source code in src/aibs_informatics_core/utils/file_operations.py
102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 | |
is_archive
classmethod
¶
is_archive(path)
Check if a path is any supported archive type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
Path
|
Path to check. |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if |
Source code in src/aibs_informatics_core/utils/file_operations.py
85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 | |
is_archive_type ¶
is_archive_type(path)
Check if a path matches this archive type.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
Path
|
Path to check. |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if |
Source code in src/aibs_informatics_core/utils/file_operations.py
71 72 73 74 75 76 77 78 79 80 81 82 83 | |
CannotAcquirePathLockError ¶
Bases: Exception
Raised when a path lock cannot be acquired.
PathLock
dataclass
¶
PathLock(path, lock_root=None, raise_if_locked=False)
A context manager for acquiring and releasing locks on a file or directory path.
If lock_root is provided, a lock file will be created in that directory with the name of the hash of the path. If lock_root is not provided, a lock file with the same name as the path and a .lock extension will be created.
Providing an explicit lock root is useful if you dont want processes to read the lock file from the same directory as the file being locked.
Attributes:
| Name | Type | Description |
|---|---|---|
path |
Union[str, Path]
|
The path to the file. |
lock_root |
Optional[Union[str, Path]]
|
The root directory for lock files. If provided, a lock file will be created in this directory with the name of the hash of the path. Otherwise, a lock file with the same name as the path and a .lock extension will be created. Defaults to None. |
acquire ¶
acquire()
Acquire the file lock.
Raises:
| Type | Description |
|---|---|
CannotAcquirePathLockError
|
If the lock cannot be acquired. |
Source code in src/aibs_informatics_core/utils/file_operations.py
485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 | |
release ¶
release()
Release the file lock and remove the lock file.
Source code in src/aibs_informatics_core/utils/file_operations.py
506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 | |
copy_path ¶
copy_path(source_path, destination_path, exists_ok=False)
Copies path from source to destination
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source_path
|
Path
|
source path |
required |
destination_path
|
Path
|
destination path |
required |
exists_ok
|
bool
|
if true, overwrites destination. Defaults to False. |
False
|
Source code in src/aibs_informatics_core/utils/file_operations.py
270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 | |
extract_archive ¶
extract_archive(source_path, destination_path=None)
Untar/unzip data batch into a dedicate folder Example: batch_of_samples.tar.gz -> batch_of_samples
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source_path
|
Path
|
archived batch data |
required |
destination_path
|
Optional[Path]
|
Optional destination path for extracted data. If none, then source path name used with extension removed. |
None
|
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If extraction fails |
Returns:
| Type | Description |
|---|---|
Path
|
path to the untarred data |
Source code in src/aibs_informatics_core/utils/file_operations.py
150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 | |
find_filesystem_boundary ¶
find_filesystem_boundary(starting_path)
Given some starting Path, determine the nearest filesystem boundary (mount point). If no mount is found, then this function will return the first parent directory PRIOR to the filesystem anchor.
find_filesystem_boundary(Path("/allen/scratch/aibstemp")) PosixPath('/allen/scratch')
find_filesystem_boundary(Path("/tmp/random_file.txt")) PosixPath('/tmp')
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
starting_path
|
Path
|
The starting Path |
required |
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If the provided starting_path cannot resolve to a real existing path |
Returns:
| Type | Description |
|---|---|
Path
|
The path of the nearest filesystem boundary OR the first parent directory prior to the filesystem anchor (example anchors: "/", "c:\") |
Source code in src/aibs_informatics_core/utils/file_operations.py
220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 | |
find_paths ¶
find_paths(
root,
include_dirs=True,
include_files=True,
includes=None,
excludes=None,
)
Find paths that match criteria
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
root
|
Union[str, Path]
|
root path |
required |
include_dirs
|
bool
|
whether to include directories. Defaults to True. |
True
|
include_files
|
bool
|
whether to include files. Defaults to True. |
True
|
includes
|
Sequence[str]
|
list of regex patterns to include. Defaults to all. |
None
|
excludes
|
Sequence[str]
|
list of regex patterns to exclude. Defaults to None. |
None
|
Returns:
| Type | Description |
|---|---|
list[str]
|
list of paths matching criteria |
Source code in src/aibs_informatics_core/utils/file_operations.py
352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 | |
get_path_size_bytes ¶
get_path_size_bytes(path)
Calculate the total size in bytes of all files under a path.
Handles FileNotFoundError and stale NFS file handles gracefully.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
Path
|
A file or directory path. |
required |
Returns:
| Type | Description |
|---|---|
int
|
The total size in bytes. |
Source code in src/aibs_informatics_core/utils/file_operations.py
315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 | |
get_path_with_root ¶
get_path_with_root(path, root)
Ensure a path is rooted under the given directory.
If path is already relative to root, it is returned unchanged.
Otherwise, the path is made relative and joined with root.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str | Path
|
The path to adjust. |
required |
root
|
str | Path
|
The root directory. |
required |
Returns:
| Type | Description |
|---|---|
str
|
The normalized path string under |
Source code in src/aibs_informatics_core/utils/file_operations.py
397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 | |
make_archive ¶
make_archive(
source_path,
destination_path=None,
archive_type=ArchiveType.TAR_GZ,
)
tar/zip data batch from a folder Example: batch_of_samples -> batch_of_samples.tar.gz
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source_path
|
Path
|
folder of data to archive |
required |
destination_path
|
Optional[Path]
|
Optional destination path for archived file. If none, then tmp file is created and used |
None
|
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If archiving operation fails |
Returns:
| Type | Description |
|---|---|
Path
|
path to the untarred data |
Source code in src/aibs_informatics_core/utils/file_operations.py
177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 | |
move_path ¶
move_path(source_path, destination_path, exists_ok=False)
Alias to simple mv command from one path to another
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source_path
|
Path
|
source path |
required |
destination_path
|
Path
|
destination path |
required |
exists_ok
|
bool
|
if true, overwrites destination. Defaults to False. |
False
|
Source code in src/aibs_informatics_core/utils/file_operations.py
253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 | |
remove_path ¶
remove_path(path, ignore_errors=True)
Remove a file or directory at the given path.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
Path
|
The path to remove. |
required |
ignore_errors
|
bool
|
If True, errors are logged but not raised. Defaults to True. |
True
|
Source code in src/aibs_informatics_core/utils/file_operations.py
289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 | |
strip_path_root ¶
strip_path_root(path, root=None)
Strip the root from the path if path is absolute
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
Union[str, Path]
|
path to strip root from |
required |
root
|
Optional[Union[str, Path]]
|
optionally specify root. If no root specified, uses "/". |
None
|
Returns:
| Type | Description |
|---|---|
str
|
a relative path |
Source code in src/aibs_informatics_core/utils/file_operations.py
420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 | |