pyseq - Python sequence string module

PySeq is a python module that finds groups of items that follow a naming convention containing a numerical sequence index, e.g.

fileA.001.png, fileA.002.png, fileA.003.png...

and serializes them into a compressed sequence string representing the entire sequence, e.g.

fileA.1-3.png

It should work regardless of where the numerical sequence index is embedded in the name.

Source Code

PySeq’s git repo is available on GitHub, which can be browsed at:

and cloned using:

$ git clone git://github.com/rsgalloway/pyseq.git pyseq

Installation

Installing PySeq is easily done using setuptools. Assuming it is installed, just run the following from the command-line:

$ pip install pyseq

Alternatively, you can install from the distribution using the setup.py script:

$ python setup.py install

Overview

PySeq comes with a command-line script called lss.

$ lss [path] [-f format] [-d]

Using the “z1” file sequence example in the “tests” directory:

$ ls tests/files/z1*
tests/files/z1_001_v1.1.png tests/files/z1_002_v1.3.png
tests/files/z1_001_v1.2.png tests/files/z1_002_v1.4.png
tests/files/z1_001_v1.3.png tests/files/z1_002_v2.1.png
tests/files/z1_001_v1.4.png tests/files/z1_002_v2.2.png
tests/files/z1_002_v1.1.png tests/files/z1_002_v2.3.png
tests/files/z1_002_v1.2.png tests/files/z1_002_v2.4.png

$ lss tests/files/z1*
   4 z1_001_v1.%d.png 1-4
   4 z1_002_v1.%d.png 1-4
   4 z1_002_v2.%d.png 1-4

$ lss tests/files/z1* -f "%h%r%t"
z1_001_v1.1-4.png
z1_002_v1.1-4.png
z1_002_v2.1-4.png

API Examples

Sequence compression

Example using getSequences to compress filesystem sequences starting with “bnc”. The getSequences function will return a list of all sequences found using the given input, which can be either a path or a list.

>>> import pyseq
>>> seqs = pyseq.get_sequences('./tests/files/bnc*')
>>> for s in seqs:
...     print(s.format('%h%p%t %r'))
...
bnc01_TinkSO_tx_0_ty_0.%04d.tif 101-105
bnc01_TinkSO_tx_0_ty_1.%04d.tif 101-105
bnc01_TinkSO_tx_1_ty_0.%04d.tif 101-105
bnc01_TinkSO_tx_1_ty_1.%04d.tif 101-105

Example using the Sequence class with a list as input. The Sequence class constructor will return a single Sequence class instance of sequential items, skipping any items in the list that are not part of the sequence.

>>> s = pyseq.Sequence(['file.0001.jpg', 'file.0002.jpg', 'file.0003.jpg'])
>>> print(s)
file.1-3.jpg
>>> s.append('file.0006.jpg')
>>> print(s.format("%h%p%t %R"))
file.%04d.jpg 1-3 6
>>> s.includes('file.0009.jpg')
True
>>> s.contains('file.0009.jpg')
False
>>> s.includes('file.0009.pic')
False
>>> s.contains('file.0009.pic')
False

Iterate over the Sequence members:

>>> for i in s:
...     print i.name,  # file name
...     print i.frame, # unpadded frame number
...     print i.exists # True if file exists on disk
...
file.0001.jpg 1 False
file.0002.jpg 2 False
file.0003.jpg 3 False

Sequence expansion

>>> s = pyseq.uncompress('012_vb_110_v002.1-150.dpx', format="%h%r%t")
>>> len(s)
150
>>> seq = pyseq.uncompress('./tests/012_vb_110_v001.%04d.png 1-10', format='%h%p%t %r')
>>> print(seq.format('%04l %h%p%t %R'))
  10 012_vb_110_v001.%04d.png 1-10

API Reference

PySeq is a python module that finds groups of items that follow a naming convention containing a numerical sequence index, e.g.

fileA.001.png, fileA.002.png, fileA.003.png...

and serializes them into a compressed sequence string representing the entire sequence, e.g.

fileA.1-3.png

It should work regardless of where the numerical sequence index is embedded in the name.

Docs and latest version available for download at

exception pyseq.FormatError

Special exception for Sequence format errors.

class pyseq.Item(item)

Represents a file in a sequence.

property digits

Returns the numerical components of the Item as a list of strings.

Returns:

The numerical components.

property dirname

Gets the directory name of the Item, if it is a filesystem item.

Returns:

The directory name.

property exists

Checks if this Item exists on disk.

Returns:

True if the Item exists, False otherwise.

is_sibling(item)

Determines if this Item and another Item are part of the same sequence.

Parameters:

item – Another Item instance.

Returns:

True if this Item and the other Item are sequential siblings, False otherwise.

property mtime

Returns the modification time of the Item.

Returns:

The modification time.

property name

Gets the base name of the Item.

Returns:

The base name.

property number_matches

Returns the numerical components of the Item as a list of regex match objects.

Returns:

The numerical components.

property parts

Returns the non-numerical components of the Item.

Returns:

The non-numerical components.

property path

Gets the absolute path of the Item, if it is a filesystem item.

Returns:

The absolute path.

property size

Returns the size of the Item, reported by os.stat.

Returns:

The size of the Item.

property stat

Returns the os.stat object for this file.

Returns:

The os.stat object.

class pyseq.Sequence(items)

Extends list class with methods that handle item sequentialness.

For example:

>>> s = Sequence(['file.0001.jpg', 'file.0002.jpg', 'file.0003.jpg'])
>>> print(s)
file.1-3.jpg
>>> s.append('file.0006.jpg')
>>> print(s.format('%4l %h%p%t %R'))
   4 file.%04d.jpg 1-3 6
>>> s.includes('file.0009.jpg')
True
>>> s.includes('file.0009.pic')
False
>>> s.contains('file.0006.jpg')
False
>>> print(s.format('%h%p%t %r (%R)'))
file.%04d.jpg 1-6 (1-3 6)
append(item, check_membership=True)

Adds another member to the sequence.

Parameters:
  • item – pyseq.Item object.

  • check_membership – Check if item is a member. Can be useful if membership is checked prior to appending.

SequenceError raised if item is not a sequence member.

contains(item)

Checks for sequence membership. Calls Item.is_sibling() and returns True if item is part of the sequence.

For example:

>>> s = Sequence(['fileA.0001.jpg', 'fileA.0002.jpg'])
>>> print(s)
fileA.1-2.jpg
>>> s.contains('fileA.0003.jpg')
False
>>> s.contains('fileB.0003.jpg')
False
Parameters:

item – pyseq.Item class object.

Returns:

True if item is a sequence member.

end()
Returns:

Last index number in sequence.

extend(items, check_membership=True)

Add members to the sequence.

Parameters:
  • items – List of pyseq.Item objects.

  • check_membership – Check if item is a member. Can be useful if membership is checked prior to appending.

Exc:

SequenceError Raised if any items are not a sequence member.

format(fmt='%4l %h%p%t %R')

Format the stdout string.

The following directives can be embedded in the format string. Format directives support padding, for example: “%04l”.

Directive

Meaning

%s

sequence start

%e

sequence end

%l

sequence length

%f

list of found files

%m

list of missing files

%M

explicit missingfiles [11-14,19-21]

%p

padding, e.g. %06d

%r

implied range, start-end

%R

explicit broken range, [1-10, 15-20]

%d

disk usage

%H

disk usage (human readable)

%D

parent directory

%h

string preceding sequence number

%t

string after the sequence number

Parameters:

fmt – Format string. Default is ‘%4l %h%p%t %R’.

Returns:

Formatted string.

frames()
Returns:

List of files in sequence.

head()
Returns:

String before the sequence index number.

property human

Returns the size of all items in human-readable format.

includes(item)

Checks if the item can be contained in this sequence, i.e. if it is a sibling of any of the items in the list.

For example:

>>> s = Sequence(['fileA.0001.jpg', 'fileA.0002.jpg'])
>>> print(s)
fileA.1-2.jpg
>>> s.includes('fileA.0003.jpg')
True
>>> s.includes('fileB.0003.jpg')
False
Parameters:

item – pyseq.Item class object.

Returns:

True if item is a sequence member.

insert(index, item, check_membership=True)

Add another member to the sequence at the given index.

Parameters:
  • item – pyseq.Item object.

  • check_membership – Check if item is a member. Can be useful if membership is checked prior to appending.

Exc:

SequenceError Raised if item is not a sequence member.

length()
Returns:

The length of the sequence.

missing()
Returns:

List of missing files.

property mtime

Returns the latest mtime of all items.

path()
Returns:

Absolute path to sequence.

reIndex(offset, padding=None)

Renames and reindexes the items in the sequence, e.g.

>>> seq.reIndex(offset=100)

will add a 100 frame offset to each Item in seq, and rename the files on disk.

Parameters:
  • offset – The frame offset to apply to each item.

  • padding – Change the padding.

property size

Returns the size all items in bytes.

start()
Returns:

First index number in sequence.

tail()
Returns:

String after the sequence index number.

exception pyseq.SequenceError

Special exception for Sequence errors.

pyseq.diff(f1, f2)

Examines diffs between f1 and f2 and deduces numerical sequence number.

For example

>>> diff('file01_0040.rgb', 'file01_0041.rgb')
[{'frames': ('0040', '0041'), 'start': 7, 'end': 11}]

>>> diff('file3.03.rgb', 'file4.03.rgb')
[{'frames': ('3', '4'), 'start': 4, 'end': 5}]
Parameters:
  • f1 – pyseq.Item object.

  • f2 – pyseq.Item object to diff.

Returns:

A dictionary with keys ‘frames’, ‘start’, and ‘end’.

pyseq.get_sequences(source)

Returns a list of Sequence objects given a directory or list that contain sequential members.

Get sequences in a directory:

>>> seqs = get_sequences('tests/files/')
>>> for s in seqs: print(s)
...
012_vb_110_v001.1-10.png
012_vb_110_v002.1-10.png
a.1-14.tga
alpha.txt
bnc01_TinkSO_tx_0_ty_0.101-105.tif
bnc01_TinkSO_tx_0_ty_1.101-105.tif
bnc01_TinkSO_tx_1_ty_0.101-105.tif
bnc01_TinkSO_tx_1_ty_1.101-105.tif
file.1-2.tif
file.info.03.rgb
file01_40-43.rgb
file02_44-47.rgb
file1-4.03.rgb
file_02.tif
z1_001_v1.1-4.png
z1_002_v1.1-4.png
z1_002_v2.1-4.png

Get sequences from a list of file names:

>>> seqs = get_sequences(['fileA.1.rgb', 'fileA.2.rgb', 'fileB.1.rgb'])
>>> for s in seqs: print(s)
...
fileA.1-2.rgb
fileB.1.rgb
Parameters:

source – Can be directory path, list of strings, or sortable list of objects.

Returns:

List of pyseq.Sequence class objects.

pyseq.uncompress(seq_string, fmt='%4l %h%p%t %R')

Basic uncompression or deserialization of a compressed sequence string.

For example:

>>> seq = uncompress('./tests/files/012_vb_110_v001.%04d.png 1-10', fmt='%h%p%t %r')
>>> print(seq)
012_vb_110_v001.1-10.png
>>> len(seq)
10
>>> seq2 = uncompress('./tests/files/a.%03d.tga [1-3, 10, 12-14]', fmt='%h%p%t %R')
>>> print(seq2)
a.1-14.tga
>>> len(seq2)
7
>>> seq3 = uncompress('a.%03d.tga 1-14 ([1-3, 10, 12-14])', fmt='%h%p%t %r (%R)')
>>> print(seq3)
a.1-14.tga
>>> len(seq3)
7

See unit tests for more examples.

Parameters:
  • seq_string – Compressed sequence string.

  • fmt – Format of sequence string.

Returns:

Sequence instance.

pyseq.walk(source, level=-1, topdown=True, onerror=None, followlinks=False, hidden=False)

Generator that traverses a directory structure starting at source looking for sequences.

Parameters:
  • source – Valid folder path to traverse.

  • level – int, if < 0 traverse entire structure otherwise traverse to given depth.

  • topdown – Walk from the top down.

  • onerror – Callable to handle os.listdir errors.

  • followlinks – Whether to follow links.

  • hidden – Include hidden files and dirs.

Indices and tables