malstruct package

Subpackages

Submodules

malstruct.adapters module

adapters and validators

class malstruct.adapters.ExprAdapter(subcon: Construct, decoder: Callable[[Any, Any], Any], encoder: Callable[[Any, Any], Any])

Bases: Adapter

Generic adapter that takes decoder and encoder lambdas as parameters. You can use ExprAdapter instead of writing a full-blown class deriving from Adapter when only a simple lambda is needed.

Parameters:
  • subcon – Construct instance, subcon to adapt

  • decoder – lambda that takes (obj, context) and returns an decoded version of obj

  • encoder – lambda that takes (obj, context) and returns an encoded version of obj

Example:

>>> d = ExprAdapter(Byte, obj_+1, obj_-1)
>>> d.parse(b'\x04')
5
>>> d.build(5)
b'\x04'
class malstruct.adapters.ExprSymmetricAdapter(subcon: Construct, encoder: Callable[[Any, Any], Any])

Bases: ExprAdapter

Macro around ExprAdapter.

Parameters:
  • subcon – Construct instance, subcon to adapt

  • encoder – lambda that takes (obj, context) and returns both encoded version and decoded version of obj

Example:

>>> d = ExprSymmetricAdapter(Byte, obj_ & 0b00001111)
>>> d.parse(b"ÿ")
15
>>> d.build(255)
b''
class malstruct.adapters.ExprValidator(subcon: Construct, validator: Callable[[Any, Any], bool])

Bases: Validator

Generic adapter that takes validator lambda as parameter. You can use ExprValidator instead of writing a full-blown class deriving from Validator when only a simple lambda is needed.

Parameters:
  • subcon – Construct instance, subcon to adapt

  • validator – lambda that takes (obj, context) and returns a bool

Example:

>>> d = ExprValidator(Byte, obj_ & 0b11111110 == 0)
>>> d.build(1)
b'\x01'
>>> d.build(88)
ValidationError: object failed validation: 88
malstruct.adapters.OneOf(subcon: Construct, valids)

Validates that the object is one of the listed values, both during parsing and building.

Note

For performance, valids should be a set or frozenset.

Parameters:
  • subcon – Construct instance, subcon to validate

  • valids – collection implementing __contains__, usually a list or set

Raises:

ValidationError – parsed or build value is not among valids

Example:

>>> d = OneOf(Byte, [1,2,3])
>>> d.parse(b"\x01")
1
>>> d.parse(b"\xff")
malstruct.core.ValidationError: object failed validation: 255
malstruct.adapters.NoneOf(subcon: Construct, invalids)

Validates that the object is none of the listed values, both during parsing and building.

Note

For performance, valids should be a set or frozenset.

Parameters:
  • subcon – Construct instance, subcon to validate

  • invalids – collection implementing __contains__, usually a list or set

Raises:

ValidationError – parsed or build value is among invalids

malstruct.adapters.Filter(predicate: Callable[[Any, Any], bool], subcon: Construct)

Filters a list leaving only the elements that passed through the predicate.

Parameters:
  • subcon – Construct instance, usually Array GreedyRange Sequence

  • predicate – lambda that takes (obj, context) and returns a bool

Can propagate any exception from the lambda, possibly non-ConstructError.

Example:

>>> d = Filter(obj_ != 0, Byte[:])
>>> d.parse(b"\x00\x02\x00")
[2]
>>> d.build([0,1,0,2,0])
b'\x01\x02'
class malstruct.adapters.Slicing(subcon: Construct, count: int, start: int, stop: int, step: int = 1, empty: Any = None)

Bases: Adapter

Adapter for slicing a list. Works with GreedyRange and Sequence.

Parameters:
  • subcon – Construct instance, subcon to slice

  • count – integer, expected number of elements, needed during building

  • start – integer for start index (or None for entire list)

  • stop – integer for stop index (or None for up-to-end)

  • step – integer, step (or 1 for every element)

  • empty – object, value to fill the list with, during building

Example:

d = Slicing(Array(4,Byte), 4, 1, 3, empty=0)
assert d.parse(b"\x01\x02\x03\x04") == [2,3]
assert d.build([2,3]) == b"\x00\x02\x03\x00"
assert d.sizeof() == 4
class malstruct.adapters.Indexing(subcon: Construct, count: int, index: int, empty: Any = None)

Bases: Adapter

Adapter for indexing a list (getting a single item from that list). Works with Range and Sequence and their lazy equivalents.

Parameters:
  • subcon – Construct instance, subcon to index

  • count – integer, expected number of elements, needed during building

  • index – integer, index of the list to get

  • empty – object, value to fill the list with, during building

Example:

d = Indexing(Array(4,Byte), 4, 2, empty=0)
assert d.parse(b"\x01\x02\x03\x04") == 3
assert d.build(3) == b"\x00\x00\x03\x00"
assert d.sizeof() == 4

malstruct.alignment module

Alignment and padding constructs

malstruct.alignment.Padding(length, pattern=b'\x00')

Appends null bytes.

Parsing consumes specified amount of bytes and discards it. Building writes specified pattern byte multiplied into specified length. Size is same as specified.

Parameters:
  • length – integer or context lambda, length of the padding

  • pattern – b-character, padding pattern, default is \x00

Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

  • PaddingError – length was negative

  • PaddingError – pattern was not bytes (b-character)

Can propagate any exception from the lambda, possibly non-ConstructError.

Example:

>>> d = Padding(4) or Padded(4, Pass)
>>> d.build(None)
b'\x00\x00\x00\x00'
>>> d.parse(b"****")
None
>>> d.sizeof()
4
class malstruct.alignment.Padded(length, subcon, pattern=b'\x00')

Bases: Subconstruct

Appends additional null bytes to achieve a length.

Parsing first parses the subcon, then uses stream.tell() to measure how many bytes were read and consumes additional bytes accordingly. Building first builds the subcon, then uses stream.tell() to measure how many bytes were written and produces additional bytes accordingly. Size is same as length, but negative amount results in error. Note that subcon can actually be variable size, it is the eventual amount of bytes that is read or written during parsing or building that determines actual padding.

Parameters:
  • length – integer or context lambda, length of the padding

  • subcon – Construct instance

  • pattern – optional, b-character, padding pattern, default is \x00

Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

  • PaddingError – length is negative

  • PaddingError – subcon read or written more than the length (would cause negative pad)

  • PaddingError – pattern is not bytes of length 1

Can propagate any exception from the lambda, possibly non-ConstructError.

Example:

>>> d = Padded(4, Byte)
>>> d.build(255)
b'\xff\x00\x00\x00'
>>> d.parse(_)
255
>>> d.sizeof()
4

>>> d = Padded(4, VarInt)
>>> d.build(1)
b'\x01\x00\x00\x00'
>>> d.build(70000)
b'\xf0\xa2\x04\x00'
class malstruct.alignment.Aligned(modulus, subcon, pattern=b'\x00')

Bases: Subconstruct

Appends additional null bytes to achieve a length that is shortest multiple of a modulus.

Note that subcon can actually be variable size, it is the eventual amount of bytes that is read or written during parsing or building that determines actual padding.

Parsing first parses subcon, then consumes an amount of bytes to sum up to specified length, and discards it. Building first builds subcon, then writes specified pattern byte to sum up to specified length. Size is subcon size plus modulo remainder, unless SizeofError was raised.

Parameters:
  • modulus – integer or context lambda, modulus to final length

  • subcon – Construct instance

  • pattern – optional, b-character, padding pattern, default is \x00

Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

  • PaddingError – modulus was less than 2

  • PaddingError – pattern was not bytes (b-character)

Can propagate any exception from the lambda, possibly non-ConstructError.

Example:

>>> d = Aligned(4, Int16ub)
>>> d.parse(b'\x00\x01\x00\x00')
1
>>> d.sizeof()
4
malstruct.alignment.AlignedStruct(modulus, *subcons, **subconskw)

Makes a structure where each field is aligned to the same modulus (it is a struct of aligned fields, NOT an aligned struct).

See Aligned and Struct for semantics and raisable exceptions.

Parameters:
  • modulus – integer or context lambda, passed to each member

  • *subcons – Construct instances, list of members, some can be anonymous

  • **subconskw – Construct instances, list of members (requires Python 3.6)

Example:

>>> d = AlignedStruct(4, "a"/Int8ub, "b"/Int16ub)
>>> d.build(dict(a=0xFF,b=0xFFFF))
b'\xff\x00\x00\x00\xff\xff\x00\x00'
malstruct.alignment.Pass

No-op construct, useful as default cases for Switch and Enum.

Parsing returns None. Building does nothing. Size is 0 by definition.

Example:

>>> Pass.parse(b"")
None
>>> Pass.build(None)
b''
>>> Pass.sizeof()
0

malstruct.analysis module

class malstruct.analysis.TimestampAdapter(subcon)

Bases: Adapter

Used internally.

malstruct.analysis.Timestamp(subcon, unit, epoch)

Datetime, represented as Arrow object.

Note that accuracy is not guaranteed, because building rounds the value to integer (even when Float subcon is used), due to floating-point errors in general, and because MSDOS scheme has only 5-bit (32 values) seconds field (seconds are rounded to multiple of 2).

Unit is a fraction of a second. 1 is second resolution, 10**-3 is milliseconds resolution, 10**-6 is microseconds resolution, etc. Usually its 1 on Unix and MacOSX, 10**-7 on Windows. Epoch is a year (if integer) or a specific day (if Arrow object). Usually its 1970 on Unix, 1904 on MacOSX, 1600 on Windows. MSDOS format doesnt support custom unit or epoch, it uses 2-seconds resolution and 1980 epoch.

Parameters:
  • subcon – Construct instance like Int* Float*, or Int32ub with msdos format

  • unit – integer or float, or msdos string

  • epoch – integer, or Arrow instance, or msdos string

Raises:
  • ImportError – arrow could not be imported during ctor

  • TimestampError – subcon is not a Construct instance

  • TimestampError – unit or epoch is a wrong type

Example:

>>> d = Timestamp(Int64ub, 1., 1970)
>>> d.parse(b'\x00\x00\x00\x00ZIz\x00')
<Arrow [2018-01-01T00:00:00+00:00]>
>>> d = Timestamp(Int32ub, "msdos", "msdos")
>>> d.parse(b'H9\x8c"')
<Arrow [2016-01-25T17:33:04+00:00]>
class malstruct.analysis.EpochTimeAdapter(subcon, tz=None)

Bases: Adapter

Adapter to convert time_t, EpochTime, to an isoformat

Example:

>>> EpochTimeAdapter(Int32ul, tz=datetime.timezone.utc).parse(b'\xff\x93\x37\x57')
'2016-05-14T21:09:19+00:00'
>>> EpochTimeAdapter(Int32ul).parse(b'\xff\x93\x37\x57')
'2016-05-14T17:09:19'
class malstruct.analysis.Hex(subcon)

Bases: Adapter

Adapter for displaying hexadecimal/hexlified representation of integers/bytes/RawCopy dictionaries.

Parsing results in int-alike bytes-alike or dict-alike object, whose only difference from original is pretty-printing. If you look at the result, you will be presented with its repr which remains as-is. If you print it, then you will see its str whic is a hexlified representation. Building and sizeof defer to subcon.

To obtain a hexlified string (like before Hex HexDump changed semantics) use binascii.(un)hexlify on parsed results.

Example:

>>> d = Hex(Int32ub)
>>> obj = d.parse(b"\x00\x00\x01\x02")
>>> obj
258
>>> print(obj)
0x00000102

>>> d = Hex(GreedyBytes)
>>> obj = d.parse(b"\x00\x00\x01\x02")
>>> obj
b'\x00\x00\x01\x02'
>>> print(obj)
unhexlify('00000102')

>>> d = Hex(RawCopy(Int32ub))
>>> obj = d.parse(b"\x00\x00\x01\x02")
>>> obj
{'data': b'\x00\x00\x01\x02',
 'length': 4,
 'offset1': 0,
 'offset2': 4,
 'value': 258}
>>> print(obj)
unhexlify('00000102')
class malstruct.analysis.HexDump(subcon)

Bases: Adapter

Adapter for displaying hexlified representation of bytes/RawCopy dictionaries.

Parsing results in bytes-alike or dict-alike object, whose only difference from original is pretty-printing. If you look at the result, you will be presented with its repr which remains as-is. If you print it, then you will see its str whic is a hexlified representation. Building and sizeof defer to subcon.

To obtain a hexlified string (like before Hex HexDump changed semantics) use malstruct.lib.hexdump on parsed results.

Example:

>>> d = HexDump(GreedyBytes)
>>> obj = d.parse(b"\x00\x00\x01\x02")
>>> obj
b'\x00\x00\x01\x02'
>>> print(obj)
hexundump('''
0000   00 00 01 02                                       ....
''')

>>> d = HexDump(RawCopy(Int32ub))
>>> obj = d.parse(b"\x00\x00\x01\x02")
>>> obj
{'data': b'\x00\x00\x01\x02',
 'length': 4,
 'offset1': 0,
 'offset2': 4,
 'value': 258}
>>> print(obj)
hexundump('''
0000   00 00 01 02                                       ....
''')
class malstruct.analysis.HexString(subcon)

Bases: Adapter

Adapter used to convert an int into a hex string equivalent.

Example:

>>> HexString(Int32ul).build('0x123')
b'#\x01\x00\x00'
>>> HexString(Int32ul).parse(b'\x20\x01\x00\x00')
'0x120'
>>> HexString(Int16ub).parse(b'\x12\x34')
'0x1234'
>>> HexString(BytesInteger(20)).parse(b'\x01' * 20)
'0x101010101010101010101010101010101010101'
class malstruct.analysis.UUIDAdapter(subcon, le=True)

Bases: Adapter

Adapter used to convert parsed bytes to a string representing the UUID. Adapter can decode 16 bytes straight or in little-endian order if you set le=True.

Example:

>>> UUIDAdapter(Bytes(16)).build('{12345678-1234-5678-1234-567812345678}')
b'xV4\x124\x12xV\x124Vx\x124Vx'
>>> UUIDAdapter(Bytes(16), le=False).build('{12345678-1234-5678-1234-567812345678}')
b'\x124Vx\x124Vx\x124Vx\x124Vx'
>>> UUIDAdapter(Bytes(16)).parse(b'xV4\x124\x12xV\x124Vx\x124Vx')
'{12345678-1234-5678-1234-567812345678}'
malstruct.analysis.UUID(le=True)

A convenience function for using the UUIDAdapter with 16 bytes.

Parameters:

le – Whether to use “bytes_le” or “bytes” when constructing the UUID.

Example:

>>> UUID().build('{12345678-1234-5678-1234-567812345678}')
b'xV4\x124\x12xV\x124Vx\x124Vx'
>>> UUID(le=False).build('{12345678-1234-5678-1234-567812345678}')
b'\x124Vx\x124Vx\x124Vx\x124Vx'
>>> UUID().parse(b'xV4\x124\x12xV\x124Vx\x124Vx')
'{12345678-1234-5678-1234-567812345678}'
>>> UUID(le=False).parse(b'\x124Vx\x124Vx\x124Vx\x124Vx')
'{12345678-1234-5678-1234-567812345678}'

malstruct.bytes_ module

Bytes and bits

class malstruct.bytes_.Bytes(length)

Bases: Construct

Field consisting of a specified number of bytes.

Parses into a bytes (of given length). Builds into the stream directly (but checks that given object matches specified length). Can also build from an integer for convenience (although BytesInteger should be used instead). Size is the specified length.

Can also build from a bytearray.

Parameters:

length – integer or context lambda

Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

  • StringError – building from non-bytes value, perhaps unicode

Can propagate any exception from the lambda, possibly non-ConstructError.

Example:

>>> d = Bytes(4)
>>> d.parse(b'beef')
b'beef'
>>> d.build(b'beef')
b'beef'
>>> d.build(0)
b'\x00\x00\x00\x00'
>>> d.sizeof()
4

>>> d = Struct(
...     "length" / Int8ub,
...     "data" / Bytes(this.length),
... )
>>> d.parse(b"\x04beef")
Container(length=4, data=b'beef')
>>> d.sizeof()
malstruct.core.SizeofError: cannot calculate size, key not found in context
malstruct.bytes_.GreedyBytes

Field consisting of unknown number of bytes.

Parses the stream to the end. Builds into the stream directly (without checks). Size is undefined.

Can also build from a bytearray.

Raises:
  • StreamError – stream failed when reading until EOF

  • StringError – building from non-bytes value, perhaps unicode

Example:

>>> GreedyBytes.parse(b"asislight")
b'asislight'
>>> GreedyBytes.build(b"asislight")
b'asislight'

malstruct.conditional module

Conditional constructs

class malstruct.conditional.Union(parsefrom=None, *subcons, **subconskw)

Bases: Construct

Treats the same data as multiple constructs (similar to C union) so you can look at the data in multiple views. Fields are usually named (so parsed values are inserted into dictionary under same name).

Parses subcons in sequence, and reverts the stream back to original position after each subcon. Afterwards, advances the stream by selected subcon. Builds from first subcon that has a matching key in given dict. Size is undefined (because parsefrom is not used for building).

This class does context nesting, meaning its members are given access to a new dictionary where the “_” entry points to the outer context. When parsing, each member gets parsed and subcon parse return value is inserted into context under matching key only if the member was named. When building, the matching entry gets inserted into context before subcon gets build, and if subcon build returns a new value (not None) that gets replaced in the context.

This class exposes subcons as attributes. You can refer to subcons that were inlined (and therefore do not exist as variable in the namespace) by accessing the struct attributes, under same name. Also note that compiler does not support this feature. See examples.

This class exposes subcons in the context. You can refer to subcons that were inlined (and therefore do not exist as variable in the namespace) within other inlined fields using the context. Note that you need to use a lambda (this expression is not supported). Also note that compiler does not support this feature. See examples.

Warning

If you skip parsefrom parameter then stream will be left back at starting offset, not seeked to any common denominator.

Parameters:
  • parsefrom – how to leave stream after parsing, can be integer index or string name selecting a subcon, or None (leaves stream at initial offset, the default), or context lambda

  • *subcons – Construct instances, list of members, some can be anonymous

  • **subconskw – Construct instances, list of members (requires Python 3.6)

Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

  • StreamError – stream is not seekable and tellable

  • UnionError – selector does not match any subcon, or dict given to build does not contain any keys matching any subcon

  • IndexError – selector does not match any subcon

  • KeyError – selector does not match any subcon

Can propagate any exception from the lambda, possibly non-ConstructError.

Example:

>>> d = Union(0,
...     "raw" / Bytes(8),
...     "ints" / Int32ub[2],
...     "shorts" / Int16ub[4],
...     "chars" / Byte[8],
... )
>>> d.parse(b"12345678")
Container(raw=b'12345678', ints=[825373492, 892745528], shorts=[12594, 13108, 13622, 14136], chars=[49, 50, 51, 52, 53, 54, 55, 56])
>>> d.build(dict(chars=range(8)))
b'\x00\x01\x02\x03\x04\x05\x06\x07'

>>> d = Union(None,
...     "animal" / Enum(Byte, giraffe=1),
... )
>>> d.animal.giraffe
'giraffe'
>>> d = Union(None,
...     "chars" / Byte[4],
...     "data" / Bytes(lambda this: this._subcons.chars.sizeof()),
... )
>>> d.parse(b"\x01\x02\x03\x04")
Container(chars=[1, 2, 3, 4], data=b'\x01\x02\x03\x04')

Alternative syntax, but requires Python 3.6 or any PyPy:
>>> Union(0, raw=Bytes(8), ints=Int32ub[2], shorts=Int16ub[4], chars=Byte[8])
class malstruct.conditional.Select(*subcons, **subconskw)

Bases: Construct

Selects the first matching subconstruct.

Parses and builds by literally trying each subcon in sequence until one of them parses or builds without exception. Stream gets reverted back to original position after each failed attempt, but not if parsing succeeds. Size is not defined.

Parameters:
  • *subcons – Construct instances, list of members, some can be anonymous

  • **subconskw – Construct instances, list of members (requires Python 3.6)

Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

  • StreamError – stream is not seekable and tellable

  • SelectError – neither subcon succeded when parsing or building

Example:

>>> d = Select(Int32ub, CString("utf8"))
>>> d.build(1)
b'\x00\x00\x00\x01'
>>> d.build(u"Афон")
b'\xd0\x90\xd1\x84\xd0\xbe\xd0\xbd\x00'

Alternative syntax, but requires Python 3.6 or any PyPy:
>>> Select(num=Int32ub, text=CString("utf8"))
malstruct.conditional.Optional(subcon)

Makes an optional field.

Parsing attempts to parse subcon. If sub-parsing fails, returns None and reports success. Building attempts to build subcon. If sub-building fails, writes nothing and reports success. Size is undefined, because whether bytes would be consumed or produced depends on actual data and actual context.

Parameters:

subcon – Construct instance

Example:

Optional  <-->  Select(subcon, Pass)

>>> d = Optional(Int64ul)
>>> d.parse(b"12345678")
4050765991979987505
>>> d.parse(b"")
None
>>> d.build(1)
b'\x01\x00\x00\x00\x00\x00\x00\x00'
>>> d.build(None)
b''
malstruct.conditional.If(condfunc, subcon)

If-then conditional construct.

Parsing evaluates condition, if True then subcon is parsed, otherwise just returns None. Building also evaluates condition, if True then subcon gets build from, otherwise does nothing. Size is either same as subcon or 0, depending how condfunc evaluates.

Parameters:
  • condfunc – bool or context lambda (or a truthy value)

  • subcon – Construct instance, used if condition indicates True

Raises:

StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

Can propagate any exception from the lambda, possibly non-ConstructError.

Example:

If <--> IfThenElse(condfunc, subcon, Pass)

>>> d = If(this.x > 0, Byte)
>>> d.build(255, x=1)
b'\xff'
>>> d.build(255, x=0)
b''
class malstruct.conditional.IfThenElse(condfunc, thensubcon, elsesubcon)

Bases: Construct

If-then-else conditional construct, similar to ternary operator.

Parsing and building evaluates condition, and defers to either subcon depending on the value. Size is computed the same way.

Parameters:
  • condfunc – bool or context lambda (or a truthy value)

  • thensubcon – Construct instance, used if condition indicates True

  • elsesubcon – Construct instance, used if condition indicates False

Raises:

StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

Can propagate any exception from the lambda, possibly non-ConstructError.

Example:

>>> d = IfThenElse(this.x > 0, VarInt, Byte)
>>> d.build(255, dict(x=1))
b'\xff\x01'
>>> d.build(255, dict(x=0))
b'\xff'
class malstruct.conditional.Switch(keyfunc, cases, default=None)

Bases: Construct

A conditional branch.

Parsing and building evaluate keyfunc and select a subcon based on the value and dictionary entries. Dictionary (cases) maps values into subcons. If no case matches then default is used (that is Pass by default). Note that default is a Construct instance, not a dictionary key. Size is evaluated in same way as parsing and building, by evaluating keyfunc and selecting a field accordingly.

Parameters:
  • keyfunc – context lambda or constant, that matches some key in cases

  • cases – dict mapping keys to Construct instances

  • default – optional, Construct instance, used when keyfunc is not found in cases, Pass is default value for this parameter, Error is a possible value for this parameter

Raises:

StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

Can propagate any exception from the lambda, possibly non-ConstructError.

Example:

>>> d = Switch(this.n, { 1:Int8ub, 2:Int16ub, 4:Int32ub })
>>> d.build(5, n=1)
b'\x05'
>>> d.build(5, n=4)
b'\x00\x00\x00\x05'

>>> d = Switch(this.n, {}, default=Byte)
>>> d.parse(b"\x01", n=255)
1
>>> d.build(1, n=255)
b"\x01"
class malstruct.conditional.StopIf(condfunc)

Bases: Construct

Checks for a condition, and stops certain classes (Struct Sequence GreedyRange) from parsing or building further.

Parsing and building check the condition, and raise StopFieldError if indicated. Size is undefined.

Parameters:

condfunc – bool or context lambda (or truthy value)

Raises:

StopFieldError – used internally

Can propagate any exception from the lambda, possibly non-ConstructError.

Example:

>>> Struct('x'/Byte, StopIf(this.x == 0), 'y'/Byte)
>>> Sequence('x'/Byte, StopIf(this.x == 0), 'y'/Byte)
>>> GreedyRange(FocusedSeq(0, 'x'/Byte, StopIf(this.x == 0)))

malstruct.core module

Internal methods, abstract constructs, structures, sequences, and arrays

malstruct.core.mergefields(*subcons)
malstruct.core.hyphenatedict(d)
malstruct.core.hyphenatelist(l)
malstruct.core.extractfield(sc)
malstruct.core.evaluate(param, context)
class malstruct.core.Construct

Bases: object

The mother of all constructs.

This object is generally not directly instantiated, and it does not directly implement parsing and building, so it is largely only of interest to subclass implementors. There are also other abstract classes sitting on top of this one.

The external user API:

  • parse

  • parse_stream

  • parse_file

  • build

  • build_stream

  • build_file

  • sizeof

  • compile

  • benchmark

Subclass authors should not override the external methods. Instead, another API is available:

  • _parse

  • _build

  • _sizeof

  • _actualsize

  • _emitparse

  • _emitbuild

  • _emitseq

  • _emitprimitivetype

  • _emitfulltype

  • __getstate__

  • __setstate__

Attributes and Inheritance:

All constructs have a name and flags. The name is used for naming struct members and context dictionaries. Note that the name can be a string, or None by default. A single underscore “_” is a reserved name, used as up-level in nested containers. The name should be descriptive, short, and valid as a Python identifier, although these rules are not enforced. The flags specify additional behavioral information about this construct. Flags are used by enclosing constructs to determine a proper course of action. Flags are often inherited from inner subconstructs but that depends on each class.

parse(data, **contextkw)

Parse an in-memory buffer (often bytes object). Strings, buffers, memoryviews, and other complete buffers can be parsed with this method.

Whenever data cannot be read, ConstructError or its derivative is raised. This method is NOT ALLOWED to raise any other exceptions although (1) user-defined lambdas can raise arbitrary exceptions which are propagated (2) external libraries like numpy can raise arbitrary exceptions which are propagated (3) some list and dict lookups can raise IndexError and KeyError which are propagated.

Context entries are passed only as keyword parameters **contextkw.

Parameters:

**contextkw – context entries, usually empty

Returns:

some value, usually based on bytes read from the stream but sometimes it is computed from nothing or from the context dictionary, sometimes its non-deterministic

Raises:

ConstructError – raised for any reason

parse_stream(stream, **contextkw)

Parse a stream. Files, pipes, sockets, and other streaming sources of data are handled by this method. See parse().

parse_file(filename, **contextkw)

Parse a closed binary file. See parse().

build(obj, **contextkw)

Build an object in memory (a bytes object).

Whenever data cannot be written, ConstructError or its derivative is raised. This method is NOT ALLOWED to raise any other exceptions although (1) user-defined lambdas can raise arbitrary exceptions which are propagated (2) external libraries like numpy can raise arbitrary exceptions which are propagated (3) some list and dict lookups can raise IndexError and KeyError which are propagated.

Context entries are passed only as keyword parameters **contextkw.

Parameters:

**contextkw – context entries, usually empty

Returns:

bytes

Raises:

ConstructError – raised for any reason

build_stream(obj, stream, **contextkw)

Build an object directly into a stream. See build().

build_file(obj, filename, **contextkw)

Build an object into a closed binary file. See build().

sizeof(**contextkw)

Calculate the size of this object, optionally using a context.

Some constructs have fixed size (like FormatField), some have variable-size and can determine their size given a context entry (like Bytes(this.otherfield1)), and some cannot determine their size (like VarInt).

Whenever size cannot be determined, SizeofError is raised. This method is NOT ALLOWED to raise any other exception, even if eg. context dictionary is missing a key, or subcon propagates ConstructError-derivative exception.

Context entries are passed only as keyword parameters **contextkw.

Parameters:

**contextkw – context entries, usually empty

Returns:

integer if computable, SizeofError otherwise

Raises:

SizeofError – size could not be determined in actual context, or is impossible to be determined

class malstruct.core.Subconstruct(subcon)

Bases: Construct

Abstract subconstruct (wraps an inner construct, inheriting its name and flags). Parsing and building is by default deferred to subcon, same as sizeof.

Parameters:

subcon – Construct instance

class malstruct.core.Adapter(subcon)

Bases: Subconstruct

Abstract adapter class.

Needs to implement _decode() for parsing and _encode() for building.

Parameters:

subcon – Construct instance

class malstruct.core.SymmetricAdapter(subcon)

Bases: Adapter

Abstract adapter class.

Needs to implement _decode() only, for both parsing and building.

Parameters:

subcon – Construct instance

class malstruct.core.Validator(subcon)

Bases: SymmetricAdapter

Abstract class that validates a condition on the encoded/decoded object.

Needs to implement _validate() that returns a bool (or a truthy value)

Parameters:

subcon – Construct instance

class malstruct.core.Tunnel(subcon)

Bases: Subconstruct

Abstract class that allows other constructs to read part of the stream as if they were reading the entire stream. See Prefixed for example.

Needs to implement _decode() for parsing and _encode() for building.

class malstruct.core.Computed(func)

Bases: Construct

Field computing a value from the context dictionary or some outer source like os.urandom or random module. Underlying byte stream is unaffected. The source can be non-deterministic.

Parsing and Building return the value returned by the context lambda (although a constant value can also be used). Size is defined as 0 because parsing and building does not consume or produce bytes into the stream.

Parameters:

func – context lambda or constant value

Can propagate any exception from the lambda, possibly non-ConstructError.

Example::
>>> d = Struct(
...     "width" / Byte,
...     "height" / Byte,
...     "total" / Computed(this.width * this.height),
... )
>>> d.build(dict(width=4,height=5))
b'\x04\x05'
>>> d.parse(b"12")
Container(width=49, height=50, total=2450)
>>> d = Computed(7)
>>> d.parse(b"")
7
>>> d = Computed(lambda ctx: 7)
>>> d.parse(b"")
7
>>> import os
>>> d = Computed(lambda ctx: os.urandom(10))
>>> d.parse(b"")
b'\x98\xc2\xec\x10\x07\xf5\x8e\x98\xc2\xec'
class malstruct.core.Struct(*subcons, **subconskw)

Bases: Construct

Sequence of usually named constructs, similar to structs in C. The members are parsed and build in the order they are defined. If a member is anonymous (its name is None) then it gets parsed and the value discarded, or it gets build from nothing (from None).

Some fields do not need to be named, since they are built without value anyway. See: Const Padding Check Error Pass Terminated Seek Tell for examples of such fields.

Operator + can also be used to make Structs (although not recommended).

Parses into a Container (dict with attribute and key access) where keys match subcon names. Builds from a dict (not necessarily a Container) where each member gets a value from the dict matching the subcon name. If field has build-from-none flag, it gets build even when there is no matching entry in the dict. Size is the sum of all subcon sizes, unless any subcon raises SizeofError.

This class does context nesting, meaning its members are given access to a new dictionary where the “_” entry points to the outer context. When parsing, each member gets parsed and subcon parse return value is inserted into context under matching key only if the member was named. When building, the matching entry gets inserted into context before subcon gets build, and if subcon build returns a new value (not None) that gets replaced in the context.

This class exposes subcons as attributes. You can refer to subcons that were inlined (and therefore do not exist as variable in the namespace) by accessing the struct attributes, under same name. Also note that compiler does not support this feature. See examples.

This class exposes subcons in the context. You can refer to subcons that were inlined (and therefore do not exist as variable in the namespace) within other inlined fields using the context. Note that you need to use a lambda (this expression is not supported). Also note that compiler does not support this feature. See examples.

This class supports stopping. If StopIf field is a member, and it evaluates its lambda as positive, this class ends parsing or building as successful without processing further fields.

Parameters:
  • *subcons – Construct instances, list of members, some can be anonymous

  • **subconskw – Construct instances, list of members (requires Python 3.6)

Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

  • KeyError – building a subcon but found no corresponding key in dictionary

Example:

>>> d = Struct("num"/Int8ub, "data"/Bytes(this.num))
>>> d.parse(b"\x04DATA")
Container(num=4, data=b"DATA")
>>> d.build(dict(num=4, data=b"DATA"))
b"\x04DATA"

>>> d = Struct(Const(b"MZ"), Padding(2), Pass, Terminated)
>>> d.build({})
b'MZ\x00\x00'
>>> d.parse(_)
Container()
>>> d.sizeof()
4

>>> d = Struct(
...     "animal" / Enum(Byte, giraffe=1),
... )
>>> d.animal.giraffe
'giraffe'
>>> d = Struct(
...     "count" / Byte,
...     "data" / Bytes(lambda this: this.count - this._subcons.count.sizeof()),
... )
>>> d.build(dict(count=3, data=b"12"))
b'\x0312'

Alternative syntax (not recommended):
>>> ("a"/Byte + "b"/Byte + "c"/Byte + "d"/Byte)

Alternative syntax, but requires Python 3.6 or any PyPy:
>>> Struct(a=Byte, b=Byte, c=Byte, d=Byte)
class malstruct.core.Sequence(*subcons, **subconskw)

Bases: Construct

Sequence of usually un-named constructs. The members are parsed and build in the order they are defined. If a member is named, its parsed value gets inserted into the context. This allows using members that refer to previous members.

Operator >> can also be used to make Sequences (although not recommended).

Parses into a ListContainer (list with pretty-printing) where values are in same order as subcons. Builds from a list (not necessarily a ListContainer) where each subcon is given the element at respective position. Size is the sum of all subcon sizes, unless any subcon raises SizeofError.

This class does context nesting, meaning its members are given access to a new dictionary where the “_” entry points to the outer context. When parsing, each member gets parsed and subcon parse return value is inserted into context under matching key only if the member was named. When building, the matching entry gets inserted into context before subcon gets build, and if subcon build returns a new value (not None) that gets replaced in the context.

This class exposes subcons as attributes. You can refer to subcons that were inlined (and therefore do not exist as variable in the namespace) by accessing the struct attributes, under same name. Also note that compiler does not support this feature. See examples.

This class exposes subcons in the context. You can refer to subcons that were inlined (and therefore do not exist as variable in the namespace) within other inlined fields using the context. Note that you need to use a lambda (this expression is not supported). Also note that compiler does not support this feature. See examples.

This class supports stopping. If StopIf field is a member, and it evaluates its lambda as positive, this class ends parsing or building as successful without processing further fields.

Parameters:
  • *subcons – Construct instances, list of members, some can be named

  • **subconskw – Construct instances, list of members (requires Python 3.6)

Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

  • KeyError – building a subcon but found no corresponding key in dictionary

Example:

>>> d = Sequence(Byte, Float32b)
>>> d.build([0, 1.23])
b'\x00?\x9dp\xa4'
>>> d.parse(_)
[0, 1.2300000190734863] # a ListContainer

>>> d = Sequence(
...     "animal" / Enum(Byte, giraffe=1),
... )
>>> d.animal.giraffe
'giraffe'
>>> d = Sequence(
...     "count" / Byte,
...     "data" / Bytes(lambda this: this.count - this._subcons.count.sizeof()),
... )
>>> d.build([3, b"12"])
b'\x0312'

Alternative syntax (not recommended):
>>> (Byte >> Byte >> "c"/Byte >> "d"/Byte)

Alternative syntax, but requires Python 3.6 or any PyPy:
>>> Sequence(a=Byte, b=Byte, c=Byte, d=Byte)
class malstruct.core.Array(count, subcon, discard=False)

Bases: Subconstruct

Homogenous array of elements, similar to C# generic T[].

Parses into a ListContainer (a list). Parsing and building processes an exact amount of elements. If given list has more or less than count elements, raises RangeError. Size is defined as count multiplied by subcon size, but only if subcon is fixed size.

Operator [] can be used to make Array instances (recommended syntax).

Parameters:
  • count – integer or context lambda, strict amount of elements

  • subcon – Construct instance, subcon to process individual elements

  • discard – optional, bool, if set then parsing returns empty list

Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

  • RangeError – specified count is not valid

  • RangeError – given object has different length than specified count

Can propagate any exception from the lambdas, possibly non-ConstructError.

Example:

>>> d = Array(5, Byte) or Byte[5]
>>> d.build(range(5))
b'\x00\x01\x02\x03\x04'
>>> d.parse(_)
[0, 1, 2, 3, 4]
class malstruct.core.Range(min, max, subcon)

Bases: Subconstruct

A homogenous array of elements. The array will iterate through between min to max times. If an exception occurs (EOF, validation error), the repeater exits cleanly. If less than min units have been successfully parsed, a RangeError is raised.

See also

Analog GreedyRange() that parses until end of stream.

Note

This object requires a seekable stream for parsing.

Parameters:
  • min – the minimal count

  • max – the maximal count

  • subcon – the subcon to process individual elements

Example:

>>> Range(3, 5, Byte).build([1,2,3,4])
'\x01\x02\x03\x04'
>>> Range(3, 5, Byte).parse(_)
ListContainer([1, 2, 3, 4])

>>> Range(3, 5, Byte).build([1,2])
Traceback (most recent call last):
    ...
RangeError: expected from 3 to 5 elements, found 2
>>> Range(3, 5, Byte).build([1,2,3,4,5,6])
Traceback (most recent call last):
    ...
RangeError: expected from 3 to 5 elements, found 6
min
max
class malstruct.core.GreedyRange(subcon, discard=False)

Bases: Subconstruct

Homogenous array of elements, similar to C# generic IEnumerable<T>, but works with unknown count of elements by parsing until end of stream.

Parses into a ListContainer (a list). Parsing stops when an exception occured when parsing the subcon, either due to EOF or subcon format not being able to parse the data. Either way, when GreedyRange encounters either failure it seeks the stream back to a position after last successful subcon parsing. Builds from enumerable, each element as-is. Size is undefined.

This class supports stopping. If StopIf field is a member, and it evaluates its lambda as positive, this class ends parsing or building as successful without processing further fields.

Parameters:
  • subcon – Construct instance, subcon to process individual elements

  • discard – optional, bool, if set then parsing returns empty list

Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

  • StreamError – stream is not seekable and tellable

Can propagate any exception from the lambdas, possibly non-ConstructError.

Example:

>>> d = GreedyRange(Byte)
>>> d.build(range(8))
b'\x00\x01\x02\x03\x04\x05\x06\x07'
>>> d.parse(_)
[0, 1, 2, 3, 4, 5, 6, 7]
class malstruct.core.RepeatUntil(predicate, subcon, discard=False)

Bases: Subconstruct

Homogenous array of elements, similar to C# generic IEnumerable<T>, that repeats until the predicate indicates it to stop. Note that the last element (that predicate indicated as True) is included in the return list.

Parse iterates indefinately until last element passed the predicate. Build iterates indefinately over given list, until an element passed the precicate (or raises RepeatError if no element passed it). Size is undefined.

Parameters:
  • predicate – lambda that takes (obj, list, context) and returns True to break or False to continue (or a truthy value)

  • subcon – Construct instance, subcon used to parse and build each element

  • discard – optional, bool, if set then parsing returns empty list

Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

  • RepeatError – consumed all elements in the stream but neither passed the predicate

Can propagate any exception from the lambda, possibly non-ConstructError.

Example:

>>> d = RepeatUntil(lambda x,lst,ctx: x > 7, Byte)
>>> d.build(range(20))
b'\x00\x01\x02\x03\x04\x05\x06\x07\x08'
>>> d.parse(b"\x01\xff\x02")
[1, 255]

>>> d = RepeatUntil(lambda x,lst,ctx: lst[-2:] == [0,0], Byte)
>>> d.parse(b"\x01\x00\x00\xff")
[1, 0, 0]
class malstruct.core.Renamed(subcon, newname=None, newdocs=None, newparsed=None)

Bases: Subconstruct

Special wrapper that allows a Struct (or other similar class) to see a field as having a name (or a different name) or having a parsed hook. Library classes do not have names (its None). Renamed does not change a field, only wraps it like a candy with a label. Used internally by / and * operators.

Also this wrapper is responsible for building a path info (a chain of names) that gets attached to error message when parsing, building, or sizeof fails. Fields that are not named do not appear in the path string.

Parsing building and size are deferred to subcon.

Parameters:
  • subcon – Construct instance

  • newname – optional, string

  • newdocs – optional, string

  • newparsed – optional, lambda

Example:

>>> "number" / Int32ub
<Renamed: number>

malstruct.debug module

class malstruct.debug.Probe(into=None, lookahead=128, name=None)

Bases: Construct

Probe that dumps the context, and some stream content (peeks into it) to the screen to aid the debugging process. It can optionally limit itself to a single context entry, instead of printing entire context.
  • The lookahead stream is enabled by default

  • Use hexdump instead of hexlify to display lookahead stream

  • Allows for setting a name

Parameters:
  • into – optional, None by default, or context lambda

  • lookahead – optional, integer, number of bytes to dump from the stream

Example:

>>> d = Struct(
...     "count" / Byte,
...     "items" / Byte[this.count],
...     Probe(lookahead=32),
... )
>>> d.parse(b"\x05abcde\x01\x02\x03")

--------------------------------------------------
Probe, path is (parsing), into is None
Stream peek: (hexlified) b'010203'...
Container:
    count = 5
    items = ListContainer:
        97
        98
        99
        100
        101
--------------------------------------------------
>>> d = Struct(
...     "count" / Byte,
...     "items" / Byte[this.count],
...     Probe(this.count),
... )
>>> d.parse(b"\x05abcde\x01\x02\x03")

--------------------------------------------------
Probe, path is (parsing), into is this.count
5
--------------------------------------------------
printout(stream, context, path)
class malstruct.debug.Debugger(subcon)

Bases: Subconstruct

PDB-based debugger. When an exception occurs in the subcon, a debugger will appear and allow you to debug the error (and even fix it on-the-fly).

Parameters:

subcon – Construct instance, subcon to debug

Example:

>>> Debugger(Byte[3]).build([])

--------------------------------------------------
Debugging exception of <Array: None>
path is (building)
  File "/media/ciphertechsolutions/MAIN/GitHub/ciphertechsolutions/malstruct/debug.py", line 192, in _build
    return self.subcon._build(obj, stream, context, path)
  File "/media/ciphertechsolutions/MAIN/GitHub/ciphertechsolutions/malstruct/core.py", line 2149, in _build
    raise RangeError("expected %d elements, found %d" % (count, len(obj)))
malstruct.core.RangeError: expected 3 elements, found 0

> /media/ciphertechsolutions/MAIN/GitHub/ciphertechsolutions/malstruct/core.py(2149)_build()
-> raise RangeError("expected %d elements, found %d" % (count, len(obj)))
(Pdb) q
--------------------------------------------------
handle_exc(path, msg=None)

malstruct.exceptions module

exception malstruct.exceptions.ConstructError(message='', path=None)

Bases: Exception

This is the root of all exceptions raised by parsing classes in this library. Note that the helper functions in lib module can raise standard ValueError (but parsing classes are not allowed to).

exception malstruct.exceptions.SizeofError(message='', path=None)

Bases: ConstructError

Parsing classes sizeof() methods are only allowed to either return an integer or raise SizeofError instead. Note that this exception can mean the parsing class cannot be measured apriori in principle, however it can also mean that it just cannot be measured in these particular circumstances (eg. there is a key missing in the context dictionary at this time).

exception malstruct.exceptions.AdaptationError(message='', path=None)

Bases: ConstructError

Currently not used.

exception malstruct.exceptions.ValidationError(message='', path=None)

Bases: ConstructError

Validator ExprValidator derived parsing classes can raise this exception: OneOf NoneOf. It can mean that the parse or build value is or is not one of specified values.

exception malstruct.exceptions.StreamError(message='', path=None)

Bases: ConstructError

Almost all parsing classes can raise this exception: it can mean a variety of things. Maybe requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, could not write all bytes, stream is not seekable, stream is not tellable, etc. Note that there are a few parsing classes that do not use the stream to compute output and therefore do not raise this exception.

exception malstruct.exceptions.FormatFieldError(message='', path=None)

Bases: ConstructError

Only one parsing class can raise this exception: FormatField. It can either mean the format string is invalid or the value is not valid for provided format string. See standard struct module for what is acceptable.

exception malstruct.exceptions.IntegerError(message='', path=None)

Bases: ConstructError

Only some numeric parsing classes can raise this exception: BytesInteger BitsInteger VarInt ZigZag. It can mean either the length parameter is invalid, the value is not an integer, the value is negative or too low or too high for given parameters, or the selected endianness cannot be applied.

exception malstruct.exceptions.StringError(message='', path=None)

Bases: ConstructError

Almost all parsing classes can raise this exception: It can mean a unicode string was ed instead of bytes, or a bytes was ed instead of a unicode string. Also some classes can raise it explicitly: PascalString CString GreedyString. It can mean no encoding or invalid encoding was selected. Note that currently, if the data cannot be encoded decoded given selected encoding then UnicodeEncodeError UnicodeDecodeError are raised, which are not rooted at ConstructError.

exception malstruct.exceptions.MappingError(message='', path=None)

Bases: ConstructError

Few parsing classes can raise this exception: Enum FlagsEnum Mapping. It can mean the build value is not recognized and therefore cannot be mapped onto bytes.

exception malstruct.exceptions.RangeError(message='', path=None)

Bases: ConstructError

Few parsing classes can raise this exception: Array PrefixedArray LazyArray. It can mean the count parameter is invalid, or the build object has too little or too many elements.

exception malstruct.exceptions.RepeatError(message='', path=None)

Bases: ConstructError

Only one parsing class can raise this exception: RepeatUntil. It can mean none of the elements in build object ed the given predicate.

exception malstruct.exceptions.ConstError(message='', path=None)

Bases: ConstructError

Only one parsing class can raise this exception: Const. It can mean the wrong data was parsed, or wrong object was built from.

exception malstruct.exceptions.IndexFieldError(message='', path=None)

Bases: ConstructError

Only one parsing class can raise this exception: Index. It can mean the class was not nested in an array parsing class properly and therefore cannot access the _index context key.

exception malstruct.exceptions.CheckError(message='', path=None)

Bases: ConstructError

Only one parsing class can raise this exception: Check. It can mean the condition lambda failed during a routine parsing building check.

exception malstruct.exceptions.ExplicitError(message='', path=None)

Bases: ConstructError

Only one parsing class can raise this exception: Error. It can mean the parsing class was merely parsed or built with.

exception malstruct.exceptions.NamedTupleError(message='', path=None)

Bases: ConstructError

Only one parsing class can raise this exception: NamedTuple. It can mean the subcon is not of a valid type.

exception malstruct.exceptions.TimestampError(message='', path=None)

Bases: ConstructError

Only one parsing class can raise this exception: Timestamp. It can mean the subcon unit or epoch are invalid.

exception malstruct.exceptions.UnionError(message='', path=None)

Bases: ConstructError

Only one parsing class can raise this exception: Union. It can mean none of given subcons was properly selected, or trying to build without providing a proper value.

exception malstruct.exceptions.SelectError(message='', path=None)

Bases: ConstructError

Only one parsing class can raise this exception: Select. It can mean neither subcon succeded when parsing or building.

exception malstruct.exceptions.SwitchError(message='', path=None)

Bases: ConstructError

Currently not used.

exception malstruct.exceptions.StopFieldError(message='', path=None)

Bases: ConstructError

Only one parsing class can raise this exception: StopIf. It can mean the given condition was met during parsing or building.

exception malstruct.exceptions.PaddingError(message='', path=None)

Bases: ConstructError

Multiple parsing classes can raise this exception: PaddedString Padding Padded Aligned FixedSized NullTerminated NullStripped. It can mean multiple issues: the encoded string or bytes takes more bytes than padding allows, length parameter was invalid, pattern terminator or pad is not a proper bytes value, modulus was less than 2.

exception malstruct.exceptions.TerminatedError(message='', path=None)

Bases: ConstructError

Only one parsing class can raise this exception: Terminated. It can mean EOF was not found as expected during parsing.

exception malstruct.exceptions.RawCopyError(message='', path=None)

Bases: ConstructError

Only one parsing class can raise this exception: RawCopy. It can mean it cannot build as both data and value keys are missing from build dict object.

exception malstruct.exceptions.RotationError(message='', path=None)

Bases: ConstructError

Only one parsing class can raise this exception: ProcessRotateLeft. It can mean the specified group is less than 1, data is not of valid length.

exception malstruct.exceptions.ChecksumError(message='', path=None)

Bases: ConstructError

Only one parsing class can raise this exception: Checksum. It can mean expected and actual checksum do not match.

exception malstruct.exceptions.CancelParsing(message='', path=None)

Bases: ConstructError

This exception can only be raised explicitly by the user, and it causes the parsing class to stop what it is doing (interrupts parsing or building).

exception malstruct.exceptions.CipherError(message='', path=None)

Bases: ConstructError

Two parsing classes can raise this exception: EncryptedSym EncryptedSymAead. It can mean none or invalid cipher object was provided.

exception malstruct.exceptions.IterError(message='', path=None)

Bases: ConstructError

malstruct.expr module

class malstruct.expr.ExprMixin

Bases: object

class malstruct.expr.UniExpr(op, operand)

Bases: ExprMixin

class malstruct.expr.BinExpr(op, lhs, rhs)

Bases: ExprMixin

class malstruct.expr.Path(name, field=None, parent=None)

Bases: ExprMixin

class malstruct.expr.Path2(name, index=None, parent=None)

Bases: ExprMixin

class malstruct.expr.FuncPath(func, operand=None)

Bases: ExprMixin

malstruct.helpers module

malstruct.helpers.singleton(arg)

Instantiate the decorated class/function and tag the instance for autodoc.

malstruct.helpers.chunk(seq, size)

Returns an iterator that yields full chunks seq into size chunks.

>>> list(chunk('hello', 2))
[('h', 'e'), ('l', 'l')]
>>> list(chunk('hello!', 2))
[('h', 'e'), ('l', 'l'), ('o', '!')]
malstruct.helpers.stream_read(stream: BytesIO, length, path=None)
malstruct.helpers.stream_read_entire(stream: BytesIO, path=None)
malstruct.helpers.stream_write(stream: BytesIO, data: bytes, length: int = None, path: str = None)
malstruct.helpers.stream_seek(stream: BytesIO, offset: int, whence: int = 0, path: str = None)
malstruct.helpers.stream_tell(stream: BytesIO, path: str = None)
malstruct.helpers.stream_size(stream: BytesIO)
malstruct.helpers.stream_iseof(stream: BytesIO)
class malstruct.helpers.BytesIOWithOffsets(contents: bytes, parent_stream: BytesIO, offset: int)

Bases: BytesIO

static from_reading(stream: BytesIO, length: int, path: str) BytesIOWithOffsets | BytesIO

Creates a new BytesIOWithOffsets instance from an existing stream

Parameters:
  • stream (io.BytesIO) – Existing stream

  • length (int) – Number of bytes to read

  • path (str) – Path for error reporting

Returns:

BytesIOWithOffsets instance

tell() int

Obtain the current offset from within the parent stream

Returns:

Current offset

Return type:

int

seek(offset: int, whence: int = 0) int

Move the current position to the specified offset

Parameters:
  • offset (int) – Offset to seek to

  • whence (int) – Reference point for offset (default is SEEK_SET)

Returns:

Updated offset

Return type:

int

malstruct.helpers.find_constructs(struct, data)

Generator that yields the results of successful parsings of the given construct. Note: Construct must attempt to read something. Ie, don’t have a Peek as your first subconstruct.

Also, it’s best if you have some type of validation (Const, OneOf, NoneOf, Check, etc) within your struct. Otherwise, it makes more sense to use a GreedyRange (the ‘[:]’ notation) instead of this function.

Example:

>>> struct = Struct(
...     Const(b'MZ'),
...     'int' / Int16ul,
...     'string' / CString())
>>> list(find_constructs(struct, b'\x01\x02\x03MZ\x0A\x00hello\x00\x03\x04MZ\x0B\x00world\x00\x00'))
[(3, Container(int=10, string=u'hello')), (15, Container(int=11, string=u'world'))]
>>> list(find_constructs(struct, b'nope'))
[]
Parameters:
  • struct – construct to apply (instance of construct.Construct)

  • data – byte string of data to search.

Yield:

tuple containing (offset with data, result Container class)

malstruct.html module

This module is used to convert malstructs to an HTML document.

To use, run the html_hex with a malstruct and data:

print html_hex(malstruct, data)

malstruct.html.brightness(hexcode)

Calculates brightness for give html hex code of the format #xxxxxx

malstruct.html.grouper(n, iterable, fillvalue=None)

Groups iterable into n length chunks. If the last chunk doesn’t have n items, the remaining is filled with fillvalue.

>>> list(grouper(3, 'ABCDEFG', fillvalue='x'))
[('A', 'B', 'C'), ('D', 'E', 'F'), ('G', 'x', 'x')]
class malstruct.html.Member(member_map, subcon)

Bases: RawCopy

This is a submalstruct that collects offset, data, and size information into the given member table, but then returns the original parsed value, like nothing happened. (This is to allow the callbacks work like they originally functioned.)

class malstruct.html.MemberMap(subcon)

Bases: Adapter

Wraps Submalstruct to produce a member map of all the parsed objections and their offsets:

{offset: [list of parsed Containers in order of descending depth]}

Needs to implement _decode() and _encode().

Parameters:

subcon – the malstruct to wrap

malstruct.html.html_hex(struct, data, width=16, depth=None, member_callback=None)

Uses malstruct to parse data and creates a user-friendly html hex dump.

Parameters:
  • struct – A malstruct object to parse.

  • data – Data to dump.

  • width – The number of bytes displayed for each line.

  • depth – The number of levels deep to display in table (defaults to all levels)

  • member_callback

    Optional callback function that can be used to tweak the member name or value in the variable table. Function must accept two parameters (name, value) and return a tuple of the (name, value) or None to make no change. e.g.

    def edit_member(name, value):
    if name == ‘data’:

    return name, value.encode(‘hex’)

Return type:

str

Returns:

returns unicode string of html data.

Raises:

malstructError – If given struct fails to parse given data.

malstruct.integers module

Integers and floats

class malstruct.integers.FormatField(endianity, format)

Bases: Construct

Field that uses struct module to pack and unpack CPU-sized integers and floats and booleans. This is used to implement most Int* Float* fields, but for example cannot pack 24-bit integers, which is left to BytesInteger class. For booleans I also recommend using Flag class instead.

See struct module documentation for instructions on crafting format strings.

Parses into an integer or float or boolean. Builds from an integer or float or boolean into specified byte count and endianness. Size is determined by struct module according to specified format string.

Parameters:
  • endianity – string, character like: < > =

  • format – string, character like: B H L Q b h l q e f d ?

Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

  • FormatFieldError – wrong format string, or struct.(un)pack complained about the value

Example:

>>> d = FormatField(">", "H") or Int16ub
>>> d.parse(b"\x01\x00")
256
>>> d.build(256)
b"\x01\x00"
>>> d.sizeof()
2
class malstruct.integers.BytesInteger(length, signed=False, swapped=False)

Bases: Construct

Field that packs integers of arbitrary size. Int24* fields use this class.

Parses into an integer. Builds from an integer into specified byte count and endianness. Size is specified in ctor.

Analog to BitsInteger which operates on bits. In fact:

BytesInteger(n) <--> Bitwise(BitsInteger(8*n))
BitsInteger(8*n) <--> Bytewise(BytesInteger(n))

Byte ordering refers to bytes (chunks of 8 bits) so, for example:

BytesInteger(n, swapped=True) <--> Bitwise(BitsInteger(8*n, swapped=True))
Parameters:
  • length – integer or context lambda, number of bytes in the field

  • signed – bool, whether the value is signed (two’s complement), default is False (unsigned)

  • swapped – bool or context lambda, whether to swap byte order (little endian), default is False (big endian)

Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

  • IntegerError – length is negative or zero

  • IntegerError – value is not an integer

  • IntegerError – number does not fit given width and signed parameters

Can propagate any exception from the lambda, possibly non-ConstructError.

Example:

>>> d = BytesInteger(4) or Int32ub
>>> d.parse(b"abcd")
1633837924
>>> d.build(1)
b'\x00\x00\x00\x01'
>>> d.sizeof()
4
class malstruct.integers.BitsInteger(length, signed=False, swapped=False)

Bases: Construct

Field that packs arbitrarily large (or small) integers. Some fields (Bit Nibble Octet) use this class. Must be enclosed in Bitwise context.

Parses into an integer. Builds from an integer into specified bit count and endianness. Size (in bits) is specified in ctor.

Analog to BytesInteger which operates on bytes. In fact:

BytesInteger(n) <--> Bitwise(BitsInteger(8*n))
BitsInteger(8*n) <--> Bytewise(BytesInteger(n))

Note that little-endianness is only defined for multiples of 8 bits.

Byte ordering (i.e. swapped parameter) refers to bytes (chunks of 8 bits) so, for example:

BytesInteger(n, swapped=True) <--> Bitwise(BitsInteger(8*n, swapped=True))

Swapped argument was recently fixed. To obtain previous (faulty) behavior, you can use ByteSwapped, BitsSwapped and Bitwise in whatever particular order (see examples).

Parameters:
  • length – integer or context lambda, number of bits in the field

  • signed – bool, whether the value is signed (two’s complement), default is False (unsigned)

  • swapped – bool or context lambda, whether to swap byte order (little endian), default is False (big endian)

Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

  • IntegerError – length is negative or zero

  • IntegerError – value is not an integer

  • IntegerError – number does not fit given width and signed parameters

  • IntegerError – little-endianness selected but length is not multiple of 8 bits

Can propagate any exception from the lambda, possibly non-ConstructError.

Examples:

>>> d = Bitwise(BitsInteger(8)) or Bitwise(Octet)
>>> d.parse(b"\x10")
16
>>> d.build(255)
b'\xff'
>>> d.sizeof()
1

Obtaining other byte or bit orderings:

>>> d = BitsInteger(2)
>>> d.parse(b'\x01\x00') # Bit-Level Big-Endian
2
>>> d = ByteSwapped(BitsInteger(2))
>>> d.parse(b'\x01\x00') # Bit-Level Little-Endian
1
>>> d = BitsInteger(16) # Byte-Level Big-Endian, Bit-Level Big-Endian
>>> d.build(5 + 19*256)
b'\x00\x00\x00\x01\x00\x00\x01\x01\x00\x00\x00\x00\x00\x01\x00\x01'
>>> d = BitsInteger(16, swapped=True) # Byte-Level Little-Endian, Bit-Level Big-Endian
>>> d.build(5 + 19*256)
b'\x00\x00\x00\x00\x00\x01\x00\x01\x00\x00\x00\x01\x00\x00\x01\x01'
>>> d = ByteSwapped(BitsInteger(16)) # Byte-Level Little-Endian, Bit-Level Little-Endian
>>> d.build(5 + 19*256)
b'\x01\x00\x01\x00\x00\x00\x00\x00\x01\x01\x00\x00\x01\x00\x00\x00'
>>> d = ByteSwapped(BitsInteger(16, swapped=True)) # Byte-Level Big-Endian, Bit-Level Little-Endian
>>> d.build(5 + 19*256)
b'\x01\x01\x00\x00\x01\x00\x00\x00\x01\x00\x01\x00\x00\x00\x00\x00'
malstruct.integers.Bit

A 1-bit integer, must be enclosed in a Bitwise (eg. BitStruct)

malstruct.integers.Nibble

A 4-bit integer, must be enclosed in a Bitwise (eg. BitStruct)

malstruct.integers.Octet

A 8-bit integer, must be enclosed in a Bitwise (eg. BitStruct)

malstruct.integers.Int8ub

Unsigned, big endian 8-bit integer

malstruct.integers.Int16ub

Unsigned, big endian 16-bit integer

malstruct.integers.Int32ub

Unsigned, big endian 32-bit integer

malstruct.integers.Int64ub

Unsigned, big endian 64-bit integer

malstruct.integers.Int8sb

Signed, big endian 8-bit integer

malstruct.integers.Int16sb

Signed, big endian 16-bit integer

malstruct.integers.Int32sb

Signed, big endian 32-bit integer

malstruct.integers.Int64sb

Signed, big endian 64-bit integer

malstruct.integers.Int8ul

Unsigned, little endian 8-bit integer

malstruct.integers.Int16ul

Unsigned, little endian 16-bit integer

malstruct.integers.Int32ul

Unsigned, little endian 32-bit integer

malstruct.integers.Int64ul

Unsigned, little endian 64-bit integer

malstruct.integers.Int8sl

Signed, little endian 8-bit integer

malstruct.integers.Int16sl

Signed, little endian 16-bit integer

malstruct.integers.Int32sl

Signed, little endian 32-bit integer

malstruct.integers.Int64sl

Signed, little endian 64-bit integer

malstruct.integers.Int8un

Unsigned, native endianity 8-bit integer

malstruct.integers.Int16un

Unsigned, native endianity 16-bit integer

malstruct.integers.Int32un

Unsigned, native endianity 32-bit integer

malstruct.integers.Int64un

Unsigned, native endianity 64-bit integer

malstruct.integers.Int8sn

Signed, native endianity 8-bit integer

malstruct.integers.Int16sn

Signed, native endianity 16-bit integer

malstruct.integers.Int32sn

Signed, native endianity 32-bit integer

malstruct.integers.Int64sn

Signed, native endianity 64-bit integer

malstruct.integers.Byte

Unsigned, big endian 8-bit integer

malstruct.integers.Short

Unsigned, big endian 16-bit integer

malstruct.integers.Int

Unsigned, big endian 32-bit integer

malstruct.integers.Long

Unsigned, big endian 64-bit integer

malstruct.integers.Float16b

Big endian, 16-bit IEEE 754 floating point number

malstruct.integers.Float16l

Little endian, 16-bit IEEE 754 floating point number

malstruct.integers.Float16n

Native endianity, 16-bit IEEE 754 floating point number

malstruct.integers.Float32b

Big endian, 32-bit IEEE floating point number

malstruct.integers.Float32l

Little endian, 32-bit IEEE floating point number

malstruct.integers.Float32n

Native endianity, 32-bit IEEE floating point number

malstruct.integers.Float64b

Big endian, 64-bit IEEE floating point number

malstruct.integers.Float64l

Little endian, 64-bit IEEE floating point number

malstruct.integers.Float64n

Native endianity, 64-bit IEEE floating point number

malstruct.integers.Half

Big endian, 16-bit IEEE 754 floating point number

malstruct.integers.Single

Big endian, 32-bit IEEE floating point number

malstruct.integers.Double

Big endian, 64-bit IEEE floating point number

malstruct.integers.Int24ub

A 3-byte big-endian unsigned integer, as used in ancient file formats.

malstruct.integers.Int24ul

A 3-byte little-endian unsigned integer, as used in ancient file formats.

malstruct.integers.Int24un

A 3-byte native-endian unsigned integer, as used in ancient file formats.

malstruct.integers.Int24sb

A 3-byte big-endian signed integer, as used in ancient file formats.

malstruct.integers.Int24sl

A 3-byte little-endian signed integer, as used in ancient file formats.

malstruct.integers.Int24sn

A 3-byte native-endian signed integer, as used in ancient file formats.

malstruct.integers.VarInt

VarInt encoded unsigned integer. Each 7 bits of the number are encoded in one byte of the stream, where leftmost bit (MSB) is unset when byte is terminal. Scheme is defined at Google site related to Protocol Buffers.

Can only encode non-negative numbers.

Parses into an integer. Builds from an integer. Size is undefined.

Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

  • IntegerError – given a negative value, or not an integer

Example:

>>> VarInt.build(1)
b'\x01'
>>> VarInt.build(2**100)
b'\x80\x80\x80\x80\x80\x80\x80\x80\x80\x80\x80\x80\x80\x80\x04'
malstruct.integers.VarIntl

VarInt encoded unsigned integer. Each 7 bits of the number are encoded in one byte of the stream, where leftmost bit (MSB) is unset when byte is terminal. Scheme is defined at Google site related to Protocol Buffers.

Can only encode non-negative numbers.

Parses into an integer. Builds from an integer. Size is undefined.

Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

  • IntegerError – given a negative value, or not an integer

Example:

>>> VarInt.build(1)
b'\x01'
>>> VarInt.build(2**100)
b'\x80\x80\x80\x80\x80\x80\x80\x80\x80\x80\x80\x80\x80\x80\x04'
malstruct.integers.VarIntb

VarInt encoded unsigned integer (big-endian). Each 7 bits of the number are encoded in one byte of the stream.

Can only encode non-negative numbers.

Parses into an integer. Builds from an integer. Size is undefined.

Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

  • IntegerError – given a negative value, or not an integer

Example:

>>> VarIntb.build(4609)
b"\x81\xa4\x00"
>>> VarIntb.build(2**100)
b'\x84\x80\x80\x80\x80\x80\x80\x80\x80\x80\x80\x80\x80\x80\x00'
malstruct.integers.ZigZag

ZigZag encoded signed integer. This is a variant of VarInt encoding that also can encode negative numbers. Scheme is defined at Google site related to Protocol Buffers.

Can also encode negative numbers.

Parses into an integer. Builds from an integer. Size is undefined.

Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

  • IntegerError – given not an integer

Example:

>>> ZigZag.build(-3)
b'\x05'
>>> ZigZag.build(3)
b'\x06'

malstruct.lazy module

lazy equivalents

class malstruct.lazy.Lazy(subcon)

Bases: Subconstruct

Lazyfies a field.

This wrapper allows you to do lazy parsing of individual fields inside a normal Struct (without using LazyStruct which may not work in every scenario). It is also used by KaitaiStruct compiler to emit instances because those are not processed greedily, and they may refer to other not yet parsed fields. Those are 2 entirely different applications but semantics are the same.

Parsing saves the current stream offset and returns a lambda. If and when that lambda gets evaluated, it seeks the stream to then-current position, parses the subcon, and seeks the stream back to previous position. Building evaluates that lambda into an object (if needed), then defers to subcon. Size also defers to subcon.

Parameters:

subcon – Construct instance

Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

  • StreamError – stream is not seekable and tellable

Example:

>>> d = Lazy(Byte)
>>> x = d.parse(b'\x00')
>>> x
<function malstruct.core.Lazy._parse.<locals>.execute>
>>> x()
0
>>> d.build(0)
b'\x00'
>>> d.build(x)
b'\x00'
>>> d.sizeof()
1
class malstruct.lazy.LazyContainer(struct, stream, offsets, values, context, path)

Bases: dict

Used internally.

keys() a set-like object providing a view on D's keys
values() an object providing a view on D's values
items() a set-like object providing a view on D's items
class malstruct.lazy.LazyStruct(*subcons, **subconskw)

Bases: Construct

Equivalent to Struct, but when this class is parsed, most fields are not parsed (they are skipped if their size can be measured by _actualsize or _sizeof method). See its docstring for details.

Fields are parsed depending on some factors:

  • Some fields like Int* Float* Bytes(5) Array(5,Byte) Pointer are fixed-size and are therefore skipped. Stream is not read.

  • Some fields like Bytes(this.field) are variable-size but their size is known during parsing when there is a corresponding context entry. Those fields are also skipped. Stream is not read.

  • Some fields like Prefixed PrefixedArray PascalString are variable-size but their size can be computed by partially reading the stream. Only first few bytes are read (the lengthfield).

  • Other fields like VarInt need to be parsed. Stream position that is left after the field was parsed is used.

  • Some fields may not work properly, due to the fact that this class attempts to skip fields, and parses them only out of necessity. Miscellaneous fields often have size defined as 0, and fixed sized fields are skippable.

Note there are restrictions:

  • If a field like Bytes(this.field) references another field in the same struct, you need to access the referenced field first (to trigger its parsing) and then you can access the Bytes field. Otherwise it would fail due to missing context entry.

  • If a field references another field within inner (nested) or outer (super) struct, things may break. Context is nested, but this class was not rigorously tested in that manner.

Building and sizeof are greedy, like in Struct.

Parameters:
  • *subcons – Construct instances, list of members, some can be anonymous

  • **subconskw – Construct instances, list of members (requires Python 3.6)

class malstruct.lazy.LazyListContainer(subcon, stream, count, offsets, values, context, path)

Bases: list

Used internally.

class malstruct.lazy.LazyArray(count, subcon)

Bases: Subconstruct

Equivalent to Array, but the subcon is not parsed when possible (it gets skipped if the size can be measured by _actualsize or _sizeof method). See its docstring for details.

Fields are parsed depending on some factors:

  • Some fields like Int* Float* Bytes(5) Array(5,Byte) Pointer are fixed-size and are therefore skipped. Stream is not read.

  • Some fields like Bytes(this.field) are variable-size but their size is known during parsing when there is a corresponding context entry. Those fields are also skipped. Stream is not read.

  • Some fields like Prefixed PrefixedArray PascalString are variable-size but their size can be computed by partially reading the stream. Only first few bytes are read (the lengthfield).

  • Other fields like VarInt need to be parsed. Stream position that is left after the field was parsed is used.

  • Some fields may not work properly, due to the fact that this class attempts to skip fields, and parses them only out of necessity. Miscellaneous fields often have size defined as 0, and fixed sized fields are skippable.

Note there are restrictions:

  • If a field references another field within inner (nested) or outer (super) struct, things may break. Context is nested, but this class was not rigorously tested in that manner.

Building and sizeof are greedy, like in Array.

Parameters:
  • count – integer or context lambda, strict amount of elements

  • subcon – Construct instance, subcon to process individual elements

class malstruct.lazy.LazyBound(subconfunc)

Bases: Construct

Field that binds to the subcon only at runtime (during parsing and building, not ctor). Useful for recursive data structures, like linked-lists and trees, where a construct needs to refer to itself (while it does not exist yet in the namespace).

Note that it is possible to obtain same effect without using this class, using a loop. However there are usecases where that is not possible (if remaining nodes cannot be sized-up, and there is data following the recursive structure). There is also a significant difference, namely that LazyBound actually does greedy parsing while the loop does lazy parsing. See examples.

To break recursion, use If field. See examples.

Parameters:

subconfunc – parameter-less lambda returning Construct instance, can also return itself

Example:

d = Struct(
    "value" / Byte,
    "next" / If(this.value > 0, LazyBound(lambda: d)),
)
>>> print(d.parse(b"\x05\x09\x00"))
Container:
    value = 5
    next = Container:
        value = 9
        next = Container:
            value = 0
            next = None
d = Struct(
    "value" / Byte,
    "next" / GreedyBytes,
)
data = b"\x05\x09\x00"
while data:
    x = d.parse(data)
    data = x.next
    print(x)
# print outputs
Container:
    value = 5
    next = \t\x00 (total 2)
# print outputs
Container:
    value = 9
    next = \x00 (total 1)
# print outputs
Container:
    value = 0
    next =  (total 0)

malstruct.mappings module

Mappings constructs and Adapters

malstruct.mappings.Flag

One byte (or one bit) field that maps to True or False. Other non-zero bytes are also considered True. Size is defined as 1.

Raises:

StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

Example:

>>> Flag.parse(b"\x01")
True
>>> Flag.build(True)
b'\x01'
class malstruct.mappings.Boolean(subcon)

Bases: Adapter

Adapter used to convert parsed value into a boolean. NOTE: While similar to construct.Flag, this adapter accepts any value other than 0 or ‘’ as true. And will work with more than just construct.Byte.

WARNING: Due to the lossy nature, this can’t be used to build.

Example:

>>> Boolean(Int32ul).parse(b'\x01\x02\x03\x04')
True
>>> Boolean(Int32ul).parse(b'\x00\x00\x00\x00')
False
>>> Boolean(CString()).parse(b'hello\x00')
True
>>> Boolean(CString()).parse(b'\x00')
False
class malstruct.mappings.EnumInteger

Bases: int

Used internally.

class malstruct.mappings.EnumIntegerString

Bases: str

Used internally.

static new(intvalue, stringvalue)
class malstruct.mappings.Enum(subcon, *merge, **mapping)

Bases: Adapter

Translates unicode label names to subcon values, and vice versa.

Parses integer subcon, then uses that value to lookup mapping dictionary. Returns an integer-convertible string (if mapping found) or an integer (otherwise). Building is a reversed process. Can build from an integer flag or string label. Size is same as subcon, unless it raises SizeofError.

There is no default parameter, because if no mapping is found, it parses into an integer without error.

This class supports enum module. See examples.

This class supports exposing member labels as attributes, as integer-convertible strings. See examples.

Parameters:
  • subcon – Construct instance, subcon to map to/from

  • *merge – optional, list of enum.IntEnum and enum.IntFlag instances, to merge labels and values from

  • **mapping – dict, mapping string names to values

Raises:

MappingError – building from string but no mapping found

Example:

>>> d = Enum(Byte, one=1, two=2, four=4, eight=8)
>>> d.parse(b"\x01")
'one'
>>> int(d.parse(b"\x01"))
1
>>> d.parse(b"\xff")
255
>>> int(d.parse(b"\xff"))
255

>>> d.build(d.one or "one" or 1)
b'\x01'
>>> d.one
'one'

import enum
class E(enum.IntEnum or enum.IntFlag):
    one = 1
    two = 2

Enum(Byte, E) <--> Enum(Byte, one=1, two=2)
FlagsEnum(Byte, E) <--> FlagsEnum(Byte, one=1, two=2)
class malstruct.mappings.BitwisableString

Bases: str

Used internally.

class malstruct.mappings.FlagsEnum(subcon, *merge, **flags)

Bases: Adapter

Translates unicode label names to subcon integer (sub)values, and vice versa.

Parses integer subcon, then creates a Container, where flags define each key. Builds from a container by bitwise-oring of each flag if it matches a set key. Can build from an integer flag or string label directly, as well as | concatenations thereof (see examples). Size is same as subcon, unless it raises SizeofError.

This class supports enum module. See examples.

This class supports exposing member labels as attributes, as bitwisable strings. See examples.

Parameters:
  • subcon – Construct instance, must operate on integers

  • *merge – optional, list of enum.IntEnum and enum.IntFlag instances, to merge labels and values from

  • **flags – dict, mapping string names to integer values

Raises:
  • MappingError – building from object not like: integer string dict

  • MappingError – building from string but no mapping found

Can raise arbitrary exceptions when computing | and & and value is non-integer.

Example:

>>> d = FlagsEnum(Byte, one=1, two=2, four=4, eight=8)
>>> d.parse(b"\x03")
Container(one=True, two=True, four=False, eight=False)
>>> d.build(dict(one=True,two=True))
b'\x03'

>>> d.build(d.one|d.two or "one|two" or 1|2)
b'\x03'

import enum
class E(enum.IntEnum or enum.IntFlag):
    one = 1
    two = 2

Enum(Byte, E) <--> Enum(Byte, one=1, two=2)
FlagsEnum(Byte, E) <--> FlagsEnum(Byte, one=1, two=2)
class malstruct.mappings.Mapping(subcon, dec_mapping, enc_mapping=None)

Bases: Adapter

Adapter that maps objects to other objects. Translates objects after parsing and before building. Can for example, be used to translate between enum objects and strings, but Enum class supports enum module already and is recommended. Mappings have been reversed compared to the original Construct

Parameters:
  • subcon – Construct instance

  • mapping – dict, for decoding (parsing) mapping

  • enc_mapping – Optional mapping for encoding (building), otherwise the reversed decoding mapping it used

Example::
>>> spec = Mapping(Byte, {0: u'a', 1: u'b', 2: u'b'})
>>> spec.parse(b'\x02')
u'b'

# Reverse mapping is sorted so 1 will be used instead of 2. >>> spec.build(u’b’) ‘x01’

malstruct.miscellaneous module

Miscellaneous constructs

class malstruct.miscellaneous.Const(value, subcon=None)

Bases: Subconstruct

Field enforcing a constant. It is used for file signatures, to validate that the given pattern exists. Data in the stream must strictly match the specified value.

Note that a variable sized subcon may still provide positive verification. Const does not consume a precomputed amount of bytes, but depends on the subcon to read the appropriate amount (eg. VarInt is acceptable). Whatever subcon parses into, gets compared against the specified value.

Parses using subcon and return its value (after checking). Builds using subcon from nothing (or given object, if not None). Size is the same as subcon, unless it raises SizeofError.

Parameters:
  • value – expected value, usually a bytes literal

  • subcon – optional, Construct instance, subcon used to build value from, assumed to be Bytes if value parameter was a bytes literal

Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

  • ConstError – parsed data does not match specified value, or building from wrong value

  • StringError – building from non-bytes value, perhaps unicode

Example:

>>> d = Const(b"IHDR")
>>> d.build(None)
b'IHDR'
>>> d.parse(b"JPEG")
malstruct.core.ConstError: expected b'IHDR' but parsed b'JPEG'

>>> d = Const(255, Int32ul)
>>> d.build(None)
b'\xff\x00\x00\x00'
malstruct.miscellaneous.Index

Indexes a field inside outer Array GreedyRange RepeatUntil context.

Note that you can use this class, or use this._index expression instead, depending on how its used. See the examples.

Parsing and building pulls _index key from the context. Size is 0 because stream is unaffected.

Raises:

IndexFieldError – did not find either key in context

Example:

>>> d = Array(3, Index)
>>> d.parse(b"")
[0, 1, 2]
>>> d = Array(3, Struct("i" / Index))
>>> d.parse(b"")
[Container(i=0), Container(i=1), Container(i=2)]

>>> d = Array(3, Computed(this._index+1))
>>> d.parse(b"")
[1, 2, 3]
>>> d = Array(3, Struct("i" / Computed(this._._index+1)))
>>> d.parse(b"")
[Container(i=1), Container(i=2), Container(i=3)]
class malstruct.miscellaneous.Default(subcon, value)

Bases: Subconstruct

Field where building does not require a value, because the value gets taken from default. Comes handy when building a Struct from a dict with missing keys.

Parsing defers to subcon. Building is defered to subcon, but it builds from a default (if given object is None) or from given object. Building does not require a value, but can accept one. Size is the same as subcon, unless it raises SizeofError.

Difference between Default and Rebuild, is that in first the build value is optional and in second the build value is ignored.

Parameters:
  • subcon – Construct instance

  • value – context lambda or constant value

Raises:

StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

Can propagate any exception from the lambda, possibly non-ConstructError.

Example:

>>> d = Struct(
...     "a" / Default(Byte, 0),
... )
>>> d.build(dict(a=1))
b'\x01'
>>> d.build(dict())
b'\x00'
class malstruct.miscellaneous.Check(func)

Bases: Construct

Checks for a condition, and raises CheckError if the check fails.

Parsing and building return nothing (but check the condition). Size is 0 because stream is unaffected.

Parameters:

func – bool or context lambda, that gets run on parsing and building

Raises:

CheckError – lambda returned false

Can propagate any exception from the lambda, possibly non-ConstructError.

Example:

Check(lambda ctx: len(ctx.payload.data) == ctx.payload_len)
Check(len_(this.payload.data) == this.payload_len)
malstruct.miscellaneous.Error

Raises ExplicitError, unconditionally.

Parsing and building always raise ExplicitError. Size is undefined.

Raises:

ExplicitError – unconditionally, on parsing and building

Example:

>>> d = Struct("num"/Byte, Error)
>>> d.parse(b"data...")
malstruct.core.ExplicitError: Error field was activated during parsing
class malstruct.miscellaneous.ErrorMessage(message='Error field was activated.')

Bases: Construct

Raises an exception when triggered by parse or build. Can be used as a sentinel that blows a whistle when a conditional branch goes the wrong way, or to raise an error explicitly the declarative way. This modification allows the ability to supply a custom message.

Example:

>>> d = "x"/Int8sb >> IfThenElse(this.x > 0, Int8sb, ErrorMessage('Failed if statement'))
>>> d.parse(b"\xff\x05")
Traceback (most recent call last):
    ...
construct.core.ExplicitError: Failed if statement
class malstruct.miscellaneous.Iter(iterable, cases, default=None)

Bases: Construct

Class that allows iterating over an object and acting on each item.

Example:

>>> spec = Struct(
...     'types' / Byte[3],
...     'entries' / Iter(this.types, {
...        1: Int32ul,
...        2: Int16ul,
...     },
...     default=Pass
...     )
... )
>>> spec.parse(b'\x01\x02\x09\x03\x03\x03\x03\x06\x06')
Container(types=ListContainer([1, 2, 9]), entries=ListContainer([50529027, 1542, None]))
>>> C = _
>>> spec.build(C)
b'\x01\x02\t\x03\x03\x03\x03\x06\x06'
>>> spec.sizeof(**C)
9

>>> spec = Struct(
...     'sizes' / Int16ul[4],
...     'entries' / Iter(this.sizes, Bytes)  # equivalent to Iter(this.sizes, lambda size: Bytes(size))
... )
>>> spec.parse(b'\x01\x00\x03\x00\x00\x00\x05\x00abbbddddd')
Container(sizes=ListContainer([1, 3, 0, 5]), entries=ListContainer([b'a', b'bbb', b'', b'ddddd']))
>>> C = _
>>> spec.build(C)
b'\x01\x00\x03\x00\x00\x00\x05\x00abbbddddd'
>>> Iter(this.sizes, Bytes).sizeof(sizes=[1,2,3,0])
6
>>> spec.sizeof(**C)
17
Parameters:
  • iterable – iterable items to act upon

  • cases – A dictionary of cases or a function that takes a key and returns a construct spec.

  • default – The default case (only if cases is a dict)

malstruct.miscellaneous.Pickled

Preserves arbitrary Python objects.

Parses using pickle.load() and builds using pickle.dump() functions, using default Pickle binary protocol. Size is undefined.

Raises:

StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

Can propagate pickle.load() and pickle.dump() exceptions.

Example:

>>> x = [1, 2.3, {}]
>>> Pickled.build(x)
b'\x80\x03]q\x00(K\x01G@\x02ffffff}q\x01e.'
>>> Pickled.parse(_)
[1, 2.3, {}]
malstruct.miscellaneous.Numpy

Preserves numpy arrays (both shape, dtype and values).

Parses using numpy.load() and builds using numpy.save() functions, using Numpy binary protocol. Size is undefined.

Raises:
  • ImportError – numpy could not be imported during parsing or building

  • ValueError – could not read enough bytes, or so

Can propagate numpy.load() and numpy.save() exceptions.

Example:

>>> import numpy
>>> a = numpy.asarray([1,2,3])
>>> Numpy.build(a)
b"\x93NUMPY\x01\x00F\x00{'descr': '<i8', 'fortran_order': False, 'shape': (3,), }            \n\x01\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x03\x00\x00\x00\x00\x00\x00\x00"
>>> Numpy.parse(_)
array([1, 2, 3])
class malstruct.miscellaneous.NamedTuple(tuplename, tuplefields, subcon)

Bases: Adapter

Both arrays, structs, and sequences can be mapped to a namedtuple from collections module. To create a named tuple, you need to provide a name and a sequence of fields, either a string with space-separated names or a list of string names, like the standard namedtuple.

Parses into a collections.namedtuple instance, and builds from such instance (although it also builds from lists and dicts). Size is undefined.

Parameters:
  • tuplename – string

  • tuplefields – string or list of strings

  • subcon – Construct instance, either Struct Sequence Array GreedyRange

Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

  • NamedTupleError – subcon is neither Struct Sequence Array GreedyRange

Can propagate collections exceptions.

Example:

>>> d = NamedTuple("coord", "x y z", Byte[3])
>>> d = NamedTuple("coord", "x y z", Byte >> Byte >> Byte)
>>> d = NamedTuple("coord", "x y z", "x"/Byte + "y"/Byte + "z"/Byte)
>>> d.parse(b"123")
coord(x=49, y=50, z=51)
class malstruct.miscellaneous.Delimited(delimiter, *subcons)

Bases: Construct

A construct used to parse delimited data.

NOTE: The parsed constructs will be buffered

Example:

>>> spec = Delimited(b'|',
...     'first' / CString(),
...     'second' / Int32ul,
...     # When using a Greedy construct, either all data till EOF or the next delimiter will be consumed.
...     'third' / GreedyBytes,
...     'fourth' / Byte
... )
>>> spec.parse(b'Hello\x00\x00|\x01\x00\x00\x00|world!!\x01\x02|\xff')
Container(first=u'Hello', second=1, third=b'world!!\x01\x02', fourth=255)
>>> spec.build(dict(first=u'Hello', second=1, third=b'world!!\x01\x02', fourth=255))
b'Hello\x00|\x01\x00\x00\x00|world!!\x01\x02|\xff'

# If you don't care about a particular element, you can leave it nameless just like in Structs.
# NOTE: You can't build unless you have supplied every attribute.::

>>> spec = Delimited(b'|',
...     'first' / CString(),
...     'second' / Int32ul,
...     Pass,
...     'fourth' / Byte
... )
>>> spec.parse(b'Hello\x00\x00|\x01\x00\x00\x00|world!!\x01\x02|\xff')
Container(first=u'Hello', second=1, fourth=255)

# It may also be useful to use Pass or Optional for fields that may not exist.::

>>> spec = Delimited(b'|',
...     'first' / CString(),
...     'second' / Pass,
...     'third' / Optional(Int32ul)
... )
>>> spec.parse(b'Hello\x00\x00|dont care|\x01\x00\x00\x00')
Container(first=u'Hello', second=None, third=1)
>>> spec.parse(b'Hello\x00\x00||')
Container(first=u'Hello', second=None, third=None)

# delimiters may have a length > 1::

>>> spec = Delimited(b'YOYO',
...     'first' / CString(),
...     'second' / Int32ul,
...     # When using a Greedy construct, either all data till EOF or the next delimiter will be consumed.
...     'third' / GreedyBytes,
...     'fourth' / Byte
... )
>>> spec.parse(b'Hello\x00\x00YOYO\x01\x00\x00\x00YOYOworld!!YO!!\x01\x02YOYO\xff')
Container(first=u'Hello', second=1, third=b'world!!YO!!\x01\x02', fourth=255)
>>> spec.build(dict(first=u'Hello', second=1, third=b'world!!YO!!\x01\x02', fourth=255))
b'Hello\x00YOYO\x01\x00\x00\x00YOYOworld!!YO!!\x01\x02YOYO\xff'
class malstruct.miscellaneous.Regex(regex, *subcon, **group_subcons)

Bases: Construct

A construct designed look for the first match for the given regex, then parse the data collected in the groups. Returns the matched capture groups in attributes based on their respective names. If a subconstruct is defined for a group, it will run that construct on that particular piece of data.

NOTE: The subconstruct will run on the data as if is the only data that exists. Therefore, using Seek and Tell will be purely relative to that piece of data only. This was done to ensure you are only parsing what has been captured. (If you need to use Seek or Tell, you will have to instead make a capture group that collects no data.)

NOTE: If you supply a string as the regular expression, the re.DOTALL flag will be automatically specified. If you need to use different flags, you must past a compiled regex.

Example:

# The seek position is left at the end of the successful match (match.end()).
>>> regex = re.compile(b'\x01\x02(?P<size>.{4})\x03\x04(?P<path>[A-Za-z].*\x00)', re.DOTALL)
>>> data = b'GARBAGE!\x01\x02\x0A\x00\x00\x00\x03\x04C:\Windows\x00MORE GARBAGE!'
>>> r = Regex(regex, size=Int32ul, path=CString()).parse(data)
>>> r == Container(path='C:\\Windows', size=10)
True
>>> r = Regex(regex).parse(data)
>>> r == Container(path=b'C:\\Windows\x00', size=b'\n\x00\x00\x00')
True
>>> r = Struct(
...     're' / Regex(regex, size=Int32ul, path=CString()),
...     'after_re' / Tell,
...     'garbage' / GreedyBytes
... ).parse(data)
>>> r == Container(re=Container(path='C:\\Windows', size=10), after_re=27, garbage=b'MORE GARBAGE!')
True

>>> Struct(
...     *Regex(regex, size=Int32ul, path=CString()),
...     'after_re' / Tell,
...     'garbage' / GreedyBytes
... ).parse(data)
Container(size=10, path=u'C:\\Windows', after_re=27, garbage=b'MORE GARBAGE!')

# You can use Regex as a trigger to find a particular piece of data before you start parsing.
>>> Struct(
...     Regex(b'TRIGGER'),
...     'greeting' / CString()
... ).parse(b'\x01\x02\x04GARBAGE\x05TRIGGERhello world\x00')
Container(greeting=u'hello world')

# If no data is captured, the associated subcon will received a stream with the position set at the location
# of that captured group. Thus, allowing you to use it as an anchor point.
>>> r = Regex(b'hello (?P<anchor>)world(?P<extra_data>.*)', anchor=Tell).parse(b'hello world!!!!')
>>> r == Container(extra_data=b'!!!!', anchor=6)
True

# If no named capture groups are used, you can instead parse the entire matched string by supplying
# a subconstruct as a positional argument. (If no subcon is provided, the raw bytes are returned instead.
>>> Regex(b'hello world\x00', CString()).parse(b'GARBAGE\x01\x03hello world\x00\x04')
'hello world'
>>> Regex(b'hello world\x00').parse(b'GARBAGE\x01\x03hello world\x00\x04')
b'hello world\x00'

# You can also set the regular expression to match in-place (instead of searching the data)
# by setting the keyword argument _match to True.

>>> Regex('hello', _match=True).parse(b'hello world!')
b'hello'
>>> Regex('hello').parse(b'bogus hello world')
b'hello'
>>> Regex('hello', _match=True).parse(b'bogus hello world')
Traceback (most recent call last):
    ...
construct.core.ConstructError: [(parsing)] regex did not match
regex
match
subcons
subcon
malstruct.miscellaneous.RegexSearch(regex, *subcon, **group_subcons) Regex

Performs search of given regex pattern starting at current stream position and then parses match groups.

malstruct.miscellaneous.RegexMatch(regex, *subcon, **group_subcons) Regex

Peforms match of given regex pattern at current stream position and then parses match groups.

class malstruct.miscellaneous.BytesTerminated(subcon, term=b'\x00', include=False, consume=True, require=True, absolute=False)

Bases: NullTerminated

BytesTerminated is the same as NullTerminated except that it is targeted for binary data and not strings, and therefore the terminator can be an arbitrary length (as opposed to having length equal to the character width). See the NullTerminated documentation for the remainder of the functionality and options.

>>> BytesTerminated(GreedyBytes, term=b'TERM').parse(b'helloTERM')
b'hello'
class malstruct.miscellaneous.Stripped(subcon, pad=None)

Bases: Adapter

An adapter that strips characters/bytes from the right of the parsed results.

NOTE: While this may look similar to Padded() this is different because this doesn’t take a length and instead strips out the nulls from within the already parsed subconstruct.

Parameters:
  • subcon – The sub-construct to wrap.

  • pad – The character/bytes to use for stripping. Defaults to null character.

Example:

>>> Stripped(GreedyBytes).parse(b'hello\x00\x00\x00')
b'hello'
>>> Stripped(Bytes(10)).parse(b'hello\x00\x00\x00\x00\x00')
b'hello'
>>> Stripped(Bytes(14), pad=b'PAD').parse(b'helloPADPADPAD')
b'hello'
>>> Stripped(Bytes(14), pad=b'PAD').build(b'hello')
b'helloPADPADPAD'
>>> Stripped(CString(), pad=u'PAD').parse(b'helloPADPAD\x00')
'hello'
>>> Stripped(String(14), pad=u'PAD').parse(b'helloPADPAD\x00\x00\x00')
'hello'

# WARNING: If padding doesn't fit in the perscribed data it will not strip it!
>>> Stripped(Bytes(13), pad=b'PAD').parse(b'helloPADPADPA')
b'helloPADPADPA'
>>> Stripped(Bytes(13), pad=b'PAD').build(b'hello')
Traceback (most recent call last):
    ...
construct.core.StreamError: Error in path (building)
bytes object of wrong length, expected 13, found 5

# If the wrapped subconstruct's size can't be determined, if defaults to not providing a pad.
>>> Stripped(CString(), pad=u'PAD').build(u'hello')
b'hello\x00'
class malstruct.miscellaneous.Base64(subcon, custom_alpha=None)

Bases: Adapter

Adapter used to Base64 encoded/decode a value.

WARNING: This adapter must be used on a unicode string value.

Parameters:
  • subcon – the construct to wrap

  • custom_alpha – optional custom alphabet to use

Example:

>>> Base64(GreedyString()).build(b'hello')
b'aGVsbG8='
>>> Base64(GreedyString()).parse(b'aGVsbG8=')
b'hello'
>>> Base64(GreedyBytes).build(b'\x01\x02\x03\x04')
b'AQIDBA=='
>>> Base64(GreedyBytes).parse(b'AQIDBA==')
b'\x01\x02\x03\x04'

NOTE: String size is based on the encoded version.

>>> Base64(String(16)).build('hello world')
b'aGVsbG8gd29ybGQ='
>>> Base64(String(16)).parse(b'aGVsbG8gd29ybGQ=')
b'hello world'

Supplying a custom alphabet is also supported.

>>> spec = Base64(String(16), custom_alpha=b'EFGHQRSTUVWefghijklmnopIJKLMNOPABCDqrstuvwxyXYZabcdz0123456789+/=')
>>> spec.build('hello world')
b'LSoXMS8BO29dMSj='
>>> spec.parse(b'LSoXMS8BO29dMSj=')
b'hello world'
class malstruct.miscellaneous.Backwards(subcon)

Bases: Subconstruct

Subconstruct used to parse a given subconstruct backwards in the stream. This ia a macro for seeking backwards before parsing the construct. (This will not work for subcons that don’t have a valid sizeof. Except for GreedyBytes and GreedyString)

The stream will be left off at the start of the parsed result by design. Therefore, doing something like Int32ul >> Backwards(Int32ul) >> Int32ul will parse the same data 3 times.

Example:

>>> (Bytes(14) >> Backwards(Int32ul) >> Tell).parse(b'junk stuff\x01\x02\x00\x00')
ListContainer([b'junk stuff\x01\x02\x00\x00', 513, 10])
>>> spec = Struct(Seek(0, os.SEEK_END), 'name' / Backwards(String(9)), 'number' / Backwards(Int32ul))
>>> spec.parse(b'A BUNCH OF JUNK DATA\x01\x00\x00\x00joe shmoe')
Container(name=u'joe shmoe', number=1)

# WARNING: This will break if the subcon doesn't have a valid sizeof.
>>> spec = Struct(Seek(0, os.SEEK_END), 'name' / Backwards(CString()), 'number' / Backwards(Int32ul))
>>> spec.parse(b'A BUNCH OF JUNK DATA\x01\x00\x00\x00joe shmoe\x00')
Traceback (most recent call last):
  ...
construct.core.SizeofError: Error in path (parsing) -> name


# However, GreedyBytes and GreedyString are allowed.
>>> spec = Struct(Seek(0, os.SEEK_END), 'name' / Backwards(String(9)), 'rest' / Backwards(GreedyBytes))
>>> spec.parse(b'A BUNCH OF JUNK DATA\x01\x00\x00\x00joe shmoe')
Container(name=u'joe shmoe', rest=b'A BUNCH OF JUNK DATA\x01\x00\x00\x00')
>>> spec = Struct(Seek(0, os.SEEK_END), 'name' / Backwards(String(9)), 'rest' / Backwards(GreedyString(encoding='utf-16-le')))
>>> spec.parse(b'h\x00e\x00l\x00l\x00o\x00joe shmoe')
Container(name=u'joe shmoe', rest=u'hello')

# WARNING: This will also break if you read more data that is behind the current position.
>>> (Seek(0, os.SEEK_END) >> Backwards(String(10))).parse(b'yo')
Traceback (most recent call last):
  ...
construct.core.FormatFieldError: could not read enough bytes, expected 10, found 2

malstruct.network module

Network constructs

malstruct.stream module

class malstruct.stream.Pointer(offset, subcon, stream=None, relativeOffset=False)

Bases: Subconstruct

Jumps in the stream forth and back for one field.

Parsing and building seeks the stream to new location, processes subcon, and seeks back to original location. Size is defined as 0 but that does not mean no bytes are written into the stream.

Offset can be positive, indicating a position from stream beginning forward, or negative, indicating a position from EOF backwards. Alternatively the offset can be interpreted as relative to the current stream position.

Parameters:
  • offset – integer or context lambda, positive or negative

  • subcon – Construct instance

  • stream – None to use original stream (default), or context lambda to provide a different stream

  • relativeOffset – True to interpret the offset as relative to the current stream position

Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

  • StreamError – stream is not seekable and tellable

Can propagate any exception from the lambda, possibly non-ConstructError.

Example:

>>> d = Pointer(8, Bytes(1))
>>> d.parse(b"abcdefghijkl")
b'i'
>>> d.build(b"Z")
b'\x00\x00\x00\x00\x00\x00\x00\x00Z'
class malstruct.stream.Peek(subcon)

Bases: Subconstruct

Peeks at the stream.

Parsing sub-parses (and returns None if failed), then reverts stream to original position. Building does nothing (its NOT deferred). Size is defined as 0 because there is no building.

This class is used in Union class to parse each member.

Parameters:

subcon – Construct instance

Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

  • StreamError – stream is not seekable and tellable

Example:

>>> d = Sequence(Peek(Int8ub), Peek(Int16ub))
>>> d.parse(b"\x01\x02")
[1, 258]
>>> d.sizeof()
0
class malstruct.stream.OffsettedEnd(endoffset, subcon, absolute=False)

Bases: Subconstruct

Parses all bytes in the stream till EOF plus a negative endoffset is reached.

This is useful when GreedyBytes (or any other greedy construct) is followed by a fixed-size footer.

Parsing determines the length of the stream and reads all bytes till EOF plus endoffset is reached, then defers to subcon using new BytesIO with said bytes. Building defers to subcon as-is. Size is undefined.

Parameters:
  • endoffset – integer or context lambda, only negative offsets or zero are allowed

  • subcon – Construct instance

  • absolute – Seek relative to the start of the stream rather than relative to the last occurence of a subconstruct

Raises:
  • StreamError – could not read enough bytes

  • StreamError – reads behind the stream (if endoffset is positive)

Example:

>>> d = Struct(
...     "header" / Bytes(2),
...     "data" / OffsettedEnd(-2, GreedyBytes),
...     "footer" / Bytes(2),
... )
>>> d.parse(b"\x01\x02\x03\x04\x05\x06\x07")
Container(header=b'\x01\x02', data=b'\x03\x04\x05', footer=b'\x06\x07')
class malstruct.stream.Seek(at, whence=0)

Bases: Construct

Seeks the stream.

Parsing and building seek the stream to given location (and whence), and return stream.seek() return value. Size is not defined.

See also

Analog Pointer wrapper that has same side effect but also processes a subcon, and also seeks back.

Parameters:
  • at – integer or context lambda, where to jump to

  • whence – optional, integer or context lambda, is the offset from beginning (0) or from current position (1) or from EOF (2), default is 0

Raises:

StreamError – stream is not seekable

Can propagate any exception from the lambda, possibly non-ConstructError.

Example:

>>> d = (Seek(5) >> Byte)
>>> d.parse(b"01234x")
[5, 120]

>>> d = (Bytes(10) >> Seek(5) >> Byte)
>>> d.build([b"0123456789", None, 255])
b'01234\xff6789'
malstruct.stream.Tell

Tells the stream.

Parsing and building return current stream offset using using stream.tell(). Size is defined as 0 because parsing and building does not consume or add into the stream.

Tell is useful for adjusting relative offsets to absolute positions, or to measure sizes of Constructs. To get an absolute pointer, use a Tell plus a relative offset. To get a size, place two Tells and measure their difference using a Compute field. However, its recommended to use RawCopy instead of manually extracting two positions and computing difference.

Raises:

StreamError – stream is not tellable

Example:

>>> d = Struct("num"/VarInt, "offset"/Tell)
>>> d.parse(b"X")
Container(num=88, offset=1)
>>> d.build(dict(num=88))
b'X'
malstruct.stream.Terminated

Asserts end of stream (EOF). You can use it to ensure no more unparsed data follows in the stream.

Parsing checks if stream reached EOF, and raises TerminatedError if not. Building does nothing. Size is defined as 0 because parsing and building does not consume or add into the stream, as far as other constructs see it.

Raises:

TerminatedError – stream not at EOF when parsing

Example:

>>> Terminated.parse(b"")
None
>>> Terminated.parse(b"remaining")
malstruct.core.TerminatedError: expected end of stream

malstruct.strings module

malstruct.strings.possiblestringencodings = {'ascii': 1, 'u16': 2, 'u32': 4, 'u8': 1, 'utf16': 2, 'utf32': 4, 'utf8': 1, 'utf_16': 2, 'utf_16_be': 2, 'utf_16_le': 2, 'utf_32': 4, 'utf_32_be': 4, 'utf_32_le': 4, 'utf_8': 1}

Explicitly supported encodings (by PaddedString and CString classes).

malstruct.strings.encodingunit(encoding)
>>> encodingunit('utf-8')
b'\x00'
>>> encodingunit('utf-16le')
b'\x00\x00'
>>> encodingunit('utf-16')
b'\x00\x00'
>>> encodingunit('utf-32')
b'\x00\x00\x00\x00'
>>> encodingunit('cp950')
b'\x00'
class malstruct.strings.StringEncoded(subcon, encoding)

Bases: Adapter

Used internally.

malstruct.strings.PaddedString(length, encoding='utf-8')

Configurable, fixed-length or variable-length string field.

When parsing, the byte string is stripped of null bytes (per encoding unit), then decoded. Length is an integer or context lambda. When building, the string is encoded and then padded to specified length. If encoded string is larger than the specified length, it fails with PaddingError. Size is same as length parameter.

Warning

PaddedString and CString only support encodings explicitly listed in possiblestringencodings .

Parameters:
  • length – integer or context lambda, length in bytes (not unicode characters)

  • encoding – string like: utf8 utf16 utf32 ascii

Raises:

Can propagate any exception from the lambda, possibly non-ConstructError.

Example:

>>> d = PaddedString(10, "utf8")
>>> d.build(u"Афон")
b'\xd0\x90\xd1\x84\xd0\xbe\xd0\xbd\x00\x00'
>>> d.parse(_)
u'Афон'
malstruct.strings.PascalString(lengthfield, encoding)

Length-prefixed string. The length field can be variable length (such as VarInt) or fixed length (such as Int64ub). VarInt is recommended when designing new protocols. Stored length is in bytes, not characters. Size is not defined.

Parameters:
  • lengthfield – Construct instance, field used to parse and build the length (like VarInt Int64ub)

  • encoding – string like: utf8 utf16 utf32 ascii

Raises:

StringError – building a non-unicode string

Example:

>>> d = PascalString(VarInt, "utf8")
>>> d.build(u"Афон")
b'\x08\xd0\x90\xd1\x84\xd0\xbe\xd0\xbd'
>>> d.parse(_)
u'Афон'
malstruct.strings.String(length, encoding='utf-8')

Configurable, fixed-length or variable-length string field.

When parsing, the byte string is stripped of null bytes (per encoding unit), then decoded. Length is an integer or context lambda. When building, the string is encoded and then padded to specified length. If encoded string is larger than the specified length, it fails with PaddingError. Size is same as length parameter.

Warning

PaddedString and CString only support encodings explicitly listed in possiblestringencodings .

Parameters:
  • length – integer or context lambda, length in bytes (not unicode characters)

  • encoding – string like: utf8 utf16 utf32 ascii

Raises:

Can propagate any exception from the lambda, possibly non-ConstructError.

Example:

>>> d = PaddedString(10, "utf8")
>>> d.build(u"Афон")
b'\xd0\x90\xd1\x84\xd0\xbe\xd0\xbd\x00\x00'
>>> d.parse(_)
u'Афон'
malstruct.strings.CString(encoding='utf-8')

String ending in a terminating null byte (or null bytes in case of UTF16 UTF32).

Warning

String and CString only support encodings explicitly listed in possiblestringencodings .

Parameters:

encoding – string like: utf8 utf16 utf32 ascii

Raises:

Example:

>>> d = CString("utf8")
>>> d.build(u"Афон")
b'\xd0\x90\xd1\x84\xd0\xbe\xd0\xbd\x00'
>>> d.parse(_)
u'Афон'
malstruct.strings.GreedyString(encoding='utf-8')

String that reads entire stream until EOF, and writes a given string as-is. Analog to GreedyBytes but also applies unicode-to-bytes encoding.

Parameters:

encoding – string like: utf8 utf16 utf32 ascii

Raises:

Example:

>>> d = GreedyString("utf8")
>>> d.build(u"Афон")
b'\xd0\x90\xd1\x84\xd0\xbe\xd0\xbd'
>>> d.parse(_)
u'Афон'
malstruct.strings.String16(length)

Creates UTF-16 (little endian) encoded string.

>>> String16(10).build(u'hello')
b'h\x00e\x00l\x00l\x00o\x00'
>>> String16(10).parse(b'h\x00e\x00l\x00l\x00o\x00')
'hello'
>>> String16(16).parse(b'h\x00e\x00l\x00l\x00o\x00\x00\x00\x00\x00\x00\x00')
'hello'
malstruct.strings.String32(length)

Creates UTF-32 (little endian) encoded string.

>>> String32(20).build(u'hello')
b'h\x00\x00\x00e\x00\x00\x00l\x00\x00\x00l\x00\x00\x00o\x00\x00\x00'
>>> String32(20).parse(b'h\x00\x00\x00e\x00\x00\x00l\x00\x00\x00l\x00\x00\x00o\x00\x00\x00')
'hello'
class malstruct.strings.Printable(subcon)

Bases: Validator

Validator used to validate that a parsed String (or Bytes) is a printable (ascii) string.

NOTE: A ValidationError is a type of ConstructError and will be cause if catching ConstructError.

>>> Printable(String(5)).parse(b'hello')
'hello'
>>> Printable(String(5)).parse(b'he\x11o!')
Traceback (most recent call last):
    ...
construct.core.ValidationError: Error in path (parsing)
object failed validation: heo!
>>> Printable(Bytes(3)).parse(b'\x01NO')
Traceback (most recent call last):
    ...
construct.core.ValidationError: Error in path (parsing)
object failed validation: b'\x01NO'
>>> Printable(Bytes(3)).parse(b'YES')
b'YES'

malstruct.transforms module

Subconstruct transforms

malstruct.transforms.Bitwise(subcon)

Converts the stream from bytes to bits, and passes the bitstream to underlying subcon. Bitstream is a stream that contains 8 times as many bytes, and each byte is either \x00 or \x01 (in documentation those bytes are called bits).

Parsing building and size are deferred to subcon, although size gets divided by 8 (therefore the subcon’s size must be a multiple of 8).

Note that by default the bit ordering is from MSB to LSB for every byte (ie. bit-level big-endian). If you need it reversed, wrap this subcon with malstruct.core.BitsSwapped.

Parameters:

subcon – Construct instance, any field that works with bits (like BitsInteger) or is bit-byte agnostic (like Struct or Flag)

See Transformed and Restreamed for raisable exceptions.

Example:

>>> d = Bitwise(Struct(
...     'a' / Nibble,
...     'b' / Bytewise(Float32b),
...     'c' / Padding(4),
... ))
>>> d.parse(bytes(5))
Container(a=0, b=0.0, c=None)
>>> d.sizeof()
5

Obtaining other byte or bit orderings:

>>> d = Bitwise(Bytes(16))
>>> d.parse(b'\x01\x03')
b'\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x01\x01'
>>> d = BitsSwapped(Bitwise(Bytes(16)))
>>> d.parse(b'\x01\x03')
b'\x01\x00\x00\x00\x00\x00\x00\x00\x01\x01\x00\x00\x00\x00\x00\x00'
malstruct.transforms.Bytewise(subcon)

Converts the bitstream back to normal byte stream. Must be used within Bitwise.

Parsing building and size are deferred to subcon, although size gets multiplied by 8.

Parameters:

subcon – Construct instance, any field that works with bytes or is bit-byte agnostic

See Transformed and Restreamed for raisable exceptions.

Example:

>>> d = Bitwise(Struct(
...     'a' / Nibble,
...     'b' / Bytewise(Float32b),
...     'c' / Padding(4),
... ))
>>> d.parse(bytes(5))
Container(a=0, b=0.0, c=None)
>>> d.sizeof()
5
malstruct.transforms.BitStruct(*subcons, **subconskw)

Makes a structure inside a Bitwise.

See Bitwise and Struct for semantics and raisable exceptions.

Parameters:
  • *subcons – Construct instances, list of members, some can be anonymous

  • **subconskw – Construct instances, list of members (requires Python 3.6)

Example:

BitStruct  <-->  Bitwise(Struct(...))

>>> d = BitStruct(
...     "a" / Flag,
...     "b" / Nibble,
...     "c" / BitsInteger(10),
...     "d" / Padding(1),
... )
>>> d.parse(b"\xbe\xef")
Container(a=True, b=7, c=887, d=None)
>>> d.sizeof()
2
class malstruct.transforms.RawCopy(subcon)

Bases: Subconstruct

Used to obtain byte representation of a field (aside of object value).

Returns a dict containing both parsed subcon value, the raw bytes that were consumed by subcon, starting and ending offset in the stream, and amount in bytes. Builds either from raw bytes representation or a value used by subcon. Size is same as subcon.

Object is a dictionary with either “data” or “value” keys, or both.

When building, if both the “value” and “data” keys are present, then the “data” key is used and the “value” key is ignored. This is undesirable in the case that you parse some data for the purpose of modifying it and writing it back; in this case, delete the “data” key when modifying the “value” key to correctly rebuild the former.

Parameters:

subcon – Construct instance

Raises:
  • StreamError – stream is not seekable and tellable

  • RawCopyError – building and neither data or value was given

  • StringError – building from non-bytes value, perhaps unicode

Example:

>>> d = RawCopy(Byte)
>>> d.parse(b"\xff")
Container(data=b'\xff', value=255, offset1=0, offset2=1, length=1)
>>> d.build(dict(data=b"\xff"))
'\xff'
>>> d.build(dict(value=255))
'\xff'
malstruct.transforms.ByteSwapped(subcon)

Swaps the byte order within boundaries of given subcon. Requires a fixed sized subcon.

Parameters:

subcon – Construct instance, subcon on top of byte swapped bytes

Raises:

SizeofError – ctor or compiler could not compute subcon size

See Transformed and Restreamed for raisable exceptions.

Example:

Int24ul <--> ByteSwapped(Int24ub) <--> BytesInteger(3, swapped=True) <--> ByteSwapped(BytesInteger(3))
malstruct.transforms.BitsSwapped(subcon)

Swaps the bit order within each byte within boundaries of given subcon. Does NOT require a fixed sized subcon.

Parameters:

subcon – Construct instance, subcon on top of bit swapped bytes

Raises:

SizeofError – compiler could not compute subcon size

See Transformed and Restreamed for raisable exceptions.

Example:

>>> d = Bitwise(Bytes(8))
>>> d.parse(b"\x01")
'\x00\x00\x00\x00\x00\x00\x00\x01'
>>>> BitsSwapped(d).parse(b"\x01")
'\x01\x00\x00\x00\x00\x00\x00\x00'
class malstruct.transforms.FocusedSeq(parsebuildfrom, *subcons, **subconskw)

Bases: Construct

Allows constructing more elaborate “adapters” than Adapter class.

Parse does parse all subcons in sequence, but returns only the element that was selected (discards other values). Build does build all subcons in sequence, where each gets build from nothing (except the selected subcon which is given the object). Size is the sum of all subcon sizes, unless any subcon raises SizeofError.

This class does context nesting, meaning its members are given access to a new dictionary where the “_” entry points to the outer context. When parsing, each member gets parsed and subcon parse return value is inserted into context under matching key only if the member was named. When building, the matching entry gets inserted into context before subcon gets build, and if subcon build returns a new value (not None) that gets replaced in the context.

This class exposes subcons as attributes. You can refer to subcons that were inlined (and therefore do not exist as variable in the namespace) by accessing the struct attributes, under same name. Also note that compiler does not support this feature. See examples.

This class exposes subcons in the context. You can refer to subcons that were inlined (and therefore do not exist as variable in the namespace) within other inlined fields using the context. Note that you need to use a lambda (this expression is not supported). Also note that compiler does not support this feature. See examples.

This class is used internally to implement PrefixedArray.

Parameters:
  • parsebuildfrom – string name or context lambda, selects a subcon

  • *subcons – Construct instances, list of members, some can be named

  • **subconskw – Construct instances, list of members (requires Python 3.6)

Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

  • UnboundLocalError – selector does not match any subcon

Can propagate any exception from the lambda, possibly non-ConstructError.

Excample:

>>> d = FocusedSeq("num", Const(b"SIG"), "num"/Byte, Terminated)
>>> d.parse(b"SIG\xff")
255
>>> d.build(255)
b'SIG\xff'

>>> d = FocusedSeq("animal",
...     "animal" / Enum(Byte, giraffe=1),
... )
>>> d.animal.giraffe
'giraffe'
>>> d = FocusedSeq("count",
...     "count" / Byte,
...     "data" / Padding(lambda this: this.count - this._subcons.count.sizeof()),
... )
>>> d.build(4)
b'\x04\x00\x00\x00'

PrefixedArray <--> FocusedSeq("items",
    "count" / Rebuild(lengthfield, len_(this.items)),
    "items" / subcon[this.count],
)
malstruct.transforms.FocusLast(*subcons, **kw)

A helper for performing the common technique of using FocusedSeq to parse a bunch of subconstructs and then grab the last element.

Example:

>>> FocusLast(Byte, Byte, String(2)).parse(b'\x01\x02hi')
'hi'

>>> spec = FocusLast(
...     'a' / Byte,
...     'b' / Byte,
...     String(this.a + this.b),
... )
>>> spec.parse(b'\x01\x02hi!')
'hi!'
>>> spec.build(u'hi!', a=1, b=2)
b'\x01\x02hi!'

# Simplifies this:
>>> FocusedSeq(
    'value',
    're' / construct.Regex(.., offset=construct.Int32ul, size=construct.Byte),
    'value' / construct.PEPointer(this.re.offset, construct.Bytes(this.re.size)
)
# To this:
>>> FocusLast(
    're' / construct.Regex(.., offset=construct.Int32ul, size=construct.Byte),
    construct.PEPointer(this.re.offset, construct.Bytes(this.re.size)
)
class malstruct.transforms.Rebuild(subcon, func)

Bases: Subconstruct

Field where building does not require a value, because the value gets recomputed when needed. Comes handy when building a Struct from a dict with missing keys. Useful for length and count fields when Prefixed and PrefixedArray cannot be used.

Parsing defers to subcon. Building is defered to subcon, but it builds from a value provided by the context lambda (or constant). Size is the same as subcon, unless it raises SizeofError.

Difference between Default and Rebuild, is that in first the build value is optional and in second the build value is ignored.

Parameters:
  • subcon – Construct instance

  • func – context lambda or constant value

Raises:

StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

Can propagate any exception from the lambda, possibly non-ConstructError.

Example:

>>> d = Struct(
...     "count" / Rebuild(Byte, len_(this.items)),
...     "items" / Byte[this.count],
... )
>>> d.build(dict(items=[1,2,3]))
b'\x03\x01\x02\x03'
class malstruct.transforms.Prefixed(lengthfield, subcon, includelength=False, absolute=False)

Bases: Subconstruct

Prefixes a field with byte count.

Parses the length field. Then reads that amount of bytes, and parses subcon using only those bytes. Constructs that consume entire remaining stream are constrained to consuming only the specified amount of bytes (a substream). When building, data gets prefixed by its length. Optionally, length field can include its own size. Size is the sum of both fields sizes, unless either raises SizeofError.

Analog to PrefixedArray which prefixes with an element count, instead of byte count. Semantics is similar but implementation is different.

VarInt is recommended for new protocols, as it is more compact and never overflows.

Parameters:
  • lengthfield – Construct instance, field used for storing the length

  • subcon – Construct instance, subcon used for storing the value

  • includelength – optional, bool, whether length field should include its own size, default is False

  • absolute – Seek relative to the start of the stream rather than relative to the last occurence of a subconstruct

Raises:

StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

Example:

>>> d = Prefixed(VarInt, GreedyRange(Int32ul))
>>> d.parse(b"\x08abcdefgh")
[1684234849, 1751606885]

>>> d = PrefixedArray(VarInt, Int32ul)
>>> d.parse(b"\x02abcdefgh")
[1684234849, 1751606885]
malstruct.transforms.PrefixedArray(countfield, subcon)

Prefixes an array with item count (as opposed to prefixed by byte count, see Prefixed).

VarInt is recommended for new protocols, as it is more compact and never overflows.

Parameters:
  • countfield – Construct instance, field used for storing the element count

  • subcon – Construct instance, subcon used for storing each element

Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

  • RangeError – consumed or produced too little elements

Example:

>>> d = Prefixed(VarInt, GreedyRange(Int32ul))
>>> d.parse(b"\x08abcdefgh")
[1684234849, 1751606885]

>>> d = PrefixedArray(VarInt, Int32ul)
>>> d.parse(b"\x02abcdefgh")
[1684234849, 1751606885]
class malstruct.transforms.FixedSized(length, subcon, absolute=False)

Bases: Subconstruct

Restricts parsing to specified amount of bytes.

Parsing reads length bytes, then defers to subcon using new BytesIO with said bytes. Building builds the subcon using new BytesIO, then writes said data and additional null bytes accordingly. Size is same as length, although negative amount raises an error.

Parameters:
  • length – integer or context lambda, total amount of bytes (both data and padding)

  • subcon – Construct instance

  • absolute – Seek relative to the start of the stream rather than relative to the last occurence of a subconstruct

Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

  • PaddingError – length is negative

  • PaddingError – subcon written more bytes than entire length (negative padding)

Can propagate any exception from the lambda, possibly non-ConstructError.

Example:

>>> d = FixedSized(10, Byte)
>>> d.parse(b'\xff\x00\x00\x00\x00\x00\x00\x00\x00\x00')
255
>>> d.build(255)
b'\xff\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>> d.sizeof()
10
class malstruct.transforms.NullTerminated(subcon, term=b'\x00', include=False, consume=True, require=True, absolute=False)

Bases: Subconstruct

Restricts parsing to bytes preceding a null byte.

Parsing reads one byte at a time and accumulates it with previous bytes. When term was found, (by default) consumes but discards the term. When EOF was found, (by default) raises same StreamError exception. Then subcon is parsed using new BytesIO made with said data. Building builds the subcon and then writes the term. Size is undefined.

The term can be multiple bytes, to support string classes with UTF16/32 encodings for example. Be warned however: as reported in Issue 1046, the data read must be a multiple of the term length and the term must start at a unit boundary, otherwise strange things happen when parsing.

Parameters:
  • subcon – Construct instance

  • term – optional, bytes, terminator byte-string, default is x00 single null byte

  • include – optional, bool, if to include terminator in resulting data, default is False

  • consume – optional, bool, if to consume terminator or leave it in the stream, default is True

  • require – optional, bool, if EOF results in failure or not, default is True

  • absolute – Seek relative to the start of the stream rather than relative to the last occurence of a subconstruct

Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

  • StreamError – encountered EOF but require is not disabled

  • PaddingError – terminator is less than 1 bytes in length

Example:

>>> d = NullTerminated(Byte)
>>> d.parse(b'\xff\x00')
255
>>> d.build(255)
b'\xff\x00'
class malstruct.transforms.NullStripped(subcon, pad=b'\x00', absolute=False)

Bases: Subconstruct

Restricts parsing to bytes except padding left of EOF.

Parsing reads entire stream, then strips the data from right to left of null bytes, then parses subcon using new BytesIO made of said data. Building defers to subcon as-is. Size is undefined, because it reads till EOF.

The pad can be multiple bytes, to support string classes with UTF16/32 encodings.

Parameters:
  • subcon – Construct instance

  • pad – optional, bytes, padding byte-string, default is x00 single null byte

  • absolute – Seek relative to the start of the stream rather than relative to the last occurence of a subconstruct

Raises:

PaddingError – pad is less than 1 bytes in length

Example:

>>> d = NullStripped(Byte)
>>> d.parse(b'\xff\x00\x00')
255
>>> d.build(255)
b'\xff'
class malstruct.transforms.RestreamData(datafunc, subcon)

Bases: Subconstruct

Parses a field on external data (but does not build).

Parsing defers to subcon, but provides it a separate BytesIO stream based on data provided by datafunc (a bytes literal or another BytesIO stream or Construct instances that returns bytes or context lambda). Building does nothing. Size is 0 because as far as other fields see it, this field does not produce or consume any bytes from the stream.

Parameters:
  • datafunc – bytes or BytesIO or Construct instance (that parses into bytes) or context lambda, provides data for subcon to parse from

  • subcon – Construct instance

Can propagate any exception from the lambdas, possibly non-ConstructError.

Example:

>>> d = RestreamData(b"\x01", Int8ub)
>>> d.parse(b"")
1
>>> d.build(0)
b''

>>> d = RestreamData(NullTerminated(GreedyBytes), Int16ub)
>>> d.parse(b"\x01\x02\x00")
0x0102
>>> d = RestreamData(FixedSized(2, GreedyBytes), Int16ub)
>>> d.parse(b"\x01\x02\x00")
0x0102
class malstruct.transforms.Transformed(subcon, decodefunc, decodeamount, encodefunc, encodeamount)

Bases: Subconstruct

Transforms bytes between the underlying stream and the (fixed-sized) subcon.

Parsing reads a specified amount (or till EOF), processes data using a bytes-to-bytes decoding function, then parses subcon using those data. Building does build subcon into separate bytes, then processes it using encoding bytes-to-bytes function, then writes those data into main stream. Size is reported as decodeamount or encodeamount if those are equal, otherwise its SizeofError.

Used internally to implement Bitwise Bytewise ByteSwapped BitsSwapped .

Possible use-cases include encryption, obfuscation, byte-level encoding.

Warning

Remember that subcon must consume (or produce) an amount of bytes that is same as decodeamount (or encodeamount).

Warning

Do NOT use seeking/telling classes inside Transformed context.

Parameters:
  • subcon – Construct instance

  • decodefunc – bytes-to-bytes function, applied before parsing subcon

  • decodeamount – integer, amount of bytes to read

  • encodefunc – bytes-to-bytes function, applied after building subcon

  • encodeamount – integer, amount of bytes to write

Raises:
  • StreamError – requested reading negative amount, could not read enough bytes, requested writing different amount than actual data, or could not write all bytes

  • StreamError – subcon build and encoder transformed more or less than encodeamount bytes, if amount is specified

  • StringError – building from non-bytes value, perhaps unicode

Can propagate any exception from the lambdas, possibly non-ConstructError.

Example:

>>> d = Transformed(Bytes(16), bytes2bits, 2, bits2bytes, 2)
>>> d.parse(b"\x00\x00")
b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'

>>> d = Transformed(GreedyBytes, bytes2bits, None, bits2bytes, None)
>>> d.parse(b"\x00\x00")
b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
class malstruct.transforms.Restreamed(subcon, decoder, decoderunit, encoder, encoderunit, sizecomputer)

Bases: Subconstruct

Transforms bytes between the underlying stream and the (variable-sized) subcon.

Used internally to implement Bitwise Bytewise ByteSwapped BitsSwapped .

Warning

Remember that subcon must consume or produce an amount of bytes that is a multiple of encoding or decoding units. For example, in a Bitwise context you should process a multiple of 8 bits or the stream will fail during parsing/building.

Warning

Do NOT use seeking/telling classes inside Restreamed context.

Parameters:
  • subcon – Construct instance

  • decoder – bytes-to-bytes function, used on data chunks when parsing

  • decoderunit – integer, decoder takes chunks of this size

  • encoder – bytes-to-bytes function, used on data chunks when building

  • encoderunit – integer, encoder takes chunks of this size

  • sizecomputer – function that computes amount of bytes outputed

Can propagate any exception from the lambda, possibly non-ConstructError. Can also raise arbitrary exceptions in RestreamedBytesIO implementation.

Example:

Bitwise  <--> Restreamed(subcon, bits2bytes, 8, bytes2bits, 1, lambda n: n//8)
Bytewise <--> Restreamed(subcon, bytes2bits, 1, bits2bytes, 8, lambda n: n*8)
class malstruct.transforms.ProcessXor(padfunc, subcon, absolute=False)

Bases: Subconstruct

Transforms bytes between the underlying stream and the subcon.

Used internally by KaitaiStruct compiler, when translating process: xor tags.

Parsing reads till EOF, xors data with the pad, then feeds that data into subcon. Building first builds the subcon into separate BytesIO stream, xors data with the pad, then writes that data into the main stream. Size is the same as subcon, unless it raises SizeofError.

Parameters:
  • padfunc – integer or bytes or context lambda, single or multiple bytes to xor data with

  • subcon – Construct instance

  • absolute – Seek relative to the start of the stream rather than relative to the last occurence of a subconstruct

Raises:

StringError – pad is not integer or bytes

Can propagate any exception from the lambda, possibly non-ConstructError.

Example:

>>> d = ProcessXor(0xf0 or b'\xf0', Int16ub)
>>> d.parse(b"\x00\xff")
0xf00f
>>> d.sizeof()
2
class malstruct.transforms.ProcessRotateLeft(amount, group, subcon)

Bases: Subconstruct

Transforms bytes between the underlying stream and the subcon.

Used internally by KaitaiStruct compiler, when translating process: rol/ror tags.

Parsing reads till EOF, rotates (shifts) the data left by amount in bits, then feeds that data into subcon. Building first builds the subcon into separate BytesIO stream, rotates right by negating amount, then writes that data into the main stream. Size is the same as subcon, unless it raises SizeofError.

Parameters:
  • amount – integer or context lambda, shift by this amount in bits, treated modulo (group x 8)

  • group – integer or context lambda, shifting is applied to chunks of this size in bytes

  • subcon – Construct instance

Raises:

Can propagate any exception from the lambda, possibly non-ConstructError.

Example:

>>> d = ProcessRotateLeft(4, 1, Int16ub)
>>> d.parse(b'\x0f\xf0')
0xf00f
>>> d = ProcessRotateLeft(4, 2, Int16ub)
>>> d.parse(b'\x0f\xf0')
0xff00
>>> d.sizeof()
2
precomputed_single_rotations = {1: [0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 212, 214, 216, 218, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246, 248, 250, 252, 254, 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109, 111, 113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137, 139, 141, 143, 145, 147, 149, 151, 153, 155, 157, 159, 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199, 201, 203, 205, 207, 209, 211, 213, 215, 217, 219, 221, 223, 225, 227, 229, 231, 233, 235, 237, 239, 241, 243, 245, 247, 249, 251, 253, 255], 2: [0, 4, 8, 12, 16, 20, 24, 28, 32, 36, 40, 44, 48, 52, 56, 60, 64, 68, 72, 76, 80, 84, 88, 92, 96, 100, 104, 108, 112, 116, 120, 124, 128, 132, 136, 140, 144, 148, 152, 156, 160, 164, 168, 172, 176, 180, 184, 188, 192, 196, 200, 204, 208, 212, 216, 220, 224, 228, 232, 236, 240, 244, 248, 252, 1, 5, 9, 13, 17, 21, 25, 29, 33, 37, 41, 45, 49, 53, 57, 61, 65, 69, 73, 77, 81, 85, 89, 93, 97, 101, 105, 109, 113, 117, 121, 125, 129, 133, 137, 141, 145, 149, 153, 157, 161, 165, 169, 173, 177, 181, 185, 189, 193, 197, 201, 205, 209, 213, 217, 221, 225, 229, 233, 237, 241, 245, 249, 253, 2, 6, 10, 14, 18, 22, 26, 30, 34, 38, 42, 46, 50, 54, 58, 62, 66, 70, 74, 78, 82, 86, 90, 94, 98, 102, 106, 110, 114, 118, 122, 126, 130, 134, 138, 142, 146, 150, 154, 158, 162, 166, 170, 174, 178, 182, 186, 190, 194, 198, 202, 206, 210, 214, 218, 222, 226, 230, 234, 238, 242, 246, 250, 254, 3, 7, 11, 15, 19, 23, 27, 31, 35, 39, 43, 47, 51, 55, 59, 63, 67, 71, 75, 79, 83, 87, 91, 95, 99, 103, 107, 111, 115, 119, 123, 127, 131, 135, 139, 143, 147, 151, 155, 159, 163, 167, 171, 175, 179, 183, 187, 191, 195, 199, 203, 207, 211, 215, 219, 223, 227, 231, 235, 239, 243, 247, 251, 255], 3: [0, 8, 16, 24, 32, 40, 48, 56, 64, 72, 80, 88, 96, 104, 112, 120, 128, 136, 144, 152, 160, 168, 176, 184, 192, 200, 208, 216, 224, 232, 240, 248, 1, 9, 17, 25, 33, 41, 49, 57, 65, 73, 81, 89, 97, 105, 113, 121, 129, 137, 145, 153, 161, 169, 177, 185, 193, 201, 209, 217, 225, 233, 241, 249, 2, 10, 18, 26, 34, 42, 50, 58, 66, 74, 82, 90, 98, 106, 114, 122, 130, 138, 146, 154, 162, 170, 178, 186, 194, 202, 210, 218, 226, 234, 242, 250, 3, 11, 19, 27, 35, 43, 51, 59, 67, 75, 83, 91, 99, 107, 115, 123, 131, 139, 147, 155, 163, 171, 179, 187, 195, 203, 211, 219, 227, 235, 243, 251, 4, 12, 20, 28, 36, 44, 52, 60, 68, 76, 84, 92, 100, 108, 116, 124, 132, 140, 148, 156, 164, 172, 180, 188, 196, 204, 212, 220, 228, 236, 244, 252, 5, 13, 21, 29, 37, 45, 53, 61, 69, 77, 85, 93, 101, 109, 117, 125, 133, 141, 149, 157, 165, 173, 181, 189, 197, 205, 213, 221, 229, 237, 245, 253, 6, 14, 22, 30, 38, 46, 54, 62, 70, 78, 86, 94, 102, 110, 118, 126, 134, 142, 150, 158, 166, 174, 182, 190, 198, 206, 214, 222, 230, 238, 246, 254, 7, 15, 23, 31, 39, 47, 55, 63, 71, 79, 87, 95, 103, 111, 119, 127, 135, 143, 151, 159, 167, 175, 183, 191, 199, 207, 215, 223, 231, 239, 247, 255], 4: [0, 16, 32, 48, 64, 80, 96, 112, 128, 144, 160, 176, 192, 208, 224, 240, 1, 17, 33, 49, 65, 81, 97, 113, 129, 145, 161, 177, 193, 209, 225, 241, 2, 18, 34, 50, 66, 82, 98, 114, 130, 146, 162, 178, 194, 210, 226, 242, 3, 19, 35, 51, 67, 83, 99, 115, 131, 147, 163, 179, 195, 211, 227, 243, 4, 20, 36, 52, 68, 84, 100, 116, 132, 148, 164, 180, 196, 212, 228, 244, 5, 21, 37, 53, 69, 85, 101, 117, 133, 149, 165, 181, 197, 213, 229, 245, 6, 22, 38, 54, 70, 86, 102, 118, 134, 150, 166, 182, 198, 214, 230, 246, 7, 23, 39, 55, 71, 87, 103, 119, 135, 151, 167, 183, 199, 215, 231, 247, 8, 24, 40, 56, 72, 88, 104, 120, 136, 152, 168, 184, 200, 216, 232, 248, 9, 25, 41, 57, 73, 89, 105, 121, 137, 153, 169, 185, 201, 217, 233, 249, 10, 26, 42, 58, 74, 90, 106, 122, 138, 154, 170, 186, 202, 218, 234, 250, 11, 27, 43, 59, 75, 91, 107, 123, 139, 155, 171, 187, 203, 219, 235, 251, 12, 28, 44, 60, 76, 92, 108, 124, 140, 156, 172, 188, 204, 220, 236, 252, 13, 29, 45, 61, 77, 93, 109, 125, 141, 157, 173, 189, 205, 221, 237, 253, 14, 30, 46, 62, 78, 94, 110, 126, 142, 158, 174, 190, 206, 222, 238, 254, 15, 31, 47, 63, 79, 95, 111, 127, 143, 159, 175, 191, 207, 223, 239, 255], 5: [0, 32, 64, 96, 128, 160, 192, 224, 1, 33, 65, 97, 129, 161, 193, 225, 2, 34, 66, 98, 130, 162, 194, 226, 3, 35, 67, 99, 131, 163, 195, 227, 4, 36, 68, 100, 132, 164, 196, 228, 5, 37, 69, 101, 133, 165, 197, 229, 6, 38, 70, 102, 134, 166, 198, 230, 7, 39, 71, 103, 135, 167, 199, 231, 8, 40, 72, 104, 136, 168, 200, 232, 9, 41, 73, 105, 137, 169, 201, 233, 10, 42, 74, 106, 138, 170, 202, 234, 11, 43, 75, 107, 139, 171, 203, 235, 12, 44, 76, 108, 140, 172, 204, 236, 13, 45, 77, 109, 141, 173, 205, 237, 14, 46, 78, 110, 142, 174, 206, 238, 15, 47, 79, 111, 143, 175, 207, 239, 16, 48, 80, 112, 144, 176, 208, 240, 17, 49, 81, 113, 145, 177, 209, 241, 18, 50, 82, 114, 146, 178, 210, 242, 19, 51, 83, 115, 147, 179, 211, 243, 20, 52, 84, 116, 148, 180, 212, 244, 21, 53, 85, 117, 149, 181, 213, 245, 22, 54, 86, 118, 150, 182, 214, 246, 23, 55, 87, 119, 151, 183, 215, 247, 24, 56, 88, 120, 152, 184, 216, 248, 25, 57, 89, 121, 153, 185, 217, 249, 26, 58, 90, 122, 154, 186, 218, 250, 27, 59, 91, 123, 155, 187, 219, 251, 28, 60, 92, 124, 156, 188, 220, 252, 29, 61, 93, 125, 157, 189, 221, 253, 30, 62, 94, 126, 158, 190, 222, 254, 31, 63, 95, 127, 159, 191, 223, 255], 6: [0, 64, 128, 192, 1, 65, 129, 193, 2, 66, 130, 194, 3, 67, 131, 195, 4, 68, 132, 196, 5, 69, 133, 197, 6, 70, 134, 198, 7, 71, 135, 199, 8, 72, 136, 200, 9, 73, 137, 201, 10, 74, 138, 202, 11, 75, 139, 203, 12, 76, 140, 204, 13, 77, 141, 205, 14, 78, 142, 206, 15, 79, 143, 207, 16, 80, 144, 208, 17, 81, 145, 209, 18, 82, 146, 210, 19, 83, 147, 211, 20, 84, 148, 212, 21, 85, 149, 213, 22, 86, 150, 214, 23, 87, 151, 215, 24, 88, 152, 216, 25, 89, 153, 217, 26, 90, 154, 218, 27, 91, 155, 219, 28, 92, 156, 220, 29, 93, 157, 221, 30, 94, 158, 222, 31, 95, 159, 223, 32, 96, 160, 224, 33, 97, 161, 225, 34, 98, 162, 226, 35, 99, 163, 227, 36, 100, 164, 228, 37, 101, 165, 229, 38, 102, 166, 230, 39, 103, 167, 231, 40, 104, 168, 232, 41, 105, 169, 233, 42, 106, 170, 234, 43, 107, 171, 235, 44, 108, 172, 236, 45, 109, 173, 237, 46, 110, 174, 238, 47, 111, 175, 239, 48, 112, 176, 240, 49, 113, 177, 241, 50, 114, 178, 242, 51, 115, 179, 243, 52, 116, 180, 244, 53, 117, 181, 245, 54, 118, 182, 246, 55, 119, 183, 247, 56, 120, 184, 248, 57, 121, 185, 249, 58, 122, 186, 250, 59, 123, 187, 251, 60, 124, 188, 252, 61, 125, 189, 253, 62, 126, 190, 254, 63, 127, 191, 255], 7: [0, 128, 1, 129, 2, 130, 3, 131, 4, 132, 5, 133, 6, 134, 7, 135, 8, 136, 9, 137, 10, 138, 11, 139, 12, 140, 13, 141, 14, 142, 15, 143, 16, 144, 17, 145, 18, 146, 19, 147, 20, 148, 21, 149, 22, 150, 23, 151, 24, 152, 25, 153, 26, 154, 27, 155, 28, 156, 29, 157, 30, 158, 31, 159, 32, 160, 33, 161, 34, 162, 35, 163, 36, 164, 37, 165, 38, 166, 39, 167, 40, 168, 41, 169, 42, 170, 43, 171, 44, 172, 45, 173, 46, 174, 47, 175, 48, 176, 49, 177, 50, 178, 51, 179, 52, 180, 53, 181, 54, 182, 55, 183, 56, 184, 57, 185, 58, 186, 59, 187, 60, 188, 61, 189, 62, 190, 63, 191, 64, 192, 65, 193, 66, 194, 67, 195, 68, 196, 69, 197, 70, 198, 71, 199, 72, 200, 73, 201, 74, 202, 75, 203, 76, 204, 77, 205, 78, 206, 79, 207, 80, 208, 81, 209, 82, 210, 83, 211, 84, 212, 85, 213, 86, 214, 87, 215, 88, 216, 89, 217, 90, 218, 91, 219, 92, 220, 93, 221, 94, 222, 95, 223, 96, 224, 97, 225, 98, 226, 99, 227, 100, 228, 101, 229, 102, 230, 103, 231, 104, 232, 105, 233, 106, 234, 107, 235, 108, 236, 109, 237, 110, 238, 111, 239, 112, 240, 113, 241, 114, 242, 115, 243, 116, 244, 117, 245, 118, 246, 119, 247, 120, 248, 121, 249, 122, 250, 123, 251, 124, 252, 125, 253, 126, 254, 127, 255]}
class malstruct.transforms.Checksum(checksumfield, hashfunc, bytesfunc)

Bases: Construct

Field that is build or validated by a hash of a given byte range. Usually used with RawCopy .

Parsing compares parsed subcon checksumfield with a context entry provided by bytesfunc and transformed by hashfunc. Building fetches the contect entry, transforms it, then writes is using subcon. Size is same as subcon.

Parameters:
  • checksumfield – a subcon field that reads the checksum, usually Bytes(int)

  • hashfunc – function that takes bytes and returns whatever checksumfield takes when building, usually from hashlib module

  • bytesfunc – context lambda that returns bytes (or object) to be hashed, usually like this.rawcopy1.data

Raises:

ChecksumError – parsing and actual checksum does not match actual data

Can propagate any exception from the lambdas, possibly non-ConstructError.

Example:

import hashlib
d = Struct(
    "fields" / RawCopy(Struct(
        Padding(1000),
    )),
    "checksum" / Checksum(Bytes(64),
        lambda data: hashlib.sha512(data).digest(),
        this.fields.data),
)
d.build(dict(fields=dict(value={})))
import hashlib
d = Struct(
    "offset" / Tell,
    "checksum" / Padding(64),
    "fields" / RawCopy(Struct(
        Padding(1000),
    )),
    "checksum" / Pointer(this.offset, Checksum(Bytes(64),
        lambda data: hashlib.sha512(data).digest(),
        this.fields.data)),
)
d.build(dict(fields=dict(value={})))
class malstruct.transforms.Compressed(subcon, lib, wrap_exception=True, encode_args={}, decode_args={})

Bases: Adapter

Compresses and decompresses underlying stream when processing subcon. When parsing, entire stream is consumed. When building, it puts compressed bytes without marking the end. This construct should be used with Prefixed .

Parsing and building transforms all bytes using a specified codec. Since data is processed until EOF, it behaves similar to GreedyBytes. Size is undefined.
  • supports providing a custom encoding module or object.

  • (provide any object that has a “decompress” and “compress” function in the lib parameter.)

  • produces a ConstructError if compressed/decompression fails.
    • (You can turn this off by setting wrap_exception=False)

  • uses Adapter instead of Tunnel in order to allow it be embedded within other constructs.
    • (Original one read entire stream, no matter the subcon you provide.)

Parameters:
  • subcon – Construct instance, subcon used for storing the value

  • encoding – string, any of module names like zlib/gzip/bzip2/lzma, otherwise any of codecs module bytes<->bytes encodings, each codec usually requires some Python version

  • level – optional, integer between 0..9, although lzma discards it, some encoders allow different compression levels

Raises:
  • ImportError – needed module could not be imported by ctor

  • StreamError – stream failed when reading until EOF

Example:

>>> d = Prefixed(VarInt, Compressed(GreedyBytes, "zlib"))
>>> d.build(bytes(100))
b'\x0cx\x9cc`\xa0=\x00\x00\x00d\x00\x01'
>>> len(_)
13
wrap_exception
lib
class malstruct.transforms.ZLIB(subcon, wbits=None, bufsize=None, level=None)

Bases: Adapter

Adapter used to zlib compress/decompress a data buffer

Parameters:
  • subcon – The construct to wrap

  • level (int) – The zlib compression level

  • wbits (int) – The zlib decompression window size

  • bufsize (int) – The initial output buffer size

>>> ZLIB(Bytes(12)).build(b'data')
b'x\x9cKI,I\x04\x00\x04\x00\x01\x9b'
>>> ZLIB(GreedyBytes, level=0).build(b'data')
b'x\x01\x01\x04\x00\xfb\xffdata\x04\x00\x01\x9b'
>>> ZLIB(GreedyBytes).parse(b'x^KI,I\x04\x00\x04\x00\x01\x9b')
b'data'
class malstruct.transforms.CompressedLZ4(subcon)

Bases: Tunnel

Compresses and decompresses underlying stream before processing subcon. When parsing, entire stream is consumed. When building, it puts compressed bytes without marking the end. This construct should be used with Prefixed .

Parsing and building transforms all bytes using LZ4 library. Since data is processed until EOF, it behaves similar to GreedyBytes. Size is undefined.

Parameters:

subcon – Construct instance, subcon used for storing the value

Raises:
  • ImportError – needed module could not be imported by ctor

  • StreamError – stream failed when reading until EOF

Can propagate lz4.frame exceptions.

Example:

>>> d = Prefixed(VarInt, CompressedLZ4(GreedyBytes))
>>> d.build(bytes(100))
b'"\x04"M\x18h@d\x00\x00\x00\x00\x00\x00\x00#\x0b\x00\x00\x00\x1f\x00\x01\x00KP\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>> len(_)
35
class malstruct.transforms.EncryptedSym(subcon, cipher)

Bases: Tunnel

Perform symmetrical encryption and decryption of the underlying stream before processing subcon. When parsing, entire stream is consumed. When building, it puts encrypted bytes without marking the end.

Parsing and building transforms all bytes using the selected cipher. Since data is processed until EOF, it behaves similar to GreedyBytes. Size is undefined.

The key for encryption and decryption should be passed via contextkw to build and parse methods.

This construct is heavily based on the cryptography library, which supports the following algorithms and modes. For more details please see the documentation of that library.

Algorithms: - AES - Camellia - ChaCha20 - TripleDES - CAST5 - SEED - SM4 - Blowfish (weak cipher) - ARC4 (weak cipher) - IDEA (weak cipher)

Modes: - CBC - CTR - OFB - CFB - CFB8 - XTS - ECB (insecure)

Note

Keep in mind that some of the algorithms require padding of the data. This can be done e.g. with Aligned.

Note

For GCM mode use EncryptedSymAead.

Parameters:
  • subcon – Construct instance, subcon used for storing the value

  • cipher – Cipher object or context lambda from cryptography.hazmat.primitives.ciphers

Raises:
  • ImportError – needed module could not be imported

  • StreamError – stream failed when reading until EOF

  • CipherError – no cipher object is provided

  • CipherError – an AEAD cipher is used

Can propagate cryptography.exceptions exceptions.

Example:

>>> from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
>>> d = Struct(
...     "iv" / Default(Bytes(16), os.urandom(16)),
...     "enc_data" / EncryptedSym(
...         Aligned(16,
...             Struct(
...                 "width" / Int16ul,
...                 "height" / Int16ul,
...             )
...         ),
...         lambda ctx: Cipher(algorithms.AES(ctx._.key), modes.CBC(ctx.iv))
...     )
... )
>>> key128 = b"\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f"
>>> d.build({"enc_data": {"width": 5, "height": 4}}, key=key128)
b"o\x11i\x98~H\xc9\x1c\x17\x83\xf6|U:\x1a\x86+\x00\x89\xf7\x8e\xc3L\x04\t\xca\x8a\xc8\xc2\xfb'\xc8"
>>> d.parse(b"o\x11i\x98~H\xc9\x1c\x17\x83\xf6|U:\x1a\x86+\x00\x89\xf7\x8e\xc3L\x04\t\xca\x8a\xc8\xc2\xfb'\xc8", key=key128)
Container:
    iv = b'o\x11i\x98~H\xc9\x1c\x17\x83\xf6|U:\x1a\x86' (total 16)
    enc_data = Container:
        width = 5
        height = 4
class malstruct.transforms.EncryptedSymAead(subcon, cipher, nonce, associated_data=b'')

Bases: Tunnel

Perform symmetrical AEAD encryption and decryption of the underlying stream before processing subcon. When parsing, entire stream is consumed. When building, it puts encrypted bytes and tag without marking the end.

Parsing and building transforms all bytes using the selected cipher and also authenticates the associated_data. Since data is processed until EOF, it behaves similar to GreedyBytes. Size is undefined.

The key for encryption and decryption should be passed via contextkw to build and parse methods.

This construct is heavily based on the cryptography library, which supports the following AEAD ciphers. For more details please see the documentation of that library.

AEAD ciphers: - AESGCM - AESCCM - ChaCha20Poly1305

Parameters:
  • subcon – Construct instance, subcon used for storing the value

  • cipher – Cipher object or context lambda from cryptography.hazmat.primitives.ciphers

Raises:
  • ImportError – needed module could not be imported

  • StreamError – stream failed when reading until EOF

  • CipherError – unsupported cipher object is provided

Can propagate cryptography.exceptions exceptions.

Example:

>>> from cryptography.hazmat.primitives.ciphers import aead
>>> d = Struct(
...     "nonce" / Default(Bytes(16), os.urandom(16)),
...     "associated_data" / Bytes(21),
...     "enc_data" / EncryptedSymAead(
...         GreedyBytes,
...         lambda ctx: aead.AESGCM(ctx._.key),
...         this.nonce,
...         this.associated_data
...     )
... )
>>> key128 = b"\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f"
>>> d.build({"associated_data": b"This is authenticated", "enc_data": b"The secret message"}, key=key128)
b'\xe3\xb0"\xbaQ\x18\xd3|\x14\xb0q\x11\xb5XZ\xeeThis is authenticated\x88~\xe5Vh\x00\x01m\xacn\xad k\x02\x13\xf4\xb4[\xbe\x12$\xa0\x7f\xfb\xbf\x82Ar\xb0\x97C\x0b\xe3\x85'
>>> d.parse(b'\xe3\xb0"\xbaQ\x18\xd3|\x14\xb0q\x11\xb5XZ\xeeThis is authenticated\x88~\xe5Vh\x00\x01m\xacn\xad k\x02\x13\xf4\xb4[\xbe\x12$\xa0\x7f\xfb\xbf\x82Ar\xb0\x97C\x0b\xe3\x85', key=key128)
Container:
    nonce = b'\xe3\xb0"\xbaQ\x18\xd3|\x14\xb0q\x11\xb5XZ\xee' (total 16)
    associated_data = b'This is authenti'... (truncated, total 21)
    enc_data = b'The secret messa'... (truncated, total 18)
class malstruct.transforms.Rebuffered(subcon, tailcutoff=None)

Bases: Subconstruct

Caches bytes from underlying stream, so it becomes seekable and tellable, and also becomes blocking on reading. Useful for processing non-file streams like pipes, sockets, etc.

Warning

Experimental implementation. May not be mature enough.

Parameters:
  • subcon – Construct instance, subcon which will operate on the buffered stream

  • tailcutoff – optional, integer, amount of bytes kept in buffer, by default buffers everything

Can also raise arbitrary exceptions in its implementation.

Example:

Rebuffered(..., tailcutoff=1024).parse_stream(nonseekable_stream)

Module contents