|buffer=None, formats=None, shape=0, names=None, byteorder=sys.byteorder)|
arrayis, for most practical purposes, all a user needs to know to construct a record array.
formats is a string containing the format information of all fields.
Each format can be the letter code, such as
or longer name like
Int16. For a list of letter
codes or the longer names, see Table 4.1 or use
letterCode() function. A field of strings is specified by the
a, followed by an integer giving the maximum length; thus
a5 is the format for a field of strings of (maximum) length of 5.
The formats are separated by commas, and each cell
(element in a field) can be a numarray itself, by attaching a number or a
tuple in front of the format specification. So if
formats='i4,Float64,a5,3i2,(2,3)f4,Complex64,b1', the record array
1st field: (4-byte) integers 2nd field: double precision floating point numbers 3rd field: strings of length 5 4th field: short (2-byte) integers, each element is an array of shape=(3,) 5th field: single precision floating point numbers, each element is an array of shape=(2,3) 6th field: double precision complex numbers 7th field: (1-byte) Booleans
formatsspecification takes precedence over the data. For example, if a field is specified as integers in
buffer, but is specified as floats in
formats, it will be floats in the record array. If a field in the
bufferis not convertible to the corresponding data type in the
formatsspecification, e.g. from strings to numbers (integers, floats, Booleans) or vice versa, an exception will be raised.
shape is the shape of the record array. It can be an integer,
in which case it is equivalent to the number of rows in a table.
It can also be a tuple where the record array is an N-D array with
Records as its elements.
shape must be consistent with the
buffer for buffer types (5) and (6), explained below.
names is a string containing the names of the fields, separated by
commas. If there are more formats specified than names, then default
names will be used: If there are five fields specified in
names=None (default), then the field names will be:
c1, c2, c3, c4, c5. If
names="a,b", then the field
names will be:
a, b, c3, c4, c5.
If more names have been specified than there are formats, the extra names
will be discarded. If duplicate names are specified, a
will be raised. Field names are case sensitive, e.g. column
not be found if it is referred to as
(for example) when using the
byteorder is a string of the value
referring to big endian or little endian. This is useful when reading
(binary) data from a string or a file. If not specified, it will use the
sys.byteorder value and the result will be platform dependent for
string or file input.
The first argument,
buffer, may be any one of the following:
None (default). The data block in the record array will not be
initialized. The user must assign valid data before trying to read the
contents or before writing the record array to a disk file.
(2) a Python string containing binary data. For example:
>>> r=rec.array('abcdefg'*100, formats='i2,a3,i4', shape=3, byteorder='big') >>> print r RecArray[ (24930, 'cde', 1718051170), (25444, 'efg', 1633837924), (25958, 'gab', 1667523942) ]
(3) a Python file object for an open file. The data will be copied from
the file, starting at the current position of the read pointer, with
byte order as specified in
(4) a record array. This results in a deep copy of the input record array;
any other arguments to
array() will be silently ignored.
(5) a list of numarrays. There must be one such numarray for each field.
shape arguments to
array() are not
required, but if they are specified, they need to be consistent with the
input arrays. The shapes of all the input numarrays also need to be
consistent to one another.
# this will have 3 rows, each cell in the 2nd field is an array of 4 elements # note that the formats sepcification needs to reflect the data shape >>> arr1=numarray.arange(3) >>> arr2=numarray.arange(12,shape=(3,4)) >>> r=rec.array([arr1, arr2],formats='i2,4f4')
In this example,
arr2 is cast up to float.
(6) a list of sequences. Each sequence contains the number(s)/string(s) of a record. The example in the introduction uses such input, sometimes called longhand input. The data types are automatically determined after comparing all input data. Data of the same field will be cast to the highest type:
# the first field uses the highest data type: Float64 >>> r=rec.array([[1,'abc'],(3.5, 'xx')]); print r RecArray[ (1.0, 'abc'), (3.5, 'xx') ]
# overrule the first field to short integers, second field to shorter strings >>> r=rec.array([[1,'abc'],(3.5, 'xx')],formats='i2,a1'); print r RecArray[ (1, 'a'), (3, 'x') ]
>>> r=rec.array([[1,'abc'],('a', 'xx')]) ValueError: inconsistent data at row 1,field 0
A record array with multi-dimensional numarray cells in a field can also be constructed by using nested sequences:
>>> r=rec.array([[(11,12,13),'abc'],[(2,3,4), 'xx']]); print r RecArray[ (array([11, 12, 13]), 'abc'), (array([2, 3, 4]), 'xx') ]
Send comments to the NumArray community.