numarray Manual

8. Array Functions

Most of the useful manipulations on arrays are done with functions. This might be surprising given Python's object-oriented framework, and that many of these functions could have been implemented using methods instead. Choosing functions means that the same procedures can be applied to arbitrary python sequences, not just to arrays. For example, while transpose([[1,2],[3,4]]) works just fine, [[1,2],[3,4]].transpose() does not. This approach also allows uniformity in interface between functions defined in the numarray Python system, whether implemented in C or in Python, and functions defined in extension modules. We've already covered two functions which operate on arrays: reshape and resize.

take( array, indices, axis=0, clipmode=CLIP)

The function take is a generalized indexing/slicing of the array. In its simplest form, it is equivalent to indexing:

>>> a1 = array([10,20,30,40])
>>> print a1[[1,3]]
[20 40]
>>> print take(a1,[1,3])
[20 40]

See the description of index arrays in the Array Basics section for an alternative to take and put. take selects the elements of the array (the first argument) based on the indices (the second argument). Unlike slicing, however, the array returned by take has the same rank as the input array. Some 2-D examples:

>>> print a2
[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]
>>> print take(a2, (0,))                 # first row
[[0 1 2 3 4]]
>>> print take(a2, (0,1))                # first and second row
[[0 1 2 3 4]
 [5 6 7 8 9]]
>>> print take (a2, (0, -1))             # index relative to dimension end
[[ 0 1 2 3 4]
[15 16 17 18 19]]

The optional third argument specifies the axis along which the selection occurs, and the default value (as in examples above) is 0, the first axis. If you want a different axis, then you need to specify it:

>>> print take(a2, (0,), axis=1)         # first column
[[ 0]
 [ 5]
 [10]
 [15]]
>>> print take(a2, (0,1), axis=1)        # first and second column
[[ 0  1]
 [ 5  6]
 [10 11]
 [15 16]]
>>> print take(a2, (0,4), axis=1)        # first and last column
[[ 0  4]
 [ 5  9]
 [10 14]
 [15 19]]

This is considered to be a structural operation, because its result does not depend on the content of the arrays or the result of a computation on those contents but uniquely on the structure of the array. Like all such structural operations, the default axis is 0 (the first rank). Later in this tutorial, we will see other functions with a default axis of -1.

take is often used to create multidimensional arrays with the indices from a rank-1 array. As in the earlier examples, the shape of the array returned by take is a combination of the shape of its first argument and the shape of the array that elements are "taken" from - when that array is rank-1, the shape of the returned array has the same shape as the index sequence. This, as with many other facets of numarray, is best understood by experiment.

>>> x = arange(10) * 100
>>> print x
[  0 100 200 300 400 500 600 700 800 900]
>>> print take(x, [[2,4],[1,2]])
[[200 400]
 [100 200]]

A typical example of using take is to replace the grey values in an image according to a "translation table." For example, suppose m51 is a 2-D array of type UInt8 containing a greyscale image. We can create a table mapping the integers 0-255 to integers 0-255 using a "compressive nonlinearity":

>>> table = (255 - arange(256)**2 / 256).astype(UInt8)

Then we can perform the take() operation

>>> m51b = take(table, m51)

The numarray version of take can also take a sequence as its axis value:

>>> print take(a2, [0,1], [0,1])
1
>>> print take(a2, [0,1], [1,0])
5

In this case, it functions like indexing: a2[0,1] and a2[1,0] respectively.

put( array, indices, values, axis=0, clipmode=CLIP)

put is the opposite of take. The values of array at the locations specified in indices are set to the corresponding values. The array must be a contiguous array. The indices can be any integer sequence object with values suitable for indexing into the flat form of array. The values must be any sequence of values that can be converted to the type of a.

>>> x = arange(6)
>>> put(x, [2,4], [20,40])
>>> print x
[ 0  1 20  3 40  5]

Note that the target array is not required to be one-dimensional. Since array is contiguous and stored in row-major order, the indices can be treated as indexing array's elements in storage order. The routine put is thus equivalent to the following (although the loop is in C for speed):

ind = array(indices, copy=0)
v = array(values, copy=0).astype(a.type())
for i in range(len(ind)): a.flat[i] = v[i]

putmask( array, mask, values)

putmask sets those elements of array for which mask is true to the corresponding value in values. The array array must be contiguous. The argument mask must be an integer sequence of the same size (but not necessarily the same shape) as array. The argument values will be repeated as necessary; in particular it can be a scalar. The array values must be convertible to the type of array.

>>> x=arange(5)
>>> putmask(x, [1,0,1,0,1], [10,20,30,40,50])
>>> print x
[10  1 30  3 50]
>>> putmask(x, [1,0,1,0,1], [-1,-2])
>>> print x
[-1  1 -1  3 -1]

Note how in the last example, the third argument was treated as if it were [-1, -2, -1, -2, -1].

transpose( array, axes=None)

transpose takes an array array and returns a new array which corresponds to a with the order of axes specified by the second argument axes which is a sequence of dimension indices. The default is to reverse the order of all axes, i.e. axes=arange(a.rank)[::-1].

>>> a2=arange(6,shape=(2,3)); print a2
[[0 1 2]
 [3 4 5]]
>>> print transpose(a2)  # same as transpose(a2, axes=(1,0))
[[0 3]
 [1 4]
 [2 5]]
>>> a3=arange(24,shape=(2,3,4)); print a3
[[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]

 [[12 13 14 15]
  [16 17 18 19]
  [20 21 22 23]]]
>>> print transpose(a3)  # same as transpose(a3, axes=(2,1,0))
[[[ 0 12]
  [ 4 16]
  [ 8 20]]

 [[ 1 13]
  [ 5 17]
  [ 9 21]]

 [[ 2 14]
  [ 6 18]
  [10 22]]

 [[ 3 15]
  [ 7 19]
  [11 23]]]
>>> print transpose(a3, axes=(1,0,2))
[[[ 0  1  2  3]
  [12 13 14 15]]

 [[ 4  5  6  7]
  [16 17 18 19]]

 [[ 8  9 10 11]
  [20 21 22 23]]]

repeat( array, repeats, axis=0)

repeat takes an array array and returns an array with each element in the input array repeated as often as indicated by the corresponding elements in the second array. It operates along the specified axis. So, to stretch an array evenly, one needs the repeats array to contain as many instances of the integer scaling factor as the size of the specified axis:

>>> print a
[[0 1 2]
 [3 4 5]]
>>> print repeat(a, 2*ones(a.shape[0]))   # i.e. repeat(a, (2,2)), broadcast 
                   # rules apply, so this is also equivalent to repeat(a, 2)
[[0 1 2]
 [0 1 2]
 [3 4 5]
 [3 4 5]]
>>> print repeat(a, 2*ones(a.shape[1]), 1)  # i.e. repeat(a, (2,2,2), 1), or
                                            # repeat(a, 2, 1)
[[0 0 1 1 2 2]
 [3 3 4 4 5 5]]
>>> print repeat(a, (1,2))
[[0 1 2]
 [3 4 5]
 [3 4 5]]

where( condition, x, y, out=None)

The where function creates an array whose values are those of x at those indices where condition is true, and those of y otherwise. The shape of the result is the shape of condition. The type of the result is determined by the types of x and y. Either x or y (or both) can be a scalar, which is then used for all appropriate elements of condition. out can be used to specify an output array.

>>> where(arange(10) >= 5, 1, 2)
array([2, 2, 2, 2, 2, 1, 1, 1, 1, 1])

Starting with numarray-0.6, where supports a one parameter form that is equivalent to the nonzero function but reads better:

>>> where(arange(10) % 2)
(array([1, 3, 5, 7, 9]),)   # indices where expression is true

Note that in this case, the output is a tuple.

Like nonzero, the one parameter form of where can be used to do array indexing:

>>> a = arange(10,20)
>>> a[ where( a % 2 ) ]
array([11, 13, 15, 17, 19])

Note that for array indices which are boolean arrays, using where is not necessary but is still OK:

>>> a[(a % 2) != 0]
array([11, 13, 15, 17, 19])
>>> a[where((a%2) != 0)]
array([11, 13, 15, 17, 19])

choose( selector, population, outarr=None, clipmode=RAISE)

The function choose provides a more general mechanism for selecting members of a population based on a selector array. Unlike where, choose can select values from more than two arrays. selector is an array of integers between 0 and n. The resulting array will have the same shape as selector, with element selected from population=(b0, ..., bn) as indicated by the value of the corresponding element in selector. Assume selector is an array that you want to "clip" so that no values are greater than 100.0.

>>> choose(greater(a, 100.0), (a, 100.0))

Everywhere that greater(a, 100.0) is false (i.e. 0) this will ``choose'' the corresponding value in a. Everywhere else it will ``choose'' 100.0. This works as well with arrays. Try to figure out what the following does:

>>> ret = choose(greater(a,b), (c,d))

ravel( array)

Returns the argument array as a 1-D array. It is equivalent to reshape(a, (-1,)). There is a ravel method which reshapes the array in-place. Unlike a.ravel(), however, the ravel function works with non-contiguous arrays.

>>> a=arange(25)
>>> a.setshape(5,5)
>>> a.transpose()
>>> a.iscontiguous()
0
>>> a
array([[ 0,  5, 10, 15, 20],
  [ 1,  6, 11, 16, 21],
  [ 2,  7, 12, 17, 22],
  [ 3,  8, 13, 18, 23],
  [ 4,  9, 14, 19, 24]])
>>> a.ravel()
Traceback (most recent call last):
...
TypeError: Can't reshape non-contiguous arrays
>>> ravel(a)
array([ 0,  5, 10, 15, 20,  1,  6, 11, 16, 21,  2,  7, 12, 17, 22,  3,
        8, 13, 18, 23,  4,  9, 14, 19, 24])

nonzero( a)

nonzero returns a tuple of arrays containing the indices of the elements in a that are nonzero.

>>> a = array([-1, 0, 1, 2])
>>> nonzero(a)
(array([0, 2, 3]),)
>>> print a2
[[-1  0  1  2]
 [ 9  0  4  0]]
>>> print nonzero(a2)
(array([0, 0, 0, 1, 1]), array([0, 2, 3, 0, 2]))

compress( condition, a, axis=0)

Returns those elements of a corresponding to those elements of condition that are nonzero. condition must be the same size as the given axis of a. Alternately, condition may match a in shape; in this case the result is a 1D array and axis should not be specified.

>>> print x
[1 0 6 2 3 4]
>>> print greater(x, 2)
[0 0 1 0 1 1]
>>> print compress(greater(x, 2), x)
[6 3 4]
>>> print a2
[[-1  0  1  2]
 [ 9  0  4  0]]
>>> print compress(a2>1, a2)
[2 9 4]
>>> a = array([[1,2],[3,4]])
>>> print compress([1,0], a, axis = 1)
[[1]
[3]]
>>> print compress([[1,0],[0,1]], a)
[1, 4]

diagonal( a, offset=0, axis1=0, axis2=1)

Returns the entries along the diagonal of a specified by offset. The offset is relative to the axis2 axis. This is designed for 2-D arrays. For arrays of higher ranks, it will return the diagonal of each 2-D sub-array. The 2-D array does not have to be square.

Warning: in Numeric (and numarray 0.7 or before), there is a bug in the diagonal function which will give erronous result for arrays of 3-D or higher.

>>> print x
[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]
 [20 21 22 23 24]]
>>> print diagonal(x)
[ 0  6 12 18 24]
>>> print diagonal(x, 1)
[ 1  7 13 19]
>>> print diagonal(x, -1)
[ 5 11 17 23]

trace( a, offset=0, axis1=0, axis2=1)

Returns the sum of the elements in a along the diagonal specified by offset.

Warning: in Numeric (and numarray 0.7 or before), there is a bug in the trace function which will give erronous result for arrays of 3-D or higher.

>>> print x
[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]
 [20 21 22 23 24]]
>>> print trace(x)                      # 0 + 6 + 12 + 18 + 24
60
>>> print trace(x, -1)                  # 5 + 11 + 17 + 23
56
>>> print trace(x, 1)                   # 1 + 7 + 13 + 19
40

searchsorted( bin, values)

Called with a rank-1 array sorted in ascending order, searchsorted will return the indices of the positions in bin where the corresponding values would fit.

>>> print bin_boundaries
[ 0.   0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  1. ]
>>> print data
[ 0.31  0.79  0.82  5.  -2.  -0.1 ]
>>> print searchsorted(bin_boundaries, data)
[4 8 9 11 0 0]

This can be used for example to write a simple histogramming function:

>>> def histogram(a, bins):
...         # Note that the argument names below are reverse of the 
...         # searchsorted argument names
...         n = searchsorted(sort(a), bins)
...         n = concatenate([n, [len(a)]])
...         return n[1:]-n[:-1]
...
>>> print histogram([0,0,0,0,0,0,0,.33,.33,.33], arange(0,1.0,.1))
[7 0 0 3 0 0 0 0 0 0]
>>> print histogram(sin(arange(0,10,.2)), arange(-1.2, 1.2, .1))
[0 0 4 2 2 2 0 2 1 2 1 3 1 3 1 3 2 3 2 3 4 9 0 0]

sort( array, axis=-1)

This function returns an array containing a copy of the data in array, with the same shape as array, but with the order of the elements along the specified axis sorted. The shape of the returned array is the same as array's. Thus, sort(a, 3) will be an array of the same shape as array, where the elements of array have been sorted along the fourth axis.

>>> print data
[[5 0 1 9 8]
 [2 5 8 3 2]
 [8 0 3 7 0]
 [9 6 9 5 0]
 [9 0 9 7 7]]
>>> print sort(data)                    # Axis -1 by default
[[0 1 5 8 9]
 [2 2 3 5 8]
 [0 0 3 7 8]
 [0 5 6 9 9]
 [0 7 7 9 9]]
>>> print sort(data, 0)
[[2 0 1 3 0]
 [5 0 3 5 0]
 [8 0 8 7 2]
 [9 5 9 7 7]
 [9 6 9 9 8]]

argsort( array, axis=-1)

argsort will return the indices of the elements of the array needed to produce sort(array). In other words, for a 1-D array, take(a.flat, argsort(a)) is the same as sort(a)... but slower.

>>> print data
[5 0 1 9 8]
>>> print sort(data)
[0 1 5 8 9]
>>> print argsort(data)
[1 2 0 4 3]
>>> print take(data, argsort(data))
[0 1 5 8 9]

argmax( array, axis=-1)

argmin( array, axis=-1)

The argmax function returns an array (or scalar for a 1D array) with the index(es) of the maximum value(s) of its input array along the given axis. The returned array will have one less dimension than array. argmin is just like argmax, except that it returns the indices of the minima along the given axis.

>>> print data
[[9 6 1 3 0]
 [0 0 8 9 1]
 [7 4 5 4 0]
 [5 2 7 7 1]
 [9 9 7 9 7]]
>>> print argmax(data)
[0 3 0 3 1]
>>> print argmax(data, 0)
[4 4 1 4 4]
>>> print argmin(data)
[4 1 4 4 4]
>>> print argmin(data, 0)
[1 1 0 0 2]

fromstring( datastring, type, shape=None): Will return the array formed by the binary data given in datastring, of the specified type. This is mainly used for reading binary data to and from files, it can also be used to exchange binary data with other modules that use python strings as storage (e.g. PIL). Note that this representation is dependent on the byte order. To find out the byte ordering used, use the isbyteswapped method described on page . If shape is not specified, the created array will be one dimensional.

fromfile( file, type, shape=None): If file is a string then it is interpreted as the name of a file which is opened and read. Otherwise, file must be a Python file object which is read as a source of binary array data. fromfile reads directly into the newly created array buffer with no intermediate string, but otherwise is similar to fromstring, treating the contents of the specified file as a binary data string.

dot( a, b)

The dot function returns the dot product of a and b. This is equivalent to matrix multiply for rank-2 arrays (without the transposition). This function is identical to the matrixmultiply function.

>>> print a
[1 2]
>>> print b
[10 11]
# kind of like vector inner product with implicit transposition 
>>> print dot(a,b), dot(b,a) 
32 32
>>> print a
[[1 2]
 [5 7]]
>>> print b
[[  0   1]
 [ 10 100]]
>>> print dot(a,b)
[[ 20 201]
 [ 70 705]]
>>> print dot(b,a)
[[  5   7]
 [510 720]]

matrixmultiply( a, b)

This function multiplies matrices or matrices and vectors as matrices rather than elementwise. This function is identical to dot. Compare:

>>> print a
[[0 1 2]
 [3 4 5]]
>>> print b
[1 2 3]
>>> print a*b
[[ 0  2  6]
 [ 3  8 15]]
>>> print matrixmultiply(a,b)
[ 8 26]

clip( m, m_min, m_max)

The clip function creates an array with the same shape and type as m, but where every entry in m that is less than m_min is replaced by m_min, and every entry greater than m_max is replaced by m_max. Entries within the range [m_min, m_max] are left unchanged.

>>> a = arange(9, type=Float32)
>>> print clip(a, 1.5, 7.5)
[1.5 1.5 2. 3. 4. 5. 6. 7. 7.5]

indices( shape, type=None)

The indices function returns an array corresponding to the shape given. The array returned is an array of a new shape which is based on the specified shape, but has an added dimension of length the number of dimensions in the specified shape. For example, if shape=(3,4), then the shape of the array returned will be (2,3,4) since the length of (3,4) is 2 and if shape=(5,6,7), the returned array's shape will be (3,5,6,7). The contents of the returned arrays are such that the ith subarray (along index 0, the first dimension) contains the indices for that axis of the elements in the array. An example makes things clearer:

>>> i = indices((4,3))
>>> i.getshape()
(2, 4, 3)
>>> print i[0]
[[0 0 0]
 [1 1 1]
 [2 2 2]
 [3 3 3]]
>>> print i[1]
[[0 1 2]
 [0 1 2]
 [0 1 2]
 [0 1 2]]

So, i[0] has an array of the specified shape, and each element in that array specifies the index of that position in the subarray for axis 0. Similarly, each element in the subarray in i[1] contains the index of that position in the subarray for axis 1.

swapaxes( array, axis1, axis2)

Returns a new array which shares the data of array, but has the two axes specified by axis1 and axis2 swapped. If array is of rank 0 or 1, swapaxes simply returns a new reference to array.

>>> x = arange(10)
>>> x.setshape((5,2,1))
>>> print x
[[[0]
  [1]]

 [[2]
  [3]]

 [[4]
  [5]]

 [[6]
  [7]]

 [[8]
  [9]]]
>>> y = swapaxes(x, 0, 2)
>>> y.getshape()
(1, 2, 5)
>>> print y
[[[0 2 4 6 8]
 [1 3 5 7 9]]]

concatenate( arrs, axis=0)

Returns a new array containing copies of the data contained in all arrays of arrs= (a0, a1, ... an). The arrays ai will be concatenated along the specified axis (default=0). All arrays ai must have the same shape along every axis except for the one specified in axis. To concatenate arrays along a newly created axis, you can use array((a0, ..., an)), as long as all arrays have the same shape.

>>> print x
[[ 0  1  2  3]
 [ 5  6  7  8]
 [10 11 12 13]]
>>> print concatenate((x,x))
[[ 0  1  2  3]
 [ 5  6  7  8]
 [10 11 12 13]
 [ 0  1  2  3]
 [ 5  6  7  8]
 [10 11 12 13]]
>>> print concatenate((x,x), 1)
[[ 0  1  2  3  0  1  2  3]
 [ 5  6  7  8  5  6  7  8]
 [10 11 12 13 10 11 12 13]]
>>> print array((x,x))   # Note: one extra dimension
[[[ 0  1  2  3]
  [ 5  6  7  8]
  [10 11 12 13]]
 [[ 0  1  2  3]
  [ 5  6  7  8]
  [10 11 12 13]]]
>>> print a
[[1 2]]
>>> print b
[[3 4 5]]
>>> print concatenate((a,b),1)
[[1 2 3 4 5]]
>>> print concatenate((a,b),0)
ValueError: _concat array shapes must match except 1st dimension

innerproduct( a, b): innerproduct produces the inner product of arrays a and b. It is equivalent to matrixmultiply(a, transpose(b)).

outerproduct( a,b): outerproduct produces the outer product of vectors a and b, that is result[i, j] = a[i] * b[j].

array_repr( a, max_line_width=None, precision=None, supress_small=None): See section on Textual Representations of arrays.

array_str( a, max_line_width=None, precision=None, supress_small=None)

See section Textual Representations of arrays.

>>> print a
[  1.00000000e+00   1.10000000e+00   1.11600000e+00   1.11380000e+00
   1.20000000e-02   1.34560000e-04]
>>> print array_str(a,precision=4,suppress_small=1)
[ 1.      1.1     1.116   1.1138  0.012   0.0001]
>>> print array_str(a,precision=3,suppress_small=1)
[ 1.     1.1    1.116  1.114  0.012  0.   ]
>>> print array_str(a,precision=3)
[  1.000e+00   1.100e+00   1.116e+00   1.114e+00   1.200e-02
   1.346e-04]

resize( array, shape)

The resize function takes an array and a shape, and returns a new array with the specified shape, and filled with the data in the input array. Unlike the reshape function, the new shape does not have to yield the same size as the original array. If the new size of is less than that of the input array, the returned array contains the appropriate data from the "beginning" of the old array. If the new size is greater than that of the input array, the data in the input array is repeated as many times as needed to fill the new array.

>>> x = arange(10)
>>> y = resize(x, (4,2))                # note that 4*2 < 10
>>> print x
[0 1 2 3 4 5 6 7 8 9]
>>> print y
[[0 1]
 [2 3]
 [4 5]
 [6 7]]
>>> print resize(array((0,1)), (5,5))   # note that 5*5 > 2
[[0 1 0 1 0]
 [1 0 1 0 1]
 [0 1 0 1 0]
 [1 0 1 0 1]
 [0 1 0 1 0]]

identity( n, type=None)

The identity function returns an n by n array where the diagonal elements are 1, and the off-diagonal elements are 0.

>>> print identity(5)
[[1 0 0 0 0]
 [0 1 0 0 0]
 [0 0 1 0 0]
 [0 0 0 1 0]
 [0 0 0 0 1]]

sum( a, axis=0)

The sum function is a synonym for the reduce method of the add ufunc. It returns the sum of all of the elements in the sequence given along the specified axis (first axis by default).

>>> print x
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [12 13 14 15]
 [16 17 18 19]]
>>> print sum(x)
[40 45 50 55]                           # 0+4+8+12+16, 1+5+9+13+17,
2+6+10+14+18, ...
>>> print sum(x, 1)
[ 6 22 38 54 70]                        # 0+1+2+3, 4+5+6+7, 8+9+10+11, ...

cumsum( a, axis=0): The cumsum function is a synonym for the accumulate method of the add ufunc.

product( a, axis=0): The product function is a synonym for the reduce method of the multiply ufunc.

cumproduct( a, axis=0): The cumproduct function is a synonym for the accumulate method of the multiply ufunc.

alltrue( a, axis=0): The alltrue function is a synonym for the reduce method of the logical_and ufunc.

sometrue( a, axis=0): The sometrue function is a synonym for the reduce method of the logical_or ufunc.

all( a): all is a synonym for the reduce method of the logical_and ufunc, preceded by a ravel which converts arrays with to . Thus, all tests that all the elements of a multidimensional array are nonzero.

any( a): The any function is a synonym for the reduce method of the logical_and ufunc, preceded by a ravel which converts arrays with to . Thus, any tests that at least one of the elements of a multidimensional array is nonzero.

allclose( a, b, rtol=1.e-5, atol=1.e-8)

This function tests whether or not arrays x and y of an integer or real type are equal subject to the given relative and absolute tolerances: rtol, atol. The formula used is:

$\displaystyle \left\vert x - y \right\vert < atol + rtol * \left\vert y \right\vert$

(8.1)

This means essentially that both elements are small compared to atol or their difference divided by y's value is small compared to rtol.

See Also:

Module numarray.convolve:: The convolve function is implemented in the optional numarray.convolve package.

Module numarray.convolve:: The correlation function is implemented in the optional numarray.convolve package.

Send comments to the NumArray community.