numpy 合并矩阵

我们在处理数据的过程中,会经常遇到多维数组拼接的问题。
常见的情况有以下几种(下面以*.shape代指张量大小):

  • 沿某一维度扩展,比如a.shape == [32,3,5000] b.shape == [32,1,5000],我们需要把a,b合并成新的张量c,其中c.shape == [32,4,5000];
  • 增加新的列,比如a.shape == [32,3,5000],b.shape == [32,5000],我们需要把a,b合并成新的张量c,其中c.shape == [32,4,5000];
    简言之,第一种情况可以使用numpy下的concatenate实现,第二种情况我们可以用expand_dims先拓展维度再按第一种情况处理来实现。下面具体说明其用法。

    一 concatenate的用法

    函数原型:
  • numpy.concatenate((a1, a2, …), axis=0, out=None)
    Join a sequence of arrays along an existing axis.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    Parameters:	

    a1, a2, … : sequence of array_like

    The arrays must have the same shape, except in the dimension corresponding to axis (the first, by default).
    axis : int, optional

    The axis along which the arrays will be joined. If axis is None, arrays are flattened before use. Default is 0.
    out : ndarray, optional

    If provided, the destination to place the result. The shape must be correct, matching that of what concatenate would have returned if no out argument were specified.

    Returns:

    res : ndarray

    The concatenated array.

需要注意的地方有以下问题:

  • 能够拼接的张量必须具有相同的形状,待拼接的维度除外。也就是说a.shape == [32,3,5000] b.shape == [32,1,5000] ab是可沿axis=1拼接的。a.shape == [32,3,5000],b.shape == [32,5000] ab是无法拼接的。
    举例如下:
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    >>> a = np.array([[1, 2], [3, 4]])
    >>> b = np.array([[5, 6]])
    >>> np.concatenate((a, b), axis=0)
    array([[1, 2],
    [3, 4],
    [5, 6]])
    >>> np.concatenate((a, b.T), axis=1)
    array([[1, 2, 5],
    [3, 4, 6]])
    >>> np.concatenate((a, b), axis=None)
    array([1, 2, 3, 4, 5, 6])

二 column_stack的用法

看这个名字我们就大概知道,这个函数是用来新增列的。
函数原型:

  • numpy.column_stack(tup)
    Stack 1-D arrays as columns into a 2-D array.
    Take a sequence of 1-D arrays and stack them as columns to make a single 2-D array. 2-D arrays are stacked as-is, just like with hstack. 1-D arrays are turned into 2-D columns first.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    Parameters:	

    tup : sequence of 1-D or 2-D arrays.

    Arrays to stack. All of them must have the same first dimension.

    Returns:

    stacked : 2-D array

    The array formed by stacking the given arrays.

这里的函数说明指出,我们传入一个tuple然后得到合并的二维数组。其实这个说法很模糊。我们先看一个例子。

1
2
3
4
5
6
>>> a = np.array((1,2,3))
>>> b = np.array((2,3,4))
>>> np.column_stack((a,b))
array([[1, 2],
[2, 3],
[3, 4]])

我们再看一下,上文指出输入是tuple,那么我们再看一个例子:

1
2
3
4
5
6
7
8
9
10
11
12
import numpy as np

a = np.zeros((32,3,1000))
b = np.zeros((32,1,1000))
c = np.zeros((32,1,1000))

print(a.shape)
print(b.shape)
print(c.shape)
d = np.column_stack((a,b,c))

print(d.shape)

结果如下:

1
2
3
4
(32, 3, 1000)
(32, 1, 1000)
(32, 1, 1000)
(32, 5, 1000)

三 vstack的用法

函数原型:

  • numpy.vstack(tup)
    Stack arrays in sequence vertically (row wise).
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    Parameters:	

    tup : sequence of ndarrays

    The arrays must have the same shape along all but the first axis. 1-D arrays must have the same length.

    Returns:

    stacked : ndarray

    The array formed by stacking the given arrays, will be at least 2-D.

我们直接看例子吧!

1
2
3
4
5
>>> a = np.array([1, 2, 3])
>>> b = np.array([2, 3, 4])
>>> np.vstack((a,b))
array([[1, 2, 3],
[2, 3, 4]])

也就说对于0-1维度而言,这个函数沿着axis=0拼接。原先a.shape == [1,3] b.shape == [1,3],拼接后变为[2,3]

1
2
3
4
5
6
7
8
9
>>> a = np.array([[1], [2], [3]])
>>> b = np.array([[2], [3], [4]])
>>> np.vstack((a,b))
array([[1],
[2],
[3],
[2],
[3],
[4]])

四 hstack的用法

函数原型:

  • numpy.hstack(tup)
    我们直接看例子!
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    >>> a = np.array((1,2,3))
    >>> b = np.array((2,3,4))
    >>> np.hstack((a,b))
    array([1, 2, 3, 2, 3, 4])
    >>> a = np.array([[1],[2],[3]])
    >>> b = np.array([[2],[3],[4]])
    >>> np.hstack((a,b))
    array([[1, 2],
    [2, 3],
    [3, 4]])

对于0-1维度而言,这个函数沿着axis=2拼接。原先a.shape == [1,3] b.shape == [1,3],拼接后变为[1,6]。

expand_dims的用法

函数原型:

  • numpy.expand_dims(a, axis)
    Expand the shape of an array.

Insert a new axis that will appear at the axis position in the expanded array shape.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
Parameters:	

a : array_like

Input array.
axis : int

Position in the expanded axes where the new axis is placed.

Returns:

res : ndarray

Output array. The number of dimensions is one greater than that of the input array.

我们先构造一个长度为2的向量:

1
2
3
>>> x = np.array([1,2])
>>> x.shape
(2,)

我们添加新的列,也就是说在axis=0增加一列。

1
2
3
4
5
>>> y = np.expand_dims(x, axis=0)
>>> y
array([[1, 2]])
>>> y.shape
(1, 2)

现在我们可以看到,矩阵大小变为了(1,2)
我们沿axis=1增加一行。

1
2
3
4
5
6
>>> y = np.expand_dims(x, axis=1)  # Equivalent to x[:,np.newaxis]
>>> y
array([[1],
[2]])
>>> y.shape
(2, 1)

现在我们可以看到,矩阵大小变为了(2,1)

-------------End of the articleThank you for reading-------------
  • Author of this article:zfish
  • Link to this article: archives/26dfb48.html
  • Copyright Notice: All articles in this blog, except for special statements, please indicate the source!
0%