How to split the array for every batch?
Could you provide a python solutions for how I can split my dataset array (for example the list of training images paths)?
Could you provide a python solutions for how I can split my dataset array (for example the list of training images paths)?
Maybe you could try this function.
from sklearn.utils import shuffle
def batch_data(data_X, data_Y, batch_size=256):
"""
:param data_X: source data of training set, type: list
:param data_Y: the label to the source data, type: list
:param batch_size: 256, type: int
:return: x_batch, y_batch
"""
# before, you'd better do shuffle
shuffled_X, shuffled_Y = shuffle(data_X, data_Y)
for idx in range(len(shuffled_X) // batch_size):
x_batch = shuffled_X[batch_size * idx: batch_size * (idx + 1)]
y_batch = shuffled_Y[batch_size * idx: batch_size * (idx + 1)]
yield x_batch, y_batch
If you need to split your array for every batch, just call this function in your epoch loop.
def mini_batch(array, batch_size=10):
array = np.array(array)
if not array.shape[0] or not batch_size:
return np.array([])
for start_idx in range(0, array.shape[0] - batch_size + 1, batch_size):
excerpt = slice(start_idx, start_idx + batch_size)
yield array[excerpt]
if array.shape[0] % batch_size != 0:
yield array[-(array.shape[0] % batch_size):]