-
数据处理: feature_columns = [sparseFeature(feat, len(fea_map[feat]) + 1 if feat in sparse_features else n_bins+1,embed_dim=embed_dim) for feat in features] dense特征离散化后传分桶个数
-
数据集处理这块
def mnist_dataset(batch_size):
(x_train, y_train), _ = tf.keras.datasets.mnist.load_data()
The x arrays are in uint8 and have values in the [0, 255] range.
You need to convert them to float32 with values in the [0, 1] range.
x_train = x_train / np.float32(255)
y_train = y_train.astype(np.int64)
train_dataset = tf.data.Dataset.from_tensor_slices(
(x_train, y_train)).shuffle(60000).repeat().batch(batch_size)
return train_dataset
dataset 对分布式更友好,特别是custom train step
数据处理: feature_columns = [sparseFeature(feat, len(fea_map[feat]) + 1 if feat in sparse_features else n_bins+1,embed_dim=embed_dim) for feat in features] dense特征离散化后传分桶个数
数据集处理这块
def mnist_dataset(batch_size):
(x_train, y_train), _ = tf.keras.datasets.mnist.load_data()
The
xarrays are in uint8 and have values in the [0, 255] range.You need to convert them to float32 with values in the [0, 1] range.
x_train = x_train / np.float32(255)
y_train = y_train.astype(np.int64)
train_dataset = tf.data.Dataset.from_tensor_slices(
(x_train, y_train)).shuffle(60000).repeat().batch(batch_size)
return train_dataset
dataset 对分布式更友好,特别是custom train step