Skip to content

Improve intermediate layer extraction explanation#1338

Open
palonso wants to merge 3 commits intoMTG:masterfrom
palonso:intermediate-layer-extraction-doc
Open

Improve intermediate layer extraction explanation#1338
palonso wants to merge 3 commits intoMTG:masterfrom
palonso:intermediate-layer-extraction-doc

Conversation

@palonso
Copy link
Copy Markdown
Contributor

@palonso palonso commented May 26, 2023

TensorToVectorReal converts tensors to 2D arrays by flattening all axis but the last one into the first dimension.
model-specific prediction algorithms (e.g., TensorflowPredictVGGish) use this algorithm to return 2D arrays since they are primarily intended for time-wise predictions or embeddings. However, it is possible to use these algorithms to extract intermediate layers of the models that may have more than two dimensions. In this case, all dimensions but the last one will be flattened. To address this:

  • TensorToVectorReal throws a warning in case it flattens a dimension.
  • We added notes explaining this behavior to the algorithms potentially affected.

Note that it is also possible to retrieve intermediate layers with their original shape using TensorflowPredict as discussed here.

@palonso palonso requested a review from dbogdanov May 26, 2023 08:51
Copy link
Copy Markdown
Member

@dbogdanov dbogdanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good! I've left a proposal to improve the description of the algorithms' output in the DOC string.

"Note: The output of this algorithm is 2D, which is suitable for extracting embeddings or "
"class activations (the output shape is, e.g., [time, number of classes]). If the output "
"parameter is set to an intermediate layer with more dimensions, the output will be "
"flattened to 2D.\n"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rephrased version (trying to simplify):

Note: The algorithm outputs a time series of class activations or embedding vectors, with a 2D shape [time, feature vector]. Feature vector values will be flattened if the output parameter is set to extract an intermediate layer with multiple dimensions.

"class activations (the output shape is, e.g., [time, number of classes]). If the output "
"parameter is set to an intermediate layer with more dimensions, the output will be "
"flattened to 2D.\n"
"\n"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comments as for TensorflowPredictEffnetDiscogs

"Note: The output of this algorithm is 2D, which is suitable for extracting embeddings or "
"class activations (the output shape is, e.g., [time, number of classes]). If the output "
"parameter is set to an intermediate layer with more dimensions, the output will be "
"flattened to 2D.\n"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as for TensorflowPredictEffnetDiscogs

"Note: The output of this algorithm is 2D, which is suitable for extracting embeddings or "
"class activations (the output shape is, e.g., [time, number of classes]). If the output "
"parameter is set to an intermediate layer with more dimensions, the output will be "
"flattened to 2D.\n"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as for TensorflowPredictEffnetDiscogs

_featsSize = tensor.dimension(3);

if (_channels != 1 && !_warned) {
E_WARNING("TensorToVectorReal: The channel axis (dimension 1) of the input tensor has size larger than 1, but the output of this algorithm is 2D. The batch, channel, and time axes (dimensions 0, 1, 2) will be flattened to the first dimension of the output matrix.");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We output a vector of vector of reals, so the "matrix" terminology may be misleading.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants