-
Notifications
You must be signed in to change notification settings - Fork 758
[ET-VK] Pass detailed op information to event tracer #16266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
## Context
Currently, when using the event tracer API to log shader execution times, only the shader name is recorded as the event name. However, this provides very minimal context to use when interpreting profiling data. For example, if we see that a convolution shader is running slow, it's is impossible to know from the profiling data alone what the input/output sizes were, what convolution parameters (i.e. stride/padding/dilation) were used, etc.
## Changes
This diff makes it so that for each shader dispatch, a JSON is recorded as the event name instead which contains the complete details of all the arguments the operator was invoked with.
The JSON will look something like
```json
{
"name": "aten.where.self",
"args": [
{
"type": "TENSOR",
"value_ref": 25,
"dtype": "Bool",
"sizes": [
1,
1,
1,
8
],
"storage": "TEXTURE_3D",
"packed_dim": 2
},
{
"type": "TENSOR",
"value_ref": 30,
"dtype": "Float",
"sizes": [
1,
6,
43,
8
],
"storage": "TEXTURE_3D",
"packed_dim": 2
},
{
"type": "TENSOR",
"value_ref": 32,
"dtype": "Float",
"sizes": [
1,
6,
43,
8
],
"storage": "TEXTURE_3D",
"packed_dim": 2
},
{
"type": "TENSOR",
"value_ref": 33,
"dtype": "Float",
"sizes": [
1,
6,
43,
8
],
"storage": "TEXTURE_3D",
"packed_dim": 2
}
]
}
```
Then, when processing the profiling data, the JSON can be post-processed to provide useful information to contextualize the shader execution times, for example:
* Memory Throughput
* GFLOPS
Differential Revision: [D84646748](https://our.internmc.facebook.com/intern/diff/D84646748/)
[ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16266
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New Failure, 4 Unrelated FailuresAs of commit 5286b58 with merge base 5d40a3a ( NEW FAILURE - The following job has failed:
FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #16267 * #16266 ## Context When debugging correctness issues in ET-VK, it can be helpful to extract a subgraph of the model and test on the subgraph. ## Changes This diff/PR introduces some test utilites that can be used to extract all nodes tagged with a specified field in the `node.meta["custom"]` map into a separate `ExportedProgram`. Differential Revision: [D89216531](https://our.internmc.facebook.com/intern/diff/D89216531/) --------- Co-authored-by: ssjia <ssjia@devvm1479.ncg0.facebook.com>
Stack from ghstack (oldest at bottom):
Context
Currently, when using the event tracer API to log shader execution times, only the shader name is recorded as the event name. However, this provides very minimal context to use when interpreting profiling data. For example, if we see that a convolution shader is running slow, it's is impossible to know from the profiling data alone what the input/output sizes were, what convolution parameters (i.e. stride/padding/dilation) were used, etc.
Changes
This diff makes it so that for each shader dispatch, a JSON is recorded as the event name instead which contains the complete details of all the arguments the operator was invoked with.
The JSON will look something like
{ "name": "aten.where.self", "args": [ { "type": "TENSOR", "value_ref": 25, "dtype": "Bool", "sizes": [ 1, 1, 1, 8 ], "storage": "TEXTURE_3D", "packed_dim": 2 }, { "type": "TENSOR", "value_ref": 30, "dtype": "Float", "sizes": [ 1, 6, 43, 8 ], "storage": "TEXTURE_3D", "packed_dim": 2 }, { "type": "TENSOR", "value_ref": 32, "dtype": "Float", "sizes": [ 1, 6, 43, 8 ], "storage": "TEXTURE_3D", "packed_dim": 2 }, { "type": "TENSOR", "value_ref": 33, "dtype": "Float", "sizes": [ 1, 6, 43, 8 ], "storage": "TEXTURE_3D", "packed_dim": 2 } ] }Then, when processing the profiling data, the JSON can be post-processed to provide useful information to contextualize the shader execution times, for example:
Differential Revision: D84646748