TorchScript availability

Hi, thank you for such a nice work! I wanted to ask if by any chance you have a torchscript version of the feature extractor available? It will just make it much easier to incorporate it into existing pipelines (i would only need to replace I3D with V-JEPA and frechet distance with MMD). I tried to quickly prepare a torchscript version myself, but stumbled upon some issues:

- I noticed that the positional embeddings for e.g. `vith16` do not load correctly if i use the imported V-JEPA version from your `vjepa` package since positional embeddings are ignored. From the shape sizes, seems like you only keep the first-frame PEs? Is it the intended behaviour?
```
RuntimeError: Error(s) in loading state_dict for VisionTransformer:
        size mismatch for pos_embed: copying a param with shape torch.Size([1, 1568, 1280]) from checkpoint, the shape in current model is torch.Size([1, 196, 1280]).
        size mismatch for patch_embed.proj.weight: copying a param with shape torch.Size([1280, 3, 2, 16, 16]) from checkpoint, the shape in current model is torch.Size([1280, 3, 16, 16]).
```
- I couldn't find the source repo for the [vjepa](https://pypi.org/project/vjepa/) package — is it public (from my understanding, the submitted pypi package is different from the jepa repo)?
- Judging by Fig 6, the vanilla pretrained version of V-JEPA works better, right?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TorchScript availability #5

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

TorchScript availability #5

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions