I tried to use TorchStudio in our project. It was a rather pleasant experience overall. Except that my model training behaves wildly different from my parallel vanilla implementation in PyTorch.
In vanilla PyTorch, the model has issues with imbalance and missing augmentation. But it learns to 99+% accuracy whatever that means. The same model with identical hyperparameters (incl. optimizer and loss parameters) in TorchStudio either:
- trains to 97% accuracy but ONLY if loss computes to NaN after about 2 epochs (which it does)
- trains to <80% accuracy if loss does not compute to NaN (which requires weight decay or a very small learning rate)
Also, an epoch in TorchStudio is faster which however, may be due to better caching.
Overall, I concluded that this is a bug within TorchStudio I have run across. More details are found at the corresponding discussion on discord -> https://discord.com/channels/702624558536065165/1052003784500453380