The sharp edges

This section describes features in the language that are not (yet) supported by Thunder, along with their workarounds. You might encounter these when compiling a module or function with Thunder. The examples should give you a good idea of how to change your program so that it works.

Note that the fact that something is not supported today doesn’t mean it won’t be supported at some point in the future. Feel free to reach out to help us prioritize.

Inplace operations

Inplace PyTorch operations like t.add_(1.0) are not supported in Thunder yet. Support for inplace operations is coming soon.

Tensor subclasses

Thunder currently supports Python data types and PyTorch tensors as inputs of functions and models.

Subclasses of these types, e.g. lazy tensors, nested tensors, or sparse tensors are not supported today.

Tracing Python builtins, standard library operations and functions that call other languages

Calling a Python builtin, standard library operation, or a function that calls into another language is safe to trace, so long as the following rules are observed:

  1. The function should not have side effects. For example, calling print() will execute the print() function while tracing, but since it’s not a Thunder operation it will not appear in a trace, and so future cached executions will not execute the print() statement.

  2. The function must not manipulate tensor data or metadata. Since the operation won’t appear in a trace, these manipulations won’t be repeated by Thunder, and may even cause a crash while tracing. To implement such operations, see Adding Custom Operators

  3. The function must not produce different results across invocations. Again, since the operation won’t appear in traces, Thunder cannot replicate an operation that produces different results when it’s invoked, like random.random() will.

Using Thunder-optimized Modules

Compiling a module produces a Thunder-optimized module”. A Thunder-optimized module is less dynamic than the original module, which facilitates tracing and optimization. It has a reference to the original module, and it shares its parameters with it.

While modifying the original model’s parameters will reflect in the Thunder-optimized module, other changes to the original module will not. In particular:

  • Whether model is in train or eval mode is captured at compilation time and constant

  • The structure of the module is captured at compilation time, and changing the original module’s structure will likely break the Thunder-optimized module

  • Non-parameter attributes of the module may or may not be captured at compile time and treated as constants

Not all features of PyTorch modules are currently supported, either. Module hooks are not supported, and adding new module attributes in a module’s forward() method is only partially supported.