---
license: bsd-3-clause
tags:
- kernel
---
# Triton layer normalization kernels.

This kernel implements layers normalization using Triton. This kernel is from
the [flash-attention](https://github.com/Dao-AILab/flash-attention) project.

## Functions

### Function `layer_norm`

`(x: torch.Tensor, weight: torch.Tensor, bias: torch.Tensor, residual: Optional[torch.Tensor] = None, x1: Optional[torch.Tensor] = None, weight1: Optional[torch.Tensor] = None, bias1: Optional[torch.Tensor] = None, eps: float = 1e-06, dropout_p: float = 0.0, rowscale=None, prenorm: bool = False, residual_in_fp32: bool = False, is_rms_norm: bool = False, return_dropout_mask: bool = False, out: Optional[torch.Tensor] = None, residual_out: Optional[torch.Tensor] = None)`

Apply layer normalization to the input tensor with Triton acceleration.

### Parameters

- **x** (*torch.Tensor*) --
  Input tensor to normalize.
- **weight** (*torch.Tensor*) --
  Scale parameter for normalization.
- **bias** (*torch.Tensor*) --
  Shift parameter for normalization.
- **residual** (*torch.Tensor*, *optional*) --
  Optional residual tensor to add to the input before normalization.
- **x1** (*torch.Tensor*, *optional*) --
  Optional second input tensor to combine with *x*. When provided, the function
  first adds *x1* to *x* and then applies normalization.
- **weight1** (*torch.Tensor*, *optional*) --
  Scale parameter for the second normalization.
- **bias1** (*torch.Tensor*, *optional*) --
  Shift parameter for the second normalization.
- **eps** (*float*, *optional*, defaults to 1e-6) --
  Small constant added for numerical stability in normalization.
- **dropout_p** (*float*, *optional*, defaults to 0.0) --
  Dropout probability. If greater than 0, applies dropout to the input before
  normalization and residual addition.
- **rowscale** (*torch.Tensor*, *optional*) --
  Optional scaling factor applied to each row of the input tensor.
  Not compatible with the use of *x1*.
- **prenorm** (*bool*, *optional*, defaults to False) --
  If True, returns both the normalized output and the unnormalized input+residual.
- **residual_in_fp32** (*bool*, *optional*, defaults to False) --
  If True, performs the residual connection in FP32 precision.
- **is_rms_norm** (*bool*, *optional*, defaults to False) --
  If True, uses RMS normalization instead of layer normalization.
- **return_dropout_mask** (*bool*, *optional*, defaults to False) --
  If True, returns the dropout mask used for the computation.
- **out** (*torch.Tensor*, *optional*) --
  Output tensor for the normalized result. If *None*, a new tensor is allocated.
- **residual_out** (*torch.Tensor*, *optional*) --
  Output tensor for the residual result when using prenorm. If *None*, a new tensor
  is allocated when needed.

### Returns

**Type**: *torch.Tensor* or tuple of *torch.Tensor*

- The normalized input.
- The second normalization of the input if *weight1* is provided.
- The residual tensor if *prenorm* is set.
- The dropout mask if *return_dropout_mask* is set.
- The dropout mask for *x1* if *x1* is provided and *return_dropout_mask* is set.

## Layers

### Class `LlamaRMSNorm`

No documentation available.

#### Methods

##### Method `forward`

`(self, hidden_states: torch.Tensor) -> torch.Tensor`

No documentation available.