ttnn.copy_host_to_device_tensor

ttnn.copy_host_to_device_tensor(host_tensor: ttnn.Tensor, device_tensor: ttnn.Tensor, cq_id: ttnn.QueueId = None) → None

copy_host_to_device_tensor(host_tensor: ttnn._ttnn.tensor.Tensor, device_tensor: ttnn._ttnn.tensor.Tensor, cq_id: ttnn._ttnn.types.QueueId | None = None) -> None

Copies a tensor from host to device.

Parameters:

host_tensor (ttnn.Tensor) – the tensor to be copied from host to device.
device_tensor (ttnn.Tensor) – the tensor to be copied to.
cq_id (ttnn.QueueId, optional) – The queue id to use. Defaults to None.

Note

This operations supports tensors according to the following data types and layout:

host/device tensor
dtype - layout
BFLOAT16, BFLOAT8_B, BFLOAT4_B, FLOAT32, UINT32, INT32, UINT16, UINT8 - TILE
BFLOAT16, FLOAT32, UINT32, INT32, UINT16, UINT8 - ROW_MAJOR

Memory Support:

Interleaved: DRAM and L1
Height, Width, Block, and ND Sharded: DRAM and L1

Limitations:

Host and Device tensors must be the same shape, have the same datatype, and have the same data layout (ROW_MAJOR or TILE).

Example

# Create a host tensor and copy it to a pre-allocated device tensor
dtype = ttnn.bfloat16
layout = ttnn.ROW_MAJOR_LAYOUT

tensor = ttnn.rand((10, 64, 32), device=device, dtype=dtype, layout=layout)
host_tensor = ttnn.from_device(tensor)
device_tensor_copy = ttnn.allocate_tensor_on_device(host_tensor.spec, device)
ttnn.copy_host_to_device_tensor(host_tensor, device_tensor_copy)

logger.info("TT-NN tensor shape after copying to device", device_tensor_copy.shape)