How to Design Tensor Pipelines for Deep Learning Using Einops with Perception, Attention, and Multimodal Models
section(“6) pack unpack”) B, Cemb = 2, 128 class_token = torch.randn(B, 1, Cemb, device=device) image_tokens = torch.randn(B, 196, Cemb, device=device) text_tokens = torch.randn(B, 32, Cemb, device=device) show_shape(“class_token”, class_token) show_shape(“image_tokens”, image_tokens) show_shape(“text_tokens”, text_tokens) packed, ps = pack([class_token, image_tokens, text_tokens], “b * c”) show_shape(“packed”, packed) print(“packed_shapes (ps):”, ps) mixer = nn.Sequential( nn.LayerNorm(Cemb), nn.Linear(Cemb, 4 * Cemb), nn.GELU(), … Read more