A Tutorial on Hand-Coding Qualcomm AI Hub Models for Classification, Object Discovery, and Hardware Deployment

In this tutorial, we work from end to end workflow of Qualcomm AI Hub models. We start by setting up the required package, find the available model collection, and load MobileNet-V2 for PyTorch local inference. We also handle the important issue of input shape by converting the NHWC image tensors into the NCHW format expected by the model. From there, we apply inferences to both the model’s built-in sample input and a real image, test the top predictions, sign the official Qualcomm AI Hub CLI demo, and extend the workflow with an example of YOLOv7 object detection. Also, we include an optional cloud device section where we integrate, profile, and use the model on a real Qualcomm device if an API token is available.

Copy the CodeCopiedUse a different browser
import subprocess, sys, os, glob, textwrap, traceback
import numpy as np, torch
from PIL import Image
import matplotlib.pyplot as plt
def pip_install(*pkgs):
   subprocess.run([sys.executable, "-m", "pip", "install", "-q", *pkgs], check=True)
pip_install("qai_hub_models")
OUT_DIR = "/content/qaihm_out"; os.makedirs(OUT_DIR, exist_ok=True)
torch.set_grad_enabled(False)
def to_nchw(value):
   arr = value[0] if isinstance(value, (list, tuple)) else value
   t = torch.from_numpy(np.asarray(arr, dtype=np.float32))
   if t.ndim == 3:
       t = t.unsqueeze(0)
   if t.ndim == 4 and t.shape[1] != 3 and t.shape[-1] == 3:
       t = t.permute(0, 3, 1, 2).contiguous()
   return t

We start by importing the libraries and setting up a utility to install the packages directly within Colab. We install qai_hub_models, create an output directory, and disable gradient tracking as we only need to consider it. We also define a to_nchw() function to convert any input image tensor to the original channel format expected by the model.

Copy the CodeCopiedUse a different browser
import pkgutil, qai_hub_models.models as _m
model_ids = sorted(n for _, n, p in pkgutil.iter_modules(_m.__path__)
                  if p and not n.startswith("_"))
print(f">>> {len(model_ids)} models available. First 40:n")
print(textwrap.fill(", ".join(model_ids[:40]), 100), "n")
from qai_hub_models.models.mobilenet_v2 import Model as MobileNetV2
model = MobileNetV2.from_pretrained().eval()
spec = model.get_input_spec()
input_name = list(spec.keys())[0]
print(">>> Input:", input_name, spec[input_name].shape, spec[input_name].dtype)
from torchvision.models import MobileNet_V2_Weights
IMAGENET_CLASSES = MobileNet_V2_Weights.IMAGENET1K_V1.meta["categories"]
def top5(logits):
   if logits.ndim == 1: logits = logits.unsqueeze(0)
   probs = torch.softmax(logits, dim=1)[0]
   conf, idx = probs.topk(5)
   return [(IMAGENET_CLASSES[i], float(c)) for c, i in zip(conf, idx)]

We find the Qualcomm AI Hub model packages and print the first set of model IDs to understand what is accessible. We then load the pretrained MobileNet-V2 model, read its input information, and identify the correct input name. We also modify the ImageNet class labels and define the top5() function to convert model logs into readable top 5 predictions.

Copy the CodeCopiedUse a different browser
sample = model.sample_inputs()
x = to_nchw(sample[input_name])
print(">>> fed tensor shape:", tuple(x.shape))
print("n>>> Top-5 for the built-in sample input:")
for label, conf in top5(model(x)):
   print(f"    {conf:6.2%}  {label}")
from torchvision import transforms
preprocess = transforms.Compose([
   transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor(),
])
img = None
try:
   import urllib.request
   p = os.path.join(OUT_DIR, "input.jpg")
   urllib.request.urlretrieve(
       " p)
   img = Image.open(p).convert("RGB")
except Exception as e:
   print(">>> photo download skipped:", e)
if img is not None:
   preds = top5(model(preprocess(img).unsqueeze(0)))
   print("n>>> Top-5 for the downloaded photo:")
   for label, conf in preds: print(f"    {conf:6.2%}  {label}")
   plt.figure(figsize=(5,5)); plt.imshow(img); plt.axis("off")
   plt.title(f"{preds[0][0]}  ({preds[0][1]:.1%})"); plt.show()

We first run inference using the model’s built-in sample input and then use_nchw() to adjust the shape of the tensor before passing it to MobileNet-V2. We then download the original image, preprocess it using standard scaling, cropping, and tensor transformation steps, and run another prediction. Finally we show the predicted top labeled image to link the model output to the input image.

Copy the CodeCopiedUse a different browser
def run_demo(module, extra=None, timeout=900):
   cmd = [sys.executable, "-m", module, "--eval-mode", "fp",
          "--output-dir", OUT_DIR] + (extra or [])
   print(f"n>>> {' '.join(cmd)}")
   try:
       r = subprocess.run(cmd, capture_output=True, text=True, timeout=timeout)
       print("n".join((r.stdout + r.stderr).strip().splitlines()[-25:]))
   except Exception as e:
       print(">>> demo skipped:", e)
run_demo("qai_hub_models.models.mobilenet_v2.demo")
try:
   pip_install("qai_hub_models[yolov7]")
   run_demo("qai_hub_models.models.yolov7.demo")
   imgs = sorted(glob.glob(OUT_DIR + "/*.png") + glob.glob(OUT_DIR + "/*.jpg"),
                 key=os.path.getmtime)
   if imgs:
       plt.figure(figsize=(9,9)); plt.imshow(Image.open(imgs[-1]).convert("RGB"))
       plt.axis("off"); plt.title("YOLOv7 detections"); plt.show()
   else:
       print(">>> no output image found (results may have printed instead).")
except Exception:
   print(">>> YOLOv7 section skipped:n", traceback.format_exc())

We define a run_demo() executable function that runs demos of the official Qualcomm AI Hub model from the command line. We use it to run the MobileNet-V2 demo and install the YOLOv7 add-on for object detection. We use the YOLOv7 demo, search for the generated image, and visualize the detection when the image is created.

Copy the CodeCopiedUse a different browser
try:
   import qai_hub as hub
   devices = hub.get_devices()
   print(f"n>>> Authenticated. {len(devices)} cloud devices available.")
   device = hub.Device("Samsung Galaxy S24 (Family)")
   sample = model.sample_inputs()
   nchw = to_nchw(sample[input_name])
   traced = torch.jit.trace(model, [nchw])
   cloud_inputs = {input_name: [nchw.numpy()]}
   cj = hub.submit_compile_job(model=traced, device=device,
                               input_specs=model.get_input_spec(),
                               options="--target_runtime tflite")
   target = cj.get_target_model(); print(">>> compiled:", cj.url)
   pj = hub.submit_profile_job(model=target, device=device); print(">>> profiling:", pj.url)
   ij = hub.submit_inference_job(model=target, device=device, inputs=cloud_inputs)
   out = ij.download_output_data()
   dev_logits = torch.from_numpy(np.asarray(list(out.values())[0][0]))
   print(">>> Top-5 from the REAL device:")
   for label, conf in top5(dev_logits): print(f"    {conf:6.2%}  {label}")
   target.download(os.path.join(OUT_DIR, "mobilenet_v2.tflite"))
   print(">>> saved compiled .tflite to", OUT_DIR)
except Exception as e:
   print("n>>> Cloud (on-device) section skipped — no API token configured.")
   print("    Get one at workbench.aihub.qualcomm.com, then:")
   print("    !qai-hub configure --api_token YOUR_TOKEN")
   print("    detail:", (str(e).splitlines() or [type(e).__name__])[0])
print("n>>> Tutorial complete. Outputs in:", OUT_DIR)

We include a Qualcomm AI Hub cloud workflow that only works when the API token is set. We download the available cloud devices, trace the PyTorch model, compile it for TFLite, profile it on the Qualcomm device, and run the indexing function. We then download the output of the device, print the top predictions, save the compiled TFLite model, and finish by showing where all the tutorial results are stored.

In conclusion, we have a fully functional implementation of the Qualcomm AI Hub models within Colab. We learned how to load pre-trained models, properly prepare inputs, use local assumptions, visualize classification and detection results, and use formal demos as a replicable reference point. We also saw how the same model can be passed from local PyTorch implementations to Qualcomm’s cloud device pipeline for integration, profiling, and real-world device description. It provides a path from simple testing to the application of hardware knowledge with the Qualcomm AI Hub.


Check it out Full Codes with Notebook here. Also, feel free to follow us Twitter and don’t forget to join our 150k+ ML SubReddit and Subscribe to Our newspaper. Wait! are you on telegram? now you can join us on telegram too.

Need to work with us on developing your GitHub Repo OR Hug Face Page OR Product Release OR Webinar etc.?contact us

The post Hands-on Coding Tutorial for Qualcomm AI Hub Models for Classification, Object Discovery, and Hardware Deployment appeared first on MarkTechPost.

Leave a Comment