Vision Transformer Model using just one quarter of the target data outperforming CNN models.

Technologies:

  • C/C++ / Python
  • Linux Ubuntu
  • Zero Shot Transformer Model
  • Clip
  • Open Vocabulary Detection and Classification without prior training
  • Aras of Interest and Objects Detection
  • TensorRT

Others:

  • GenAI
  • AI Models selection
  • Linux Configuration
  • Containerization (Docker)