Role overview
Hidonix is seeking an AI / Machine Learning Engineer to help design and implement intelligent systems that extract meaning and predictive value from computer vision and behavioral datasets. This is a junior-level, in-person role suited for candidates with 2–3 years of experience and a solid foundation in deep learning, embeddings, and modern neural architectures.
As a member of the AI team, the ideal candidate will work on projects that leverage CNNs, transformer models, and embedding architectures to encode and reason over pose, facial, and action-based visual data. These systems support downstream tasks such as future action prediction, semantic matching, and similarity-based inference.
What you'll work on
- Design and implement machine learning pipelines that encode visual input (pose, face, object/classification) into shared embedding spaces for similarity and predictive tasks
- Build and fine-tune convolutional and transformer-based neural architectures optimized for visual recognition and representation learning
- Develop encoding and embedding techniques that allow consistent comparison across multiple data types (e.g., pose vectors, facial landmarks, class labels)
- Apply techniques such as cosine similarity, distance metrics, and latent clustering to perform behavioral inference and action prediction
- Contribute to model training, evaluation, and deployment workflows including data preprocessing, augmentation, hyperparameter tuning, and performance profiling
- Collaborate closely with engineers in computer vision, embedded systems, software, and UI/UX to ensure seamless integration of AI pipelines into real-time systems
- Produce clean, well-documented code and maintain version-controlled model artifacts and experiment logs
- Write technical documentation for models, training procedures, evaluation criteria, and system integration
What we're looking for
- Experience integrating vision-based AI models into embedded or robotics systems
- Familiarity with ONNX or TensorRT for model optimization and deployment
- Background in sequence modeling, recurrent architectures, or video-based action recognition
- Exposure to multimodal AI systems that blend image, pose, and metadata representations
- Familiarity with techniques like CLIP, DINO, or self-supervised representation learning
- Experience with MLOps or training orchestration tools such as MLflow, Weights & Biases, or DVC