Role overview
Job Description:
We are seeking a highly skilled and innovative Data Scientist to join a dynamic analytics team. In this role, you will build, train, and deploy large-scale, self-supervised "foundation" models that learn rich representations of time series, sequential sensor data in addition to textual and vision data, to be fine-tuned for tasks such as anomaly/event detection, predictive maintenance, forecasting, classification, or multi-modal sensor fusion for industrial and scientific applications.
Details:
- $130k - $210k per year salary (stock options, and other incentives)
- Full Time
- Hybrid in Houston
What you'll work on
- Design, develop, and optimize machine learning models for time series and multi-modal data
- Process, clean, augment, and engineer features from large-scale sequential and sensor datasets
- Build and integrate multi-modal deep learning architectures for heterogeneous data sources
- Develop and implement self-supervised and semi-supervised learning approaches
- Fine-tune and adapt foundation models for domain-specific applications
- Evaluate model performance using appropriate statistical, time-series, and business metrics
- Implement signal processing techniques for noise reduction, alignment, and feature extraction
- Develop scalable data pipelines for ingesting and synchronizing multi-sensor data
- Train and deploy models in distributed, multi-GPU environments
- Optimize model performance through hyperparameter tuning and architectural improvements
- Collaborate with cross-functional teams to translate domain requirements into technical solutions
- Communicate model behavior, interpretability insights, and performance results to stakeholders
What we're looking for
- MS or Ph.D. in Computer Science, Data Science, AI, or related field
- 3+ years of experience in machine learning, AI, or data science
- Strong experience with time series, sequential, and multi-sensor data
- Expertise in signal processing (Fourier/wavelet analysis, filtering, noise modeling)
- Experience with multi-modal learning (time series, images, text, audio)
- Proficiency in deep learning architectures (RNN/LSTM/GRU, CNNs, Transformers, GNNs, generative models)
- Experience with self-/semi-supervised learning and transfer learning
- Strong model evaluation skills (MSE, F1, AUC, DTW, IoU)
- Expert Python; experience with PyTorch, TensorFlow, or JAX; C++/CUDA a plus
- Experience with distributed, large-scale model training
- Solid foundation in linear algebra, probability, statistics, and optimization
- Strong collaboration and communication skills