Description
- Data Science with Python — core to advanced skills Â
- Role summary — Build end‑to‑end analytics and ML solutions using Python ecosystems for data ingestion, exploration, modeling, and deployment.
- Python fundamentals — Strong mastery of language features, OOP, functional patterns, generators, context managers, and performance profiling.
- Data manipulation — Expert use of Pandas and NumPy for cleaning, reshaping, joins, group‑bys, window ops, and memory‑efficient workflows.
- Data ingestion — Connectors and patterns for CSV/JSON, SQL, NoSQL, APIs, cloud object stores, and streaming sources with robust error handling.
- Exploratory analysis — Statistical summaries, hypothesis testing, correlation analysis, and interactive visualization with Matplotlib, Seaborn, and Plotly.
- Feature engineering — Encoding, scaling, time‑series features, embeddings, target encoding, and automated feature stores for reproducible pipelines.
- Machine learning basics — Supervised and unsupervised modeling with scikit‑learn, cross‑validation, hyperparameter tuning, and model evaluation metrics.
- Advanced modeling — Deep learning with TensorFlow/PyTorch, transfer learning, sequence models, and custom training loops for production workloads.
- Model explainability — SHAP/LIME, partial dependence, and counterfactual analysis to validate and communicate model behavior.
- MLOps and deployment — Containerized models, REST/gRPC serving, model registries, A/B testing, monitoring, and rollback strategies.
- Scalable compute — Parallelism with Dask, Spark (PySpark), GPU acceleration, and distributed training patterns for large datasets.
- Time series and forecasting — Stationarity checks, ARIMA/Prophet, state‑space models, and deep learning approaches for irregular and multivariate series.
- Probabilistic modeling — Bayesian inference, PyMC, uncertainty quantification, and decision‑aware modeling for high‑risk domains.
- Data quality and governance — Validation frameworks, lineage, metadata, privacy-preserving techniques, and reproducible notebooks/workflows.
- Production readiness — Logging, observability, drift detection, retraining pipelines, and cost/performance optimization for sustained SLAs.
- Senior expectations  — Lead experiment design, own MLOps, optimize pipelines, and mentor cross‑functional teams on model lifecycle.
- Lead expectations  — Architect data science strategy, enforce governance, align models to business KPIs, and drive platform-level decisions.




