Welcome to my GitHub! I'm a Data guy (analytics/engineering/science) with a Masterβs in Advanced Data Analytics and a solid foundation in Data Analytics, Data Science, Data Engineering, MLOps, and Business Analytics. Iβm passionate about building data-driven solutions that drive growth, innovation, and operational efficiency. My background spans data architecture, scalable ML pipelines, cloud computing, and actionable insights that help teams make strategic decisions.
- β‘ Former Product Lead at Cirrus Nexus (Cumulus Nexus India Pvt Ltd)
- π¨βπ» Experienced in Python, R, SQL, Rust, C++, Go, Terraform, and advanced ML frameworks like TensorFlow, PyTorch, and Scikit-Learn
- βοΈ Proficient in Cloud Platforms: AWS (SageMaker, Glue, Redshift, Lambda), Azure (Data Factory, Synapse, HDInsight, ML Studio), GCP (BigQuery, Looker, Vertex AI Platform); Certified in AWS, Azure, GCP, and Kubernetes
- π Skilled in Data Engineering (ETL, Data Modeling, Real-Time Streaming), MLOps (CI/CD, Model Deployment), and Data Science (Predictive Modeling, NLP, Computer Vision)
- π¬ Advocate for Cloud Cost Optimization strategies, helping companies cut costs while improving performance through structured planning
- Data Engineering & Big Data Pipelines β Architecting and optimizing ETL pipelines for large-scale data processing with Apache Spark, Flink, Superset, Dagster, Druid,Delta lakee,dbt,Airflow, Snowflake, and Fivetran
- MLOps Pipelines β Building end-to-end ML pipelines with Kubernetes, Docker, Jenkins, and Kubeflow to automate model training and deployment, with a focus on scalability and CI/CD workflows
- Generative AI & NLP Models β Developing cutting-edge models for NLP, including language models and sentiment analysis, using transformer architectures
- Cloud Infrastructure Optimization β Implementing efficient infrastructure using Terraform and IaC (Infrastructure as Code) to optimize cloud resources on AWS, Azure, and GCP
- Scaling Machine Learning Operations β Expanding knowledge in MLflow, Argo, and advanced MLOps for seamless deployment and monitoring of ML models
- Distributed Systems & Real-Time Analytics β Exploring Apache Flink, Kafka, and Delta Lake for real-time analytics and streaming solutions
- Advanced Data Engineering β Diving deeper into data warehouse and data lake architecture, leveraging platforms like Snowflake and Databricks
- Tools & Platforms: Apache Spark, Kafka, Hadoop, Snowflake, Databricks, Apache Airflow, Fivetran, dbt
- Cloud & Big Data: AWS (Lambda, Glue, RDS, S3, EMR, Redshift), Azure Data Factory, Azure Databricks, Azure Synapse, GCP BigQuery, Snowflake
- Skills: Data Pipeline Design, ETL Optimization, Data Modeling, Real-Time Data Streaming
- Languages & Libraries: Python, R, Julia, Scala, Java, SQL, Scikit-Learn, TensorFlow, PyTorch, PySpark, Keras, Pandas, Dask
- Specializations: Predictive Modeling, Time Series, NLP, Deep Learning, Hyperparameter Tuning, Computer Vision
- MLOps Tools: Docker, Kubernetes, Jenkins, MLflow, Kubeflow, Argo, Terraform, GitHub Actions
- CI/CD & Automation: CI/CD Pipelines, Model Versioning, Model Deployment, Monitoring & Logging
- Visualization Tools: Power BI, Tableau, Plotly, Matplotlib, ggplot2
- Business Tools: JIRA, Confluence, Lucidchart, Microsoft Visio, Business Process Mapping, Requirements Analysis
- Data Engineering & Cloud:
- AWS Cloud Data Engineer, Azure Data Engineer, Google Cloud Professional Data Engineer, SnowPro Core, Meta Database Engineer
- Machine Learning & Data Science:
- TensorFlow Developer, AWS Certified Machine Learning Specialty, IBM Data Science Professional
- MLOps & DevOps:
- Certified Kubernetes Administrator, Terraform Associate, Databricks Certified for Apache Spark
- Tools: R, SQL, Tableau, ETL
- Summary: Advanced to Round 2 among 400 teams by designing KPIs to track healthcare patient engagement, creating impactful insights for targeted health improvement.
- Tools: Kafka, AWS Lambda, Spark
- Summary: Built a real-time data streaming architecture to process and analyze data instantly, achieving 99.9% system availability and reducing latency for business-critical decisions.
- Tools: Python, Scikit-Learn, AWS
- Summary: Developed a predictive model with 86.2% accuracy to forecast customer churn, allowing for proactive retention strategies and enhancing customer engagement.
- Tools: Python, Apache Airflow, AWS SageMaker
- Summary: Created an ML pipeline automating data preprocessing, model training, and deployment, reducing operational costs by 14% while maintaining high model performance.
- π« Email: [email protected]
- πΌ LinkedIn: linkedin.com/in/chaitanyavankadaru
- π Blog: Coming soon, where I'll share insights on data engineering, MLOps, and AI-driven strategies!
- β Tea over Coffee! Extra fuel for complex problem-solving.
- π² Avid puzzle solver and lover of challenging data problems.
- πΎ I enjoy exploring the latest in Generative AI and contributing to open-source projects.
Thanks for stopping by my profile! Feel free to explore my repos, and letβs collaborate if you share similar interests or need insights on cloud and AI solutions.