About

Bio: A Data Scientist with strong background in Data Engineering.

Through a rigorous coursework I have deepened my understanding of Statistics, Experimental Design, Machine Learning and Natural Language Processing. I have done several projects which make use of skills from Data Engineering and Data Science to answer questions using data.

I consider myself a hybrid of a Data Engineer and Data Scientist.

Data Engineering skills and tools

  • Languages: Python, SQL, Java
  • Databases: Postgres, Mysql, Oracle, AWS RDS, DynamoDB, RedShift
  • Modeling: Dimensional data modeling
  • Batch ETL: Python, SparkSQL
  • Streaming ETL: Kafka, Faust, KSQL, Spark Streaming
  • Workflow Management: Airflow

Data Engineering Projects

Data Science skills and tools

  • Languages: Python, R, SQL, MATLAB, SAS
  • Deep Learning: PyTorch, TensorFlow
  • Machine Learning: sklearn, pandas, numpy
  • NLP: spaCy, nltk, pattern, gensim
  • Deployment: AWS SageMaker, Docker, Kubernetes
  • Visualization: Plotly, D3.js, p5.js, Tableau
NLP Projects Machine Learning Projects Deep Learning Projects

Other skills and tools

  • Cloud: AWS, OpenStack, Pivotal Could Foundry
  • CI/CD: Concourse, Jenkins, CircleCI
  • IAC: Boto3, Terraform

Certifications

  • AWS Certified Developer - Associate (Expires: July 2023)

Publications