Well-versed in digging through data to find key insights and curating a compelling story from complex analyses, passionate about delving into data from different systems, at different timescales, and in complex formats to uncover hidden relationships.
Machine Learning knowledge acquired from personal experimentation with Spark: Linear / Logistic Regression, Decision Trees, NaiveBayes, Alternating Least Squares (Recommender Systems), TF-IDF
Professional Background (formerly): ETL Developer / Traditional DWHs / Kimball's Methodology
Well-versed in digging through data to find key insights and curating a compelling story from complex analyses, passionate about delving into data from different systems, at different timescales, and in complex formats to uncover hidden relationships.
Machine Learning with Spark: Linear / Logistic Regression, Decision Trees, NaiveBayes, Alternating Least Squares (Recommender Systems), TF-IDF, Frequent Pattern Mining
Professional Background (formerly): ETL Developer / Traditional DWHs / Kimball's Methodology
Big Data / Other: Apache Kafka => Spark Streaming from Kafka topics
Big Data / Other: Apache Cassandra => Data Modeling
Source Control: GitHub
Source Control / Other: BitBucket
DevOps / Other: Docker / DockerHub
Programming Languages / Core: Scala, Python
Programming Language / Other: Haskell
Keen interest in experimenting with open-source Big Data technologies.
E-mail address in the profile.
I'm currently doing work on the freelancing site upwork.com, if you feel more comfortable "trying before you buy" we can start with a limited-scope, fixed-budget project there.
Well-versed in digging through data to find key insights and curating a compelling story from complex analyses, passionate about delving into data from different systems, at different timescales, and in complex formats to uncover hidden relationships.
Keywords: Spark Streaming, Cassandra data modeling.
Well-versed in digging through data to find key insights and curating a compelling story from complex analyses, passionate about delving into data from different systems, at different timescales, and in complex formats to uncover hidden relationships.
Machine Learning with Spark: Linear / Logistic Regression, Decision Trees, NaiveBayes, Alternating Least Squares (Recommender Systems), TF-IDF, Frequent Pattern Mining
Professional Background (formerly): ETL Developer / Traditional DWHs / Kimball's Methodology
Keywords: Spark Streaming, Cassandra data modeling.
Well-versed in digging through data to find key insights and curating a compelling story from complex analyses, passionate about delving into data from different systems, at different timescales, and in complex formats to uncover hidden relationships.
Machine Learning with Spark: Linear / Logistic Regression, Decision Trees, NaiveBayes, Alternating Least Squares (Recommender Systems), TF-IDF, Frequent Pattern Mining
Professional Background (formerly): ETL Developer / Traditional DWHs / Kimball's Methodology
Keywords: Spark Streaming, Cassandra data modeling, Spark GraphFrames.
Well-versed in digging through data to find key insights and curating a compelling story from complex analyses, passionate about delving into data from different systems, at different timescales, and in complex formats to uncover hidden relationships.
Machine Learning with Spark: Linear / Logistic Regression, Decision Trees, NaiveBayes, Alternating Least Squares (Recommender Systems), TF-IDF, Frequent Pattern Mining
Big Data / Core Skill: Apache Spark
Big Data / Core Skill: Apache Cassandra (Data Modeling)
Big Data / Core Skill: Graph Modeling / Algorithms / Queries (with Spark GraphFrames and Neo4J)
Big Data / Other: Apache Kafka (incl. KafkaConnect), ElasticSearch, RedShift
I have 5+ years of Big Data and Data Science Experience and I'm a DataBricks Certified Apache Spark Developer, MapR Certified Hadoop Developer, Cloudera Certified Hadoop and Spark Developer, Cloudera Certified Hadoop Administrate, DataStax Certified Apache Cassandra Developer and I have very good experience in working with USA clients.
My profile here. https://in.linkedin.com/in/sandishkumar
I have 5+ years of Big Data and Data Science Experience and I'm a DataBricks Certified Apache Spark Developer, MapR Certified Hadoop Developer, Cloudera Certified Hadoop and Spark Developer, Cloudera Certified Hadoop Administrate, DataStax Certified Apache Cassandra Developer and I have very good experience in working with USA clients.
My profile here. https://in.linkedin.com/in/sandishkumar
I have 5+ years of Big Data and Data Science Experience and I'm a DataBricks Certified Apache Spark Developer, MapR Certified Hadoop Developer, Cloudera Certified Hadoop and Spark Developer, Cloudera Certified Hadoop Administrate, DataStax Certified Apache Cassandra Developer and I have very good experience in working with USA clients.
My profile here. https://in.linkedin.com/in/sandishkumar
I have 5+ years of Big Data and Data Science Experience and I'm a DataBricks Certified Apache Spark Developer, MapR Certified Hadoop Developer, Cloudera Certified Hadoop and Spark Developer, Cloudera Certified Hadoop Administrate, DataStax Certified Apache Cassandra Developer and I have very good experience in working with USA clients.
My profile here. https://in.linkedin.com/in/sandishkumar
I have 5+ years of Big Data and Data Science Experience and I'm a DataBricks Certified Apache Spark Developer, MapR Certified Hadoop Developer, Cloudera Certified Hadoop and Spark Developer, Cloudera Certified Hadoop Administrate, DataStax Certified Apache Cassandra Developer and I have very good experience in working with USA clients.
My profile here. https://in.linkedin.com/in/sandishkumar
SEEKING WORK, Primarily Remote (based in Eindhoven, NL)
I am a strong data engineer who is passionate about large-scale distributed systems and streaming pipelines, and cares about producing clean, elegant, maintainable, robust, well-tested Scala / Spark code.
Well-rounded Scala data engineer with deep knowledge of the internals of distributed datastores. Solid experience working remotely and working with teams that are distributed geographically. I typically work Pacific Time hours.
Core Skills:
? Cassandra (Data Modeling, Troubleshooting Performance And Operational Issues)
? Stream Processing At Scale: Kafka, Flink, Spark Streaming
? Custom-Crafted TF-IDF / Embeddings, Vector-Based Semantic Search, Deep Intent Recognition In Search Engine Queries
SEEKING WORK, Primarily Remote (based in Eindhoven, NL)
I am a strong data engineer who is passionate about large-scale distributed systems and streaming pipelines, and cares about producing clean, elegant, maintainable, robust, well-tested Scala / Spark code.
Well-rounded Scala data engineer with deep knowledge of the internals of distributed datastores. Solid experience working remotely and working with teams that are distributed geographically. I typically work Pacific Time hours.
Core Skills:
? Cassandra (Data Modeling, Troubleshooting Performance And Operational Issues)
? Stream Processing At Scale: Kafka, Flink, Spark Streaming
? Custom-Crafted TF-IDF / Embeddings, Vector-Based Semantic Search, Deep Intent Recognition In Search Engine Queries
Well-rounded Scala data engineer with deep knowledge of the internals of distributed datastores. Solid experience working remotely and working with teams that are distributed geographically. I typically work Pacific Time hours.
Core Skills:
? Cassandra (Data Modeling, Troubleshooting Performance And Operational Issues)
? Stream Processing At Scale: Kafka, Flink, Spark Streaming
? Custom-Crafted TF-IDF / Embeddings, Vector-Based Semantic Search, Deep Intent Recognition In Search Engine Queries
SEEKING WORK, Primarily Remote (based in Eindhoven, NL)
I am a strong data engineer who is passionate about large-scale distributed systems and streaming pipelines, and cares about producing clean, elegant, maintainable, robust, well-tested Scala / Spark code.
Well-rounded Scala data engineer with deep knowledge of the internals of distributed datastores. Solid experience working remotely and working with teams that are distributed geographically. I typically work Pacific Time hours.
Core Skills:
? Cassandra (Data Modeling, Troubleshooting Performance And Operational Issues)
? Stream Processing At Scale: Kafka, Flink, Spark Streaming
? Custom-Crafted TF-IDF / Embeddings, Vector-Based Semantic Search, Deep Intent Recognition In Search Engine Queries
I have experience designing and implementing data processing systems in the Hadoop ecosystem as well as developing algorithms in a telecommunication environment for distributed systems.
Experienced Data Scientist.
Keywords: Apache Spark, scaling algorithms.
Well-versed in digging through data to find key insights and curating a compelling story from complex analyses, passionate about delving into data from different systems, at different timescales, and in complex formats to uncover hidden relationships.
Machine Learning knowledge acquired from personal experimentation with Spark: Linear / Logistic Regression, Decision Trees, NaiveBayes, Alternating Least Squares (Recommender Systems), TF-IDF
Professional Background (formerly): ETL Developer / Traditional DWHs / Kimball's Methodology
Computer Science Skills / Core: Data Structures, Algorithms, Functional Programming Paradigm, Relational Databases
Big Data Framework / Core: Spark
Big Data / Other: Apache Kafka => Spark Streaming from Kafka topics
Source Control: GitHub
Source Control / Other: BitBucket
DevOps / Other: Docker / DockerHub
Programming Languages / Core: Python, Scala
Programming Language / Other: Haskell
Keen interest in experimenting with open-source Big Data technologies.
E-mail address in the profile.
reply