Primary Skills
Expert
Developing Java Classes
15
NoSQL Database
9
Spark
2
Storm
6
Kafka
8
Spring
10
Maven
12
Rest
15
HTTP
15
HDFS
8
MapReduce
8
Hadoop
8
Developer
Using threads
12
SQL
10
Writing SQL Statements
10
Implementing Data Access Classes
10
Creating ERD
12
Search (Solr/Lucene, ElasticSearch)
Linux OS - desktop (end user) / server (administration) experience
10
Spring Data
3
EJB / EJB3
15
JDBC
15
Scala
4
Linux OS - power user
10
JBoss
15
Tomcat
15
MySQL
10
ANT
15
Shell Script
15
Multi-threading
15
AWS
1
Couchbase
2
Redis
3
MongoDB
3
ElasticSearch
4
Functional
4
SparkStreaming
2
Spark ML
1

Professional experience

Big Data Architect
Feedvisor
  • Mission: to build a new scalable Data architecture
  • As the company expended its customer base, the Data collection had to move forward in its ability to scale.
  • Add Apache Spark ecosystem and introduce it in several data pipelines for the company
  • Build complete architecture for the big data solution
  • Challenge was to define data architecture, decide about the technical Big Data solution and then to implement the chosen one.
  • Work on AWS stack with S3, EC2, Spark on EMR
  • Languages: Java, Scala
Big Data Architect
Playtika
  • Mission: to add a Apache Spark into the company in order to scale out Data Platform.
  • Project collected data from numerous amount of clients and had to incorporate it into one uniform data lake.
  • Integrate Apache Spark into the Big Data life-cycle in the company.
  • Challenge was to scale out the Data Platform with new scalable technologies. Helped in making the decision about Spark as the Data Pipeline and then implement it with coding.
  • Use Technologies of Spark, Kafka, Hadoop, Hive, Couchbase.
  • Languages: Java, Scala
Java & Big Data Architect
Fortscale
  • Mission: to assess the Big Data technologies of the company and improve performance
  • Data Collection of security events is collected into Big Data platforms
  • Task was to validate the technologies and help getting decisions about what to use and how
  • Technologies: Cloudera distribution: Hadoop, hive, Impala, Cloudera Manager. Also: MongoDb and Apache Samza for streaming.
  • Helped in implementing best practices of Java Development: Spring Inversion of Control, Automation, Unit Testing methodologies. Helped implement new rule-base technology of Apache Drools.
  • Languages: Java, Javascript

LATEST ARTICLES

by rans
border:

Agile is not as "Hot" as used to be - true or false? Recently, I get a lot of negative attitude toward Agile Development. In many organizations where I was consulting, people use Scrum), but complain about it. Some of the complaints I hear are: There are too many meetings...

by rans
border:

Data Processing - Streaming Vs. Batch In today's big data world, we need to process a lot of data in high volumes. There are some very good frameworks for data processing. Just to name some: Apache Hadoop for batch processing, Apache Storm for data streaming, Apache Spark that can do...

by rans
tags:
border:

This workshop is taken by Ran Silberman. Here is the agenda for this session : Presentatio: Short introduction to Hadoop, Hadoop ecosystem in brief, Future of Hadoop, Hands-on, Installing Hadoop, Writing MapReduce program, Debug MapReduce in Hadoop, Deploy Program on the cluster, Work with management and monitoring tools. Prerequisites for...

by rans
border:

Calculate Average value in WordCount MapReduce on Hadoop In this post I show how to calculate average value of counters in a Java program that runs Map-Reduce over hadoop. The famous example of Word Count that can be found here shows a simple MapReduce that sets counter of words. Here...

by rans
border:

A benchmarking tool for Streaming systems Yahoo! did a benchmark tool to compare different open source stream processing systems. They open-sourceed it in github for anyone to use in their own environment. github/yahoo/streaming-benchmark Currently this benchmark support three Streaming systems: - Apache Storm - Apache Flink - Apache Spark Storm...

by rans
border:

Hadoop Ecosystem – How to find your way out there? Hadoop and HDFS are the pillars of today’s big data systems. They are the cornerstone for almost every big data project. But in order to use them properly and efficiently, it needs to know their ecosystem. Today there are dozens...