by yoavn
tags:
border:

Following my successful lecture I’d like to share the slides with you. In this session we learn how to package your code using Java's new modules. Benefit from cool features such as JShell and find out what else is new Java 9. This session will explain how these changes will...

border:

AppEngine Dataflow SpringBoot In a past blog I wrote about Scheduling Dataflow pipeline. There I described how to leverage google's app engine and cron server to schedule dataflow pipelines. For simplicity I had used java spark as the web server. Thought now I have decide that spring boot is more...

border:

Apache Beam Testing So you decided to join the wagon of apache beam. Now that you have written your pipeline, you would like to test it. Apache beam has a full framework for testing your code. You can test each function, Composites, and even a full pipeline. For the full...

tags:
border:

Apache Beam Good Bad and the Ugly I would like to share with you my experience with apache beam on the latest project that I have worked on. Senario My project was an ETL from salesforce to bigquery. Since we need to do this with a lot of tables and...

tags:
border:

BigQuery I have been working on BigQuery for a few months and would like to share what I have learnt. What is BigQuery? BigQuery is a SAS Database platform by google. BigQuery is very similar to RDBMS, and you use SQL to work with it. The main advantage of BigQuery...

border:

Scheduling Dataflow pipelines Google has a product called Dataflow. Dataflow is a engine for processing big data. Google took the dataflow model and externalized it as Apache Beam Model (see https://beam.apache.org). The idea behind Apache Beam, is a generic model that deals with both streaming and batch including windowing with...

by haimc
tags:
border:

Following my successful lecture I’d like to share the slides with you. If you already familiar with Spark API, it's time to take your code to a higher level and gain performance. In this session we go over best practices of handling data, improved code and Spark cluster configuration. Hope...

border:

Following our successful Fullstack event with over 350 RSVP members, we are happy to share the slides with you. Machine Learning is more available today then ever due to API's in the cloud as services. They are mostly used for Neural Networks but also other API's. But Machine Learning also...

by yanai
tags:
border:

Introduction I would like to post a short description about a simple design change, I just did for one of Tikal’s customer, which greatly improved the throughput for their processing on their BigData lake with Spark. Background In the last few months I had to build a BigData infrastructure for...

border:

Salesforce Data Extractor Conundrum There are many companies that are using Salesforce as their CRM. Though, what is very lacking in the CRM systems is analytics. So many companies need to export their data from Salesforce to another systems like BigQuery. If you need to do this and do not...