Google makes its big data services Cloud Dataflow, Cloud Pub/Sub available

Google has removed the beta label from two of its Big data services -- Cloud Dataflow and Cloud Pub/Sub -- and is making them generally available.

Cloud Dataflow is specifically designed to remove the complexity of developing separate systems for batch and streaming data sources by providing a unified programming model. "Based on MapReduce, FlumeJava, and Millwheel, the Cloud Dataflow is built to free you from the operational overhead related to large scale cluster management and optimization," Cloud Dataflow product manager Eric Schmidt said in a blog post.

Cloud Pub/Sub is a tool that can send and receive data to and from applications in the form of “messages."

Cloud Dataflow is a fully managed, fault tolerant, SLA-backed service for batch and stream processing. The service is 2-3x faster and cheaper than Hadoop when evaluating classic MapReduce based pipelines, such as PageRank and WordCount, the company claimed. And with dynamic work rebalancing, Cloud Dataflow effectively optimizes resource utilization which provides additional performance gains without requiring manual intervention.
Google has also expanded its technology partner, third party connector, and service provider integration efforts including Tamr, Salesforce, ClearStory, springML, Cloudera, data Artisans; and continue to support alternate runner enablement for Apache Spark and Apache Flink.
"Native Google Cloud Platform integration for Cloud Storage, Cloud Datastore, BigQuery, and Cloud Pub/Sub. You now get full query support for our BigQuery source. Our integration with Cloud Pub/Sub now provides source timestamp processing in addition to arrival time processing. Source timestamps, when combined with flexible Windowing and Triggering primitives, enable developers to produce more accurate windows of data output."
In addition, Google Cloud Pub/Sub can help integrate applications and services reliably, as well as analyse big data streams in real-time. Cloud Pub/Sub addresses a broad range of scenarios with a single API, a managed service that eliminates those tradeoffs, and remains cost-effective as you grow, with pricing as low as 5¢ per million message operations for sustained usage.

Updated Date: Aug 14, 2015 16:07:26 IST