Big data in the cloud is an area of the market where Google , Amazon Web Services (AWS) and Azure are attracting some interesting startup companies
AWS has a broad spectrum of big data services. Amazon Elastic MapReduce, for example, runs Hadoop and Spark while Kinesis Firehose and Kinesis Streams provide a way to stream large data sets into AWS. Users can store data in Redshift, a petabyte-scale data warehouse, with data compression to help reduce costs. Amazon Elasticsearch is a service to deploy the open source Elasticsearch tool in AWS for analytics such as click-through and log monitoring. Kinesis Analytics complements this by analyzing data streams.
Learn how to develop a Data Discovery foundation for the future to drive process improvements.
For analytics, Azure has Data Lake Analytics,as well as HDInsight, a Hadoop-based service. There is also an Azure Stream Analytics service, a Data Catalog that identifies data assets using a global metadata system, and Data Factory, which interlinks on-premises and cloud data sources and manages data pipelines.
Google’s BigQuery data service uses a SQL-like interface that is intuitive for most users. It supports petabyte databases and can perform data streaming at 100,000 rows per second as an alternative to running data from cloud storage. BigQuery also supports geographic replication and users can select where they store their data.