The fastest growing data set in the energy industry is time-series data that is typically loaded into a SCADA system from IoT devices and sensors in the field. SCADA data is a key asset energy companies use to drive digital transformation initiatives, such as advanced prescriptive and predictive analytics via machine learning (ML) solutions.
The challenge lies in capturing and retain these large data sets for use by data consumers in today’s energy companies. In the new reality of IoT, ML, and AI solutions, where data is utilized through multiple reporting and analytics tools, data consumers prefer to utilize data in historians with other analytical solutions (e.g., Spotfire).
Current Approach to SCADA Data for Analytics
Traditionally data from IoT devices or sensors is sent to PLCs and written into a historian. Historians have live analysis capabilities and can store large data volumes in either their proprietary format or file format. Historians are certainly good for their intended purpose. When there is a need to analyze historical data, however, historians do not have functionality to meet the analytical needs of oil and gas data consumers.
This current approach creates disadvantages:
- Limited amount of data that can be retained in historians
- High cost of historian software and storage
- Decreased performance as data volume increased on traditional hardware platforms
- Lack of data availability to drive advanced AI analytics using multiple platforms
- Lack of integration with downstream systems such as production accounting and asset management
Generally speaking, the central challenge to the traditional approach is data movement. If the data has to be moved from the historian to a time-series database or a relational database for integration with downstream systems, the data will have to move through a pipeline. Depending on the frequency of data movement, data storage and capacity can become issues as adequate resources have to be provisioned to store at a least minute interval.
Let’s look at some estimated numbers based on data from an oil and gas client of ours.
- Tags/Sensors = 30 Per Well
- Number of Tags x Time Interval (minute) for 1000 wells = 43 million records per day
- Total Cost = ∑ (Total Tags) x Interval + Servers x (α Maintenance + Storage Hardware Cost)
These estimates illustrate how the cost of storage becomes a function of time interval and number of tags. For a traditional storage system, this will include adding additional hardware. As a result, oil and gas will keep on playing catch-up, or they will decide to retain a small time slice of data (e.g., 30-90 days). The downside is that, for doing predictive analysis, 30-, 60-, or 90-day data is not always sufficient.
New cloud-based data warehouse solutions, such as Snowflake, offer an innovative approach to SCADA data management. SCADA data captured in a cloud-architected system has a capacity and capability to compute ever-growing data and with a small marginal cost for storage.
Snowflake Solution for SCADA Data
Snowflake utilizes cloud storage that ranges between hot and cold storage and varies around $4 – $25 per TB. Cloud storage not only solves the cost problem related to storage, but also eliminates the hardware, labor, and maintenance cost associated with it. What’s more, Snowflake provides straight-forward compute resources. Compute resources help run analysis and process data when needed. Plus, clients are billed only when they use the system. This creates further savings.
Beyond costs savings, Snowflake is designed to handle large data sets, and in the case of SCADA data it touches all of the sweet spots. Snowflake does not limit the users on any BI tool.
Snowflake provides a modern cloud data warehouse platform that can grow and scale on demand utilizing three key architectural innovations:
- Decoupling of storage and compute
- Multi-tenant metadata, security and transactional management
- Unlimited concurrency
The total benefits of Snowflake’s cloud data warehouse solution include:
- Low cost
- Simplified architecture
- Scalable
- Integrated data
- ML and AI process ready
- Works with all major BI Tools
- SQL Based interface
Summary
Snowflake’s cloud-engineered solution provides a unique capability for SCADA data management with its robust and fast data retrieval and storage capabilities. Energy companies can save money and also gain new flexibility to leverage their data assets to the fullest. They can realize new value in SCADA, which is one of the most crucial operation of energy, where data is consistently growing, and needs to be analyzed for asset optimization and failure prediction. The value can positively impact energy companies’ bottom line.
To learn how Stonebridge can help you take advantage of Snowflake’s cloud data warehouse platform, contact us.