Volume is one thing. Velocity is another.
It is the big data platform for storing and processing lots of data, later.
What about storing and processing a subset of that data, now, by…
- reducing the length of time it takes to store the data.
- reducing the length of time it takes to process the data.
Store the data in a data grid, in memory.
Process the data within a data grid, in memory.
Big Data Stack
The data can be stored in a data grid and in a big data platform. The data can be stored in a data grid, and the data grid can persist the data to a big data platform. It can do so asynchronously to maintain performance, or it can do so synchronously to guarantee consistency.
The data grid can rely on expiration to maintain a finite set of data. For example, the data grid can store current data (e.g. five business days) and the big data platform can store the historic data.
If an analyst needs to analyze current data…
the analyst can submit a map / reduce job to the data grid.
If an analyst needs to analyze historic data…
the analyst can schedule a map / reduce job with the big data platform.
Investment Banking & Risk Management
To analyze portfolio data at the close of business, every day.
The working set is portfolio data for the day. It does not include, for example, portfolio data for the year. The sooner the data is analyzed, the better. As a result, it does not make sense to analyze all of the portfolio data in a big data platform or to schedule a map / reduce job with batch processing. It is more efficient to limit analysis to the working set, in memory, and to submit a map / reduce job for near real-time processing.