global data distribution
“Data entry is feasibly one of the most important factors in different business verticals for increasing productivity & mitigating repetitive business task (Wixon, N.d.)”. So, where should one put the data? How often should the data be updated and ideas of partitioning types are up for discussion.
Partitioning data is an important factor within a database as it increases performance (Partitioning, N.d.). Partitioning basically takes data and “splits” it up into smaller segments. It allows the system to “search” within smaller areas at a faster rate which gives for a quick result. “Partitioning is a general term used to describe the act of breaking up your logical data elements into multiple entities for the purpose of performance, availability, or maintainability (Bako, N.d.)”.
There are a few types of partitioning schemes available; vertical and horizontal partitioning. “Determining how to partition the tables horizontally depends on how data is analyzed. You should partition the tables so that queries reference as few tables as possible. Otherwise, excessive UNION queries, used to merge the tables logically at query time, can affect performance (Partitioning, N.d.)”.
Many organizations run different database applications but the assumptions of the merger with two companies should have “the same underlying hardware and run the same operating systems and database applications; the database will be a homogenous distributed database system (Rouse, 2014)”. A “homogeneous system(s) are much easier to design and manage. This approach provides incremental growth, making the addition of a new site to the DDBMS easy, and allows increased performance by exploiting the parallel processing capability of multiple sites (Thakur, N.d.)”. Of course not all systems function this simply as many do require it “to translate the query language used”. An example of this would be the “SQL SELECT statements such as the FIND and GET statements” (Thukur, N.d.). So, how does one keep track of sensitive data within a database?
“Data flow maps are a recognized method of tracing the flow of data through a process or physically through a network (Hayes, N.d.)”. “For organizations where sensitive data is housed at multiple sites, bird's eye (high-level) and in-the-weeds (detailed) diagrams will be needed. This approach helps to make the flow of sensitive information more comprehensible without a high degree of abstraction (Hayes, N.d.)”.
Overall building a database needs to be secure and have data quality or the system would be null. It is suggested that “an annual review and realignment of objectives with corresponding presentations and an updated business case for data quality would be reasonable (Loshin, N.d.)”. In regard to the quality of the data how often should the data be refreshed?
“Batch data processing is an efficient way of processing high volumes of data where a group of transactions is collected over a period of time. Data is collected, entered, processed and then the batch results are produced (Walker, 2013)”. The approach for retrieving the data efficiently will be a combination of “ETL and Analytical processing” or in other words a “Time-delayed batch” (Walker, 2013).