Hi.

Welcome.

I have documented all my discussions, projects and different trends from obtaining my Master's in Information Assurance and my Doctorate in Information Technology.

 The key is - Educate

It is crucial that people are educated about security; how to deter from intrusion at ones organization; and most importantly protecting one self.

Building systems and educating on information assurance have been my exposure more than half of my life.

This is my journey of accomplishments, even more Struggles and a few solutions.

I hope you Enjoy!
data warehouse

data warehouse

“Data warehousing is a business analyst's dream—all the information about the organization's activities gathered in one place, open to a single set of analytical tools. But how do you make the dream a reality (Wall & Scott, N.d.)?” There are the basics of building a data warehouse such as “understanding the business processes, the goals, determining the business objectives and then after constructing a data model, figuring out the data sources”, the decision on how to handle growth within the organization should be the next set of determining factors (Wall & Scott, N.d.). 

            When a company grows the storage of data must also increase or “expand”.  Prior to expanding or “merging data” one must “explore new ways to grow, cut costs and increase profits”.  Below are more details for a successful “warehouse construction”. 

External Feeds

External feeds are very important but is usually neglected in the data warehouse.  The “external data type is external to your company and is usually not in the data warehouse because no one has requested it yet (Mould, 2013)”. 

This type of feed is can be based on “statistics and trends”.  They  also “can have a major impact upon ones business, so they should be tracked over time” (Mould, 2013). This is the area for increasing profits as external feeds allow for “surveys and ROI analysis” which allow the benefits of using this data (score uplift) outweigh the cost (Mould, 2013)”. 

External data also allows for the company “to ensure one is making the right business decisions as it allows the company to see the big picture (Hammergren, N.d.)”.   Not only is the big picture an important factor within a data warehouse so is having “clean data”.

Ensure Only Clean Data

There are a few ways to ensure for data accuracy and one is a little acronym named “ETL” which stand for “Extract/Cleans -Transform-Load” (ETL, N.d.).

The extraction is the part that pulls and technically “cleans” the data.  Prior to merging the data it is usually reviewed to ensure there are no duplication's.  When one extracts the data the following is performed: 

  “Extracts

  • Update notification – notification a record has been changed

  • Incremental extract - identify which records have been modified

  • Full extract – keeps a copy of the original to determine or identify changes

 Clean

  •  Making identifiers unique

  • Standardizes null values

  • Validates address fields

  • Validates numbers

Transform

  • Applies the rules

  • Converts measured data

  • Joining sources

Load

  •  Disable constraints and indexes – pre load

  • Enable constraints and indexes – post load (ETL, N.d.)”.

Timing

So how often should one update the data in the data warehouse?  “You must update your data warehouse on a regular basis to ensure that the information derived from it is current. This process of updating the data is called the refresh process (Defining, N.d.)”. 

The ETL is usually put on a scheduled time and can be completed after hours, in the middle of the night or on weekends – this all depends on the company’s urgency of the data (10, N.d.).  “Once all of this data has been loaded into the data warehouse, the materialized views must be updated to reflect the latest data (10, N.d.)”.   While timing and scheduling the “refresh” is crucial the sequence and “parallel” is even more important to a data warehouse.

Parallel and Sequential Order

Order is key in a data warehouse as a system is literal.  It only does as it is told.  If a specific table is needed to pull data prior to another table the sequence is key and must be orderly.  “Choose the partitioning strategy of the materialized view in sync with a table's partitioning strategy (10, N.d.)”.  The following is an example of “refreshing of a materialized view and parallel updates” (10, N.d.). 

1.      “Table that will be the basis for the materialized view including partitioning by range

2.      Load the initial table data

3.      Create materialized view

4.      Load a separate table to be exchanged with the new partition.

5.      Create and load a separate table to be exchanged with a partition in the materialized view

a.       INSERT /*+ PARALLEL

6.      Gather statistics

7.      Exchange the partitions

8.      Inform the database that the materialized view is fresh again (10, N.d.)”.

The information given is a quick snapshot of how one can explore new ways to grow, cut costs, and increase profits in building a data warehouse.  There are many techniques and rules to review but given the demonstration should allow one to determine how important data can be to an organization; how frequent to refresh the data in the data warehouse; and how external resources can help ones company.

global data distribution

global data distribution

pointing the "finger"

pointing the "finger"