What is Data Warehouse?

Data Warehouse

  • A data warehouse is a central repository of data that is designed for reporting and data analysis rather than for transaction processing
  •  It usually contains data from OLTP applications and other diverse data sources.
  • It separates analysis workload from transaction workload and enables an organization to consolidate data from several sources.
  • Data warehouse environment includes ETL solution, an online analytical processing (OLAP) engine, client analysis tools, and other applications that manage the process of gathering data and delivering it to business users.

Data warehouses vs. OLTP

  • Data warehouses are not usually in 3NF, while in OLTP environments the data is usually normalized.

  • A data warehouse is updated on a regular basis by the ETL process (run nightly or weekly) using bulk data modification techniques. The end users of a data warehouse do not directly update the data warehouse.
  • In OLTP systems, end users routinely issue individual data modification statements to the database. The OLTP database is always up to date, and reflects the current state of each business transaction.
  • A data warehouse often uses star schema design (Denormalized), while OLTP systems often use fully normalized schemas.
  • Data warehouses usually store many months or years of data. This is to support historical analysis, while OLTP systems usually store data from only a few weeks or months.


Why do we need a data warehouse?

 
Data warehouses are used extensively in the largest and most complex businesses around the world. In demanding situations, good decision making becomes critical. Significant and relevant data is required to make decisions. This is possible only with the help of a well-designed data warehouse.
 

Enhancing the turnaround time for analysis and reporting:

 
Data warehouse allows business users to access critical data from a single source enabling them to take quick decisions. They need not waste time retrieving data from multiple sources.
The business executives can query the data themselves with minimal or no support from IT which in turn saves money and time.
 

Improved Business Intelligence:

 
Data warehouse helps in achieving the vision for the managers and business executives. Outcomes that affect the strategy and procedures of an organization will be based on reliable facts and supported with evidence and organizational data.
 

Benefit of historical data:

 
Transactional data stores data on a day to day basis or for a very short period of duration without the inclusion of historical data. In comparison, a data warehouse stores large amounts of historical data which enables the business to include time-period analysis, trend analysis, and trend forecasts.
 

Standardization of data:

The data from heterogeneous sources are available in a single format in a data warehouse. This simplifies the readability and accessibility of data. For example, gender is denoted as Male/ Female in Source 1 and m/f in Source 2 but in a data warehouse the gender is stored in a format which is common across all the businesses i.e. M/F.

 

Immense ROI (Return On Investment): 

 
Return On Investment refers to the additional revenues or reduces expenses a business will be able to realize from any project. According to a 2002 International Data Corporation (IDC) study “The Financial Impact of Business Analytics”, analytics projects have been achieving a substantial impact on a business’ financial status.


Components of Data warehouse

 

Data Sources

A flat file database stores data in a normal text format. Contrary to a relational database where the data is stored in the form of tables, in a flat file database the data stored does not have a folders or paths related to them. No manipulations are performed on the data. Delimiters are used in flat files to separate the data columns.
Excel spreadsheets are regularly used in data warehousing operations. They are impressive, low-priced, and flexible tolls that many decision-makers find convenient to use. Excel also provides graphing features that allow the end-user to present the required data in chart and graph formats. These formats can be easily integrated into MS Word and Power Point presentations.
Operational systems of a business contain the day to day transactions of the data at a low-level. For example, the sales data, HR data, marketing data are used as input sources for a data warehouse.
Legacy systems are the applications of the yesteryear. They mirror the requirements of a business that might be twenty to twenty five year old. They are use till date since over years these systems have captured the business knowledge and rules that are exceptionally difficult to translate to a new platform/application.


Staging Area

 
The first part of the staging area is the most challenging process of extraction. Depending on how accurately the data is extracted the subsequent operations succeed or fail. The source systems might be complicated or poorly documented due to which the process becomes all the more difficult. The data may be extracted not only once but also periodically when changes occur at the source side.
The second stage is the transformation where the data is converted from one format to another. Since data often exists in different locations and formats across the enterprises, data conversion is mandatory to ensure that data from one application is comprehensible to other applications and databases.
The third stage is the loading where the extracted and transformed data is loaded into a data mart or a data warehouse depending on the business. The populated data is used for presentation applications by the end users.


Data Repository

 
The data is loaded into a data warehouse in the form of facts and dimensions
 

Users

 
The loaded data is accessed for reporting, analysis, and mining. The reporting tools like Business Objects and Cognos are used by users to generate reports. The data is also used for predicting trends



HTTP Post Request in Android Example

HTTP post request are executed in android in a very simple way. You need just 4 steps.

1. Declare Internet permissions in the manifest by adding the following line to AndroidManifest.xml.


<uses-permission android:name="android.permission.INTERNET" />


2. Create your HttpClient and HttpPost objects to execute the POST request.


HttpClient client = new DefaultHttpClient();
HttpPost post = new HttpPost(address);


String Object like String address = "www.google.com";

3. Set your POST data

List<NameValuePair> pairs = new ArrayList<NameValuePair>();
pairs.add(new BasicNameValuePair("key1", "value1"));
pairs.add(new BasicNameValuePair("key2", "value2"));
post.setEntity(new UrlEncodedFormEntity(pairs));


4. Execute the POST request

This returns an HttpResponse object, whose data can be extracted and parsed.

HttpResponse response = client.execute(post);




A closer look into struts-config.xml in Struts 1.x

A web application uses a deployment descriptor to initialize resources like servlets and taglibs. This deployment descriptor is formatted as an XML file and named "web.xml". Likewise, the framework uses a configuration file to initialize its own resources. These resources include ActionForms to collect input from users, ActionMappings to direct input to server-side Actions, and ActionForwards to select output pages.

Here's a simple struts-config.xml file and its different elements:

<struts-config>

<form-beans> 
  <form-bean 
       name="helloWorldForm" 
     type="com.naeemgik.common.form.HelloWorldForm"/>
</form-beans>

<action-mappings>
  <action 
       path="/helloWorld"
       input="/ByeWorld.jsp"
type="com.naeemgik.common.action.HelloWorldAction"
name="helloWorldForm"
       attribute="helloWorldForm"
       validate="true"
       scope="request"
       parameter="operation">

<forward name="success" path="/HelloWorld.jsp"/>
       <forward name="error" path="/ByeWorld.jsp"/>

  </action>
</action-mappings>

</struts-config>


In the above struts-config.xml file I have included most of the attributes of the action tag for the understanding point of view. Now let discuss each one.

The form-bean tag have HelloWorldForm which is used to hold the data. This class is extended from ActionForm.java

The acion tag attributes:

path: The action will be only and only called when the the user hit the URL of named /helloWorld

input: If vaidate is true and the form have some validation errors then the default page to display will be mentioned in this attribute.

type: Type is the Action class that will be called. It may be extended from Action.java or DispatchAction.java depending upon user scenarios. For more details may refer to Difference between Action and DispatchAction

name: The “name” attribute is actually the form-bean name.

attribute: attribute have same meaning as that of name attribute.

validate: This attribute is used whether this action needs some sort of validators or not? In case validate="false" it skips any validations.

scope: This actaully describe the action scope whether it will be request or session.

parameter: This is used only in case of DispatchAction. As dispatch Action have multiple action methods so programmer must have explicitly mention which action needs to be triggered. If you are new to Action and DispatchAction must read Difference between Action and DispatchAction

forward: This tag actually launches the corresponding JSP,page or another action depending on the forward name attribute.




Difference between Action and DispatchAction


Below are the main differences between Struts Action and Dispatch Actions

  • Grouping related actions into one class is possible using DispatchAction class. But in Action class a group related action into one class is not possible..Several changes need to be done, when one plans to group the related action using Action class.
  • Action class is used to perform single functionality. To perform multiple functionalities, we need to write an additional Action class.


  • The complexity rises if Action class is used for multiple functionalities, whereas by using DispatchAction, the code complexity will noticeably be reduced.
  • An Action class acts as a controller and is an adapter class between the web tier and the business tier.
  • Dispatch class is used to combine the related operations, thus share the common resources, helper classes for instance.
  • An Action is invoked by the ActionServlet to perform an operation that is depending on the URL request.
  • A DispatchAction selects a method depending on the value of the request parameter that is configured in the xml file.