Imagine that the CIO has been impressed with your past performance and has promoted you to database administrator. Your first project is to create a data warehouse project plan that will merge five (5) disparate operational data sources into a common warehouse and associated data marts. The project plan will address the technical design for the future implementation of the data warehouse. Your company has realized that there is a great amount of data within the company that you should use or test with before you can recommend or propose any intelligent business solutions. You approach for better understanding their business needs is to setup Hadoop independent from their business and systems environment and extract data files to experiment with first. Simply :
- Gathers data from three (3) internal organizational systems and two (2) external sources.
- Combines the data into one database.
- Manipulates and conforms it to common and organizational standards.
- Restructures it for easy exploration, reporting, and data mining.
- Use any of the recommended tools (i.e., Hadoop, Google Analysis etc.) in class to analyze this data.
Using your knowledge of requirements gatherings, architectural framework, the Hadoop system, and testing methodologies for data warehouse projects, develop your report for the final analytical findings of preceived trends.
Specifically, write a (2-3) page report in which you:
- Write an objective statement to include an introduction of the purpose of this activity
- Construct a requirements statement that details the business user and technical requirements.
- Use Excel, Visio, MS Project, or one of their equivalents such as Open Project, Dia, or OpenOffice to complete the following. Note: The graphically depicted solution may not exceed two (2) pages. Ensure that each finding has a written explanation or narrative.
- Construct the business and technical metadata that would be necessary for the data files used.
- Create a schema for the data files used.
- Construct the necessary fact table(s) and dimensional table(s).
- Illustrate the flow of data including both inputs and outputs for Hadoop.
- Recommend and justify a business intelligence (BI) solution; then, depict the probable dashboard that could benefit users long term.
- Use at least five (5) quality resources in this assignment. Note: Wikipedia and similar Websites do not qualify as quality resources.
Section 2: Create a PPT for the class on your analysis and findings
I expect you and your teammate to present your slides in class. There is a separate loading area for the PPT.
Your assignment must follow these formatting requirements:
- Be typed, double spaced, using Times New Roman font (size 12), with one-inch margins on all sides; citations and references must follow APA or school-specific format. Check with your professor for any additional instructions.
- Include a cover page containing the title of the assignment, the student’s name, the professor’s name, the course title, and the date. The cover page and the reference page are not included in the required assignment page length.
- Include charts or diagrams created in Excel, Visio, MS Project, or one of their equivalents such as Open Project, Dia, and OpenOffice. The completed diagrams/charts must be imported into the Word document before the paper is submitted