Understanding Data Warehousing and Business Intelligence Evolution
31) What is the Data Warehouse and Its Characteristics?
A data warehouse is a centralized repository designed to store large volumes of structured, semi-structured, and sometimes unstructured data.
Characteristics of a Data Warehouse
- Subject-Oriented
- Organized around major subjects like customers, sales, or products, rather than business operations.
- Focuses on specific areas of analysis rather than the detailed operations of individual systems.
- Integrated
- Data from different sources is cleaned, transformed, and standardized to ensure consistency.
- Resolves data inconsistencies in formats, naming conventions, and units of measure.
- Time-Variant
- Stores historical data, making it easier to analyze trends over time.
- Data is timestamped to track changes and allow comparisons over specific periods.
- Differentiate among a DM, ODS, and EDW.
- 3) Describe major components of a data warehouse.
1. Data Source
- Definition: The origin of raw data to be integrated into the data warehouse.
- Role: Supplies raw transactional and operational data for analysis.
2. Data Extraction
- Definition: The process of retrieving raw data from multiple heterogeneous sources.
- Purpose: Collect data efficiently without disrupting source operations.
3. Data Loading
- Definition: The process of transferring processed data into the data warehouse or staging area.
- Purpose: Ensures that the warehouse has up-to-date and accurate information for analysis.
4. Middleware Tools
- Definition: Software that connects the data warehouse with front-end tools or applications for data access and analysis.
- Purpose: Provides a bridge between the data warehouse and user interfaces.
4) List and briefly describe Simon’s four phases of DM.
1. Intelligence Phase
- Definition: In this phase, the decision-maker identifies and defines the problem or opportunity.
- Purpose: Ensure a clear understanding of the problem to set the stage for effective decision-making.
2. Design Phase
- Definition: In this phase, potential solutions or courses of action are developed and analyzed.
- Purpose: Explore all possible options to address the problem effectively.
3. Choice Phase
- Definition: The decision-maker selects the best solution or course of action from the available alternatives.
- Purpose: Make a rational and informed decision that aligns with objectives.
4. Implementation Phase
- Definition: The chosen solution is put into action, and its outcomes are monitored.
Purpose: Translate the decision into tangible results and ensure its success.
5) Explain the evolution of BI.
Evolution of Business Intelligence (BI)
Business Intelligence (BI) has evolved significantly over the years.
1. Early Days: Data Collection and Reporting (1950s–1970s)
- Focus: Basic data collection and manual reporting.
- Technology: Early mainframe systems were used for data storage and basic computational tasks.
2. Emergence of Decision Support Systems (DSS) (1970s–1980s)
- Focus: Supporting management decisions with structured data and analytical tools.
- Technology: Development of relational databases and DSS tools.
3. Birth of Modern BI (1980s–1990s)
- Focus: Integrated tools for data analysis and visualization.
- Technology: Emergence of data warehouses, OLAP (Online Analytical Processing), and ETL processes.
4. Self-Service BI (2000s)
- Focus: Empowering end-users with tools to analyze and visualize data without IT assistance.
- Technology: Rise of self-service BI tools like Tableau, Power BI, and Qlik.
5. Big Data and Advanced Analytics (2010s)
- Focus: Managing and analyzing vast volumes of structured and unstructured data.
- Technology: Big data platforms like Hadoop, Spark, and cloud-based solutions.
6. Modern BI and Augmented Analytics (2020s and Beyond)
- Focus: Automating insights with AI and enhancing decision-making with augmented analytics.
- Technology: Cloud-native BI platforms, AI-driven tools, and natural language processing (NLP).
6) Explain three types of analysis.
Descriptive Analytics
- Definition: Focuses on summarizing and interpreting historical data to understand what has happened.
- Purpose: Provides insights into past performance and trends.
- Output: Answers questions like “What happened?” and “What are the trends?”
2. Predictive Analytics
- Definition: Uses statistical models, machine learning, and algorithms to analyze historical data and predict future outcomes.
- Purpose: Helps forecast what is likely to happen.
- Output: Answers questions like “What is likely to happen?” and “What are the potential risks?”
3. Prescriptive Analytics
- Definition: Provides actionable recommendations by analyzing data and evaluating possible outcomes.
- Purpose: Guides decision-making by suggesting optimal actions.
- Output: Answers questions like “What should we do?” and “What is the best course of action?”