Software Engineering Fundamentals: Estimation, Design, Risk, and Quality Assurance

Software Project Estimation and Decomposition

Decomposition Techniques in Estimation

Software project estimation is similar to problem solving. When the problem to be solved is too complex in software engineering, we decompose the given problem into a set of smaller problems.

The decomposition can be done using two approaches:

  • Decomposition of the problem (e.g., breaking down features).
  • Decomposition of the process (e.g., breaking down tasks).

Estimation uses one or both forms of decomposition (partitioning).

Function Point (FP) Based Estimation

Function Point (FP) Based Estimation is a widely used software estimation technique that measures the size of a software application in terms of its functionality, independent of the technology used. It is particularly effective for estimating the effort, cost, and time required for software development projects.

Key Components of Function Point Counting

  1. External Inputs (EI): User inputs into the system (e.g., forms, data entry screens).
  2. External Outputs (EO): Outputs provided to the user (e.g., reports, messages).
  3. External Queries (EQ): Interactive inputs and outputs (e.g., search or lookup functionality).
  4. Internal Logical Files (ILF): Files or data tables maintained within the system.
  5. External Interface Files (EIF): Files or tables used for reference that are maintained by external systems.

Methods for Reliable Cost and Effort Estimation

Software project estimation can be performed in systematic steps with acceptable risks. To obtain reliable cost and effort estimations, various alternatives can be used:

  1. Delay the estimation activity as far as possible.
  2. Base estimates can be made using similar projects (Analogy-based estimation).
  3. Use simple decomposition techniques to estimate project cost and effort.
  4. Empirical models can be used for project cost and effort estimation.

The first two alternatives are simple. The third and fourth alternatives suggest the use of specific, detailed project estimation techniques.

Empirical Estimation Models

The empirical estimation technique involves an educated guess of project parameters, often based on common sense and historical data. Values for LOC (Lines of Code) or FP (Function Points), often derived using Three-Point or Expected Value Estimates, are input into the estimation model.

Since empirical data are derived from a limited sample of projects, these models may not be appropriate for all classes of software products. Therefore, the results obtained from such models must be used judiciously.

Structure of the Estimation Model

A typical estimation model is derived using regression analysis on data collected from past software projects. The estimated effort ($E$) can be denoted by the following formula:

E = A + B * (ev)C

Where:

  • A, B, and C are constants derived empirically.
  • E is the effort in person-months.
  • ev is the estimation variable derived from either LOC or FP.

Many estimation models include project adjustment components that allow the value of E to be adjusted based on other project characteristics.

Examples of Estimation Models

LOC-Based Estimation Models:
  1. Walston-Felix model
  2. Bailey-Basili model
  3. Boehm simple model
  4. Doty model
FP-Oriented Estimation Models:
  1. Albrecht and Gaffney model
  2. Kemerer model
  3. Small project regression model

Object Point (OP) Based Estimation

Lorenz and Kidd suggested this estimation technique specifically for object-oriented projects. The steps for this estimation technique are as follows:

  1. Using the requirement model, develop use cases and determine their count. (Note: As the project proceeds, the number of use cases may change.)
  2. Using the requirement model, determine the number of key classes.
  3. Categorize the type of interface and develop the multiplier. Examples include:
    • No GUI
    • Text-based interface
    • Graphical User Interface (GUI)
    • Complex GUI
    Multiply the key classes by these multipliers to obtain the number of support classes.
  4. Multiply the key and support classes by the average number of work units per class. (Approximately 15–20 person-days per class.)

Software Project Management: Resources and Scheduling

Resource Management

The first task in project planning is to identify the scope and feasibility of the project. The second task is to estimate the resources required to accomplish the software development efforts.

Three major categories of software engineering resources are:

  1. Human Resources (People)
  2. Reusable Software Components
  3. Development Environment (hardware and software tools)

Human Resources

The Project Planner begins by evaluating the software scope and selecting the necessary skills required to complete development. Both organizational position (such as manager, senior software engineer, and so on) and the specialty (e.g., database, telecommunication, client/server) are specified. For a relatively small project, a single individual may perform all software engineering tasks, consulting with specialists whenever needed.

Reusable Software Resources

Component-Based Software Engineering (CBSE) emphasizes reusability. Software development is achieved by creating and reusing building blocks, which are called components.

Environmental Resources

The Software Engineering Environment (SEE) incorporates both hardware and software. Hardware provides a platform for using the tools required to produce work products, which are the outcome of good software engineering practice.

Project Scheduling

When scheduling a project, the manager must estimate the time and resources required. All project activities must be arranged in a coherent sequence. Schedules must be continually updated because uncertain problems may occur during the project life cycle.

For new projects, initial estimates can be made optimistically. During scheduling, the total work is separated into various small activities, and the time required for each activity must be determined by the project manager. For efficient performance, some activities are conducted in parallel.

The project manager should be aware that every stage of the project may not be problem-free. To accomplish the project within the given schedule, the required resources must be available when needed. Project schedules are typically represented using charts that illustrate the work breakdown structure (WBS) and dependencies among various activities.

Core Software Design Concepts

Key design concepts include:

  • Abstraction
  • Modularity
  • Architecture
  • Refinement
  • Pattern
  • Information Hiding
  • Functional Independence
  • Refactoring
  • Design Classes

Abstraction

Abstraction is the ability to cope with complexity by focusing on essential details while suppressing unnecessary ones. Software design occurs at different levels of abstraction:

  • At the higher level, the solution is stated in broad terms.
  • At the lower level, a more detailed description of the solution is provided.

Moving through different levels of abstraction involves creating procedural abstraction and data abstraction. In data abstraction, a collection of data objects is represented. For example, for a search procedure, the data abstraction might be a record, consisting of attributes such as record ID, name, address, and designation.

Modularity

Modularity involves dividing the software into separately named and addressable components called modules. Monolithic software is difficult for software engineers to grasp; hence, the trend is to divide the software into a number of manageable modules.

There is a correlation between the number of modules and the overall cost of the software product. Both overmodularity (too many small modules) and undermodularity (too few large modules) must be avoided during software development.

Architecture

Architecture is the representation of the overall structure of an integrated system. It defines how various components (called system elements) interact and how data is structured and used by these components. Architecture provides the basic framework for the software system, ensuring that important framework activities can be conducted systematically.

In architectural design, various system models can be used:

  1. Structural model
  2. Framework model
  3. Dynamic model
  4. Process model
  5. Functional model

Refinement

Refinement is a process of elaboration. Stepwise refinement is a top-down design strategy proposed by Niklaus Wirth. The architecture of a program is developed by successively refining levels of procedural detail.

The process of program refinement is analogous to the refinement and partitioning used during requirements analysis. Abstraction and refinement are complementary concepts:

  • In abstraction, low-level details are suppressed.
  • Refinement helps the designer to elaborate low-level details.

Design Patterns

According to Brad Appleton, a design pattern is defined as: “A named nugget (something valuable) of insight which conveys the essence of a proven solution to a recurring problem within a certain context.”

In other words, a design pattern acts as a reusable solution for a particular problem occurring in a specific domain. Using design patterns, a designer can determine:

  • If the pattern is reusable.
  • If the pattern can be used for the current work.
  • If the pattern can solve a similar kind of problem with different functionality.

Information Hiding

Information hiding is an important property of effective modular design. It means that modules are designed so that information contained in one module is not accessible to other modules. Only a limited amount of information can be passed to another module or to any local data structure used by another module.

The primary advantage of information hiding lies in testing and maintenance. By hiding data and procedures of one module from another, it prevents the introduction of errors. Similarly, changes can be made in a specific module without affecting the functionality of other modules.

Functional Independence

Functional independence is achieved by developing functional modules with a single-minded approach. Using functional independence, functions can be compartmentalized and interfaces simplified. Independent modules are easier to maintain, leading to reduced error propagation.

Functional independence is key to good design, and good design is key to software quality. The major benefit is achieving effective modularity. Functional independence is assessed using two qualitative criteria: Cohesion and Coupling.

Cohesion

Cohesion helps facilitate information hiding. A cohesive module performs only one task within a software procedure with little interaction with other modules. In other words, a cohesive module performs a single, well-defined function.

Types of Cohesion (from lowest to highest desirability):

  1. Coincidental Cohesion: Tasks within the module are loosely related to each other.
  2. Logical Cohesion: A module performs tasks that are logically related (e.g., a module handling all input/output operations).
  3. Temporal Cohesion: Tasks within the module need to be executed within a specific time span (e.g., initialization routines).
  4. Procedural Cohesion: Processing elements of a module are related and must be executed in a specific order.
  5. Communicational Cohesion: Processing elements of a module share the same input or output data.

Coupling

Coupling represents how modules are “connected” with other modules or with the outside world. It is a measure of interconnection among modules in a program structure and depends heavily on the interface complexity between modules.

The goal in software design is to strive for the lowest possible coupling among modules. Good coupling reduces or avoids change impact and ripple effects, thereby reducing costs in program changes, testing, and maintenance.

Various types of coupling (from lowest to highest undesirability):

  1. Data Coupling: Achieved by parameter passing or data interaction (passing only necessary data).
  2. Control Coupling: Modules share related control data (e.g., flags or switches).
  3. Common Coupling: Common data or global data is shared among the modules.
  4. Content Coupling: Occurs when one module makes use of data or control information maintained within the internal structure of another module.

Object-Oriented Design (OOD) Concepts

The object is the fundamental element of Object-Oriented Design (OOD). Various objects and their interactions are often graphically represented in OOD. An object is an entity obtained from its belonging class, possessing attributes and operations.

Three main strategies are applied in the object-oriented paradigm:

  1. Object-Oriented Analysis (OOA): The problem is analyzed by decomposing it into objects. The objects in this model reflect entities and operations of the problem domain.
  2. Object-Oriented Design (OOD): The design is developed by modeling the identified requirements.
  3. Object-Oriented Programming (OOP): The object-oriented design is implemented using an object-oriented programming language such as C++ or Java.

Component-Level Design Guidelines (Ambler’s Suggestions)

Ambler suggested the following guidelines for conducting component-level design:

  • Components: Components are part of the architectural model. Naming conventions should be established, and component names should be meaningful and specified from the problem domain.
  • Interfaces: Interfaces play an important role in communication and collaboration. The interface should be drawn as a circle with a solid line connecting it to another element, indicating that the connected element offers the interface. Interfaces should typically flow from the left side of the component. Only important interfaces must be drawn.
  • Dependencies and Interfaces: Dependencies must be shown from left to right. Inheritance should be shown from bottom to top (i.e., from derived to base class). Component interdependencies should be represented via interfaces.

Conducting Component-Level Design

  1. Identify all the design classes from the problem domain.
  2. Identify all the design classes from the infrastructure domain.
  3. Detail out all the design classes that are not reusable components:
    1. When components communicate or collaborate, represent their communication using messages.
    2. Identify interfaces for each component.
    3. Define the attributes of the classes by specifying the data types and deciding the structures.
    4. Specify all the operations by describing the processing flow within each operation.
  4. Specify the databases and files and identify the classes involved in handling them.
  5. Describe the behavioral representation for the class.
  6. Design the deployment diagram to provide additional implementation context.
  7. Factor the component-level design.
  8. Find alternative design solutions as well.

Software Architectural Design and Styles

Architectural Design Definition

Architectural design is the process of identifying the subsystems that make up the system and defining the framework for subsystem control and communication. The goal of architectural design is to establish the overall structure of the software system. Architectural design represents the link between the design specification and the actual design process.

Common Architectural Styles

An architectural model or style is a pattern for creating the system architecture for a given problem. However, most large systems are heterogeneous and do not follow a single architectural style.

  1. Data-Centered Architectures
  2. Data Flow Architectures
  3. Call and Return Architectures
  4. Object-Oriented Architectures
  5. Layered Architectures

Data-Centered Architectures

In this architecture, the data store lies at the center, and other components frequently access it by performing add, delete, and modify operations. The client software requests data from the central repository. Sometimes the client software accesses the data without modifying it or triggering changes in software actions.

Data-centered architecture possesses the property of interchangeability, meaning any component can be replaced by a new component without affecting the working of other components.

Key characteristics:

  • Components: Database elements such as tables and queries.
  • Communication: Primarily via relationships defined in the data store.
  • Constraints: Client software must request information from the central data store.

Data Flow Architectures

In this architecture, a series of transformations are applied to produce the output data. A set of components, called filters, are connected by pipes to transform the data from one component to another. These filters work independently without concern for the working of neighboring filters. If the data flow degenerates into a single line of transforms, it is termed as batch sequential.

Layered Architecture

The layered architecture is composed of different layers, where each layer is intended to perform specific operations, often building upon the services of the layer beneath it. Various components in each layer perform specific operations.

Typically:

  • The outer layer is responsible for user interface operations.
  • The inner layer components perform operating system interfaces or core services.
  • The intermediate layers perform utility services and application software functions.

Software Risk Management (RMMM)

Defining Risk and Risk Management

Risk denotes the uncertainty that may occur in choices due to past actions, often leading to potential losses. Risk Management refers to the process of making decisions based on an evaluation of factors that pose threats to the business or project.

Reactive Risk Strategy

Reactive risk management is a strategy where corrective action is taken only after the project encounters trouble. When risks cannot be managed and new risks arise sequentially, the software team engages in rapid problem correction, often called “firefighting” activities. Resources are utilized to manage these immediate risks. If risks remain unmanaged, the project is endangered.

In this older approach, no preventive care is taken; risks are handled only upon their occurrence.

Proactive Risk Strategy

Proactive risk management begins before technical activity starts by considering probable risks. In this strategy:

  1. Potential risks are identified.
  2. Their probability and impact are analyzed.
  3. Risks are prioritized (high-priority risks are managed first).
  4. The software team prepares a plan for managing these risks.

The objective is risk avoidance (prevention is better than cure). Since it is not possible to avoid all risks, the team prepares the risk management plan to ensure efficient risk control. This is the intelligent strategy widely used by modern IT industries.

Risk Identification

Risk identification is based on two approaches:

  1. Generic Risk Identification: Includes identifying potential threats common to software projects.
  2. Product-Specific Risk Identification: Includes identifying threats specific to the product by understanding the people, technology, and working environment in which the product is built.

The project manager typically identifies risk items using the following known and predictable components:

  • Product Size: Risks based on the overall size of the software product.
  • Business Impact: Risks related to the marketplace or management decisions.
  • Customer Characteristics: Risks associated with customer-developer communication.
  • Process Definition: Risks arising from the definition of the software process. This category is crucial because the defined process is followed by the entire team.
  • Development Environment: Risks associated with the technology and tools used for development.
  • Staff Size and Experience: Risks associated with having sufficient, highly experienced, and skilled staff for development.
  • Technology to be Built: Risks related to the complexity of the system being developed.

Risk Projection (Risk Estimation)

Risk projection, also called risk estimation, involves rating risks based on two factors:

  1. The probability that the risk is real.
  2. The consequences of problems associated with the risk (impact).

The project planner, technical staff, and project manager perform the following steps for risk projection:

  • Establish a scale that indicates the probability of the risk being real.
  • Enlist the consequences of the risk.
  • Estimate the impact of the risk on the project and product.
  • Maintain the overall accuracy of the risk projection to ensure a clear understanding of the software to be built.

These steps help to prioritize the risks, making it easier to allocate resources for handling them.

Risk Refinement (CTC Format)

Risk refinement is the process of specifying a risk in more detail. It can be represented using the CTC format (Condition-Transition-Consequence) suggested by D. P. Gluch.

The process involves:

  1. Stating the initial condition.
  2. Deriving sub-conditions based on the initial condition.
  3. Determining the effects (consequences) of these sub-conditions to refine the risk.

This refinement helps expose underlying risks and allows the project manager to analyze the risk in greater detail.

RMMM: Mitigation, Monitoring, and Management

RMMM stands for Risk Mitigation, Monitoring, and Management. The strategy for handling risk involves three core components:

  1. Risk Avoidance/Mitigation
  2. Risk Monitoring
  3. Risk Management (Contingency Planning)

Risk Mitigation (Avoidance)

Risk mitigation means preventing risks from occurring. Steps for mitigating risks include:

  1. Communicate with concerned staff to identify probable risks.
  2. Identify and eliminate all causes that could create risk before the project starts.
  3. Develop organizational policies to help the project continue even if staff members leave.
  4. Ensure every project team member is acquainted with the current development activity.
  5. Maintain corresponding documents in a timely manner, strictly adhering to organizational standards.
  6. Conduct timely reviews to speed up the work.
  7. Provide additional staff, if required, for critical software development activities.

Risk Monitoring

In the risk monitoring process, the project manager must track the following:

  • The approach or behavior of team members as project pressure varies.
  • The degree to which the team performs with a spirit of “teamwork.”
  • The type of cooperation among team members.
  • The types of problems that are occurring.
  • Availability of jobs within and outside the organization (staff retention risk).

The project manager should continuously monitor mitigation steps. For example, continuous monitoring of current development activity ensures everyone on the team remains acquainted with the project status.

The objectives of risk monitoring are:

  1. To check whether the predicted risks actually occur.
  2. To ensure the steps defined to avoid the risk are applied properly.
  3. To gather information useful for analyzing future risks.

Risk Management (Contingency)

The project manager performs this task when a risk becomes a reality. If the project manager successfully applies risk mitigation effectively, managing the realized risk becomes much easier. For example, if sufficient additional staff is available, if the current development activity is known to everybody in the team, and if systematic documentation is available, any newcomer can easily understand the current development activity. This ultimately helps in continuing the work without interruption.

Software Configuration Management (SCM)

Software Configuration Management (SCM) is a set of activities carried out for identifying, organizing, and controlling changes throughout the lifecycle of computer software. Change must be managed and controlled during development to improve quality and reduce errors. Hence, SCM is a quality assurance activity applied throughout the software process.

SCM is concerned with managing evolving software systems. It is a set of tracking and control activities that begin when a software development project starts and terminates when the software is retired.

Origins of Software Change Requests

  1. New business or market positions cause changes in requirements.
  2. New stakeholders may require changes in existing requirements.
  3. Business growth or project extension necessitates changes in the project.
  4. Schedule or budget constraints sometimes mandate changes in the project.

The SCM Repository

Software Configuration Items (SCI) are maintained in a project repository or library. The software repository is essentially a database that acts as a center for both accumulation and storage of software engineering information. Software engineers interact with the repository using integrated tools.

Role of the Project Repository

The software repository is a collection of information accessed by software engineers to make appropriate changes. It is handled using modern database management functions and must maintain important properties such as:

  • Data integrity
  • Sharing
  • Integration

The repository must also maintain a uniform structure and format for all software engineering work products.

Software Testing and Quality Assurance

Verification vs. Validation

Verification (Are we building the product right?)Validation (Are we building the right product?)
Refers to the set of activities that ensure software correctly implements the specific function.Refers to the set of activities that ensure that the software built is traceable to customer requirements.
Verification starts after a valid and complete specification is available.Validation begins as soon as the project starts.
Verification is for the prevention of errors.Validation is for the detection of errors.
Verification is conducted using reviews, walkthroughs, inspections, and audits.Validation is conducted using system testing, user interface testing, and stress testing.
Verification is also termed white box testing or static testing, as the work product goes through reviews.Validation can be termed black box testing or dynamic testing, as the work product is executed.
Verification finds about 50% to 60% of the defects.Validation finds about 20% to 30% of the defects.
Verification is based on the opinion of the reviewer and may change from person to person.Validation is based on facts and is often stable.
Verification is about process, standards, and guidelines.Validation is about the product.

Unit Testing vs. Integration Testing

Unit TestingIntegration Testing
Tests individual components or units of code in isolation.Tests interactions between integrated units or modules.
Verifies that a single function, method, or class works correctly.Ensures that modules work together as expected.
Focuses on the functionality and correctness of a specific unit.Focuses on data flow and interaction between modules.
Typically done by developers.Can be performed by developers or dedicated testers.
No dependencies; units are tested in isolation, often using mocks or stubs.Relies on integrated components working together.
Uses unit testing frameworks like JUnit, NUnit, or PyTest.Uses testing tools like Postman, Selenium, or custom test scripts.
Focuses on internal logic and specific edge cases of the unit.Focuses on interface, interaction, and integration scenarios.
Identifies bugs in specific functions or methods.Identifies issues in module interactions or data flow.

Integration Testing Approaches

Top-Down Integration Testing

Top-down testing is an incremental approach in which modules are integrated by moving down through the control structure. Modules subordinate to the main control module are incorporated into the system in either a depth-first or breadth-first manner. The integration process is performed using the following steps:

  1. The main control module is used as a test driver, and stubs are substituted for all modules directly subordinate to the main control module.
  2. Subordinate stubs are replaced one at a time with actual modules using either the depth-first or breadth-first method.
  3. Tests are conducted as each module is integrated.
  4. Upon completion of each set of tests, another stub is replaced with the real module.
  5. Regression testing is conducted to prevent the introduction of new errors.

Bottom-Up Integration Testing

In bottom-up integration, the modules at the lowest levels are integrated first, and then integration moves upward through the control structure. The bottom-up integration process is carried out using the following steps:

  1. Low-level modules are combined into clusters that perform a specific software subfunction.
  2. A driver program is written to coordinate test case input and output.
  3. The whole cluster is tested.
  4. Drivers are removed, and clusters are combined, moving upward in the program structure.

Alpha Testing vs. Beta Testing

Alpha TestingBeta Testing
Performed at the developer’s site.Performed at the end user’s site.
Performed in a controlled environment, as the developer is present.Performed in an uncontrolled environment, as the developer is not present.
Less probability of finding errors, as it is driven by developers.High probability of finding errors, as the end user can use it in unexpected ways.
It is done during the implementation phase of the software.It is done as a pre-release of the software.
Less time-consuming, as the developer can make necessary changes quickly.More time-consuming, as the user has to report bugs via an appropriate channel.
It is not considered a live application.It is considered a live application (in a real-world context).

Black Box Testing vs. White Box Testing

Black Box Testing (Behavioral Testing)White Box Testing (Glass Box Testing)
Examines the fundamental aspects of the system with little regard for the internal logical structure of the software.The procedural details, all logical paths, and all internal data structures are closely examined.
During black box testing, the program cannot be tested 100 percent.White box testing leads to testing the program thoroughly.
This type of testing is suitable for large projects.This type of testing is suitable for small projects.

Software Testing Life Cycle (STLC)

The Software Testing Life Cycle (STLC) is a structured process that defines the various stages of testing activities to ensure software quality. It is an integral part of the Software Development Life Cycle (SDLC) and focuses specifically on testing-related tasks.

  1. Requirement Analysis: Understand the testing requirements and identify what needs to be tested. This involves analyzing functional and non-functional requirements, identifying testable requirements, and collaborating with stakeholders.
  2. Test Planning: Create a detailed test strategy and plan to define the scope and approach for testing. This includes defining test objectives, scope, and deliverables; estimating effort, resources, and timeline; identifying risks and preparing mitigation plans; and selecting testing tools and environments.
  3. Test Case Development: Create and document detailed test cases and prepare test data. This involves writing test cases based on requirements and creating reusable and modular test scripts.
  4. Environment Setup: Prepare the environment where testing will be conducted.
  5. Test Execution: Execute test cases and log results to identify defects.