Core Concepts in Data Mining and Machine Learning
Association Rule Mining Fundamentals
Association Rule Mining (ARM) is a data mining technique used to uncover interesting relationships, patterns, or associations among items in large datasets. It is commonly applied in market basket analysis to discover how items are purchased together.
Understanding Association Rules
An association rule is typically written in the form:
A → B
Where A and B are itemsets. This means if itemset A occurs, then itemset B is likely to occur as well.
Key Measures for Rule Evaluation
To evaluate the strength and usefulness of an association rule, three important measures are used:
Support
Support measures how frequently an itemset appears in the dataset. It indicates the proportion of transactions that contain both A and B. Higher support means the rule is more frequent in the data and thus more significant.
Confidence
Confidence is a measure of the reliability of the rule. It tells how often item B appears in transactions that contain item A. A higher confidence means a stronger association between A and B.
Lift
Lift measures how much more likely B is purchased when A is purchased compared to B being purchased independently. It helps understand whether A and B have a real association or if it’s just by chance.
- A lift > 1 indicates a positive correlation.
- Lift = 1 means no correlation.
- Lift < 1 means a negative correlation.
These metrics help in identifying the most significant and interesting rules that can be used for decision-making, product placement, and targeted marketing.
Reinforcement Learning (RL) Principles
Reinforcement Learning is a type of machine learning where an agent learns by interacting with its environment to achieve a specific goal. The agent receives feedback in the form of rewards or penalties based on the actions it takes. Over time, the agent learns the best strategy (policy) to maximize the total reward.
Reinforcement Learning is based on the idea of trial and error, where the agent improves its performance by continuously exploring the environment and learning from its past actions. It is widely used in robotics, game playing, autonomous vehicles, recommendation systems, and more.
Main Components of Reinforcement Learning
Agent
The agent is the learner or decision-maker. It performs actions in the environment to achieve a goal. The agent chooses actions based on a policy.
Example: A robot learning to walk or a program learning to play chess.
Environment
The environment is everything outside the agent. It provides the state to the agent and gives a reward after each action. The environment reacts to the agent’s actions and changes accordingly.
Example: A grid maze, a game board, or a traffic system.
Policy (π)
A policy is the agent’s strategy or rule for selecting actions. It can be deterministic (fixed action for each state) or stochastic (probability-based). The goal of the agent is to find the optimal policy that gives the maximum reward over time.
Example: “If traffic light is red, then stop.”
State (s)
A state is the current situation or condition of the agent in the environment. It contains all the necessary information for decision-making. The state changes when the agent takes an action.
Example: The location of the agent in a maze or the score in a game.
Reward (R)
A reward is a numeric value that the agent receives after performing an action. It tells the agent how good or bad its action was. The main objective of the agent is to maximize the total cumulative reward.
Example: +10 for winning a game, -1 for losing a life.
Working Process of Reinforcement Learning
- The agent observes the current state from the environment.
- It takes an action based on its policy.
- The environment responds by giving a reward and a new state.
- The agent uses this feedback to update its policy for better future decisions.
RL Interaction Diagram
[Agent] → performs Action → [Environment]
← gets Reward & New State ←
