Skip to main content
  1. Data Science Courses/

Reinforcement Learning

·390 words·2 mins· loading · ·
ML Courses Machine Learning ML Courses

On This Page

Table of Contents
Share with :

Reinforcement Learning

Reinforcement Learning
#

Classical Reinforcement Learning
#

Markov Decision Process
#

  • Introduction
  • What is Reinforcement Learning?
  • Agent-Environment Interaction
  • State Vectors
  • Objective of RL Agent
  • Actions & Policy
  • Exploration vs Exploitation
  • Markov State
  • Markov Decision Process (MDP)
  • Value Function
  • Optimal Policy
  • Model of the Environment
  • RL vs Supervised Learning
  • Inventory Management (MDP)

Fundamental Equations in RL
#

  • Introduction
  • RL Equations – State Value Function
  • RL Equations – Action Value Function
  • Understanding the RL Equations
  • Bellman Equations of Optimality
  • Policy Improvement
  • Introduction

Model-Based Method – Dynamic Programming
#

  • Dynamic Programming
  • Policy Iteration – Algorithm
  • Policy Evaluation – Prediction
  • Policy Improvement – Control
  • Policy Iteration – GridWorld
  • Value Iteration
  • Generalised Policy Iteration (GPI)
  • Ad Placement Optimization (Demo)

Model-Free Methods
#

  • Introduction
  • Intuition behind Monte-Carlo Methods
  • Monte-Carlo Prediction & Demo
  • Monte-Carlo Control
  • Off Policy
  • Temporal Difference
  • Q-Learning with Pseudocode
  • Cliff Walking Demo
  • Ad Placement Optimization Demo -Q Learning
  • OpenAI Gym -Taxi v2

Inventory Management Demo
#

  • Introduction
  • Problem Statement
  • MDP code
  • Q-Learning code
  • Results

Assignment -Classical Reinforcement Learning
#

Assignment – Tic-Tac-Toe
#

Deep Reinforcement Learning
#

Introduction
#

Want to build your own Atari Game? Learn the Q-function or policy using the various Deep Reinforcement Learning algorithms: Deep Q Learning, Policy Gradient Methods, Actor-Critic method.

Architectures of Deep Q Learning
#

  • Architectures of Deep Q Network
  • DQN Architecture II – Visualisation
  • DQN Demo – Cartpole Environment
  • Double DQN – A DQN Variation

Deep Q Learning
#

  • Introduction
  • Why Deep Reinforcement Learning?
  • Parameterised Representation
  • Generalizability in Deep RL
  • Deep Q Learning
  • Training in Deep Reinforcement Learning
  • Replay Buffer
  • Generate Data for Training
  • Target in DQN
  • When to stop training?
  • Atari Game
  • Introduction

Policy Gradient Methods
#

  • Introduction to Policy Gradient Methods
  • The Intuition of Policy-Based Methods
  • Comparing DQN and Policy-Based Methods
  • Path Probability
  • Objective Function
  • Gradient of the Objective Function
  • The Update Rule
  • Step-by-Step Update

Actor-Critic Methods
#

  • Introduction
  • The Need for Actor-Critic Methods
  • Addressing the Problem of Variance
  • Justification for Adding the Baseline
  • Reducing Variance Using the Baseline
  • Appropriate Choice of the Baseline
  • Policy Gradient (REINFORCE)
  • Actor-Critic Methods: Training
  • Training Process: Summary
  • Illustration: Defining the State Space

Reinforcement Learning Project
#

Problem Statement
#

Improve the recommendation of the rides to the cab drivers by creating an RL-based algorithm using vanilla Deep Q-Learning (DQN) to maximize the driver’s profits and in turn help in retention of the driver on the cab aggregator service.

Dr. Hari Thapliyaal's avatar

Dr. Hari Thapliyaal

Dr. Hari Thapliyal is a seasoned professional and prolific blogger with a multifaceted background that spans the realms of Data Science, Project Management, and Advait-Vedanta Philosophy. Holding a Doctorate in AI/NLP from SSBM (Geneva, Switzerland), Hari has earned Master's degrees in Computers, Business Management, Data Science, and Economics, reflecting his dedication to continuous learning and a diverse skill set. With over three decades of experience in management and leadership, Hari has proven expertise in training, consulting, and coaching within the technology sector. His extensive 16+ years in all phases of software product development are complemented by a decade-long focus on course design, training, coaching, and consulting in Project Management. In the dynamic field of Data Science, Hari stands out with more than three years of hands-on experience in software development, training course development, training, and mentoring professionals. His areas of specialization include Data Science, AI, Computer Vision, NLP, complex machine learning algorithms, statistical modeling, pattern identification, and extraction of valuable insights. Hari's professional journey showcases his diverse experience in planning and executing multiple types of projects. He excels in driving stakeholders to identify and resolve business problems, consistently delivering excellent results. Beyond the professional sphere, Hari finds solace in long meditation, often seeking secluded places or immersing himself in the embrace of nature.

Comments:

Share with :

Related

AI for Prospective Email Writing
·491 words·3 mins· loading
ML Courses TensorFlow Lite Android Development
AI for Prospective Email Writing # Course Objective # Equip participants with the skills to draft …
GenAI for Cybersecurity
·526 words·3 mins· loading
ML Courses TensorFlow Lite Android Development
GenAI for Cybersecurity # Course Overview: Here’s a simplified and enriched version of your course …
Train Tensorflow Lite Models for Android
·852 words·4 mins· loading
ML Courses TensorFlow Lite Android Development
Course Title: Developing Solutions with Agentic AI # Course Outline # Module 1: Introduction to …
AI Powered Account Management Strategies
·421 words·2 mins· loading
ML Courses Artificial Intelligence Account Management
Program Outline: AI Powered Account Management Strategies # Duration: # 2 Days Course Audience: # …
Generative AI for Client and Stakeholder Engagement
·412 words·2 mins· loading
ML Courses Generative AI Stakeholder Engagement
Program Outline: AI Powered Client and Stakeholder Engagement # Duration: # 2 Days Course Audience: …