首页 > 其他分享 >6CCS3AIN  Pacman MDP-solver

6CCS3AIN  Pacman MDP-solver

时间:2024-11-28 19:56:05浏览次数:6  
标签:code Pacman py your will games 6CCS3AIN MDP

6CCS3AIN  Coursework

1 Introduction

This coursework exercise asks you to write code to create an MDP-solver to work in the Pacmanenvironment that we used for the practical exercises.Read all these instructions before starting.This exercise will be assessed.

2 Getting started You should download the file pacman-cw.zip from KEATS. This contains a familiar set of files thatimplement Pacman, and version 6 of api.py which defines the observability of the environment thatyou will have to deal with, and the same non-deterministic motion model that the practicals used.Version 6 of api.py, further extends what Pacman can know about the world. In addition toknowing the location of all the objects in the world (walls, food, capsules, ghosts), Pacman can nowsee what state the ghosts are in, and so can decide whether they have to be avoided or not.

3 What you need to do

3.1 Write code This coursework requires you to write code to control Pacman and win games using an MDP-solver.For each move, you will need to have the model of Pacman’s world, which consists of all the elementsof a Markov Decision Process, namely:

  • A finite set of states S;
  • A finite set of actions A;
  • A state-transition function P(s 0 |s, a);
  • A reward function R;
  • A discount factor γ [0, 1]Following this you can then compute the action to take, either via Value Iteration, Policy Iteration or

Modified Policy Iteration. It is expected that you will correctly implement such a solver and optimizethe choice of the parameters. There is a (rather familiar) skeleton piece of code to take as your

starting point in the file mdpAgents.py. This code defines the class MDPAgent.There are two main aims for your code:1Mallmann-Trenn / McBurney / 6ccs3ain-cw(a) Win hard in smallGrid(b) Win hard in mediumClassicTo win games, Pacman has to be able to eat all the food. In this coursework, for these objectives,“winning” just means getting the environment to report a win. Score is irrelevant.

3.1.1 Getting Excellence points There is a difference between winning a lot and winning well. This is why completing aim (a) and

(b) from previous section allows you to collect up to 80 points in the Coursework. The remaining20 points are obtained by having a high Excellence Score Difference in the mediumClassic layout,a metric that directly comes from having a high average winning score. This can be done throughdifferent strategies, for example through chasing eatable ghosts.A couple of things to be noted. Let W be the set of games won, i.e., |W| ∈ [0, 25]. For any wongame i W define sw(i) to be the score obtained in game/run i.

  • Se in the marksheet is the Excellence Score Difference. You can use the following formulato calculate it when you test your code and compare the result against the values in Table 3∆Se = XiW (sw(i) 1500) (1)Losses count as 0 score and are not considered. If ∆Se < 0, we set it to 0 (you cannot havea negative excellence score difference).
  • Because smallGrid does not have room for score improvement, we will only look at themediumClassic layout
  • You can still get excellence points if your code performs poorly in the number of wins; markingpoints are assigned independently in the two sections
  • Note however that marking points are assigned such that it is not convenient for you to directlyaim for a higher average winning score without securing previous sections’s aims (a) and (b)first
  • We will use the same runs in mediumClassic to derive the marks for Table 2 and Table 3.

3.2 Things to bear in mind Some things that you may find helpful:

(a) We will evaluate whether your code can win games in smallGrid by running:python pacman.py -q -n 25 -p MDPAgent -l smallGrid-l is shorthand for -layout. -p is shorthand for -pacman. -q runs the game without the

interface (making it faster).(b) We will evaluate whether your code can win games in mediumClassic by running:python pacman.py -q -n 25 -p MDPAgent -l mediumClassicThe -n 25 runs 25 games in a row.2Mallmann-Trenn / McBurney / 代写6CCS3AIN  Pacman  MDP-solver 6ccs3ain-cw(c) The time limit for evlauation is 25minute for mediumClassic and 5 minutes for small grid.It will run on a high performance computer with 26 cores and 192 Gb of RAM. The timeconstraints are chosen after repeated practical experience and reflect a fair time bound.

(d) When using the -n option to run multiple games, the same agent (the same instance ofMDPAgent.py) is run in all the games.That means you might need to change the values of some of the state variables that controlPacman’s behaviour in between games. You can do that using the final() function.(e) There is no requirement to use any of the methods described in the practicals, though youcan use these if you wish.(f) If you wish to use the map code I provided in MapAgent, you may do this, but you need toinclude comments that explain what you used and where it came from (just as you would forany code that you make use of but don’t write yourself).(g) You can only use libraries that are part of a the standard Python 2.7 distribution. This ensuresthat (a) everyone has access to the same libraries (since only the standarddistribution isavailable on the lab machines) and (b) we don’t have trouble running your code due to somelibraryincompatibilities.

(h) You should comment your code and have a consistent style all over the file.3.3 Limitations There are some limitations on what you can submit.

(a) Your code must be in Python 2.7. Code written in a language other than Python will not bemarked.Code written in Python 3.X is unlikely to run with the clean copy of pacman-cw that we willtest it against. If is doesn’t run, you will lose marks.Code using libraries that are not in the standard Python 2.7 distribution will not run (inparticular, NumPy is not allowed). Ifyou choose to use such libraries and your code does notrun as a result, you will lose marks.

(b) Your code must only interact with the Pacman environment by making calls through functions in Version 6 of api.py. Code that finds other ways to access information about theenvironment will lose marks.The idea here is to have everyone solve the same task, andhave that task explore issues withnon-deterministic actions.

(c) You are not allowed to modify any of the files in pacman-cw.zip except mdpAgents.py.Similar to the previous point, the idea is that everyone solves the same problem — you can’tchange the problem by modifying the base code that runs the Pacman environment. Therefore,you are not allowed to modify the api.py file.

(d) You are not allowed to copy, without credit, code that you might get from other students orfind lying around on the Internet. We will be checking.This is the usual plagiarism statement. When you submit work to be marked, you should onlyseek to get credit for work you have done yourself. When the work you are submitting is code,Mallmann-Trenn / McBurney / 6ccs3ain-cwyou can use code that other people wrote, but you have to say clearly that the other personwrote it — you do that by putting in comment that says who wrote it. That way we canadjust your mark to take account of the work that you didn’t do.

(e) Your code must be based on solving the Pacman environment as an MDP. If you don’t submita program that contains a recognisable MDP solver, you will lose marks.

(f) The only MDP solvers we will allow are the ones presented in the lecture, i.e., Value iteration,Policy iteration and Modified policy iteration. In particular, Q-Learning isunacceptable.

(g) Your code must only use the results of the MDP solver to decide what to do. If you submitcode which makes decisions about what to do that uses other information in addition to whatthe MDP-solver generates (like ad-hoc ghost avoiding code, for example), you will lose marks.This is to ensure that your MDP-solver is the thing that can win enough games to pass thefunctionality test.

4 What you have to hand in

Your submission should consist of a single ZIP file. (KEATS will be configured to only accept asingle file.) This ZIP file must include a single Python .py file (your code).The ZIP file must be named:cw <lastname> <firstname>.zipso my ZIP file would be named cw mallmann-trenn frederik.zip.Remember that we are going to evaluate your code by running your code by using variations onpython pacman.py -p MDPAgent(see Section 5 for the exact commands we will use) and we will do this in a vanilla copy of thepacman-cw folder, so the base class for your MDP-solving agent must be called MDPAgent.To streamline the marking of the coursework, you must put all your code in one file, and this filemust be called mdpAgents.py,Do not just include the whole pacman-cw folder. You should only include the one file that includesthe code you have written.Submissions that do not follow these instructions will lose marks. That includes submissions whichare RAR files. RAR is not ZIP.

5 How your work will be marked See cw-marksheet.pdf for more information about the marking.There will be six components of the mark for your work:(a) FunctionalityWe will test your code by running your .py file against a clean copy of pacman-cw.As discussed above, the number of games you win determines the number of marks you get.Since we will check it this way, you may want to reset any internal state in your agent using4Mallmann-Trenn / McBurney / 6ccs3ain-cwfinal() (see Section 3.2). For the excellence marks, we will look at the winning scores forthe mediumClassic layout.Since we have a lot of coursework to mark, we will limit how long your code has to demonstratethat it can win. We will terminate the run of the 25smallGrid games after 5 minutes, andwill terminate the run of the 25 mediumClassic games after 25 minutes. Ifyour code hasfailed to win enough games within these times, we will mark it as if it lost. Note that we willuse the -q command, which runs Pacman without the interface, to speed things up.(b) Code not written in Python will not be marked.(c) Code that does not run in our test setting will receive 0 marks. Regardless of the reason.

(d) We will release the random seed that we use for marking. Say the seed is 42, then you needto do the following to verify our marking is correct:

  1. 1) fix the random seed to 42 (int, not string type) at line 541 of pacman.py. (not ’42’)
  2. 2) download a fresh copy of the new api (to avoid using files you modified yourself)
  3. 3) run python pacman.py -q -f -n 25 -p MDPAgent -l mediumClassic
  4. 4) you should get the same result as us. If not repeat step 3) again. Should the outcome bedifferent, then you didn’t fix the random seed correctly. Go back to 1)A copy of the marksheet, which shows the distribution of marks across the different elements of thecoursework, will be available from KEATS.Mallmann-Trenn / McBurney / 6ccs3ain-cw

标签:code,Pacman,py,your,will,games,6CCS3AIN,MDP
From: https://www.cnblogs.com/CSE231/p/18574048

相关文章

  • COMP3702 Artificial Intelligence BeeBot MDP
    COMP3702ArtificialIntelligence(Semester2,2024)Assignment2:BeeBotMDPKeyinformation:Due:1pm,Friday20September2024Thisassignmentassessesyourskillsindevelopingdiscretesearchtechniquesforchallengingproblems.Assignment2contrib......
  • 智能医学(二)——MDPI特刊推荐
     特刊征稿01 特刊名称:eHealthandmHealth:Challengesand Prospects,2ndVolume参与期刊:截止时间:摘要提交截止日期关闭(2024年6月30日)投稿截止日期2024年9月30日目标及范围:关键字l 人工智能l 计算机视觉l 图像处理l 医学成像l 决策支持系......
  • (10-2-01)智能行为决策算法:常用的智能行为决策算法-------马尔可夫决策过程(MDP)
    10.2 常用的智能行为决策算法在实际应用中,智能行为决策算法在自动驾驶系统中各有其独特的优势和应用场景,通过合理组合和优化,能够有效提升自动驾驶的安全性、可靠性和效率。在本节的内容中,将详细讲解常用的智能行为决策算法的用法。10.2.1 马尔可夫决策过程(MDP)马尔可夫......
  • hmdp-短信验证
    基于Session实现登录流程发送验证码:用户在提交手机号后,会校验手机号是否合法,如果不合法,则要求用户重新输入手机号如果手机号合法,后台此时生成对应的验证码,同时将验证码进行保存,然后再通过短信的方式将验证码发送给用户短信验证码登录、注册:用户将验证码和手机号进行输入,后......
  • [MDP.AspNetCore] 實作OAuth協定SSO Server/Client專案範例
    團隊負責的系統變多的時候,使用SSOServer提供統一身分驗證,讓團隊只需要維護一份用戶資料及一個身分驗證服務。除了減少團隊維護成本之外,也讓使用者不用記憶多個站台的帳號密碼,提供更好的使用者體驗。本篇文章,介紹使用MDP.AspNetCore的NuGet套件,所建立的實作OAuth協定SSOServer/C......
  • [MDP.BlazorCore] 快速建立跨Web、App執行的BlazorApp專案
    團隊資源受限的時候,使用Blazor開發應用系統,只需開發一份程式碼及使用一種程式語言,就同時產出Web跟App應用系統。本篇文章,紀錄使用MDP.BlazorCore所提供的樣板,快速建立跨Web、App執行的BlazorApp專案。為自己留個紀錄,也希望能幫助到有需要的開發人員。.安裝指令:dotnetnewinstal......
  • dmdpc安装部署
    环境:OS:Centos7DM:DMV8达梦分布计算集群英文全称DMDistributedProcessingCluster,简称DMDPC.计划生成节点,英文全称为SQLProcessor,简称为SP;数据存储节点,英文全称为BackendProcessor,简称为BP;元数据服务器节点,英文全称为MetadataProcessor,简称为MP.一个最小的......
  • yay和debtap和pacman结合安装软件,manjaro还可以这么安装软件
    看到网上把欧路词典说的那么好,正好觉得goldendict用的不够顺手,打算试试,安装过程有点波折,记录如下:1.常规做法,sudopacman-Seudic没有这个软件,那就yay-Seudic,这次倒没有提示没有软件,这可以archlinux系列的软件宝库啊2.中途出现如下错误:无法读取配置文件'/home/nication/.conf......
  • Archlinux pacman 滚挂的惨痛教训
    本文以BY-NC-SA协议发布。省流不要将/var/cache/pacman/pkg及它的任一父目录设为符号链接。完整版我真傻,真的。我是单知道/var/cache会占很大空间导致滚挂,不知道/var/cache不能设为符号链接。在上次滚挂后我设置了符号链接,然后一个月不到就又挂了,救不回来的那种。翻......
  • Arch(Manjaro) Linux Pacman 命令详解
    参考Wiki:https://wiki.archlinuxcn.org/zh-hans/Pacmanyay命令参考:HerePacman是一个软件包管理器,作为ArchLinux发行版的一部分。简单来说,就是和apt-get之于Ubuntu一样,pacman就是Arch的apt-get。要想轻松玩转Arch,学会pacman是必需的。Pacman包管理器是ArchLinux的一大亮点。......