首页 > 编程语言 >CITS1401 Computational Thinking with Python

CITS1401 Computational Thinking with Python

时间:2024-09-12 12:24:26浏览次数:12  
标签:product Computational Thinking CITS1401 price will program file your

CITS1401 Computational Thinking with Python

Project 1, Semester 2, 2024

Department of Computer Science and Software Engineering

The University of Western Australia

CITS1401

Computational Thinking with Python roject 1, Semester 2, 2024(Individual project)

Submission deadline: 23:59 PM, 13 September 2024.

Total Marks: 30

Project Submission Guidelines:

You should construct a Python 3 program containing your solution to the given problem andsubmit your program electronically on Moodle. The name of the file containing your codeshould be your student ID e.g. 12345678.py. No other method of submission is allowed. Pleasenote that this is an individual project.

  • Your program will be automatically run on Moodle for sample test cases provided inthe project sheet if you click the “check” link. However, this does not test all requiredcriteria and your submission will be thoroughly tested manually for grading purposesafter the due date. Remember you need to submit the program as a single file and copypaste the same program in the provided text box.
  • You have only one attempt to submit, so don’t submit until you are satisfied with yourattempt.
  • All open submissions at the time of the deadline will be automatically submitted. Thereis no way in the system to open/modify/reverse your submission.
  • You must submit your project before the deadline listed above. Following UWA policy,a late penalty of 5% will be deducted for each day (or part day) i.e., 24 hours after thedeadline, that the assignment is submitted.
  • No submissions will be allowed after 7 days following the deadline except approved

special consideration cases.You are expected to have read and understood the University's guidelines on academic conduct.

In accordance with this policy, you may discuss with other students the general principlesrequired to understand this project, but the work you submit must be the result of your owneffort. Plagiarism detection, and other systems for detecting potential malpractice, willPage 1 of 9CITS1401 Computational Thinking with PythonProject 1, Semester 2, 2024

therefore be used. Besides, if what you submit is not your own work then you will have learnt

little and will therefore, likely, fail the final exam.

Project Overview:

In the rapidly expanding world of e-commerce, platforms like Amazon provide vast amountsof data that can offer valuable insights into various aspects of product performance. This project

aims to analyze Amazon data for different products within specific categories, utilizing keyparameters such as product ID, product name, category, discounted price, actual price, ratings, rating count etc., The data set includes a diverse range of categories, each with multiple

products, allowing us to identify trends and patterns specific to each category.

You are required to write a Python 3 program that will read two different files: a CSV file and

a TXT file. Your program will perform four different tasks outlined below. While the CSV file

is required to solve all the tasks (Tasks1-4), the TXT file is only required for the last task (Task

4).

After reading the CSV file, your program is required to complete the following:

  • Task 1: Identify Extreme Discount Prices

Find the product ID with the highest discounted price and the product ID with the

lowest discounted price for a specific category.

  • Task 2: Summarize Price Distribution

Provide a summary of the ‘actual price’ distribution i.e., mean, median and mean

absolute deviation of products for a specific category, considering only the products

with a rating count higher than 1000.

  • Task 3: Calculate Standard Deviation of Discounted Percentages

Calculate the standard deviation of the discounted percentages for products with rating

in the range 3.3≤rating≤4.3, for each category.

  • Task 4: Correlate Sales Data

Find the correlations between the sales of the products identified in Task 1 (products

with highest and lowest discounted prices for a specific category).

Steps:

o Read the TXT file which contains the sales data for several years, such as 1998-

  1. Each line lists product IDs and the units sold for that year. If a product ID

is not mentioned in a line, it means zero units sold for that year.

Page 2 of 9CITS1401 Computational Thinking with Python

Project 1, Semester 2, 2024

o Create two lists, one for the sales of the product with the highest discounted

price and another for the sales of the product with the lowest discounted price

identified in Task 1.

o Process each line of the TXT file to determine the number of units sold each

year.

o Each list should have one entry per year, with the total number of entries

matching the number of lines in the TXT file.

Finally, calculate the correlation coefficient between the two sales lists.

Requirements:

  1. 1) You are not allowed to import any external or internal module in python. While use of

many of these modules, e.g., csv or math is a perfectly sensible thing to do in production

setting, it takes away much of the point of different aspects of the project, which is about getting

practice opening text files, processing text file data, and use of basic Python structures, in this

case lists and loops.

  1. 2) Ensure your program does NOT call the input() function at any time. Calling the

input() function will cause your program to hang, waiting for input that automated testing

system will not provide (in fact, what will happen is that if the marking program detects the

call(s), it will not test your code at all which may result in zero grade).

  1. 3) Your program should also not call print()function at any time except for the case of

graceful termination (if needed). If your program encounters an error state and exits gracefully,

it should return a correlation/standard deviation/mean/median value of zero and print an

appropriate error message. At no point should you print the program’s outputs or provide a

printout of the program’s progress in calculating such outputs. Outputs should be returned by

the program instead.

  1. 4) Do not assume that the input file names will end in .csv or .txt. File name suffixes such

as .csv and .txt are not mandatory in systems other than Microsoft Windows. Do not

enforce within your program that the file must end with a specific extension, nor should you

attempt to add an extension to the provided file name. Doing so can result in loss of marks.

Page 3 of 9CITS1401 Computational Thinking with Python

Project 1, Semester 2, 2024

Page 4 of 9

Input:

Your program must define the function main with the following syntax:

def main(CSVfile, TXTfile, category):

The input arguments for this function are:

  1. CSVfile: The name of the CSV file (as string) containing the record of the Amazon’s

product data.

  1. TXTfile: The name of the TXT file (as string) containing the record of Amazon’s

product sales.

  1. category: A string representing the category to be analysed. The Amazon’s product

data contains multiple categories.

Output:

The following four outputs are expected:

  1. i)

OP1= [Product ID1, Product ID2]: A list that contains two items, ID of

the product with the highest discounted price, ID of the product with the lowestdiscounted price. Your output should be stored in a list in the following order:

highest discounted price product ID, lowest discounted price product ID]For example: ['b07vtfn6hm', 'b08y5kxr6z']

Note: If multiple products have the same highest discounted price, select the product ID that comes first when the product IDs are sorted in ascending order. Apply the same rule for the lowest discounted price.

  1. ii) OP2= [mean, median, mean absolute deviation]: A list containing

three statistical measures i.e., mean, median, and mean absolute deviation of the actual price for products within a given category, considering only those products with arating count higher than 1000. The output should be代 写CITS1401 Computational Thinking with Python stored in a list in the followingorder:mean, median, mean absolute deviation]

For example: [2018.8, 800.0, 2132.48] CITS1401 Computational Thinking with PythonProject 1, Semester 2, 2024

iii) OP3= [STD1, STD2, ..., STDN]: A list containing the standard deviation ofthe discounted percentages for products within the rating in the range 3.3 to 4.3 (3.3 ≤

rating ≤ 4.3) of each category. The output should be sorted in the descending order. Theexpected output is a list with values sorted in the descending order.For example: [0.297, 0.2654, 0.2311, 0.198, 0.1701, 0.1596,0.0071]

  1. iv) OP4= Correlation: A numeric value representing the correlation between the

sales of a product with the highest discounted price and the lowest discounted priceound in the task 1 above. The expected output is a single float value.

For example: -0.0232All returned numeric outputs (both in lists and individual) must contain values rounded to fourecimal places (if required to be rounded off). Do not round the values during calculations.nstead, round them only at the time when you save them into the final output variables.

Examples:

Download Amazon_products.csv and Amazon_sales.txt from the folder of Project

1 on LMS or Moodle. An example of how you can call your program from the Python shell

(and examine the results it returns) is provided below:

>>>OP1, OP2, OP3, OP4= main('Amazon_products.csv',

'Amazon_sales.txt', 'Computers&Accessories')

>>>OP1

['b07vtfn6hm', 'b08y5kxr6z']

>>> OP2

[2018.8, 800.0, 2132.48]

>>> OP3

[0.297, 0.2654, 0.2311, 0.198, 0.1701, 0.1596, 0.0071]

>>> OP4

-0.0232

Page 5 of 9CITS1401 Computational Thinking with Python

Project 1, Semester 2, 2024

Assumptions:

Your program can assume the following:

  1. Anything that is meant to be string (e.g., header) will be a string, and anything that is

meant to be numeric will be numeric.

  1. All string data in the CSV file and TXT file is case-insensitive, which means

“Computers&accessories” is same as “Computers&Accessories” or “B08Y5KXR6Z” is

same as “b08y5kxr6z”. Your program needs to handle the situation to consider both

strings to be the same.

  1. In the CSV file, the order of columns in each row will follow the order of the headings

provided in the first row. However, rows can be in random order except the first row

which contains the headings.

  1. No data will be missing in the CSV file; however, values can be zero and must be

accounted for when calculating averages and standard deviations.

[In case any part of the calculation cannot be performed due to zero values or other

boundary conditions, do a graceful termination by printing an error message and

returning a zero value (for numbers), None for (string) or empty list depending on the

expected outcome. Your program must not crash.]

  1. Each line in the TXT file will correspond to a unique year, with no repetition of years.

The number of years may vary, so avoid hard coding.

  1. All the product IDs in the CSV file will be unique.
  2. The main() will always be provided with valid input parameters.
  3. The necessary formulas are provided at the end of this document.

Important grading instruction:

Note that you have not been asked to write specific functions. The task has been left to you.

However, it is essential that your program defines the top-level function main(CSVfile,

TXTfile, category) (commonly referred to as ‘main()’ in the project documents to

save space when writing it. Note that when main() is written it still implies that it is defined

with its three input arguments). The idea is that within main(), the program calls the other

functions. (Of course, these functions may then call further functions.) This is important

because when your code is tested on Moodle, the testing program will call your main()

function. So, if you fail to define main(), the testing program will not be able to test your

Page 6 of 9CITS1401 Computational Thinking with Python

Project 1, Semester 2, 2024

Page 7 of 9

code and your submission will be graded zero. Don’t forget the submission guidelines provided

at the start of this document.

Marking rubric:

Your program will be marked out of 30 (later scaled to be out of 15% of the final mark).

24 out of 30 marks will be awarded automatically based on how well your program completes

a number of tests, reflecting normal use of the program, and how the program handles various

states including, but not limited to, different numbers of rows in the input file and / or any error

states. You need to think creatively what your program may face. Your submission will be

graded by data files other than the provided data file. Therefore, you need to be creative to

investigate corner or worst cases. I have provided few guidelines from ACS Accreditation

manual at the end of the project sheet which will help you to understand the expectations.

6 out of 30 marks will be awarded on style (3/6) “the code is clear to read” and efficiency (3/6)

“your program is well constructed and run efficiently”. For style, think about use of comments,

sensible variable names, your name at the top of the program, student ID, etc. (Please watch

the lectures where this is discussed).

Style Rubric:

0 Gibberish, impossible to understand

1

Style is really poor or

fair.

2

Style is good or very good, with small lapses.

3

Excellent style, really easy to read and follow

Your program will be traversing text files of various sizes (possibly including large csv files)

so you need to minimise the number of times your program looks at the same data items.

Efficiency rubric:

Code too complicated to judge efficiency or wrong problem tackle

Very poor efficiency, additional loops, inappropriate use of readline()

Acceptable or good efficiency with some lapse

Excellnt efficiency, should have no problem on large files, etc.

Automated testing is being used so that all submitted programs are being tested the same way.

Sometimes it happens that there is one mistake in the program that means that no tests are

passed. If the marker can spot the cause and fix it readily, then they are allowed to do that and

your - now fixed - program will score whatever it scores from the tests, minus 4 marks, because CITS1401 Computational Thinking with Python

Project 1, Semester 2, 2024

Page 8 of 9

other students will not have had the benefit of marker intervention. Still, that's way better than

getting zero. On the other hand, if the bug is hard to fix, the marker needs to move on to other

submissions.

Extract from Australian Computing Society Accreditation manual 2019:

As per Seoul Accord section D, a complex computing problem will normally have some or

all the following criteria:

- involves wide-ranging or conflicting technical, computing, and other issues.

- has no obvious solution and requires conceptual thinking and innovative analysis to

formulate suitable abstract models.

- a solution requires the use of in-depth computing or domain knowledge and an

analytical approach that is based on well-founded principles.

- involves infrequently encountered issues.

- are outside problems encompassed by standards and standard practice for professionalcomputing.

- involves diverse groups of stakeholders with widely varying needs.

标签:product,Computational,Thinking,CITS1401,price,will,program,file,your
From: https://www.cnblogs.com/qq---99515681/p/18409930

相关文章

  • MAST90083: Computational Statistics and Data Science
    SchoolofMathematicsandStatisticsMAST90083:ComputationalStatisticsandDataScienceAssignment1Duedate:Nolaterthan08:00amonMonday9thSeptember2024Weight:20%Question1LinearRegressionInthisquestion,wewillapplylinearregression,......
  • 软件安装方式 thinking
    像MySQL、Kafka、ZooKeeper、Redis、ElasticSearch、ApacheDrios等软件的安装方式。 1、单机安装本机云服务器:国内云、国外云 2、试验性集群安装小集群,比如,三个主机的。 3、容器安装Docker等 4、Kubernetes安装-yaml方式 5、Kubernetes安装-HelmChart......
  • SQL Thinking
    s2下半年我在内部有一次部门级别的技术分享会,以本文内容分享为主。其实有很多人问过我相同的问题,遇到需要改写的慢sql,不知道怎么改,改好了以后也不知道等不等价?不等价了也不知道错在哪?这个要怎么破?其实都是因为绝大多数人没有做过开发,看不懂sql,不会写sql,没有sql思维,下面通过几个......
  • [GYCTF2020]EasyThinking 1
    think模板,6.0特性因为题目已经很明确给我们了hint,直接尝试查看特性发现这里是任意文件写入的hint具体可以查看https://www.freebuf.com/vuls/352360.html这里我们直接进行账号注册因为他是根据cookie的名称进行定义文件名称的所以我们可以直接写文件名,这里的文件名长度要求......
  • COMPSCI 369 Computational Biology
    COMPSCI369THEUNIVERSITYOFAUCKLANDFIRSTSEMESTER,2023Campus:CentralCityCOMPUTERSCIENCEComputationalMethodsinInterdisciplinaryScienceTimeallowed:THREEhours)NOTE:Thisisarestrictedbookexam.YouareallowedasinglesheetofA4pa......
  • 52 Things: Number 3: Computational and storage power of different form factors
    52Things:Number3:Computationalandstoragepowerofdifferentformfactors52件事:数字3:不同外形尺寸的计算和存储能力Thisisthethirdinaseriesofblogpoststoaddressthelistof '52ThingsEveryPhDStudentShouldKnow' todoCryptography.Thes......
  • 阅读笔记《大象:Thinking in UML》下
    《ThinkinginUML》中的大象思考引发了我对UML在软件开发中的重要性和应用的思考。大象的比喻不仅揭示了软件项目的庞大和复杂性,同时也突显了UML作为一种建模语言的价值。首先,大象象征了软件项目的庞大复杂性。在一个庞大的项目中,各种功能、模块和组件交织在一起,形成了一个庞大......
  • 第三届世界华人计算生物学大会 The 3rd Worldwide Chinese Computational Biology Con
    第三届世界华人计算生物学大会发布:2020年08月03日11:58浏览:52次【转】The3rdWorldwideChineseComputationalBiologyConference 时间:2020年8月3日-8月6日线上会议&实时直播:https://www.koushare.com/live/liveroom?islive=0&lid=394&roomid=132792会议官网:https://q......
  • SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation
    SegNeXt:RethinkingConvolutionalAttentionDesignforSemanticSegmentation*Authors:[[Meng-HaoGuo]],[[Cheng-ZeLu]],[[QibinHou]],[[ZhengningLiu]],[[Ming-MingCheng]],[[Shi-MinHu]]·······初读印象comment::发现了导致分割模型性能提高的几......
  • Rethinking and Improving Relative Position Encoding for Vision Transformer: ViT
    RethinkingandImprovingRelativePositionEncodingforVisionTransformer*Authors:[[KanWu]],[[HouwenPeng]],[[MinghaoChen]],[[JianlongFu]],[[HongyangChao]]初读印象comment::(iRPE)提出了专门用于图像的相对位置编码方法,code:Cream/iRPEatmain·mi......