首页 > 其他分享 >comp10002 Foundations of Algorithms

comp10002 Foundations of Algorithms

时间:2024-09-09 17:25:27浏览次数:8  
标签:2024 your will program Algorithms Foundations data comp10002 Stage

School of Computing and Information Systems

comp10002 Foundations of Algorithms

Semester 2, 2024

Assignment 1

Learning Outcomes

In this assignment you will demonstrate your understanding of arrays,  strings, functions, and the typedef facility.  You may also (but are not required to) use structs (see Chapter 8) if you wish. You must not make any use of malloc() (Chapter 10) or file operations (Chapter 11) in this project.

TSV Files

The tab separated format is a common and convenient way of storing structured data in a text file. The first line of the file stores a set of c “column header” entries all represented as text strings that may contain alphanumeric, blank characters and punctuation characters, but no tab (’\t’) or newline (’\n’) characters, stored with c − 1 tab characters between the c entries to denote the breaks between the strings, and with a final newline character at the end of the line. A set of r following rows each also store c data items, represented in the same format.  For example, consider the data file test0 .tsv, with tabs and newline characters shown explicitly:

Year\tEvent\tGender\tCountry\tMedal\n

2024\tSwimming\tWomens\tNew  Zealand\tfirst:  gold\n 2024\tSwimming\tWomens\tChina\tsecond:  silver\n

2024\tSwimming\tWomens\tIndonesia\tthird:  bronze\n 2024\tCycling\tWomens\tChina\tfirst:  gold\n

2024\tCycling\tWomens\tNew  Zealand\tsecond:  silver\n 2024\tCycling\tWomens\tNew  Zealand\tthird:  bronze\n

This test file has c  =  5 columns, headed “Year”, “Event”, “Gender”, “Country”, and “Medal” respectively;  and contains r  =  6 data  rows  that  show (for example) that in “2024” the country  “New  Zealand” is associated with the medals “second:  silver” and “third:  bronze” in the  sport “Cycling” in connection with the gender “Womens” .  Note that r and c are not stored, and are  implicit via the appearance of the newline and tab characters.

Before doing anything else, you should copy the skeleton program ass1-skel .c from the LMS Assignment 1 page, and read through the code.  Check that you can compile it via either Grok or a terminal shell and gcc. Once you have it compiled, you should see this when you run it:

mac:  ./ass1-skel ta  daa!

You are now ready to start adding functionality.

Stage 1 – Reading and Printing (12/20 marks)

Extend the skeleton program so that it that reads a TSV input stream from stdin and builds a corre- sponding internal data structure using a 2d array of strings. You may assume that at most 1,000 input lines will be presented, that each input line 代 写comp10002 Foundations of Algorithms contains at most 30 columns of information, and that each entry contains at most 50 characters.

The required output for this first stage is a summary of what was read, here is an example:

mac:  ./ass1-soln  < test0 .tsv Stage 1

input  tsv  data  has  6  rows  and  5  columns

row  6  is:

1:  Year 2024

2:  Event Cycling

3:  Gender Womens

4:  Country New  Zealand

5:  Medal third:  bronze

To be eligible for full marks your output must exactly match.  Other input files and example output files are linked from the LMS page. Note that the columns are labeled from 1, and that the data rows are also labeled starting with 1 – this information is for non-computer scientists!

You may assume throughout that all input files you will be provided with will be “correct”,with all rows having the same number of entries; the number of rows and columns being within the specified bounds; the length of each entry being within the specified bound; and with no additional (or missing) newline or tab characters. The last character of all valid input files will be a ’\n’ character, and if you create your own test files, make sure you do the same.

Note that the getfield() function handles the PC-versus-Unix newline differences.  These dif- ference have the potential to be very frustrating if you don’t allow for them (see the section about this in the “Programming Tips” page linked from the LMS).

Temptations to avoid: use the getfield() function provided in the skeleton (rather than try and do it yourself with scanf() or etc); do not try and use malloc() or the other similar functions from Chapter 10; do not use any of the file operations described in Chapter 11 (except for fflush(stdout) when debugging, see the FAQ page); and do not print any prompts of any sort.

Stage 2 – Sorting (16/20 marks)

Now modify your program so that it accesses a sequence of integers from the command-line, each of them a value between 1 and c, indicating which column(s) should be used as sort keys. For example, for test0.tsv, if columns 4 and then 2 are specified, the r data rows in your internal format are to be reordered so that they are sorted first by Country (as a string, using strcmp()), and then where the Country fields are equal, using Event as a secondary key. Where two rows have the same values in the selected columns, they should be retained in the same order as they were originally; that is, your sort should be stable. The header row should not be sorted. For test0.tsv, sorting by the program arguments “4 2” gives:

Year

Event

Gender

Country

Medal

2024

Cycling

Womens

China

first: gold

2024

Swimming

Womens

China

second: silver

2024

Swimming

Womens

Indonesia

third: bronze

2024

Cycling

Womens

New Zealand

second: silver

2024

Cycling

Womens

New Zealand

third: bronze

2024

Swimming

Womens

New Zealand

first: gold

Think of that (hint, hint) as being “debugging” output. Then the actual Stage 2 output requires that the first data row, the middle data row (that is, row「r/2⌉), and the last data row:

mac:  ./ass1-soln  4  2  < test0 .tsv

<<>> Stage  2

sorting  by  "Country",

then  by  "Event"

row  1  is:

<<<5 1="" utput="" lines="" for="" row="">>>

row  3  is:

<<<5 3="" utput="" lines="" for="" row="">>>

row  6  is:

<<<5 6="" utput="" lines="" for="" row="">>>

Full examples are provided at the LMS page. If there are no column numbers supplied on the command line, your program should exit after completing Stage 1, without any Stage 2 output. You may assume that each specified integer will be in the range 1 to c inclusive, and that no values will be repeated.

Use insertionsort rather than quicksort (this project is about programming inC, not about algorithm implementation, and there won’t be a penalty for using insertionsort). And keep it simple and elegant – use functions sensibly, including a suitable comparison function and a suitable swapping function.

Stage 3 – Tabulation (20/20 marks)

Now use the sorted TSV values to generate a report that shows counts of rows matching the same selected column combination:

mac:  ./ass1-soln  4  2  < test0 .tsv

<<>>

Stage  3

Country

Event Count

------------------

China

Cycling 1

Swimming 1

Indonesia

Swimming 1

New Zealand

Cycling 2

Swimming 1

The last column starts one space to the right of the longest entry (including the column header) in the last column getting printed, with the counts then printed as five-digit numbers under the word “Count” . There are no tabsinthe Stage 3 output, and the formatting is based solely on spaces. Marks will be deducted for layout discrepancies. The format descriptor * will be your friend to help achieve this goal, use printf("%-*s",  num,  str) to print the string str left-justified in num columns, where num is a variable/value calculated while your program is executing.

General Tips...

You will probably find it helpful to include a DEBUG mode in your program that prints out interme- diate data and variable values. Use #if   (DEBUG) and #endif around such blocks of code, and then #define  DEBUG  1 or #define  DEBUG  0 at the top.  Turn off the debug mode when making your final submission, but leave the debug code in place. The FAQ page has more information about this.

The sequence of stages described in this handout is deliberate – it represents a sensible path through to the final program.  You can, of course, ignore the advice and try and write final program in a single effort, without developing it incrementally and testing it in phases.  You might even get away with it, this time and at this somewhat limited scale, and develop a program that works. But in general, one of the key things that makes some people better at programming than others is the ability to see a design path through simple programs, to more comprehensive programs, to final programs, that keeps the complexity under control at all times. That is one of the skills this subject is intended to teach you. And if you submit each of the stages as you complete it, you’ll know that you are accumu- lating evidence should you need to demonstrate your progress in the event of a special consideration application becoming necessary.

Boring But Important...

This project is worth 20% of your final mark, and is due at 6:00pm on Friday 13 September.

Submissions that are made after the deadline will incur penalty marks at the rate of two marks per (part or all) working day. For example, submissions made after 6:00pm Friday and before 6:00pm Monday will lose two marks.  Multiple submissions may be made; only the last submission that you make before the deadline will be marked. If you make any late submission at all, your on-time submissions will be ignored, and if you have not been granted an extension, the late penalty will be applied.

A rubric explaining the marking expectations is linked from the LMS, and you should study it carefully. Marks and feedback will be provided approximately two weeks after submissions close.

You need to submit your program for assessment via the link to GradeScope at the bottom of the LMS Assignment page. Submission is not possible through Grok.

 

 

标签:2024,your,will,program,Algorithms,Foundations,data,comp10002,Stage
From: https://www.cnblogs.com/qq--99515681/p/18404959

相关文章

  • Study Plan For Algorithms - Part26
    1.礼物的最大价值在一个m*n的棋盘的每一格都放有一个礼物,每个礼物都有一定的价值(价值大于0)。你可以从棋盘的左上角开始拿格子里的礼物,并每次向右或者向下移动一格、直到到达棋盘的右下角。给定一个棋盘及其上面的礼物的价值,请计算你最多能拿到多少价值的礼物?方法一:de......
  • Study Plan For Algorithms - Part27
    1.最长不含重复字符的子字符串请从字符串中找出一个最长的不包含重复字符的子字符串,计算该最长子字符串的长度。方法一:deflengthOfLongestSubstring(s):n=len(s)max_len=0start=0char_index={}forendinrange(n):ifs[en......
  • Study Plan For Algorithms - Part24
    1.全排列II题目链接:https://leetcode.cn/problems/permutations-ii/给定一个可包含重复数字的序列nums,按任意顺序返回所有不重复的全排列。classSolution:defpermuteUnique(self,nums:List[int])->List[List[int]]:defbacktrack(nums,path,res,us......
  • Study Plan For Algorithms - Part25
    1.栈的压入、弹出序列输入两个整数序列,第一个序列表示栈的压入顺序,请判断第二个序列是否为该栈的弹出顺序。假设压入栈的所有数字均不相等。方法一:defvalidateStackSequences(pushed,popped):stack=[]len_pushed=len(pushed)stack_index=0i......
  • Study Plan For Algorithms - Part23
    1.跳跃游戏II题目链接:https://leetcode.cn/problems/jump-game-ii/给定一个长度为n的0索引整数数组nums。初始位置为nums[0]。每个元素nums[i]表示从索引i向前跳转的最大长度:0<=j<=nums[i]i+j<n返回到达nums[n-1]的最小跳跃次数。classSolutio......
  • Study Plan For Algorithms - Part22
    1.字符串相乘题目链接:https://leetcode.cn/problems/multiply-strings/给定两个以字符串形式表示的非负整数num1和num2,返回num1和num2的乘积,它们的乘积也表示为字符串形式。classSolution:defmultiply(self,num1:str,num2:str)->str:ifnum1==......
  • Study Plan For Algorithms - Part21
    1.缺失的第一个正数题目链接:https://leetcode.cn/problems/first-missing-positive/给定一个未排序的整数数组nums,请找出其中没有出现的最小的正整数。classSolution:deffirstMissingPositive(self,nums:List[int])->int:n=len(nums)forii......
  • Study Plan For Algorithms - Part20
    1.组合总和题目链接:https://leetcode.cn/problems/combination-sum/给定一个无重复元素的整数数组candidates和一个目标整数target,找出candidates中可以使数字和为目标数target的所有不同组合,并以列表形式返回。classSolution:defcombinationSum(self,ca......
  • Study Plan For Algorithms - Part20
    1.树的子结构输入两棵二叉树A和B,判断B是不是A的子结构。方法一:classTreeNode:def__init__(self,val=0,left=None,right=None):self.val=valself.left=leftself.right=rightdefisSubStructure(A,B):ifnotAornot......
  • Study Plan For Algorithms - Part18
    1.搜索插入位置题目链接:https://leetcode.cn/problems/search-insert-position/给定一个排序数组和一个目标值,在数组中找到目标值,并返回其索引。如果目标值不存在于数组中,返回它将会被按顺序插入的位置。classSolution:defsearchInsert(self,nums:List[int],target:......