首页 > 其他分享 >COMM5000 Data Literacy

COMM5000 Data Literacy

时间:2024-09-29 17:11:51浏览次数:7  
标签:Literacy quality Data analysis will wine data your COMM5000

COMM5000Data Literacy

Case Study ProjectMilestone 1 Information

Term 3, 2024

Case Study Information

Business context: In recent years, the growing interest in wine has fuelled the expansion of the wine industry. As a result, companies are investing in new technologies to enhance both wine production and sales. Quality certification plays a vital role in these processes and currently relies heavily on wine tasting by human experts.

Case/Scenario: You consult a winery and help this company to predict or estimate human wine taste preferences at the certification step. Knowing the wine quality will allow the winery to be better positioned to predict available amounts and yearly sales. It will also support the oenologist wine tasting evaluations by potentially improving the quality and speed of their decisions, and improve wine production. Furthermore, similar techniques can help in target marketing by modelling consumer tastes from niche markets. In order to predict wine quality you will use a dataset consisting of 4898 white and 1599 red vinho verde samples from Portugal's northwest region, and the statistical methods covered in this course.

Project objectives

1. Looking at the provided dataset, is there any relationship (positive or negative) that can be used between wine quality and any of the variables in the dataset? Does wine type (red or white) have an impact on our predictions of wine quality? If so, how can these relationships be used to predict wine quality, using the methods used in this class? The data provided is everything we know about these wines, and no external data sources are to be used. Moreover, no knowledge about the chemical properties of wines is assumed or required: this project should be seen as a business / statistical exercise. Please provide both quantitative and qualitative analysis supporting any findings.

2. In addition, based on your analysis towards the Objective (1) identify weaknesses and limitations of the chosen approach, and propose, at least in broad terms, a better approach. This proposed approach could include additional data to be included, or a methodology that is able to better deal with data limitations. Please also provide any supporting analysis for these additional considerations.

COMM5000 Context

This is a business question that is based on real data, although simulated for assessment purposes. The role you are to play is one of a consultant contracted by a winery to assist with the analysis of data using the COMM5000 data analysis tools, which include descriptive and inferential statistics.

The work will be scaffolded into two milestones M1 (20%) & M2 (20%) and a final project report (60%). Every milestone will require you to use what you have learned to address specific aspects of the data. Generally, M1 consists of an exploratory data analysis, whereas M2 is concerned with identifying hypotheses and formulating key inferential questions. In the final project report, all the insights gathered from M1 and M2 are used to model the data to answer the project questions.

M1 is a peer-reviewed assessment, which means that your 代 写 COMM5000 Data Literacy assessment will be assessed by some random peer students. More details on this process below.

Schedule of engagement for the entire course

Upon request and as part of additional support for assessments one of the course teaching team members might hold consultation sessions throughout the term. It is very important that you attend these sessions where we will hold live synchronous sessions to provide more detailed information about the case study. During these sessions, you are free to ask questions and discuss any aspects of the project.

Milestone 1: Preliminary Insight Development

Description of assessment task

This first milestone aims to give you a better understanding of the datasets, variables, and questions in this Case Study. This exploratory data analysis seeks to get the necessary insights so that a development plan can be formulated to address the following key points of the case study project:

1) Data analysis: an in-depth description of the variables included in the dataset and the relationship between wine quality and alcohol.

2) Effect of Wine Type on estimated wine quality.

You must submit a written development plan summarising the finding from the data explorations, describing any patterns from comparing summary statistics of the variable of interest, and providing a plan on how you may address the key questions (1) and (2). The report should be concise and well written.

Please note that you are not required to fully answer (1) and (2) in this milestone. Instead, you are required to develop insights and understand the problem, as well as the datasets for your final project.

As a style. guide, you may include some or all tables/graphs as an appendix and refer to them as appropriate in your report. You should only include graphics and tables to support your analysis, conclusions, and findings. While preparing your paper, you will encounter numerous tables and graphs, which are irrelevant to the analysis. So be very selective and make good use of the page limit!

Approach to the assessment task

In week 1, we learnt how to represent the data using graphical tools, as well as numerical summaries. All these tools are meant to give us an idea of what the data are ‘trying to tell us’. Can we make sense of the large numbers of observations and tell a simple story or pick up a trend? This is what you will do in this milestone: understand the data and what we are trying to find out from the data.

(A) Expected Tasks

(i) Download the data. This assessment requires the download o f the Excel file provided on your course Moodle page (file name: “Vinho_Verde.xlsm”). The dataset is related to red and white variants of the Portuguese "Vinho Verde" wine. For more details, consult: https://www.vinhoverde.pt/en/homepage or the reference [Cortez et al., 2009] which can be accessed from https://repositorium.sdum.uminho.pt/bitstream/1822/10029/1/wine5.pdf.

(ii) Data preparation. For this class, the data are already cleaned and complete. No cleaning is needed. However, you must explain any data manipulations you perform. and provide a rationale for them.

(iii) Variables of interest. Consider (1) and (2) above and focus only on the variables included in the provided dataset.

(B) The expected outcomes

The written work must provide a brief description of the Case Study problem and a clear plan of how the dataset provided will address the key questions raised in the project description. You will have the opportunity to adjust, revise and review this plan as we progress throughout the term. M1 analysis is based on COMM5000 content covered in weeks 1, 2 and 3.

(i) Numerical summaries of the key variables of interest: present descriptive statistics of the variables in the data. You may represent these results in the form. of tables.

(ii) These numerical summaries must be presented for 1) the entire sample, 2) only for red wines and 3) only for white wines.

(iii) For example, for each variable:

                               Mean              Mode            Median            SD             Min           Max

Variable name 1

(ii) Graphical representations of some variables if you deem it important for to capture a trend or some interesting patterns in the data.

(iii) Analysis of the relationship between wine quality and alcohol content. Use scatter plots and describe your findings

(iv) What conclusions can you make from the inspection of these data summaries in the form. of tables and graphs? For example, is there a pattern that you can identify?

(v) Your analysis should inform. your development plan to address points (1) and (2) in Milestone 2 and in the final report. This plan may be revised later during your work on Milestone 2.

Structure of the report

* Introduction You should briefly introduce the topic and summarise the purpose and importance of this project for the client. Then outline how this preliminary insight development plan will be structured. It is important to provide some background information on this topic. You can find relevant information from https://www.vinhoverde.pt/en/homepage or the reference [Cortez et al., 2009] which can be accessed from https://repositorium.sdum.uminho.pt/bitstream/1822/10029/1/wine5.pdf

* Data Summaries and Descriptive Statistics: Provide the necessary analysis to explore the variables. Describe the trends and stories that emerge from the data summaries. Are any patterns emerging from the graphs or tables you have constructed so far? Note: now that you have completed the first stage of data summaries, you have some basic insight into the dataset. You can use this information to develop some plans of action to address points (1) and (2) in Milestone 2 and in the final report.

* Conclusion: The conclusion should summarise the findings of your investigation and any concluding comments. It should also provide your plan for the next step of the analysis.

* References: Every piece of external documents you use needs to be properly referenced. Please include page and link so that your lecturer and tutor can efficiently check your work.

* It is suggested that you limit your report to a maximum of 8 pages including tables, graphs, and references.

Schedule of engagement for M1

Below you can find a summary of the deadlines related to M1. M1 requires not only the submission of the report (15%), but also a high-quality, constructive contribution to the assessment of the other students’ works (5%). This peer assessment involves being randomly assigned to critically review a number of other students’ submissions. You cannot choose which students to assess. You will 1) mark these submissions using a marking rubric to facilitate your job but also 2) leave constructive feedback. It is important that you leave high-quality, constructive feedback. Your feedback will be assessed, and you will also have the opportunity to evaluate the feedback you have received.

1. Week 4, Friday, deadline for submission of M1

2. Week 6, deadline for marking the submissions you were assigned to

3. Week 7, deadline for evaluating the feedback you have received

Submission instructions

• Via Moodle course site.

Supporting resources and links

- Dataset files: The Excel dataset file is available on Moodle. You only need to analyse the data that is included in this file.

- Weekly seminar: The seminar coordinator will cover relevant project aspects using Excel during the SEM session.

- Background information: See https://www.vinhoverde.pt/en/homepage or the reference [Cortez et al., 2009] which can be accessed from https://repositorium.sdum.uminho.pt/bitstream/1822/10029/1/wine5.pdf

 

标签:Literacy,quality,Data,analysis,will,wine,data,your,COMM5000
From: https://www.cnblogs.com/comp9021/p/18440431

相关文章

  • 解决Kuboard etcd 空间超过2G报:etcdserver: mvcc: database space exceeded。无法访问
    解决Kuboardetcd空间超过2G后无法访问问题Kuboard突然无法访问,使用kubernetes运行的kuboard参考:https://github.com/eip-work/kuboard-press/issues/526https://blog.csdn.net/wdy_2099/article/details/133203698排查问题查看日志信息通过查看pod日志看到错误信息......
  • 深入理解 Nuxt.js 中的 app:data:refresh 钩子
    title:深入理解Nuxt.js中的app:data:refresh钩子date:2024/9/29updated:2024/9/29author:cmdragonexcerpt:摘要:本文详细介绍了Nuxt.js框架中的app:data:refresh钩子,包括其定义、用途、使用方法及实际应用案例。该钩子用于在数据刷新时执行额外处理,支持服务器端和客......
  • DataFrame中保存和加载数据
    在Pandas中,可以很容易地将DataFrame对象保存到CSV文件,也可以从CSV文件加载数据到DataFrame。以下是这两个操作的详细解释:保存到CSV:df.to_csv('filename.csv',index=False)df:代表你的DataFrame对象。to_csv():这个方法用于将DataFrame保存到CSV文件。'fi......
  • 欧拉操作系统进行分区挂载/data
    要有一个/data目录虚拟机上面的硬盘使用情况lsblkvdb,一块新的独立的硬盘空间这里先使用命令vgdisplay看下是不存在卷组的如果不存在pvdisplay命令则安装下yuminstall-ylvm2--releasever=7新建磁盘分区:fdisk/dev/vdbm接着输入p选择主分区,默认也可......
  • huggingface的transformers与datatsets的安装与使用
    目录1.安装 2.分词2.1tokenizer.encode() 2.2tokenizer.encode_plus ()2.3tokenizer.batch_encode_plus() 3.添加新词或特殊字符 3.1tokenizer.add_tokens()3.2 tokenizer.add_special_tokens() 4.datasets的使用4.1加载datasets 4.2从dataset中取数据  4.3对datas......
  • 237 Sending a Http Request to Store Coach Data(加入后端)
    步骤1、准备后端程序后端程序使用ASP.NETWebAPI编写,见如下源码链接中“237-CoachWebAPI”文件夹黄健华/Vue3用VS2022打开后,需要做如下操作:1)appsettings.json文件中的数据库连接字符串改成自己的(需要新建数据库);2)执行Update-DataBase命令;如下两个专栏可以帮助大家学......
  • 第五届经济管理与大数据应用国际学术会议 2024 5th International Conference on Econ
    文章目录一、会议详情二、重要信息三、大会介绍四、出席嘉宾五、征稿主题六、咨询一、会议详情二、重要信息大会官网:https://ais.cn/u/vEbMBz提交检索:EICompendex、IEEEXplore、Scopus会议时间:2024年10月25日-27日会议地点:中国-大连三、大会介绍第五届经济管......
  • [转]Microsoft Dataverse documentation
    Dataversedocumentation-PowerApps|MicrosoftLearn GetstartedOverviewWhatisDataverse?WhychooseDataverse?ImproveCopilotresponsesTrainingIntroductiontoDataverse......
  • C# MySQL Dapper insert delete select update data from table
    Installdapper    usingDapper;usingMySql.Data.MySqlClient;namespaceConsoleApp87{internalclassProgram{staticstringconnStr=@"Server=servernamevalue;userid=usernamevalue;password=passwordvalue;database=databasename......
  • Mysql8.0启动时出现ERROR: Different lower_case_table_names settings for server ('
    分析:出现这个原因数据库启动后,调整lower_case_table_names参数导致的这个问题。mysql8.0之后,lower_case_table_names配置必须在安装好MySQL后,初始化mysql配置时才有效。一旦mysql启动后,再设置是无效的,而且启动报错。lower_case_table_names=1表示mysql是不区分大小写的......