Medical University of South Carolina Wine Recognition Dataset Discussion
The dataset used for this assignment is the Wine Recognition Dataset. Remember you must save this file in your directory that SAS can see on your hard drive.
The information on the data set can be found at this website (Links to an external site.).
The data are the results of chemical analysis of wines grown in the same region but derived from three different cultivars (plant variety).
- The analysis determined the quantities of 13 constituents found in each of the three types of wines.
- There are 178 instances in total with 59, 71, and 48 instances in class 1, class 2, and class 3, respectively.
- 1st attribute in the data is the class identifier (1-3) – see the information for the other attributes.
- Select randomly two variables and plot the data using different colors/signs to plot the points belonging to three different classes.
- What can you say about the class separability in this space?
- Repeat, but this time use the two variables that you found to be the most relevant for the classification process.
- Explain the approach you applied to select these two variables and include the analysis you performed in your answer.
- With your write-up, submit screenshots of the steps you took and the scatterplots.
- Your write-up must include exact steps you took to complete this assignment.
- If you had some challenges, please state the issue(s) as well.
Your write-up should meet the following requirements:
- Three to five pages (750-1,250 words) in length
- Include the screenshot diagram(s) of the BI solution framework
- Formatted according to APA guidelines as explained in the CSU-Global Guide to Writing & APA (Links to an external site.) (subheadings, 1” margins, and double spacing)
- Supported by three credible, academic outside sources in addition to course materials