Python – Remove Duplicate Data
Question Description
You are working as an analytics developer for an animal discovery channel. Your helpdesk person has received various reports via fax and compiled everything into CSV file. There are various data quality issues as data entry was done manually. You need to count the number of shark attack by country and report it to the TV show anchor.
Use the raw data file – SharkAttack.csv (attached)
You need to write a Python script that can perform following tasks.
Directions:
- Read the file using Python libraries
- Identify duplicate records
- Remove all the duplicate records
- Write the output into new file
Submission
- Attach a Python script
- Output screenshot