首页 > 其他分享 >[909] Remove duplicated rows based on multiple columns in Pandas

[909] Remove duplicated rows based on multiple columns in Pandas

时间:2023-10-17 13:22:51浏览次数:29  
标签:rows based duplicates 909 DataFrame keep df multiple columns

In a Pandas DataFrame, you can remove duplicated rows based on multiple columns using the drop_duplicates() method. Here's how you can do it:

import pandas as pd

# Sample DataFrame
data = {
    'A': [1, 2, 3, 2, 1],
    'B': ['apple', 'banana', 'cherry', 'banana', 'apple'],
    'C': [10, 20, 30, 20, 10]

df = pd.DataFrame(data)

# Remove duplicates based on columns A and B
df = df.drop_duplicates(subset=['A', 'B'])

# Display the resulting DataFrame

In this example, we have a DataFrame with three columns, and we want to remove duplicates based on columns 'A' and 'B'. The subset parameter is set to a list of column names ('A' and 'B') to specify which columns should be considered when checking for duplicates. The resulting DataFrame will have duplicate rows removed based on the specified columns.

You can also use the keep parameter to control which duplicate values to keep. By default, it's set to 'first', which keeps the first occurrence and removes subsequent duplicates. You can set it to 'last' to keep the last occurrence and remove earlier duplicates or 'False' to remove all duplicates. For example:

# Remove duplicates based on columns A and B, keeping the last occurrence
df = df.drop_duplicates(subset=['A', 'B'], keep='last')

This code will keep the last occurrence of a duplicated row based on columns 'A' and 'B'. Adjust the subset and keep parameters according to your specific requirements.

From: https://www.cnblogs.com/alex-bn-lee/p/17769457.html


  • [LeetCode] 1354. Construct Target Array With Multiple Sums 多次求和构造目标数组
    Youaregivenanarray target ofnintegers.Fromastartingarray arr consistingof n 1's,youmayperformthefollowingprocedure:let x bethesumofallelementscurrentlyinyourarray.chooseindex i,suchthat 0<=i<n andsettheva......
  • [ARC116C] Multiple Sequences题解
  • To install it, you can run: npm install --save svg-baker-runtime/browser-symbol
  • [903] Concatenate (merge) multiple dictionaries in Python
  • 2023-10-08 npx update-browserslist-db@latest==》不用管,能运行即可
  • Go - Using Multiple Versions of the Same Dependent Packages
    Problem: Youwanttousemultipleversionsofthesamedependentpackagesinyourcode.Solution: Usethereplacedirectiveinthego.modfiletorenameyourpackage.Thoughitmightseemlikeaverynicherequirement,thereissometimesaneedtobeabl......
  • browser_more
  • [891] Combine multiple dictionaries in Python
  • Road To Reality(Multiple valuedness, natural logarithms)
  • iOS开发Swift-UITableView-func tableView(_ tableView: UITableView, numberOfRowsIn
    functableView(_tableView:UITableView,numberOfRowsInSectionsection:Int)->Int{return6}返回一个整形.作用:UITableView的DataSource,用来确定cell的个数.numberOfRowsInSection就是在界面中的行数例如: ......