首页 > 其他分享 >[909] Remove duplicated rows based on multiple columns in Pandas

[909] Remove duplicated rows based on multiple columns in Pandas

时间:2023-10-17 13:22:51浏览次数:29  
标签:rows based duplicates 909 DataFrame keep df multiple columns

In a Pandas DataFrame, you can remove duplicated rows based on multiple columns using the drop_duplicates() method. Here's how you can do it:

import pandas as pd

# Sample DataFrame
data = {
    'A': [1, 2, 3, 2, 1],
    'B': ['apple', 'banana', 'cherry', 'banana', 'apple'],
    'C': [10, 20, 30, 20, 10]
}

df = pd.DataFrame(data)

# Remove duplicates based on columns A and B
df = df.drop_duplicates(subset=['A', 'B'])

# Display the resulting DataFrame
print(df)

In this example, we have a DataFrame with three columns, and we want to remove duplicates based on columns 'A' and 'B'. The subset parameter is set to a list of column names ('A' and 'B') to specify which columns should be considered when checking for duplicates. The resulting DataFrame will have duplicate rows removed based on the specified columns.

You can also use the keep parameter to control which duplicate values to keep. By default, it's set to 'first', which keeps the first occurrence and removes subsequent duplicates. You can set it to 'last' to keep the last occurrence and remove earlier duplicates or 'False' to remove all duplicates. For example:

# Remove duplicates based on columns A and B, keeping the last occurrence
df = df.drop_duplicates(subset=['A', 'B'], keep='last')

This code will keep the last occurrence of a duplicated row based on columns 'A' and 'B'. Adjust the subset and keep parameters according to your specific requirements.

标签:rows,based,duplicates,909,DataFrame,keep,df,multiple,columns
From: https://www.cnblogs.com/alex-bn-lee/p/17769457.html

相关文章

  • [LeetCode] 1354. Construct Target Array With Multiple Sums 多次求和构造目标数组
    Youaregivenanarray target ofnintegers.Fromastartingarray arr consistingof n 1's,youmayperformthefollowingprocedure:let x bethesumofallelementscurrentlyinyourarray.chooseindex i,suchthat 0<=i<n andsettheva......
  • [ARC116C] Multiple Sequences题解
    思路我们可以很好的想到一种\(O(nm)\)的dp:状态:\(dp_{i,j}\)为搜到第\(i\)个,最后一个数是\(j\)的方案数。转移:\(dp_{i,j}=\displaystyle\sum_{k|j,k\not=j}dp_{i-1,k}\)当然这是会超时的。我们换一种思路,我们先枚举最后一个数,再计算方案数。这有个好处,我们缩小......
  • To install it, you can run: npm install --save svg-baker-runtime/browser-symbol
    运行vue项目npmrundev命令报错报错信息:错误提示:Toinstallit,youcanrun:npminstall--savesvg-baker-runtime/browser-symbol解决:npminstall--saveregenerator-runtimesvg-baker-runtimevue-style-loader......
  • [903] Concatenate (merge) multiple dictionaries in Python
    Toconcatenate(merge)multipledictionariesinPython,youcanusevariousmethodsdependingonyourPythonversionandpreferences.Herearesomecommonapproaches:1.Usingtheupdate()Method:Youcanusetheupdate()methodofdictionariestomergeo......
  • 2023-10-08 npx update-browserslist-db@latest==》不用管,能运行即可
    今天在hbuilderx运行微信小程序项目时显示一下内容:09:03:01.944npxupdate-browserslist-db@latest09:03:01.944Whyyoushoulddoitregularly:https://github.com/browserslist/update-db#readme​意思就是你的某个依赖需要更新了,如果它没有终止运行,那就不要管,能跑......
  • Go - Using Multiple Versions of the Same Dependent Packages
    Problem: Youwanttousemultipleversionsofthesamedependentpackagesinyourcode.Solution: Usethereplacedirectiveinthego.modfiletorenameyourpackage.Thoughitmightseemlikeaverynicherequirement,thereissometimesaneedtobeabl......
  • browser_more
    title:关于浏览器的那些事儿tags:[浏览器,Web优化]categories:湿垃圾keywords:关于浏览器的那些事儿,浏览器内核,浏览器渲染,浏览器的性能优化,浏览器的兼容性优化description:从浏览器的基础知识到页面的渲过程与优化hot:truedate:2020-12-2122:16:15{%notewa......
  • [891] Combine multiple dictionaries in Python
    TocombinemultipledictionariesinPython,youcanuseanyofthemethodsmentionedearlierforcombiningtwodictionaries.Youcanrepeatedlyapplythesemethodstomergemultipledictionariesintoone.Here'showyoucandoit:Usingtheupdate()......
  • Road To Reality(Multiple valuedness, natural logarithms)
    RoadToReality(Multiplevaluedness,naturallogarithms)Addition-to-multiplication\(e^{a+b}=e^ae^b\)theinverseoftheexponentialfunction:\(z=\ln{w}\)if\(w=e^z\)Hence:\(\ln{ab}=\ln{a}+\ln{b}\)AspecialCartesianform(\(z=x+iy\)......
  • iOS开发Swift-UITableView-func tableView(_ tableView: UITableView, numberOfRowsIn
    functableView(_tableView:UITableView,numberOfRowsInSectionsection:Int)->Int{return6}返回一个整形.作用:UITableView的DataSource,用来确定cell的个数.numberOfRowsInSection就是在界面中的行数例如: ......