Tutorials References Exercises Videos Menu
Free Website Get Certified Upgrade

Pandas DataFrame duplicated() Method

❮ DataFrame Reference


Example

Check which rows are duplicated and not:

import pandas as pd

data = {
  "name": ["Sally", "Mary", "John", "Mary"],
  "age": [50, 40, 30, 40]
}

df = pd.DataFrame(data)

s = df.duplicated()
Try it Yourself »

Definition and Usage

The duplicated() method returns a Series with True and False values that describe which rows in the DataFrame are duplicated and not.

Use the subset parameter to specify if any columns should not be considered when looking for duplicates.


Syntax

dataframe.duplicated(subset, keep)

Parameters

The parameters are keyword arguments.

Parameter Value Description
subset column label(s) Optional. A String, or a list, containing any columns to ignore
keep 'first'
'last'
False
Optional, default 'first'. Specifies which duplicate to keep. If False, drop ALL duplicates

Return Value

A Series with a boolean value for each row in the DataFrame.


❮ DataFrame Reference