Generating column unique values from all columns from a dataframe and storing it into python dictionary.
Dataframe has "unique" method which returns the required unique values.
Dataframe "columns" property will also be used to get the column names which will iterated over for the purpose.
Creating a new dataframe with dictionary (which has duplicate data)
# importing pandas
import pandas as pd
# animal_data_with_duplicates dictionary
animal_data_with_duplicates = {
"Name": ["Cat", "Dog", "Cow","Tiger","Cat"],
"Speed": [15, 12, 10,20,15],
"Sound": ["Meow", "Woof", "Mooo","Roar","Meow"],
}
#creating a dataframe using the animal_data_with_duplicates dictionary
animal_with_duplicates_df = pd.DataFrame(animal_data_with_duplicates)
#printing animal_with_duplicates_df
print("animal_with_duplicates_df \n", animal_with_duplicates_df)
animal_with_duplicates_df
Name Speed Sound
0 Cat 15 Meow
1 Dog 12 Woof
2 Cow 10 Mooo
3 Tiger 20 Roar
4 Cat 15 Meow
#generating unique values from "Name" column
unique_names = animal_with_duplicates_df["Name"].unique()
print("unique_names \n", unique_names)
unique_names ['Cat' 'Dog' 'Cow' 'Tiger']
#getting the column names of dataframe
column_names = animal_with_duplicates_df.columns
print("column_names \n" , column_names)
#dictionary to store unique values for each column
unique_dict={}
# iterating over the column names
# getting unique values for each column and storing it to dictionary
for column in column_names:
unique_items = animal_with_duplicates_df[column].unique()
unique_dict[column] = unique_items
print("\n unique_dict \n",unique_dict)
column_names
Index(['Name', 'Speed', 'Sound'], dtype='object')
unique_dict
{'Name': array(['Cat', 'Dog', 'Cow', 'Tiger'], dtype=object),
'Speed': array([15, 12, 10, 20]),
'Sound': array(['Meow', 'Woof', 'Mooo', 'Roar'], dtype=object)}
Comments
Post a Comment
Oof!