We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Drop Missing Data

Difficulty: Easy


Problem Description

There are some rows having missing values in the name column. Write a solution to remove the rows with missing values.


Key Insights

  • The goal is to filter out rows where the name column contains missing (null) values.
  • The DataFrame structure is crucial, where each row represents a student with associated attributes: student_id, name, and age.
  • The solution must ensure that the integrity of the DataFrame is maintained by only removing the specified rows.

Space and Time Complexity

Time Complexity: O(n), where n is the number of rows in the DataFrame, since we need to check each row for the presence of a missing value in the name column.

Space Complexity: O(n), as we create a new DataFrame to store the filtered results.


Solution

To solve the problem, we will iterate through the DataFrame and check for null or missing values in the name column. We will utilize a filtering technique to create a new DataFrame that only includes rows where the name is not null. This approach efficiently removes unwanted entries while preserving the rest of the data intact.


Code Solutions

import pandas as pd

# Create a DataFrame with student data
df = pd.DataFrame({
    'student_id': [32, 217, 779, 849],
    'name': ['Piper', None, 'Georgia', 'Willow'],
    'age': [5, 19, 20, 14]
})

# Remove rows where 'name' is missing
df_cleaned = df[df['name'].notnull()]

# Display the cleaned DataFrame
print(df_cleaned)
← Back to All Questions