How To Add Update Delete Rows & Columns In Pandas DataFrame

This article will show you some examples of how to add, update, delete rows & columns in the pandas DataFrame object.

1. The Example Original DataFrame Data.

  1. The below python source code is used to generate the original DataFrame object that will be modified in the later examples.
    import pandas as pd
    
    '''
    This function create a python pandas DataFrame object with a 2 dimension array.
    '''
    def create_dataframe_from_2_dimension_array():
        
        pd.set_option('display.unicode.east_asian_width', True)
        
        ''' Define a 2 dimension array, each element in the array's first dimension is a list. 
            
            It contains the position number, programming language and operating system '''
        data = [[1, 'python', 'Windows'], [5, 'java', 'Linux'],[8, 'c++', 'macOS']]
        
        # Define the column list, each element in the list is the column label.
        columns = ['Position', 'Programming Language', 'Operating System']
        
        name = ['USA', 'UK', 'CA']
        
        df = pd.DataFrame(data=data, index=name, columns=columns)
        
        print('========================== original DataFrame data ================================')
        
        print(df, '\r\n')
        
        # Return the python pandas DataFrame object.
        return df
    
    if __name__ == '__main__':
        
        create_dataframe_from_2_dimension_array()
    
  2. When you run the above code, it will output the below DataFrame object’s rows and columns data.
    ========================== original DataFrame data ================================
         Position Programming Language Operating System
    USA         1               python          Windows
    UK          5                 java            Linux
    CA          8                  c++            macOS 
    

2. Add Columns & Rows In Pandas DataFrame Object.

2.1 Add Columns.

  1. You can add a column to the DataFrame object in the following ways.
  2. Add a column to DataFrame directly.
  3. Add the DataFrame column with the DataFrame loc attribute.
  4. Add the DataFrame column with the DataFrame object’s insert method.
  5. Below is the example source code, the function name is add_columns_in_dataframe().
    import pandas as pd
    
    def add_columns_in_dataframe():
        
        df = create_dataframe_from_2_dimension_array()
        
        print('\r\n========================== add column Database-1 at the end of the DataFrame columns ================================\r\n')
        
        # add a column by providing the column name and data list.
        df['Database-1'] = ['Oracle', 'MySQL', 'SQLite']
        
        print(df)   
        
        
        
        print('\r\n========================== use DataFrame loc attribute to add column Database-2 at the end of the DataFrame columns ================================\r\n')
        
        # add column with the DataFrame loc attribute.
        df.loc[:, 'Database-2'] = ['MongoDB', 'SQL Server', 'Access']
        
        print(df)   
        
           
        
        print('\r\n========================== insert column Coding-Language after DataFrame object first column ================================\r\n')
        
        # add column with the DataFrame insert method.
        df.insert(1, 'Coding-Language', ['C++', 'C', 'C#'])
        
        print(df)    
    
    
    if __name__ == '__main__':
        
        add_columns_in_dataframe()
    
  6. Below is the above example source code output.
    ========================== original DataFrame data ================================
         Position Programming Language Operating System
    USA         1               python          Windows
    UK          5                 java            Linux
    CA          8                  c++            macOS 
    
    
    
    
    ========================== add column Database-1 at the end of the DataFrame columns ================================
    
    
         Position Programming Language Operating System Database-1
    USA         1               python          Windows     Oracle
    UK          5                 java            Linux      MySQL
    CA          8                  c++            macOS     SQLite
    
    
    ========================== use DataFrame loc attribute to add column Database-2 at the end of the DataFrame columns ================================
    
    
         Position Programming Language Operating System Database-1  Database-2
    USA         1               python          Windows     Oracle     MongoDB
    UK          5                 java            Linux      MySQL  SQL Server
    CA          8                  c++            macOS     SQLite      Access
    
    
    ========================== insert column Coding-Language after DataFrame object first column ================================
    
    
         Position Coding-Language  ... Database-1  Database-2
    USA         1             C++  ...     Oracle     MongoDB
    UK          5               C  ...      MySQL  SQL Server
    CA          8              C#  ...     SQLite      Access
    
    [3 rows x 6 columns]
    

2.2 Add Rows.

  1. There are 2 ways to add rows into pandas DataFrame object.
  2. Use the DataFrame object’s loc attribute.
  3. Use the DataFrame object’s append method.
  4. Below is the example function add_rows_in_dataframe().
    import pandas as pd
    
    def add_rows_in_dataframe():
        
        df = create_dataframe_from_2_dimension_array()
        
        print('\r\n========================== add one row in DataFrame ================================\r\n')
        
        # add DataFrame row with the loc attribute.
        df.loc['CN'] = ['2', 'Python', 'Windows']
        
        print(df)    
        
        print('\r\n========================== add multiple rows in DataFrame ================================\r\n')
        
        # create a python dictionary object.
        dict_insert = {'Position':[3, 6, 9], 'Programming Language':['Go', 'R', 'Php'], 'Operating System':['Linux', 'macOS', 'Windows']}
        
        name = ['JP', 'TW', 'KO']
        
        df_1  = pd.DataFrame(data = dict_insert, index = name)
        
        # append the new DataFrame object to the existing one.
        df = df.append(df_1)
        
        print(df)
    
    if __name__ == '__main__':
           
        add_rows_in_dataframe()
  5. Below is the above example execution result.
    ========================== original DataFrame data ================================
         Position Programming Language Operating System
    USA         1               python          Windows
    UK          5                 java            Linux
    CA          8                  c++            macOS 
    
    
    
    
    ========================== add one row in DataFrame ================================
    
    
        Position Programming Language Operating System
    USA        1               python          Windows
    UK         5                 java            Linux
    CA         8                  c++            macOS
    CN         2               Python          Windows
    
    
    ========================== add multiple rows in DataFrame ================================
    
    
        Position Programming Language Operating System
    USA        1               python          Windows
    UK         5                 java            Linux
    CA         8                  c++            macOS
    CN         2               Python          Windows
    JP         3                   Go            Linux
    TW         6                    R            macOS
    KO         9                  Php          Windows
    

3. Update DataFrame Object Data.

3.1 Update DataFrame Object Column Title Labels.

  1. You can update the DataFrame object’s column by it’s columns attribute or rename method.
    import pandas as pd
    
    def update_column_title_in_dataframe():
        
        df = create_dataframe_from_2_dimension_array()
        
        print('\r\n========================== update DataFrame column title ================================\r\n')
        
        # update the DataFrame columns name by it's columns attribute.
        df.columns=['Pos', 'PL','OS']
        
        print(df)
        
        
        print('\r\n========================== update DataFrame multiple columns title by rename method ================================\r\n')
        
        dict_column_title_change = {'Pos':'Order', 'PL':'Coding Language','OS':'Operating System'}
        
        # update the DataFrame columns by it's rename method.
        df.rename(columns=dict_column_title_change, inplace=True)
        
        print(df)
    
    if __name__ == '__main__':
        
        update_column_title_in_dataframe()
  2. Below is the above example execution output.
    ========================== original DataFrame data ================================
         Position Programming Language Operating System
    USA         1               python          Windows
    UK          5                 java            Linux
    CA          8                  c++            macOS 
    
    
    
    
    ========================== update DataFrame column title ================================
    
    
         Pos      PL       OS
    USA    1  python  Windows
    UK     5    java    Linux
    CA     8     c++    macOS
    
    
    ========================== update DataFrame multiple columns title by rename method ================================
    
    
         Order Coding Language Operating System
    USA      1          python          Windows
    UK       5            java            Linux
    CA       8             c++            macOS
    

3.2 Update DataFrame Object Row Index Labels.

  1. You can use the DataFrame object’s index attribute or rename method to update the DataFrame object’s row index labels.
    import pandas as pd
    
    def update_row_index_title_in_dataframe():
        
        df = create_dataframe_from_2_dimension_array()
        
        print('\r\n========================== update DataFrame row index label ================================\r\n')
        
        # update the DataFrame object's row index label by it's index attribute.
        df.index = list('123')
        
        print(df)
        
        
        # get the original DataFrame object again.
        df = create_dataframe_from_2_dimension_array()
        
        print('\r\n========================== update DataFrame multiple rows index title by rename method ================================\r\n')
        
        dict_row_index_title_change = {'USA':'7', 'UK':'8','CA':'9'}
        
        # call DataFrame object's rename method to update the row label, axis=0 means update rows.
        df.rename(dict_row_index_title_change, axis=0, inplace=True)
        
        print(df)   
    
    
    if __name__ == '__main__':
           
        update_row_index_title_in_dataframe()
    
  2. Below is the above example execution result.
    ========================== original DataFrame data ================================
         Position Programming Language Operating System
    USA         1               python          Windows
    UK          5                 java            Linux
    CA          8                  c++            macOS 
    
    
    
    
    ========================== update DataFrame row index label ================================
    
    
       Position Programming Language Operating System
    1         1               python          Windows
    2         5                 java            Linux
    3         8                  c++            macOS
    
    
    
    
    
    ========================== original DataFrame data ================================
         Position Programming Language Operating System
    USA         1               python          Windows
    UK          5                 java            Linux
    CA          8                  c++            macOS 
    
    
    ========================== update DataFrame multiple rows index title by rename method ================================
    
    
       Position Programming Language Operating System
    7         1               python          Windows
    8         5                 java            Linux
    9         8                  c++            macOS
    

3.3 Update DataFrame Object Rows & Columns Data.

  1. You can use the DataFrame object’s loc or iloc attribute to update the DataFrame object’s data.
    import pandas as pd
    
    def update_row_column_data_in_dataframe():
        
        df = create_dataframe_from_2_dimension_array()
        
        print('\r\n========================== update DataFrame entire row data ================================\r\n')
        
        df.loc['CA'] = [1, 'Python','Linux']
        
        df.iloc[1,:] = [2, 'Java & Python', 'Unix']
        
        print(df)
        
        
        print('\r\n========================== update DataFrame entire column data ================================\r\n')
        
        df.loc[:, 'Operating System'] = ['macOS', 'Linux', 'Windows']
        
        df.iloc[:, 1] = ['Java', 'Python', 'C++'] 
        
        print(df)
        
        
        print('\r\n========================== update DataFrame data by row and column index ================================\r\n')
        
        df.loc['CA', 'Programming Language'] = 'Python & JavaScript'
        
        df.iloc[1, 1] = 'Python & R'
        
        print(df)
    
    
    if __name__ == '__main__':
        
        update_row_column_data_in_dataframe()
  2. Below is the above example execution output.
    ========================== original DataFrame data ================================
         Position Programming Language Operating System
    USA         1               python          Windows
    UK          5                 java            Linux
    CA          8                  c++            macOS 
    
    
    ========================== update DataFrame entire row data ================================
    
    
         Position Programming Language Operating System
    USA         1               python          Windows
    UK          2        Java & Python             Unix
    CA          1               Python            Linux
    
    
    ========================== update DataFrame entire column data ================================
    
    
         Position Programming Language Operating System
    USA         1                 Java            macOS
    UK          2               Python            Linux
    CA          1                  C++          Windows
    
    
    ========================== update DataFrame data by row and column index ================================
    
    
         Position Programming Language Operating System
    USA         1                 Java            macOS
    UK          2           Python & R            Linux
    CA          1  Python & JavaScript          Windows

4. Delete DataFrame Object Rows & Columns Data.

  1. Call the DataFrame object’s drop(labels=None, axis=0, index=None, columns=None, level=None, inplace=False, error=’raise’) method.
  2. labels: row or column label.
  3. axis: axis=0 means drop by row, axis=1 means drop by column, the default value is 0.
  4. index: the dropped row label array.
  5. columns: the dropped column label array.
  6. inplace: True means changing the original DataFrame object directly, False means returning a new DataFrame object.

4.1 Drop DataFrame Object Columns Data.

  1. delete_column_in_dataframe().
    import pandas as pd
    
    def delete_column_in_dataframe():    
            
        df = create_dataframe_from_2_dimension_array()
        
        print('\r\n========================== drop DataFrame "Position" column data ================================\r\n')
        
        # axis=1 means to drop column,so the first parameter is the column name.
        df.drop('Position', axis=1, inplace=True)
        
        print(df)
        
        print('\r\n========================== drop DataFrame "Programming Language" column data ================================\r\n')
        
        # specify the dropped column name to the columns parameter to drop it. 
        df.drop(columns='Programming Language', inplace=True)
       
        print(df)
    
        print('\r\n========================== drop DataFrame "Operating System" column data ================================\r\n')
        
        # specify the dropped column name to the columns parameter to drop it. 
        df.drop(labels='Operating System', axis=1, inplace=True)
       
        print(df)
    
    
    if __name__ == '__main__':
        
        delete_column_in_dataframe()
  2. Below is the above example execution result.
    ========================== original DataFrame data ================================
         Position Programming Language Operating System
    USA         1               python          Windows
    UK          5                 java            Linux
    CA          8                  c++            macOS 
    
    
    
    
    ========================== drop DataFrame "Position" column data ================================
    
    
        Programming Language Operating System
    USA               python          Windows
    UK                  java            Linux
    CA                   c++            macOS
    
    
    ========================== drop DataFrame "Programming Language" column data ================================
    
    
        Operating System
    USA          Windows
    UK             Linux
    CA             macOS
    
    
    ========================== drop DataFrame "Operating System" column data ================================
    
    
    Empty DataFrame
    Columns: []
    Index: [USA, UK, CA]

4.2 Drop DataFrame Object Rows Data.

  1. delete_row_in_dataframe().
    import pandas as pd
    
    def delete_row_in_dataframe():    
            
        df = create_dataframe_from_2_dimension_array()
        
        print('\r\n========================== drop DataFrame "USA" row data ================================\r\n')
        
        # axis=0 means to drop row,so the first parameter is the row name.
        df.drop('USA', axis=0, inplace=True)
        
        print(df)
        
        print('\r\n========================== drop DataFrame "UK" row data ================================\r\n')
        
        # specify the dropped row name to the labels parameter to drop it. 
        df.drop(labels='UK', inplace=True)
       
        print(df)
    
        print('\r\n========================== drop DataFrame "CA" row data ================================\r\n')
        
        # specify the dropped column name to the columns parameter to drop it. 
        df.drop(labels='CA', axis=0, inplace=True)
       
        print(df)
    
    
    if __name__ == '__main__':
        
        delete_row_in_dataframe()
  2. Below is the above source code execution output.
    ========================== original DataFrame data ================================
         Position Programming Language Operating System
    USA         1               python          Windows
    UK          5                 java            Linux
    CA          8                  c++            macOS 
    
    
    ========================== drop DataFrame "USA" row data ================================
    
    
        Position Programming Language Operating System
    UK         5                 java            Linux
    CA         8                  c++            macOS
    
    
    ========================== drop DataFrame "UK" row data ================================
    
    
        Position Programming Language Operating System
    CA         8                  c++            macOS
    
    
    ========================== drop DataFrame "CA" row data ================================
    
    
    Empty DataFrame
    Columns: [Position, Programming Language, Operating System]
    Index: []

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.