How To Use DataFrame In Pandas

Python pandas DataFrame is a data structure object that is similar to a table. It contains rows and columns. Each column contains the same type of data. For each column of data, you can use the row number to iterate the column elements. This article will tell you how to create a pandas DataFrame object and how to get the columns and rows data in it.

1. How To Create Pandas DataFrame Object.

  1. Call the pandas module’s DataFrame(data, index=index, columns=columns) method to create the python pandas DataFrame object.
  2. The data parameter saves the DataFrame object data, it can be a 2 dimension array or a Python dictionary object.
  3. The index parameter is the DataFrame object’s row index number, it is a Python list object.
  4. The columns parameter is the DataFrame object’s column label text, we can use each column value to get the DataFrame object’s one-column data in pandas.Series type object.

1.1 Create Pandas DataFrame Object By 2 Dimension Array.

  1. The below example will create a python pandas DataFrame object with a 2 dimension array.
    import pandas as pd
    
    
    '''
    This function create a python pandas DataFrame object with a 2 dimension array.
    '''
    def create_dataframe_from_2_dimension_array():
        
        pd.set_option('display.unicode.east_asian_width', True)
        
        ''' Define a 2 dimension array, each element in the array's first dimension is a list. 
            
            It contains the position number, programming language and operating system '''
        data = [[1, 'python', 'Windows'], [5, 'java', 'Linux'],[8, 'c++', 'macOS']]
        
        # Define the column list, each element in the list is the column label.
        columns = ['Position', 'Programming Language', 'Operating System']
        
        # Define the row index list.
        index = [1, 2, 3]
        
        # Create the python pandas DataFrame object.
        df = pd.DataFrame(data, index=index, columns=columns)
        
        # Print out the DataFrame object data.
        print(df)
        
        # Return the python pandas DataFrame object.
        return df
    
  2. When you run the above function, it will print out the below data in the console.
       Position Programming Language Operating System
    1         1               python          Windows
    2         5                 java            Linux
    3         8                  c++            macOS
    

1.2 Create Pandas DataFrame Object By Python Dictionary Object.

  1. The below example will create a python pandas DataFrame object with a Python dictionary object.
    '''
    This function create a python pandas DataFrame object with a python dictionary object.
    '''
    def create_dataframe_from_dictionary_object():
        
        pd.set_option('display.unicode.east_asian_width', True)
        '''
        Define a python dictionary object, the key is the column name, the value is a list that contains the column value of each row in the column.
        '''
        dict_obj = {'Position':[1, 5, 8], 'Programming Language':['python', 'java', 'c++'], 'Operating System':['Windows', 'Linux', 'macOS']}
        
        # Create a list object to store the row index number.
        index = [1, 2, 3]
        
        # Create the python pandas DataFrame object 
        df = pd.DataFrame(dict_obj, index=index)
        
        # Print the DataFrame object's data in the console.
        print(df)
        
        # Return the created DataFrame object.
        return df
  2. Below is the above example function execute result in the console.
       Position Programming Language Operating System
    1         1               python          Windows
    2         5                 java            Linux
    3         8                  c++            macOS

2. How To Iterate Python Pandas DataFrame Object.

2.1 Iterate DataFrame Columns.

  1. The python pandas DataFrame object’s columns attribute will return all the DataFrame object’s columns value in a list.
  2. Then we can iterate the returned column list and then get the column data in a pandas Series object. Below is an example.
    '''
    This function will iterate the dataframe_object and print out each column data list in the python pandas DataFrame object.
    '''
    def iterate_dataframe_object(dataframe_object):
        
        
        print('=================== iterate_dataframe_object ======================')
        
        
        # Loop the DataFrame object's columns.
        for column in dataframe_object.columns:
            
            # Print out the column name.
            print(column)
            
            # Get the column data in a pandas Series object.
            column_data_series = dataframe_object[column]
            
            # Print out the column data Series object.
            print(column_data_series)
    
            print('=======================================')
    
    
    if __name__ == '__main__':
        
        #create_dataframe_from_2_dimension_array()
        
        df = create_dataframe_from_dictionary_object()
        
        iterate_dataframe_object(df)
    
  3. Below is the above example execution result.
    =================== iterate_dataframe_object ======================
    Position
    1    1
    2    5
    3    8
    Name: Position, dtype: int64
    =======================================
    Programming Language
    1    python
    2      java
    3       c++
    Name: Programming Language, dtype: object
    =======================================
    Operating System
    1    Windows
    2      Linux
    3      macOS
    Name: Operating System, dtype: object
    =======================================

2.2 Iterate DataFrame Rows.

  1. You can use the pandas module’s DataFrame object’s iterrows() function to get a DataFrame object’s row iterator.
  2. Then you can invoke the python next() function to iterate the items with the iterator., and then get each row data of the DataFrame object. Below is the example source code.
    '''
    Created on Oct 23, 2021
    
    @author: songzhao
    '''
    
    import pandas as pd
    
    '''
    This function create a python pandas DataFrame object with a python dictionary object.
    '''
    def create_dataframe_from_dictionary_object():
        
        pd.set_option('display.unicode.east_asian_width', True)
        '''
        Define a python dictionary object, the key is the column name, the value is a list that contains the column value of each row in the column.
        '''
        dict_obj = {'Position':[1, 5, 8], 'Programming Language':['python', 'java', 'c++'], 'Operating System':['Windows', 'Linux', 'macOS']}
        
        # Create a list object to store the row index number.
        index = [1, 2, 3]
        
        # Create the python pandas DataFrame object 
        df = pd.DataFrame(dict_obj, index=index)
        
        # Print the DataFrame object's data in the console.
        print(df)
        
        # Return the created DataFrame object.
        return df
            
        
    '''
    This function will iterate the DataFrame object rows and print each row data.
    '''  
    def iterate_dataframe_rows(df_obj):
        
        print('=================== iterate_dataframe_rows ======================')
        
        # Call the DataFrame object's iterrows() function to get row iterator.
        iterator = df_obj.iterrows()
        
        # Get the next item in the iterator.
        row = next(iterator, None)
       
        # While there are rows in the iterator.
        while row != None:  
            
            row_number = row[0]
            
            series_obj = row[1]
            
            print('row number = ', row_number)
            
            print(series_obj.index)
            
            print(series_obj.values)
            
            print('\r\n')
            
            # Get the next row from the iterator.
            row = next(iterator, None)
                
    
    if __name__ == '__main__':
        
        df = create_dataframe_from_dictionary_object()
        
        iterate_dataframe_rows(df)
    
  3. When you run the above example source code, you will get the below output.
       Position Programming Language Operating System
    1         1               python          Windows
    2         5                 java            Linux
    3         8                  c++            macOS
    =================== iterate_dataframe_rows ======================
    row number =  1
    Index(['Position', 'Programming Language', 'Operating System'], dtype='object')
    [1 'python' 'Windows']
    
    
    row number =  2
    Index(['Position', 'Programming Language', 'Operating System'], dtype='object')
    [5 'java' 'Linux']
    
    
    row number =  3
    Index(['Position', 'Programming Language', 'Operating System'], dtype='object')
    [8 'c++' 'macOS']
    

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.