Convert Numpy Array to Pandas DataFrame

Convert Numpy Array to Pandas DataFrame

In data analysis and manipulation, two popular libraries in Python are Numpy and Pandas. Numpy is used for numerical operations, while Pandas provides data structures and functions for efficient data manipulation. Sometimes it might be necessary to convert a Numpy array to a Pandas DataFrame to leverage the added features and functionalities offered by Pandas.

In this article, we will explore various ways to convert a Numpy array to a Pandas DataFrame and provide code examples along with their execution results.

Method 1: Using pd.DataFrame()

One of the simplest methods to convert a Numpy array to a Pandas DataFrame is by using the pd.DataFrame() function provided by Pandas. Let’s see an example:

import numpy as np
import pandas as pd

# Creating a Numpy array
arr = np.array([[1, 2, 3], [4, 5, 6]])

# Converting Numpy array to Pandas DataFrame
df = pd.DataFrame(arr)

# Displaying the DataFrame
df

Output:

Convert Numpy Array to Pandas DataFrame

This method simply takes the Numpy array as an argument and creates a DataFrame with the same values.

Method 2: Using pd.DataFrame.from_records()

Another way to convert a Numpy array to a Pandas DataFrame is by using the pd.DataFrame.from_records() function. This method is useful when dealing with structured arrays or if we want to specify column names. Let’s see an example:

import numpy as np
import pandas as pd

# Creating a structured Numpy array
arr = np.array([(1, 'numpywhere.com'), (2, 'geek-docs.com'), (3, 'deepinout.com')], dtype=[('id', int), ('web', object)])

# Converting Numpy structured array to Pandas DataFrame
df = pd.DataFrame.from_records(arr)

# Displaying the DataFrame
df

Output:

Convert Numpy Array to Pandas DataFrame

Here, we have a structured Numpy array with two columns: id and web. The pd.DataFrame.from_records() method automatically assigns column names based on the field names in the structured array.

Method 3: Using pd.DataFrame.from_dict()

If we have a dictionary with column names as keys and Numpy arrays as values, we can directly convert it to a Pandas DataFrame using the pd.DataFrame.from_dict() method. Let’s see an example:

import numpy as np
import pandas as pd

# Creating a dictionary with column names as keys and Numpy arrays as values
data_dict = {
    'web': np.array(['numpywhere.com', 'geek-docs.com', 'deepinout.com']),
    'year': np.array([2020, 2022, 2023]),
    'city': np.array(['New York', 'Beijing', 'Chengdu'])
}

# Converting dictionary to Pandas DataFrame
df = pd.DataFrame.from_dict(data_dict)

# Displaying the DataFrame
df

Output:

Convert Numpy Array to Pandas DataFrame

In this example, each key in the dictionary represents a column name, and the corresponding Numpy array contains the values for that column.

Method 4: Using pd.DataFrame.from_records() with Column Names

In Method 2, we saw how to convert a structured Numpy array to a Pandas DataFrame. We can further customize the column names by providing them explicitly. Let’s see an example:

import numpy as np
import pandas as pd

# Creating a structured Numpy array
arr = np.array([(2024, 'numpywhere.com'), (2023, 'geek-docs.com'), (2022, 'deepinout.com')], dtype=[('year', int), ('web', object)])

# Converting Numpy structured array to Pandas DataFrame with custom column names
df = pd.DataFrame.from_records(arr, columns=['ID', 'Fruit'])

# Displaying the DataFrame
df

Output:

Convert Numpy Array to Pandas DataFrame

By specifying the column names using the columns parameter, we can assign custom names to the columns.

Method 5: Using pd.DataFrame() with Index and Column Names

We can also convert a Numpy array to a Pandas DataFrame while specifying both the index and column names using the pd.DataFrame() function. Let’s see an example:

import numpy as np
import pandas as pd

# Creating a Numpy array
arr = np.array([[1, 2, 3], [4, 5, 6]])

# Converting Numpy array to Pandas DataFrame with custom index and column names
df = pd.DataFrame(arr, index=['row1', 'row2'], columns=['col1', 'col2', 'col3'])

# Displaying the DataFrame
df

Output:

Convert Numpy Array to Pandas DataFrame

In this example, we provided custom index names (row1 and row2) and column names (col1, col2, and col3) while converting the Numpy array to a Pandas DataFrame.

Method 6: Using pd.DataFrame() with Multi-dimensional Numpy Array

If we have a multi-dimensional Numpy array, we can convert it to a Pandas DataFrame by specifying column names for each dimension. Let’s see an example:

import numpy as np
import pandas as pd

# Creating a 2D Numpy array
arr = np.array([[1, 2], [3, 4], [5, 6]])

# Converting Numpy array to Pandas DataFrame with custom column names
df = pd.DataFrame(arr, columns=['col1', 'col2'])

# Displaying the DataFrame
df

Output:

Convert Numpy Array to Pandas DataFrame

Here, we have a 2D Numpy array with two columns. We provided custom column names while converting it to a Pandas DataFrame.

Method 7: Using pd.DataFrame() with DateTime Index

If our Numpy array represents time-series data, we can convert it to a Pandas DataFrame by specifying a DateTime index using the pd.to_datetime() function. Let’s see an example:

import numpy as np
import pandas as pd

# Creating a Numpy array
arr = np.array([[1, 2], [3, 4], [5, 6]])

# Generating DateTime index
datetime_index = pd.date_range('2024-01-01', periods=3, freq='D')

# Converting Numpy array to Pandas DataFrame with DateTime index
df = pd.DataFrame(arr, index=datetime_index, columns=['col1', 'col2'])

# Displaying the DataFrame
df

Output:

Convert Numpy Array to Pandas DataFrame

In this example, we generated a DateTime index using the pd.date_range() function and used it to create a Pandas DataFrame from the Numpy array.

Method 8: Using pd.DataFrame() with Categorical Data

If our Numpy array contains categorical data, we can convert it to a Pandas DataFrame by specifying the categories for each column using the pd.Categorical() function. Let’s see an example:

import numpy as np
import pandas as pd

# Creating a Numpy array with categorical data
arr = np.array(['numpywhere.com', 'geek-docs.com', 'deepinout.com', 'csdn.net', 'cnblogs.com'])

# Converting Numpy array to Pandas DataFrame with categorical data
df = pd.DataFrame({'Web': pd.Categorical(arr)})

# Displaying the DataFrame
df

Output:

Convert Numpy Array to Pandas DataFrame

Like(0)