## When i convert my numpy array to Dataframe it update values to Nan

build a numpy array from the dataframe

numpy array to dataframe column

numpy fill missing values

converting array to dataframe in python

pandas create dataframe with index

import impyute.imputation.cs as imp print(Data) Data = pd.DataFrame(data = imp.em(Data),columns = columns) print(Data)

When i do the above code all my values gets converted to Nan as below,Can someone help me where am i going wrong?

Before

Time LymphNodeStatus ... MeanPerimeter TumorSize 0 31 5.0 ... 117.50 5.0 1 61 2.0 ... 122.80 3.0 2 116 0.0 ... 137.50 2.5 3 123 0.0 ... 77.58 2.0 4 27 0.0 ... 135.10 3.5 5 77 0.0 ... 84.60 2.5

After

Time LymphNodeStatus ... MeanPerimeter TumorSize 0 NaN NaN ... NaN NaN 1 NaN NaN ... NaN NaN 2 NaN NaN ... NaN NaN 3 NaN NaN ... NaN NaN 4 NaN NaN ... NaN NaN 5 NaN NaN ... NaN NaN

**Editted**

**Solution first**

Instead of passing `columns`

to `pd.DataFrame`

, just manually assign column names:

data = pd.DataFrame(imp.em(data)) data.columns = columns

**Cause**

Error lies in `Data = pd.DataFrame(data = imp.em(Data),columns = columns)`

.

`imp.em`

has a decorator `@preprocess`

which converts input into a `numpy.array`

if it is a `pandas.DataFrame`

.

... if pd_DataFrame and isinstance(args[0], pd_DataFrame): args[0] = args[0].as_matrix() return pd_DataFrame(fn(*args, **kwargs))

It therefore returns a `dataframe`

reconstructed from a matrix, having `range(data.shape[1])`

as column names.

And as I have pointed below, when `pd.DataFrame`

is instantiated with *mismatching* `columns`

on another `pd.DataFrame`

, all the contents become `NaN`

.

You can test this by

from impyute.util import preprocess @preprocess def test(data): return data data = pd.DataFrame({"time": [1,2,3], "size": [3,2,1]}) columns = data.columns data = pd.DataFrame(test(data), columns = columns)) size time 0 NaN NaN 1 NaN NaN 2 NaN NaN

When you instantiate a `pd.DataFrame`

from an existing `pd.DataFrame`

, `columns`

argument specifies which of the columns from original dataframe you want to use.

It **does not** re-label the dataframe. Which is not odd, just the way `pandas`

intended in reindexing

By default values in the new index that do not have corresponding records in the dataframe are assigned NaN.

# Make new pseudo dataset data = pd.DataFrame({"time": [1,2,3], "size": [3,2,1]}) data size time 0 3 1 1 2 2 2 1 3 #Make new dataset with original `data` data = pd.DataFrame(data, columns = ["a", "b"]) data a b 0 NaN NaN 1 NaN NaN 2 NaN NaN

**NumPy Array manipulation: reshape() function,** How do you convert an array to a DataFrame in Python? My goal is to perform a 2D histogram on it. Replace all values of -999 with NAN. RasterToNumPyArray supports the direct conversion of a multidimensional raster dataset to NumPy array. Replace the NaN values in the dataframe (with a 0 in this case) #Now, we can replace them df = df. Subscribe via email.

There may be some bug in `impyute`

library. You are using `em`

function which is nothing but a way to `fill-missing`

values by `expectation-maximization`

algorithm. You can try without using that function, as

`df = pd.DataFrame(data = Data ,columns = columns)`

You can raise this issue here after confirming. To confirm first load the data, using above example and find if there are null data present in the data by using `df.isnull()`

method.

**How to create Pandas DataFrame from a Numpy array in Python,** How do I turn a data frame into an array? If you need to specify the data types on a dataframe you already created you can use. This is a quick solution in case you want to convert more columns of your Pandas DataFrame df from float to integer considering also the case that you can have NaN values. replace(' ',0, regex=True) # convert it back to numpy array X_np = X_replace.

Data = pd.DataFrame(data = np.array(imp.em(Data)),columns = columns)

Doing this solved the issue i was facing, i guess the data after the use of `em`

function doesn't return numpy array.

**Convert pandas dataframe to NumPy array,** To convert a pandas dataframe into a NumPy array you can use df.values in your code just add .values() with the rename_axis() function and Count function counting only last line of my list. python,python-2.7. I don't know what you are exactly trying to achieve but if you are trying to count R and K in the string there are more elegant ways to achieve it. But for your reference I had modified your code. N = int(raw_input()) s = [] for i in range(N):

**Pandas Dataframe.to_numpy(),** data : numpy ndarray (structured or homogeneous), dict, or DataFrame as_matrix([columns]), Convert the frame to its Numpy-array representation. Add two DataFrame objects and do not propagate NaN values, so if for a pct_change([periods, fill_method, limit, freq]), Percent change over given number of periods. (4) For an entire DataFrame using numpy: df.replace(np.nan,0) Let’s now review how to apply each of the 4 methods using simple examples. 4 cases to replace NaN values with zeros in pandas DataFrame Case 1: replace NaN values with zeros for a column using pandas. Suppose that you have a single column with the following data:

**pandas.DataFrame,** The first sentinel value used by Pandas is None , a Python singleton object that is array to floating point, Pandas automatically converts the None to a NaN value. works quite well in practice and in my experience only rarely causes issues. We cannot drop single values from a DataFrame ; we can only drop full rows or I want to convert one Numpy array to a tuple. How do I obtain the index list in a NumPy Array of all the NaN values present using Python? How to set dataframe

**Handling Missing Data,** We will create a temperature DataFrame, in which some data is not defined, i.e. NaN. We will use and change the data from the the temperatures.csv file:. Previous: Write a NumPy program to convert a numpy array to an image. Display the image. Display the image. Next: Write a NumPy program to create a Cartesian product of two arrays into single array of 2D points.

##### Comments

- My column names are correct as i used Data.columns and used the result to store the column names in a list named 'columns'
- @JACK Happy to help. If any answer solved your issue, please mark it as accepted.
- Yes yes sure sorry didn't knew about that,my bad