Pandas - get list df columns names (e.g) in a string

I've the following dataframe

dtf = pd.DataFrame({'col1' : ['howdy_dude'],
             'col2' : ["HI"],
             'col3' : ["3"]})

I need to get just the header columns with output in string format, kind of : 'col1 + col2 + col3'

3 columns for this sample, but sometimes the number of columns can be higher, sometimes lower.

Thank you.

As I understood your question, you want sum of all column strings irrespective of the number of columns.

Here is response

dtf = pd.DataFrame({'col1' : ['howdy_dude'],
             'col2' : ["HI"],
             'col3' : ["3"]})

dtf['new'] = dtf.apply(' '.join, axis=1)
dtf

This new column will have sum of all the strings in all the columns for the given row (you may remove space in join if you want).

And, if you want to add column names as string, you can use join as

dtf = pd.DataFrame({'col1' : ['howdy_dude'],
             'col2' : ["HI"],
             'col3' : ["3"]})

result = " ".join(dtf.columns)
print (type(result))
result 

Hope this helps

How to get column names in Pandas dataframe, Output: Method #4: Using tolist() method with values with given the list of columns . This new column will have sum of all the strings in all the columns for the given row (you may remove space in join if you want). And, if you want to add column names as string, you can use join as dtf = pd.DataFrame({'col1' : ['howdy_dude'], 'col2' : ["HI"], 'col3' : ["3"]}) result = " ".join(dtf.columns) print (type(result)) result

The exact answer is, as mentioned by Harry_pb's answer:

" + ".join(dtf.columns)

Note that (list()) is useless.


However, it won't work if your column names are integers. You need to first convert them into strings, for example:

dtf.columns = [1,2,3]
" + ".join( dtf.columns.astype(str) )

Also, this method won't work if you have a MultiIndex. So in general, and it's quicker to write, you can do:

" + ".join( dtf.columns.format() )

If you need more control on the MultIndex format, I would use a list comprehension. For a fancy example:

id = pd.MultiIndex.from_tuples( (('A','X',0), ('B','Y',0), ('C','X',0)) )

'\n'.join([
    str(level) if i == 0 else '{}|_{}'.format('  '*(i-1), level)
    for elmt in id
    for i, level in enumerate(elmt)
])

Out:

A
|_X
  |_0
B
|_Y
  |_0
C
|_X
  |_0

Working with text data — pandas 1.1.0 documentation, object dtype breaks dtype-specific operations like DataFrame.select_dtypes() . These are accessed via the str attribute and generally have names matching the in comparison to Series of type string (e.g. you can't add strings to each other: s + " " + s When original Series has StringDtype , the output columns will all be� Get the list of column headers or column name: Method 1: # method 1: get list of column name list(df.columns.values) The above function gets the column names and converts them to list. So the output will be

If specifically needed in this 'col1 + col2 + col3' format then,

"+".join(list(dtf.columns))

pandas.DataFrame.to_string — pandas 1.1.0 documentation, DataFrame. to_string (buf=None, columns=None, col_space=None, header=True , index=True, If a list of strings is given, it is assumed to be aliases for the column names. Character recognized as decimal separator, e.g. ',' in Europe. listOfColumnNames = list (columnsNamesArr) listOfColumnNames is a list that contains all the column names of a DataFrame object i.e. ['Name', 'Age', 'City', 'Country'] ['Name', 'Age', 'City', 'Country'] ['Name', 'Age', 'City', 'Country'] Get Column name by Index / position in DataFrame.

to get all the columns use:

dtf.columns.tolist()

then you have a list of them and you can concatonate them as you like.

Essential basic functionality — pandas 1.1.0 documentation, For heterogeneous data (e.g. some of the DataFrame's columns are not all the same dtype), The apply() method will also dispatch on a string method name. The list values can be a string or a Python object. You can also use the filter method to select columns based on the column names or index labels. In the above example, the filter method returns columns that contain the exact string 'acid'. The like parameter takes a string as an input and returns columns that has the string.

pandas, Example#. df = pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6], 'c': [7, 8, 9]}). To list the column names in a DataFrame: >>> list(df) ['a', 'b', 'c']. This list comprehension� String representation of NAN to use. formatters list, tuple or dict of one-param. functions, optional. Formatter functions to apply to columns’ elements by position or name. The result of each function must be a unicode string. List/tuple must be of length equal to the number of columns. float_format one-parameter function, optional, default None

How To Split A Column or Column Names in Pandas and Get Part of , Just like Python, Pandas has great string manipulation abilities that lets Let us see an example of using Pandas to manipulate column names We can get the names of the columns as a list from pandas dataframe using� Contents of the Dataframe : Name Age City Marks 0 jack 34 Sydney 155.0 1 Riti 31 Delhi 177.5 2 Aadi 16 Mumbai 81.0 3 Mohit 31 Delhi 167.0 4 Veena 12 Delhi 144.0 5 Shaunak 35 Mumbai 135.0 6 Shaun 35 Colombo 111.0 *** Get the Data type of each column in Dataframe *** Data type of each column of Dataframe : Name object Age int64 City object Marks

pyspark.sql module — PySpark master documentation, Column A column expression in a DataFrame . pyspark.sql.Row A row data – an RDD of any kind of SQL data representation(e.g. row, tuple, int, boolean, etc.), or list DataType or a datatype string or a list of column names, default is None . List of column names to use. If the file contains a header row, then you should explicitly pass header=0 to override the column names. Duplicates in this list are not allowed. index_col int, str, sequence of int / str, or False, default None. Column(s) to use as the row labels of the DataFrame, either given as string name or column index. If a

Comments
  • ','.join(list(dtf))
  • I'd start with dtf.columns
  • dtf.columns.str.cat(sep=",") ??
  • list(dtf) brings this result :['col1', 'col2', 'col3']
  • Thanks Anky_91. Works fine
  • You're giving the sum of the content of the columns. What I need is a string with the columns names 'col1 step col2 step col3' for instance
  • @CarlosCarvalho I have added, please let me know if this is what you are looking
  • I understand, but dtf.columns.tolist() gives me the list of columns inside brackets.
  • Thanks anky. This works perfectly fine : dtf.columns.str.cat(sep=",")