pandas-percentage count of categorical variable

groupby pandas percentage of total
pandas percentage of each value in column
pandas percentage plot
pandas count values in column
pandas count specific value in column
value counts percentage python
pandas percentage of values greater than
pandas count occurrences in column

I have a pandas df like

df_test = pd.DataFrame({'A': 'a a a b b'.split(), 'B': ['Y','N','Y','Y','N']})

and my desired output to be df_test2 = pd.DataFrame({'A': 'a b'.split(), 'B': [2/3,1/2]}) How would you do a groupby().apply by column A to get the percentage of 'Y' in column B?

I have been searching groupby.apply() but nothing have worked so far Thank you !

One approach could be

In [10]: df_test.groupby('A').B.apply(lambda x: (x == 'Y').mean())
Out[10]:
A
a    0.666667
b    0.500000

or, if you don't mind changing df_test in the process,

In [15]: df_test['C'] = df_test.B == 'Y'
In [17]: df_test.groupby('A').C.mean()
Out[17]:
A
a    0.666667
b    0.500000
Name: C, dtype: float64

Pandas Series: value_counts() function, () function is used to get a Series containing counts of unique values. The resulting object will be in descending order so that the first element is the most frequently-occurring element. Categorical data and Python are a data scientist’s friends. The Iris dataset is made of four metric variables and a qualitative target outcome. Just as you use means and variance as descriptive measures for metric variables, so do frequencies strictly relate to qualitative ones.

Use GroupBy.mean with boolean mask, where Trues are processes like 1, no new column is necessary, because also is pass Series df_test["A"] to groupby:

Notice:

Instead == is used eq for cleaner syntax.

df = df_test["B"].eq('Y').groupby(df_test["A"]).mean().reset_index()
print (df)
   A         B
0  a  0.666667
1  b  0.500000

Python, How do I count values in a column in pandas? First level index is the variable name (e.g. 'grade') Second level index is the levels within the variable (e.g. 'A', 'B', 'C') One column contains 'n', a count of the number of times the level appears; A second column contains 'proportion', the proportion represented by this level. For example:

personal favorite way:

df.column_name.value_counts() / len(df)

Gives a series with the column's values as the index and the proportions as the values.

Pandas Tutorial 2: Aggregation and Grouping, pandas-percentage count of categorical variable. groupby pandas percentage of total pandas percentage plot pandas calculate percentage of each row pandas  Stack Overflow for Teams is a private, secure spot for you and your coworkers to find and share information. Learn more Count number of occurrences of categorical variables in data frame (R) [duplicate]

pandas-percentage count of categorical variable, This is the simplest way to get the count, percenrage ( also from 0 to 100 ) at once with pandas. Let have this data: Video · Notebook. food, Portion  Often while working with pandas dataframe you might have a column with categorical variables, string/characters, and you want to find the frequency counts of each unique elements present in the column. Pandas’ value_counts() easily let you get the frequency counts. Let us get started with an example from a real world data set. Load gapminder […]

Pandas count and percentage by value for a column, Return a Series containing counts of unique values. Bins can be useful for going from a continuous variable to a categorical variable; instead of counting  Generally, the pandas data type of categorical columns is similar to simply strings of text or numerical values. However, with using ordinal categorical data types, there's a few small differences that would affect my typical workflow. Those differences in pandas are sorting as well as calculuating the minimum and maximum values in a column.

pandas.Series.value_counts, Percentage of a column in pandas python is carried out using sum() function in. Let's see how to Get the percentage of a column in pandas dataframe example. B = countcats(A) returns the number of elements in each category of the categorical array, A. If A is a vector, then countcats returns the number of elements in each category. If A is a matrix, then countcats treats the columns of A as vectors and returns the category counts for each column of A .