## Generating a weighted average value with data from pandas and a dictionary?

I have a dataframe:

SALES Date 2018-03-31 123090 2018-04-30 116591 2018-05-31 119581 2018-06-30 117544 2018-07-31 129574 2018-08-31 118876 2018-09-30 129467 2018-10-31 126062 2018-11-30 128552 2018-12-31 104994 2019-01-31 149188 2019-02-28 118204

And a dictionary, *price*:

{Oct: 11, Nov: 23, Dec: 34, Jan: 20, Feb: 30, Mar: 31, Apr: 22, May: 23, Jun: 34, Jul: 20, Aug: 30, Sep: 31}

I want to calculate a weighted average price by multiplying each of the sales figures from the DataFrame with the corresponding months from the dictionary and then dividing by the total sales figures. i.e. taking the sales for of *126062* for October from the dataframe and then multiplying it by 11 (*Oct*) from the dictionary.

I have tried adding a month column and re-ordering the dataframe and then use an ordered dictionary but I feel like I am using the proverbial sledge hammer for this problem.

SUM MONTH Date 2019-01-31 129188.1 1 2019-02-28 118304.5 2 2018-03-31 123090.6 3 2018-04-30 116591.2 4 2018-05-31 119581.5 5 2018-06-30 117544.0 6 2018-07-31 129574.9 7 2018-08-31 118876.2 8 2018-09-30 109467.5 9 2018-10-31 126062.0 10 2018-11-30 128552.9 11 2018-12-31 104994.2 12

I have also tried to look at zip and iterating over both the dataframe and dictionary but I'm struggling to find the best way to map the two datasets together.

I am happy to convert the dictionary to another dataframe if that makes it easier?

Any help would be appreciated.

You can use `map`

with a DatetimeIndex method `strftime`

:

Where df, dataframe and dd, dictionary of waits are defined as,

d = {'SALES': {pd.Timestamp('2018-03-31 00:00:00'): 123090, pd.Timestamp('2018-04-30 00:00:00'): 116591, pd.Timestamp('2018-05-31 00:00:00'): 119581, pd.Timestamp('2018-06-30 00:00:00'): 117544, pd.Timestamp('2018-07-31 00:00:00'): 129574, pd.Timestamp('2018-08-31 00:00:00'): 118876, pd.Timestamp('2018-09-30 00:00:00'): 129467, pd.Timestamp('2018-10-31 00:00:00'): 126062, pd.Timestamp('2018-11-30 00:00:00'): 128552, pd.Timestamp('2018-12-31 00:00:00'): 104994, pd.Timestamp('2019-01-31 00:00:00'): 149188, pd.Timestamp('2019-02-28 00:00:00'): 118204}} df = pd.DataFrame(d) dd = {'Oct': 11, 'Nov': 23, 'Dec': 34, 'Jan': 20, 'Feb': 30, 'Mar': 31, 'Apr': 22,'May': 23, 'Jun': 34, 'Jul': 20, 'Aug': 30,'Sep': 31}

Use

df['Adj Sales'] = df.index.strftime('%b').map(dd) * df['SALES']

Output:

SALES Adj Sales 2018-03-31 123090 3815790 2018-04-30 116591 2565002 2018-05-31 119581 2750363 2018-06-30 117544 3996496 2018-07-31 129574 2591480 2018-08-31 118876 3566280 2018-09-30 129467 4013477 2018-10-31 126062 1386682 2018-11-30 128552 2956696 2018-12-31 104994 3569796 2019-01-31 149188 2983760 2019-02-28 118204 3546120

**Learn More About Pandas By Building and Using a Weighted ,** Building a weighted average function in pandas is relatively simple but can be Because we need values and weights, it can be a little less intuitive to We absolutely could but I wanted to show how to create a formula. by defining a dictionary with the column names and aggregation functions to call. A weighted average can be calculated like this: ( 300 ∗ 20 + 200 ∗ 100 + 150 ∗ 225) ( 20 + 100 + 225) = $ 173.19. Since we are selling the vast majority of our shoes between $200 and $150, this number represents the overall average price of our products more accurately than the simple average.

Try this to get the weights column:

my_dict = {'Oct': 11, 'Nov': 23, 'Dec': 34, 'Jan': 20, 'Feb': 30, 'Mar': 31, 'Apr': 22, 'May': 23, 'Jun': 34, 'Jul': 20, 'Aug': 30, 'Sep': 31} weights = pd.Series(my_dict) df.Date = pd.to_datetime(df.Date) df.set_index(df.Date.dt.strftime("%b"), inplace=True) df['Weights'] = weights df.reset_index(drop=True, inplace=True)

then `df`

is:

Date SALES Weights 0 2018-03-31 123090 31 1 2018-04-30 116591 22 2 2018-05-31 119581 23 3 2018-06-30 117544 34 4 2018-07-31 129574 20 5 2018-08-31 118876 30 6 2018-09-30 129467 31 7 2018-10-31 126062 11 8 2018-11-30 128552 23 9 2018-12-31 104994 34 10 2019-01-31 149188 20 11 2019-02-28 118204 30

**Need some explaining, i don't really get average and dictionaries ,** Now in Python, the way to get a value from a dictionary is by using dictionary_name[key] . + 0.6 * average(n[“tests”]) #weight(%) x average x grade -values print '''Create a function to calculate averages, where parameter passed to the� Pandas Replace from Dictionary Values. We will now see how we can replace the value of a column with the dictionary values. Create a Dataframe. Let’s create a dataframe of five Names and their Birth Month

I would do it like this:
First create the `'weight'`

column:

df['weight'] = [month[ind_month] for ind_month in df.index.month_name().str[:3].values] Out[48]: Sales weight 2018-03-31 100 31 2018-04-30 101 22 2018-05-31 102 23 2018-06-30 103 34 2018-07-31 104 20 2018-08-31 105 30 2018-09-30 106 31 2018-10-31 107 11 2018-11-30 108 23 2018-12-31 109 34 2019-01-31 110 20 2019-02-28 111 30 2019-03-31 112 31 2019-04-30 113 22

where:

month = {'Oct': 11,'Nov': 23,'Dec': 34, 'Jan': 20, 'Feb': 30, 'Mar': 31,'Apr': 22, 'May': ^M ...: 23, 'Jun': 34, 'Jul': 20,'Aug': 30, 'Sep': 31}

and then mulitply columns:

df['weighted_Sales'] = df.weight * df.Sales

which produces:

Out[50]: Sales weight weighted_Sales 2018-03-31 100 31 3100 2018-04-30 101 22 2222 2018-05-31 102 23 2346 2018-06-30 103 34 3502 2018-07-31 104 20 2080 2018-08-31 105 30 3150 2018-09-30 106 31 3286 2018-10-31 107 11 1177 2018-11-30 108 23 2484 2018-12-31 109 34 3706 2019-01-31 110 20 2200 2019-02-28 111 30 3330 2019-03-31 112 31 3472 2019-04-30 113 22 2486

**pandas.DataFrame.aggregate — pandas 1.1.1 documentation,** Function to use for aggregating the data. dict of axis labels -> functions, function names or list of such. This behavior is different from: numpy aggregation functions ( mean , median , prod , sum , std ,: var ), where the default is to compute the aggregation of the Perform operation over exponential weighted window. [code]import pandas as pd import numpy as np df = pd.DataFrame({'a': [300, 200, 100], 'b': [10, 20, 30]}) # using formula wm_formula = (df['a']*df['b'

Step 1. Create a price dataframe out of dictionary

dict_p = {"Oct": 11, "Nov": 23, "Dec": 34, "Jan": 20, "Feb": 30, "Mar": 31, "Apr": 22, "May": 23, "Jun": 34, "Jul": 20, "Aug": 30, "Sep": 31} dict_m = {"Oct": 10, "Nov": 11, "Dec": 12, "Jan": 1, "Feb": 2, "Mar": 3, "Apr": 4, "May": 5, "Jun": 6, "Jul": 7, "Aug": 8, "Sep": 9} import pandas as pd price = pd.DataFrame.from_dict(dict_p, orient = "index", columns = ["price"]) month = pd.DataFrame.from_dict(dict_m, orient = "index", columns = ["month"]) df_price = pd.concat([price, month],axis = 1) print(df_price)

Produces:

price month Oct 11 10 Nov 23 11 Dec 34 12 Jan 20 1 Feb 30 2 Mar 31 3 Apr 22 4 May 23 5 Jun 34 6 Jul 20 7 Aug 30 8 Sep 31 9

Step 2. Merge price and sales data

df_sales = pd.DataFrame(d) df_sales["month"] = df_sales.index.month df = df_sales.merge(df_price) print(df)

Produces:

SALES month price 0 123090 3 31 1 116591 4 22 2 119581 5 23 3 117544 6 34 4 129574 7 20 5 118876 8 30 6 129467 9 31 7 126062 10 11 8 128552 11 23 9 104994 12 34 10 149188 1 20 11 118204 2 30

Step 3. Calculate weights and compute weighted average price

df["weight"] = df.SALES/df.SALES.sum() price_weighted_ave = sum(df.price*df.weight) print(price_weighted_ave)

Produces:

25.471658332900283

**pandas.DataFrame.stack — pandas 1.1.1 documentation,** DataFrame.mean � pandas. Whether to drop rows in the resulting Frame/Series with missing values. Stacking a column level onto the index axis can create combinations of index and column values that are missing df_single_level_cols weight height cat 0 1 dog 2 3 >>> df_single_level_cols. stack() cat weight 0 height 1� (2) Average for each row: df.mean(axis=1) Next, I’ll review an example with the steps to get the average for each column and row for a given DataFrame. Steps to get the Average for each Column and Row in Pandas DataFrame Step 1: Gather the data. To start, gather the data that needs to be averaged.

**Calculating Seasonal Averages from Timeseries of Monthly Means ,** Creating a Dataset � Dataset contents � Dictionary like methods � Modifying datasets Suppose we have a netCDF or xray Dataset of monthly mean data and we To do this properly, we need to calculate the weighted average considering that import numpy as np import pandas as pd import xray from netCDF4 import� Implementing Moving Average on Time Series Data Simple Moving Average (SMA) First, let's create dummy time series data and try implementing SMA using just Python. Assume that there is a demand for a product and it is observed for 12 months (1 Year), and you need to find moving averages for 3 and 4 months window periods. Import module

Let’s discuss how to convert Python Dictionary to Pandas Dataframe. We can convert a dictionary to a pandas dataframe by using the pd.DataFrame.from_dict() class-method. Example 1: Passing the key value as a list.

mean() – Mean Function in python pandas is used to calculate the arithmetic mean of a given set of numbers, mean of a data frame ,column wise mean or mean of column in pandas and row wise mean or mean of rows in pandas , lets see an example of each . We need to use the package name “statistics” in calculation of mean.

##### Comments

- Thanks Scott, that works great. Out of interest, if I wanted to do the same for future prices so my price dictionary contained the month and future year {Oct-19: 12} etc how could I use map but extract the month from the dictionary as i don't think I could strip it or use strftime?
- You can still use strftime.
`strftime('%b-%y')`

should work. - dd = {'Oct-19': 11, 'Nov-19': 23, ... ,'Sep-20': 31}. If I try and use: df['Adj Sales'] = df.index.strftime('%b').map(dd.strftime('%b') * df['SALES'] it throws a 'dict' object has no attribute 'strftime'?
`df.index.strftime('%b-%y).map(dd)`

... you don't need to do anyting to the dictionary. Just modify the index side