## How to efficiently calculate prefix sum of frequencies of characters in a string?

Say, I have a string

s = 'AAABBBCAB'

How can I efficiently calculate the prefix sum of frequencies of each character in the string, i.e.:

psum = [{'A': 1}, {'A': 2}, {'A': 3}, {'A': 3, 'B': 1}, {'A': 3, 'B': 2}, {'A': 3, 'B': 3}, {'A': 3, 'B': 3, 'C': 1}, {'A': 4, 'B': 3, 'C': 1}, {'A': 4, 'B': 4, 'C': 1}]

You can do it in one line using `itertools.accumulate`

and `collections.Counter`

:

from collections import Counter from itertools import accumulate s = 'AAABBBCAB' psum = list(accumulate(map(Counter, s)))

This gives you a list of `Counter`

objects. Now, to get frequencies for any substring of `s`

in O(1) time, you can simply subtract counters, e.g.:

>>> psum[6] - psum[1] # get frequencies for s[2:7] Counter({'B': 3, 'A': 1, 'C': 1})

**Queries for frequencies of characters in substrings,** Say, I have a string. Say, I have a string s = 'AAABBBCAB'. How can I efficiently calculate the prefix sum of frequencies of each character in the string, i.e.: Length of string = 7 Count of all possible substrings = (7 * (8 + 1)) / 2 = 28 Since, all the characters of the string are included in sprecialArray[], ratio of count of special characters to the length of substring for every substring will always be 1. Hence, the sum of ratio = Number of substrings * 1 = 28.

this is an option:

from collections import Counter c = Counter() s = 'AAABBBCAB' psum = [] for char in s: c.update(char) psum.append(dict(c)) # [{'A': 1}, {'A': 2}, {'A': 3}, {'A': 3, 'B': 1}, {'A': 3, 'B': 2}, # {'A': 3, 'B': 3}, {'A': 3, 'B': 3, 'C': 1}, {'A': 4, 'B': 3, 'C': 1}, # {'A': 4, 'B': 4, 'C': 1}]

i use `collections.Counter`

in order to keep a 'running sum' and add (a copy of the result) to the list `psum`

. this way i iterate once only over the string `s`

.

if you prefer to have `collections.Counter`

objects in your result, you could change the last line to

psum.append(c.copy())

in order to get

[Counter({'A': 1}), Counter({'A': 2}), ... Counter({'A': 4, 'B': 4, 'C': 1})]

the same result could also be achieved with this (using `accumulate`

was first proposed in Eugene Yarmash's answer; i just avoid `map`

in favour of a generator expression):

from itertools import accumulate from collections import Counter s = "AAABBBCAB" psum = list(accumulate(Counter(char) for char in s))

just for completeness (as there is no 'pure `dict`

' answer here yet). if you do not want to use `Counter`

or `defaultdict`

you could use this as well:

c = {} s = 'AAABBBCAB' psum = [] for char in s: c[char] = c.get(char, 0) + 1 psum.append(c.copy())

although `defaultdict`

is usually more performant than `dict.get(key, default)`

.

**Prefix Sum Array,** Find frequency of character c in substring l to r. Efficient Approach:We can pre- compute the count for each character. 0 to size of string. Iterate over the string and genertae frequencies of substrings by using the prefix sum array. If a substring with same frequency of characters is already present in the HashMap . Otherwise, store the frequency of characters of the substring with the current substring in the HashMap , if the frequency of the character X in the substring is 0 .

You actually don't even need a counter for this, just a defaultdict would suffice!

from collections import defaultdict c = defaultdict(int) s = 'AAABBBCAB' psum = [] #iterate through the character for char in s: #Update count for each character c[char] +=1 #Add the updated dictionary to the output list psum.append(dict(c)) print(psum)

The output looks like

[{'A': 1}, {'A': 2}, {'A': 3}, {'A': 3, 'B': 1}, {'A': 3, 'B': 2}, {'A': 3, 'B': 3}, {'A': 3, 'B': 3, 'C': 1}, {'A': 4, 'B': 3, 'C': 1}, {'A': 4, 'B': 4, 'C': 1}]

**CBAL - Editorial - editorial,** Efficient approach using Prefix Sum Array : 1 : Run a loop for 'm' times, inputting 'a' and 'b'. 2 : Add 100 at index 'a' and subtract 100 from index 'b+1'. 3 : After completion of 'm' operations, compute the prefix sum array. 4 : Scan the largest element and we're done. Maximize length of the String by concatenating characters from an Array of Strings; Find the amplitude and number of waves for the given array; Find relative rank of each element in array; Find the pair (a, b) with minimum LCM such that their sum is equal to N; Count all possible unique sum of series K, K+1, K+2, K+3, K+4, …, K+N

Simplest would be to use the Counter object from collections.

from collections import Counter s = 'AAABBBCAB' [ dict(Counter(s[:i]) for i in range(1,len(s))]

Yields:

[{'A': 1}, {'A': 2}, {'A': 3}, {'A': 3, 'B': 1}, {'A': 3, 'B': 2}, {'A': 3, 'B': 3}, {'A': 3, 'B': 3, 'C': 1}, {'A': 4, 'B': 3, 'C': 1}]

We are given a string S of length N, and we want to determine whether its substring S[L, N], such that Q[x] stores the frequency of element x in the subarray. prefix sums to quickly retrieve the parity of each character in a substring, How do you store/calculate the count array(Z[][] I suppose) in that case? Given a string, the task is to find the frequencies of all the characters in that string and return a dictionary with key as the character and its value as its frequency in the given string. Method #1 : Naive method

In Python 3.8 you can use a list comprehension with an assignment expression (aka "the walrus operator"):

>>> from collections import Counter >>> s = 'AAABBBCAB' >>> c = Counter() >>> [c := c + Counter(x) for x in s] [Counter({'A': 1}), Counter({'A': 2}), Counter({'A': 3}), Counter({'A': 3, 'B': 1}), Counter({'A': 3, 'B': 2}), Counter({'A': 3, 'B': 3}), Counter({'A': 3, 'B': 3, 'C': 1}), Counter({'A': 4, 'B': 3, 'C': 1}), Counter({'A': 4, 'B': 4, 'C': 1})]

Say, I have a string. s = 'AAABBBCAB'. How can I efficiently calculate the prefix sum of frequencies of each character in the string, i.e.: psum = [{'A': 1}, {'A': 2}, {'A': � 1 : Run a loop for 'm' times, inputting 'a' and 'b'. 2 : Add 100 at index 'a' and subtract 100 from index 'b+1'. 3 : After completion of 'm' operations, compute the prefix sum array. 4 : Scan the largest element and we're done. What we did was adding 100 at ‘a’ because this will add 100 to all elements while taking prefix sum array.

7 How to efficiently calculate prefix sum of 6 “not in” identity operator not working when checking empty string for certain characters Salary Calculator;

6 How to clean up string to load to an array in 6 How to efficiently calculate prefix sum of frequencies of characters 5 Sum of digits untill reach single

75 How do I reliably split a string in 34 How can I exclude some characters from a 23 How to efficiently calculate prefix sum of frequencies of characters

##### Comments

- Finally you want one dict or you want a list of dicts for each char while reading?
- @Vanjith I want a running counter of character frequencies.
- We don't even need
`Counter`

here, a simple`defaultdict`

will do @hiro-protagonist , check my answer below! - what makes you say
`defaultdict`

is 'simpler' than`Counter`

? simpler in what way? - @DeveshKumarSingh they are both subclasses of
`dict`

; the data structure of a counter is not more complicated that the one of a`dict`

. or what am i missing? - @DeveshKumarSingh, this considerations are misplaced. I've pointed time performance difference, but the OP should make his(her) own decision.
- @DeveshKumarSingh: Your answer came later than this one, it is the exact same structure with a slightly different type, it has the same complexity but with a more verbose output. You shouldn't advertise it here.
- Just to note,
`Counter`

is a subclass of`dict`

, so there's little reason to replace the`Counter`

with a plain`dict`

. - I agree, but its more in line with what user specified as output. I would keep the Counter objects myself as they have useful functions in addition to being a dict.
- This is an elegant 1-liner so +1, but is quadratic rather than linear. I suspect that the similar solution by hiro protagonist is more efficient.