REGEXTRACT - to match one letter represented by a special character within a string

I'm facing a little problem to do the following:

I have within cells (in Google sheets) some text, and I'm searching to extract what's after Question 1: More precisely, the missing letter represented by the special character ●

The cell might contains one of the following strings:

Question 1: ●BCD

Question 1: A●CD

Question 1: AB●D

Question 1: ABC●

The letters extracted can only be (A,B,C,D) in capital, so in the first example, I should extract the letter A, in the second the letter B, in the third the letter C, and in the last example the letter D.

After searching for a while I was able to write the following:

=IFERROR(trim(upper(regexextract(trim(clean(substitute(B2,char(160)," "))),"Question 1:(\s?[a-dA-D])"))),"??")

But this extract always the letter A, and if the special character is at the beginning (example 1) I get an error.

Another similar scenario is again to extract the letters T or F in the following:

Question 2 : ●F (here we should extract T)

Question 2 : T● (here we should extract F)

Thank you for showing some lights on these issues.


try:

=ARRAYFORMULA(TRANSPOSE(IFNA(VLOOKUP(TRIM(SUBSTITUTE(QUERY(TRANSPOSE(
 REGEXREPLACE(SPLIT(B2, CHAR(10)), "Question \d+: ", "♦")), 
 "where Col1 contains '♦'"), "♦", )), 
 {"BCD",  "A";
  "A CD", "B";
  "AB D", "C";
  "ABC",  "D";
  "F",    "T";
  "T",    "F"}, 2, 0))))

Regex Tutorial, Literal Characters. The most basic regular expression consists of a single literal character, such as a. It matches the first occurrence of that character in the string. In regex, we can match any character using period "." character. To match only a given set of characters, we should use character classes. 1. Match any character using regex '.' character will match any character without regard to what character it is. The matched character can be an alphabet, number of any special character.


You don't need regex to achieve this. See the following example: https://docs.google.com/spreadsheets/d/1iMt0pUeyenIAzdHAw2f5jhT41B-Tz86Ab3YfZgXhKk0/edit#gid=0

Question 1 formula:

=MID("ABCD", FIND("●", A2) - FIND(":", A2) - 1, 1)

Question 2 formula:

=MID("TF", FIND("●", A6) - FIND(":", A6) - 1, 1)

The linked sheet breaks down procedurally how you can arrive at this by:

  1. Finding the position of the character you are looking for FIND("●", A2) in the original text
  2. Converting that to a relative position <position> - FIND(":", A2) - 1
  3. Returning the character at the specific position MID("ABCD", <relative position>, 1)

Regex to Test The Same Part of The String for More Than One , This sub-regex, and therefore the lookahead, matches only when the current character position in the string is at the start of a 6-letter word in the string. If not, the� The ‹ ^ › and ‹ $ › anchors ensure that the regex matches the entire subject string; otherwise, it could match 10 characters within longer text. The ‹ [A-Z] › character class matches any single uppercase character from A to Z, and the interval quantifier ‹ {1,10} › repeats the character class from 1 to 10 times.


try:

=ARRAYFORMULA(IFERROR(CHAR(FINDB("●", A1:A)-FINDB(":", A1:A)+63)))

or:

=ARRAYFORMULA(IFERROR(CHAR(FINDB("●", A1:A)-LEN(REGEXEXTRACT(A1:A, "(.+: )"))+64)))


=ARRAYFORMULA(SUBSTITUTE(IFERROR(CHAR(FINDB("●", A1:A)-FINDB(":", A1:A)+82)), "U", "F"))

or:

=ARRAYFORMULA(SUBSTITUTE(IFERROR(CHAR(FINDB("●", A1:A)-
 LEN(REGEXEXTRACT(A1:A, "(.+: )"))+83)), "U", "F"))


UPDATE:

I need to transpose the results horizontally, as I'm aiming to extract the letter for each question in a different cell

=ARRAYFORMULA(TRANSPOSE(ARRAY_CONSTRAIN(IFERROR(IF(REGEXMATCH(A2:A, "T●|●F"), 
 SUBSTITUTE(CHAR(FIND("●", A2:A)-FIND(":", A2:A)+82), "U", "F"), 
            CHAR(FIND("●", A2:A)-FIND(":", A2:A)+63))), COUNTA(A2:A), 1)))

Regular expressions 1. Special characters, The following characters are the meta characters that give special meaning to the regular Example: The regex "aa\n" tries to match two consecutive "a"s at the end of a line, Example: "[a-z]" matches any lower-case characters in the alphabet. Here the search string is one character class and all the meta characters are� [A-Za-z0-9._%+-] represents the set of characters before the @ symbol.+ represents that there can be one or more of the preceding character meaning one or more of the character types represented between the square brackets. @ Represents the @ in the middle of an email id. \. Represents the period in front of the domain name.


Pattern Matching With Regular Expressions, At their simplest, a regular expression is simply a string of characters and this characters are considered special and when used in conjunction with each other they regex: 'a*b' test: 'b' # Matches as there are no occurrences of 'a' test: 'ab'� The simplest solution to match any line that ends in 'og' but is not 'bog' would be the expression [^b]og. Alternatively, you could use what we learned from the previous lesson and use [hd]og to match 'hog' and 'dog' but not 'bog'. Note that it is slightly more restrictive expression because it limits the strings it can match.


Special pattern matching character operators, This simply allows for any single character to match where a . is placed in a regular expression. allow for any one of the letters listed inside the brackets to be matched at the specified position. For example (regex.pl): and end of the string as matching a \W . (Within character classes \b represents backspace rather than� ‹ [^cat] › is a valid regex, but it matches any character except c, a, or t. Hence, although ‹ \b[^cat]+\b › would avoid matching the word cat, it wouldn’t match the word time either, because it contains the forbidden letter t.


Simple RegEx Tutorial, First of all, let's take a look at two special symbols: '^' and '$'. matches any string that starts with "The". "of despair$". matches a string that ends in with "of despair". "^abc$" specify which characters are allowed in a single position of a string:� ton. doesn't match for the term tones because . by itself will only match for a single character, here, in the 4th position of the term. In tones, s is the 5th character and is not accounted for in the RegEx..* Combine the metacharacters . and *, in that order .* to match for any character 0 or more times.