I'm attempting to find a word by its known letters and letters position (similar to a crossword) similar to what does

Example :

B E _ K

possible words: 

I have all possible words (with the same length) in a list. the problem is, i can't find a proper solution to compare user_input to my list.

comparing each index of each word on dictionary to user_input word letters seems to be a solution, however it is not efficient at all.

is there any other way to approach this problem ?

thank you in advance

EDIT : I should add that regex cannot be used as a solution because I'm working with Persian(farsi) words, which uses persian alphabet (similar to arabic)

User input is taken letter by letter and is stored as List. There might be more than 1 missing letter and theWord length can be anything between 1-10

A quick hack

# Save pattern as (char, position) where position starts at 0
pattern = [("B", 0), ("E", 1), ("K", 3)] 

dictionary = ["BEAK", "BECK", "BELK", "BERK"]

def match(word, pattern):
    if len(pattern) > len(word):
        return false

    return all(word[pos] == c for (c, pos) in pattern):

def list_matches(pattern, dictionary):
    for word in dictionary:
        if match(word, pattern):

list_matches(pattern, dictionary)

You can use a Trie data structure and that will be much more efficient.

I suggest that you build a tree with your list of words.

  |   |
      |   |   |
              +-K-x ("BEAK")

Searching would be fast and memory consumption low.

If you don't want to start from scratch, you could use the module anytree.

Have a look on regular expression package

Something as:

import re
pattern = re.compile('BE.K')
possible_words = [word for word in all_words if re.match(pattern, word)]

would work.

  • Is there always just a single unknown letter in the input? How many words are there in your list of possible words?
  • Could you please add how your user input is stored? How does the user input the undefined character _?
  • Regex can be used with Unicode. Which version of Python are you using?
  • use r'BE.K' for regex pattern and its word for word not word for work in your list comp
  • BE.K was an example, Regex can be implemented for a good solution however, my Dictionary contains Farsi words (similar to Arabic alphabet) so i can not use regex
  • @A.Aminidad The point was use a raw string to save you from from a backslash hell in your regular expression. For example to match a backslash you need two write "\\\\" or just r"\\", because you need to escape both the slash for the regex and the non-raw python string. So that is much easier to read.