An alphabet is a finite set of symbols. A word
of length
over an alphabet
is a sequence of symbols from
, i.e.,
Alternatively, we can view a word of length
over the alphabet
as an
-tuple (i.e., an ordered list with
elements) over
, i.e.,
. The set of all finite words over
, including the empty word
, is denoted by
, where
is the Kleene star.
We thus encounter the following problem:
Problem: Given an alphabet , how do we generate all the words over
? In other words, given
, how do we generate the Kleene closure
?
The following Haskell script solves this problem:
g :: [a] -> [[a]] -> [[a]] g alphabet = concat . map (\xs -> [ xs ++ [s] | s <- alphabet]) allwords :: [a] -> [[a]] allwords alphabet = concat $ iterate (g alphabet) [[]]
where alphabet must be a finite list, otherwise the execution of the script won’t ever terminate. The script above, although short, uses a lot of machinery: list comprehension, higher-order functions map and iterate, concatenation of lists of lists, and partial function application. Let us test this script. First, we load it into GHCi. Then we can run a GHCi session:
*Main> -- define alphabet *Main> let alphabet = [0,1] *Main> -- define function f *Main> let f = g alphabet *Main> -- check type *Main> :t f f :: [[Integer]] -> [[Integer]] *Main> -- apply f to several lists of lists *Main> f [[]] [[0],[1]] *Main> (f . f) [[]] [[0,0],[0,1],[1,0],[1,1]] *Main> (f . f . f) [[]] [[0,0,0],[0,0,1],[0,1,0],[0,1,1],[1,0,0],[1,0,1],[1,1,0],[1,1,1]]
So far, so good. Suppose that we would like to find all words over alphabet whose length is less than or equal to
. How many such words are there? There are
words of length
. Hence, there are
binary words of length less than or equal to . Hence, if we make
, then we obtain
. Continuing our GHCi session:
*Main> take 31 $ allwords alphabet [[],[0],[1], [0,0],[0,1],[1,0],[1,1], [0,0,0],[0,0,1],[0,1,0],[0,1,1], [1,0,0],[1,0,1],[1,1,0],[1,1,1], [0,0,0,0],[0,0,0,1],[0,0,1,0],[0,0,1,1], [0,1,0,0],[0,1,0,1],[0,1,1,0],[0,1,1,1], [1,0,0,0],[1,0,0,1],[1,0,1,0],[1,0,1,1], [1,1,0,0],[1,1,0,1],[1,1,1,0],[1,1,1,1]]
Very nice! We could use this in Coding Theory! For example, suppose that we would like to find all binary words of length whose Hamming weight is equal to
. There are
such words. We can find them as follows:
*Main> -- take all words of length less than or equal to 8 *Main> let words = take 511 $ allwords alphabet *Main> -- filter out binary words of length 8 *Main> let wordsL8 = filter (\xs -> length xs == 8) words *Main> length wordsL8 256 *Main> -- filter out binary words of length 8 *Main> -- and also of Hamming weight equal to 5 *Main> let wordsL8H5 = filter (\xs -> sum xs == 5) wordsL8 *Main> length wordsL8H5 56 *Main> wordsL8H5 [[0,0,0,1,1,1,1,1],[0,0,1,0,1,1,1,1],[0,0,1,1,0,1,1,1], [0,0,1,1,1,0,1,1],[0,0,1,1,1,1,0,1],[0,0,1,1,1,1,1,0], [0,1,0,0,1,1,1,1],[0,1,0,1,0,1,1,1],[0,1,0,1,1,0,1,1], [0,1,0,1,1,1,0,1],[0,1,0,1,1,1,1,0],[0,1,1,0,0,1,1,1], [0,1,1,0,1,0,1,1],[0,1,1,0,1,1,0,1],[0,1,1,0,1,1,1,0], [0,1,1,1,0,0,1,1],[0,1,1,1,0,1,0,1],[0,1,1,1,0,1,1,0], [0,1,1,1,1,0,0,1],[0,1,1,1,1,0,1,0],[0,1,1,1,1,1,0,0], [1,0,0,0,1,1,1,1],[1,0,0,1,0,1,1,1],[1,0,0,1,1,0,1,1], [1,0,0,1,1,1,0,1],[1,0,0,1,1,1,1,0],[1,0,1,0,0,1,1,1], [1,0,1,0,1,0,1,1],[1,0,1,0,1,1,0,1],[1,0,1,0,1,1,1,0], [1,0,1,1,0,0,1,1],[1,0,1,1,0,1,0,1],[1,0,1,1,0,1,1,0], [1,0,1,1,1,0,0,1],[1,0,1,1,1,0,1,0],[1,0,1,1,1,1,0,0], [1,1,0,0,0,1,1,1],[1,1,0,0,1,0,1,1],[1,1,0,0,1,1,0,1], [1,1,0,0,1,1,1,0],[1,1,0,1,0,0,1,1],[1,1,0,1,0,1,0,1], [1,1,0,1,0,1,1,0],[1,1,0,1,1,0,0,1],[1,1,0,1,1,0,1,0], [1,1,0,1,1,1,0,0],[1,1,1,0,0,0,1,1],[1,1,1,0,0,1,0,1], [1,1,1,0,0,1,1,0],[1,1,1,0,1,0,0,1],[1,1,1,0,1,0,1,0], [1,1,1,0,1,1,0,0],[1,1,1,1,0,0,0,1],[1,1,1,1,0,0,1,0], [1,1,1,1,0,1,0,0],[1,1,1,1,1,0,0,0]]
where we used function take (again) and higher-order function filter. The two predicates were built using anonymous functions.
I am happy with this script. Please let me know in case you are not.
__________
Related:
- Distance between two words (2009)


