Asp Forum
Home
|
Login
|
Register
|
Search
Forums
>
comp.lang.ruby
[QUIZ] Cryptograms (#13
James Gray
12/31/2004 6:27:00 PM
The three rules of Ruby Quiz:
1. Please do not post any solutions or spoiler discussion for this quiz until
48 hours have passed from the time on this message.
2. Support Ruby Quiz by submitting ideas as often as you can:
http://www.grayproductions.net/...
3. Enjoy!
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
by Glenn Parker
GOAL: Given a cryptogram and a dictionary of known words, find the best
possible solution(s) to the crytogram. Extra points for speed. Coding
a brute force solution is relatively trivial, but there are many
opportunities for the clever optimizer.
A cryptogram is piece of text that has been passed through a simple
cipher that maps all instances of one letter to a different letter. The
familiar rot13 encoding is a trivial example.
A solution to a cryptogram is a one-to-one mapping between two sets of
(up to) 26 letters, such that applying the map to the cryptogram yields
the greatest possible number from words in the dictionary.
Both the dictionary and the crytogram are presented as a set of words,
one per line. The script should output one or more solutions and the
full or partial mapping for each solution. An example cryptogram and
solution are provided, below.
Three unsolved cryptograms given. Each cryptogram uses a different
mapping. The cryptograms may contain a few words that are not in the
dictionary, e.g. an author's name is commonly appended to quoted text
in cryptograms. Many published cryptograms also contain punctuation in
plaintext as a clue to the solver. The cryptograms below contain no
punctuation, since it just confuses dictionary-based searches.
The dictionary I used was 2of4brif.txt, available as part of the
12Dicts package at
http://prdownloads.sourceforge.net/wordlist/12dic...
. Given the
size of the file (~ 1Mb), it is not included in this message, but I do
think it would be best if participants all used the same dictionary.
EXAMPLE:
gebo
tev
e
cwaack
cegn
gsatkb
ussyk
SOLUTION:
mary
had
a
little
lamb
mother
goose
in: abcdefghijklmnopqrstuvwxyz
out: trl.a.m...e..by...ohgdi.s.
Note: dots in the "out" side of the mapping indicate unused input
letters.
CRYPTOGRAMS:
-----
crypto1.txt
-----
zfsbhd
bd
lsf
xfe
ofsr
bsdxbejrbls
sbsfra
sbsf
xfe
ofsr
xfedxbejrbls
rqlujd
jvwj
fpbdls
-----
crypto2.txt
-----
mkr
ideerqruhr
nrmsrru
mkr
ozgcym
qdakm
scqi
oui
mkr
qdakm
scqi
dy
mkr
ideerqruhr
nrmsrru
mkr
zdakmudua
nja
oui
mkr
zdakmudua
goqb
msodu
-----
crypto3.txt
-----
ftyw
uwmb
yw
ilwwv
qvb
bjtvi
fupxiu
t
dqvi
tv
yj
huqtvd
mtrw
fuw
dwq
bjmqv
fupyqd
1 Answer
Glenn Parker
1/5/2005 2:58:00 AM
0
Here's my "solution". I must admit that it does not perform very well
on crypto3.txt, although based on copious diagnostic output I think it
would eventually come up with the correct solution after a few days.
The comments in my code may help illuminate some of my thinking, which
was very similar to Michael C. Libby's, although the implementation
details are quite different.
I'll provide a summary of all (two, so far) solutions and a discussion
of my approach on Thursday.
--
Glenn Parker | glenn.parker-AT-comcast.net | <
http://www.tetrafoi...
#!ruby
STDOUT.sync = true
# A utility class to read files containing words one-per-line
class WordReader
include Enumerable
def initialize(filename)
@filename = filename
end
def each
File.open(@filename) do |file|
file.each do |word|
word.chomp!
next if word.length == 0
yield word
end
end
end
end
# A copy of the dictionary, with words grouped by "signature".
# A signature simplifies a word to its repeating letter patterns.
# The signature for "cat" is 1.2.3 because each successive letter
# in cat is unique. The signature for "banana" is 1.2.3.2.3.2,
# where letter 2, "a", is repeated three times and letter 3, "n"
# is repeated twice.
class Dictionary
def initialize(filename)
@all = {}
@sigs = {}
@max_sig = nil
@max_sig_len = nil
WordReader.new(filename).each { |word|
word.downcase!
word.gsub!(/[^a-z]/, '')
next if word.empty?
@all[word] = true
sig = signature(word)
@sigs[sig] ||= []
@sigs[sig].push(word)
}
self.freeze
end
def lookup(word)
@all[word]
end
def candidates(cipher)
@sigs[signature(cipher)]
end
private
def signature(word)
seen = {}
u = 0
sig = []
word.each_byte do |b|
if not seen[b]
u += 1
seen[b] = u
end
sig.push(seen[b])
end
sig.join('.')
end
end
# CMap maintains the mapping from ciphertext to plaintext and the
# some state related to the solution. @map is the actual mapping.
# @solved is just a string with all the solved words. @shelved
# is an array of ciphertext words that cannot be solved because
# the current mapping resolves all their letters and the result
# is not found in the dictionary.
class CMap
attr_reader :map, :solved, :shelved
def initialize(arg = nil, newmap = nil, dword = nil)
case
when arg.kind_of?(String)
@map = arg.dup
when arg.kind_of?(CMap)
@map = newmap || arg.map.dup
@solved = arg.solved.dup
@shelved = arg.shelved.dup
append_solved(dword) if dword
else
@map = '.' * 26
@solved = ''
@shelved = []
end
end
def dup
CMap.new(self)
end
# Attempt to update the map to include all letter combinations
# needed to map cword into dword. Return nil if a conflict is found.
def learn(cword, dword)
newmap = @map.dup
(0...cword.length).each do |i|
c = cword[i] - ?a
p = newmap[c]
# check for correct mapping
next if p == dword[i]
# check for incorrect mapping
return nil if (p != ?.) || newmap.include?(dword[i])
# create new mapping
newmap[c] = dword[i]
end
CMap.new(self, newmap, dword)
end
def append_solved(dword)
@solved += ' ' unless @solved.empty?
@solved += dword
end
def shelve(cword)
@shelved << cword
end
def convert(cword)
pattern = ''
cword.each_byte do |c|
pattern << @map[c - ?a]
end
pattern
end
end
# Pruner keeps track of maps that have already been tried.
# Some maps are subsets of previously tried maps, allowing us to skip them.
# This may consume more RAM than it is worth.
class Pruner
def initialize(list)
@pruned = {}
letter = 'A'
@abbrev = {}
list.each {|w| @abbrev[w] = letter.clone; letter.succ!}
end
def genkey(list)
list.collect{|w|@abbrev[w]}.sort.join('.')
end
def test(list, cmap)
key = genkey(list)
return false unless prunelist = @pruned[key]
prunelist.each do |prunemap|
if /#{prunemap}/ =~ cmap.map
return true
end
end
return false
end
def store(list, cmap)
key = genkey(list)
#puts "Pruner.store #{cmap.map} #{key}"
@pruned[key] ||= []
@pruned[key] = @pruned[key].reject do |prunemap|
/#{cmap.map}/ =~ prunemap
end << cmap.map
end
end
class Cryptogram
def initialize(filename, dict)
@dict = dict
@words = WordReader.new(filename).to_a
# clist is the input cipher with no duplicated words
# and no unrecognized input characters
@clist = []
@words.each do |word|
word.downcase!
word.gsub!(/[^a-z]/, '')
next if word.empty? || @clist.include?(word)
@clist.push(word)
end
# Sort by increasing size of candidate list
@clist = @clist.sort_by {|w| @dict.candidates(w).length}
end
def solve(max_unsolved = 0, stop_on_first = true)
@max_unsolved = max_unsolved
@stop_on_first = stop_on_first
@checks = 0
@solutions = {}
@partials = {}
# @pruner = Pruner.new(@clist)
solve_p(@clist, CMap.new, 0)
end
def solve_p(list, cmap, depth)
# Simplify list if possible
list = prescreen(list, cmap)
return if check_solution(list, cmap)
# return if @pruner.test(list, cmap)
solve_r(list, cmap, depth)
return if done?
# @pruner.store(list, cmap)
end#solve_p
def solve_r(start_list, start_cmap, depth)
for i in (0...start_list.length)
# Pull a cword out of start_list
list = start_list.dup
cword = list.delete_at(i)
pattern = start_cmap.convert(cword)
search(cword, pattern) do |dword|
# Try to make a new cmap by learning dword for cword
next unless cmap = start_cmap.learn(cword, dword)
# Recurse on remaining words
solve_p(list, cmap, depth + 1)
return if done?
end#search
end#for
end#solve_r
def done?
@stop_on_first && @solutions.length > 0
end
# Return the subset of cwords in list that are not fully solved by cmap.
# Update cmap with learned and shelved words.
def prescreen(list, cmap)
start_list = []
list.each do |cword|
pattern = cmap.convert(cword)
if pattern.include?(?.)
# cword was not fully resolved.
start_list << cword
elsif @dict.lookup(pattern)
# cword was resolved and is a known word.
cmap.learn(cword, pattern)
else
# cword cannot be solved.
cmap.shelve(cword)
end
end
start_list
end
# Generate dictionary words matching the pattern
def search(cword, pattern)
# the pattern will normally have at least one unknown character
if pattern.include? ?.
re = Regexp.new("^#{pattern}$")
@dict.candidates(cword).each do |dword|
yield dword if re =~ dword
end
# otherwise, just check that the pattern is actually a known word.
elsif @dict.lookup(pattern)
yield pattern
end
end
def check_solution(list, cmap)
@checks += 1
unsolved = list.length + cmap.shelved.length
# Did we get lucky?
if unsolved == 0
if not @solutions.has_key?(cmap.map)
@solutions[cmap.map] = true
if not @stop_on_first
puts "\nfound complete solution \##{@solutions.length}"
puts "performed #{@checks} checks"
show_cmap(cmap)
end
end
return true
end
# Give up if too many words cannot be solved
if cmap.shelved.length > @max_unsolved
return true
end
# Check for satisfactory partial solution
if unsolved <= @max_unsolved
if not @partials.has_key?(cmap.map)
@partials[cmap.map] = true
puts "\nfound partial \##{@partials.length} with #{unsolved} unsolved"
puts "performed #{@checks} checks"
puts Time.now
show_cmap(cmap)
end
end
return false
end
def show
puts "Performed #{@checks} checks"
puts "Found #{@solutions.length} solutions"
@solutions.each_key do |sol|
show_cmap(CMap.new(sol))
end
puts
puts "Found #{@partials.length} partial solutions"
@partials.each_key do |sol|
show_cmap(CMap.new(sol))
end
end
def show_cmap(cmap)
puts(('a'..'z').to_a.join(''))
puts cmap.map
puts
@words.each do |word|
pattern = cmap.convert(word)
printf "%-20s %s %-20s\n",
word, (@dict.lookup(pattern) ? ' ' : '*'), pattern
end
puts '-' * 42
end
end
DICTFILE = ARGV[0]
PARTIAL = ARGV[1].to_i
puts "Reading dictionary #{DICTFILE}"
dict = Dictionary.new(DICTFILE)
ARGV[2..-1].each do |filename|
puts "Solving cryptogram #{filename} allowing #{PARTIAL} unknowns", Time.now
cryp = Cryptogram.new(filename, dict)
cryp.solve PARTIAL
puts "Cryptogram solution", Time.now
cryp.show
end
Servizio di avviso nuovi messaggi
Ricevi direttamente nella tua mail i nuovi messaggi per
[QUIZ] Cryptograms (#13
Inserendo la tua e-mail nella casella sotto, riceverai un avviso tramite posta elettronica ogni volta che il motore di ricerca troverà un nuovo messaggio per te
Il servizio è completamente GRATUITO!
x
Login to ForumsZone
Login with Google
Login with E-Mail & Password