30 August 2015

13,147 views

1 0

Typoglycemia: The PowerShell and the SQL

Typoglycemia is the ironic name, (derived from Hypoglycemia) given to the phenomenon that many readers can understand the meaning of words in a sentence even when the interior letters of each word are scrambled. They appear to recognize words by the outermost letters, the length, the letters used and the context. As long as all the necessary letters are present, and the first and last letters remain the same, they appear to have little trouble reading the text, but reading speed slows as brain-processing resources are utilised.

Scrambling text between the first and last letter of every word has become a common homework puzzle for people learning computing. There is a perl one-liner that is said to perform this scrambling function on a string:

perl -ne's@\b(\w)(\w+)(\w)\b@"$1".(join"",sort{(rand)<=>0.7}split//,$2).$3@gex&&print;'

Can it be done in PowerShell? Well yes: It looks like one of those problems for which Regular expressions are ideal. In fact I suspect that it was invented purely to put RegExs in a good light. I’d be interested if you can do it in a shorter algorithm.

function Obfuscate-String {

.SYNOPSIS

This function takes a string and shuffles all but the first and last characters of each word, but leaving everything else. It returns the obfuscated string

.DESCRIPTION

if you pass a string into this function, it will shuffle randomly every word,

but retain the first and last character intact, so that what I've just said

becomes "if you psas a stirng into this ftonucin, it will sfhfule rldamony

eevry wrod, but riaetn the frsit and lsat catearhcr inactt, so taht waht I've

jsut said beomces".

.EXAMPLE

Obfuscate-String 'whatever I type seems to me made rather hard to read!'

.EXAMPLE

'This seems rather trivial: or should I be amused?'|Obfuscate-String

.EXAMPLE

'Dave','Dee','Dozy','Beaky','Mick','Tich'|%{Obfuscate-String $_}

.PARAMETER TheStringToShuffle

The string you want to shuffle

[CmdletBinding()]

param

(<# the only parameter is the string to mangle #>

[Parameter(Mandatory=$True,

ValueFromPipeline=$True,

ValueFromPipelineByPropertyName=$True,

HelpMessage='what string should I shuffle?')]

[string]$TheStringToShuffle

)

$words = ([regex] '\b(\w)(\w{2,})(\w)\b').Match($TheStringToShuffle)

#have three separate capture groups for each word of over three letters

$obscured=''

$previous=0

while ($words.Success) #for the total number of matches

{

$start=$words.Index-$previous; #remember where the match started

$obscured+=$TheStringToShuffle.Substring($previous,$start)+"$($words.Groups[1].value)$($words.Groups[2].value -split ''| Sort-Object {Get-Random} | %{$a=''}{$a+=$_}{$a})$($words.Groups[3].value)";

# Add everything since the end of the last match up to this match

# Add the start characted, mangle, and add in the middle of the string

# And the final character (all three capture groups

$previous=$words.Index+$words.Length; #remember the index of the end of the matched string

$words = $words.NextMatch(); #and get the next match

}

# and now we do the remains of the string (if there is any more)

$obscured+=$TheStringToShuffle.Substring($previous,$TheStringToShuffle.Length-$previous);

"$obscured"; #and return the string

}

There was a young lady of Natchez

Whose garments were always in patchez.

When comment arose

On the state of her clothes,

She drawled, When Ah itchez, Ah scratchez!

'@ |Obfuscate-String

<# gives something like ...

Trehe was a yonug lday of Nceathz

Wsohe gmtneras wree aywals in peahctz.

When cnemmot aorse

On the satte of her clhtoes,

She draweld, When Ah ietchz, Ah shtccearz!

Now what about T-SQL? Is it possible to do the same thing in transact SQL? Well, it is: but is it pretty? Is it fast? Now, there’s a challenge!

added later…

Ok, so only one entry in the SQL version (at the time of writing), so I thought I ought to add my own attempt at a SQL Typoglycemia. It is done as a stored procedure because a function must be determinant, and this routine is likely to produce a different result every time!

The trick I used was to find every word with four or more letters using the wildcard search (LIKE and PATINDEX) and using this information to extract the letters and shuffle them using the NEWID() trick. Easy really. I bet you could write something faster or simpler!

IF OBJECT_ID ('Obfuscate_string') IS NOT NULL

DROP procedure Obfuscate_string

CREATE procedure Obfuscate_string @sentence VARCHAR(MAX) OUTPUT

BEGIN

DECLARE @start INT, @WordLength INT,@sent VARCHAR(100), @LenSentence INT,@index INT;

SELECT @LenSentence=len(@Sentence),@Start=100,@index =0;

/* we find each word of four or more letters in a WHILE loop, and having found the

start and end position in the string, we shuffle the characters using the NEWID()

trick. We insert them back into the string using STUFF.*/

WHILE (@start>0 AND @index+4<@lenSentence)

BEGIN

SELECT

@start= patindex('%[^a-z][a-z][a-z][a-z][a-z]%',' '+right(@Sentence,@lenSentence-@index)),

@WordLength=

CASE WHEN @start=0 THEN 0

ELSE patindex('%[a-z][^a-z]%',substring(right(@Sentence,@lenSentence-@index),@start,100))END,

@WordLength=CASE WHEN @WordLength=0 THEN @LenSentence-@index-@start ELSE @WordLength END

IF @start>0

BEGIN

SELECT @Sentence=stuff(@Sentence, @index+@start+1,@WordLength-2,

(SELECT substring(substring(@sentence,@index+@start+1, @WordLength-2), n,1)

FROM (SELECT TOP (@WordLength-2) row_number() OVER (ORDER BY (SELECT NULL))

FROM(VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) a(n) -- 10 rows

CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) b(n))x(n) -- 100 rows

ORDER BY ABS(CHECKSUM(NewId()))

FOR Xml PATH (''), TYPE).value('.', 'varchar(max)'))

END

SELECT @index=@index+@start+@WordLength

END

end

--and now we test it.

Declare @sentence varchar(max);

SELECT @sentence=

'There was a young lady of Natchez

Whose garments were always in patchez.

When comment arose

On the state of her clothes,

She drawled, ''When Ah itchez, Ah scratchez''';

EXECUTE Obfuscate_string @sentence OUTPUT

SELECT @Sentence

SELECT @sentence= 'whatever I type seems to me made rather hard to read!'

EXECUTE Obfuscate_string @sentence OUTPUT

SELECT @Sentence

SELECT @sentence= 'This seems rather trivial: or should I be amused?'

EXECUTE Obfuscate_string @sentence OUTPUT

SELECT @Sentence

Typoglycemia: The PowerShell and the SQL

Subscribe for more articles

Rate this article

Phil Factor

Related articles

Edit The JSON of a Fabric Pipeline