16

Предположим, у вас есть хеш-функция $\mathcal{H}$ которая принимает строки длиной $2n$ и возвращает строки длины и имеет приятное свойство - она устойчива к столкновениям , то есть трудно найти две разные строки с одинаковым хешем . $n$ $s \neq s'$ $\mathcal{H}(s) = \mathcal{H}(s')$

Теперь вы хотели бы создать новую хеш-функцию которая принимает строки произвольной длины и отображает их в строки длины , в то же время сохраняя устойчивость к столкновениям. $\mathcal{H'}$ $n$

К счастью для вас, уже в 1979 году был опубликован метод, теперь известный как конструкция Меркля-Дамгарда, который достигает именно этого.

Задача этой задачи будет заключаться в реализации этого алгоритма, поэтому сначала мы рассмотрим формальное описание конструкции Меркля – Дамгарда, прежде чем перейти к пошаговому примеру, который должен показать, что подход проще, чем это может появиться сначала.

Учитывая некоторое целое число , хеш-функцию как описано выше, и входную строку произвольной длины, новая хеш-функция выполняет следующие действия: $n > 0$ $\mathcal{H}$ $s$ $\mathcal{H'}$

Установите, длина и разбиение на куски длины , заполняя последний кусок завершающими нулями, если необходимо. Это дает $l = |s|$ $s$ $s$ $n$ $m = \lceil \frac{l}{n} \rceil$ много кусков, которые помечены как $c_1, c_2, \dots, c_m$ .

Добавьте начальный и конечный фрагменты $c_0$ и $c_{m+1}$ , где $c_0$ - строка, состоящая из $n$ нулей и $c_{m+1}$ - этодвоичное число $n$ , дополненноеначальныминулями до длины $n$ .

Теперь итеративно применяем $\mathcal{H}$ к текущему фрагменту $c_i$ добавленному к предыдущему результату $r_{i-1}$ : $r_i = \mathcal{H}(r_{i-1}c_i)$ , где $r_0 = c_0$ . (Этот шаг может быть более понятным после просмотра примера ниже.)

Выход $\mathcal{H'}$ является конечным результатом $r_{m+1}$ .

Задание

Напишите программу или функцию, которая принимает в качестве входных данных положительное целое число $n$ , хеш-функцию $\mathcal{H}$ качестве черного ящика и непустую строку $s$ и возвращает тот же результат, что и $\mathcal{H'}$ на тех же входах.

Это код-гольф , поэтому выигрывает самый короткий ответ на каждом языке.

пример

Допустим, $n = 5$ , поэтому наша заданная хеш-функция $\mathcal{H}$ принимает строки длиной 10 и возвращает строки длиной 5.

Учитывая ввод $s = \texttt{"Programming Puzzles"}$ , мы получаем следующие фрагменты: $s_1 = \texttt{"Progr"}$ , $s_2 = \texttt{"ammin"}$ , $s_3 = \texttt{"g Puz"}$ и $s_4 = \texttt{"zles0"}$ . Обратите внимание, что $s_4$ необходимо дополнить до длины 5 с одним завершающим нулем.
$c_0 = \texttt{"00000"}$ - это просто строка из пяти нулей, а $c_5 = \texttt{"00101"}$ - это пять в двоичном виде ( $\texttt{101}$ ), дополненная двумя ведущими нулями.
Теперь куски объединяются с $\mathcal{H}$ :
$r_0 = c_0 = \texttt{"00000"}$
$r_1 = \mathcal{H}(r_0c_1) = \mathcal{H}(\texttt{"00000Progr"})$
$r_2 = \mathcal{H}(r_1c_2) = \mathcal{H}(\mathcal{H}(\texttt{"00000Progr"})\texttt{"ammin"})$ $r_3 = \mathcal{H}(r_2c_3) = \mathcal{H}(\mathcal{H}(\mathcal{H}(\texttt{"00000Progr"})\texttt{"ammin"})\texttt{"g Puz"})$
$r_4 = \mathcal{H}(r_3c_4) = \mathcal{H}(\mathcal{H}(\mathcal{H}(\mathcal{H}(\texttt{"00000Progr"})\texttt{"ammin"})\texttt{"g Puz"})\texttt{"zles0"})$
$r_5 = \mathcal{H}(r_4c_5) = \mathcal{H}(\mathcal{H}(\mathcal{H}(\mathcal{H}(\mathcal{H}(\texttt{"00000Progr"})\texttt{"ammin"})\texttt{"g Puz"})\texttt{"zles0"})\texttt{"00101"})$
$r_5$ is our output.

Let's have a look how this output would look depending on some choices¹ for $\mathcal{H}$ :

If $\mathcal{H}(\texttt{"0123456789"}) = \texttt{"13579"}$ , i.e. $\mathcal{H}$ just returns every second character, we get:
$r_1 = \mathcal{H}(\texttt{"00000Progr"}) = \texttt{"00Por"}$
$r_2 = \mathcal{H}(\texttt{"00Porammin"}) = \texttt{"0oamn"}$
$r_3 = \mathcal{H}(\texttt{"0oamng Puz"}) = \texttt{"omgPz"}$
$r_4 = \mathcal{H}(\texttt{"omgPzzles0"}) = \texttt{"mPze0"}$
$r_5 = \mathcal{H}(\texttt{"mPze000101"}) = \texttt{"Pe011"}$
So $\texttt{"Pe011"}$ needs to be the output if such a $\mathcal{H}$ is given as black box function.
If $\mathcal{H}$ simply returns the first 5 chars of its input, the output of $\mathcal{H'}$ is $\texttt{"00000"}$ . Similarly if $\mathcal{H}$ returns the last 5 chars, the output is $\texttt{"00101"}$ .
If $\mathcal{H}$ multiplies the character codes of its input and returns the first five digits of this number, e.g. $\mathcal{H}(\texttt{"PPCG123456"}) = \texttt{"56613"}$ , then $\mathcal{H}'(\texttt{"Programming Puzzles"}) = \texttt{"91579"}$ .

^{1 For simplicity, those $\mathcal{H}$ are actually not collision resistant, though this does not matter for testing your submission.}

code-golf function hashing

— Laikoni
источник

Sandbox (deleted)

— Laikoni

I must say it's fun that the example given has the last 'full' hash be of "OMG Puzzles!" effectively omgPzzles0. Well chosen example input!

— LambdaBeta

Can we assume some flexibility on the input format for H (e.g. it takes two strings of length n, or a longer string of which it only considers the first 2n characters)?

— Delfad0r

Are space characters, e.g., between "g P" valid output?

— guest271314

@guest271314 If the space is part of the resulting hash, it needs to be outputted. If the hash is actually "gP", you may not output a space inbetween.

— Laikoni

7

Haskell, 91 90 86 bytes

-1 byte thanks to Laikoni
-4 bytes thanks to xnor

n!h|let a='0'<$[1..n];c?""=c;c?z=h(c++take n(z++a))?drop n z=h.(++mapM(:"1")a!!n).(a?)

Try it online!

Explanation

a='0'<$[1..n]

Just assigns the string "00...0" ('0' $n$ times) to a

c?""=c
c?z=h(c++take n(z++a))?drop n z

The function ? implements the recursive application of h: c is the hash we have obtained so far (length $n$ ), z is the rest of the string. If z is empty then we simply return c, otherwise we take the first $n$ characters of z (possibly padding with zeros from a), prepend c and apply h. This gives the new hash, and then we call ? recursively on this hash and the remaining characters of z.

n!h=h.(++mapM(:"1")a!!n).(a?)

The function ! is the one actually solving the challenge. It takes n, h and s (implicit) as inputs. We compute a?s, and all we have to do is append n in binary and apply h once more. mapM(:"1")a!!n returns the binary representation of $n$ .

— Delfad0r
источник

1

let in a guard is shorter than using where: Try it online!

— Laikoni

2

Похоже, mapM(\_->"01")aможно mapM(:"1")a.

— xnor

7

R , 159 154 байт

function(n,H,s,`?`=paste0,`*`=strrep,`/`=Reduce,`+`=nchar,S=0*n?s?0*-(+s%%-n)?"?"/n%/%2^(n:1-1)%%2)(function(x,y)H(x?y))/substring(S,s<-seq(,+S,n),s--n-1)

Попробуйте онлайн!

Тьфу! Отвечать на вызовы строк в R никогда не бывает красиво, но это ужасно. Это поучительный ответ о том, как не писать "нормальный" R-код ...

Спасибо nwellnhof за исправление ошибки стоимостью 0 байт!

Спасибо Дж. Доу за замену псевдонимов операторов на изменение приоритета, хорошо для -4 байта.

Приведенное ниже объяснение относится к предыдущей версии кода, но принципы остаются теми же.

function(n,H,s,               # harmless-looking function arguments with horrible default arguments 
                              # to prevent the use of {} and save two bytes
                              # then come the default arguments,
                              # replacing operators as aliases for commonly used functions:
 `+`=paste0,                  # paste0 with binary +
 `*`=strrep,                  # strrep for binary *
 `/`=Reduce,                  # Reduce with binary /
 `?`=nchar,                   # nchar with unary ?
 S=                           # final default argument S, the padded string:
  0*n+                        # rep 0 n times
  s+                          # the original string
  0*-((?s)%%-n)+              # 0 padding as a multiple of n
  "+"/n%/%2^(n:1-1)%%2)       # n as an n-bit number
                              # finally, the function body:
 (function(x,y)H(x+y)) /      # Reduce/Fold (/) by H operating on x + y
  substring(S,seq(1,?S,n),seq(n,?S,n))  # operating on the n-length substrings of S

— Giuseppe
источник

I think 0*(n-(?s)%%n) doesn't work if n divides s evenly. But 0*-((?s)%%-n) should work.

— nwellnhof

@nwellnhof ah, of course, thank you, fixed.

— Giuseppe

Minor changes, 155 bytes

— J.Doe

1

@J.Doe nice! I saved another byte since seq has 1 as its from argument by default.

— Giuseppe

3

C (gcc), 251 bytes

#define P sprintf(R,
b(_){_=_>1?10*b(_/2)+_%2:_;}f(H,n,x)void(*H)(char*);char*x;{char R[2*n+1],c[n+1],*X=x;P"%0*d",n,0);while(strlen(x)>n){strncpy(c,x,n);x+=n;strcat(R,c);H(R);}P"%s%s%0*d",R,x,n-strlen(x),0);H(R);P"%s%0*d",R,n,b(n));H(R);strcpy(X,R);}

Try it online!

Not as clean as the bash solution, and highly improvable.

The function is f taking H as a function that replaces its string input with that string's hash, n as in the description, and x the input string and output buffer.

Description:

#define P sprintf(R,     // Replace P with sprintf(R, leading to unbalanced parenthesis
                         // This is replaced and expanded for the rest of the description
b(_){                    // Define b(x). It will return the integer binary expansion of _
                         // e.g. 5 -> 101 (still as integer)
  _=_>1?                 // If _ is greater than 1
    10*b(_/2)+_%2        // return 10*binary expansion of _/2 + last binary digit
    :_;}                 // otherwise just _
f(H,n,x)                 // Define f(H,n,x)
  void(*H)(char*);       // H is a function taking a string
  char*x; {              // x is a string
  char R[2*n+1],c[n+1],  // Declare R as a 2n-length string and c as a n-length string
  *X=x;                  // save x so we can overwrite it later
  sprintf(R,"%0*d",n,0); // print 'n' 0's into R
  while(strlen(x)>n){    // while x has at least n characters
    strncpy(c,x,n);x+=n; // 'move' the first n characters of x into c
    strcat(R,c);         // concatenate c and R
    H(R);}               // Hash R
  sprintf(R,"%s%s%0*d"   // set R to two strings concatenated followed by some zeroes
    R,x,                 // the two strings being R and (what's left of) x
    n-strlen(x),0);      // and n-len(x) zeroes
  H(R);                  // Hash R
  sprintf(R,"%s%*d",R,n, // append to R the decimal number, 0 padded to width n
    b(n));               // The binary expansion of n as a decimal number
  H(R);strcpy(X,R);}     // Hash R and copy it into where x used to be

— LambdaBeta
источник

229 bytes

— ceilingcat

I think: 227 bytes (going off of ceilingcat's comment)

— Zacharý

3

Ruby, 78 bytes

->n,s,g{(([?0*n]*2*s).chop.scan(/.{#{n}}/)+["%0#{n}b"%n]).reduce{|s,x|g[s+x]}}

Try it online!

How it works:

([?0*n]*2*s).chop    # Padding: add leading and trailing 
                     # zeros, then remove the last one
.scan(/.{#{n}}/)     # Split the string into chunks
                     # of length n
+["%0#{n}b"%n]       # Add the trailing block
.reduce{|s,x|g[s+x]} # Apply the hashing function
                     # repeatedly

— G B
источник

2

Jelly, 23 bytes

0Ṿ;s;BṾ€ṚWƲ}z”0ZU0¦;Ç¥/

Try it online!

Accepts $\mathcal H$ at the line above it, $s$ as its left argument, and $n$ as its right argument.

— Erik the Outgolfer
источник

2

Bash, 127-ε bytes

Z=`printf %0*d $1` R=$Z
while IFS= read -rn$1 c;do R=$R$c$Z;R=`H<<<${R::2*$1}`;done
H< <(printf $R%0*d $1 `bc <<<"obase=2;$1"`)

Try it online!

This works as a program/function/script/snippet. H must be resolveable to a program or function that will perform the hashing. N is the argument. Example call:

$ H() {
>   sed 's/.\(.\)/\1/g'
> }
$ ./wherever_you_put_the_script.sh 5 <<< "Programming Puzzles"  # if you add a shebang
Pe011

Description:

Z=`printf %0*d $1`

This creates a string of $1 zeroes. This works by calling printf and telling it to print an integer padded to extra argument width. That extra argument we pass is $1, the argument to the program/function/script which stores n.

R=$Z

This merely copies Z, our zero string, to R, our result string, in preparation for the hashing loop.

while IFS= read -rn$1 c; do

This loops over the input every $1 (n) characters loading the read characters into c. If the input ends then c merely ends up too short. The r option ensures that any special characters in the input don't get bash-interpreted. This is the -ε in the title - that r isn't strictly necessary, but makes the function more accurately match the input.

R=$R$c$Z

This concatenates the n characters read from input to R along with zeroes for padding (too many zeroes for now).

R=`H<<<${R::2*$1}`;done

This uses a here string as input to the hash function. The contents ${R::2*$1} are a somewhat esoteric bash parameter substitution which reads: R, starting from 0, only 2n characters.

Here the loop ends and we finish with:

H< <(printf $R%0*d $1 `bc <<<"obase=2;$1"`)

Here the same format string trick is used to 0 pad the number. bc is used to convert it to binary by setting the output base (obase) to 2. The result is passed to the hash function/program whose output is not captured and thus is shown to the user.

— LambdaBeta
источник

Why "127-ε"? Why not just "127"?

— Solomon Ucko

I don't know. I was on the fence about the necessity of the r flag. I figured 1 byte doesn't really matter, but if pushed I could shave it.

— LambdaBeta

For the read command?

— Solomon Ucko

Because without it a `` in the input will be interpreted instead of ignored, so they'd have to be escaped.

— LambdaBeta

Maybe add a note about that?

— Solomon Ucko

2

Pyth, 24 bytes

Since Pyth doesn't allow H to be used for a function name, I use y instead.

uy+GH+c.[E=`ZQQ.[ZQ.BQ*Z

Try it online! Example is with the "every second character" version of H.

— Steven H.
источник

2

Perl 6, 79 68 bytes

{reduce &^h o&[~],comb 0 x$^n~$^s~$n.fmt("%.{$n-$s.comb%-$n}b"): $n}

Try it online!

Explanation

{
  reduce         # Reduce with
    &^h o&[~],   # composition of string concat and hash function
    comb         # Split string
      0 x$^n     # Zero repeated n times
      ~$^s       # Append input string s
      ~$n.fmt("  # Append n formatted
        %.       # with leading zeroes,
        {$n             # field width n for final chunk
         -$s.comb%-$n}  # -(len(s)%-n) for padding,
        b")      # as binary number
      :          # Method call with colon syntax
      $n         # Split into substrings of length n
}

— nwellnhof
источник

1

Чисто , 143 байта

import StdEnv
r=['0':r]
$n h s=foldl(\a b=h(a++b))(r%(1,n))([(s++r)%(i,i+n-1)\\i<-[0,n..length s]]++[['0'+toChar((n>>(n-p))rem 2)\\p<-[1..n]]])

Попробуйте онлайн!

— Οurous
источник

1

Python 2 , 126 113 байтов

lambda n,H,s:reduce(lambda x,y:H(x+y),re.findall('.'*n,'0'*n+s+'0'*(n-len(s)%n))+[bin(n)[2:].zfill(n)])
import re

Попробуйте онлайн!

-13 благодаря триггернометрии .

Да, это мерзость, почему я не могу просто использовать встроенный для разделения строки на куски ...? :-(

— Эрик Outgolfer
источник

codegolf.stackexchange.com/a/173952/55696while петля является лучшей встроенным я мог надеяться. 104 байта

— Стивен Х.

@StevenH. Yeah, especially if you're actually focusing on the golfing itself. >_>

— Erik the Outgolfer

'0'*~-n instead of '0'*(len(s)%n) is shorter (and actually correct for shorter inputs).

— nwellnhof

@nwellnhof Yeah, but it's definitely not the same thing.

— Erik the Outgolfer

Может быть, я не был достаточно ясен. Ваше решение дает неправильный ответ для строк типа Programming Puzz(16 символов). Замена '0'*(len(s)%n)с '0'*~-nисправлениями , что и экономит 7 байт.

— nwellnhof

1

Python 2 , 106 102 байта

На этот раз функция превосходит лямбду. -4 байта для простого манипулирования синтаксисом, благодаря Джо Кингу.

def f(n,H,s):
 x='0'*n;s+='0'*(n-len(s)%n)+bin(n)[2:].zfill(n)
 while s:x=H(x+s[:n]);s=s[n:]
 return x

Попробуйте онлайн!

— Стивен Х.
источник

Разве результат не должен быть «Pe011», а не «e011»?

— Триггонометрия

Это должно быть. Исправлена!

— Стивен Х.

Используйте точки с запятой вместо символов новой строки. -4 байта

— Джо Кинг,

Я не осознавал, что это работает и для циклов while, спасибо!

— Стивен Х.

1

Japt , 27 байт

òV ú'0 pV¤ùTV)rÈ+Y gOvW}VçT

Попытайся!

Я не нашел возможности для Japt принимать функции непосредственно в качестве входных данных, поэтому он принимает строку, которая интерпретируется как код Japt, и ожидает, что она определит функцию. В частности, OvWпринимает третий ввод и интерпретирует его как Japt, а затем gвызывает его. Заменив это OxWразрешением на ввод вместо функции Javascript, или, если функция (каким-то образом) уже была сохранена в W, она может просто Wсохранить 2 байта. Ссылка выше имеет рабочий пример $\mathcal{H}$ это принимает символы с нечетными индексами, в то время как этот пример является "умножением кодов символов и возьмите 5 старших цифр" пример.

Благодаря тому, как Japt принимает входные данные, $s$ будет U, $n$ будет Vи $\mathcal{H}$ будет W

Объяснение:

òV                             Split U into segments of length V
   ú'0                         Right-pad the short segment with "0" to the same length as the others
       p     )                 Add an extra element:
        V¤                       V as a base-2 string
          ùTV                    Left-pad with "0" until it is V digits long
              r                Reduce...
                        VçT          ...Starting with "0" repeated V times...
               È       }                                                  ...By applying:
                +Y               Combine with the previous result
                   gOvW          And run W as Japt code

— Камил Дракари
источник

0

GolfScript , 47 байт

~:π'0'*:s\+s+π/);[sπ2base{48+}%+0π->]+{+1$~}*\;

Попробуйте онлайн!

— Wastl
источник

0

ОК , 41 байт

{(x#48)(y@,)/(0N,x)#z,,/$((x+x!-#z)#2)\x}

Попробуйте онлайн!

{                                       } /x is n, y is H, z is s.
                          (x+x!-#z)       /number of padding 0's needed + x
                         (         #2)\x  /binary(x) with this length
                      ,/$                 /to string
                    z,                    /append to z
             (0N,x)#                      /split into groups of length x
       (y@,)/                             /foldl of y(concat(left, right))...
 (x#48)                                   /...with "0"*x as the first left string

— zgrep
источник

Произвольная длина хэширования

Задание

пример

Haskell, 91 90 86 bytes

Explanation

R , 159 154 байт

C (gcc), 251 bytes

Ruby, 78 bytes

How it works:

Jelly, 23 bytes

Bash, 127-ε bytes

Pyth, 24 bytes

Perl 6, 79 68 bytes

Explanation

Чисто , 143 байта

Python 2 , 126 113 байтов

Python 2 , 106 102 байта

Japt , 27 байт

GolfScript , 47 байт

ОК , 41 байт