Имитация NFA


15

Недетерминирован конечный автомат является конечным автоматом , где кортеж отображаются в нескольких штатов. То есть. мы заменяем обычную функцию перехода DFA другой функцией .(state,symbol)δ:Q×ΣQ Δ:Q×ΣP(Q)

Если вы знаете, что такое NFA, вы можете пропустить следующий раздел.

Формальное определение

НФА однозначно описывается

  • Q конечный набор состояний
  • Σ конечный набор символов
  • Δ:Q×ΣP(Q) функция перехода
  • q0Q начальное состояние
  • FQ a set of final states

The machine starts out in q0 and reads a finite string of symbols wΣ, for each symbol it will simultaneously apply the transition function function with a current state and add each new set of states to the set of current states.

Challenge

For this challenge we will ignore F  to simplify it, furthermore the alphabet will always be the (lower-case) letters a  to z  and the set of states will be {0N} for some non-negative integer N. The initial state will always be 0.

Given a word w{az} and a description of the NFA, your task is to determine all the final states.

Example

Consider the string abaab and the following description:

state, symbol, new-states
0, 'a', [1]
1, 'a', [0]
1, 'b', [1,2]

The machine will start in q0=0:

  1. read an a: new states {1}
  2. read a b: new states {1,2}
  3. read an a: new states {0}
  4. read an a: new states {1}
  5. read a b: new states {1,2}

So the final states and thus the output would be {1,2}.

Note: In step (2) the transition of state 2 maps to as the description only includes transitions to non-empty sets.

Rules

The input will consist of a string and some kind of description of the NFA (without ϵ-transitions):

  • the input string will always be element of {az}
  • valid inputs (not limited to):
    • list/array of tuples/lists
    • new-line separated input
  • the description of the NFA will only contain transitions with non-empty sets as result
    • you may abbreviate rules with the same characters if their result is the same (eg. rules 0,'a',[1,2] and 0,'b',[1,2] could be abbreviated with 0,"ab",[1,2]
    • you may take each rule separate (eg. rule 0,'a',[1,2] can be 0,'a',[1] and 0,'a',[2])
  • you may choose upper-case letters if you want
  • you may take the number of states as input
  • you may assume some kind of ordering of the inputs (eg. ordered by state or symbols)

The output will be a list/set/new-line separated output etc. of the final states

  • order doesn't matter
  • no duplicates (as it's a set)

Test cases

These examples will be in the format description word -> states where description is a list of tuples (state,symbol,new-states):

[]  "x" -> []
[]  "" -> [0]
[(0,'a',[1]),(1,'a',[0]),(1,'b',[1,2])]  "abaab" -> [1,2]
[(0,'a',[1]),(1,'a',[0]),(1,'b',[1,2])]  "abc" -> []
[(0,'p',[0,1]),(0,'g',[2]),(1,'c',[1]),(1,'g',[4]),(1,'p',[2]),(2,'c',[0])]  "ppcg" -> [2,4]
[(0,'f',[1]),(1,'o',[1,2]),(2,'b',[3]),(3,'a',[4]),(4,'r',[0,4])]  "foobar" -> [0,4]
[(0,'f',[1]),(1,'o',[1,2]),(2,'b',[3]),(3,'a',[4]),(4,'r',[0,4])]  "fooooooobar" -> [0,4]
[(0,'f',[1]),(1,'o',[1,2]),(2,'b',[3]),(3,'a',[4]),(4,'r',[0,4])]  "fobarfo" -> [1,2]
[(0,'f',[1]),(1,'o',[1,2]),(2,'b',[3]),(3,'a',[4]),(4,'r',[0,4])]  "foobarrf" -> [1]
[(0,'d',[1,2]),(1,'u',[2]),(2,'u',[2,3]),(2,'p',[3]),(3,'p',[3])]  "dup" -> [3]
[(0,'a',[0,2]),(0,'b',[3]),(1,'a',[1]),(1,'b',[1]),(2,'b',[1,4]),(4,'b',[2])]  "aab" -> [3,1,4]
[(0,'a',[0,2]),(0,'b',[3]),(1,'a',[1]),(1,'b',[1]),(2,'b',[1,4]),(4,'b',[2])]  "abb" -> [1,2]


3
This brings back scary memories from my automaton course.
Don Thousand

Can we take input with individual lines for each new state, e.g. this for worked example?
ovs

@ovs: Sure go ahead!
ბიმო

Ответы:


7

Haskell, 66 bytes

import Data.List
f d=foldl(\s c->nub[r|(y,r)<-d,g<-s,(g,c)==y])[0]

Try it online!


You can get rid of the import for nub if you assume the states to be [Int], then you can use check each [0..] which is finite: 60 bytes
ბიმო

@BWO This iterates over all Ints and over all current states and therefore still produces duplicate states. Example (changed [0..] to [0..3] for testing purposes, but this should not make a difference, right?)
ovs

Yeah, not sure what I was thinking.. Nevermind..
ბიმო

4

Brachylog, 42 bytes

,0{hẸ&t|∋₁B∋IhJ&tJ&hhC∧I∋₁C∧It∋S&hb;B,S↰}ᵘ

input as [string, nfa] where nfa is a list of state transitions [ initial state, letter, new states as list ]

Explanation

,0                                              # Append 0 to the input (initial state)
  {                                      }ᵘ     # Find all unique outputs
   h                                            # if first element (string)
    Ẹ                                           #   is empty
     &t                                         #   then: return last element (current state)
       |                                        #   else:
        ∋₁B                                     #       save the state transitions in "B"
           ∋I                                   #       take one of these transitions, save in "I"
             hJ                                 #       take the initial state requirement, store in "J"
               &tJ                              #       make sure "J" is actually the current state
                  &hhC                          #       Save first char of string in C
                      ∧I∋₁C                     #       make sure the char requirement for the state transition is the current char
                           ∧It∋S                #       Make "S" equal to one of the new states
                                &hb             #       Behead the string (remove first char)
                                   ;B,S         #       Add B (the state transitions) and S (the new state)
                                       ↰        #       recur this function

Try it online!


4

Brachylog v2, 31 bytes

{b,Ȯ,Ȯ\c↔,0↔ġ₃kH&hg;Hz{∋ᵈ}ᵐtt}ᵘ

Try it online! (or with a more complex example)

Brachylog is really good at this sort of problem, and really bad at problems that require two separate inputs and an output. Almost all of this program is just plumbing.

Input format is a list containing two elements: the first is the list of state transitions ([oldState, symbol, newState]), and the second is the list of symbols. I originally planned this program to work with character codes for symbols (because Brachylog's string handling can be a bit weird sometimes), but it turns out that characters work too (although you have to write the input string as a list of characters, not as a string). If a state-symbol pair can transition to multiple different states, you write multiple transitions to deal with that.

Explanation

{b,Ȯ,Ȯ\c↔,0↔ġ₃kH&hg;Hz{∋ᵈ}ᵐtt}ᵘ
{                            }ᵘ   Find all distinct outputs that can result from:
 b                                  taking the input minus its first element,
  ,Ȯ                                appending a singleton list (i.e. an element)
    ,Ȯ                              then appending that same element again
      \                             and transposing;
       c                            then concatenating the resulting lists,
        ↔,0↔                        prepending a 0,
            ġ₃                      grouping into blocks of 3 elements
              k                       (and discarding the last, incomplete, block),
               H&                   storing that while we
                 h                  take the first input element,
                  g  z              pair a copy of it with each element of
                   ;H                 the stored value,
                      {  }ᵐ         assert that for each resulting element
                       ∋ᵈ             its first element contains the second,
                        ᵈ ᵐ           returning the list of second elements,
                            t       then taking the last element of
                           t          the last element.

It's probably easier to follow this by looking at what some partial versions of the program would produce. Using the following input each time:

[[[0,97,1],[1,97,0],[1,98,1],[1,98,2]],[97,98,97,97,98]]

we can observe the outputs of some prefixes of this program:

[[[0,97,1],[1,97,0],[1,98,1],[1,98,2]],[97,98,97,97,98]]b,Ȯ,Ȯ
[[97,98,97,97,98],L,L]

[[[0,97,1],[1,97,0],[1,98,1],[1,98,2]],[97,98,97,97,98]]b,Ȯ,Ȯ\
[[97,A,A],[98,B,B],[97,C,C],[97,D,D],[98,E,E]]

[[[0,97,1],[1,97,0],[1,98,1],[1,98,2]],[97,98,97,97,98]]b,Ȯ,Ȯ\c↔,0↔
[0,97,A,A,98,B,B,97,C,C,97,D,D,98,E,E]

[[[0,97,1],[1,97,0],[1,98,1],[1,98,2]],[97,98,97,97,98]]b,Ȯ,Ȯ\c↔,0↔ġ₃k
[[0,97,A],[A,98,B],[B,97,C],[C,97,D],[D,98,E]]

[[[0,97,1],[1,97,0],[1,98,1],[1,98,2]],[97,98,97,97,98]]b,Ȯ,Ȯ\c↔,0↔ġ₃kH&hg;Hz
[[[[0,97,1],[1,97,0],[1,98,1],[1,98,2]],[0,97,A]],
 [[[0,97,1],[1,97,0],[1,98,1],[1,98,2]],[A,98,B]],
 [[[0,97,1],[1,97,0],[1,98,1],[1,98,2]],[B,97,C]],
 [[[0,97,1],[1,97,0],[1,98,1],[1,98,2]],[C,97,D]],
 [[[0,97,1],[1,97,0],[1,98,1],[1,98,2]],[D,98,E]]]

[[[0,97,1],[1,97,0],[1,98,1],[1,98,2]],[97,98,97,97,98]]b,Ȯ,Ȯ\c↔,0↔ġ₃kH&hg;Hz{∋ᵈ}ᵐ
e.g. [[0,97,1],[1,98,1],[1,97,0],[0,97,1],[1,98,1]]

For the first example here, L is initially an unknown element, but when we transpose it via \, Brachylog realises that the only possibility is a list with the same length as the input. The last example here is nondeterministic; we're modelling nondeterminism in the NFA using the nondeterminism in Brachylog itself.

Possible improvements

Some of the syntax here, like ↔,0↔ and especially the mess with H&hg;Hz{…ᵈ}ᵐ, is fairly clunky. It wouldn't surprise me if there were a terser way to phrse this.

{∋ᵈ}ᵐ is in its own right a fairly suspicious structure – you'd expect to just be able to write ∋ᵈᵐ – but it doesn't parse for some reason.


∋ᵈᵐ doesn't parse because it is implemented such that multi-character meta-predicate names could in theory be used (if we ran out of single-symbol possibilities). In practice it's not currently used.
Fatalize

3

Python 3, 103 80 bytes

thanks to @BWO

w=lambda n,f,a={0}:w(n,f[1:],{y for(x,c,y)in n if c==f[0]and{x}&a})if''<f else a

TIO

Previous "elegant" list comprehension(103 bytes):

def w(a,b):
    q=[0]
    for c in b:q=[j for s in q for i in a if s in i if i[1]==c for j in i[2]]
    return q

Shame that Python 3 lacks reduce.. But using recursion and actual sets still brings you down to 80 bytes.
ბიმო

@BWO nice, thanks, haha btw the above is my new favorite example python code to show... one line giant list comprehensions amuse me way more than they should
Quintec

I think you can save 2 bytes by replacing if''<f with if f.
Chas Brown

@Chas Brown that fails if f is a falsy value such as empty string
Quintec

Actually, what am I saying, ignore that
Quintec

3

JavaScript (ES6), 99 bytes

Takes input as (nfa)(string). Returns a Set.

a=>g=([c,...b],s=[0])=>c?g(b,a.reduce((p,[x,y,z])=>s.includes(x)&y==c?[...p,...z]:p,[])):new Set(s)

Try it online!


3

R, 81 bytes

function(a,b,e,s)Reduce(function(A,x)unique(e[a%in%A&b==x]),el(strsplit(s,"")),0)

Try it online!

Straightforward answer using Reduce. Takes rules as three vectors of state, symbol, new-states called a,b,e.

Rules are separate (eg. rule 0,'a',[1,2] is 0,'a',1 and 0,'a',2).



2

Clean, 68 bytes

This one based on ovs's Haskell solution is a bit shorter than my initial approach was.

now includes a test harness

import StdEnv
?d=foldl(\s c=removeDup[r\\(y,r)<-d,g<-s|(g,c)==y])[0]

Try it online!


1
@BWO Test harness added
Οurous

1

Charcoal, 44 bytes

⊞υ⁰Fη«≔υζ≔⟦⟧υFζFθ¿∧⁼§λ⁰κ⁼§λ¹ιF§λ²¿¬№υμ⊞υμ»Iυ

Try it online! Link is to verbose version of code. Explanation:

⊞υ⁰

Push 0 to the predefined empty list to set the inital state to {0}.

Fη«

Loop over the input.

≔υζ

Copy the state.

≔⟦⟧υ

Reset the state.

Fζ

Loop over the copy of the state.

Fθ

Loop over the NFA entries.

¿∧⁼§λ⁰κ⁼§λ¹ι

If the entry matches, then...

F§λ²

... loop over the new states...

¿¬№υμ

.... if they are not already in the list...

⊞υμ»

... add them to the list.

Iυ

Cast the list of states to string for implicit output on separate lines.



1

Japt, 31 bytes

W=[W]c;Ê?ßUÅVVf!øW føUg)mÌc):Wâ

Try it!

Saved 2 bytes with better use of Japt's ability to implicitly form a function out of some inputs

Explanation:

W=[W]c;                            Initialize the state set to [0] on the first run
       Ê?                   :Wâ    If the input is empty return the unique states; else...
             Vf!øW                 Get the transitions valid for one of the current states
                   føUg)           Of those, get the ones valid for the current character
                        mÌc)       Merge the states of the remaining transitions
         ßUÅV                      Repeat with the remaining characters as input

The new "initialize states" code could use a bit more detail. Japt initializes W to 0 if there are fewer than 3 inputs, so on the first run [W] is [0], and c "flattens" an array. [0] is already as flat as it gets, so it isn't changed. On subsequent runs W has a different value, for example [1,2]. In that case [W] becomes [[1,2]], a single-element array where that element is an array. This time c unwraps that and gets back to [1,2]. Thus, on the first run it's W=[0] and on subsequent runs it's W=W.

Используя наш сайт, вы подтверждаете, что прочитали и поняли нашу Политику в отношении файлов cookie и Политику конфиденциальности.
Licensed under cc by-sa 3.0 with attribution required.