Hello everyone! This article is the third chapter on the series “Exploring Automata Theory” that I am publishing on. This article gives you an gentle introduction on what DFA is about. We will also be touching a tiny bit of regular languages.
I won’t be stressing on mathematical proofs here. Any one who is interested in learning them can refer to the references section I have mentioned below. If you have any questions or comments, do post them in the comments section below. Now, let’s get started.
Deterministic Finite State Automata
A deterministic finite automaton M is a 5-tuple, (Q, Σ , δ , q0, F), consisting of a finite set of states Q, a finite set of input symbols called the alphabet Σ, a transition function δ : Q x Σ -> Q, an initial or start state q0( pronounced as q not ) and a set of accept states denoted by F.
Let us understand this with the following examples.
Problem Statement :
Construct a DFA for the language L1 , where L1 is set of all strings that start with ‘0’.
Solution :
Okay, let’s try to solve this question. First, we shall expand and see what L1 actually consists of.
L1 = { 0, 00, 01, 000, 001, 010, 011, 0000, …. }
In order to construct the DFA, we have to identify the elements of the 5-tuple as mentioned in the above definition.
What can be the initial state ?
Well, here the binary string given by the user can be thought of as the starting point. From that you got to predict whether it is present in L1 or not.
Let us denote that initial state by A.Therefore, your q0= A.
What can be the possible outcomes we can obtain ?
Either the string satisfies our condition i.e starts with 0 or fails the condition. Hence only two final states we can have. Let us name them B and C respectively.
Note however that in either of the above cases, once we enter B or C, irrespective of what inputs we give, we stay there forever. What I mean is, no more checking is required. Irrespective of what follows after the first character, we stay in one of these states.
Keeping this in mind, we can come up with the below transition state diagram for the DFA given.
I have mentioned few terms here :
- A denotes the “Start State” i.e it is the initial state in which our machine or model exists before any inputs are given.
- Input here refer to the next characters of the string that are processed except the first character.
- The arrows indicate how the transition occurs when the input specified is supplied.
- B is the state we wish to achieve. Hence it is known as an “Final State”.
- C is the state that is of no interest to us. Also note that once we enter C, there is no way of going back. Hence it is known as “Dead State”.
Problem Statement :
Construct a DFA that accepts strings made up of {0, 1} and are of length 2.
Solution :
First identify your alphabet and the string. Here our alphabet is {0, 1} and the possible strings are { 00, 01, 10, 11}. Second, start guessing on what are the various states.
Let us have our initial state as A. If the inputs are 0 or 1, then we can move further. Let that state be B. Again, if the inputs are 0 or 1, we can move further. Let that state be C.
Are we done ? Well, the answer is NO. What if the string is not of length 2. Then clearly, one more state is required. Let that state be D. In case state C receives inputs, it leads to D, which acts as a dead state or trap state.
The above guesses leads us to the below transition diagram.
Problem Statement :
Construct a DFA that accepts strings made up of {a,b} and that does not contain the string aabb.
Solution :
Hmm…The problem looks quite difficult to solve isn’t it? I mean it is kinda hard to understand or visualize what type of states are possible. Don’t worry.
Whenever you find it difficult to solve a problem, the try solving its opposite version. Reversing the process gives the required problem answer.
So let’s try solving this problem statement instead of the above one.
Revised Problem Statement :
Construct a DFA that accepts strings made up of {a,b} and that contains the string aabb.
Solution :
Well again, first define your alphabet and language. Here they are {a,b} and strings that contain ‘aabb’ respectively.
Let the initial state be A. When you give the input ‘a’ we got to accept and proceed further. Else, what ? Well you just ignore this character and keep checking in the rest of the string if ‘aabb’ is available. That is, you go back to state A and again do the checking process.
Let the next state be B. So again two cases : If the letter is ‘a’, we proceed further. If the letter is ‘b’, you go back to A.
Let the next state be C. If the input is ‘b’, we proceed. Else, we remain in state C ( Why ? ). U do so because as of now the string you have is aaa. You just require two more b’s to proceed further. This can be achieved only if you stay in ‘C’. Returning back to states A or B, will eliminate the strings aaabb kinds.
Let the next state be D. If the input is ‘b’, we proceed. Else, we return to state B. This is because as of now, the strings are aaba. You have to start your checking back from the last letter a i.e taking ‘a’ as the first input. Where does our DFA be when it receives letter ‘a’ as the first input ? The answer is ‘B’.
Finally the state is E. This is our final state. Any inputs will be reverted here.
Phew! WE FOUND THE SOLUTION. But, we are supposed to find the answer to the opposite of this question. Well, it’s pretty simple. Just flip the states in the above diagram i.e all the final states will become non-final states and all the non-final states will become final states.
Our final solution thus is
Regular Language
A language is said to be a regular language if and only if some finite state machine recognizes it.
So what languages are not REGULAR ?
- If they are not recognized by a FSM
- If they require memory
We say FSM has limited memory i.e it stores the state information only. It only knows what state it is currently present in. It cannot store or count strings.
‘ababbababb’ — this cannot be recognized by a FSM since your FSM has to store extra information on what string we have till the present state and look for the repetition.
similarly strings of length A^N and B^N cannot be recognized by a FSM as here we r supposed to store the count of A’s and B’s.
More of this we will discuss in the upcoming articles.
REFERENCES :
John E. Hopcroft, Rajeev Motwani, Jeffrey D. Ullman-Introduction to Automata Theory, Languages, and Computations-Prentice Hall (2006)
If you liked this article, do share it with your friends and don’t forget to give claps. If you did not, then I am happy to hear your thoughts and suggestions. Any type of feedback and questions are most welcome. Do provide them in the comments section below.